Provider pricing guide
Gemini API Pricing Calculator & Flash vs Pro Cost Guide
Gemini Flash and Pro can have materially different input, output, long-context, cache, and batch rates. Compare the same workload before selecting a model tier.
Open Gemini cost calculator| Provider / model | Input | Output | Cached / batch | Unit price | Example cost | Verified / source | Use |
|---|---|---|---|---|---|---|---|
Google Gemini Gemini 2.5 Flash Cheapest | $0.30 / 1M | $2.50 / 1M | $0.03 cache | — | $2.80 | Verified Jun 21, 2026 Google Gemini API pricing | Calculate |
Google Gemini Gemini 2.5 Pro Highest | $1.25 / 1M | $10.00 / 1M | $0.125 cache | — | $11.25 | Verified Jun 21, 2026 Google Gemini API pricing | Calculate |
Pricing FAQ
Is Gemini Flash cheaper than Gemini Pro?
For the catalog rates shown here, Flash has lower standard token rates. Capability, latency, long-context usage, and quality requirements still determine the practical choice.
Does Gemini long-context pricing differ?
Some Gemini models publish higher rates after an input-token threshold. TokenMath shows and applies the long-context tier when it exists in the pricing record.
Are Gemini API prices live on this page?
No. This page uses a versioned snapshot with a last-verified date and a direct Google source link.
Related comparisons
Gemini Flash vs Pro API Cost Comparison
Compare Gemini Flash vs Pro API pricing for 1,000 input tokens, 300 output tokens, and 10,000 monthly requests.
OpenGemini vs Claude API Cost Comparison
Compare representative Gemini and Claude API model costs using a consistent token workload and monthly request count.
OpenRelated glossary terms
Input tokens
Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.
OpenOutput tokens
Output tokens are the tokenized units generated by a language model, including visible responses and any billable reasoning or thinking tokens defined by the provider.
OpenCached tokens
Cached tokens are repeated prompt tokens that qualify for a provider's prompt-caching mechanism and may use a separate cache-hit rate.
Open