Provider pricing guide

Gemini API Pricing Calculator & Flash vs Pro Cost Guide

Gemini Flash and Pro can have materially different input, output, long-context, cache, and batch rates. Compare the same workload before selecting a model tier.

Open Gemini cost calculator

ProviderModality

Example comparison basis: 1M input tokens + 1M output tokens. This is a rate comparison, not a quality or capability ranking.

AI model pricing comparison filtered by provider and modality
Provider / model	Input	Output	Cached / batch	Unit price	Example cost	Verified / source	Use
Google Gemini Gemini 2.5 Flash Cheapest	$0.30 / 1M	$2.50 / 1M	$0.03 cache	—	$2.80	Verified Jun 21, 2026 Google Gemini API pricing	Calculate
Google Gemini Gemini 2.5 Pro Highest	$1.25 / 1M	$10.00 / 1M	$0.125 cache	—	$11.25	Verified Jun 21, 2026 Google Gemini API pricing	Calculate

Pricing FAQ

Is Gemini Flash cheaper than Gemini Pro?

For the catalog rates shown here, Flash has lower standard token rates. Capability, latency, long-context usage, and quality requirements still determine the practical choice.

Does Gemini long-context pricing differ?

Some Gemini models publish higher rates after an input-token threshold. TokenMath shows and applies the long-context tier when it exists in the pricing record.

Are Gemini API prices live on this page?

No. This page uses a versioned snapshot with a last-verified date and a direct Google source link.

Related comparisons

Gemini Flash vs Pro API Cost Comparison

Compare Gemini Flash vs Pro API pricing for 1,000 input tokens, 300 output tokens, and 10,000 monthly requests.

Open

Gemini vs Claude API Cost Comparison

Compare representative Gemini and Claude API model costs using a consistent token workload and monthly request count.

Open

Related glossary terms

Input tokens

Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.

Open

Output tokens

Output tokens are the tokenized units generated by a language model, including visible responses and any billable reasoning or thinking tokens defined by the provider.

Open

Cached tokens

Cached tokens are repeated prompt tokens that qualify for a provider's prompt-caching mechanism and may use a separate cache-hit rate.

Open