Skip to main content

API cost comparison

Gemini Flash vs Pro API Cost Comparison

Gemini Flash and Pro have different standard and long-context rate profiles. This page compares the same short-context workload and leaves capability and quality evaluation separate.

Standard workload comparison

1,000 input tokens + 300 output tokens × 10,000 requests per month, with no cache or batch discount.

Monthly cost difference: $32.00

Compared model rates and standard workload costs
Provider / modelInput / 1MOutput / 1MCached input / 1MMonthly exampleVerification

Google Gemini

Gemini 2.5 Flash

$0.30$2.50$0.03$10.50Lowest costVerified

Jun 21, 2026

Google Gemini API pricing

Google Gemini

Gemini 2.5 Pro

$1.25$10.00$0.125$42.50Verified

Jun 21, 2026

Google Gemini API pricing

When each option may fit

These are decision prompts, not quality rankings. Validate capability, latency, context limits, rate limits, and reliability with your own evaluation set.

When Gemini Flash may fit

  • High request volume makes unit cost a primary constraint.
  • Evaluation results show sufficient quality for the task.
  • Interactive latency and throughput matter.

When Gemini Pro may fit

  • The task benefits from a higher-capability tier.
  • Quality improvements reduce retries or human review.
  • You have modeled the long-context tier where applicable.

Frequently asked questions

Is Gemini Flash always cheaper than Pro?

Its listed rates are lower for this standard workload, but retries, context size, and successful-task quality can change total product cost.

Does this example use long-context pricing?

No. The standard workload uses 1,000 input tokens, which is below the catalog's long-context threshold.