Pricing comparison

AI Model Pricing Comparison

Filter versioned rates across text, embedding, image, and audio models. Every row includes its verification date, trust status, and provider source.

Consistent example workloadsProvider source linksNo quality ranking implied

ProviderModality

Example comparison basis: 1M input tokens + 1M output tokens. This is a rate comparison, not a quality or capability ranking.

AI model pricing comparison filtered by provider and modality
Provider / model	Input	Output	Cached / batch	Unit price	Example cost	Verified / source	Use
OpenAI GPT-4.1 mini Cheapest	$0.40 / 1M	$1.60 / 1M	$0.10 cache	—	$2.00	Verified Jun 21, 2026 OpenAI model pricing	Calculate
Google Gemini Gemini 2.5 Flash	$0.30 / 1M	$2.50 / 1M	$0.03 cache	—	$2.80	Verified Jun 21, 2026 Google Gemini API pricing	Calculate
Anthropic Claude Haiku 4.5	$1.00 / 1M	$5.00 / 1M	$0.10 cache	—	$6.00	Verified Jun 21, 2026 Anthropic Claude API pricing	Calculate
OpenAI GPT-4.1	$2.00 / 1M	$8.00 / 1M	$0.50 cache	—	$10.00	Verified Jun 21, 2026 OpenAI model pricing	Calculate
Google Gemini Gemini 2.5 Pro	$1.25 / 1M	$10.00 / 1M	$0.125 cache	—	$11.25	Verified Jun 21, 2026 Google Gemini API pricing	Calculate
Anthropic Claude Sonnet 4.6	$3.00 / 1M	$15.00 / 1M	$0.30 cache	—	$18.00	Verified Jun 21, 2026 Anthropic Claude API pricing	Calculate
Anthropic Claude Opus 4.8 Highest	$5.00 / 1M	$25.00 / 1M	$0.50 cache	—	$30.00	Verified Jun 21, 2026 Anthropic Claude API pricing	Calculate

How to compare AI API pricing

Start with a representative workload, not a unit rate in isolation. Output-heavy assistants can reverse the apparent ranking from an input-only comparison, while caching and batch processing can materially reduce eligible workloads.

Frequently asked questions

What does the example cost compare?

Text models use one million input tokens plus one million output tokens. Embeddings use one million input tokens, audio uses 60 minutes, and image rows use one generated image for the listed size and quality.

Does the cheapest model provide the best value?

Not necessarily. The table compares published rates only. Model quality, latency, context limits, tool support, regional availability, and reliability can matter more than the lowest unit price.

Are these rates live?

No. TokenMath stores versioned pricing snapshots with source links and last-verified dates. Confirm the provider page before making a production commitment.