Skip to main content

Pricing comparison

AI Model Pricing Comparison

Filter versioned rates across text, embedding, image, and audio models. Every row includes its verification date, trust status, and provider source.

Consistent example workloadsProvider source linksNo quality ranking implied
Example comparison basis: 1M input tokens + 1M output tokens. This is a rate comparison, not a quality or capability ranking.
AI model pricing comparison filtered by provider and modality
Provider / modelInputOutputCached / batchUnit priceExample costVerified / sourceUse

OpenAI

GPT-4.1 mini

Cheapest
$0.40 / 1M$1.60 / 1M$0.10 cache$2.00Verified

Jun 21, 2026

OpenAI model pricing
Calculate

Google Gemini

Gemini 2.5 Flash

$0.30 / 1M$2.50 / 1M$0.03 cache$2.80Verified

Jun 21, 2026

Google Gemini API pricing
Calculate

Anthropic

Claude Haiku 4.5

$1.00 / 1M$5.00 / 1M$0.10 cache$6.00Verified

Jun 21, 2026

Anthropic Claude API pricing
Calculate

OpenAI

GPT-4.1

$2.00 / 1M$8.00 / 1M$0.50 cache$10.00Verified

Jun 21, 2026

OpenAI model pricing
Calculate

Google Gemini

Gemini 2.5 Pro

$1.25 / 1M$10.00 / 1M$0.125 cache$11.25Verified

Jun 21, 2026

Google Gemini API pricing
Calculate

Anthropic

Claude Sonnet 4.6

$3.00 / 1M$15.00 / 1M$0.30 cache$18.00Verified

Jun 21, 2026

Anthropic Claude API pricing
Calculate

Anthropic

Claude Opus 4.8

Highest
$5.00 / 1M$25.00 / 1M$0.50 cache$30.00Verified

Jun 21, 2026

Anthropic Claude API pricing
Calculate

How to compare AI API pricing

Start with a representative workload, not a unit rate in isolation. Output-heavy assistants can reverse the apparent ranking from an input-only comparison, while caching and batch processing can materially reduce eligible workloads.

Frequently asked questions

What does the example cost compare?

Text models use one million input tokens plus one million output tokens. Embeddings use one million input tokens, audio uses 60 minutes, and image rows use one generated image for the listed size and quality.

Does the cheapest model provide the best value?

Not necessarily. The table compares published rates only. Model quality, latency, context limits, tool support, regional availability, and reliability can matter more than the lowest unit price.

Are these rates live?

No. TokenMath stores versioned pricing snapshots with source links and last-verified dates. Confirm the provider page before making a production commitment.