Claude cost comparison
Claude Sonnet vs Opus Cost: API Pricing Comparison
Claude Sonnet and Opus have different rate profiles and intended workloads. Compare them using the same token mix, then evaluate whether task quality changes retries or the number of model calls needed.
Output volume matters
Agentic coding and reasoning requests can produce substantial output and tool-call context. Because output has its own rate, a one-million-input-token comparison alone can understate the real difference.
Prompt caching may lower eligible repeated input, but cache writes and cache hits are different billing events. Model them separately when your application has stable prompt prefixes.
Measure cost per successful task
A more capable model can be economical when it reduces retries, human review, or multi-step calls. Run an evaluation set and compare successful-task cost rather than assuming either tier wins.
Worked example
Claude Sonnet 4.6: 1M input tokens + 1M output tokens
Using the versioned rates below, this example workload is estimated at $18.00. This isolates provider usage only and does not include taxes, regional premiums, retries, storage, network traffic, or unrelated infrastructure.
Current pricing references
These versioned records support the examples above. Check the date and provider source before using them in a production forecast.
| Provider / model | Input or unit | Output | Status | Source |
|---|---|---|---|---|
Anthropic Claude Sonnet 4.6 | $3.00 per 1M tokens | $15.00 / 1M | Verified | Anthropic Claude API pricing Checked Jun 21, 2026 |
Anthropic Claude Opus 4.8 | $5.00 per 1M tokens | $25.00 / 1M | Verified | Anthropic Claude API pricing Checked Jun 21, 2026 |
Frequently asked questions
Which is cheaper, Claude Sonnet or Opus?
Sonnet has lower standard token rates in this snapshot. The better economic choice depends on output volume and success rate for your workload.
Does Claude batch pricing change the comparison?
It can for eligible asynchronous workloads. Compare both models under the same processing mode and confirm current Anthropic terms.
Related calculators and guides
Related glossary terms
Input tokens
Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.
OpenOutput tokens
Output tokens are the tokenized units generated by a language model, including visible responses and any billable reasoning or thinking tokens defined by the provider.
OpenCost per request
Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.
Open