Token planning guide

How to Calculate LLM Tokens and Estimate API Cost

Token counts drive most language-model API bills. A useful estimate separates prompt tokens from generated tokens, applies the correct model rates, and then scales one request across real traffic.

Formula-driven examplesSource-linked pricing snapshots

Estimate tokens before you call the API

A provider tokenizer is the source of truth, but a planning estimate of roughly four Unicode characters per token is useful for English prose. Code, tables, non-Latin languages, whitespace, and structured data can produce different ratios.

Include system instructions, retrieved context, conversation history, tool schemas, and formatting wrappers. Counting only the visible user message usually understates production input.

estimated input tokens ≈ total prompt characters ÷ 4

Convert token usage into cost

Input and output tokens usually have different rates. Calculate each side independently, add them for cost per request, then multiply by daily request volume.

A safety buffer is useful when response length and retrieved context vary. TokenMath applies the buffer before pricing so the projection remains internally consistent.

request cost = (input tokens ÷ 1,000,000 × input rate) + (output tokens ÷ 1,000,000 × output rate)

Worked example

GPT-4.1 mini: 1M input tokens + 1M output tokens

Using the versioned rates below, this example workload is estimated at $2.00. This isolates provider usage only and does not include taxes, regional premiums, retries, storage, network traffic, or unrelated infrastructure.

Current pricing references

These versioned records support the examples above. Check the date and provider source before using them in a production forecast.

Provider / model	Input or unit	Output	Status	Source
OpenAI GPT-4.1 mini	$0.40 per 1M tokens	$1.60 / 1M	Verified	OpenAI model pricing Checked Jun 21, 2026
Google Gemini Gemini 2.5 Flash	$0.30 per 1M tokens	$2.50 / 1M	Verified	Google Gemini API pricing Checked Jun 21, 2026
Anthropic Claude Sonnet 4.6	$3.00 per 1M tokens	$15.00 / 1M	Verified	Anthropic Claude API pricing Checked Jun 21, 2026

Provider / model

Input or unit

Output

Status

Source

OpenAI

GPT-4.1 mini

$0.40 per 1M tokens

$1.60 / 1M

Verified

OpenAI model pricing

Checked Jun 21, 2026

Google Gemini

Gemini 2.5 Flash

$0.30 per 1M tokens

$2.50 / 1M

Verified

Google Gemini API pricing

Checked Jun 21, 2026

Anthropic

Claude Sonnet 4.6

$3.00 per 1M tokens

$15.00 / 1M

Verified

Anthropic Claude API pricing

Checked Jun 21, 2026

Frequently asked questions

How accurate is four characters per token?

It is a planning heuristic, not a tokenizer replacement. Use provider token counts from representative production requests when precision matters.

Do output tokens usually cost more?

Many text models publish a higher output rate, which makes response length an important cost-control variable.

Related calculators and guides

Open the LLM token cost estimator

Paste a prompt and project request, monthly, and yearly cost.

Open

Compare model pricing

Apply a consistent example workload across providers.

Open

Related glossary terms

Input tokens

Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.

Open

Output tokens

Output tokens are the tokenized units generated by a language model, including visible responses and any billable reasoning or thinking tokens defined by the provider.

Open

Cost per request

Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.

Open