LLM token cost estimator
OpenAI, Gemini & Claude API Cost Calculator
Paste a representative prompt, choose a model, and estimate cost per request plus daily, monthly, and yearly spend. No login or data upload is required.
API cost estimate
EstimateOpenAI · GPT-4.1 mini
1,000
100 requests × 10 active users
$0.000565
$0.5652
$16.96
30-day projection
$206.30
365-day projection
Calculation basis
- Input tokens after buffer
- 33
- Output tokens after buffer
- 345
- Total requests per day
- 1,000
- Safety buffer
- 15%
- Monthly projection
- 30 days
- Yearly projection
- 365 days
Selected model pricing
Verified pricing snapshot| Provider | OpenAI |
|---|---|
| Model | GPT-4.1 mini |
| Input / 1M tokens | $0.40Applied |
| Output / 1M tokens | $1.60Applied |
| Cached input / 1M | $0.10 |
| Last verified | Jun 21, 2026 |
| Source | OpenAI model pricing |
Standard text-token pricing. Batch API input and output rates are listed separately and are not applied by this calculator.
View provider pricing source (opens in a new tab)Model cost comparison preview
The same buffered workload and request volume applied to other available text models.
| Provider | Model | Monthly estimate | Versus selected |
|---|---|---|---|
| OpenAI | GPT-4.1 miniSelectedCheapest | $16.96 | Baseline |
| Gemini | Gemini 2.5 Flash | $26.17 | +$9.22 |
| Anthropic | Claude Haiku 4.5 | $52.74 | +$35.78 |
| OpenAI | GPT-4.1 | $84.78 | +$67.82 |
| Gemini | Gemini 2.5 Pro | $104.74 | +$87.78 |
Comparison estimates use standard synchronous token pricing only. Cached input, batch processing, platform fees, and negotiated discounts are not applied.
Formula
How the API cost estimate is calculated
The safety buffer is applied before input and output token costs are calculated. The combined request cost is then multiplied by requests per active user and active users per day.
Total daily requests = requests per active user per day × active users per day
Request estimate = buffered input cost + buffered output cost
Monthly estimate = request estimate × total daily requests × 30
type Workload = {
inputTokens: number;
outputTokens: number;
requestsPerActiveUserPerDay: number;
activeUsersPerDay: number;
safetyBufferPercent: number;
};
type TokenPricing = {
inputPricePerMillion: number;
outputPricePerMillion: number;
};
export function estimateApiCost(
workload: Workload,
pricing: TokenPricing,
) {
const bufferMultiplier =
1 + workload.safetyBufferPercent / 100;
const bufferedInputTokens = Math.ceil(
workload.inputTokens * bufferMultiplier,
);
const bufferedOutputTokens = Math.ceil(
workload.outputTokens * bufferMultiplier,
);
const inputCost =
(bufferedInputTokens / 1_000_000) *
pricing.inputPricePerMillion;
const outputCost =
(bufferedOutputTokens / 1_000_000) *
pricing.outputPricePerMillion;
const costPerRequest = inputCost + outputCost;
const totalDailyRequests =
Math.floor(workload.requestsPerActiveUserPerDay) *
Math.floor(workload.activeUsersPerDay);
const dailyCost = costPerRequest * totalDailyRequests;
return {
bufferedInputTokens,
bufferedOutputTokens,
totalDailyRequests,
costPerRequest,
dailyCost,
monthlyCost: dailyCost * 30,
yearlyCost: dailyCost * 365,
};
}Example API cost estimate
Suppose a support assistant sends about 1,000 input tokens and generates 300 output tokens. At 500 requests per day, even a small difference in output pricing compounds across a month.
Use a representative production prompt, include system instructions in the pasted text, and set output tokens near your observed average. Add a safety buffer when traffic or response length is uncertain.
What this estimate includes
- Approximate prompt tokens from pasted text
- Manually specified response tokens
- Separate model input and output rates
- Requests per active user multiplied by active users per day
- Configurable input and output safety buffer
- Daily, 30-day monthly, and 365-day yearly estimates
Frequently asked questions
How does this OpenAI API cost calculator work?
This OpenAI API cost calculator estimates prompt tokens from your text, adds expected output tokens and a safety buffer, then applies the selected model's input and output rates. Request volume converts the per-request estimate into daily, monthly, and yearly projections.
How do I calculate LLM tokens from text?
If you are learning how to calculate LLM tokens, a practical planning estimate is about four Unicode characters per token. Exact counts depend on the provider tokenizer, language, formatting, and code content, so this LLM token cost estimator should be treated as a budget estimate rather than a billing record.
Can I use this as a Claude API cost calculator?
Yes. Select Anthropic and a Claude model to use the corresponding versioned pricing snapshot. The Claude API cost calculator view uses the same buffered token and request-volume math as every other provider.
Can I compare Gemini API costs with other models?
Yes. Select a Gemini model to calculate its estimate, then review the comparison preview below the calculator. The Gemini API cost calculator applies the same workload to available OpenAI, Gemini, and Anthropic models so the monthly differences are comparable.
What does the safety buffer change?
The safety buffer increases both estimated input tokens and expected output tokens before pricing is applied. It helps account for system instructions, prompt wrappers, tokenizer differences, and responses that run longer than the expected average.
Are these provider prices live?
No. TokenMath uses versioned pricing snapshots with source labels and last-verified dates. Providers can change rates, tiers, regional premiums, and discounts without notice, so verify the linked provider pricing before making a production commitment.
Related calculators
Related comparisons
Gemini Flash vs Pro API Cost Comparison
Compare Gemini Flash vs Pro API pricing for 1,000 input tokens, 300 output tokens, and 10,000 monthly requests.
OpenClaude Sonnet vs Opus API Cost Comparison
Compare Claude Sonnet vs Opus API cost using source-linked pricing and a consistent monthly request workload.
OpenOpenAI vs Gemini API Cost Comparison
Compare representative OpenAI and Gemini API costs for 1,000 input tokens, 300 output tokens, and 10,000 monthly requests.
OpenRelated glossary terms
Input tokens
Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.
OpenOutput tokens
Output tokens are the tokenized units generated by a language model, including visible responses and any billable reasoning or thinking tokens defined by the provider.
OpenCost per request
Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.
Open