Embedding API estimator
Embedding Cost Calculator
Estimate document embedding cost from representative chunk size, document count, chunks per document, refresh frequency, and model pricing.
Embedding cost estimate
OpenAI · text-embedding-3-small
176
1,760,000
Per complete embedding batch
$0.0352
$0.0352
1 full refreshes
$0.4224
12-month projection
Calculation basis
- Tokens per chunk
- 20
- Total chunks
- 80,000
- Base tokens per document
- 160
Selected pricing source
Verified pricing snapshot- Provider
- OpenAI
- Model
- text-embedding-3-small
- Price per 1M tokens
- $0.02
- Last verified
- Jun 21, 2026
- Source
- OpenAI embedding model pricing
Standard embedding input estimate per one million tokens. Tokenization and provider billing totals can differ.
View provider pricing source (opens in a new tab)Formula
How embedding cost is calculated
Representative tokens per chunk are multiplied by chunks and documents, then buffered before the model's per-million-token rate is applied.
Tokens per document = tokens per chunk × chunks per document
Batch tokens = buffered tokens per document × documents
Monthly refresh cost = batch cost × full refreshes per month
export function embeddingCost(input: {
tokensPerChunk: number;
documents: number;
chunksPerDocument: number;
refreshesPerMonth: number;
safetyBufferPercent: number;
pricePerMillionTokens: number;
}) {
const tokensPerDocument =
input.tokensPerChunk * input.chunksPerDocument;
const bufferedTokens = Math.ceil(
tokensPerDocument * (1 + input.safetyBufferPercent / 100),
);
const totalTokens = bufferedTokens * input.documents;
const batchCost =
(totalTokens / 1_000_000) * input.pricePerMillionTokens;
return {
batchCost,
monthlyCost: batchCost * input.refreshesPerMonth,
};
}Example embedding workload
A retrieval system with ten thousand documents, eight chunks per document, and roughly two hundred fifty tokens per chunk embeds about twenty million base tokens.
A full monthly refresh re-embeds the complete collection. If only changed documents are refreshed, lower the document count to the expected changed subset.
What this estimate includes
- Estimated tokens per chunk and per document
- Document, chunk, and full-refresh volume
- Cost per complete embedding batch
- Monthly refresh and yearly estimates
Frequently asked questions
How do I estimate embedding tokens?
Paste representative chunk text for a transparent four-characters-per-token estimate, or enter observed tokenizer counts manually. Exact tokenization varies by model.
Does chunking increase embedding cost?
Chunking can add overlap, titles, metadata, and repeated context. Model the average tokens in one final chunk rather than dividing raw document text without accounting for overlap.
What does a full refresh mean?
A full refresh embeds every modeled document again. For incremental indexing, use the number of changed documents instead of the total collection size.
Does this include vector database storage?
No. Use the vector database storage estimator for vector dimensions, metadata, replicas, index overhead, and user-entered storage rates.
Related calculators
Related glossary terms
Input tokens
Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.
OpenRequests per day
Requests per day is the number of billable API calls made during a day. TokenMath commonly derives it from requests per active user multiplied by active users.
OpenCost per request
Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.
Open