Skip to main content

Embedding API estimator

Embedding Cost Calculator

Estimate document embedding cost from representative chunk size, document count, chunks per document, refresh frequency, and model pricing.

Text or manual token inputDocument and chunk volumeRecurring refresh projections

Embedding workload

Estimate recurring embedding volume by representative chunk size.

80 characters · about 20 tokens

One refresh means embedding the complete document set once.

%

Embedding cost estimate

OpenAI · text-embedding-3-small

Estimated tokens per document

176

Total tokens embedded

1,760,000

Per complete embedding batch

Estimated cost per embedding batch

$0.0352

Estimated monthly refresh cost

$0.0352

1 full refreshes

Estimated yearly cost

$0.4224

12-month projection

Calculation basis

Tokens per chunk
20
Total chunks
80,000
Base tokens per document
160

Selected pricing source

Verified pricing snapshot
Provider
OpenAI
Model
text-embedding-3-small
Price per 1M tokens
$0.02
Last verified
Jun 21, 2026
Source
OpenAI embedding model pricing

Standard embedding input estimate per one million tokens. Tokenization and provider billing totals can differ.

View provider pricing source (opens in a new tab)

Formula

How embedding cost is calculated

Representative tokens per chunk are multiplied by chunks and documents, then buffered before the model's per-million-token rate is applied.

Tokens per document = tokens per chunk × chunks per document

Batch tokens = buffered tokens per document × documents

Monthly refresh cost = batch cost × full refreshes per month

embedding-cost.ts
export function embeddingCost(input: {
  tokensPerChunk: number;
  documents: number;
  chunksPerDocument: number;
  refreshesPerMonth: number;
  safetyBufferPercent: number;
  pricePerMillionTokens: number;
}) {
  const tokensPerDocument =
    input.tokensPerChunk * input.chunksPerDocument;
  const bufferedTokens = Math.ceil(
    tokensPerDocument * (1 + input.safetyBufferPercent / 100),
  );
  const totalTokens = bufferedTokens * input.documents;
  const batchCost =
    (totalTokens / 1_000_000) * input.pricePerMillionTokens;

  return {
    batchCost,
    monthlyCost: batchCost * input.refreshesPerMonth,
  };
}

Example embedding workload

A retrieval system with ten thousand documents, eight chunks per document, and roughly two hundred fifty tokens per chunk embeds about twenty million base tokens.

A full monthly refresh re-embeds the complete collection. If only changed documents are refreshed, lower the document count to the expected changed subset.

What this estimate includes

  • Estimated tokens per chunk and per document
  • Document, chunk, and full-refresh volume
  • Cost per complete embedding batch
  • Monthly refresh and yearly estimates

Frequently asked questions

How do I estimate embedding tokens?

Paste representative chunk text for a transparent four-characters-per-token estimate, or enter observed tokenizer counts manually. Exact tokenization varies by model.

Does chunking increase embedding cost?

Chunking can add overlap, titles, metadata, and repeated context. Model the average tokens in one final chunk rather than dividing raw document text without accounting for overlap.

What does a full refresh mean?

A full refresh embeds every modeled document again. For incremental indexing, use the number of changed documents instead of the total collection size.

Does this include vector database storage?

No. Use the vector database storage estimator for vector dimensions, metadata, replicas, index overhead, and user-entered storage rates.