Embedding API estimator

Embedding Cost Calculator

Estimate document embedding cost from representative chunk size, document count, chunks per document, refresh frequency, and model pricing.

Text or manual token inputDocument and chunk volumeRecurring refresh projections

Embedding cost estimate

OpenAI · text-embedding-3-small

Estimated tokens per document

176

Total tokens embedded

1,760,000

Per complete embedding batch

Estimated cost per embedding batch

$0.0352

Estimated monthly refresh cost

$0.0352

1 full refreshes

Estimated yearly cost

$0.4224

12-month projection

Calculation basis

Tokens per chunk: 20
Total chunks: 80,000
Base tokens per document: 160

Selected pricing source

Verified pricing snapshot

Provider: OpenAI
Model: text-embedding-3-small
Price per 1M tokens: $0.02
Last verified: Jun 21, 2026
Source: OpenAI embedding model pricing

Standard embedding input estimate per one million tokens. Tokenization and provider billing totals can differ.

View provider pricing source

Formula

How embedding cost is calculated

Representative tokens per chunk are multiplied by chunks and documents, then buffered before the model's per-million-token rate is applied.

Tokens per document = tokens per chunk × chunks per document

Batch tokens = buffered tokens per document × documents

Monthly refresh cost = batch cost × full refreshes per month

embedding-cost.ts

export function embeddingCost(input: {
  tokensPerChunk: number;
  documents: number;
  chunksPerDocument: number;
  refreshesPerMonth: number;
  safetyBufferPercent: number;
  pricePerMillionTokens: number;
}) {
  const tokensPerDocument =
    input.tokensPerChunk * input.chunksPerDocument;
  const bufferedTokens = Math.ceil(
    tokensPerDocument * (1 + input.safetyBufferPercent / 100),
  );
  const totalTokens = bufferedTokens * input.documents;
  const batchCost =
    (totalTokens / 1_000_000) * input.pricePerMillionTokens;

  return {
    batchCost,
    monthlyCost: batchCost * input.refreshesPerMonth,
  };
}

Example embedding workload

A retrieval system with ten thousand documents, eight chunks per document, and roughly two hundred fifty tokens per chunk embeds about twenty million base tokens.

A full monthly refresh re-embeds the complete collection. If only changed documents are refreshed, lower the document count to the expected changed subset.

What this estimate includes

Estimated tokens per chunk and per document
Document, chunk, and full-refresh volume
Cost per complete embedding batch
Monthly refresh and yearly estimates

Frequently asked questions

How do I estimate embedding tokens?

Paste representative chunk text for a transparent four-characters-per-token estimate, or enter observed tokenizer counts manually. Exact tokenization varies by model.

Does chunking increase embedding cost?

Chunking can add overlap, titles, metadata, and repeated context. Model the average tokens in one final chunk rather than dividing raw document text without accounting for overlap.

What does a full refresh mean?

A full refresh embeds every modeled document again. For incremental indexing, use the number of changed documents instead of the total collection size.

Does this include vector database storage?

No. Use the vector database storage estimator for vector dimensions, metadata, replicas, index overhead, and user-entered storage rates.

Related calculators

Text-to-Token & Cost Estimator

Estimate input tokens and project OpenAI, Gemini, or Claude API spend.

Open

Vector Database Storage Estimator

Estimate vector count, index size, and storage requirements.

Open

Daily API Budget Planner

Turn a fixed monthly AI budget into request and user limits.

Open

Related glossary terms

Input tokens

Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.

Open

Requests per day

Requests per day is the number of billable API calls made during a day. TokenMath commonly derives it from requests per active user multiplied by active users.

Open

Cost per request

Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.

Open

Embedding Cost Calculator

Embedding workload

Embedding cost estimate

Selected pricing source

How embedding cost is calculated

Example embedding workload

What this estimate includes

Frequently asked questions

Related calculators

Related glossary terms