Embedding cost guide

Embedding Cost Explained: Tokens, Chunks, and Re-indexing

Embedding costs are usually small per million tokens, but duplicate chunks, frequent re-indexing, and uncontrolled document growth can compound. Estimate the initial corpus and ongoing refresh workload separately.

Formula-driven examplesSource-linked pricing snapshots

Calculate corpus token volume

Estimate tokens per chunk, multiply by chunk count, then include document updates and overlap. Chunk overlap improves retrieval context but embeds repeated text.

A full refresh reprocesses the entire corpus. Incremental indexing only embeds changed or new content and is often the larger operational saving.

embedding cost = total embedded tokens ÷ 1,000,000 × model input rate

Separate embedding and vector storage

The embedding API charge creates vectors; the vector database has separate storage, indexing, query, replica, and network costs. Budget both layers.

Worked example

text-embedding-3-small: 1M input tokens

Using the versioned rates below, this example workload is estimated at $0.02. This isolates provider usage only and does not include taxes, regional premiums, retries, storage, network traffic, or unrelated infrastructure.

Current pricing references

These versioned records support the examples above. Check the date and provider source before using them in a production forecast.

Provider / model	Input or unit	Output	Status	Source
OpenAI text-embedding-3-small	$0.02 per 1M tokens	—	Verified	OpenAI embedding model pricing Checked Jun 21, 2026
OpenAI text-embedding-3-large	$0.13 per 1M tokens	—	Verified	OpenAI embedding model pricing Checked Jun 21, 2026
Google Gemini Gemini Embedding 001	$0.15 per 1M tokens	—	Verified	Google Gemini Developer API pricing Checked Jun 21, 2026

Provider / model

Input or unit

Output

Status

Source

OpenAI

text-embedding-3-small

$0.02 per 1M tokens

—

Verified

OpenAI embedding model pricing

Checked Jun 21, 2026

OpenAI

text-embedding-3-large

$0.13 per 1M tokens

—

Verified

OpenAI embedding model pricing

Checked Jun 21, 2026

Google Gemini

Gemini Embedding 001

$0.15 per 1M tokens

—

Verified

Google Gemini Developer API pricing

Checked Jun 21, 2026

Frequently asked questions

Do vector dimensions change embedding API cost?

Token-based API pricing is driven by input tokens, though dimensions affect vector storage and may vary by model or configuration.

Should unchanged documents be re-embedded?

Usually not. Content hashes and incremental updates can prevent unnecessary API and indexing work.

Related calculators and guides

Open embedding cost calculator

Estimate corpus and refresh costs.

Open

Estimate vector storage

Project vector count and raw storage.

Open

Related glossary terms

Input tokens

Input tokens are the tokenized units sent to a model, including instructions, user content, conversation history, retrieved context, and tool definitions.

Open

Output tokens

Output tokens are the tokenized units generated by a language model, including visible responses and any billable reasoning or thinking tokens defined by the provider.

Open

Cost per request

Cost per request is the sum of all billable usage generated by one API call, commonly input token cost plus output token cost for a text model.

Open