OpenAI tool-cost detail

OpenAI file search pricing spans storage, tool calls, and model-token exposure.

This page answers the OpenAI file search pricing query directly. It keeps the current storage rate, tool-call rate, token exposure, and practical workload patterns in one source-linked view.

Review cost anatomy Open workload examples

Current state

This tool has more than one meter, and each one can dominate a different workload.

Live cost brief

Current OpenAI pricing splits file search across vector-store storage, Responses API tool calls, and model-token exposure when retrieved context enters the response path.

Last checked

March 12, 2026

Storage rate

$0.10 per GB per day

The first 1 GB is free, and OpenAI defines GB here as binary gigabytes.

Tool-call rate

$2.50 per 1K calls

This line item is listed for file search in the Responses API pricing table.

Third meter

Chosen model token rates still apply

OpenAI bills built-in tool tokens at the chosen model's per-token rates, and the file search guide separately notes that limiting results can reduce token usage.

Cost anatomy

A serious file-search estimate needs three meters, not one.

The pricing page gives you the base rates, but file search only becomes predictable once storage footprint, call volume, and token exposure are separated.

Vector-store storage is a standing meter.

OpenAI bills file search storage at $0.10 per GB per day after the first free 1 GB. The vector-store reference exposes `usage_bytes`, which is the direct measurement to watch instead of guessing from raw file size.

Sources

OpenAI API pricing Vector stores API reference

Each file-search invocation has its own call meter.

OpenAI lists file search tool calls at $2.50 per 1K calls, and marks that line item as Responses API only. A workload with frequent retrieval can therefore become call-driven even when storage stays small.

Sources

OpenAI API pricing File search guide

Retrieved context can still move model spend.

OpenAI says built-in tool tokens are billed at the chosen model's per-token rates, and the file search guide says lowering `max_num_results` can reduce token usage. Together, those sources mean file-search tuning can affect the model token line as well as latency.

Sources

OpenAI API pricing File search guide

Workload examples

Different workloads get dominated by different meters.

These examples use OpenAI's current rates and simple arithmetic so you can see where file search gets unexpectedly cheap, unexpectedly expensive, or just mis-estimated. They exclude model-token charges because those depend on the chosen model and retrieved context size.

Scenario	Workload	Storage meter	Tool-call meter	Model token exposure	Decision read	Sources
Small internal knowledge helper	0.8 GB active vector store for 30 days and 20K file-search calls during the month.	Storage stays inside the free 1 GB threshold: $0	20 x $2.50 = $50	Model token spend still varies with prompt size and retrieved context, but the call meter is already the main line item.	For small knowledge bases, file search is usually a call-volume question before it becomes a storage question.	OpenAI API pricing
Busy support copilot	6 GB active vector stores for 30 days and 200K file-search calls during the month.	(6 GB - 1 GB free) x $0.10 x 30 days = $15	200 x $2.50 = $500	Result count and response shape can still add token pressure on top of the call line.	When file search fires often, tool calls can dominate the bill long before storage becomes the biggest cost surface.	OpenAI API pricing File search guide
Stale archive left online	50 GB vector-store footprint kept active for 30 days, but only 2K file-search calls during the month.	(50 GB - 1 GB free) x $0.10 x 30 days = $147	2 x $2.50 = $5	Token spend is low because calls are low, but storage keeps billing every day until the store expires or is deleted.	Quiet archives become storage-driven. If the data should cool off, expiration policy matters more than query optimization.	OpenAI API pricing Retrieval guide

Control levers

The cheapest file-search workload is usually the one you constrain on purpose.

These are the controls OpenAI exposes today that materially change storage or token pressure without requiring a new architecture.

Set expiration on vector stores that should cool off.

The retrieval guide says expired vector stores stop charging, and the vector-store API lets you anchor `expires_after` to `last_active_at`. This is the cleanest control for archives, temporary projects, and bursty analysis jobs.

Sources

Retrieval guide Vector stores API reference

Inspect `usage_bytes` instead of guessing from source files.

OpenAI exposes `usage_bytes` directly on the vector-store object. That matters because billing is based on stored bytes, while the retrieval guide separately shows that files are chunked and indexed before search.

Sources

Vector stores API reference Retrieval guide

Lower `max_num_results` when answer quality allows it.

The file search guide explicitly says limiting results can reduce token usage and latency. This does not reduce tool-call count, but it can narrow the model-token line and reduce response bloat.

Sources

File search guide OpenAI API pricing

Treat chunking as a storage-shape control, then re-measure.

OpenAI documents default chunking at 800 tokens with 400-token overlap and allows chunking changes when files are added. That means indexing behavior is tunable, but the right check after tuning is still the resulting `usage_bytes` rather than an assumption.

Sources

Retrieval guide Vector stores API reference

Decision signals

What usually determines whether file search is worth it.

Use these signals when deciding whether to keep file search in the path, add stronger controls, or price an alternative retrieval design.

Call-heavy workloads are usually tool-meter problems first.

If the application hits file search on most user turns, the $2.50 per 1K call line often grows faster than storage. Start by estimating invocation frequency before debating storage optimization.

Sources

OpenAI API pricing

Large but quiet datasets are storage-governance problems.

When retrieval volume is low but vector stores stay online, the daily storage meter keeps running. Expiration policy and deletion hygiene matter more than prompt tuning in that case.

Sources

OpenAI API pricing Retrieval guide

Model selection still matters after the tool line is priced.

File search does not replace model pricing. OpenAI bills built-in tool tokens at the chosen model's per-token rates, so the same retrieval pattern can land very differently on a high-end versus low-cost model row.

Sources

OpenAI API pricing

A good estimate needs all three meters in the same sheet.

Storage, tool calls, and model tokens move independently. If any estimate leaves one of them out, it is probably still a document skim rather than a real operating cost read.

Sources

OpenAI API pricing File search guide

Official sources

Check the OpenAI pages behind these cost lines.

This page keeps the source set narrow so the cost brief can stay auditable instead of drifting into guesswork.

Pricing

OpenAI API pricing

Source of record for the file search storage price, the Responses API tool-call price, the first free gigabyte, and the statement that built-in tool tokens are billed at model rates.

Open official page

Guide

File search guide

Shows that file search is a hosted tool in the Responses API, requires vector stores, and can reduce token usage by limiting the number of results.

Open official page

Guide

Retrieval guide

Documents expiration policies that stop charges and the default chunking behavior used when files are indexed into vector stores.

Open official page

API reference

Vector stores API reference

Documents `usage_bytes`, `last_active_at`, and `expires_after`, which are the fields you need to inspect and control storage cost directly.

Open official page

Continue the site

Keep moving through the decision from here.

Use the groups below to move laterally through the decision, not back out into another doc hunt.

Stay in the same decision neighborhood instead of backing out to search.

Pricing / Costs

Model pricing, hosted-tool costs, and fit constraints that materially change the operating estimate.

Open page

OpenAI web search pricing

Tool-cost brief for web search pricing across standard and preview search paths.

Open page

OpenAI container pricing

Tool-cost brief for code interpreter container runtime and how it stacks with model spend.

Open page

Compare pages

Open the pages that turn this topic into a side-by-side decision.

GPT-5.4 vs GPT-5 mini

Side-by-side comparison of GPT-5.4 and GPT-5 mini across price, fit, and tool pressure.

Open page

Cheapest OpenAI model for extraction

Scenario recommendation page for choosing the cheapest workable OpenAI extraction model.

Open page

Replacement pages

Use the likely substitutes, migration targets, or fallback choices as the next click.

OpenAI API pricing calculator

Interactive calculator for model tokens, hosted tools, and runtime in one estimate.

Open page

GPT-5 mini pricing

Single-model pricing brief for GPT-5 mini across standard and batch rows.

Open page

GPT-5.4 pricing

Single-model pricing brief for GPT-5.4 across short, long, and batch rows.

Open page

Source category pages

Trace the source families behind this page instead of opening random docs in isolation.

Pricing sources

Official pricing pages used to support model, tool-cost, and calculator estimates.

Open page

Guide and API reference sources

Operational guides and API references used by tool-cost, migration, and calculator pages.

Open page

Return

Return to the OpenAI tracker

Go back to the main OpenAI decision surface to compare file-search cost pressure with current model rows, other hosted tool charges, and lifecycle risk.

Back to OpenAI tracker