OpenAI tool-cost detail

OpenAI web search pricing depends on which search path you choose.

This page answers the OpenAI web search pricing query directly. It keeps the current tool-call rates, search-content token rules, fixed-token exception, and practical workload patterns in one source-linked view.

Current state

This tool has more than one meter, and each one can dominate a different workload.
Live cost brief
Current OpenAI pricing separates standard web search, preview reasoning web search, and preview non-reasoning web search. Search content tokens are not billed the same way on each path.

Last checked

March 12, 2026

Standard web search

$10 per 1K calls plus search content tokens at model rates

This is the baseline pricing on the current API pricing page for standard web search.

Preview reasoning path

$10 per 1K calls plus search content tokens at model rates

Preview reasoning web search keeps the lower call price, but search content tokens still ride the chosen model row.

Preview non-reasoning path

$25 per 1K calls and free search content tokens

This path simplifies search-content billing, but the higher call meter can dominate the estimate fast.

Mini fixed block

gpt-4o-mini and gpt-4.1-mini standard web search bill 8,000 input tokens per call

This is a special pricing rule on the current pricing page, not a guess from the general model row.

Cost anatomy

A serious web-search estimate needs the search path before it needs the model row.

OpenAI web search is not one uniform meter. The call rate, the search-content token rule, and the fixed-token exception all change depending on which search path you pick.

Tool-call pricing changes by search path.
The pricing page lists standard web search at $10 per 1K calls, preview reasoning web search at $10 per 1K calls, and preview non-reasoning web search at $25 per 1K calls. If a team chooses the wrong path first, the rest of the estimate starts from the wrong floor.
Search content tokens are a second meter, but not on every path.
Standard web search and preview reasoning web search bill search content tokens at the chosen model's token rates. Preview non-reasoning web search marks search content tokens as free. That means the same search-heavy workload can flip between token-sensitive and call-sensitive depending on the path.
Mini standard web search has a fixed billed-token floor.
For gpt-4o-mini and gpt-4.1-mini with standard web search, OpenAI charges search content tokens as a fixed block of 8,000 input tokens per call. That special rule matters more than the cheap headline input row when search is frequent.

Workload examples

Different web-search paths get expensive for different reasons.

These examples use OpenAI's current published search rates and simple arithmetic. They isolate the search-specific burden so the call meter and search-content rule are visible before normal prompt and output tokens are folded back in.

Worked example

This compare isolates search-specific cost. Regular prompt and output tokens still sit outside these figures.

Monthly workload

30,000 Responses API web searches on the same product surface.

Compared options

Standard web search with gpt-5 versus standard web search with gpt-4.1-mini.

Search-content assumption

gpt-5 path sees about 3,000 search content tokens per call; gpt-4.1-mini uses the published fixed 8,000-token block instead.

Scope of estimate

This sample prices only the search-specific meters: tool calls and search content tokens.

Model option

Responses API: gpt-5 + web_search

~$412.50 in search-specific cost

Tool-call meter

30 x $10 = $300.

Search content tokens

30,000 x 3,000 tokens = 90M input tokens. At $1.25 per 1M input tokens, that is about $112.50.

Decision read

This path keeps the lower call price, but search content volume still pushes the bill upward as results get richer.

Recommended next check

Confirm whether the richer model is needed on searched turns, or whether the search path can move to a cheaper model without losing fit.

Model option

Responses API: gpt-4.1-mini + web_search

~$396 in search-specific cost

Tool-call meter

30 x $10 = $300.

Search content tokens

30,000 x 8,000 billed tokens = 240M input tokens. At $0.40 per 1M input tokens, that is about $96.

Decision read

The cheaper model row helps, but the fixed 8,000-token block means the savings are much smaller than the base model pricing alone would suggest.

Recommended next check

Use this path only after confirming the smaller model is good enough and the fixed-block search billing still beats the richer-model path for your call volume.

Estimated search-specific cost

The cheaper model only saves about $16.50 in this sample because the $300 call meter dominates both options and the mini path keeps a fixed search-token floor.

What matters first

On frequent web search, choosing the search path and understanding its token rule matter more than headline model input pricing.

Recommended next check

Before switching downmarket, compare actual call volume and the effective search-token rule, not just the base model row.

This is a sample compare, not a live calculator. It combines the current published web-search tool rates with current model input rates to show how the search-specific burden can flatten apparent model savings.

Additional examples

Use the table below to see how the same tool changes shape under different workload patterns after the worked example has framed the main decision.

ScenarioWorkloadTool-call meterSearch content tokensModel-row pressureDecision readSources
Light lookup helper5,000 standard web searches in the Responses API on gpt-5, with about 1,000 search content tokens per call.5 x $10 = $505M search content tokens x $1.25 per 1M = about $6.25Regular prompt and output tokens still sit on the normal gpt-5 row outside this search-specific estimate.On light lookups, standard web search is usually call-driven before search content tokens become the main concern.
Mini fixed-block path20,000 standard web searches in the Responses API on gpt-4.1-mini.20 x $10 = $20020,000 x 8,000 billed tokens = 160M tokens x $0.40 per 1M = $64The fixed block means short searches do not collapse to near-zero search-token cost even on a mini model row.This path can still be economical, but the fixed block creates a cost floor that the base model row does not show by itself.
Preview non-reasoning search model20,000 web lookups using gpt-4o-search-preview in Chat Completions.20 x $25 = $500Search content tokens are free on this preview non-reasoning pathRegular prompt and output tokens still use the selected preview model's token rates.This path simplifies search-token math, but the higher call price dominates quickly on frequent search.

Control levers

The best web-search estimate usually starts by controlling call volume, not by polishing the model row.

OpenAI already exposes path choices and usage patterns that materially change the search bill. These are the ones that most often move the estimate.

Choose Responses API when search should be conditional.

The web search guide shows the Responses API using a `web_search` tool, while the Chat Completions preview search models always retrieve from the web before responding. That means conditional search belongs on the Responses path when not every turn needs live web data.

Pick the search price path before comparing model rows.

Standard web search, preview reasoning web search, and preview non-reasoning web search do not share one price shape. If the path is not fixed first, the estimate is still comparing the wrong surfaces.

Watch the fixed 8,000-token rule on mini standard web search.

For gpt-4o-mini and gpt-4.1-mini with standard web search, every search call carries the fixed billed token block. This is the main reason the mini path can save less than the base model pricing suggests.

Separate search-specific cost from normal prompt and output tokens.

Web search introduces its own tool-call and search-content logic. Keeping those lines separate from the ordinary model bill is the only reliable way to see whether search is the real cost problem.

Decision signals

What usually determines whether OpenAI web search still looks economical.

Use these signals when deciding whether to keep the current search path, switch paths, or push harder on call-volume control.

High-frequency search is usually a call-meter problem first.

At both $10 and $25 per 1K calls, frequent web search can dominate before search-content tokens do. Start by estimating search frequency, then price the token rule for the chosen path.

Mini standard web search is not a free-retrieval shortcut.

The fixed 8,000-token block on gpt-4o-mini and gpt-4.1-mini standard web search creates a billed floor per call. Cheap base input pricing does not erase that search-specific floor.

Preview non-reasoning search trades free search tokens for a higher call price.

If a team prices only the 'free search content tokens' note and ignores the $25 per 1K call line, it will probably under-estimate the preview non-reasoning path.

Path choice can matter more than the underlying model row.

Because the guide separates conditional web search in Responses API from always-searching preview models in Chat Completions, teams should choose the search path before treating this as a normal model-pricing comparison.

Official sources

Check the OpenAI pages behind these cost lines.

This page keeps the source set narrow so the cost brief can stay auditable instead of drifting into guesswork.

Pricing

OpenAI API pricing

Source of record for standard web search, preview reasoning web search, preview non-reasoning web search, free search content token treatment on preview non-reasoning, and the fixed 8,000-token block on gpt-4o-mini and gpt-4.1-mini standard web search.

Open official page
Guide

Web search guide

Shows how to use web search in the Responses API, documents that Chat Completions preview search models always retrieve from the web before responding, and gives the current tool interface context.

Open official page

Continue the site

Keep moving through the decision from here.

Use the groups below to move laterally through the decision, not back out into another doc hunt.

Related pages

Stay in the same decision neighborhood instead of backing out to search.

Pricing / Costs

Model pricing, hosted-tool costs, and fit constraints that materially change the operating estimate.

Open page

OpenAI file search pricing

Tool-cost brief for file search pricing across storage, tool calls, and model-token exposure.

Open page

OpenAI container pricing

Tool-cost brief for code interpreter container runtime and how it stacks with model spend.

Open page

Compare pages

Open the pages that turn this topic into a side-by-side decision.

GPT-5.4 vs GPT-5 mini

Side-by-side comparison of GPT-5.4 and GPT-5 mini across price, fit, and tool pressure.

Open page

Cheapest OpenAI model for extraction

Scenario recommendation page for choosing the cheapest workable OpenAI extraction model.

Open page

Replacement pages

Use the likely substitutes, migration targets, or fallback choices as the next click.

OpenAI API pricing calculator

Interactive calculator for model tokens, hosted tools, and runtime in one estimate.

Open page

GPT-5 mini pricing

Single-model pricing brief for GPT-5 mini across standard and batch rows.

Open page

GPT-5.4 pricing

Single-model pricing brief for GPT-5.4 across short, long, and batch rows.

Open page

Source category pages

Trace the source families behind this page instead of opening random docs in isolation.

Pricing sources

Official pricing pages used to support model, tool-cost, and calculator estimates.

Open page

Guide and API reference sources

Operational guides and API references used by tool-cost, migration, and calculator pages.

Open page

Return

Return to the OpenAI tracker
Go back to the main OpenAI decision surface to compare web-search cost pressure with current model rows, other hosted tool charges, and lifecycle risk.