Track the OpenAI changes that can force a budget or migration call.

Sources

Live change

Assistants API is now on a fixed shutdown path.

The deprecations page lists August 26, 2026 as the shutdown date and points new work toward the Responses API plus Conversations API.

What changes now

Any new OpenAI build should treat Assistants as a migration topic, not as a fresh platform choice.

Decision brief

Sources

Live change

Realtime beta has a published end date.

The beta Realtime interface is deprecated with a shutdown date of May 7, 2026.

What changes now

Teams still budgeting or integrating against beta-era docs should migrate and price the GA Realtime path instead.

Decision brief

Sources

Live change

Preview audio and realtime model names also have a cutoff.

OpenAI schedules May 7, 2026 as the shutdown date for the current gpt-4o preview audio and realtime model group.

What changes now

Teams still pinned to preview model names should move to the listed current realtime or audio families, and beta-interface users may need a separate interface migration as well.

Decision brief

Sources

OpenAI API pricing Web search guide

Live change

File search cost is now a separate operating question.

Current OpenAI pricing splits file search into storage, Responses API tool calls, and model-token exposure rather than a single retrieval fee.

What changes now

A file-search estimate needs storage footprint, call volume, and chosen-model token pressure in the same read or it will miss the real bill.

Decision brief

Open cost brief

Sources

OpenAI API pricing

Live change

Web search cost now depends on the search path, not just the model row.

Current OpenAI pricing separates standard web search, preview reasoning web search, and preview non-reasoning web search, and search content tokens are not billed the same way on each path.

What changes now

A web-search estimate now needs call volume and the exact search path before the model row can be trusted.

Decision brief

Open cost brief

Sources

Key cost signals

Read the rows most likely to change an estimate before opening the full matrix.

Most decisions do not need every OpenAI row at once. Start with a small benchmark set across text, image, audio, and embeddings, then expand the full pricing matrix only when family-level verification is necessary.

Flagship text anchor

gpt-5.4 short: $2.50 input / $15.00 output per 1M tokens

Long context raises the same row to $5.00 input and $22.50 output, so context length alone can materially change the estimate.

Open GPT-5.4 pricing

Practical low-cost text row

gpt-5-mini: $0.25 input / $2.00 output per 1M tokens

This is the first row many teams should benchmark before paying for higher-end reasoning or larger context windows.

Open GPT-5 mini pricing

Image generation floor

GPT Image 1.5 low: $0.009 per 1024x1024 image

Image workloads should be estimated separately from text-token workloads rather than folded into the same mental model.

Realtime audio exposure

gpt-realtime: $40.00 input / $80.00 output per 1M audio tokens

Voice and realtime workloads leave the text-price range quickly, which is why audio needs its own benchmark row up front.

Embedding ingestion baseline

text-embedding-3-large: $0.13 standard / $0.065 batch per 1M tokens

Batch pricing materially changes large ingestion jobs, so the cheaper path may come from workflow design rather than model choice alone.

Full pricing matrix

Expand the model family you need to verify. Keep the rest collapsed.

The complete matrix stays available for verification, but it no longer needs to dominate the page before a decision is framed.

Text and reasoning models

Core GPT-5, o-series, codex, search, and computer-use rows that show up in most direct OpenAI comparisons.

2 tables50 rowsExpand

Standard pricing

Model	Context	Input	Cached input	Output
gpt-5.4	Short	$2.50	$0.25	$15.00
gpt-5.4	Long	$5.00	$0.50	$22.50
gpt-5.4-pro	Short	$30.00	N/A	$180.00
gpt-5.4-pro	Long	$60.00	N/A	$270.00
gpt-5.2	Standard	$1.25	$0.13	$10.00
gpt-5.1	Standard	$1.25	$0.13	$10.00
gpt-5	Standard	$1.25	$0.13	$10.00
gpt-5-mini	Standard	$0.25	$0.03	$2.00
gpt-5-nano	Standard	$0.05	$0.01	$0.40
gpt-5.3-chat-latest	Standard	$1.50	$0.15	$12.00
gpt-5.2-chat-latest	Standard	$1.50	$0.15	$12.00
gpt-5.1-chat-latest	Standard	$1.50	$0.15	$12.00
gpt-5-chat-latest	Standard	$1.50	$0.15	$12.00
gpt-5.3-codex	Standard	$1.50	$0.15	$6.00
gpt-5.2-codex	Standard	$1.50	$0.15	$6.00
gpt-5.1-codex-max	Standard	$20.00	N/A	$100.00
gpt-5.1-codex	Standard	$1.50	$0.15	$6.00
gpt-5-codex	Standard	$1.50	$0.15	$6.00
gpt-5.2-pro	Standard	$15.00	N/A	$120.00
gpt-5-pro	Standard	$15.00	N/A	$120.00
gpt-4.1	Standard	$2.00	$0.50	$8.00
gpt-4.1-mini	Standard	$0.40	$0.10	$1.60
gpt-4.1-nano	Standard	$0.10	$0.03	$0.40
gpt-4o	Standard	$2.50	$1.25	$10.00
gpt-4o-2024-05-13	Standard	$5.00	N/A	$15.00
gpt-4o-mini	Standard	$0.15	$0.08	$0.60
o1	Standard	$15.00	$7.50	$60.00
o1-pro	Standard	$150.00	N/A	$600.00
o3-pro	Standard	$20.00	N/A	$80.00
o3	Standard	$2.00	$0.50	$8.00
o3-deep-research	Standard	$10.00	$2.50	$40.00
o4-mini	Standard	$1.10	$0.28	$4.40
o4-mini-deep-research	Standard	$2.00	$0.50	$8.00
o3-mini	Standard	$1.10	$0.55	$4.40
o1-mini	Standard	$1.10	$0.55	$4.40
gpt-5.1-codex-mini	Standard	$0.40	$0.04	$1.50
codex-mini-latest	Standard	$1.50	$0.15	$6.00
gpt-5-search-api	Standard	$2.50	$0.25	$15.00
gpt-4o-mini-search-preview	Standard	$0.15	$0.08	$0.60
gpt-4o-search-preview	Standard	$2.50	N/A	$10.00
computer-use-preview	Standard	$3.00	N/A	$12.00

This table covers the text-token rows currently listed for text and reasoning models. It excludes image, audio, and embedding families on purpose.

Batch pricing

Model	Context	Input	Cached input	Output
gpt-5.4	Short only	$1.25	$0.13	$7.50
gpt-5.4-pro	Short only	$15.00	N/A	$90.00
gpt-5.2	Standard	$0.63	$0.06	$5.00
gpt-5.1	Standard	$0.63	$0.06	$5.00
gpt-5	Standard	$0.63	$0.06	$5.00
gpt-5-mini	Standard	$0.13	$0.01	$1.00
gpt-5-nano	Standard	$0.03	$0.00	$0.20
o3	Standard	$1.00	$0.25	$4.00
o4-mini	Standard	$0.55	$0.14	$2.20

OpenAI currently lists a narrower Batch matrix than the Standard matrix. Models not shown here are not padded with guessed batch rows.

Image models

Image-family pricing is split between image-token billing and per-image generation pricing.

2 tables26 rowsExpand

Image token pricing

Mode	Model	Input	Cached input	Output
Batch	gpt-image-1.5	$4.00	$1.00	$16.00
Batch	chatgpt-image-latest	$4.00	$1.00	$16.00
Batch	gpt-image-1	$2.50	$0.63	$10.00
Batch	gpt-image-1-mini	$0.25	$0.06	$1.00
Standard	gpt-image-1.5	$8.00	$2.00	$32.00
Standard	chatgpt-image-latest	$8.00	$2.00	$32.00
Standard	gpt-image-1	$5.00	$1.25	$20.00
Standard	gpt-image-1-mini	$0.50	$0.13	$2.00
Standard	gpt-realtime	$5.00	$0.50	N/A
Standard	gpt-realtime-1.5	$5.00	$0.50	N/A
Standard	gpt-realtime-mini	$0.60	$0.06	N/A

Realtime rows are included here because OpenAI lists image-token pricing for them in the image-token section.

Image generation pricing

Model	Quality or mode	Base listed size	Largest listed size
GPT Image 1.5	Low	$0.009 at 1024x1024	$0.013 at 1024x1536 or 1536x1024
GPT Image 1.5	Medium	$0.035 at 1024x1024	$0.053 at 1024x1536 or 1536x1024
GPT Image 1.5	High	$0.14 at 1024x1024	$0.21 at 1024x1536 or 1536x1024
GPT Image latest	Low	$0.008 at 1024x1024	$0.012 at 1024x1536 or 1536x1024
GPT Image latest	Medium	$0.032 at 1024x1024	$0.048 at 1024x1536 or 1536x1024
GPT Image latest	High	$0.13 at 1024x1024	$0.19 at 1024x1536 or 1536x1024
GPT Image 1	Low	$0.011 at 1024x1024	$0.016 at 1024x1536 or 1536x1024
GPT Image 1	Medium	$0.042 at 1024x1024	$0.063 at 1024x1536 or 1536x1024
GPT Image 1	High	$0.17 at 1024x1024	$0.25 at 1024x1536 or 1536x1024
GPT Image 1 Mini	Low	$0.009 at 1024x1024	$0.013 at 1024x1536 or 1536x1024
GPT Image 1 Mini	Medium	$0.035 at 1024x1024	$0.053 at 1024x1536 or 1536x1024
GPT Image 1 Mini	High	$0.14 at 1024x1024	$0.21 at 1024x1536 or 1536x1024
DALL-E 3	Standard	$0.04 at 1024x1024	$0.08 at 1024x1792 or 1792x1024
DALL-E 3	HD	$0.08 at 1024x1024	$0.12 at 1024x1792 or 1792x1024
DALL-E 2	Standard	$0.016 at 256x256	$0.020 at 1024x1024

DALL-E sizes differ from the GPT Image family, so the row shows the lowest and largest listed sizes rather than forcing a single shared size grid.

Audio models

Voice workflows need both audio-token pricing and speech or transcription pricing in view.

2 tables17 rowsExpand

Audio token pricing

Model	Input	Cached input	Output
gpt-realtime	$40.00	$2.50	$80.00
gpt-realtime-1.5	$40.00	$2.50	$80.00
gpt-realtime-mini	$10.00	$0.30	$20.00
gpt-4o-realtime-preview	$100.00	$20.00	$200.00
gpt-4o-mini-realtime-preview	$10.00	$0.30	$20.00
gpt-audio	$40.00	$2.50	$80.00
gpt-audio-1.5	$40.00	$2.50	$80.00
gpt-audio-mini	$10.00	$0.30	$20.00
gpt-4o-audio-preview	$100.00	N/A	$200.00
gpt-4o-mini-audio-preview	$10.00	N/A	$20.00

OpenAI lists some preview rows without cached-input pricing. Those cells are left as N/A rather than inferred.

Speech and transcription pricing

Model	Text input	Text output	Audio input	Audio output	Estimate or note
gpt-4o-mini-tts	$0.60	N/A	N/A	$12.00	$0.015 per minute
gpt-4o-transcribe	$2.50	$10.00	$6.00	N/A	$0.006 per minute
gpt-4o-transcribe-diarize	$2.50	$10.00	$6.00	N/A	$0.006 per minute
gpt-4o-mini-transcribe	$1.25	$5.00	$3.00	N/A	$0.003 per minute
Whisper	N/A	N/A	N/A	N/A	$0.006 per minute transcription
TTS	N/A	N/A	N/A	N/A	$15.00 per 1M characters
TTS HD	N/A	N/A	N/A	N/A	$30.00 per 1M characters

This table keeps the official speech and transcription billing lines together even though some are per token, per minute, or per character.

Embedding models

Embedding pricing is simpler than generation pricing, but the batch discount still matters in large-volume ingestion.

1 tables3 rowsExpand

Embedding pricing

Model	Standard	Batch
text-embedding-3-large	$0.13 per 1M tokens	$0.065 per 1M tokens
text-embedding-3-small	$0.02 per 1M tokens	$0.01 per 1M tokens
text-embedding-ada-002	$0.10 per 1M tokens	N/A

OpenAI currently lists batch pricing for the text-embedding-3 family but not for text-embedding-ada-002.

Decision constraints

Read the hidden pressure before trusting the cheapest visible row.

Limits, hosted tools, and shutdown timing are what usually turn a cheap-looking option into the wrong production path. Scan the pressure points first, then expand the evidence tables only when a decision needs proof.

This section now works as a second reading spine. It is for deciding whether the estimate still holds under context limits, tool billing, and lifecycle risk.

Context fit

The cheaper row can still fail on usable context or tool coverage.

gpt-5.4 keeps a 1,048,576-token context window and the widest built-in tool set in the current summary rows, while gpt-5-mini narrows context to 400,000 tokens.

Open limits brief

Hosted tools

OpenAI-hosted tools create separate bill lines before token spend is finished.

Web search, file search storage, file search tool calls, and code interpreter runtime each add non-trivial cost pressure outside the base model row.

Open container pricing

Shelf life

A low current price is not a win if the path is already on a shutdown clock.

Assistants, Realtime beta, preview audio and realtime aliases, and legacy GPT snapshots all need lifecycle checks before they are treated as stable choices.

Constraint evidence

Expand only the evidence table that answers the current risk.

This keeps the page readable when the answer is obvious, while leaving the source-shaped tables in place for a harder review.

Limits and context coverage

Model availability and usable limits belong next to price because they determine whether the cheaper row is actually usable.

3 rows4 columnsExpand

Model	Context window	Max output	Built-in tools
gpt-5.4	1,048,576 tokens	128,000 tokens	Functions, web search, file search, skills, image generation, code interpreter, MCP
gpt-5-mini	400,000 tokens	128,000 tokens	Functions, web search, file search, MCP
gpt-4.1	1,047,576 tokens	32,768 tokens	Functions, fine-tuning, web search, file search, image generation

A lower-cost row is not the better choice if it misses the context size or tool path the app actually needs.

Tool-cost coverage

Hosted tool charges materially change the OpenAI bill and still need to stay separate from the model-family pricing matrix.

4 rows3 columnsExpand

Tool	Current cost	Notes
Web search	$10 per 1K calls	Search content tokens are billed at the model's token rates for all models.
File search storage	$0.10 per GB per day	The first 1 GB is free.
File search tool calls	$2.50 per 1K calls	Applies to the Responses API path.
Code interpreter container	$0.03 per container	Starting March 31, 2026, usage shifts to $0.03 per container per 20 minutes.

If a workflow uses OpenAI-hosted tools, the budgeting question is tokens plus model-family rows plus tool rows plus runtime behavior.

Deprecation coverage

Shutdown dates and replacement paths belong in the same view as price whenever the integration has a shelf life.

4 rows4 columnsExpand

System	Shutdown date	Replacement	Notes
Selected legacy GPT snapshots	March 26, 2026	gpt-5 or gpt-4.1 family	OpenAI places legacy GPT-4 snapshot IDs and older preview aliases on a March 26, 2026 shutdown path, with replacement guidance pointing to the gpt-5 or gpt-4.1 family.
gpt-4o preview audio and realtime models	May 7, 2026	gpt-realtime-1.5, gpt-realtime-mini, gpt-audio-1.5, or gpt-audio-mini	OpenAI has also dated the shutdown of several gpt-4o preview realtime and audio models, with replacement paths now pointing to the current realtime and audio families.
Realtime API beta	May 7, 2026	Realtime API	OpenAI marks the beta Realtime interface deprecated and routes teams to the GA Realtime API before the shutdown date.
Assistants API	August 26, 2026	Responses API plus Conversations API	OpenAI marks the Assistants API deprecated and directs new work toward the Responses API plus the Conversations API.

A lower current price does not reduce risk if the chosen API path or snapshot is already on a shutdown clock.

Sources of record

Verify the exact row only after the main read has identified the question.

This list stays narrow on purpose. It sits after cost and constraint review so the docs support the decision instead of replacing the page flow.

Pricing

API pricing

Use this when a hosted tool charge, model row, or token rule needs confirmation against the source of record.

Models

Models guide

Use this when a cheap-looking model still needs validation against support, availability, or family fit.

Deprecations

Use this when the pricing question is no longer enough because the current path may already be on a shutdown clock.

Changelog

Developer changelog

Use this when you need to confirm whether the change is newly published and still active in the current API surface.

Open the full GPT-5.4 vs GPT-5 mini comparison

Workload compare preview

A worked example should tell a team what matters before a calculator exists.

This sample compare uses one realistic OpenAI workload to show where model price helps, where hosted tools dominate, and what should be checked before switching downmarket.

Worked example

Keep this section as a worked example. It should teach the cost shape of a real decision without pretending a live calculator is already shipped.

Monthly workload

20M input tokens and 4M output tokens on a current Responses API path.

Hosted retrieval

40,000 file search tool calls per month with an average 30 GB vector-store footprint.

Compared options

gpt-5.4 short context versus gpt-5-mini on the same workload.

Lifecycle posture

No dated shutdown pressure in this sample, so the call is cost and support fit rather than migration timing.

Model option

gpt-5.4

~$297 per month

Model spend

~$110 model spend plus the same file-search envelope.

Tool cost exposure

~$187 in hosted retrieval cost, driven by 40,000 tool calls and 30 GB average storage.

Constraint risk

Lower support risk in this sample because the current summary rows keep the widest context window and tool coverage.

Recommended next check

Use this path only if the larger context window or broader built-in tool set is actually required by the app.

Model option

gpt-5-mini

~$200 per month

Model spend

~$13 model spend plus the same file-search envelope.

Tool cost exposure

~$187 in hosted retrieval cost, which means the model swap saves less than the retrieval layer costs.

Constraint risk

Higher fit risk because the current summary rows narrow context to 400,000 tokens even though file search support remains in place.

Recommended next check

Choose this row only after confirming the smaller context window and lighter tool envelope still support the actual workload.

Estimated monthly cost

The swap from gpt-5.4 to gpt-5-mini saves about $97 per month in this sample, but most of the bill still comes from file search rather than model tokens.

Tool cost exposure

Hosted retrieval is the first optimization target because it stays roughly $187 per month on both options before token differences are even considered.

Recommended next check

Validate whether the workload truly needs gpt-5.4-level context and tool breadth; if not, the cheaper model can work, but the larger savings may come from reducing retrieval calls or storage footprint.

This is a sample compare, not a live calculator. It combines the current page's published model rows with the listed file-search storage and tool-call charges to show how a real estimate can tilt.

Next move

Use the current OpenAI readout to close the next decision, not to start another round of doc hunting.

These checks are meant to turn the current facts into a concrete budget, comparison, or migration call.

Step 1

Start with the rows you would actually pay.

If the workflow uses web search, file search, or code interpreter, add those charges before comparing OpenAI to another provider.

Step 2

Check model limits before choosing by price.

Context window, output cap, and tool support can invalidate a low-cost option before it reaches production.

Step 3

Treat shutdown dates as part of the selection process.

Before committing to an API path, confirm whether it is current, deprecated, or already on a migration timeline.

Popular decisions

These pages answer the queries most likely to become their own search session: a direct model compare, a cheapest-fit use case, and a transparent pricing calculator.

Model compare

GPT-5.4 vs GPT-5 mini

Use the side-by-side brief when the decision depends on price, context, output caps, and tool support together.

Open GPT-5.4 vs GPT-5 mini

Use case

Cheapest OpenAI model for extraction

Open the extraction brief when the real question is cheapest viable structured extraction, not cheapest generic chat row.

Open extraction recommendation

Calculator

OpenAI API pricing calculator

Run a transparent estimate with token, search, file, and container inputs when the budget now needs a line-item number.

Open pricing calculator

OpenAI work queue

Next pages to add to the OpenAI coverage.

Each one is chosen for a specific pricing, tool-cost, fit, or migration question that still needs its own page.

The current OpenAI coverage is clear for now. New pages will be added when another cost, fit, or lifecycle question needs its own page.

Continue the site

Keep moving through the decision from here.

Use the groups below to move laterally through the decision, not back out into another doc hunt.

Stay in the same decision neighborhood instead of backing out to search.

Pricing / Costs

Model pricing, hosted-tool costs, and fit constraints that materially change the operating estimate.

Deprecations / Migrations

Shutdown dates, migration paths, and replacement decisions grouped into live provider clusters.

Comparisons

Side-by-side model comparisons and scenario recommendation pages for cost-sensitive decisions.

Compare pages

Open the pages that turn this topic into a side-by-side decision.

GPT-5.4 vs GPT-5 mini

Side-by-side comparison of GPT-5.4 and GPT-5 mini across price, fit, and tool pressure.

Cheapest OpenAI model for extraction

Scenario recommendation page for choosing the cheapest workable OpenAI extraction model.

Replacement pages

Use the likely substitutes, migration targets, or fallback choices as the next click.

GPT-5.4 context and tool support

Limits brief for GPT-5.4 versus GPT-5 mini context windows, output caps, and tool support.

OpenAI API pricing calculator

Interactive calculator for model tokens, hosted tools, and runtime in one estimate.

Source category pages

Trace the source families behind this page instead of opening random docs in isolation.

Pricing sources

Official pricing pages used to support model, tool-cost, and calculator estimates.

Model sources

Official model pages used for context windows, output caps, and built-in tool coverage.

Deprecation and migration sources

Official shutdown, migration, and replacement references behind lifecycle pages.

Changelog sources

Official changelog sources used to confirm whether a pricing or lifecycle change is current.