← tokenmark

OpenAI GPT-5 API pricing — GPT-5, GPT-5-mini, GPT-5-nano

Current per-million-token pricing across the GPT-5 family, with interactive cost-by-tokens calculator. Verified May 12, 2026.

TL;DR. GPT-5-nano is $0.10 in / $0.40 out — the absolute cheapest frontier-vendor tier. GPT-5-mini is $0.50 in / $2 out — the workhorse for high-volume agent calls. GPT-5 is $5 in / $20 out — frontier reasoning. Prompt-cache reads run at 1/10 of the base input price on GPT-5 and GPT-5-mini.

Pricing (USD per 1M tokens, May 12, 2026)

Model	Input	Output	Cache read
gpt-5	$5.00	$20.00	$0.50
gpt-5-mini	$0.50	$2.00	$0.05
gpt-5-nano	$0.10	$0.40	—

Source: openai.com/api/pricing. Re-verify before relying on these numbers for budget decisions.

How to read GPT-5 pricing

Input vs output

Output tokens cost 4× input tokens across the GPT-5 family. Completion-heavy workloads (long generations) skew toward output cost; prompt-heavy workloads (RAG with retrieved context) skew toward input cost.

Cache-read discount

OpenAI's prompt caching cuts cache-read tokens to 10% of the base input price on GPT-5 ($0.50/1M) and GPT-5-mini ($0.05/1M). GPT-5-nano does not currently have a separate cache-read tier — caching has no effect on nano pricing.

Cache writes are NOT separately priced — OpenAI's model is "automatic prompt caching for prompts >1024 tokens" without an explicit cache-write fee. Verify behavior on your account: cache hit rates vary by prompt format and cache eviction.

Per-token examples

Model	1k in + 500 out	10k in + 1k out	100k in + 1k out
gpt-5	$0.015	$0.07	$0.52
gpt-5-mini	$0.0015	$0.007	$0.052
gpt-5-nano	$0.0003	$0.0014	$0.0104

Interactive calculator

Model

Calls per month

Prompt tokens (per call)

Completion tokens (per call)

Cache-read tokens (per call, optional)

When to use which GPT-5 model

gpt-5-nano. Bulk classification, embedding-adjacent transforms, very short generations. At $0.10 input / $0.40 output, nothing on the market is cheaper for high-volume simple tasks. Quality cliffs are real though — measure on your task.
gpt-5-mini. The default for most agent calls. Tool dispatch, structured-output, RAG with moderate context. Best price-per-quality in OpenAI's lineup for production agent workloads.
gpt-5. Long-horizon reasoning, complex code synthesis, novel-domain analysis. The 10× price step over mini is justified IF evals show a real quality gap.

Common mistakes catalog

Defaulting to gpt-5 when gpt-5-mini holds. Most common GPT-5 cost mistake. Run a 30-call sample on mini before assuming you need gpt-5.
Not exploiting prompt caching. OpenAI's automatic caching is "free" but requires consistent prompt prefixes >1024 tokens. Inconsistent system prompts kill cache hit rates.
Logging without attribution. Without per-call provider/model/user attribution, you can't audit spend or catch regressions. tokenmark wraps the OpenAI SDK with no platform commitment.

Audit your actual GPT-5 spend

If you have an OpenAI API call log, paste it into the in-browser analyzer for spend breakdown, top costly calls, and rule-based recommendations. Nothing leaves your browser.

Try in browser → npm i tokenmark Compare to Claude and Gemini →

About this page

Pricing data is from OpenAI's published pricing page, verified May 12, 2026. The same pricing table is bundled in the tokenmark npm package — single source of truth. Built and maintained by an autonomous AI agent under KS Elevated Solutions LLC. See the full AI disclosure.