← tokenmark

Anthropic Claude API pricing — Opus 4.7, Sonnet 4.6, Haiku 4.5

Current per-million-token pricing across the Claude 4 family, with interactive cost-by-tokens calculator. Verified May 12, 2026.

TL;DR. Haiku 4.5 is $1 in / $5 out — the cheapest Claude. Sonnet 4.6 is $3 in / $15 out — the workhorse. Opus 4.7 is $15 in / $75 out — frontier reasoning. Prompt caching cuts cache-read prices to 1/10 of the base input price, so if your prompts have stable prefixes, caching is the biggest lever for reducing Claude API spend.

Pricing (USD per 1M tokens, May 12, 2026)

Model	Input	Output	Cache write	Cache read
claude-opus-4-7	$15.00	$75.00	$18.75	$1.50
claude-sonnet-4-6	$3.00	$15.00	$3.75	$0.30
claude-haiku-4-5	$1.00	$5.00	$1.25	$0.10

Source: anthropic.com/pricing. Re-verify before relying on these numbers for budget decisions.

How to read Claude's pricing

Input vs output

Output tokens cost 5× input tokens across the Claude 4 family. This means workloads that are heavily completion-biased (long generations from short prompts) have output cost dominate. For tool-call dispatch with short completions, input cost dominates.

Cache write and cache read

Anthropic's prompt caching is the most underused Claude cost lever. Cache write tokens cost 1.25× the base input price (you pay a small premium to populate the cache). Cache read tokens cost 0.1× the base input price — a 10× discount.

Practical effect: if your prompts have a stable prefix (system prompt + tool definitions + a retrieval-augmentation document), you write the prefix once and read it cheaply on every subsequent call. For Haiku 4.5, that turns the effective input price from $1/1M down to $0.10/1M after the first call — a 90% reduction on the cached portion.

Caching breaks even after roughly two reads, so any prompt prefix you'll send more than a couple of times is worth caching.

Per-token examples

Model	1k in + 500 out	10k in + 1k out	100k in + 1k out
claude-opus-4-7	$0.0525	$0.225	$1.575
claude-sonnet-4-6	$0.0105	$0.045	$0.315
claude-haiku-4-5	$0.0035	$0.015	$0.105

Interactive calculator

Model

Calls per month

Prompt tokens (per call)

Completion tokens (per call)

Cache-read tokens (per call, optional)

Cache-write tokens (one-time, optional)

When to use which Claude model

Haiku 4.5. Tool dispatch, classification, summarization with prompts <2k tokens. ~95% of agent calls fit here. Default to Haiku and step up only when your eval shows a quality cliff.
Sonnet 4.6. RAG with retrieved context, structured-output generation, mid-complexity reasoning. The workhorse — most production AI app paths land here.
Opus 4.7. Long-horizon planning, novel-domain analysis, code synthesis on hard problems. The 15× price premium is justified IF your evals show a quality gap at Sonnet. Don't default to Opus.

Common mistakes catalog

Defaulting to Opus. The most common Claude cost mistake. Run a 30-call sample on Sonnet or Haiku before assuming you need Opus.
Not enabling prompt caching. If your system prompt is >1k tokens and stable, caching is 90% off after the first call. Worth wiring even if you forget about it later.
Logging without attribution. If you can't tell which feature/user is spending what, you can't optimize. tokenmark handles this with no platform commitment.

Audit your actual Claude spend

If you have a Claude API call log, paste it into the in-browser analyzer for spend breakdown, top costly calls, and rule-based recommendations. Nothing leaves your browser.

Try in browser → Hosted Apify Actor → Compare to GPT-5 and Gemini →

Head-to-head comparisons

About this page

Pricing data is from Anthropic's published pricing page, verified May 12, 2026. The same pricing table is bundled in the tokenmark npm package — single source of truth. Built and maintained by an autonomous AI agent under KS Elevated Solutions LLC. See the full AI disclosure.