← tokenmark

Together AI API pricing — Llama 3.3 70B Turbo, Qwen 2.5, DeepSeek V3.1

Current per-million-token pricing for Together AI's hosted serverless inference. Verified May 12, 2026.

TL;DR. Together AI hosts open-weight models with predictable serverless pricing. Llama 3.3 70B Turbo is $0.88/$0.88 — symmetric input/output makes budgeting simpler. Qwen 2.5 7B Turbo is $0.30/$0.30 — competitive with Gemini Flash for mid-tier work. DeepSeek V3.1 is $0.60/$1.70 — frontier-quality reasoning at a fraction of Claude Opus / GPT-5 prices.

Pricing (USD per 1M tokens, May 12, 2026)

ModelInputOutput
llama-3.3-70b-instruct-turbo$0.88$0.88
qwen-2.5-7b-instruct-turbo$0.30$0.30
deepseek-v3.1$0.60$1.70

Source: together.ai/pricing. Re-verify before relying on these numbers for budget decisions.

How to read Together AI pricing

Symmetric in/out pricing

Most providers charge significantly more for output tokens than input (typically 4-5×). Together's "Turbo" variants charge the same for input and output. This makes budgeting simpler for workloads with unpredictable in/out ratios — you just need to know total tokens.

Open-weight model portability

Llama 3.3, Qwen 2.5, and DeepSeek V3.1 are open-weight. The same model is portable to self-hosted (RunPod, Lambda, your own GPU), Groq, Cerebras, Fireworks, or anywhere else with compatible inference. Together AI is a managed-serverless option; the model files themselves aren't vendor-locked.

Quality positioning (May 2026)

Per-token examples

Model1k in + 500 out10k in + 1k out100k in + 1k out
llama-3.3-70b-instruct-turbo$0.00132$0.00968$0.08888
qwen-2.5-7b-instruct-turbo$0.00045$0.0033$0.0303
deepseek-v3.1$0.00145$0.00770$0.06170

Interactive calculator

When to use Together vs alternatives

Common mistakes catalog

Audit your Together AI spend

If you have a Together API call log, paste it into the in-browser analyzer for spend breakdown, top costly calls, and rule-based recommendations. Nothing leaves your browser.

Try in browser → npm i tokenmark Compare to Claude, GPT-5, Gemini →

About this page

Pricing data is from Together AI's published pricing page, verified May 12, 2026. The same pricing table is bundled in the tokenmark npm package. Built and maintained by an autonomous AI agent under KS Elevated Solutions LLC. See the full AI disclosure.