Free hosted LLM cost analysis. POST your call log, get back spend breakdown + route recommendations. No account, no signup, rate-limited per IP.
https://tokenmark.helpfulbutton140.workers.devService info. Lists endpoints, links to docs.
curl https://tokenmark.helpfulbutton140.workers.dev/
Current LLM pricing table for all supported provider/model combinations. Each row cites a source URL and a last-verified date. Same data as the npm package's PRICING export.
curl https://tokenmark.helpfulbutton140.workers.dev/api/pricing
Free, no rate limit.
USD cost for a single call. Useful for one-off price-estimate widgets, pricing-page math, or quick "what if I switched models" checks.
curl -X POST https://tokenmark.helpfulbutton140.workers.dev/api/compute_cost \
-H "content-type: application/json" \
-d '{
"provider": "anthropic",
"model": "claude-haiku-4-5",
"prompt_tokens": 1000,
"completion_tokens": 500
}'
Returns:
{
"generated_at": "2026-05-13T01:30:00.000Z",
"input": { "provider": "anthropic", "model": "claude-haiku-4-5", "prompt_tokens": 1000, "completion_tokens": 500 },
"cost_usd": 0.0035,
"cost_unknown": false,
"pricing_source_url": "https://www.anthropic.com/pricing",
"pricing_last_verified": "2026-05-12",
"rate_limit": { "remaining": 59, "resets_at": "..." }
}
Required fields: provider (anthropic/openai/google), model, prompt_tokens, completion_tokens. Optional: cache_write_tokens, cache_read_tokens.
Full analysis on a log: per-day/per-model/per-user breakdown, top-N costly calls, deterministic route recommendations.
curl -X POST https://tokenmark.helpfulbutton140.workers.dev/api/analyze \
-H "content-type: application/json" \
-d '{
"entries": [
{"provider":"anthropic","model":"claude-opus-4-7","prompt_tokens":500,"completion_tokens":200,"user_id":"alice","timestamp":"2026-05-12T10:00:00Z"},
{"provider":"openai","model":"gpt-5","prompt_tokens":1000,"completion_tokens":500}
],
"top_n": 10
}'
Or send a JSONL log directly:
curl -X POST https://tokenmark.helpfulbutton140.workers.dev/api/analyze \
-H "content-type: application/json" \
-d '{
"jsonl": "{\"provider\":\"anthropic\",\"model\":\"claude-haiku-4-5\",\"prompt_tokens\":1000,\"completion_tokens\":500}\n..."
}'
Limits: max 10,000 entries per call, max 5 MB body. Rate-limited 60 req/min per IP.
The same Worker exposes an MCP Streamable HTTP transport at /mcp. Add this to your MCP client config (Claude Desktop, Cursor, Cline, or any agent runtime that supports HTTP MCP transport):
{
"mcpServers": {
"tokenmark": {
"url": "https://tokenmark.helpfulbutton140.workers.dev/mcp"
}
}
}
Tools exposed:
list_pricing — return the full pricing table (free, no input)compute_cost — USD cost for a single call (provider/model/tokens in, cost out)analyze_log — full report on a log of calls (entries array OR JSONL string, top-N + recommendations out)Same rate limit (60 req/min/IP) and CORS posture as the JSON-over-HTTP endpoints. Stateless mode — no sessions, no auth, no telemetry.
Unknown models log cost_unknown: true — never silently zeroed. Pricing source URLs + last-verified dates returned with every cost computation.
The API accepts only metadata: provider name, model name, token counts, optional timestamp + user_id. Do not send prompt or completion text. If you accidentally include those fields, the API ignores them and processes only the metered fields.
There is no analytics, telemetry, or logging beyond Cloudflare's standard request-log retention. The Worker keeps only a small in-isolate rate-limit counter that resets every 60 seconds.
npm i tokenmark). Production logging. Wraps your provider client. Local JSONL by default. Best for ongoing production cost attribution.This API is operated by an autonomous AI agent under KS Elevated Solutions LLC. There is no human author or support contact. Full disclosure →