Free tool · Pricing

AI API cost calculator — vibe coding, agents & production apps

Struggling to figure out what Cursor, Windsurf, Claude Code, OpenClaw, or Cline actually costs you per month on your own API key? This calculator turns tokens-per-prompt into real dollars — with presets calibrated from real-world data, not guesses. Pick a preset, choose your model from 16 providers including GPT-4.1, Claude Sonnet 4, DeepSeek V3, Gemini 2.5 Pro, and Grok — then see your monthly estimate next to flat-rate subscription breakevens. All math runs in your browser; no signup, no data sent anywhere. New to tokens? Read the plain-English explainer ↓

Skip to calculator

Quick start: Pick the preset that matches your tool — Cursor/Windsurf, Claude Code, OpenClaw, or Cline. Each preset is calibrated from real reported data (Anthropic official docs, Cursor forum, GitHub issues). Then adjust prompts per day to your actual usage and check the subscription breakeven panel. New to tokens and caching? Read the plain-English explainer ↓ · Count your tokens → · Browse AI API list prices →

Scenario

What are you building?

Presets

Prompts / messages per day

Days per week you use AI

Tokens per prompt (input + output combined)

Input % (0.1–0.9; e.g. 0.80 = 80% input)

Primary model (API)

Task-meter multiplier (auto from billing map) Manual multiplier override

Cached input % (OpenAI & Anthropic prompt cache) 0% (system prompt + repeated file context)

Boost output $ for reasoning models (o3, o4-mini, DeepSeek R1)

Output multiplier when boost on

Cache write / cold start overhead (% of non-cached input billed at ~125%) 0%

See primary model in AI model pricing table

Inverse: max $/month → est. messages/mo

OpenAI batch API (50% off list where you use OpenAI models)

Tool calls per prompt (0 = chat · 2–4 = coding agent · 6+ = deep agent)

Blend (optional)

Routing

Economy model

Premium model

% tokens on economy 40% economy / 60% premium

When blend is used, the blended API line is the weighted average of each model’s estimated monthly $ at your current tokens/message and volume (not a token-level price table).

Compare

Model B

Compare model B

Volume and token assumptions are shared with the primary column.

Add-ons

Embeddings & images

Embedding tokens / month (millions)

Images / month (rough)

$ per image (your rule of thumb)

Embedding row uses OpenAI text-embedding-3-small list price; verify on OpenAI’s pricing page.

Import usage (back-solve tokens per prompt)

Approximate bill-driving tokens per month from a provider dashboard. Uses current messages/month and tool rounds.

Total tokens / month

References

Sources (verify before relying)

OpenAI API pricing — GPT-4.1, GPT-4o, o3, o4-mini
Anthropic API pricing — Claude Opus 4, Sonnet 4, Haiku 3.5
Google AI (Gemini) pricing — Gemini 2.5 Pro, 2.0 Flash
DeepSeek API pricing — DeepSeek V3, R1
xAI API pricing — Grok 3, Grok 3 mini
Example third-party browser host pricing (illustrative add-on; not CloudyBot pricing)
CloudyBot pricing (plan caps & task multipliers)

Subscription tool pricing for comparison: Cursor · Windsurf · GitHub Copilot

Live estimate

Results

Vendor list prices last verified:

List estimates use the same published catalog as the AI model pricing table when available; otherwise bundled fallback rates.

Mid estimate (API list-price model)

$0.00/ mo

≈ $0.00 / yr · primary model + add-ons

Low (±20%)

High (±20%)

Per prompt (chat)

Per day

Model B

Blended

—

Cost breakdown

Why this number?

CloudyBot task-meter estimate (runtime formula — not a ledger)

Task-meter units / month: 0

Ceiling task rows for tier fit: 0 (rounded up from fractional meter)

Uses the same aggregation as CloudyBot billing (). Vendor API dollars above still use public list prices — these differ by design.

Suggested tier (AI credits + browser min): —

API cost vs flat-rate subscriptions

At your current workload. Subscriptions include IDE, routing & extras beyond raw tokens. Green = API is cheaper, red = sub is cheaper.

ChatGPT / Claude Pro are consumer TOS, not equivalent to direct API access.

Advanced: browser hours & cloud browser host (third-party example)

Browser hours / month

Uses example third-party browser host pricing (monthly platform + included hours + overage). This is not CloudyBot pricing.

Browser host estimate (illustrative): $0.00/mo

LLM API mid (incl. embeddings/images if set) + browser illustrative: $0.00/mo

CloudyBot includes browser minutes per pricing — compare those caps to your DIY stack needs.

Export & share

Embed snippet (iframe → this page on cloudybot.ai)

Glossary — what these terms mean for Cursor / Claude Code users

Input tokens — everything you send: system prompt, file contents, conversation history, tool results. In Cursor/Claude Code, this is usually 70–90% of your total tokens.
Output tokens — what the model writes back. Code edits, explanations, responses. Much smaller, but 3–5× more expensive per token at most providers.
Cached input (prompt cache) — if the same prefix (e.g. system prompt + a large file) appears in multiple turns, providers discount re-reading it. Anthropic: ~10% of base price. OpenAI: ~50% off. Must be explicitly enabled; not automatic in all tools.
Tool round-trips — each time a coding agent calls a tool (read file, write file, run bash) and gets a result back, that's an extra model call with fresh input tokens. 4 tool calls per user message = 5× the API calls.
Context window — max tokens a model can see at once. Claude Sonnet 4: 200K. GPT-4.1: 1M. Filling a context window can cost $0.60–$15 per call depending on model.
Per 1M tokens — standard unit for API pricing. $3/1M input = $0.000003 per input token = $0.003 per 1K tokens.

What we did not model (important for Claude Code users)

Anthropic cache write cost — writing to cache costs ~125% of base input price (a one-time write per unique prefix). This calculator only models the cheaper cache read. For new sessions that haven’t cached yet, real cost is higher.
Thinking / extended output tokens — o3, o4-mini, DeepSeek R1 and other reasoning models generate hidden “thinking” tokens billed as output. At $40/M output for o3, a single reasoning chain can cost $0.05–0.50.
Enterprise discounts, prepaid credits, free trial credits, taxes.
Regional pricing / FX.
Annual prepay: many vendors offer 10–20% off list for annual commitment.
Cursor / Windsurf / Copilot model routing — subscription services may route to different models depending on load; your actual model may differ from what you expect.

· Wrong price? Open an issue · AI token counter (paste → rough tokens) · Source · GitHub Pages

How it works

Each preset is calibrated from real-world data — Anthropic official benchmarks, Cursor forum reports, Cline GitHub issues. Pick a preset → adjust prompts per day → read the subscription breakeven table. The table now includes CloudyBot flat-rate plans alongside Cursor Pro, Claude Pro, and GitHub Copilot so you can compare all your options in one place. New to tokens and caching? The plain-English explainer below covers everything from scratch. Pricing verified 2026-05-03 with links to each provider’s pricing page.

Who this calculator is for

Vibe coders on Cursor / Windsurf / Copilot / Cline — wondering whether to use your own API key or just subscribe. Load the matching preset and compare vs the subscription breakeven panel.
Claude Code / OpenClaw users — load the preset, set your real prompts/day, and see your monthly exposure vs Claude Pro. See the FAQ for OpenClaw-specific details.
Teams pitching a budget — translate "we send ~200 messages/day to GPT-4.1" into a monthly API line item with a low/mid/high range.
Model routing decisions — use the Blend panel to route cheap queries to GPT-4.1 nano / Gemini 2.0 Flash and only flagship queries to Claude Sonnet 4.

You calculated the cost. Now cap it.

API billing scales with every prompt, every tool call, every cache miss — and there's no ceiling. CloudyBot runs AI credits on a flat monthly plan with hard caps: chat, browse real websites, automate workflows, connect to GitHub, manage files, run on a schedule. Same kind of work. Predictable price. Service pauses at the cap — no surprise invoices, ever.

Put my work on autopilot → — hard caps, no overages See what CloudyBot does →

Free plan · 100 AI credits/mo · No credit card · Not a replacement for Cursor or Claude Code

AI costs explained simply — no jargon

Not familiar with tokens, caching, or API pricing? Here's everything you need to know, explained with plain analogies.

🧮 What is a "token"?

A token is roughly ¾ of a word. "Hello world" = 2 tokens. "authentication" = 4 tokens. A line of code like const user = await db.getUser(id); is about 12 tokens. Every word you type and every word the AI writes back gets counted as tokens — that's what you're billed for.

Real example: You type "fix this bug" (4 tokens). But Cursor also silently attaches your open files (maybe 8,000 tokens), its own system instructions (2,000 tokens), and conversation history (5,000 tokens). So the AI actually receives ~15,000 tokens even though you typed 4 words. That's normal — and that's why costs add up faster than people expect.

💸 How does billing work?

AI providers charge separately for input tokens (everything you send) and output tokens (everything the AI writes back). Output tokens cost 3–5× more per token on most models, because generating text is harder than reading it.

Think of it like a taxi: the ride starts the moment you get in (input cost), but the meter ticks faster on the highway (output cost). A long conversation where you paste a 500-line file and the AI rewrites the whole thing? That's an expensive trip.

🔄 What is caching — and why does it matter so much?

Imagine a photocopier that charges you full price the first copy, but 10 cents on the dollar for reprints of the same page. That's prompt caching.

The first time Cursor sends your codebase to the AI, you pay full price. On the next prompt (in the same session), those same file contents are "cached" — you pay only 10% of the original price to re-read them. This is why the Cursor preset has 80% cache: 80% of your huge token volume is cheap reprints, not full-price reads. Without caching, Cursor would cost 5–10× more.

The catch: caching only works within a session. Close Cursor and reopen it — the cache is cold again. That first prompt of the day costs full price.

🤖 Why do coding agents (Claude Code, Cline) cost so much more?

When you type one message to Claude Code, it doesn't just answer once. It runs a loop: read the file → think → write a change → run the tests → read the output → think again → fix the error → done. Each step in that loop is a separate API call, each with its own tokens. 4 loops per message = 5× the API calls for that one prompt you sent.

Claude Code also carries a mandatory 14,328-token overhead in every single call — that's its system instructions and tool definitions, sent every time, before your code is even mentioned. At $3/M input, that's about $0.04 just to start each turn. It adds up.

📊 Subscription vs API key — which is cheaper?

A subscription like Cursor Pro ($20/mo) is a flat fee that covers your usage up to a cap, after which the tool quietly routes you to slower/cheaper models. A direct API key charges you per token but always uses exactly the model you chose.

Light user (under 20 prompts/day): Subscribe. You'll probably spend less than $20/mo on pure API anyway, and the subscription gives you extras (IDE features, routing intelligence, no billing setup).
Heavy user (40+ prompts/day on flagship models): Direct API can win, especially if you mix in cheaper models like GPT-4.1 mini or DeepSeek V3 for simple questions.
Unpredictable usage: API keys can surprise you at month-end. A heavy week of coding can cost 5× a light week, and there's no ceiling.

The subscription breakeven panel in the Results section calculates the exact crossover for your scenario.

📐 How to read this calculator

The three most impactful inputs are: (1) tokens per prompt — most people underestimate this; use 40K for Cursor, 30K for Claude Code, not 1K. (2) cache % — if you use Cursor in a long session, set it to 80%; if you use Cline or start fresh sessions often, set it lower (20–30%). (3) tool rounds — set to 0 for chat, 1 for Cursor, 4 for Claude Code, 4–6 for Cline. These three numbers together determine 90% of your bill.

API billing vs flat-rate AI plans: which is cheaper?

Per-token API billing is the right model when you need full control — building your own product, picking specific models, handling unpredictable volume patterns. The cost scales linearly with usage, which is perfect for low-volume automation but painful for heavy daily use where a busy week costs 5× a quiet one.

Flat-rate plans (Cursor Pro, Claude Pro, CloudyBot) work the opposite way: fixed monthly price, known ceiling, predictable budgeting. The tradeoff is that you get a capped number of requests or AI credits instead of unlimited tokens. Below a certain usage level, subscriptions almost always win on price. Above it, direct API often wins — but only if you're deliberate about model routing (using cheaper models for simple queries).

The subscription breakeven table in the Results panel calculates your crossover live. CloudyBot is included alongside Cursor Pro, Claude Pro, and GitHub Copilot — but note that CloudyBot is a different product category (AI workers for research, automation, and operations — not an IDE). The comparison is useful for anyone evaluating where their AI budget should go across different jobs.

Frequently asked questions

Where do the calculator’s vendor $/1M numbers come from?

When your browser can load our public pricing API, list prices match the same catalog snapshot as AI model pricing. Each calculator slot maps to a catalog model_key. If the API is offline or a key is missing, embedded fallback rates are used and the results panel shows an “offline fallback” badge.

What does “cache write overhead” mean?

Anthropic (and some other providers) charge more for the first write of a prompt prefix into the cache than for later cache reads. The advanced slider models a slice of your non-cached input at ~125% of base input price. It is optional and still a simplification of real billing.

Is this AI cost calculator free?

Yes. There is no signup and no paywall. All arithmetic runs in your browser.

Are these API costs guaranteed?

No. Figures are educational estimates based on public list prices and your inputs. Enterprise discounts, prepaid credits, taxes, and regional pricing are not modeled.

Is my usage data sent to CloudyBot servers?

No. Scenario inputs and results stay in your browser. Optional localStorage remembers your last scenario on this device only.

How does CloudyBot pricing compare to pay-per-token APIs?

CloudyBot publishes fixed monthly plans with AI credit and browser minute caps. The calculator maps your assumed message volume to implied credits and suggests the smallest published tier that fits those caps — not a quote.

Where is the source code?

Open source under the MIT License at github.com/CloudAxisAi/ai-cost-calculator. This cloudybot.ai page hosts the same client-side logic for discoverability.

How does CloudyBot compare to paying for API access directly?

API access charges per token with no ceiling — a busy month can cost 5× a quiet month. CloudyBot charges a flat monthly fee with a hard cap: $19 (3,000 AI credits), $39 (6,000 AI credits), or $149 (18,000 AI credits) depending on your plan. Service pauses at the limit — no overage charges. You get AI chat, a real cloud browser, file workspace, GitHub integration, scheduling, and WhatsApp delivery included. The key tradeoff: you get a fixed number of AI credits instead of unlimited tokens, and you use CloudyBot's interface rather than plugging an API key into your own app. For anyone who wants AI to do work for them (research, monitoring, content, automation) rather than building AI into their own product, CloudyBot is usually simpler and more predictable. See full pricing at https://cloudybot.ai/pricing

How much does Cursor actually cost per month if I use my own API key?

Based on Cursor forum and community reports: $20–60/month for moderate use (25 prompts/day on GPT-4.1 or Claude Sonnet). Heavy users report $100–500/month. The key variable is cache hit rate — Cursor aggressively caches the system prompt + codebase index, so 80–90% of tokens billed are cheap cache-read tokens ($0.50–1.25/M) rather than expensive input tokens ($2–10/M). Without caching, Cursor would be 5–10× more expensive. One HackerNews user reported 269,738 tokens to edit a 1,213-token file — but 90% was cache reads.

How much does Claude Code cost per month?

Anthropic officially documents: average $13 per developer per active day, which translates to $150–250/month for most enterprise developers. The 90th percentile is under $30/active day. Heavy agentic workflows (automated test loops, long multi-file refactors) can hit $400–1,000+/month. Claude Code's system prompt + tool definitions alone cost 14,328 tokens per every single turn — that's $0.043 just in fixed overhead per user message before any real work. Prompt caching covers most of this (~40%), but cache writes on session start are billed at 125% of base.

What is OpenClaw, and how much does it cost?

OpenClaw is an open-source CLI coding agent similar to Claude Code — it connects directly to the Claude API and lets you run AI-powered coding sessions in your terminal. Because it's community-maintained, its system prompt and tooling overhead are typically smaller than Claude Code's (estimated 5K–8K tokens vs 14,328 tokens fixed overhead). There's no official published cost data from Anthropic for OpenClaw, but based on its architecture and community reports, expect $60–120/month for moderate use (10–15 targeted coding sessions per day on Claude Sonnet 4). Heavy refactor sessions with many file reads can push higher. The OpenClaw preset in this calculator is calibrated from Claude Code data scaled down for lighter context — treat it as a realistic estimate, not a guarantee. Since OpenClaw uses your own Claude API key, your actual cost is fully transparent in your Anthropic usage dashboard.

Why is Cline so much more expensive than Claude Code?

Cline accumulates the full conversation in every API call. By mid-session, a single API call can be 200K tokens. A short 5-turn coding session can be 4–5 million tokens total (documented in Cline GitHub issue #2110). Claude Code does something similar but compacts at ~83% context capacity. Cline's system prompt is ~10,682–11,747 tokens, and with MCP tools enabled it bloats further. Result: the same task costs 4.2× more in Claude Code and even more in Cline vs Aider, which uses a minimal context approach.

Should I use my own API key or subscribe to Cursor Pro / Claude Pro?

For casual users (under 20 prompts/day), subscriptions almost always win — simpler and usually cheaper. For heavy users (40+ prompts/day on flagship models), direct API is often cheaper, especially if you route some queries to cheaper models. The breakeven table in the Results panel calculates this live for your scenario. Note: Cursor Pro's $20/mo also includes IDE features, codebase indexing, and routing intelligence that raw API access doesn't.

What are the correct 2026 prices for Claude Opus, Sonnet, and Haiku?

As of April 2026: Claude Opus 4 is $15/M input · $75/M output · $1.50/M cached read. Claude Sonnet 4 is $3/M input · $15/M output · $0.30/M cached read. Claude Haiku 3.5 is $0.80/M input · $4/M output · $0.08/M cached read. Note: older pricing ($5/$25 for "Opus") was for Claude 2-era and is wrong — Opus 3 was $15/$75. Always verify at anthropic.com/pricing.

Does prompt caching work for Anthropic/Claude models, and how much does it save?

Yes. Cache reads on Claude cost $0.30/M (vs $3/M base) — 10% of the full price. Cache writes cost $3.75/M (125% of base, one-time per unique prefix). For tools like Cursor and Claude Code that cache the system prompt + file contents, 80–90% of a session's token volume can be cache reads. This is why Claude Code is actually affordable despite the 14,328-token fixed overhead — in an ongoing session, those tokens cost $0.004 not $0.043 per turn. Set the cache % slider to 40% for Claude Code, 80% for Cursor.

Extended: Are these API costs guaranteed (including cache-write nuances)?

No. Figures are educational estimates based on public list prices and your inputs. Enterprise discounts, prepaid credits, taxes, regional pricing, and Anthropic cache write costs are not modeled.

Built by the same team.

We built this calculator because AI costs shouldn't be a mystery. We also built CloudyBot — an AI workforce that does the work (chat, browse, research, automate, schedule) on a published monthly plan with hard caps. You see the limit before you buy. Service pauses at the cap. No surprise invoices, ever.

Put my work on autopilot → — 100 AI credits, no credit card

Hard monthly caps — no overages No training on your data AI workforce, not an IDE Cancel anytime

Curious how it works? See what CloudyBot does →

AI API cost calculator — vibe coding, agents & production apps

Routing

Model B

Embeddings & images

Sources (verify before relying)

Export & share

What we did not model (important for Claude Code users)

How it works

Who this calculator is for

You calculated the cost. Now cap it.

AI costs explained simply — no jargon

🧮 What is a "token"?

💸 How does billing work?

🔄 What is caching — and why does it matter so much?

🤖 Why do coding agents (Claude Code, Cline) cost so much more?

📊 Subscription vs API key — which is cheaper?

📐 How to read this calculator

API billing vs flat-rate AI plans: which is cheaper?

Frequently asked questions

Related tools and reading

Built by the same team.