Free tool · Pricing

AI API cost calculator — vibe coding, agents & production apps

Struggling to figure out what Cursor, Windsurf, Claude Code, OpenClaw, or Cline actually costs you per month on your own API key? This calculator turns tokens-per-prompt into real dollars — with presets calibrated from real-world data, not guesses. Pick a preset, choose your model from 16 providers including GPT-4.1, Claude Sonnet 4, DeepSeek V3, Gemini 2.5 Pro, and Grok — then see your monthly estimate next to flat-rate subscription breakevens. All math runs in your browser; no signup, no data sent anywhere. New to tokens? Read the plain-English explainer ↓

Token counter JSON formatter

Quick start: Pick the preset that matches your tool — Cursor/Windsurf, Claude Code, OpenClaw, or Cline. Each preset is calibrated from real reported data (Anthropic official docs, Cursor forum, GitHub issues). Then adjust prompts per day to your actual usage and check the subscription breakeven panel. New to tokens and caching? Read the plain-English explainer ↓ · Count your tokens →

Scenario

Presets

0% (system prompt + repeated file context)

Blend (optional)

Routing

40% economy / 60% premium

When blend is used, the blended API line is the weighted average of each model’s estimated monthly $ at your current tokens/message and volume (not a token-level price table).

Compare

Model B

Volume and token assumptions are shared with the primary column.

Add-ons

Embeddings & images

Embedding row uses OpenAI text-embedding-3-small list price; verify on OpenAI’s pricing page.

References

Sources (verify before relying)

Subscription tool pricing for comparison: Cursor · Windsurf · GitHub Copilot

Advanced: browser hours & Browserbase (third-party example)

Uses Browserbase public pricing (Developer-style: monthly platform + included hours + overage). This is not CloudyBot’s internal cost.

Browser host estimate (illustrative): $0.00/mo

LLM API mid (incl. embeddings/images if set) + browser illustrative: $0.00/mo

CloudyBot includes browser minutes per pricing — compare those caps to your DIY stack needs.

Share

Export & share

URL updates as you edit — link shares your exact scenario.
Glossary — what these terms mean for Cursor / Claude Code users
  • Input tokens — everything you send: system prompt, file contents, conversation history, tool results. In Cursor/Claude Code, this is usually 70–90% of your total tokens.
  • Output tokens — what the model writes back. Code edits, explanations, responses. Much smaller, but 3–5× more expensive per token at most providers.
  • Cached input (prompt cache) — if the same prefix (e.g. system prompt + a large file) appears in multiple turns, providers discount re-reading it. Anthropic: ~10% of base price. OpenAI: ~50% off. Must be explicitly enabled; not automatic in all tools.
  • Tool round-trips — each time a coding agent calls a tool (read file, write file, run bash) and gets a result back, that's an extra model call with fresh input tokens. 4 tool calls per user message = 5× the API calls.
  • Context window — max tokens a model can see at once. Claude Sonnet 4: 200K. GPT-4.1: 1M. Filling a context window can cost $0.60–$15 per call depending on model.
  • Per 1M tokens — standard unit for API pricing. $3/1M input = $0.000003 per input token = $0.003 per 1K tokens.

What we did not model (important for Claude Code users)

  • Anthropic cache write cost — writing to cache costs ~125% of base input price (a one-time write per unique prefix). This calculator only models the cheaper cache read. For new sessions that haven’t cached yet, real cost is higher.
  • Thinking / extended output tokens — o3, o4-mini, DeepSeek R1 and other reasoning models generate hidden “thinking” tokens billed as output. At $40/M output for o3, a single reasoning chain can cost $0.05–0.50.
  • Enterprise discounts, prepaid credits, free trial credits, taxes.
  • Regional pricing / FX.
  • Annual prepay: many vendors offer 10–20% off list for annual commitment.
  • Cursor / Windsurf / Copilot model routing — subscriptions internally route to different models depending on load; your actual model may differ from what you expect.

· Wrong price? Open an issue · AI token counter (paste → rough tokens) · Source · GitHub Pages

How it works

Each preset is calibrated from real-world data — Anthropic official benchmarks, Cursor forum reports, Cline GitHub issues. Pick a preset → adjust prompts per day → read the subscription breakeven table. The table now includes CloudyBot flat-rate plans alongside Cursor Pro, Claude Pro, and GitHub Copilot so you can compare all your options in one place. New to tokens and caching? The plain-English explainer below covers everything from scratch. Pricing verified 2026-04-17 with links to each provider’s pricing page.

Who this calculator is for

You calculated the cost. Now cap it.

API billing scales with every prompt, every tool call, every cache miss — and there's no ceiling. CloudyBot runs AI tasks on a flat monthly plan with hard caps: chat, browse real websites, automate workflows, connect to GitHub, manage files, run on a schedule. Same kind of work. Predictable price. Service pauses at the cap — no surprise invoices, ever.

Try CloudyBot free — hard caps, no overages See what CloudyBot does →

Free plan · 30 AI Tasks/mo · No credit card · Not a replacement for Cursor or Claude Code

AI costs explained simply — no jargon

Not familiar with tokens, caching, or API pricing? Here's everything you need to know, explained with plain analogies.

🧮 What is a "token"?

A token is roughly ¾ of a word. "Hello world" = 2 tokens. "authentication" = 4 tokens. A line of code like const user = await db.getUser(id); is about 12 tokens. Every word you type and every word the AI writes back gets counted as tokens — that's what you're billed for.

Real example: You type "fix this bug" (4 tokens). But Cursor also silently attaches your open files (maybe 8,000 tokens), its own system instructions (2,000 tokens), and conversation history (5,000 tokens). So the AI actually receives ~15,000 tokens even though you typed 4 words. That's normal — and that's why costs add up faster than people expect.

💸 How does billing work?

AI providers charge separately for input tokens (everything you send) and output tokens (everything the AI writes back). Output tokens cost 3–5× more per token on most models, because generating text is harder than reading it.

Think of it like a taxi: the ride starts the moment you get in (input cost), but the meter ticks faster on the highway (output cost). A long conversation where you paste a 500-line file and the AI rewrites the whole thing? That's an expensive trip.

🔄 What is caching — and why does it matter so much?

Imagine a photocopier that charges you full price the first copy, but 10 cents on the dollar for reprints of the same page. That's prompt caching.

The first time Cursor sends your codebase to the AI, you pay full price. On the next prompt (in the same session), those same file contents are "cached" — you pay only 10% of the original price to re-read them. This is why the Cursor preset has 80% cache: 80% of your huge token volume is cheap reprints, not full-price reads. Without caching, Cursor would cost 5–10× more.

The catch: caching only works within a session. Close Cursor and reopen it — the cache is cold again. That first prompt of the day costs full price.

🤖 Why do coding agents (Claude Code, Cline) cost so much more?

When you type one message to Claude Code, it doesn't just answer once. It runs a loop: read the file → think → write a change → run the tests → read the output → think again → fix the error → done. Each step in that loop is a separate API call, each with its own tokens. 4 loops per message = 5× the API calls for that one prompt you sent.

Claude Code also carries a mandatory 14,328-token overhead in every single call — that's its system instructions and tool definitions, sent every time, before your code is even mentioned. At $3/M input, that's about $0.04 just to start each turn. It adds up.

📊 Subscription vs API key — which is cheaper?

A subscription like Cursor Pro ($20/mo) is a flat fee that covers your usage up to a cap, after which the tool quietly routes you to slower/cheaper models. A direct API key charges you per token but always uses exactly the model you chose.

The subscription breakeven panel in the Results section calculates the exact crossover for your scenario.

📐 How to read this calculator

The three most impactful inputs are: (1) tokens per prompt — most people underestimate this; use 40K for Cursor, 30K for Claude Code, not 1K. (2) cache % — if you use Cursor in a long session, set it to 80%; if you use Cline or start fresh sessions often, set it lower (20–30%). (3) tool rounds — set to 0 for chat, 1 for Cursor, 4 for Claude Code, 4–6 for Cline. These three numbers together determine 90% of your bill.

API billing vs flat-rate AI plans: which is cheaper?

Per-token API billing is the right model when you need full control — building your own product, picking specific models, handling unpredictable volume patterns. The cost scales linearly with usage, which is perfect for low-volume automation but painful for heavy daily use where a busy week costs 5× a quiet one.

Flat-rate plans (Cursor Pro, Claude Pro, CloudyBot) work the opposite way: fixed monthly price, known ceiling, predictable budgeting. The tradeoff is that you get a capped number of requests or "tasks" instead of unlimited tokens. Below a certain usage level, subscriptions almost always win on price. Above it, direct API often wins — but only if you're deliberate about model routing (using cheaper models for simple queries).

The subscription breakeven table in the Results panel calculates your crossover live. CloudyBot is included alongside Cursor Pro, Claude Pro, and GitHub Copilot — but note that CloudyBot is a different product category (AI workers for research, automation, and operations — not an IDE). The comparison is useful for anyone evaluating where their AI budget should go across different jobs.

Frequently asked questions

How does CloudyBot compare to paying for API access directly?

API access charges per token with no ceiling — a busy month can cost 5× a quiet month. CloudyBot charges a flat monthly fee with a hard cap: $19 (1,500 AI Tasks), $39 (3,000 tasks), or $149 (8,000 tasks) depending on your plan. Service pauses at the limit — no overage charges. You get AI chat, a real cloud browser, file workspace, GitHub integration, scheduling, and WhatsApp delivery included. The key tradeoff: you get a fixed number of "AI Tasks" instead of unlimited tokens, and you use CloudyBot's interface rather than plugging an API key into your own app. For anyone who wants AI to do work for them (research, monitoring, content, automation) rather than building AI into their own product, CloudyBot is usually simpler and more predictable. See full pricing →

How much does Cursor actually cost per month if I use my own API key?

Based on Cursor forum and community reports: $20–60/month for moderate use (25 prompts/day on GPT-4.1 or Claude Sonnet). Heavy users report $100–500/month. The key variable is cache hit rate — Cursor aggressively caches the system prompt + codebase index, so 80–90% of tokens billed are cheap cache-read tokens ($0.50–1.25/M) rather than expensive input tokens ($2–10/M). Without caching, Cursor would be 5–10× more expensive. One HackerNews user reported 269,738 tokens to edit a 1,213-token file — but 90% was cache reads.

How much does Claude Code cost per month?

Anthropic officially documents: average $13 per developer per active day, which translates to $150–250/month for most enterprise developers. The 90th percentile is under $30/active day. Heavy agentic workflows (automated test loops, long multi-file refactors) can hit $400–1,000+/month. Claude Code's system prompt + tool definitions alone cost 14,328 tokens per every single turn — that's $0.043 just in fixed overhead per user message before any real work. Prompt caching covers most of this (~40%), but cache writes on session start are billed at 125% of base.

What is OpenClaw, and how much does it cost?

OpenClaw is an open-source CLI coding agent similar to Claude Code — it connects directly to the Claude API and lets you run AI-powered coding sessions in your terminal. Because it's community-maintained, its system prompt and tooling overhead are typically smaller than Claude Code's (estimated 5K–8K tokens vs 14,328 tokens fixed overhead). There's no official published cost data from Anthropic for OpenClaw, but based on its architecture and community reports, expect $60–120/month for moderate use (10–15 targeted coding sessions per day on Claude Sonnet 4). Heavy refactor sessions with many file reads can push higher. The OpenClaw preset in this calculator is calibrated from Claude Code data scaled down for lighter context — treat it as a realistic estimate, not a guarantee. Since OpenClaw uses your own Claude API key, your actual cost is fully transparent in your Anthropic usage dashboard.

Why is Cline so much more expensive than Claude Code?

Cline accumulates the full conversation in every API call. By mid-session, a single API call can be 200K tokens. A short 5-turn coding session can be 4–5 million tokens total (documented in Cline GitHub issue #2110). Claude Code does something similar but compacts at ~83% context capacity. Cline's system prompt is ~10,682–11,747 tokens, and with MCP tools enabled it bloats further. Result: the same task costs 4.2× more in Claude Code and even more in Cline vs Aider, which uses a minimal context approach.

Should I use my own API key or subscribe to Cursor Pro / Claude Pro?

For casual users (under 20 prompts/day), subscriptions almost always win — simpler and usually cheaper. For heavy users (40+ prompts/day on flagship models), direct API is often cheaper, especially if you route some queries to cheaper models. The breakeven table in the Results panel calculates this live for your scenario. Note: Cursor Pro's $20/mo also includes IDE features, codebase indexing, and routing intelligence that raw API access doesn't.

What are the correct 2026 prices for Claude Opus, Sonnet, and Haiku?

As of April 2026: Claude Opus 4 is $15/M input · $75/M output · $1.50/M cached read. Claude Sonnet 4 is $3/M input · $15/M output · $0.30/M cached read. Claude Haiku 3.5 is $0.80/M input · $4/M output · $0.08/M cached read. Note: older pricing ($5/$25 for "Opus") was for Claude 2-era and is wrong — Opus 3 was $15/$75. Always verify at anthropic.com/pricing.

Does prompt caching work for Anthropic/Claude models, and how much does it save?

Yes. Cache reads on Claude cost $0.30/M (vs $3/M base) — 10% of the full price. Cache writes cost $3.75/M (125% of base, one-time per unique prefix). For tools like Cursor and Claude Code that cache the system prompt + file contents, 80–90% of a session's token volume can be cache reads. This is why Claude Code is actually affordable despite the 14,328-token fixed overhead — in an ongoing session, those tokens cost $0.004 not $0.043 per turn. Set the cache % slider to 40% for Claude Code, 80% for Cursor.

Are these API costs guaranteed?

No. Figures are educational estimates based on public list prices and your inputs. Enterprise discounts, prepaid credits, taxes, regional pricing, and Anthropic cache write costs are not modeled.

Is my usage data sent to CloudyBot servers?

No. Scenario inputs and results stay in your browser. Optional localStorage remembers your last scenario on this device only.

Where is the source code?

Open source under the MIT License at github.com/CloudAxisAi/ai-cost-calculator.

Related tools and reading

Built by the same team.

We built this calculator because AI costs shouldn't be a mystery. We also built CloudyBot — an AI workforce that does the work (chat, browse, research, automate, schedule) on a published monthly plan with hard caps. You see the limit before you buy. Service pauses at the cap. No surprise invoices, ever.

Try CloudyBot free — 30 AI Tasks, no credit card
Hard monthly caps — no overages No training on your data AI workforce, not an IDE Cancel anytime

Curious how it works? See what CloudyBot does →