AI & LLM Economics

LLM API Cost Calculator (GPT-4o, Claude, Gemini)

Pick a model, enter how many input and output tokens a typical request uses and how many requests you make per month, and this calculator estimates your monthly API bill using published 2025 list prices. It also splits the cost into input versus output so you can see which side of the request is driving spend. Prices as of 2025 — verify against each provider's current pricing page before budgeting, because model pricing changes often.

Model & usage

Prices as of 2025— verify against each provider's current pricing page. Rates change often, and caching, batch, and volume discounts are not modeled.

Estimated monthly cost

$100.00

$0.01/request × 10,000 req/mo

Input cost / mo

$50.00

Output cost / mo

$50.00

Cost detail
Input rate (GPT-4o)$2.50 / 1MOutput rate$10.00 / 1MInput cost / request$0.01Output cost / request$0.01Total / request$0.01Input cost / month$50.00Output cost / month$50.00Total / month$100.00

Monthly cost, same volume across models

GPT-4oGPT-4o miniClaude SonnetGemini 1.5 Flash

How it works

Large language model APIs bill per token, and almost every provider charges a different rate for input (the prompt you send) than for output (the completion the model returns). This tool prices a single request as (input tokens ÷ 1,000,000 × input price) + (output tokens ÷ 1,000,000 × output price), using the per-million-token rates from a versioned 2025 price table for GPT-4o, GPT-4o mini, Claude Opus, Claude Sonnet, Claude Haiku, Gemini 1.5 Pro, and Gemini 1.5 Flash.

Monthly cost is simply the per-request cost multiplied by your requests per month. The breakdown table and chart separate input cost from output cost, which matters because output tokens are typically 3–5× more expensive than input tokens — so a chatbot that returns long answers can cost far more than its prompt size suggests, even when the prompt is large.

Switching models re-prices the exact same token volume against a different rate card, which makes the gap between tiers obvious: a lightweight model like GPT-4o mini or Gemini 1.5 Flash can be one to two orders of magnitude cheaper than a frontier model for identical usage. Use that comparison to decide where a smaller model is good enough and where you actually need the larger one.

Frequently asked questions

Why are input and output tokens priced differently?+

Providers charge separately because generating output is more compute-intensive than reading input. Output (completion) tokens are usually several times more expensive than input (prompt) tokens — for example GPT-4o is $2.50 per million input tokens but $10.00 per million output tokens. That asymmetry is why this calculator reports the two costs separately: for chat and generation workloads that return long responses, output can dominate the bill even when your prompt is long, so optimizing prompt size alone may not cut costs as much as you expect.

Does this include caching, batch, or volume discounts?+

No. This is a straight list-price estimate. Most providers offer discounts this tool does not model: prompt caching (a reduced rate for repeated prompt prefixes), batch APIs (often around 50% off for asynchronous jobs), and negotiated or committed-use volume pricing. There are also extra-cost features — long-context surcharges, vision or audio tokens, and fine-tuned model rates — that fall outside a simple per-token estimate. Treat the figure here as an upper-bound sticker price; your real bill can be lower once discounts apply.

How current are these prices?+

The rates come from a versioned 2025 price table and are labelled accordingly throughout the tool. LLM pricing changes often — providers cut rates, launch new model versions, and retire old ones on a regular basis — so a table that was accurate when it was written can drift out of date within months. Before you budget or compare vendors, verify each rate against the provider's official pricing page (linked in the sources below). This calculator is for quick estimation and comparison, not a billing guarantee.

Related tools

Sources