AI & LLM Economics

Token Counter & Context-Window Fit Calculator

Paste text to estimate how many tokens it will use, see which model context windows it fits (8k, 128k, 200k, 1M), and roughly what it costs at a per-million input price you choose. This is a fast heuristic estimate, not an exact tokenizer — real byte-pair tokenizers like OpenAI's tiktoken or Anthropic's token-counting endpoint return the true numbers, and they can differ noticeably for code, non-English text, and unusual formatting.

Your text

148 characters28 words

Estimate, not exact. This uses the heuristic max(chars ÷ 4, words ÷ 0.75)and is not a real BPE tokenizer. Actual counts from tiktoken (OpenAI) or Anthropic's token-counting endpoint differ, especially for code and non-English text.

Estimated tokens

37

4.00 chars/token · est. input cost $0.00

8kfits
128kfits
200kfits
1Mfits
Estimate detail
Characters148Words28Estimated tokens37Est. input cost$0.00Fits 8k windowyesFits 128k windowyesFits 200k windowyesFits 1M windowyes

Tokens vs context windows

your text8k · 128k · 200k

How it works

The estimate uses a simple rule of thumb: tokens ≈ max(characters ÷ 4, words ÷ 0.75). English prose averages roughly four characters per token and about three tokens for every four words, so taking the larger of the two bounds gives a reasonable single number for typical writing. The result is rounded to a whole token because a partial token has no meaning. Character and word counts are read straight from your text — characters are the raw string length and words are whitespace-separated runs.

Once tokens are estimated, the tool compares them against four common context windows — 8,192 (8k), 128,000 (128k), 200,000 (200k), and 1,000,000 (1M) tokens — and marks each as a fit or not. Note that models reserve part of the window for their response, so a prompt that just barely fits the raw window may still fail in practice; leave headroom.

The optional cost figure multiplies the estimated tokens by the input price per million tokens you enter (default $2.50). It reflects input tokens only — output tokens are billed separately and often at a higher rate — so treat it as a lower-bound sanity check on a single request, not a full bill. Change the price to match the specific model and tier you are using.

Frequently asked questions

How accurate is this token estimate?+

It is a heuristic, not a real tokenizer, so treat it as a ballpark. The chars/4 and words/0.75 factors are tuned to ordinary English prose and are usually within ten to twenty percent of the true count for that kind of text. They drift further for source code, JSON, tables, emoji, math, and non-English or non-Latin scripts, which tokenize very differently. For an exact count, run the text through the model's own tokenizer.

Why does the real token count differ from this estimate?+

Modern models use byte-pair encoding (BPE), which splits text into subword units learned from data rather than counting characters or words. Common words often become a single token while rare words, long numbers, and non-English characters split into several. Whitespace, capitalization, and punctuation all affect the split too. That is why two strings of the same length can have quite different real token counts, and why a character-based rule can only approximate.

How do I get the exact token count?+

Use the tokenizer that ships with your model. For OpenAI models, the open-source tiktoken library gives exact counts locally. For Anthropic's Claude, the Messages API exposes a token-counting endpoint that returns the precise input token count before you send a request. Both are the authoritative source for billing and context-limit decisions; this calculator is a quick first pass when you do not want to make an API call or install a library.

Related tools

Sources