AI Token Counter

Your data never leaves your browser

Count tokens for GPT-4o, Claude, and Gemini. See context window usage and estimated API cost instantly.

Model:
Paste text above to count tokens.
Ctrl+KClear

Share this tool

Found it useful? Help a fellow developer discover it.

https://developertoolkit.dev/tools/token-counter

How to count tokens for GPT-4o, Claude, and Gemini

Every call to a language model API has a cost measured in tokens, not words. Understanding token counts is essential for three things: staying within context window limits, budgeting API spend, and optimizing prompt efficiency. A prompt that uses 90 percent of a model's context window leaves little room for the response and may cause errors or truncation. A system prompt that could be trimmed by 1,000 tokens saves real money at scale. This tool gives you an instant count for all major models side by side. Select GPT-4o, GPT-4, GPT-3.5, Claude Sonnet, Claude Haiku, or Gemini 1.5 Pro and the count updates as you type. The context usage bar turns amber above 70 percent and red above 90 percent. Estimated input cost is shown per call, which helps when you are building batch processing pipelines or evaluating which model makes economic sense for your use case.

Frequently Asked Questions

What are tokens and why do they matter?

Language models do not read text character by character. They split text into pieces called tokens, which are roughly 3 to 4 characters of English text on average. A single word can be one or multiple tokens depending on how common it is. Tokens matter because API pricing is based on tokens, not characters or words. Context windows (how much a model can "remember") are also measured in tokens.

Why do token counts differ between models?

Different models use different tokenizers. GPT-4 and GPT-3.5 use the cl100k_base tokenizer. GPT-4o uses the o200k_base tokenizer. Claude and Gemini use proprietary tokenizers that are not publicly available, so this tool uses GPT-4 tokenization as a close approximation for those models.

How can I reduce token usage in my prompts?

Remove redundant instructions and boilerplate. Use concise, direct language. For structured data, consider using TOON format (our JSON to TOON converter) which reduces token count by around 40 percent for tabular data. Avoid repeating context that the model already has. Break long tasks into smaller calls.

What is a context window?

The context window is the maximum number of tokens a model can process in a single call, including both the input (prompt plus conversation history) and the output. GPT-4o and GPT-4 Turbo each have a 128,000-token context window. Claude models have 200,000. Gemini 1.5 Pro has 1,000,000. Exceeding the context window causes an error or truncates the input.

Are the cost estimates accurate?

The estimates are based on publicly listed input pricing per million tokens as of mid-2026 and are approximate. Output tokens are priced differently (usually higher) but not shown here since output length is not known in advance. Check each provider's pricing page for current rates before budgeting API usage.

Related Tools