What is Token Pricing? A Simple Guide for Developers

Introduction

If you've ever looked at an AI API pricing page and wondered what "per million tokens" actually means — you're not alone. Token pricing is the standard way AI companies charge for their APIs, but it can be confusing at first.

In this guide, we explain exactly what tokens are, how they're counted, and how to estimate your costs before you build.

What is a Token?

A token is a chunk of text — roughly 3-4 characters or about ¾ of a word in English.

Some examples:

"Hello" = 1 token
"Hello world" = 2 tokens
"ChatGPT is amazing" = 4 tokens
1,000 words ≈ 750 tokens
1 million tokens ≈ 750,000 words (about 10 full novels)

Tokens aren't the same as words or characters — they depend on the model's tokenizer. Common words are usually 1 token, while long or rare words may be 2-3 tokens.

Input vs Output Tokens

Every AI API charges separately for input and output tokens:

Input tokens: the text you send to the model — your prompt, system instructions, conversation history, and any context
Output tokens: the text the model generates in response

Output tokens are almost always more expensive than input tokens. For example, GPT-4o charges $2.50/1M input but $10.00/1M output — 4x more expensive for output.

Current Token Prices (June 2026)

Premium models:

GPT-4o: $2.50 input / $10.00 output per 1M tokens
Claude Sonnet 4.6: $3.00 input / $15.00 output per 1M tokens
Gemini 3.1 Pro: $2.00 input / $12.00 output per 1M tokens

Budget models:

GPT-4o Mini: $0.15 input / $0.60 output per 1M tokens
Claude Haiku 4.5: $1.00 input / $5.00 output per 1M tokens
Gemini 3.1 Flash: $0.30 input / $2.50 output per 1M tokens

How to Calculate Your Costs

The formula is simple:

Cost = (Input tokens × Input price + Output tokens × Output price) × Number of requests

Example:

You have 1,000 users
Each sends 1 message per day (30/month)
Average message: 200 input tokens + 300 output tokens
Using GPT-4o Mini ($0.15 input / $0.60 output)

Monthly cost = (200 × $0.00000015 + 300 × $0.0000006) × 30,000 requests = ($0.00003 + $0.00018) × 30,000 = $6.30/month

What Increases Your Token Usage?

Several things silently add tokens to every request:

System prompt: sent on every single request — keep it short
Conversation history: sending past messages adds tokens fast
RAG context: adding documents can add thousands of tokens per request
Long responses: complex tasks generate more output tokens

Tips to Reduce Token Costs

Use a budget model for simple tasks
Keep your system prompt under 200 tokens
Limit conversation history to last 5 messages
Set a max_tokens limit on responses
Use context caching for repeated content

Use Our Free Calculator

Not sure how much your app will cost? Try our AI API Cost Calculator — enter your users, frequency, and use case to get an instant cost breakdown across all major models.

Conclusion

Token pricing is straightforward once you understand it. The key is knowing that input and output are priced separately, output costs more, and small optimizations to your prompts and context can save significant money at scale.