How Many Tokens Per Claude Session? (2026)

Last updated: April 26, 2026

The most common question new Claude Code users ask is “how much will this cost?” The answer depends on what you are doing, how long the session lasts, and how you manage context. This guide gives you real numbers based on typical usage patterns.

For a hands-on estimate before your next session, use the Token Estimator tool.

Session Token Benchmarks

These benchmarks reflect typical Claude Code usage patterns in 2026. Your numbers will vary based on project complexity, prompt style, and context management habits.

Quick bug fix (5-15 minutes)

Metric	Typical Range
Input tokens	15,000 - 50,000
Output tokens	3,000 - 10,000
API calls	3-8
Files read	1-3
Estimated cost (Opus)	$0.30 - $1.00
Estimated cost (Sonnet)	$0.06 - $0.20

A quick bug fix typically involves reading the error, reading 1-2 source files, making a change, and verifying. Context stays small because the session is short.

Feature implementation (30-60 minutes)

Metric	Typical Range
Input tokens	80,000 - 250,000
Output tokens	20,000 - 60,000
API calls	15-40
Files read	5-15
Estimated cost (Opus)	$2.00 - $8.00
Estimated cost (Sonnet)	$0.40 - $1.60

Feature implementation involves more exploration, multiple file reads, iterative changes, and testing. Context grows significantly without compacting.

Code review (15-30 minutes)

Metric	Typical Range
Input tokens	30,000 - 100,000
Output tokens	8,000 - 25,000
API calls	5-15
Files read	3-10
Estimated cost (Opus)	$0.80 - $3.00
Estimated cost (Sonnet)	$0.16 - $0.60

Code reviews read multiple files but generate focused output. The /review command is efficient because it reads diffs rather than full files.

Architecture discussion (60-120 minutes)

Metric	Typical Range
Input tokens	150,000 - 400,000
Output tokens	40,000 - 100,000
API calls	30-80
Files read	10-30
Estimated cost (Opus)	$5.00 - $15.00
Estimated cost (Sonnet)	$1.00 - $3.00

Architecture discussions are the most token-intensive because they involve broad codebase exploration, long reasoning chains, and extensive back-and-forth dialogue.

Why Sessions Get Expensive

The primary cost driver is context re-transmission. Here is how it works:

Message 1: You send 200 tokens. Claude receives 200 + system overhead. Message 5: You send 200 tokens. Claude receives your 200 + previous 4 exchanges (~4,000 tokens) + system overhead. Message 20: You send 200 tokens. Claude receives your 200 + previous 19 exchanges (~20,000 tokens) + system overhead.

Each message sends everything that came before. By message 20, a 200-token prompt costs 20,000+ input tokens because of the accumulated history.

This is why proactive compacting is the single highest-impact cost reduction technique. Compacting at message 15 can cut the cost of messages 16-30 by 60% or more.

Estimating Before You Start

Use this formula for a rough estimate:

Estimated input tokens = (messages × avg_message_size × messages/2) + (files × avg_file_tokens) + system_overhead
Estimated output tokens = messages × avg_response_size

Or skip the math and use the Token Estimator, which does this calculation for you based on your specific inputs.

Quick rules of thumb

Budget $1 for every 30 minutes of Opus usage (with compacting)
Budget $0.20 for every 30 minutes of Sonnet usage (with compacting)
Without compacting, double or triple these estimates
Large codebase exploration (10+ files) adds $0.50-$2.00

How to Track Usage During a Session

The /cost command

Run /cost at any time:

/cost

It shows cumulative input tokens, output tokens, cache hits, and estimated dollar cost. Make this a habit: check every 10-15 messages to stay aware.

Session cost alerts

Configure cost alerts to get warnings before you exceed your budget. See the cost alerts guide for setup instructions.

Team-wide tracking

For organizations, ccusage provides aggregated dashboards across all team members and projects.

Cost Reduction in Practice

A developer working 6 hours per day with Claude Code might see these monthly costs:

Strategy	Monthly cost (Opus)	Monthly cost (Sonnet)
No optimization	$400-800	$80-160
With regular compacting	$200-400	$40-80
With compacting + model switching	$100-200	$30-60
With all optimizations	$80-150	$20-40

The “all optimizations” row includes compacting, model switching, precise prompts, lean CLAUDE.md, and MCP tool pruning. See reduce token usage for the full list.

Try It Yourself

Before your next coding session, open the Token Estimator and estimate the cost based on:

How many messages you expect to send (usually 2x what you think)
How many files Claude will need to read
Which model you plan to use

After the session, compare your estimate to the actual /cost output. This calibration exercise quickly builds intuition for token costs.

Frequently Asked Questions

What is the cheapest way to use Claude Code?

Use Sonnet for all routine tasks, compact at 40% context usage, write precise prompts with file paths, and keep CLAUDE.md under 300 words. This combination cuts costs by 70-80% compared to default usage.

Does the Max plan include unlimited Claude Code tokens?

The Anthropic Max plan includes Claude Code usage with rate limits. You do not pay per-token, but rate limits may throttle heavy usage. Check current plan details on Anthropic’s pricing page.

How do cache tokens affect my costs?

Cache read tokens cost roughly 90% less than regular input tokens. Your CLAUDE.md and system prompt are frequently cached, which reduces the overhead of these repeated elements significantly.

Can I set a hard spending limit per session?

Claude Code supports budget configuration through the cost alerts system. While it does not hard-stop at a limit, it warns you when approaching your budget so you can compact or end the session.

Know your costs → Use our Claude Code Cost Calculator to estimate your monthly spend.

Configure MCP → Build your server config with our MCP Config Generator.

Token Estimator — Estimate session costs before starting
Token Usage Explained — How tokens work under the hood
Cost Optimization: 15 Techniques — Complete cost-saving guide
Cost Alerts and Budgets — Set up spending limits
Reduce Token Usage — Cut consumption strategies