Why Claude Code Uses So Many Tokens Explained
A typical Claude Code session consumes 100,000 to 250,000 tokens. That’s not because the model is inefficient – it’s because every file read (500-5,000 tokens each), every bash command output (200-2,000 tokens), every tool result, and every conversation turn accumulates in the context window. A session that reads 20 files, runs 15 commands, and has 10 conversation exchanges can easily hit 150,000 tokens. At API-equivalent Opus rates ($5.00/MTok), that’s $0.75 in context alone per subsequent interaction.
The Setup
Claude Code operates as an agentic coding assistant with full tool access: it reads files via the text editor (700 tokens overhead per invocation), runs bash commands (245 tokens overhead per invocation), and maintains the entire conversation history in context. Unlike a chatbot where messages are short, Claude Code messages contain entire file contents, full command outputs, and detailed code blocks. The context window grows monotonically throughout a session because the model needs the history to maintain coherent understanding of your codebase and your intent.
The Math
Token breakdown for a real-world debugging session:
Task: Fix a failing integration test
| Action | Tokens Added | Running Total |
|---|---|---|
| System prompt + CLAUDE.md | 2,000 | 2,000 |
| User describes the bug | 200 | 2,200 |
| Model reads test file (450 lines) | 4,500 | 6,700 |
| Model reads source file (300 lines) | 3,000 | 9,700 |
| Model reads config file | 800 | 10,500 |
| Model runs failing test | 2,500 (output) | 13,000 |
| Model’s analysis response | 1,500 | 14,500 |
| User asks for fix | 100 | 14,600 |
| Model reads 2 more source files | 5,000 | 19,600 |
| Model edits source file | 1,200 | 20,800 |
| Model runs test again | 2,000 (output) | 22,800 |
| New error, model reads stack trace | 1,500 | 24,300 |
| Model reads another file | 2,500 | 26,800 |
| Model applies second fix | 1,000 | 27,800 |
| Model runs test suite | 3,500 (full output) | 31,300 |
| Model’s summary response | 800 | 32,100 |
Total after one bug fix: 32,100 tokens
After 4 such bug fixes in a session: ~128,000 tokens
Every subsequent interaction at 128K context:
- Input cost (API equivalent): 128K x $5.00/MTok = $0.64 per interaction
- 5 more interactions at this level: $3.20 in context re-processing
The five biggest token sinks:
- File reads (35-45% of tokens): Each file read dumps the entire content into context
- Command outputs (15-25%): Test results, build logs, git diffs
- Model responses (15-20%): Explanations, code blocks, analysis
- Tool overhead (5-10%): 700 tokens for text_editor, 245 for bash, per invocation
- Conversation history (10-15%): All previous turns stay in context
The Technique
Reduce token consumption by controlling the five token sinks.
# Sink 1: File reads - be surgical, not exhaustive
# BAD: "Read the entire auth module"
# This reads every file in the directory: 5-10 files x 3,000 tokens = 15,000-30,000 tokens
# GOOD: "Read only the authenticate function in src/auth/login.ts, around lines 45-80"
# This reads ~35 lines: ~350 tokens
# Sink 2: Command outputs - filter before displaying
# BAD: Run the full test suite (outputs 500 lines)
# GOOD: Run only the failing test with minimal output
npm test -- --testPathPattern="auth.test.ts" --verbose 2>&1 | tail -30
# BAD: Full git log
git log
# GOOD: Concise recent log
git log --oneline -10
# Sink 3: Be specific in your requests to get concise responses
# BAD: "Explain what this code does and how to fix it"
# GOOD: "The test on line 47 fails with TypeError. Fix it."
Configure your project to minimize unnecessary file reading:
// .claude/settings.json - control what Claude Code can see
{
"permissions": {
"allow": [
"Read(src/**)",
"Write(src/**)",
"Bash(npm test*)",
"Bash(git *)"
],
"deny": [
"Read(node_modules/**)",
"Read(dist/**)",
"Read(.git/**)",
"Read(coverage/**)"
]
}
}
For CLAUDE.md, include a project map so Claude doesn’t need to explore:
# Project Structure
src/
app/ - Next.js routes (App Router)
components/ - React components
lib/
auth.ts - Authentication logic
db.ts - Database client
utils.ts - Shared utilities
api/ - API handlers
tests/
unit/ - Jest unit tests
integration/ - Integration tests
This 100-token map saves thousands of tokens that would otherwise be spent on ls and find commands exploring the directory structure.
The Tradeoffs
Being overly restrictive with file reads can backfire. If Claude can’t see enough context, it makes incorrect assumptions and produces buggy code that requires more iterations to fix – consuming more tokens than a single full file read would have. The goal is informed minimalism: give Claude exactly the context it needs, no more and no less. When in doubt, let Claude read the full file rather than restricting it to a range that might miss important context (imports, type definitions, related functions).
Implementation Checklist
- Add a project structure map to your CLAUDE.md file
- Configure
.claude/settings.jsonto deny access to large generated directories - Train yourself to give specific file and line references rather than broad requests
- Pipe command outputs through
tail,head, orgrepto reduce output size - Use
/compactwhen context exceeds 100K tokens - Start new sessions for unrelated tasks rather than extending long sessions
- Keep CLAUDE.md under 200 tokens
Measuring Impact
On subscription plans, reducing token consumption means hitting rate limits less often and getting more done per session. On the Max 20x plan ($200/month), this translates to higher effective throughput. On API usage, the savings are direct. Monitor your average session length (in tokens) and average interactions per session. After applying these techniques, session peak token counts should drop 30-50%, and interactions per session should increase by a similar amount. The average Claude Code developer spends about $6/day at API rates – these techniques can cut that to $3-4/day.