Why Claude Code Uses So Many Tokens Explained

Written by Michael Lip · Solo founder of Zovo · $400K+ on Upwork · 100% JSS Join 50+ builders · More at zovo.one

A typical Claude Code session consumes 100,000 to 250,000 tokens. That’s not because the model is inefficient – it’s because every file read (500-5,000 tokens each), every bash command output (200-2,000 tokens), every tool result, and every conversation turn accumulates in the context window. A session that reads 20 files, runs 15 commands, and has 10 conversation exchanges can easily hit 150,000 tokens. At API-equivalent Opus rates ($5.00/MTok), that’s $0.75 in context alone per subsequent interaction.

The Setup

Claude Code operates as an agentic coding assistant with full tool access: it reads files via the text editor (700 tokens overhead per invocation), runs bash commands (245 tokens overhead per invocation), and maintains the entire conversation history in context. Unlike a chatbot where messages are short, Claude Code messages contain entire file contents, full command outputs, and detailed code blocks. The context window grows monotonically throughout a session because the model needs the history to maintain coherent understanding of your codebase and your intent.

The Math

Token breakdown for a real-world debugging session:

Task: Fix a failing integration test

Action Tokens Added Running Total
System prompt + CLAUDE.md 2,000 2,000
User describes the bug 200 2,200
Model reads test file (450 lines) 4,500 6,700
Model reads source file (300 lines) 3,000 9,700
Model reads config file 800 10,500
Model runs failing test 2,500 (output) 13,000
Model’s analysis response 1,500 14,500
User asks for fix 100 14,600
Model reads 2 more source files 5,000 19,600
Model edits source file 1,200 20,800
Model runs test again 2,000 (output) 22,800
New error, model reads stack trace 1,500 24,300
Model reads another file 2,500 26,800
Model applies second fix 1,000 27,800
Model runs test suite 3,500 (full output) 31,300
Model’s summary response 800 32,100

Total after one bug fix: 32,100 tokens

After 4 such bug fixes in a session: ~128,000 tokens

Every subsequent interaction at 128K context:

The five biggest token sinks:

  1. File reads (35-45% of tokens): Each file read dumps the entire content into context
  2. Command outputs (15-25%): Test results, build logs, git diffs
  3. Model responses (15-20%): Explanations, code blocks, analysis
  4. Tool overhead (5-10%): 700 tokens for text_editor, 245 for bash, per invocation
  5. Conversation history (10-15%): All previous turns stay in context

The Technique

Reduce token consumption by controlling the five token sinks.

# Sink 1: File reads - be surgical, not exhaustive

# BAD: "Read the entire auth module"
# This reads every file in the directory: 5-10 files x 3,000 tokens = 15,000-30,000 tokens

# GOOD: "Read only the authenticate function in src/auth/login.ts, around lines 45-80"
# This reads ~35 lines: ~350 tokens

# Sink 2: Command outputs - filter before displaying

# BAD: Run the full test suite (outputs 500 lines)
# GOOD: Run only the failing test with minimal output
npm test -- --testPathPattern="auth.test.ts" --verbose 2>&1 | tail -30

# BAD: Full git log
git log

# GOOD: Concise recent log
git log --oneline -10

# Sink 3: Be specific in your requests to get concise responses
# BAD: "Explain what this code does and how to fix it"
# GOOD: "The test on line 47 fails with TypeError. Fix it."

Configure your project to minimize unnecessary file reading:

// .claude/settings.json - control what Claude Code can see
{
  "permissions": {
    "allow": [
      "Read(src/**)",
      "Write(src/**)",
      "Bash(npm test*)",
      "Bash(git *)"
    ],
    "deny": [
      "Read(node_modules/**)",
      "Read(dist/**)",
      "Read(.git/**)",
      "Read(coverage/**)"
    ]
  }
}

For CLAUDE.md, include a project map so Claude doesn’t need to explore:

# Project Structure
src/
  app/          - Next.js routes (App Router)
  components/   - React components
  lib/
    auth.ts     - Authentication logic
    db.ts       - Database client
    utils.ts    - Shared utilities
  api/          - API handlers
tests/
  unit/         - Jest unit tests
  integration/  - Integration tests

This 100-token map saves thousands of tokens that would otherwise be spent on ls and find commands exploring the directory structure.

The Tradeoffs

Being overly restrictive with file reads can backfire. If Claude can’t see enough context, it makes incorrect assumptions and produces buggy code that requires more iterations to fix – consuming more tokens than a single full file read would have. The goal is informed minimalism: give Claude exactly the context it needs, no more and no less. When in doubt, let Claude read the full file rather than restricting it to a range that might miss important context (imports, type definitions, related functions).

Implementation Checklist

Measuring Impact

On subscription plans, reducing token consumption means hitting rate limits less often and getting more done per session. On the Max 20x plan ($200/month), this translates to higher effective throughput. On API usage, the savings are direct. Monitor your average session length (in tokens) and average interactions per session. After applying these techniques, session peak token counts should drop 30-50%, and interactions per session should increase by a similar amount. The average Claude Code developer spends about $6/day at API rates – these techniques can cut that to $3-4/day.