Why Large Context Makes Claude Code Expensive
A Claude Code session on Opus 4.7 starts at roughly 5K tokens and grows to 150K+ tokens within 30 interactions. Each interaction at 150K context costs $0.75 in input tokens alone. The average developer spends $6/day on Claude Code — and most of that goes to re-sending bloated context rather than generating new output.
The Setup
Claude Code maintains a running conversation history that includes every user message, assistant response, tool call, tool result, code edit, terminal output, and error trace from the entire session. Every new interaction sends this entire history back to the API as context.
The economics are straightforward: if your context is 150K tokens and input costs $5.00/MTok on Opus 4.7, each interaction costs $0.75 just to re-read the conversation history. The actual new content you are asking about might be 500 tokens — but you pay for 150K.
This guide breaks down exactly where Claude Code context tokens accumulate and how to manage them.
The Math
Context growth in a typical 30-interaction Claude Code session:
| Interaction | Est. Context | Opus Input Cost | Cumulative Cost |
|---|---|---|---|
| 1 | 5K | $0.025 | $0.025 |
| 5 | 25K | $0.125 | $0.375 |
| 10 | 50K | $0.250 | $1.375 |
| 15 | 75K | $0.375 | $3.125 |
| 20 | 100K | $0.500 | $5.625 |
| 25 | 125K | $0.625 | $8.875 |
| 30 | 150K | $0.750 | $12.875 |
Total session input cost: $12.88 on Opus 4.7
For a developer running 3 sessions/day: $38.63/day -> $1,158.90/month in input tokens alone
Add output tokens (average 2K per interaction at $25/MTok):
- Output cost: 30 * 2K * $25/MTok = $1.50/session
- Total session: $14.38
- Monthly (3 sessions/day): $1,294.20/month
With context management (compact at interaction 15):
- Interactions 1-15: $3.125 cumulative
- Compact: 75K -> 25K
- Interactions 16-30: ~$4.50 cumulative
- Total session: $7.63
- Monthly (3 sessions/day): $686.70/month
Savings: $607.50/month (47%)
The Technique
Understanding Where Tokens Accumulate
def analyze_claude_code_context_growth(session_log: list) -> dict:
"""Analyze token growth patterns in a Claude Code session."""
components = {
"tool_calls": 0,
"tool_results": 0,
"code_output": 0,
"error_traces": 0,
"assistant_text": 0,
"user_messages": 0,
}
for entry in session_log:
content = entry.get("content", "")
char_count = len(content)
token_est = char_count // 4
role = entry.get("role", "")
if entry.get("type") == "tool_use":
components["tool_calls"] += token_est
elif entry.get("type") == "tool_result":
# Tool results are the biggest token consumers
if "Error" in content or "Traceback" in content:
components["error_traces"] += token_est
elif "```" in content:
components["code_output"] += token_est
else:
components["tool_results"] += token_est
elif role == "assistant":
components["assistant_text"] += token_est
elif role == "user":
components["user_messages"] += token_est
total = sum(components.values())
percentages = {k: f"{v/max(total,1)*100:.1f}%" for k, v in components.items()}
return {
"total_tokens": total,
"breakdown": components,
"percentages": percentages,
"top_consumer": max(components, key=components.get),
}
The Biggest Token Consumers in Claude Code
Based on production usage analysis:
- Tool results (40-50%) — Terminal output, file contents read by Claude, build logs
- Code blocks (20-30%) — Full file contents in assistant responses
- Error traces (10-15%) — Stack traces, build errors, test failures
- Assistant text (10-15%) — Explanations, analysis, suggestions
- User messages (5-10%) — Your actual requests
# Strategies to reduce each component:
# 1. Tool results: Truncate long outputs
def truncate_tool_result(output: str, max_lines: int = 50) -> str:
"""Keep first and last N lines, truncate middle."""
lines = output.split("\n")
if len(lines) <= max_lines:
return output
half = max_lines // 2
return "\n".join(lines[:half] + [f"\n... ({len(lines) - max_lines} lines truncated) ...\n"] + lines[-half:])
# 2. Code blocks: Reference files instead of including full content
# In Claude Code, prefer "read file X" over pasting file content
# 3. Error traces: Use compact error messages
def compact_error(traceback: str) -> str:
"""Extract the essential error info from a full traceback."""
lines = traceback.strip().split("\n")
# Keep last line (the actual error) and the file/line references
essential = [l for l in lines if "File " in l or "Error" in l or "Exception" in l]
return "\n".join(essential[-5:]) # Last 5 relevant lines
Practical Session Management in Claude Code
# 1. Start sessions with clear, focused scope
# BAD: "Fix all the bugs in the project"
# GOOD: "Fix the authentication bug in auth.py line 42"
# 2. Use /compact at natural breakpoints
# After completing a task:
/compact
# 3. Start new sessions for unrelated tasks
# Don't reuse a session from bug fixing for a new feature
# 4. Use CLAUDE.md to persist critical context across compactions
# Claude Code reads this file automatically
Cost Monitoring Script
import json
from datetime import datetime, timedelta
def estimate_daily_claude_code_cost(
sessions_per_day: int = 3,
interactions_per_session: int = 30,
avg_growth_per_interaction: int = 5000, # tokens
model_input_rate: float = 5.0, # $/MTok
model_output_rate: float = 25.0,
avg_output_tokens: int = 2000,
compact_interval: int = 0, # 0 = no compaction
) -> dict:
"""Estimate daily Claude Code spending."""
total_input_cost = 0
total_output_cost = 0
for _ in range(sessions_per_day):
context = 0
for i in range(interactions_per_session):
context += avg_growth_per_interaction
if compact_interval > 0 and (i + 1) % compact_interval == 0:
context = int(context * 0.33)
input_cost = context * model_input_rate / 1_000_000
output_cost = avg_output_tokens * model_output_rate / 1_000_000
total_input_cost += input_cost
total_output_cost += output_cost
return {
"daily_input_cost": f"${total_input_cost:.2f}",
"daily_output_cost": f"${total_output_cost:.2f}",
"daily_total": f"${total_input_cost + total_output_cost:.2f}",
"monthly_total": f"${(total_input_cost + total_output_cost) * 30:.2f}",
}
# Without compaction
print("No compaction:", estimate_daily_claude_code_cost())
# With compaction every 15 interactions
print("Compact@15:", estimate_daily_claude_code_cost(compact_interval=15))
# With compaction every 10 interactions
print("Compact@10:", estimate_daily_claude_code_cost(compact_interval=10))
The Tradeoffs
The average developer spending $6/day on Claude Code is already on the cheaper end. Industry data shows 90% of Claude Code users spend under $12/day. For many developers, the productivity gains justify the cost without optimization.
Context management adds cognitive overhead. Remembering to compact, starting new sessions, and writing CLAUDE.md files takes time. If your time is worth more than the savings, focus only on the highest-impact optimizations.
Aggressive compaction can disrupt complex debugging workflows where Claude needs the full error history to find root causes. Keep full context during active debugging; compact after resolution.
Implementation Checklist
- Track your session lengths (interactions per session)
- Identify sessions that regularly exceed 100K context
- Add /compact to your workflow after completing each subtask
- Start new sessions for unrelated tasks
- Use CLAUDE.md for persistent context across sessions
- Monitor daily spend through your Anthropic dashboard
Measuring Impact
Track daily Claude Code spend before and after adopting context management habits. The industry benchmark is $6/day average. If you are above that, context management can bring you in line. Measure interactions per session and context size at session end — target keeping final context under 80K tokens through strategic compaction.
Related Guides
- Why Is Claude Code Expensive — the full analysis of Claude Code cost drivers
- Claude Code Context Window Management Guide — practical context management techniques
- Why Does Anthropic Limit Claude Code Context Window — design decisions behind context limits