Claude Code /compact Saves Thousands of Tokens

Written by Michael Lip · Solo founder of Zovo · $400K+ on Upwork · 100% JSS Join 50+ builders · More at zovo.one

The /compact command in Claude Code compresses your session context by an estimated 50-70%. A session sitting at 180,000 tokens drops to roughly 60,000 tokens after compaction. On API-equivalent pricing at Opus rates ($5.00/MTok input), each subsequent interaction saves $0.60 in context re-processing. Over 3 remaining interactions, that’s $1.80 per session. A team running 50 sessions per day saves $2,700/month from this one command.

The Setup

Claude Code sessions grow continuously. Every file read, command output, tool result, and conversation turn adds to the context window. After an hour of intensive coding, your context can balloon to 150,000-200,000 tokens. The problem compounds: every new message sends the entire context back to the model, so the cost of each interaction grows linearly with session length. The /compact command summarizes the conversation history, distilling key decisions and code changes while discarding verbose intermediate output (command results, file contents already incorporated into edits, exploration dead-ends). The model retains the essential state but in far fewer tokens.

The Math

A typical 2-hour coding session without compaction:

Session context growth:

Cost of last 3 interactions without /compact (API equivalent at $5.00/MTok):

Cost with /compact at 90-minute mark (reduces 180K to 60K):

Savings per session: $1.85 (58%) At 50 sessions/day across a team: $2,775/month

The Technique

Use /compact strategically at key session milestones to maximize token savings.

# Claude Code session workflow with strategic compaction

# Phase 1: Investigation (context grows fast)
# Read files, run tests, understand the problem
# Context: 0 -> 80K tokens

# Phase 2: First implementation
# Write code, run tests
# Context: 80K -> 150K tokens

# >>> USE /compact HERE <<<
# Context drops: 150K -> ~50K tokens
# Key decisions and code changes preserved
# Verbose file contents and command outputs discarded

# Phase 3: Refinement
# Fix issues, add tests, polish
# Context: 50K -> 100K tokens (instead of 150K -> 220K)

# >>> USE /compact AGAIN if needed <<<
# Context drops: 100K -> ~35K tokens

# Phase 4: Final review and commit
# Context stays manageable: 35K -> 60K tokens

Optimize your CLAUDE.md to reduce the baseline context footprint:

# CLAUDE.md - Optimized for token efficiency

## Project: MyApp
- Language: TypeScript, Node.js 20
- Framework: Next.js 14, App Router
- Test: Jest + React Testing Library
- Style: ESLint + Prettier, 2-space indent

## Key directories
- src/app/ - routes and pages
- src/components/ - React components
- src/lib/ - utilities and helpers
- src/api/ - API route handlers

## Conventions
- Functional components only
- Server Components by default, "use client" when needed
- Zod for validation
- Error boundaries in layout.tsx files

## Common commands
- `pnpm dev` - start dev server
- `pnpm test` - run tests
- `pnpm lint` - run linter
- `pnpm build` - production build

This CLAUDE.md runs about 150 tokens – far less than a 500+ token version with lengthy explanations and examples. Every token saved in CLAUDE.md saves that token on every single interaction in every session.

# Calculate CLAUDE.md token impact
def claude_md_impact(
    claude_md_tokens: int,
    interactions_per_session: int = 20,
    sessions_per_day: int = 10,
    days_per_month: int = 22,
    input_price_per_mtok: float = 5.00  # Opus rate
) -> dict:
    """Calculate monthly cost of CLAUDE.md tokens."""
    total_appearances = (
        interactions_per_session * sessions_per_day * days_per_month
    )
    total_tokens = claude_md_tokens * total_appearances
    monthly_cost = total_tokens * input_price_per_mtok / 1_000_000

    return {
        "claude_md_tokens": claude_md_tokens,
        "monthly_appearances": total_appearances,
        "monthly_tokens": total_tokens,
        "monthly_cost": f"${monthly_cost:.2f}",
    }

# Compare verbose vs optimized CLAUDE.md
verbose = claude_md_impact(500)  # 500-token CLAUDE.md
optimized = claude_md_impact(150)  # 150-token CLAUDE.md

print(f"Verbose CLAUDE.md: {verbose['monthly_cost']}/month")
print(f"Optimized CLAUDE.md: {optimized['monthly_cost']}/month")
# Verbose: $11.00/month
# Optimized: $3.30/month
# Savings: $7.70/month per developer

The Tradeoffs

Compaction is lossy. The model summarizes rather than preserves exact conversation history, which means specific details about why you rejected an approach, exact error messages, or nuanced discussion about edge cases may be lost. If you need to reference something from earlier in the session after compaction, the model might not remember it accurately. Best practice: compact after major milestones (completed a feature, fixed a bug) rather than mid-investigation. Also, compaction itself consumes tokens – the model reads the full context to produce the summary, so there’s a one-time cost before the ongoing savings.

Implementation Checklist

Measuring Impact

On subscription plans (Pro $20, Max $100-$200), cost savings from /compact don’t directly reduce your bill – they extend your rate limits, letting you do more within the same subscription. On API usage, savings are direct: the token reduction translates to proportional cost reduction. Track session duration (in interactions, not minutes) before and after adopting /compact. Teams typically find sessions last 30-50% more interactions after compaction, meaning more productive coding per session without hitting context limits.