Claude Code Router: Model Routing Guide (2026)
Model routing is the practice of selecting the optimal Claude model for each task based on complexity, cost, and quality requirements. Claude Code supports both automatic and manual routing strategies. This guide covers every approach.
What Is Model Routing?
Not every task needs the most powerful model. A simple file rename does not need Opus. A complex architecture decision should not be handled by Haiku. Model routing matches tasks to the right model, saving money without sacrificing quality.
Claude Code’s model routing works at three levels:
- Built-in automatic routing — Claude Code selects models internally for different operations
- Manual model selection — you specify which model to use via CLI flags
- CLAUDE.md routing rules — you define routing logic in your project configuration
Built-In Routing: How Claude Code Selects Models
Claude Code uses different models for different internal operations:
| Operation | Model Used | Why |
|---|---|---|
| Primary reasoning | Opus or Sonnet (user-selected) | Main task processing |
| Subagent tasks | Same as parent or configurable | Delegated work |
| Quick completions | Haiku | Tab completion, simple suggestions |
| File summarization | Haiku or Sonnet | Context compression |
The primary model — the one doing the heavy reasoning — is what you control directly. Claude Code handles the rest.
Manual Model Selection
CLI Flag
Specify the model when launching Claude Code:
# Use Opus for complex work
claude --model claude-opus-4-20250514
# Use Sonnet for general development
claude --model claude-sonnet-4-20250514
# Use Haiku for simple edits
claude --model claude-haiku-4-5-20251001
In API Mode
When using Claude Code programmatically:
# Pipe a task to a specific model
echo "Fix the typo in README.md" | claude --model claude-haiku-4-5-20251001 -p
# Complex task with Opus
echo "Refactor the auth module to use JWT" | claude --model claude-opus-4-20250514 -p
Model Aliases
Claude Code supports shorthand aliases:
claude --model opus # resolves to claude-opus-4-20250514
claude --model sonnet # resolves to claude-sonnet-4-20250514
claude --model haiku # resolves to claude-haiku-4-5-20251001
CLAUDE.md Routing Rules
You can embed routing guidance in your CLAUDE.md file so every team member uses the same strategy:
## Model Routing
- Use Opus for: architecture decisions, security reviews, complex refactors
- Use Sonnet for: feature implementation, bug fixes, code generation
- Use Haiku for: formatting fixes, typo corrections, simple renames
When the user doesn't specify a model, default to Sonnet.
These rules are advisory — Claude Code reads them as part of its context, but the actual model used depends on the --model flag or account default.
For CLAUDE.md configuration patterns in detail, see our CLAUDE.md best practices guide.
Smart Routing Strategies
Strategy 1: Cost Optimization
Start with the cheapest model. Escalate only when the task fails or produces poor results.
# First attempt with Haiku
echo "Add input validation to the signup form" | claude --model haiku -p
# If output quality is insufficient, escalate
echo "Add input validation to the signup form" | claude --model sonnet -p
This works well for tasks with clear success criteria. If the Haiku output compiles and passes tests, you saved 80% versus Sonnet.
Cost savings: 60-80% compared to always using Sonnet.
Strategy 2: Quality Optimization
Always use Opus. Ignore cost, maximize output quality.
claude --model opus
Best for: security audits, architecture reviews, production-critical code. When a mistake costs more than the API bill, use the best model.
Tradeoff: 5x the cost of Sonnet, but measurably better at complex reasoning and edge-case handling.
Strategy 3: Balanced (Recommended for Most Teams)
Use Sonnet as default. Escalate to Opus for defined categories.
## CLAUDE.md Routing (Balanced)
Default model: Sonnet
Escalate to Opus when:
- Designing new system architecture
- Reviewing security-sensitive code
- Debugging issues that have resisted multiple fix attempts
- Writing complex algorithms
Use Haiku when:
- Fixing typos or formatting
- Generating boilerplate
- Running simple grep-and-replace operations
This is the strategy most teams should start with. It provides good quality at reasonable cost.
Multi-Model Workflows
The most powerful routing pattern uses different models for different stages of the same task.
Opus Plans, Sonnet Implements, Haiku Reviews
# Step 1: Opus creates the plan
echo "Plan a refactor of the payment module to support multiple providers" \
| claude --model opus -p > plan.md
# Step 2: Sonnet implements the plan
echo "Implement the plan in plan.md" \
| claude --model sonnet -p
# Step 3: Haiku does a quick lint check
echo "Check all modified files for obvious issues" \
| claude --model haiku -p
This mirrors how human teams work: senior architects plan, mid-level engineers implement, junior engineers handle review checklists.
For more on multi-agent workflows, see our multi-agent architecture guide.
Parallel Multi-Model with Subagents
When using Claude Code’s subagent system, each subagent can use a different model:
## CLAUDE.md
When spawning subagents:
- Security scan subagent: use Opus
- Test generation subagent: use Sonnet
- Documentation update subagent: use Haiku
This lets you parallelize work across models, optimizing cost and quality simultaneously.
Cost Comparison by Model
Current per-token pricing (as of April 2026):
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Relative Cost |
|---|---|---|---|
| Claude Opus 4 | $15.00 | $75.00 | 5x |
| Claude Sonnet 4 | $3.00 | $15.00 | 1x (baseline) |
| Claude Haiku 4.5 | $0.80 | $4.00 | 0.27x |
Prompt caching reduces input costs by 90% for cached content. This applies to all models and is particularly valuable for routing scenarios where the same system prompt is sent repeatedly.
Real Cost Examples
| Task | Haiku Cost | Sonnet Cost | Opus Cost |
|---|---|---|---|
| Fix a typo | $0.005 | $0.02 | $0.10 |
| Implement a feature | $0.08 | $0.30 | $1.50 |
| Full codebase refactor | $0.50 | $2.00 | $10.00 |
| Architecture review | $0.15 | $0.60 | $3.00 |
Routing Haiku for simple tasks and Sonnet for the rest can cut your bill by 30-50% with no quality loss. See our cost optimization guide for additional strategies.
Need the complete toolkit? The Claude Code Playbook includes 200 production-ready templates, decision frameworks, and team setup guides for every Claude Code workflow.
Router Configuration in Settings
Claude Code’s settings file (.claude/settings.json) supports model preferences:
{
"model": "claude-sonnet-4-20250514",
"preferences": {
"defaultModel": "claude-sonnet-4-20250514"
}
}
Organization-level settings can enforce model policies:
{
"permissions": {
"allowedModels": [
"claude-sonnet-4-20250514",
"claude-haiku-4-5-20251001"
]
}
}
This prevents developers from accidentally using Opus for routine tasks, enforcing cost controls at the org level.
Routing Decision Flowchart
Use this quick reference to choose the right model for any task:
- Is the task trivial? (typo fix, rename, formatting) – Use Haiku
- Is the task standard development? (feature implementation, bug fix, test writing) – Use Sonnet
- Has Sonnet already failed on this task? – Escalate to Opus
- Does the task involve security, architecture, or complex multi-system reasoning? – Use Opus
- Are you unsure? – Start with Sonnet. Escalate if the output is insufficient.
This flowchart covers 95% of routing decisions. The remaining 5% are edge cases where testing both models on your specific workload is the best approach.
Common Routing Mistakes
Mistake: Using Opus for everything. Result: 5x higher costs with minimal quality improvement for routine tasks. Opus shines on hard problems — it does not meaningfully outperform Sonnet on straightforward implementations.
Mistake: Using Haiku for complex tasks to save money. Result: lower-quality output that requires rework, often costing more total tokens than using Sonnet once. Haiku is for tasks with clear, simple requirements.
Mistake: Never changing the default model. Result: paying Sonnet prices for tasks Haiku handles perfectly, or getting subpar results on tasks that warrant Opus. Spend 5 seconds choosing the model before each session.
Mistake: Switching models mid-task by starting a new session. Result: loss of conversation context, duplicate file reads, and wasted tokens rebuilding state. If you realize mid-task that you need a different model, finish the current step first, then switch for the next phase.
Frequently Asked Questions
Does Claude Code automatically route to cheaper models? Not by default. Claude Code uses the model you specify (or your account default). You must configure routing manually via CLI flags, CLAUDE.md rules, or scripts.
Can I switch models mid-session?
Not within a single session. Each Claude Code session uses one primary model. To use a different model, start a new session with the --model flag.
What happens if I use Haiku for a task that needs Opus? The task will still run, but the output may be lower quality — missed edge cases, simpler solutions, less thorough analysis. There is no automatic escalation.
Is routing worth the complexity? For individual developers, simple manual selection is fine. For teams spending more than $500/month on Claude Code, routing strategies can cut costs by 30-50%.
Can I route based on file type? Not natively. You can build this into scripts: use Opus for security-sensitive files, Sonnet for everything else.
How does prompt caching interact with routing? Prompt caching works across all models. Cached inputs cost 90% less. This means the cost gap between models narrows for repeated operations on the same files.
Related Guides
- Claude Code cost breakdown — full pricing reference
- Reduce Claude Code spend — cost optimization tactics
- Best cost-saving tools — third-party cost tools
- Multi-agent architecture — multi-model agent patterns
- API mode vs interactive — programmatic model selection
- Claude Code prompt engineering — get better results from any model
- The Claude Code Playbook — comprehensive configuration reference
-
Parallel subagents best practices — multi-model parallelism
- Claude Sonnet 4.5 model guide — Sonnet 4.5 capabilities and features
- Claude Sonnet 4 model guide — Sonnet 4 capabilities and features
Does model routing affect prompt caching?
Prompt caching works within the same model. Switching models between requests means the cache from the previous model is not reused. For maximum cache benefit, batch requests by model.
Can I set up automatic routing based on task complexity?
Not natively in Claude Code. You can build this into scripts that analyze the task description and select the model accordingly before passing it to Claude Code.
Is there a cost difference between using the model alias and the full model ID?
No. The alias (like sonnet) resolves to the same model as the full ID (claude-sonnet-4-20250514) and is billed identically.
How do I know if Sonnet is good enough or if I need Opus?
Start with Sonnet. If the output quality is insufficient — missed edge cases, incorrect logic, or shallow analysis — escalate to Opus for that specific task. Most daily development tasks do not need Opus.