Claude Orchestrator-Worker Cost Optimization

Written by Michael Lip · Solo founder of Zovo · $400K+ on Upwork · 100% JSS Join 50+ builders · More at zovo.one

An all-Opus agent fleet costs $25.00 per sprint in API tokens. Replace 4 of 5 agents with Haiku workers and the cost drops to $9.00 – a 64% reduction. The orchestrator (Opus 4.7 at $5.00/$25.00 per MTok) handles complex reasoning and task decomposition. The workers (Haiku 4.5 at $1.00/$5.00 per MTok) execute straightforward tasks at 5x lower cost per token.

The Setup

You are building an API-driven multi-agent system for automated code review. The system needs one agent to analyze PR complexity, decompose the review into subtasks, and synthesize findings. The remaining agents execute individual review subtasks: check style compliance, verify test coverage, scan for security issues, and validate documentation.

Only the orchestration agent needs Opus-level reasoning. The worker agents follow explicit instructions and produce structured output – tasks where Haiku 4.5 performs well at one-fifth the cost.

Current all-Opus fleet: $25.00 per sprint. After optimization: $9.00 per sprint. Monthly savings at 30 sprints: $480.

The Math

All-Opus fleet (5 agents), per sprint:

Opus orchestrator + 4 Haiku workers, per sprint:

Savings: $16.00/sprint (64%) Monthly (30 sprints): $480 saved

At Sonnet 4.6 workers instead of Haiku:

The Haiku worker configuration saves 2x more than Sonnet workers.

The Technique

The orchestrator-worker pattern requires clear separation between reasoning tasks and execution tasks:

import anthropic
from dataclasses import dataclass

client = anthropic.Anthropic()

ORCHESTRATOR_MODEL = "claude-opus-4-7-20250415"    # $5/$25 per MTok
WORKER_MODEL = "claude-haiku-4-5-20251001"          # $1/$5 per MTok


@dataclass
class ReviewTask:
    task_type: str
    instructions: str
    context: str


def orchestrate_review(pr_diff: str) -> list[ReviewTask]:
    """Opus decomposes the PR into subtasks for workers."""
    response = client.messages.create(
        model=ORCHESTRATOR_MODEL,
        max_tokens=4096,
        system=(
            "You are a code review orchestrator. Analyze the PR diff and "
            "create specific review subtasks. Output valid JSON array."
        ),
        messages=[
            {"role": "user", "content": f"PR Diff:\n{pr_diff}"}
        ]
    )

    import json
    tasks_raw = json.loads(response.content[0].text)

    return [
        ReviewTask(
            task_type=t["type"],
            instructions=t["instructions"],
            context=t.get("context", "")
        )
        for t in tasks_raw
    ]


def execute_task(task: ReviewTask, pr_diff: str) -> str:
    """Haiku worker executes a single review subtask."""
    response = client.messages.create(
        model=WORKER_MODEL,
        max_tokens=2048,
        system=(
            f"You are a {task.task_type} reviewer. "
            f"Follow these instructions exactly:\n{task.instructions}"
        ),
        messages=[
            {"role": "user", "content": f"Code:\n{pr_diff}\n\n{task.context}"}
        ]
    )
    return response.content[0].text


def synthesize_review(
    pr_diff: str,
    task_results: list[dict]
) -> str:
    """Opus synthesizes worker results into final review."""
    results_text = "\n\n".join(
        f"## {r['type']}\n{r['result']}" for r in task_results
    )

    response = client.messages.create(
        model=ORCHESTRATOR_MODEL,
        max_tokens=4096,
        system="Synthesize these review results into a cohesive PR review.",
        messages=[
            {"role": "user", "content": f"PR:\n{pr_diff}\n\nResults:\n{results_text}"}
        ]
    )
    return response.content[0].text


def review_pr(pr_diff: str) -> str:
    """Full orchestrator-worker review pipeline."""

    # Step 1: Opus decomposes (high-cost reasoning)
    tasks = orchestrate_review(pr_diff)
    print(f"Orchestrator created {len(tasks)} subtasks")

    # Step 2: Haiku workers execute (low-cost execution)
    results = []
    for task in tasks:
        result = execute_task(task, pr_diff)
        results.append({"type": task.task_type, "result": result})
        print(f"  Worker completed: {task.task_type}")

    # Step 3: Opus synthesizes (high-cost reasoning)
    final = synthesize_review(pr_diff, results)
    print("Orchestrator synthesized final review")

    return final


# Run
diff = open("pr_diff.txt").read()
review = review_pr(diff)
print(review)

Cost tracking per pipeline run:

def track_costs(pipeline_name: str, calls: list[dict]) -> dict:
    """Track actual costs for an orchestrator-worker pipeline."""
    prices = {
        ORCHESTRATOR_MODEL: {"in": 5.00, "out": 25.00},
        WORKER_MODEL: {"in": 1.00, "out": 5.00}
    }

    total_cost = 0
    breakdown = {"orchestrator": 0, "workers": 0}

    for call in calls:
        model = call["model"]
        p = prices[model]
        cost = (call["input_tokens"] * p["in"] +
                call["output_tokens"] * p["out"]) / 1e6

        total_cost += cost
        if model == ORCHESTRATOR_MODEL:
            breakdown["orchestrator"] += cost
        else:
            breakdown["workers"] += cost

    return {
        "pipeline": pipeline_name,
        "total_cost": f"${total_cost:.4f}",
        "orchestrator_cost": f"${breakdown['orchestrator']:.4f}",
        "worker_cost": f"${breakdown['workers']:.4f}",
        "worker_pct": f"{breakdown['workers']/total_cost*100:.0f}%"
    }

The Tradeoffs

The orchestrator-worker pattern trades cost for complexity:

Implementation Checklist

  1. Audit your agent tasks: which require complex reasoning vs structured execution?
  2. Assign Opus ($5/$25) to reasoning-heavy tasks (planning, analysis, synthesis)
  3. Assign Haiku ($1/$5) to execution tasks (classification, formatting, extraction)
  4. Implement the three-phase pipeline: decompose, execute, synthesize
  5. Add per-model cost tracking to every API call
  6. Test worker quality on 100 representative tasks before production deployment
  7. Compare total pipeline cost against the single-model baseline

Measuring Impact

Track cost distribution across the fleet: