Build a Claude Cost Attribution System

Written by Michael Lip · Solo founder of Zovo · $400K+ on Upwork · 100% JSS Join 50+ builders · More at zovo.one

A 5-agent fleet spending $1,000/month on Claude API had one agent consuming $400 of that budget – 3x more than any other agent – due to a retry loop bug. Without per-agent cost attribution, the problem was invisible for months. After deploying attribution tracking, the team identified the runaway agent in 24 hours, fixed the loop, and reduced that agent’s spend to $100/month. The $300/month savings took one afternoon to implement.

The Setup

Cost attribution means tagging every API request with metadata that identifies who or what generated it: which agent, which user, which feature, which task. The Claude API doesn’t provide built-in tagging, so you implement it at the application layer by wrapping API calls with a context object that includes attribution fields. These fields travel with the cost data into your analytics system, enabling drill-down views from total spend to individual request costs. The granularity of your attribution determines the precision of your optimization – you can only cut costs you can see.

The Math

A 5-agent fleet with $1,000/month total spend:

Before attribution (blind allocation):

After attribution and fix:

Further optimization enabled by attribution:

The Technique

Build a comprehensive attribution system with hierarchical tags.

import anthropic
import json
from dataclasses import dataclass, field, asdict
from datetime import datetime
from typing import Optional
from contextlib import contextmanager

PRICING = {
    "claude-opus-4-7": {"input": 5.00, "output": 25.00},
    "claude-sonnet-4-6": {"input": 3.00, "output": 15.00},
    "claude-haiku-4-5": {"input": 1.00, "output": 5.00},
}

@dataclass
class Attribution:
    agent_id: str
    task_type: str
    user_id: Optional[str] = None
    project: Optional[str] = None
    feature: Optional[str] = None
    session_id: Optional[str] = None
    parent_task_id: Optional[str] = None

@dataclass
class CostRecord:
    timestamp: str
    attribution: Attribution
    model: str
    input_tokens: int
    output_tokens: int
    total_cost: float
    request_id: str = ""

class AttributedClient:
    """Claude client with built-in cost attribution."""

    def __init__(self, store_path: str = "cost_attribution.jsonl"):
        self.client = anthropic.Anthropic()
        self.store_path = store_path
        self._current_attribution: Optional[Attribution] = None

    @contextmanager
    def attribution(self, **kwargs):
        """Context manager for setting attribution on requests."""
        self._current_attribution = Attribution(**kwargs)
        try:
            yield self
        finally:
            self._current_attribution = None

    def create(self, **kwargs) -> anthropic.types.Message:
        """Make an attributed API call."""
        if not self._current_attribution:
            raise RuntimeError("Must set attribution before making requests")

        response = self.client.messages.create(**kwargs)

        model = kwargs.get("model", "unknown")
        prices = PRICING.get(model, PRICING["claude-sonnet-4-6"])
        cost = (
            response.usage.input_tokens * prices["input"] / 1_000_000
            + response.usage.output_tokens * prices["output"] / 1_000_000
        )

        record = CostRecord(
            timestamp=datetime.utcnow().isoformat(),
            attribution=self._current_attribution,
            model=model,
            input_tokens=response.usage.input_tokens,
            output_tokens=response.usage.output_tokens,
            total_cost=round(cost, 6),
            request_id=response.id,
        )

        self._store(record)
        return response

    def _store(self, record: CostRecord) -> None:
        """Append record to JSONL file."""
        with open(self.store_path, "a") as f:
            data = asdict(record)
            f.write(json.dumps(data) + "\n")


def generate_attribution_report(store_path: str = "cost_attribution.jsonl"):
    """Generate cost report grouped by attribution dimensions."""
    from collections import defaultdict

    by_agent = defaultdict(lambda: {"requests": 0, "cost": 0.0})
    by_task = defaultdict(lambda: {"requests": 0, "cost": 0.0})
    by_model = defaultdict(lambda: {"requests": 0, "cost": 0.0})

    with open(store_path) as f:
        for line in f:
            record = json.loads(line)
            agent = record["attribution"]["agent_id"]
            task = record["attribution"]["task_type"]
            model = record["model"]
            cost = record["total_cost"]

            by_agent[agent]["requests"] += 1
            by_agent[agent]["cost"] += cost
            by_task[task]["requests"] += 1
            by_task[task]["cost"] += cost
            by_model[model]["requests"] += 1
            by_model[model]["cost"] += cost

    print("=== COST BY AGENT ===")
    for agent, data in sorted(by_agent.items(),
                               key=lambda x: x[1]["cost"], reverse=True):
        print(f"  {agent}: {data['requests']} requests, "
              f"${data['cost']:.2f}")

    print("\n=== COST BY TASK TYPE ===")
    for task, data in sorted(by_task.items(),
                              key=lambda x: x[1]["cost"], reverse=True):
        print(f"  {task}: {data['requests']} requests, "
              f"${data['cost']:.2f}")

    print("\n=== COST BY MODEL ===")
    for model, data in sorted(by_model.items(),
                                key=lambda x: x[1]["cost"], reverse=True):
        avg = data["cost"] / data["requests"] if data["requests"] else 0
        print(f"  {model}: {data['requests']} requests, "
              f"${data['cost']:.2f} (avg ${avg:.4f}/req)")


# Usage
ac = AttributedClient()

with ac.attribution(agent_id="agent-1", task_type="classification",
                     project="customer-support"):
    response = ac.create(
        model="claude-haiku-4-5",
        max_tokens=100,
        messages=[{"role": "user", "content": "Classify: billing issue"}]
    )

with ac.attribution(agent_id="agent-3", task_type="code-generation",
                     project="dev-tools"):
    response = ac.create(
        model="claude-opus-4-7",
        max_tokens=4096,
        messages=[{"role": "user", "content": "Write a sorting function"}]
    )

generate_attribution_report()

The Tradeoffs

Attribution adds a layer of instrumentation that all API call sites must use. Forgetting to wrap a call in an attribution context means unattributed costs that muddy your analysis. Enforce attribution through code review and linting rules that flag bare client.messages.create() calls. The JSONL storage approach works for small-to-medium volumes but needs replacement with a proper time-series database (InfluxDB, TimescaleDB) at scale. Attribution granularity is a spectrum: too coarse (agent-level only) misses optimization opportunities, too fine (per-user per-feature per-session) creates analysis paralysis.

Implementation Checklist

Measuring Impact

The attribution system pays for itself when it surfaces the first actionable insight. Track “insights surfaced” and “dollars saved per insight.” Common first findings: one agent type consuming disproportionate tokens (fix: loop detection), one task type using an over-powered model (fix: model routing), or one project generating 10x more requests than expected (fix: request deduplication). A well-maintained attribution system surfaces $500-$2,000/month in savings opportunities within the first week.