Batch API Cost Calculator for Claude Models

Written by Michael Lip · Solo founder of Zovo · $400K+ on Upwork · 100% JSS Join 50+ builders · More at zovo.one

Batch pricing on Claude is straightforward: divide the standard price by two. Opus 4.7 input goes from $5.00 to $2.50 per million tokens. Output goes from $25.00 to $12.50. But calculating the actual cost for a specific workload requires knowing your token distribution, and most teams overestimate input tokens while underestimating output tokens.

The Setup

You are planning to migrate three workloads to the Batch API and need precise cost projections before committing. Workload A is a summarization task (high input, low output). Workload B is content generation (moderate input, high output). Workload C is classification (low input, minimal output).

Each workload has different cost characteristics, and the model choice matters. Running workload B on Opus 4.7 batch costs $15.00 per 1,000 requests. Running it on Haiku 4.5 batch costs $1.50 per 1,000 requests. The calculator below handles all permutations.

The Math

Batch pricing per million tokens (all verified):

Model Batch Input Batch Output Standard Input Standard Output
Opus 4.7 $2.50 $12.50 $5.00 $25.00
Sonnet 4.6 $1.50 $7.50 $3.00 $15.00
Haiku 4.5 $0.50 $2.50 $1.00 $5.00
Opus 4.1 $7.50 $37.50 $15.00 $75.00

Three workload examples (1,000 requests each):

Workload A (summarization): 10K input + 500 output per request

Workload B (content gen): 5K input + 3K output per request

Workload C (classification): 2K input + 200 output per request

The Technique

Here is a comprehensive cost calculator that handles any model, token count, and batch size:

from dataclasses import dataclass

@dataclass
class BatchCostResult:
    model: str
    requests: int
    total_input_tokens: int
    total_output_tokens: int
    batch_input_cost: float
    batch_output_cost: float
    batch_total: float
    standard_total: float
    savings: float
    savings_pct: float
    cost_per_request: float

    def __str__(self) -> str:
        return (
            f"Model: {self.model}\n"
            f"Requests: {self.requests:,}\n"
            f"Batch cost: ${self.batch_total:.2f} "
            f"(in: ${self.batch_input_cost:.2f}, "
            f"out: ${self.batch_output_cost:.2f})\n"
            f"Standard cost: ${self.standard_total:.2f}\n"
            f"Savings: ${self.savings:.2f} ({self.savings_pct:.1f}%)\n"
            f"Per request: ${self.cost_per_request:.4f}"
        )


BATCH_PRICES = {
    "opus-4.7": {"in": 2.50, "out": 12.50, "std_in": 5.00, "std_out": 25.00},
    "sonnet-4.6": {"in": 1.50, "out": 7.50, "std_in": 3.00, "std_out": 15.00},
    "haiku-4.5": {"in": 0.50, "out": 2.50, "std_in": 1.00, "std_out": 5.00},
    "opus-4.1": {"in": 7.50, "out": 37.50, "std_in": 15.00, "std_out": 75.00},
}


def calculate_batch_cost(
    model: str,
    num_requests: int,
    avg_input_tokens: int,
    avg_output_tokens: int
) -> BatchCostResult:
    """Calculate batch vs standard API cost."""

    if model not in BATCH_PRICES:
        raise ValueError(f"Unknown model: {model}. "
                        f"Options: {list(BATCH_PRICES.keys())}")

    p = BATCH_PRICES[model]
    total_in = num_requests * avg_input_tokens
    total_out = num_requests * avg_output_tokens

    batch_in_cost = total_in * p["in"] / 1e6
    batch_out_cost = total_out * p["out"] / 1e6
    batch_total = batch_in_cost + batch_out_cost

    std_in_cost = total_in * p["std_in"] / 1e6
    std_out_cost = total_out * p["std_out"] / 1e6
    std_total = std_in_cost + std_out_cost

    savings = std_total - batch_total

    return BatchCostResult(
        model=model,
        requests=num_requests,
        total_input_tokens=total_in,
        total_output_tokens=total_out,
        batch_input_cost=batch_in_cost,
        batch_output_cost=batch_out_cost,
        batch_total=batch_total,
        standard_total=std_total,
        savings=savings,
        savings_pct=(savings / std_total * 100) if std_total > 0 else 0,
        cost_per_request=batch_total / num_requests if num_requests > 0 else 0
    )


# Compare all models for the same workload
for model in BATCH_PRICES:
    result = calculate_batch_cost(model, 10000, 5000, 3000)
    print(result)
    print()

For a quick command-line calculation:

# Quick batch cost estimator
python3 -c "
import sys

models = {
    'opus-4.7': (2.50, 12.50),
    'sonnet-4.6': (1.50, 7.50),
    'haiku-4.5': (0.50, 2.50),
}

reqs = 10000
inp = 5000
out = 3000

print(f'Batch cost for {reqs:,} requests ({inp:,} in + {out:,} out tokens each):')
print(f'{\"\":-^60}')
for name, (pin, pout) in models.items():
    cost_in = reqs * inp * pin / 1e6
    cost_out = reqs * out * pout / 1e6
    total = cost_in + cost_out
    per_req = total / reqs
    print(f'{name:>12}: \${total:>10,.2f} total  (\${per_req:.4f}/request)')
"

Output:

Batch cost for 10,000 requests (5,000 in + 3,000 out tokens each):
------------------------------------------------------------
   opus-4.7: $    500.00 total  ($0.0500/request)
 sonnet-4.6: $    300.00 total  ($0.0300/request)
  haiku-4.5: $    100.00 total  ($0.0100/request)

To estimate tokens from your actual data before batch submission:

def estimate_tokens_from_text(text: str) -> int:
    """Rough token estimate: ~4 chars per token for English."""
    return len(text) // 4

# Sample your dataset to estimate average tokens
import json
import random

data = json.load(open("dataset.json"))
sample = random.sample(data, min(100, len(data)))

input_tokens = [estimate_tokens_from_text(item["prompt"]) for item in sample]
avg_input = sum(input_tokens) // len(input_tokens)

print(f"Estimated avg input tokens: {avg_input}")
print(f"Run calculator with avg_input_tokens={avg_input}")

The Tradeoffs

Cost calculators have inherent limitations:

Implementation Checklist

  1. Sample 100 representative requests from your workload
  2. Count actual input tokens using the Anthropic tokenizer or estimate at 4 chars/token
  3. Estimate output tokens from historical response lengths or max_tokens settings
  4. Run the calculator for all candidate models
  5. Compare batch cost against your current standard API spend
  6. Select the model that meets quality requirements at the lowest batch cost
  7. Start with a small test batch (100 requests) before committing to 100K

Measuring Impact

After running your first production batch: