Claude SDK Timeout Configuration Guide

Written by Michael Lip · Solo founder of Zovo · $400K+ on Upwork · 100% JSS Join 50+ builders · More at zovo.one

The Claude SDK defaults to a 10-minute timeout and 2 retries. For production workloads, you need to tune these values based on your request patterns. This guide covers every timeout and retry option.

The Error

When a request exceeds the timeout:

anthropic.APITimeoutError: Request timed out.

When a non-streaming request is expected to exceed 10 minutes:

ValueError: Non-streaming request expected to exceed 10 minute timeout. Use streaming or increase timeout.

Quick Fix

  1. Use streaming for long-running requests (avoids the 10-minute limit).
  2. Increase the timeout for specific requests using with_options().
  3. Increase retries to handle transient connection failures.

What Causes This

Full Solution

Python: Client-Level Timeout

import anthropic
import httpx

# Simple timeout (all operations use the same value)
client = anthropic.Anthropic(timeout=20.0)  # 20 seconds

# Fine-grained timeout control
client = anthropic.Anthropic(
    timeout=httpx.Timeout(
        60.0,          # Total timeout
        read=5.0,      # Read timeout per chunk
        write=10.0,    # Write timeout
        connect=2.0    # Connection timeout
    )
)

Python: Per-Request Timeout Override

import anthropic

client = anthropic.Anthropic()

# Override timeout for a single request
message = client.with_options(timeout=120.0).messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    messages=[{"role": "user", "content": "Complex analysis task..."}]
)

TypeScript: Client-Level Timeout

import Anthropic from "@anthropic-ai/sdk";

// 20-second timeout
const client = new Anthropic({ timeout: 20 * 1000 });

For TypeScript, when using large max_tokens without streaming, the SDK dynamically calculates timeouts up to 60 minutes.

Python: Configure Retries

import anthropic

# Disable retries
client = anthropic.Anthropic(max_retries=0)

# Increase retries for resilient production workloads
client = anthropic.Anthropic(max_retries=5)

# Per-request retry override
message = client.with_options(max_retries=5).messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}]
)

The SDK retries on these conditions with exponential backoff:

TypeScript: Configure Retries

import Anthropic from "@anthropic-ai/sdk";

// Disable retries
const client = new Anthropic({ maxRetries: 0 });

// Increase retries
const client2 = new Anthropic({ maxRetries: 5 });

Avoid the 10-Minute Timeout with Streaming

The SDK throws ValueError if a non-streaming request is expected to take more than 10 minutes. The solution is to use streaming with get_final_message():

import anthropic

client = anthropic.Anthropic()

# This might throw ValueError for very large outputs:
# message = client.messages.create(model="claude-opus-4-6", max_tokens=128000, ...)

# Use streaming instead -- no 10-minute limit
with client.messages.stream(
    model="claude-opus-4-6",
    max_tokens=128000,
    messages=[{"role": "user", "content": "Write a comprehensive report"}]
) as stream:
    message = stream.get_final_message()

print(message.content[0].text)
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const stream = client.messages.stream({
  model: "claude-opus-4-6",
  max_tokens: 128000,
  messages: [{ role: "user", content: "Write a comprehensive report" }]
});

const message = await stream.finalMessage();
import anthropic
import httpx

client = anthropic.Anthropic(
    max_retries=3,
    timeout=httpx.Timeout(
        300.0,         # 5-minute total timeout
        read=30.0,     # 30-second read timeout
        write=30.0,    # 30-second write timeout
        connect=10.0   # 10-second connect timeout
    )
)
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  maxRetries: 3,
  timeout: 300 * 1000  // 5 minutes
});

Default Values Reference

Setting Python Default TypeScript Default
Timeout 10 minutes 10 minutes
Max retries 2 2
Retry conditions Connection, 408, 409, 429, >=500 Same
Backoff Exponential Exponential

Prevention

  1. Use streaming for long outputs: Any request with max_tokens > 4096 benefits from streaming.
  2. Set appropriate timeouts: Match your timeout to the expected response time. A 10-second question does not need a 10-minute timeout.
  3. Increase retries in production: 3-5 retries catches most transient failures without excessive delay.
  4. Use TCP keep-alive: The SDKs automatically set TCP keep-alive options to detect broken connections.