Claude SDK Timeout Configuration Guide
The Claude SDK defaults to a 10-minute timeout and 2 retries. For production workloads, you need to tune these values based on your request patterns. This guide covers every timeout and retry option.
The Error
When a request exceeds the timeout:
anthropic.APITimeoutError: Request timed out.
When a non-streaming request is expected to exceed 10 minutes:
ValueError: Non-streaming request expected to exceed 10 minute timeout. Use streaming or increase timeout.
Quick Fix
- Use streaming for long-running requests (avoids the 10-minute limit).
- Increase the timeout for specific requests using
with_options(). - Increase retries to handle transient connection failures.
What Causes This
- Non-streaming requests with large
max_tokensvalues can exceed the default 10-minute timeout. - The SDK validates non-streaming requests and throws
ValueErrorbefore sending if the expected time exceeds 10 minutes. - Network issues or slow responses from the API can cause
APITimeoutError.
Full Solution
Python: Client-Level Timeout
import anthropic
import httpx
# Simple timeout (all operations use the same value)
client = anthropic.Anthropic(timeout=20.0) # 20 seconds
# Fine-grained timeout control
client = anthropic.Anthropic(
timeout=httpx.Timeout(
60.0, # Total timeout
read=5.0, # Read timeout per chunk
write=10.0, # Write timeout
connect=2.0 # Connection timeout
)
)
Python: Per-Request Timeout Override
import anthropic
client = anthropic.Anthropic()
# Override timeout for a single request
message = client.with_options(timeout=120.0).messages.create(
model="claude-opus-4-6",
max_tokens=4096,
messages=[{"role": "user", "content": "Complex analysis task..."}]
)
TypeScript: Client-Level Timeout
import Anthropic from "@anthropic-ai/sdk";
// 20-second timeout
const client = new Anthropic({ timeout: 20 * 1000 });
For TypeScript, when using large max_tokens without streaming, the SDK dynamically calculates timeouts up to 60 minutes.
Python: Configure Retries
import anthropic
# Disable retries
client = anthropic.Anthropic(max_retries=0)
# Increase retries for resilient production workloads
client = anthropic.Anthropic(max_retries=5)
# Per-request retry override
message = client.with_options(max_retries=5).messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}]
)
The SDK retries on these conditions with exponential backoff:
- Connection errors
- HTTP 408 (Request Timeout)
- HTTP 409 (Conflict)
- HTTP 429 (Rate Limited)
- HTTP >= 500 (Server Errors, including 529 Overloaded)
TypeScript: Configure Retries
import Anthropic from "@anthropic-ai/sdk";
// Disable retries
const client = new Anthropic({ maxRetries: 0 });
// Increase retries
const client2 = new Anthropic({ maxRetries: 5 });
Avoid the 10-Minute Timeout with Streaming
The SDK throws ValueError if a non-streaming request is expected to take more than 10 minutes. The solution is to use streaming with get_final_message():
import anthropic
client = anthropic.Anthropic()
# This might throw ValueError for very large outputs:
# message = client.messages.create(model="claude-opus-4-6", max_tokens=128000, ...)
# Use streaming instead -- no 10-minute limit
with client.messages.stream(
model="claude-opus-4-6",
max_tokens=128000,
messages=[{"role": "user", "content": "Write a comprehensive report"}]
) as stream:
message = stream.get_final_message()
print(message.content[0].text)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const stream = client.messages.stream({
model: "claude-opus-4-6",
max_tokens: 128000,
messages: [{ role: "user", content: "Write a comprehensive report" }]
});
const message = await stream.finalMessage();
Recommended Production Configuration
import anthropic
import httpx
client = anthropic.Anthropic(
max_retries=3,
timeout=httpx.Timeout(
300.0, # 5-minute total timeout
read=30.0, # 30-second read timeout
write=30.0, # 30-second write timeout
connect=10.0 # 10-second connect timeout
)
)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
maxRetries: 3,
timeout: 300 * 1000 // 5 minutes
});
Default Values Reference
| Setting | Python Default | TypeScript Default |
|---|---|---|
| Timeout | 10 minutes | 10 minutes |
| Max retries | 2 | 2 |
| Retry conditions | Connection, 408, 409, 429, >=500 | Same |
| Backoff | Exponential | Exponential |
Prevention
- Use streaming for long outputs: Any request with
max_tokens > 4096benefits from streaming. - Set appropriate timeouts: Match your timeout to the expected response time. A 10-second question does not need a 10-minute timeout.
- Increase retries in production: 3-5 retries catches most transient failures without excessive delay.
- Use TCP keep-alive: The SDKs automatically set TCP keep-alive options to detect broken connections.
Related Guides
- Claude Streaming Not Working – fix streaming issues that arise from timeout misconfiguration.
- Claude API Error 500 api_error Fix – server errors that retries can recover from.
- Claude API Error 429 rate_limit_error Fix – rate limit handling works with retry configuration.
- Claude Python SDK Installation Guide – install the SDK before configuring timeouts.
- Claude API Error 529 overloaded_error Fix – overload errors also benefit from retry configuration.