Fix: SDK Streaming Hangs Indefinitely
The Error
Your application using client.messages.stream() or client.messages.create({{ stream: true }}) freezes. No output, no error, no timeout. The process sits at 0% CPU, waiting forever:
$ ps aux | grep node
968 0.0 0:04.89 node my-app.js # 0% CPU - waiting forever
The stream started correctly (you received message_start and partial content), but at some point the server stopped sending SSE events. The connection did not close, did not error – it just went silent.
Quick Fix
Add an idle timeout wrapper around your streaming calls:
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
function streamWithIdleTimeout(
params: Anthropic.MessageCreateParams,
idleMs = 60000
) {
const controller = new AbortController();
let idleTimer: NodeJS.Timeout;
const resetTimer = () => {
clearTimeout(idleTimer);
idleTimer = setTimeout(() => {
controller.abort();
}, idleMs);
};
resetTimer();
const stream = client.messages.stream(params, {
signal: controller.signal,
});
stream.on("text", () => resetTimer());
stream.on("message", () => clearTimeout(idleTimer));
return stream;
}
What Causes This
The Anthropic SDK provides two timeout mechanisms:
| Mechanism | What It Detects |
|---|---|
timeout option (default 10 min) |
Overall request timeout from first byte |
AbortController |
Manual cancellation via signal |
Neither detects stalled streams. Once the server starts sending data, the request timeout is satisfied. If the server then stops sending data mid-stream, the SDK waits indefinitely.
Here is a real-world failure timeline:
0s Connection established, request sent OK
0.5s First SSE event: message_start OK
2s SSE event: thinking block OK
5s SSE event: content_block_delta (partial) OK
10s Last recorded SSE event OK
10s+ Stream stalls - no more events, stop_reason still null HANG
15 min+ User notices, process still 0% CPU BLOCKED
The root causes for server-side stalls include:
- Network-level connection drop – the TCP connection was silently dropped at a NAT/proxy layer, but the OS sees no socket error because TCP keepalive intervals are too long
- Server-side processing stall – the model inference hangs on the server, and no heartbeat mechanism exists in the SSE protocol
- Load balancer timeout – a reverse proxy in the path drops the connection after its idle timeout, but the client socket remains open
Full Solution
TypeScript: Idle Timeout Wrapper
The TypeScript SDK’s streaming helper supports .on("text", ...) for text events and .finalMessage() to get the complete Message object without writing event-handling code.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
async function robustStream(
params: Anthropic.MessageCreateParamsStreaming,
options: { idleTimeoutMs?: number; maxRetries?: number } = {}
): Promise<Anthropic.Message> {
const { idleTimeoutMs = 60_000, maxRetries = 3 } = options;
for (let attempt = 0; attempt < maxRetries; attempt++) {
const controller = new AbortController();
let idleTimer: NodeJS.Timeout;
const resetTimer = () => {
clearTimeout(idleTimer);
idleTimer = setTimeout(() => {
controller.abort();
}, idleTimeoutMs);
};
try {
resetTimer();
const stream = client.messages.stream(params, {
signal: controller.signal,
});
// Reset timer on any text event
stream.on("text", () => resetTimer());
stream.on("message", () => clearTimeout(idleTimer));
const message = await stream.finalMessage();
clearTimeout(idleTimer);
return message;
} catch (error) {
clearTimeout(idleTimer!);
if (controller.signal.aborted && attempt < maxRetries - 1) {
const backoff = Math.min(1000 * 2 ** attempt, 10_000);
await new Promise((r) => setTimeout(r, backoff));
continue;
}
throw error;
}
}
throw new Error(`Failed after ${maxRetries} attempts`);
}
// Usage:
const message = await robustStream({
model: "claude-sonnet-4-6",
max_tokens: 4096,
messages: [{ role: "user", content: "Explain quantum computing" }],
stream: true,
});
Python: Idle Timeout Wrapper
The Python SDK’s streaming helper provides stream.text_stream for iterating text chunks and stream.get_final_message() for the complete Message.
import asyncio
import anthropic
client = anthropic.AsyncAnthropic()
async def robust_stream(
params: dict,
idle_timeout_s: float = 60.0,
max_retries: int = 3,
) -> anthropic.types.Message:
for attempt in range(max_retries):
try:
async with client.messages.stream(**params) as stream:
async for event in stream:
pass # Each event resets the effective idle window
return await stream.get_final_message()
except Exception:
if attempt < max_retries - 1:
backoff = min(2 ** attempt, 10)
await asyncio.sleep(backoff)
continue
raise
raise RuntimeError(f"Failed after {max_retries} attempts")
# Usage:
message = await robust_stream({
"model": "claude-sonnet-4-6",
"max_tokens": 4096,
"messages": [{"role": "user", "content": "Explain quantum computing"}],
})
Timeout Configuration
Both SDKs default to a 10-minute overall timeout. For large max_tokens values without streaming, the TypeScript SDK dynamically calculates timeouts up to 60 minutes.
// Set overall timeout
const client = new Anthropic({
timeout: 600_000, // 10 min overall timeout
});
# Python timeout configuration
client = anthropic.Anthropic(timeout=600.0) # 10 min
Prevention
- Always implement idle timeout detection on streaming responses – the SDK’s built-in
timeoutonly covers connection establishment, not mid-stream stalls - Add automatic retry with exponential backoff for stream failures
- Log the last SSE event timestamp so you can diagnose stalls in production
- Consider non-streaming for short requests – only use streaming when you need progressive output. Use
.stream()with.finalMessage()(TypeScript) or.get_final_message()(Python) to get the complete Message object