Fix: SDK TypeError: terminated Streaming

Written by Michael Lip · Solo founder of Zovo · $400K+ on Upwork · 100% JSS Join 50+ builders · More at zovo.one

The Error

When using the Anthropic TypeScript SDK with streaming and large inputs, you intermittently get:

TypeError: terminated

This happens specifically when handling large inputs combined with multiple tool calls, and is more frequent with models like Sonnet 4.5, Sonnet 4.6, and Haiku 4.5.

Quick Fix

Add retry logic with exponential backoff:

import Anthropic from "@anthropic-ai/sdk";

async function createWithRetry(
  client: Anthropic,
  params: Anthropic.MessageCreateParamsStreaming,
  maxRetries = 3
): Promise<Anthropic.Message> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      const stream = client.messages.stream(params);
      return await stream.finalMessage();
    } catch (error) {
      if (
        error instanceof TypeError &&
        error.message === "terminated" &&
        attempt < maxRetries - 1
      ) {
        const backoff = Math.min(1000 * 2 ** attempt, 10_000);
        await new Promise((r) => setTimeout(r, backoff));
        continue;
      }
      throw error;
    }
  }
  throw new Error("Max retries exceeded");
}

What Causes This

The TypeError: terminated error originates from Node.js’s undici HTTP client (the default fetch implementation in Node.js 18+). It occurs when the HTTP connection is terminated unexpectedly during a streaming response.

The error chain:

  1. Your code sends a streaming API request with a large input (many messages, large context)
  2. The server begins streaming the response
  3. The underlying TCP connection is closed mid-stream (by the server, a proxy, or a network intermediary)
  4. undici converts the premature connection close into a TypeError: terminated
  5. The SDK propagates this as an unrecoverable error

Why large inputs trigger this more frequently:

Full Solution

Option 1: Retry Wrapper with Classification

import Anthropic from "@anthropic-ai/sdk";

function isRetryableError(error: unknown): boolean {
  if (error instanceof TypeError && error.message === "terminated") {
    return true;
  }
  if (error instanceof Anthropic.APIConnectionError) {
    return true;
  }
  if (
    error instanceof Anthropic.APIError &&
    (error.status === 429 || error.status === 500 || error.status === 503)
  ) {
    return true;
  }
  return false;
}

async function resilientStream(
  client: Anthropic,
  params: Anthropic.MessageCreateParamsStreaming,
  options: {
    maxRetries?: number;
    baseDelayMs?: number;
    onRetry?: (attempt: number, error: unknown) => void;
  } = {}
): Promise<Anthropic.Message> {
  const { maxRetries = 3, baseDelayMs = 1000, onRetry } = options;

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const stream = client.messages.stream(params);
      return await stream.finalMessage();
    } catch (error) {
      if (!isRetryableError(error) || attempt === maxRetries) {
        throw error;
      }
      const delay = Math.min(baseDelayMs * 2 ** attempt, 30_000);
      onRetry?.(attempt + 1, error);
      await new Promise((resolve) => setTimeout(resolve, delay));
    }
  }

  throw new Error("Unreachable");
}

Option 2: Reduce Input Size

Manage conversation context to keep inputs manageable:

function trimMessages(
  messages: Anthropic.MessageParam[],
  maxTokenEstimate: number
): Anthropic.MessageParam[] {
  // Rough estimate: 1 token ~= 4 characters
  const estimateTokens = (msg: Anthropic.MessageParam): number => {
    if (typeof msg.content === "string") {
      return Math.ceil(msg.content.length / 4);
    }
    return msg.content.reduce((acc, block) => {
      if ("text" in block) return acc + Math.ceil(block.text.length / 4);
      return acc + 100; // Estimate for non-text blocks
    }, 0);
  };

  let totalTokens = 0;
  const kept: Anthropic.MessageParam[] = [];

  // Keep the last N messages that fit within the budget
  for (let i = messages.length - 1; i >= 0; i--) {
    const msgTokens = estimateTokens(messages[i]);
    if (totalTokens + msgTokens > maxTokenEstimate) break;
    totalTokens += msgTokens;
    kept.unshift(messages[i]);
  }

  return kept;
}

Option 3: Use Non-Streaming for Large Inputs

Switch to non-streaming mode when input size is large. For long-running non-streaming requests, the SDK validates that the request is not expected to exceed a 10-minute timeout. Use streaming or override the timeout for very large inputs.

async function adaptiveRequest(
  client: Anthropic,
  params: Anthropic.MessageCreateParamsNonStreaming | Anthropic.MessageCreateParamsStreaming
): Promise<Anthropic.Message> {
  const inputSize = JSON.stringify(params.messages).length;
  const LARGE_INPUT_THRESHOLD = 200_000; // characters

  if (inputSize > LARGE_INPUT_THRESHOLD) {
    // Large input: use non-streaming to avoid undici termination
    const nonStreamParams = {
      ...params,
      stream: false,
    } as Anthropic.MessageCreateParamsNonStreaming;
    return await client.messages.create(nonStreamParams);
  }

  // Normal input: use streaming
  const streamParams = {
    ...params,
    stream: true,
  } as Anthropic.MessageCreateParamsStreaming;
  const stream = client.messages.stream(streamParams);
  return await stream.finalMessage();
}

Prevention