Claude XML Tags vs JSON for Token Efficiency

Written by Michael Lip · Solo founder of Zovo · $400K+ on Upwork · 100% JSS Join 50+ builders · More at zovo.one

Passing structured context to Claude in XML format uses approximately 30% fewer tokens than equivalent JSON. On a pipeline processing 1,000 daily conversations with 50K-token contexts on Opus 4.7, switching from JSON to XML saves $75/day — $2,250/month — by reducing input tokens from 50K to 35K per request.

The Setup

Claude is designed to work well with XML tags for structuring prompts. Anthropic’s own documentation uses XML for prompt structuring. JSON, while universally understood, carries syntactic overhead: quotes around every key, colons, commas, curly braces, and square brackets.

For prompt context (not output format), XML is consistently more token-efficient. This guide quantifies the difference and shows how to convert your JSON-heavy prompts to XML for immediate token savings.

The Math

Example: Customer record in both formats

JSON version (estimated ~180 tokens):

{"customer":{"name":"Jane Smith","email":"[email protected]","plan":"enterprise","created":"2025-01-15","tickets":[{"id":"T-001","type":"billing","status":"resolved","date":"2026-04-01"},{"id":"T-002","type":"technical","status":"open","date":"2026-04-15"}]}}

XML version (estimated ~125 tokens):

<customer name="Jane Smith" email="[email protected]" plan="enterprise" created="2025-01-15">
<ticket id="T-001" type="billing" status="resolved" date="2026-04-01"/>
<ticket id="T-002" type="technical" status="open" date="2026-04-15"/>
</customer>

Token savings: ~30%

At scale (1,000 requests/day, 50K context tokens, Opus 4.7):

On Sonnet 4.6: $1,350/month saved. On Haiku 4.5: $450/month saved.

The Technique

Converting JSON Context to XML

import json
import xml.etree.ElementTree as ET

def json_to_xml_context(data: dict, root_tag: str = "data") -> str:
    """Convert JSON data to XML format for token-efficient Claude prompts."""
    def dict_to_xml(d: dict, parent: ET.Element):
        for key, value in d.items():
            if isinstance(value, dict):
                child = ET.SubElement(parent, key)
                dict_to_xml(value, child)
            elif isinstance(value, list):
                for item in value:
                    child = ET.SubElement(parent, key[:-1] if key.endswith("s") else key)
                    if isinstance(item, dict):
                        for k, v in item.items():
                            child.set(k, str(v))
                    else:
                        child.text = str(item)
            else:
                parent.set(key, str(value))

    root = ET.Element(root_tag)
    dict_to_xml(data, root)
    return ET.tostring(root, encoding="unicode")

# Example conversion
customer_json = {
    "name": "Jane Smith",
    "email": "[email protected]",
    "plan": "enterprise",
    "tickets": [
        {"id": "T-001", "type": "billing", "status": "resolved"},
        {"id": "T-002", "type": "technical", "status": "open"},
    ],
}

xml_context = json_to_xml_context(customer_json, "customer")
print(xml_context)

Using XML Tags in Claude Prompts

import anthropic

client = anthropic.Anthropic()

# XML-structured prompt (token-efficient)
xml_prompt = """<context>
<customer name="Jane Smith" email="[email protected]" plan="enterprise"/>
<tickets>
<ticket id="T-001" type="billing" status="resolved" date="2026-04-01"/>
<ticket id="T-002" type="technical" status="open" date="2026-04-15"/>
</tickets>
</context>

<task>Draft a follow-up email about the open technical ticket.</task>"""

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": xml_prompt}],
)
print(response.content[0].text)

Token Count Comparison Script

def compare_format_tokens(data: dict, model: str = "claude-sonnet-4-6") -> dict:
    """Compare token counts between JSON and XML representations."""
    json_text = json.dumps(data, indent=2)
    xml_text = json_to_xml_context(data)

    json_count = client.messages.count_tokens(
        model=model,
        messages=[{"role": "user", "content": f"Analyze this data:\n{json_text}"}],
    )
    xml_count = client.messages.count_tokens(
        model=model,
        messages=[{"role": "user", "content": f"Analyze this data:\n{xml_text}"}],
    )

    savings = json_count.input_tokens - xml_count.input_tokens
    pct = (savings / json_count.input_tokens) * 100

    return {
        "json_tokens": json_count.input_tokens,
        "xml_tokens": xml_count.input_tokens,
        "tokens_saved": savings,
        "percentage_saved": f"{pct:.1f}%",
        "opus_savings_per_10k": f"${savings * 10000 * 5 / 1_000_000:.2f}",
    }

result = compare_format_tokens(customer_json)
print(f"JSON: {result['json_tokens']} tokens")
print(f"XML: {result['xml_tokens']} tokens")
print(f"Saved: {result['tokens_saved']} tokens ({result['percentage_saved']})")
print(f"Opus savings per 10K requests: {result['opus_savings_per_10k']}")

Hybrid Approach: XML Input, JSON Output

You can use XML for input context (saves tokens) while requesting JSON output (easier to parse programmatically):

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=500,
    system='Analyze the customer data and output JSON: {"risk_level": "low|medium|high", "reason": "str", "action": "str"}',
    messages=[{
        "role": "user",
        "content": """<customer plan="free" tickets_open="5" days_since_login="90">
<ticket type="billing" age_days="30"/>
<ticket type="billing" age_days="15"/>
<ticket type="technical" age_days="7"/>
<ticket type="access" age_days="3"/>
<ticket type="access" age_days="1"/>
</customer>""",
    }],
)
print(response.content[0].text)

The Tradeoffs

XML saves tokens on input context, but if your application parses Claude’s output programmatically, JSON output is easier to handle. Use the hybrid approach: XML in, JSON out.

Not all data structures convert cleanly to XML. Deeply nested objects with mixed types can produce awkward XML that is harder for Claude to interpret. Flat or shallow data structures benefit most from XML conversion.

If your entire codebase is JSON-centric, the refactoring cost may not justify the savings for low-volume applications. The break-even point is roughly 1,000 requests/day on Sonnet or 500 requests/day on Opus.

Additional Token-Saving Techniques for Structured Data

Beyond XML conversion, consider these approaches for further token reduction:

At Opus 4.7 rates ($5.00/MTok), combining XML conversion with attribute optimization and whitespace removal can reduce a 50K-token JSON context to approximately 30K tokens – saving $100/day at 1,000 requests.

Implementation Checklist

  1. Identify prompts that pass structured data as JSON context
  2. Convert the top 3 highest-volume prompts to XML format
  3. Run token count comparison to verify savings
  4. Test output quality on 50 requests with XML context
  5. Deploy XML-formatted prompts for qualifying request types
  6. Keep JSON for output format specifications

Measuring Impact

Compare token counts for identical requests in JSON vs XML format. Track the percentage reduction and multiply by your daily volume and model rate. Typical savings range from 20-35% on structured context. If your prompts are primarily natural language with minimal structured data, the savings will be smaller. Focus XML conversion on prompts where structured data makes up more than 30% of the input tokens.