Claude XML Tags vs JSON for Token Efficiency
Passing structured context to Claude in XML format uses approximately 30% fewer tokens than equivalent JSON. On a pipeline processing 1,000 daily conversations with 50K-token contexts on Opus 4.7, switching from JSON to XML saves $75/day — $2,250/month — by reducing input tokens from 50K to 35K per request.
The Setup
Claude is designed to work well with XML tags for structuring prompts. Anthropic’s own documentation uses XML for prompt structuring. JSON, while universally understood, carries syntactic overhead: quotes around every key, colons, commas, curly braces, and square brackets.
For prompt context (not output format), XML is consistently more token-efficient. This guide quantifies the difference and shows how to convert your JSON-heavy prompts to XML for immediate token savings.
The Math
Example: Customer record in both formats
JSON version (estimated ~180 tokens):
{"customer":{"name":"Jane Smith","email":"[email protected]","plan":"enterprise","created":"2025-01-15","tickets":[{"id":"T-001","type":"billing","status":"resolved","date":"2026-04-01"},{"id":"T-002","type":"technical","status":"open","date":"2026-04-15"}]}}
XML version (estimated ~125 tokens):
<customer name="Jane Smith" email="[email protected]" plan="enterprise" created="2025-01-15">
<ticket id="T-001" type="billing" status="resolved" date="2026-04-01"/>
<ticket id="T-002" type="technical" status="open" date="2026-04-15"/>
</customer>
Token savings: ~30%
At scale (1,000 requests/day, 50K context tokens, Opus 4.7):
- JSON: 50M tokens/day * $5.00/MTok = $250/day
- XML: 35M tokens/day * $5.00/MTok = $175/day
- Savings: $75/day -> $2,250/month
On Sonnet 4.6: $1,350/month saved. On Haiku 4.5: $450/month saved.
The Technique
Converting JSON Context to XML
import json
import xml.etree.ElementTree as ET
def json_to_xml_context(data: dict, root_tag: str = "data") -> str:
"""Convert JSON data to XML format for token-efficient Claude prompts."""
def dict_to_xml(d: dict, parent: ET.Element):
for key, value in d.items():
if isinstance(value, dict):
child = ET.SubElement(parent, key)
dict_to_xml(value, child)
elif isinstance(value, list):
for item in value:
child = ET.SubElement(parent, key[:-1] if key.endswith("s") else key)
if isinstance(item, dict):
for k, v in item.items():
child.set(k, str(v))
else:
child.text = str(item)
else:
parent.set(key, str(value))
root = ET.Element(root_tag)
dict_to_xml(data, root)
return ET.tostring(root, encoding="unicode")
# Example conversion
customer_json = {
"name": "Jane Smith",
"email": "[email protected]",
"plan": "enterprise",
"tickets": [
{"id": "T-001", "type": "billing", "status": "resolved"},
{"id": "T-002", "type": "technical", "status": "open"},
],
}
xml_context = json_to_xml_context(customer_json, "customer")
print(xml_context)
Using XML Tags in Claude Prompts
import anthropic
client = anthropic.Anthropic()
# XML-structured prompt (token-efficient)
xml_prompt = """<context>
<customer name="Jane Smith" email="[email protected]" plan="enterprise"/>
<tickets>
<ticket id="T-001" type="billing" status="resolved" date="2026-04-01"/>
<ticket id="T-002" type="technical" status="open" date="2026-04-15"/>
</tickets>
</context>
<task>Draft a follow-up email about the open technical ticket.</task>"""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": xml_prompt}],
)
print(response.content[0].text)
Token Count Comparison Script
def compare_format_tokens(data: dict, model: str = "claude-sonnet-4-6") -> dict:
"""Compare token counts between JSON and XML representations."""
json_text = json.dumps(data, indent=2)
xml_text = json_to_xml_context(data)
json_count = client.messages.count_tokens(
model=model,
messages=[{"role": "user", "content": f"Analyze this data:\n{json_text}"}],
)
xml_count = client.messages.count_tokens(
model=model,
messages=[{"role": "user", "content": f"Analyze this data:\n{xml_text}"}],
)
savings = json_count.input_tokens - xml_count.input_tokens
pct = (savings / json_count.input_tokens) * 100
return {
"json_tokens": json_count.input_tokens,
"xml_tokens": xml_count.input_tokens,
"tokens_saved": savings,
"percentage_saved": f"{pct:.1f}%",
"opus_savings_per_10k": f"${savings * 10000 * 5 / 1_000_000:.2f}",
}
result = compare_format_tokens(customer_json)
print(f"JSON: {result['json_tokens']} tokens")
print(f"XML: {result['xml_tokens']} tokens")
print(f"Saved: {result['tokens_saved']} tokens ({result['percentage_saved']})")
print(f"Opus savings per 10K requests: {result['opus_savings_per_10k']}")
Hybrid Approach: XML Input, JSON Output
You can use XML for input context (saves tokens) while requesting JSON output (easier to parse programmatically):
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=500,
system='Analyze the customer data and output JSON: {"risk_level": "low|medium|high", "reason": "str", "action": "str"}',
messages=[{
"role": "user",
"content": """<customer plan="free" tickets_open="5" days_since_login="90">
<ticket type="billing" age_days="30"/>
<ticket type="billing" age_days="15"/>
<ticket type="technical" age_days="7"/>
<ticket type="access" age_days="3"/>
<ticket type="access" age_days="1"/>
</customer>""",
}],
)
print(response.content[0].text)
The Tradeoffs
XML saves tokens on input context, but if your application parses Claude’s output programmatically, JSON output is easier to handle. Use the hybrid approach: XML in, JSON out.
Not all data structures convert cleanly to XML. Deeply nested objects with mixed types can produce awkward XML that is harder for Claude to interpret. Flat or shallow data structures benefit most from XML conversion.
If your entire codebase is JSON-centric, the refactoring cost may not justify the savings for low-volume applications. The break-even point is roughly 1,000 requests/day on Sonnet or 500 requests/day on Opus.
Additional Token-Saving Techniques for Structured Data
Beyond XML conversion, consider these approaches for further token reduction:
- Attribute-heavy XML: Use XML attributes instead of child elements.
<order id="123" total="$49.99" status="shipped"/>uses fewer tokens than nesting each field as a separate element. - Abbreviate repeated labels: If you pass 100 customer records, define a legend once (
<legend>n=name e=email p=plan</legend>) and use single-letter attributes thereafter. This can save an additional 10-15% on large datasets. - Strip whitespace: Minified XML (no indentation or newlines) saves 5-8% of tokens compared to pretty-printed XML. Claude parses both equally well.
- Omit null/empty fields: JSON requires explicit
"field": nullentries (3-5 tokens each). XML simply omits the attribute, saving tokens automatically.
At Opus 4.7 rates ($5.00/MTok), combining XML conversion with attribute optimization and whitespace removal can reduce a 50K-token JSON context to approximately 30K tokens – saving $100/day at 1,000 requests.
Implementation Checklist
- Identify prompts that pass structured data as JSON context
- Convert the top 3 highest-volume prompts to XML format
- Run token count comparison to verify savings
- Test output quality on 50 requests with XML context
- Deploy XML-formatted prompts for qualifying request types
- Keep JSON for output format specifications
Measuring Impact
Compare token counts for identical requests in JSON vs XML format. Track the percentage reduction and multiply by your daily volume and model rate. Typical savings range from 20-35% on structured context. If your prompts are primarily natural language with minimal structured data, the savings will be smaller. Focus XML conversion on prompts where structured data makes up more than 30% of the input tokens.
Related Guides
- Claude Code Token Usage Optimization — broader token optimization strategies
- Reduce Claude Code Hallucinations Save Tokens — structured prompts reduce errors and retries
- Claude Skill Token Usage Profiling — measure token usage by prompt type