Building Production AI Agents (2026)
Claude Code skills are .md files that extend Claude’s behavior for specific tasks When building production AI agents that run autonomously, these skills provide specialized context. for document processing, TDD workflows, memory management, and more. This guide covers practical patterns for composing them into agents that handle real workloads.
Core Skills for Agent Development
TDD Skill for Quality Assurance
The tdd skill guides Claude through red-green-refactor cycles. In an agent pipeline, invoke it on your agent’s own test suite before deployment:
Run TDD skill analysis on agent modules
claude -p "/tdd analyze the agent modules in ./src and generate missing test cases"
This generates test cases based on your agent’s actual implementation. state management, tool invocation sequences, and response validation.
PDF Skill for Document Processing
Production agents frequently process PDF documents. Invoke the pdf skill when the agent receives a document:
claude -p "/pdf Extract all tables and section headings from /tmp/incoming-contract.pdf"
The skill handles complex layouts, form fields, and multi-column documents.
Supermemory Skill for Context Management
Long-running agents need persistent context between sessions. The supermemory skill provides retrieval of previously stored knowledge:
/supermemory store: Order #12345 processed for customer acme-corp, result: approved
/supermemory What are the previous orders for acme-corp?
This lets agents maintain decision history and reference previous outputs without re-loading full context on every run.
Building a Production Agent Workflow
A practical order-processing agent sequences skills via claude -p calls from a shell script:
#!/bin/bash
agent-pipeline.sh
PDF_FILE="$1"
CUSTOMER_ID="$2"
Step 1: Extract order details from PDF
EXTRACTED=$(claude -p "/pdf Extract order details (items, quantities, totals) from $PDF_FILE")
Step 2: Retrieve customer history
HISTORY=$(claude -p "/supermemory What are the previous orders for $CUSTOMER_ID?")
Step 3: Process with full context
RESULT=$(claude -p "
Order details: $EXTRACTED
Customer history: $HISTORY
Determine whether to approve this order based on past account standing.
Reply with APPROVED or HOLD and a one-sentence reason.
")
echo "$RESULT"
Step 4: Store the result for future reference
claude -p "/supermemory store: Order from $PDF_FILE for $CUSTOMER_ID. $RESULT"
Each step is discrete and testable. The tdd skill can generate tests for each step’s expected input/output pairs before you deploy.
Skill Composition Strategies
Sequential processing: Chain claude -p calls where each output feeds the next. Use when transformations build on each other.
Parallel execution: Invoke independent skills simultaneously using background shell processes:
Run pdf extraction and memory retrieval in parallel
claude -p "/pdf Extract order details from $PDF_FILE" > /tmp/pdf-result.txt &
claude -p "/supermemory What is the order history for $CUSTOMER_ID?" > /tmp/mem-result.txt &
wait
Then process both results together
RESULT=$(claude -p "PDF result: $(cat /tmp/pdf-result.txt)
Memory: $(cat /tmp/mem-result.txt)
Approve or hold this order?")
Conditional routing: Select skills based on input analysis using shell conditionals:
if [[ "$REQUEST_TYPE" == "document" ]]; then
claude -p "/pdf Process $FILE"
elif [[ "$REQUEST_TYPE" == "query" ]]; then
claude -p "/supermemory $QUERY"
else
claude -p "Handle this general request: $REQUEST"
fi
Error Handling and Resilience
Implement retry logic for skill invocations that may fail transiently:
run_skill_with_retry() {
local PROMPT="$1"
local OUTPUT="$2"
local MAX_RETRIES=3
local ATTEMPT=0
while [ $ATTEMPT -lt $MAX_RETRIES ]; do
RESULT=$(claude -p "$PROMPT" 2>/dev/null)
if [[ $? -eq 0 && -n "$RESULT" ]]; then
echo "$RESULT" > "$OUTPUT"
return 0
fi
ATTEMPT=$((ATTEMPT + 1))
echo "Attempt $ATTEMPT failed, retrying..." >&2
sleep $((ATTEMPT * 2))
done
echo "All retries failed" >&2
return 1
}
When a skill fails repeatedly, fail fast rather than consuming resources on retries.
Monitoring Skill Execution
Track which skills your agent invokes and their outcomes using structured logging:
invoke_skill() {
local SKILL="$1"
local PROMPT="$2"
local START=$(date +%s%N)
RESULT=$(claude -p "/$SKILL $PROMPT" 2>/dev/null)
local EXIT_CODE=$?
local END=$(date +%s%N)
local DURATION_MS=$(( (END - START) / 1000000 ))
if [[ $EXIT_CODE -eq 0 ]]; then
echo "{\"skill\": \"$SKILL\", \"status\": \"success\", \"duration_ms\": $DURATION_MS}" >> /var/log/agent-metrics.jsonl
else
echo "{\"skill\": \"$SKILL\", \"status\": \"failure\", \"duration_ms\": $DURATION_MS}" >> /var/log/agent-metrics.jsonl
fi
echo "$RESULT"
}
These logs help you identify which skills cause latency and which fail under load.
Deployment Considerations
Version management: Pin the version of Claude Code you’re using in production. Skills are read from your local ~/.claude/skills/ directory. store them in version control and deploy them alongside your agent.
npm install -g @anthropic-ai/[email protected]
Resource allocation: Profile which skills are invoked most frequently. If your agent processes thousands of PDFs daily, ensure you have sufficient API quota for those calls.
Testing before deployment: Use the tdd skill to validate that your skill composition produces expected outputs for representative inputs before shipping to production:
claude -p "/tdd Write integration tests for this agent pipeline script: $(cat agent-pipeline.sh)"
Environment Isolation for Agent Skills
When running Claude Code agents in production, skill files should never live only on a developer’s laptop. Store your ~/.claude/skills/ directory contents in a dedicated repository and deploy them as part of your CI/CD pipeline:
deploy-skills.sh
SKILLS_REPO="[email protected]:your-org/claude-skills.git"
SKILLS_DIR="$HOME/.claude/skills"
mkdir -p "$SKILLS_DIR"
git clone "$SKILLS_REPO" /tmp/claude-skills
cp /tmp/claude-skills/*.md "$SKILLS_DIR/"
rm -rf /tmp/claude-skills
echo "Skills deployed: $(ls $SKILLS_DIR | wc -l) files"
Run this script at container startup. Any change to a skill. updated instructions, new examples, corrected behavior. ships through a pull request rather than a manual file copy, giving you an audit trail of every skill modification.
For containerized agents running in Docker, add the skills copy step to your Dockerfile so the image is self-contained:
FROM node:20-slim
RUN npm install -g @anthropic-ai/[email protected]
COPY skills/ /root/.claude/skills/
COPY agent-pipeline.sh /app/agent-pipeline.sh
RUN chmod +x /app/agent-pipeline.sh
CMD ["/app/agent-pipeline.sh"]
Validating Skill Output Quality in Production
Retry logic handles transient failures, but it does not catch outputs that succeed technically yet return wrong answers. Add an output validation step between each skill call and the next stage of your pipeline:
validate_output() {
local OUTPUT="$1"
local MIN_LENGTH="${2:-20}"
if [[ -z "$OUTPUT" ]]; then
echo "INVALID: empty output" >&2
return 1
fi
if [[ ${#OUTPUT} -lt $MIN_LENGTH ]]; then
echo "INVALID: output too short (${#OUTPUT} chars)" >&2
return 1
fi
if echo "$OUTPUT" | grep -qi "i cannot\|unable to process\|error:"; then
echo "INVALID: refusal or error phrase detected" >&2
return 1
fi
return 0
}
EXTRACTED=$(claude -p "/pdf Extract order details from $PDF_FILE")
if ! validate_output "$EXTRACTED" 50; then
echo "PDF extraction failed quality check, aborting pipeline" >&2
exit 1
fi
This pattern catches the cases where the pdf skill runs without error but returns a refusal message because the document was unreadable or password-protected. Catching these early prevents garbage data from propagating to downstream steps and corrupting your supermemory store.
Scaling Agent Pipelines Horizontally
Single-threaded shell pipelines hit throughput limits when order volume grows. Use a simple queue-based architecture to run multiple agent instances in parallel without duplicating processing:
worker.sh. run N copies of this on separate machines or containers
QUEUE_DIR="/shared/queue/pending"
DONE_DIR="/shared/queue/done"
while true; do
JOB=$(ls "$QUEUE_DIR" | head -1)
if [[ -z "$JOB" ]]; then
sleep 2
continue
fi
# Atomic claim: move file before processing
mv "$QUEUE_DIR/$JOB" "/shared/queue/processing/$JOB" 2>/dev/null || continue
PDF_FILE=$(jq -r '.pdf' "/shared/queue/processing/$JOB")
CUSTOMER_ID=$(jq -r '.customer_id' "/shared/queue/processing/$JOB")
bash /app/agent-pipeline.sh "$PDF_FILE" "$CUSTOMER_ID" \
> "/shared/queue/results/$JOB.result" 2>&1
mv "/shared/queue/processing/$JOB" "$DONE_DIR/$JOB"
done
Each worker claims jobs atomically by moving the file before processing. Multiple workers can run on separate containers pointing at the same shared volume without stepping on each other. The supermemory skill remains a single source of truth for customer history because all workers invoke it through the same API endpoint rather than maintaining local state.
Find the right skill → Browse 155+ skills in our Skill Finder.
Related
Try it: Paste your error into our Error Diagnostic for an instant fix.
-
Claude Agent SDK guide — Complete guide to building agents with the Claude Agent SDK
- Claude Flow tool guide — How to use Claude Flow for multi-agent orchestration
Related Reading
- Best Claude Code Skills for Frontend Development. Top frontend skills with examples
- Best Claude Skills for Developers in 2026. Broader developer skill overview
- Claude Skills Auto Invocation: How It Works. How skills activate automatically
- Claude Skills vs Langflow for Building AI Agents — CLI Precision vs Visual Workflow Builder — 2026
- How AI Agents Use Tools and Skills Explained
- Claude Skills vs Langflow for AI Agents (2026)
*Built by theluckystrike. More at zovo.one *