Claude Agent Sandbox Skill (2026)

When you run AI agents in production workflows, security and isolation become critical concerns. The agent skill in Claude Code includes sandboxing capabilities that create boundaries between the AI’s operations and your actual filesystem, network, and credentials.

Skills are .md files in ~/.claude/skills/. The agent skill is invoked like any other:

/agent refactor the authentication module in src/auth/. do not touch anything outside that directory

The sandbox behavior defines what that agent can and cannot do when Claude executes bash commands or file operations in response to the task.

What Sandbox Isolation Does

A sandbox is a restricted execution context. When the agent skill operates in sandboxed mode, Claude’s file operations, shell commands, and network requests are constrained to explicitly permitted paths and actions.

This follows least-privilege: the agent receives only the permissions necessary for the current task. If you’re refactoring a module, there’s no reason for the agent to read your SSH keys or call external APIs.

The practical benefit: you can run agentic workflows without reviewing every individual action, because the scope of possible actions is bounded by configuration. This is qualitatively different from simply trusting that Claude will behave. the sandbox enforces limits at the tool permission level, so even unexpected behavior or a confused model state cannot breach the defined perimeter.

Think of it as the difference between asking a contractor to work in your kitchen and telling them not to enter the bedroom versus giving them a key only to the kitchen and deadbolting the bedroom. The second approach removes the need for trust.

Setting Up Filesystem Isolation

The most common sandbox configuration constrains filesystem access. Claude Code’s permission system lets you define which paths are allowed for reads and writes using ~/.claude/settings.json:

{
 "permissions": {
 "allow": [
 "Read(./src/)",
 "Read(./tests/)",
 "Write(./src/)",
 "Write(./build/)"
 ],
 "deny": [
 "Read(~/.ssh/)",
 "Write(./config/)"
 ]
 }
}

With this in place, a /agent invocation that tries to modify ./config/secrets.yml will be blocked by the deny rule. The agent sees the restriction and should report it rather than proceeding.

For testing new community skills without risking your production code, set up an isolated directory structure:

/project/
 sandbox/ ← agent can modify
 test-files/
 production/ ← agent cannot access
 real-app/
 .claude/
 skills/

Then invoke:

/agent test the new pdf skill on sandbox/test-files/. generate a sample PDF and verify extraction works

Even if the community skill behaves unexpectedly, it cannot reach production/.

Understanding Allow and Deny Precedence

When both allow and deny rules match a path, deny takes precedence. This is deliberate. it means you can create a broad allow rule and then carve out exceptions without having to enumerate every protected path individually:

{
 "permissions": {
 "allow": [
 "Read(./src/)",
 "Write(./src/)"
 ],
 "deny": [
 "Read(./src/secrets/)",
 "Write(./src/migrations/)"
 ]
 }
}

Here the agent can read and write most of src/, but migration files and secrets remain protected even though they fall under the broad src/ allow rule. This pattern is easier to maintain than an exhaustive allowlist that you update every time you add a new file.

Path Specificity Matters

Vague paths undermine sandbox effectiveness. Compare these two configurations:

Configuration What it Means
Write(~/) Agent can write anywhere in your home directory
Write(~/projects/myapp/src/) Agent can only write inside that specific source directory
Write(./) Agent can write anywhere in the current directory tree
Write(./src/components/) Agent limited to that one subdirectory

The difference between Write(./) and Write(./src/components/) is the difference between “can modify everything” and “can only modify React components.” Always specify exact directories.

Network Isolation

Network isolation controls outbound calls. For workflows that should be purely local, use a PreToolUse hook in ~/.claude/settings.json to block WebFetch and WebSearch tools:

{
 "hooks": {
 "PreToolUse": [
 {
 "matcher": "WebFetch|WebSearch",
 "hooks": [
 {
 "type": "command",
 "command": "echo 'Network access blocked for this workflow' && exit 1"
 }
 ]
 }
 ]
 }
}

This blocks all network requests, preventing accidental data exfiltration or unexpected API calls. For workflows that need selective network access, modify the hook to only block calls to unauthorized domains.

A more nuanced network policy allows internal tooling but blocks external calls. This is useful when your agent needs to call an internal API but should not be able to reach the public internet:

#!/bin/bash
~/.claude/hooks/network-policy.sh
URL=$(cat /dev/stdin | jq -r '.tool_input.url // empty')
Allow calls to internal services
if echo "$URL" | grep -qE "^https?://(localhost|10\.|192\.168\.|internal\.)"; then
 exit 0
fi
Block all external URLs
echo "Network policy: only internal URLs permitted. Blocked: $URL"
exit 1

Register this as a PreToolUse hook matching WebFetch. The agent can now call your internal API at http://localhost:3000 or http://internal.company.com but cannot reach external domains.

Why Network Isolation Matters Beyond Security

The security argument for network isolation is obvious. you don’t want an agent exfiltrating data or calling external APIs with your credentials. But there’s a subtler reason: reproducibility.

An agent that can make outbound HTTP calls during a refactor might silently pull in different behavior based on whatever remote resource it contacts. An isolated agent produces the same output regardless of network state. For CI/CD pipelines and automated workflows where you need consistent, auditable results, network isolation is as much about reliability as security.

Process Isolation

Process isolation limits which shell commands the agent can execute. Use a PreToolUse hook to intercept Bash tool calls and validate commands:

#!/bin/bash
~/.claude/hooks/validate-bash.sh
COMMAND=$(cat /dev/stdin | jq -r '.tool_input.command')
ALLOWED="npm|git|python|pytest"
if ! echo "$COMMAND" | grep -qE "^($ALLOWED)\b"; then
 echo "Blocked: only $ALLOWED commands are permitted"
 exit 1
fi

Register this hook in ~/.claude/settings.json under PreToolUse with matcher Bash. This prevents shell injection from malformed inputs and blocks the agent from running commands like curl, wget, or ssh.

A more structured version of this hook also logs every command attempt, giving you an audit trail:

#!/bin/bash
~/.claude/hooks/validate-bash-audit.sh
COMMAND=$(cat /dev/stdin | jq -r '.tool_input.command')
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
LOGFILE="$HOME/.claude/agent-audit.log"
ALLOWED_PATTERN="^(npm|yarn|pnpm|git|python3?|pytest|node|ts-node)\b"
if echo "$COMMAND" | grep -qE "$ALLOWED_PATTERN"; then
 echo "$TIMESTAMP ALLOWED: $COMMAND" >> "$LOGFILE"
 exit 0
else
 echo "$TIMESTAMP BLOCKED: $COMMAND" >> "$LOGFILE"
 echo "Command blocked by policy: $COMMAND"
 exit 1
fi

The audit log becomes useful when debugging why a workflow failed. you can see exactly which commands the agent attempted and which were blocked, letting you adjust the allowlist precisely rather than guessing.

Comparing Isolation Levels

Not every workflow needs the same level of restriction. Here is a practical framework for matching isolation level to workflow type:

Workflow Type Filesystem Network Process
Code review (read-only analysis) Read-only allow on src/ Block all Block all
Automated testing Read src/ + tests/, Write build/ Block all npm, pytest only
Code generation (new features) Read + Write src/ Block all npm, git
Documentation generation Read src/, Write docs/ Block all npm, git
Dependency updates Read + Write package.json Allow registry only npm only
Data pipeline (internal APIs) Read data/, Write output/ Allow internal python only

Starting from this table and tightening based on your specific setup is faster than starting from scratch and easier to audit than a single catch-all configuration.

Practical Use Case: Running Tests Safely

The tdd skill generates tests that may include third-party dependencies. Running those tests in a sandboxed context prevents buggy or malicious packages from accessing your environment variables or SSH keys.

/agent run the test suite in tests/ using the tdd skill. only read from tests/ and src/, write only to build/test-results/, no network access

If you need more isolation than the settings file provides, combine the agent skill with Docker:

docker run --rm \
 --network none \
 --read-only \
 --tmpfs /tmp \
 -v $(pwd)/src:/workspace/src:ro \
 -v $(pwd)/tests:/workspace/tests:ro \
 claude-test-image \
 /agent run all tests in /workspace/tests/

This gives you OS-level isolation on top of Claude’s built-in sandbox controls.

The Docker approach is worth elaborating on. Claude Code’s permission system is a soft boundary. it constrains what Claude will attempt. Docker’s --network none and --read-only flags are hard boundaries enforced by the kernel. Combining both means: Claude won’t try to breach the perimeter (soft), and even if it did, the kernel would reject it (hard). For production automation or any workflow where the stakes of a breach are high, layered controls are the right approach.

A more complete Docker setup that includes environment variable isolation:

docker run --rm \
 --network none \
 --read-only \
 --tmpfs /tmp \
 --tmpfs /var/tmp \
 --security-opt no-new-privileges \
 --cap-drop ALL \
 -v $(pwd)/src:/workspace/src:ro \
 -v $(pwd)/tests:/workspace/tests:ro \
 -v $(pwd)/build:/workspace/build:rw \
 --env-file .env.sandbox \
 claude-agent:latest \
 /agent run test suite in /workspace/tests/

The --env-file .env.sandbox passes only the specific variables the workflow needs, rather than inheriting your full shell environment. Your AWS_ACCESS_KEY_ID, DATABASE_URL, and other sensitive variables never enter the container.

Environment-Specific Configuration

Development and production sandboxes should have different permission levels. Your local development box can be more permissive; your CI/CD environment should mirror production restrictions.

Keep environment-specific settings in separate files. Claude Code loads ~/.claude/settings.json globally and .claude/settings.json from the project root. Use the project-level file for stricter CI settings:

{
 "permissions": {
 "allow": [
 "Read(./src/)",
 "Read(./tests/)",
 "Write(./build/)"
 ],
 "deny": [
 "Write(./config/)",
 "Write(./src/)"
 ]
 }
}

The project-level .claude/settings.json committed to your repository ensures every developer on the team and every CI run uses the same restrictions. A developer’s global ~/.claude/settings.json can be more permissive for their personal workflow, but the project file is the floor that applies everywhere.

A practical team configuration separates read-only reviewers from contributors who need write access:

{
 "permissions": {
 "allow": [
 "Read()"
 ],
 "deny": [
 "Write()",
 "Read(./secrets/)",
 "Read(./.env*)"
 ]
 },
 "hooks": {
 "PreToolUse": [
 {
 "matcher": "WebFetch|WebSearch",
 "hooks": [
 {
 "type": "command",
 "command": "echo 'Network access not permitted in this project' && exit 1"
 }
 ]
 }
 ]
 }
}

This project-level config creates a read-only agent by default. suitable for code review workflows. Developers who need write access override it locally or use a separate settings profile for development tasks.

Common Pitfalls

Overly broad filesystem permissions. Granting write access to ~/ or / rather than specific paths removes the protection entirely. Specify exact directories.

Disabling sandboxing for convenience. Some developers disable sandbox controls because a workflow is failing and they want to unblock quickly. This removes a critical safety layer. Instead, identify the specific permission the workflow needs and add only that.

Stale allowlists. As workflows evolve, you accumulate permissions that are no longer needed. Audit your sandbox configuration periodically and remove unused allowances.

Testing sandbox config only in development. Your CI configuration should be tested in a staging environment that mirrors production before it’s relied on. A permission that works locally may interact differently in CI due to path differences or file ownership.

Missing deny rules for sensitive files. An allow rule for Read(./src/) does not automatically exclude ./src/.env.local if that file exists in your source tree. Be explicit about denying environment files and credentials even within otherwise-permitted directories:

{
 "permissions": {
 "allow": ["Read(./src/)"],
 "deny": [
 "Read(./src//.env*)",
 "Read(./src//secrets*)",
 "Read(./src//*credentials*)"
 ]
 }
}

Treating hooks as the only enforcement mechanism. Hooks running shell scripts can fail if jq is not installed, if the script has a syntax error, or if the shell exits with an unexpected code. Always pair hook-based controls with permission rules so there is a second layer if the hook fails.

Auditing and Monitoring Sandbox Behavior

Once sandboxing is in place, periodic review of what the agent actually does is as important as the configuration itself. Claude Code’s PostToolUse hook provides a lightweight monitoring mechanism:

#!/bin/bash
~/.claude/hooks/audit-tool-use.sh
TOOL_NAME=$(cat /dev/stdin | jq -r '.tool_name')
TOOL_INPUT=$(cat /dev/stdin | jq -c '.tool_input')
TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
echo "$TIMESTAMP $TOOL_NAME $TOOL_INPUT" >> ~/.claude/tool-use-audit.log

Register this as a PostToolUse hook with a universal matcher. The resulting log shows you every tool invocation the agent made, which files it read and wrote, which commands it ran, and which URLs it fetched. Reviewing this log after a few agent runs quickly reveals whether your permissions are appropriately scoped or whether you have unnecessary allowances.

Moving Forward

Start with the strictest isolation level that still allows your workflow to function. Gradually relax restrictions only after identifying specific needed permissions. This “deny by default” approach minimizes your exposure from the start and keeps your configuration easy to audit over time.

The goal is not to prevent Claude from being useful. it is to make Claude useful within a defined, reviewable boundary. Well-configured sandboxing lets you run more ambitious automated workflows with confidence, because the blast radius of any unexpected behavior is bounded before the workflow starts.

Set it up → Build your permission config with our Permission Configurator.

Try it: Paste your error into our Error Diagnostic for an instant fix.


This site was built by 5 autonomous agents running in tmux while I was in Bali. 2,500 articles. Zero manual work. 100% quality gate pass rate. The orchestration configs, sprint templates, and quality gates that made that possible are in the Zovo Lifetime bundle. Along with 16 CLAUDE.md templates and 80 tested prompts. **[See how the pipeline works →](https://zovo.one/lifetime?utm_source=ccg&utm_medium=cta-skills&utm_campaign=claude-agent-sandbox-skill-isolated-environments)** $99 once. I'm a solo dev in Da Nang. This is how I scale.

Related Reading


*Built by theluckystrike. More at zovo.one *