Introduction · agentjail

agentjail is a policy guardrail for coding agents. It sits at the boundary between your agent and the tools it can call (the shell, the filesystem, the network) and blocks dangerous tool calls before they execute.

Why it exists

Coding agents are useful precisely because they take actions on your behalf. That same capability is the risk: a well-meaning but vague instruction can lead an agent to delete the wrong directory, leak a secret, or push to the wrong remote.

Most safety today is either a coarse sandbox (all-or-nothing, awkward to live in) or a permission prompt on every action (fine until there are dozens per task and fatigue sets in). agentjail takes a third path: it doesn’t try to make the agent smarter, and it doesn’t ask you to approve everything. It enforces policy at the boundary, regardless of the agent’s intent.

How it works

Every time your agent is about to use a tool, agentjail intercepts the call, evaluates it against your policy, and returns a verdict. Allowed calls run as normal; denied calls never reach the shell. The agent gets a structured block response and explains itself instead of proceeding.

  your agent ──tool call──▶  agentjail  ──allow──▶  shell / files / network
 (Claude Code)              (policy gate,  └─deny──▶  blocked
                             offline)                (e.g. rm -rf ~/.ssh)

Three properties make this practical:

It runs at the tool boundary: on the PreToolUse hook, before the command is ever handed to the shell.
The decision loop is offline: the allow/ask/deny verdict is computed locally, with no network round-trip and no model in the loop. Fast and deterministic. (Network visibility, described below, is a separate, opt-in capability - it doesn’t put a network call on the decision path.)
It’s auditable: every rule is plain text you can read, diff, and version alongside your project.

What it can guard

A policy can inspect any tool call your agent makes, so you can write rules over:

Shell commands: block destructive or sensitive operations.
Filesystem paths: keep the agent out of ~/.ssh, .env, credentials, or anything outside the working tree.
Network access: stop exfiltration to unexpected hosts.
Git actions: prevent pushes to protected remotes or force-pushes.

Beyond blocking: sandbox and network visibility

The hook is the first layer. Two more ship in the box:

OS-native sandbox. Launch your agent with agentjail claude (or agentjail run -- codex / agentjail run -- cursor) and it runs inside the kernel sandbox - Seatbelt on macOS, Landlock on Linux. Shell tricks, eval, and subprocesses that skip the hook are stopped at the syscall boundary, not just pattern matched. See the OS-native sandbox.
Network visibility (new in v1.0.0). See exactly what your agent sends its model. agentjail captures the full LLM request and response - on macOS with no system extension - and the opt-in transparent tunnel (--tunnel) MITMs and enforces per-host network policy for everything else, on both Linux and macOS, with HTTP/2 and gRPC support.

Together these are agentjail’s isolation tiers: a light hook by default, a kernel sandbox and full network capture when you want them.

What a rule looks like

Policies are written in Rego. A rule contributes a candidate verdict: an object with an action, rule_id, and reason. The central resolver picks the most restrictive candidate across all loaded rules:

package agentjail

import future.keywords.if
import future.keywords.contains

candidate contains r if {
  input.tool_name == "Bash"
  contains(input.tool_input.command, "/.ssh/")
  r := {
    "action":  "deny",
    "rule_id": "custom/my_policy/no-ssh-access",
    "reason":  "Blocked: command targets sensitive path ~/.ssh/",
  }
}

One rule. Offline. No round-trips. Three possible verdicts: allow, ask, or deny, and the fail-safe default (nothing fires) is ask, not allow. The policy model page covers how rules are matched and evaluated.

Where it fits

agentjail is most useful anywhere an agent can touch something you care about:

Local development: a safety net while you let an agent work in your repo.
“Skip the prompts” workflows: when you run an agent with permission prompts disabled, agentjail is the boundary that still holds.

agentjail supports Claude Code, Codex, and Cursor on macOS and Linux. The installer auto-detects which agents are present and wires the hook for each.

Next steps

Quickstart: zero to a blocked tool call in about two minutes.
Installation: get agentjail running locally.
The policy model: how rules are written and evaluated.