Agent Exec Policy

concept
ai-agentssafetytool-callinghuman-in-the-loopexec-approval

A three-axis policy model for human-in-the-loop approval of agent tool execution. The three axes are independent and composable: what commands are structurally permitted, when to prompt the human, and what to do when the human doesn’t respond.

The three axes

ExecSecurity — structural permission

Controls what the agent is allowed to execute at all, regardless of human approval:

  • deny — no execution permitted
  • allowlist — only pre-approved command patterns
  • full — any command (subject to the other two axes)

ExecAsk — when to prompt

Controls when the system pauses to ask the human for approval:

  • off — never ask, execute immediately (if ExecSecurity permits)
  • on-miss — ask only when the command doesn’t match the allowlist
  • always — ask for every execution, even allowlisted commands

ExecAsk fallback — timeout behavior

Controls what happens when the human doesn’t respond within the approval timeout:

  • full — execute anyway (optimistic: assume approval)
  • deny — refuse execution (pessimistic: assume denial)

Fail-closed composition

The key design principle: when multiple policy sources apply (host-level config, session-level config, agent-level config), the most restrictive policy wins. This is implemented as a minSecurity / maxAsk merge:

  • Security takes the minimum: if host says allowlist and session says full, the result is allowlist
  • Ask takes the maximum: if host says on-miss and session says off, the result is on-miss

No code path can silently broaden permissions. A tighter host policy cannot be overridden by a looser session policy. This matters because agent sessions can be initiated from untrusted channels (an inbound WhatsApp message triggering an agent run), so the host must be able to set a floor.

Two-phase approval registration

The approval flow uses a registration-then-wait protocol to prevent a specific race condition:

  1. Register — the approval request is registered server-side with a unique ID, returning immediately
  2. Wait — the system blocks on a decision (approve/deny/timeout) for that ID

The two-phase split is necessary because the approval command (/approve <id>) can arrive from a different channel or client than the one running the agent. If the ID were generated client-side and sent with the wait call, a fast human could approve before the server even knew about the request.

Per-agent allowlists

Approved commands are persisted as allowlist entries with:

  • Glob patterns for command matching
  • Argument patterns for parameter matching
  • Last-used timestamps
  • Source tracking (which approval created this entry)
  • SHA-256 hashed command fingerprints

The system builds a merged allowlist from wildcard (*) entries plus agent-specific entries. This enables different security postures per agent: a coding agent might have broad exec permissions while a customer-service agent has none.

Approval decisions

When the human is prompted, three options:

  • allow-once — permit this specific execution
  • allow-always — add the command pattern to the persistent allowlist
  • deny — refuse this execution

When ExecAsk=always, the allow-always option is deliberately removed. If you set “always ask,” the system respects that literally.

Sandbox integration

The exec policy layer composes with a separate sandbox layer (Docker, SSH). When sandboxed, tool execution runs in an isolated environment with blocked host paths (/etc, /proc, .ssh, .aws, .docker), symlink escape hardening, and network namespace restrictions. The sandbox provides defense-in-depth: even if exec policy permits a command, the sandbox constrains its blast radius.

A non-main sandbox mode sandboxes all sessions except the operator’s direct chat — acknowledging that the operator’s own session has different trust from an externally-triggered agent run.

  • OpenClaw implements this in src/infra/exec-approvals.ts and src/agents/sandbox/
  • Prompt Injection Robustness — exec policy is the enforcement layer that makes prompt injection resistance practical