Managed Agents Architecture

source
ai-agentsinfrastructureanthropicdecouplingmanaged-agents

Source summary of Scaling Managed Agents: Decoupling the brain from the hands by Lance Martin, Gabe Cemaj, and Michael Cohen (Anthropic Engineering Blog).

Problem

Agent harnesses encode assumptions about model limitations — for example, adding context resets to prevent “context anxiety.” These assumptions go stale as models improve: the same resets became dead weight when moving from Claude Sonnet 4.5 to Opus 4.5. The challenge is designing a system for “programs as yet unthought of” — future harnesses that don’t exist yet.

Solution: virtualize agent components

Managed Agents applies the operating system pattern: virtualize components into abstractions general enough for implementations that don’t exist yet. Three components, each behind a stable interface:

  • Session — append-only event log. Interface: emitEvent(id, event), getEvents(). Durable storage outside both the harness and the context window.
  • Harness (brain) — the loop that calls Claude and routes tool calls. Stateless. Interface: wake(sessionId), getSession(id). Recoverable by rebooting and replaying from the event log.
  • Sandbox (hands) — execution environment. Interface: execute(name, input) → string, provision({resources}). Replaceable — the harness doesn’t know if the sandbox is a container, a phone, or a Pokémon emulator.

Evolution: pet → cattle

The initial design placed all components in a single container. This created a “pet” — if it failed, the session was lost. Debugging required shelling into containers that also held user data. Connecting to customer VPCs required network peering because the harness assumed all resources were co-located.

After decoupling: containers are cattle. If one dies, the harness catches it as a tool-call error. If the harness dies, a new one boots with wake(sessionId) and resumes from the last event. No state needs to survive a crash.

Session as context object

The session log is not Claude’s context window — it is a separate, durable store that the harness interrogates via getEvents(). This separates two concerns: recoverable context storage (the session) and arbitrary context management (the harness). The harness can slice, rewind, or transform events before passing them into Claude’s context window. See Session as Context Object.

Security boundary

Credentials never enter the sandbox. Two patterns:

  1. Auth bundled with resource — Git tokens are used to clone repos during sandbox initialization, wired into the local git remote. Push/pull work without the agent handling tokens.
  2. Auth in vault — OAuth tokens live in a secure vault. Claude calls MCP tools via a proxy that fetches credentials from the vault per-session. The harness never sees credentials either.

Performance: many brains, many hands

Decoupling enables on-demand provisioning. Containers spin up only when the brain makes a tool call, not at session start. Result: p50 TTFT dropped ~60%, p95 dropped >90%.

The architecture also supports multiple brains connected to multiple hands. Each hand is just execute(name, input) → string — it could be any custom tool, any MCP server, or Anthropic’s own tools. Brains can pass hands to one another.

Connections

  • The meta-harness concept: opinionated about interfaces, not implementations
  • Harness staleness: why harnesses need to be swappable
  • Brain-hands decoupling: the core architectural pattern
  • Hermes Agent takes a different approach — bundling terminal backends, memory, and skills into a single agent process, though its 6 terminal backends (local, Docker, SSH, Daytona, Modal, Singularity) parallel the “many hands” idea
  • Context window compression: the session-as-context-object pattern is an alternative to summarize-and-discard
  • Credential pool pattern: solves a related problem (safe credential access) with a different mechanism (failover rotation vs. vault isolation)
  • The user’s observation that an AI agent is always physical — Managed Agents makes the brain/hands split explicit, treating execution environments as abstract “hands” regardless of substrate