Managed Agents Architecture

source Apr 9, 2026

ai-agentsinfrastructureanthropicdecouplingmanaged-agents

Source summary of Scaling Managed Agents: Decoupling the brain from the hands by Lance Martin, Gabe Cemaj, and Michael Cohen (Anthropic Engineering Blog).

Problem

Agent harnesses encode assumptions about model limitations — for example, adding context resets to prevent “context anxiety.” These assumptions go stale as models improve: the same resets became dead weight when moving from Claude Sonnet 4.5 to Opus 4.5. The challenge is designing a system for “programs as yet unthought of” — future harnesses that don’t exist yet.

Solution: virtualize agent components

Managed Agents applies the operating system pattern: virtualize components into abstractions general enough for implementations that don’t exist yet. Three components, each behind a stable interface:

Session — append-only event log. Interface: emitEvent(id, event), getEvents(). Durable storage outside both the harness and the context window.
Harness (brain) — the loop that calls Claude and routes tool calls. Stateless. Interface: wake(sessionId), getSession(id). Recoverable by rebooting and replaying from the event log.
Sandbox (hands) — execution environment. Interface: execute(name, input) → string, provision({resources}). Replaceable — the harness doesn’t know if the sandbox is a container, a phone, or a Pokémon emulator.

Evolution: pet → cattle

The initial design placed all components in a single container. This created a “pet” — if it failed, the session was lost. Debugging required shelling into containers that also held user data. Connecting to customer VPCs required network peering because the harness assumed all resources were co-located.

After decoupling: containers are cattle. If one dies, the harness catches it as a tool-call error. If the harness dies, a new one boots with wake(sessionId) and resumes from the last event. No state needs to survive a crash.

Session as context object

The session log is not Claude’s context window — it is a separate, durable store that the harness interrogates via getEvents(). This separates two concerns: recoverable context storage (the session) and arbitrary context management (the harness). The harness can slice, rewind, or transform events before passing them into Claude’s context window. See Session as Context Object.

Security boundary

Credentials never enter the sandbox. Two patterns:

Auth bundled with resource — Git tokens are used to clone repos during sandbox initialization, wired into the local git remote. Push/pull work without the agent handling tokens.
Auth in vault — OAuth tokens live in a secure vault. Claude calls MCP tools via a proxy that fetches credentials from the vault per-session. The harness never sees credentials either.

Performance: many brains, many hands

Decoupling enables on-demand provisioning. Containers spin up only when the brain makes a tool call, not at session start. Result: p50 TTFT dropped ~60%, p95 dropped >90%.

The architecture also supports multiple brains connected to multiple hands. Each hand is just execute(name, input) → string — it could be any custom tool, any MCP server, or Anthropic’s own tools. Brains can pass hands to one another.

Connections

The meta-harness concept: opinionated about interfaces, not implementations
Harness staleness: why harnesses need to be swappable
Brain-hands decoupling: the core architectural pattern
Hermes Agent takes a different approach — bundling terminal backends, memory, and skills into a single agent process, though its 6 terminal backends (local, Docker, SSH, Daytona, Modal, Singularity) parallel the “many hands” idea
Context window compression: the session-as-context-object pattern is an alternative to summarize-and-discard
Credential pool pattern: solves a related problem (safe credential access) with a different mechanism (failover rotation vs. vault isolation)
The user’s observation that an AI agent is always physical — Managed Agents makes the brain/hands split explicit, treating execution environments as abstract “hands” regardless of substrate

Backlinks

Building Effective Agents

Directly extends the Managed Agents Architecture by the same organization — that post covers infrastructure, this one covers design patterns

The brain-hands split in Brain-Hands Decoupling is the infrastructure counterpart to the augmented LLM concept here

The orchestrator-workers pattern maps to the "many brains, many hands" idea in Managed Agents

The ACI concept connects to tool design patterns in Hermes Agent and OpenClaw

→

Brain-Hands Decoupling

Managed Agents Architecture: the system that implements this pattern

Meta-harness: the design philosophy that motivates decoupling

Hermes Agent takes the opposite approach — a single process with 6 terminal backends. This works for a developer tool but limits infrastructure flexibility

Capsules: isolated environments for agents — the "hands" in this pattern

The user notes that the boundary between software and hardware is an implementation detail — brain-hands decoupling embodies this: the brain's interface abstracts away whether the hands are software containers or physical devices

→

Building Effective Agents

The infrastructure counterpart to this design guide is Managed Agents Architecture by the same organization — that post covers the brain/hands/session split, this one covers the patterns running inside the brain

The orchestrator-workers pattern directly maps to the "many brains, many hands" architecture in Brain-Hands Decoupling

The emphasis on tool quality resonates with Hermes Agent's 40+ tools and skill system — both argue that tool design matters more than prompt design

The evaluator-optimizer pattern is structurally identical to the Agent Learning Loop: generate, evaluate, improve

The recommendation to start simple and add complexity connects to the knowledge base's theme around Autonomy With Acceptable Quality

→

Harness Staleness

Meta-harness: the architectural response to harness staleness

Managed Agents Architecture: the system built around swappable harnesses

Context window compression: one category of harness assumption that may go stale as models handle longer contexts natively

→

Meta-Harness

Managed Agents Architecture: Anthropic's implementation of the meta-harness concept

Brain-hands decoupling: the architectural pattern that makes the meta-harness possible

Harness staleness: the problem the meta-harness solves

→

AI Agents

Capsules Isolated Environments for AI Agents — isolated, reproducible environments for agents

Clawdbot Capsules and Self Evolving Agents — minimal core + self-development

My Digital Twin Starts With Claude Code — personal knowledge graph from Claude Code sessions

Hermes Agent — self-improving multi-platform agent framework with learning loop, skills, and RL training (Nous Research)

OpenClaw — personal AI assistant gateway: 24+ messaging channels, typed plugin adapters, embedded agent runtime, native companion apps

Agent Learning Loop — memory + skills + session search forming a closed self-improvement cycle

Sleep-Phase Memory Consolidation — offline three-phase (light/REM/deep) memory consolidation with evidence accumulation thresholds

Context Window Compression — auto-summarizing old conversation turns to stay within context limits

Credential Pool Pattern — multi-credential failover with selection strategies for agent infrastructure

Managed Agents Architecture — Anthropic's hosted long-horizon agent service: brain/hands/session decoupling

Brain-Hands Decoupling — separating reasoning from execution behind stable interfaces

Meta-Harness — system designed for harnesses that don't exist yet

Harness Staleness — harnesses encode assumptions that go stale as models improve

Session as Context Object — durable event log as interrogable context outside the context window

Multi-Channel AI Gateway — single-daemon architecture routing one agent across many messaging platforms

Channel Adapter Pattern — typed composition of optional interfaces for messaging channel plugins

→

Orchestrator-Workers Pattern

The Managed Agents Architecture is the infrastructure for this pattern at scale — the brain (orchestrator) calls many hands (workers) through execute(name, input) -> string

Brain-Hands Decoupling formalizes the interface between orchestrator and workers

CORAL extends this into autonomous multi-agent evolution — agents act as both orchestrators and workers, with shared persistent memory replacing explicit delegation

Cross-Agent Knowledge Transfer describes what happens when workers can learn from each other's results, which the basic pattern doesn't include

The user's blog post From Solo Sessions to Agent Orchestras describes the human experience of scaling from a single agent to an orchestrated team — the human becomes the orchestrator

→

OpenClaw

Architecturally parallel to Hermes Agent: same multi-platform gateway + skills + tools + session pattern, different language (TypeScript vs Python) and different design philosophy (composition-based adapters vs inheritance-based ABC). OpenClaw has stricter plugin boundaries and a richer native companion app story; Hermes Agent has a built-in learning loop and RL training infrastructure.

The channel adapter pattern is a typed variant of the plugin composition approach described in managed agents architecture — tools and capabilities compose without inheritance.

Skills system parallels Hermes Agent's SKILL.md format and the agent learning loop pattern, though OpenClaw's skills are not self-created by the agent (they're authored by users or shipped by plugins).

The gateway-as-control-plane design echoes the brain-hands decoupling pattern: reasoning (agent runtime) and execution (channel adapters, tool backends, nodes) are separated behind stable typed interfaces.

Context compaction mirrors context window compression — auto-summarizing when approaching limits, preserving recent context, keeping cached prefixes stable. OpenClaw adds staged summarization for large histories and an identifier preservation policy.

The dreaming system (memory-core plugin) implements sleep-phase memory consolidation — three-phase offline consolidation (light/REM/deep) with evidence accumulation thresholds before durable promotion. This is the offline complement to the agent learning loop.

Tool loop detection uses content-aware result hashing to distinguish legitimate polling from stuck loops, with four independent detectors and escalating response (warn -> critical -> circuit break).

The agent exec policy implements three-axis human-in-the-loop tool approval (security x ask x fallback) with fail-closed composition across host and session policies.

Two-tier model failover extends the credential pool pattern: auth profile rotation (inner loop) + model fallback chain (outer loop) + cooldown probing near expiry.

→

Session as Context Object

Managed Agents Architecture: the system that implements this pattern

Context window compression: one of the transformations the harness can apply to events fetched from the session

Shared persistent memory (CORAL): a related pattern for multi-agent systems — agents coordinate through a durable filesystem rather than direct messaging. The session log serves a similar role for a single agent across time.

→