Evaluator-Optimizer Pattern
An agentic workflow pattern where one LLM generates a response and another evaluates it, iterating in a loop until quality criteria are met. One of five agentic workflow patterns described in Building Effective Agents.
Structure
Generator LLM --> Output --> Evaluator LLM --> Feedback
^ |
+-------------------------------------------+
(loop until satisfied)
The generator produces; the evaluator critiques. The generator incorporates feedback and tries again. The loop terminates when the evaluator approves or a maximum iteration count is reached.
When to use
Two diagnostic questions:
- Can a human articulate feedback that demonstrably improves the output? If so, an LLM evaluator can likely provide similar feedback.
- Are there clear evaluation criteria? The evaluator needs something concrete to assess against.
If both are true, the pattern is a good fit. If the quality difference between iterations is marginal, the added latency and cost aren’t justified.
Examples
- Literary translation — nuances that the translator LLM misses initially, but an evaluator can identify and critique
- Complex search — multiple rounds of searching and analysis, with the evaluator deciding whether further searches are warranted
- Code generation — generate code, run tests, use test results as evaluation, regenerate
The feedback loop connection
This pattern is a feedback loop in the control theory sense: output is measured against a reference (the evaluation criteria), the error signal (evaluator feedback) drives correction, and the system iterates toward convergence.
This makes it structurally identical to:
- The Agent Learning Loop — generate, evaluate, improve, store as skill
- The iterative refinement in human writing — draft, review, revise
- The Heartbeat Mechanism in CORAL — periodic reflection that redirects effort when progress stalls
The key difference from the learning loop: the evaluator-optimizer is stateless across tasks. It refines a single output. The learning loop persists improvements across tasks via memory and skills.
Connections
- Part of the Agentic Workflow Patterns taxonomy
- The Augmented LLM is the building block — both generator and evaluator are augmented LLMs
- Connects to Discrepancy Production vs Reduction — the evaluator produces discrepancy (identifying gaps), the generator reduces it (fixing them)
- Goal-Setting Theory applies: the evaluation criteria function as specific, challenging goals that drive performance