Evaluator-Optimizer Pattern

An agentic workflow pattern where one LLM generates a response and another evaluates it, iterating in a loop until quality criteria are met. One of five agentic workflow patterns described in Building Effective Agents.

Structure

Generator LLM --> Output --> Evaluator LLM --> Feedback
     ^                                           |
     +-------------------------------------------+
                  (loop until satisfied)

The generator produces; the evaluator critiques. The generator incorporates feedback and tries again. The loop terminates when the evaluator approves or a maximum iteration count is reached.

When to use

Two diagnostic questions:

Can a human articulate feedback that demonstrably improves the output? If so, an LLM evaluator can likely provide similar feedback.
Are there clear evaluation criteria? The evaluator needs something concrete to assess against.

If both are true, the pattern is a good fit. If the quality difference between iterations is marginal, the added latency and cost aren’t justified.

Examples

Literary translation — nuances that the translator LLM misses initially, but an evaluator can identify and critique
Complex search — multiple rounds of searching and analysis, with the evaluator deciding whether further searches are warranted
Code generation — generate code, run tests, use test results as evaluation, regenerate

The feedback loop connection

This pattern is a feedback loop in the control theory sense: output is measured against a reference (the evaluation criteria), the error signal (evaluator feedback) drives correction, and the system iterates toward convergence.

This makes it structurally identical to:

The Agent Learning Loop — generate, evaluate, improve, store as skill
The iterative refinement in human writing — draft, review, revise
The Heartbeat Mechanism in CORAL — periodic reflection that redirects effort when progress stalls

The key difference from the learning loop: the evaluator-optimizer is stateless across tasks. It refines a single output. The learning loop persists improvements across tasks via memory and skills.

Connections

Part of the Agentic Workflow Patterns taxonomy
The Augmented LLM is the building block — both generator and evaluator are augmented LLMs
Connects to Discrepancy Production vs Reduction — the evaluator produces discrepancy (identifying gaps), the generator reduces it (fixing them)
Goal-Setting Theory applies: the evaluation criteria function as specific, challenging goals that drive performance