Single-Agent Evolution¶

This document explains how single-agent evolution works, including the roles of critic and reflection agents, trial structure, and how they work together.

Overview¶

Single-agent evolution optimizes one agent's components (instruction, output_schema, generate_content_config) through iterative improvement.

from gepa_adk import evolve

result = await evolve(
    agent=my_agent,
    trainset=examples,
    critic=critic_agent,
)

Evolvable Components¶

Component	Type	What It Controls
`instruction`	`str`	The agent's prompt/instructions
`output_schema`	`type[BaseModel]`	Pydantic model for structured output
`generate_content_config`	`GenerateContentConfig`	LLM parameters (temperature, etc.)

By default, only instruction evolves. Specify others explicitly:

result = await evolve(
    agent=my_agent,
    trainset=examples,
    components=["instruction", "output_schema"],
)

The Critic Agent¶

The critic evaluates agent outputs and provides feedback for reflection.

Output Schemas¶

SimpleCriticOutput - Basic evaluation:

class SimpleCriticOutput(BaseModel):
    score: float = Field(..., ge=0.0, le=1.0)  # Required
    feedback: str = Field(...)                  # Required

CriticOutput - Advanced with dimensions:

class CriticOutput(BaseModel):
    score: float = Field(..., ge=0.0, le=1.0)  # Required
    feedback: str = Field(default="")           # Optional
    dimension_scores: dict[str, float] = Field(default_factory=dict)
    actionable_guidance: str = Field(default="")

Default Instructions¶

Simple critic:

Evaluate the quality of the output.

Provide:
- A score from 0.0 (poor) to 1.0 (excellent)
- Feedback explaining what works and what doesn't

Focus on clarity, accuracy, and completeness in your evaluation.

Advanced critic:

Evaluate the quality of the output across multiple dimensions.

Provide:
- An overall score from 0.0 (poor) to 1.0 (excellent)
- Feedback explaining what works and what doesn't
- Dimension scores for specific quality aspects you identify
- Actionable guidance for concrete improvement steps

Critic Requirements¶

Requirement	Details
Output Format	JSON with `score` field (required)
Score Range	0.0 to 1.0 (float)
Feedback	Recommended for reflection quality
Normalization	`normalize_feedback()` converts to trial format

The Reflection Agent¶

The reflection agent analyzes trial results and proposes improved component text.

Default Instruction¶

## Component Text to Improve
{component_text}

## Trials
{trials}

Propose an improved version of the component text based on the trials above.
Return ONLY the improved component text, nothing else.

Component-Aware Reflection¶

Different components get specialized reflection agents:

Component	Factory	Special Tools
`output_schema`	`create_schema_reflection_agent`	`validate_output_schema` tool
`generate_content_config`	`create_config_reflection_agent`	None (YAML validation)
default	`create_text_reflection_agent`	None

Reflection Requirements¶

Requirement	Details
Placeholders	Must accept `{component_text}` and `{trials}`
Output Key	Must use `output_key="proposed_component_text"`
Return Format	Plain text (the improved component)
Session State	Receives `component_text` (str) and `trials` (JSON string)

Trial Structure¶

Each trial record passed to reflection contains:

{
    "feedback": {
        "score": 0.85,              # From critic
        "feedback_text": "...",      # From critic
        "dimension_scores": {...},   # Optional
        "actionable_guidance": "..." # Optional
    },
    "trajectory": {
        "input": "...",             # Original task input
        "output": "...",            # Agent's generated output
        "trace": {...}              # ADK execution trace
    }
}

The trajectory captures execution context—tool calls, state changes, token usage—that helps the reflection agent understand how the agent arrived at its output.

How Critic + Reflection Work Together¶

┌─────────────┐    output    ┌─────────────┐    score,     ┌─────────────────┐
│   Agent     │─────────────▶│   Critic    │───feedback───▶│  Trial Builder  │
│  (evolving) │              │  (scoring)  │               │                 │
└─────────────┘              └─────────────┘               └────────┬────────┘
                                                                    │
                                                                    │ trials
                                                                    ▼
┌─────────────┐    proposed   ┌─────────────────┐                   │
│  Component  │◀─────text─────│   Reflection    │◀──────────────────┘
│   Handler   │               │     Agent       │
└─────────────┘               └─────────────────┘

The flow:

Agent produces output from input
Critic scores output → {score, feedback, dimensions, guidance}
Trial Builder combines into {feedback, trajectory}
Reflection Agent receives {component_text, trials} → proposes improvement
Component Handler applies proposed text to candidate
Engine re-evaluates → accepts if score improves

Example: Complete Single-Agent Evolution¶

from google.adk.agents import LlmAgent
from google.adk.models.lite_llm import LiteLlm
from gepa_adk import evolve, SimpleCriticOutput

# Agent to evolve
writer = LlmAgent(
    name="writer",
    model=LiteLlm(model="ollama_chat/llama3.2:latest"),
    instruction="Write a haiku about the given topic.",
)

# Critic to evaluate
critic = LlmAgent(
    name="critic",
    model=LiteLlm(model="ollama_chat/llama3.2:latest"),
    instruction="""
    Evaluate this haiku for:
    - Correct 5-7-5 syllable structure
    - Imagery and emotional impact
    - Connection to the topic

    Score 0.0-1.0 and explain your reasoning.
    """,
    output_schema=SimpleCriticOutput,
)

# Training examples
trainset = [
    {"topic": "autumn leaves"},
    {"topic": "morning coffee"},
    {"topic": "city rain"},
]

# Run evolution
result = await evolve(
    agent=writer,
    trainset=trainset,
    critic=critic,
)

print(f"Improved: {result.original_score:.2f} → {result.final_score:.2f}")
print(f"New instruction: {result.evolved_components['instruction']}")

Next Steps¶

Multi-Agent Evolution - How multiple agents evolve together
Workflow Agents - How workflow structures evolve
Critic Agents Guide - Practical guide to building critics