Customizing Reflection Prompts¶

This guide covers how to customize the reflection prompt used during evolution to improve instruction mutations.

Deprecation Notice

Direct LiteLLM reflection via reflection_model and reflection_prompt parameters is deprecated and will be removed in a future version. Use reflection_agent with an ADK LlmAgent instead for consistent execution and session management. See Issue #144.

Recommended approach:

from google.adk.agents import LlmAgent
from google.adk.models.lite_llm import LiteLlm

reflection_agent = LlmAgent(
    name="Reflector",
    model=LiteLlm(model="ollama_chat/llama3.2:latest"),
    instruction="Your custom reflection prompt with {component_text} and {trials}",
)

result = await evolve(
    agent=my_agent,
    trainset=test_cases,
    critic=my_critic,  # Optional
    reflection_agent=reflection_agent,  # Recommended
)

Overview¶

The reflection prompt is the template sent to the reflection model (e.g., ollama_chat/gpt-oss:20b) to generate improved agent instructions. By customizing this prompt, you can:

Tailor the mutation strategy to your specific use case
Request specific output formats (e.g., JSON)
Add domain-specific guidelines
Optimize for different model capabilities

Available Placeholders¶

The reflection prompt template supports two placeholders that are filled at runtime:

Placeholder	Content	Description
`{component_text}`	component being evolved	The text content of the component_text (e.g., instruction)
`{trials}`	trial data	Collection of trials with feedback and trajectory for each test

ADK Template Syntax¶

gepa-adk uses ADK's native template substitution for injecting session state values into agent instructions. This provides cleaner separation between data (in session state) and instruction logic.

How It Works¶

ADK automatically replaces {key} placeholders in agent instructions with values from session.state[key]:

from google.adk.agents import LlmAgent
from google.adk.models.lite_llm import LiteLlm

# Define agent with template placeholders
agent = LlmAgent(
    name="Reflector",
    model=LiteLlm(model="ollama_chat/llama3.2:latest"),
    instruction="""## Component Text
{component_text}

## Trial Results
{trials}

Improve the component text based on the trials.""",
)

# Session state is set up automatically by gepa-adk
# ADK's inject_session_state() replaces {component_text} and {trials}

Template Syntax Reference¶

Syntax	Description	Example
`{key}`	Required placeholder - raises KeyError if missing	`{component_text}`
`{key?}`	Optional placeholder - returns empty string if missing	`{context?}`
`{app:key}`	Application-scoped state (advanced)	`{app:shared_config}`
`{user:key}`	User-scoped state (advanced)	`{user:preferences}`

Type Handling¶

ADK converts all session state values to strings using str(). For complex types, gepa-adk pre-serializes to JSON:

# gepa-adk automatically handles serialization:
session_state = {
    "component_text": "Be helpful",  # String - used as-is
    "trials": json.dumps(trials),    # List → JSON string
}

Important: If you pass a dict/list directly without JSON serialization, ADK uses Python's repr() which may not be readable by the LLM.

Example Placeholder Values¶

{component_text}:

You are a helpful assistant that answers questions about Python programming.
Be concise and provide code examples when relevant.

{trials}:

Example 1:
  Input: "How do I read a file?"
  Expected: "Use open() with 'r' mode"
  Actual: "Files can be read using Python"
  Score: 0.3
  Feedback: Response too vague, missing code example

Example 2:
  Input: "What is a list?"
  Expected: "A mutable sequence type"
  Actual: "A list is a mutable sequence type in Python"
  Score: 0.9
  Feedback: Good explanation

Basic Usage¶

Using a Custom Prompt¶

from gepa_adk import evolve, EvolutionConfig

config = EvolutionConfig(
    reflection_prompt="""You are improving an AI agent's instructions.

## Current Instruction
{component_text}

## Evaluation Results
{trials}

## Your Task
Based on the feedback, propose ONE specific improvement to the instruction.
Focus on the most impactful change.

Return ONLY the improved instruction text."""
)

result = await evolve(
    agent=my_agent,
    trainset=test_cases,
    critic=my_critic,
    config=config,
)

Extending the Default Prompt¶

You can import and extend the default prompt template:

from gepa_adk import evolve, EvolutionConfig
from gepa_adk.engine.adk_reflection import REFLECTION_INSTRUCTION

# Add domain-specific context to the default
custom_prompt = REFLECTION_INSTRUCTION + """

Additional Guidelines:
- Focus on clarity and conciseness
- Preserve any safety constraints in the original
- Consider edge cases mentioned in feedback
"""

config = EvolutionConfig(reflection_prompt=custom_prompt)

Using the Default (No Configuration)¶

from gepa_adk import evolve

# reflection_prompt defaults to None → uses REFLECTION_INSTRUCTION
result = await evolve(
    agent=my_agent,
    trainset=test_cases,
    critic=my_critic,
)

Prompt Design Guidelines¶

1. Include Both Placeholders¶

Always include {component_text} and {trials} in your prompt. The system will warn (but not error) if either is missing:

# This will log a warning about missing {trials}
config = EvolutionConfig(
    reflection_prompt="Improve this: {component_text}"
)

2. Request Clear Output Format¶

Specify exactly what format you want the response in:

# Good: Clear output expectation
prompt = """...
Return ONLY the improved instruction text, with no additional commentary.
"""

# Also good: Structured output
prompt = """...
Respond with exactly this JSON structure:
{
  "analysis": "Brief analysis",
  "improved_instruction": "The improved text"
}
"""

3. Be Specific About the Task¶

Tell the model exactly what kind of improvements to make:

prompt = """...
## Your Task
1. Address the issues identified in negative feedback
2. Preserve elements that worked well in positive feedback
3. Maintain clarity and specificity
4. Keep the instruction concise (under 200 words)
"""

4. Consider Model Capabilities¶

Adjust prompt complexity based on your reflection model:

Smaller models (7B-13B): Use simpler, more direct prompts
Larger models (70B+): Can handle chain-of-thought, structured reasoning

Example Prompts¶

Minimal/Fast Prompt¶

For quick iterations with capable models:

minimal_prompt = """Instruction:
{component_text}

Feedback:
{trials}

Improved instruction:"""

Chain-of-Thought Prompt¶

For more thoughtful improvements:

cot_prompt = """You are an expert at improving AI instructions.

## Current Instruction
{component_text}

## Performance Feedback
{trials}

## Analysis Process
1. What patterns appear in successful examples?
2. What patterns appear in failed examples?
3. What specific changes would address the failures while preserving successes?

Think through each step, then provide the improved instruction.

## Improved Instruction
"""

JSON Output Format¶

For structured responses:

json_prompt = """Analyze the agent instruction and feedback, then respond with JSON.

## Current Instruction
{component_text}

## Feedback
{trials}

Respond with exactly this JSON structure:
{
  "analysis": "Brief analysis of what's working and what isn't",
  "improved_instruction": "The complete improved instruction text"
}
"""

Domain-Specific Prompt¶

For specialized use cases:

code_review_prompt = """You are improving a code review agent's instructions.

## Current Instruction
{component_text}

## Evaluation Feedback
{trials}

## Domain Guidelines
- The agent should identify bugs, security issues, and style problems
- Feedback should be actionable and specific
- Code examples should be provided when suggesting fixes
- Tone should be constructive, not critical

Provide an improved instruction that addresses the feedback while
following these domain guidelines.

## Improved Instruction
"""

Model Selection Guidance¶

The reflection model processes your custom prompt to generate improved instructions. Choosing the right model affects quality, speed, and cost.

Token Budget Considerations¶

Your reflection prompt plus placeholders consume context tokens:

Component	Typical Size
Custom prompt template	100-500 tokens
`{component_text}`	50-500 tokens
`{trials}`	200-2000 tokens (depends on trial count)
Response	50-500 tokens

Recommendation: Keep your prompt template under 500 tokens. For larger instruction sets, consider reducing batch size or using a model with larger context.

Model Capability vs Task Complexity¶

Task Complexity	Recommended Model Tier	Examples
Simple rewording	Local (7B-13B)	Ollama gpt-oss:7b
Structured improvements	Local (20B+)	Ollama gpt-oss:20b (default)
Complex reasoning	Cloud (cheap)	GPT-4o-mini, Claude Haiku
Domain expertise	Cloud (premium)	GPT-4o, Claude Sonnet

Cost vs Quality Tradeoffs¶

Model Tier	Cost	Quality	Speed	When to Use
Local (Ollama)	Free	Good	Medium	Development, iteration, cost-sensitive production
Cloud Cheap	~$0.15/1M tokens	Better	Fast	Production with budget constraints
Cloud Premium	~$5-15/1M tokens	Best	Fast	High-stakes applications, complex domains

Configuring the Reflection Model¶

from gepa_adk import evolve, EvolutionConfig

# Local model (default)
config = EvolutionConfig(
    reflection_model="ollama_chat/gpt-oss:20b",
)

# Cloud model (OpenAI)
config = EvolutionConfig(
    reflection_model="gpt-4o-mini",
)

# Cloud model (Anthropic)
config = EvolutionConfig(
    reflection_model="claude-3-haiku-20240307",
)

Validation and Debugging¶

Check for Missing Placeholders¶

The system logs warnings if placeholders are missing:

WARNING: reflection_prompt is missing {trials} placeholder

Test Your Prompt¶

Before running evolution, test your prompt manually:

from gepa_adk.engine.adk_reflection import REFLECTION_INSTRUCTION

# See what the default looks like
print(REFLECTION_INSTRUCTION)

# Test your custom prompt with sample values
my_prompt = "..."
formatted = my_prompt.format(
    component_text="Be helpful",
    trials="Example 1: Score 0.5..."
)
print(formatted)

Migration from f-string Workaround¶

If you have custom reflection code that previously embedded data in user messages via f-strings, you can migrate to ADK's template syntax.

Before (f-string in user message)¶

# OLD: Data embedded directly in user message
user_message = f"""## Component Text to Improve
{component_text}

## Trials
{json.dumps(trials, indent=2)}

Propose an improved version..."""

async for event in runner.run_async(
    user_id="reflection",
    session_id=session_id,
    new_message=Content(role="user", parts=[Part(text=user_message)]),
):
    events.append(event)

After (ADK template substitution)¶

# NEW: Data in session state, placeholders in instruction
agent = LlmAgent(
    name="Reflector",
    model=LiteLlm(model="ollama_chat/llama3.2:latest"),
    instruction="""## Component Text to Improve
{component_text}

## Trials
{trials}

Propose an improved version...""",
)

# Session state set during session creation
session_state = {
    "component_text": component_text,
    "trials": json.dumps(trials),  # Pre-serialize complex types
}

await session_service.create_session(
    app_name="gepa_reflection",
    user_id="reflection",
    session_id=session_id,
    state=session_state,
)

# Simple trigger message - ADK handles substitution
async for event in runner.run_async(
    user_id="reflection",
    session_id=session_id,
    new_message=Content(role="user", parts=[Part(text="Improve the component.")]),
):
    events.append(event)

Benefits of Template Syntax¶

Cleaner separation: Data lives in session state, logic in instruction
ADK-native: Uses documented ADK patterns
Testable: Session state can be inspected/mocked independently
Consistent: Aligns with other ADK agents using template substitution

Custom Reflection Prompt Example - Complete working example
Single-Agent Evolution - Basic evolution patterns
Multi-Agent Evolution - Multi-agent pipelines
Critic Agents - Using critic agents for scoring