Skip to content

Models

models

Domain models for the gepa-adk evolution engine.

This module contains the core domain models used throughout the evolution engine, including result types with schema versioning and serialization support. Configuration validation enforces field constraints and finite-float checks. All models are dataclasses following hexagonal architecture principles with no runtime dependencies beyond structlog and the Python standard library.

Terminology
  • component: An evolvable unit with a name and text (e.g., instruction)
  • component_text: The current text content of a component being evolved
  • trial: One performance record {feedback, trajectory}
  • feedback: Critic evaluation {score, feedback_text, feedback_*} (stochastic)
  • trajectory: Execution record {input, output, trace} (deterministic)
ATTRIBUTE DESCRIPTION
EvolutionConfig

Configuration parameters for evolution runs.

TYPE: class

IterationRecord

Immutable record of a single iteration.

TYPE: class

EvolutionResult

Immutable outcome of a completed evolution run.

TYPE: class

Candidate

Mutable candidate holding components being evolved.

TYPE: class

CURRENT_SCHEMA_VERSION

Current result schema version constant.

TYPE: int

Examples:

Creating configuration and result objects:

from gepa_adk.domain.models import EvolutionConfig, EvolutionResult

config = EvolutionConfig(max_iterations=20)
result = EvolutionResult(
    original_score=0.5,
    final_score=0.8,
    evolved_components={"instruction": "Be helpful"},
    iteration_history=[],
    total_iterations=10,
)
assert result.schema_version == 1

Serializing and deserializing results:

import json
from gepa_adk.domain.models import EvolutionResult

data = result.to_dict()
json_str = json.dumps(data)
restored = EvolutionResult.from_dict(json.loads(json_str))

Display methods for inspecting results:

from gepa_adk.domain.models import EvolutionResult

result = EvolutionResult(
    original_score=0.5,
    final_score=0.8,
    evolved_components={"instruction": "Be helpful and concise"},
    original_components={"instruction": "Be helpful"},
    iteration_history=[],
    total_iterations=10,
)
print(repr(result))  # narrative summary with improvement %
print(result.show_diff())  # unified diff of component changes
See Also
Note

These models are pure data containers with validation logic. They have no knowledge of infrastructure concerns like databases or APIs.

CURRENT_SCHEMA_VERSION module-attribute

CURRENT_SCHEMA_VERSION = 1

Schema version for evolution result serialization.

Incremented when the result schema changes in a way that requires migration logic in from_dict().

VideoFileInfo dataclass

Metadata for a validated video file.

This is an immutable record containing validated metadata about a video file. Created by VideoBlobService.validate_video_file() after checking that the file exists, is within size limits, and has a valid MIME type.

ATTRIBUTE DESCRIPTION
path

Absolute path to the video file.

TYPE: str

size_bytes

File size in bytes.

TYPE: int

mime_type

MIME type of the video (e.g., "video/mp4").

TYPE: str

Examples:

Creating video file info:

from gepa_adk.domain.models import VideoFileInfo

info = VideoFileInfo(
    path="/data/video.mp4",
    size_bytes=1024000,
    mime_type="video/mp4",
)
print(f"File: {info.path}, Size: {info.size_bytes}, Type: {info.mime_type}")
Note

A frozen dataclass ensuring immutability after validation. Instances cannot be modified once created, guaranteeing consistency of validated file metadata.

Source code in src/gepa_adk/domain/models.py
@dataclass(slots=True, frozen=True)
class VideoFileInfo:
    """Metadata for a validated video file.

    This is an immutable record containing validated metadata about a video
    file. Created by VideoBlobService.validate_video_file() after checking
    that the file exists, is within size limits, and has a valid MIME type.

    Attributes:
        path (str): Absolute path to the video file.
        size_bytes (int): File size in bytes.
        mime_type (str): MIME type of the video (e.g., "video/mp4").

    Examples:
        Creating video file info:

        ```python
        from gepa_adk.domain.models import VideoFileInfo

        info = VideoFileInfo(
            path="/data/video.mp4",
            size_bytes=1024000,
            mime_type="video/mp4",
        )
        print(f"File: {info.path}, Size: {info.size_bytes}, Type: {info.mime_type}")
        ```

    Note:
        A frozen dataclass ensuring immutability after validation.
        Instances cannot be modified once created, guaranteeing
        consistency of validated file metadata.
    """

    path: str
    size_bytes: int
    mime_type: str

EvolutionConfig dataclass

Configuration parameters for an evolution run.

Defines the parameters that control how evolution proceeds, including iteration limits, concurrency settings, and stopping criteria.

ATTRIBUTE DESCRIPTION
max_iterations

Maximum number of evolution iterations. 0 means just evaluate baseline without evolving.

TYPE: int

max_concurrent_evals

Number of concurrent batch evaluations. Must be at least 1.

TYPE: int

min_improvement_threshold

Minimum score improvement to accept a new candidate. Set to 0.0 to accept any improvement.

TYPE: float

patience

Number of iterations without improvement before stopping early. Set to 0 to disable early stopping.

TYPE: int

reflection_model

Model identifier for reflection/mutation operations.

TYPE: str

frontier_type

Frontier tracking strategy for Pareto selection (default: INSTANCE).

TYPE: FrontierType

acceptance_metric

Aggregation method for acceptance decisions on iteration evaluation batches. "sum" uses sum of scores (default, aligns with upstream GEPA). "mean" uses mean of scores (legacy behavior).

TYPE: Literal['sum', 'mean']

use_merge

Enable merge proposals for genetic crossover. Defaults to False.

TYPE: bool

max_merge_invocations

Maximum number of merge attempts per run. Defaults to 10. Must be non-negative.

TYPE: int

reflection_prompt

Custom reflection/mutation prompt template. If provided, this template is used instead of the default when the reflection model proposes improved text. Required placeholders: - {component_text}: The current component text being evolved - {trials}: Trial data with feedback and trajectory for each test case If None or empty string, the default prompt template is used.

TYPE: str | None

stop_callbacks

List of stopper callbacks for custom stop conditions. Each callback receives a StopperState and returns True to signal stop. Defaults to an empty list.

TYPE: list[StopperProtocol]

seed

Random seed for deterministic engine decisions. When set, a seeded random.Random is created and shared across all stochastic components (candidate selector, merge proposer). None (default) preserves current random behavior.

TYPE: int | None

Examples:

Creating a configuration with defaults:

from gepa_adk.domain.models import EvolutionConfig

config = EvolutionConfig(max_iterations=100, patience=10)
print(config.max_iterations)  # 100
print(config.reflection_model)  # ollama_chat/gpt-oss:20b
Note

All numeric parameters are validated in post_init to ensure they meet their constraints. Cross-field consistency is also checked (e.g., use_merge requires max_merge_invocations > 0, stop_callbacks must be callable). Invalid values raise ConfigurationError.

Determinism applies to engine decisions only (candidate selection, component selection, merge proposals). LLM inference is inherently stochastic and not covered by the seed guarantee.

Source code in src/gepa_adk/domain/models.py
@dataclass(slots=True, kw_only=True)
class EvolutionConfig:
    """Configuration parameters for an evolution run.

    Defines the parameters that control how evolution proceeds, including
    iteration limits, concurrency settings, and stopping criteria.

    Attributes:
        max_iterations (int): Maximum number of evolution iterations. 0 means
            just evaluate baseline without evolving.
        max_concurrent_evals (int): Number of concurrent batch evaluations.
            Must be at least 1.
        min_improvement_threshold (float): Minimum score improvement to accept
            a new candidate. Set to 0.0 to accept any improvement.
        patience (int): Number of iterations without improvement before stopping
            early. Set to 0 to disable early stopping.
        reflection_model (str): Model identifier for reflection/mutation
            operations.
        frontier_type (FrontierType): Frontier tracking strategy for Pareto
            selection (default: INSTANCE).
        acceptance_metric (Literal["sum", "mean"]): Aggregation method for
            acceptance decisions on iteration evaluation batches. "sum" uses
            sum of scores (default, aligns with upstream GEPA). "mean" uses
            mean of scores (legacy behavior).
        use_merge (bool): Enable merge proposals for genetic crossover.
            Defaults to False.
        max_merge_invocations (int): Maximum number of merge attempts per run.
            Defaults to 10. Must be non-negative.
        reflection_prompt (str | None): Custom reflection/mutation prompt template.
            If provided, this template is used instead of the default when the
            reflection model proposes improved text. Required placeholders:
            - {component_text}: The current component text being evolved
            - {trials}: Trial data with feedback and trajectory for each test case
            If None or empty string, the default prompt template is used.
        stop_callbacks (list[StopperProtocol]): List of stopper callbacks for
            custom stop conditions. Each callback receives a StopperState and
            returns True to signal stop. Defaults to an empty list.
        seed (int | None): Random seed for deterministic engine decisions.
            When set, a seeded ``random.Random`` is created and shared across
            all stochastic components (candidate selector, merge proposer).
            ``None`` (default) preserves current random behavior.

    Examples:
        Creating a configuration with defaults:

        ```python
        from gepa_adk.domain.models import EvolutionConfig

        config = EvolutionConfig(max_iterations=100, patience=10)
        print(config.max_iterations)  # 100
        print(config.reflection_model)  # ollama_chat/gpt-oss:20b
        ```

    Note:
        All numeric parameters are validated in __post_init__ to ensure
        they meet their constraints. Cross-field consistency is also checked
        (e.g., use_merge requires max_merge_invocations > 0, stop_callbacks
        must be callable). Invalid values raise ConfigurationError.

        Determinism applies to engine decisions only (candidate selection,
        component selection, merge proposals). LLM inference is inherently
        stochastic and not covered by the seed guarantee.
    """

    max_iterations: int = 50
    max_concurrent_evals: int = 5
    min_improvement_threshold: float = 0.01
    patience: int = 5
    reflection_model: str = "ollama_chat/gpt-oss:20b"
    frontier_type: FrontierType = FrontierType.INSTANCE
    acceptance_metric: Literal["sum", "mean"] = "sum"
    use_merge: bool = False
    max_merge_invocations: int = 10
    reflection_prompt: str | None = None
    stop_callbacks: list["StopperProtocol"] = field(default_factory=list)
    seed: int | None = None

    def __post_init__(self) -> None:
        """Validate configuration parameters after initialization.

        Raises:
            ConfigurationError: If any parameter violates its constraints,
                including non-finite floats (NaN, Inf), cross-field consistency
                rules (e.g., use_merge requires max_merge_invocations > 0,
                stop_callbacks must be callable).

        Note:
            Operates automatically after dataclass __init__ completes. Validates
            all fields including finite-float checks, cross-field consistency,
            and raises ConfigurationError with context on failure.
        """
        if self.max_iterations < 0:
            raise ConfigurationError(
                "max_iterations must be non-negative",
                field="max_iterations",
                value=self.max_iterations,
                constraint=">= 0",
            )

        if self.max_concurrent_evals < 1:
            raise ConfigurationError(
                "max_concurrent_evals must be at least 1",
                field="max_concurrent_evals",
                value=self.max_concurrent_evals,
                constraint=">= 1",
            )

        if not math.isfinite(self.min_improvement_threshold):
            raise ConfigurationError(
                "min_improvement_threshold must be a finite number",
                field="min_improvement_threshold",
                value=self.min_improvement_threshold,
                constraint="finite float",
            )

        if self.min_improvement_threshold < 0.0:
            raise ConfigurationError(
                "min_improvement_threshold must be non-negative",
                field="min_improvement_threshold",
                value=self.min_improvement_threshold,
                constraint=">= 0.0",
            )

        if self.patience < 0:
            raise ConfigurationError(
                "patience must be non-negative",
                field="patience",
                value=self.patience,
                constraint=">= 0",
            )

        if not self.reflection_model:
            raise ConfigurationError(
                "reflection_model must be a non-empty string",
                field="reflection_model",
                value=self.reflection_model,
                constraint="non-empty string",
            )

        if not isinstance(self.frontier_type, FrontierType):
            try:
                self.frontier_type = FrontierType(self.frontier_type)
            except ValueError as exc:
                raise ConfigurationError(
                    "frontier_type must be a supported FrontierType value",
                    field="frontier_type",
                    value=self.frontier_type,
                    constraint=", ".join(t.value for t in FrontierType),
                ) from exc

        if self.acceptance_metric not in ("sum", "mean"):
            raise ConfigurationError(
                "acceptance_metric must be 'sum' or 'mean'",
                field="acceptance_metric",
                value=self.acceptance_metric,
                constraint="sum|mean",
            )

        if self.max_merge_invocations < 0:
            raise ConfigurationError(
                "max_merge_invocations must be non-negative",
                field="max_merge_invocations",
                value=self.max_merge_invocations,
                constraint=">= 0",
            )

        # Cross-field consistency checks
        self._validate_consistency()

        # Validate reflection_prompt if provided
        self._validate_reflection_prompt()

    def _validate_consistency(self) -> None:
        """Validate cross-field consistency rules.

        Raises:
            ConfigurationError: If use_merge is True but max_merge_invocations
                is zero, or if stop_callbacks contains non-callable items.

        Note:
            Hard errors raise ConfigurationError; soft issues log warnings.
            Called from __post_init__ after individual field validation.
        """
        if self.use_merge and self.max_merge_invocations == 0:
            raise ConfigurationError(
                "use_merge=True requires max_merge_invocations > 0",
                field="max_merge_invocations",
                value=self.max_merge_invocations,
                constraint="> 0 when use_merge=True",
            )

        if (
            self.patience > 0
            and self.max_iterations > 0
            and self.patience > self.max_iterations
        ):
            logger.warning(
                "config.patience.exceeds_max_iterations",
                patience=self.patience,
                max_iterations=self.max_iterations,
                message="patience exceeds max_iterations; early stopping will never trigger",
            )

        for i, callback in enumerate(self.stop_callbacks):
            if not callable(callback):
                raise ConfigurationError(
                    f"stop_callbacks[{i}] is not callable",
                    field=f"stop_callbacks[{i}]",
                    value=type(callback).__name__,
                    constraint="must be callable",
                )

    def _validate_reflection_prompt(self) -> None:
        """Validate reflection_prompt and handle empty string.

        Converts empty string to None with info log. Warns if required
        placeholders are missing but allows the config to be created.

        Note:
            Soft validation approach - missing placeholders trigger warnings
            but don't prevent config creation for maximum flexibility.
        """
        # Handle empty string as "use default"
        if self.reflection_prompt == "":
            logger.info(
                "config.reflection_prompt.empty",
                message="Empty reflection_prompt provided, using default template",
            )
            # Use object.__setattr__ because slots=True prevents direct assignment
            object.__setattr__(self, "reflection_prompt", None)
            return

        # Skip validation if None
        if self.reflection_prompt is None:
            return

        # Warn about missing placeholders
        if "{component_text}" not in self.reflection_prompt:
            logger.warning(
                "config.reflection_prompt.missing_placeholder",
                placeholder="component_text",
                message="reflection_prompt is missing {component_text} placeholder",
            )

        if "{trials}" not in self.reflection_prompt:
            logger.warning(
                "config.reflection_prompt.missing_placeholder",
                placeholder="trials",
                message="reflection_prompt is missing {trials} placeholder",
            )

__post_init__

__post_init__() -> None

Validate configuration parameters after initialization.

RAISES DESCRIPTION
ConfigurationError

If any parameter violates its constraints, including non-finite floats (NaN, Inf), cross-field consistency rules (e.g., use_merge requires max_merge_invocations > 0, stop_callbacks must be callable).

Note

Operates automatically after dataclass init completes. Validates all fields including finite-float checks, cross-field consistency, and raises ConfigurationError with context on failure.

Source code in src/gepa_adk/domain/models.py
def __post_init__(self) -> None:
    """Validate configuration parameters after initialization.

    Raises:
        ConfigurationError: If any parameter violates its constraints,
            including non-finite floats (NaN, Inf), cross-field consistency
            rules (e.g., use_merge requires max_merge_invocations > 0,
            stop_callbacks must be callable).

    Note:
        Operates automatically after dataclass __init__ completes. Validates
        all fields including finite-float checks, cross-field consistency,
        and raises ConfigurationError with context on failure.
    """
    if self.max_iterations < 0:
        raise ConfigurationError(
            "max_iterations must be non-negative",
            field="max_iterations",
            value=self.max_iterations,
            constraint=">= 0",
        )

    if self.max_concurrent_evals < 1:
        raise ConfigurationError(
            "max_concurrent_evals must be at least 1",
            field="max_concurrent_evals",
            value=self.max_concurrent_evals,
            constraint=">= 1",
        )

    if not math.isfinite(self.min_improvement_threshold):
        raise ConfigurationError(
            "min_improvement_threshold must be a finite number",
            field="min_improvement_threshold",
            value=self.min_improvement_threshold,
            constraint="finite float",
        )

    if self.min_improvement_threshold < 0.0:
        raise ConfigurationError(
            "min_improvement_threshold must be non-negative",
            field="min_improvement_threshold",
            value=self.min_improvement_threshold,
            constraint=">= 0.0",
        )

    if self.patience < 0:
        raise ConfigurationError(
            "patience must be non-negative",
            field="patience",
            value=self.patience,
            constraint=">= 0",
        )

    if not self.reflection_model:
        raise ConfigurationError(
            "reflection_model must be a non-empty string",
            field="reflection_model",
            value=self.reflection_model,
            constraint="non-empty string",
        )

    if not isinstance(self.frontier_type, FrontierType):
        try:
            self.frontier_type = FrontierType(self.frontier_type)
        except ValueError as exc:
            raise ConfigurationError(
                "frontier_type must be a supported FrontierType value",
                field="frontier_type",
                value=self.frontier_type,
                constraint=", ".join(t.value for t in FrontierType),
            ) from exc

    if self.acceptance_metric not in ("sum", "mean"):
        raise ConfigurationError(
            "acceptance_metric must be 'sum' or 'mean'",
            field="acceptance_metric",
            value=self.acceptance_metric,
            constraint="sum|mean",
        )

    if self.max_merge_invocations < 0:
        raise ConfigurationError(
            "max_merge_invocations must be non-negative",
            field="max_merge_invocations",
            value=self.max_merge_invocations,
            constraint=">= 0",
        )

    # Cross-field consistency checks
    self._validate_consistency()

    # Validate reflection_prompt if provided
    self._validate_reflection_prompt()

IterationRecord dataclass

Captures metrics for a single evolution iteration.

This is an immutable record of what happened during one iteration of the evolution process. Records are created by the engine and stored in EvolutionResult.iteration_history.

ATTRIBUTE DESCRIPTION
iteration_number

1-indexed iteration number for human readability.

TYPE: int

score

Score achieved in this iteration (typically in [0.0, 1.0]).

TYPE: float

component_text

The component_text that was evaluated in this iteration (e.g., the instruction text for the "instruction" component).

TYPE: str

evolved_component

The name of the component that was evolved in this iteration (e.g., "instruction", "output_schema"). Used for tracking which component changed in round-robin evolution strategies.

TYPE: str

accepted

Whether this proposal was accepted as the new best.

TYPE: bool

objective_scores

Optional per-example multi-objective scores from the valset evaluation. None when adapter does not provide objective scores. Each dict maps objective name to score value. Index-aligned with evaluation batch examples.

TYPE: list[dict[str, float]] | None

reflection_reasoning

Optional natural language reasoning from the reflection agent explaining why the mutation was proposed. None when reasoning is not available (e.g., model without thinking support or older data without this field).

TYPE: str | None

Examples:

Creating an iteration record:

from gepa_adk.domain.models import IterationRecord

record = IterationRecord(
    iteration_number=1,
    score=0.85,
    component_text="Be helpful",
    evolved_component="instruction",
    accepted=True,
)
print(record.score)  # 0.85
print(record.evolved_component)  # "instruction"
print(record.accepted)  # True

Serialization round-trip:

d = record.to_dict()
restored = IterationRecord.from_dict(d)
assert restored.score == record.score
Note

An immutable record that captures iteration metrics. Once created, IterationRecord instances cannot be modified, ensuring historical accuracy of the evolution trace.

Source code in src/gepa_adk/domain/models.py
@dataclass(slots=True, frozen=True, kw_only=True)
class IterationRecord:
    """Captures metrics for a single evolution iteration.

    This is an immutable record of what happened during one iteration
    of the evolution process. Records are created by the engine and
    stored in EvolutionResult.iteration_history.

    Attributes:
        iteration_number (int): 1-indexed iteration number for human
            readability.
        score (float): Score achieved in this iteration (typically in
            [0.0, 1.0]).
        component_text (str): The component_text that was evaluated in this
            iteration (e.g., the instruction text for the "instruction" component).
        evolved_component (str): The name of the component that was evolved
            in this iteration (e.g., "instruction", "output_schema"). Used for
            tracking which component changed in round-robin evolution strategies.
        accepted (bool): Whether this proposal was accepted as the new best.
        objective_scores (list[dict[str, float]] | None): Optional per-example
            multi-objective scores from the valset evaluation. None when adapter
            does not provide objective scores. Each dict maps objective name to
            score value. Index-aligned with evaluation batch examples.
        reflection_reasoning (str | None): Optional natural language reasoning
            from the reflection agent explaining why the mutation was proposed.
            None when reasoning is not available (e.g., model without thinking
            support or older data without this field).

    Examples:
        Creating an iteration record:

        ```python
        from gepa_adk.domain.models import IterationRecord

        record = IterationRecord(
            iteration_number=1,
            score=0.85,
            component_text="Be helpful",
            evolved_component="instruction",
            accepted=True,
        )
        print(record.score)  # 0.85
        print(record.evolved_component)  # "instruction"
        print(record.accepted)  # True
        ```

        Serialization round-trip:

        ```python
        d = record.to_dict()
        restored = IterationRecord.from_dict(d)
        assert restored.score == record.score
        ```

    Note:
        An immutable record that captures iteration metrics. Once created,
        IterationRecord instances cannot be modified, ensuring historical
        accuracy of the evolution trace.
    """

    iteration_number: int
    score: float
    component_text: str
    evolved_component: str
    accepted: bool
    objective_scores: list[dict[str, float]] | None = None
    reflection_reasoning: str | None = None

    def to_dict(self) -> dict[str, Any]:
        """Serialize this record to a stdlib-only dict.

        Returns:
            Dict containing all 7 fields. Output is directly
            ``json.dumps()``-compatible.
        """
        return {
            "iteration_number": self.iteration_number,
            "score": self.score,
            "component_text": self.component_text,
            "evolved_component": self.evolved_component,
            "accepted": self.accepted,
            "objective_scores": self.objective_scores,
            "reflection_reasoning": self.reflection_reasoning,
        }

    @classmethod
    def from_dict(cls, data: dict[str, Any]) -> "IterationRecord":
        """Reconstruct an IterationRecord from a dict.

        Unknown keys are silently ignored for forward compatibility,
        allowing older code to load records produced by newer versions.
        Optional fields (``objective_scores``, ``reflection_reasoning``)
        default to None when missing from the input dict.

        Args:
            data: Dict containing iteration record fields.

        Returns:
            Reconstructed IterationRecord instance.

        Raises:
            KeyError: If a required field is missing from the dict.
        """
        return cls(
            iteration_number=data["iteration_number"],
            score=data["score"],
            component_text=data["component_text"],
            evolved_component=data["evolved_component"],
            accepted=data["accepted"],
            objective_scores=data.get("objective_scores"),
            reflection_reasoning=data.get("reflection_reasoning"),
        )

to_dict

to_dict() -> dict[str, Any]

Serialize this record to a stdlib-only dict.

RETURNS DESCRIPTION
dict[str, Any]

Dict containing all 7 fields. Output is directly

dict[str, Any]

json.dumps()-compatible.

Source code in src/gepa_adk/domain/models.py
def to_dict(self) -> dict[str, Any]:
    """Serialize this record to a stdlib-only dict.

    Returns:
        Dict containing all 7 fields. Output is directly
        ``json.dumps()``-compatible.
    """
    return {
        "iteration_number": self.iteration_number,
        "score": self.score,
        "component_text": self.component_text,
        "evolved_component": self.evolved_component,
        "accepted": self.accepted,
        "objective_scores": self.objective_scores,
        "reflection_reasoning": self.reflection_reasoning,
    }

from_dict classmethod

from_dict(data: dict[str, Any]) -> IterationRecord

Reconstruct an IterationRecord from a dict.

Unknown keys are silently ignored for forward compatibility, allowing older code to load records produced by newer versions. Optional fields (objective_scores, reflection_reasoning) default to None when missing from the input dict.

PARAMETER DESCRIPTION
data

Dict containing iteration record fields.

TYPE: dict[str, Any]

RETURNS DESCRIPTION
IterationRecord

Reconstructed IterationRecord instance.

RAISES DESCRIPTION
KeyError

If a required field is missing from the dict.

Source code in src/gepa_adk/domain/models.py
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "IterationRecord":
    """Reconstruct an IterationRecord from a dict.

    Unknown keys are silently ignored for forward compatibility,
    allowing older code to load records produced by newer versions.
    Optional fields (``objective_scores``, ``reflection_reasoning``)
    default to None when missing from the input dict.

    Args:
        data: Dict containing iteration record fields.

    Returns:
        Reconstructed IterationRecord instance.

    Raises:
        KeyError: If a required field is missing from the dict.
    """
    return cls(
        iteration_number=data["iteration_number"],
        score=data["score"],
        component_text=data["component_text"],
        evolved_component=data["evolved_component"],
        accepted=data["accepted"],
        objective_scores=data.get("objective_scores"),
        reflection_reasoning=data.get("reflection_reasoning"),
    )

EvolutionResult dataclass

Outcome of a completed evolution run.

Contains the final results after evolution completes, including all evolved component values, performance metrics, and full history.

ATTRIBUTE DESCRIPTION
schema_version

Schema version for forward-compatible serialization. Always CURRENT_SCHEMA_VERSION for newly created results.

TYPE: int

stop_reason

Why the evolution run terminated. Defaults to StopReason.COMPLETED.

TYPE: StopReason

original_score

Starting performance score (baseline).

TYPE: float

final_score

Ending performance score (best achieved).

TYPE: float

evolved_components

Dictionary mapping component names to their final evolved text values. Keys include "instruction" and optionally "output_schema" or other components. Access individual components via result.evolved_components["instruction"].

TYPE: dict[str, str]

iteration_history

Chronological list of iteration records.

TYPE: list[IterationRecord]

total_iterations

Number of iterations performed.

TYPE: int

valset_score

Score on validation set used for acceptance decisions. None if no validation set was used.

TYPE: float | None

trainset_score

Score on trainset used for reflection diagnostics. None if not computed.

TYPE: float | None

objective_scores

Optional per-example multi-objective scores from the best candidate's final evaluation. None when no objective scores were tracked. Each dict maps objective name to score value. Index-aligned with evaluation batch examples.

TYPE: list[dict[str, float]] | None

original_components

Optional snapshot of pre-evolution component values. When present, enables zero-arg show_diff() calls. None for results created before this field was added or when originals were not captured.

TYPE: dict[str, str] | None

reflection_reasoning

Read-only property returning the reflection reasoning from the last iteration. Convenience accessor; None if no iterations or last iteration has no reasoning.

TYPE: str | None

Examples:

Creating and analyzing a result:

from gepa_adk.domain.models import EvolutionResult, IterationRecord

result = EvolutionResult(
    original_score=0.60,
    final_score=0.85,
    evolved_components={"instruction": "Be helpful and concise"},
    original_components={"instruction": "Be helpful"},
    iteration_history=[],
    total_iterations=10,
)
print(result.improvement)  # 0.25
print(result.show_diff())  # unified diff of instruction changes

Serialization round-trip:

import json

d = result.to_dict()
json_str = json.dumps(d)
restored = EvolutionResult.from_dict(json.loads(json_str))
Note

As a frozen dataclass, EvolutionResult instances cannot be modified.

Source code in src/gepa_adk/domain/models.py
@dataclass(slots=True, frozen=True, kw_only=True)
class EvolutionResult:
    """Outcome of a completed evolution run.

    Contains the final results after evolution completes, including
    all evolved component values, performance metrics, and full history.

    Attributes:
        schema_version (int): Schema version for forward-compatible serialization.
            Always ``CURRENT_SCHEMA_VERSION`` for newly created results.
        stop_reason (StopReason): Why the evolution run terminated. Defaults to
            ``StopReason.COMPLETED``.
        original_score (float): Starting performance score (baseline).
        final_score (float): Ending performance score (best achieved).
        evolved_components (dict[str, str]): Dictionary mapping component names
            to their final evolved text values. Keys include "instruction" and
            optionally "output_schema" or other components. Access individual
            components via ``result.evolved_components["instruction"]``.
        iteration_history (list[IterationRecord]): Chronological list of
            iteration records.
        total_iterations (int): Number of iterations performed.
        valset_score (float | None): Score on validation set used for
            acceptance decisions. None if no validation set was used.
        trainset_score (float | None): Score on trainset used for reflection
            diagnostics. None if not computed.
        objective_scores (list[dict[str, float]] | None): Optional per-example
            multi-objective scores from the best candidate's final evaluation.
            None when no objective scores were tracked. Each dict maps objective
            name to score value. Index-aligned with evaluation batch examples.
        original_components (dict[str, str] | None): Optional snapshot of
            pre-evolution component values. When present, enables zero-arg
            ``show_diff()`` calls. None for results created before this field
            was added or when originals were not captured.
        reflection_reasoning (str | None): Read-only property returning the
            reflection reasoning from the last iteration. Convenience
            accessor; None if no iterations or last iteration has no reasoning.

    Examples:
        Creating and analyzing a result:

        ```python
        from gepa_adk.domain.models import EvolutionResult, IterationRecord

        result = EvolutionResult(
            original_score=0.60,
            final_score=0.85,
            evolved_components={"instruction": "Be helpful and concise"},
            original_components={"instruction": "Be helpful"},
            iteration_history=[],
            total_iterations=10,
        )
        print(result.improvement)  # 0.25
        print(result.show_diff())  # unified diff of instruction changes
        ```

        Serialization round-trip:

        ```python
        import json

        d = result.to_dict()
        json_str = json.dumps(d)
        restored = EvolutionResult.from_dict(json.loads(json_str))
        ```

    Note:
        As a frozen dataclass, EvolutionResult instances cannot be modified.
    """

    schema_version: int = CURRENT_SCHEMA_VERSION
    stop_reason: StopReason = StopReason.COMPLETED
    original_score: float
    final_score: float
    evolved_components: dict[str, str]
    iteration_history: list[IterationRecord]
    total_iterations: int
    valset_score: float | None = None
    trainset_score: float | None = None
    objective_scores: list[dict[str, float]] | None = None
    original_components: dict[str, str] | None = None

    @property
    def reflection_reasoning(self) -> str | None:
        """Return the reflection reasoning from the last iteration.

        Convenience accessor for the most recent iteration's reasoning
        explaining why the reflection agent proposed its mutation.

        Returns:
            The reasoning string from the last iteration record, or None
            if no iterations exist or the last iteration has no reasoning.
        """
        if not self.iteration_history:
            return None
        return self.iteration_history[-1].reflection_reasoning

    def to_dict(self) -> dict[str, Any]:
        """Serialize this result to a stdlib-only dict.

        Returns:
            Dict containing all fields. ``stop_reason`` is serialized
            as its string value. ``iteration_history`` is serialized as a
            list of dicts. Output is directly ``json.dumps()``-compatible.
        """
        return {
            "schema_version": self.schema_version,
            "stop_reason": self.stop_reason.value,
            "original_score": self.original_score,
            "final_score": self.final_score,
            "evolved_components": self.evolved_components,
            "iteration_history": [r.to_dict() for r in self.iteration_history],
            "total_iterations": self.total_iterations,
            "valset_score": self.valset_score,
            "trainset_score": self.trainset_score,
            "objective_scores": self.objective_scores,
            "original_components": self.original_components,
        }

    @classmethod
    def from_dict(cls, data: dict[str, Any]) -> "EvolutionResult":
        """Reconstruct an EvolutionResult from a dict.

        Validates schema version, applies migration if needed, and
        reconstructs all nested objects including optional
        original_components.

        Args:
            data: Dict containing evolution result fields.

        Returns:
            Reconstructed EvolutionResult instance.

        Raises:
            ConfigurationError: If ``schema_version`` exceeds the current
                version or ``stop_reason`` is not a valid enum value.
            KeyError: If a required field is missing from the dict.
        """
        version = data.get("schema_version", 1)
        if version > CURRENT_SCHEMA_VERSION:
            raise ConfigurationError(
                f"Cannot deserialize result with schema_version {version} "
                f"(current version is {CURRENT_SCHEMA_VERSION}). "
                f"Upgrade gepa-adk to load this result.",
                field="schema_version",
                value=version,
                constraint=f"<= {CURRENT_SCHEMA_VERSION}",
            )
        migrated = _migrate_result_dict(data, from_version=version)
        try:
            stop_reason = StopReason(migrated.get("stop_reason", "completed"))
        except ValueError:
            raw = migrated.get("stop_reason")
            raise ConfigurationError(
                f"Invalid stop_reason value: {raw!r}",
                field="stop_reason",
                value=raw,
                constraint=("one of: " + ", ".join(sr.value for sr in StopReason)),
            ) from None
        return cls(
            schema_version=migrated["schema_version"],
            stop_reason=stop_reason,
            original_score=migrated["original_score"],
            final_score=migrated["final_score"],
            evolved_components=migrated["evolved_components"],
            iteration_history=[
                IterationRecord.from_dict(r)
                for r in migrated.get("iteration_history", [])
            ],
            total_iterations=migrated["total_iterations"],
            valset_score=migrated.get("valset_score"),
            trainset_score=migrated.get("trainset_score"),
            objective_scores=migrated.get("objective_scores"),
            original_components=migrated.get("original_components"),
        )

    @property
    def improvement(self) -> float:
        """Calculate the score improvement from original to final.

        Returns:
            The difference between final_score and original_score.
            Positive values indicate improvement, negative indicates degradation.

        Note:
            Override is not needed since frozen dataclasses support properties.
        """
        return self.final_score - self.original_score

    @property
    def improved(self) -> bool:
        """Check if the final score is better than the original.

        Returns:
            True if final_score > original_score, False otherwise.

        Note:
            Only returns True for strict improvement, not equal scores.
        """
        return self.final_score > self.original_score

    def __repr__(self) -> str:
        """Narrative summary of the evolution result.

        Returns:
            Human-readable multi-line summary with improvement percentage,
            iterations, stop reason, component names, and acceptance rate.
            Uses 2-space indent, no box-drawing characters, every line
            greppable.
        """
        if abs(self.original_score) < 1e-9:
            imp_str = f"{self.improvement:+.4f} improvement"
        else:
            pct = self.improvement * 100 / abs(self.original_score)
            sign = "+" if pct > 0 else ""
            imp_str = f"{sign}{pct:.1f}% improvement"
        lines = [
            f"EvolutionResult: {imp_str} "
            f"({self.original_score:.2f} \u2192 {self.final_score:.2f})",
            f"  iterations: {self.total_iterations}, "
            f"stop_reason: {self.stop_reason.value}",
            f"  components: {', '.join(sorted(self.evolved_components))}",
        ]
        if self.total_iterations > 0:
            accepted = sum(1 for r in self.iteration_history if r.accepted)
            lines.append(f"  acceptance_rate: {accepted}/{self.total_iterations}")
        return "\n".join(lines)

    def show_diff(self, original_components: dict[str, str] | None = None) -> str:
        """Show unified diff between original and evolved components.

        Uses stored ``original_components`` if no explicit argument is
        provided. Produces git-diff-style output (``---``/``+++``/``@@``)
        for each component that changed.

        Args:
            original_components: Pre-evolution component values. If None,
                falls back to ``self.original_components``.

        Returns:
            Unified diff string, or ``"No changes detected."`` if all
            components are identical.

        Raises:
            ValueError: If both the argument and ``self.original_components``
                are None.
        """
        originals = (
            original_components
            if original_components is not None
            else self.original_components
        )
        if originals is None:
            raise ValueError(
                "No original components available. "
                "Pass original_components or use a result that stores them."
            )
        return _build_diff(self.evolved_components, originals)

    def _repr_html_(self) -> str:
        """Render an HTML summary for Jupyter notebooks.

        Returns:
            HTML string with summary and components tables. Uses semantic
            ``<th>`` header elements and inline CSS for portability across
            JupyterLab, Colab, and VS Code. Iteration history is wrapped
            in a collapsible ``<details>``/``<summary>`` block.
        """
        if abs(self.original_score) < 1e-9:
            imp_html = f"{self.improvement:+.4f}"
        else:
            pct = self.improvement * 100 / abs(self.original_score)
            sign = "+" if pct > 0 else ""
            imp_html = f"{sign}{pct:.1f}%"
        style = (
            'style="border-collapse:collapse;'
            "font-family:monospace;font-size:13px;"
            'margin:8px 0"'
        )
        td = 'style="padding:4px 12px;border:1px solid #ddd"'
        th = 'style="padding:4px 12px;border:1px solid #ddd;background:#f5f5f5"'

        rows = [
            f"<tr><th {th}>Improvement</th><td {td}>{imp_html}</td></tr>",
            f"<tr><th {th}>Original Score</th>"
            f"<td {td}>{self.original_score:.4f}</td></tr>",
            f"<tr><th {th}>Final Score</th><td {td}>{self.final_score:.4f}</td></tr>",
            f"<tr><th {th}>Iterations</th><td {td}>{self.total_iterations}</td></tr>",
            f"<tr><th {th}>Stop Reason</th>"
            f"<td {td}>{html_mod.escape(self.stop_reason.value)}</td></tr>",
        ]
        summary_table = f"<table {style}>{''.join(rows)}</table>"

        comp_rows = []
        for name in sorted(self.evolved_components):
            val = html_mod.escape(_truncate(self.evolved_components[name], 200))
            comp_rows.append(
                f"<tr><td {td}>{html_mod.escape(name)}</td><td {td}>{val}</td></tr>"
            )
        comp_header = f"<tr><th {th}>Component</th><th {th}>Evolved Value</th></tr>"
        comp_table = f"<table {style}>{comp_header}{''.join(comp_rows)}</table>"

        history_rows = []
        for rec in self.iteration_history:
            accepted_mark = "\u2713" if rec.accepted else ""
            history_rows.append(
                f"<tr><td {td}>{rec.iteration_number}</td>"
                f"<td {td}>{rec.score:.4f}</td>"
                f"<td {td}>{html_mod.escape(rec.evolved_component)}</td>"
                f"<td {td}>{accepted_mark}</td></tr>"
            )
        history_header = (
            f"<tr><th {th}>#</th><th {th}>Score</th>"
            f"<th {th}>Component</th><th {th}>Accepted</th></tr>"
        )
        history_table = (
            f"<table {style}>{history_header}{''.join(history_rows)}</table>"
        )
        history_section = (
            "<details><summary>Iteration History "
            f"({len(self.iteration_history)} records)</summary>"
            f"{history_table}</details>"
        )

        return (
            "<div>"
            "<strong>EvolutionResult</strong>"
            f"{summary_table}{comp_table}{history_section}"
            "</div>"
        )

reflection_reasoning property

reflection_reasoning: str | None

Return the reflection reasoning from the last iteration.

Convenience accessor for the most recent iteration's reasoning explaining why the reflection agent proposed its mutation.

RETURNS DESCRIPTION
str | None

The reasoning string from the last iteration record, or None

str | None

if no iterations exist or the last iteration has no reasoning.

improvement property

improvement: float

Calculate the score improvement from original to final.

RETURNS DESCRIPTION
float

The difference between final_score and original_score.

float

Positive values indicate improvement, negative indicates degradation.

Note

Override is not needed since frozen dataclasses support properties.

improved property

improved: bool

Check if the final score is better than the original.

RETURNS DESCRIPTION
bool

True if final_score > original_score, False otherwise.

Note

Only returns True for strict improvement, not equal scores.

to_dict

to_dict() -> dict[str, Any]

Serialize this result to a stdlib-only dict.

RETURNS DESCRIPTION
dict[str, Any]

Dict containing all fields. stop_reason is serialized

dict[str, Any]

as its string value. iteration_history is serialized as a

dict[str, Any]

list of dicts. Output is directly json.dumps()-compatible.

Source code in src/gepa_adk/domain/models.py
def to_dict(self) -> dict[str, Any]:
    """Serialize this result to a stdlib-only dict.

    Returns:
        Dict containing all fields. ``stop_reason`` is serialized
        as its string value. ``iteration_history`` is serialized as a
        list of dicts. Output is directly ``json.dumps()``-compatible.
    """
    return {
        "schema_version": self.schema_version,
        "stop_reason": self.stop_reason.value,
        "original_score": self.original_score,
        "final_score": self.final_score,
        "evolved_components": self.evolved_components,
        "iteration_history": [r.to_dict() for r in self.iteration_history],
        "total_iterations": self.total_iterations,
        "valset_score": self.valset_score,
        "trainset_score": self.trainset_score,
        "objective_scores": self.objective_scores,
        "original_components": self.original_components,
    }

from_dict classmethod

from_dict(data: dict[str, Any]) -> EvolutionResult

Reconstruct an EvolutionResult from a dict.

Validates schema version, applies migration if needed, and reconstructs all nested objects including optional original_components.

PARAMETER DESCRIPTION
data

Dict containing evolution result fields.

TYPE: dict[str, Any]

RETURNS DESCRIPTION
EvolutionResult

Reconstructed EvolutionResult instance.

RAISES DESCRIPTION
ConfigurationError

If schema_version exceeds the current version or stop_reason is not a valid enum value.

KeyError

If a required field is missing from the dict.

Source code in src/gepa_adk/domain/models.py
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "EvolutionResult":
    """Reconstruct an EvolutionResult from a dict.

    Validates schema version, applies migration if needed, and
    reconstructs all nested objects including optional
    original_components.

    Args:
        data: Dict containing evolution result fields.

    Returns:
        Reconstructed EvolutionResult instance.

    Raises:
        ConfigurationError: If ``schema_version`` exceeds the current
            version or ``stop_reason`` is not a valid enum value.
        KeyError: If a required field is missing from the dict.
    """
    version = data.get("schema_version", 1)
    if version > CURRENT_SCHEMA_VERSION:
        raise ConfigurationError(
            f"Cannot deserialize result with schema_version {version} "
            f"(current version is {CURRENT_SCHEMA_VERSION}). "
            f"Upgrade gepa-adk to load this result.",
            field="schema_version",
            value=version,
            constraint=f"<= {CURRENT_SCHEMA_VERSION}",
        )
    migrated = _migrate_result_dict(data, from_version=version)
    try:
        stop_reason = StopReason(migrated.get("stop_reason", "completed"))
    except ValueError:
        raw = migrated.get("stop_reason")
        raise ConfigurationError(
            f"Invalid stop_reason value: {raw!r}",
            field="stop_reason",
            value=raw,
            constraint=("one of: " + ", ".join(sr.value for sr in StopReason)),
        ) from None
    return cls(
        schema_version=migrated["schema_version"],
        stop_reason=stop_reason,
        original_score=migrated["original_score"],
        final_score=migrated["final_score"],
        evolved_components=migrated["evolved_components"],
        iteration_history=[
            IterationRecord.from_dict(r)
            for r in migrated.get("iteration_history", [])
        ],
        total_iterations=migrated["total_iterations"],
        valset_score=migrated.get("valset_score"),
        trainset_score=migrated.get("trainset_score"),
        objective_scores=migrated.get("objective_scores"),
        original_components=migrated.get("original_components"),
    )

__repr__

__repr__() -> str

Narrative summary of the evolution result.

RETURNS DESCRIPTION
str

Human-readable multi-line summary with improvement percentage,

str

iterations, stop reason, component names, and acceptance rate.

str

Uses 2-space indent, no box-drawing characters, every line

str

greppable.

Source code in src/gepa_adk/domain/models.py
def __repr__(self) -> str:
    """Narrative summary of the evolution result.

    Returns:
        Human-readable multi-line summary with improvement percentage,
        iterations, stop reason, component names, and acceptance rate.
        Uses 2-space indent, no box-drawing characters, every line
        greppable.
    """
    if abs(self.original_score) < 1e-9:
        imp_str = f"{self.improvement:+.4f} improvement"
    else:
        pct = self.improvement * 100 / abs(self.original_score)
        sign = "+" if pct > 0 else ""
        imp_str = f"{sign}{pct:.1f}% improvement"
    lines = [
        f"EvolutionResult: {imp_str} "
        f"({self.original_score:.2f} \u2192 {self.final_score:.2f})",
        f"  iterations: {self.total_iterations}, "
        f"stop_reason: {self.stop_reason.value}",
        f"  components: {', '.join(sorted(self.evolved_components))}",
    ]
    if self.total_iterations > 0:
        accepted = sum(1 for r in self.iteration_history if r.accepted)
        lines.append(f"  acceptance_rate: {accepted}/{self.total_iterations}")
    return "\n".join(lines)

show_diff

show_diff(
    original_components: dict[str, str] | None = None,
) -> str

Show unified diff between original and evolved components.

Uses stored original_components if no explicit argument is provided. Produces git-diff-style output (---/+++/@@) for each component that changed.

PARAMETER DESCRIPTION
original_components

Pre-evolution component values. If None, falls back to self.original_components.

TYPE: dict[str, str] | None DEFAULT: None

RETURNS DESCRIPTION
str

Unified diff string, or "No changes detected." if all

str

components are identical.

RAISES DESCRIPTION
ValueError

If both the argument and self.original_components are None.

Source code in src/gepa_adk/domain/models.py
def show_diff(self, original_components: dict[str, str] | None = None) -> str:
    """Show unified diff between original and evolved components.

    Uses stored ``original_components`` if no explicit argument is
    provided. Produces git-diff-style output (``---``/``+++``/``@@``)
    for each component that changed.

    Args:
        original_components: Pre-evolution component values. If None,
            falls back to ``self.original_components``.

    Returns:
        Unified diff string, or ``"No changes detected."`` if all
        components are identical.

    Raises:
        ValueError: If both the argument and ``self.original_components``
            are None.
    """
    originals = (
        original_components
        if original_components is not None
        else self.original_components
    )
    if originals is None:
        raise ValueError(
            "No original components available. "
            "Pass original_components or use a result that stores them."
        )
    return _build_diff(self.evolved_components, originals)

Candidate dataclass

Represents an instruction candidate being evolved.

Unlike GEPA's simple dict[str, str] type alias, this class provides richer state tracking for async scenarios including lineage and metadata.

ATTRIBUTE DESCRIPTION
components

Component name to text value mapping. Common keys include 'instruction' (main agent prompt) and 'output_schema'.

TYPE: dict[str, str]

generation

Generation number in the evolution lineage (0 = initial).

TYPE: int

parent_id

ID of the parent candidate for lineage tracking (legacy field, retained for compatibility).

TYPE: str | None

parent_ids

Multi-parent indices for merge operations. None for seed candidates, [single_idx] for mutations, [idx1, idx2] for merges.

TYPE: list[int] | None

metadata

Extensible metadata dict for async tracking and debugging.

TYPE: dict[str, Any]

Examples:

Creating a candidate:

from gepa_adk.domain.models import Candidate

candidate = Candidate(
    components={"instruction": "Be helpful"},
    generation=0,
)
print(candidate.components["instruction"])  # Be helpful
print(candidate.generation)  # 0
Note

A mutable candidate representation with richer state tracking than GEPA's simple dict. Components and metadata can be modified during the evolution process. Use generation and parent_id to track lineage.

Source code in src/gepa_adk/domain/models.py
@dataclass(slots=True, kw_only=True)
class Candidate:
    """Represents an instruction candidate being evolved.

    Unlike GEPA's simple `dict[str, str]` type alias, this class provides
    richer state tracking for async scenarios including lineage and metadata.

    Attributes:
        components (dict[str, str]): Component name to text value mapping.
            Common keys include 'instruction' (main agent prompt) and
            'output_schema'.
        generation (int): Generation number in the evolution lineage
            (0 = initial).
        parent_id (str | None): ID of the parent candidate for lineage
            tracking (legacy field, retained for compatibility).
        parent_ids (list[int] | None): Multi-parent indices for merge operations.
            None for seed candidates, [single_idx] for mutations, [idx1, idx2] for merges.
        metadata (dict[str, Any]): Extensible metadata dict for async tracking
            and debugging.

    Examples:
        Creating a candidate:

        ```python
        from gepa_adk.domain.models import Candidate

        candidate = Candidate(
            components={"instruction": "Be helpful"},
            generation=0,
        )
        print(candidate.components["instruction"])  # Be helpful
        print(candidate.generation)  # 0
        ```

    Note:
        A mutable candidate representation with richer state tracking than
        GEPA's simple dict. Components and metadata can be modified during
        the evolution process. Use generation and parent_id to track lineage.
    """

    components: dict[str, str] = field(default_factory=dict)
    generation: int = 0
    parent_id: str | None = None
    parent_ids: list[int] | None = None
    metadata: dict[str, Any] = field(default_factory=dict)

MultiAgentEvolutionResult dataclass

Outcome of a completed multi-agent evolution run.

Contains evolved component_text for all agents in the group, along with performance metrics and evolution history.

ATTRIBUTE DESCRIPTION
schema_version

Schema version for forward-compatible serialization.

TYPE: int

stop_reason

Why the evolution run terminated.

TYPE: StopReason

evolved_components

Mapping of agent name to evolved component_text.

TYPE: dict[str, str]

original_score

Starting performance score (baseline).

TYPE: float

final_score

Ending performance score (best achieved).

TYPE: float

primary_agent

Name of the agent whose output was used for scoring.

TYPE: str

iteration_history

Chronological list of iteration records.

TYPE: list[IterationRecord]

total_iterations

Number of iterations performed.

TYPE: int

original_components

Optional snapshot of pre-evolution component values. When present, enables zero-arg show_diff() calls. None for results created before this field was added or when originals were not captured.

TYPE: dict[str, str] | None

Examples:

Creating and analyzing a multi-agent result:

from gepa_adk.domain.models import MultiAgentEvolutionResult, IterationRecord

result = MultiAgentEvolutionResult(
    evolved_components={
        "generator": "Generate high-quality code",
        "critic": "Review code thoroughly",
    },
    original_components={
        "generator": "Generate code",
        "critic": "Review code",
    },
    original_score=0.60,
    final_score=0.85,
    primary_agent="generator",
    iteration_history=[],
    total_iterations=10,
)
print(result.improvement)  # 0.25
print(result.show_diff())  # unified diff of component changes
print(result.agent_names)  # ["critic", "generator"]
assert result.schema_version == 1

Serialization round-trip:

import json

d = result.to_dict()
restored = MultiAgentEvolutionResult.from_dict(json.loads(json.dumps(d)))
Note

An immutable result container for multi-agent evolution. Once created, MultiAgentEvolutionResult instances cannot be modified. Use computed properties like improvement, improved, and agent_names to analyze results without modifying the underlying data.

Source code in src/gepa_adk/domain/models.py
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
@dataclass(slots=True, frozen=True, kw_only=True)
class MultiAgentEvolutionResult:
    """Outcome of a completed multi-agent evolution run.

    Contains evolved component_text for all agents in the group,
    along with performance metrics and evolution history.

    Attributes:
        schema_version (int): Schema version for forward-compatible serialization.
        stop_reason (StopReason): Why the evolution run terminated.
        evolved_components (dict[str, str]): Mapping of agent name to evolved
            component_text.
        original_score (float): Starting performance score (baseline).
        final_score (float): Ending performance score (best achieved).
        primary_agent (str): Name of the agent whose output was used for scoring.
        iteration_history (list[IterationRecord]): Chronological list of iteration records.
        total_iterations (int): Number of iterations performed.
        original_components (dict[str, str] | None): Optional snapshot of
            pre-evolution component values. When present, enables zero-arg
            ``show_diff()`` calls. None for results created before this field
            was added or when originals were not captured.

    Examples:
        Creating and analyzing a multi-agent result:

        ```python
        from gepa_adk.domain.models import MultiAgentEvolutionResult, IterationRecord

        result = MultiAgentEvolutionResult(
            evolved_components={
                "generator": "Generate high-quality code",
                "critic": "Review code thoroughly",
            },
            original_components={
                "generator": "Generate code",
                "critic": "Review code",
            },
            original_score=0.60,
            final_score=0.85,
            primary_agent="generator",
            iteration_history=[],
            total_iterations=10,
        )
        print(result.improvement)  # 0.25
        print(result.show_diff())  # unified diff of component changes
        print(result.agent_names)  # ["critic", "generator"]
        assert result.schema_version == 1
        ```

        Serialization round-trip:

        ```python
        import json

        d = result.to_dict()
        restored = MultiAgentEvolutionResult.from_dict(json.loads(json.dumps(d)))
        ```

    Note:
        An immutable result container for multi-agent evolution. Once created,
        MultiAgentEvolutionResult instances cannot be modified. Use computed
        properties like `improvement`, `improved`, and `agent_names` to analyze
        results without modifying the underlying data.
    """

    schema_version: int = CURRENT_SCHEMA_VERSION
    stop_reason: StopReason = StopReason.COMPLETED
    evolved_components: dict[str, str]
    original_score: float
    final_score: float
    primary_agent: str
    iteration_history: list[IterationRecord]
    total_iterations: int
    original_components: dict[str, str] | None = None

    def to_dict(self) -> dict[str, Any]:
        """Serialize this result to a stdlib-only dict.

        Returns:
            Dict containing all 9 fields. ``stop_reason`` is serialized
            as its string value. ``iteration_history`` is serialized as a
            list of dicts. Output is directly ``json.dumps()``-compatible.
        """
        return {
            "schema_version": self.schema_version,
            "stop_reason": self.stop_reason.value,
            "evolved_components": self.evolved_components,
            "original_score": self.original_score,
            "final_score": self.final_score,
            "primary_agent": self.primary_agent,
            "iteration_history": [r.to_dict() for r in self.iteration_history],
            "total_iterations": self.total_iterations,
            "original_components": self.original_components,
        }

    @classmethod
    def from_dict(cls, data: dict[str, Any]) -> "MultiAgentEvolutionResult":
        """Reconstruct a MultiAgentEvolutionResult from a dict.

        Validates schema version, applies migration if needed, and
        reconstructs all nested objects including optional
        original_components.

        Args:
            data: Dict containing multi-agent evolution result fields.

        Returns:
            Reconstructed MultiAgentEvolutionResult instance.

        Raises:
            ConfigurationError: If ``schema_version`` exceeds the current
                version or ``stop_reason`` is not a valid enum value.
            KeyError: If a required field is missing from the dict.
        """
        version = data.get("schema_version", 1)
        if version > CURRENT_SCHEMA_VERSION:
            raise ConfigurationError(
                f"Cannot deserialize result with schema_version {version} "
                f"(current version is {CURRENT_SCHEMA_VERSION}). "
                f"Upgrade gepa-adk to load this result.",
                field="schema_version",
                value=version,
                constraint=f"<= {CURRENT_SCHEMA_VERSION}",
            )
        migrated = _migrate_result_dict(data, from_version=version)
        try:
            stop_reason = StopReason(migrated.get("stop_reason", "completed"))
        except ValueError:
            raw = migrated.get("stop_reason")
            raise ConfigurationError(
                f"Invalid stop_reason value: {raw!r}",
                field="stop_reason",
                value=raw,
                constraint=("one of: " + ", ".join(sr.value for sr in StopReason)),
            ) from None
        return cls(
            schema_version=migrated["schema_version"],
            stop_reason=stop_reason,
            evolved_components=migrated["evolved_components"],
            original_score=migrated["original_score"],
            final_score=migrated["final_score"],
            primary_agent=migrated["primary_agent"],
            iteration_history=[
                IterationRecord.from_dict(r)
                for r in migrated.get("iteration_history", [])
            ],
            total_iterations=migrated["total_iterations"],
            original_components=migrated.get("original_components"),
        )

    @property
    def improvement(self) -> float:
        """Calculate the score improvement from original to final.

        Returns:
            The difference between final_score and original_score.
            Positive values indicate improvement, negative indicates degradation.

        Note:
            Override is not needed since frozen dataclasses support properties.
        """
        return self.final_score - self.original_score

    @property
    def improved(self) -> bool:
        """Check if the final score is better than the original.

        Returns:
            True if final_score > original_score, False otherwise.

        Note:
            Only returns True for strict improvement, not equal scores.
        """
        return self.final_score > self.original_score

    @property
    def agent_names(self) -> list[str]:
        """Get sorted list of evolved agent names.

        Returns:
            Sorted list of agent names from evolved_components keys.

        Note:
            Outputs a new list each time, sorted alphabetically for
            consistent ordering regardless of insertion order.
        """
        return sorted(self.evolved_components.keys())

    def __repr__(self) -> str:
        """Narrative summary of the multi-agent evolution result.

        Returns:
            Human-readable multi-line summary with improvement percentage,
            iterations, stop reason, primary agent, agent names, and
            acceptance rate. Uses 2-space indent, no box-drawing characters,
            every line greppable.
        """
        if abs(self.original_score) < 1e-9:
            imp_str = f"{self.improvement:+.4f} improvement"
        else:
            pct = self.improvement * 100 / abs(self.original_score)
            sign = "+" if pct > 0 else ""
            imp_str = f"{sign}{pct:.1f}% improvement"
        lines = [
            f"MultiAgentEvolutionResult: {imp_str} "
            f"({self.original_score:.2f} \u2192 {self.final_score:.2f})",
            f"  iterations: {self.total_iterations}, "
            f"stop_reason: {self.stop_reason.value}",
            f"  primary_agent: {self.primary_agent}",
            f"  agents: {', '.join(sorted(self.evolved_components))}",
        ]
        if self.total_iterations > 0:
            accepted = sum(1 for r in self.iteration_history if r.accepted)
            lines.append(f"  acceptance_rate: {accepted}/{self.total_iterations}")
        return "\n".join(lines)

    def show_diff(self, original_components: dict[str, str] | None = None) -> str:
        """Show unified diff between original and evolved components.

        Uses stored ``original_components`` if no explicit argument is
        provided. Produces git-diff-style output (``---``/``+++``/``@@``)
        for each component that changed.

        Args:
            original_components: Pre-evolution component values. If None,
                falls back to ``self.original_components``.

        Returns:
            Unified diff string, or ``"No changes detected."`` if all
            components are identical.

        Raises:
            ValueError: If both the argument and ``self.original_components``
                are None.
        """
        originals = (
            original_components
            if original_components is not None
            else self.original_components
        )
        if originals is None:
            raise ValueError(
                "No original components available. "
                "Pass original_components or use a result that stores them."
            )
        return _build_diff(self.evolved_components, originals)

    def _repr_html_(self) -> str:
        """Render an HTML summary for Jupyter notebooks.

        Returns:
            HTML string with summary and components tables. Uses semantic
            ``<th>`` header elements and inline CSS for portability across
            JupyterLab, Colab, and VS Code. Iteration history is wrapped
            in a collapsible ``<details>``/``<summary>`` block.
        """
        if abs(self.original_score) < 1e-9:
            imp_html = f"{self.improvement:+.4f}"
        else:
            pct = self.improvement * 100 / abs(self.original_score)
            sign = "+" if pct > 0 else ""
            imp_html = f"{sign}{pct:.1f}%"
        style = (
            'style="border-collapse:collapse;'
            "font-family:monospace;font-size:13px;"
            'margin:8px 0"'
        )
        td = 'style="padding:4px 12px;border:1px solid #ddd"'
        th = 'style="padding:4px 12px;border:1px solid #ddd;background:#f5f5f5"'

        rows = [
            f"<tr><th {th}>Improvement</th><td {td}>{imp_html}</td></tr>",
            f"<tr><th {th}>Original Score</th>"
            f"<td {td}>{self.original_score:.4f}</td></tr>",
            f"<tr><th {th}>Final Score</th><td {td}>{self.final_score:.4f}</td></tr>",
            f"<tr><th {th}>Iterations</th><td {td}>{self.total_iterations}</td></tr>",
            f"<tr><th {th}>Stop Reason</th>"
            f"<td {td}>{html_mod.escape(self.stop_reason.value)}</td></tr>",
            f"<tr><th {th}>Primary Agent</th>"
            f"<td {td}>{html_mod.escape(self.primary_agent)}</td></tr>",
        ]
        summary_table = f"<table {style}>{''.join(rows)}</table>"

        comp_rows = []
        for name in sorted(self.evolved_components):
            val = html_mod.escape(_truncate(self.evolved_components[name], 200))
            comp_rows.append(
                f"<tr><td {td}>{html_mod.escape(name)}</td><td {td}>{val}</td></tr>"
            )
        comp_header = f"<tr><th {th}>Agent</th><th {th}>Evolved Value</th></tr>"
        comp_table = f"<table {style}>{comp_header}{''.join(comp_rows)}</table>"

        history_rows = []
        for rec in self.iteration_history:
            accepted_mark = "\u2713" if rec.accepted else ""
            history_rows.append(
                f"<tr><td {td}>{rec.iteration_number}</td>"
                f"<td {td}>{rec.score:.4f}</td>"
                f"<td {td}>{html_mod.escape(rec.evolved_component)}</td>"
                f"<td {td}>{accepted_mark}</td></tr>"
            )
        history_header = (
            f"<tr><th {th}>#</th><th {th}>Score</th>"
            f"<th {th}>Component</th><th {th}>Accepted</th></tr>"
        )
        history_table = (
            f"<table {style}>{history_header}{''.join(history_rows)}</table>"
        )
        history_section = (
            "<details><summary>Iteration History "
            f"({len(self.iteration_history)} records)</summary>"
            f"{history_table}</details>"
        )

        return (
            "<div>"
            "<strong>MultiAgentEvolutionResult</strong>"
            f"{summary_table}{comp_table}{history_section}"
            "</div>"
        )

improvement property

improvement: float

Calculate the score improvement from original to final.

RETURNS DESCRIPTION
float

The difference between final_score and original_score.

float

Positive values indicate improvement, negative indicates degradation.

Note

Override is not needed since frozen dataclasses support properties.

improved property

improved: bool

Check if the final score is better than the original.

RETURNS DESCRIPTION
bool

True if final_score > original_score, False otherwise.

Note

Only returns True for strict improvement, not equal scores.

agent_names property

agent_names: list[str]

Get sorted list of evolved agent names.

RETURNS DESCRIPTION
list[str]

Sorted list of agent names from evolved_components keys.

Note

Outputs a new list each time, sorted alphabetically for consistent ordering regardless of insertion order.

to_dict

to_dict() -> dict[str, Any]

Serialize this result to a stdlib-only dict.

RETURNS DESCRIPTION
dict[str, Any]

Dict containing all 9 fields. stop_reason is serialized

dict[str, Any]

as its string value. iteration_history is serialized as a

dict[str, Any]

list of dicts. Output is directly json.dumps()-compatible.

Source code in src/gepa_adk/domain/models.py
def to_dict(self) -> dict[str, Any]:
    """Serialize this result to a stdlib-only dict.

    Returns:
        Dict containing all 9 fields. ``stop_reason`` is serialized
        as its string value. ``iteration_history`` is serialized as a
        list of dicts. Output is directly ``json.dumps()``-compatible.
    """
    return {
        "schema_version": self.schema_version,
        "stop_reason": self.stop_reason.value,
        "evolved_components": self.evolved_components,
        "original_score": self.original_score,
        "final_score": self.final_score,
        "primary_agent": self.primary_agent,
        "iteration_history": [r.to_dict() for r in self.iteration_history],
        "total_iterations": self.total_iterations,
        "original_components": self.original_components,
    }

from_dict classmethod

from_dict(
    data: dict[str, Any],
) -> MultiAgentEvolutionResult

Reconstruct a MultiAgentEvolutionResult from a dict.

Validates schema version, applies migration if needed, and reconstructs all nested objects including optional original_components.

PARAMETER DESCRIPTION
data

Dict containing multi-agent evolution result fields.

TYPE: dict[str, Any]

RETURNS DESCRIPTION
MultiAgentEvolutionResult

Reconstructed MultiAgentEvolutionResult instance.

RAISES DESCRIPTION
ConfigurationError

If schema_version exceeds the current version or stop_reason is not a valid enum value.

KeyError

If a required field is missing from the dict.

Source code in src/gepa_adk/domain/models.py
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "MultiAgentEvolutionResult":
    """Reconstruct a MultiAgentEvolutionResult from a dict.

    Validates schema version, applies migration if needed, and
    reconstructs all nested objects including optional
    original_components.

    Args:
        data: Dict containing multi-agent evolution result fields.

    Returns:
        Reconstructed MultiAgentEvolutionResult instance.

    Raises:
        ConfigurationError: If ``schema_version`` exceeds the current
            version or ``stop_reason`` is not a valid enum value.
        KeyError: If a required field is missing from the dict.
    """
    version = data.get("schema_version", 1)
    if version > CURRENT_SCHEMA_VERSION:
        raise ConfigurationError(
            f"Cannot deserialize result with schema_version {version} "
            f"(current version is {CURRENT_SCHEMA_VERSION}). "
            f"Upgrade gepa-adk to load this result.",
            field="schema_version",
            value=version,
            constraint=f"<= {CURRENT_SCHEMA_VERSION}",
        )
    migrated = _migrate_result_dict(data, from_version=version)
    try:
        stop_reason = StopReason(migrated.get("stop_reason", "completed"))
    except ValueError:
        raw = migrated.get("stop_reason")
        raise ConfigurationError(
            f"Invalid stop_reason value: {raw!r}",
            field="stop_reason",
            value=raw,
            constraint=("one of: " + ", ".join(sr.value for sr in StopReason)),
        ) from None
    return cls(
        schema_version=migrated["schema_version"],
        stop_reason=stop_reason,
        evolved_components=migrated["evolved_components"],
        original_score=migrated["original_score"],
        final_score=migrated["final_score"],
        primary_agent=migrated["primary_agent"],
        iteration_history=[
            IterationRecord.from_dict(r)
            for r in migrated.get("iteration_history", [])
        ],
        total_iterations=migrated["total_iterations"],
        original_components=migrated.get("original_components"),
    )

__repr__

__repr__() -> str

Narrative summary of the multi-agent evolution result.

RETURNS DESCRIPTION
str

Human-readable multi-line summary with improvement percentage,

str

iterations, stop reason, primary agent, agent names, and

str

acceptance rate. Uses 2-space indent, no box-drawing characters,

str

every line greppable.

Source code in src/gepa_adk/domain/models.py
def __repr__(self) -> str:
    """Narrative summary of the multi-agent evolution result.

    Returns:
        Human-readable multi-line summary with improvement percentage,
        iterations, stop reason, primary agent, agent names, and
        acceptance rate. Uses 2-space indent, no box-drawing characters,
        every line greppable.
    """
    if abs(self.original_score) < 1e-9:
        imp_str = f"{self.improvement:+.4f} improvement"
    else:
        pct = self.improvement * 100 / abs(self.original_score)
        sign = "+" if pct > 0 else ""
        imp_str = f"{sign}{pct:.1f}% improvement"
    lines = [
        f"MultiAgentEvolutionResult: {imp_str} "
        f"({self.original_score:.2f} \u2192 {self.final_score:.2f})",
        f"  iterations: {self.total_iterations}, "
        f"stop_reason: {self.stop_reason.value}",
        f"  primary_agent: {self.primary_agent}",
        f"  agents: {', '.join(sorted(self.evolved_components))}",
    ]
    if self.total_iterations > 0:
        accepted = sum(1 for r in self.iteration_history if r.accepted)
        lines.append(f"  acceptance_rate: {accepted}/{self.total_iterations}")
    return "\n".join(lines)

show_diff

show_diff(
    original_components: dict[str, str] | None = None,
) -> str

Show unified diff between original and evolved components.

Uses stored original_components if no explicit argument is provided. Produces git-diff-style output (---/+++/@@) for each component that changed.

PARAMETER DESCRIPTION
original_components

Pre-evolution component values. If None, falls back to self.original_components.

TYPE: dict[str, str] | None DEFAULT: None

RETURNS DESCRIPTION
str

Unified diff string, or "No changes detected." if all

str

components are identical.

RAISES DESCRIPTION
ValueError

If both the argument and self.original_components are None.

Source code in src/gepa_adk/domain/models.py
def show_diff(self, original_components: dict[str, str] | None = None) -> str:
    """Show unified diff between original and evolved components.

    Uses stored ``original_components`` if no explicit argument is
    provided. Produces git-diff-style output (``---``/``+++``/``@@``)
    for each component that changed.

    Args:
        original_components: Pre-evolution component values. If None,
            falls back to ``self.original_components``.

    Returns:
        Unified diff string, or ``"No changes detected."`` if all
        components are identical.

    Raises:
        ValueError: If both the argument and ``self.original_components``
            are None.
    """
    originals = (
        original_components
        if original_components is not None
        else self.original_components
    )
    if originals is None:
        raise ValueError(
            "No original components available. "
            "Pass original_components or use a result that stores them."
        )
    return _build_diff(self.evolved_components, originals)