ADR-010: Docstring Quality Standards¶

Status: Accepted Date: 2026-01-10 Deciders: gepa-adk maintainers

Context¶

Documentation serves multiple audiences: - Users: Understanding gepa-adk's public APIs through rendered documentation - Developers: Inline context while reading/modifying code - Tooling: Type checkers, linters, IDE autocomplete

We use mkdocs-material with mkdocstrings-python which parses Google-style docstrings to generate API documentation.

Problem: Docstrings can be: - Present but incorrect (Args don't match code) - Accurate but stale (code changed, docstring didn't) - Compliant but minimal (missing Examples, Attributes)

Decision¶

Convention: Google-Style Docstrings¶

Follow Google's Python Style Guide exclusively.

Supported Sections¶

Core Sections (validated by tools):

Section	Purpose
Args	Function/method parameters
Returns	Return value type and description
Yields	Values yielded by generators
Raises	Exceptions that may be raised
Attributes	Class attributes
Examples	Usage examples with fenced code blocks

Admonition Sections (rendered as callout boxes):

Section	Rendering	Use Case
Note	Blue info box	Implementation details, caveats
Warning	Orange warning box	Critical warnings about misuse
Tip	Green tip box	Best practices, suggestions
See Also	Bullet list with links	Cross-references to related modules

Quality Dimensions¶

+---------------------------------------------------------------+
|                    Docstring Quality Model                     |
+---------------------------------------------------------------+
                              |
              +---------------+---------------+
              |                               |
              v                               v
+----------------------+           +----------------------+
|   1. COMPLIANCE      |           |    2. COVERAGE       |
|   Do sections match  |           |   Does it exist?     |
|   the code?          |           |                      |
|                      |           |                      |
|   Tool: ruff D rules |           |   Tool: interrogate  |
|                      |           |   Target: 95%+       |
+----------------------+           +----------------------+

Enforcement Tools¶

1. Coverage: interrogate¶

What it checks: Docstring presence (modules, classes, functions)

# pyproject.toml (already configured)
[tool.interrogate]
verbose = 1
fail-under = 95
ignore-init-method = true
ignore-init-module = true
ignore-magic = true
ignore-private = true
exclude = ["tests", "scripts"]

2. Style: ruff pydocstyle¶

What it checks: Google-style compliance

# pyproject.toml (already configured)
[tool.ruff.lint]
extend-select = ["D"]  # pydocstyle

[tool.ruff.lint.pydocstyle]
convention = "google"

Usage Guidelines¶

Public Functions¶

Required sections: - Summary (first line) - Args (if has parameters) - Returns (if returns non-None) - Raises (if raises exceptions)

async def evolve(
    agent: LlmAgent,
    trainset: list[dict],
    *,
    critic: LlmAgent | None = None,
    config: EvolutionConfig | None = None,
) -> EvolutionResult:
    """Evolve agent instructions using GEPA algorithm.

    Args:
        agent: ADK agent to evolve.
        trainset: Training examples with input/expected pairs.
        critic: Optional critic agent for scoring.
        config: Evolution configuration.

    Returns:
        EvolutionResult with evolved instruction and metrics.

    Raises:
        EvaluationError: If agent execution fails.
        ConfigurationError: If config is invalid.

    Examples:
        Basic evolution:

        ```python
        result = await evolve(agent, trainset, critic=critic)
        print(result.evolved_instruction)
        ```
    """

Public Classes¶

Required sections: - Summary (first line) - Attributes (if has instance attributes) - Examples (for complex classes)

class AsyncGEPAEngine:
    """Async-first evolution engine implementing GEPA algorithm.

    This engine orchestrates the evolution loop: evaluate, propose,
    accept/reject, repeat until convergence or max iterations.

    Attributes:
        adapter: AsyncGEPAAdapter for agent evaluation.
        config: Evolution configuration.
        proposer: Mutation proposer for generating candidates.

    Examples:
        Using with mock adapter:

        ```python
        engine = AsyncGEPAEngine(adapter=mock_adapter, config=config)
        result = await engine.run()
        ```
    """

Protocol Definitions¶

Document the contract clearly:

class AsyncGEPAAdapter(Protocol):
    """Protocol for async GEPA adapters.

    Implementations must provide async methods for evaluation,
    reflective dataset creation, and proposal generation.

    All methods are coroutines to enable concurrent execution.
    """

    async def evaluate(
        self,
        batch: list[DataInst],
        candidate: dict[str, str],
        capture_traces: bool = False,
    ) -> EvaluationBatch:
        """Execute candidate on batch.

        Args:
            batch: Input examples to evaluate.
            candidate: Candidate instruction texts.
            capture_traces: Whether to capture execution traces.

        Returns:
            EvaluationBatch with outputs, scores, and trajectories.
        """
        ...

When to Use Admonitions¶

Note (Blue): Implementation details, performance characteristics

async def evaluate_batch(self, batch: list[DataInst]) -> EvaluationBatch:
    """Evaluate batch with controlled concurrency.

    Note:
        Concurrency is limited by `config.max_concurrent_evals`.
        Default is 5 parallel evaluations.
    """

Warning (Orange): Critical warnings, common pitfalls

def evolve_sync(agent: LlmAgent, trainset: list[dict], **kwargs) -> EvolutionResult:
    """Synchronous wrapper for evolve().

    Warning:
        Creates new event loop internally. Do NOT call from
        existing async context - use `evolve()` directly instead.
    """

See Also (Bullet list): Cross-references to related modules

"""Package docstring.

See Also:
    - [`gepa_adk.domain`][gepa_adk.domain]: Core domain models.
    - [`gepa_adk.ports`][gepa_adk.ports]: Port interfaces.
"""

Syntax: Use [`display.text`][target.identifier] for mkdocstrings cross-references. Sphinx-style :mod: syntax is not supported by mkdocstrings.

What NOT to Document¶

Don't Document	Why	Alternative
Private functions (`_helper`)	Internal implementation	Inline comments if complex
Self-evident parameters	Noise without value	Skip obvious Args
Type info in docstrings	Duplicates type hints	Use type hints in signature

Example of over-documentation:

# Don't do this
def add(a: int, b: int) -> int:
    """Add two numbers.

    Args:
        a: The first integer to add.
        b: The second integer to add.

    Returns:
        The sum of a and b as an integer.
    """
    return a + b

Better:

# Do this
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

Workflow¶

Development¶

Write code with docstrings following Google style
Run uv run ruff check --fix to check style
Run uv run interrogate -v src/ to check coverage
Commit

CI/CD¶

Pre-commit: ruff D rules (blocks if style violations)
Pre-commit: interrogate (blocks if coverage <95%)

Consequences¶

Positive¶

Consistent docstring style across codebase
Automated enforcement via existing tools
Beautiful rendered documentation via mkdocs-material
Clear guidelines for contributors

Negative¶

Learning curve for Google-style sections
Time investment to document existing code

Neutral¶

Documentation quality is measurable and trackable

References¶

Google Python Style Guide: Docstrings
mkdocstrings-python: Google Style
interrogate Documentation
ADR-005: Three-Layer Testing Strategy