ADR-010: Docstring Quality Standards¶
Status: Accepted Date: 2026-01-10 Deciders: gepa-adk maintainers
Context¶
Documentation serves multiple audiences: - Users: Understanding gepa-adk's public APIs through rendered documentation - Developers: Inline context while reading/modifying code - Tooling: Type checkers, linters, IDE autocomplete
We use mkdocs-material with mkdocstrings-python which parses Google-style docstrings to generate API documentation.
Problem: Docstrings can be: - Present but incorrect (Args don't match code) - Accurate but stale (code changed, docstring didn't) - Compliant but minimal (missing Examples, Attributes)
Decision¶
Convention: Google-Style Docstrings¶
Follow Google's Python Style Guide exclusively.
Supported Sections¶
Core Sections (validated by tools):
| Section | Purpose |
|---|---|
| Args | Function/method parameters |
| Returns | Return value type and description |
| Yields | Values yielded by generators |
| Raises | Exceptions that may be raised |
| Attributes | Class attributes |
| Examples | Usage examples with fenced code blocks |
Admonition Sections (rendered as callout boxes):
| Section | Rendering | Use Case |
|---|---|---|
| Note | Blue info box | Implementation details, caveats |
| Warning | Orange warning box | Critical warnings about misuse |
| Tip | Green tip box | Best practices, suggestions |
| See Also | Bullet list with links | Cross-references to related modules |
Quality Dimensions¶
+---------------------------------------------------------------+
| Docstring Quality Model |
+---------------------------------------------------------------+
|
+---------------+---------------+
| |
v v
+----------------------+ +----------------------+
| 1. COMPLIANCE | | 2. COVERAGE |
| Do sections match | | Does it exist? |
| the code? | | |
| | | |
| Tool: ruff D rules | | Tool: interrogate |
| | | Target: 95%+ |
+----------------------+ +----------------------+
Enforcement Tools¶
1. Coverage: interrogate¶
What it checks: Docstring presence (modules, classes, functions)
# pyproject.toml (already configured)
[tool.interrogate]
verbose = 1
fail-under = 95
ignore-init-method = true
ignore-init-module = true
ignore-magic = true
ignore-private = true
exclude = ["tests", "scripts"]
2. Style: ruff pydocstyle¶
What it checks: Google-style compliance
# pyproject.toml (already configured)
[tool.ruff.lint]
extend-select = ["D"] # pydocstyle
[tool.ruff.lint.pydocstyle]
convention = "google"
Usage Guidelines¶
Public Functions¶
Required sections: - Summary (first line) - Args (if has parameters) - Returns (if returns non-None) - Raises (if raises exceptions)
async def evolve(
agent: LlmAgent,
trainset: list[dict],
*,
critic: LlmAgent | None = None,
config: EvolutionConfig | None = None,
) -> EvolutionResult:
"""Evolve agent instructions using GEPA algorithm.
Args:
agent: ADK agent to evolve.
trainset: Training examples with input/expected pairs.
critic: Optional critic agent for scoring.
config: Evolution configuration.
Returns:
EvolutionResult with evolved instruction and metrics.
Raises:
EvaluationError: If agent execution fails.
ConfigurationError: If config is invalid.
Examples:
Basic evolution:
```python
result = await evolve(agent, trainset, critic=critic)
print(result.evolved_instruction)
```
"""
Public Classes¶
Required sections: - Summary (first line) - Attributes (if has instance attributes) - Examples (for complex classes)
class AsyncGEPAEngine:
"""Async-first evolution engine implementing GEPA algorithm.
This engine orchestrates the evolution loop: evaluate, propose,
accept/reject, repeat until convergence or max iterations.
Attributes:
adapter: AsyncGEPAAdapter for agent evaluation.
config: Evolution configuration.
proposer: Mutation proposer for generating candidates.
Examples:
Using with mock adapter:
```python
engine = AsyncGEPAEngine(adapter=mock_adapter, config=config)
result = await engine.run()
```
"""
Protocol Definitions¶
Document the contract clearly:
class AsyncGEPAAdapter(Protocol):
"""Protocol for async GEPA adapters.
Implementations must provide async methods for evaluation,
reflective dataset creation, and proposal generation.
All methods are coroutines to enable concurrent execution.
"""
async def evaluate(
self,
batch: list[DataInst],
candidate: dict[str, str],
capture_traces: bool = False,
) -> EvaluationBatch:
"""Execute candidate on batch.
Args:
batch: Input examples to evaluate.
candidate: Candidate instruction texts.
capture_traces: Whether to capture execution traces.
Returns:
EvaluationBatch with outputs, scores, and trajectories.
"""
...
When to Use Admonitions¶
Note (Blue): Implementation details, performance characteristics
async def evaluate_batch(self, batch: list[DataInst]) -> EvaluationBatch:
"""Evaluate batch with controlled concurrency.
Note:
Concurrency is limited by `config.max_concurrent_evals`.
Default is 5 parallel evaluations.
"""
Warning (Orange): Critical warnings, common pitfalls
def evolve_sync(agent: LlmAgent, trainset: list[dict], **kwargs) -> EvolutionResult:
"""Synchronous wrapper for evolve().
Warning:
Creates new event loop internally. Do NOT call from
existing async context - use `evolve()` directly instead.
"""
See Also (Bullet list): Cross-references to related modules
"""Package docstring.
See Also:
- [`gepa_adk.domain`][gepa_adk.domain]: Core domain models.
- [`gepa_adk.ports`][gepa_adk.ports]: Port interfaces.
"""
Syntax: Use
[`display.text`][target.identifier]for mkdocstrings cross-references. Sphinx-style:mod:syntax is not supported by mkdocstrings.
What NOT to Document¶
| Don't Document | Why | Alternative |
|---|---|---|
Private functions (_helper) | Internal implementation | Inline comments if complex |
| Self-evident parameters | Noise without value | Skip obvious Args |
| Type info in docstrings | Duplicates type hints | Use type hints in signature |
Example of over-documentation:
# Don't do this
def add(a: int, b: int) -> int:
"""Add two numbers.
Args:
a: The first integer to add.
b: The second integer to add.
Returns:
The sum of a and b as an integer.
"""
return a + b
Better:
Workflow¶
Development¶
- Write code with docstrings following Google style
- Run
uv run ruff check --fixto check style - Run
uv run interrogate -v src/to check coverage - Commit
CI/CD¶
- Pre-commit: ruff D rules (blocks if style violations)
- Pre-commit: interrogate (blocks if coverage <95%)
Consequences¶
Positive¶
- Consistent docstring style across codebase
- Automated enforcement via existing tools
- Beautiful rendered documentation via mkdocs-material
- Clear guidelines for contributors
Negative¶
- Learning curve for Google-style sections
- Time investment to document existing code
Neutral¶
- Documentation quality is measurable and trackable
References¶
- Google Python Style Guide: Docstrings
- mkdocstrings-python: Google Style
- interrogate Documentation
- ADR-005: Three-Layer Testing Strategy