Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
6412a5d
feat(gepa): add tool description optimization for multi-agent systems
Ju-usc Oct 10, 2025
cf0be4f
style: fix ruff formatting (trailing whitespace)
Ju-usc Oct 10, 2025
aa53fe2
style: apply ruff formatting fixes
Ju-usc Oct 10, 2025
045c6cf
feat(gepa): implement tool-specific proposer for tool descriptions
Ju-usc Oct 10, 2025
c4f2041
docs(gepa): clean up multi-agent example code
Ju-usc Oct 10, 2025
260ca80
refactor(gepa): simplify tool reflective dataset with ReAct context r…
Ju-usc Oct 11, 2025
04f7e3d
fix(gepa): unify custom proposer routing for tools
Ju-usc Oct 12, 2025
f92e184
docs(gepa): clarify tool reflection prompt
Ju-usc Oct 12, 2025
7178869
test: streamline GEPA tool optimization tests
Ju-usc Oct 12, 2025
e34703b
fix(gepa): streamline tool proposer formatting
Ju-usc Oct 12, 2025
3f05311
test(gepa): drop legacy dummy tool fixture
Ju-usc Oct 12, 2025
4df9ce5
docs(gepa): add tool-specific reflection prompt and metric example
Ju-usc Oct 12, 2025
4296ccf
docs(gepa): fix implementation details with accurate code flow
Ju-usc Oct 13, 2025
ea1204a
docs(gepa): remove backward compatibility note
Ju-usc Oct 13, 2025
48d5cd6
docs(gepa): improve usage examples with optimization visualization
Ju-usc Oct 13, 2025
548d9b6
docs(gepa): add design rationale comments for tool context sharing
Ju-usc Oct 13, 2025
e61d0a1
docs(gepa): add tool optimization links to overview and parameter docs
Ju-usc Oct 13, 2025
5c95412
docs(gepa): refine tool optimization scenarios and remove implementat…
Ju-usc Oct 13, 2025
19d7717
docs(gepa): clarify future work section in code comments
Ju-usc Oct 13, 2025
9ce5fe4
refactor(gepa): unify ReAct optimization as single module
Ju-usc Oct 24, 2025
91331d0
test(gepa): add end-to-end ReAct module optimization test
Ju-usc Oct 24, 2025
3418b59
fix(gepa): enable arg description optimization for ReAct tools
Ju-usc Oct 24, 2025
b26d39a
chore: remove legacy test_gepa_tool_optimization.py
Ju-usc Oct 24, 2025
2791b5c
fix: restore accidentally removed score mismatch warning
Ju-usc Oct 24, 2025
8e63c62
test: update fixture after arg description optimization fix
Ju-usc Oct 25, 2025
7a9d2f3
fix(test): use JSON-based hashing for cross-version fixture stability
Ju-usc Oct 25, 2025
cd0de57
refactor(gepa): rename optimize_tool_descriptions to optimize_react_c…
Ju-usc Oct 26, 2025
67bb739
docs(gepa): improve 'What is optimize_react_components?' section
Ju-usc Oct 26, 2025
b3026a7
docs(gepa): replace outdated tool-specific prompt with actual ReAct o…
Ju-usc Oct 26, 2025
4e107aa
docs(gepa): simplify 'How It Works' section with accurate routing beh…
Ju-usc Oct 26, 2025
78547e7
docs(gepa): remove outdated Implementation Details section
Ju-usc Oct 26, 2025
7fa829b
docs(gepa): replace theoretical scenarios with real user pain points
Ju-usc Oct 26, 2025
da0e7bc
docs(gepa): fix usage examples reference to match updated scenarios
Ju-usc Oct 26, 2025
e51158d
docs(gepa): update inspect section to show all 4 ReAct components wit…
Ju-usc Oct 26, 2025
776ab9b
docs(gepa): rewrite Section 8 with accurate custom proposer behavior …
Ju-usc Oct 26, 2025
ec6bb7b
fix(gepa): fix top-level ReAct module lookup and remove tool name san…
Ju-usc Oct 27, 2025
b6cc67b
refactor(gepa): unify ReAct module key handling and use constant
Ju-usc Oct 28, 2025
1206f38
test(gepa): add ReAct module detection tests for nested structures
Ju-usc Oct 28, 2025
333cbbf
test(gepa): add comprehensive ReAct detection and reconstruction tests
Ju-usc Oct 28, 2025
a50552a
test(gepa): add reflective dataset tests for multi-agent trajectory v…
Ju-usc Oct 28, 2025
965b157
test(gepa): verify tool arg descriptions propagate to args schema
Ju-usc Oct 29, 2025
5ddc6d3
fix(gepa): propagate arg_desc updates to tool.args for prompt rendering
Ju-usc Oct 29, 2025
2269de5
test(gepa): remove fixture-based test and unused dependencies
Ju-usc Oct 29, 2025
17456f0
test(gepa): remove unused fixture file
Ju-usc Oct 29, 2025
c884c18
style: fix ruff linting issues (import formatting, whitespace, bare e…
Ju-usc Oct 31, 2025
82dee25
refactor(test): rename setup_spy_for_base_program to setup_capture_fo…
Ju-usc Oct 31, 2025
ca84b9d
docs(gepa): clarify why Tool.func uses placeholder lambda in proposer
Ju-usc Oct 31, 2025
2eb8986
refactor(gepa): make all ReAct components optional with None default …
Ju-usc Oct 31, 2025
9f37ac1
docs(gepa): clarify 'LM' as 'reflection LM' in comments for precision
Ju-usc Oct 31, 2025
bd4cdac
refactor(gepa): refine reflection prompt to guide concise, focused Re…
Ju-usc Oct 31, 2025
0ad4077
docs(gepa): revise ReAct metric example to be general and extensible
Ju-usc Oct 31, 2025
ef5563e
docs(gepa): replace custom proposer example with reference to ReActMo…
Ju-usc Oct 31, 2025
1b10b65
docs(gepa): make custom proposer section more approachable and clear
Ju-usc Oct 31, 2025
675a0cd
docs(gepa): update ReAct reflection prompt to match current implement…
Ju-usc Nov 1, 2025
4a4d209
feat(gepa): warn when ReAct modules detected but optimization disabled
Ju-usc Nov 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
370 changes: 370 additions & 0 deletions docs/docs/api/optimizers/GEPA/GEPA_Advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -443,3 +443,373 @@ gepa = dspy.GEPA(
auto="medium"
)
```

## ReAct Component Optimization

### What is optimize_react_components?

Enable `optimize_react_components=True` to apply specialized optimization to `dspy.ReAct` modules while using default optimization for other modules.

A [`dspy.ReAct`](../../learn/programming/tools.md#approach-1-using-dspyreact-fully-managed) module has three parts: a **react predictor** (iteratively reasons and selects tools), an **extract predictor** (extracts final answers from trajectories), and **tools** with their schemas.

**What gets optimized for ReAct modules:**

GEPA can improve textual components across all parts:
- **React instruction** - Guides reasoning and tool selection (always optimized)
- **Extract instruction** - Guides answer extraction from trajectories (optional)
- **Tool descriptions** - Describes what each tool does (optional)
- **Tool argument descriptions** - Describes tool parameters (optional)

The reflection LM decides which optional components to improve based on observed failures. Non-ReAct modules in your program are optimized using GEPA's default signature optimization.

**Why this matters:**

Unlike optimizing signature instructions alone (which improves individual predictors), ReAct optimization improves the **entire agent workflow** - from initial reasoning through tool execution to final answer extraction.

ReAct agents often fail when their components contradict each other. A clear tool description doesn't help if the react instruction never considers using that tool. GEPA analyzes execution traces to learn how all components should work together.

### ReAct Optimization Prompt

GEPA uses a specialized prompt to jointly optimize all ReAct components. The prompt receives complete ReAct trajectories and current component texts:

```python
class GenerateImprovedReActDescriptionsFromFeedback(dspy.Signature):
"""Improve a ReAct agent based on execution examples and feedback.

These components are progressively optimized - refine what needs improvement.
Analyze the trajectories to identify successful patterns and failure causes.
Generate improved texts to help the agent succeed on similar tasks.
Place improved texts at their appropriate level of abstraction and/or specificity.
"""

current_react_instruction = dspy.InputField(
desc="Current ReAct module instruction guiding the ReAct agent's reasoning and tool selection"
)
current_extract_instruction = dspy.InputField(
desc="Current Extract module instruction for extracting final answers from trajectories"
)
current_tools = dspy.InputField(
annotation=list[dspy.Tool],
desc="Available tools with their complete schemas"
)
examples_with_feedback = dspy.InputField(
desc="Execution examples with feedback showing successes and failures"
)

improved_react_instruction: str | None = dspy.OutputField(
desc="ReAct instruction for reasoning and tool selection",
default=None
)
improved_extract_instruction: str | None = dspy.OutputField(
desc="Extract instruction for answer extraction",
default=None
)
# Note: Tool descriptions and arg descriptions are added dynamically via signature.append()
# with field descriptions like "Purpose of tool" and "Usage of parameter"
```

The reflection LM receives all current components and execution traces, then decides which components to improve. Tool-specific fields (`improved_tool_{name}_desc`, `improved_tool_{name}_arg_{param}_desc`) are generated dynamically for each tool and parameter.

**Writing Metrics for ReAct Optimization**

GEPA optimizes ReAct modules more effectively when metrics provide feedback about the agent's execution. Here's how to write metrics that help:

```python
def react_metric(example, pred, trace=None, pred_name=None, pred_trace=None):
"""Evaluate ReAct agent performance with trajectory feedback."""
# Check if the answer is correct
answer_match = pred.answer == example.answer
score = 1.0 if answer_match else 0.0

# Provide feedback to help GEPA understand what happened
feedback = "Correct answer" if answer_match else "Incorrect answer"

return dspy.Prediction(score=score, feedback=feedback)
```

You can make feedback more informative by examining the trajectory:

```python
def react_metric_with_trajectory(example, pred, trace=None, pred_name=None, pred_trace=None):
"""Evaluate with trajectory analysis."""
# Check if the answer is correct
answer_match = pred.answer == example.answer
score = 1.0 if answer_match else 0.0

# Access the ReAct trajectory to understand agent behavior
trajectory = getattr(pred, 'trajectory', {})

# Extract tool names from trajectory (excluding 'finish')
tools_used = []
for key in trajectory:
if key.startswith('tool_name_'):
tool_name = trajectory[key]
if tool_name != 'finish':
tools_used.append(tool_name)

# Build feedback message
if answer_match:
feedback = "Correct answer"
else:
feedback = "Incorrect answer"

if tools_used:
feedback += f". Tools: {', '.join(tools_used)}"

return dspy.Prediction(score=score, feedback=feedback)
```

The trajectory contains the agent's step-by-step execution. Use it to provide feedback about:

- **Tool selection**: Were appropriate tools chosen?
- **Reasoning quality**: Did the agent think through the problem?
- **Efficiency**: Were there unnecessary steps?

The reflection LM uses your feedback to jointly improve react instructions, tool descriptions, and extraction logic.

### How It Works

When `optimize_react_components=True`, GEPA:

1. **Discovers ReAct modules** - Finds all `dspy.ReAct` instances in your program (including nested modules)
2. **Extracts components** - Collects react instructions, extract instructions, and tool schemas from each ReAct module
3. **Routes to proposers** - Separates components by type and routes them appropriately:
- **With custom `instruction_proposer`**: Your custom proposer receives all components (both regular instructions and ReAct components) and handles the optimization logic
- **With default proposer**: Regular instructions use default instruction proposer, ReAct components use specialized `ReActModuleProposer`
4. **Optimizes jointly** - ReAct proposer improves all four components together based on execution feedback
5. **Applies updates** - Updates your ReAct modules with improved instructions and tool descriptions

Non-ReAct modules (like `dspy.Predict` or `dspy.ChainOfThought`) continue using standard GEPA optimization.

### When to Use optimize_react_components

Enable `optimize_react_components=True` when you use `dspy.ReAct` in your program and need better agent performance. GEPA jointly optimizes all ReAct components (react instruction, extract instruction, tool descriptions, tool argument descriptions) based on execution feedback. Common scenarios:

1. **Agent loops with repeated tool calls** - Agent keeps calling `web_search` multiple times with similar queries instead of synthesizing information. GEPA improves react instruction to encourage synthesis and tool descriptions to clarify when searches are sufficient.

2. **Wrong tool selection** - Agent with `search` and `calculator` tools keeps searching when it should calculate, or vice versa. GEPA refines react instruction and tool descriptions to clarify "use search for factual queries, calculator for numerical analysis."

3. **Agent gives up without trying tools** - Agent responds "I don't know" without using available tools that could answer the question. GEPA improves react instruction to be more proactive about tool usage.

4. **Extraction failures** - Agent executes tools correctly but fails to extract the final answer from the trajectory. GEPA improves extract instruction to better identify and format answers from tool outputs.

5. **Multi-agent delegation issues** - Parent agent has delegation tools to specialized sub-agents but doesn't understand when to use each. GEPA optimizes all ReAct components across both parent and sub-agent modules for coherent delegation.

See the usage examples below for basic ReAct agents and multi-agent systems.

### Usage Examples

#### Basic ReAct Agent

```python
import dspy

def search_web(query: str) -> str:
return f"Search results for: {query}"

def calculate(expression: str) -> float:
return eval(expression)

# Create ReAct agent with tools (poor initial descriptions)
search_tool = dspy.Tool(search_web, name="search", desc="Finds things")
calc_tool = dspy.Tool(calculate, name="calculator", desc="Does calculations")

agent = dspy.ReAct("question -> answer", tools=[search_tool, calc_tool])

# Enable tool optimization
gepa = dspy.GEPA(
metric=my_metric,
reflection_lm=dspy.LM(model="gpt-5-mini"),
optimize_react_components=True,
component_selector="all", # Optimize all components together
auto="medium"
)

optimized_agent = gepa.compile(agent, trainset=train_examples, valset=val_examples)

# View optimized tool descriptions
print("Optimized search tool:", optimized_agent.tools["search"].desc)
print("Optimized calculator tool:", optimized_agent.tools["calculator"].desc)
```

**Example output after optimization:**
```
Optimized search tool: Use when you need to find current information, facts, or data
from external sources. Provide specific search queries to get relevant results.

Optimized calculator tool: Use for arithmetic operations and mathematical expressions.
Accepts Python-compatible expressions with numbers and operators (+, -, *, /, **).
Do not use for date calculations or string manipulations.
```

#### Multi-Agent System

GEPA automatically discovers and optimizes tools in nested agents:

```python
import dspy

def search_web(query: str) -> str:
return f"Search results for: {query}"

def calculate(expression: str) -> float:
return eval(expression)

search_tool = dspy.Tool(search_web, name="search", desc="Searches")
calc_tool = dspy.Tool(calculate, name="calculator", desc="Computes")

class ResearchAssistant(dspy.Module):
def __init__(self):
super().__init__()
self.researcher = dspy.ReAct("query -> findings", tools=[search_tool])

def delegate_research(query: str) -> str:
return self.researcher(query=query).findings

research_tool = dspy.Tool(delegate_research, name="research", desc="Helps with questions")
self.assistant = dspy.ReAct("question -> answer", tools=[research_tool, calc_tool])

def forward(self, question):
return self.assistant(question=question)

# Optimizes ALL tools: calculator, research, search
gepa = dspy.GEPA(
metric=my_metric,
reflection_lm=dspy.LM(model="gpt-5-mini"),
optimize_react_components=True,
component_selector="all",
auto="medium"
)

optimized_system = gepa.compile(ResearchAssistant(), trainset=train, valset=val)

# View optimized nested tool descriptions
print(optimized_system.researcher.tools["search"].desc)
print(optimized_system.assistant.tools["research"].desc)
print(optimized_system.assistant.tools["calculator"].desc)
```

### Inspecting Optimized ReAct Components

After optimization, all ReAct components are automatically updated in your program. Access them directly:

```python
optimized_agent = gepa.compile(agent, trainset=train, valset=val)

# ReAct instruction (guides reasoning and tool selection)
print("React instruction:", optimized_agent.react.signature.instructions)

# Extract instruction (guides answer extraction from trajectory)
print("Extract instruction:", optimized_agent.extract.predict.signature.instructions)

# Tool descriptions
for tool_name, tool in optimized_agent.tools.items():
if tool_name != 'finish': # Skip the built-in finish tool
print(f"Tool '{tool_name}' description:", tool.desc)
# Tool argument descriptions
print(f" Argument descriptions:", tool.arg_desc)
```

### Custom Instruction Proposers and ReAct Optimization

**Important:** When you provide a custom `instruction_proposer`, it receives ALL components (regular predictors AND ReAct modules). You must set `optimize_react_components=True` to enable ReAct module discovery and serialization, then handle the optimization logic yourself.

**How it works internally:**

1. **Component Discovery** - GEPA discovers components in your program:
- Regular predictors → keys like `"predict"`, `"chain_of_thought"`
- ReAct modules → keys like `"react_module"` or `"react_module:agent_name"`

2. **ReAct Serialization** - When `optimize_react_components=True`, GEPA serializes ReAct modules as JSON:
```json
{
"react": "instruction for reasoning and tool selection",
"extract": "instruction for answer extraction",
"tools": {
"tool_name": {
"desc": "what the tool does",
"args": {"param": {"type": "string"}},
"arg_desc": {"param": "description of param"}
}
}
}
```

3. **Custom Proposer Receives**:
- `candidate: dict[str, str]` - **All values are strings**
- Regular component: `candidate["predict"]` → `"Your instruction here"`
- ReAct component: `candidate["react_module"]` → `'{"react": "...", "extract": "...", "tools": {...}}'` (JSON as a string)
- `reflective_dataset: dict[str, list[ReflectiveExample]]` - **GEPA provides this**
- Contains execution traces: inputs, outputs (including full ReAct trajectory), and your metric's feedback
- For ReAct: `Generated_Outputs` includes the entire trajectory with all tool calls and reasoning
- Use this to understand what went wrong and guide your improvements
- `components_to_update: list[str]` - Component keys to optimize this round

4. **Your Responsibility**:
- For ReAct components: Use `json.loads()` to parse, improve all 4 parts, use `json.dumps()` to return
- For regular components: Improve the instruction string directly
- Return `dict[str, str]` with same keys

**What this means:**
- Your custom proposer receives ALL components: regular signatures AND ReAct modules
- GEPA still does discovery and JSON serialization, but YOU handle the optimization logic
- ReAct components are passed with keys like `"react_module"` or `"react_module:agent_name"`

#### Implementing a Custom Proposer for ReAct

If you need custom optimization logic beyond the default, you can build your own proposer. The best way to start is by looking at the reference implementation: [`ReActModuleProposer`](https://github.com/stanfordnlp/dspy/blob/main/dspy/teleprompt/gepa/instruction_proposal.py).

**Understanding ReAct component structure**

When GEPA optimizes ReAct modules, it serializes them as JSON strings containing all the pieces you can improve:

```json
{
"react": "instruction for reasoning and tool selection",
"extract": "instruction for answer extraction",
"tools": {
"search": {
"desc": "Search the web for information",
"args": {"query": {"type": "string"}},
"arg_desc": {"query": "The search query to execute"}
}
}
}
```

**What you can improve:**
- **`react`** - How the agent reasons and decides which tools to use
- **`extract`** - How the agent extracts the final answer from execution results
- **`tools[*].desc`** - What each tool does and when to use it
- **`tools[*].arg_desc`** - What each parameter means and how to use it

**What to preserve:**
- **`tools[*].args`** - The tool's parameter schema (types, required fields, etc.)

**Your proposer's interface**

Your custom proposer is a callable that receives component instructions and execution feedback, then returns improved versions:

```python
def your_custom_proposer(
candidate: dict[str, str], # Current instructions for all components
reflective_dataset: dict[str, list], # Execution examples with feedback
components_to_update: list[str], # Which components to optimize this round
) -> dict[str, str]: # Return improved instructions
"""
For ReAct components:
- Use json.loads() to parse the JSON string
- Improve what needs fixing based on the feedback
- Use json.dumps() to serialize back

For regular components:
- Just return the improved instruction string
"""
# Your optimization logic here
pass
```

**The reference shows how to:**
- Parse and rebuild the JSON structure
- Generate dynamic fields for tools/parameters
- Use execution feedback to guide improvements
6 changes: 6 additions & 0 deletions docs/docs/api/optimizers/GEPA/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,12 @@ Practical Recipe for GEPA-Friendly Feedback:
- **Multi-Objective Tasks** (e.g., PUPA): Decompose aggregate scores to reveal contributions from each objective, highlighting tradeoffs (e.g., quality vs. privacy).
- **Stacked Pipelines** (e.g., code generation: parse → compile → run → profile → evaluate): Expose stage-specific failures; natural-language traces often suffice for LLM self-correction.

## ReAct Component Optimization

GEPA can optimize ReAct modules holistically. When `optimize_react_components=True`, GEPA jointly optimizes all four components of ReAct modules: react instructions, extract instructions, tool descriptions, and tool argument descriptions. This helps agents make better decisions by learning from execution traces how all components work together.

For details on how ReAct optimization works, when to use it, and usage examples, see [ReAct Component Optimization](GEPA_Advanced.md#react-component-optimization) in the Advanced Features guide.

## Custom Instruction Proposal

For advanced customization of GEPA's instruction proposal mechanism, including custom instruction proposers and component selectors, see [Advanced Features](GEPA_Advanced.md).
Expand Down
Loading