Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
| Example | Description |
|---------|-------------|
| [deep_research](deep_research/) | Multi-step web research agent using Tavily for URL discovery, parallel sub-agents, and strategic reflection |
| [code-reviewer](code-reviewer/) | Code review agent that analyzes source code for correctness, style, and security using static analysis and security auditing tools |
| [content-builder-agent](content-builder-agent/) | Content writing agent that demonstrates memory (`AGENTS.md`), skills, and subagents for blog posts, LinkedIn posts, and tweets with generated images |
| [text-to-sql-agent](text-to-sql-agent/) | Natural language to SQL agent with planning, skill-based workflows, and the Chinook demo database |
| [ralph_mode](ralph_mode/) | Autonomous looping pattern that runs with fresh context each iteration, using the filesystem for persistence |
Expand Down
12 changes: 12 additions & 0 deletions examples/code-reviewer/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# API Keys for Code Reviewer Agent Example
# Copy this file to .env and fill in your actual API keys

# Anthropic API Key (for Claude Sonnet 4)
ANTHROPIC_API_KEY=your_anthropic_api_key_here

# OpenAI API Key (for GPT-4o-mini as alternative)
OPENAI_API_KEY=your_openai_api_key_here

# LangSmith API Key (required for LangGraph local server)
# Get your key at: https://smith.langchain.com/settings
LANGSMITH_API_KEY=lsv2_pt_your_api_key_here
152 changes: 152 additions & 0 deletions examples/code-reviewer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Code Reviewer

## Quickstart

**Prerequisites**: Install [uv](https://docs.astral.sh/uv/) package manager:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

Ensure you are in the `code-reviewer` directory:

```bash
cd examples/code-reviewer
```

Install packages:

```bash
uv sync
```

Set your API keys in your environment:

```bash
export ANTHROPIC_API_KEY=your_anthropic_api_key_here # Required for Claude model
export GOOGLE_API_KEY=your_google_api_key_here # Required for Gemini model ([get one here](https://ai.google.dev/gemini-api/docs))
export LANGSMITH_API_KEY=your_langsmith_api_key_here # [LangSmith API key](https://smith.langchain.com/settings) (free to sign up)
```

## Usage Options

You can run this example in two ways:

### Option 1: Direct Python Invocation

Run the code reviewer directly from Python:

```python
from deepagents import create_deep_agent
from langchain.chat_models import init_chat_model
from code_reviewer.tools import analyze_code, security_audit, think_tool
from code_reviewer.prompts import REVIEW_WORKFLOW_INSTRUCTIONS

model = init_chat_model(model="anthropic:claude-sonnet-4-5-20250929", temperature=0.0)

agent = create_deep_agent(
model=model,
tools=[analyze_code, security_audit, think_tool],
system_prompt=REVIEW_WORKFLOW_INSTRUCTIONS,
)

result = agent.invoke({
"messages": [{
"role": "user",
"content": """Review this Python code:

def calculate_total(items):
total = 0
for i in range(len(items)):
total = total + items[i]
return total

class order_processor:
def process(self, data, db_conn, user_id, auth_token, config, callback):
eval(data[\"query\"])
return \"done\"
""",
}]
})

print(result["messages"][-1].content)
```

### Option 2: LangGraph Server

Run a local [LangGraph server](https://langchain-ai.github.io/langgraph/tutorials/langgraph-platform/local-server/) with a web interface:

```bash
langgraph dev
```

LangGraph server will open a new browser window with the Studio interface, where you can submit your code for review.

You can also connect the LangGraph server to a [UI specifically designed for deepagents](https://github.com/langchain-ai/deep-agents-ui):

```bash
git clone https://github.com/langchain-ai/deep-agents-ui.git
cd deep-agents-ui
yarn install
yarn dev
```

Then follow the instructions in the [deep-agents-ui README](https://github.com/langchain-ai/deep-agents-ui?tab=readme-ov-file#connecting-to-a-langgraph-server) to connect the UI to the running LangGraph server.

This provides a user-friendly interface for submitting code and viewing review reports.

## Custom Model

By default, `deepagents` uses `"claude-sonnet-4-5-20250929"`. You can customize this by passing any [LangChain model object](https://python.langchain.com/docs/integrations/chat/). See the Deep Agents package [README](https://github.com/langchain-ai/deepagents?tab=readme-ov-file#model) for more details.

```python
from langchain.chat_models import init_chat_model
from deepagents import create_deep_agent

# Using Claude
model = init_chat_model(model="anthropic:claude-sonnet-4-5-20250929", temperature=0.0)

# Using Gemini
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-3-pro-preview")

agent = create_deep_agent(
model=model,
)
```

## Custom Instructions

The code reviewer agent uses custom instructions defined in `code_reviewer/prompts.py` that complement (rather than duplicate) the default middleware instructions. You can modify these in any way you want.

| Instruction Set | Purpose |
|----------------|---------|
| `REVIEW_WORKFLOW_INSTRUCTIONS` | Defines the 6-step code review workflow: save request → plan with TODOs → analyze with sub-agents → synthesize → write report → verify. Includes reporting format with severity-based issue classification (Critical, Major, Minor, Suggestion). |
| `SUBAGENT_DELEGATION_INSTRUCTIONS` | Provides delegation strategies for multi-file reviews. Defaults to 1 sub-agent per review; parallelizes only for logically independent files. Sets limits on parallel execution (max 3 concurrent) and iteration rounds (max 3). |
| `REVIEWER_INSTRUCTIONS` | Guides individual review sub-agents to analyze code for correctness, style, and security. Includes hard limits on tool calls (2-3 for simple scripts, max 5 for complex code) and emphasizes using `think_tool` after each analysis. |

## Custom Tools

The code reviewer agent adds the following custom tools beyond the built-in deepagent tools. You can also use your own tools, including via MCP servers. See the Deep Agents package [README](https://github.com/langchain-ai/deepagents?tab=readme-ov-file#mcp) for more details.

| Tool Name | Description |
|-----------|-------------|
| `analyze_code` | Static analysis tool that uses Python's `ast` module to parse and inspect code structure. Checks for syntax errors, missing docstrings, naming convention violations (snake_case for functions, PascalCase for classes), excessive parameter counts (>5), overly long functions (>50 lines), long lines (>100 chars), and TODO/FIXME markers. Returns a categorized report with severity levels. |
| `security_audit` | Security vulnerability scanner using regex pattern matching. Detects dangerous functions (eval, exec, pickle), SQL injection via f-strings or string formatting, shell injection (os.system, subprocess with shell=True), hardcoded credentials, unsafe yaml.load, requests without timeouts, and security-relevant comments. Reports findings with severity levels (CRITICAL, HIGH, MEDIUM, LOW). |
| `think_tool` | Strategic reflection mechanism that helps the agent pause and assess progress between analysis calls, evaluate findings, identify gaps, and plan next steps. |

### Example Analysis Output

When run against code with common issues, the agent produces a structured report like this:

**Correctness:**
- [critical] Line 12: `eval(data["query"])` can execute arbitrary code — use `ast.literal_eval()` or a safer alternative

**Style:**
- [minor] Class `order_processor` (line 8) violates PascalCase naming convention
- [minor] Function `calculate_total` is missing a docstring
- [minor] Use `for item in items` instead of `for i in range(len(items))` for cleaner iteration

**Security:**
- [CRITICAL] Line 12: Use of eval() — can execute arbitrary code
- [HIGH] Line 9: Function `process` has 6 parameters (max recommended: 5)
53 changes: 53 additions & 0 deletions examples/code-reviewer/agent.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
"""Code Reviewer Agent - Standalone script for LangGraph deployment.

This module creates a code review agent with custom tools and prompts
for analyzing code correctness, style, and security.
"""

from deepagents import create_deep_agent
from langchain.chat_models import init_chat_model

from code_reviewer.prompts import (
REVIEWER_INSTRUCTIONS,
REVIEW_WORKFLOW_INSTRUCTIONS,
SUBAGENT_DELEGATION_INSTRUCTIONS,
)
from code_reviewer.tools import analyze_code, security_audit, think_tool

# Limits
max_concurrent_review_units = 3
max_reviewer_iterations = 3

# Combine orchestrator instructions (REVIEWER_INSTRUCTIONS only for sub-agents)
INSTRUCTIONS = (
REVIEW_WORKFLOW_INSTRUCTIONS
+ "\n\n"
+ "=" * 80
+ "\n\n"
+ SUBAGENT_DELEGATION_INSTRUCTIONS.format(
max_concurrent_review_units=max_concurrent_review_units,
max_reviewer_iterations=max_reviewer_iterations,
)
)

# Create code review sub-agent
review_sub_agent = {
"name": "code-reviewer",
"description": "Delegate a file or code snippet for detailed review. Give this reviewer one file or snippet at a time.",
"system_prompt": REVIEWER_INSTRUCTIONS,
"tools": [analyze_code, security_audit, think_tool],
}

# Model Gemini 3
# model = ChatGoogleGenerativeAI(model="gemini-3-pro-preview", temperature=0.0)

# Model Claude 4.5
model = init_chat_model(model="anthropic:claude-sonnet-4-5-20250929", temperature=0.0)

# Create the agent
agent = create_deep_agent(
model=model,
tools=[analyze_code, security_audit, think_tool],
system_prompt=INSTRUCTIONS,
subagents=[review_sub_agent],
)
21 changes: 21 additions & 0 deletions examples/code-reviewer/code_reviewer/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
"""Code Reviewer Agent Example.

This module demonstrates building a code review agent using the deepagents package
with custom tools for code analysis and security auditing.
"""

from code_reviewer.prompts import (
REVIEWER_INSTRUCTIONS,
REVIEW_WORKFLOW_INSTRUCTIONS,
SUBAGENT_DELEGATION_INSTRUCTIONS,
)
from code_reviewer.tools import analyze_code, security_audit, think_tool

__all__ = [
"analyze_code",
"security_audit",
"think_tool",
"REVIEWER_INSTRUCTIONS",
"REVIEW_WORKFLOW_INSTRUCTIONS",
"SUBAGENT_DELEGATION_INSTRUCTIONS",
]
133 changes: 133 additions & 0 deletions examples/code-reviewer/code_reviewer/prompts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
"""Prompt templates and tool descriptions for the code review deepagent."""

REVIEW_WORKFLOW_INSTRUCTIONS = """# Code Review Workflow

Follow this workflow for all code review requests:

1. **Plan**: Create a todo list with write_todos to break down the review into focused tasks
2. **Save the request**: Use write_file() to save the code or reference to `/review_request.md`
3. **Analyze**: Delegate code review tasks to sub-agents using the task() tool - delegate individual files or functions to sub-agents
4. **Synthesize**: Review all sub-agent findings and consolidate the review
5. **Write Report**: Write a comprehensive review report to `/review_report.md` (see Report Writing Guidelines below)
6. **Verify**: Read `/review_request.md` and confirm you've addressed all aspects

## Review Planning Guidelines
- For single-file reviews, delegate the entire file to one sub-agent
- For multi-file reviews, delegate one file per sub-agent
- Each sub-agent should analyze one file or snippet and return findings
- Focus on actionable, specific feedback rather than general observations

## Report Writing Guidelines

When writing the final report to `/review_report.md`, follow this structure:

### Structure:
1. **Overview**: Brief summary of what was reviewed
2. **Correctness**: Logic errors, edge cases, error handling issues
3. **Style**: Naming conventions, formatting, code organization
4. **Security**: Vulnerabilities, unsafe patterns, hardcoded secrets
5. **Recommendations**: Prioritized list of actionable changes
6. **Summary**: Overall assessment and verdict

### Issue severity levels:
- **Critical**: Bugs, security vulnerabilities, data loss risks
- **Major**: Significant code quality or maintainability concerns
- **Minor**: Style violations, minor inefficiencies
- **Suggestion**: Optional improvements or best practices

### General guidelines:
- Use clear section headings (## for sections, ### for subsections)
- Be specific - include line numbers and code snippets where relevant
- Prioritize issues: critical > major > minor > suggestion
- Do NOT use self-referential language ("I found...", "I analyzed...")
- Write as a professional review without meta-commentary
- Each issue should include: severity, location, description, and recommendation
"""

REVIEWER_INSTRUCTIONS = """You are a code reviewer analyzing source code for correctness, style, and security.

<Task>
Your job is to use tools to analyze the provided code and return a detailed review.
You can call these tools in series or in parallel, your analysis is conducted in a tool-calling loop.
</Task>

<Available Review Tools>
You have access to the following tools:
1. **analyze_code**: For static analysis of Python code (AST parsing, naming conventions, complexity)
2. **security_audit**: For security vulnerability scanning
3. **think_tool**: For reflection and strategic planning during review
**CRITICAL: Use think_tool after each analysis to reflect on findings and plan next steps**
</Available Review Tools>

<Instructions>
Think like an experienced code reviewer. Follow these steps:

1. **Read the code carefully** - What does it do? What could go wrong?
2. **Analyze correctness** - Check for logic errors, missing edge cases, exception handling
3. **Analyze style** - Check naming, formatting, documentation, code organization
4. **Analyze security** - Check for common vulnerabilities and unsafe patterns
5. **Stop when you can provide a comprehensive review** - Don't keep analyzing for perfection
</Instructions>

<Hard Limits>
**Analysis Budgets** (Prevent excessive tool calls):
- **Simple scripts**: Use 2-3 analysis tool calls maximum
- **Complex code**: Use up to 5 analysis tool calls maximum
- **Always stop**: After 5 analysis tool calls if you cannot complete the review

**Stop Immediately When**:
- You can provide a comprehensive review
- You have identified all critical and major issues
- Your last 2 analysis calls returned overlapping findings
</Hard Limits>

<Show Your Thinking>
After each analysis tool call, use think_tool to reflect:
- What key issues did I find?
- What aspects still need review?
- Do I have enough to provide a comprehensive review?
- Should I do more analysis or compile my findings?
</Show Your Thinking>

<Final Response Format>
When providing your findings back to the orchestrator:

1. **Structure your response**: Organize findings with clear headings by category (Correctness, Style, Security)
2. **Be specific**: Include line numbers, code snippets, and exact recommendations
3. **Prioritize issues**: Group by severity (Critical, Major, Minor, Suggestion)

The orchestrator will consolidate findings from all reviewers into the final report.
</Final Response Format>
"""

TASK_DESCRIPTION_PREFIX = """Delegate a task to a specialized sub-agent with isolated context. Available agents for delegation are:
{other_agents}
"""

SUBAGENT_DELEGATION_INSTRUCTIONS = """# Sub-Agent Code Review Coordination

Your role is to coordinate code reviews by delegating tasks from your TODO list to specialized review sub-agents.

## Delegation Strategy

**DEFAULT: Start with 1 sub-agent** for most reviews:
- "Review this single file" → 1 sub-agent (full review)
- "Check this function for issues" → 1 sub-agent

**Parallelize for multi-file projects:**
- Multiple independent files → 1 sub-agent per file
- Only parallelize when files are logically independent

## Key Principles
- **Bias towards single sub-agent**: One comprehensive review is more token-efficient than multiple narrow ones
- **Avoid premature decomposition**: Don't break "review file X" into "check correctness", "check style", "check security" - just use 1 sub-agent for all aspects of X

## Parallel Execution Limits
- Use at most {max_concurrent_review_units} parallel sub-agents per iteration
- Make multiple task() calls in a single response to enable parallel execution
- Each sub-agent returns findings independently

## Review Limits
- Stop after {max_reviewer_iterations} delegation rounds if you haven't completed the review
- Stop when you have sufficient analysis for a comprehensive review
- Bias towards focused review over exhaustive analysis"""
Loading
Loading