Skip to content

Feat/quick review v2#16

Closed
AntoMontagneDev wants to merge 3 commits into
mainfrom
feat/quick-review-v2
Closed

Feat/quick review v2#16
AntoMontagneDev wants to merge 3 commits into
mainfrom
feat/quick-review-v2

Conversation

@AntoMontagneDev

@AntoMontagneDev AntoMontagneDev commented Mar 5, 2026

Copy link
Copy Markdown
Contributor

🔦 description

What change is being made?

Add a new list_directory_at_commit tool, integrate a hallucination-filter step into quick PR reviews, and replace brittle JSON parsing with a robust response parser across agentic and quick-review workflows, including tests and updated prompts.

Why are these changes being made?

These changes improve PR review reliability by enabling directory-orientation, reducing noise from investigation-requests, and handling LLM outputs more gracefully. They introduce new tools, prompts, and tests, which brings added maintenance and potential edge-case risks that should be monitored.

@AntoMontagneDev AntoMontagneDev marked this pull request as ready for review March 5, 2026 21:34
stripped = content.strip()

# Try markdown code block with json language tag
match = re.search(r"```(?:json)?\s*\n(.*?)\n```", stripped, re.DOTALL)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Regex-based block extraction is brittle; consider supporting more variants (extra spaces, language hints, nested blocks) or a more robust extraction strategy.

return match.group(1).strip()

# Try generic code block
match = re.search(r"```\s*\n(.*?)\n```", stripped, re.DOTALL)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Second code-block extraction also relies on a strict pattern; consider consolidating extraction logic or adding tests for edge cases (e.g., multiple fences, spacing).

return None, False

# Workaround: some models insert newlines before closing quotes
normalized = extracted.replace('\n"', '"')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Normalization step (replacing \n") is a hack. Prefer robust JSON parsing with error handling and optional normalization, to avoid edge cases with escaping.

normalized = extracted.replace('\n"', '"')

try:
parser = PydanticOutputParser(output_cls=ValidationAgentResponseModel)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Catch specific parsing/JSON errors instead of a broad Exception to avoid masking other bugs during parsing.

parser = PydanticOutputParser(output_cls=ValidationAgentResponseModel)
parsed = parser.parse(normalized)
return parsed, True
except Exception:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

On failure you return (None, False) without logging; consider logging the failure and/or returning a clearer error signal.

@@ -0,0 +1,54 @@
"""Prompt for hallucination filter — mutes comments that ask the user to investigate instead of stating verified bugs."""

HALLUCINATION_FILTER_SYSTEM_PROMPT = """

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

System prompt content could be split into smaller, testable templates or loaded from config for easier maintenance.


For each issue whose comment is an "investigation request" (see below), call mute_issue with its issue_id and reason "investigation_request". Do NOT mute comments that state a verified bug. Prefer issuing all mute_issue calls in a single response.

# Mute These (investigation requests)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Hard-coded lists of investigation-phrases; consider moving to a config/module constant for easier updates and localization.

**Files Changed:**
{files_changed}

**Issues to Review (with IDs for muting):**

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Issues to Review uses placeholders; ensure formatting is robust if data is missing (e.g., missing issues_with_ids).

self.verbose = verbose
self.logger = logging.getLogger(name=LAMPE_LOGGER_NAME)
self.llm = llm or LiteLLM(
model=MODELS.GPT_5_NANO_2025_08_07,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Model choice (GPT_5_NANO_2025_08_07) should be verified for availability/licensing and whether a fallback is needed for environments without that model.


# Skip if no findings to filter
issues_with_ids = _build_issues_with_ids(ev.agent_reviews)
if "_No issues to review._" in issues_with_ids:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Sentinel check 'No issues to review.' relies on a specific string from a prior call; this is brittle. Consider a more explicit boolean/result from the aggregation step.

)

try:
agent_ctx = WorkflowContext(self._agent)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Ensure the context/agent lifecycle is correct when creating and using agent_ctx; confirm that storing/fetching 'muted_reasons' is safe across runs.

start_event=MuteIssueStart(user_prompt=user_prompt),
ctx=agent_ctx,
)
muted_reasons = await agent_ctx.store.get("muted_reasons", default={})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Getting 'muted_reasons' from the store uses a default of {}; ensure the store API actually returns a dict and not None.

if self.verbose and muted_reasons:
self.logger.debug(f"Hallucination filter muted {len(muted_reasons)} issues")

except Exception as e:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Broad exception handling (except Exception) can mask real issues; prefer catching specific known exceptions from the LL/LMM/agent layer.

def test_extract_json_from_plain_content():
"""When no markdown block, return stripped content."""
content = ' {"no_issue": true, "findings": []} '
assert extract_json_from_llm_content(content) == '{"no_issue": true, "findings": []}'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Plain content extraction test looks fine.

@lampe-ci

lampe-ci Bot commented Mar 5, 2026

Copy link
Copy Markdown

packages/lampe-review/tests/unit/workflows/agentic_review/test_response_parse.py (Line 23): Markdown JSON block extraction is tested; ensure isolated JSON is returned (not the surrounding text).

{"no_issue": true, "findings": []}
```
"""
result = extract_json_from_llm_content(content)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Generic code-block extraction test; ensure behavior when code fences are present without language hints.


def test_extract_json_empty_content():
"""Empty or whitespace returns empty string."""
assert extract_json_from_llm_content("") == ""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Empty content test; good to cover whitespace/empty input.

def test_parse_validation_response_valid_json():
"""Valid JSON returns parsed model and success."""
content = '{"no_issue": true, "findings": []}'
parsed, success = parse_validation_response(content)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Valid JSON parse test; assumes parsed model exposes attributes (no_issue, findings). Confirm the model type aligns with tests.

{"file_path": "src/a.py", "line_number": 42, "action": "fix",
"problem_summary": "Missing validation", "severity": "high", "category": "security"}
]}"""
parsed, success = parse_validation_response(content)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Test with findings checks nested fields; ensure tests align with actual model structure (dicts vs objects).

assert parsed.findings[0]["line_number"] == 42


def test_parse_validation_response_malformed_json_no_exception():

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Malformed JSON test; verify graceful failure without exceptions.


def test_parse_validation_response_truncated_json_no_exception():
"""Truncated JSON returns (None, False) without raising."""
truncated = '{"no_issue": false, "findings": [{"file_path": "x'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Truncated JSON test; ensure graceful failure path.


def test_parse_validation_response_empty_no_exception():
"""Empty content returns (None, False) without raising."""
parsed, success = parse_validation_response("")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Empty input test; covered.


def test_parse_validation_response_garbage_no_exception():
"""Arbitrary garbage returns (None, False) without raising."""
for garbage in ["not json at all", "null", "[]", '{"x"}', "}{"]:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Garbage inputs loop; ensures non-crashing behavior across varied inputs.



def test_validation_agent_parse_response_graceful_fallback():
"""ValidationAgent._parse_response returns empty findings on malformed input (no traceback)."""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Graceful fallback for ValidationAgent in a test; verify private parsing path behavior.

"""QuickReviewAgent._parse_response returns empty findings on malformed input (no traceback)."""
from lampe.review.workflows.quick_review.quick_review_agent import QuickReviewAgent

agent = QuickReviewAgent()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔦🐛

Graceful fallback for QuickReviewAgent test; ensure compatibility with actual agent implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant