feat: harden review pipeline by frankekn · Pull Request #6 · frankekn/needlefish

frankekn · 2026-06-28T14:28:38Z

Summary

add a typed CLI parser and focused tests
split shared repo/process/normalization helpers out of adapters and schema
harden GitHub PR review behavior around PR base/head SHAs, token isolation, and review posting
tighten review/deep/map/critic prompt evidence contracts
add regression coverage for JSON normalization, verdicts, process spawning, and Codex retry backoff

Verification

pnpm test
pnpm check
git diff --check
no-excuse TypeScript checker on changed TS files
Oracle review rerun with ChatGPT Pro Extended verified via browser model strategy select

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8cce593c1f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-28T14:41:35Z

+  if (!patch.trim()) {
+    throw new Error(`No diff between ${mergeBase} and ${headSha}. Nothing to review.`);
+  }


Post a check result for empty diffs

When a GitHub review is run against a no-op PR or a manual dispatch after the branch already matches the base, this new throw happens before the try block that reports failures via postCheck. The workflow job exits without creating/updating the Needlefish check run, which can leave branch protection waiting on a missing check instead of getting the intended failure/neutral result; handle the empty-diff case inside the reporting path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-28T14:41:35Z

+  if (lineEnd < lineStart) {
+    throw new Error("malformed finding: lineEnd before lineStart");
+  }
+  const confidence = Math.max(0, Math.min(1, Number(record.confidence ?? 0)));


Reject non-finite confidence values

If the model returns a non-numeric confidence such as "high" for a P0/P1/P2 finding, Number(...) becomes NaN, the clamp still produces NaN, and confidence < 0.65 is false. That lets malformed blocking findings bypass the new confidence gate and request changes with NaN confidence (later serialized as null), so this should explicitly require Number.isFinite before accepting the value.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-28T14:41:36Z

+  if (mappedHotspots.length === 0) {
    throw new Error("map pass produced no hotspots");
  }


Fall back when all mapped hotspots are filtered

When the map pass returns hotspots but changedHotspots() filters out every file (for example because the model used ./src/foo.ts paths or put only consumer files in files), this throw runs before the tail-coverage backstop below can review bundle.changedFiles. That turns a recoverable mapping-format issue in large/deep reviews into a failed review/check instead of still reviewing the changed files; let the backstop create a tail hotspot when mappedHotspots is empty.

Useful? React with 👍 / 👎.

github-actions

Needlefish PR Review

Verdict: ⛔ changes_requested
Base: a7796d3 → Head: 8cce593

PR refactors Needlefish's CLI/review pipeline into shared repo/process/normalization modules, tightens prompt evidence contracts, and adds test/package/docs coverage. — PR refactors Needlefish's CLI/review pipeline into shared repo/process/normalization modules, tightens prompt evidence contracts, and adds test/package/docs coverage.

Findings

P1: Preserve deep-pass evidence before critic pruning

prompts/critic.md:15

Why this breaks: The changed critic contract at prompts/critic.md:15 requires a checked[] entry beginning with EVIDENCE finding: for every kept P0/P1/P2. The small-review prompt now requires that prefix at prompts/review.md:37, and deep reviews now have their own checked evidence contract at prompts/deep.md:16. But reviewLarge still discards res.checked at src/core/review.ts:152 and builds candidateMerged.checked from summary-only strings at src/core/review.ts:166 before invoking the critic. A legitimate deep P2 with checked evidence is therefore presented to the critic without per-finding evidence, so a compliant critic is instructed to delete it and the large-review verdict can pass incorrectly.

Suggested fix: When merging each deep result, preserve res.checked entries in the merged checked array without breaking the EVIDENCE finding: prefix, and align the deep prompt to emit that per-finding evidence prefix for blocking findings.

Validation: Add a reviewLarge test with stubbed map/deep/critic where deep returns a P2 plus EVIDENCE finding:; assert the critic input includes that evidence and the final blocking finding survives.

P2: Reject nonnumeric blocking confidence

src/shared/normalize.ts:147-148

Why this breaks: The new confidence gate computes confidence with Number(...), but nonnumeric input becomes NaN; because NaN < 0.65 is false, a malformed P0/P1/P2 finding with confidence: "bad" is returned instead of rejected. That malformed finding then reaches the consuming verdict predicate at src/core/verdict.ts:9, where severity alone produces changes_requested, so invalid model output can block a PR.

Suggested fix: Parse confidence into a temporary number, reject non-finite values with Number.isFinite, then clamp and apply the low-confidence severity gate.

Validation: Add a schema test for a P2 finding with nonnumeric confidence and run npm test.

Checked

EVIDENCE finding:Reject nonnumeric blocking confidence changed=src/shared/normalize.ts:147 effect=Number("bad") becomes NaN, the low-confidence check at src/shared/normalize.ts:148 does not throw, and src/core/verdict.ts:9 turns P0/P1/P2 severity into changes_requested.
EVIDENCE finding:Preserve deep-pass evidence before critic pruning changed=prompts/critic.md:15 effect=large-review candidateMerged.checked at src/core/review.ts:166 contains only per-hotspot summaries from src/core/review.ts:152, so a compliant critic sees no per-finding evidence for deep P0/P1/P2 findings.
[Schema and normalization gates] Found one validation bug in the new normalization gate: malformed blocking confidence can bypass the low-confidence rejection and still block a PR.
[Review orchestration and prompt contracts] One blocking contract regression: large-review deep evidence is dropped before the critic, so legitimate P0/P1/P2 findings can be pruned.
[Execution adapters and bundle construction] No actionable defects found in this execution-adapter and bundle-construction surface.
[Codex execution and JSON extraction] No correctness, contract, runtime, security, validation, duplicate, or regression findings in this surface.
[CLI argument contract and dispatch] No correctness, contract, routing, or validation defects found in this CLI argument surface.
[Verdict, classification tests, and package test script] No correctness findings in this surface.

Residual Risk

Static review only; GitHub API endpoints were not exercised because the requested process was read-only inspection of this surface.
Did not run the full test or typecheck suite; verification was limited to diff/source inspection and targeted runtime checks.

github-actions · 2026-06-28T14:58:53Z

+  if (lineEnd < lineStart) {
+    throw new Error("malformed finding: lineEnd before lineStart");
+  }
+  const confidence = Math.max(0, Math.min(1, Number(record.confidence ?? 0)));


P2 (validation): Reject nonnumeric blocking confidence

The new confidence gate computes confidence with Number(...), but nonnumeric input becomes NaN; because NaN < 0.65 is false, a malformed P0/P1/P2 finding with confidence: "bad" is returned instead of rejected. That malformed finding then reaches the consuming verdict predicate at src/core/verdict.ts:9, where severity alone produces changes_requested, so invalid model output can block a PR.

Suggested fix: Parse confidence into a temporary number, reject non-finite values with Number.isFinite, then clamp and apply the low-confidence severity gate.

github-actions · 2026-06-28T14:58:54Z

- Re-check residual_risks: keep blocks=true only when it genuinely prevents a verdict (e.g. a deep pass that failed, leaving files unreviewed).

 # Output
+Before returning, ensure every kept P0/P1/P2 has a corresponding `checked[]` entry beginning with `EVIDENCE finding:`. If the evidence cannot be stated concretely, delete the finding.


P1 (bug): Preserve deep-pass evidence before critic pruning

The changed critic contract at prompts/critic.md:15 requires a checked[] entry beginning with EVIDENCE finding: for every kept P0/P1/P2. The small-review prompt now requires that prefix at prompts/review.md:37, and deep reviews now have their own checked evidence contract at prompts/deep.md:16. But reviewLarge still discards res.checked at src/core/review.ts:152 and builds candidateMerged.checked from summary-only strings at src/core/review.ts:166 before invoking the critic. A legitimate deep P2 with checked evidence is therefore presented to the critic without per-finding evidence, so a compliant critic is instructed to delete it and the large-review verdict can pass incorrectly.

Suggested fix: When merging each deep result, preserve res.checked entries in the merged checked array without breaking the EVIDENCE finding: prefix, and align the deep prompt to emit that per-finding evidence prefix for blocking findings.

frankekn · 2026-06-28T15:34:33Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5443ca4d1a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-28T15:45:40Z

+    throw new Error(`malformed finding: invalid confidence ${String(record.confidence)}`);
+  }
+  const confidence = Math.max(0, Math.min(1, rawConfidence));
+  if (severity !== "P3" && confidence < 0.65) {


Use the documented confidence cutoff

When a model returns a blocking P0/P1/P2 finding with confidence in the 0.65–0.69 range, this accepts it even though the review prompt explicitly says findings below 0.70 must be downgraded to P3 or residual risk. Since deriveVerdict treats any accepted P0/P1/P2 as changes_requested, borderline low-confidence output can incorrectly block a PR; raise this cutoff to 0.70 to match the policy.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-28T15:45:40Z

+  if (mappedHotspots.length === 0) {
    throw new Error("map pass produced no hotspots");


Fall back when mapped hotspots miss changed files

If the map pass returns hotspots whose files do not exactly match the changed paths (for example ./src/app.ts, or only a consumer file), changedHotspots filters them all out and this throws before the tail-coverage backstop can add bundle.changedFiles. That turns a recoverable map miss into a failed review/check for --deep or large PRs; build the tail hotspot before deciding there is nothing to review.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-28T15:45:40Z

+    findings,
+    checked: rawChecked.map(String),


Enforce evidence for blocking findings

The new prompt contract requires every P0/P1/P2 finding to have a matching checked[] entry starting with EVIDENCE finding:<title>, but the normalizer accepts blocking findings without checking that proof exists. If the critic returns a P2 finding with a generic or missing checked entry, deriveVerdict still turns it into changes_requested, so malformed low-evidence output can block a PR despite the hard rule; reject or downgrade blocking findings that lack the required evidence entry.

Useful? React with 👍 / 👎.

github-actions

Needlefish PR Review

Verdict: ⛔ changes_requested
Base: a7796d3 → Head: 5443ca4

Refactors Needlefish around parsed CLI commands, centralized repo/process helpers, async Codex review orchestration, stricter model-output normalization, and supporting tests/docs. — Refactors Needlefish around parsed CLI commands, centralized repo/process helpers, async Codex review orchestration, stricter model-output normalization, and supporting tests/docs.

Findings

P2: Keep tail coverage after filtering map hotspots

src/core/review.ts:118-119

Why this breaks: The new mappedHotspots guard rejects an otherwise recoverable map result before the existing tail-coverage path runs. If the map pass returns hotspots whose files are all non-changed consumers, changedHotspots filters them to [], this throw aborts the review, and the changed files never reach the tail hotspot at src/core/review.ts:127.

Suggested fix: Let tail coverage run when mappedHotspots is empty; only fail after adding uncovered changed files if there is still no hotspot to review.

Validation: Add a reviewLarge test where the fake map returns only files:["src/unchanged.ts"] while changedFiles contains src/app.ts; assert the deep prompt receives tail-coverage for src/app.ts and review completes.

Checked

[Review orchestration, prompts, and Codex execution] One P2 finding: large-review can fail before tail coverage handles filtered map output.
[Local/GitHub adapters and repo/process shelling] No correctness findings found in this review surface.
[Model output schema and normalization] No findings; schema/normalization changes traced through the listed consumers without a concrete regression.
[CLI command contract and option validation] No correctness findings for this CLI parser surface.
[Node test script wiring] No merge-blocking issues found in the package test-script wiring surface.

Residual Risk

Static inspection only; no build or test commands were run.

github-actions · 2026-06-28T16:00:23Z

-  if (mapResult.hotspots.length === 0) {
+  const mapResult = normalizeMap(extractJson(await runCodex(mapPrompt, { repoPath: bundle.repoPath })));
+  const mappedHotspots = changedHotspots(mapResult.hotspots, bundle);
+  if (mappedHotspots.length === 0) {


P2 (bug): Keep tail coverage after filtering map hotspots

The new mappedHotspots guard rejects an otherwise recoverable map result before the existing tail-coverage path runs. If the map pass returns hotspots whose files are all non-changed consumers, changedHotspots filters them to [], this throw aborts the review, and the changed files never reach the tail hotspot at src/core/review.ts:127.

Suggested fix: Let tail coverage run when mappedHotspots is empty; only fail after adding uncovered changed files if there is still no hotspot to review.

frankekn · 2026-06-28T16:03:09Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 632f728ba0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-28T16:14:50Z

+  if (!patch.trim()) {
+    throw new Error(`No diff between ${mergeBase} and ${headSha}. Nothing to review.`);
+  }


Post a check when GitHub diff is empty

When GitHub mode is invoked for a PR whose head has no effective diff against the merge base, this new throw runs before the try block that posts the failure check. The top-level CLI catch only writes stderr, so the documented Needlefish check run is never created or updated; if that check is required, the PR can remain pending instead of getting a clear failure/neutral result. Move this path under the checked error handling or post a check before returning.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-28T16:14:50Z


 export async function review(bundle: Bundle): Promise<ReviewResult> {
-  return isLarge(bundle) ? reviewLarge(bundle) : reviewSmall(bundle);
+  return bundle.deep || isLarge(bundle) ? reviewLarge(bundle) : reviewSmall(bundle);


Preserve PR metadata in deep reviews

When a local review is run as needlefish --deep --pr 123, this newly routes even small diffs through reviewLarge, but that path builds the map/deep prompts without bundle.prMeta. The --pr contract says the PR body/comments/checks are pulled into the review context, so combining it with --deep now silently discards that metadata; pass prMeta into the large-review prompts or keep small deep reviews on the full-bundle path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-28T16:14:50Z

- Re-check residual_risks: keep blocks=true only when it genuinely prevents a verdict (e.g. a deep pass that failed, leaving files unreviewed).

 # Output
+Before returning, ensure every kept P0/P1/P2 has a corresponding `checked[]` entry beginning with `EVIDENCE finding:`. If the evidence cannot be stated concretely, delete the finding.


Add deep evidence before critic pruning

For large diffs or --deep, candidate findings come from deep.md, whose new checked contract asks for FILES, EDGE, and trigger entries but not EVIDENCE finding: entries. This critic requirement can therefore cause valid P0/P1/P2 findings from the deep pass to be deleted solely because the upstream prompt never requested the required marker; add the same finding-evidence contract to the deep prompt or let the critic create the evidence entry after re-verifying.

Useful? React with 👍 / 👎.

github-actions

Needlefish PR Review

Verdict: ⛔ changes_requested
Base: a7796d3 → Head: 632f728

This PR refactors Needlefish around shared CLI parsing, repo/process helpers, model-output normalization, and the large-review map/deep/critic pipeline. — Kept two concrete P2 findings: the new confidence contract is enforced too loosely, and --deep now drops --pr metadata for small local reviews.

Findings

P2: Raise the blocking-confidence cutoff to 0.70

src/shared/normalize.ts:152-153

Why this breaks: The review prompt contract says findings below 0.70 confidence must not be emitted as P0/P1/P2, but this changed predicate only rejects blocking findings below 0.65. Substitute a model finding with severity='P2' and confidence=0.66: normalizeFinding accepts it, then the consuming guard at src/core/verdict.ts:9 treats P2 as blocking and returns changes_requested, so a below-contract weak finding can fail the GitHub check.

Suggested fix: Use the same 0.70 cutoff as the prompt contract, or demote/reject P0/P1/P2 findings whose confidence is below 0.70 before verdict derivation.

Validation: Add a normalizeFinding test for P2 confidence 0.69 and run pnpm test.

P2: Preserve PR metadata in deep reviews

src/core/review.ts:178

Why this breaks: Setting bundle.deep now routes even small local reviews into reviewLarge. Local mode still populates bundle.prMeta from --pr at src/adapters/local.ts:77, but reviewLarge builds the map/deep prompts from a subset of the bundle at src/core/review.ts:106 and never includes prMeta; the deep prompt only receives AGENTS/HOTSPOT/FOCUS/BASE/HEAD at src/core/review.ts:142. The small path this bypasses serializes the full bundle at src/core/review.ts:95, so needlefish --pr 123 --deep silently drops PR body/comments/checks even though the CLI contract says --pr pulls them.

Suggested fix: Include prMeta in the map/deep prompt context, or keep --pr reviews on a path that passes the full bundle to the model.

Validation: Add a review test with deep=true and non-null prMeta whose fake Codex asserts that the map/deep prompt contains the PR body/comments.

Checked

EVIDENCE finding:Raise the blocking-confidence cutoff to 0.70 changed=src/shared/normalize.ts:152 effect=P2 confidence 0.66 is accepted and deriveVerdict at src/core/verdict.ts:9 returns changes_requested
EVIDENCE finding:Preserve PR metadata in deep reviews changed=src/core/review.ts:178 effect=--pr metadata loaded at src/adapters/local.ts:77 is routed into reviewLarge, whose map/deep prompt construction omits prMeta

Residual Risk

Tests were not run because this pruning pass used read-only diff/source inspection.

github-actions · 2026-06-28T16:31:39Z

+    throw new Error(`malformed finding: invalid confidence ${String(record.confidence)}`);
+  }
+  const confidence = Math.max(0, Math.min(1, rawConfidence));
+  if (severity !== "P3" && confidence < 0.65) {


P2 (validation): Raise the blocking-confidence cutoff to 0.70

The review prompt contract says findings below 0.70 confidence must not be emitted as P0/P1/P2, but this changed predicate only rejects blocking findings below 0.65. Substitute a model finding with severity='P2' and confidence=0.66: normalizeFinding accepts it, then the consuming guard at src/core/verdict.ts:9 treats P2 as blocking and returns changes_requested, so a below-contract weak finding can fail the GitHub check.

Suggested fix: Use the same 0.70 cutoff as the prompt contract, or demote/reject P0/P1/P2 findings whose confidence is below 0.70 before verdict derivation.

github-actions · 2026-06-28T16:31:39Z


 export async function review(bundle: Bundle): Promise<ReviewResult> {
-  return isLarge(bundle) ? reviewLarge(bundle) : reviewSmall(bundle);
+  return bundle.deep || isLarge(bundle) ? reviewLarge(bundle) : reviewSmall(bundle);


P2 (contract): Preserve PR metadata in deep reviews

Setting bundle.deep now routes even small local reviews into reviewLarge. Local mode still populates bundle.prMeta from --pr at src/adapters/local.ts:77, but reviewLarge builds the map/deep prompts from a subset of the bundle at src/core/review.ts:106 and never includes prMeta; the deep prompt only receives AGENTS/HOTSPOT/FOCUS/BASE/HEAD at src/core/review.ts:142. The small path this bypasses serializes the full bundle at src/core/review.ts:95, so needlefish --pr 123 --deep silently drops PR body/comments/checks even though the CLI contract says --pr pulls them.

Suggested fix: Include prMeta in the map/deep prompt context, or keep --pr reviews on a path that passes the full bundle to the model.

frankekn · 2026-06-28T16:38:04Z

@codex review

chatgpt-codex-connector · 2026-06-28T16:43:01Z

Codex Review: Didn't find any major issues. 🚀

Reviewed commit: 15103d26f9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

github-actions

Needlefish PR Review

Verdict: ✅ pass
Base: a7796d3 → Head: 15103d2

This PR refactors Needlefish's CLI/adapters into shared helpers, adds large-review map/deep orchestration with stricter prompt contracts, and moves model/PR JSON parsing into typed normalization. — This PR refactors Needlefish's CLI/adapters into shared helpers, adds large-review map/deep orchestration with stricter prompt contracts, and moves model/PR JSON parsing into typed normalization.

Findings

No actionable findings. Prefer this over padding weak ones.

Checked

[Review orchestration, prompt contracts, and Codex execution] No concrete correctness, security, contract, runtime, or validation defects found in this review surface.
[Adapter bundle construction and GitHub/local side effects] No reportable correctness, contract, runtime, security, validation, or duplicate-behavior defects found in this surface.
[Model and PR output normalization with verdict schema] No actionable findings for this normalization/verdict surface.
[CLI argument parser and command dispatch] No correctness findings in the CLI parser and dispatch surface.

Residual Risk

Typecheck and tests could not be used as evidence in this environment because dependencies are not installed; review conclusion is based on diff and consumer tracing.
Dynamic CLI execution/tests were not run; this review used the requested read-only diff and static consumer tracing.

feat: harden review pipeline

8cce593

chatgpt-codex-connector Bot reviewed Jun 28, 2026

View reviewed changes

github-actions Bot requested changes Jun 28, 2026

View reviewed changes

fix: preserve review evidence through critic

5443ca4

chatgpt-codex-connector Bot reviewed Jun 28, 2026

View reviewed changes

github-actions Bot requested changes Jun 28, 2026

View reviewed changes

fix: keep large review tail coverage

632f728

chatgpt-codex-connector Bot reviewed Jun 28, 2026

View reviewed changes

github-actions Bot requested changes Jun 28, 2026

View reviewed changes

fix: preserve PR context in deep reviews

15103d2

github-actions Bot reviewed Jun 28, 2026

View reviewed changes

frankekn merged commit 86ae4da into main Jun 28, 2026
2 checks passed

		if (mappedHotspots.length === 0) {
		throw new Error("map pass produced no hotspots");

Conversation

frankekn commented Jun 28, 2026

Summary

Verification

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Needlefish PR Review

Findings

P1: Preserve deep-pass evidence before critic pruning

P2: Reject nonnumeric blocking confidence

Checked

Residual Risk

Uh oh!

github-actions Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

frankekn commented Jun 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Needlefish PR Review

Findings

P2: Keep tail coverage after filtering map hotspots

Checked

Residual Risk

Uh oh!

github-actions Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

frankekn commented Jun 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Needlefish PR Review

Findings

P2: Raise the blocking-confidence cutoff to 0.70

P2: Preserve PR metadata in deep reviews