Skip to content

Navigator: capture rejection signal + give the AI a stop() tool#39

Open
DavertMik wants to merge 5 commits into
mainfrom
feat/navigate-cli-command
Open

Navigator: capture rejection signal + give the AI a stop() tool#39
DavertMik wants to merge 5 commits into
mainfrom
feat/navigate-cli-command

Conversation

@DavertMik
Copy link
Copy Markdown
Contributor

@DavertMik DavertMik commented May 24, 2026

Summary

Navigator improvements that together fix the failure mode where Navigator burns 9 retry attempts on a form that is actually being rejected by the server, not by a bad locator.

What the AI sees, post-PR

When a click runs cleanly but the URL doesn't change to the expected target, the raw ARIA diff between pre-click and post-click is included in the next retry prompt. No regex extraction, no pre-digestion. The retry prompt then instructs the AI to read the diff and judge whether the application rejected the action — looking for any new role / text that signals rejection (different sites express it differently — the prompt names categories, not specific phrases).

What the AI can do

A stop(reason) tool is added to Navigator's tool set. The AI is told to call it when the rejection is something only the user can fix (wrong credentials, missing knowledge, captcha, blocking error). When called, the reason is logged at error level and surfaced in the existing interactive failure prompt so the user knows what to fix. Mirrors the existing Tester stop tool pattern.

Prompt contradiction fixed

Previously the retry prompt told the model both "this is NOT a locator issue" AND "propose new solutions" in the same turn. Now branched: app-side rejection → call stop() or correct the submitted data (two clear choices, no locator changes); click that missed entirely → propose new locator strategies.

Files

  • src/ai/navigator.tsstop tool, stopReason check after each AI call, branched retry prompt, ARIA-diff feedback path. Reuses existing ActionResult.diff / Diff.calculate.
  • src/utils/aria.ts — unchanged (an extractAlerts regex helper was added and then removed in commit 0e58987 after maintainer feedback that regex-based detectors violate the project's "rely on the prompt" rule).

Test plan

  • bun run format / bun run lint:fix clean
  • No new type errors in edited files
  • ./bin/explorbot-cli.ts navigate / against the demo app: happy path succeeds first attempt in ~10s
  • Manual: with a known-bad credential in the knowledge/.env, expect Navigator to call stop() with the page's rejection text after one or two attempts. I couldn't drive this in this session because explorbot's loadEnv overwrites command-line env vars and the demo's experience cache replays the working credentials. Easy to verify locally by editing .env and clearing experience/.

🤖 Generated with Claude Code

DavertMik and others added 4 commits May 24, 2026 00:50
Exposes the AI Navigator as a one-shot CLI command. Exits 0 when the
target URL is reached and 1 otherwise, so it can be used as a
reachability probe in CI. Inherits --session and all common options,
making it the canonical way to capture an authenticated session for
downstream agents in a single command.

Also restructures docs/commands.md to treat CLI as a first-class
surface alongside TUI: a comprehensive reference table, per-command
sections showing both invocations side by side, and coverage of the
previously-undocumented CLI-only commands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a click succeeds but the URL stays put, Navigator now diffs the
page and extracts any new alert/status/alertdialog messages, then
includes them (plus the ARIA changes) in the next retry prompt. This
breaks the loop where Navigator would re-fire 9 syntactic locator
variants against a form that was actually being rejected by the
server. The retry prompt now tells the model to re-examine credentials
or input data before changing locators when the page reacted.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Navigator's resolveState loop was passing tools=undefined to the AI,
so the model had no way to signal "this is hopeless, the page is
rejecting the submit." Combined with a retry prompt that, on
app-rejection, told the model both "do not change the locator" AND
"propose new solutions" in the same turn — the model spent its full
retry budget mutating locators that were already correct.

This adds a stop(reason) tool the AI can call when no locator change
will help (wrong credentials, missing knowledge, captcha, blocking
error). When called, the reason is logged at error level and surfaced
in the existing interactive failure prompt so the user knows what to
fix.

The retry prompt is restructured so the two paths are mutually
exclusive:
- if the page reacted (alerts / ARIA changes): two clear choices —
  call stop(reason), or correct the submitted data using known
  knowledge. Do not change the locator.
- if the page did not react: propose new locator strategies (the old
  behaviour).

No control-flow rewrite — the model can still emit code blocks in
text; the tool is an optional escape valve, mirroring how Tester uses
its stop() tool at tester.ts:938.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@DavertMik DavertMik changed the title Add navigate CLI command + feed back why URL didn't change Navigator: capture rejection signal + give the AI a stop() tool May 24, 2026
CLAUDE.md ("Prompts & Rules — General, Not Example-Driven") forbids
adding programmatic detectors that target a semantic judgement the
model should make from a prompt. The extractAlerts() helper I added
to utils/aria.ts violated that — it was a regex over the ARIA snapshot
looking for alert/alertdialog/status roles to feed to the model as a
pre-digested "Page now shows: ..." line.

Remove it. The ARIA diff produced by the existing Diff infrastructure
is fed to the model as-is, and the retry prompt now instructs the
model to read the diff and judge for itself whether the application
rejected the action — looking for any new role/text that signals
rejection, without naming specific phrases. Different sites express
rejection differently and the prompt says so.

The stop() tool path and the prompt branching are unchanged in
structure; only the language is now "read the ARIA diff above" instead
of "see alerts above", and the pageReacted gate is based on whether
the diff is non-null rather than on whether the regex matched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant