Navigator: capture rejection signal + give the AI a stop() tool#39
Open
DavertMik wants to merge 5 commits into
Open
Navigator: capture rejection signal + give the AI a stop() tool#39DavertMik wants to merge 5 commits into
DavertMik wants to merge 5 commits into
Conversation
Exposes the AI Navigator as a one-shot CLI command. Exits 0 when the target URL is reached and 1 otherwise, so it can be used as a reachability probe in CI. Inherits --session and all common options, making it the canonical way to capture an authenticated session for downstream agents in a single command. Also restructures docs/commands.md to treat CLI as a first-class surface alongside TUI: a comprehensive reference table, per-command sections showing both invocations side by side, and coverage of the previously-undocumented CLI-only commands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a click succeeds but the URL stays put, Navigator now diffs the page and extracts any new alert/status/alertdialog messages, then includes them (plus the ARIA changes) in the next retry prompt. This breaks the loop where Navigator would re-fire 9 syntactic locator variants against a form that was actually being rejected by the server. The retry prompt now tells the model to re-examine credentials or input data before changing locators when the page reacted. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Navigator's resolveState loop was passing tools=undefined to the AI, so the model had no way to signal "this is hopeless, the page is rejecting the submit." Combined with a retry prompt that, on app-rejection, told the model both "do not change the locator" AND "propose new solutions" in the same turn — the model spent its full retry budget mutating locators that were already correct. This adds a stop(reason) tool the AI can call when no locator change will help (wrong credentials, missing knowledge, captcha, blocking error). When called, the reason is logged at error level and surfaced in the existing interactive failure prompt so the user knows what to fix. The retry prompt is restructured so the two paths are mutually exclusive: - if the page reacted (alerts / ARIA changes): two clear choices — call stop(reason), or correct the submitted data using known knowledge. Do not change the locator. - if the page did not react: propose new locator strategies (the old behaviour). No control-flow rewrite — the model can still emit code blocks in text; the tool is an optional escape valve, mirroring how Tester uses its stop() tool at tester.ts:938. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLAUDE.md ("Prompts & Rules — General, Not Example-Driven") forbids
adding programmatic detectors that target a semantic judgement the
model should make from a prompt. The extractAlerts() helper I added
to utils/aria.ts violated that — it was a regex over the ARIA snapshot
looking for alert/alertdialog/status roles to feed to the model as a
pre-digested "Page now shows: ..." line.
Remove it. The ARIA diff produced by the existing Diff infrastructure
is fed to the model as-is, and the retry prompt now instructs the
model to read the diff and judge for itself whether the application
rejected the action — looking for any new role/text that signals
rejection, without naming specific phrases. Different sites express
rejection differently and the prompt says so.
The stop() tool path and the prompt branching are unchanged in
structure; only the language is now "read the ARIA diff above" instead
of "see alerts above", and the pageReacted gate is based on whether
the diff is non-null rather than on whether the regex matched.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Navigator improvements that together fix the failure mode where Navigator burns 9 retry attempts on a form that is actually being rejected by the server, not by a bad locator.
What the AI sees, post-PR
When a click runs cleanly but the URL doesn't change to the expected target, the raw ARIA diff between pre-click and post-click is included in the next retry prompt. No regex extraction, no pre-digestion. The retry prompt then instructs the AI to read the diff and judge whether the application rejected the action — looking for any new role / text that signals rejection (different sites express it differently — the prompt names categories, not specific phrases).
What the AI can do
A
stop(reason)tool is added to Navigator's tool set. The AI is told to call it when the rejection is something only the user can fix (wrong credentials, missing knowledge, captcha, blocking error). When called, the reason is logged at error level and surfaced in the existing interactive failure prompt so the user knows what to fix. Mirrors the existing Testerstoptool pattern.Prompt contradiction fixed
Previously the retry prompt told the model both "this is NOT a locator issue" AND "propose new solutions" in the same turn. Now branched: app-side rejection → call
stop()or correct the submitted data (two clear choices, no locator changes); click that missed entirely → propose new locator strategies.Files
src/ai/navigator.ts—stoptool,stopReasoncheck after each AI call, branched retry prompt, ARIA-diff feedback path. Reuses existingActionResult.diff/Diff.calculate.src/utils/aria.ts— unchanged (anextractAlertsregex helper was added and then removed in commit0e58987after maintainer feedback that regex-based detectors violate the project's "rely on the prompt" rule).Test plan
bun run format/bun run lint:fixclean./bin/explorbot-cli.ts navigate /against the demo app: happy path succeeds first attempt in ~10s.env, expect Navigator to callstop()with the page's rejection text after one or two attempts. I couldn't drive this in this session because explorbot'sloadEnvoverwrites command-line env vars and the demo's experience cache replays the working credentials. Easy to verify locally by editing.envand clearingexperience/.🤖 Generated with Claude Code