Skip to content

feat: make the inner debug agent effective for web QA + run controls#9

Merged
sebyx07 merged 2 commits into
mainfrom
feat/effective-web-qa
Jun 29, 2026
Merged

feat: make the inner debug agent effective for web QA + run controls#9
sebyx07 merged 2 commits into
mainfrom
feat/effective-web-qa

Conversation

@sebyx07

@sebyx07 sebyx07 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

What & why

Driving the debug loop end-to-end against a real app (a planted-bug React fixture) surfaced blockers that unit tests couldn't — the inner agent never converged to a report. This makes the fast-model driver (glm-5.2) actually work for web QA, and adds the run-control surface.

Validated by a tool-scoped claude -p black-box QA run: a second Claude, allowed only the mcp__ui-debugger__* tools (no source access), drove the debugger and returned a correct, prioritized fix plan — catching the dead buttons, no-op newsletter, /api/featured 404 + App.tsx:31 JSON SyntaxError, and the broken logo.png/product-3.png.

Inner-loop fixes

  • act: flat input schema instead of z.discriminatedUnion — the anyOf JSON-Schema made glm emit empty {} args, so act never executed. Per-action requirements now enforced at runtime. (the make-or-break fix)
  • normalizeQuery in the browser adapter maps LLM-natural targets (button "Add to cart", plain text) onto Playwright role=/text= engines, not just raw CSS.
  • observe(tree) returns a ready target per node (role+name or text, >> nth= for dupes) so the model copies a working selector — act success ~0% → ~100%.
  • Screenshot slug capped at 60 chars — a long look question hit ENAMETOOLONG and killed look.
  • Step-budget nudge so the driver reports before the 30-step cap; agent.log step + tool-error logging.

Run controls

  • start_debug timeout — always capped (default 300s, overridable via timeout); auto-ends and frees the profile.
  • CLI status / stop over a new state.json breadcrumb; SIGTERM/SIGINT end the run gracefully.

Docs + fixtures

  • Move idea/docs/idea/; add docs/claude/SKILL.md (driving claude headless).
  • Rewrite README (MCP tools, CLI, timeout, skill); exclude dummy/ from biome.
  • dummy/web: deliberately-buggy React+SCSS QA fixture (answer key in BUGS.md).

Known gap

The vision pass misses subtle pure-CSS issues (white-on-white invisible text, low-contrast) — a screenshot can't reveal invisible text; would need DOM computed-style contrast analysis (follow-up).

291 tests pass; typecheck + lint clean.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added session controls to check run status and stop active runs.
    • Added optional timeout support for debug sessions, including automatic end behavior.
    • Improved debug runs with clearer step summaries, logging, and budget warnings.
  • Bug Fixes

    • Better selector handling makes UI interactions more reliable.
    • Fixed missing-state handling for run status/stop commands.
  • Documentation

    • Expanded setup and usage docs for debugging tools, CLI commands, and session behavior.

Driving the loop end-to-end against a real app surfaced several blockers that
unit tests couldn't (the agent never converged to a report). Fixed the loop so a
fast model (glm-5.2) can actually drive a web app and return useful findings, and
added the run-control surface.

Inner-loop fixes:
- act: flat input schema instead of z.discriminatedUnion — anyOf JSON-Schema made
  glm emit empty {} args, so act never executed. Per-action requirements now
  enforced at runtime.
- browser adapter: normalizeQuery maps LLM-natural targets (role+name, plain text)
  onto Playwright role=/text= engines instead of only raw CSS.
- observe(tree): returns a ready-to-use `target` per node (role+name or text, with
  >> nth= for duplicates) so the model copies a working selector instead of guessing.
- findings-store: cap screenshot-name slug at 60 chars (a long look question hit
  ENAMETOOLONG and killed look).
- loop: step-budget nudge so the driver reports before the 30-step cap; agent.log
  step + tool-error logging for observability.

Run controls:
- start_debug: always-on wall-clock timeout (default 300s, overridable via `timeout`).
- CLI `status` / `stop` over a new state.json breadcrumb; SIGTERM/SIGINT end the run
  gracefully (abort loop, close browser, free profile).

Docs + fixtures:
- move idea/ -> docs/idea/; add docs/claude/SKILL.md (driving claude headless).
- rewrite README (MCP tools, CLI, timeout, skill); exclude dummy/ from biome.
- dummy/web: deliberately-buggy React+SCSS QA fixture (answer key in BUGS.md),
  used to validate the loop with a tool-scoped `claude -p` black-box QA run.

291 tests pass; typecheck + lint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

You’ve reached a temporary PR review limit under our Fair Usage Limits Policy.

Your recent review volume is higher than typical usage, so adaptive limits are currently applied.

Next review available in: 33 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b314bce0-50a6-4835-8c6a-850e9d1a3dc9

📥 Commits

Reviewing files that changed from the base of the PR and between 54cf2e6 and dad65d9.

📒 Files selected for processing (17)
  • README.md
  • dummy/web/BUGS.md
  • dummy/web/src/components/ProductCard.module.scss
  • dummy/web/src/components/ProductGrid.module.scss
  • src/adapters/browser/query.test.ts
  • src/adapters/browser/query.ts
  • src/agent/belt/act.ts
  • src/agent/belt/observe.test.ts
  • src/agent/belt/observe.ts
  • src/agent/prompts/web-addendum.ts
  • src/cli/control.ts
  • src/config/load.test.ts
  • src/config/load.ts
  • src/mcp/tools/start-debug.ts
  • src/services/session-builder.ts
  • src/session/state-file.test.ts
  • src/session/state-file.ts
📝 Walkthrough

Walkthrough

Adds cross-process run state persistence (state.json), wall-clock session timeouts, and CLI status/stop commands. Introduces Playwright query normalization, flattens the act schema to a single object with runtime guards, and enriches observe tree nodes with precomputed target selectors. Enhances the agent loop with step-budget nudging and logging. Adds a deliberately-buggy React/Vite QA fixture (dummy/web).

Changes

Core source changes

Layer / File(s) Summary
StateFile schema and FileStatePort
src/session/state-file.ts, src/session/state-file.test.ts
Introduces StateFileSchema/StateFile, writeState/readState helpers, StatePort interface, noopStatePort, FileStatePort, and markStatus with full test coverage.
DebugService timeouts and state breadcrumb
src/services/debug-service.ts, src/services/debug-service.test.ts, src/mcp/tools/start-debug.ts
Adds DEFAULT_SESSION_TIMEOUT_MS, per-run timeoutMs, #armTimeout/#clearTimeout helpers, state.record/clear calls on start/end, and wires timeout seconds through the start_debug MCP tool.
CLI status/stop commands and main wiring
src/cli/control.ts, src/cli/control.test.ts, src/config/load.ts, src/main.ts
Adds runStatus/runStop reading state.json, loadWorkspaceDir helper, main.ts subcommand dispatch for status/stop, and SIGTERM/SIGINT graceful shutdown.
Playwright query normalization
src/adapters/browser/query.ts, src/adapters/browser/query.test.ts, src/adapters/browser/browser-adapter.ts
Adds normalizeQuery mapping LLM-natural targets to Playwright selector strings; integrates into click/type/waitFor/#collect.
Flat act schema with runtime guards
src/agent/belt/act.ts, src/agent/belt/act.test.ts
Replaces discriminated-union ActInputSchema with a flat zod object keyed by ACT_ACTIONS enum, adds required() helper for per-action runtime validation.
TreeNode targets in observe tree channel
src/agent/belt/observe.ts, src/agent/belt/observe.test.ts, src/agent/prompts/web-addendum.ts
Adds TreeNode type with optional target, ARIA-role selector builders, >> nth= disambiguation, and updates the web-addendum prompt to reference precomputed targets.
Agent loop: budget nudge, logging, session-builder instrumentation
src/agent/loop.ts, src/agent/loop.test.ts, src/services/session-builder.ts, src/session/findings-store.ts, src/session/findings-store.test.ts
Adds BUDGET_WARN, budgetNudge, AgentLog, describeStep, withToolLog, logAgent, run lifecycle logging, and SLUG_MAX slug capping.
Docs and config updates
README.md, CLAUDE.md, docs/claude/SKILL.md, biome.json
Updates README with a "Using it" section and docs/idea/ links, updates CLAUDE.md with timeout/state.json/SIGTERM details, adds headless claude CLI guide, and excludes dummy from biome.

Dummy web QA fixture

Layer / File(s) Summary
Project scaffold and MCP config
dummy/web/package.json, dummy/web/tsconfig.json, dummy/web/vite.config.ts, dummy/web/index.html, dummy/web/src/vite-env.d.ts, dummy/web/.ui-debugger-mcp.json
Configures a Vite/React/TypeScript/Sass project bound to 127.0.0.1:5179 with MPA mode and a ui-debugger-mcp config pointing at that target.
Design tokens and global styles
dummy/web/src/styles/vars.scss, dummy/web/src/styles/global.scss
Defines eight SCSS color tokens and global box-sizing/body/link/root resets.
React components and SCSS modules
dummy/web/src/components/*
Adds Header, Hero, Footer, Newsletter, ProductCard, and ProductGrid with SCSS modules, embedding intentional bugs: white-on-near-white subtitle, broken image references, missing onAdd handlers, and no-op newsletter form.
App root, product data, and BUGS answer key
dummy/web/src/products.ts, dummy/web/src/App.tsx, dummy/web/src/main.tsx, dummy/web/BUGS.md
Adds four product fixtures, App with broken fetch('/api/featured') and addToCartBroken, React entry point, and BUGS.md enumerating all planted defects with a grader signal table.

Sequence Diagram(s)

sequenceDiagram
  participant Client as MCP Client
  participant Tool as start_debug tool
  participant DS as DebugService
  participant SP as FileStatePort
  participant Timer as Wall-clock timer

  Client->>Tool: start_debug({ target, goal, timeout })
  Tool->>DS: start({ target, goal, timeoutMs })
  DS->>SP: record({ sessionId, pid, target, goal })
  DS->>Timer: armTimeout(timeoutMs)
  Note over DS,Timer: timer fires → endActive()
  Timer-->>DS: endActive()
  DS->>SP: clear() → markStatus('ended')
  DS->>DS: clearTimeout()
Loading
sequenceDiagram
  participant CLI as cli status/stop
  participant Control as runStatus / runStop
  participant FS as state.json / findings.json
  participant Proc as Server PID

  CLI->>Control: runStatus(cwd)
  Control->>FS: readState(statePath)
  Control->>Proc: kill(pid, 0) — liveness check
  Control->>FS: readFindings (live count)
  Control-->>CLI: print summary

  CLI->>Control: runStop(cwd)
  Control->>FS: readState(statePath)
  Control->>Proc: SIGTERM (if alive)
  Control->>FS: markStatus('stopped')
  Control-->>CLI: log result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • developerz-ai/ui-debugger-mcp#1: This PR adds loadWorkspaceDir to src/config/load.ts, directly extending the loadConfig infrastructure introduced there.
  • developerz-ai/ui-debugger-mcp#3: This PR modifies BrowserAdapter to use normalizeQuery in click/type/waitFor/collect — the same browser adapter area established in that PR.
  • developerz-ai/ui-debugger-mcp#6: Both PRs modify src/agent/loop.ts step handling and createDebugAgent wiring, making them closely coupled in the agent loop lineage.

Suggested labels

claudetm

🐇 A rabbit hops through the code so bright,
State files and timeouts set things right,
The dummy store has bugs galore—
White text on white, cart handlers sore,
But normalizeQuery finds the way,
And budget nudges save the day! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title matches the PR’s main themes: improving the inner debug agent for web QA and adding run controls.
Docstring Coverage ✅ Passed Docstring coverage is 86.36% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/effective-web-qa

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (3)
src/agent/belt/act.ts (1)

48-76: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Make the act input schema strict at the MCP boundary.

z.object(...) will silently drop unknown operands here, so payloads like { action: 'navigate', url: '...' } degrade into a later requires 'target' runtime error instead of failing at parse time. z.strictObject(...) preserves the flat shape while keeping the boundary fail-fast.

As per coding guidelines, "Use Zod at every boundary, including config, MCP inputs, and findings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/agent/belt/act.ts` around lines 48 - 76, The ActInputSchema in act.ts is
too permissive because z.object(...) drops unknown fields, so invalid MCP
payloads can pass parsing and fail later with misleading runtime errors. Make
the boundary fail-fast by switching ActInputSchema to a strict Zod object shape,
keeping the same action/target/text/key/direction/amount/networkIdle/timeout
fields while rejecting unknown operands like url at parse time.

Source: Coding guidelines

src/cli/control.test.ts (1)

59-69: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Cover the live-PID stop path too.

These tests only exercise the missing-state and dead-PID branches. Please add one test that stubs process.kill as alive and asserts both the SIGTERM attempt and the stopped state update.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cli/control.test.ts` around lines 59 - 69, Add a new stop-path test in
control.test.ts that covers the live PID branch in runStop: stub process.kill so
the target PID appears alive, then assert that the stop flow attempts SIGTERM
and updates the persisted state to stopped. Reuse the existing runStop,
seedState, and readState helpers so the new test sits alongside the current
no-state and dead-server cases and validates the alive-process behavior end to
end.
src/main.ts (1)

17-29: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy lift

Move CLI subcommand dispatch out of src/main.ts.

This file now owns both stdio-server bootstrap and control-CLI routing. Keeping init/status/stop in a dedicated entrypoint preserves the repo’s intended seam and keeps src/main.ts single-purpose.

As per coding guidelines, "src/main.ts should only boot the stdio MCP server."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main.ts` around lines 17 - 29, Move the subcommand routing logic out of
main and keep src/main.ts focused only on stdio MCP server startup. Extract the
init/status/stop dispatch currently in main() into a dedicated CLI entrypoint
and leave src/main.ts with just the server bootstrap path. Use the existing
main() function and runInit/runStatus/runStop symbols to relocate the
control-CLI handling without changing behavior.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dummy/web/BUGS.md`:
- Around line 11-13: Clarify the React 18 StrictMode note in BUGS.md so it only
refers to duplicated mount effects/remounts in dev, not click handlers. Update
the wording near the StrictMode note to avoid implying the product-4 click error
is a StrictMode duplicate, and keep the guidance scoped to the behavior
triggered by mount effects/remounts.

In `@dummy/web/src/components/ProductCard.module.scss`:
- Line 62: The inline comment in ProductCard.module.scss violates
scss/double-slash-comment-empty-line-before because it is not preceded by a
blank line. Update the SCSS around the affected comment so there is an empty
line before the double-slash comment, keeping the existing styling intent
intact.

In `@dummy/web/src/components/ProductGrid.module.scss`:
- Line 23: The SCSS comment in ProductGrid.module.scss is violating stylelint
due to inconsistent spacing and missing the required empty line before a
double-slash comment. Update the comment formatting near the affected rule so it
has a blank line before the comment and uses consistent spacing, keeping the fix
localized to the stylesheet comment block.

In `@README.md`:
- Around line 116-117: The README still references the old command name, so
update the CLI usage text to use the actual binary name consistently. Search the
CLI intro and related sections around the ui-debugger init wording, and replace
those references with ui-debugger-mcp so the documented command matches the new
binary name everywhere it appears.

In `@src/adapters/browser/query.ts`:
- Around line 20-27: The pre-`text=` detection in `ROLE_NAME` and `CSS_LIKE` is
too permissive and is misclassifying normal labels before the plain-text
fallback in `query.ts`. Narrow the heuristics in the query parsing path so only
clearly intended role-name or CSS selectors are matched, and ensure ambiguous
inputs like quoted phrases or labels containing `>`/`~` still fall through to
text handling. Update the relevant selector classification logic around
`ROLE_NAME` and `CSS_LIKE` so `Sale "50% off"` and similar cases are not treated
as `role=` or CSS.

In `@src/agent/belt/act.ts`:
- Around line 140-143: The `type` branch in `act.ts` should fail fast on missing
input by validating `input.text` with `required()` before calling `resolve()`,
so a bad selector does not mask the real `requires 'text'` error and an
unnecessary adapter lookup is avoided. Update the `case 'type'` flow to check
`text` first, then resolve `input.target` and call `adapter.type(node, text)`
only after both required values are present.

In `@src/agent/belt/observe.ts`:
- Around line 144-152: withTargets() is generating target selectors from scoped
read results, but observe/act are not preserving that scope when replaying the
target. Update the observe.ts flow around withTargets(), readState(), and act so
that scoped reads (within/filters) either do not emit target or include enough
scope information for act to resolve the same element instead of using an
unscoped find({ query: target }).

In `@src/agent/prompts/web-addendum.ts`:
- Around line 53-59: The guidance in web-addendum.ts incorrectly says every node
from observe({kind:"tree"}) has a target, but withTargets() can leave target
undefined for unnamed, non-semantic nodes. Update the wording around the
observe({kind:"tree"})/act({action, target}) instructions to refer only to
actionable nodes, and keep the fallback sentence for nodes without target
consistent with that behavior.

In `@src/cli/control.ts`:
- Around line 25-32: The stop flow in runStop() should not trust state.pid
alone, because it may signal a reused PID and then incorrectly mark the run
stopped even if teardown failed. Persist an additional process identity in
state.json from the server startup path and verify it before calling
process.kill in isAlive() and runStop(); if the identity does not match, skip
signaling and treat it as a stale record. Update the error handling so only the
ESRCH race is considered benign, and make sure the stop status is only written
after a verified successful signal or a confirmed already-dead process.

In `@src/config/load.ts`:
- Around line 111-117: The loadWorkspaceDir helper is swallowing all config
parse/validation errors and falling back to DEFAULT_WORKSPACE, which hides
invalid .ui-debugger-mcp.json data. Keep the existing existsSync/missing-file
fallback, but change the try/catch around parseProject(readFileSync(...)) so
only a missing workspace field defaults while other config errors are surfaced
(either by validating just workspace here or by rethrowing the parseProject
error). Use loadWorkspaceDir and parseProject as the key symbols to update.

In `@src/mcp/tools/start-debug.ts`:
- Around line 48-63: The start-debug tool currently accepts any positive timeout
and converts it to milliseconds in the async handler, which can overflow Node’s
maximum timer delay. Add an upper bound to the Zod schema for timeout in
start-debug.ts so the value stays within safe seconds before it reaches
service.start and gets multiplied into timeoutMs. Use the existing timeout field
definition in the tool schema to enforce the cap at validation time.

In `@src/services/session-builder.ts`:
- Around line 167-169: The crash-path logging is still fire-and-forget, so the
failure details can be lost before agent.log is flushed. Update run() to await
the start/end/error appendLog writes instead of dropping them via logAgent(),
and make sure the error path waits for those writes before rethrowing or
returning. Use the existing logAgent helper and the run() try/catch/finally flow
to locate and replace the best-effort logging.

In `@src/session/state-file.ts`:
- Around line 113-120: The markStatus function is overwriting an existing
terminal state, causing a prior stopped breadcrumb to be replaced by ended
during later shutdown. Update markStatus so that when it is called with status
"ended" and readState(path) already returns a prior record with status
"stopped", it preserves the existing stopped value instead of writing ended.
Keep the current writeState flow for other cases, and use the existing
readState/markStatus/FileStatePort.clear path to locate the fix.

---

Nitpick comments:
In `@src/agent/belt/act.ts`:
- Around line 48-76: The ActInputSchema in act.ts is too permissive because
z.object(...) drops unknown fields, so invalid MCP payloads can pass parsing and
fail later with misleading runtime errors. Make the boundary fail-fast by
switching ActInputSchema to a strict Zod object shape, keeping the same
action/target/text/key/direction/amount/networkIdle/timeout fields while
rejecting unknown operands like url at parse time.

In `@src/cli/control.test.ts`:
- Around line 59-69: Add a new stop-path test in control.test.ts that covers the
live PID branch in runStop: stub process.kill so the target PID appears alive,
then assert that the stop flow attempts SIGTERM and updates the persisted state
to stopped. Reuse the existing runStop, seedState, and readState helpers so the
new test sits alongside the current no-state and dead-server cases and validates
the alive-process behavior end to end.

In `@src/main.ts`:
- Around line 17-29: Move the subcommand routing logic out of main and keep
src/main.ts focused only on stdio MCP server startup. Extract the
init/status/stop dispatch currently in main() into a dedicated CLI entrypoint
and leave src/main.ts with just the server bootstrap path. Use the existing
main() function and runInit/runStatus/runStop symbols to relocate the
control-CLI handling without changing behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b8d94e92-3117-40eb-a8f8-98236c253e5e

📥 Commits

Reviewing files that changed from the base of the PR and between fbb3c21 and 54cf2e6.

⛔ Files ignored due to path filters (4)
  • dummy/web/bun.lock is excluded by !**/*.lock
  • dummy/web/public/images/product-1.png is excluded by !**/*.png
  • dummy/web/public/images/product-2.png is excluded by !**/*.png
  • dummy/web/public/images/product-4.png is excluded by !**/*.png
📒 Files selected for processing (59)
  • CLAUDE.md
  • README.md
  • biome.json
  • docs/claude/SKILL.md
  • docs/idea/adapters.md
  • docs/idea/agent-loop.md
  • docs/idea/architecture.md
  • docs/idea/config.md
  • docs/idea/desktop-control.md
  • docs/idea/mcp-tools.md
  • docs/idea/models.md
  • docs/idea/overview.md
  • docs/idea/workspace.md
  • dummy/web/.ui-debugger-mcp.json
  • dummy/web/BUGS.md
  • dummy/web/index.html
  • dummy/web/package.json
  • dummy/web/src/App.tsx
  • dummy/web/src/components/Footer.module.scss
  • dummy/web/src/components/Footer.tsx
  • dummy/web/src/components/Header.module.scss
  • dummy/web/src/components/Header.tsx
  • dummy/web/src/components/Hero.module.scss
  • dummy/web/src/components/Hero.tsx
  • dummy/web/src/components/Newsletter.module.scss
  • dummy/web/src/components/Newsletter.tsx
  • dummy/web/src/components/ProductCard.module.scss
  • dummy/web/src/components/ProductCard.tsx
  • dummy/web/src/components/ProductGrid.module.scss
  • dummy/web/src/components/ProductGrid.tsx
  • dummy/web/src/main.tsx
  • dummy/web/src/products.ts
  • dummy/web/src/styles/global.scss
  • dummy/web/src/styles/vars.scss
  • dummy/web/src/vite-env.d.ts
  • dummy/web/tsconfig.json
  • dummy/web/vite.config.ts
  • src/adapters/browser/browser-adapter.ts
  • src/adapters/browser/query.test.ts
  • src/adapters/browser/query.ts
  • src/agent/belt/act.test.ts
  • src/agent/belt/act.ts
  • src/agent/belt/observe.test.ts
  • src/agent/belt/observe.ts
  • src/agent/loop.test.ts
  • src/agent/loop.ts
  • src/agent/prompts/web-addendum.ts
  • src/cli/control.test.ts
  • src/cli/control.ts
  • src/config/load.ts
  • src/main.ts
  • src/mcp/tools/start-debug.ts
  • src/services/debug-service.test.ts
  • src/services/debug-service.ts
  • src/services/session-builder.ts
  • src/session/findings-store.test.ts
  • src/session/findings-store.ts
  • src/session/state-file.test.ts
  • src/session/state-file.ts

Comment thread dummy/web/BUGS.md Outdated
Comment thread dummy/web/src/components/ProductCard.module.scss
Comment thread dummy/web/src/components/ProductGrid.module.scss
Comment thread README.md Outdated
Comment thread src/adapters/browser/query.ts Outdated
Comment thread src/cli/control.ts
Comment thread src/config/load.ts Outdated
Comment thread src/mcp/tools/start-debug.ts
Comment thread src/services/session-builder.ts
Comment thread src/session/state-file.ts Outdated
- query.ts: gate role= shorthand behind ARIA-role allow-list; narrow
  CSS combinator detection so labels like `Next >` / `A ~ B` stay text
- observe.ts: omit `target` for scoped (within/filters) reads — act's
  unscoped replay could resolve a different element
- act.ts: validate `text` before resolving the type target
- control.ts: mark run stopped only after a verified SIGTERM; only the
  ESRCH race is benign
- config/load.ts: surface invalid-config errors in loadWorkspaceDir,
  keep the missing-file fallback
- start-debug.ts: cap timeout at 2_147_483s (Node timer ceiling)
- session-builder.ts: await crash-path log writes so failures flush
- state-file.ts: preserve terminal `stopped` over a later `ended`
- web-addendum.ts + BUGS.md + README + SCSS comment/lint + docs

Co-Authored-By: Claude <noreply@anthropic.com>
@sebyx07 sebyx07 merged commit 09ddbc5 into main Jun 29, 2026
2 checks passed
@sebyx07 sebyx07 deleted the feat/effective-web-qa branch June 29, 2026 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant