Add wukong test command for iOS simulator driving by mfauzaan · Pull Request #240 · mindvalley/wukong-cli

mfauzaan · 2026-04-28T06:45:07Z

Summary

New wukong test command group with iOS subcommands for driving the iOS Simulator (layout-map, tap, type, scroll, find-element, wait, screenshot, etc.) intended for agent-friendly UI automation.
Vendors the simClaw bash backend (bin/sim + lib/simclaw/*.sh + swift helper) into cli/src/commands/test/ios/simclaw/. Files are embedded via include_str!, extracted to ~/.config/wukong/scripts/simclaw/ on first use, and version-gated on CARGO_PKG_VERSION so subsequent invocations skip the rewrite.
Safeguards against silent vendor drift: VENDORED.md records the upstream commit, revendor.sh automates re-sync + diff, and a manifest_matches_tree test fails the build if SIMCLAW_FILES ever disagrees with the on-disk tree.
Two real-world fixes from end-to-end smoke against Settings.app:
- Raise WDA snapshot caps (maxDepth=60, maxChildren=200) after bootstrap so label-based primitives don't return empty on screens with >25 elements.
- Harden run_json to slice from the first {/[, since some compound commands print progress lines (e.g. READY: <label>) on stdout before the JSON payload.
Test JSON schemas are passed through as serde_json::Value until the upstream simClaw schema is contracted.

Test plan

cargo build and cargo test pass locally
manifest_matches_tree test passes
wukong test ios layout-map returns a populated tree against a booted Settings.app
wukong test ios tap-on <label> works end-to-end on a real simulator
Re-running test subcommands in quick succession does not re-extract the simclaw tree (version marker hit)
revendor.sh against a local homebrew-simClaw checkout reports a clean diff

🤖 Generated with Claude Code

Implements `wukong test --platform ios <cmd>` as a Rust CLI layer that delegates to the simClaw bash backend. The backend script is bundled via include_str! and extracted to ~/.config/wukong/scripts/sim.sh on each invocation so a Wukong upgrade always ships the matching script. Command surface (18 commands): setup, start, status, doctor, teardown, activate, layout-map, title, tap, tap-on, swipe, scroll {up|down|to}, type, wait [--stable], find-element, hit-test, describe, screenshot. Architecture: PlatformBackend async trait with an IosBackend impl. Android slots in via the same trait and --platform flag without touching the iOS path. Three run helpers (run_streaming, run_capture, run_json) cover the output-shape variants; stderr is inherited so the user sees script output in real time. Script exit codes are surfaced as TestError::ScriptFailed. The bundled sim.sh is a placeholder today — the real simClaw script replaces it before release. Out of scope: Android backend, skills installer (separate RFC), JSON output for `status` (pending a --json flag from the script team). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the placeholder sim.sh with the full simClaw tree copied from the homebrew-simClaw repo: bin/sim + lib/simclaw/*.sh + the swift helper. The sim script's own SIM_LIB resolution works unchanged because the extracted layout matches its repo layout exactly. The iOS backend now embeds every file via a static SIMCLAW_FILES manifest, extracts the whole tree to ~/.config/wukong/scripts/simclaw/ on each invocation, and returns bin/sim as the entry point. Only bin/sim gets the executable bit; the helpers are sourced, not invoked. Smoke-tested against a machine with no booted simulator: the real script boots, fails cleanly with the expected message, and Wukong maps the non-zero exit to TestError::ScriptFailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three safeguards so updates from the simClaw team can't silently break the iOS integration: 1. VENDORED.md records the upstream repo, the vendored commit SHA, and the update procedure. Locks down the version for audit and makes provenance obvious to anyone browsing the tree. 2. revendor.sh automates the copy + diff + SHA report. Run it with a path to a local homebrew-simClaw checkout; it prints any added or removed files so the caller knows exactly what SIMCLAW_FILES in ios/mod.rs must change to stay in sync. 3. manifest_matches_tree test in ios::tests fails if the on-disk simclaw/ tree contains any bin/ or lib/ file not listed in SIMCLAW_FILES (and vice versa). This catches the drift case where upstream adds a new lib file and a re-vendor forgets to wire it through — compile succeeds but sourcing fails at runtime. The test explicitly ignores VENDORED.md, revendor.sh, and anything outside bin/ + lib/ so documentation/tooling doesn't confuse the guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Five tweaks from a /simplify pass: - script_path() now skips the 15-file rewrite when a .wukong-version marker inside the extracted dir matches CARGO_PKG_VERSION. Agent workflows that fire many test subcommands no longer pay extraction cost per call; upgrades still force a fresh extract because the marker disagrees. - Collapse the SIMCLAW_FILES manifest with a simclaw_files! macro so each entry is a single literal. Removes the hand-typed simclaw/ prefix duplicated in every tuple and the drift risk where the key string and include_str! path could disagree. - Replace to_string_lossy() in collect_tree with to_str().expect() so a non-UTF-8 path fails loudly instead of silently producing a key that can't match the manifest. - revendor.sh uses `cp -R` for whole bin/ and lib/ subtrees instead of unquoted globs. Future-proof against upstream adding files whose names contain spaces. - Drop redundant doc comments on EXTRACT_SUBDIR / ENTRY_SCRIPT. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

End-to-end smoke against real Settings.app revealed that the LayoutMap and Element structs I initially wrote based on the RFC docs don't match what the simClaw script actually emits — e.g. the script uses navigation.title / interactive / scroll.above_fold, not the guessed screen_title / elements / scroll_hints. Since the schema is owned by the simClaw team and hasn't been frozen into a stable contract, pass the JSON through as serde_json::Value so consumers can work with whatever shape the script emits today. Retighten to typed structs once the upstream schema is contracted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two real-world bugs found while driving Settings.app end-to-end: 1. Upstream simClaw creates the WDA session with snapshotMaxDepth=15 and snapshotMaxChildren=25, which truncates Settings.app's tree at exactly 25 elements (root + nav + title). Every label-based primitive (layout-map, tap-on, find-element, scroll-to-visible, wait) came back empty. After setup()/wda_start() succeeds, we now POST permissive settings (60/200) to /appium/settings via reqwest — transparent to the user, best-effort silent no-op if WDA isn't reachable. Read the session ID from whichever bootstrap cache the script wrote (UDID-keyed or the _default.json fallback). 2. cmd_tap_and_wait prints a "READY: <label>" progress line on stdout before cmd_layout_map emits the JSON payload, so serde_json chokes on the prefix. Harden run_json to slice from the first `{` or `[`, ignoring any leading progress lines. Generally useful — bash scripts commonly mix progress and structured output on stdout. Both workarounds can be removed once the simClaw team raises (or exposes) the snapshot caps and moves READY/progress lines to stderr. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Drop unused Deserialize/Serialize imports in test/platform.rs (LayoutMap and Element are now serde_json::Value aliases — neither derive is needed). - Replace closure char comparison with array pattern in run_json (manual_pattern_char_comparison clippy lint). Also re-run cargo fmt under the 1.81 toolchain to match CI exactly; 1.90's rustfmt produced a slightly different layout that CI's 1.81 rejected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Switch upstream from homebrew-simClaw to mindvalley/mv-simclaw-ios (which is now the source of truth — homebrew-simClaw repo carried only the formula after the migration). Re-vendors bin/sim + lib/simclaw/ and adds lib/simclaw/login.sh to the SIMCLAW_FILES manifest so the drift-guard test stays green. login.sh is vendored dormant — IosBackend does not dispatch to it; login flows belong at the user/skill layer composed from tap-on/type/wait primitives, not as a wukong CLI verb. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # cli/src/commands/mod.rs # cli/tests/snapshots/completion__wukong_completion_bash.snap # cli/tests/snapshots/completion__wukong_completion_fish.snap # cli/tests/snapshots/completion__wukong_completion_zsh.snap

mfauzaan and others added 12 commits April 21, 2026 07:36

style: cargo fmt

278f871

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(completion): update snapshots for test subcommand

e3215a3

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test: update wukong help snapshot for test subcommand

7dbc36a

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

maail self-requested a review May 18, 2026 03:53

maail approved these changes May 18, 2026

View reviewed changes

mfauzaan merged commit 8e2089d into main May 18, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add wukong test command for iOS simulator driving#240

Add wukong test command for iOS simulator driving#240
mfauzaan merged 12 commits into
mainfrom
fauzaan/add-test-command

mfauzaan commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

mfauzaan commented Apr 28, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants