Skip to content

Add wukong test command for iOS simulator driving#240

Merged
mfauzaan merged 12 commits into
mainfrom
fauzaan/add-test-command
May 18, 2026
Merged

Add wukong test command for iOS simulator driving#240
mfauzaan merged 12 commits into
mainfrom
fauzaan/add-test-command

Conversation

@mfauzaan

Copy link
Copy Markdown
Member

Summary

  • New wukong test command group with iOS subcommands for driving the iOS Simulator (layout-map, tap, type, scroll, find-element, wait, screenshot, etc.) intended for agent-friendly UI automation.
  • Vendors the simClaw bash backend (bin/sim + lib/simclaw/*.sh + swift helper) into cli/src/commands/test/ios/simclaw/. Files are embedded via include_str!, extracted to ~/.config/wukong/scripts/simclaw/ on first use, and version-gated on CARGO_PKG_VERSION so subsequent invocations skip the rewrite.
  • Safeguards against silent vendor drift: VENDORED.md records the upstream commit, revendor.sh automates re-sync + diff, and a manifest_matches_tree test fails the build if SIMCLAW_FILES ever disagrees with the on-disk tree.
  • Two real-world fixes from end-to-end smoke against Settings.app:
    • Raise WDA snapshot caps (maxDepth=60, maxChildren=200) after bootstrap so label-based primitives don't return empty on screens with >25 elements.
    • Harden run_json to slice from the first {/[, since some compound commands print progress lines (e.g. READY: <label>) on stdout before the JSON payload.
  • Test JSON schemas are passed through as serde_json::Value until the upstream simClaw schema is contracted.

Test plan

  • cargo build and cargo test pass locally
  • manifest_matches_tree test passes
  • wukong test ios layout-map returns a populated tree against a booted Settings.app
  • wukong test ios tap-on <label> works end-to-end on a real simulator
  • Re-running test subcommands in quick succession does not re-extract the simclaw tree (version marker hit)
  • revendor.sh against a local homebrew-simClaw checkout reports a clean diff

🤖 Generated with Claude Code

mfauzaan and others added 12 commits April 21, 2026 07:36
Implements `wukong test --platform ios <cmd>` as a Rust CLI layer that
delegates to the simClaw bash backend. The backend script is bundled via
include_str! and extracted to ~/.config/wukong/scripts/sim.sh on each
invocation so a Wukong upgrade always ships the matching script.

Command surface (18 commands): setup, start, status, doctor, teardown,
activate, layout-map, title, tap, tap-on, swipe, scroll {up|down|to}, type,
wait [--stable], find-element, hit-test, describe, screenshot.

Architecture: PlatformBackend async trait with an IosBackend impl. Android
slots in via the same trait and --platform flag without touching the iOS
path. Three run helpers (run_streaming, run_capture, run_json) cover the
output-shape variants; stderr is inherited so the user sees script output
in real time. Script exit codes are surfaced as TestError::ScriptFailed.

The bundled sim.sh is a placeholder today — the real simClaw script
replaces it before release.

Out of scope: Android backend, skills installer (separate RFC), JSON output
for `status` (pending a --json flag from the script team).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the placeholder sim.sh with the full simClaw tree copied from
the homebrew-simClaw repo: bin/sim + lib/simclaw/*.sh + the swift helper.
The sim script's own SIM_LIB resolution works unchanged because the
extracted layout matches its repo layout exactly.

The iOS backend now embeds every file via a static SIMCLAW_FILES
manifest, extracts the whole tree to ~/.config/wukong/scripts/simclaw/
on each invocation, and returns bin/sim as the entry point. Only bin/sim
gets the executable bit; the helpers are sourced, not invoked.

Smoke-tested against a machine with no booted simulator: the real script
boots, fails cleanly with the expected message, and Wukong maps the
non-zero exit to TestError::ScriptFailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three safeguards so updates from the simClaw team can't silently break
the iOS integration:

1. VENDORED.md records the upstream repo, the vendored commit SHA, and
   the update procedure. Locks down the version for audit and makes
   provenance obvious to anyone browsing the tree.

2. revendor.sh automates the copy + diff + SHA report. Run it with a
   path to a local homebrew-simClaw checkout; it prints any added or
   removed files so the caller knows exactly what SIMCLAW_FILES in
   ios/mod.rs must change to stay in sync.

3. manifest_matches_tree test in ios::tests fails if the on-disk
   simclaw/ tree contains any bin/ or lib/ file not listed in
   SIMCLAW_FILES (and vice versa). This catches the drift case where
   upstream adds a new lib file and a re-vendor forgets to wire it
   through — compile succeeds but sourcing fails at runtime.

The test explicitly ignores VENDORED.md, revendor.sh, and anything
outside bin/ + lib/ so documentation/tooling doesn't confuse the guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five tweaks from a /simplify pass:

- script_path() now skips the 15-file rewrite when a .wukong-version
  marker inside the extracted dir matches CARGO_PKG_VERSION. Agent
  workflows that fire many test subcommands no longer pay extraction
  cost per call; upgrades still force a fresh extract because the
  marker disagrees.

- Collapse the SIMCLAW_FILES manifest with a simclaw_files! macro so
  each entry is a single literal. Removes the hand-typed simclaw/
  prefix duplicated in every tuple and the drift risk where the key
  string and include_str! path could disagree.

- Replace to_string_lossy() in collect_tree with to_str().expect() so
  a non-UTF-8 path fails loudly instead of silently producing a key
  that can't match the manifest.

- revendor.sh uses `cp -R` for whole bin/ and lib/ subtrees instead of
  unquoted globs. Future-proof against upstream adding files whose
  names contain spaces.

- Drop redundant doc comments on EXTRACT_SUBDIR / ENTRY_SCRIPT.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end smoke against real Settings.app revealed that the LayoutMap
and Element structs I initially wrote based on the RFC docs don't match
what the simClaw script actually emits — e.g. the script uses
navigation.title / interactive / scroll.above_fold, not the guessed
screen_title / elements / scroll_hints.

Since the schema is owned by the simClaw team and hasn't been frozen
into a stable contract, pass the JSON through as serde_json::Value so
consumers can work with whatever shape the script emits today. Retighten
to typed structs once the upstream schema is contracted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two real-world bugs found while driving Settings.app end-to-end:

1. Upstream simClaw creates the WDA session with snapshotMaxDepth=15 and
   snapshotMaxChildren=25, which truncates Settings.app's tree at exactly
   25 elements (root + nav + title). Every label-based primitive
   (layout-map, tap-on, find-element, scroll-to-visible, wait) came back
   empty. After setup()/wda_start() succeeds, we now POST permissive
   settings (60/200) to /appium/settings via reqwest — transparent to
   the user, best-effort silent no-op if WDA isn't reachable. Read the
   session ID from whichever bootstrap cache the script wrote (UDID-keyed
   or the _default.json fallback).

2. cmd_tap_and_wait prints a "READY: <label>" progress line on stdout
   before cmd_layout_map emits the JSON payload, so serde_json chokes on
   the prefix. Harden run_json to slice from the first `{` or `[`,
   ignoring any leading progress lines. Generally useful — bash scripts
   commonly mix progress and structured output on stdout.

Both workarounds can be removed once the simClaw team raises (or exposes)
the snapshot caps and moves READY/progress lines to stderr.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop unused Deserialize/Serialize imports in test/platform.rs (LayoutMap
  and Element are now serde_json::Value aliases — neither derive is needed).
- Replace closure char comparison with array pattern in run_json
  (manual_pattern_char_comparison clippy lint).

Also re-run cargo fmt under the 1.81 toolchain to match CI exactly; 1.90's
rustfmt produced a slightly different layout that CI's 1.81 rejected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch upstream from homebrew-simClaw to mindvalley/mv-simclaw-ios (which
is now the source of truth — homebrew-simClaw repo carried only the
formula after the migration). Re-vendors bin/sim + lib/simclaw/ and adds
lib/simclaw/login.sh to the SIMCLAW_FILES manifest so the drift-guard
test stays green. login.sh is vendored dormant — IosBackend does not
dispatch to it; login flows belong at the user/skill layer composed from
tap-on/type/wait primitives, not as a wukong CLI verb.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	cli/src/commands/mod.rs
#	cli/tests/snapshots/completion__wukong_completion_bash.snap
#	cli/tests/snapshots/completion__wukong_completion_fish.snap
#	cli/tests/snapshots/completion__wukong_completion_zsh.snap
@maail maail self-requested a review May 18, 2026 03:53
@mfauzaan mfauzaan merged commit 8e2089d into main May 18, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants