Add wukong test command for iOS simulator driving#240
Merged
Conversation
Implements `wukong test --platform ios <cmd>` as a Rust CLI layer that
delegates to the simClaw bash backend. The backend script is bundled via
include_str! and extracted to ~/.config/wukong/scripts/sim.sh on each
invocation so a Wukong upgrade always ships the matching script.
Command surface (18 commands): setup, start, status, doctor, teardown,
activate, layout-map, title, tap, tap-on, swipe, scroll {up|down|to}, type,
wait [--stable], find-element, hit-test, describe, screenshot.
Architecture: PlatformBackend async trait with an IosBackend impl. Android
slots in via the same trait and --platform flag without touching the iOS
path. Three run helpers (run_streaming, run_capture, run_json) cover the
output-shape variants; stderr is inherited so the user sees script output
in real time. Script exit codes are surfaced as TestError::ScriptFailed.
The bundled sim.sh is a placeholder today — the real simClaw script
replaces it before release.
Out of scope: Android backend, skills installer (separate RFC), JSON output
for `status` (pending a --json flag from the script team).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the placeholder sim.sh with the full simClaw tree copied from the homebrew-simClaw repo: bin/sim + lib/simclaw/*.sh + the swift helper. The sim script's own SIM_LIB resolution works unchanged because the extracted layout matches its repo layout exactly. The iOS backend now embeds every file via a static SIMCLAW_FILES manifest, extracts the whole tree to ~/.config/wukong/scripts/simclaw/ on each invocation, and returns bin/sim as the entry point. Only bin/sim gets the executable bit; the helpers are sourced, not invoked. Smoke-tested against a machine with no booted simulator: the real script boots, fails cleanly with the expected message, and Wukong maps the non-zero exit to TestError::ScriptFailed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three safeguards so updates from the simClaw team can't silently break the iOS integration: 1. VENDORED.md records the upstream repo, the vendored commit SHA, and the update procedure. Locks down the version for audit and makes provenance obvious to anyone browsing the tree. 2. revendor.sh automates the copy + diff + SHA report. Run it with a path to a local homebrew-simClaw checkout; it prints any added or removed files so the caller knows exactly what SIMCLAW_FILES in ios/mod.rs must change to stay in sync. 3. manifest_matches_tree test in ios::tests fails if the on-disk simclaw/ tree contains any bin/ or lib/ file not listed in SIMCLAW_FILES (and vice versa). This catches the drift case where upstream adds a new lib file and a re-vendor forgets to wire it through — compile succeeds but sourcing fails at runtime. The test explicitly ignores VENDORED.md, revendor.sh, and anything outside bin/ + lib/ so documentation/tooling doesn't confuse the guard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five tweaks from a /simplify pass: - script_path() now skips the 15-file rewrite when a .wukong-version marker inside the extracted dir matches CARGO_PKG_VERSION. Agent workflows that fire many test subcommands no longer pay extraction cost per call; upgrades still force a fresh extract because the marker disagrees. - Collapse the SIMCLAW_FILES manifest with a simclaw_files! macro so each entry is a single literal. Removes the hand-typed simclaw/ prefix duplicated in every tuple and the drift risk where the key string and include_str! path could disagree. - Replace to_string_lossy() in collect_tree with to_str().expect() so a non-UTF-8 path fails loudly instead of silently producing a key that can't match the manifest. - revendor.sh uses `cp -R` for whole bin/ and lib/ subtrees instead of unquoted globs. Future-proof against upstream adding files whose names contain spaces. - Drop redundant doc comments on EXTRACT_SUBDIR / ENTRY_SCRIPT. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end smoke against real Settings.app revealed that the LayoutMap and Element structs I initially wrote based on the RFC docs don't match what the simClaw script actually emits — e.g. the script uses navigation.title / interactive / scroll.above_fold, not the guessed screen_title / elements / scroll_hints. Since the schema is owned by the simClaw team and hasn't been frozen into a stable contract, pass the JSON through as serde_json::Value so consumers can work with whatever shape the script emits today. Retighten to typed structs once the upstream schema is contracted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two real-world bugs found while driving Settings.app end-to-end:
1. Upstream simClaw creates the WDA session with snapshotMaxDepth=15 and
snapshotMaxChildren=25, which truncates Settings.app's tree at exactly
25 elements (root + nav + title). Every label-based primitive
(layout-map, tap-on, find-element, scroll-to-visible, wait) came back
empty. After setup()/wda_start() succeeds, we now POST permissive
settings (60/200) to /appium/settings via reqwest — transparent to
the user, best-effort silent no-op if WDA isn't reachable. Read the
session ID from whichever bootstrap cache the script wrote (UDID-keyed
or the _default.json fallback).
2. cmd_tap_and_wait prints a "READY: <label>" progress line on stdout
before cmd_layout_map emits the JSON payload, so serde_json chokes on
the prefix. Harden run_json to slice from the first `{` or `[`,
ignoring any leading progress lines. Generally useful — bash scripts
commonly mix progress and structured output on stdout.
Both workarounds can be removed once the simClaw team raises (or exposes)
the snapshot caps and moves READY/progress lines to stderr.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop unused Deserialize/Serialize imports in test/platform.rs (LayoutMap and Element are now serde_json::Value aliases — neither derive is needed). - Replace closure char comparison with array pattern in run_json (manual_pattern_char_comparison clippy lint). Also re-run cargo fmt under the 1.81 toolchain to match CI exactly; 1.90's rustfmt produced a slightly different layout that CI's 1.81 rejected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch upstream from homebrew-simClaw to mindvalley/mv-simclaw-ios (which is now the source of truth — homebrew-simClaw repo carried only the formula after the migration). Re-vendors bin/sim + lib/simclaw/ and adds lib/simclaw/login.sh to the SIMCLAW_FILES manifest so the drift-guard test stays green. login.sh is vendored dormant — IosBackend does not dispatch to it; login flows belong at the user/skill layer composed from tap-on/type/wait primitives, not as a wukong CLI verb. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # cli/src/commands/mod.rs # cli/tests/snapshots/completion__wukong_completion_bash.snap # cli/tests/snapshots/completion__wukong_completion_fish.snap # cli/tests/snapshots/completion__wukong_completion_zsh.snap
maail
approved these changes
May 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
wukong testcommand group with iOS subcommands for driving the iOS Simulator (layout-map, tap, type, scroll, find-element, wait, screenshot, etc.) intended for agent-friendly UI automation.simClawbash backend (bin/sim + lib/simclaw/*.sh + swift helper) intocli/src/commands/test/ios/simclaw/. Files are embedded viainclude_str!, extracted to~/.config/wukong/scripts/simclaw/on first use, and version-gated onCARGO_PKG_VERSIONso subsequent invocations skip the rewrite.VENDORED.mdrecords the upstream commit,revendor.shautomates re-sync + diff, and amanifest_matches_treetest fails the build ifSIMCLAW_FILESever disagrees with the on-disk tree.maxDepth=60,maxChildren=200) after bootstrap so label-based primitives don't return empty on screens with >25 elements.run_jsonto slice from the first{/[, since some compound commands print progress lines (e.g.READY: <label>) on stdout before the JSON payload.serde_json::Valueuntil the upstream simClaw schema is contracted.Test plan
cargo buildandcargo testpass locallymanifest_matches_treetest passeswukong test ios layout-mapreturns a populated tree against a booted Settings.appwukong test ios tap-on <label>works end-to-end on a real simulatorrevendor.shagainst a localhomebrew-simClawcheckout reports a clean diff🤖 Generated with Claude Code