fix: CLI recovery stays reliable during Unity readiness updates#1139
Conversation
Add a terminal-driven smoke harness that exercises launch/reuse readiness, compile domain-reload waiting, immediate log retrieval, dynamic-code recovery, and stale recovery-state guidance through the checked-in native CLI binary. Document the new helper so the E2E path stays explicit about project selection instead of depending on ambient working directories.
Add a single POSIX E2E runner so recovery readiness, input record/replay, and simulate-mouse UI coverage run through the same explicit project path flow. Update the existing shell helpers to accept the checked-in CLI binary and automated replay input so the documented E2E entrypoint is repeatable without relying on ambient shell state.
📝 WalkthroughWalkthroughThis PR introduces a complete terminal-driven end-to-end testing infrastructure for the Unity CLI Loop. It adds a Go-based smoke-test framework with recovery/readiness sequences, a POSIX orchestrator that coordinates multiple E2E phases, enhanced replay verification with JSON parsing and automation support, and transient-read retry logic to handle Windows file access transience. Documentation is updated to reflect the unified single-runner-per-shell approach. ChangesTerminal-Driven E2E Verification and Recovery System
Sequence DiagramsequenceDiagram
participant POSIX as run-posix-e2e.sh
participant Smoke as smoke-cli-recovery-readiness.go
participant ULoop as uloop CLI
participant Replay as verify-replay-via-cli.sh
participant Mouse as test-simulate-mouse-demo.sh
POSIX->>POSIX: Parse options & auto-select uloop binary
POSIX->>POSIX: Validate project structure
POSIX->>Smoke: Run recovery/readiness (conditional)
Smoke->>ULoop: launch / get-logs / compile / execute-dynamic-code
Smoke->>ULoop: get-logs (stale state detection)
Smoke->>ULoop: fix (cleanup)
Smoke-->>POSIX: Recovery checks passed
POSIX->>Replay: Run record/replay (conditional)
Replay->>ULoop: activate_for_record / record-input
Replay->>ULoop: activate_for_replay / replay-input
Replay-->>POSIX: Replay checks passed
POSIX->>Mouse: Run simulate-mouse (conditional)
Mouse->>ULoop: configure & run mouse simulation
Mouse-->>POSIX: Mouse checks passed
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
Assets/Tests/Demo/scripts/verify-replay-via-cli.sh (1)
266-273:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEmpty log file would cause arithmetic error.
If
$1is empty,basebecomes empty and$((frame - base))fails with a syntax error.🛡️ Proposed defensive fix
normalize_frames() { base=$(head -1 "$1" | sed 's/Frame \([0-9]*\):.*/\1/') + [ -z "$base" ] && { cat "$1"; return; } sed "s/Frame \([0-9]*\)/Frame \1/" "$1" | while IFS= read -r line; do🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh` around lines 266 - 273, normalize_frames currently computes base from head -1 and will fail arithmetic if the file is empty or header missing; update normalize_frames to defensively handle empty/malformed input by checking that "$1" exists and has at least one line (e.g. test -s or read first line) and that the extracted base is a valid number, falling back to 0 if not; then proceed with the sed/while loop and use the sanitized numeric base when computing $((frame - base)) to avoid syntax errors in normalize_frames.
🧹 Nitpick comments (1)
Assets/Tests/Demo/scripts/verify-replay-via-cli.sh (1)
73-80: 💤 Low valueConsider using
jqfor robust JSON parsing if available.The sed-based JSON parsing is fragile for values containing escaped quotes or special characters. However, for this controlled E2E context where results are simple strings like
"OK: activated for recording", this is acceptable.Note: The sibling script
test-simulate-mouse-demo.shusesjqwith arequire_jqcheck. Ifjqavailability is expected, consider aligning parsing approaches.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh` around lines 73 - 80, The assert_json_result function uses a fragile sed extraction; update it to prefer jq when available: check for jq with "command -v jq >/dev/null" (or call the existing require_jq helper if present) and extract the value with jq -r '.Result' into actual, falling back to the existing sed expression only if jq is not found; keep the same failure behavior (call fail "$context: expected '$expected', got '$actual'") so behavior is unchanged but parsing is robust for escaped characters.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh`:
- Line 197: Replace the plain read invocation "read dummy" with a safe discard
pattern: use read -r with an underscore variable to avoid backslash
interpretation and signal the input is unused (i.e., change the read call in the
script from the current read dummy to use -r and _ as the variable).
In `@scripts/smoke-cli-recovery-readiness.py`:
- Around line 127-130: The assertion currently only fails when
payload.get("Success") is False, allowing missing Success keys to pass; update
the check in the block that calls print_command_context(label, result) and
raises the AssertionError so it explicitly requires payload.get("Success") is
True (e.g., use "is not True" or "!= True") and raise an AssertionError when
Success is missing or not True, including label/result context via
print_command_context(label, result) and a clear error message referencing the
label and payload state.
- Around line 249-251: The code currently calls os.path.abspath on
args.project_path and args.uloop_path before validate_paths, which converts an
empty default_uloop_path() into the CWD and masks the intended "missing
uloop_path" validation; change the flow in main so you keep the raw
args.uloop_path value (do not call os.path.abspath on it if it's empty) and call
validate_paths(project_path, uloop_path) using the raw uloop_path string; only
normalize (os.path.abspath) project_path and uloop_path after validate_paths
returns success (or alternatively, apply os.path.abspath to uloop_path
conditionally when uloop_path is truthy). Ensure references to project_path,
uloop_path, validate_paths(), and default_uloop_path() are used as described.
---
Outside diff comments:
In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh`:
- Around line 266-273: normalize_frames currently computes base from head -1 and
will fail arithmetic if the file is empty or header missing; update
normalize_frames to defensively handle empty/malformed input by checking that
"$1" exists and has at least one line (e.g. test -s or read first line) and that
the extracted base is a valid number, falling back to 0 if not; then proceed
with the sed/while loop and use the sanitized numeric base when computing
$((frame - base)) to avoid syntax errors in normalize_frames.
---
Nitpick comments:
In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh`:
- Around line 73-80: The assert_json_result function uses a fragile sed
extraction; update it to prefer jq when available: check for jq with "command -v
jq >/dev/null" (or call the existing require_jq helper if present) and extract
the value with jq -r '.Result' into actual, falling back to the existing sed
expression only if jq is not found; keep the same failure behavior (call fail
"$context: expected '$expected', got '$actual'") so behavior is unchanged but
parsing is robust for escaped characters.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: db2f9357-2cbc-48a9-99d5-e18afe69ac84
📒 Files selected for processing (7)
Assets/Tests/Demo/scripts/verify-replay-via-cli.shREADME.mdREADME_ja.mdscripts/run-posix-e2e.shscripts/smoke-cli-recovery-readiness.pyscripts/test-simulate-mouse-demo.shscripts/test-smoke-cli-recovery-readiness.sh
There was a problem hiding this comment.
1 issue found across 7 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/smoke-cli-recovery-readiness.py">
<violation number="1" location="scripts/smoke-cli-recovery-readiness.py:250">
P2: When `default_uloop_path()` returns an empty string (on non-Darwin, non-Windows platforms), `os.path.abspath("")` normalizes it to the current working directory. This prevents the `if not uloop_path` guard in `validate_paths()` from ever triggering, producing a confusing "binary not found" error instead of the intended "No checked-in uloop binary for this platform" message.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic
Require success-style JSON only for commands that expose a Success field, keep get-logs checks aligned with the real CLI payload, preserve empty default uloop paths for platform validation, and make the replay prompt read input without backslash mangling.
Replace the Python recovery/readiness smoke with a Go harness so POSIX and Windows terminal E2E coverage share the same implementation language. Add the new smoke to the Windows runner, align compile calls with the current default domain-reload wait contract, and harden native CLI readiness-state reads against transient Windows file locks and partially published JSON observed during the Windows run.
Use the checked-in Windows native CLI binary when the POSIX E2E runner is launched from Git Bash, MSYS, or Cygwin. This keeps the documented default runner behavior working on Windows without requiring an explicit --uloop-path.
Fail the terminal replay helper when a uloop JSON response reports Success=false and reject empty recording or replay logs. Convert Git Bash paths before passing log destinations into Unity so the POSIX runner can save and verify logs on Windows.
There was a problem hiding this comment.
1 issue found across 14 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/run-windows-e2e.ps1">
<violation number="1" location="scripts/run-windows-e2e.ps1:330">
P2: Resolve `uloop` as an executable command (`Application`) before passing `--uloop-path`; otherwise function/alias resolution can produce a non-binary `.Source` and break the new recovery smoke step.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
Keep the Go recovery/readiness smoke aligned with the live CLI contract by accepting JSON-only get-logs payloads while requiring Success=true for success-style tool responses. Preserve the domain-reload compile path and cover missing Success payloads in the shell harness so the Go E2E replacement keeps the review fix intact.
Let the native CLI apply its checked-in compile wait default instead of passing the positive wait flag, which may be unavailable when the default tool catalog exposes only the negated no-wait option.
Ensure the Windows E2E runner passes an application path to the Go recovery smoke instead of a PowerShell alias or function source.
Summary
User Impact
--project-pathand the checked-in native CLI binary, reducing accidental project or CLI mismatch during validation.Changes
uloopas an executable application.Verification
scripts/check-go-cli.shsh scripts/test-smoke-cli-recovery-readiness.shsh scripts/run-posix-e2e.sh --project-path /Users/a12115/ghq/hatayama/unity-cli-loop --timeout 180 --launch-timeout 300sh -n scripts/run-posix-e2e.sh && sh -n scripts/test-simulate-mouse-demo.sh && sh -n Assets/Tests/Demo/scripts/verify-replay-via-cli.shgit diff --checkgit diff --check -- scripts/run-windows-e2e.ps1