Skip to content

fix: CLI recovery stays reliable during Unity readiness updates#1139

Merged
hatayama merged 9 commits into
v3-betafrom
feature/hatayama/add-cli-recovery-e2e
May 16, 2026
Merged

fix: CLI recovery stays reliable during Unity readiness updates#1139
hatayama merged 9 commits into
v3-betafrom
feature/hatayama/add-cli-recovery-e2e

Conversation

@hatayama
Copy link
Copy Markdown
Owner

@hatayama hatayama commented May 16, 2026

Summary

  • Fix CLI recovery/readiness checks so transient readiness-file locks or partial writes are retried instead of failing immediately.
  • Unify terminal-driven E2E coverage so recovery/readiness, input record/replay, and simulate-mouse UI workflows can be verified from one runner per shell family.

User Impact

  • CLI commands are less likely to fail while Unity is replacing its readiness state, especially on Windows-style file handling where the readiness file can be temporarily locked or partially visible.
  • Developers can run a single POSIX E2E entrypoint with explicit --project-path and the checked-in native CLI binary, reducing accidental project or CLI mismatch during validation.

Changes

  • Add retry handling for Go CLI readiness-state file reads and partially published JSON, with focused unit coverage.
  • Add a Go recovery/readiness smoke test plus a fake-CLI harness test.
  • Add the POSIX terminal E2E runner and wire the existing record/replay and simulate-mouse helpers through explicit project paths.
  • Extend the Windows E2E runner to include the recovery/readiness smoke and resolve uloop as an executable application.
  • Use POSIX path semantics for the macOS uninstall target path and regenerate checked-in native CLI binaries.
  • Update README and README_ja to point users at the unified terminal E2E runners.

Verification

  • scripts/check-go-cli.sh
  • sh scripts/test-smoke-cli-recovery-readiness.sh
  • sh scripts/run-posix-e2e.sh --project-path /Users/a12115/ghq/hatayama/unity-cli-loop --timeout 180 --launch-timeout 300
  • sh -n scripts/run-posix-e2e.sh && sh -n scripts/test-simulate-mouse-demo.sh && sh -n Assets/Tests/Demo/scripts/verify-replay-via-cli.sh
  • git diff --check
  • git diff --check -- scripts/run-windows-e2e.ps1

hatayama added 2 commits May 16, 2026 15:33
Add a terminal-driven smoke harness that exercises launch/reuse readiness, compile domain-reload waiting, immediate log retrieval, dynamic-code recovery, and stale recovery-state guidance through the checked-in native CLI binary. Document the new helper so the E2E path stays explicit about project selection instead of depending on ambient working directories.
Add a single POSIX E2E runner so recovery readiness, input record/replay, and simulate-mouse UI coverage run through the same explicit project path flow. Update the existing shell helpers to accept the checked-in CLI binary and automated replay input so the documented E2E entrypoint is repeatable without relying on ambient shell state.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 16, 2026

📝 Walkthrough

Walkthrough

This PR introduces a complete terminal-driven end-to-end testing infrastructure for the Unity CLI Loop. It adds a Go-based smoke-test framework with recovery/readiness sequences, a POSIX orchestrator that coordinates multiple E2E phases, enhanced replay verification with JSON parsing and automation support, and transient-read retry logic to handle Windows file access transience. Documentation is updated to reflect the unified single-runner-per-shell approach.

Changes

Terminal-Driven E2E Verification and Recovery System

Layer / File(s) Summary
Server State Retry Infrastructure
Packages/src/Cli~/internal/cli/server_state.go, server_state_test.go
Adds configurable transient-read retry logic for server-state.json and sidecar files with tests validating recovery from read failures and partial JSON unmarshalling.
Go Smoke-Test Framework for CLI Recovery/Readiness
scripts/smoke-cli-recovery-readiness.go
Comprehensive CLI smoke test that orchestrates live recovery sequences (launch → get-logs → compile → execute-dynamic-code) and stale-state detection/cleanup, executes uloop with context timeouts, validates JSON responses and result fields, and provides structured error reporting.
POSIX E2E Orchestrator and Test Harness
scripts/run-posix-e2e.sh, scripts/test-smoke-cli-recovery-readiness.sh
Main POSIX entry point that parses options, auto-selects checked-in uloop binaries by OS/architecture, validates project structure, and conditionally executes recovery/readiness, replay, and simulate-mouse phases; test harness creates temp projects and fake uloop binaries for validation.
Enhanced Replay Verification with JSON and Automation
Assets/Tests/Demo/scripts/verify-replay-via-cli.sh
Refactored script with CLI option parsing, JSON-aware wrappers, helper functions for controller activation and log saving, replay-scene initialization, automated input dispatch, and 8-step record/replay orchestration that extracts output paths from JSON responses.
Simulate-Mouse Script CLI Configuration
scripts/test-simulate-mouse-demo.sh
Adds --project-path and --uloop-path CLI options and run_uloop wrapper to consistently apply configured paths across cleanup, readiness checks, and scene initialization.
Compile/Get-Logs Flow and Documentation Updates
scripts/smoke-compile-get-logs.py, scripts/uloop-compile-get-logs-stress.sh, README.md, README_ja.md
Removes --wait-for-domain-reload from compile invocations for immediate get-logs checks after domain reload; updates documentation to reflect unified single-runner-per-shell E2E approach.
POSIX Path Handling for Uninstall
Packages/src/Cli~/internal/uninstall/command.go
Changes POSIX uninstall command construction from filepath.Join to path.Join for forward-slash semantics.

Sequence Diagram

sequenceDiagram
  participant POSIX as run-posix-e2e.sh
  participant Smoke as smoke-cli-recovery-readiness.go
  participant ULoop as uloop CLI
  participant Replay as verify-replay-via-cli.sh
  participant Mouse as test-simulate-mouse-demo.sh
  
  POSIX->>POSIX: Parse options & auto-select uloop binary
  POSIX->>POSIX: Validate project structure
  POSIX->>Smoke: Run recovery/readiness (conditional)
  Smoke->>ULoop: launch / get-logs / compile / execute-dynamic-code
  Smoke->>ULoop: get-logs (stale state detection)
  Smoke->>ULoop: fix (cleanup)
  Smoke-->>POSIX: Recovery checks passed
  POSIX->>Replay: Run record/replay (conditional)
  Replay->>ULoop: activate_for_record / record-input
  Replay->>ULoop: activate_for_replay / replay-input
  Replay-->>POSIX: Replay checks passed
  POSIX->>Mouse: Run simulate-mouse (conditional)
  Mouse->>ULoop: configure & run mouse simulation
  Mouse-->>POSIX: Mouse checks passed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • hatayama/unity-cli-loop#1137: Directly implements terminal-driven E2E CLI coverage requested in the issue by introducing smoke-cli-recovery-readiness.go and POSIX orchestrator with explicit --project-path wiring across all uloop invocations.

Possibly related PRs

  • hatayama/unity-cli-loop#1027: Aligns with this PR's changes to compile/get-logs workflows by removing --wait-for-domain-reload to allow immediate get-logs checks after domain reload completion.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title describes CLI recovery reliability improvements, which aligns with the stale recovery-state handling and retry logic in server_state.go, but the PR's primary changes center on adding comprehensive E2E test coverage (recovery/readiness, input record/replay, simulate-mouse) with explicit project path and CLI binary handling—a broader scope than the title conveys.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/hatayama/add-cli-recovery-e2e

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Assets/Tests/Demo/scripts/verify-replay-via-cli.sh (1)

266-273: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Empty log file would cause arithmetic error.

If $1 is empty, base becomes empty and $((frame - base)) fails with a syntax error.

🛡️ Proposed defensive fix
 normalize_frames() {
     base=$(head -1 "$1" | sed 's/Frame \([0-9]*\):.*/\1/')
+    [ -z "$base" ] && { cat "$1"; return; }
     sed "s/Frame \([0-9]*\)/Frame \1/" "$1" | while IFS= read -r line; do
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh` around lines 266 - 273,
normalize_frames currently computes base from head -1 and will fail arithmetic
if the file is empty or header missing; update normalize_frames to defensively
handle empty/malformed input by checking that "$1" exists and has at least one
line (e.g. test -s or read first line) and that the extracted base is a valid
number, falling back to 0 if not; then proceed with the sed/while loop and use
the sanitized numeric base when computing $((frame - base)) to avoid syntax
errors in normalize_frames.
🧹 Nitpick comments (1)
Assets/Tests/Demo/scripts/verify-replay-via-cli.sh (1)

73-80: 💤 Low value

Consider using jq for robust JSON parsing if available.

The sed-based JSON parsing is fragile for values containing escaped quotes or special characters. However, for this controlled E2E context where results are simple strings like "OK: activated for recording", this is acceptable.

Note: The sibling script test-simulate-mouse-demo.sh uses jq with a require_jq check. If jq availability is expected, consider aligning parsing approaches.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh` around lines 73 - 80, The
assert_json_result function uses a fragile sed extraction; update it to prefer
jq when available: check for jq with "command -v jq >/dev/null" (or call the
existing require_jq helper if present) and extract the value with jq -r
'.Result' into actual, falling back to the existing sed expression only if jq is
not found; keep the same failure behavior (call fail "$context: expected
'$expected', got '$actual'") so behavior is unchanged but parsing is robust for
escaped characters.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh`:
- Line 197: Replace the plain read invocation "read dummy" with a safe discard
pattern: use read -r with an underscore variable to avoid backslash
interpretation and signal the input is unused (i.e., change the read call in the
script from the current read dummy to use -r and _ as the variable).

In `@scripts/smoke-cli-recovery-readiness.py`:
- Around line 127-130: The assertion currently only fails when
payload.get("Success") is False, allowing missing Success keys to pass; update
the check in the block that calls print_command_context(label, result) and
raises the AssertionError so it explicitly requires payload.get("Success") is
True (e.g., use "is not True" or "!= True") and raise an AssertionError when
Success is missing or not True, including label/result context via
print_command_context(label, result) and a clear error message referencing the
label and payload state.
- Around line 249-251: The code currently calls os.path.abspath on
args.project_path and args.uloop_path before validate_paths, which converts an
empty default_uloop_path() into the CWD and masks the intended "missing
uloop_path" validation; change the flow in main so you keep the raw
args.uloop_path value (do not call os.path.abspath on it if it's empty) and call
validate_paths(project_path, uloop_path) using the raw uloop_path string; only
normalize (os.path.abspath) project_path and uloop_path after validate_paths
returns success (or alternatively, apply os.path.abspath to uloop_path
conditionally when uloop_path is truthy). Ensure references to project_path,
uloop_path, validate_paths(), and default_uloop_path() are used as described.

---

Outside diff comments:
In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh`:
- Around line 266-273: normalize_frames currently computes base from head -1 and
will fail arithmetic if the file is empty or header missing; update
normalize_frames to defensively handle empty/malformed input by checking that
"$1" exists and has at least one line (e.g. test -s or read first line) and that
the extracted base is a valid number, falling back to 0 if not; then proceed
with the sed/while loop and use the sanitized numeric base when computing
$((frame - base)) to avoid syntax errors in normalize_frames.

---

Nitpick comments:
In `@Assets/Tests/Demo/scripts/verify-replay-via-cli.sh`:
- Around line 73-80: The assert_json_result function uses a fragile sed
extraction; update it to prefer jq when available: check for jq with "command -v
jq >/dev/null" (or call the existing require_jq helper if present) and extract
the value with jq -r '.Result' into actual, falling back to the existing sed
expression only if jq is not found; keep the same failure behavior (call fail
"$context: expected '$expected', got '$actual'") so behavior is unchanged but
parsing is robust for escaped characters.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: db2f9357-2cbc-48a9-99d5-e18afe69ac84

📥 Commits

Reviewing files that changed from the base of the PR and between 7e45f1e and cdcb70d.

📒 Files selected for processing (7)
  • Assets/Tests/Demo/scripts/verify-replay-via-cli.sh
  • README.md
  • README_ja.md
  • scripts/run-posix-e2e.sh
  • scripts/smoke-cli-recovery-readiness.py
  • scripts/test-simulate-mouse-demo.sh
  • scripts/test-smoke-cli-recovery-readiness.sh

Comment thread Assets/Tests/Demo/scripts/verify-replay-via-cli.sh Outdated
Comment thread scripts/smoke-cli-recovery-readiness.py Outdated
Comment thread scripts/smoke-cli-recovery-readiness.py Outdated
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 7 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="scripts/smoke-cli-recovery-readiness.py">

<violation number="1" location="scripts/smoke-cli-recovery-readiness.py:250">
P2: When `default_uloop_path()` returns an empty string (on non-Darwin, non-Windows platforms), `os.path.abspath("")` normalizes it to the current working directory. This prevents the `if not uloop_path` guard in `validate_paths()` from ever triggering, producing a confusing "binary not found" error instead of the intended "No checked-in uloop binary for this platform" message.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Re-trigger cubic

Comment thread scripts/smoke-cli-recovery-readiness.py Outdated
Require success-style JSON only for commands that expose a Success field, keep get-logs checks aligned with the real CLI payload, preserve empty default uloop paths for platform validation, and make the replay prompt read input without backslash mangling.
hatayama added 3 commits May 16, 2026 17:16
Replace the Python recovery/readiness smoke with a Go harness so POSIX and Windows terminal E2E coverage share the same implementation language. Add the new smoke to the Windows runner, align compile calls with the current default domain-reload wait contract, and harden native CLI readiness-state reads against transient Windows file locks and partially published JSON observed during the Windows run.
Use the checked-in Windows native CLI binary when the POSIX E2E runner is launched from Git Bash, MSYS, or Cygwin. This keeps the documented default runner behavior working on Windows without requiring an explicit --uloop-path.
Fail the terminal replay helper when a uloop JSON response reports Success=false and reject empty recording or replay logs. Convert Git Bash paths before passing log destinations into Unity so the POSIX runner can save and verify logs on Windows.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 14 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="scripts/run-windows-e2e.ps1">

<violation number="1" location="scripts/run-windows-e2e.ps1:330">
P2: Resolve `uloop` as an executable command (`Application`) before passing `--uloop-path`; otherwise function/alias resolution can produce a non-binary `.Source` and break the new recovery smoke step.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic

Comment thread scripts/run-windows-e2e.ps1 Outdated
hatayama added 3 commits May 16, 2026 21:59
Keep the Go recovery/readiness smoke aligned with the live CLI contract by accepting JSON-only get-logs payloads while requiring Success=true for success-style tool responses. Preserve the domain-reload compile path and cover missing Success payloads in the shell harness so the Go E2E replacement keeps the review fix intact.
Let the native CLI apply its checked-in compile wait default instead of passing the positive wait flag, which may be unavailable when the default tool catalog exposes only the negated no-wait option.
Ensure the Windows E2E runner passes an application path to the Go recovery smoke instead of a PowerShell alias or function source.
@hatayama hatayama changed the title chore: Terminal E2E checks now cover recovery and input workflows fix: CLI recovery stays reliable during Unity readiness updates May 16, 2026
@hatayama hatayama merged commit 6dbe57b into v3-beta May 16, 2026
10 checks passed
@hatayama hatayama deleted the feature/hatayama/add-cli-recovery-e2e branch May 16, 2026 14:04
@github-actions github-actions Bot mentioned this pull request May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant