Skip to content

Track and report slow snapshot diagnostics #596

@thymikee

Description

@thymikee

This was generated by AI during triage.

Problem

During flaky Android replay/test runs, snapshot capture latency can jump from the expected ~400-600ms range to ~1.5-2.3s. That is a strong signal that the run environment is unhealthy or that the snapshot path has degraded, but Agent Device currently does not summarize that signal in a way humans or agents can act on.

This affected React Navigation Android suite investigation: slow snapshots correlated with noisy/flaky runs, stale daemon suspicion, emulator load, Metro/app stuck states, or Android snapshot helper slowdown/fallback.

Proposal

Collect snapshot timing stats for each open session and include them in relevant command/run output.

Suggested stats:

  • count
  • p50
  • p95
  • max
  • platform/backend detail when available, for example Android helper vs stock fallback
  • helper fallback/error count when available

For agent-device test and agent-device replay, include aggregate snapshot stats in the final structured result.

For individual commands that capture snapshots, expose a small diagnostic payload in JSON and optionally print a warning to stderr in non-JSON mode when the current command or session crosses a threshold.

Warning Behavior

If snapshot p95 is high for the current run/session, print a scoped warning such as:

Warning: Android snapshots are slow in this run: p95 2180ms over 34 captures. Possible causes: emulator load, app/Metro stuck, helper fallback, stale daemon.

Keep stdout stable for normal command results. Prefer stderr for non-JSON warnings.

Agent-Actionable Guidance

The diagnostic should distinguish reliable actions from guesses.

Reliable checks/actions:

  • report the slowdown and avoid trusting perf comparisons from that run
  • retry a flaky test once when retries are enabled
  • clean up daemons owned by the current test/replay run
  • report Android helper fallback counts/errors
  • check Metro reachability when the run is clearly RN/Expo/dev-client based

Potential but not always safe:

  • refresh adb reverse for known Metro ports
  • collect screenshot/log artifacts
  • wait for app/device idle

Avoid automatic broad recovery:

  • killing arbitrary stale daemons
  • rebooting emulators
  • restarting Metro
  • assuming app stuck vs host/device load without supporting evidence

Acceptance Criteria

  • Snapshot capture durations are recorded per session/run.
  • test/replay final results include aggregate snapshot stats.
  • JSON output includes machine-readable fields that an agent can inspect.
  • Non-JSON output prints a warning when snapshot p95 crosses a conservative threshold.
  • Warnings are scoped and actionable, not noisy for normal runs.
  • Existing command stdout contracts remain stable.
  • Unit/integration coverage exercises aggregation and warning rendering.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions