Release 0.17.6 — workflow-capture reliability hardening#112
Conversation
Make macOS + browser workflow capture (monitor) production-grade across speech recognition, screen recording, AX/scroll event volume, and the task envelope. Speech - Ship the Speech Recognition usage string so macOS can present the TCC prompt; guard speech calls so a missing string can't crash on requestAuthorization. - New `interceptor macos trust speech [--prompt]` surfaces the Speech Recognition dialog (momentary activation-policy upgrade) and deep-links Settings; `trust check` now reports speech state. - Rewrite live recognition: actor-neutral authorization, on-device gated on support, ~55s task restart with a short audio ring buffer, and an always-on offline .caf fallback so a transcript exists even when live ASR is denied. `--speech-mode auto|live|offline|off`. Spin-up is async (never blocks start). Screen recording - Fix SCStream frame geometry (pixel scale + scalesToFit + capture resolution). - Continuous video via SCRecordingOutput -> screen.mp4 (`--video [fps]`, `--video-max-mb`). Frame budget / disk cap / dedup (`--frame-budget`, `--frame-disk-cap`, `--frame-pixel-scale`). Event volume - Coalesce scroll + AX value/layout (`--scroll-coalesce-ms`, `--ax-coalesce-ms`). - Async batched event writer off the main run loop + bounded queue with backpressure shedding. - `--all-apps` scope hygiene (skip system chrome by default; `--include-system-apps`, `--exclude-app`). Built-in pause-before-stop. Task envelope - macOS `monitor stop --task` now snapshots + regenerates + grades locally. - sid recovery + post-attach verification (no more empty split-brain envelopes). - Implement `monitor repair --task`; correct the recommended-fix strings that named a non-existent command. Auto-finalize on stop. Lifecycle - Decouple start/stop acks from heavy setup; raise the monitor RPC deadline; flush the event writer on SIGINT/SIGTERM. Bump all surfaces to 0.17.6. Tests +2 (dual-source single-envelope, repair recovery); suite 484/0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds a non-blocking ChangesMonitor Capture Infrastructure and Repair Flow
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI (macos.ts / monitor.ts)
participant MonitorDomain as MonitorDomain (Swift)
participant MonitorEventWriter as MonitorEventWriter (Swift)
participant Bridges as Input/AX Bridges (Swift)
participant SpeechCapture as Speech/Frame Capture (Swift)
rect rgba(30, 100, 200, 0.5)
note over CLI,MonitorDomain: Session Start
CLI->>MonitorDomain: macos_monitor start (speechMode, video fps, coalescing, excludeApps...)
MonitorDomain->>MonitorDomain: parseKnobs → persist onto MonitorRuntime
MonitorDomain->>Bridges: start(scrollCoalesceMs, axCoalesceMs)
MonitorDomain->>SpeechCapture: startSpeechRecognition(speechMode: offline/live/auto)
MonitorDomain->>SpeechCapture: startFrameCapture(videoFps, frameBudget, diskCap)
MonitorDomain->>MonitorEventWriter: enqueue mon_start + video_start
end
rect rgba(200, 100, 30, 0.5)
note over Bridges,MonitorEventWriter: High-rate event coalescing
Bridges->>Bridges: emitMaybeCoalesced / accumulateScroll (timer window)
Bridges->>MonitorDomain: recordEvent(coalesced payload)
MonitorDomain->>MonitorEventWriter: enqueue → Bool (backpressure)
MonitorEventWriter-->>MonitorDomain: highWater → shed noisyEvents, emit capture_backpressure
end
rect rgba(30, 160, 80, 0.5)
note over CLI,MonitorEventWriter: Session Stop
CLI->>MonitorDomain: macos_monitor stop --task <id>
MonitorDomain->>MonitorDomain: s.paused = true
MonitorDomain->>MonitorEventWriter: flush()
MonitorDomain->>SpeechCapture: stopSpeechRecognition → finalize CAF, emit speech_audio_done
MonitorDomain->>SpeechCapture: stopFrameCapture → emit video_stop
MonitorDomain->>MonitorEventWriter: close(sessionEventsPath) → rotate file
CLI->>CLI: synthesizeTranscript + generateCaptureQuality → print
end
sequenceDiagram
participant CLI as CLI (monitor.ts)
participant repairFn as repairMonitorTask (shared/monitor-tasks.ts)
participant SessionDir as MONITOR_SESSIONS_DIR
participant TaskMeta as Task Metadata
CLI->>repairFn: repairMonitorTask(taskId, {snapshotSources, regenerate})
repairFn->>SessionDir: findLatestSessionSidForTask(taskId)
SessionDir-->>repairFn: unregistered SID
repairFn->>TaskMeta: attachMonitorTaskSource(sid)
repairFn->>repairFn: snapshotMonitorTaskSources()
repairFn->>repairFn: transcribeTaskSpeechAudio(taskId)
repairFn->>repairFn: synthesizeMonitorTaskTranscript(taskId)
repairFn->>repairFn: generateMonitorTaskCaptureQuality(taskId)
repairFn-->>CLI: { manifest, report, attached: [sid] }
CLI-->>CLI: print JSON or status+quality
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Makes macOS + browser workflow capture (the
monitorsurface) production-grade across speech recognition, screen recording, AX/scroll event volume, and the task envelope. Bumps all surfaces to 0.17.6.Speech
requestAuthorization.interceptor macos trust speech [--prompt]surfaces the Speech Recognition dialog (momentary activation-policy upgrade) and deep-links Settings;trust checknow reports speech state..caffallback so a transcript exists even when live ASR is denied. New--speech-mode auto|live|offline|off. Spin-up is async — never blocksmonitor start.Screen recording
scalesToFit+ capture resolution).SCRecordingOutput→screen.mp4(--video [fps],--video-max-mb).--frame-budget,--frame-disk-cap,--frame-pixel-scale).Event volume
--scroll-coalesce-ms,--ax-coalesce-ms).--all-appsscope hygiene (skip system chrome by default;--include-system-apps,--exclude-app). Built-in pause-before-stop.Task envelope
monitor stop --tasknow snapshots + regenerates + grades locally.sidrecovery + post-attach verification (no more empty split-brain envelopes).monitor repair --task; correct the recommended-fix strings that named a non-existent command. Auto-finalize on stop.Lifecycle
Verification
swift build, and the full test suite green (484 pass / 0 fail; +2 new tests: dual-source single-envelope, repair recovery).🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
monitor repaircommand to recover incomplete or disconnected monitor tasksBug Fixes
Chores