Skip to content

feat(update): add --watch mode for periodic re-indexing#646

Open
mavaali wants to merge 1 commit into
tobi:mainfrom
mavaali:feat/watch-mode
Open

feat(update): add --watch mode for periodic re-indexing#646
mavaali wants to merge 1 commit into
tobi:mainfrom
mavaali:feat/watch-mode

Conversation

@mavaali
Copy link
Copy Markdown

@mavaali mavaali commented May 15, 2026

Summary

Adds qmd update --watch [--interval <dur>] [--embed] so users can keep the QMD index fresh without an external scheduler. This is the piece most often paired with qmd mcp in mcp.json-style setups where the agent expects the index to stay current as notes change.

Why

A friend of mine asked whether he could drop QMD into .vscode/mcp.json the same way you'd drop in @playwright/mcp and forget about it. Today the missing piece is re-indexing: qmd update works great manually, but there's no built-in way to keep it running. People end up wiring cron jobs, launchd agents, or just forgetting and getting stale results.

This PR adds watch mode so the common case ("re-index every few minutes, embed if anything new shows up") is one flag.

Design choices

  • Polling, not fs.watch. reindexCollection() already hashes files and skips unchanged content, so idle ticks are near-zero cost. Polling avoids a new dependency (chokidar) and keeps cross-platform behavior identical on Linux, macOS, and Windows where fs.watch semantics differ.
  • Default interval: 5 minutes. Configurable via --interval with units ms/s/m/h (e.g. 30s, 5m, 1.5h). Bare numbers are treated as seconds.
  • Quiet by default. Ticks only emit a line when documents actually change. Routine no-op ticks are silent so the log stays useful. Format: [2026-05-15T20:27:31.844Z] smoke: 1 new, 0 updated, 0 removed.
  • Optional --embed. Runs vectorIndex() after any tick that left pending hashes, so semantic search stays current without a second cron job.
  • Circuit breaker. After 3 consecutive tick failures, the loop exits non-zero so a supervisor (launchd/systemd/Task Scheduler) can restart it instead of looping forever.
  • Clean shutdown. SIGINT/SIGTERM stop the loop between ticks. Follows the same process.removeAllListeners pattern used by qmd mcp --http to bypass the top-level cursor-restoring handlers.

Usage

# Re-index every 5 minutes (default)
qmd update --watch

# Custom interval
qmd update --watch --interval 30s

# Auto-embed new hashes after each tick
qmd update --watch --embed

Paired with the MCP server:

# Terminal 1
qmd mcp --http --daemon
# Terminal 2
qmd update --watch --embed

What I didn't do

  • Did not integrate into qmd mcp. Keeping watch as a sibling process avoids contention concerns and keeps the PR small. Happy to land that as a follow-up if you'd like.
  • Did not add a launchd/systemd template. Documented the pattern in README; users can wire their own supervisor.
  • Did not use fs.watch. Open to revisiting if anyone wants true real-time, but polling covers the stated use case.

Tests

  • 17 new unit tests in test/watch.test.ts covering parseDurationMs (units, fractions, edge cases, error paths) and runWatchLoop (tick counting, failure tolerance, circuit breaker, consecutive-failure reset, invalid interval).
  • All 843 existing tests still pass.
  • E2E manual smoke test: created a fresh collection, started qmd update --watch --interval 1s, added a new .md file mid-run, observed exactly one timestamped log line, sent SIGINT, observed clean exit.

Refactor notes

updateCollections() now takes { quiet?, keepDbOpen? } and returns an UpdateSummary. Non-watch behavior is byte-identical to before. The only externally visible change in the legacy path: the existing console.log for custom-update-command stderr now routes through console.error (correct destination for stderr content; previously it went to stdout).

Checklist

  • Tests added
  • npm run build passes
  • Full test suite passes (npx vitest run)
  • Changelog entry under ## [Unreleased]
  • README updated

Adds 'qmd update --watch [--interval <dur>] [--embed]' so users can keep
the QMD index fresh without running an external scheduler. This is the
piece most often paired with 'qmd mcp' in mcp.json-style setups where
the agent expects the index to stay current as notes change.

Design:
- Polling-based (setInterval), not fs.watch. Cheap because the existing
  reindexCollection() hashes files and skips unchanged content; idle
  ticks are near-zero cost. Avoids a new dependency (chokidar) and
  keeps cross-platform behavior identical on Linux, macOS, and Windows
  where fs.watch semantics differ.
- Default interval: 5 minutes. Configurable via --interval with units
  ms/s/m/h (e.g. '30s', '5m', '1.5h').
- Quiet by default: ticks only emit a line when documents actually
  change. Routine no-op ticks are silent so logs stay useful.
- Optional --embed flag runs vectorIndex() after any tick that left
  pending hashes, so semantic search stays current without a second
  cron job.
- 3-strike circuit breaker: after N consecutive tick failures (default
  3), the loop exits non-zero so a supervisor (launchd/systemd) can
  restart it instead of failing silently.
- Clean shutdown on SIGINT/SIGTERM between ticks. Removes the top-level
  cursor-restoring signal handlers (same pattern as 'qmd mcp --http')
  so the watch loop can install its own.

Refactor:
- updateCollections() now accepts { quiet, keepDbOpen } and returns
  an UpdateSummary describing what changed. The non-watch behavior is
  identical to before.
- New module src/watch.ts exposes parseDurationMs() and runWatchLoop()
  with full test coverage (17 unit tests).

Verified:
- 843/843 tests pass.
- E2E smoke test: 'qmd update --watch --interval 1s' detects a
  newly-created markdown file on the next tick, logs a single line,
  and exits cleanly on SIGINT.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant