Upstream a BMQ3-style read cache: MCP tool latency is product-defining for multi-agent use

## Problem

At power-user scale (~4,000+ notes), our MCP tool latency is severe enough that it forced a user to build a parallel tooling layer rather than use BM's tools directly: reads take 3–7s and search ~12s through the standard path. For multi-agent setups (several assistants hitting the same knowledge base concurrently), that latency makes BM effectively unusable — performance was the most repeated complaint in the user interview that surfaced this.

## Prior art: BMQ3

The user runs **BMQ3**, a caching proxy in front of BM's SQLite, and has offered it as input for upstreaming. Architecture worth studying:

- HTTP engine with a thin stdio wrapper forwarding JSON-RPC; 9 tools (`bm_read`, `bm_search`, `bm_grep`, `bm_query`, `bm_ls`, `bm_write`, `bm_edit`, `bm_delete`, `bm_health`)
- **RAM caches** for entities / observations / relations / frontmatter / permalinks, **invalidated on DB+WAL mtime** — cheap and correct enough in practice
- Imports BM internals directly (`FastEmbedEmbeddingProvider`, `apply_edit_operation`, `generate_permalink`) so read/write semantics match BM's
- Writes return in ~25ms with the DB upsert backgrounded
- Explicitly does **not** replace graph traversal (`build_context`) or schema operations

**Measured results:** read 5ms cold / ~0.001ms cached (vs. 3–7s); search 95ms (vs. 12s); grep 0.05ms.

## Proposal (v1.0)

Evaluate upstreaming the caching approach into BM itself — an in-process read cache over the SQLite index with mtime-based invalidation, so every MCP tool benefits without a sidecar proxy. Target: sub-100ms reads/search at the few-thousand-note scale.

Open questions:

- In-process cache in the API layer vs. a long-lived daemon (cloud mode already has a server; local stdio MCP is where cold-start + query cost bites)
- Invalidation: DB+WAL mtime is proven by BMQ3; interaction with sync writes needs care
- Which surfaces gain most: `read_note`, `search_notes`, `list_directory` first; `build_context` stays on the graph path
- Memory budget at 10k+ notes

## Related

- #977 — the interface-design half of the same interview feedback (file-path-style addressing and grep-style search align with model training; BMQ3 implements both). This issue is the performance half.
- If we don't address this, the failure mode is more power users forking or building parallel layers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upstream a BMQ3-style read cache: MCP tool latency is product-defining for multi-agent use #980

Problem

Prior art: BMQ3

Proposal (v1.0)

Related

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Upstream a BMQ3-style read cache: MCP tool latency is product-defining for multi-agent use #980

Description

Problem

Prior art: BMQ3

Proposal (v1.0)

Related

Metadata

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Issue actions