You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At power-user scale (~4,000+ notes), our MCP tool latency is severe enough that it forced a user to build a parallel tooling layer rather than use BM's tools directly: reads take 3–7s and search ~12s through the standard path. For multi-agent setups (several assistants hitting the same knowledge base concurrently), that latency makes BM effectively unusable — performance was the most repeated complaint in the user interview that surfaced this.
Prior art: BMQ3
The user runs BMQ3, a caching proxy in front of BM's SQLite, and has offered it as input for upstreaming. Architecture worth studying:
HTTP engine with a thin stdio wrapper forwarding JSON-RPC; 9 tools (bm_read, bm_search, bm_grep, bm_query, bm_ls, bm_write, bm_edit, bm_delete, bm_health)
RAM caches for entities / observations / relations / frontmatter / permalinks, invalidated on DB+WAL mtime — cheap and correct enough in practice
Imports BM internals directly (FastEmbedEmbeddingProvider, apply_edit_operation, generate_permalink) so read/write semantics match BM's
Writes return in ~25ms with the DB upsert backgrounded
Explicitly does not replace graph traversal (build_context) or schema operations
Evaluate upstreaming the caching approach into BM itself — an in-process read cache over the SQLite index with mtime-based invalidation, so every MCP tool benefits without a sidecar proxy. Target: sub-100ms reads/search at the few-thousand-note scale.
Open questions:
In-process cache in the API layer vs. a long-lived daemon (cloud mode already has a server; local stdio MCP is where cold-start + query cost bites)
Invalidation: DB+WAL mtime is proven by BMQ3; interaction with sync writes needs care
Which surfaces gain most: read_note, search_notes, list_directory first; build_context stays on the graph path
Memory budget at 10k+ notes
Related
Normalize the CLI / MCP tool argument surface for v1.0 #977 — the interface-design half of the same interview feedback (file-path-style addressing and grep-style search align with model training; BMQ3 implements both). This issue is the performance half.
If we don't address this, the failure mode is more power users forking or building parallel layers.
Problem
At power-user scale (~4,000+ notes), our MCP tool latency is severe enough that it forced a user to build a parallel tooling layer rather than use BM's tools directly: reads take 3–7s and search ~12s through the standard path. For multi-agent setups (several assistants hitting the same knowledge base concurrently), that latency makes BM effectively unusable — performance was the most repeated complaint in the user interview that surfaced this.
Prior art: BMQ3
The user runs BMQ3, a caching proxy in front of BM's SQLite, and has offered it as input for upstreaming. Architecture worth studying:
bm_read,bm_search,bm_grep,bm_query,bm_ls,bm_write,bm_edit,bm_delete,bm_health)FastEmbedEmbeddingProvider,apply_edit_operation,generate_permalink) so read/write semantics match BM'sbuild_context) or schema operationsMeasured results: read 5ms cold / ~0.001ms cached (vs. 3–7s); search 95ms (vs. 12s); grep 0.05ms.
Proposal (v1.0)
Evaluate upstreaming the caching approach into BM itself — an in-process read cache over the SQLite index with mtime-based invalidation, so every MCP tool benefits without a sidecar proxy. Target: sub-100ms reads/search at the few-thousand-note scale.
Open questions:
read_note,search_notes,list_directoryfirst;build_contextstays on the graph pathRelated