Skip to content

Upstream a BMQ3-style read cache: MCP tool latency is product-defining for multi-agent use #980

@groksrc

Description

@groksrc

Problem

At power-user scale (~4,000+ notes), our MCP tool latency is severe enough that it forced a user to build a parallel tooling layer rather than use BM's tools directly: reads take 3–7s and search ~12s through the standard path. For multi-agent setups (several assistants hitting the same knowledge base concurrently), that latency makes BM effectively unusable — performance was the most repeated complaint in the user interview that surfaced this.

Prior art: BMQ3

The user runs BMQ3, a caching proxy in front of BM's SQLite, and has offered it as input for upstreaming. Architecture worth studying:

  • HTTP engine with a thin stdio wrapper forwarding JSON-RPC; 9 tools (bm_read, bm_search, bm_grep, bm_query, bm_ls, bm_write, bm_edit, bm_delete, bm_health)
  • RAM caches for entities / observations / relations / frontmatter / permalinks, invalidated on DB+WAL mtime — cheap and correct enough in practice
  • Imports BM internals directly (FastEmbedEmbeddingProvider, apply_edit_operation, generate_permalink) so read/write semantics match BM's
  • Writes return in ~25ms with the DB upsert backgrounded
  • Explicitly does not replace graph traversal (build_context) or schema operations

Measured results: read 5ms cold / ~0.001ms cached (vs. 3–7s); search 95ms (vs. 12s); grep 0.05ms.

Proposal (v1.0)

Evaluate upstreaming the caching approach into BM itself — an in-process read cache over the SQLite index with mtime-based invalidation, so every MCP tool benefits without a sidecar proxy. Target: sub-100ms reads/search at the few-thousand-note scale.

Open questions:

  • In-process cache in the API layer vs. a long-lived daemon (cloud mode already has a server; local stdio MCP is where cold-start + query cost bites)
  • Invalidation: DB+WAL mtime is proven by BMQ3; interaction with sync writes needs care
  • Which surfaces gain most: read_note, search_notes, list_directory first; build_context stays on the graph path
  • Memory budget at 10k+ notes

Related

  • Normalize the CLI / MCP tool argument surface for v1.0 #977 — the interface-design half of the same interview feedback (file-path-style addressing and grep-style search align with model training; BMQ3 implements both). This issue is the performance half.
  • If we don't address this, the failure mode is more power users forking or building parallel layers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestv1.0Targeted for the v1.0 release
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions