Please read this Version file
███╗ ███╗███████╗███╗ ███╗███╗ ███╗ ██████╗ ██╗ ████████╗
████╗ ████║██╔════╝████╗ ████║████╗ ████║██╔═══██╗██║ ╚══██╔══╝
██╔████╔██║█████╗ ██╔████╔██║██╔████╔██║██║ ██║██║ ██║
██║╚██╔╝██║██╔══╝ ██║╚██╔╝██║██║╚██╔╝██║██║ ██║██║ ██║
██║ ╚═╝ ██║███████╗██║ ╚═╝ ██║██║ ╚═╝ ██║╚██████╔╝███████╗██║
╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝ ╚══════╝╚═╝
Structured, searchable, long-term memory for your AI agent. Plugs into Claude Code (and any MCP-compatible client) over the Model Context Protocol.
MemMolt is a memory system for AI agents. Think of it as a brain extension for Claude Code that remembers things across conversations, keeps them tidy, and can recall them by meaning — not just keywords.
It lives on your machine as a tiny local server. Your agent connects to it, reads from it, writes to it, and picks up where it left off the next time you open a conversation.
Memory is organized into three simple layers:
BUCKET → top-level category ("Personal Finances")
│
└── THREAD → sub-topic ("Quarterly Taxes")
│
└── MEMO → the actual note ("Q3 estimated payment notes")
That's it. Buckets hold threads. Threads hold memos. Memos are markdown documents. No nested folders, no wiki rabbit holes, no broken links.
If you've used an AI agent for a while, you've probably hit one of these:
- "It forgot what we talked about yesterday."
- "I keep re-explaining the same context every session."
- "My notes folder has grown into a jungle I can't navigate."
- "The agent wastes half its context window reading old notes."
MemMolt fixes all four. The agent only pulls in the memos it actually needs, organized so it knows where to look, indexed so it can find them by meaning and not just word match.
A plain notes folder works — until it doesn't. Here's what you trade away:
| Problem | Plain notes folder | MemMolt |
|---|---|---|
| Finding things | Text search only. Miss the exact word, miss the note. | Hybrid search: keyword (FTS5) + meaning (vector embeddings) combined via RRF so you get both. |
| Organization | Folders can be nested arbitrarily, links break, stuff drifts. | Enforced 3-level hierarchy. The agent can't make a mess because the structure doesn't allow one. |
| Context window | Agent reads multiple full files just to check if they're relevant. | Agent searches summaries first, fetches only the memos it actually needs. Massively fewer tokens. |
| Spiraling out | Long notes get longer, topics sprawl across files. | Summaries force compression. Each memo has a forced title + summary that must describe what it contains. |
| Consistency | The agent might remember to update notes, or it might not. | The tool prompts the agent to update summaries after changes. Built-in nudges keep memory fresh. |
| Search across topics | Manual. You grep, you browse, you hope. | One call returns results from the whole system, ranked by relevance. |
Short version: A notes folder is a filesystem. MemMolt is a memory system.
- Everything lives in one SQLite file on your disk (default:
~/.memmolt/memmolt.sqlite— see Configuration for the full resolution order). - Summaries are turned into vectors by a local embedding model (all-MiniLM-L6-v2, runs in-process, no cloud calls).
- Search combines keyword matching (FTS5) and semantic matching (sqlite-vec), then merges them with Reciprocal Rank Fusion (RRF).
- The agent talks to MemMolt over MCP (Model Context Protocol) — the same way Claude Code talks to any other tool.
No separate database server. No subprocess management. No cloud dependencies. One file, one process.
When the agent is asked to remember something, it routes the information through the enforced Bucket → Thread → Memo hierarchy:
The AI agent picks the right bucket (or creates one), picks the right thread under it, then writes a memo. Every level carries its own vector + BM25 summary, so search can start from any level.
Used for memo, thread, and bucket searches — same pipeline at every level:
The query runs through both a vector search (semantic similarity on the summary embedding) and an FTS5 / BM25 search (keyword matching). The two ranked lists are merged via Reciprocal Rank Fusion (RRF) to produce the final result set. Anything that ranks well in either approach — or both — bubbles to the top.
Individual memos don't live in isolation. Once a memo exists, MemMolt gives the agent two ways to traverse the memo graph from it:
Direct linking — the agent writes cross-references directly in a memo's Markdown, using standard link syntax:
The foundations are in [Color theory basics](M:1#heading-2).
See also the [full intro](M:1).You can point at a whole memo (M:1) or at a specific heading inside it (M:1#heading-2). Headings can be written in their natural form — [x](M:1#My Section) — and the server normalizes them to GitHub-style slugs (#my-section) at save time, so the agent never has to slugify anything. Links inside fenced code blocks or inline code are left alone, and external links like [doc](./file.md) are ignored. On every fetch_memos call, these refs are resolved to { memo_id, heading, memo_title, memo_summary } so the agent sees exactly where each link points.
Semantic linking — on every fetch, MemMolt also runs a vector KNN pass over the fetched memo's own embedding and returns up to 5 semantically similar memos (cosine similarity ≥ 0.5) in the response. No query, no keyword match — just "what else in your memory is about this thing." It's how the agent discovers context that should have been linked but wasn't, or material that was captured before the linking concept existed.
Together these two feed the same field on fetch_memos: the agent reads a memo and immediately sees (a) the memos this one explicitly points at and (b) the memos that are implicitly related. From there it can iterate — follow a link, fetch that memo, see its links and neighbors, keep going. The memo graph becomes navigable in both the human-curated direction and the automatic one.
Benchmarked on a realistic dataset of 1,000 memos across 10 buckets and 50 threads, measured over 10,000 hybrid search queries:
| Operation | Avg latency | p95 | p99 | Throughput |
|---|---|---|---|---|
search_bucket (hybrid FTS5 + vec + RRF) |
5.35 ms | 7.03 ms | 8.57 ms | 187 ops/sec |
search_memos (hybrid FTS5 + vec + RRF) |
11.14 ms | 15.06 ms | 19.63 ms | 90 ops/sec |
search_thread (hybrid FTS5 + vec + RRF) |
14.08 ms | 17.96 ms | 24.12 ms | 71 ops/sec |
That's single-digit-to-low-teens millisecond hybrid search at every level of the hierarchy — keyword matching, semantic matching via a 384-dim vector model, and RRF fusion all in roughly the time it takes a human eye to blink.
Re-run the numbers on your own machine:
npm run benchmarkSee benchmark/ for full methodology and the latest committed run.
MemMolt is designed to be fast, small, and forgettable. You shouldn't notice it running.
- No dedicated dev server. It's a single Node process you start with
npm start. - No background daemons. No Redis, no Postgres, no vector DB subprocess.
- No network chatter. All search and embedding happens in-process, on your machine.
- No memory bloat. One SQLite file on disk + one small transformer model in RAM (loaded lazily, only when first used).
- Fast startup. The full test suite (176 tests with an in-memory DB + mocked embedder) finishes in ~7 seconds. The server itself boots in well under a second.
If a feature would require a separate service, a heavy dependency, or would noticeably slow things down, we'd rather not add it. The whole point is that MemMolt should just work in the background without ever being the bottleneck — of your machine, your workflow, or your agent's context window.
Three commands. Claude Code handles the rest.
/plugin marketplace add rituraj-io/MemMolt
/plugin install memmolt@memmolt
/reload-plugins
Run /mcp to confirm plugin:memmolt:memmolt is connected. Done.
What happens under the hood:
- Claude Code fetches
memmolt@^1.0.1from npm into its plugin cache and builds the native dependencies (better-sqlite3,sqlite-vec,@xenova/transformers) automatically. - The MCP server is registered and started.
- Your memory is stored at
~/.claude/plugins/data/memmolt/memmolt.sqlite— outside the plugin cache, so it survives plugin updates and reinstalls. You will not lose your memos when MemMolt is updated.
First-call note: the first tool call triggers a one-time ~90 MB embedding model download (all-MiniLM-L6-v2, cached to disk). Subsequent calls are instant.
/mcp— showsplugin:memmolt:memmoltconnected.- Ask: "What memory tools do you have?" — agent lists the MemMolt catalog (
status,search_memos,create_bucket, etc.). - Try: "Remember that my favorite color is blue." — the agent creates a bucket, thread, and memo. In a new conversation, ask "What's my favorite color?" — it searches MemMolt and recalls the answer.
If you want a standalone CLI (e.g., to use MemMolt with Claude Desktop, another MCP client, or via HTTP/SSE):
npm install -g memmoltThat gives you a memmolt command. Default is stdio; add --http for the HTTP/SSE transport.
- stdio — wire it into any MCP client's config:
{ "mcpServers": { "memmolt": { "command": "memmolt", "args": ["--stdio"] } } } - HTTP/SSE — run
memmolt --httpin a terminal and point your client athttp://localhost:3100/sse.
Memory is stored at ~/.memmolt/memmolt.sqlite in this mode.
git clone https://github.com/rituraj-io/MemMolt.git
cd MemMolt
npm install
npm start # HTTP/SSE on port 3100
# or:
node index.js --stdioWhen running from a cloned checkout, memory lives at <repo>/.db/memmolt.sqlite (the .git marker tells MemMolt it's a dev environment).
| Symptom | Fix |
|---|---|
/mcp shows "Failed to reconnect to memmolt" right after install |
Claude Code is still building native deps. Wait ~30 seconds, then /reload-plugins and /mcp again. |
/mcp shows two memmolt entries, one broken |
You have a stale project-scoped .mcp.json in your current folder. Delete it; plugin MCP (plugin:memmolt:memmolt) is self-sufficient. |
| Tool calls hang on first use | First embedding model download is in progress (~90 MB). Give it a minute; subsequent calls are instant. |
| Port 3100 already in use (HTTP/SSE only) | Another MemMolt or unrelated process is using it. Set MEMMOLT_PORT=3200 before starting. |
Install fails with EBUSY on Windows during plugin update |
A previous MemMolt process is still running. Kill any node.exe whose command line references memmolt, delete ~/.claude/plugins/cache/memmolt/, and retry. Fixed from 1.0.1 onwards via graceful shutdown. |
MemMolt exposes 16 MCP tools. The agent doesn't need you to understand these — it figures out when to use which one — but here's the full catalog:
status— health check, counts per entity
search_memos— find memos by query (optionally scoped to a bucket or thread)search_bucket— find bucketssearch_thread— find threadsfetch_memos— pull full content for a list of memo IDs, plus the memos they link to (direct) and the memos semantically nearest to them (semantic)
create_bucket— new top-level categorycreate_thread— new sub-topic under a bucketcreate_memo— new document under a thread; content may cross-link other memos with[text](M:<id>)or[text](M:<id>#heading)
update_bucket— rename or re-describe a bucketupdate_thread— rename or re-describe a threadupdate_memo— update title, summary, or content of a memo- Supports line-level edits so the agent doesn't have to resend huge content blobs
delete_bucket— deletes the bucket and everything inside itdelete_thread— deletes the thread and its memosdelete_memo— deletes a single memo
move_thread— move a thread to a different bucketmove_memo— move a memo to a different thread
Every tool response includes an agent_guidance field where relevant — small nudges like "consider updating the parent bucket summary" that keep the memory graph coherent over time.
| Layer | Technology |
|---|---|
| Runtime | Node.js |
| Storage | SQLite (via better-sqlite3) |
| Keyword search | SQLite FTS5 (BM25 ranking) |
| Vector search | sqlite-vec extension (384-dim embeddings) |
| Embedding model | all-MiniLM-L6-v2 via @xenova/transformers (local, in-process) |
| Search fusion | Reciprocal Rank Fusion (RRF) |
| Protocol | Model Context Protocol (MCP) |
| Transports | HTTP/SSE (default) + stdio |
Everything runs locally. No API keys required. No data leaves your machine.
memmolt/
├── .claude-plugin/ # Claude Code plugin manifest + marketplace entry
│ ├── plugin.json
│ └── marketplace.json
├── bin/
│ ├── memmolt.js # Global CLI entry (exposed via `bin` in package.json)
│ └── start.js # Legacy bootstrap (kept as a harmless fallback)
├── database/
│ ├── sqlite.js # SQLite connector, DB path resolver, sqlite-vec loader
│ └── tables/init.sql # Schema (tables, FTS5, vec0, triggers)
├── functions/
│ ├── memory/ # Domain logic (buckets, threads, memos)
│ ├── mcp/ # MCP tool handlers (thin wrappers)
│ └── utils/ # Embedder, RRF, vector sync, orphan cleanup, FTS sanitizer
├── tests/ # Jest unit tests
├── benchmark/ # Search-latency benchmark harness
├── documentations/ # Version specs (VERSION1.0.0.md, etc.)
├── assets/ # README architecture diagrams
├── index.js # MCP server entry point (stdio + HTTP/SSE)
├── package.json
├── tsconfig.json # JSDoc + TS strict-mode type checking
├── CLAUDE.md # Project conventions for Claude Code
├── CONTRIBUTING.md
├── LICENSE # MIT
└── README.md
npm start # Run the server (HTTP/SSE, port 3100)
node index.js --stdio # Run with stdio transport
npm test # Jest test suite (176 tests)
npm run test:watch # Jest in watch mode
npm run test:coverage # Tests with coverage report
npx tsc --noEmit # Type check (JSDoc + TS compiler)| Env var | Default | Description |
|---|---|---|
MEMMOLT_PORT |
3100 |
HTTP/SSE port. |
MEMMOLT_DB_PATH |
(resolved — see below) | Path to the SQLite file. Use :memory: for tests. |
Default DB path resolution (first match wins):
MEMMOLT_DB_PATHenv var, if set.${CLAUDE_PLUGIN_DATA}/memmolt.sqlite— when running as a Claude Code plugin. This is a persistent per-plugin data directory set by Claude Code; user memory survives plugin updates and reinstalls.<repo>/.db/memmolt.sqlite— when running from a cloned git checkout (.gitis present), so contributors runningnpm startlocally still get the in-repo.db/workflow.~/.memmolt/memmolt.sqlite— safe default fornpm install -g memmoltand everything else. Never inside any plugin cache.
- SQLite opens,
sqlite-vecextension loads, schema creates (idempotent). - Orphan cleanup sweep — removes any dangling vectors or unreferenced rows left behind by crashes or manual edits.
- MCP server starts on the chosen transport.
Unit tests cover all pure functions in functions/memory/ and functions/utils/.
The MCP wrappers are thin routing layers and aren't covered by unit tests directly — if the domain functions work, the wrappers do too.
Tests use an in-memory SQLite database and a deterministic mocked embedder, so the full suite of 176 tests runs in ~7 seconds.
MIT — do whatever you want with it, just keep the copyright notice.
Memory is only useful if it's current. Check it before you answer. Update it when you learn. Don't let it drift out of date.


