One memory vault. Every MCP client. Self-hosted.
Status: alpha — interfaces may change. Issues and PRs welcome.
Claude Code, Cursor, claude.ai web/mobile/desktop, Claude Desktop, Gemini CLI — one vault shared across all of them.
mnemon is a Model Context Protocol server with hybrid BM25 + vector search. Your data stays on your machine or your own Fly app.
Platforms: Tested on macOS 14+. Linux should work. Windows untested.
| Capability | mnemon local | mnemon web |
|---|---|---|
| Claude Code (CLI) | ✅ | ✅ |
| Claude Desktop (Mac/Win app) | ✅ | ✅ |
| Cursor | ✅ | ✅ |
| Gemini CLI | ✅ | ✅ |
| Any local MCP client | ✅ | ✅ |
| claude.ai (web) | ❌ | ✅ |
| Claude mobile app | ❌ | ✅ |
| Memory shared across laptop + desktop | ❌ | ✅ |
| Durable off-machine backup | ❌ (manual file copy) | ✅ (S3) |
| External accounts required | none | Fly.io + AWS |
| Credit card required | no | yes (both free tiers) |
| First-install setup time | ~2 min | ~15 min |
| Ongoing cost | $0 | ~$0 on free tiers |
| Cold-start latency after idle | none | 2–5s |
- mnemon local — one machine, one or more MCP clients. Zero external accounts.
pip install mnemon-memory && mnemon setup. - mnemon web — memory across devices + claude.ai + mobile. Requires Fly.io + AWS.
mnemon upgrade web --app-name my-mnemon.
Start local, upgrade later — your vault rides along. mnemon downgrade local reverts. Comfortable up to ~50k memories.
pip install mnemon-memoryOptional: pip install "mnemon-memory[ui]" for the Streamlit dashboard, or [llm] for the on-device 1.7B model.
From source (e.g. to try unreleased fixes on main):
git clone https://github.com/cipher813/mnemon.git
cd mnemon
pip install -e .For contributors (adds pytest, ruff, and other test/lint tooling):
pip install -e ".[dev]"pip install mnemon-memory
mnemon setupAuto-detects Claude Code, Claude Desktop, Cursor, Gemini CLI — configures each, then runs mnemon doctor.
First memory_search takes ~10–20s (one-time FastEmbed model download). Subsequent calls are fast.
Prereqs: flyctl authenticated, aws CLI configured, an S3 bucket.
export MNEMON_S3_BUCKET=my-mnemon-vault
mnemon upgrade web --app-name my-mnemonAfter it finishes, add https://my-mnemon.fly.dev/mcp to claude.ai and the Claude mobile app manually (Settings → Connectors / Connected Apps).
Rerun the same command after pip install -U mnemon-memory — upgrade web is idempotent. If the Fly app already exists, it skips the first-time steps (S3 push, volume create, client reconfigure) and just redeploys with the new version pinned. Clients keep their URL and token; the new image is picked up on the next request.
pip install -U 'mnemon-memory[server]'
mnemon upgrade web --app-name my-mnemonmnemon downgrade local --destroy-fly-appPulls the Fly vault back via S3, reconfigures clients to stdio, optionally destroys the Fly app. No memories lost.
pip install "mnemon-memory[ui]"
mnemon dashboardStreamlit UI at http://localhost:8503 — stats, search, timeline, UMAP graph view, profile. Works against local and remote vaults.
Once configured, mnemon works automatically — memories save and surface during your sessions. You can also interact directly:
mnemon search "deployment architecture"
mnemon save "DB migration plan" "Migrate from PostgreSQL to DynamoDB in Q3"
mnemon forget 42
mnemon status| Tool | Description |
|---|---|
memory_search |
Hybrid BM25 + vector search with composite scoring (relevance + recency + confidence) |
memory_get |
Fetch a specific memory by ID with full content |
memory_timeline |
Recent memories in reverse chronological order |
memory_related |
Find memories related to a given memory via the relationship graph |
| Tool | Description |
|---|---|
memory_save |
Store a new memory with content type classification and auto-embedding |
memory_pin |
Pin a memory to boost confidence and prevent archival |
memory_forget |
Soft-delete a memory (marked as invalidated, not physically removed) |
| Tool | Description |
|---|---|
memory_status |
Vault health stats — counts by type, vectors, pinned/invalidated |
memory_sweep |
Archive stale memories past their half-life (dry-run by default) |
memory_rebuild |
Re-embed all documents (use after upgrading embedding model) |
| Tool | Description |
|---|---|
memory_check_contradictions |
Check a memory for conflicts using vector similarity + LLM classification |
profile_get |
Synthesized user profile from stored preferences and decisions |
profile_update |
Manually add a fact to the user profile |
Each memory has a content type that determines its default confidence and decay half-life:
| Type | Default Confidence | Half-Life | Use for |
|---|---|---|---|
decision |
0.85 | Never | Architectural choices, design decisions |
preference |
0.80 | Never | User workflow habits, style preferences |
antipattern |
0.80 | Never | Things that failed, approaches to avoid |
observation |
0.70 | 90 days | Learned facts, discovered behaviors |
research |
0.70 | 90 days | Investigation results, findings |
project |
0.65 | 120 days | Project status, goals, context |
handoff |
0.60 | 30 days | Session summaries for continuity |
note |
0.50 | 60 days | General notes, default type |
Memories with access activity decay slower — each access extends the effective half-life by 10%, up to 3x the base value.
When configured via mnemon setup, the following hooks are installed on Claude Code:
| Hook | Event | Timeout | Mode | Description |
|---|---|---|---|---|
| Health warm-keeper | UserPromptSubmit |
40s | remote only | curl /health to wake/keep the Fly machine warm. Runs first so the MCP call below has a warm machine. |
| Context surfacing | UserPromptSubmit |
8s | both | Searches vault and injects relevant memories as context |
| Session pre-warm | SessionStart |
90s | remote only | Polls /health for up to 60s in the background so the first prompt of the session lands on a warm machine |
| Session extractor | Stop |
30s | both | Extracts decisions, preferences, and observations from the transcript |
| Handoff generator | Stop |
30s | both | Creates a session summary for the next session |
The extractor and handoff generator use LLM-based extraction when mnemon[llm] is installed, with regex/heuristic fallback otherwise.
A self-hosted mnemon Fly app with auto_stop_machines = "stop" (the default in fly.toml.example) will autostop after a few minutes of idle. The warm-keeper resets Fly's idle timer on every prompt, so the machine stays warm during an active Claude Code session and only autostops once you've been idle for a while. Cost stays the same — Fly bills only running time — but you get reliable mid-session access without paying for an always-on machine. The || true ensures a slow Fly cold-start never blocks your prompt.
The server persists every issued MCP session ID to <vault_dir>/mcp_sessions.sqlite (7-day TTL). When a request bearing a known-but-not-in-memory session ID arrives at a fresh process — typical after a cold-stop or redeploy — the session is transparently resumed: a new transport is spawned with the same ID, and the underlying ServerSession is born already-initialized so tool calls succeed without a re-handshake. The MCP client sees no break in continuity. This is the safety net under the warm-keeper, not a replacement for it.
The Claude Code hook above gives unconditional pre-LLM retrieval — every prompt triggers a memory_search before the model sees the question, and the results are injected into context. No other MCP client supports this flow today, including Claude Desktop, claude.ai web, the Claude mobile app, Cursor, and Gemini CLI.
Why: the hook works because Claude Code exposes a UserPromptSubmit lifecycle event that runs a subprocess between the user pressing Enter and the model being invoked. Desktop and the other clients only expose the standard MCP surfaces — tools (model-decided), prompts (user-invoked via slash menu), and resources (client-pulled). None of these fire automatically on every prompt, so there is no architectural place for an MCP server to insert "always-on" recall. Aggressively rewriting the memory_search tool description to coerce the model into calling it first is rejected by design — it pollutes the tool surface for every other consumer and is still model-decided.
Same snippet for every non-Code client — paste it into whichever custom-instructions / rules / memory surface that client exposes:
Before responding to any of my prompts, call the
memory_searchtool from the mnemon MCP server using relevant terms from my question. Use the returned memories to inform your response. Ifmemory_searchreturns nothing useful, proceed without it.
| Client | Need custom instructions? | Where to paste |
|---|---|---|
| Claude Code | No — the UserPromptSubmit hook handles it unconditionally. |
— |
| Claude Desktop | Yes | Settings → Profile → "What personal preferences should Claude consider in responses?" |
| claude.ai (web) | Yes | Same Profile field as Desktop — it's shared across your Anthropic account. Per-Project instructions also work and override the Profile within that Project. |
| Claude mobile app | Yes | Inherits from the same Profile field — set it once on Desktop or claude.ai and mobile picks it up. |
| Cursor | Yes | Settings → Rules → User Rules (global) or a .cursor/rules/*.mdc file in the project (workspace-scoped). |
| Gemini CLI | Yes | ~/.gemini/GEMINI.md (global) or a project-rooted GEMINI.md. |
In every non-Code client the call is still model-decided — it will skip on short prompts, follow-ups, or when distracted. Not a substitute for a real lifecycle hook.
Anthropic would need to add a pre-prompt lifecycle hook to Claude Desktop — the Desktop-side equivalent of Claude Code's UserPromptSubmit. Once that surface exists, mnemon's existing context_surfacing hook can be wired to it directly. Until then, Claude Code is the only client with guaranteed pre-LLM injection; everywhere else, retrieval is model-decided.
Remove mnemon state from this machine. Nothing user-owned in the cloud is touched.
mnemon uninstall [--yes] [--keep-vault]~/.mnemon/— vault (SQLite + vectors), archive/, remote_url, local_token, models cache. With--keep-vault, this directory is preserved.- Claude Code MCP registration (
claude mcp remove --scope user mnemon). - mnemon hook + mcpServers entries in
~/.claude/settings.json. - mnemon entry in
~/.cursor/mcp.json. - mnemon entry in Claude Desktop's config.
- The
mnemon-memoryPython package. Usepip uninstall mnemon-memoryseparately. - Your Fly.io app. Destroy it first with
mnemon downgrade local --destroy-fly-appif you want the app gone — that pulls the remote vault back to local so no memories are lost. - Your S3 bucket contents. mnemon has no
sync delete. - claude.ai + Claude mobile MCP entries. These live in your Anthropic account.
claude mcp listshows them with aclaude.aiprefix. Remove via Settings → Connectors in the claude.ai web UI. Ifmnemon uninstalldetects one, it surfaces a⚠ REQUIREDbullet pointing you there.
| Command | Local ~/.mnemon/default.sqlite |
Fly volume | S3 bucket contents |
|---|---|---|---|
mnemon uninstall |
deleted (unless --keep-vault) |
untouched | untouched |
mnemon uninstall --keep-vault |
untouched | untouched | untouched |
mnemon downgrade local |
replaced with Fly state (via S3 pull) | untouched (keeps running) | untouched |
mnemon downgrade local --destroy-fly-app |
replaced with Fly state | destroyed (after data was pulled to local) | untouched |
mnemon upgrade web |
archived to archive/pre-web-<date>.sqlite |
newly created, seeded from S3 | written to (push) |
mnemon sync push / mnemon sync pull |
read/write local | — | read/write |
Memories are always recoverable as long as at least one of {S3 backup, Fly volume, local vault, local archive} exists.
Test from scratch on one machine:
mnemon uninstall --yes
pip install -e . # or: pip install mnemon-memory
mnemon setupStop using mnemon entirely:
mnemon downgrade local --destroy-fly-app # tears down Fly, preserves vault via S3 pull
mnemon uninstall --yes # removes local state
pip uninstall mnemon-memory # removes the package
# Then remove mnemon entries in claude.ai and the Claude mobile app manually.
# Delete your S3 bucket contents if you want no residual memory data.Move to a new machine (preserve all memories):
# Old machine:
mnemon sync push
mnemon uninstall --yes
# New machine:
pip install mnemon-memory
mnemon setup claude-code --remote-url https://<your-app>.fly.dev/mcpThe [ui] extra pulls in umap-learn, which requires numba + llvmlite. Starting with numba 0.63, those packages only ship macOS wheels for Apple Silicon (arm64). On an Intel Mac (x86_64), pip falls back to a source build and fails with llvmlite needs CMake tools to build.
Pin to the last versions that ship x86_64 macOS wheels:
pip install 'numba==0.62.1' 'llvmlite==0.45.1' 'mnemon-memory[ui]'If pip then complains about NumPy, add 'numpy<2.3' to the same command.
Apple Silicon, Linux, and Windows are unaffected — prebuilt wheels exist on those platforms.
pip install -e ".[dev]"
pytestMIT