claude-memory-stack

Stop Claude from forgetting. Ship faster with fewer tokens.

A portable, deployable 3-layer memory architecture for Claude Code that
reduces token consumption by ~81% and makes your AI assistant learn from every session.

The Problem

Claude Code forgets everything between sessions. Every new conversation starts from zero — re-reading files, re-discovering architecture, re-learning your preferences. You burn tokens repeating context that should already be known.

Vector databases alone don't fix this. The problem isn't storage — it's retrieval architecture. Dumping embeddings into a single store creates noise, retrieval latency, and context window bloat.

What you need is a tiered system that loads the right memory at the right time, at the right cost.

The Solution

3 layers. Each serves a different purpose. Each has a different cost.

ASCII version (for terminals)

┌──────────────────────────────────────────────────────────────────┐
│                       Claude Code Session                        │
│                                                                  │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │  L1: Always-Loaded Context              COST: FREE        │  │
│  │  CLAUDE.md + memory-bank/*.md + tech-tips/*.md             │  │
│  │  Preferences, decisions, patterns, strategies              │  │
│  └────────────────────────────────────────────────────────────┘  │
│                              │                                    │
│                   falls through if not found                      │
│                              ▼                                    │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │  L2: Code Intelligence Pipeline         COST: AUTO        │  │
│  │  code-review-graph ──▶ codebase-memory-mcp                 │  │
│  │  Blast-radius scoping → targeted code retrieval            │  │
│  └────────────────────────────────────────────────────────────┘  │
│                              │                                    │
│                   falls through if not found                      │
│                              ▼                                    │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │  L3: Knowledge + Learning               COST: ON-DEMAND   │  │
│  │  RuVector PostgreSQL with ReasoningBank                    │  │
│  │  Vectors + patterns + learning + graph ops                 │  │
│  └────────────────────────────────────────────────────────────┘  │
│                                                                  │
│  ┌───────────────────────┐  ┌─────────────────────────────────┐ │
│  │  Ruflo Orchestration  │  │  Superpowers (8 skills)          │ │
│  │  Swarms + Q-Learning  │  │  TDD, Debug, Review, Verify     │ │
│  └───────────────────────┘  └─────────────────────────────────┘ │
│                                                                  │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │  Security: no-leak hook blocks .env, .key, .pem, secrets  │  │
│  └────────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────┘

Impact

Measured on real-world coding sessions with 25 turns.

Metric	Without Stack	With Stack	Improvement
Tokens per session	~380,000	~72,000	-81%
System prompt per turn	22,000	16,500	-25%
Code exploration task	65,000	2,300	-96%
Memory queries per session	30	0–8	-73%
Effective sessions per $ budget	1x	~5x	+400%
Bug-related rework	~40%	~10%	-75%

L1 eliminates repeated context. L2 replaces multi-round grep with single graph queries. L3 surfaces historical knowledge only when needed, instead of stuffing the prompt with everything.

Quick Start

Prerequisites

macOS or Linux
Node.js 20+
Docker Desktop (running)
Claude Code CLI (npm install -g @anthropic-ai/claude-code)
jq (brew install jq)

Install

git clone https://github.com/rajat1021/claude-memory-stack.git
cd claude-memory-stack
./install.sh

Verify

./verify.sh

Set GitHub Token (optional)

echo 'export GITHUB_TOKEN="ghp_your_token"' >> ~/.zshenv
source ~/.zshenv
./install.sh  # re-run to inject token

What the installer does

Checks prerequisites (Node.js 20+, Docker, Claude CLI, jq)
Installs MCP servers (codebase-memory-mcp, code-review-graph, ruvector)
Starts RuVector PostgreSQL container (port 5433, auto-restart on boot)
Configures Docker Desktop to start on login
Installs Ruflo orchestration
Backs up existing config (~/.claude/CLAUDE.md, settings.json, .mcp.json)
Deploys new config with trimmed system prompt
Deploys hooks (2 instead of 7), skills, commands, tech-tips
Installs Superpowers plugin
Writes install marker and runs verification

Idempotent — safe to run multiple times.

Architecture Deep Dive

Layer 1: Always-Loaded Context

Files in ~/.claude/CLAUDE.md and ~/.claude/memory-bank/*.md are loaded into every session's system prompt automatically. This is where you store:

Coding style preferences and conventions
Project-specific protocols and state
Architecture Decision Records (ADRs)
Tech-tips (cross-project gotchas)

Cost: Part of the system prompt. No queries. No latency. No cost.

Layer 2: Code Intelligence Pipeline

Two MCP servers chained for maximum token efficiency:

Stage 1 — code-review-graph (scoping)

Blast-radius analysis: "this change affects 4 functions in 3 files"
Risk scoring: "high: payment_handler, low: logger"
22 MCP tools, 19 languages

Stage 2 — codebase-memory-mcp (precision retrieval)

get_code_snippet(): exact function code (~200 tokens vs ~5,000 for full file)
trace_call_path(): who calls what, depth-controlled
14 MCP tools, 66 languages

Together: Know WHAT to look at, then get ONLY that code.

Cost: 1 MCP call (~200–500 tokens) instead of 10+ grep/read cycles (~20,000–65,000 tokens).

Layer 3: Knowledge + Learning

RuVector PostgreSQL — replaces both pgvector (dumb storage) and memorygraph (separate knowledge graph) with one intelligent system:

Feature	What It Does
Vector search	HNSW-indexed similarity on insights, note_chunks, observations
ReasoningBank	RETRIEVE → JUDGE → DISTILL → ROUTE learning loop
Pattern tracking	Confidence scores, success/failure counts per insight
Graph operations	Relationships between insights — replaces memorygraph
Agent routing	Auto-route tasks to optimal model via Q-Learning

Cost: Only queried when user approves — except for keyword triggers ("history", "insights", "patterns").

Memory Retrieval Protocol

L1 (check first, free) → L2 (auto for code) → L3 (ask user first)

Layer	Trigger	Cost
L1	Always loaded — check first	Free
L2	Auto-fires on code questions	Only when relevant
L3	Asks user before querying	Only when approved

L3 Exception: If the user says "history", "what happened last time", "insights", or "patterns" — query L3 directly without asking.

Orchestration: Ruflo

Swarm coordination — Queen/Worker hierarchy for parallel work
Q-Learning task routing — right model for right task (Opus for hard, Haiku for simple)
ReasoningBank — self-learning from every session
Multi-model selection — 30–50% token savings via intelligent routing

Discipline: Superpowers (8 skills)

Skill	What It Enforces
Brainstorming	Explore intent before implementation
Writing Plans	Structured specs before code
TDD	No production code without failing test first
Systematic Debugging	Root cause before fix attempts
Verification	Run commands + confirm output before claiming done
Code Review (give)	Verify work meets requirements
Code Review (receive)	Verify feedback technically before implementing
Git Worktrees	Isolated feature work with safety checks

Components

Component	Type	Source	Role
codebase-memory-mcp	MCP server	DeusData/codebase-memory-mcp	L2 — code graph, 66 languages, sub-ms queries
code-review-graph	MCP server	tirth8205/code-review-graph	L2 — blast-radius scoping, risk scoring
ruvector	MCP + PostgreSQL	ruvnet/ruflo	L3 — vectors, learning, graph, routing
github MCP	MCP server	@anthropic-ai/github-mcp-server	GitHub integration — PRs, issues, CI
ruflo	Orchestration	ruvnet/ruflo	Swarms, Q-Learning routing, ReasoningBank
superpowers	Plugin (8 skills)	obra/superpowers	TDD, debugging, verification, code review
no-leak	Hook	Custom	Blocks .env, .pem, .key, credentials
auto-index	Hook	Custom	Re-indexes codebase on session start

Per-Project Isolated Memory

Every project gets its own memory-bank. Switch projects, switch context — instantly. No cross-contamination.

Bootstrap a New Project

cd ~/my-new-project
claude /init-project my-project

Creates (copied from ~/.claude/templates/project/ with placeholders substituted):

my-project/
├── .mcp.json                              # codebase-memory + memory-graph + postgres
├── CLAUDE.md                              # Project prefs + @imports
└── .claude/
    └── memory-bank/
        ├── architecture/system-overview.md
        ├── decisions/README.md            # ADR log
        ├── patterns/coding-standards.md
        └── troubleshooting/known-issues.md

Placeholders substituted on init: __PROJECT_NAME__, __CMS_CODEBASE_MEMORY_BIN__ (from which codebase-memory-mcp), __CMS_DB_PASSWORD__ (from $CMS_DB_PASSWORD or default).

Then auto-runs code-review-graph build and codebase-memory-mcp index.

Built-in Commands

Command	Usage	What It Does
`/init-project`	`claude /init-project my-app`	Bootstrap 3-layer architecture for any project — creates CLAUDE.md, .mcp.json, memory-bank/, and indexes the codebase
`/tech-tip`	`claude /tech-tip python`	Capture a technology-specific gotcha to the shared tips library at `~/.claude/memory-bank/tech-tips/`

Hooks (auto-fire, no manual action)

Hook	Fires On	What It Does
no-leak	Every file read/write/edit/bash	Blocks access to `.env`, `.pem`, `.key`, `credentials.json` — prevents accidental secret exposure
auto-index	Session start	Re-indexes codebase graph if stale (>24h) — keeps L2 code intelligence fresh

Skills (invoked automatically when relevant)

Skill	Trigger	Purpose
codebase-memory-exploring	Code exploration questions	Guides Claude to use graph search instead of grep
codebase-memory-tracing	"Who calls this function?"	Traces call paths via graph instead of multi-round grep
codebase-memory-quality	Code quality questions	Dead code detection, complexity analysis via graph
codebase-memory-reference	MCP tool usage questions	Reference guide for graph query syntax
defuddle	URL content extraction	Clean markdown from web pages, saves tokens vs raw HTML

Migration

From existing pgvector

./migration/migrate-pgvector.sh

Migrates source tables into insights, note_chunks, and observations on RuVector PostgreSQL (port 5433). Configure source creds and table names via env vars — see migration/migrate-pgvector.sh.

From memorygraph

./migration/migrate-memorygraph.sh

Exports entities and relations from memorygraph SQLite databases into RuVector's patterns table.

From vanilla Claude Code

Just run ./install.sh. It backs up your existing config before deploying. Originals saved as *.bak.<timestamp> in ~/.claude/.

Rollback

./migration/rollback.sh   # Restore config backups
./uninstall.sh             # Full removal + Docker cleanup

Configuration

Global (all projects)

File	Location	Purpose
`CLAUDE.md`	`~/.claude/CLAUDE.md`	Preferences, protocols, memory retrieval rules
`settings.json`	`~/.claude/settings.json`	Hooks, permissions, plugins
`.mcp.json`	`~/.claude/.mcp.json`	Global MCP servers (GitHub)
`memory-bank/`	`~/.claude/memory-bank/`	Tech-tips, shared knowledge

Per-project (overrides global)

File	Location	Purpose
`CLAUDE.md`	`<project>/CLAUDE.md`	Project-specific instructions
`.mcp.json`	`<project>/.mcp.json`	Project MCP servers (code-review-graph, codebase-memory, ruvector)
`memory-bank/`	`<project>/.claude/memory-bank/`	Decisions, patterns, troubleshooting

Uninstall

./uninstall.sh

Stops Docker container, restores backed-up configs, removes hooks/skills/commands. Restart Claude Code after.

Credits

Built by Rajat Tanwar

Powered by:
codebase-memory-mcp by DeusData • code-review-graph by Tirth Patel
ruflo by Ruv • superpowers by Jesse Vincent
Claude Code by Anthropic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

claude-memory-stack

The Problem

The Solution

Impact

Quick Start

Prerequisites

Install

Verify

Set GitHub Token (optional)

Architecture Deep Dive

Components

Per-Project Isolated Memory

Bootstrap a New Project

Built-in Commands

Hooks (auto-fire, no manual action)

Skills (invoked automatically when relevant)

Migration

Configuration

Global (all projects)

Per-project (overrides global)

Uninstall

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
commands		commands
config		config
docker		docker
docs		docs
hooks		hooks
migration		migration
scripts		scripts
skills		skills
templates/project		templates/project
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
uninstall.sh		uninstall.sh
verify.sh		verify.sh

Folders and files

Latest commit

History

Repository files navigation

claude-memory-stack

The Problem

The Solution

Impact

Quick Start

Prerequisites

Install

Verify

Set GitHub Token (optional)

Architecture Deep Dive

Components

Per-Project Isolated Memory

Bootstrap a New Project

Built-in Commands

Hooks (auto-fire, no manual action)

Skills (invoked automatically when relevant)

Migration

Configuration

Global (all projects)

Per-project (overrides global)

Uninstall

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages