RLM: Recursive Language Model Analysis + Episodic Memory

Mission Statement

RLM enables AI agents to think beyond their context windows while maintaining perfect memory across sessions.

Core Innovation: Two-Part System

1. Best-in-Class RLM Analysis

Implement the recursive analysis technique from the research paper:

Scan → Chunk → Extract → Analyze large codebases and documents without context bloat
Parallel subagent dispatch for efficient multi-scale analysis
Never load full content into main context - orchestrator sees metadata, subagents see content
Multiple chunking strategies: files, functions, headings, semantic, lines

2. Episodic Memory (The Innovation)

Automatic persistent memory that survives context limits and session boundaries:

Auto-archive context windows before compaction → stored in ~/.rlm/memory/
Recall using RLM - retrieve past conversations efficiently without context bloat
Full-fidelity storage via SQLite + FTS5 - no data loss, fast search
Smart retrieval - use RLM chunking on large memory matches

Why This Matters

Most AI agents have amnesia. When context fills up or sessions end, they forget everything.

MCP and RAG give agents access to external knowledge (databases, APIs, documents) but don't preserve the agent's own conversation history across sessions.

RLM + Episodic Memory solves this by:

Storing the agent's full conversation history
Using RLM to efficiently recall relevant past context
Maintaining continuity across arbitrarily long timespans
Enabling agents to learn from their own past interactions

The Technique

Store in full - SQLite backend preserves complete conversations and documents
Index with FTS5 - Fast keyword search with BM25 ranking and Porter stemming
Retrieve with RLM - Use recursive chunking to analyze large matches efficiently
No context bloat - Summaries and targeted extracts only, never full dumps

Result: An AI agent that can work with infinite context and perfect recall.

Implementation Roadmap

Version History

0.1 — Core RLM analysis (scan/chunk/extract/dispatch loop, skill prompt)
0.2 — Persistent memory (SQLite FTS5, recall pipeline, grep pre-filtering, graduated dispatch, self-improving strategies, chat data ingestion)
0.3 — Auto-recall infrastructure (MCP server, conversation archiving hooks, zero-footprint install, export moved into package, global MCP config)
0.4 — Compression + quality (semantic tagging, two-tier storage, structural compression, session summaries, json_each() tag filtering, rlm stats) ← current
1.0 — Public release (PyPI packaging, relocatable install, stable API, external testing)

Phase 1: Core RLM ✅ COMPLETE

Scan, chunk, extract, session management
Multiple chunking strategies (functions, files, headings, semantic, lines)
Subagent dispatch pattern
Skill prompt with clear modes
Tag filtering in SQL (json_each() for exact matching)

Phase 2: Episodic Memory ✅ COMPLETE

SQLite + FTS5 backend with batch transaction optimization (21x speedup)
Memory storage API (/rlm remember) and recall API (/rlm recall)
Hook-based auto-archiving (PreCompact + SessionEnd hooks)
RLM-powered recall with size detection + chunking guidance
Self-improving retrieval via learned patterns
WAL mode enabled for concurrent read/write performance

Phase 3: Auto-Recall + Quality ✅ COMPLETE

MCP server (mcp/server.py) exposing 5 tools as native operations
SessionStart hook injecting recent project context
Semantic tagging (LLM + keyword extraction fallback)
Two-tier storage (summary + compressed transcript, linked by session_id)
Structural compression (strips skill prompts — 63% reduction)
Session summary generator (rlm/summarize.py)
rlm stats command

Phase 4: Remaining Improvements 📋 (Planned)

Analysis Engine

Adaptive chunk sizing based on content density
Cross-file dependency awareness (include imports/callers in same chunk)
AST-aware chunking for more languages (JS/TS, Go, Rust)
Confidence-based early termination for subagent dispatch
Result deduplication across subagents

Memory Intelligence

Memory importance scoring (information density, decision count, topic novelty)
Memory consolidation (synthesize patterns across related conversations)
Forgetting curve (boost frequently-recalled memories, decay stale ones)
Content deduplication (detect and skip re-archived sessions)
Retention policies (auto-compress old memories >30 days)

Schema & Search

Temporal indexing (fast queries like "what was I doing last Tuesday")
Relationship tracking (link continuation sessions, superseding decisions)
Configurable BM25 weights for different content types
Ranked tag search (weight exact matches higher than partial)

Developer Experience

Memory dashboard (TUI to browse/search/manage)
Export/import (backup and migration between machines)
Privacy controls (mark memories as "do not recall", auto-redact secrets)
Documentation and examples

Phase 5: Cross-Platform & Distribution 📋 (Planned)

Library-ification

Provider-agnostic hook interface (abstract PreCompact, SessionEnd, etc.)
Generic transcript format with platform adapters
Abstract skill/prompt interface

Platform Adapters

Cursor (MCP support exists — test immediately)
OpenAI Codex (research extension points)
Gemini Code Assist, Windsurf, Aider

Packaging

PyPI package (pip install rlm / uv pip install rlm)
Comprehensive documentation and getting-started guide
Test with at least one external user

1.0 Release Criteria

Installable via pip install rlm without cloning the repo
No hardcoded paths — install works on any machine
Documented, stable CLI and MCP API
Tested by at least one person who isn't the author

Completed Feature Details

Auto-Recall (MCP Server) ✅

All 5 MCP tools available as native operations in terminal Claude Code sessions:

rlm_recall — search memory by query + optional tag filter
rlm_remember — store new memories
rlm_memory_list — browse entries by tag
rlm_memory_extract — extract full content with optional grep
rlm_forget — delete entries

SessionStart hook injects recent project context automatically.

Known limitation: Omnara ignores additionalContext from hooks. Terminal sessions provide the full experience.

Semantic Tagging ✅

LLM-based tag extraction with keyword fallback. Generates 5-10 semantic tags per archived conversation. Tags like mcp, hooks, architecture-decision dramatically improve recall quality.

Transcript Compression ✅

Two-tier storage with structural compression:

Summary entry (~1KB): decisions, questions, commits, files modified
Compressed transcript (~60KB): skill prompts stripped, boilerplate removed, formatting compacted
Both linked by shared session_id tag
Key finding: 84% of user content was injected skill prompts

Tag Filtering ✅

Replaced LIKE-based tag filtering with json_each() for exact matching. No more substring false positives (e.g., "mcp" matching "mcp-server").

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RLM: Recursive Language Model Analysis + Episodic Memory

Mission Statement

Core Innovation: Two-Part System

1. Best-in-Class RLM Analysis

2. Episodic Memory (The Innovation)

Why This Matters

The Technique

Implementation Roadmap

Version History

Phase 1: Core RLM ✅ COMPLETE

Phase 2: Episodic Memory ✅ COMPLETE

Phase 3: Auto-Recall + Quality ✅ COMPLETE

Phase 4: Remaining Improvements 📋 (Planned)

Analysis Engine

Memory Intelligence

Schema & Search

Developer Experience

Phase 5: Cross-Platform & Distribution 📋 (Planned)

Library-ification

Platform Adapters

Packaging

1.0 Release Criteria

Completed Feature Details

Auto-Recall (MCP Server) ✅

Semantic Tagging ✅

Transcript Compression ✅

Tag Filtering ✅

FilesExpand file tree

PROJECT.md

Latest commit

History

PROJECT.md

File metadata and controls

RLM: Recursive Language Model Analysis + Episodic Memory

Mission Statement

Core Innovation: Two-Part System

1. Best-in-Class RLM Analysis

2. Episodic Memory (The Innovation)

Why This Matters

The Technique

Implementation Roadmap

Version History

Phase 1: Core RLM ✅ COMPLETE

Phase 2: Episodic Memory ✅ COMPLETE

Phase 3: Auto-Recall + Quality ✅ COMPLETE

Phase 4: Remaining Improvements 📋 (Planned)

Analysis Engine

Memory Intelligence

Schema & Search

Developer Experience

Phase 5: Cross-Platform & Distribution 📋 (Planned)

Library-ification

Platform Adapters

Packaging

1.0 Release Criteria

Completed Feature Details

Auto-Recall (MCP Server) ✅

Semantic Tagging ✅

Transcript Compression ✅

Tag Filtering ✅