Graph revision by wangyu-ustc · Pull Request #130 · Mirix-AI/MIRIX

wangyu-ustc · 2026-05-17T07:59:14Z

No description provided.

- New: temporal knowledge graph (entity_nodes, entity_edges, episode_nodes, involves_edges) - New: GraphMemoryManager with write path (W1-W5) and read path (R1-R4) - Toggle: MIRIX_ENABLE_GRAPH_MEMORY=true/false (default off) - LoCoMo benchmark: +3.05% LLM Judge (0.5429 → 0.5734) on 1540 questions - Zero changes to original logic when disabled

Graph memory returns {"context": "<pre-formatted str>"} instead of the {"total_count": N, "items": [...]} shape used by other memory types, so the existing total_count==0 short-circuit dropped graph context entirely. Split the empty-data check from the count check and add a graph-specific branch that reads the context string directly.

Introduces a deterministic conflict-resolution path for semantic memory inserts, with source provenance (turn_id / chunk_id / serial / occurred_at) flowing from /memory/add through to stored records. Enabled per meta-agent via the new `enable_conflict_resolution` flag; legacy free-form inserts remain the default. Schema: - `users.turn_counter`, `users.chunk_counter` — per-user monotonic counters used by `/memory/add` to fill in fallback provenance when the client does not provide source_meta. - `episodic_memory.source_refs`, `semantic_memory.source_refs` — provenance pointers from stored memories back to their source units. - `semantic_memory.prior_values` — history of values that have been superseded under the conflict-resolution path. Services: - `UserManager.reserve_source_ids` — atomic counter bump used by the /memory/add fallback. - New `semantic_memory_upsert_fact` tool gated by the agent flag. - `MetaAgent` system prompt augmentation when the flag is on. Docs: `docs/mab_conflict_resolution_and_provenance.md`, `docs/mab_raw_chunk_side_channel.md`, `docs/mab_user_id_isolation_fix.md`.

Replaces v2 single-graph memory with two independent Neo4j graphs — one per existing MIRIX memory layer: - G_episodic: (:Episode) + (:EpisodicEntity), with [:NEXT] temporal edges, [:EP_RELATES] entity edges (with keywords + embedding), and [:MENTIONS] episode→entity links. Driven by EpisodicMemoryManager.insert_event. - G_semantic: (:Concept) + (:SemanticEntity), with [:CONCEPT_RELATES] concept-concept edges (LLM-judged at insert time), [:SEM_RELATES] entity edges, and [:MENTIONS]. Driven by SemanticMemoryManager.insert_semantic_item. Retrieval (GraphRetrieverDispatcher): - 1 LLM call to split the query into ll/hl keywords (cached in Redis) - 1 batch embed call for both keyword sets - Parallel asyncio.gather over EpisodicRetriever + SemanticRetriever - Each retriever runs LightRAG dual-level vector search (ll → entity name vector index, hl → relation keyword vector index), round-robin merges, reverses MENTIONS to fetch items, then one-hop expands (NEXT for episodes, CONCEPT_RELATES for concepts). - 50/50 token budget split across the two graphs, format as a combined "## Episodic KG / ## Semantic KG" markdown payload. Zero-overhead default: - All hooks gated on settings.enable_graph_memory (default False). - Neo4j compose service is profile-gated ("graph"); mirix_api's depends_on is required: false, so plain `docker compose up` skips Neo4j. - Token tracker (mirix/database/token_tracker.py) is disabled by default; record() is a no-op until enable() is called by the eval harness via POST /debug/token_stats/reset. Schema bootstrap (mirix/database/neo4j_client.py): - 6 unique constraints, 2 btree indexes, 5 vector indexes (Neo4j 5.13+) - v3 (:Entity / :Event) cleanup runs first; safe on fresh DBs - Idempotent: re-running on existing DBs is a no-op Removed: - mirix/orm/graph_memory.py (v2 single-graph ORM) - mirix/services/graph_memory_manager.py (v2 manager) Docs: - docs/graph_memory_v4/README.md: design overview + zero-overhead notes - docs/graph_memory_v4/v4_graph_memory.md: per-file source + diffs - docs/graph_memory_v4/kg_overview_{episodic,semantic}.png: top-N visualizations - docs/graph_memory_v4/kg_subgraph_{identity,family_camping,art_creativity}.png: paired episodic-vs-semantic zoom-ins on shared themes (conv-26) Configuration: - MIRIX_ENABLE_GRAPH_MEMORY=true - MIRIX_NEO4J_URI=bolt://neo4j:7687 - MIRIX_NEO4J_USER, MIRIX_NEO4J_PASSWORD, MIRIX_NEO4J_DATABASE - MIRIX_NEO4J_VECTOR_DIM (default 1536, match the embedding model in use) Tested with gpt-4.1-mini + text-embedding-3-small + Neo4j 5.20-community on LoCoMo conv-26 (154 QA non-adversarial). See docs/graph_memory_v4/.

One-shot script: runs main_eval.py (LoCoMo sample 0) followed by organize_results.py, then prints overall accuracy + per-category breakdown. Pre-flight checks that server is up on :8531 and that locomo10.json exists. Output goes to evals/results/locomo/v4_<timestamp>/.

Combines main and graph_revision: v2/v4 graph memory, dual-graph LightRAG retrieval, MAB conflict resolution, graph retrievers.

1. episodic_memory_manager: pgvector embedding-search SELECT was missing source_refs, causing to_pydantic() to receive None for a non-nullable List field — every episodic search threw a Pydantic ValidationError and silently returned no memories. Add source_refs to the explicit select() column list. 2. semantic_memory_manager: same bug on the semantic side, plus prior_values. Add both to the embedding-search SELECT. 3. memory_tools.semantic_memory_insert: indexed item['source'] directly, so any LLM call that omitted the source field (which it commonly does — source is the least essential field) crashed with KeyError and lost the whole item. Switch to item.get('source', ''). Net effect on LoCoMo conv-26 with 0201c config: 20.4% -> 80.3% (the SELECT fix alone). The source KeyError was masking real semantic memory writes in graph-mode LongMemEval ingest. Also ignores evals/snapshots/ — local-only memory dumps, large and regenerable.

…h is on retrieve_memories_by_keywords: when MIRIX_ENABLE_GRAPH_MEMORY=true, episodic and semantic retrieval is served entirely by the v4 dual-graph dispatcher. The flat PG episodic/semantic search is gated behind 'not settings.enable_graph_memory' (kept as a fallback for graph-off mode). The other four memory types (resource / procedural / knowledge_vault / core) have no graph counterpart and are always retrieved flat — unchanged. On LoCoMo conv-26, v5 (pure graph) scores 84.2% vs v4 (graph+flat side-by-side) 84.9% — effectively a wash (1 question), confirming the flat layer is redundant once the graph layer covers the same recall. graph_retriever_dispatcher.DEFAULT_MAX_TOTAL_TOKENS: 12000 -> 24000. At 12k the formatted graph context measured ~13k tokens on LongMemEval-S, i.e. already over budget — apply_budget_to_search was truncating the tail. Doubled to give counting / multi-session questions recall headroom. Still well under the 'graph should be ≤1/3 of the 128k window' discipline.

evals/longmem_eval.py (new): runner for MemoryAgentBench LongMemEval-S (longmemeval_s* split). Reuses MirixMemorySystem + TaskAgent + organize_results unchanged — only the data layer differs. Parses each context's per-session [Chat Time, messages] structure, then further splits each session into <=4096-char chunks on message boundaries so the extractor sees small blocks (a whole session is ~14k chars and measurably dilutes LightRAG recall). Every sub-chunk inherits its session's chat time as occurred_at. Also records memory_stats (stored chars across PG flat + Neo4j graph) so no-graph and graph runs can be compared on the same yardstick. evals/memory_snapshot.py (new): save / load / list / delete memory snapshots so an expensive ingest (hours, real OpenAI cost) can be reused. pg_dump for the seven memory tables + agents/messages, plus a full Cypher-based Neo4j node/relationship export to JSON. load truncates first then restores, so the snapshot is the exact state afterwards. evals/mirix_memory_system.py: add_chunk now accepts an optional occurred_at parameter and forwards it to client.add. Without it the episodic agent guesses a year (and on LongMemEval guesses the ingest year — 2026 — collapsing temporal questions). LoCoMo's runner already embeds dates in the chunk text so it didn't need this; LongMemEval's dates live in the per-session Chat Time, which the new runner now passes through.

Jasonya added 12 commits April 3, 2026 21:33

Merge remote-tracking branch 'origin/main' into graph_revision

2f3fb88

Merge origin/main into graph_revision (openrouter configs)

0c77b8f

Merge remote-tracking branch 'origin/main' into graph_revision

8fae46e

Merge origin/graph_revision into main (local)

65a6993

Combines main and graph_revision: v2/v4 graph memory, dual-graph LightRAG retrieval, MAB conflict resolution, graph retrievers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graph revision#130

Graph revision#130
wangyu-ustc wants to merge 12 commits into
mainfrom
graph_revision

wangyu-ustc commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wangyu-ustc commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants