mem

A fast, local semantic search engine and knowledge graph for your personal knowledge base. Works with plain markdown files, exposes an MCP server for AI assistants, and includes a CLI for direct access.

What it does: Point it at a directory of markdown files and it builds a searchable vector index + knowledge graph from YAML frontmatter links. AI assistants (Claude, Gemini, etc.) can then search, create, and manage documents through MCP tools. You can also use the CLI directly.

Features

Semantic search — BGE-M3 embeddings (1024-dim) via ONNX Runtime, with hybrid graph-proximity boosting
Knowledge graph — Seven edge types extracted from frontmatter (parent, depends_on, soft_depends_on, supersedes, contributes_to) and body (link from wikilinks), plus auto-discovered similar_to edges from semantic similarity. PageRank, betweenness centrality, and path tracing
Task management — Create, prioritize, and track tasks with dependency graphs; ready and blocked filters use graph analysis
Memory system — Store and retrieve observations, notes, and insights with semantic search
MCP server — 40 tools for AI assistants over stdio transport
CLI — Full-featured command-line interface for search, tasks, memory, and graph operations
Telemetry — Built-in usage tracking for MCP tools (call counts, response sizes, latency)
Fast — Lazy ONNX session pooling, SIMD-accelerated vector ops, parallel batch embedding
Local — Everything runs on your machine. No cloud services, no API keys, no data leaves your disk
Auto-setup — Model files and ONNX Runtime are downloaded automatically on first run

Install

Pre-built binaries (recommended)

curl -fsSL https://raw.githubusercontent.com/nicsuzor/mem/main/install.sh | sh

Supports Linux x86_64 and macOS Apple Silicon. Installs the pkb binary (MCP server + CLI in one) to /usr/local/bin.

From source

cargo install --git https://github.com/nicsuzor/mem.git

Requires Rust >= 1.88.

cargo-binstall

cargo binstall mem

Quick Start

1. Set your PKB directory

export ACA_DATA=~/brain  # or wherever your markdown files live

2. Index your files

pkb reindex

3. Search

pkb search "how does authentication work"

4. Connect to an AI assistant

Add to your MCP client config (e.g. Claude Code .mcp.json):

{
  "mcpServers": {
    "pkb": {
      "command": "pkb",
      "args": []
    }
  }
}

Document Format

mem works with plain markdown files that have YAML frontmatter:

---
id: my-task-123
title: Implement user auth
type: task
status: active
priority: 2
tags: [backend, security]
depends_on: [design-doc-456]
parent: project-789
---

The actual content of the document goes here.
Any markdown is fine.

All frontmatter fields are optional. Files without frontmatter are indexed by filename and content.

Status values

Status	Meaning
`active`	Open, ready to work on (default)
`in_progress`	Currently being worked on
`blocked`	Waiting on dependencies
`review`	In review / awaiting feedback
`paused`	Intentionally deferred
`someday`	Low priority / maybe later
`done`	Completed successfully
`cancelled`	Abandoned / no longer relevant

Aliases (automatically normalized): inbox, todo, open → active; in-progress → in_progress; in_review, in-review → review; complete, completed, closed, archived → done; dead → cancelled.

Node types

Category	Types	Role
Actionable	`goal`, `project`, `subproject`, `epic`, `task`, `action`, `bug`, `feature`, `milestone`, `learn`	Executed; appear in ready/blocked queues
Obligation	`target`, `prototype`	Declared deadline-bound obligations or class templates; not executed but propagate urgency to contributing tasks (see Focus Scoring)
Reference	`note`, `knowledge`, `memory`, `contact`	Knowledge content; searchable but excluded from task workflows

target represents a one-shot terminal obligation (a deadline you must not miss). prototype is a class template for recurring obligations (e.g. peer review load) whose instances inherit severity, goal_type, and edge defaults at creation.

Priority levels

Level	Label	Use
`0`	P0 — Critical	Drop everything; this is what you're doing now
`1`	P1 — High	Active commitment; this week
`2`	P2 — Standard	Default; ordinary work
`3`	P3 — Low	Background; pick up when capacity exists
`4`	P4 — Backlog	May never happen; keep visible

Priority propagates upward via effective_priority: a P3 task blocking a P0 inherits P0 weighting in scoring even though its own field stays P3. See Focus Scoring for how priority composes with severity and urgency.

Edge types

The knowledge graph has seven edge types. Some are derived from frontmatter, others are computed automatically.

Edge type	Source	Affects ready/blocked?	Affects importance propagation?	Notes
`parent`	`parent:` frontmatter or `children:` list	✅ (via unfinished children)	✅	Hierarchy
`depends_on`	`depends_on:` list	✅ blocks task	✅	Hard dependency
`soft_depends_on`	`soft_depends_on:` list	❌	✅	Informational ordering
`link`	`[[wikilinks]]` and markdown links in body	❌	❌	Cross-references; counted as backlinks
`supersedes`	`supersedes:` frontmatter	❌	❌	This node replaces the target
`contributes_to`	`contributes_to:` list with verbal weights	❌	✅	Strategic priority (Birnbaum importance with Renooij-Witteman terms)
`similar_to`	Computed from BGE-M3 embeddings (cosine ≥ 0.85)	❌	❌	Auto-discovered semantic similarity; appears in `pkb_context` and `pkb_trace`

similar_to edges are materialised when the graph is built with the vector store available (e.g. via the MCP server). They participate in pathfinding (pkb_trace) and context display (pkb_context) but are deliberately excluded from blocking analysis and ready/blocked classification — semantic similarity is informational, not causal.

Focus Scoring

Tasks are ranked by one composite integer, focus_score — the sum of priority, severity, deadline pressure, age, structural blast radius, stakeholder waiting time, and urgency (target propagation). Sort by it; ignore the components unless you're debugging a ranking.

For deadline-bound obligations that aren't tasks themselves (ARC submissions, contract signings, anything you must not fail), declare a target node and link contributing tasks to it:

# The obligation
type: target
severity: 3                      # see severity ladder below
goal_type: committed             # committed | aspirational | learning
due: 2026-05-07
consequence: "Late review damages standing with the panel."

# A task contributing to it
contributes_to:
  - to: <target-id>
    weight: Certain              # see weight scale below
    why: "contractual obligation as assigned assessor"

mem propagates severity × edge_weight × deadline-slack back from each target to its contributors, writing node.urgency and folding it into focus_score. A P2 task blocking a SEV3-committed deadline rises automatically as the deadline approaches — no priority bumping.

Severity ladder

Level	Label	Example
0	Negligible	Minor annoyance; no consequence beyond self
1	Low	Small reputational or time cost
2	Moderate	Meaningful commitment; recoverable if missed
3	High	Serious consequence; hard to recover
4	Terminal	Job loss, bankruptcy, severe health, legal

SEV0–3 are compensatory (standard scalar math). SEV4 + goal_type: committed is lexicographic — it gets a 10 000× multiplier so any SEV4-adjacent task outranks any non-SEV4 task regardless of priority, deadline, or anything else. Use sparingly; the cognitive speedbump of writing consequence: prose is part of the design.

`goal_type`

Value	Effect
`committed`	Receives the lexicographic override at SEV4. Standard contractual / non-negotiable obligations.
`aspirational`	Linear propagation only. `consequence:` is reused as opportunity-cost prose. Prevents moonshots from hijacking the queue.
`learning`	Linear propagation only. Marks targets where the value is the attempt, not the outcome.

Weight scale (Renooij-Witteman)

contributes_to.weight accepts only verbal terms — raw decimals are rejected at parse time. Weights mean Birnbaum importance (probability that missing this task guarantees target failure), not "percent contribution":

Term	Anchor	Reading
Certain	1.00	Single point of failure — miss this and the target fails
Probable	0.85	Strong contributor
Expected	0.75	Likely needed
Fifty-Fifty	0.50	Redundancy exists
Uncertain	0.25	Possibly needed
Improbable	0.15	Marginal
Impossible	0.00	No contribution

Non-linearity defeats the spacing and centring biases that corrupt linear scales.

`focus_score` components

Term	Range	Trigger
`priority_base`	0 / 5 000 / 10 000	P0 = 10 000, P1 = 5 000, P2+ = 0
`severity_bonus`	0 – 100 000	SEV0–4 on the task itself; SEV4 lexicographic
`deadline_score`	0 – 12 000	Overdue / tight / near-tight; ×1.5 if `consequence` set
`age_staleness_bonus`	0 – 200	P2+ only; min(days_since_created, 200)
`downstream_weight × 10`	0 – ∞	Structural blast radius from `parent` / `depends_on` BFS
`stakeholder_waiting_bonus`	0 / 2 000 – 8 000	When `stakeholder` set; +200/day
`urgency_term`	0 – 10 000+	`round(node.urgency)` — target propagation

The formula lives in compute_urgency and compute_focus_scores in src/graph_store.rs. Prototype nodes (for recurring obligations like peer review) and the deferred calibration ritual extend the model — see the source for current behaviour.

CLI Commands

Search & Index

Command	Description
`pkb search <query> [-n limit] [--full]`	Semantic search across the knowledge base
`pkb add <files...>`	Add markdown files to the index
`pkb reindex [--force]`	Re-scan and re-index all PKB files
`pkb status`	Show index statistics (document count, DB size)

Task Management

Command	Description
`pkb tasks [ready\|blocked\|all] [--project P] [--sort S]`	List tasks sorted by priority + downstream weight
`pkb task <id>`	Show task details and relationships
`pkb new <title> [--parent ID] [--priority N] [--project P] [--tags T] [--depends-on ID]`	Create a new task
`pkb done <id>`	Mark a task as done
`pkb update <id> [--status S] [--priority N] [--project P] [--tags T]`	Update task fields
`pkb deps <id> [--tree]`	Show dependency tree
`pkb blocks <id> [--tree]`	Show what completing a task would unblock

Memory

Command	Description
`pkb recall <query> [-n limit]`	Semantic search over memories and notes
`pkb memories [--tag T]`	List memory-type documents
`pkb tags [tag...] [--count] [--type T]`	Tag frequency summary or search by tags
`pkb forget <id>`	Delete a memory document

Knowledge Graph

Command	Description
`pkb context <id> [--hops N]`	Neighbourhood: metadata, backlinks, nearby nodes
`pkb trace <from> <to> [-n max_paths]`	Shortest paths between two nodes
`pkb orphans`	Disconnected nodes with no edges
`pkb metrics [id]`	PageRank, betweenness, degree centrality
`pkb graph [--format json\|graphml\|mcp-index\|all] [--output path]`	Export the knowledge graph
`pkb stats [--sort count\|bytes\|latency\|errors]`	Show MCP tool usage telemetry

MCP Tools

The pkb binary exposes 39 tools over MCP stdio transport. Any MCP-compatible client can use them. It also provides MCP prompts to guide AI assistants through common search and navigation patterns.

Prompts

Prompt	Description	Guidance
`find-task`	"How do I find a task about X?"	Demonstrates `task_search` then `get_task`
`explore-topic`	"What do we know about X?"	Demonstrates `search` then `get_document`
`navigate-graph`	"What's connected to X?"	Demonstrates `pkb_context` for relationships
`find-by-tag`	"Show me everything tagged X"	Demonstrates `search_by_tag` usage

Tools

Category	Tools
Search	`search`, `get_document`, `list_documents`, `find_duplicates`
Tasks	`task_search`, `list_tasks`, `get_task`, `create_task`, `create_subtask`, `update_task`, `complete_task`, `release_task`, `decompose_task`, `get_dependency_tree`, `get_task_children`, `task_summary`, `get_network_metrics`
Memory	`retrieve_memory`, `search_by_tag`, `list_memories`, `delete_memory`
CRUD	`create`, `create_memory`, `append`, `delete`
Graph	`pkb_context`, `pkb_trace`, `pkb_orphans`, `graph_stats`, `graph_json`
Batch	`batch_update`, `batch_reparent`, `batch_archive`, `batch_merge`, `batch_create_epics`, `batch_reclassify`, `bulk_reparent`, `merge_node`
System	`get_stats`

Architecture

MCP Client <--stdio--> pkb (MCP server)
                         |
                   +-----+------+
                   |  Dispatch  |  (18 tools, ServerHandler trait)
                   +-----+------+
                    +----|----+
             +------+ +--+-+ +----------+
             |Vector| |Graph| | Document |
             |Store | |Store| | CRUD     |
             +--+---+ +--+--+ +----------+
                |         |
          +-----+------+  |
          |  Embedder   |  |  BGE-M3 via ONNX Runtime (1024-dim)
          +-------------+  |
                           |
                     +-----+------+
                     | PKB Files  |  markdown + YAML frontmatter
                     +------------+

Environment Variables

Variable	Default	Description
`ACA_DATA`	`~/brain`	PKB root directory
`RUST_LOG`	`info`	Log level filter
`AOPS_OFFLINE`	`false`	Disable model/runtime auto-download
`AOPS_MODEL_PATH`	(auto)	Override model directory path
`ORT_DYLIB_PATH`	(auto)	Override ONNX Runtime library path

Acknowledgments

The SIMD-optimized vector distance functions in src/distance.rs are adapted from shodh-memory by Varun Ankus, originally licensed under Apache-2.0. The embedding and vector search architecture also drew inspiration from shodh-memory's design.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

See LICENSE for the full text.

Name		Name	Last commit message	Last commit date
Latest commit History 783 Commits
.agent		.agent
.github		.github
.vscode		.vscode
containers/pkb		containers/pkb
examples		examples
references		references
scripts		scripts
specs		specs
src		src
tests		tests
tools		tools
.dockerignore		.dockerignore
.gitignore		.gitignore
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
GEMINI.md		GEMINI.md
INVENTORY.md		INVENTORY.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
benchmark.py		benchmark.py
build.rs		build.rs
install.sh		install.sh
release-please-config.json		release-please-config.json

Folders and files

Latest commit

History

Repository files navigation

mem

Features

Install

Pre-built binaries (recommended)

From source

cargo-binstall

Quick Start

1. Set your PKB directory

2. Index your files

3. Search

4. Connect to an AI assistant

Document Format

Status values

Node types

Priority levels

Edge types

Focus Scoring

Severity ladder

goal_type

Weight scale (Renooij-Witteman)

focus_score components

CLI Commands

Search & Index

Task Management

Memory

Knowledge Graph

MCP Tools

Prompts

Tools

Architecture

Environment Variables

Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 114

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`goal_type`

`focus_score` components

Packages