This repository contains a local-first foundation for a cross-modal memory system.
Current scope:
- Ingest markdown notes into SQLite.
- Ingest git commits and diffs into SQLite.
- Create searchable chunks from evidence.
- Query with a simple lexical retriever and get cited evidence.
- Create a virtual environment and install:
python -m venv .venv
source .venv/bin/activate
pip install -e .- Initialize the local database:
mem init-dbOptional: seed deterministic synthetic notes + git history for smoke testing:
mem seed-sample- Ingest notes:
mem ingest-notes /path/to/obsidian/vaultOr ingest multiple vaults in one command:
mem ingest-notes /path/to/vault-a /path/to/vault-bIf you omit paths, mem ingest-notes will use all OBSIDIAN_VAULT_PATH_<n> values from your local .env.
- Ingest git history:
mem ingest-git /path/to/repo --max-commits 300Or ingest multiple repos in one command:
mem ingest-git /path/to/repo-a /path/to/repo-b --max-commits 300If you omit paths, mem ingest-git will use all REPO_PATH_<n> values from your local .env.
- Ask a question:
mem ask "Why did I change the parser?" --top-k 5- Run retrieval evaluation (using seeded sample queries or your own
queries_evalrows):
mem eval --top-k 5mem init-dbmem seed-sample [--workspace-dir PATH] [--force]mem ingest-notes [<vault_path> ...](falls back to.envOBSIDIAN_VAULT_PATH_*)mem ingest-git [<repo_path> ...] [--max-commits N](falls back to.envREPO_PATH_*)mem ask "<query>" [--top-k N]mem eval [--top-k N] [--query-prefix PREFIX] [--load-queries PATH.json]
Use mem seed-sample to create a tiny deterministic sample vault + sample git repo and ingest them into an isolated sample DB.
- Creates a local workspace at
./data/sample-seed-workspaceby default - Writes to a separate temp sample DB by default (does not modify your main
./data/memory.db) - Seeds synthetic markdown notes and git commits (no personal data)
- Populates namespaced sample rows in
queries_evalfor future eval/smoke tests - Safe to re-run; unchanged content is reused and ingestion remains idempotent
Run the sample retrieval benchmark:
mem eval --query-prefix "[sample]" --top-k 5Use --force to rebuild the sample workspace directory from scratch.
Use --db-path if you want the sample dataset in a specific non-main database path.
By default, data is stored at:
./data/memory.db
Set CMRAG_DB_PATH to override:
export CMRAG_DB_PATH=/absolute/path/to/memory.dbUse a JSON array of rows:
[
{
"query_text": "What fixed the parser bounds check bug?",
"expected_source_uris": [
"/abs/path/to/repo@abc123",
"/abs/path/to/note/http-parser.md"
]
}
]mem eval --load-queries file.json upserts rows into queries_eval and then runs metrics (Recall@K, MRR@K, and an approximate citation hit-rate based on the top retrieved source).