The project consists of three core components:
graph TD
A[Shell Hooks / User Term] -->|Manual Command / Future Hook| B[Rust Daemon /daemon]
B -->|Ingest payload| C[FastAPI Backend /backend]
C -->|Store raw metadata| D[(SQLite DB)]
C -->|Embed & index| E[(Qdrant Vector Store)]
C -->|Synthesis API| F[Gemini Generative API]
G[CLI Tool: tm] -->|Query endpoints| C
| Component | Status | Key Features | Tech Stack |
|---|---|---|---|
| FastAPI Backend | Solid V1 | Ingestion, session gap detection, semantic recall, SQLite + Qdrant sync. | Python, FastAPI, SQLite |
CLI Tool (tm) |
Solid V1 | Rich panels, formatted tables, subcommands for recall, reports, replay. | Python, rich, httpx |
| Capture Daemon | Partial V1 | CLI wrapper for manual/file ingestions. No active passive watcher. | Rust, clap, reqwest |
| Embedding Engine | Excellent | Multi-backend fallback (Gemini → Ollama → Local SHA1 token hashing). | Python, httpx, hashlib |
| Redaction Layer | Good | Regex-based scrubbing of AWS keys, database connection strings, JWTs, IPs. | Python, re |
Reviewing PLAN.md shows that the core of Phase 1 (V1) is highly functional:
-
Intelligent Sessionizer:
db.pygroups consecutive terminal commands into distinct logical sessions anchored around the inferredproject_root(resolving.git,Cargo.toml, etc.) and a configurable time gap (e.g., 20 minutes). -
Failure-Recovery Tracking: The system detects a failing command (exit code
$\neq 0$ ), captures itslikely_root_causevia stdout heuristics, and automatically registers a recovery link (failure_fixestable) when the same command subsequently succeeds. -
Hybrid Recall & Synthesis:
/v1/recallcombines semantic vectors from Qdrant with classic SQLiteLIKEfallbacks, feeding relevant context to the Gemini generative model to write a clean natural language answer to questions liketm recall "why did docker fail last time?". - Weekly Reporting: Automatic aggregation of category stats, failure rates, and recurring failure commands.
Important
The largest architectural gap is the lack of automated shell interception. Right now, commands must be manually ingested via tm ingest or daemon. Passive observation (the primary value proposition of Terminux) is not yet active.
tests/test_classifier.py contains documented edge cases of the regex classifier:
composealone matchescontainer(e.g.compose an emailis classified ascontainer).uvalone matchespython-dev(e.g.check uv levelsmatchespython-dev).tokenmatchesauthanywhere in a command or output (e.g.,echo token_namebecomesauth).
Though we have 8 strong regex patterns (IPs, AWS keys, JWTs, database strings), we don't yet capture:
- Private SSH keys (
-----BEGIN OPENSSH PRIVATE KEY-----). - Slack/Discord webhook URLs.
- Generic config values matching
password = ...with multiline blocks.
The system classifies events automatically, but has no command for the user to override or correct a misclassification (e.g. fixing a compose an email event to be general instead of container).
To systematically advance Terminux, we suggest prioritizing the roadmap into three distinct phases:
- Develop Shell Hooks: Write shell scripts (for Bash/Zsh) that leverage shell hooks (
PROMPT_COMMANDin Bash,preexec/precmdin Zsh) to capture commands, execution duration, and exit codes. - Daemon Enhancement: Update the Rust daemon to passively receive background signals from these shell hooks, tail the terminal output buffers, and automatically post the details asynchronously to the FastAPI backend.
- Refine Classifier Regular Expressions:
- Upgrade
classifier.pyto use stricter word boundaries (e.g.,\buv\bis good, but check surrounding tokens to ensure it isn'tuv indexoruv rays). - Add negative lookaheads/lookbehinds to category matching.
- Upgrade
- Expand Redaction Patterns: Add SSH key headers and webhooks, and cover common config patterns to guarantee local-first safety.
- API Support for Memory Correction: Add a
/v1/events/{id}/correctionendpoint to update database records and Qdrant payloads with corrected categories or root causes. - CLI Integration: Implement
tm correct <event_id> --category <new_category>to easily fix any automated misclassifications. - Confidence Scoring: Add a lightweight confidence model/heuristics rating to inferred root causes (e.g.,
confidence: Highfor exact matching errors likeModuleNotFoundError).