jessekemp1 · jessekemp1 · May 14, 2026 · May 14, 2026 · May 14, 2026 · May 14, 2026
diff --git a/.gitignore b/.gitignore
@@ -122,3 +122,9 @@ docs/RECURRING_FAILURE_PATTERN.md
 docs/REFLECT_*.md
 docs/REORGANIZATION_COMPLETE_*.md
 docs/TESTING_FAILURE_ANALYSIS.md
+
+# Phase 0 — captured goldens are environment-specific
+goldens/
+
+# Live state written by work_absorber daemon — not source
+WORK_PROGRESS_REPORT.md
diff --git a/BETA_ONBOARDING.md b/BETA_ONBOARDING.md
@@ -55,32 +55,26 @@ Cortex talks to Claude Code through MCP. Update `.mcp.json` at your Dev root:
 
 Replace `YOUR_USER` with your actual username. Claude Code will pick this up on next restart.
 
-## Start the Bridge (For Intelligence Queries)
+## No Bridge Needed
 
-Basic commands (`status`, `onboard`, `doctor`) work immediately. For intelligence queries and MCP tools, start the bridge server:
-
-```bash
-# Start bridge in background (runs on :8765)
-python api/bridge_endpoint.py &
-
-# Verify
-curl -s http://127.0.0.1:8765/health | python3 -m json.tool
-```
+The MCP server runs fully in-process — all 18 tools work without starting any
+background server. (The optional HTTP bridge at `:8765` exists only for local
+agents like Hermes; MCP users can ignore it.)
 
 ## First Session (Verify It Works)
 
 ```bash
 # 1. Status — shows your context
 cortex status
 
-# 2. Query intelligence (requires bridge running)
+# 2. Query intelligence
 cortex intelligence "What are the key gotchas in this codebase?"
 
 # 3. Daily briefing
 cortex briefing
 ```
 
-In Claude Code, you'll now have 18 MCP tools: `cortex_intelligence`, `cortex_recommendations`, `cortex_anomalies`, `cortex_doctor`, and more.
+In Claude Code, you'll now have 18 MCP tools: `cortex_intelligence`, `cortex_recommendations`, `cortex_anomalies`, `cortex_doctor`, and more — all served in-process.
 
 ## What to Expect
 

diff --git a/INSTALL.md b/INSTALL.md
@@ -50,22 +50,10 @@ cortex status
 cortex doctor
 ```
 
-## Starting the Bridge Server
+## Intelligence Queries — No Server Needed
 
-The bridge server (`api/bridge_endpoint.py`) powers intelligence queries and MCP integration. Basic commands (`status`, `onboard`, `doctor`) work without it.
-
-```bash
-# Start the bridge (runs on :8765)
-python api/bridge_endpoint.py
-
-# Or with uvicorn directly
-uvicorn api.bridge_endpoint:app --host 127.0.0.1 --port 8765
-
-# Verify it's running
-curl http://127.0.0.1:8765/health
-```
-
-Once the bridge is running, you can use:
+Intelligence runs **in-process**. There is no bridge daemon to start for normal
+use:
 
 ```bash
 # Intelligence query — ask Cortex anything about your project
@@ -75,6 +63,10 @@ cortex intelligence "What patterns should I watch out for?"
 cortex briefing
 ```
 
+The optional HTTP bridge (`api/bridge_endpoint.py`, installed via
+`pip install -e ".[server]"`) exists only for local agents that consume Cortex
+over HTTP. MCP clients and the CLI never require it.
+
 ## Claude Code / MCP Integration
 
 If you use Claude Code (or any MCP-compatible client), add Cortex as a tool server:
@@ -110,15 +102,6 @@ result = bridge.query_intelligence("implement caching", project="my-api")
 session = bridge.get_session_context()
 ```
 
-## Gateway — Telegram Bot + Web Chat (Coming Soon)
-
-Cortex includes a Gateway module for Telegram (`@KempionBot`) and web chat (`:8765/chat`). **This feature is not yet active by default.** To enable it, you'll need:
-
-- A Telegram bot token (`CORTEX_TELEGRAM_TOKEN`) from [@BotFather](https://t.me/BotFather)
-- The bridge server running (`cortex serve`)
-
-See `cortex/gateway/` for configuration details.
-
 ## What to Expect
 
 - **First session**: Cortex starts with an empty memory. It learns from your git history, commits, and interaction patterns.

diff --git a/README.md b/README.md
@@ -219,7 +219,10 @@ print(f"Branch: {session['git']['branch']}")
 print(f"Active goals: {session['goals']}")
 ```
 
-Performance: bridge initialization under 10ms, context retrieval under 100ms, intelligence queries under 1s.
+Performance: context retrieval under 100ms, intelligence queries under 1s. First
+`CortexBridge()` construction loads ML/embedding modules and can take a few
+seconds — the MCP server defers this with a lazy singleton so tool calls stay
+responsive.
 
 ---
 
@@ -266,6 +269,10 @@ Cortex exposes a Model Context Protocol server so Claude Desktop and compatible
 
 Once registered, Claude can call `cortex_intelligence`, `cortex_recommendations`, and `cortex_anomalies` without prompt engineering on your end.
 
+All 18 MCP tools run **in-process** — no background server, no HTTP daemon. The
+optional FastAPI bridge (`pip install -e ".[server]"`) is only needed by local
+agents that consume Cortex over HTTP; MCP clients never require it.
+
 ---
 
 ## Comparison with Alternatives
@@ -307,8 +314,8 @@ All data is local by default. Nothing leaves your machine unless you configure a
 ```bash
 git clone https://github.com/jessekemp1/cortex
 cd cortex
-pip install -e .            # core only
-pip install -e ".[server]"  # + FastAPI server (uvicorn, apscheduler)
+pip install -e .            # core only — MCP server + CLI, all in-process
+pip install -e ".[server]"  # + optional HTTP bridge (uvicorn) for local agents
 pip install -e ".[all]"     # + analytics (xgboost, shap, openai)
 ```
 

diff --git a/ROADMAP.md b/ROADMAP.md
@@ -298,7 +298,14 @@ Beyond just discovering papers, Cortex needs to evolve its own capabilities:
 
 ### 6-Month Roadmap (Phased)
 
-#### Phase 1: Ship + Validate (Mar 12 — Mar 28) — CURRENT
+> **Status (2026-05-21):** Phase dates below are the original plan. Phases 1-3
+> ran roughly to schedule (see per-item status columns). A parallel
+> engineering-stabilization track also landed in May — gap analysis, a
+> contract-test safety net, repair of the broken MCP tools, the bridge
+> HTTP collapse (MCP now runs fully in-process), and the install-process
+> fixes for Linux. Phase 6 (batch redesign) was added from that track.
+
+#### Phase 1: Ship + Validate (Mar 12 — Mar 28)
 
 | Item | Priority | Status | Dependency |
 |------|----------|--------|------------|
@@ -522,6 +529,74 @@ class CortexMemoryBackend:
 | **Cross-repo transfer** (memory sharing across repos) | P2 | 2 weeks | Portfolio value |
 | **CRA self-improvement** (research agent learns what to scan) | P3 | 1 week | Meta-learning |
 
+#### Phase 6: Batch Subsystem Redesign — Local-First Tiered Routing
+
+**Context.** The `batch/` subsystem was 17,994 LOC built around one strategy:
+keep the Anthropic Batch API queue permanently full ("the flywheel"). For a
+solo dev / small-portfolio user this is over-built — 5 competing orchestrators,
+a daemon that *invents* work to fill the queue, and a large fraction of "batch
+intelligence" that is actually mechanical data-processing wrongly routed
+through a paid async API. A first gut removed 3,227 LOC of verified-dead code
+(deprecated/, 2 orphan orchestrators, weather cruft) — batch/ is now 14,764
+LOC. The remaining redesign replaces "always-full cloud queue" with a
+**local-first, 4-tier router** that escalates to Claude only when the task
+genuinely needs it.
+
+| Item | Priority | Effort | Impact |
+|------|----------|--------|--------|
+| **Step 1 — Tier 0 reclassification** (pull mechanical work off the API) | P1 | 3-4 days | High — removes 40-60% of batch volume; free, instant, robust |
+| **Step 2 — Unify 3 orchestrators → one `BatchOrchestrator`** | P1 | 1 week | High — migrate ~8 callers; deletes ~5-6K LOC + optimizer trio |
+| **Step 3 — Delete the flywheel daemon** (after `frontier_scout` migrated) | P2 | 1 day | Medium — kills the make-work loop + a macOS-only daemon |
+| **Step 4 — Build the 4-tier router** (extend `routing_framework.py`) | P1 | 3-4 days | High — the new decision brain; ~300-500 LOC |
+| **Step 5 — Add Ollama Tier 1** (optional, degrades to Tier 2) | P2 | 3-4 days | Medium — free local LLM for bulk summarization |
+
+**The 4-tier router (routing by task stakes, local-first default):**
+
+| Tier | Engine | Work | Cost |
+|------|--------|------|------|
+| 0 | Local, no LLM | Pattern extraction, dedup, scoring, scans | Free, instant |
+| 1 | Local LLM (Ollama) | Bulk summarization, triage, draft briefings | Free, minutes |
+| 2 | Claude Batch API | Real reasoning, non-urgent | 50% off, async |
+| 3 | Claude real-time | Interactive / in-session (MCP tools) | Full price |
+
+```python
+# Extend batch/routing_framework.py from 2-way (interactive|batch) to 4-tier.
+# Default to the lowest tier that can do the job; escalate, never default high.
+class TieredRouter:
+    def route(self, task: BatchTask) -> Tier:
+        if task.is_mechanical:          # no LLM needed at all
+            return Tier.LOCAL_COMPUTE  # Tier 0
+        if task.quality_tolerant and self.ollama_available:
+            return Tier.LOCAL_LLM      # Tier 1
+        if not task.time_sensitive:
+            return Tier.CLAUDE_BATCH   # Tier 2 — 50% off
+        return Tier.CLAUDE_REALTIME    # Tier 3
+```
+
+**Sequencing rules:**
+- Step 1 first — highest ROI, lowest risk, no caller migration (just stop
+  routing data-processing through Claude).
+- Step 2 is gated by the contract suite — migrate one caller at a time
+  (`briefing.py`, `cli/v2_ops.py`, `cli/system.py`, `cli/batch.py`,
+  `engines/frontier_scout.py`, `orchestration/models.py`, `health/monitor.py`,
+  `batch/overnight_queue.py`), run `pytest tests/contract/` + `smoke_mcp.py`
+  between each.
+- Step 5 (Ollama) must be **optional** — if no local model is reachable, the
+  router degrades Tier 1 → Tier 2. Never make local a hard dependency
+  (a CPU-only Hetzner VM runs 8B models slowly).
+
+**Why this is more robust:** removes the hard dependency on Batch-API uptime /
+network / API key for the bulk path; overnight work runs offline on the box;
+no "queue stuck for 24h" failure mode; Claude becomes an escalation, not a
+single point of failure. Cost becomes bounded — local is free, you pay Claude
+only for the escalated high-stakes slice.
+
+**Success criteria:** `batch/` ≤ 4,000 LOC (from 17,994). One orchestrator,
+not five. Zero "invent work to fill the queue" code. Tier 0 work makes zero
+API calls. The router degrades cleanly with no Ollama and no Batch API
+(everything still completes, just at a higher tier or on-demand).
+
+
 ---
 
 ### Research Papers to Track (Priority Queue)

diff --git a/STRUCTURE.md b/STRUCTURE.md
@@ -2,90 +2,106 @@
 
 ## Entry Points
 
-| File | Purpose |
+| Path | Purpose |
 |------|---------|
-| `cli.py` | CLI entry point (`cortex` command). 4,300 lines — use `def cmd_` to navigate command handlers. Decomposition planned for v1.1. |
-| `mcp_server.py` | MCP server for Claude Desktop and compatible clients. |
-| `bridge.py` | Python SDK. `CortexBridge` is the public class. Start here for programmatic use. |
+| `mcp_server.py` | **Primary interface.** MCP server for Claude Code / Claude Desktop. Since the Phase 5 bridge collapse, all 18 tools run in-process — no HTTP, no daemon. |
+| `cli/` | CLI package (`cortex` command). `cli/__init__.py:main()` dispatches to handlers in `cli/commands/`. |
+| `bridge.py` | `CortexBridge` — the backing intelligence class. Used directly (in-process) by the MCP server and as a Python SDK. |
+| `api/bridge_endpoint.py` | Optional FastAPI HTTP shim. Only needed by local agents (e.g. Hermes) — not by MCP clients. |
 
 ## Package Layout
 
 ```
 cortex/
-├── bridge.py               # SDK entry point — CortexBridge class
-├── bridge_intelligence.py  # Intelligence mixin for CortexBridge (query, retrieval)
-├── bridge_system.py        # System/ops mixin for CortexBridge (git, deps, portfolio)
-├── cli.py                  # CLI (monolith, refactor in progress)
-├── mcp_server.py           # MCP server
-├── config.py               # CortexConfig dataclass + load_config()
-├── briefing.py             # Daily briefing generation
+├── mcp_server.py           # MCP server — 18 tools, in-process
+├── mcp_handlers.py         # Stdlib-only handlers backing MCP tools (no HTTP)
+├── health_probe.py         # Stdlib-only service-health probes
+├── bridge.py               # CortexBridge class — core init/storage + composition
+├── bridge_intelligence.py  # IntelligenceMixin (query, retrieval, recommendations)
+├── bridge_system.py        # SystemMixin (git, deps, portfolio, graph, batch ops)
+├── briefing.py             # Daily briefing generation (incl. resilient tiered path)
 ├── orchestrator.py         # Task orchestration
+├── recommendation_engine.py# Task-level recommendations
+├── recommendations.py      # PortfolioRecommender — portfolio-level reports
 ├── scheduler.py            # Background job scheduler
 ├── learning.py             # Feedback and learning loop
+├── config.py               # CortexConfig dataclass + load_config()
+│
+├── cli/                    # CLI package
+│   ├── __init__.py         #   main() entry point + dispatcher
+│   └── commands/           #   command handlers (one module per area)
+│
+├── api/
+│   └── bridge_endpoint.py  # Optional HTTP shim (FastAPI) for non-MCP consumers
 │
 ├── intelligence/           # Core algorithms (importable subpackage)
-│   ├── memory/             # Three-tier memory (tiered_memory.py, hybrid_retriever.py)
-│   ├── monitoring/         # Trend analysis, anomaly detection, alert generation
-│   ├── signals.py          # Signal detection
-│   ├── unified_intelligence.py  # Aggregates all intelligence sources
+│   ├── memory/             #   tiered_memory.py, hybrid_retriever.py
+│   ├── monitoring/         #   trend analysis, anomaly detection, alerts
+│   ├── unified_intelligence.py
 │   └── ...
 │
-├── tests/                  # 600+ tests
-├── examples/               # Runnable demos (demo_tiered_memory.py, etc.)
-├── batch/                  # Async batch job infrastructure
-├── scripts/                # Utilities
-│   └── internal/           # Internal tooling (not for external use)
-├── docs/
-│   ├── API.md              # Full API reference
-│   ├── user_guide/         # Getting started guides
-│   └── archive/internal/   # Session logs and implementation notes (historical)
-└── agents/                 # Data agents for portfolio analysis
+├── batch/                  # Async batch infrastructure (Phase 6 redesign planned)
+├── engines/, supervisor/   # Orchestration + research agent
+│
+├── tests/
+│   ├── contract/           # MCP-tool + bridge-endpoint contract tests
+│   └── ...                 # ~93 test files
+└── docs/
 ```
 
 ## Bridge Architecture (Mixin Pattern)
 
-`CortexBridge` uses Python mixins to keep the class navigable:
+`CortexBridge` composes two mixins to keep the class navigable:
 
 ```python
 # bridge.py — defines CortexBridge + core init/storage
-class CortexBridge(BridgeIntelligenceMixin, BridgeSystemMixin):
+class CortexBridge(IntelligenceMixin, SystemMixin):
     def __init__(self, root_dir=None): ...
-    def get_context(self, task, project): ...          # Core retrieval
-    def inject_recommendation(self, title, ...): ...   # Store to memory
+    def get_context(self, task, project): ...          # core retrieval
+    def query_graph(self, node_type, filters): ...     # context graph
 
-# bridge_intelligence.py — intelligence queries
-class BridgeIntelligenceMixin:
+# bridge_intelligence.py
+class IntelligenceMixin:
     def query_intelligence(self, request, project, ...): ...
     def get_recommendations(self): ...
-    def get_anomalies(self, project): ...
 
-# bridge_system.py — system/portfolio operations
-class BridgeSystemMixin:
-    def get_session_context(self): ...
-    def get_portfolio_stats(self): ...
-    def get_dependency_graph(self, project): ...
+# bridge_system.py
+class SystemMixin:
+    def get_portfolio_health_summary(self): ...
+    def get_batch_status(self, batch_id): ...
 ```
 
+The MCP server reaches `CortexBridge` through a lazy singleton (`mcp_server._get_bridge`)
+— construction loads ML/embedding modules and is deferred until the first
+tool call that needs it.
+
 ## Configuration
 
 Cortex stores user data in `~/.cortex/` (never in the repo).
 
 Key env vars:
 - `CORTEX_ROOT_DIR` — path to your projects root (default: cwd)
 - `ANTHROPIC_API_KEY` — required for embedding and intelligence features
-- `CORTEX_ANTI_PATTERNS_SCRIPT` — path to custom anti-pattern mining script (optional)
+- `CORTEX_STATE_DIR` / `CORTEX_HOME` — override the `~/.cortex/` state location
 
 Config file: `~/.cortex/config.yaml` (created by `cortex init`)
 
 ## Known Technical Debt
 
-- `cli.py` is a monolith (4,311 lines). Decomposition into `commands/` is planned for v1.1.
-- Internal imports use bare module names (`from formatter import ...`) rather than `from cortex.formatter import ...`. This works with the current `package_dir` setup but will be migrated to proper package imports in v1.1.
+- Internal imports use bare module names (`from formatter import ...`) rather
+  than `from cortex.formatter import ...`. This works with the current
+  `package_dir` setup but is a real hazard: importing the same file via both
+  `bridge` and `cortex.bridge` produces two distinct module objects. New code
+  should pick one canonical path.
 - `sys.path.insert` calls in several files are a legacy workaround for the above.
+- `batch/` (~14.7K LOC) is over-built around an always-full cloud queue —
+  ROADMAP Phase 6 tracks the local-first redesign.
 
 ## Contributing
 
 1. Run `pytest tests/ -v` — all tests must pass before submitting a PR.
-2. Run `ruff check .` — no lint errors.
-3. New memory or retrieval logic requires tests with **specific value assertions** (not `assert result is not None`).
-4. See `tests/KNOWN_ISSUES.md` for the current state of test quality.
+2. Run `pytest tests/contract/` — the MCP/bridge contract suite must stay green.
+3. Run `ruff check .` — no lint errors.
+4. New memory or retrieval logic requires tests with **specific value
+   assertions** (not `assert result is not None`).
+5. See `tests/KNOWN_ISSUES.md` for the test-quality policy.
diff --git a/WORK_PROGRESS_REPORT.md b/WORK_PROGRESS_REPORT.md