MCP slim-down: in-process server, bridge optional, install fixed#1
Open
jessekemp1 wants to merge 23 commits into
Open
MCP slim-down: in-process server, bridge optional, install fixed#1jessekemp1 wants to merge 23 commits into
jessekemp1 wants to merge 23 commits into
Conversation
…invariant
Adds the contract testing layer that should have caught the 4 broken MCP tools
shipping today (record_decision schema mismatch, conductor_compose field bug,
plan_create/plan_progress 404, graph_query silently drops `q` param).
What's added:
tests/contract/test_mcp_contract.py
One test per MCP tool. Documents the exact payload mcp_server.py sends and
asserts the bridge accepts it. Four broken tools are
@pytest.mark.xfail(strict=True) so the moment Phase 1 fixes them, the
strict flag forces marker removal — no silent regression.
tests/contract/test_bridge_endpoints.py
Covers /health, /docs, /intelligence/reason, and /meta/compounding/*
endpoints hit by gateway/, supervisor/, heartbeat, alert_monitor, and
compounding_risk. These survive the bridge slim-down in Phase 5.
tests/contract/test_schema_invariant.py
Asserts signal_bus.db DDL matches tests/fixtures/cortex_state.sql.
signal_bus.db is the only persistent state; the slim-down plan forbids
drift.
scripts/smoke_mcp.py
Spawns the bridge + the stdio MCP server and exercises every registered
tool. Differentiates KNOWN_BROKEN (Phase 1 fixes) from KNOWN_ENV_DEPENDENT
(psutil, v2 module, patterns.json). Unexpected greens in KNOWN_BROKEN
fail the run — the ratchet.
scripts/capture_goldens.sh + scripts/verify_goldens.sh
Pair for capturing/diffing `cortex briefing/status/doctor` output before
Phase 4 refactor. Goldens are env-specific and gitignored.
Baseline (against current main):
pytest tests/contract/ → 19 passed, 6 xfailed
smoke_mcp.py → 10 OK, 4 known-broken, 4 env-dependent, exit 0
Why the previous tests missed this: test_mcp_research_tools.py only verified
tools were *registered* (test_all_tools_always_loaded, test_exact_tool_count).
It never sent a payload through. KNOWN_ISSUES.md openly admits 87 weak
assertions remain. This file fixes that for the MCP/bridge surface.
Closes the surface gap surfaced by Phase 0 contract tests. All 18 MCP tools
now return successful responses end-to-end (env-permitting); no more silent
"{"error": ...}" payloads from schema mismatches or 404s.
Per-tool fixes:
cortex_conductor_compose
mcp_server.py:241 sent {"project": ...} but the bridge Pydantic model
(PromptComposeRequest) requires "project_id". Renamed the payload key.
Caused 422 on every call.
cortex_graph_query
The bridge endpoint required node_type and accepted only filters; the
MCP signature advertised (node_type, query, limit) which FastAPI silently
dropped. Extended bridge query_graph to:
- make node_type optional (defaults to iterating all NodeType values)
- accept `q` for case-insensitive substring search
- accept `limit` (default 10, capped at 500)
Falls back gracefully when cortex.engines.synthesis is unavailable.
cortex_record_decision
Bridge /decisions/record was scenario-picker schema (prediction_id,
scenario_chosen, scenario_name, domain — all required) while MCP sent
free-form (decision, context, alternatives, rationale). Every call 422'd.
Added POST /decisions/record-freeform with a matching Pydantic model
(DecisionFreeformRequest). Persists alongside scenario-picker decisions
in ~/.cortex/decisions.jsonl, discriminated by a `kind` field — no new
file. MCP tool now points at the new endpoint and accepts additional
optional fields (project, confidence, tags).
cortex_plan_create + cortex_plan_progress
Both endpoints were 404 — the routes did not exist. Added:
POST /plans/create using cortex.goal_parser.GoalParser (reused —
NOT reimplemented). Writes plan JSON to
~/.cortex/plans/{project}_{ts}.json.
GET /plans/progress reads ~/.cortex/plans/*.json, groups items by
status. Git-log cross-reference is deferred to
Phase 4.
Test/smoke updates:
- tests/contract/test_mcp_contract.py: all 5 xfail markers replaced with
real assertions. The AST drift check now scans Assign + AnnAssign +
subscript-assign forms and catches both old bugs and any future drift.
- scripts/smoke_mcp.py: KNOWN_BROKEN cleared (was 4 entries).
Baselines after Phase 1:
pytest tests/contract/ → 25 passed, 0 xfailed (was 19 passed, 6 xfailed)
smoke_mcp.py → 14 OK, 0 broken, 4 env-dependent (was 10 OK, 4 broken, 4 env)
tests/test_mcp_research_tools.py → 12 passed (no regression)
The 4 env-dependent failures (cortex_orchestrate needs psutil, cortex_outcomes
needs cortex.v2.learning.outcomes, cortex_prompt_refine needs patterns.json
seeded, cortex_batch_status returns 404 for a fake batch id) are not Phase 1
scope — they are real environmental gaps, not code bugs, and will be addressed
in later phases.
Removes verifiably-orphan code paths. Each entry confirmed via grep for
literal import + dotted-attribute references across .py, .sh, and .plist
files.
_contrib/ 18K LOC, 4.4MB on disk — historical artifacts
(old cortexdbx, old synthetic, old site
frontend). Not consumed by anything.
semantic_recommender.py 129 LOC — zero importers. Was an aspirational
second recommendation path; the canonical one
is recommendation_engine.py.
portfolio_analyzer.py 678 LOC — zero importers. Standalone module
that never wired into the bridge or CLI.
Earlier audit claims were over-aggressive. These candidates from the
plan turned out to have real importers and stayed:
- goal_parser.py: used by cli/__init__.py, briefing.py, orchestrator.py
- heartbeat.py: invoked by com.cortex.heartbeat.plist via launchd
- session_delta.py: used by api/bridge_endpoint.py via /session/delta
- validation.py: used by orchestration/anti_pattern_detector.py
- reflection.py, task_discovery.py, session_cache.py, self_audit.py,
deep_assessment.py, agent_factory.py, learning_config.py: all have
1-5 real importers
- briefing_resilient.py: 2 importers (cli wrapper + test) that need
proper migration to briefing.py — deferred to
Phase 4 (briefing cluster consolidation)
- cortex_ai_research_brief_*.md, WORK_PROGRESS_REPORT.md: WPR is
written live by scripts/work_absorber_daemon.sh
Verification post-deletion:
pytest tests/contract/ → 25 passed (no regression)
scripts/smoke_mcp.py → 14 OK (no regression)
…tras/
Stages 7 subsystems for eventual extraction to sibling repos by moving them
into a single cortex_extras/ directory. None of these are on the cortex MCP
critical path; collecting them in one place makes the eventual git filter-repo
split trivial and clarifies what's core vs auxiliary.
cortex_extras/synthetic/ ~16K LOC → cortex-synthetic
cortex_extras/cortexdbx/ ~1.9K → cortex-databricks
cortex_extras/gateway/ ~1.5K → cortex-gateway
cortex_extras/mvp/ ~1.1K → cortex-dashboard
cortex_extras/plugins/ ~2.9K → cortex-plugins
cortex_extras/tui/ ~0.7K → cortex-tui
cortex_extras/lean/ ~0.8K → delete (research artifact)
Four core import sites updated to use cortex_extras paths:
bridge.py:146-147 from cortex_extras.synthetic.{generator,schemas}
bridge_intelligence.py:45-46 from cortex_extras.synthetic.{generator,schemas}
api/bridge_endpoint.py:172 from cortex_extras.gateway.web_chat
cli/commands/compact.py:21 from cortex_extras.tui.data
All four imports are inside try/except ImportError blocks (the subsystems
were already optional). Behavior on missing extras: unchanged.
Intra-subsystem imports (e.g. `from synthetic.X import Y` inside
synthetic/ itself) keep working via path injection in
cortex_extras/__init__.py. This avoids rewriting hundreds of internal
imports. When a subsystem moves to its sibling repo, the path injection
becomes irrelevant.
See cortex_extras/README.md for the migration roadmap per subsystem.
Verification:
pytest tests/contract/ → 25 passed (no regression)
scripts/smoke_mcp.py → 15 OK (no regression)
pytest tests/ (broader, no env) → 1,515 passed, 4 pre-existing failures
(research_directives.md path on test machine)
…g tiers
Two consolidations addressing duplicates flagged by the slim-down audit:
4a) recommendations.py: rename `RecommendationEngine` → `PortfolioRecommender`
The audit claimed "4 recommendation modules to consolidate." Investigation
shows the modules serve different concerns and only the class name
collided. The two classes have entirely different APIs:
recommendations.py:RecommendationEngine
Portfolio-level reports: priority projects, risk alerts, next action
across the full codebase portfolio.
recommendation_engine.py:RecommendationEngine
Task-level smart recommendation generation for specific work items.
intelligence/recommendations/
Library used BY recommendation_engine.py — not a duplicate.
Real fix is a rename, not a merge. Renamed the class to
`PortfolioRecommender` and kept `RecommendationEngine = PortfolioRecommender`
as a backward-compat alias. Updated 5 import sites
(bridge_intelligence.py x4, intelligence/unified_intelligence.py x1) +
their 5 instantiations.
LOC impact: ~10 lines net change. Eliminates the silent name collision
that lured the audit into the wrong consolidation plan.
4b) briefing_resilient.py: fold into briefing.py
briefing_resilient.py (568 LOC) was a defensive tiered-fallback wrapper
around briefing.py — Tier 1 calls into briefing.generate_daily_briefing,
Tier 2 reads GOALS.md+git directly, Tier 3 falls back to bare git status.
It was a single outer indirection layer over briefing.py, not a separate
subsystem.
Folded its functions into briefing.py:
generate_resilient_briefing()
format_resilient_briefing()
_tier1_full_briefing, _tier2_file_based, _tier3_minimal_git
_parse_goals_md, _extract_immediate_actions, _extract_active_goals,
_extract_high_priority, _get_git_info, _get_cortex_state
_format_tier1, _format_tier2, _format_tier3
_R_RESET/_R_BOLD/etc. color constants
Tier 1 implementation now calls generate_daily_briefing() directly
instead of doing a late-binding `from briefing import ...` to escape
the same-module problem. Format helpers similarly call format_briefing()
directly. Eliminates the cross-module dance.
Updated callers:
cli/commands/briefing.py:111 from briefing_resilient → from briefing
tests/test_briefing_resilient.py from briefing_resilient → from briefing
patch("briefing_resilient.X") → patch("briefing.X")
briefing_resilient.py deleted.
briefing.py grew from 3,035 → 3,419 LOC (568 LOC source folded, ~180 LOC
deduplicated in helpers). Net: 568 LOC deleted from the repo, briefing
remains the single canonical generator with tiered fallback as a feature.
Verification:
pytest tests/contract/ → 25 passed (no regression)
pytest tests/test_briefing_resilient.py → 17 passed (folded tests still pass)
pytest tests/test_briefing.py → 19 passed (no regression)
scripts/smoke_mcp.py → 15 OK (no regression)
pytest tests/ (broader, no env) → 1,515 passed (same as Phase 3 baseline)
…_health direct call
Establishes the foundation for the bridge HTTP collapse. Two pieces:
1. `_get_bridge()` lazy singleton in mcp_server.py
- Instantiates CortexBridge on first call, caches for the session.
- Uses single canonical import path (`from bridge import CortexBridge`)
to avoid the double-import hazard observed during development:
`bridge` and `cortex.bridge` resolve to distinct module objects with
distinct CortexBridge classes (Python double-import via two sys.path
entries). Forcing one path guarantees one shared instance.
- Importing CortexBridge eagerly costs ~16s (transformer/ML imports
in bridge_intelligence). Lazy pattern keeps MCP module-load fast;
first tool call inside a session pays the cost once.
2. Extracted `compute_service_health()` to health_probe.py (stdlib-only)
- 175 LOC of HTTP probes pulled out of api/bridge_endpoint.py:246-410
into a standalone module with zero heavy imports.
- `/service-health` endpoint now delegates to it (single source of truth).
- `cortex_service_health` MCP tool now calls `health_probe.compute_service_health`
directly — no HTTP roundtrip to localhost:8765, no CortexBridge required.
3. New `tests/contract/test_mcp_direct.py` with Step 1 assertions:
- test_service_health_no_http: asserts cortex_service_health does NOT
invoke _bridge_get.
- test_service_health_uses_health_probe_helper: asserts the tool delegates
to compute_service_health.
- test_get_bridge_singleton_caches: three calls return identical objects.
- test_get_bridge_not_called_at_import: importing mcp_server does not
eagerly instantiate CortexBridge (verifies lazy startup).
Verification:
pytest tests/contract/ → 29 passed (was 25)
scripts/smoke_mcp.py --tool cortex_service_health → ok
Pattern established for Steps 2-3: each remaining HTTP-backed MCP tool follows
this template — direct call (singleton method or extracted helper), no HTTP.
…alls
Each tool now calls into Python code in-process; none hit the HTTP bridge.
Three migration patterns used:
1. Filesystem-only endpoints → new `mcp_handlers.py` (stdlib-only):
cortex_projects → mcp_handlers.compute_projects
cortex_sessions → mcp_handlers.scan_sessions
cortex_taskboard → mcp_handlers.query_taskboard
cortex_plan_progress → mcp_handlers.plans_progress
2. CortexBridge methods → lazy `_get_bridge()`:
cortex_recommendations → bridge.get_recommendations
cortex_graph_query → bridge.query_graph (with type iteration + q filter)
3. Domain manager classes → lazy in-function imports:
cortex_anomalies → OrchestrationAnomalyManager(...).detect_all
cortex_batch_status → BatchAPIClient.get_batch_status
cortex_outcomes → v2.learning.outcomes.OutcomeDetector.get_recent_outcomes
New `mcp_handlers.py` (235 LOC) extracts endpoint internals:
compute_projects, scan_sessions, load_taskboard, save_taskboard,
query_taskboard, plans_progress.
All stdlib imports only — safe to load at MCP module init.
New direct-call tests in tests/contract/test_mcp_direct.py:
test_projects_no_http, test_sessions_no_http, test_taskboard_no_http,
test_plan_progress_no_http, test_recommendations_no_http, test_graph_query_no_http.
Each asserts the underlying helper/bridge method is invoked AND _bridge_get
is NOT called — the no-HTTP invariant.
Verification:
pytest tests/contract/ → 35 passed (was 29, added 6 no-http tests)
scripts/smoke_mcp.py → 15 OK (no regression)
(3 env-dependent unchanged: batch needs API key,
v2 module not installed, prompt patterns.json not seeded)
After this step, 10 of 14 bridge-backed MCP tools no longer touch HTTP.
Remaining for Step 3: cortex_intelligence, cortex_conductor_compose,
cortex_plan_create, cortex_record_decision (the 4 POST tools).
After this step, ALL 14 bridge-backed MCP tools are HTTP-free. The
_bridge_get/_bridge_post helpers are now unused and will be removed in
Step 4.
Migrations:
cortex_intelligence → _get_bridge().query_intelligence(...)
cortex_conductor_compose → mcp_handlers.compose_conductor_prompt
cortex_plan_create → mcp_handlers.create_plan
cortex_record_decision → mcp_handlers.record_freeform_decision
New handlers in mcp_handlers.py:
compose_conductor_prompt
Mirrors the /conductor/compose endpoint body (intent framing,
git log, .next_session.md, persist to prompt_history.jsonl).
Pulls in CONDUCTOR_PROJECTS and NEXT_SESSION_FILES constants.
create_plan
Reuses goal_parser.GoalParser (single canonical import). Writes
plan JSON to ~/.cortex/plans/{project}_{ts}.json.
record_freeform_decision
Appends to ~/.cortex/decisions.jsonl with kind="freeform".
Direct-call tests in test_mcp_direct.py:
test_intelligence_no_http — _bridge_post NOT called
test_conductor_compose_no_http — _bridge_post NOT called
test_plan_create_no_http — _bridge_post NOT called
test_record_decision_no_http — _bridge_post NOT called
Verification:
pytest tests/contract/ → 39 passed (was 35, added 4 no-http tests)
scripts/smoke_mcp.py → 15 OK (no regression)
Next: Step 4 — delete _bridge_get/_bridge_post helpers and BRIDGE_URL
constant from mcp_server.py. The bridge process is no longer required
for any MCP tool.
After Phase 5 Steps 1-3 migrated all 14 bridge-backed tools to direct calls,
the HTTP helpers are dead code. Removed:
BRIDGE_URL — no longer referenced
_bridge_get — no callers
_bridge_post — no callers
import urllib.request, urllib.error
Net change: -60 LOC from mcp_server.py.
The MCP server now runs as a pure in-process Python application. The HTTP
bridge (uvicorn cortex.api.bridge_endpoint:app) is required ONLY for
non-Python consumers (telegram bot, vite UI). Step 6 will shrink the bridge
shim to match.
Replaced the per-test `bridge_get.assert_not_called()` pattern (which would
now raise AttributeError) with a single AST-based invariant:
test_mcp_server_has_no_http_plumbing
Scans mcp_server.py AST. Fails if it sees `import urllib`,
`BRIDGE_URL`, `_bridge_get`, or `_bridge_post` referenced anywhere.
Any future re-introduction of HTTP bridge calls fails this test.
Verification:
pytest tests/contract/ → 40 passed (was 39)
scripts/smoke_mcp.py → 15 OK
scripts/smoke_mcp.py --no-bridge → 15 OK ← bridge not required
The --no-bridge result is the key Phase 5 milestone: MCP serves all
14 previously-HTTP tools without uvicorn running.
Two non-MCP Python consumers had real RPC needs that should bypass HTTP.
Migrated those to direct calls. The remaining four consumers genuinely
probe the bridge AS A SERVICE for non-Python clients (gateway, web UI)
and stay on HTTP.
Migrated (had real RPC need):
session_delta.py:_collect_emos
Was: curl /service-health to pull emos.pairs
Now: import health_probe; compute_service_health() locally
session_delta.py:_collect_bridge_status
Was: curl /health
Now: importlib.util.find_spec("bridge") — structural availability.
Note: bridge being importable != bridge process running. The function's
callers want to know if cortex's bridge layer is available in this
Python env, which is what find_spec measures. The HTTP probe was a
proxy for that.
supervisor/intake.py:_load_from_recommendations
Was: httpx.get http://localhost:8765/intelligence/recommendations
Now: from bridge import CortexBridge; CortexBridge().get_recommendations()
Falls back to HTTP path if bridge module can't be imported (covers
cross-process deployments where supervisor runs detached from bridge).
NOT migrated (legitimate external HTTP probes — keep as-is):
heartbeat.py → :8765/health
alert_monitor.py → :8765/docs
notifications/threshold_detector.py → :8765/health
intelligence/analysis/compounding_risk.py → :8765/meta/compounding*
These tools probe the bridge AS A SERVING PROCESS, which is the right
question for monitoring whether gateway and web-UI consumers are
unblocked. After Phase 5 the bridge is optional infrastructure; these
probes correctly report when that infrastructure is down.
Plan agent's Step 5 recommended these become find_spec checks. That
would lose the signal — find_spec only verifies the file is importable,
not that the HTTP process is responding. The HTTP probe is more
informative for monitoring use.
Verification:
pytest tests/contract/ → 40 passed (no regression)
After Steps 1-5 migrated all 14 bridge-backed MCP tools to direct calls, the
endpoints they used had zero remaining callers. Removed:
POST /intelligence/query
GET /graph/query
GET /batches/{batch_id}
POST /batches/{batch_id}/cancel
GET /v2/outcomes
GET /v2/outcomes/stats
GET /projects
GET /sessions
POST /conductor/compose
POST /decisions/record-freeform
POST /plans/create
GET /plans/progress
Also dropped 4 now-orphaned Pydantic request models:
IntelligenceQuery, PromptComposeRequest, PlanCreateRequest, DecisionFreeformRequest
Kept endpoints (still used by non-MCP consumers):
/health, /docs, /, /service-health (monitoring + gateway)
/intelligence/recommendations (telegram, web chat)
/intelligence/reason (telegram, web chat)
/anomalies (telegram, web chat)
/taskboard (GET/POST/PATCH/DELETE) (vite UI)
/decisions/record (Co-Navigator scenario picker)
/meta/compounding* (compounding_risk.py)
/briefing/*, /session/*, /goals/* (existing briefing tests + gateway)
/predictions/*, /guardian/*, ... (vite UI surface — defer to next pass)
api/bridge_endpoint.py: 3158 → 2576 LOC (-582, ~18%).
Test surface updated:
test_mcp_contract.py
Slimmed from 13 endpoint-exercising tests to 4 (the endpoints that
survived) plus 3 invariants (doctor-is-local, prompt-refine-is-local,
all_18_tools_documented). The stale tests for deleted endpoints were
pure dead-weight — MCP no longer routes through HTTP for any of them.
test_mcp_payloads_match_bridge_pydantic_models dropped — mcp_server.py
has zero _bridge_post calls, so the AST check finds nothing to drift
against.
Two reactive fixes from the trim script's overly-greedy span detection:
- Restored ReasonQuery class (was sandwiched between /intelligence/query
and /intelligence/reason, deleted with the former)
- Restored TaskBoard helpers + 3 Pydantic models (sandwiched between
/sessions and /taskboard)
Verification:
pytest tests/contract/ → 29 passed (was 40; -11 stale tests)
scripts/smoke_mcp.py --no-bridge → 15 OK (same as Step 4 baseline)
The bridge process now serves a narrow set of legitimately HTTP-consumed
endpoints. Future passes can audit guardian/, signal/, briefing/, etc.
once their consumer set is verified.
…itecture Three invariants enforce the Phase 5 deliverables and prevent regression: 1. test_mcp_module_import_under_2s + test_bridge_singleton_remains_uninitialized_after_import The whole point of Step 1's lazy singleton is fast MCP startup (~16s saved over eager bridge import). Both tests catch any future move of `from bridge import CortexBridge` to module scope. 2. test_bridge_endpoint_inventory_unchanged Snapshots the exact endpoint set after Step 6's trim. Any add/remove must update EXPECTED_ENDPOINTS with a code comment naming the consumer that needs the change. Prevents accidental re-introduction of MCP-only endpoints. 3. test_no_phase5_deleted_endpoints_resurrect Lists the 12 paths Step 6 removed by name. If any reappears, the test fails with a clear message pointing to mcp_handlers / _get_bridge as the correct replacement pattern. Verification: pytest tests/contract/ → 33 passed scripts/smoke_mcp.py --no-bridge → 15 OK ────────────────────────────────────────────────────────────────────── Phase 5 (bridge collapse) complete Started: MCP server required uvicorn cortex.api.bridge_endpoint:app on port 8765. 14 of 18 tools made HTTP roundtrips back to localhost. Ended: MCP server is fully in-process. The HTTP bridge is optional — needed only for telegram bot, vite UI, and external monitoring probes. Commits: 37b0b17 Step 1 — lazy CortexBridge singleton + service_health ee4e659 Step 2 — 9 read-only tools direct-call 872cd1d Step 3 — 4 POST tools direct-call 235b3e3 Step 4 — delete HTTP plumbing from mcp_server.py 29e50c6 Step 5 — migrate in-process bridge consumers a9b5a5c Step 6 — trim shim (-582 LOC, 12 endpoints removed) (this) Step 7 — invariant tests Bridge merge into core.py (per original plan): DEFERRED per Plan agent's recommendation. The 4,328-LOC mixin merge is risk-uncorrelated with the HTTP collapse; doing both at once would double blast radius. Separate follow-up.
Product decision: the cortex bridge is needed only by local agents (Hermes).
The web-facing UI surfaces are archived — present for recovery, out of the
build/test/install path.
Moved to archive/:
site/ → archive/site/ (vite/React dashboard)
cortex_extras/gateway/ → archive/gateway/ (telegram bot + web chat)
archive/README.md documents what's frozen and how to revive it. Nothing in
archive/ is imported by the active codebase; the directory can be deleted
outright (git history preserves it).
Wiring removed:
api/bridge_endpoint.py
Dropped the web-chat router mount (was: try/except import of
cortex_extras.gateway.web_chat). Bridge routes: 56 → 53.
install.sh
Removed the "Site dashboard" npm-install step and its verify line.
Removed the node-not-found warning (no longer relevant). Renumbered
the remaining install steps 5-8.
tests/contract/test_phase5_invariants.py
Removed /chat, /chat/manifest.json, /ws/chat from EXPECTED_ENDPOINTS
(those came from the now-archived web-chat router).
cortex_extras/README.md
Dropped gateway from the extraction roadmap; noted the archive move.
Note on the vite dashboard: Phase 5 Step 6 had already removed bridge
endpoints the dashboard depended on (/projects, /sessions,
/intelligence/query, /conductor/compose). Archiving makes that breakage
moot rather than requiring endpoint restoration.
Verification:
pytest tests/contract/ → 33 passed
scripts/smoke_mcp.py --no-bridge → 15 OK
pytest tests/ (broader) → 1,523 passed, 4 pre-existing failures
…docs
Three real bugs surfaced in the post-Phase-5 reassessment.
1. _get_bridge() was not thread-safe
FastMCP can dispatch tool calls concurrently. On a cold process, two
threads racing into _get_bridge could both see _bridge_singleton is
None and each construct a CortexBridge — a 16s double-init producing
two divergent instances. Added a threading.Lock with double-checked
locking: the hot path stays lock-free after warm-up, the cold path is
serialized. New test_get_bridge_is_thread_safe hammers it from 12
threads gated on a barrier.
2. cortex_doctor reported the bridge as a hard failure
Since Phase 5 the MCP server runs fully in-process; the HTTP bridge is
optional (Hermes-only). cortex_doctor still had the bridge check
feeding `all_pass`, so every MCP user without a bridge running saw a
failed doctor report. Made the check informational (pass: True always,
detail reports up/down). all_pass now reflects only real problems.
3. Docs described the dead architecture
- mcp_server.py module docstring said "Connects to bridge at :8765 via
HTTP" — false since Phase 5.
- BETA_ONBOARDING.md had a whole "Start the Bridge" section telling
users to run `python api/bridge_endpoint.py &` and curl :8765 before
MCP tools work. They work in-process now; section replaced with
"No Bridge Needed".
- README.md MCP section + install extras now state tools run
in-process and the [server] extra is optional (Hermes-only).
Verification:
pytest tests/contract/ → 33 passed (+1 thread-safety test = 34 total...
actually 33; thread test replaces nothing)
cortex_doctor → bridge check pass:True, all_pass driven by real checks
…S claims
The "87 weak assertions across 24 files" figure in KNOWN_ISSUES.md was
stale. The actual remaining weak-assertion debt:
assert X in (True, False)
4 occurrences, all in test_bridge_integration.py (availability-flag
import tests). The pattern is trivially true AND wrongly accepts 0/1.
Replaced with `assert isinstance(X, bool)` — rejects non-bool values,
which is the real invariant (flags are set by try/except guards and
must never be None).
assert X is not None (sole assertion)
test_known_issues_accuracy's AST scan of test_integration_*.py reports
zero functions whose only assertion is `is not None`. Already clean.
Changes:
tests/test_bridge_integration.py
4 weak assertions → isinstance(X, bool).
tests/test_memory_roundtrip.py
Removed @pytest.mark.xfail(strict=True) from test_known_issues_accuracy.
It was xfailed because KNOWN_ISSUES.md's "Fixed" claims contradicted
code reality. They no longer do — the test is now a live guard that
fails if anyone marks an issue resolved without fixing it.
tests/KNOWN_ISSUES.md
Rewritten: weak-assertion items moved to a "Resolved" section with
dates; kept the standing test-quality policy. The file is now accurate
(the meta-test enforces this).
────────────────────────────────────────────────────────────────────────
Bridge-mixin merge (bridge.py + bridge_intelligence.py + bridge_system.py
→ core.py): ASSESSED AND DECLINED.
The slim-down plan estimated 4,328 LOC → ~2,500 "after dedup". Measured
the actual dedup opportunity:
- 39 except-ImportError blocks across 3 files
- ~16 duplicate import lines (the only real dedup)
- 86 methods total (6 + 33 + 47) — all cohesively grouped
Realistic saving from a merge: ~50-100 LOC (1-2%), in exchange for one
4,300-line file replacing three navigable ~1,500-line ones, touching the
single most central class in the system. Net-negative. Not done.
The real bloat in the bridge files is the 39 defensive imports and
possible dead methods — that needs surgical per-method analysis, not a
file merge. Tracked as separate future work.
Verification:
pytest tests/contract/ → 34 passed
pytest tests/test_bridge_integration.py tests/test_memory_roundtrip.py
tests/test_assertion_quality.py → 33 passed, 9 skipped, 2 xfailed
scripts/smoke_mcp.py --no-bridge → 15 OK
The smoke suite's 3 long-standing env-dependent failures were real product
gaps, not environment quirks. All three fixed.
3a — cortex_outcomes pointed at a module that was never built
mcp_server imported `v2.learning.outcomes.OutcomeDetector`. There is no
`v2/` package anywhere in the repo — the tool had never returned data.
Repointed at the real outcome store: ~/.cortex/outcomes.jsonl, written
by feedback.OutcomeLogger (OutcomeEntry schema). New handler
mcp_handlers.read_outcomes filters by context.project, returns newest
`limit` entries. cortex_outcomes now returns real outcomes.
3b — cortex_prompt_refine failed hard on a missing cache
The tool read patterns.json and, if absent, told the user to run a
script. Now it auto-seeds: on first call it invokes
prompt_db.extract_patterns(read_log()), which writes the cache and works
even with zero prompt history (category-hint defaults still serve).
Root-cause fix: the tool now uses `prompt_db.PATTERNS_FILE` as the single
source of truth for the cache path. Previously mcp_server kept its own
PROMPTS_DIR constant that could drift from prompt_db's — they happened to
agree in production but diverged under test. One canonical path now.
3c — cortex_batch_status surfaced a cryptic SDK error
Without ANTHROPIC_API_KEY, BatchAPIClient() raised "Could not resolve
authentication method...". Now caught and translated to: "cortex_batch_
status requires an Anthropic API key. Set ANTHROPIC_API_KEY in
cortex/.env or your environment."
Tests (tests/contract/test_mcp_direct.py):
test_outcomes_reads_real_jsonl_not_phantom_v2
test_batch_status_gives_clear_auth_error
test_prompt_refine_autoseeds_patterns
scripts/smoke_mcp.py: KNOWN_ENV_DEPENDENT trimmed from 4 entries to 1
(only cortex_batch_status, which legitimately needs an API key — and now
returns a clear message without one).
Verification:
pytest tests/contract/ → 37 passed (was 34)
scripts/smoke_mcp.py --no-bridge → 17 OK (was 15)
1 env-dependent (batch_status, needs key)
Every MCP tool now works end-to-end in a clean environment, or returns a
clear actionable message. This is the beta-ready milestone.
A clean-room fresh install (fresh venv, pip install -e ., run the binaries)
surfaced four real blockers a beta user on Linux would hit.
CRITICAL — mcp SDK was not installed by the core install
`mcp` lived in the [server] optional-dependencies extra. But install.sh
and the README tell users to run `pip install -e .` (core only). The MCP
server — the primary interface since Phase 5 — then crashes immediately:
`ModuleNotFoundError: No module named 'mcp'`. Moved `mcp>=1.0.0` to core
dependencies. Verified: fresh core install now registers all 18 tools.
CRITICAL — scikit-learn was imported but never declared
intelligence/embeddings_client.py has a hard top-level
`from sklearn.feature_extraction.text import HashingVectorizer` — it is
the local vector-retrieval backend, used transitively by spec_knowledge_
base, pattern_memory, hybrid_retriever. It was in NO dependency list.
Added `scikit-learn>=1.3.0` to core dependencies.
CRITICAL — install.sh used macOS-only `sed -i ''`
Four call sites (.env API-key write, CORTEX_ROOT_DIR write, hooks
REPO_ROOT fix) used BSD `sed -i ''`. On GNU sed (Linux) `''` is parsed as
the script, so the substitutions silently no-op — a Linux beta user's API
key and project root never get written to .env. Added a portable
`sed_inplace()` helper (temp-file form, works on GNU/BSD/busybox) and
converted the Linux-reachable call sites. The plist-install `sed -i ''`
at the macOS-only Darwin guard is left as-is (correct there).
MAJOR — cortex doctor (CLI) failed on a healthy fresh install
cli/commands/v2_ops.py:cmd_doctor still treated `bridge :8765` as a hard
check — but since Phase 5 the bridge is optional (the MCP/CLI run
in-process). Made it informational, matching the earlier fix to the MCP
cortex_doctor tool. (The sklearn check now legitimately passes since
sklearn is a declared dependency.)
MAJOR — pyproject py-modules missing Phase 5 modules
Added bridge_intelligence, bridge_system, mcp_handlers, health_probe to
[tool.setuptools] py-modules — mcp_server imports the latter two, so a
non-editable wheel install was broken. Added `archive*` to the package
find-exclude so the archived UI code isn't packaged.
E2E verification (fresh venv, pip install -e . from clean):
- 18 MCP tools register
- cortex doctor: all checks pass except ANTHROPIC_API_KEY (correct —
none set in test env)
- scripts/smoke_mcp.py --no-bridge → 17/18 tools green, 1 needs API key
- sed_inplace verified writing .env correctly on Linux
- install.sh: bash -n syntax OK
- pytest tests/contract/ → 37 passed
First step of the batch-subsystem redesign. Deletes only what a hard import
audit proved has zero external callers — the entangled consolidation work
(see below) is deliberately NOT done here.
Deleted (zero importers outside batch/, verified by grep across .py/.sh/.plist):
batch/deprecated/ 1,476 LOC — 5 files, deprecated by name
batch/strategic_orchestrator.py 694 LOC — zero importers
batch/intelligent_orchestrator_anthropic.py 623 LOC — only docstring self-refs
batch/weather_batcher.py 434 LOC — zero importers; domain cruft
(Vortex weather backfill —
not portfolio intelligence)
batch/__init__.py: dropped the weather_batcher export (WeatherBackfillBatcher,
WeatherBackfillContext) — it was the only thing pulling that module into the
package import graph.
batch/ subsystem: 17,994 → 14,767 LOC.
Verification:
import batch + batch.batch_api_client → OK
pytest tests/contract/ → 37 passed
scripts/smoke_mcp.py → 17/18 green
pytest batch/tests/ → 48 passed, 2 failed (both PRE-EXISTING —
confirmed by re-running against stashed
pre-change tree)
──────────────────────────────────────────────────────────────────────────
NOT done here — and why. The redesign plan called for cutting 18K → 3-4K by
also removing 3 more orchestrators, the flywheel, and the optimizer trio.
The import audit shows those are NOT cleanly separable:
orchestrator.py → imported by cli/commands/v2_ops.py,
health/monitor.py, orchestration/models.py,
batch/overnight_queue.py
intelligent_orchestrator → orchestration/models.py, engines/frontier_scout.py,
cli/commands/system.py, nightly plist
v2a_sprint_orchestrator → briefing.py, cli/commands/batch.py
flywheel_daemon → engines/frontier_scout.py, flywheel plist
optimizer / usage_optimizer→ intelligent_orchestrator, flywheel, briefing.py
Deleting those requires migrating ~8 caller files across briefing.py, the
CLI, engines/, orchestration/, health/ — and the 3 live orchestrators have
DIFFERENT APIs, so "keep one" means designing a unified one, not picking a
survivor. That is a sequenced refactor + the 4-tier local-first router
build, not a deletion pass. Tracked as the next batch-redesign phase.
…ered routing) Adds the batch-redesign plan to ROADMAP.md as a tracked phase, following the assessment that the batch/ subsystem (was 17,994 LOC) is over-built around an "always-full cloud queue" strategy that doesn't match a solo/small-portfolio workload. Phase 6 captures the 5-step sequence: 1. Tier 0 reclassification — pull mechanical work off the paid API path 2. Unify 3 orchestrators → one BatchOrchestrator (migrate ~8 callers) 3. Delete the flywheel daemon (make-work loop) 4. Build the 4-tier router (Tier 0 local-compute → Tier 3 Claude realtime) 5. Add optional Ollama Tier 1, degrading to Claude Batch when absent Includes the tier table, a TieredRouter design sketch, per-step caller- migration sequencing gated by the contract suite, and success criteria (batch/ ≤ 4,000 LOC, one orchestrator, zero queue-filling make-work). Records the work already done: the first gut removed 3,227 LOC of verified-dead code (commit 5af2e16); batch/ is currently 14,764 LOC.
Post-Phase-5 (MCP runs in-process) and post-Phase-2/3 (cortex_extras/,
archive/, deletions), the docs described an architecture that no longer
exists. Corrected:
ROADMAP.md
- Removed the stale "— CURRENT" marker on Phase 1 (Mar 12-28).
- Added a dated status note (2026-05-21) explaining the parallel
engineering-stabilization track that produced Phases 0-6 of slim-down
work.
README.md
- "bridge initialization under 10ms" was flatly false — CortexBridge
construction loads ML/embedding modules and takes seconds (the reason
the MCP server uses a lazy singleton). Corrected the perf line.
STRUCTURE.md — substantial rewrite:
- Entry points: cli.py (claimed 4,300-line monolith) → cli/ package;
the "decomposition planned for v1.1" already happened.
- Mixin class names corrected: BridgeIntelligenceMixin/BridgeSystemMixin
→ IntelligenceMixin/SystemMixin (the actual class names).
- Package layout now includes mcp_handlers.py, health_probe.py,
cortex_extras/, archive/, api/, tests/contract/.
- mcp_server.py described as the primary interface (in-process);
bridge as the backing class; api/bridge_endpoint.py as optional shim.
- Known Technical Debt: dropped the stale cli.py-monolith item; kept
the bare-import hazard and documented the double-import pitfall.
INSTALL.md
- "Starting the Bridge Server" section (told users to run
api/bridge_endpoint.py + curl :8765 before intelligence works) →
replaced with "Intelligence Queries — No Server Needed". Intelligence
runs in-process.
- Removed the "Gateway — Telegram Bot + Web Chat" section — gateway is
archived (archive/gateway/); the cortex/gateway/ path it referenced
no longer exists.
docs/INSTALLATION.md
- Was a 535-line second install guide dated 2025-12-24, contradicting
the current root INSTALL.md. Replaced with a short pointer to the
canonical INSTALL.md. Prior content preserved in git history.
(TROUBLESHOOTING.md, DEPLOYMENT.md, user_guide/getting_started.md
link here — the pointer keeps those links valid.)
Verification: pytest tests/contract/ → 37 passed.
…rtifacts
Follow-up to the primary-docs fix. Audited all 27 docs/ markdown files;
corrected the living reference docs and tidied repo-root artifacts.
Living reference docs — stale `cli.py` references (the CLI is now the
`cli/` package; `python .../cli.py` commands were broken):
docs/TECHNICAL_REFERENCE.md
- Dropped the `alias cortex='python3 .../cli.py'` / symlink block —
`pip install -e .` provides the `cortex` console script.
- Hook example: `python3 .../cli.py briefing` → `cortex briefing`.
docs/developer/setup.md
- Repo tree: `cli.py # Main CLI` → `cli/ # CLI package`; added
mcp_server.py as the primary interface.
docs/developer/extension_points.md
- "Add command" guide: `# In cli.py or bridge.py` → `# In a module
under cli/commands/`.
docs/ARCHITECTURE.md
- Entry-points diagram: replaced `cli.py` + the archived `plugins`
box with the real surface — mcp_server (primary), cli/, bridge.py
(intelligence backing), briefing.py, api/ optional HTTP shim.
Left as-is (point-in-time records, not living docs):
docs/PRD-mcp-v2.md (dated PRD), docs/demo_plan.md (demo script),
docs/DEEP_ASSESSMENT_2026-04-08.md, docs/cortex_paper.md.
Repo-root artifact cleanup:
- cortex_ai_research_brief_2026-03-26.md, ..._2026-04-08.md → moved to
research_briefs/ (the existing home for CRA-generated briefs).
- WORK_PROGRESS_REPORT.md → untracked + gitignored. It is live state
written by scripts/work_absorber_daemon.sh, not source — it should
never have been committed.
Verification: pytest tests/contract/ → 37 passed.
The beta ships the MCP server + CLI + bridge backing — nothing else.
cortex_extras/ and archive/ were staging/freezer dirs from earlier sessions
("stage for sibling-repo extraction" and "freeze the vite UI"). Neither
belongs in a polished beta. Git history preserves everything; the
sibling-repo extraction (when it happens) works just as well from history
via `git filter-repo --path cortex_extras/<sub>`.
Removed:
archive/ the frozen vite dashboard + telegram/web_chat gateway
cortex_extras/ synthetic, cortexdbx, mvp, plugins, tui, lean
(everything previously staged for extraction)
Relocated, not deleted:
cortex_extras/tui/data.py → compact_data.py
Self-contained stdlib-only CortexSnapshot/collect_snapshot — the `cortex
compact` CLI command's data layer. Moved to repo root alongside the
other small support modules (health_probe.py, mcp_handlers.py).
cli/commands/compact.py:21 updated.
Import sites cleaned:
bridge.py:146-147
bridge_intelligence.py:45-46
`from cortex_extras.synthetic.*` → `from cortex_synthetic.*` (try/except
still degrades; the target is now the future sibling pip package name,
so the import path stays correct when synthetic is re-introduced).
pyproject.toml:
- compact_data added to py-modules (it's a new top-level module).
- Dropped the now-moot `archive*` from packages.find exclude.
STRUCTURE.md:
- Layout no longer lists cortex_extras/ or archive/.
tests/contract/test_phase5_invariants.py
- Updated the explanatory comment about the removed web-chat router
(was: "archived"; now: "removed from beta tree").
Verification:
- All core modules import (mcp_server, mcp_handlers, health_probe,
compact_data, bridge).
- pytest tests/contract/ → 37 passed.
- scripts/smoke_mcp.py --no-bridge → 17/18 tools green.
- Repo: 21M total; tree is now genuinely a beta-shippable surface.
The v2 PRD (Mar 27) planned 30 more tools wrapped around an always-on HTTP
bridge with deferred tool-search loading. What actually delivered better
value was the opposite: fewer, honest tools + the HTTP indirection deleted
+ a test ratchet + correct install on Linux.
Renamed docs/PRD-mcp-v2.md → docs/PRD-mcp.md and rewrote as a SHIPPED-status
PRD describing the post-Phase-5 reality:
- 18 in-process MCP tools (17/18 green end-to-end; 1 needs API key with
clear message)
- Architecture diagram showing MCP client → mcp_server (no urllib) →
direct calls into mcp_handlers / health_probe / CortexBridge (lazy
singleton) / intelligence/ / engines/ / supervisor/
- The HTTP bridge as optional shim for local agents (Hermes), not the
foundation
- The three invariants enforced by tests (no-HTTP, fast-import,
endpoint-inventory-bounded)
- What's NOT in the beta (cortex_extras/, archive/, 12 deleted endpoints)
and why
- Success criteria with shipped-status checkmarks
- Non-goals (no HTTP-only MCP, no deferred-loading, no web UI, no auth)
- Planned-but-not-shipped: ROADMAP Phase 6 batch redesign, Hermes
verification, surgical bridge-mixin cleanup
- Honest beta-readiness verdict with the macOS-only background-automation
caveat
The v2 PRD remains in git history as a planning record.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A 23-commit, multi-phase slim-down + correctness pass that takes Cortex from "MCP-server-with-broken-tools-behind-an-HTTP-bridge" to a beta-shippable in-process MCP server with a real test ratchet. The full narrative is in
docs/PRD-mcp.md(rebuilt from the v2 PRD).Headline numbers:
ANTHROPIC_API_KEYand now says so clearly). Was 14/18 with 4 silently broken.uvicorn, no HTTP, no "is the bridge up?" failure mode._contrib/historical artifacts +cortex_extras/staging area +archive/UI surfaces + dead batch code).pip install -e .works on Linux out of the box. It didn't before —mcpandscikit-learnwere both transitively required but undeclared, andinstall.shused macOS-onlysed -i ''.What's in the branch (23 commits)
tests/contract/(MCP + bridge + schema invariant),scripts/smoke_mcp.py, golden capture/verify scriptsconductor_compose(field rename),graph_query(extended bridge to acceptq/limit),record_decision(new/decisions/record-freeformendpoint matching the MCP schema),plan_create+plan_progress(added the routes, reusedgoal_parser.GoalParser)_contrib/(4.4 MB),semantic_recommender.py,portfolio_analyzer.py— verified zero importerssynthetic/,cortexdbx/,gateway/,mvp/,plugins/,tui/,lean/consolidated undercortex_extras/(later removed entirely; git history preserves them)RecommendationEngine→PortfolioRecommender(eliminated a real name collision);briefing_resilient.pyfolded intobriefing.py_get_bridge()singleton; migrated all 14 bridge-backed MCP tools to direct in-process calls (mcp_handlers.py+health_probe.py); deleted HTTP plumbing frommcp_server.py; migrated in-process consumers (session_delta,supervisor/intake); trimmedapi/bridge_endpoint.pyfrom 56 routes to 53; added 4 invariant tests (no-HTTP AST scan, <2 s import, endpoint inventory bounded, no-resurrect of deleted routes)cortex_outcomesrepointed at real~/.cortex/outcomes.jsonl(thev2.learning.outcomesmodule it imported never existed);cortex_prompt_refineauto-seeds the patterns cache;cortex_batch_statusreturns a clear "needs API key" message instead of an opaque SDK errormcpandscikit-learnmoved into core deps;install.shsed -i ''→ portablesed_inplacehelper; CLIcortex doctormade bridge check informational;pyprojectpy-modulesupdated; docs/INSTALLATION duplicate replaced with pointer_get_bridge()thread lock with double-checked locking;mcp_serverdocstring +BETA_ONBOARDING.md+README.mdcorrected (bridge is optional, not required)batch/deprecated/,strategic_orchestrator.py,intelligent_orchestrator_anthropic.py,weather_batcher.py(verified zero importers) — ~3.2K LOC; full redesign tracked as ROADMAP Phase 6cli/notcli.py, real mixin class names, correct package layout); INSTALL.md de-stale'd; docs/ subtree audit (TECHNICAL_REFERENCE, ARCHITECTURE diagram, developer setup, extension_points);docs/INSTALLATION.md→ pointer to canonical; CRA briefs moved toresearch_briefs/;WORK_PROGRESS_REPORT.mdgitignored (it's daemon-written state)cortex_extras/+archive/removed entirely;cortex_extras/tui/data.pyrelocated to root ascompact_data.py(needed bycortex compactCLI); bridgesyntheticimports retargeted at the futurecortex_syntheticsibling-package name; PRD rebuilt as canonicaldocs/PRD-mcp.mddescribing the shipped systemVerification (run on the head commit)
Three test-enforced invariants are now load-bearing:
mcp_server.pycontains nourllib/BRIDGE_URL/_bridge_get/post(AST scan).What's NOT in this PR (deliberately deferred)
batch/is ~14.7K LOC; target ~3-4K via a 4-tier local-first router (local-compute / local-LLM / Claude Batch / Claude real-time). Full plan and sequencing inROADMAP.md. Needs a focused session.git revertper block;test_no_phase5_deleted_endpoints_resurrectlists them by name.CortexBridgehas 86 methods; only ~5 are called in-repo. Cutting the rest needs the Hermes call graph; potentially another 500-1.5K LOC.Test plan
pytest tests/contract/greenpytest tests/green (modulo the 4 pre-existingresearch_directives.mdpath failures, unrelated)python -m venv && pip install -e .works on Linuxscripts/smoke_mcp.py --no-bridgereports 17 OK + 1 env-dependentcortexas an MCP server in Claude Code, confirm at least 2-3 tools respond end-to-endhttps://claude.ai/code/session_01PD48YfwHA1KQNJ1vVmMwuT
Generated by Claude Code