feat(middleware): EmbeddingProvider injection — #178 native-app readiness#190
Merged
Conversation
… readiness The Auto-Absorb / Auto-Digest / PrimeFetcher classes each had their own private 11-line `embed()` method that fetched Ollama's `/api/embed` directly, bypassing the `EmbeddingProvider` factory in `services/embeddings.ts`. Result: even with `MYCELIUM_LLM_PROVIDER= llama-cpp` (PR #187, in flight), the middleware would still hit Ollama — blocking the native-app track (#176) at three call sites. This wires an optional `embed: (text) => Promise<number[]>` callback through each constructor and uses it when set, falling back to the existing direct-fetch when absent. proxy.ts builds the callback once via `createEmbeddingProvider()` and passes it to all three middleware classes — so once #187 lands, switching the env var routes embedding, absorbing, digesting, and prime-fetching uniformly through the same provider, no further wiring needed. Side benefit: the provider's `sampleForEmbedding` (head+middle+tail truncation) now applies to long auto-absorbed/digested texts that the previous direct-fetch path silently truncated from the end past nomic-embed's 2048-token window. Tests: 4 new in middleware-embed-override.test.ts — verifies absorber + digester route through the override, prime-fetcher accepts the option, absorber falls back to direct fetch when no override is provided (back-compat for the existing test suite + any operator scripts). Suite: 943 pass, 0 fail (was 942 + 1 skipped before). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
Author
Peer audit (autonomous tick, 2026-05-02)Read PR top-to-bottom against Claims that hold up
Non-blocking follow-up notes
VerdictMERGEABLE as-is. No file conflicts with #185/#187/#188/#189/#192 (verified by the empirical merge train in #176 comment 4364403408 — all 7 PRs merged in sequence with zero conflicts and 998/999 tests green). Per the suggested merge order, #190 lands cleanly any time after #187 (the load-bearing dispatch). 🤖 Posted via the audit-loop-break heuristic (memory |
Dewinator
added a commit
that referenced
this pull request
May 3, 2026
…n W4.1
W4.1 of docs/wave-4-anti-echo.md says "the first PR of this wave creates
the directory" — this is that scaffold. Lands two files only:
- mcp-server/src/__tests__/fixtures/anti-echo/README.md
Developer-facing spec for the corpus shape, mirrors the governance
rules from the anchor doc but written for the file-format reader.
- mcp-server/src/__tests__/fixtures/anti-echo/corpus-types.ts
`AntiEchoCorpusFixture` + `AntiEchoCohortFixture` discriminated union
over the v1.1 Lesson envelope (services/wire-types.ts). Types only,
no loader, no harness — those land alongside the first concrete
fixture per category in subsequent PRs.
Why scaffold-first instead of one big "land all 8 fixtures" PR: the eight
attack categories from wave-4-anti-echo.md §"Corpus categories" each have
their own subtleties (cohort vs single-envelope, signing-key handling,
which §10 mechanism asserts). Decomposing into one fixture per follow-up
PR keeps each diff reviewable and lets the harness shape evolve from the
first concrete fixture rather than from speculation.
Why this can land while the 9-PR native-app queue is open: the new
directory lives entirely under `__tests__/fixtures/`, so it has zero file
overlap with the native-app stack (#185 / #187 / #188 / #189 / #190 /
#191 / #192 / #193 / #194). 939/939 node --test tests still green;
`tsc --noEmit` clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dewinator
added a commit
that referenced
this pull request
May 3, 2026
…rain) Reed merged 10 PRs today: all 3 W4.1 anti-echo (#197/#198/#201), both W2 federation (#199/#200), 5 native-app (#190/#191/#192/#193/#194). Only the linear 4-PR #178-stack remains open (#185 independent + #187 → #188 → #189 strictly stacked). Three-cohort split collapsed to one cohort — old order- independence proofs (143rd/148th tick) now obsolete. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
embed: (text) => Promise<number[]>callback throughAbsorber,Digester, andPrimeFetcherconstructors.proxy.tsbuilds the callback once viacreateEmbeddingProvider()and passes it to all three. When PR feat(embeddings): LlamaCppEmbeddingProvider behind MYCELIUM_LLM_PROVIDER — #178 part 1 #187 lands, flippingMYCELIUM_LLM_PROVIDER=llama-cppwill route embedding, absorbing, digesting, and prime-fetching through llama-cpp uniformly — no further wiring needed.Why
The three middleware classes each duplicated a private 11-line
embed()that hit Ollama's/api/embeddirectly — bypassing theEmbeddingProviderfactory abstraction. Without this PR, even after #187 lands, the middleware path still depends on Ollama. That blocks the Welle-1 native-app track (#176).Side benefit
The shared
OllamaEmbeddingProvideralready doessampleForEmbedding(head + middle + tail truncation) for inputs over 6000 chars. The previous direct-fetch path silently truncated from the END — the embedding then only represented the first ~2048 tokens of long auto-absorbed/digested texts. Routing through the provider fixes this for free.Out of scope
ollamaUrl/embeddingModelconstructor options (they remain as fallback for tests + back-compat). A follow-up can require the callback once feat(embeddings): LlamaCppEmbeddingProvider behind MYCELIUM_LLM_PROVIDER — #178 part 1 #187 is merged and the provider abstraction is the sole path./api/chatproxy to llama-cpp — that's feat(app): Spike 2 — node-llama-cpp embedding + chat bridge (drop Ollama dependency) #178 part 2 (chat bridge).Pillar check
Test plan
npm run build— cleannpm test— 943 pass, 0 fail (was 942 + 1 skipped)middleware-embed-override.test.ts:MYCELIUM_LLM_PROVIDER=llama-cpp→ middleware proxy auto-absorb confirms llama-cpp is hit (no Ollama traffic)🤖 Generated with Claude Code