Skip to content

DEV-1514: extract Retriever ABC + SearchService orchestration#160

Merged
ZmeiGorynych merged 4 commits into
mainfrom
egor/dev-1514-a-clean-search-facade
Jun 1, 2026
Merged

DEV-1514: extract Retriever ABC + SearchService orchestration#160
ZmeiGorynych merged 4 commits into
mainfrom
egor/dev-1514-a-clean-search-facade

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented May 31, 2026

Summary

  • Replaces the hardcoded 3-channel BM25 + Tantivy + embeddings pipeline in slayer/search/ with a clean facade: every channel is now a Retriever ABC subclass, and SearchService orchestrates a configurable retriever list via asyncio.gather.
  • Each retriever returns a combined RetrievalResult (memory + entity rankings) from one retrieve() call so retrievers sharing expensive setup (litellm embed, dim-check) do it once per search; orchestration warning order is deterministic in retriever-declaration order, not gather-completion order.
  • EmbeddingService is absorbed into EmbeddingRetriever; SearchService gains public upsert_memory / refresh_model_subtree / refresh_datasource fan-out methods that isolate per-retriever exceptions. Persistent tantivy is explicitly out of scope (in-memory rebuild stays); future PR overrides TantivyRetriever's no-op write hooks without an ABC change.

Test plan

  • poetry run pytest -m "not integration" — 3489 unit tests pass
  • poetry run ruff check slayer/ tests/ — clean
  • New tests pin: Retriever ABC compliance, BM25 / Tantivy / Embedding read-side, single fetch_corpus + single embed_question + single dim-check per EmbeddingRetriever.retrieve, two-kind-filtered Tantivy queries running sequentially against the same index, write-side fan-out + exception isolation for all three create/refresh hooks, warning order in retriever declaration order even under artificial retrieve delays, text_by_id first-non-empty-wins precedence, per-bucket invariance (DEV-1414) preserved
  • Migrated tests (test_search_invariance, test_search_three_channel, test_cascade_strip_on_delete, test_idempotent_ingestion, test_startup_ingest, conftest) flipped to new symbols; test_embeddings_service.py deleted (replaced by test_embedding_retriever.py)
  • Manual: ensure slayer search CLI + REST POST /search + MCP search continue to return identical SearchResponse shape

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Pluggable retriever architecture that runs multiple search channels in parallel and fuses results.
    • New BM25, embedding-based, and full-text retrievers for entity and memory ranking.
    • Search now yields unified memory/entity rankings with consistent warning reporting.
  • Refactor

    • Embedding refresh and upsert flows moved under the search/retriever layer and unified via the new search facade.

Refactors slayer/search/ so the three-channel BM25 + Tantivy +
embeddings pipeline becomes a default arrangement of one
Retriever ABC. Read-side and write-side public signatures
preserved; internal channel methods replaced by a parallel
gather across a configurable retriever list.

Retriever ABC (slayer/search/retriever.py):
- One async retrieve() returning a combined RetrievalResult
  (memory + entity rankings + text_by_id + warnings) so
  retrievers that share expensive setup (litellm embed,
  tantivy index handle) do it once per search call.
- Symmetric write hooks: upsert_memory, refresh_model_subtree,
  refresh_datasource (defaults no-op) plus delete_* hooks
  reserved for future persistent retrievers.

Three shipping retrievers in slayer/search/retrievers/:
- BM25Retriever: ports _run_channel_1 (entity-overlap BM25,
  valid_canonicals filter).
- TantivyRetriever: ports _run_channel_2 (two kind-filtered
  queries run sequentially in one call against the same
  in-memory tantivy.Index).
- EmbeddingRetriever: absorbs the former EmbeddingService;
  one fetch_corpus + one embed_question + one dim-check per
  retrieve call; owns the litellm refresh pipeline on the
  write side. delete_* hooks no-op (StorageBackend owns the
  embedding cascade transactionally with the row delete).

SearchService refactor:
- Default retriever factory: [BM25, Tantivy, Embedding].
- Custom retrievers injectable via retrievers= kwarg.
- valid_canonicals + corpus built once per search; gather
  across retrievers in parallel; warnings aggregated in
  retriever declaration order (not gather completion order);
  text_by_id precedence first-non-empty wins.
- New public upsert_memory / refresh_model_subtree /
  refresh_datasource methods fan out to every retriever and
  isolate per-retriever exceptions as prefixed warnings.

Call-site migrations: slayer/memories/service.py,
slayer/engine/profiling.py, slayer/engine/ingestion.py now go
through SearchService.

Out of scope this PR: persistent tantivy (in-memory rebuild
stays); pluggable RRF; embedding cascade ownership change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@linear
Copy link
Copy Markdown

linear Bot commented May 31, 2026

DEV-1514

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 31, 2026

Review Change Stack

Warning

Review limit reached

@ZmeiGorynych, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 2 minutes and 40 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cfa4a271-2980-4002-babd-1e22eb4e8e9c

📥 Commits

Reviewing files that changed from the base of the PR and between 6e114e9 and fe7ce27.

📒 Files selected for processing (5)
  • slayer/search/retrievers/bm25.py
  • slayer/search/retrievers/embeddings.py
  • slayer/search/service.py
  • tests/test_search_invariance.py
  • tests/test_search_render_unified.py
📝 Walkthrough

Walkthrough

This PR refactors SLayer's search stack from a fixed "three-channel" architecture into a pluggable retriever-based design. EmbeddingService is removed and its logic moves into EmbeddingRetriever; SearchService becomes an orchestrator that fan-outs to multiple retrievers in parallel, fuses results via RRF, and exposes write-side hooks for retriever dispatch. All call sites are updated to use the new API.

Changes

Retriever Plugin Architecture and SearchService Orchestration Refactor

Layer / File(s) Summary
Retriever abstraction and RetrievalResult contract
slayer/search/retriever.py
Introduces Retriever ABC with async retrieve contract (inputs: query entities, question, all memories, valid canonicals, corpus, datasource; output: RetrievalResult), default no-op write/delete hooks, and shared RetrievalResult Pydantic model combining memory/entity rankings and warnings.
BM25Retriever entity-overlap ranking
slayer/search/retrievers/bm25.py
Filters memories by valid_canonicals and ranks by entity overlap via bm25_rank, returning memory_ranking only; empty entity results and no-op write hooks.
EmbeddingRetriever cosine ranking and refresh
slayer/search/retrievers/embeddings.py
Loads embeddings from storage, ranks both memory and non-memory entities by cosine similarity to embedded query; implements write-side upsert_memory/refresh_model_subtree/refresh_datasource with batched embedding, hash-based skip, and partial-failure tolerance; emits warnings for unavailable channel, empty corpus, query embedding failure, and dimension mismatch.
TantivyRetriever full-text search
slayer/search/retrievers/tantivy.py
Performs two sequential kind-filtered Tantivy queries (memory-filtered, entity-filtered) within a single retrieve call; populates text_by_id for memory results.
Retriever implementations package
slayer/search/retrievers/__init__.py
Exports BM25Retriever, EmbeddingRetriever, TantivyRetriever.
SearchService orchestrator refactor
slayer/search/service.py
Refactored from fixed "three-channel" to retriever-based facade: accepts injected or default retrievers, runs asyncio.gather for parallel fan-out, fuses text_by_id with first-non-empty-wins precedence, merges RRF memory/entity rankings; write-side upsert_memory/refresh_model_subtree/refresh_datasource dispatch to all retrievers in declaration order with per-retriever exception isolation and deterministic warning aggregation.
Migrate call sites from EmbeddingService to SearchService
slayer/embeddings/__init__.py, slayer/engine/ingestion.py, slayer/engine/profiling.py, slayer/memories/service.py, slayer/storage/sqlite_storage.py
Update imports and routing to use SearchService constructor and write-side API; memory refresh now calls SearchService(...).upsert_memory; model/datasource refresh calls SearchService(...).refresh_model_subtree/refresh_datasource.
Retriever ABC compliance and contract tests
tests/test_retriever_abc.py
Validates RetrievalResult defaults, confirms all three concrete retrievers subclass Retriever with distinct names, verifies no-op hook signatures and return values, ensures abstract Retriever cannot be instantiated.
BM25Retriever behavior tests
tests/test_bm25_retriever.py
Tests empty query results, entity-overlap ranking, valid_canonicals filtering, always-empty entity results, no-op write hooks.
TantivyRetriever behavior tests
tests/test_tantivy_retriever.py
Tests empty results on missing corpus/blank question, memory and entity ranking, text_by_id population, sequential two-query execution without overlap, memory id filtering, no-op write hooks.
EmbeddingRetriever persistence and ranking tests
tests/test_embedding_retriever.py
Tests upsert idempotency, refresh batching and hash-skip, per-entry failure tolerance with partial persistence, hidden model short-circuiting, model-name filtering, batched storage usage, single-call fetch_corpus+embed_question contract, blank question and dimension mismatch handling, delete hook no-ops.
SearchService write-side dispatch tests
tests/test_search_service_dispatch.py
Tests default retriever wiring, custom retriever injection, fan-out in declaration order, deterministic warning aggregation with deduplication, per-retriever exception isolation, absence of delete methods.
SearchService read-side orchestration tests
tests/test_search_service_orchestration.py
Tests valid_canonicals identity sharing, corpus single-build contract, parallel execution via asyncio.gather, warning order by declaration, first-declared-wins text precedence, per-bucket invariance, recency fallback bypass, all_memories and datasource forwarding with identity preservation.
Update existing tests for retriever paths
tests/test_cascade_strip_on_delete.py, tests/test_search_invariance.py, tests/test_search_three_channel.py, tests/test_startup_ingest.py, tests/test_idempotent_ingestion.py
Update monkeypatch targets and refresh calls to reflect new EmbeddingRetriever module location and SearchService.refresh_* API.

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

  • MotleyAI/slayer#114: Main PR is a direct successor refactoring EmbeddingService into EmbeddingRetriever + SearchService orchestration while rewiring ingest/edit/save paths to use the new hooks.
  • MotleyAI/slayer#127: Both PRs modify SearchService orchestration and ranking invariants; they touch similar fusion and cap behaviors.

Poem

🐰 I hopped through code and stitched the tracks,

Retrievers now run in parallel, not stacks,
BM25, Embeddings, Tantivy stride,
SearchService fuses all with pride,
A rabbit's cheer for searches unified!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly reflects the main architectural change: extracting a Retriever ABC and refactoring SearchService to orchestrate retrievers, which aligns with the extensive refactoring across search infrastructure files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch egor/dev-1514-a-clean-search-facade

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
slayer/search/service.py (1)

594-597: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use keyword arguments per coding guidelines.

Same as the earlier occurrence: _filter_memories_by_datasource has two parameters, so keyword arguments should be used.

As per coding guidelines: "Use keyword arguments for functions with more than 1 parameter"

♻️ Proposed fix
-        recency_memories = _filter_memories_by_datasource(
-            await self._storage.list_memories(entities=None),
-            datasource,
-        )
+        recency_memories = _filter_memories_by_datasource(
+            memories=await self._storage.list_memories(entities=None),
+            datasource=datasource,
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/search/service.py` around lines 594 - 597, The call to
_filter_memories_by_datasource should use keyword arguments instead of
positional ones: replace the positional invocation that passes await
self._storage.list_memories(entities=None) and datasource with named parameters
(e.g., memories=... and datasource=...) so the call becomes explicit; update the
invocation where recency_memories is assigned to call
_filter_memories_by_datasource with memories=<result of
self._storage.list_memories(entities=None)> and datasource=datasource.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@slayer/search/retrievers/bm25.py`:
- Around line 18-20: The function _filter_memories_entities currently accepts
multiple positional parameters (memories, valid_canonicals) and should enforce
keyword-only arguments per guidelines; update the signature of
_filter_memories_entities to make valid_canonicals a keyword-only parameter
(e.g., insert a '*' after the first parameter) so callers must pass
valid_canonicals by name, and adjust any call sites to use the keyword if
necessary.

In `@slayer/search/retrievers/embeddings.py`:
- Around line 56-67: The three helper functions _column_canonical_id,
_measure_canonical_id, and _aggregation_canonical_id accept multiple positional
parameters; update their signatures to require keyword-only arguments by adding
a lone * before the second parameter (e.g., def _column_canonical_id(model:
SlayerModel, *, column_name: str) -> str) so callers must pass
column_name/measure_name/aggregation_name as keywords while keeping return
behavior unchanged.

In `@slayer/search/service.py`:
- Around line 389-392: Call _filter_memories_by_datasource using keyword
arguments to follow the guideline: replace the positional call with one that
passes memories=await self._storage.list_memories(entities=None) and
datasource=datasource (i.e., _filter_memories_by_datasource(memories=...,
datasource=...)) so the function name _filter_memories_by_datasource and the
await self._storage.list_memories(entities=None) expression are left intact but
passed as named parameters.

---

Outside diff comments:
In `@slayer/search/service.py`:
- Around line 594-597: The call to _filter_memories_by_datasource should use
keyword arguments instead of positional ones: replace the positional invocation
that passes await self._storage.list_memories(entities=None) and datasource with
named parameters (e.g., memories=... and datasource=...) so the call becomes
explicit; update the invocation where recency_memories is assigned to call
_filter_memories_by_datasource with memories=<result of
self._storage.list_memories(entities=None)> and datasource=datasource.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e530faba-05ff-453a-a7ee-f439711ca1a7

📥 Commits

Reviewing files that changed from the base of the PR and between 4e295c0 and 0a2b858.

📒 Files selected for processing (25)
  • slayer/embeddings/__init__.py
  • slayer/embeddings/service.py
  • slayer/engine/ingestion.py
  • slayer/engine/profiling.py
  • slayer/memories/service.py
  • slayer/search/retriever.py
  • slayer/search/retrievers/__init__.py
  • slayer/search/retrievers/bm25.py
  • slayer/search/retrievers/embeddings.py
  • slayer/search/retrievers/tantivy.py
  • slayer/search/service.py
  • slayer/storage/sqlite_storage.py
  • tests/conftest.py
  • tests/test_bm25_retriever.py
  • tests/test_cascade_strip_on_delete.py
  • tests/test_embedding_retriever.py
  • tests/test_embeddings_service.py
  • tests/test_idempotent_ingestion.py
  • tests/test_retriever_abc.py
  • tests/test_search_invariance.py
  • tests/test_search_service_dispatch.py
  • tests/test_search_service_orchestration.py
  • tests/test_search_three_channel.py
  • tests/test_startup_ingest.py
  • tests/test_tantivy_retriever.py
💤 Files with no reviewable changes (2)
  • tests/test_embeddings_service.py
  • slayer/embeddings/service.py

Comment thread slayer/search/retrievers/bm25.py
Comment thread slayer/search/retrievers/embeddings.py Outdated
Comment thread slayer/search/service.py
ZmeiGorynych and others added 3 commits June 1, 2026 10:36
- CodeRabbit threads + outside-diff: switch the 4 helper functions
  with >1 parameter (`_filter_memories_entities`,
  `_column_canonical_id`, `_measure_canonical_id`,
  `_aggregation_canonical_id`, `_filter_memories_by_datasource`) to
  keyword-only signatures, and flip the call sites in
  `bm25._filter`, `embeddings.refresh_model_subtree`, and the two
  `service._filter_memories_by_datasource` invocations to
  `kwarg=value` form. Matches the project's "kwargs for >1 param"
  rule.

- Sonar S6466 (CRITICAL, 2x): add explicit non-empty guards in
  `test_search_service_orchestration.py:138` and `:328` before
  indexing `captured_*[0]`. Turns a possible IndexError into a
  clear "retriever was not invoked" failure and clears the
  `new_reliability_rating` gate condition.

- Sonar S7503 (MINOR, 13x — all invalid): the async-no-await
  signatures on the `Retriever` ABC default no-op hooks and on the
  test stubs are required by design (subclasses override with
  truly-async hooks; monkeypatched stubs must match the real async
  signature). Suppressed with `# NOSONAR(S7503) — ...` on each
  flagged line.

Local: 3489 unit tests pass, ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Integrates DEV-1513 (named-entity surfacing) into the retriever-based
architecture:

* BM25Retriever now augments memory entity lists with implicit
  ``memory:<self_id>`` before BM25 ranking — a user-supplied
  ``memory:<id>`` ref surfaces the named memory at the top of the
  memory ranking.
* SearchService.search() runs a new ``_build_channel_1_entity_ranking``
  pre-pass that contributes user-named canonical refs to the entity
  ranking (subject to datasource / hidden / missing filters). New
  ``LookupFound`` / ``LookupHidden`` / ``LookupMissing`` Pydantic types
  carry the lookup outcome. The entity-side fuse_entity_hits now takes
  a ``named_kind_text`` fallback for pure-named calls (no question).
* New ``_memory_id_off_datasource_warnings`` +
  ``_stale_query_warnings_for_named_memory_refs`` emit warnings for
  explicitly-named memories filtered by datasource or carrying stale
  attached queries.
* EmbeddingRetriever.refresh_model_subtree and refresh_datasource now
  route through the unified ``collect_model_entity_pairs`` /
  ``render_datasource_pair`` helpers in slayer.search.render (the
  single-source-of-truth dispatch DEV-1513 added). Drops the per-kind
  canonical-id helpers in favor of those shared with the corpus
  builder and the named-entity surfacing path.
* DEV-1513 test files (test_search_named_entity_surfacing.py,
  test_search_render_unified.py) flipped to import EmbeddingRetriever
  from slayer.search.retrievers.embeddings and monkeypatch the new
  module path for ``collect_model_entity_pairs`` / ``embed_batch``.
* slayer/embeddings/service.py stays deleted; DEV-1513 work is
  absorbed into EmbeddingRetriever instead.

3530 unit tests pass; ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Codex review on the merge commit: the truthy `if valid_canonicals`
check bypasses filtering on an empty set. The ABC declares
``valid_canonicals: set`` (non-Optional), so an empty set is a
legitimate value meaning "no entities are live — drop every stale
tag". Direct BM25Retriever callers (third-party orchestrators) would
otherwise rank against stale tags whenever they passed `set()`.

The fix: always call ``_filter_memories_entities``; an empty set
strips all tags, BM25 finds zero overlap, the ranking comes back
empty — correct behavior for "everything is stale".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 1, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant