Skip to content

fix(qdrant): route service-level vector repo lookups through env-aware factory#6

Merged
Ptah-CT merged 2 commits into
mainfrom
qdrant/service-level-routing
May 14, 2026
Merged

fix(qdrant): route service-level vector repo lookups through env-aware factory#6
Ptah-CT merged 2 commits into
mainfrom
qdrant/service-level-routing

Conversation

@Ptah-CT
Copy link
Copy Markdown

@Ptah-CT Ptah-CT commented May 14, 2026

Problem

PR #5 enabled the service to start with VECTOR_STORE_BACKEND=qdrant
by gating the Milvus lifespan and lazifying BaseMilvusRepository's
collection property. But the service-layer write/read path was still
hard-coded to the Milvus repositories: 5 service classes either called
EpisodicMemoryMilvusRepository() directly or asked the DI container
for get_bean_by_type(EpisodicMemoryMilvusRepository).

With the Milvus collections never loaded, every insert / search through
that path raised
ValueError("Collection instance not created, please call ensure_loaded() first")
— which the calling services swallowed as ERROR logs while returning
indexed=0 and treating the operation as successful.

Net effect on a live cutover deployment: data continued to land in
MongoDB (the source of truth), but never reached Qdrant. Search through
the service layer returned empty results because the same broken path
was used for reads.

Concretely measured on the production cutover host:

  • New documents in v1_episodic_memories (Mongo) were not present
    in the corresponding Qdrant collection (108 docs missing after ~12 h).
  • profile_indexer logged Profile Milvus indexing completed: indexed=0
    on every save with a UserProfileCollection Collection instance not created
    traceback right next to it.

Fix

Introduce core/oxm/vector_backend_router.py — a thin factory with one
get_<memory>_repo() per memory type. Each factory reads
VECTOR_STORE_BACKEND (case-insensitive, normalised) and resolves the
appropriate Qdrant or Milvus repo via the DI container, preserving
singleton bean scope. Direct instantiation is the fallback for unit-test
contexts without a DI scan.

Service-layer sites updated:

  • agentic_layer/search_mem_service.py — 4 hard-coded
    *MilvusRepository() direct instantiations call the factory now;
    field names keep their historical _milvus_ token for callee
    compatibility with the rest of the service.
  • agentic_layer/memory_manager.py — the match mem_type switch
    that picked one of 5 MilvusRepository beans now routes to the
    factory.
  • agentic_layer/profile_search_service.py — the lazy milvus_repo
    property resolves through get_user_profile_repo().
  • memory_layer/profile_indexer/profile_indexer.py — same lazy
    property, same factory call.
  • biz_layer/mem_sync.py — constructor defaults for
    foresight_milvus_repo and atomic_fact_milvus_repo come from
    the factory.
  • biz_layer/mem_memorize.py — the inline agent_skill_milvus_repo
    lookup inside the upsert_records sync uses get_agent_skill_repo().

Both backends expose the same public surface (vector_search,
create_and_save_*, delete_by_*), so callers need no further
changes. Field / variable names keep their _milvus_ token — that
keeps the diff minimal and avoids a rename cascade through the
existing methods; renaming is a follow-up.

Test plan

  • Service boots with VECTOR_STORE_BACKEND=qdrant (already
    validated by PR fix(qdrant): allow service to boot with VECTOR_STORE_BACKEND=qdrant #5) and now also writes to Qdrant on memory ingest.
  • profile_indexer indexed=N>0 for a new profile.
  • Search results contain documents written after the cutover.
  • A separate one-shot catchup script syncs the ~108 documents that
    were written to Mongo between cutover and this fix (idempotent via
    uuid5 mapping, no double-embedding cost).

…e factory

PR #5 made the service start with VECTOR_STORE_BACKEND=qdrant by gating
the Milvus lifespan and lazifying BaseMilvusRepository's collection
property. But the service-layer write/read path was still hard-coded to
the Milvus repositories: 5 service classes either called
``EpisodicMemoryMilvusRepository()`` directly or asked the DI container
for ``get_bean_by_type(EpisodicMemoryMilvusRepository)``. With the
Milvus collections never loaded, every insert / search through that
path raised ``ValueError("Collection instance not created, please call
ensure_loaded() first")`` — which the calling services swallowed as
``ERROR`` logs while returning success with ``indexed=0``. Net effect:
data continued to land in MongoDB, but never reached Qdrant, and search
returned empty results.

This commit introduces ``core/oxm/vector_backend_router.py`` — a thin
factory with one ``get_<memory>_repo()`` per memory type. Each factory
reads ``VECTOR_STORE_BACKEND`` (case-insensitive, normalised) and
resolves the appropriate Qdrant or Milvus repo via the DI container,
preserving singleton bean scope. Direct instantiation is the fallback
for unit-test contexts without a DI scan.

Service-layer sites updated:

- ``agentic_layer/search_mem_service.py`` — 4 hard-coded
  ``*MilvusRepository()`` direct instantiations now call the factory;
  field names keep their historical ``_milvus_`` token for backward
  compatibility with the rest of the service.
- ``agentic_layer/memory_manager.py`` — the ``match mem_type`` switch
  that picked one of 5 ``MilvusRepository`` beans now routes to the
  factory.
- ``agentic_layer/profile_search_service.py`` — the lazy ``milvus_repo``
  property now resolves through ``get_user_profile_repo()``.
- ``memory_layer/profile_indexer/profile_indexer.py`` — same lazy
  property, same factory call.
- ``biz_layer/mem_sync.py`` — constructor defaults for
  ``foresight_milvus_repo`` and ``atomic_fact_milvus_repo`` now come
  from the factory.
- ``biz_layer/mem_memorize.py`` — the inline ``agent_skill_milvus_repo``
  lookup inside ``upsert_records`` sync now uses
  ``get_agent_skill_repo()``.

Both backends already expose the same public surface
(``vector_search``, ``create_and_save_*``, ``delete_by_*``), so callers
need no further changes.
@Ptah-CT Ptah-CT requested a review from DerAuctor as a code owner May 14, 2026 22:11
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

Warning

Rate limit exceeded

@Ptah-CT has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 31 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8f2abd0c-8ebc-49d9-a31a-047618fe25d5

📥 Commits

Reviewing files that changed from the base of the PR and between f57f43c and 035444c.

📒 Files selected for processing (2)
  • methods/EverCore/src/core/oxm/qdrant/base_repository.py
  • methods/EverCore/src/core/oxm/vector_backend_router.py
📝 Walkthrough

Walkthrough

Die Änderungen führen eine zentralisierte Backend-Router-Infrastruktur ein, die die Auswahl zwischen Qdrant- und Milvus-Vector-Store-Implementierungen basierend auf einer Umgebungsvariable ermöglicht. Sechs Service-Schichten werden aktualisiert, um diese Factory-basierte Auflösung anstelle direkter Dependency-Injection zu verwenden, während Feldnamen für Rückwärtskompatibilität beibehalten werden.

Changes

Vector Backend Router Integration

Layer / File(s) Summary
Vector Backend Router Infrastructure
methods/EverCore/src/core/oxm/vector_backend_router.py
Neue Modul mit Hilfsfunktionen zur Backend-Auswahl aus VECTOR_STORE_BACKEND-Umgebungsvariable, zentraler _resolve()-Funktion mit DI-Container-Fallback und sechs Factory-Funktionen für Repository-Instanziierung (Episodic, Atomic Fact, Foresight, Agent Case, Agent Skill, User Profile).
Memory Sync Service Router Integration
methods/EverCore/src/biz_layer/mem_sync.py
MemorySyncService.__init__ wird aktualisiert, um Foresight- und Atomic-Fact-Repositories über Router-Factories zu laden statt direkt per get_bean_by_type().
Search Memory Service Router Integration
methods/EverCore/src/agentic_layer/search_mem_service.py
SearchMemoryService.__init__ wird auf den Backend-Router umgestellt für alle Vector-Repositories (Episodic, Profile, Agent Case, Agent Skill).
Memory Manager Vector Search Router Integration
methods/EverCore/src/agentic_layer/memory_manager.py
get_vector_search_results() routet die Repository-Auswahl pro MemoryType über entsprechende Router-Factory-Funktionen.
Profile Search Service Lazy Load Router Integration
methods/EverCore/src/agentic_layer/profile_search_service.py
Die milvus_repo-Property wird auf Router-basiertes Lazy-Loading für User-Profile-Repositories umgestellt.
Profile Indexer Router Integration
methods/EverCore/src/memory_layer/profile_indexer/profile_indexer.py
Die milvus_repo-Property wird aktualisiert, um das User-Profile-Repository über get_user_profile_repo() zu laden.
Agent Skill Extraction Router Integration
methods/EverCore/src/biz_layer/mem_memorize.py
_trigger_agent_skill_extraction() wird aktualisiert, um das Agent-Skill-Repository über den Backend-Router zu laden.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • DerAuctor

Poem

Ein Router hüpft durch Quanten und Vektoren,
Qdrant und Milvus, von Kaninchenohren behörtet,
Eine Umgebung flüstert, welches Netz zu wählen,
Zehn Orte landen sanft im neuen Strahl,
Die alte Namensgebung bleibt, rückwärts zuverlässig. 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.82% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Der Titel beschreibt präzise die Hauptänderung: das Routing von Vector-Repository-Lookups auf Service-Ebene durch eine umgebungsvariablen-gesteuerte Factory.
Description check ✅ Passed Die Beschreibung ist umfassend mit Problem, Fix, betroffenen Service-Klassen, Kompatibilität und Testplan relativ zum Änderungsumfang verwandt.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch qdrant/service-level-routing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@methods/EverCore/src/core/oxm/vector_backend_router.py`:
- Around line 34-38: Die Funktion _backend() normalisiert zwar den ENV-Wert,
validiert aber nicht, ob er "qdrant" oder "milvus" ist; aktualisiere _backend()
so dass nach dem Lesen und Normalisieren des ENV-Werts eine Überprüfung gegen
ein erlaubtes Set = {"qdrant","milvus"} erfolgt und bei ungültigem Wert eine
klare Ausnahme (z.B. ValueError) oder ein aussagekräftiger Fehler-Log mit dem
fehlerhaften Wert geworfen/geloggt wird, statt stillschweigend auf "milvus"
zurückzufallen; referenziere dabei die Funktion _backend() und die
Umgebungsvariable VECTOR_STORE_BACKEND und stelle sicher, dass die Fehlermeldung
den ungültigen Wert enthält.
- Line 31: Replace the deprecated typing.Type import and uses with the built-in
type: change the import line "from typing import Any, Type" to "from typing
import Any" (or drop typing import entirely if unused) and update any
annotations that use Type[...] to use the built-in "type[...]"/"type" (e.g.,
change "Type[Foo]" to "type[Foo]" or just "type" where appropriate) — look for
usages in this module such as Type in function signatures or variable
annotations and update them accordingly.
- Around line 45-57: In _resolve, narrow and log DI failures instead of silently
swallowing them: import BeanNotFoundError from core.di.exceptions, call
get_bean_by_type(cls) inside a try that catches ImportError and
BeanNotFoundError (not a bare Exception), log the exception details (include the
cls and the exception) before falling back to returning cls(), so failures in
get_bean_by_type are visible in logs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 431cb57b-9ea9-4999-adc2-2161697ccfdb

📥 Commits

Reviewing files that changed from the base of the PR and between e3cf5bb and f57f43c.

📒 Files selected for processing (7)
  • methods/EverCore/src/agentic_layer/memory_manager.py
  • methods/EverCore/src/agentic_layer/profile_search_service.py
  • methods/EverCore/src/agentic_layer/search_mem_service.py
  • methods/EverCore/src/biz_layer/mem_memorize.py
  • methods/EverCore/src/biz_layer/mem_sync.py
  • methods/EverCore/src/core/oxm/vector_backend_router.py
  • methods/EverCore/src/memory_layer/profile_indexer/profile_indexer.py

Comment thread methods/EverCore/src/core/oxm/vector_backend_router.py Outdated
Comment thread methods/EverCore/src/core/oxm/vector_backend_router.py Outdated
Comment thread methods/EverCore/src/core/oxm/vector_backend_router.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f57f43ca3e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread methods/EverCore/src/biz_layer/mem_sync.py
Comment thread methods/EverCore/src/biz_layer/mem_memorize.py
Addresses 5 of the 6 PR-6 review findings (Codex P1×3 + CodeRabbit
minor×2). The remaining one (CodeRabbit analysis-only on get_bean_by_type
exception scope) requires no code change — the fallback in ``_resolve``
already converts the missing-bean case into direct instantiation, so the
broad ``except Exception`` is intentional.

P1 fixes — Milvus-compat write API on Qdrant repositories

Codex correctly identified that the previous factory swap could route
``ProfileIndexer.insert_batch(...)``, ``sync_foresight/atomic_fact.insert(...)``
and ``mem_memorize.agent_skill insert/delete`` calls to Qdrant repos that
never had those methods. The Qdrant repos exposed only
``create_and_save_*`` + native ``upsert(PointStruct)`` — different
signatures, different entity shape.

Instead of rewriting every service-layer call (large diff, easy to miss
sites), this commit adds a thin compatibility layer to
``BaseQdrantRepository`` so all six concrete Qdrant repos inherit it:

- ``insert(entity_dict, flush=False) -> str`` — translates a
  Milvus-converter output (``{"id", "vector", payload-fields…}``) into
  a ``PointStruct`` via the same ``mongo_id_to_qdrant_id`` mapping the
  sweep uses, upserts, and returns the resulting Qdrant point id. The
  raw Mongo id is preserved as ``payload.mongo_id`` for round-trip.
- ``insert_batch(entity_dicts, flush=False) -> UpdateResult`` — batch
  version. Entities without a vector or id are silently skipped (the
  source converter is supposed to guard, but the pipeline sometimes
  hands partial docs in early-lifecycle states).
- The ``flush`` parameter is accepted and ignored — Qdrant has no
  separate flush; ``upsert(wait=True)`` already covers durability.

This keeps the existing service-layer ``MilvusConverter.from_mongo()``
→ ``milvus_repo.insert()`` pipeline intact while transparently routing
to Qdrant when the env flag is set.

CodeRabbit minors

- ``typing.Type`` → builtin ``type`` (Python 3.9+ idiom, ruff UP035).
- ``VECTOR_STORE_BACKEND`` validation: an unrecognised value now logs a
  warning and falls back to ``milvus`` instead of silently routing
  there. Catches typos like ``VECTOR_STORE_BACKEND=qdarnt``.
@Ptah-CT
Copy link
Copy Markdown
Author

Ptah-CT commented May 14, 2026

@coderabbitai review

All 5 findings addressed in 035444c:

  • Codex P1×3 (mem_sync / mem_memorize / profile_indexer write-API mismatch): added Milvus-compat insert(entity_dict, flush=False) and insert_batch(entity_dicts, flush=False) to BaseQdrantRepository so all 6 Qdrant repos inherit them. The existing *MilvusConverter.from_mongo() -> insert() service pipeline now transparently routes to Qdrant.
  • CR minor: typing.Type -> builtin type.
  • CR minor: VECTOR_STORE_BACKEND validation — unknown values now log a warning + fall back to milvus instead of silent fallback.

The remaining CR analysis (get_bean_by_type exception scope) needs no change: the broad except in _resolve is intentional because we want to fall through to direct instantiation in unit-test contexts that have no DI scan.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

@Ptah-CT Thanks for the detailed summary of the fixes in 035444c! The approach of adding insert(entity_dict, flush=False) and insert_batch(entity_dicts, flush=False) to BaseQdrantRepository is clean — it keeps the *MilvusConverter → insert() pipeline intact without touching callers. Noted as well on the intentional broad except in _resolve for DI-less unit-test contexts.

Triggering a fresh review now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Ptah-CT Ptah-CT merged commit f25eba3 into main May 14, 2026
3 checks passed
@Ptah-CT Ptah-CT deleted the qdrant/service-level-routing branch May 14, 2026 22:29
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 035444c355

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +160 to +161
self.agent_case_milvus_repo = get_agent_case_repo()
self.agent_skill_milvus_repo = get_agent_skill_repo()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve Mongo IDs when routing search repos through backend router

After SearchMemoryService starts resolving agent_case_milvus_repo/agent_skill_milvus_repo via the backend router, Qdrant mode returns point UUIDs in id, but this service still treats id as a Mongo _id for backfill (_fetch_agent_cases_by_ids / _fetch_agent_skills_by_ids) and for hybrid fusion dedupe. In Qdrant deployments this causes vector/hybrid agent-case and agent-skill results to be dropped because Mongo lookups miss, even though vector hits were found.

Useful? React with 👍 / 👎.

}
payload["mongo_id"] = str(mongo_id)
return qmodels.PointStruct(
id=mongo_id_to_qdrant_id(mongo_id),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add delete-ID compatibility alongside Mongo->Qdrant ID mapping

The new Milvus-compat insert adapter rewrites Mongo IDs to UUID5 point IDs, but delete call sites in routed services still pass raw Mongo IDs into delete_by_id. That mismatch means update/delete sync flows can fail or silently miss deletions in Qdrant mode; in mem_memorize this is especially harmful because delete and insert are inside one try, so a delete failure skips the subsequent upserts and leaves vectors stale.

Useful? React with 👍 / 👎.

Comment on lines 689 to 693
case MemoryType.EPISODIC_MEMORY:
milvus_repo = get_bean_by_type(EpisodicMemoryMilvusRepository)
milvus_repo = get_episodic_repo()
case MemoryType.AGENT_CASE:
milvus_repo = get_bean_by_type(AgentCaseMilvusRepository)
milvus_repo = get_agent_case_repo()
case MemoryType.AGENT_SKILL:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep hybrid dedupe keys consistent across vector backends

This routing change also affects MemoryManager hybrid retrieval: _search_hybrid deduplicates keyword/vector hits by id, but keyword hits use ES document IDs while Qdrant vector hits now carry point UUIDs, so the same memory is no longer recognized as duplicate. In Qdrant mode this inflates duplicate candidates before rerank and can crowd out distinct results in the final top-k.

Useful? React with 👍 / 👎.

Comment on lines +154 to +155
self.episodic_milvus_repo = get_episodic_repo()
self.profile_milvus_repo = get_user_profile_repo()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize episodic timestamp type after backend-routed repo swap

By routing episodic_milvus_repo through get_episodic_repo(), Qdrant mode now feeds search_mem_service results where timestamp is already a datetime, but _search_episodic_memories only handles numeric timestamps and otherwise sets timestamp=None. This makes vector episodic results lose their timestamp field in Qdrant deployments, which breaks time display/sorting behavior even when the underlying hit is valid.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant