fix(qdrant): allow service to boot with VECTOR_STORE_BACKEND=qdrant#5
Conversation
Three fixes that together unblock the Qdrant cutover. Without them the service crashes at FastAPI startup in a loop even though the Qdrant adapter itself is healthy. 1. Symmetric env-gate in MilvusLifespanProvider (core/lifespan/milvus_lifespan.py) QdrantLifespanProvider is already gated by VECTOR_STORE_BACKEND — when the value is anything other than 'qdrant' it is a no-op. The Milvus lifespan had no such gate: it always tried to connect to Milvus and raised on the first call when Milvus was offline. Because the Milvus lifespan runs at order=18 (before order=19 Qdrant lifespan), the Qdrant lifespan never got a chance to start. Added an early no-op return when VECTOR_STORE_BACKEND=qdrant for both startup and shutdown. 2. Lazy collection resolution in BaseMilvusRepository (core/oxm/milvus/base_repository.py) BaseMilvusRepository.__init__ called model.async_collection() eagerly, which raises 'Collection instance not created' when the Milvus lifespan has not initialised the collection — exactly the case when the service runs in Qdrant mode. Because controllers like SearchMemoryService construct Milvus repositories unconditionally, this crashed the whole FastAPI startup. Replaced with a lazy @Property: __init__ now only stashes the model class; the collection is resolved on first access. Result: Qdrant-mode boots cleanly, and Milvus-mode behaviour is unchanged (first repo call still resolves the collection the same way). 3. Corrected lazy import in QdrantConnectionCache key helper (core/tenants/tenantize/oxm/qdrant/config_utils.py) The lazy import inside get_tenant_qdrant_config was still pointing to the pre-rename module path core.tenants.tenantize.tenant_context; the actual module is core.tenants.tenant_contextvar. The error only fires when the Qdrant adapter tries to resolve a tenant — i.e. exactly at QdrantLifespanProvider.startup. Fixed the import string. Verified by booting the service with VECTOR_STORE_BACKEND=qdrant: lifespan order metrics -> mongodb -> milvus(no-op) -> elasticsearch -> qdrant (initialised) -> business -> longjob completes; /health returns 200; the Qdrant collections (101 today: 72 phase-3 sweep + 29 user_profile) are discovered and ready.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughÜberblickDas System wird aktualisiert, um Qdrant als alternatives Vector-Store-Backend zu unterstützen: Milvus-Lifespan-Startup wird übersprungen, wenn Qdrant aktiv ist, die Collection-Initialisierung wird verzögert, und die Tenant-Kontextauflösung wird konsolidiert. ÄnderungenQdrant-Backend-Migration und Lazy-Initialisierung
Aufwandsschätzung für Code-Review🎯 2 (Einfach) | ⏱️ ~12 Minuten Gedicht
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 290600f7d9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
QdrantLifespanProvider already treats the env flag case-insensitively via .strip().lower(); MilvusLifespanProvider must do the same. Without the normalization a deployment that sets VECTOR_STORE_BACKEND=QDRANT (or the value with surrounding whitespace) would still start Qdrant — and also try to start Milvus, regressing the very crash this PR fixes.
|
@coderabbitai please re-review and approve when satisfied — the Codex P2 normalization finding has been addressed in 774be68. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me fetch the latest PR state and the specific commit to verify the fix before proceeding. 🏁 Script executed: Length of output: 23592
I've already approved the PR at commit |
…e factory (#6) ## Problem PR #5 enabled the service to **start** with ``VECTOR_STORE_BACKEND=qdrant`` by gating the Milvus lifespan and lazifying ``BaseMilvusRepository``'s collection property. But the service-layer write/read path was still hard-coded to the Milvus repositories: 5 service classes either called ``EpisodicMemoryMilvusRepository()`` directly or asked the DI container for ``get_bean_by_type(EpisodicMemoryMilvusRepository)``. With the Milvus collections never loaded, every insert / search through that path raised ``ValueError("Collection instance not created, please call ensure_loaded() first")`` — which the calling services swallowed as ``ERROR`` logs while returning ``indexed=0`` and treating the operation as successful. **Net effect on a live cutover deployment**: data continued to land in MongoDB (the source of truth), but never reached Qdrant. Search through the service layer returned empty results because the same broken path was used for reads. Concretely measured on the production cutover host: - New documents in ``v1_episodic_memories`` (Mongo) were **not** present in the corresponding Qdrant collection (108 docs missing after ~12 h). - ``profile_indexer`` logged ``Profile Milvus indexing completed: indexed=0`` on every save with a ``UserProfileCollection Collection instance not created`` traceback right next to it. ## Fix Introduce ``core/oxm/vector_backend_router.py`` — a thin factory with one ``get_<memory>_repo()`` per memory type. Each factory reads ``VECTOR_STORE_BACKEND`` (case-insensitive, normalised) and resolves the appropriate Qdrant or Milvus repo **via the DI container**, preserving singleton bean scope. Direct instantiation is the fallback for unit-test contexts without a DI scan. Service-layer sites updated: - ``agentic_layer/search_mem_service.py`` — 4 hard-coded ``*MilvusRepository()`` direct instantiations call the factory now; field names keep their historical ``_milvus_`` token for callee compatibility with the rest of the service. - ``agentic_layer/memory_manager.py`` — the ``match mem_type`` switch that picked one of 5 ``MilvusRepository`` beans now routes to the factory. - ``agentic_layer/profile_search_service.py`` — the lazy ``milvus_repo`` property resolves through ``get_user_profile_repo()``. - ``memory_layer/profile_indexer/profile_indexer.py`` — same lazy property, same factory call. - ``biz_layer/mem_sync.py`` — constructor defaults for ``foresight_milvus_repo`` and ``atomic_fact_milvus_repo`` come from the factory. - ``biz_layer/mem_memorize.py`` — the inline ``agent_skill_milvus_repo`` lookup inside the ``upsert_records`` sync uses ``get_agent_skill_repo()``. Both backends expose the same public surface (``vector_search``, ``create_and_save_*``, ``delete_by_*``), so callers need no further changes. Field / variable names keep their ``_milvus_`` token — that keeps the diff minimal and avoids a rename cascade through the existing methods; renaming is a follow-up. ## Test plan - [ ] Service boots with ``VECTOR_STORE_BACKEND=qdrant`` (already validated by PR #5) and now also writes to Qdrant on memory ingest. - [ ] ``profile_indexer`` ``indexed=N>0`` for a new profile. - [ ] Search results contain documents written **after** the cutover. - [ ] A separate one-shot catchup script syncs the ~108 documents that were written to Mongo between cutover and this fix (idempotent via uuid5 mapping, no double-embedding cost). --------- Co-authored-by: Ptah-CT <221234802+Ptah-CT@users.noreply.github.com>
Problem
After PR #2 (Qdrant adapter merged) and the data-side migration, the
service still refused to boot with
VECTOR_STORE_BACKEND=qdrant:MilvusLifespanProviderhad no env-gate, always tried to connect toMilvus, and crashed when Milvus was offline (which it is in the Qdrant
cutover scenario). It runs at order=18, before the Qdrant lifespan at
order=19, so the Qdrant lifespan never got a chance to start.
BaseMilvusRepository.__init__crashed at every controller construction because it eagerly called
model.async_collection()— which raises when the Milvus lifespanhas not initialised the collection (i.e. exactly the Qdrant-mode case).
QdrantConnectionCache's lazy import insideget_tenant_qdrant_configstill referencedcore.tenants.tenantize.tenant_context— that module was renamed tocore.tenants.tenant_contextvarupstream. The failure only surfacedat the very first
ensure_collectioncall insideQdrantLifespanProvider.startup, so it looked indistinguishablefrom a wider Qdrant initialisation issue.
Fix
Three small, surgical patches:
core/lifespan/milvus_lifespan.py— symmetric env-gate.startupandshutdownreturnNoneimmediately whenVECTOR_STORE_BACKEND=qdrant, mirroring whatQdrantLifespanProvideralready does for the inverse case.
core/oxm/milvus/base_repository.py— lazy collection resolution.__init__no longer eagerly resolvesmodel.async_collection();instead it stashes the model class, and
self.collectionis now a@propertythat resolves on first access. Milvus-mode behaviour isunchanged (first repo call resolves the collection identically to
before). Qdrant-mode boots cleanly because the Milvus collections are
never touched.
core/tenants/tenantize/oxm/qdrant/config_utils.py— fixedthe lazy import:
core.tenants.tenantize.tenant_context→core.tenants.tenant_contextvar.Verification
Boot with
VECTOR_STORE_BACKEND=qdrant:qdrant (initialised) → business → longjob — all green.
/healthreturns{"status": "healthy"}HTTP 200./docsHTTP 200.startup.
ensure_allresolves the tenant-aware names and finds thepre-seeded collections green.
Risk
Low. None of the changes affect Milvus-mode (default). All three patches
are scoped to either the Qdrant code path or to a lazy-resolution change
that defers behaviour without changing it.