Skip to content

fix(qdrant): allow service to boot with VECTOR_STORE_BACKEND=qdrant#5

Merged
Ptah-CT merged 2 commits into
mainfrom
qdrant/service-cutover-fixes
May 14, 2026
Merged

fix(qdrant): allow service to boot with VECTOR_STORE_BACKEND=qdrant#5
Ptah-CT merged 2 commits into
mainfrom
qdrant/service-cutover-fixes

Conversation

@Ptah-CT
Copy link
Copy Markdown

@Ptah-CT Ptah-CT commented May 14, 2026

Problem

After PR #2 (Qdrant adapter merged) and the data-side migration, the
service still refused to boot with VECTOR_STORE_BACKEND=qdrant:

  1. MilvusLifespanProvider had no env-gate, always tried to connect to
    Milvus, and crashed when Milvus was offline (which it is in the Qdrant
    cutover scenario). It runs at order=18, before the Qdrant lifespan at
    order=19, so the Qdrant lifespan never got a chance to start.
  2. Once the Milvus lifespan was gated, BaseMilvusRepository.__init__
    crashed at every controller construction because it eagerly called
    model.async_collection() — which raises when the Milvus lifespan
    has not initialised the collection (i.e. exactly the Qdrant-mode case).
  3. QdrantConnectionCache's lazy import inside
    get_tenant_qdrant_config still referenced
    core.tenants.tenantize.tenant_context — that module was renamed to
    core.tenants.tenant_contextvar upstream. The failure only surfaced
    at the very first ensure_collection call inside
    QdrantLifespanProvider.startup, so it looked indistinguishable
    from a wider Qdrant initialisation issue.

Fix

Three small, surgical patches:

core/lifespan/milvus_lifespan.py — symmetric env-gate.
startup and shutdown return None immediately when
VECTOR_STORE_BACKEND=qdrant, mirroring what QdrantLifespanProvider
already does for the inverse case.

core/oxm/milvus/base_repository.py — lazy collection resolution.
__init__ no longer eagerly resolves model.async_collection();
instead it stashes the model class, and self.collection is now a
@property that resolves on first access. Milvus-mode behaviour is
unchanged (first repo call resolves the collection identically to
before). Qdrant-mode boots cleanly because the Milvus collections are
never touched.

core/tenants/tenantize/oxm/qdrant/config_utils.py — fixed
the lazy import: core.tenants.tenantize.tenant_context
core.tenants.tenant_contextvar.

Verification

Boot with VECTOR_STORE_BACKEND=qdrant:

  • lifespan order: metrics → mongodb → milvus(no-op) → elasticsearch →
    qdrant (initialised) → business → longjob — all green.
  • /health returns {"status": "healthy"} HTTP 200.
  • /docs HTTP 200.
  • The 6 Qdrant collection classes are discovered + initialised at
    startup. ensure_all resolves the tenant-aware names and finds the
    pre-seeded collections green.

Risk

Low. None of the changes affect Milvus-mode (default). All three patches
are scoped to either the Qdrant code path or to a lazy-resolution change
that defers behaviour without changing it.

Three fixes that together unblock the Qdrant cutover. Without them the
service crashes at FastAPI startup in a loop even though the Qdrant adapter
itself is healthy.

1. Symmetric env-gate in MilvusLifespanProvider (core/lifespan/milvus_lifespan.py)

   QdrantLifespanProvider is already gated by VECTOR_STORE_BACKEND — when
   the value is anything other than 'qdrant' it is a no-op. The Milvus
   lifespan had no such gate: it always tried to connect to Milvus and
   raised on the first call when Milvus was offline. Because the Milvus
   lifespan runs at order=18 (before order=19 Qdrant lifespan), the
   Qdrant lifespan never got a chance to start. Added an early no-op
   return when VECTOR_STORE_BACKEND=qdrant for both startup and shutdown.

2. Lazy collection resolution in BaseMilvusRepository (core/oxm/milvus/base_repository.py)

   BaseMilvusRepository.__init__ called model.async_collection() eagerly,
   which raises 'Collection instance not created' when the Milvus lifespan
   has not initialised the collection — exactly the case when the service
   runs in Qdrant mode. Because controllers like SearchMemoryService
   construct Milvus repositories unconditionally, this crashed the whole
   FastAPI startup. Replaced with a lazy @Property: __init__ now only
   stashes the model class; the collection is resolved on first access.
   Result: Qdrant-mode boots cleanly, and Milvus-mode behaviour is
   unchanged (first repo call still resolves the collection the same way).

3. Corrected lazy import in QdrantConnectionCache key helper
   (core/tenants/tenantize/oxm/qdrant/config_utils.py)

   The lazy import inside get_tenant_qdrant_config was still pointing to
   the pre-rename module path core.tenants.tenantize.tenant_context; the
   actual module is core.tenants.tenant_contextvar. The error only fires
   when the Qdrant adapter tries to resolve a tenant — i.e. exactly at
   QdrantLifespanProvider.startup. Fixed the import string.

Verified by booting the service with VECTOR_STORE_BACKEND=qdrant: lifespan
order metrics -> mongodb -> milvus(no-op) -> elasticsearch -> qdrant
(initialised) -> business -> longjob completes; /health returns 200; the
Qdrant collections (101 today: 72 phase-3 sweep + 29 user_profile) are
discovered and ready.
@Ptah-CT Ptah-CT requested a review from DerAuctor as a code owner May 14, 2026 07:59
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

Warning

Rate limit exceeded

@Ptah-CT has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 45 minutes and 18 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5e677d07-eb95-4217-84cf-012b91e698d6

📥 Commits

Reviewing files that changed from the base of the PR and between 290600f and 774be68.

📒 Files selected for processing (1)
  • methods/EverCore/src/core/lifespan/milvus_lifespan.py
📝 Walkthrough

Überblick

Das System wird aktualisiert, um Qdrant als alternatives Vector-Store-Backend zu unterstützen: Milvus-Lifespan-Startup wird übersprungen, wenn Qdrant aktiv ist, die Collection-Initialisierung wird verzögert, und die Tenant-Kontextauflösung wird konsolidiert.

Änderungen

Qdrant-Backend-Migration und Lazy-Initialisierung

Layer / Datei(en) Zusammenfassung
Milvus-Lifespan-Umgebungsprüfung
methods/EverCore/src/core/lifespan/milvus_lifespan.py
import os wird hinzugefügt und MilvusLifespanProvider.startup erhält eine Umgebungsprüfung: Wenn VECTOR_STORE_BACKEND gleich "qdrant" ist, wird die Startup-Methode mit Protokollierung übersprungen und gibt None zurück, bevor Milvus initialisiert wird.
Verzögerte Collection-Initialisierung in BaseMilvusRepository
methods/EverCore/src/core/oxm/milvus/base_repository.py
AsyncCollection wird von einer eifrig initialisierten Attribute zu einer @property mit Lazy-Initialization und Caching konvertiert; die Collection wird nur beim ersten Zugriff via self.model.async_collection() erstellt und dann zwischengespeichert.
Tenant-Kontextauflösungs-Import-Konsolidierung
methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py
Der get_current_tenant-Import wird von core.tenants.tenantize.tenant_context zu core.tenants.tenant_contextvar geändert, während die anschließende Auflösungslogik unverändert bleibt.

Aufwandsschätzung für Code-Review

🎯 2 (Einfach) | ⏱️ ~12 Minuten

Gedicht

🐰 Ein Kaninchen-Vers zum Qdrant-Abenteuer:

Milvus schläft, wenn Qdrant lockt,
Die Sammlungen warten, bis man sie klopft,
Lazy bis gebraucht, so ist es recht,
Das Tenant-Import, es fliegt zurecht! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Der Titel beschreibt präzise die Hauptänderung: Er ermöglicht das Starten des Services mit VECTOR_STORE_BACKEND=qdrant, was das zentrale Ziel des Pull Requests ist.
Description check ✅ Passed Die Beschreibung erklärt detailliert die drei Probleme, die behoben wurden, und dokumentiert die Fixes sowie die Verifikation der Änderungen.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch qdrant/service-cutover-fixes

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 14, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 290600f7d9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread methods/EverCore/src/core/lifespan/milvus_lifespan.py Outdated
QdrantLifespanProvider already treats the env flag case-insensitively
via .strip().lower(); MilvusLifespanProvider must do the same. Without
the normalization a deployment that sets VECTOR_STORE_BACKEND=QDRANT (or
the value with surrounding whitespace) would still start Qdrant — and
also try to start Milvus, regressing the very crash this PR fixes.
@Ptah-CT
Copy link
Copy Markdown
Author

Ptah-CT commented May 14, 2026

@coderabbitai please re-review and approve when satisfied — the Codex P2 normalization finding has been addressed in 774be68.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

Let me fetch the latest PR state and the specific commit to verify the fix before proceeding.


🏁 Script executed:

#!/bin/bash
# Fetch PR details and files changed
gh pr view 5 --repo XInfty/EverOS_Qdrant --json title,state,commits,files

echo "---COMMIT---"
# Fetch the specific commit 774be68
gh api repos/XInfty/EverOS_Qdrant/commits/774be68 --jq '{sha: .sha, message: .commit.message, files: [.files[] | {filename, patch}]}'

echo "---PR COMMENTS---"
gh pr view 5 --repo XInfty/EverOS_Qdrant --json reviews,comments

Length of output: 23592


@Ptah-CT: The P2 normalization fix in 774be68 is correct and complete. The change from a bare == "qdrant" to .strip().lower() == "qdrant" now makes MilvusLifespanProvider's env-gate fully symmetric with QdrantLifespanProvider, handling QDRANT, Qdrant, and values with surrounding whitespace identically.

I've already approved the PR at commit 774be68 — all three original issues (Milvus lifespan env-gate, lazy BaseMilvusRepository collection, and the renamed module import) plus the follow-up normalization are addressed. The PR is good to merge. ✅

@Ptah-CT Ptah-CT merged commit e3cf5bb into main May 14, 2026
3 checks passed
@Ptah-CT Ptah-CT deleted the qdrant/service-cutover-fixes branch May 14, 2026 08:20
Ptah-CT added a commit that referenced this pull request May 14, 2026
…e factory (#6)

## Problem

PR #5 enabled the service to **start** with
``VECTOR_STORE_BACKEND=qdrant``
by gating the Milvus lifespan and lazifying ``BaseMilvusRepository``'s
collection property. But the service-layer write/read path was still
hard-coded to the Milvus repositories: 5 service classes either called
``EpisodicMemoryMilvusRepository()`` directly or asked the DI container
for ``get_bean_by_type(EpisodicMemoryMilvusRepository)``.

With the Milvus collections never loaded, every insert / search through
that path raised
``ValueError("Collection instance not created, please call
ensure_loaded() first")``
— which the calling services swallowed as ``ERROR`` logs while returning
``indexed=0`` and treating the operation as successful.

**Net effect on a live cutover deployment**: data continued to land in
MongoDB (the source of truth), but never reached Qdrant. Search through
the service layer returned empty results because the same broken path
was used for reads.

Concretely measured on the production cutover host:

- New documents in ``v1_episodic_memories`` (Mongo) were **not** present
  in the corresponding Qdrant collection (108 docs missing after ~12 h).
- ``profile_indexer`` logged ``Profile Milvus indexing completed:
indexed=0``
on every save with a ``UserProfileCollection Collection instance not
created``
  traceback right next to it.

## Fix

Introduce ``core/oxm/vector_backend_router.py`` — a thin factory with
one
``get_<memory>_repo()`` per memory type. Each factory reads
``VECTOR_STORE_BACKEND`` (case-insensitive, normalised) and resolves the
appropriate Qdrant or Milvus repo **via the DI container**, preserving
singleton bean scope. Direct instantiation is the fallback for unit-test
contexts without a DI scan.

Service-layer sites updated:

- ``agentic_layer/search_mem_service.py`` — 4 hard-coded
  ``*MilvusRepository()`` direct instantiations call the factory now;
  field names keep their historical ``_milvus_`` token for callee
  compatibility with the rest of the service.
- ``agentic_layer/memory_manager.py`` — the ``match mem_type`` switch
  that picked one of 5 ``MilvusRepository`` beans now routes to the
  factory.
- ``agentic_layer/profile_search_service.py`` — the lazy ``milvus_repo``
  property resolves through ``get_user_profile_repo()``.
- ``memory_layer/profile_indexer/profile_indexer.py`` — same lazy
  property, same factory call.
- ``biz_layer/mem_sync.py`` — constructor defaults for
  ``foresight_milvus_repo`` and ``atomic_fact_milvus_repo`` come from
  the factory.
- ``biz_layer/mem_memorize.py`` — the inline ``agent_skill_milvus_repo``
lookup inside the ``upsert_records`` sync uses
``get_agent_skill_repo()``.

Both backends expose the same public surface (``vector_search``,
``create_and_save_*``, ``delete_by_*``), so callers need no further
changes. Field / variable names keep their ``_milvus_`` token — that
keeps the diff minimal and avoids a rename cascade through the
existing methods; renaming is a follow-up.

## Test plan

- [ ] Service boots with ``VECTOR_STORE_BACKEND=qdrant`` (already
  validated by PR #5) and now also writes to Qdrant on memory ingest.
- [ ] ``profile_indexer`` ``indexed=N>0`` for a new profile.
- [ ] Search results contain documents written **after** the cutover.
- [ ] A separate one-shot catchup script syncs the ~108 documents that
  were written to Mongo between cutover and this fix (idempotent via
  uuid5 mapping, no double-embedding cost).

---------

Co-authored-by: Ptah-CT <221234802+Ptah-CT@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant