Qdrant adapter: full Milvus parity + standalone re-embed sweep (rebased on EverCore path) by Ptah-CT · Pull Request #2 · XInfty/EverOS_Qdrant

Ptah-CT · 2026-05-13T21:14:22Z

Summary

Rebase of the original Qdrant-adapter branch on top of current main, after upstream renamed methods/evermemos/ → methods/EverCore/ and added OpenHer use-cases. All 20 commits preserved, path rewritten via git format-patch + sed + git am --3way. README.md auto-merged.

Supersedes #1, which carried the same code on the pre-rename path.

The branch carries:

Phase 1: QdrantCollectionBase (tenant-aware naming, payload indexes, query_points instead of removed search).
Phase 2: 6 collections + 6 converters covering EpisodicMemory, AtomicFact, Foresight, AgentCase, AgentSkill, UserProfile.
Phase 2.5: 6 @repository adapters with two-stage score gating, client.count(exact=True) for delete-by-filter, and tz-aware epoch helpers.
Phase 3: standalone migrate_milvus_to_qdrant CLI (Mongo → OpenRouter → Qdrant) plus the re_embed_sweep wrapper that fans the workhorse over every active tenant database.
3 rounds of bot feedback addressed inline: 30 (CodeRabbit pass-1) + 7 (CodeRabbit pass-2) + 17 (CodeRabbit on Qdrant adapter: full Milvus parity + standalone re-embed sweep #1, plus a 👍-equivalent from Codex).

Test plan

ruff check clean on changed files.
pytest green for methods/EverCore/tests/ (collection + converter + repository suites).
Smoke: standalone migrate writes a single document into a smoketest Qdrant collection (verified against a real Qdrant pre-rebase).
Sweep dry-run prints the expected (db, coll) pair plan and does not call OpenRouter.
Full sweep (after merge) writes the re-embedded points into Qdrant collections per tenant.

Out of scope / follow-ups

UserProfile reindex is intentionally not in the sweep wrapper — it splits one source document into many points and needs its own Phase 3.1 path.

Phase 1 (skeleton) of the milvus -> qdrant migration. No runtime behavior change yet — Milvus stays the default backend until cutover. - README: prepend fork header documenting motivation, status, approach, and concept mapping. Links to the Qdrant migration guide. - pyproject: add qdrant-client>=1.12,<2. - src/core/oxm/qdrant/qdrant_collection_base.py: stub IndexConfig and QdrantCollectionBase. ensure_all() is a no-op so the lifespan provider can iterate registered subclasses without crashing during the skeleton phase. upsert/search/delete raise NotImplementedError (Phase 2). - src/core/component/qdrant_client_factory.py: full QdrantClientFactory with env-driven get_qdrant_config(prefix=...), per-alias client caching, named clients, and graceful shutdown. https=Optional[bool] preserves qdrant-client's URL-scheme TLS detection; api_key=Optional[str] passes through cleanly without empty-string coercion. Registered as @component(primary=False) so the milvus factory remains the default until cutover. - Empty __init__.py for new oxm/qdrant and tenants/.../oxm/qdrant packages. Next: qdrant_lifespan.py (gated by VECTOR_STORE_BACKEND env flag) and full collection-base impl.

…D env Provides FastAPI lifespan startup/shutdown for the Qdrant adapter, analogous to MilvusLifespanProvider but no-op unless VECTOR_STORE_BACKEND=qdrant. So the Milvus backend stays the default at runtime until cutover. On startup (when active): - Resolves the 'qdrant_client_factory' DI bean. - Collects all concrete QdrantCollectionBase subclasses via get_all_subclasses. - Groups them by _DB_USING and ensures the client per group, then runs ensure_all() on each collection (currently a stub no-op; will create collections + payload indexes in Phase 1.2). On shutdown: closes all cached Qdrant clients. Cleans the same app.state attributes pattern as the milvus provider. Order=19 sits between milvus_lifespan (18) and business_lifespan (20), so during cutover both backends can briefly coexist.

Generic[QdrantCollectionType] ABC with an abstract @classmethod from_mongo() that subclasses implement to convert Mongo source docs into Qdrant point payloads (PointStruct or compatible dicts). get_qdrant_model() introspects the Generic argument from __orig_bases__ so the bound collection class can be retrieved at runtime — same pattern as the Milvus base converter, which the search-repository layer relies on.

…query_points() Replaces the Phase 1.1 stub with a full collection-management base class: - IndexConfig as @DataClass: size, distance, on_disk, hnsw_m, hnsw_ef_construct, payload_indexes (dict field_name -> schema_type). to_vectors_config() builds the qdrant_client VectorParams + HnswConfigDiff. - Module-level _DISTANCE_MAP and _PAYLOAD_SCHEMA_TYPE_MAP translate string configs to SDK enums — subclasses stay decoupled from the SDK. - QdrantCollectionBase methods: client() (lazy DI lookup), exists(), count(), ensure_collection() (idempotent), ensure_payload_indexes() (idempotent), ensure_all(), upsert(), search() (qdrant-client query_points wrapper), delete(), drop(). - Compared to MilvusCollectionBase this is ~half the LOC because Qdrant has no alias mechanism — collection names are direct. CRITICAL fix in the same commit: qdrant-client 1.16.1 (the version resolved against our >=1.12,<2 pin) removed the legacy QdrantClient.search method; only query_points is available now. The search() wrapper here calls query_points(query=..., ...) and unwraps QueryResponse.points so call sites still get List[ScoredPoint]. uv.lock is regenerated to include qdrant-client (1.16.1) and its transitive deps (h2, hpack, hyperframe, portalocker).

…t naming Adds the multi-tenancy adapter for Qdrant. Deliberately schlanker than the Milvus counterpart (~270 LOC total vs ~750 LOC for milvus' tenant layer) because Qdrant has no alias mechanism and no partition_key feature — multi- tenancy is realized via collection-per-tenant naming alone. src/core/tenants/tenantize/oxm/qdrant/config_utils.py: - get_tenant_aware_collection_name(original_name): resolves the final Qdrant collection name from the active tenant context. Lookup order is storage_info['qdrant'] -> storage_info['milvus'] (migration bridge, reuses the same collection_prefix for both backends until per-tenant qdrant config is wired) -> base resource prefix fallback. - get_qdrant_connection_cache_key(config): builds a stable factory cache key, hashing api_key fingerprints (8 hex chars) so the raw key never appears in the cache identifier. - _load_qdrant_env(prefix): env-fallback loader for tenant-aware connection routing. Currently exported as private until the routing layer consumes it; documented to avoid dead-code flags. src/core/tenants/tenantize/oxm/qdrant/tenant_aware_qdrant_collection_with_suffix.py: - TenantAwareQdrantCollectionWithSuffix(QdrantCollectionBase): overrides the name property to return tenant-prefixed + optional explicit suffix. The collection base remains unchanged; concrete Phase-2 collections inherit from this class instead of QdrantCollectionBase directly. - __init__(suffix) accepts explicit override or falls back to the SELF_QDRANT_COLLECTION_NS env-var. - _MULTI_TENANT_STRATEGY ClassVar is informational; a future version may opt into Qdrant's native payload-partitioning instead of separate collections per tenant.

…ollectionBase Wraps the sync qdrant-client API via asyncio.to_thread so the repository surface stays async (parity with the milvus repository layer). Methods: - upsert(point) -> str (point id, parity with milvus 'insert' returning entity_id) - upsert_batch(points) -> UpdateResult (full result, exposes wait-status) - find_by_id(id) / find_by_ids(ids) -> Optional[Record] / List[Record] - delete_by_id(id) -> bool; delete_batch(ids) -> UpdateResult - search(query_vector, limit, query_filter, ...) -> List[ScoredPoint] - count(exact=True) -> int - collection: lazy-instantiated QdrantCollectionBase subclass - get_model_name() -> str Error-handling semantics mirror the milvus counterpart: - upsert / upsert_batch / delete_batch / search → log + raise on failure - find_by_id / find_by_ids / delete_by_id → log + return None / False (resilient read path) Subclasses bind the generic parameter to the concrete collection model class: class EpisodicMemoryRepository( BaseQdrantRepository[EpisodicMemoryCollection] ): def __init__(self): super().__init__(EpisodicMemoryCollection)

…+converter First two of six Qdrant collections matching the Milvus search adapter layout under src/infra_layer/adapters/out/search/. Subsequent commits add the remaining four (agent_skill, foresight, atomic_fact, user_profile). Collections (src/infra_layer/adapters/out/search/qdrant/memory/): - episodic_memory_collection.py: TenantAwareQdrantCollectionWithSuffix subclass, base name v1_episodic_memory. Vector 1024-dim Cosine, HNSW m=16 ef_construct=200. Payload indexes: user_id/group_id/session_id/ parent_id/parent_type/type (keyword) + timestamp (integer, epoch ms). - agent_case_collection.py: base name v1_agent_case. Same vector params; payload indexes minus participants/type, plus timestamp as epoch seconds (parity with milvus converter). Converters (src/infra_layer/adapters/out/search/qdrant/converter/): - episodic_memory_qdrant_converter.py: BaseQdrantConverter[ EpisodicMemoryCollection]. from_mongo() builds a PointStruct from a MongoDB v1_episodic_memories document. Handles missing optional fields (sender_ids, type, subject, summary) and serializes the search_content list to JSON for downstream search service consumption. - agent_case_qdrant_converter.py: BaseQdrantConverter[AgentCaseCollection]. Maps AgentCaseRecord -> PointStruct, truncates task_intent to 5000 chars (Milvus parity), uses epoch-seconds timestamp. Qdrant is schema-flexible, so unlike the Milvus side there is no field schema declaration — only the vector params and the explicit payload indexes (the rest of the payload is whatever the converter writes).

…verter Adds collections 3 and 4 of six. Same TenantAwareQdrantCollectionWithSuffix + BaseQdrantConverter pattern as batch 1. Collections: - agent_skill_collection.py: base name v1_agent_skill. Payload indexes for user_id/group_id/cluster_id (keyword) + maturity_score/confidence (float) for threshold range queries. - foresight_collection.py: base name v1_foresight_record. Payload indexes for user_id/group_id/session_id/parent_id/parent_type/type (keyword) + start_time/end_time (integer, epoch ms). Converters: - agent_skill_qdrant_converter.py: AgentSkillRecord -> PointStruct. Builds the content payload from name + description, truncates to 5000 chars. Coerces optional maturity_score / confidence to 0.0 when absent (Qdrant silently excludes null-valued payloads from range filters, so treating 'unscored' as 'lowest score' keeps them visible to threshold queries). - foresight_qdrant_converter.py: ForesightRecord -> PointStruct. Time-field parser accepts datetime / ISO-8601 / numeric. **Diverges from Milvus template**: numeric inputs above 1e10 are treated as already-milliseconds rather than blindly multiplied by 1000 — the Milvus version corrupts already-ms inputs. content payload is intentionally passed verbatim (incl. None) for downstream sentinel semantics; documented inline.

…onverter (Phase 2 complete) Adds the final two of six Qdrant collections — Phase 2 of the Milvus->Qdrant migration is now structurally complete (6 collections + 6 converters under src/infra_layer/adapters/out/search/qdrant/). Collections: - atomic_fact_collection.py: base name v1_atomic_fact_record. Payload indexes for user_id/group_id/session_id/parent_id/parent_type/type (keyword) + timestamp (integer, epoch ms). - user_profile_collection.py: base name v1_user_profile. No session_id (user-level aggregation). Payload indexes for user_id/group_id/scenario/ item_type (all keyword). Converters: - atomic_fact_qdrant_converter.py: AtomicFactRecord -> PointStruct. Falls back to RawDataType.CONVERSATION when source.type is absent. exc_info on error log. - user_profile_qdrant_converter.py: **diverges from the other converters' return type** — returns List[Dict[str, Any]] (one item per explicit_info / implicit_trait / user_goal entry) for parity with the Milvus counterpart. ProfileIndexer downstream wraps each item into a PointStruct after embedding. Module-level _EXPLICIT_FIELDS and _IMPLICIT_FIELDS constants (Milvus version had them inline). Carries '# type: ignore[override]' on from_mongo with docstring justification. Phase 2 file inventory: src/infra_layer/adapters/out/search/qdrant/ ├── __init__.py ├── memory/ │ ├── __init__.py │ ├── episodic_memory_collection.py │ ├── agent_case_collection.py │ ├── agent_skill_collection.py │ ├── foresight_collection.py │ ├── atomic_fact_collection.py │ └── user_profile_collection.py └── converter/ ├── __init__.py ├── episodic_memory_qdrant_converter.py ├── agent_case_qdrant_converter.py ├── agent_skill_qdrant_converter.py ├── foresight_qdrant_converter.py ├── atomic_fact_qdrant_converter.py └── user_profile_qdrant_converter.py

@repository

…ories First 2 of 6 Qdrant repositories, mirroring the surface of the corresponding Milvus repositories so the search-service layer can swap backends via the VECTOR_STORE_BACKEND env flag. Both repositories: - Inherit from BaseQdrantRepository[<Collection>], registered as @repository(name='..._qdrant_repository', primary=False). - Build filters as qmodels.Filter(must=[FieldCondition(...)]) using MatchValue / MatchAny / Range instead of Milvus' string expression syntax. Filter-construction is fully typed — no injection vector remains in the search path. - Honour the MAGIC_ALL sentinel for user_id / group_id with the same semantics as the Milvus repositories. - Use a two-stage score gating pattern (server-side score_threshold via Qdrant + client-side post-filter at the caller's hard threshold); the rationale is documented inline so future readers don't read it as a duplicated check. AgentSkillQdrantRepository: - vector_search() with maturity_threshold / confidence_threshold range filters plus optional cluster_id / group_ids. - delete_by_cluster_id() — uses scroll() for a best-effort delete count (Qdrant's filter-based delete doesn't return one) then deletes via FilterSelector(filter=...). EpisodicMemoryQdrantRepository: - create_and_save_episodic_memory() — convenience constructor that builds a PointStruct and upserts, returns the same lightweight summary dict as the Milvus repository for caller parity. - vector_search() with full scope + time-range filters. - delete_by_filters() — batch delete by user_id/group_id/time-range; same MAGIC_ALL guard and 'at least one filter required' contract as the Milvus repository.

Resolves the full set of findings from a CodeRabbit code review on feature/qdrant-adapter (1 critical, 10 major, 19 minor). **Critical** - episodic_memory_qdrant_repository.delete_by_filters: scroll with hard- coded limit=10_000 could undercount large tenants. Replaced with an exact client.count() call so the returned delete-count reflects the full set. **Major** - agent_skill_qdrant_repository.delete_by_cluster_id: same scroll-limit bug. Same fix (exact count + filter delete) and re-raise on error. - episodic_memory_qdrant_repository.vector_search: two-stage score gating now uses min(radius, score_threshold) for server-side filtering so a wider radius doesn't accidentally make the server cut stricter than the client-side post-filter. - AgentCase / AgentSkill / EpisodicMemory / Foresight / AtomicFact converters: explicit non-empty vector validation (raise instead of silently writing an empty list) and explicit id None-guard (no more str(None) -> 'None' point ids). - foresight_qdrant_converter._parse_time_field: 'if not time_value' treated epoch 0 as missing. Now 'if time_value is None'. - atomic_fact_qdrant_converter: _build_search_content is now actually written into the point payload (was dead code); vector access uses getattr defensive; type fallback only when source.type is None rather than any falsy value. - base_repository.find_by_id / find_by_ids / delete_by_id: stop swallowing all exceptions. Errors are logged and re-raised so callers can distinguish 'not found' from operational failures. Behaviourally consistent with the rest of the base methods (upsert/delete_batch/ search) that already raised on failure. - qdrant_collection_base.drop: log + re-raise instead of swallow. - config_utils._load_qdrant_env and qdrant_client_factory.get_qdrant_config: safe int(QDRANT_PORT) with try/except + TCP range guard (1-65535). - qdrant_client_factory: URL assembly preserves an already-schemed host verbatim (e.g. 'https://my-qdrant.cloud') instead of force-prefixing http:// and double-appending the port. - qdrant_client_factory.get_client: threading.Lock with double-checked locking eliminates the cache-miss race that could create duplicate QdrantClient instances under concurrent FastAPI requests. **Minor** - 6x collection docstrings: 'dim=1024' -> 'dim=VECTORIZE_DIMENSIONS' (no more drift if the constant changes). - 2x ValueError messages: 'cannot be empty' -> 'cannot be None' to match the actual 'is None' guard. - config_utils.get_qdrant_connection_cache_key: api_key.encode now tolerates bytes/str/other. - agent_skill_qdrant_repository.vector_search: 'user_id is None' now skips the filter entirely instead of matching the empty string. - qdrant_client_factory.get_named_client: cache key normalized via .lower() so 'Default'/'DEFAULT'/'default' share one client. - qdrant_client_factory.get_qdrant_config: 'https_raw' now uses the _env helper consistently with the other env vars. - 2x qdrant_collection_base: 'assert cfg is not None' replaced with explicit RuntimeError so the guard survives python -O. Total: 17 files changed, ~30 distinct fixes.

Adds repositories 3 and 4 of 6. Both apply all CodeRabbit-derived patterns established in the previous Phase 2.5 fix-pass: - user_id/MAGIC_ALL guard skips the filter when None/empty (no spurious user_id == '' match). - Two-stage score gating: server-side passes min(radius, score_threshold) so a wider radius cannot accidentally tighten the cut; client-side post- filter enforces the hard caller minimum. - client.count(exact=True) for delete-by-filter return values (not a bounded scroll page). - All error paths re-raise after a structured log. AgentCaseQdrantRepository: - vector_search() with scope (user_id / session_id / group_ids / parent_id) and time-range filters in epoch seconds (parity with the AgentCase converter and the Milvus repository — agent_case is the one collection storing seconds, not milliseconds). - Returns a datetime for timestamp (round-tripped from epoch seconds with tz=UTC). AtomicFactQdrantRepository: - create_and_save_atomic_fact() convenience constructor: builds the PointStruct (with empty vector validation), upserts, returns the Milvus-shaped summary dict for caller parity. - vector_search() with full scope + time-range filters in epoch ms. - batch_vector_search_by_parent_ids(): MRAG-Phase-3 expansion path — MatchAny over parent_ids with total_limit = limit * len(parent_ids). Returns early with an empty list when no parent_ids are passed. - delete_by_filters(): uses exact count + filter-based delete; raises on any operational error so callers can distinguish 'no points' from failure. - All search paths return datetime for timestamp (consistent with create_and_save_atomic_fact, parity with agent_case).

…s (Phase 2.5 complete) Final 2 of 6 Qdrant repositories. Phase 2.5 is now structurally complete: all six adapters (agent_skill, episodic_memory, agent_case, atomic_fact, foresight, user_profile) exist as Qdrant repositories matching their Milvus counterparts. All four established Phase-2.5 patterns applied: - user_id/MAGIC_ALL guard skips the filter on None/empty - Two-stage score gating: min(radius, score_threshold) server-side - client.count(exact=True) for delete-by-filter return values - Re-raise after structured log (no swallowed errors) ForesightQdrantRepository: - create_and_save_foresight_mem(): convenience constructor + upsert. session_id is now an explicit parameter (writes into payload) so the matching vector_search(session_id=...) filter actually hits — the Milvus repository signature lacked this parameter and produced a silent zero-hit filter. - vector_search() with scope (user_id/group_ids/session_id/sender_id/ parent_type/parent_id) + time-range filters. **Diverges from Milvus**: filters on start_time/end_time payload fields (semantically correct range overlap) instead of the Milvus repository's non-existent 'timestamp' field. Documented inline. sender_id filter uses Qdrant's element-wise MatchValue on the sender_ids array — equivalent to Milvus' array_contains. - delete_by_filters() with the same start_time/end_time semantics. UserProfileQdrantRepository: - vector_search() with user_id/group_id/scenario scoping (no session_id — user_profile is user-level aggregation). - delete_by_user_group(): count + filter-based delete, raises on operational error (consistent with the Phase 2.5 fix-pass on base_repository).

CodeRabbit pass 2 found 7 follow-up findings after pass 1. All addressed: **Major** - Timezone-naive datetimes in time-range filters silently used the local timezone for .timestamp() conversion, producing wrong epoch values. Added module-level helpers in base_repository: to_epoch_ms(dt) and to_epoch_s(dt) which coerce tz-naive datetimes to UTC. All five repositories now use these helpers consistently: - agent_case (epoch seconds via to_epoch_s) - atomic_fact / episodic_memory / foresight (epoch ms via to_epoch_ms) - user_profile (no time fields, untouched) **Minor** - foresight create_and_save_foresight_mem: start_time/end_time fall back to None instead of 0 when missing — 0 would silently match epoch-1970 records. Documented inline. - base_repository.count(): wrapped in try/except + structured log to match the rest of the async methods. - atomic_fact create_and_save: vector validation now explicit None/empty check (`if vector is None or len(vector) == 0`) instead of `if not vector`, so a legitimate all-zero embedding is no longer falsy-rejected. - user_profile_qdrant_repository.vector_search: group_id and scenario filters now mirror user_id's MAGIC_ALL guard (skip filter on MAGIC_ALL sentinel) instead of treating MAGIC_ALL as a literal value to match. Note: the foresight repository's two-stage score-gating pattern was flagged as 'redundant filtering'. It is intentional — server-side uses the more permissive bound (radius widening) and the client-side post- filter enforces the caller's hard cut. The behaviour is documented in the inline comment block; the CodeRabbit finding is a false positive.

… Qdrant) Adds devops_scripts/migrate_milvus_to_qdrant.py, the workhorse for the Phase 3 cutover. Standalone: no EverOS DI container required. Reads OpenRouter, Mongo, Qdrant config from env (auto-loads .env via python-dotenv when present). Migrates one (mongo-db, mongo-collection) pair to one Qdrant collection per invocation; shell-loop over the six EverOS collection types × N tenants for the full sweep. Defaults match the documented xinfty stack: - VECTORIZE_MODEL=qwen/qwen3-embedding-8b - VECTORIZE_DIMENSIONS=4096 - OPENROUTER_BASE_URL=https://openrouter.ai/api/v1 - MONGO_URI=mongodb://localhost:27017 - QDRANT_HOST=localhost, QDRANT_PORT=6333 CLI args expose the per-collection variation: --text-field primary text used for embedding --extra-text-fields comma-separated secondary text fields --timestamp-field + --timestamp-unit ms|s --payload-fields comma-separated mongo fields projected to qdrant payload --batch-size embedding batch size (default 32) --limit cap for smoke tests --force re-embed and overwrite existing points --dry-run count without calling OpenRouter or Qdrant.upsert --log-level DEBUG/INFO/WARNING/ERROR Idempotent by default: client.retrieve filters out point ids that already exist in the target Qdrant collection (skip path); --force overwrites them. Embedded behaviour: - extract_text concatenates text_field + extra_text_fields with newlines - build_payload projects payload_fields + normalizes timestamp via datetime.timestamp() (epoch ms or s depending on --timestamp-unit) - search_content is JSON-serialized from the text pieces, mirroring the converter's payload shape used by EverOS' search service. ensure_qdrant_collection creates the target collection with the same HNSW/Cosine config the EverOS adapter writes (m=16, ef_construct=200, distance=Cosine) so the schema matches what the live service expects.

Qdrant only accepts unsigned-int or RFC-4122-UUID point ids; the Mongo ObjectId hex (e.g. 69ed6acfaf31e5cd7977bc56) is neither and the live pilot hit a 400 Bad Request from the Qdrant retrieve endpoint. Fix: - base_repository: add mongo_id_to_qdrant_id() helper that does str(uuid.uuid5(NAMESPACE, str(mongo_id))). Namespace is a fixed UUID embedded in code (must never change without a full re-migration). - migrate_milvus_to_qdrant.py: use the helper for the Qdrant point id; keep the original Mongo id in the payload as 'mongo_id' for reverse lookup. The Phase-2 converters (search/qdrant/converter/*) still use str(source_doc.id) and would fail the same way the moment they go live. That fix is the next commit.

ModuleNotFoundError: 'core' when invoked as python src/devops_scripts/migrate_milvus_to_qdrant.py ... Fix: prepend the script's parent-of-parent (the EverOS src/ tree) to sys.path before the core.oxm.qdrant.base_repository import. Lets the script run without PYTHONPATH or pip install.

The Phase 2 converters used ``id=str(source_doc.id)`` directly when building ``PointStruct`` payloads. Qdrant only accepts unsigned integers or RFC-4122 UUIDs as point ids — a 24-hex-char Mongo ``ObjectId`` is neither, so the Live-indexing path produced a 400 Bad Request on upsert. The standalone re-embed CLI (commit c17ba60) already routed Mongo ids through ``mongo_id_to_qdrant_id`` (uuid5 over a stable namespace), so the two paths now agree: - standalone migrate: Mongo doc -> uuid5 point id - live converter: Mongo doc -> uuid5 point id (this commit) Both also persist the raw Mongo id in the payload as ``mongo_id`` for round-trip lookup, idempotent re-embed, and debugging. ``user_profile_qdrant_converter`` is intentionally left untouched: it emits multiple points per source doc (one per explicit_info/implicit_trait entry), assigns fresh ObjectIds, and has its own Phase 3.1 path that needs a separate point-id scheme.

Adds ``re_embed_sweep.py`` next to the standalone ``migrate_milvus_to_qdrant`` workhorse. The wrapper iterates every active (non-hyphen) Mongo database times every supported collection type and invokes ``migrate(...)`` for each non-empty pair. Five collection types are covered: - episodic_memory (v1_episodic_memories -> <prefix>_v1_episodic_memory) - atomic_fact (v1_atomic_fact_records -> <prefix>_v1_atomic_fact_record) - foresight (v1_foresight_records -> <prefix>_v1_foresight_record) - agent_case (v1_agent_cases -> <prefix>_v1_agent_case) - agent_skill (v1_agent_skills -> <prefix>_v1_agent_skill) ``v1_user_profiles`` is deliberately excluded — it needs per-doc splitting (one source doc -> many Qdrant points), handled by a separate Phase 3.1 script. CLI shape mirrors the workhorse: ``--tenant``, ``--collection``, ``--batch-size``, ``--limit-per-pair``, ``--force``, ``--dry-run``, ``--log-level``. The wrapper imports ``migrate`` directly (no subprocess fan-out) so config is read once and progress logs interleave naturally.

Critical - ``migrate_milvus_to_qdrant`` now wraps the migration body in try/finally and closes ``mongo``, ``qdrant``, and ``openai`` clients explicitly. Long sweeps previously leaked connections on every per-pair invocation. Major - ``QdrantCollectionBase.exists`` only catches the qdrant-client transport exceptions (``ResponseHandlingException``, ``UnexpectedResponse``); other failures propagate so infrastructure issues stay visible instead of being silently treated as "collection missing". - ``EpisodicMemoryQdrantRepository.create_and_save_episodic_memory`` rejects missing/empty vectors up front with a ``ValueError`` (mirrors the converter contract) instead of bubbling up a confusing 400 from Qdrant. - ``EpisodicMemoryQdrantRepository.vector_search`` no longer treats the default ``user_id=None`` as a filter on the empty string. The condition now requires the caller to have provided an explicit (non-None, non-MAGIC_ALL) value, restoring full-recall behaviour for the unscoped search path. - ``AtomicFactQdrantRepository.vector_search`` now returns ``atomic_fact`` in the result dict (parity with the batch path and with the persisted payload from ``create_and_save_atomic_fact``). Minor - ``AgentSkillQdrantRepository.vector_search`` two-stage threshold now uses ``min(radius, score_threshold)`` server-side, matching the agent_case repository's "more permissive lower bound" semantics. Nitpick - ``logger.error`` in every except block touched by Phase 2/3 swapped for ``logger.exception`` (10 sites across converters, repositories, and the sweep wrapper) so the stack trace is always logged. - ``build_payload`` now logs a warning when a timestamp field has an unexpected type instead of silently dropping the value.

coderabbitai · 2026-05-13T21:18:34Z

Warning

Rate limit exceeded

@Ptah-CT has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 34 minutes and 19 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 653ab36c-576b-461c-92a2-bd823051d450

📥 Commits

Reviewing files that changed from the base of the PR and between 51025b2 and 5d1be9c.

📒 Files selected for processing (9)

README.md
methods/EverCore/src/core/lifespan/qdrant_lifespan.py
methods/EverCore/src/core/oxm/qdrant/base_repository.py
methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py
methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py
methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py
methods/EverCore/src/devops_scripts/re_embed_sweep.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/atomic_fact_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py

📝 Walkthrough

Walkthrough

Dieses PR implementiert das Qdrant-Vector-Backend: client-factory, Lifespan-Provider, tenant-aware Konfiguration, Collection-Basics, Domänen‑Konverter, async Repositories und Migrations-CLI-Tools.

Changes

Qdrant Vector Backend Migration

Layer / File(s)	Summary
Docs and Dependency `README.md`, `methods/EverCore/pyproject.toml`	Top-of-file README-Note zur Milvus→Qdrant-Fork; `qdrant-client>=1.12,<2` als neue Abhängigkeit.
Tenant-aware config utilities `methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py`	Tenant-Auflösung für Qdrant storage, fallback auf Milvus-Entry, tenant-prefixed collection-name resolver und deterministic connection cache key; sichere env-parsing.
Qdrant client factory and config `methods/EverCore/src/core/component/qdrant_client_factory.py`	Env-get_qdrant_config, boolean/port parsing, timeout clamping; QdrantClientFactory with alias-normalized cache, double-checked locking, get_default/get_named and close_all_clients.
FastAPI Lifespan provider `methods/EverCore/src/core/lifespan/qdrant_lifespan.py`	LifespanProvider gated by VECTOR_STORE_BACKEND; discovers QdrantCollectionBase subclasses, groups by `_DB_USING`, creates named clients and calls `ensure_all()`; shutdown closes clients and cleans state.
Collection base & IndexConfig `methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py`	`IndexConfig` dataclass, SDK-mappings, `QdrantCollectionBase` with `exists`, `ensure_collection`, `ensure_payload_indexes`, `ensure_all`, plus upsert/search/delete/drop wrappers.
Base repository & helpers `methods/EverCore/src/core/oxm/qdrant/base_repository.py`	Deterministic `mongo_id_to_qdrant_id` (UUID5), `to_epoch_ms`/`to_epoch_s`, `compute_effective_threshold`, and async `BaseQdrantRepository` with thread-wrapped CRUD/search/count.
BaseQdrantConverter `methods/EverCore/src/core/oxm/qdrant/base_converter.py`	Abstract generic converter base with runtime generic-type extraction and `from_mongo` contract.
Tenant-aware collection with suffix `methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/tenant_aware_qdrant_collection_with_suffix.py`	Collection-per-tenant naming with optional suffix resolved from constructor or `SELF_QDRANT_COLLECTION_NS`; exposes `name`, `base_name`, `suffix`.
Domain converters `methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/*`	Converters for AgentCase, AgentSkill, AtomicFact, EpisodicMemory, Foresight, UserProfile to produce Qdrant PointStructs or payload items with validation, timestamp normalization and `search_content`.
Collection definitions (memory) `methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/*`	Tenant-aware collection subclasses with `_COLLECTION_NAME` and `_VECTOR_PARAMS` (IndexConfig), payload index declarations and docstrings.
Async Qdrant repositories `methods/EverCore/src/infra_layer/adapters/out/search/repository/*_qdrant_repository.py`	Repositories implementing `vector_search` (Qdrant Filter construction, server-side effective threshold, client-side cutoff), create/save methods, batch helpers and delete-by-filters using exact count+delete.
Migration CLI tools `methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py`, `methods/EverCore/src/devops_scripts/re_embed_sweep.py`	Single-collection migration CLI with OpenRouter-compatible embeddings, idempotent ID handling, batching and dry-run; sweep script enumerates tenants and runs migrations per spec.

Sequence Diagram(s)

sequenceDiagram
  participant App as FastAPI App
  participant Lifespan as QdrantLifespanProvider
  participant Factory as QdrantClientFactory
  participant Discovery as CollectionDiscovery
  participant Collection as QdrantCollectionBase
  participant Qdrant as QdrantClient

  App->>Lifespan: startup(VECTOR_STORE_BACKEND=qdrant)
  Lifespan->>Discovery: find QdrantCollectionBase subclasses
  Discovery-->>Lifespan: grouped by _DB_USING
  Lifespan->>Factory: get_named_client(using)
  Factory->>Qdrant: create/cached QdrantClient
  Lifespan->>Collection: ensure_all()
  Collection->>Qdrant: collection_exists/create_collection/create_payload_index
  App->>Lifespan: shutdown
  Lifespan->>Factory: close_all_clients()
  Factory->>Qdrant: close() per client

sequenceDiagram
  participant Repo as Repository
  participant Thread as asyncio.to_thread
  participant Collection as QdrantCollectionBase
  participant Qdrant as QdrantClient

  Repo->>Repo: build Qdrant Filter & effective_threshold
  Repo->>Thread: call collection.search(query, filter, score_threshold)
  Thread->>Collection: search()
  Collection->>Qdrant: query_points(...)
  Qdrant-->>Collection: response.points
  Thread-->>Repo: List[ScoredPoint]
  Repo->>Repo: apply client-side cutoff, map payload, convert timestamps
  Repo-->>Client: results

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Ich hüpf' durch Code und Wald so sacht,
Qdrant bringt Ordnung über Nacht,
Tenant-Names wie Möhren fein,
Converter, Repos, alles rein,
Migration läuft — hurra, geschafft!

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch qdrant/rebase-evercore

coderabbitai

Actionable comments posted: 19

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@methods/EverCore/src/core/component/qdrant_client_factory.py`:
- Around line 88-92: Die Timeout-Parsing-Logik in QdrantClientFactory (z.B. in
der Methode create_qdrant_client / der Stelle, wo die Variable timeout geparst
wird) muss neben TypeError/ValueError auch auf gültige Werte prüfen: erzwinge
dass der geparste timeout ein positiver, sinnvoller Wert ist (z.B. timeout >= 1
und <= MAX_TIMEOUT_SECONDS wie 300) und bei Verletzung dieser Range entweder
eine ValueError werfen oder auf einen Default-Wert zurückfallen; passe die
bestehende try/except-Logik an, sodass negative, 0 oder extrem große Werte nicht
stillschweigend akzeptiert werden (verwende eindeutige Namen wie timeout,
MAX_TIMEOUT_SECONDS und die Factory-Methode create_qdrant_client zur
Lokalisierung).
- Around line 94-103: The comment claiming the qdrant-client SDK will infer
scheme when https=None is wrong because the code constructs url =
f"{scheme}://{host}:{port}" which forces the scheme; update the factory logic in
qdrant_client_factory (e.g., the function creating the client such as
create_qdrant_client / QdrantClientFactory) so that when QDRANT_HTTPS is unset
(https is None) you do NOT build a full url string but instead pass host and
port (and other kwargs like prefer_grpc/token) directly to the Qdrant client
constructor so the SDK can infer scheme, or alternately correct the inline
comment to accurately state current behavior; ensure you reference the
https/QDRANT_HTTPS flag, the variables scheme, url, host and port when making
the change.

In `@methods/EverCore/src/core/lifespan/qdrant_lifespan.py`:
- Around line 123-124: Der Shutdown/Cleanup darf nicht von einem Env-Flag
abhängen; ändere die Logik in der qdrant_lifespan.py so dass die
Shutdown-/cleanup-Methode (z.B. shutdown, cleanup oder __aexit__/__exit__) nicht
`os.environ` oder ein Environment-Flag abfragt, sondern den tatsächlichen
Initialisierungszustand prüft: setze beim erfolgreichen Initialisieren (z.B. in
initialize, start oder __aenter__) ein eindeutiges Attribut wie
self._initialized = True und/oder self._client (z.B. self._qdrant_client) und
prüfe beim Beenden ausschließlich dieses Attribut (und setzte es nach dem
Schließen auf False), sodass Schließen idempotent und nicht von zwischenzeitlich
geänderten Env-Variablen abhängig ist; stelle sicher, dass alle Referenzen im
Shutdown/Cleanup (z.B. die Existenz von self._qdrant_client oder
self._initialized) verwendet werden, statt das Env-Flag zu lesen.
- Around line 117-120: Beim Startup in qdrant_lifespan.py fehlt ein Rollback:
wenn die Collection-Initialisierung (z.B. in der Methode
startup/initialize_collections oder bei Aufrufen wie
initialize_collection/init_collection) nach bereits erstellten Qdrant-Clients
fehlschlägt, werden die offenen Clients nicht geschlossen. Ändere die
Startup-Logik so, dass bei einem Fehler während der Collection-Initialisierung
alle zuvor erstellten Clients (z.B. in der Liste/Variable clients oder
self._clients, und einzelne Client-Instanzen erstellt via create_client /
_create_client) sauber geschlossen/geschlossen werden (z.B. client.close() oder
client.shutdown()) bevor der Fehler erneut geworfen wird, und stelle sicher,
dass sowohl synchroner als auch asynchroner Cleanup (await client.close() falls
async) abgedeckt ist und die Sammlung der Clients danach geleert wird.

In `@methods/EverCore/src/core/oxm/qdrant/base_converter.py`:
- Around line 66-82: The from_mongo return annotation is wrong: it declares
QdrantCollectionType but the docstring and concrete converters (e.g.,
AgentCaseQdrantConverter, AtomicFactQdrantConverter) return a
qmodels.PointStruct/dict; fix by adding a type alias (e.g., QdrantPointType =
qmodels.PointStruct | dict) in base_converter.py and change the from_mongo
signature to return that alias (and update its docstring to reference
QdrantPointType/PointStruct), ensuring all implementations' return types now
match the declared public API.

In `@methods/EverCore/src/core/oxm/qdrant/base_repository.py`:
- Around line 42-50: The code currently accepts None/empty Mongo IDs and
deterministically maps them to the same Qdrant point, causing silent collisions;
in the BaseQdrantRepository method that converts Mongo IDs to Qdrant point IDs
(e.g., to_qdrant_point_id / get_point_id_from_mongo_id), add explicit validation
that rejects None/empty strings and invalid ObjectId values (use
bson.ObjectId.is_valid or equivalent) and raise a clear exception (ValueError or
custom) instead of returning a mapped id; update callers to handle/report the
error so bad inputs are rejected early.

In `@methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py`:
- Around line 219-244: Die Existenzprüfung mittels exists() gefolgt von
create_collection() ist nicht threadsicher — zwei parallele Prozesse können
beide False sehen und versuchen zu erstellen; ändere den Aufruf von
self.client().create_collection(...) (in der Methode, die self.name und
cfg.to_vectors_config() verwendet) so, dass der Parameter if_not_exists=True
gesetzt wird, damit create_collection idempotent wird und bei bereits
existierender Collection True zurückgibt; belasse sonstige Argumente
(collection_name=self.name, vectors_config=cfg.to_vectors_config()) unverändert.
- Around line 196-204: Die Methode collection_exists() fängt aktuell
UnexpectedResponse und maskiert damit Auth/Server-Fehler; ändere die
Fehlerbehandlung so, dass nur Transport-/Verbindungsfehler
(ResponseHandlingException) abgefangen und zu False führen, während
UnexpectedResponse (z.B. 401/403/5xx) nicht abgefangen, sondern propagiert wird,
damit ensure_collection() die tatsächlichen HTTP-Statusfehler sehen kann; passe
die except-Klausel in collection_exists() an (Referenz: collection_exists,
ensure_collection, ResponseHandlingException, UnexpectedResponse) und
entferne/ändere das aktuelle UnexpectedResponse-Handling entsprechend.
- Around line 254-256: Ersetze die Verwendung von "assert cfg is not None" durch
eine explizite Laufzeitprüfung: prüfe direkt auf "if cfg is None" und werfe eine
aussagekräftige RuntimeError-Exception (z.B. RuntimeError("collection config
missing, cannot access payload_indexes")) bevor auf cfg.payload_indexes
zugegriffen wird; siehe das Muster in ensure_collection() für Formulierung und
Platzierung der Prüfung und passe die Nachricht so an, dass sie den Kontext (z.
B. Funktionsname oder Collection-ID) enthält.

In `@methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py`:
- Around line 111-126: The cache key generation currently only uses
host/port/url and api_key (see endpoint, api_key and sha256 usage) and thus
ignores transport flags; update the key construction to incorporate
tenant-specific transport flags (e.g., config.get("https"),
config.get("prefer_grpc") or similar keys) so the endpoint string includes
normalized values for those flags (true/false) before appending the api_key
hash, ensuring cached QdrantClient instances respect https and prefer_grpc
differences.

In `@methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py`:
- Around line 273-282: Initialize the openai variable to None before the try
block (just like qdrant) so it cannot be unbound in the finally; specifically
add openai: Optional[OpenAI] = None above the try, then keep the OpenAI(...)
assignment inside the try and in the finally use getattr(openai, "close", None)
safely (or check if openai is not None) to close it. Update any other similar
OpenAI usages in the file (e.g., the block referenced at lines ~400-417) to
follow the same pattern.
- Around line 210-229: Die Funktion ensure_qdrant_collection überspringt aktuell
die Schemaprüfung, wenn die Collection bereits existiert; ändere sie so dass
nach client.collection_exists(name) die bestehende Collection-Metadaten via
client.get_collection(name) gelesen werden und deren Vektor-Parameter (z.B.
response.vectors?.config.size oder entsprechendes Feld) mit dem übergebenen
vector_size verglichen werden; bei Abweichung soll die Funktion sofort mit einem
klaren Fehler-/Exception-Log abbrechen (Fail-Fast) statt die Nutzung zu
erlauben, ansonsten unverändert weiterlaufen und nur wenn die Collection gar
nicht existiert client.create_collection mit
qmodels.VectorParams(size=vector_size, distance=..., hnsw_config=...) aufrufen.
- Around line 291-294: The MongoDB cursor opened by coll.find() can time out
during long migrations; change the call to coll.find(no_cursor_timeout=True)
(preserving the existing limit logic) and ensure the cursor is always closed by
wrapping its usage in a context/cleanup block — e.g., use "with
coll.find(no_cursor_timeout=True) as cursor:" if supported or surround cursor
iteration with try…finally and call cursor.close() in finally; update any
function that iterates the cursor (the variable cursor and the coll.find call)
so long-running tenant migrations won't raise CursorNotFound.

In `@methods/EverCore/src/devops_scripts/re_embed_sweep.py`:
- Around line 256-269: The sweep currently logs failures and increments
pairs_failed but still returns exit code 0; modify main() so that after the
final logger.info (the "SWEEP DONE" summary where pairs_run,
pairs_skipped_empty, pairs_failed and elapsed are logged) you check if
pairs_failed > 0 and propagate a non‑zero exit (e.g., call sys.exit(1) or raise
SystemExit(1)); ensure sys is imported if missing. Apply the same change to the
second summary block referenced around lines 305–322 so any partial failures
cause a non‑zero exit code.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/atomic_fact_qdrant_converter.py`:
- Line 78: The assignment to variable "vector" in
atomic_fact_qdrant_converter.py uses a redundant fallback; replace the
expression 'getattr(source_doc, "vector", None) or None' with just
'getattr(source_doc, "vector", None)' (locate the assignment to vector in the
AtomicFactQdrantConverter or surrounding function/method) to remove the
unnecessary "or None" and keep behavior identical.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/episodic_memory_qdrant_repository.py`:
- Around line 225-239: The timestamp in the search_results dict is currently
returned as a raw epoch value; update the code that builds search_results
(inside the EpisodicMemoryQdrantRepository search method that appends to
search_results) to normalize payload.get("timestamp") into a Python datetime: if
it's an int/float treat it as epoch and convert with datetime.fromtimestamp(...,
tz=timezone.utc), if it's already a datetime leave it as-is; add the necessary
import for datetime/timezone and replace the raw payload.get("timestamp")
assignment in the "timestamp" key with the normalized datetime value so this
repository matches the other Qdrant repositories' behavior.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py`:
- Around line 147-152: The current time-filter logic uses containment
(payload.start_time >= start_ms && payload.end_time <= end_ms) which misses
overlapping records; update the filters in the ForesightQdrantRepository methods
that build time predicates (e.g., the search/lookup method and the
delete-by-time-range method) to use overlap semantics: use payload.end_time >=
start_time_ms AND payload.start_time <= end_time_ms (replace the existing
start_time>= / end_time<= checks), and apply the same change to any other
occurrences (the other filter-building block around the indicated ranges).
Ensure the predicates use the record field names payload.start_time and
payload.end_time and the converted query start_time_ms/end_time_ms variables.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/user_profile_qdrant_repository.py`:
- Around line 167-181: The logger calls in user_profile_qdrant_repository.py are
emitting raw user_id and group_id (in the block around the delete logic and the
exception handler), leaking PII; change those logger calls to avoid plaintext
IDs by replacing user_id and group_id with deterministic redacted identifiers
(e.g., a short hex digest or masked form derived from hashing the ID values) and
use the redacted values in both the success info log and the error log/exception
logging (call sites: the logger.info and logger.error inside the delete profile
items flow); ensure you do not print the original IDs anywhere in the message or
exception metadata and keep the exception logging using logger.exception or
logger.error with the redacted ids plus the exception object for debugging.

In `@README.md`:
- Around line 21-24: Unvollständige Pfadangabe für den Qdrant-Adapter in
README.md kann Entwickler irreführen; erweitere die Pfaddokumentation, indem du
explizit beide relevanten Verzeichnisse nennst: `src/core/oxm/qdrant/`
(Adapter-Schnittstellen/Domain-Mapping) und
`src/infra_layer/adapters/out/search/qdrant/` (konkrete Implementierungen wie
Converter, Collections, Repositories). Füge in README.md eine kurze Liste hinzu,
die die genannten Komponenten (Converter, Collections, Repository-Klassen) und
ihren jeweiligen Pfad erwähnt, damit die Codebase-Navigation klar und
vollständig ist.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 415cf661-a731-4134-bdc6-365c171cef35

📥 Commits

Reviewing files that changed from the base of the PR and between 29d555c and bd17eb5.

⛔ Files ignored due to path filters (1)

methods/EverCore/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (34)

README.md
methods/EverCore/pyproject.toml
methods/EverCore/src/core/component/qdrant_client_factory.py
methods/EverCore/src/core/lifespan/qdrant_lifespan.py
methods/EverCore/src/core/oxm/qdrant/__init__.py
methods/EverCore/src/core/oxm/qdrant/base_converter.py
methods/EverCore/src/core/oxm/qdrant/base_repository.py
methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py
methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/__init__.py
methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py
methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/tenant_aware_qdrant_collection_with_suffix.py
methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py
methods/EverCore/src/devops_scripts/re_embed_sweep.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/__init__.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/__init__.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/agent_case_qdrant_converter.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/agent_skill_qdrant_converter.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/atomic_fact_qdrant_converter.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/episodic_memory_qdrant_converter.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/foresight_qdrant_converter.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/user_profile_qdrant_converter.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/__init__.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/agent_case_collection.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/agent_skill_collection.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/atomic_fact_collection.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/episodic_memory_collection.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/foresight_collection.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/user_profile_collection.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/agent_case_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/agent_skill_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/atomic_fact_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/episodic_memory_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/user_profile_qdrant_repository.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bd17eb5303

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

DerAuctor · 2026-05-13T21:45:54Z

@claude

claude · 2026-05-13T21:46:10Z

Claude finished @DerAuctor's task in 4m 7s —— View job

PR Review

Gather context from PR description and existing reviews
Read key source files
Validate CodeRabbit/Codex findings against actual code
Write comprehensive review

This is a substantial PR (5107 lines, 35 files) adding a full Qdrant vector backend as an alternative to Milvus. The architecture is sound and follows existing patterns well. Below are issues grouped by severity, with validation notes on the existing bot findings.

🔴 Blockers

1. VECTORIZE_DIMENSIONS default mismatch will corrupt migrations

migrate_milvus_to_qdrant.py:110 defaults to 4096:

vectorize_dimensions=int(os.environ.get("VECTORIZE_DIMENSIONS", "4096")),

But memory_layer/constants.py:4 defaults to 1024:

VECTORIZE_DIMENSIONS = int(os.getenv("VECTORIZE_DIMENSIONS", "1024"))

And all six Qdrant collection classes use size=VECTORIZE_DIMENSIONS (runtime default: 1024). A migration run without setting VECTORIZE_DIMENSIONS explicitly will create 4096-dim Qdrant collections and points that are immediately incompatible with the 1024-dim runtime service. Every subsequent upsert or search will fail with a dimension-mismatch error. The default should be 1024. Fix this →

2. openai can be unbound in finally, masking real exceptions

migrate_milvus_to_qdrant.py:279 assigns openai inside the try block — after qdrant is constructed. The finally at line 404 references openai unconditionally:

close_fn = getattr(openai, "close", None)   # NameError if QdrantClient(...) threw

If the QdrantClient(...) constructor raises (e.g., unreachable host), openai is never bound. The resulting NameError in finally propagates and completely shadows the real connection error. Initialize it to None before the try like qdrant is. Fix this →

3. radius is silently discarded when score_threshold is at its default 0.0

All six vector_search methods share this logic (episodic_memory_qdrant_repository.py:203):

if radius is not None and radius > -1.0:
    effective_threshold = min(radius, score_threshold)   # = min(0.7, 0.0) = 0.0

When score_threshold=0.0 (the parameter default), min(radius, 0.0) = 0.0 → passed to Qdrant as None (no server filter), and the client-side check point.score < 0.0 never triggers either. A caller that passes radius=0.7 expecting only high-quality results gets everything back. The comment describes the intent correctly but the code breaks it for the most common call pattern. Fix: when score_threshold == 0.0, use radius as both the server-side threshold and the client-side hard cut-off.

🟠 High severity

4. Singleton-scoped repository caches the first tenant's collection name

base_repository.py:98-103 — the collection property lazily instantiates TenantAwareQdrantCollectionWithSuffix, which resolves the tenant-prefix at construction time and stores it as self._resolved_name. If the DI container caches @repository beans (singleton scope, which is the default in most Spring-style containers), the first request's tenant will "own" the cached collection for all subsequent tenants. Requests from a different tenant would silently read/write the wrong collection, leaking data across tenant boundaries. Verify the @repository bean scope, and if singleton, change the collection property to re-instantiate per-call or accept the tenant context at call time.

5. exists() swallows auth and 5xx errors

qdrant_collection_base.py:198 catches UnexpectedResponse:

except (ResponseHandlingException, UnexpectedResponse) as e:
    return False

The docstring on that same method (lines 189-194) explicitly states that auth failures should propagate. Per the Qdrant client source, collection_exists() raises UnexpectedResponse for 401/403/5xx responses — these get silently converted to return False, causing ensure_collection() to then try to create a collection, which produces an even more confusing follow-up error at startup. Change the except clause to only catch ResponseHandlingException (transport/timeout), and let UnexpectedResponse propagate. Fix this →

6. Migration sweep always exits 0, even on partial failures

re_embed_sweep.py:258-269: failed pairs increment pairs_failed but main() returns 0 unconditionally. A nightly sweep that migrated 80% of tenants before hitting a rate limit would report success to CI. Change sweep() to return pairs_failed, and have main() return 1 when pairs_failed > 0. Fix this →

7. Numeric timestamps not unit-converted during migration

migrate_milvus_to_qdrant.py:184-185:

elif isinstance(ts_value, (int, float)):
    payload[timestamp_field] = int(ts_value)  # verbatim — no unit conversion

For agent_case (timestamp_unit="s"), a source document storing epoch milliseconds (e.g., from a pre-existing Milvus collection) would be stored verbatim as ~1.7e12, not ~1.7e9. The runtime collection's range filters work in seconds. The conversion block above it correctly handles datetime objects; the numeric branch should apply the same / 1000 for ms targets or * 1000 for s targets where the source is in the other unit. At a minimum, document the assumption ("source is always in target unit") clearly.

🟡 Medium severity

8. assert cfg is not None stripped under python -O

qdrant_collection_base.py:255: the existing assert is removed by the optimizer, leaving the cfg.payload_indexes access one line below to raise an opaque AttributeError: 'NoneType' object has no attribute 'payload_indexes' in production builds with optimizations. The same class already uses if cfg is None: raise RuntimeError(...) in ensure_collection() — apply the same pattern here. Fix this →

9. MongoDB cursor can time out mid-migration

migrate_milvus_to_qdrant.py:291: coll.find() uses MongoDB's default cursor idle timeout (10 min server-side). With a 32-doc batch and slow embedding calls (OpenRouter can add seconds per batch), large collections will hit CursorNotFound mid-run. Use coll.find(no_cursor_timeout=True) and wrap cursor iteration in try/finally: cursor.close().

10. atomic_fact field missing from sweep spec's payload_fields

re_embed_sweep.py:89-100: the atomic_fact spec only copies user_id, group_id, session_id, etc. into the payload — but not atomic_fact itself. The AtomicFactQdrantConverter stores the text in search_content (JSON-encoded), so search results won't carry the raw fact text. Verify whether downstream callers read payload.get("atomic_fact") directly from Qdrant search results, and if so, add "atomic_fact" to payload_fields.

🟢 Minor / Polish

11. Shutdown guard depends on env-flag instead of initialization state — qdrant_lifespan.py:123: if not _backend_is_qdrant() or self._qdrant_factory is None. If the env var changes between startup and shutdown (e.g., in a test), the factory won't be closed. Remove the _backend_is_qdrant() check; self._qdrant_factory is None is sufficient.

12. from_mongo return type annotation is wrong — base_converter.py:66 declares return type QdrantCollectionType (bound to QdrantCollectionBase) but every concrete converter returns qmodels.PointStruct. The Generic bound is correct for documenting the target collection, but the return type annotation is misleading for IDE users.

13. Cache key ignores transport flags — config_utils.py:111-126: get_qdrant_connection_cache_key hashes only host/port/url + api_key, ignoring https and prefer_grpc. If two tenants share a host but need different transport settings, they'd get the same cached (potentially wrong) client.

14. PII in delete_by_user_group log lines — user_profile_qdrant_repository.py:167-181: raw user_id and group_id are emitted at INFO and ERROR level. If these are opaque UUIDs this is fine, but if they can be real user identifiers they'll land in your log aggregator unredacted.

15. Redundant or None — atomic_fact_qdrant_converter.py:78: getattr(source_doc, "vector", None) or None — the or None is a no-op; remove it.

✅ Confirmed valid CodeRabbit findings I did NOT see issues with

The get_client() URL-vs-https parameter comment (lines 94-103 of the factory) is slightly misleading, but the actual client construction at lines 199-213 is correct: https=None is not passed to client_kwargs when unset, so the scheme in the URL controls TLS. The comment just needs clarifying, not a code change.
create_collection TOCTOU: if_not_exists=True is only available in newer qdrant-client versions. With >=1.12,<2 as the constraint this is worth adding but not a blocker.

Summary

The codebase quality is solid for its scope — good docstrings, consistent patterns with the Milvus counterpart, proper asyncio.to_thread wrapping. The blocking issues are all in the migration tooling (items 1-3) and are fixable quickly. The cross-tenant caching risk (item 4) depends on the DI scoping model and needs verification before cutover.

B1 — migration default dimension mismatch Default for ``VECTORIZE_DIMENSIONS`` in the migrate workhorse was ``4096`` while ``memory_layer/constants.py`` defaults to ``1024``. A migration run with the env unset produced 4096-dim collections that the runtime service (1024-dim) could not use, causing dimension-mismatch errors on every subsequent operation. Both ends now default to ``1024``; sites running a non-default dimension must set the env on both the migration and the runtime — a mismatch is loud (collection-create fails) instead of silent. B2 — ``openai`` and cursor could be unbound during finally cleanup The cleanup block in ``migrate()`` referenced ``openai`` unconditionally, but ``openai`` was only assigned inside the ``try`` after ``QdrantClient`` construction. If the Qdrant client failed to construct (unreachable host, bad credentials), the ``finally`` raised ``NameError`` and masked the real connection error. Pre-initialise ``openai`` and ``cursor`` to ``None`` before the ``try`` and guard each ``.close()`` call. ``coll.find()`` now opens with ``no_cursor_timeout=True`` (CodeRabbit M2 / Claude M2) — slow embedding batches (OpenRouter rate-limit, retry) easily exceed the 10-min server-side cursor idle default and crashed long sweeps with ``CursorNotFound``. B3 — ``min(radius, score_threshold)`` silently disabled filtering All six ``vector_search`` methods used ``min(radius, score_threshold)`` server-side. With the parameter default ``score_threshold=0.0`` and any explicit ``radius`` (e.g. ``radius=0.7``), the effective threshold collapsed to ``0.0``, was then converted to ``None`` for Qdrant, and the client-side ``point.score < score_threshold`` check (``< 0.0``) never fired either — caller got everything back. Extracted a ``compute_effective_threshold(radius, score_threshold)`` helper in ``base_repository`` that uses the smaller *positive* bound (or ``None`` when neither is positive) and switched all six repositories to it. The client-side hard cut-off is unchanged. Also: ``mongo_id_to_qdrant_id`` now rejects ``None``/empty inputs so upstream bugs do not silently collide on a single fixed Qdrant id (CodeRabbit H10).

High (H1-H14, except H10 already in 9faac50) H1 — Singleton repository cached tenant-bound collection ``BaseQdrantRepository.collection`` lazily cached the resolved tenant- prefixed collection instance, so a DI-singleton repository would lock to the first tenant that ever called it and silently route subsequent tenants' reads/writes to the wrong Qdrant collection. The cache is gone; ``self.model()`` is invoked per-call and re-resolves the tenant suffix each time. H2 — Qdrant client cache key ignored transport flags ``get_qdrant_connection_cache_key`` now folds ``https`` and ``prefer_grpc`` into the key. Two tenants on the same host but with different transport flags now get distinct cached clients. H3 — ``exists()`` swallowed auth and 5xx errors The previous catch was ``(ResponseHandlingException, UnexpectedResponse)``, which converted 401/403/5xx HTTP responses into "collection missing". ``ensure_collection()`` then routed into a confusing create attempt. Only ``ResponseHandlingException`` (transport-level) is caught now; HTTP errors propagate. H4 — Sweep exited 0 on partial failure ``sweep()`` returns the failed-pair count; ``main()`` returns ``1`` when any pair failed so cron / CI sees the run as red. H5 / H13 — Numeric timestamps not unit-normalized Extracted ``_normalize_timestamp_to_epoch`` (heuristic ms-vs-s detection on numeric values). Applied to the primary ``timestamp_field`` AND to any time-shaped values in ``payload_fields`` (e.g. foresight ``end_time``) so callers don't see a mixed ``datetime`` / numeric population. H6 — Dimension drift on pre-existing collections Both the migration helper and the runtime ``ensure_collection`` now read ``get_collection`` and raise ``RuntimeError`` when the existing vector size disagrees with the configured one. Loud failure replaces per-batch "vector size mismatch" once data starts flowing. H7 — TOCTOU between ``exists()`` and ``create_collection`` Idempotency for races: a 409 ``UnexpectedResponse`` from ``create_collection`` is swallowed (parallel process created it first); other errors propagate. H8 — Foresight time-range was containment, should have been overlap ``vector_search`` now filters ``payload.end_time >= q.start_time`` and ``payload.start_time <= q.end_time``. The previous filter dropped foresights whose window only partially overlapped the query window. H9 — Lifespan failure left clients open On any error during ``startup``, the lifespan provider now calls ``close_all_clients()`` before re-raising. ``shutdown()`` gates on the factory's initialization state, not the env flag (env flips between startup and shutdown in tests would have leaked clients). H11 — EpisodicMemory.vector_search returned raw epoch Result dict now normalizes ``timestamp`` to ``datetime`` for parity with the other Qdrant repositories. H12 — PII in user_profile delete-path logs ``user_id`` and ``group_id`` are now SHA-256-fingerprinted (12-char prefix) before logging instead of emitted in cleartext. H14 — ``atomic_fact`` missing from sweep payload spec Added to the atomic_fact sweep spec so migrated records carry the raw text — ``AtomicFactQdrantRepository.vector_search`` reads it directly out of the payload. Medium / Trivial M1 — ``assert cfg is not None`` stripped under ``python -O``: replaced with explicit RuntimeError in ``ensure_payload_indexes``. M3 — QDRANT_TIMEOUT now clamped to [1, 300] s. M4 — Shutdown no longer gates on env flag (see H9). M5 — README path-list expanded to include the infra_layer Qdrant tree. T1 — Removed redundant ``or None`` after ``getattr(..., None)`` in atomic_fact converter. T2 — ``BaseQdrantConverter.from_mongo`` return annotation widened from the generic bound (which the concrete converters do not satisfy) to ``Any``; the generic still documents the target *collection*.

Ptah-CT · 2026-05-13T22:20:43Z

@coderabbitai review

All findings from the first review on this branch addressed across two commits:

9faac50 — Blockers (3)

B1: VECTORIZE_DIMENSIONS default aligned to runtime (1024)
B2: openai + cursor pre-init in migrate() so finally cannot raise NameError; no_cursor_timeout=True on the Mongo cursor
B3: compute_effective_threshold helper replaces the broken min(radius, score_threshold) in all 6 vector_search paths; mongo_id_to_qdrant_id rejects empty inputs

51025b2 — High/Medium/Trivial (14)

H1+H2: tenant-cache leak fixed; https and prefer_grpc added to the client cache key
H3: exists() only catches transport errors now (auth/5xx propagate)
H4: sweep returns pairs_failed and main() propagates non-zero exit
H5+H13: _normalize_timestamp_to_epoch applied to primary timestamp AND time-shaped payload fields
H6: dimension drift validated against existing Qdrant collections in both ensure_qdrant_collection and ensure_collection
H7: 409 UnexpectedResponse from concurrent create_collection is now idempotent
H8: foresight time-range switched from containment to overlap
H9: lifespan rolls back open clients on startup error; shutdown gates on factory state, not env
H11: EpisodicMemory.vector_search normalizes timestamp back to datetime
H12: user_id / group_id SHA-256-fingerprinted in user_profile logs
H14: atomic_fact added to sweep payload_fields
M1, M3, M4, M5, T1, T2 — addressed
Codex's 7 inline findings were subsumed under the Blockers/High above.

Please re-run.

coderabbitai · 2026-05-13T22:20:58Z

@Ptah-CT Thank you for the detailed summary of all fixes! The thoroughness of the patch notes — covering blockers, high/medium/trivials, and the absorbed Codex inline findings — is much appreciated. Kicking off a fresh review now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 8

♻️ Duplicate comments (1)

methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py (1)

8-13: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Zeitintervall-Semantik in delete_by_filters ist immer noch Containment — divergiert von vector_search (Overlap).

Der Overlap-Fix wurde nur in vector_search (Z. 220–236) eingebaut. delete_by_filters (Z. 319–332) filtert weiterhin auf key="start_time" gte und key="end_time" lte, also Containment. Der Docstring (Z. 297–299) behauptet jedoch, die Semantik entspreche vector_search. Effekt: ein Aufrufer, der vector_search(start, end) ausführt und anschließend mit denselben Argumenten delete_by_filters(start, end) aufruft, löscht eine andere Menge — nämlich nur die vollständig enthaltenen Foresights, während überlappende (die in der Suche auftauchen) bestehen bleiben. Das ist ein stiller Datenkonsistenz-Bug bei Migrationen/Cleanups.

Zusätzlich ist der Modul-Docstring (Z. 8–13) nach dem Overlap-Fix in vector_search veraltet — er beschreibt nach wie vor die alte Containment-Semantik (start_time arg maps to start_time >= ...).

Vorgeschlagener Fix für `delete_by_filters` + Modul-Docstring

-**Note on time filters:** the Foresight schema stores ``start_time`` and
-``end_time`` (both epoch milliseconds). The Milvus repository erroneously
-filters on a non-existent ``timestamp`` field; the Qdrant repository
-filters on ``start_time``/``end_time`` semantically — ``start_time`` arg
-maps to ``start_time >= ...`` and ``end_time`` arg to ``end_time <= ...``,
-which is the natural range-overlap semantic for a time-spanning record.
+**Note on time filters:** the Foresight schema stores ``start_time`` and
+``end_time`` (both epoch milliseconds). Filtering uses window-overlap
+semantics — the query's ``start_time`` arg maps to
+``payload.end_time >= ...`` and the query's ``end_time`` arg maps to
+``payload.start_time <= ...``. This matches the natural intent of "find
+records whose window intersects the query window" and is applied
+consistently across ``vector_search`` and ``delete_by_filters``.

         Batch delete by filter combination; at least one filter required.

-        Time-range semantic matches ``vector_search``: ``start_time`` arg
-        maps to ``payload.start_time >= ...``, ``end_time`` arg maps to
-        ``payload.end_time <= ...``.
+        Time-range semantic matches ``vector_search`` (window-overlap):
+        ``start_time`` arg -> ``payload.end_time >= start_time_ms``,
+        ``end_time`` arg  -> ``payload.start_time <= end_time_ms``.
         """
@@
-            if start_time:
-                conditions.append(
-                    qmodels.FieldCondition(
-                        key="start_time",
-                        range=qmodels.Range(gte=to_epoch_ms(start_time)),
-                    )
-                )
-            if end_time:
-                conditions.append(
-                    qmodels.FieldCondition(
-                        key="end_time",
-                        range=qmodels.Range(lte=to_epoch_ms(end_time)),
-                    )
-                )
+            if start_time:
+                conditions.append(
+                    qmodels.FieldCondition(
+                        key="end_time",
+                        range=qmodels.Range(gte=to_epoch_ms(start_time)),
+                    )
+                )
+            if end_time:
+                conditions.append(
+                    qmodels.FieldCondition(
+                        key="start_time",
+                        range=qmodels.Range(lte=to_epoch_ms(end_time)),
+                    )
+                )

Falls die Containment-Semantik in delete_by_filters bewusst beibehalten werden soll (z. B. konservativeres Löschverhalten), muss der Docstring entsprechend explizit machen, dass sich Such- und Lösch-Semantik unterscheiden, damit Aufrufer das nicht miteinander verwechseln.

Also applies to: 297-332

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py`
around lines 8 - 13, Update the module docstring to reflect the overlap
(range-overlap) semantics used by vector_search instead of the old containment
wording, and change delete_by_filters to use the same overlap logic as
vector_search: replace the current containment predicates (start_time >=
start_arg AND end_time <= end_arg) with overlap predicates (start_time <=
end_arg AND end_time >= start_arg) so that a record is deleted when its time
span overlaps the query window; ensure you adjust the filter construction in
delete_by_filters and keep vector_search and delete_by_filters semantics
consistent (or explicitly document if you intentionally want them different).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@methods/EverCore/src/core/lifespan/qdrant_lifespan.py`:
- Around line 93-101: The startup async method performs blocking Qdrant I/O
synchronously; wrap the blocking calls in threads using asyncio.to_thread: call
self._qdrant_factory.get_named_client via await asyncio.to_thread(...) when
assigning self._qdrant_clients[using], instantiate collection_class() via await
asyncio.to_thread(...) if construction is blocking, and call
collection.ensure_all via await asyncio.to_thread(...) (or convert ensure_all to
async if possible) so the event loop is not blocked.

In `@methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py`:
- Around line 185-419: The current methods (exists, count, ensure_collection,
ensure_payload_indexes, ensure_all, upsert, search, delete, drop) perform
blocking Qdrant client calls (e.g. self.client().collection_exists, .count,
.create_collection, .query_points, .delete, .delete_collection) and must be made
non-blocking: change these routines to await the I/O either by using async
client APIs if available or by wrapping the blocking calls with await
asyncio.to_thread(...) (e.g. await
asyncio.to_thread(self.client().collection_exists, self.name)) and propagate
async/await through callers; ensure exceptions/logging semantics are preserved
and keep the same return values and names (exists, count, ensure_collection,
ensure_payload_indexes, ensure_all, upsert, search, delete, drop) so callers
continue to work.

In `@methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py`:
- Around line 55-57: The broad except block that logs and returns None must be
changed to fail-closed: stop swallowing unexpected errors in the except handling
around tenant qdrant config resolution (the except Exception as e:
logger.warning(...) return None block that leads callers to call
_base_prefixed_collection_name(...)). Instead, catch only the specific, expected
exceptions (e.g., config-missing or NotFound) and return a clear sentinel for
“no tenant config”; for all other exceptions re-raise or raise a
TenantResolutionError so the caller at the sites that call
_base_prefixed_collection_name(...) cannot silently fall back to shared
prefixes. Update the logging to include the error details (logger.exception or
include e) and ensure callers handle the explicit error/sentinel instead of
treating None as success.
- Around line 130-135: The cache-key boolean flags are being incorrectly
normalized because using bool(https) treats non-empty strings like "false" as
True; update the normalization in the endpoint-building logic (where https =
config.get("https") and prefer_grpc = config.get("prefer_grpc") are used to
append "#https=..." and "#grpc=...") to robustly parse string and non-string
values into real booleans (e.g., convert strings via lower() and compare to
"true"/"1"/"yes" or use a small helper parse_bool(value) that handles None,
bools, ints, and common truthy/falsey strings) and then append the normalized
boolean value to endpoint so cache keys are correct.

In `@methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py`:
- Around line 233-248: The numeric heuristic is overbroad: the second loop
applies _normalize_timestamp_to_epoch to any int/float payload field (iterating
payload_fields), corrupting non-timestamp numeric fields; change this by
removing the magnitude-based pass and instead restrict normalization to an
explicit whitelist of timestamp field names (e.g. "timestamp", "start_time",
"end_time") or to fields supplied via a new CLI option. Concretely, stop calling
_normalize_timestamp_to_epoch for every numeric payload[field] in the loop that
references payload_fields/timestamp_field, add support for an
--extra-timestamp-fields CLI flag and wire that into CollectionSpec
(re_embed_sweep.py) as extra_timestamp_fields, and only normalize fields present
in that whitelist (plus the existing timestamp_field). Ensure no other numeric
fields are transformed.

In `@methods/EverCore/src/devops_scripts/re_embed_sweep.py`:
- Around line 191-197: estimated_count currently creates and closes a
MongoClient per call causing N×M connection overhead; change sweep to create one
shared MongoClient (separate from migrate()'s internal client), pass that client
into list_active_dbs and estimated_count (modify their signatures to accept a
MongoClient parameter), use the shared client for all count/list operations, and
close the shared client in sweep's finally block to ensure a single connection
pool for the whole sweep.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/atomic_fact_qdrant_repository.py`:
- Around line 49-63: The parameter name `id` in create_and_save_atomic_fact
should be renamed to a non-builtin name (e.g., point_id or mongo_id) to avoid
shadowing Python's builtin; update the function signature of
create_and_save_atomic_fact (and any other functions in this file that use a
parameter named `id` in the noted ranges) to use point_id, then update all
references inside the method body, imports/usages, log messages, and any dict
keys or returned payload fields that currently use `id` so they consistently use
point_id; ensure tests/type hints still match and no leftover `id` references
remain.

In `@README.md`:
- Line 17: Aktualisiere die Status-Zeile für `feature/qdrant-adapter` im README
von "work in progress. Phase 1: adapter skeleton" auf eine präzise Beschreibung
der gelieferten Funktionalität (z.B. "fully implemented — Collection base, 6
converters, 6 repositories, Migration CLI, Re-Embed Sweep; achieves full Milvus
parity" oder ähnliche Formulierung), sodass die Branch-Bezeichnung
`feature/qdrant-adapter` und der PR-Titel "full Milvus parity" widerspruchsfrei
sind und die genannten Komponenten deutlich aufgeführt werden.

---

Duplicate comments:
In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py`:
- Around line 8-13: Update the module docstring to reflect the overlap
(range-overlap) semantics used by vector_search instead of the old containment
wording, and change delete_by_filters to use the same overlap logic as
vector_search: replace the current containment predicates (start_time >=
start_arg AND end_time <= end_arg) with overlap predicates (start_time <=
end_arg AND end_time >= start_arg) so that a record is deleted when its time
span overlaps the query window; ensure you adjust the filter construction in
delete_by_filters and keep vector_search and delete_by_filters semantics
consistent (or explicitly document if you intentionally want them different).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 852a4f39-c113-4f84-b4d0-11339dac2d64

📥 Commits

Reviewing files that changed from the base of the PR and between bd17eb5 and 51025b2.

📒 Files selected for processing (16)

README.md
methods/EverCore/src/core/component/qdrant_client_factory.py
methods/EverCore/src/core/lifespan/qdrant_lifespan.py
methods/EverCore/src/core/oxm/qdrant/base_converter.py
methods/EverCore/src/core/oxm/qdrant/base_repository.py
methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py
methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py
methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py
methods/EverCore/src/devops_scripts/re_embed_sweep.py
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/atomic_fact_qdrant_converter.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/agent_case_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/agent_skill_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/atomic_fact_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/episodic_memory_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py
methods/EverCore/src/infra_layer/adapters/out/search/repository/user_profile_qdrant_repository.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51025b2b62

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

CR re-review on 51025b2 surfaced 8 actionable findings (1 a duplicate of H8 in the delete path), plus the long-standing N2 ask to make ``QdrantCollectionBase`` async-first instead of relying on per-caller ``asyncio.to_thread`` wrappers. All addressed: Async refactor (N1 + N2 — the heavy lift) - ``QdrantCollectionBase`` methods are now ``async``: ``exists``, ``count``, ``ensure_collection``, ``ensure_payload_indexes``, ``ensure_all``, ``upsert``, ``search``, ``delete``, ``drop``. Each delegates to the blocking ``qdrant-client`` call via ``asyncio.to_thread`` once, instead of forcing every caller to wrap the call themselves. - ``BaseQdrantRepository`` no longer wraps ``self.collection.upsert/delete/ search/count`` in ``asyncio.to_thread`` — those are now ``await``-able directly. The to_thread wrappers around ``client.retrieve`` stay (the client method is still sync; the collection class has no native ``retrieve``). - ``QdrantLifespanProvider.startup`` now awaits ``collection.ensure_all`` and offloads ``get_named_client`` (still sync) to a worker thread, so the event loop is no longer blocked during FastAPI startup. Regressions from 51025b2 (N4, N5) - ``get_qdrant_connection_cache_key``: replaced the naive ``bool(value)`` with a ``_as_bool`` helper that handles strings (``bool("false")`` is ``True`` in Python and would have produced wrong cache keys for tenants whose storage entry returns ``https``/``prefer_grpc`` as strings). - ``build_payload``: dropped the magnitude-based "looks like a timestamp" heuristic. It would have rewritten legitimate non-time numerics like ``maturity_score`` or ``duration_days`` whenever they crossed the threshold. Normalization now requires an explicit whitelist via a new ``--extra-timestamp-fields`` CLI flag / ``extra_timestamp_fields`` parameter on the workhorse; the sweep spec for ``foresight`` declares ``end_time`` there. Other CR findings N3 — Tenant-config resolution is now fail-closed ``get_tenant_qdrant_config`` and ``get_tenant_aware_collection_name`` no longer swallow arbitrary exceptions and silently fall back to the shared base prefix (which would route a tenant's data into another tenant's collection). Only the legitimate "no tenant context" / ``LookupError`` paths return ``None``; everything else propagates. N6 — Shared MongoClient in the sweep ``sweep()`` now opens a single ``MongoClient`` for the discovery / count phase and passes it to ``list_active_dbs`` and ``estimated_count`` (which used to open and close their own client per call, producing N×M connection churn). N7 — Renamed ``id`` parameter in ``create_and_save_atomic_fact`` to ``point_id`` to avoid shadowing Python's builtin. N8 — README ``Status`` line replaced the stale "Phase 1: adapter skeleton" with a concrete list of what the branch actually delivers. D1 — ``ForesightQdrantRepository.delete_by_filters`` now uses the same window-overlap filter as ``vector_search`` so a "delete this window" call doesn't leave behind records that ``vector_search`` would still return. Module docstring updated. Links CI fix The Rokid / Creative Assistant use-case blocks in README.md were missing the ``[![banner](img)](link)`` wrap and the ``[Code|Plugin|Live Demo](link)`` primary marker that the ``Docs`` workflow's link check enforces. Wrapped both banners with self-anchor links and gave each a matching primary marker. (Pre-existing drift from upstream; the workflow only started running on this branch.)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d1be9c69e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T22:55:25Z

+                ts_ms = payload.get("timestamp", 0) or 0
+                search_results.append(
+                    {
+                        "id": str(point.id),


Return Mongo IDs instead of internal UUID point IDs

vector_search currently returns id from point.id, but this adapter writes Qdrant IDs as deterministic UUID5 values (see converter/migration paths that call mongo_id_to_qdrant_id and store the original Mongo key in payload as mongo_id). Under migrated data, callers receive UUIDs instead of source document IDs, so downstream backfill paths that look up Mongo records by returned id will miss and degrade results after Qdrant cutover. Use payload mongo_id as the outward-facing id (with a fallback to point.id for legacy points) to preserve Milvus-parity semantics.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T22:55:25Z

+            }
+
+            await self.upsert(
+                qmodels.PointStruct(id=id, vector=vector, payload=payload)


Convert runtime point IDs to Qdrant-compatible format

create_and_save_episodic_memory upserts with the raw caller id, but this same PR introduces mongo_id_to_qdrant_id because Mongo-style IDs are not valid Qdrant point IDs (Qdrant accepts uint64/UUID IDs). In cutover flows where callers pass Mongo/ObjectId-like IDs, this path will fail at upsert time, while migrated records use UUID5 IDs, creating inconsistent write behavior between live ingestion and migration. Convert id through mongo_id_to_qdrant_id (or enforce UUID input) before building PointStruct.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T22:55:25Z

+            }
+
+            await self.upsert(
+                qmodels.PointStruct(id=point_id, vector=vector, payload=payload)


Normalize atomic-fact point IDs before Qdrant upsert

create_and_save_atomic_fact writes point_id directly as the Qdrant point ID, but this PR’s own migration/converter paths translate Mongo IDs via mongo_id_to_qdrant_id to satisfy Qdrant’s ID constraints. If runtime callers pass Mongo/ObjectId-like IDs, this upsert path can fail (or diverge from migrated ID space), so search/delete behavior becomes inconsistent between newly written and migrated records. Apply the same deterministic ID normalization (or strict UUID validation) before constructing the point.

Useful? React with 👍 / 👎.

## Problem After PR #2 (Qdrant adapter merged) and the data-side migration, the service still refused to boot with ``VECTOR_STORE_BACKEND=qdrant``: 1. ``MilvusLifespanProvider`` had no env-gate, always tried to connect to Milvus, and crashed when Milvus was offline (which it is in the Qdrant cutover scenario). It runs at order=18, before the Qdrant lifespan at order=19, so the Qdrant lifespan never got a chance to start. 2. Once the Milvus lifespan was gated, ``BaseMilvusRepository.__init__`` crashed at every controller construction because it eagerly called ``model.async_collection()`` — which raises when the Milvus lifespan has not initialised the collection (i.e. exactly the Qdrant-mode case). 3. ``QdrantConnectionCache``'s lazy import inside ``get_tenant_qdrant_config`` still referenced ``core.tenants.tenantize.tenant_context`` — that module was renamed to ``core.tenants.tenant_contextvar`` upstream. The failure only surfaced at the very first ``ensure_collection`` call inside ``QdrantLifespanProvider.startup``, so it looked indistinguishable from a wider Qdrant initialisation issue. ## Fix Three small, surgical patches: **``core/lifespan/milvus_lifespan.py``** — symmetric env-gate. ``startup`` and ``shutdown`` return ``None`` immediately when ``VECTOR_STORE_BACKEND=qdrant``, mirroring what ``QdrantLifespanProvider`` already does for the inverse case. **``core/oxm/milvus/base_repository.py``** — lazy collection resolution. ``__init__`` no longer eagerly resolves ``model.async_collection()``; instead it stashes the model class, and ``self.collection`` is now a ``@property`` that resolves on first access. Milvus-mode behaviour is unchanged (first repo call resolves the collection identically to before). Qdrant-mode boots cleanly because the Milvus collections are never touched. **``core/tenants/tenantize/oxm/qdrant/config_utils.py``** — fixed the lazy import: ``core.tenants.tenantize.tenant_context`` → ``core.tenants.tenant_contextvar``. ## Verification Boot with ``VECTOR_STORE_BACKEND=qdrant``: - lifespan order: metrics → mongodb → milvus(no-op) → elasticsearch → qdrant (initialised) → business → longjob — all green. - ``/health`` returns ``{"status": "healthy"}`` HTTP 200. - ``/docs`` HTTP 200. - The 6 Qdrant collection classes are discovered + initialised at startup. ``ensure_all`` resolves the tenant-aware names and finds the pre-seeded collections green. ## Risk Low. None of the changes affect Milvus-mode (default). All three patches are scoped to either the Qdrant code path or to a lazy-resolution change that defers behaviour without changing it. --------- Co-authored-by: Ptah-CT <221234802+Ptah-CT@users.noreply.github.com>

…Lock CodeRabbit finding #3: _ensure_session() could be entered concurrently by multiple coroutines (e.g. parallel rerank batches via gather), each seeing self.session is None and creating its own aiohttp.ClientSession. The losers of the race were never tracked and leaked. Add self._session_lock = asyncio.Lock() in __init__ and use a double-checked pattern inside _ensure_session: a fast path when the session is already alive, a lock-protected slow path with re-check before instantiation. Findings #1 (Voyage model default) and #2 (raw query in debug log) are intentionally deferred — production .env always sets RERANK_MODEL and DEBUG-level logs are not enabled by default in this environment.

Ptah-CT added 20 commits May 13, 2026 21:13

Ptah-CT mentioned this pull request May 13, 2026

Qdrant adapter: full Milvus parity + standalone re-embed sweep #1

Closed

5 tasks

coderabbitai Bot requested changes May 13, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

Ptah-CT added 2 commits May 13, 2026 21:59

coderabbitai Bot requested changes May 13, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

XInfty deleted a comment from qodo-code-review Bot May 13, 2026

coderabbitai Bot approved these changes May 13, 2026

View reviewed changes

DerAuctor merged commit 66e7aff into main May 13, 2026
4 checks passed

DerAuctor deleted the qdrant/rebase-evercore branch May 13, 2026 22:54

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

Ptah-CT mentioned this pull request May 14, 2026

fix(qdrant): allow service to boot with VECTOR_STORE_BACKEND=qdrant #5

Merged

Conversation

Ptah-CT commented May 13, 2026

Summary

Test plan

Out of scope / follow-ups

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DerAuctor commented May 13, 2026

Uh oh!

claude Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

🔴 Blockers

🟠 High severity

🟡 Medium severity

🟢 Minor / Polish

✅ Confirmed valid CodeRabbit findings I did NOT see issues with

Summary

Uh oh!

Ptah-CT commented May 13, 2026

Uh oh!

coderabbitai Bot commented May 13, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

claude Bot commented May 13, 2026 •

edited

Loading