Skip to content

Qdrant adapter: full Milvus parity + standalone re-embed sweep (rebased on EverCore path)#2

Merged
DerAuctor merged 23 commits into
mainfrom
qdrant/rebase-evercore
May 13, 2026
Merged

Qdrant adapter: full Milvus parity + standalone re-embed sweep (rebased on EverCore path)#2
DerAuctor merged 23 commits into
mainfrom
qdrant/rebase-evercore

Conversation

@Ptah-CT
Copy link
Copy Markdown

@Ptah-CT Ptah-CT commented May 13, 2026

Summary

Rebase of the original Qdrant-adapter branch on top of current main, after upstream renamed methods/evermemos/methods/EverCore/ and added OpenHer use-cases. All 20 commits preserved, path rewritten via git format-patch + sed + git am --3way. README.md auto-merged.

Supersedes #1, which carried the same code on the pre-rename path.

The branch carries:

  • Phase 1: QdrantCollectionBase (tenant-aware naming, payload indexes, query_points instead of removed search).
  • Phase 2: 6 collections + 6 converters covering EpisodicMemory, AtomicFact, Foresight, AgentCase, AgentSkill, UserProfile.
  • Phase 2.5: 6 @repository adapters with two-stage score gating, client.count(exact=True) for delete-by-filter, and tz-aware epoch helpers.
  • Phase 3: standalone migrate_milvus_to_qdrant CLI (Mongo → OpenRouter → Qdrant) plus the re_embed_sweep wrapper that fans the workhorse over every active tenant database.
  • 3 rounds of bot feedback addressed inline: 30 (CodeRabbit pass-1) + 7 (CodeRabbit pass-2) + 17 (CodeRabbit on Qdrant adapter: full Milvus parity + standalone re-embed sweep #1, plus a 👍-equivalent from Codex).

Test plan

  • ruff check clean on changed files.
  • pytest green for methods/EverCore/tests/ (collection + converter + repository suites).
  • Smoke: standalone migrate writes a single document into a smoketest Qdrant collection (verified against a real Qdrant pre-rebase).
  • Sweep dry-run prints the expected (db, coll) pair plan and does not call OpenRouter.
  • Full sweep (after merge) writes the re-embedded points into Qdrant collections per tenant.

Out of scope / follow-ups

  • UserProfile reindex is intentionally not in the sweep wrapper — it splits one source document into many points and needs its own Phase 3.1 path.

Ptah-CT added 20 commits May 13, 2026 21:13
Phase 1 (skeleton) of the milvus -> qdrant migration. No runtime behavior
change yet — Milvus stays the default backend until cutover.

- README: prepend fork header documenting motivation, status, approach,
  and concept mapping. Links to the Qdrant migration guide.
- pyproject: add qdrant-client>=1.12,<2.
- src/core/oxm/qdrant/qdrant_collection_base.py: stub IndexConfig and
  QdrantCollectionBase. ensure_all() is a no-op so the lifespan provider
  can iterate registered subclasses without crashing during the skeleton
  phase. upsert/search/delete raise NotImplementedError (Phase 2).
- src/core/component/qdrant_client_factory.py: full QdrantClientFactory
  with env-driven get_qdrant_config(prefix=...), per-alias client caching,
  named clients, and graceful shutdown. https=Optional[bool] preserves
  qdrant-client's URL-scheme TLS detection; api_key=Optional[str] passes
  through cleanly without empty-string coercion. Registered as
  @component(primary=False) so the milvus factory remains the default
  until cutover.
- Empty __init__.py for new oxm/qdrant and tenants/.../oxm/qdrant
  packages.

Next: qdrant_lifespan.py (gated by VECTOR_STORE_BACKEND env flag) and
full collection-base impl.
…D env

Provides FastAPI lifespan startup/shutdown for the Qdrant adapter, analogous
to MilvusLifespanProvider but no-op unless VECTOR_STORE_BACKEND=qdrant. So
the Milvus backend stays the default at runtime until cutover.

On startup (when active):
- Resolves the 'qdrant_client_factory' DI bean.
- Collects all concrete QdrantCollectionBase subclasses via get_all_subclasses.
- Groups them by _DB_USING and ensures the client per group, then runs
  ensure_all() on each collection (currently a stub no-op; will create
  collections + payload indexes in Phase 1.2).

On shutdown: closes all cached Qdrant clients. Cleans the same app.state
attributes pattern as the milvus provider.

Order=19 sits between milvus_lifespan (18) and business_lifespan (20),
so during cutover both backends can briefly coexist.
Generic[QdrantCollectionType] ABC with an abstract @classmethod from_mongo()
that subclasses implement to convert Mongo source docs into Qdrant point
payloads (PointStruct or compatible dicts).

get_qdrant_model() introspects the Generic argument from __orig_bases__ so
the bound collection class can be retrieved at runtime — same pattern as
the Milvus base converter, which the search-repository layer relies on.
…query_points()

Replaces the Phase 1.1 stub with a full collection-management base class:

- IndexConfig as @DataClass: size, distance, on_disk, hnsw_m,
  hnsw_ef_construct, payload_indexes (dict field_name -> schema_type).
  to_vectors_config() builds the qdrant_client VectorParams + HnswConfigDiff.
- Module-level _DISTANCE_MAP and _PAYLOAD_SCHEMA_TYPE_MAP translate string
  configs to SDK enums — subclasses stay decoupled from the SDK.
- QdrantCollectionBase methods: client() (lazy DI lookup), exists(),
  count(), ensure_collection() (idempotent), ensure_payload_indexes()
  (idempotent), ensure_all(), upsert(), search() (qdrant-client
  query_points wrapper), delete(), drop().
- Compared to MilvusCollectionBase this is ~half the LOC because Qdrant
  has no alias mechanism — collection names are direct.

CRITICAL fix in the same commit: qdrant-client 1.16.1 (the version
resolved against our >=1.12,<2 pin) removed the legacy
QdrantClient.search method; only query_points is available now. The
search() wrapper here calls query_points(query=..., ...) and unwraps
QueryResponse.points so call sites still get List[ScoredPoint].

uv.lock is regenerated to include qdrant-client (1.16.1) and its
transitive deps (h2, hpack, hyperframe, portalocker).
…t naming

Adds the multi-tenancy adapter for Qdrant. Deliberately schlanker than the
Milvus counterpart (~270 LOC total vs ~750 LOC for milvus' tenant layer)
because Qdrant has no alias mechanism and no partition_key feature — multi-
tenancy is realized via collection-per-tenant naming alone.

src/core/tenants/tenantize/oxm/qdrant/config_utils.py:
- get_tenant_aware_collection_name(original_name): resolves the final
  Qdrant collection name from the active tenant context. Lookup order is
  storage_info['qdrant'] -> storage_info['milvus'] (migration bridge,
  reuses the same collection_prefix for both backends until per-tenant
  qdrant config is wired) -> base resource prefix fallback.
- get_qdrant_connection_cache_key(config): builds a stable factory cache
  key, hashing api_key fingerprints (8 hex chars) so the raw key never
  appears in the cache identifier.
- _load_qdrant_env(prefix): env-fallback loader for tenant-aware connection
  routing. Currently exported as private until the routing layer consumes
  it; documented to avoid dead-code flags.

src/core/tenants/tenantize/oxm/qdrant/tenant_aware_qdrant_collection_with_suffix.py:
- TenantAwareQdrantCollectionWithSuffix(QdrantCollectionBase): overrides
  the name property to return tenant-prefixed + optional explicit suffix.
  The collection base remains unchanged; concrete Phase-2 collections
  inherit from this class instead of QdrantCollectionBase directly.
- __init__(suffix) accepts explicit override or falls back to the
  SELF_QDRANT_COLLECTION_NS env-var.
- _MULTI_TENANT_STRATEGY ClassVar is informational; a future version may
  opt into Qdrant's native payload-partitioning instead of separate
  collections per tenant.
…ollectionBase

Wraps the sync qdrant-client API via asyncio.to_thread so the repository
surface stays async (parity with the milvus repository layer).

Methods:
- upsert(point) -> str (point id, parity with milvus 'insert' returning entity_id)
- upsert_batch(points) -> UpdateResult (full result, exposes wait-status)
- find_by_id(id) / find_by_ids(ids) -> Optional[Record] / List[Record]
- delete_by_id(id) -> bool; delete_batch(ids) -> UpdateResult
- search(query_vector, limit, query_filter, ...) -> List[ScoredPoint]
- count(exact=True) -> int
- collection: lazy-instantiated QdrantCollectionBase subclass
- get_model_name() -> str

Error-handling semantics mirror the milvus counterpart:
- upsert / upsert_batch / delete_batch / search → log + raise on failure
- find_by_id / find_by_ids / delete_by_id → log + return None / False
  (resilient read path)

Subclasses bind the generic parameter to the concrete collection model
class:

    class EpisodicMemoryRepository(
        BaseQdrantRepository[EpisodicMemoryCollection]
    ):
        def __init__(self):
            super().__init__(EpisodicMemoryCollection)
…+converter

First two of six Qdrant collections matching the Milvus search adapter layout
under src/infra_layer/adapters/out/search/. Subsequent commits add the
remaining four (agent_skill, foresight, atomic_fact, user_profile).

Collections (src/infra_layer/adapters/out/search/qdrant/memory/):
- episodic_memory_collection.py: TenantAwareQdrantCollectionWithSuffix
  subclass, base name v1_episodic_memory. Vector 1024-dim Cosine, HNSW
  m=16 ef_construct=200. Payload indexes: user_id/group_id/session_id/
  parent_id/parent_type/type (keyword) + timestamp (integer, epoch ms).
- agent_case_collection.py: base name v1_agent_case. Same vector params;
  payload indexes minus participants/type, plus timestamp as epoch
  seconds (parity with milvus converter).

Converters (src/infra_layer/adapters/out/search/qdrant/converter/):
- episodic_memory_qdrant_converter.py: BaseQdrantConverter[
  EpisodicMemoryCollection]. from_mongo() builds a PointStruct from a
  MongoDB v1_episodic_memories document. Handles missing optional fields
  (sender_ids, type, subject, summary) and serializes the search_content
  list to JSON for downstream search service consumption.
- agent_case_qdrant_converter.py: BaseQdrantConverter[AgentCaseCollection].
  Maps AgentCaseRecord -> PointStruct, truncates task_intent to 5000 chars
  (Milvus parity), uses epoch-seconds timestamp.

Qdrant is schema-flexible, so unlike the Milvus side there is no field
schema declaration — only the vector params and the explicit payload
indexes (the rest of the payload is whatever the converter writes).
…verter

Adds collections 3 and 4 of six. Same TenantAwareQdrantCollectionWithSuffix
+ BaseQdrantConverter pattern as batch 1.

Collections:
- agent_skill_collection.py: base name v1_agent_skill. Payload indexes for
  user_id/group_id/cluster_id (keyword) + maturity_score/confidence (float)
  for threshold range queries.
- foresight_collection.py: base name v1_foresight_record. Payload indexes
  for user_id/group_id/session_id/parent_id/parent_type/type (keyword) +
  start_time/end_time (integer, epoch ms).

Converters:
- agent_skill_qdrant_converter.py: AgentSkillRecord -> PointStruct. Builds
  the content payload from name + description, truncates to 5000 chars.
  Coerces optional maturity_score / confidence to 0.0 when absent (Qdrant
  silently excludes null-valued payloads from range filters, so treating
  'unscored' as 'lowest score' keeps them visible to threshold queries).
- foresight_qdrant_converter.py: ForesightRecord -> PointStruct. Time-field
  parser accepts datetime / ISO-8601 / numeric. **Diverges from Milvus
  template**: numeric inputs above 1e10 are treated as already-milliseconds
  rather than blindly multiplied by 1000 — the Milvus version corrupts
  already-ms inputs. content payload is intentionally passed verbatim
  (incl. None) for downstream sentinel semantics; documented inline.
…onverter (Phase 2 complete)

Adds the final two of six Qdrant collections — Phase 2 of the
Milvus->Qdrant migration is now structurally complete (6 collections + 6
converters under src/infra_layer/adapters/out/search/qdrant/).

Collections:
- atomic_fact_collection.py: base name v1_atomic_fact_record. Payload
  indexes for user_id/group_id/session_id/parent_id/parent_type/type
  (keyword) + timestamp (integer, epoch ms).
- user_profile_collection.py: base name v1_user_profile. No session_id
  (user-level aggregation). Payload indexes for user_id/group_id/scenario/
  item_type (all keyword).

Converters:
- atomic_fact_qdrant_converter.py: AtomicFactRecord -> PointStruct. Falls
  back to RawDataType.CONVERSATION when source.type is absent. exc_info
  on error log.
- user_profile_qdrant_converter.py: **diverges from the other converters'
  return type** — returns List[Dict[str, Any]] (one item per
  explicit_info / implicit_trait / user_goal entry) for parity with the
  Milvus counterpart. ProfileIndexer downstream wraps each item into a
  PointStruct after embedding. Module-level _EXPLICIT_FIELDS and
  _IMPLICIT_FIELDS constants (Milvus version had them inline). Carries
  '# type: ignore[override]' on from_mongo with docstring justification.

Phase 2 file inventory:
  src/infra_layer/adapters/out/search/qdrant/
  ├── __init__.py
  ├── memory/
  │   ├── __init__.py
  │   ├── episodic_memory_collection.py
  │   ├── agent_case_collection.py
  │   ├── agent_skill_collection.py
  │   ├── foresight_collection.py
  │   ├── atomic_fact_collection.py
  │   └── user_profile_collection.py
  └── converter/
      ├── __init__.py
      ├── episodic_memory_qdrant_converter.py
      ├── agent_case_qdrant_converter.py
      ├── agent_skill_qdrant_converter.py
      ├── foresight_qdrant_converter.py
      ├── atomic_fact_qdrant_converter.py
      └── user_profile_qdrant_converter.py
…ories

First 2 of 6 Qdrant repositories, mirroring the surface of the corresponding
Milvus repositories so the search-service layer can swap backends via the
VECTOR_STORE_BACKEND env flag.

Both repositories:
- Inherit from BaseQdrantRepository[<Collection>], registered as
  @repository(name='..._qdrant_repository', primary=False).
- Build filters as qmodels.Filter(must=[FieldCondition(...)]) using
  MatchValue / MatchAny / Range instead of Milvus' string expression
  syntax. Filter-construction is fully typed — no injection vector
  remains in the search path.
- Honour the MAGIC_ALL sentinel for user_id / group_id with the same
  semantics as the Milvus repositories.
- Use a two-stage score gating pattern (server-side score_threshold via
  Qdrant + client-side post-filter at the caller's hard threshold);
  the rationale is documented inline so future readers don't read it
  as a duplicated check.

AgentSkillQdrantRepository:
- vector_search() with maturity_threshold / confidence_threshold range
  filters plus optional cluster_id / group_ids.
- delete_by_cluster_id() — uses scroll() for a best-effort delete count
  (Qdrant's filter-based delete doesn't return one) then deletes via
  FilterSelector(filter=...).

EpisodicMemoryQdrantRepository:
- create_and_save_episodic_memory() — convenience constructor that builds
  a PointStruct and upserts, returns the same lightweight summary dict
  as the Milvus repository for caller parity.
- vector_search() with full scope + time-range filters.
- delete_by_filters() — batch delete by user_id/group_id/time-range;
  same MAGIC_ALL guard and 'at least one filter required' contract as
  the Milvus repository.
Resolves the full set of findings from a CodeRabbit code review on
feature/qdrant-adapter (1 critical, 10 major, 19 minor).

**Critical**
- episodic_memory_qdrant_repository.delete_by_filters: scroll with hard-
  coded limit=10_000 could undercount large tenants. Replaced with an
  exact client.count() call so the returned delete-count reflects the
  full set.

**Major**
- agent_skill_qdrant_repository.delete_by_cluster_id: same scroll-limit
  bug. Same fix (exact count + filter delete) and re-raise on error.
- episodic_memory_qdrant_repository.vector_search: two-stage score
  gating now uses min(radius, score_threshold) for server-side
  filtering so a wider radius doesn't accidentally make the server cut
  stricter than the client-side post-filter.
- AgentCase / AgentSkill / EpisodicMemory / Foresight / AtomicFact
  converters: explicit non-empty vector validation (raise instead of
  silently writing an empty list) and explicit id None-guard
  (no more str(None) -> 'None' point ids).
- foresight_qdrant_converter._parse_time_field: 'if not time_value'
  treated epoch 0 as missing. Now 'if time_value is None'.
- atomic_fact_qdrant_converter: _build_search_content is now actually
  written into the point payload (was dead code); vector access uses
  getattr defensive; type fallback only when source.type is None
  rather than any falsy value.
- base_repository.find_by_id / find_by_ids / delete_by_id: stop
  swallowing all exceptions. Errors are logged and re-raised so callers
  can distinguish 'not found' from operational failures. Behaviourally
  consistent with the rest of the base methods (upsert/delete_batch/
  search) that already raised on failure.
- qdrant_collection_base.drop: log + re-raise instead of swallow.
- config_utils._load_qdrant_env and qdrant_client_factory.get_qdrant_config:
  safe int(QDRANT_PORT) with try/except + TCP range guard (1-65535).
- qdrant_client_factory: URL assembly preserves an already-schemed host
  verbatim (e.g. 'https://my-qdrant.cloud') instead of force-prefixing
  http:// and double-appending the port.
- qdrant_client_factory.get_client: threading.Lock with double-checked
  locking eliminates the cache-miss race that could create duplicate
  QdrantClient instances under concurrent FastAPI requests.

**Minor**
- 6x collection docstrings: 'dim=1024' -> 'dim=VECTORIZE_DIMENSIONS'
  (no more drift if the constant changes).
- 2x ValueError messages: 'cannot be empty' -> 'cannot be None' to
  match the actual 'is None' guard.
- config_utils.get_qdrant_connection_cache_key: api_key.encode now
  tolerates bytes/str/other.
- agent_skill_qdrant_repository.vector_search: 'user_id is None' now
  skips the filter entirely instead of matching the empty string.
- qdrant_client_factory.get_named_client: cache key normalized via
  .lower() so 'Default'/'DEFAULT'/'default' share one client.
- qdrant_client_factory.get_qdrant_config: 'https_raw' now uses the
  _env helper consistently with the other env vars.
- 2x qdrant_collection_base: 'assert cfg is not None' replaced with
  explicit RuntimeError so the guard survives python -O.

Total: 17 files changed, ~30 distinct fixes.
Adds repositories 3 and 4 of 6. Both apply all CodeRabbit-derived patterns
established in the previous Phase 2.5 fix-pass:

- user_id/MAGIC_ALL guard skips the filter when None/empty (no spurious
  user_id == '' match).
- Two-stage score gating: server-side passes min(radius, score_threshold)
  so a wider radius cannot accidentally tighten the cut; client-side post-
  filter enforces the hard caller minimum.
- client.count(exact=True) for delete-by-filter return values (not a
  bounded scroll page).
- All error paths re-raise after a structured log.

AgentCaseQdrantRepository:
- vector_search() with scope (user_id / session_id / group_ids / parent_id)
  and time-range filters in epoch seconds (parity with the AgentCase
  converter and the Milvus repository — agent_case is the one collection
  storing seconds, not milliseconds).
- Returns a datetime for timestamp (round-tripped from epoch
  seconds with tz=UTC).

AtomicFactQdrantRepository:
- create_and_save_atomic_fact() convenience constructor: builds the
  PointStruct (with empty vector validation), upserts, returns the
  Milvus-shaped summary dict for caller parity.
- vector_search() with full scope + time-range filters in epoch ms.
- batch_vector_search_by_parent_ids(): MRAG-Phase-3 expansion path —
  MatchAny over parent_ids with total_limit = limit * len(parent_ids).
  Returns early with an empty list when no parent_ids are passed.
- delete_by_filters(): uses exact count + filter-based delete; raises on
  any operational error so callers can distinguish 'no points' from
  failure.
- All search paths return datetime for timestamp (consistent with
  create_and_save_atomic_fact, parity with agent_case).
…s (Phase 2.5 complete)

Final 2 of 6 Qdrant repositories. Phase 2.5 is now structurally complete:
all six adapters (agent_skill, episodic_memory, agent_case, atomic_fact,
foresight, user_profile) exist as Qdrant repositories matching their
Milvus counterparts.

All four established Phase-2.5 patterns applied:
- user_id/MAGIC_ALL guard skips the filter on None/empty
- Two-stage score gating: min(radius, score_threshold) server-side
- client.count(exact=True) for delete-by-filter return values
- Re-raise after structured log (no swallowed errors)

ForesightQdrantRepository:
- create_and_save_foresight_mem(): convenience constructor + upsert.
  session_id is now an explicit parameter (writes into payload) so the
  matching vector_search(session_id=...) filter actually hits — the
  Milvus repository signature lacked this parameter and produced a
  silent zero-hit filter.
- vector_search() with scope (user_id/group_ids/session_id/sender_id/
  parent_type/parent_id) + time-range filters.
  **Diverges from Milvus**: filters on start_time/end_time payload
  fields (semantically correct range overlap) instead of the Milvus
  repository's non-existent 'timestamp' field. Documented inline.
  sender_id filter uses Qdrant's element-wise MatchValue on the
  sender_ids array — equivalent to Milvus' array_contains.
- delete_by_filters() with the same start_time/end_time semantics.

UserProfileQdrantRepository:
- vector_search() with user_id/group_id/scenario scoping (no
  session_id — user_profile is user-level aggregation).
- delete_by_user_group(): count + filter-based delete, raises on
  operational error (consistent with the Phase 2.5 fix-pass on
  base_repository).
CodeRabbit pass 2 found 7 follow-up findings after pass 1. All addressed:

**Major**
- Timezone-naive datetimes in time-range filters silently used the local
  timezone for .timestamp() conversion, producing wrong epoch values.
  Added module-level helpers in base_repository: to_epoch_ms(dt) and
  to_epoch_s(dt) which coerce tz-naive datetimes to UTC. All five
  repositories now use these helpers consistently:
    - agent_case (epoch seconds via to_epoch_s)
    - atomic_fact / episodic_memory / foresight (epoch ms via to_epoch_ms)
    - user_profile (no time fields, untouched)

**Minor**
- foresight create_and_save_foresight_mem: start_time/end_time fall back
  to None instead of 0 when missing — 0 would silently match epoch-1970
  records. Documented inline.
- base_repository.count(): wrapped in try/except + structured log to
  match the rest of the async methods.
- atomic_fact create_and_save: vector validation now explicit None/empty
  check (`if vector is None or len(vector) == 0`) instead of `if not
  vector`, so a legitimate all-zero embedding is no longer falsy-rejected.
- user_profile_qdrant_repository.vector_search: group_id and scenario
  filters now mirror user_id's MAGIC_ALL guard (skip filter on MAGIC_ALL
  sentinel) instead of treating MAGIC_ALL as a literal value to match.

Note: the foresight repository's two-stage score-gating pattern was
flagged as 'redundant filtering'. It is intentional — server-side uses
the more permissive bound (radius widening) and the client-side post-
filter enforces the caller's hard cut. The behaviour is documented in
the inline comment block; the CodeRabbit finding is a false positive.
… Qdrant)

Adds devops_scripts/migrate_milvus_to_qdrant.py, the workhorse for the
Phase 3 cutover. Standalone: no EverOS DI container required.

Reads OpenRouter, Mongo, Qdrant config from env (auto-loads .env via
python-dotenv when present). Migrates one (mongo-db, mongo-collection)
pair to one Qdrant collection per invocation; shell-loop over the six
EverOS collection types × N tenants for the full sweep.

Defaults match the documented xinfty stack:
  - VECTORIZE_MODEL=qwen/qwen3-embedding-8b
  - VECTORIZE_DIMENSIONS=4096
  - OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
  - MONGO_URI=mongodb://localhost:27017
  - QDRANT_HOST=localhost, QDRANT_PORT=6333

CLI args expose the per-collection variation:
  --text-field         primary text used for embedding
  --extra-text-fields  comma-separated secondary text fields
  --timestamp-field    + --timestamp-unit ms|s
  --payload-fields     comma-separated mongo fields projected to qdrant payload
  --batch-size         embedding batch size (default 32)
  --limit              cap for smoke tests
  --force              re-embed and overwrite existing points
  --dry-run            count without calling OpenRouter or Qdrant.upsert
  --log-level          DEBUG/INFO/WARNING/ERROR

Idempotent by default: client.retrieve filters out point ids that
already exist in the target Qdrant collection (skip path); --force
overwrites them.

Embedded behaviour:
  - extract_text concatenates text_field + extra_text_fields with newlines
  - build_payload projects payload_fields + normalizes timestamp via
    datetime.timestamp() (epoch ms or s depending on --timestamp-unit)
  - search_content is JSON-serialized from the text pieces, mirroring the
    converter's payload shape used by EverOS' search service.

ensure_qdrant_collection creates the target collection with the same
HNSW/Cosine config the EverOS adapter writes (m=16, ef_construct=200,
distance=Cosine) so the schema matches what the live service expects.
Qdrant only accepts unsigned-int or RFC-4122-UUID point ids; the Mongo
ObjectId hex (e.g. 69ed6acfaf31e5cd7977bc56) is neither and the live
pilot hit a 400 Bad Request from the Qdrant retrieve endpoint.

Fix:
- base_repository: add mongo_id_to_qdrant_id() helper that does
  str(uuid.uuid5(NAMESPACE, str(mongo_id))). Namespace is a fixed UUID
  embedded in code (must never change without a full re-migration).
- migrate_milvus_to_qdrant.py: use the helper for the Qdrant point id;
  keep the original Mongo id in the payload as 'mongo_id' for reverse
  lookup.

The Phase-2 converters (search/qdrant/converter/*) still use
str(source_doc.id) and would fail the same way the moment they go live.
That fix is the next commit.
ModuleNotFoundError: 'core' when invoked as
    python src/devops_scripts/migrate_milvus_to_qdrant.py ...

Fix: prepend the script's parent-of-parent (the EverOS src/ tree) to
sys.path before the core.oxm.qdrant.base_repository import. Lets the
script run without PYTHONPATH or pip install.
The Phase 2 converters used ``id=str(source_doc.id)`` directly when building
``PointStruct`` payloads. Qdrant only accepts unsigned integers or RFC-4122
UUIDs as point ids — a 24-hex-char Mongo ``ObjectId`` is neither, so the
Live-indexing path produced a 400 Bad Request on upsert.

The standalone re-embed CLI (commit c17ba60) already routed Mongo ids
through ``mongo_id_to_qdrant_id`` (uuid5 over a stable namespace), so the
two paths now agree:

- standalone migrate: Mongo doc -> uuid5 point id
- live converter:     Mongo doc -> uuid5 point id (this commit)

Both also persist the raw Mongo id in the payload as ``mongo_id`` for
round-trip lookup, idempotent re-embed, and debugging.

``user_profile_qdrant_converter`` is intentionally left untouched: it
emits multiple points per source doc (one per explicit_info/implicit_trait
entry), assigns fresh ObjectIds, and has its own Phase 3.1 path that needs
a separate point-id scheme.
Adds ``re_embed_sweep.py`` next to the standalone ``migrate_milvus_to_qdrant``
workhorse. The wrapper iterates every active (non-hyphen) Mongo database
times every supported collection type and invokes ``migrate(...)`` for each
non-empty pair.

Five collection types are covered:

- episodic_memory  (v1_episodic_memories  -> <prefix>_v1_episodic_memory)
- atomic_fact      (v1_atomic_fact_records -> <prefix>_v1_atomic_fact_record)
- foresight        (v1_foresight_records  -> <prefix>_v1_foresight_record)
- agent_case       (v1_agent_cases        -> <prefix>_v1_agent_case)
- agent_skill      (v1_agent_skills       -> <prefix>_v1_agent_skill)

``v1_user_profiles`` is deliberately excluded — it needs per-doc splitting
(one source doc -> many Qdrant points), handled by a separate Phase 3.1
script.

CLI shape mirrors the workhorse: ``--tenant``, ``--collection``,
``--batch-size``, ``--limit-per-pair``, ``--force``, ``--dry-run``,
``--log-level``. The wrapper imports ``migrate`` directly (no subprocess
fan-out) so config is read once and progress logs interleave naturally.
Critical
- ``migrate_milvus_to_qdrant`` now wraps the migration body in try/finally
  and closes ``mongo``, ``qdrant``, and ``openai`` clients explicitly. Long
  sweeps previously leaked connections on every per-pair invocation.

Major
- ``QdrantCollectionBase.exists`` only catches the qdrant-client transport
  exceptions (``ResponseHandlingException``, ``UnexpectedResponse``); other
  failures propagate so infrastructure issues stay visible instead of being
  silently treated as "collection missing".
- ``EpisodicMemoryQdrantRepository.create_and_save_episodic_memory`` rejects
  missing/empty vectors up front with a ``ValueError`` (mirrors the converter
  contract) instead of bubbling up a confusing 400 from Qdrant.
- ``EpisodicMemoryQdrantRepository.vector_search`` no longer treats the
  default ``user_id=None`` as a filter on the empty string. The condition
  now requires the caller to have provided an explicit (non-None,
  non-MAGIC_ALL) value, restoring full-recall behaviour for the unscoped
  search path.
- ``AtomicFactQdrantRepository.vector_search`` now returns ``atomic_fact``
  in the result dict (parity with the batch path and with the persisted
  payload from ``create_and_save_atomic_fact``).

Minor
- ``AgentSkillQdrantRepository.vector_search`` two-stage threshold now uses
  ``min(radius, score_threshold)`` server-side, matching the agent_case
  repository's "more permissive lower bound" semantics.

Nitpick
- ``logger.error`` in every except block touched by Phase 2/3 swapped for
  ``logger.exception`` (10 sites across converters, repositories, and the
  sweep wrapper) so the stack trace is always logged.
- ``build_payload`` now logs a warning when a timestamp field has an
  unexpected type instead of silently dropping the value.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

Review Change Stack

Warning

Rate limit exceeded

@Ptah-CT has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 34 minutes and 19 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 653ab36c-576b-461c-92a2-bd823051d450

📥 Commits

Reviewing files that changed from the base of the PR and between 51025b2 and 5d1be9c.

📒 Files selected for processing (9)
  • README.md
  • methods/EverCore/src/core/lifespan/qdrant_lifespan.py
  • methods/EverCore/src/core/oxm/qdrant/base_repository.py
  • methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py
  • methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py
  • methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py
  • methods/EverCore/src/devops_scripts/re_embed_sweep.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/atomic_fact_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py
📝 Walkthrough

Walkthrough

Dieses PR implementiert das Qdrant-Vector-Backend: client-factory, Lifespan-Provider, tenant-aware Konfiguration, Collection-Basics, Domänen‑Konverter, async Repositories und Migrations-CLI-Tools.

Changes

Qdrant Vector Backend Migration

Layer / File(s) Summary
Docs and Dependency
README.md, methods/EverCore/pyproject.toml
Top-of-file README-Note zur Milvus→Qdrant-Fork; qdrant-client>=1.12,<2 als neue Abhängigkeit.
Tenant-aware config utilities
methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py
Tenant-Auflösung für Qdrant storage, fallback auf Milvus-Entry, tenant-prefixed collection-name resolver und deterministic connection cache key; sichere env-parsing.
Qdrant client factory and config
methods/EverCore/src/core/component/qdrant_client_factory.py
Env-get_qdrant_config, boolean/port parsing, timeout clamping; QdrantClientFactory with alias-normalized cache, double-checked locking, get_default/get_named and close_all_clients.
FastAPI Lifespan provider
methods/EverCore/src/core/lifespan/qdrant_lifespan.py
LifespanProvider gated by VECTOR_STORE_BACKEND; discovers QdrantCollectionBase subclasses, groups by _DB_USING, creates named clients and calls ensure_all(); shutdown closes clients and cleans state.
Collection base & IndexConfig
methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py
IndexConfig dataclass, SDK-mappings, QdrantCollectionBase with exists, ensure_collection, ensure_payload_indexes, ensure_all, plus upsert/search/delete/drop wrappers.
Base repository & helpers
methods/EverCore/src/core/oxm/qdrant/base_repository.py
Deterministic mongo_id_to_qdrant_id (UUID5), to_epoch_ms/to_epoch_s, compute_effective_threshold, and async BaseQdrantRepository with thread-wrapped CRUD/search/count.
BaseQdrantConverter
methods/EverCore/src/core/oxm/qdrant/base_converter.py
Abstract generic converter base with runtime generic-type extraction and from_mongo contract.
Tenant-aware collection with suffix
methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/tenant_aware_qdrant_collection_with_suffix.py
Collection-per-tenant naming with optional suffix resolved from constructor or SELF_QDRANT_COLLECTION_NS; exposes name, base_name, suffix.
Domain converters
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/*
Converters for AgentCase, AgentSkill, AtomicFact, EpisodicMemory, Foresight, UserProfile to produce Qdrant PointStructs or payload items with validation, timestamp normalization and search_content.
Collection definitions (memory)
methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/*
Tenant-aware collection subclasses with _COLLECTION_NAME and _VECTOR_PARAMS (IndexConfig), payload index declarations and docstrings.
Async Qdrant repositories
methods/EverCore/src/infra_layer/adapters/out/search/repository/*_qdrant_repository.py
Repositories implementing vector_search (Qdrant Filter construction, server-side effective threshold, client-side cutoff), create/save methods, batch helpers and delete-by-filters using exact count+delete.
Migration CLI tools
methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py, methods/EverCore/src/devops_scripts/re_embed_sweep.py
Single-collection migration CLI with OpenRouter-compatible embeddings, idempotent ID handling, batching and dry-run; sweep script enumerates tenants and runs migrations per spec.

Sequence Diagram(s)

sequenceDiagram
  participant App as FastAPI App
  participant Lifespan as QdrantLifespanProvider
  participant Factory as QdrantClientFactory
  participant Discovery as CollectionDiscovery
  participant Collection as QdrantCollectionBase
  participant Qdrant as QdrantClient

  App->>Lifespan: startup(VECTOR_STORE_BACKEND=qdrant)
  Lifespan->>Discovery: find QdrantCollectionBase subclasses
  Discovery-->>Lifespan: grouped by _DB_USING
  Lifespan->>Factory: get_named_client(using)
  Factory->>Qdrant: create/cached QdrantClient
  Lifespan->>Collection: ensure_all()
  Collection->>Qdrant: collection_exists/create_collection/create_payload_index
  App->>Lifespan: shutdown
  Lifespan->>Factory: close_all_clients()
  Factory->>Qdrant: close() per client
Loading
sequenceDiagram
  participant Repo as Repository
  participant Thread as asyncio.to_thread
  participant Collection as QdrantCollectionBase
  participant Qdrant as QdrantClient

  Repo->>Repo: build Qdrant Filter & effective_threshold
  Repo->>Thread: call collection.search(query, filter, score_threshold)
  Thread->>Collection: search()
  Collection->>Qdrant: query_points(...)
  Qdrant-->>Collection: response.points
  Thread-->>Repo: List[ScoredPoint]
  Repo->>Repo: apply client-side cutoff, map payload, convert timestamps
  Repo-->>Client: results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Ich hüpf' durch Code und Wald so sacht,
Qdrant bringt Ordnung über Nacht,
Tenant-Names wie Möhren fein,
Converter, Repos, alles rein,
Migration läuft — hurra, geschafft!

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch qdrant/rebase-evercore

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@methods/EverCore/src/core/component/qdrant_client_factory.py`:
- Around line 88-92: Die Timeout-Parsing-Logik in QdrantClientFactory (z.B. in
der Methode create_qdrant_client / der Stelle, wo die Variable timeout geparst
wird) muss neben TypeError/ValueError auch auf gültige Werte prüfen: erzwinge
dass der geparste timeout ein positiver, sinnvoller Wert ist (z.B. timeout >= 1
und <= MAX_TIMEOUT_SECONDS wie 300) und bei Verletzung dieser Range entweder
eine ValueError werfen oder auf einen Default-Wert zurückfallen; passe die
bestehende try/except-Logik an, sodass negative, 0 oder extrem große Werte nicht
stillschweigend akzeptiert werden (verwende eindeutige Namen wie timeout,
MAX_TIMEOUT_SECONDS und die Factory-Methode create_qdrant_client zur
Lokalisierung).
- Around line 94-103: The comment claiming the qdrant-client SDK will infer
scheme when https=None is wrong because the code constructs url =
f"{scheme}://{host}:{port}" which forces the scheme; update the factory logic in
qdrant_client_factory (e.g., the function creating the client such as
create_qdrant_client / QdrantClientFactory) so that when QDRANT_HTTPS is unset
(https is None) you do NOT build a full url string but instead pass host and
port (and other kwargs like prefer_grpc/token) directly to the Qdrant client
constructor so the SDK can infer scheme, or alternately correct the inline
comment to accurately state current behavior; ensure you reference the
https/QDRANT_HTTPS flag, the variables scheme, url, host and port when making
the change.

In `@methods/EverCore/src/core/lifespan/qdrant_lifespan.py`:
- Around line 123-124: Der Shutdown/Cleanup darf nicht von einem Env-Flag
abhängen; ändere die Logik in der qdrant_lifespan.py so dass die
Shutdown-/cleanup-Methode (z.B. shutdown, cleanup oder __aexit__/__exit__) nicht
`os.environ` oder ein Environment-Flag abfragt, sondern den tatsächlichen
Initialisierungszustand prüft: setze beim erfolgreichen Initialisieren (z.B. in
initialize, start oder __aenter__) ein eindeutiges Attribut wie
self._initialized = True und/oder self._client (z.B. self._qdrant_client) und
prüfe beim Beenden ausschließlich dieses Attribut (und setzte es nach dem
Schließen auf False), sodass Schließen idempotent und nicht von zwischenzeitlich
geänderten Env-Variablen abhängig ist; stelle sicher, dass alle Referenzen im
Shutdown/Cleanup (z.B. die Existenz von self._qdrant_client oder
self._initialized) verwendet werden, statt das Env-Flag zu lesen.
- Around line 117-120: Beim Startup in qdrant_lifespan.py fehlt ein Rollback:
wenn die Collection-Initialisierung (z.B. in der Methode
startup/initialize_collections oder bei Aufrufen wie
initialize_collection/init_collection) nach bereits erstellten Qdrant-Clients
fehlschlägt, werden die offenen Clients nicht geschlossen. Ändere die
Startup-Logik so, dass bei einem Fehler während der Collection-Initialisierung
alle zuvor erstellten Clients (z.B. in der Liste/Variable clients oder
self._clients, und einzelne Client-Instanzen erstellt via create_client /
_create_client) sauber geschlossen/geschlossen werden (z.B. client.close() oder
client.shutdown()) bevor der Fehler erneut geworfen wird, und stelle sicher,
dass sowohl synchroner als auch asynchroner Cleanup (await client.close() falls
async) abgedeckt ist und die Sammlung der Clients danach geleert wird.

In `@methods/EverCore/src/core/oxm/qdrant/base_converter.py`:
- Around line 66-82: The from_mongo return annotation is wrong: it declares
QdrantCollectionType but the docstring and concrete converters (e.g.,
AgentCaseQdrantConverter, AtomicFactQdrantConverter) return a
qmodels.PointStruct/dict; fix by adding a type alias (e.g., QdrantPointType =
qmodels.PointStruct | dict) in base_converter.py and change the from_mongo
signature to return that alias (and update its docstring to reference
QdrantPointType/PointStruct), ensuring all implementations' return types now
match the declared public API.

In `@methods/EverCore/src/core/oxm/qdrant/base_repository.py`:
- Around line 42-50: The code currently accepts None/empty Mongo IDs and
deterministically maps them to the same Qdrant point, causing silent collisions;
in the BaseQdrantRepository method that converts Mongo IDs to Qdrant point IDs
(e.g., to_qdrant_point_id / get_point_id_from_mongo_id), add explicit validation
that rejects None/empty strings and invalid ObjectId values (use
bson.ObjectId.is_valid or equivalent) and raise a clear exception (ValueError or
custom) instead of returning a mapped id; update callers to handle/report the
error so bad inputs are rejected early.

In `@methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py`:
- Around line 219-244: Die Existenzprüfung mittels exists() gefolgt von
create_collection() ist nicht threadsicher — zwei parallele Prozesse können
beide False sehen und versuchen zu erstellen; ändere den Aufruf von
self.client().create_collection(...) (in der Methode, die self.name und
cfg.to_vectors_config() verwendet) so, dass der Parameter if_not_exists=True
gesetzt wird, damit create_collection idempotent wird und bei bereits
existierender Collection True zurückgibt; belasse sonstige Argumente
(collection_name=self.name, vectors_config=cfg.to_vectors_config()) unverändert.
- Around line 196-204: Die Methode collection_exists() fängt aktuell
UnexpectedResponse und maskiert damit Auth/Server-Fehler; ändere die
Fehlerbehandlung so, dass nur Transport-/Verbindungsfehler
(ResponseHandlingException) abgefangen und zu False führen, während
UnexpectedResponse (z.B. 401/403/5xx) nicht abgefangen, sondern propagiert wird,
damit ensure_collection() die tatsächlichen HTTP-Statusfehler sehen kann; passe
die except-Klausel in collection_exists() an (Referenz: collection_exists,
ensure_collection, ResponseHandlingException, UnexpectedResponse) und
entferne/ändere das aktuelle UnexpectedResponse-Handling entsprechend.
- Around line 254-256: Ersetze die Verwendung von "assert cfg is not None" durch
eine explizite Laufzeitprüfung: prüfe direkt auf "if cfg is None" und werfe eine
aussagekräftige RuntimeError-Exception (z.B. RuntimeError("collection config
missing, cannot access payload_indexes")) bevor auf cfg.payload_indexes
zugegriffen wird; siehe das Muster in ensure_collection() für Formulierung und
Platzierung der Prüfung und passe die Nachricht so an, dass sie den Kontext (z.
B. Funktionsname oder Collection-ID) enthält.

In `@methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py`:
- Around line 111-126: The cache key generation currently only uses
host/port/url and api_key (see endpoint, api_key and sha256 usage) and thus
ignores transport flags; update the key construction to incorporate
tenant-specific transport flags (e.g., config.get("https"),
config.get("prefer_grpc") or similar keys) so the endpoint string includes
normalized values for those flags (true/false) before appending the api_key
hash, ensuring cached QdrantClient instances respect https and prefer_grpc
differences.

In `@methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py`:
- Around line 273-282: Initialize the openai variable to None before the try
block (just like qdrant) so it cannot be unbound in the finally; specifically
add openai: Optional[OpenAI] = None above the try, then keep the OpenAI(...)
assignment inside the try and in the finally use getattr(openai, "close", None)
safely (or check if openai is not None) to close it. Update any other similar
OpenAI usages in the file (e.g., the block referenced at lines ~400-417) to
follow the same pattern.
- Around line 210-229: Die Funktion ensure_qdrant_collection überspringt aktuell
die Schemaprüfung, wenn die Collection bereits existiert; ändere sie so dass
nach client.collection_exists(name) die bestehende Collection-Metadaten via
client.get_collection(name) gelesen werden und deren Vektor-Parameter (z.B.
response.vectors?.config.size oder entsprechendes Feld) mit dem übergebenen
vector_size verglichen werden; bei Abweichung soll die Funktion sofort mit einem
klaren Fehler-/Exception-Log abbrechen (Fail-Fast) statt die Nutzung zu
erlauben, ansonsten unverändert weiterlaufen und nur wenn die Collection gar
nicht existiert client.create_collection mit
qmodels.VectorParams(size=vector_size, distance=..., hnsw_config=...) aufrufen.
- Around line 291-294: The MongoDB cursor opened by coll.find() can time out
during long migrations; change the call to coll.find(no_cursor_timeout=True)
(preserving the existing limit logic) and ensure the cursor is always closed by
wrapping its usage in a context/cleanup block — e.g., use "with
coll.find(no_cursor_timeout=True) as cursor:" if supported or surround cursor
iteration with try…finally and call cursor.close() in finally; update any
function that iterates the cursor (the variable cursor and the coll.find call)
so long-running tenant migrations won't raise CursorNotFound.

In `@methods/EverCore/src/devops_scripts/re_embed_sweep.py`:
- Around line 256-269: The sweep currently logs failures and increments
pairs_failed but still returns exit code 0; modify main() so that after the
final logger.info (the "SWEEP DONE" summary where pairs_run,
pairs_skipped_empty, pairs_failed and elapsed are logged) you check if
pairs_failed > 0 and propagate a non‑zero exit (e.g., call sys.exit(1) or raise
SystemExit(1)); ensure sys is imported if missing. Apply the same change to the
second summary block referenced around lines 305–322 so any partial failures
cause a non‑zero exit code.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/atomic_fact_qdrant_converter.py`:
- Line 78: The assignment to variable "vector" in
atomic_fact_qdrant_converter.py uses a redundant fallback; replace the
expression 'getattr(source_doc, "vector", None) or None' with just
'getattr(source_doc, "vector", None)' (locate the assignment to vector in the
AtomicFactQdrantConverter or surrounding function/method) to remove the
unnecessary "or None" and keep behavior identical.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/episodic_memory_qdrant_repository.py`:
- Around line 225-239: The timestamp in the search_results dict is currently
returned as a raw epoch value; update the code that builds search_results
(inside the EpisodicMemoryQdrantRepository search method that appends to
search_results) to normalize payload.get("timestamp") into a Python datetime: if
it's an int/float treat it as epoch and convert with datetime.fromtimestamp(...,
tz=timezone.utc), if it's already a datetime leave it as-is; add the necessary
import for datetime/timezone and replace the raw payload.get("timestamp")
assignment in the "timestamp" key with the normalized datetime value so this
repository matches the other Qdrant repositories' behavior.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py`:
- Around line 147-152: The current time-filter logic uses containment
(payload.start_time >= start_ms && payload.end_time <= end_ms) which misses
overlapping records; update the filters in the ForesightQdrantRepository methods
that build time predicates (e.g., the search/lookup method and the
delete-by-time-range method) to use overlap semantics: use payload.end_time >=
start_time_ms AND payload.start_time <= end_time_ms (replace the existing
start_time>= / end_time<= checks), and apply the same change to any other
occurrences (the other filter-building block around the indicated ranges).
Ensure the predicates use the record field names payload.start_time and
payload.end_time and the converted query start_time_ms/end_time_ms variables.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/user_profile_qdrant_repository.py`:
- Around line 167-181: The logger calls in user_profile_qdrant_repository.py are
emitting raw user_id and group_id (in the block around the delete logic and the
exception handler), leaking PII; change those logger calls to avoid plaintext
IDs by replacing user_id and group_id with deterministic redacted identifiers
(e.g., a short hex digest or masked form derived from hashing the ID values) and
use the redacted values in both the success info log and the error log/exception
logging (call sites: the logger.info and logger.error inside the delete profile
items flow); ensure you do not print the original IDs anywhere in the message or
exception metadata and keep the exception logging using logger.exception or
logger.error with the redacted ids plus the exception object for debugging.

In `@README.md`:
- Around line 21-24: Unvollständige Pfadangabe für den Qdrant-Adapter in
README.md kann Entwickler irreführen; erweitere die Pfaddokumentation, indem du
explizit beide relevanten Verzeichnisse nennst: `src/core/oxm/qdrant/`
(Adapter-Schnittstellen/Domain-Mapping) und
`src/infra_layer/adapters/out/search/qdrant/` (konkrete Implementierungen wie
Converter, Collections, Repositories). Füge in README.md eine kurze Liste hinzu,
die die genannten Komponenten (Converter, Collections, Repository-Klassen) und
ihren jeweiligen Pfad erwähnt, damit die Codebase-Navigation klar und
vollständig ist.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 415cf661-a731-4134-bdc6-365c171cef35

📥 Commits

Reviewing files that changed from the base of the PR and between 29d555c and bd17eb5.

⛔ Files ignored due to path filters (1)
  • methods/EverCore/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (34)
  • README.md
  • methods/EverCore/pyproject.toml
  • methods/EverCore/src/core/component/qdrant_client_factory.py
  • methods/EverCore/src/core/lifespan/qdrant_lifespan.py
  • methods/EverCore/src/core/oxm/qdrant/__init__.py
  • methods/EverCore/src/core/oxm/qdrant/base_converter.py
  • methods/EverCore/src/core/oxm/qdrant/base_repository.py
  • methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py
  • methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/__init__.py
  • methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py
  • methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/tenant_aware_qdrant_collection_with_suffix.py
  • methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py
  • methods/EverCore/src/devops_scripts/re_embed_sweep.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/__init__.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/__init__.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/agent_case_qdrant_converter.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/agent_skill_qdrant_converter.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/atomic_fact_qdrant_converter.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/episodic_memory_qdrant_converter.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/foresight_qdrant_converter.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/user_profile_qdrant_converter.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/__init__.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/agent_case_collection.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/agent_skill_collection.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/atomic_fact_collection.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/episodic_memory_collection.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/foresight_collection.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/memory/user_profile_collection.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/agent_case_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/agent_skill_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/atomic_fact_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/episodic_memory_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/user_profile_qdrant_repository.py

Comment thread methods/EverCore/src/core/component/qdrant_client_factory.py
Comment thread methods/EverCore/src/core/component/qdrant_client_factory.py
Comment thread methods/EverCore/src/core/lifespan/qdrant_lifespan.py
Comment thread methods/EverCore/src/core/lifespan/qdrant_lifespan.py Outdated
Comment thread methods/EverCore/src/core/oxm/qdrant/base_converter.py Outdated
Comment thread README.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bd17eb5303

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread methods/EverCore/src/core/oxm/qdrant/base_repository.py Outdated
Comment thread methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py Outdated
Comment thread methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py Outdated
Comment thread methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py
Comment thread methods/EverCore/src/devops_scripts/re_embed_sweep.py
Comment thread methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py Outdated
@DerAuctor
Copy link
Copy Markdown

@claude

@claude
Copy link
Copy Markdown

claude Bot commented May 13, 2026

Claude finished @DerAuctor's task in 4m 7s —— View job


PR Review

  • Gather context from PR description and existing reviews
  • Read key source files
  • Validate CodeRabbit/Codex findings against actual code
  • Write comprehensive review

This is a substantial PR (5107 lines, 35 files) adding a full Qdrant vector backend as an alternative to Milvus. The architecture is sound and follows existing patterns well. Below are issues grouped by severity, with validation notes on the existing bot findings.


🔴 Blockers

1. VECTORIZE_DIMENSIONS default mismatch will corrupt migrations

migrate_milvus_to_qdrant.py:110 defaults to 4096:

vectorize_dimensions=int(os.environ.get("VECTORIZE_DIMENSIONS", "4096")),

But memory_layer/constants.py:4 defaults to 1024:

VECTORIZE_DIMENSIONS = int(os.getenv("VECTORIZE_DIMENSIONS", "1024"))

And all six Qdrant collection classes use size=VECTORIZE_DIMENSIONS (runtime default: 1024). A migration run without setting VECTORIZE_DIMENSIONS explicitly will create 4096-dim Qdrant collections and points that are immediately incompatible with the 1024-dim runtime service. Every subsequent upsert or search will fail with a dimension-mismatch error. The default should be 1024. Fix this →

2. openai can be unbound in finally, masking real exceptions

migrate_milvus_to_qdrant.py:279 assigns openai inside the try block — after qdrant is constructed. The finally at line 404 references openai unconditionally:

close_fn = getattr(openai, "close", None)   # NameError if QdrantClient(...) threw

If the QdrantClient(...) constructor raises (e.g., unreachable host), openai is never bound. The resulting NameError in finally propagates and completely shadows the real connection error. Initialize it to None before the try like qdrant is. Fix this →

3. radius is silently discarded when score_threshold is at its default 0.0

All six vector_search methods share this logic (episodic_memory_qdrant_repository.py:203):

if radius is not None and radius > -1.0:
    effective_threshold = min(radius, score_threshold)   # = min(0.7, 0.0) = 0.0

When score_threshold=0.0 (the parameter default), min(radius, 0.0) = 0.0 → passed to Qdrant as None (no server filter), and the client-side check point.score < 0.0 never triggers either. A caller that passes radius=0.7 expecting only high-quality results gets everything back. The comment describes the intent correctly but the code breaks it for the most common call pattern. Fix: when score_threshold == 0.0, use radius as both the server-side threshold and the client-side hard cut-off.


🟠 High severity

4. Singleton-scoped repository caches the first tenant's collection name

base_repository.py:98-103 — the collection property lazily instantiates TenantAwareQdrantCollectionWithSuffix, which resolves the tenant-prefix at construction time and stores it as self._resolved_name. If the DI container caches @repository beans (singleton scope, which is the default in most Spring-style containers), the first request's tenant will "own" the cached collection for all subsequent tenants. Requests from a different tenant would silently read/write the wrong collection, leaking data across tenant boundaries. Verify the @repository bean scope, and if singleton, change the collection property to re-instantiate per-call or accept the tenant context at call time.

5. exists() swallows auth and 5xx errors

qdrant_collection_base.py:198 catches UnexpectedResponse:

except (ResponseHandlingException, UnexpectedResponse) as e:
    return False

The docstring on that same method (lines 189-194) explicitly states that auth failures should propagate. Per the Qdrant client source, collection_exists() raises UnexpectedResponse for 401/403/5xx responses — these get silently converted to return False, causing ensure_collection() to then try to create a collection, which produces an even more confusing follow-up error at startup. Change the except clause to only catch ResponseHandlingException (transport/timeout), and let UnexpectedResponse propagate. Fix this →

6. Migration sweep always exits 0, even on partial failures

re_embed_sweep.py:258-269: failed pairs increment pairs_failed but main() returns 0 unconditionally. A nightly sweep that migrated 80% of tenants before hitting a rate limit would report success to CI. Change sweep() to return pairs_failed, and have main() return 1 when pairs_failed > 0. Fix this →

7. Numeric timestamps not unit-converted during migration

migrate_milvus_to_qdrant.py:184-185:

elif isinstance(ts_value, (int, float)):
    payload[timestamp_field] = int(ts_value)  # verbatim — no unit conversion

For agent_case (timestamp_unit="s"), a source document storing epoch milliseconds (e.g., from a pre-existing Milvus collection) would be stored verbatim as ~1.7e12, not ~1.7e9. The runtime collection's range filters work in seconds. The conversion block above it correctly handles datetime objects; the numeric branch should apply the same / 1000 for ms targets or * 1000 for s targets where the source is in the other unit. At a minimum, document the assumption ("source is always in target unit") clearly.


🟡 Medium severity

8. assert cfg is not None stripped under python -O

qdrant_collection_base.py:255: the existing assert is removed by the optimizer, leaving the cfg.payload_indexes access one line below to raise an opaque AttributeError: 'NoneType' object has no attribute 'payload_indexes' in production builds with optimizations. The same class already uses if cfg is None: raise RuntimeError(...) in ensure_collection() — apply the same pattern here. Fix this →

9. MongoDB cursor can time out mid-migration

migrate_milvus_to_qdrant.py:291: coll.find() uses MongoDB's default cursor idle timeout (10 min server-side). With a 32-doc batch and slow embedding calls (OpenRouter can add seconds per batch), large collections will hit CursorNotFound mid-run. Use coll.find(no_cursor_timeout=True) and wrap cursor iteration in try/finally: cursor.close().

10. atomic_fact field missing from sweep spec's payload_fields

re_embed_sweep.py:89-100: the atomic_fact spec only copies user_id, group_id, session_id, etc. into the payload — but not atomic_fact itself. The AtomicFactQdrantConverter stores the text in search_content (JSON-encoded), so search results won't carry the raw fact text. Verify whether downstream callers read payload.get("atomic_fact") directly from Qdrant search results, and if so, add "atomic_fact" to payload_fields.


🟢 Minor / Polish

11. Shutdown guard depends on env-flag instead of initialization stateqdrant_lifespan.py:123: if not _backend_is_qdrant() or self._qdrant_factory is None. If the env var changes between startup and shutdown (e.g., in a test), the factory won't be closed. Remove the _backend_is_qdrant() check; self._qdrant_factory is None is sufficient.

12. from_mongo return type annotation is wrongbase_converter.py:66 declares return type QdrantCollectionType (bound to QdrantCollectionBase) but every concrete converter returns qmodels.PointStruct. The Generic bound is correct for documenting the target collection, but the return type annotation is misleading for IDE users.

13. Cache key ignores transport flagsconfig_utils.py:111-126: get_qdrant_connection_cache_key hashes only host/port/url + api_key, ignoring https and prefer_grpc. If two tenants share a host but need different transport settings, they'd get the same cached (potentially wrong) client.

14. PII in delete_by_user_group log linesuser_profile_qdrant_repository.py:167-181: raw user_id and group_id are emitted at INFO and ERROR level. If these are opaque UUIDs this is fine, but if they can be real user identifiers they'll land in your log aggregator unredacted.

15. Redundant or Noneatomic_fact_qdrant_converter.py:78: getattr(source_doc, "vector", None) or None — the or None is a no-op; remove it.


✅ Confirmed valid CodeRabbit findings I did NOT see issues with

  • The get_client() URL-vs-https parameter comment (lines 94-103 of the factory) is slightly misleading, but the actual client construction at lines 199-213 is correct: https=None is not passed to client_kwargs when unset, so the scheme in the URL controls TLS. The comment just needs clarifying, not a code change.
  • create_collection TOCTOU: if_not_exists=True is only available in newer qdrant-client versions. With >=1.12,<2 as the constraint this is worth adding but not a blocker.

Summary

The codebase quality is solid for its scope — good docstrings, consistent patterns with the Milvus counterpart, proper asyncio.to_thread wrapping. The blocking issues are all in the migration tooling (items 1-3) and are fixable quickly. The cross-tenant caching risk (item 4) depends on the DI scoping model and needs verification before cutover.

Ptah-CT added 2 commits May 13, 2026 21:59
B1 — migration default dimension mismatch
Default for ``VECTORIZE_DIMENSIONS`` in the migrate workhorse was ``4096``
while ``memory_layer/constants.py`` defaults to ``1024``. A migration run
with the env unset produced 4096-dim collections that the runtime service
(1024-dim) could not use, causing dimension-mismatch errors on every
subsequent operation. Both ends now default to ``1024``; sites running a
non-default dimension must set the env on both the migration and the
runtime — a mismatch is loud (collection-create fails) instead of silent.

B2 — ``openai`` and cursor could be unbound during finally cleanup
The cleanup block in ``migrate()`` referenced ``openai`` unconditionally,
but ``openai`` was only assigned inside the ``try`` after ``QdrantClient``
construction. If the Qdrant client failed to construct (unreachable host,
bad credentials), the ``finally`` raised ``NameError`` and masked the real
connection error. Pre-initialise ``openai`` and ``cursor`` to ``None``
before the ``try`` and guard each ``.close()`` call. ``coll.find()`` now
opens with ``no_cursor_timeout=True`` (CodeRabbit M2 / Claude M2) — slow
embedding batches (OpenRouter rate-limit, retry) easily exceed the 10-min
server-side cursor idle default and crashed long sweeps with
``CursorNotFound``.

B3 — ``min(radius, score_threshold)`` silently disabled filtering
All six ``vector_search`` methods used ``min(radius, score_threshold)``
server-side. With the parameter default ``score_threshold=0.0`` and any
explicit ``radius`` (e.g. ``radius=0.7``), the effective threshold
collapsed to ``0.0``, was then converted to ``None`` for Qdrant, and the
client-side ``point.score < score_threshold`` check (``< 0.0``) never
fired either — caller got everything back. Extracted a
``compute_effective_threshold(radius, score_threshold)`` helper in
``base_repository`` that uses the smaller *positive* bound (or ``None``
when neither is positive) and switched all six repositories to it. The
client-side hard cut-off is unchanged.

Also: ``mongo_id_to_qdrant_id`` now rejects ``None``/empty inputs so
upstream bugs do not silently collide on a single fixed Qdrant id
(CodeRabbit H10).
High (H1-H14, except H10 already in 9faac50)

H1 — Singleton repository cached tenant-bound collection
``BaseQdrantRepository.collection`` lazily cached the resolved tenant-
prefixed collection instance, so a DI-singleton repository would lock to
the first tenant that ever called it and silently route subsequent
tenants' reads/writes to the wrong Qdrant collection. The cache is gone;
``self.model()`` is invoked per-call and re-resolves the tenant suffix
each time.

H2 — Qdrant client cache key ignored transport flags
``get_qdrant_connection_cache_key`` now folds ``https`` and ``prefer_grpc``
into the key. Two tenants on the same host but with different transport
flags now get distinct cached clients.

H3 — ``exists()`` swallowed auth and 5xx errors
The previous catch was ``(ResponseHandlingException, UnexpectedResponse)``,
which converted 401/403/5xx HTTP responses into "collection missing".
``ensure_collection()`` then routed into a confusing create attempt. Only
``ResponseHandlingException`` (transport-level) is caught now; HTTP errors
propagate.

H4 — Sweep exited 0 on partial failure
``sweep()`` returns the failed-pair count; ``main()`` returns ``1`` when
any pair failed so cron / CI sees the run as red.

H5 / H13 — Numeric timestamps not unit-normalized
Extracted ``_normalize_timestamp_to_epoch`` (heuristic ms-vs-s detection
on numeric values). Applied to the primary ``timestamp_field`` AND to any
time-shaped values in ``payload_fields`` (e.g. foresight ``end_time``)
so callers don't see a mixed ``datetime`` / numeric population.

H6 — Dimension drift on pre-existing collections
Both the migration helper and the runtime ``ensure_collection`` now read
``get_collection`` and raise ``RuntimeError`` when the existing vector
size disagrees with the configured one. Loud failure replaces per-batch
"vector size mismatch" once data starts flowing.

H7 — TOCTOU between ``exists()`` and ``create_collection``
Idempotency for races: a 409 ``UnexpectedResponse`` from
``create_collection`` is swallowed (parallel process created it first);
other errors propagate.

H8 — Foresight time-range was containment, should have been overlap
``vector_search`` now filters ``payload.end_time >= q.start_time`` and
``payload.start_time <= q.end_time``. The previous filter dropped
foresights whose window only partially overlapped the query window.

H9 — Lifespan failure left clients open
On any error during ``startup``, the lifespan provider now calls
``close_all_clients()`` before re-raising. ``shutdown()`` gates on the
factory's initialization state, not the env flag (env flips between
startup and shutdown in tests would have leaked clients).

H11 — EpisodicMemory.vector_search returned raw epoch
Result dict now normalizes ``timestamp`` to ``datetime`` for parity with
the other Qdrant repositories.

H12 — PII in user_profile delete-path logs
``user_id`` and ``group_id`` are now SHA-256-fingerprinted (12-char
prefix) before logging instead of emitted in cleartext.

H14 — ``atomic_fact`` missing from sweep payload spec
Added to the atomic_fact sweep spec so migrated records carry the raw
text — ``AtomicFactQdrantRepository.vector_search`` reads it directly
out of the payload.

Medium / Trivial

M1 — ``assert cfg is not None`` stripped under ``python -O``: replaced
     with explicit RuntimeError in ``ensure_payload_indexes``.
M3 — QDRANT_TIMEOUT now clamped to [1, 300] s.
M4 — Shutdown no longer gates on env flag (see H9).
M5 — README path-list expanded to include the infra_layer Qdrant tree.
T1 — Removed redundant ``or None`` after ``getattr(..., None)`` in
     atomic_fact converter.
T2 — ``BaseQdrantConverter.from_mongo`` return annotation widened from
     the generic bound (which the concrete converters do not satisfy) to
     ``Any``; the generic still documents the target *collection*.
@Ptah-CT
Copy link
Copy Markdown
Author

Ptah-CT commented May 13, 2026

@coderabbitai review

All findings from the first review on this branch addressed across two commits:

9faac50 — Blockers (3)

  • B1: VECTORIZE_DIMENSIONS default aligned to runtime (1024)
  • B2: openai + cursor pre-init in migrate() so finally cannot raise NameError; no_cursor_timeout=True on the Mongo cursor
  • B3: compute_effective_threshold helper replaces the broken min(radius, score_threshold) in all 6 vector_search paths; mongo_id_to_qdrant_id rejects empty inputs

51025b2 — High/Medium/Trivial (14)

  • H1+H2: tenant-cache leak fixed; https and prefer_grpc added to the client cache key
  • H3: exists() only catches transport errors now (auth/5xx propagate)
  • H4: sweep returns pairs_failed and main() propagates non-zero exit
  • H5+H13: _normalize_timestamp_to_epoch applied to primary timestamp AND time-shaped payload fields
  • H6: dimension drift validated against existing Qdrant collections in both ensure_qdrant_collection and ensure_collection
  • H7: 409 UnexpectedResponse from concurrent create_collection is now idempotent
  • H8: foresight time-range switched from containment to overlap
  • H9: lifespan rolls back open clients on startup error; shutdown gates on factory state, not env
  • H11: EpisodicMemory.vector_search normalizes timestamp back to datetime
  • H12: user_id / group_id SHA-256-fingerprinted in user_profile logs
  • H14: atomic_fact added to sweep payload_fields
  • M1, M3, M4, M5, T1, T2 — addressed
  • Codex's 7 inline findings were subsumed under the Blockers/High above.

Please re-run.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

@Ptah-CT Thank you for the detailed summary of all fixes! The thoroughness of the patch notes — covering blockers, high/medium/trivials, and the absorbed Codex inline findings — is much appreciated. Kicking off a fresh review now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (1)
methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py (1)

8-13: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Zeitintervall-Semantik in delete_by_filters ist immer noch Containment — divergiert von vector_search (Overlap).

Der Overlap-Fix wurde nur in vector_search (Z. 220–236) eingebaut. delete_by_filters (Z. 319–332) filtert weiterhin auf key="start_time" gte und key="end_time" lte, also Containment. Der Docstring (Z. 297–299) behauptet jedoch, die Semantik entspreche vector_search. Effekt: ein Aufrufer, der vector_search(start, end) ausführt und anschließend mit denselben Argumenten delete_by_filters(start, end) aufruft, löscht eine andere Menge — nämlich nur die vollständig enthaltenen Foresights, während überlappende (die in der Suche auftauchen) bestehen bleiben. Das ist ein stiller Datenkonsistenz-Bug bei Migrationen/Cleanups.

Zusätzlich ist der Modul-Docstring (Z. 8–13) nach dem Overlap-Fix in vector_search veraltet — er beschreibt nach wie vor die alte Containment-Semantik (start_time arg maps to start_time >= ...).

Vorgeschlagener Fix für `delete_by_filters` + Modul-Docstring
-**Note on time filters:** the Foresight schema stores ``start_time`` and
-``end_time`` (both epoch milliseconds). The Milvus repository erroneously
-filters on a non-existent ``timestamp`` field; the Qdrant repository
-filters on ``start_time``/``end_time`` semantically — ``start_time`` arg
-maps to ``start_time >= ...`` and ``end_time`` arg to ``end_time <= ...``,
-which is the natural range-overlap semantic for a time-spanning record.
+**Note on time filters:** the Foresight schema stores ``start_time`` and
+``end_time`` (both epoch milliseconds). Filtering uses window-overlap
+semantics — the query's ``start_time`` arg maps to
+``payload.end_time >= ...`` and the query's ``end_time`` arg maps to
+``payload.start_time <= ...``. This matches the natural intent of "find
+records whose window intersects the query window" and is applied
+consistently across ``vector_search`` and ``delete_by_filters``.
         Batch delete by filter combination; at least one filter required.

-        Time-range semantic matches ``vector_search``: ``start_time`` arg
-        maps to ``payload.start_time >= ...``, ``end_time`` arg maps to
-        ``payload.end_time <= ...``.
+        Time-range semantic matches ``vector_search`` (window-overlap):
+        ``start_time`` arg -> ``payload.end_time >= start_time_ms``,
+        ``end_time`` arg  -> ``payload.start_time <= end_time_ms``.
         """
@@
-            if start_time:
-                conditions.append(
-                    qmodels.FieldCondition(
-                        key="start_time",
-                        range=qmodels.Range(gte=to_epoch_ms(start_time)),
-                    )
-                )
-            if end_time:
-                conditions.append(
-                    qmodels.FieldCondition(
-                        key="end_time",
-                        range=qmodels.Range(lte=to_epoch_ms(end_time)),
-                    )
-                )
+            if start_time:
+                conditions.append(
+                    qmodels.FieldCondition(
+                        key="end_time",
+                        range=qmodels.Range(gte=to_epoch_ms(start_time)),
+                    )
+                )
+            if end_time:
+                conditions.append(
+                    qmodels.FieldCondition(
+                        key="start_time",
+                        range=qmodels.Range(lte=to_epoch_ms(end_time)),
+                    )
+                )

Falls die Containment-Semantik in delete_by_filters bewusst beibehalten werden soll (z. B. konservativeres Löschverhalten), muss der Docstring entsprechend explizit machen, dass sich Such- und Lösch-Semantik unterscheiden, damit Aufrufer das nicht miteinander verwechseln.

Also applies to: 297-332

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py`
around lines 8 - 13, Update the module docstring to reflect the overlap
(range-overlap) semantics used by vector_search instead of the old containment
wording, and change delete_by_filters to use the same overlap logic as
vector_search: replace the current containment predicates (start_time >=
start_arg AND end_time <= end_arg) with overlap predicates (start_time <=
end_arg AND end_time >= start_arg) so that a record is deleted when its time
span overlaps the query window; ensure you adjust the filter construction in
delete_by_filters and keep vector_search and delete_by_filters semantics
consistent (or explicitly document if you intentionally want them different).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@methods/EverCore/src/core/lifespan/qdrant_lifespan.py`:
- Around line 93-101: The startup async method performs blocking Qdrant I/O
synchronously; wrap the blocking calls in threads using asyncio.to_thread: call
self._qdrant_factory.get_named_client via await asyncio.to_thread(...) when
assigning self._qdrant_clients[using], instantiate collection_class() via await
asyncio.to_thread(...) if construction is blocking, and call
collection.ensure_all via await asyncio.to_thread(...) (or convert ensure_all to
async if possible) so the event loop is not blocked.

In `@methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py`:
- Around line 185-419: The current methods (exists, count, ensure_collection,
ensure_payload_indexes, ensure_all, upsert, search, delete, drop) perform
blocking Qdrant client calls (e.g. self.client().collection_exists, .count,
.create_collection, .query_points, .delete, .delete_collection) and must be made
non-blocking: change these routines to await the I/O either by using async
client APIs if available or by wrapping the blocking calls with await
asyncio.to_thread(...) (e.g. await
asyncio.to_thread(self.client().collection_exists, self.name)) and propagate
async/await through callers; ensure exceptions/logging semantics are preserved
and keep the same return values and names (exists, count, ensure_collection,
ensure_payload_indexes, ensure_all, upsert, search, delete, drop) so callers
continue to work.

In `@methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py`:
- Around line 55-57: The broad except block that logs and returns None must be
changed to fail-closed: stop swallowing unexpected errors in the except handling
around tenant qdrant config resolution (the except Exception as e:
logger.warning(...) return None block that leads callers to call
_base_prefixed_collection_name(...)). Instead, catch only the specific, expected
exceptions (e.g., config-missing or NotFound) and return a clear sentinel for
“no tenant config”; for all other exceptions re-raise or raise a
TenantResolutionError so the caller at the sites that call
_base_prefixed_collection_name(...) cannot silently fall back to shared
prefixes. Update the logging to include the error details (logger.exception or
include e) and ensure callers handle the explicit error/sentinel instead of
treating None as success.
- Around line 130-135: The cache-key boolean flags are being incorrectly
normalized because using bool(https) treats non-empty strings like "false" as
True; update the normalization in the endpoint-building logic (where https =
config.get("https") and prefer_grpc = config.get("prefer_grpc") are used to
append "#https=..." and "#grpc=...") to robustly parse string and non-string
values into real booleans (e.g., convert strings via lower() and compare to
"true"/"1"/"yes" or use a small helper parse_bool(value) that handles None,
bools, ints, and common truthy/falsey strings) and then append the normalized
boolean value to endpoint so cache keys are correct.

In `@methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py`:
- Around line 233-248: The numeric heuristic is overbroad: the second loop
applies _normalize_timestamp_to_epoch to any int/float payload field (iterating
payload_fields), corrupting non-timestamp numeric fields; change this by
removing the magnitude-based pass and instead restrict normalization to an
explicit whitelist of timestamp field names (e.g. "timestamp", "start_time",
"end_time") or to fields supplied via a new CLI option. Concretely, stop calling
_normalize_timestamp_to_epoch for every numeric payload[field] in the loop that
references payload_fields/timestamp_field, add support for an
--extra-timestamp-fields CLI flag and wire that into CollectionSpec
(re_embed_sweep.py) as extra_timestamp_fields, and only normalize fields present
in that whitelist (plus the existing timestamp_field). Ensure no other numeric
fields are transformed.

In `@methods/EverCore/src/devops_scripts/re_embed_sweep.py`:
- Around line 191-197: estimated_count currently creates and closes a
MongoClient per call causing N×M connection overhead; change sweep to create one
shared MongoClient (separate from migrate()'s internal client), pass that client
into list_active_dbs and estimated_count (modify their signatures to accept a
MongoClient parameter), use the shared client for all count/list operations, and
close the shared client in sweep's finally block to ensure a single connection
pool for the whole sweep.

In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/atomic_fact_qdrant_repository.py`:
- Around line 49-63: The parameter name `id` in create_and_save_atomic_fact
should be renamed to a non-builtin name (e.g., point_id or mongo_id) to avoid
shadowing Python's builtin; update the function signature of
create_and_save_atomic_fact (and any other functions in this file that use a
parameter named `id` in the noted ranges) to use point_id, then update all
references inside the method body, imports/usages, log messages, and any dict
keys or returned payload fields that currently use `id` so they consistently use
point_id; ensure tests/type hints still match and no leftover `id` references
remain.

In `@README.md`:
- Line 17: Aktualisiere die Status-Zeile für `feature/qdrant-adapter` im README
von "work in progress. Phase 1: adapter skeleton" auf eine präzise Beschreibung
der gelieferten Funktionalität (z.B. "fully implemented — Collection base, 6
converters, 6 repositories, Migration CLI, Re-Embed Sweep; achieves full Milvus
parity" oder ähnliche Formulierung), sodass die Branch-Bezeichnung
`feature/qdrant-adapter` und der PR-Titel "full Milvus parity" widerspruchsfrei
sind und die genannten Komponenten deutlich aufgeführt werden.

---

Duplicate comments:
In
`@methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py`:
- Around line 8-13: Update the module docstring to reflect the overlap
(range-overlap) semantics used by vector_search instead of the old containment
wording, and change delete_by_filters to use the same overlap logic as
vector_search: replace the current containment predicates (start_time >=
start_arg AND end_time <= end_arg) with overlap predicates (start_time <=
end_arg AND end_time >= start_arg) so that a record is deleted when its time
span overlaps the query window; ensure you adjust the filter construction in
delete_by_filters and keep vector_search and delete_by_filters semantics
consistent (or explicitly document if you intentionally want them different).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 852a4f39-c113-4f84-b4d0-11339dac2d64

📥 Commits

Reviewing files that changed from the base of the PR and between bd17eb5 and 51025b2.

📒 Files selected for processing (16)
  • README.md
  • methods/EverCore/src/core/component/qdrant_client_factory.py
  • methods/EverCore/src/core/lifespan/qdrant_lifespan.py
  • methods/EverCore/src/core/oxm/qdrant/base_converter.py
  • methods/EverCore/src/core/oxm/qdrant/base_repository.py
  • methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py
  • methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py
  • methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py
  • methods/EverCore/src/devops_scripts/re_embed_sweep.py
  • methods/EverCore/src/infra_layer/adapters/out/search/qdrant/converter/atomic_fact_qdrant_converter.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/agent_case_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/agent_skill_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/atomic_fact_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/episodic_memory_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/foresight_qdrant_repository.py
  • methods/EverCore/src/infra_layer/adapters/out/search/repository/user_profile_qdrant_repository.py

Comment thread methods/EverCore/src/core/lifespan/qdrant_lifespan.py
Comment thread methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py Outdated
Comment thread methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py Outdated
Comment thread methods/EverCore/src/core/tenants/tenantize/oxm/qdrant/config_utils.py Outdated
Comment thread methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py Outdated
Comment thread methods/EverCore/src/devops_scripts/re_embed_sweep.py Outdated
Comment thread README.md Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51025b2b62

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread methods/EverCore/src/core/lifespan/qdrant_lifespan.py Outdated
Comment thread methods/EverCore/src/devops_scripts/migrate_milvus_to_qdrant.py Outdated
Comment thread methods/EverCore/src/devops_scripts/re_embed_sweep.py
Comment thread methods/EverCore/src/devops_scripts/re_embed_sweep.py
Comment thread methods/EverCore/src/devops_scripts/re_embed_sweep.py
Comment thread methods/EverCore/src/core/oxm/qdrant/qdrant_collection_base.py
CR re-review on 51025b2 surfaced 8 actionable findings (1 a duplicate of
H8 in the delete path), plus the long-standing N2 ask to make
``QdrantCollectionBase`` async-first instead of relying on per-caller
``asyncio.to_thread`` wrappers. All addressed:

Async refactor (N1 + N2 — the heavy lift)
- ``QdrantCollectionBase`` methods are now ``async``: ``exists``, ``count``,
  ``ensure_collection``, ``ensure_payload_indexes``, ``ensure_all``,
  ``upsert``, ``search``, ``delete``, ``drop``. Each delegates to the
  blocking ``qdrant-client`` call via ``asyncio.to_thread`` once, instead
  of forcing every caller to wrap the call themselves.
- ``BaseQdrantRepository`` no longer wraps ``self.collection.upsert/delete/
  search/count`` in ``asyncio.to_thread`` — those are now ``await``-able
  directly. The to_thread wrappers around ``client.retrieve`` stay (the
  client method is still sync; the collection class has no native
  ``retrieve``).
- ``QdrantLifespanProvider.startup`` now awaits ``collection.ensure_all``
  and offloads ``get_named_client`` (still sync) to a worker thread, so
  the event loop is no longer blocked during FastAPI startup.

Regressions from 51025b2 (N4, N5)
- ``get_qdrant_connection_cache_key``: replaced the naive ``bool(value)``
  with a ``_as_bool`` helper that handles strings (``bool("false")`` is
  ``True`` in Python and would have produced wrong cache keys for
  tenants whose storage entry returns ``https``/``prefer_grpc`` as
  strings).
- ``build_payload``: dropped the magnitude-based "looks like a timestamp"
  heuristic. It would have rewritten legitimate non-time numerics like
  ``maturity_score`` or ``duration_days`` whenever they crossed the
  threshold. Normalization now requires an explicit whitelist via a new
  ``--extra-timestamp-fields`` CLI flag / ``extra_timestamp_fields``
  parameter on the workhorse; the sweep spec for ``foresight`` declares
  ``end_time`` there.

Other CR findings

N3 — Tenant-config resolution is now fail-closed
  ``get_tenant_qdrant_config`` and ``get_tenant_aware_collection_name`` no
  longer swallow arbitrary exceptions and silently fall back to the
  shared base prefix (which would route a tenant's data into another
  tenant's collection). Only the legitimate "no tenant context" /
  ``LookupError`` paths return ``None``; everything else propagates.

N6 — Shared MongoClient in the sweep
  ``sweep()`` now opens a single ``MongoClient`` for the discovery /
  count phase and passes it to ``list_active_dbs`` and ``estimated_count``
  (which used to open and close their own client per call, producing
  N×M connection churn).

N7 — Renamed ``id`` parameter in ``create_and_save_atomic_fact`` to
``point_id`` to avoid shadowing Python's builtin.

N8 — README ``Status`` line replaced the stale "Phase 1: adapter skeleton"
with a concrete list of what the branch actually delivers.

D1 — ``ForesightQdrantRepository.delete_by_filters`` now uses the same
window-overlap filter as ``vector_search`` so a "delete this window" call
doesn't leave behind records that ``vector_search`` would still return.
Module docstring updated.

Links CI fix
  The Rokid / Creative Assistant use-case blocks in README.md were
  missing the ``[![banner](img)](link)`` wrap and the ``[Code|Plugin|Live
  Demo](link)`` primary marker that the ``Docs`` workflow's link check
  enforces. Wrapped both banners with self-anchor links and gave each a
  matching primary marker. (Pre-existing drift from upstream; the
  workflow only started running on this branch.)
@XInfty XInfty deleted a comment from qodo-code-review Bot May 13, 2026
@DerAuctor DerAuctor merged commit 66e7aff into main May 13, 2026
4 checks passed
@DerAuctor DerAuctor deleted the qdrant/rebase-evercore branch May 13, 2026 22:54
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d1be9c69e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

ts_ms = payload.get("timestamp", 0) or 0
search_results.append(
{
"id": str(point.id),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return Mongo IDs instead of internal UUID point IDs

vector_search currently returns id from point.id, but this adapter writes Qdrant IDs as deterministic UUID5 values (see converter/migration paths that call mongo_id_to_qdrant_id and store the original Mongo key in payload as mongo_id). Under migrated data, callers receive UUIDs instead of source document IDs, so downstream backfill paths that look up Mongo records by returned id will miss and degrade results after Qdrant cutover. Use payload mongo_id as the outward-facing id (with a fallback to point.id for legacy points) to preserve Milvus-parity semantics.

Useful? React with 👍 / 👎.

}

await self.upsert(
qmodels.PointStruct(id=id, vector=vector, payload=payload)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Convert runtime point IDs to Qdrant-compatible format

create_and_save_episodic_memory upserts with the raw caller id, but this same PR introduces mongo_id_to_qdrant_id because Mongo-style IDs are not valid Qdrant point IDs (Qdrant accepts uint64/UUID IDs). In cutover flows where callers pass Mongo/ObjectId-like IDs, this path will fail at upsert time, while migrated records use UUID5 IDs, creating inconsistent write behavior between live ingestion and migration. Convert id through mongo_id_to_qdrant_id (or enforce UUID input) before building PointStruct.

Useful? React with 👍 / 👎.

}

await self.upsert(
qmodels.PointStruct(id=point_id, vector=vector, payload=payload)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Normalize atomic-fact point IDs before Qdrant upsert

create_and_save_atomic_fact writes point_id directly as the Qdrant point ID, but this PR’s own migration/converter paths translate Mongo IDs via mongo_id_to_qdrant_id to satisfy Qdrant’s ID constraints. If runtime callers pass Mongo/ObjectId-like IDs, this upsert path can fail (or diverge from migrated ID space), so search/delete behavior becomes inconsistent between newly written and migrated records. Apply the same deterministic ID normalization (or strict UUID validation) before constructing the point.

Useful? React with 👍 / 👎.

Ptah-CT added a commit that referenced this pull request May 14, 2026
## Problem

After PR #2 (Qdrant adapter merged) and the data-side migration, the
service still refused to boot with ``VECTOR_STORE_BACKEND=qdrant``:

1. ``MilvusLifespanProvider`` had no env-gate, always tried to connect
to
Milvus, and crashed when Milvus was offline (which it is in the Qdrant
   cutover scenario). It runs at order=18, before the Qdrant lifespan at
   order=19, so the Qdrant lifespan never got a chance to start.
2. Once the Milvus lifespan was gated, ``BaseMilvusRepository.__init__``
   crashed at every controller construction because it eagerly called
   ``model.async_collection()`` — which raises when the Milvus lifespan
has not initialised the collection (i.e. exactly the Qdrant-mode case).
3. ``QdrantConnectionCache``'s lazy import inside
   ``get_tenant_qdrant_config`` still referenced
``core.tenants.tenantize.tenant_context`` — that module was renamed to
``core.tenants.tenant_contextvar`` upstream. The failure only surfaced
   at the very first ``ensure_collection`` call inside
   ``QdrantLifespanProvider.startup``, so it looked indistinguishable
   from a wider Qdrant initialisation issue.

## Fix

Three small, surgical patches:

**``core/lifespan/milvus_lifespan.py``** — symmetric env-gate.
``startup`` and ``shutdown`` return ``None`` immediately when
``VECTOR_STORE_BACKEND=qdrant``, mirroring what
``QdrantLifespanProvider``
already does for the inverse case.

**``core/oxm/milvus/base_repository.py``** — lazy collection resolution.
``__init__`` no longer eagerly resolves ``model.async_collection()``;
instead it stashes the model class, and ``self.collection`` is now a
``@property`` that resolves on first access. Milvus-mode behaviour is
unchanged (first repo call resolves the collection identically to
before). Qdrant-mode boots cleanly because the Milvus collections are
never touched.

**``core/tenants/tenantize/oxm/qdrant/config_utils.py``** — fixed
the lazy import: ``core.tenants.tenantize.tenant_context`` →
``core.tenants.tenant_contextvar``.

## Verification

Boot with ``VECTOR_STORE_BACKEND=qdrant``:

- lifespan order: metrics → mongodb → milvus(no-op) → elasticsearch →
  qdrant (initialised) → business → longjob — all green.
- ``/health`` returns ``{"status": "healthy"}`` HTTP 200.
- ``/docs`` HTTP 200.
- The 6 Qdrant collection classes are discovered + initialised at
  startup. ``ensure_all`` resolves the tenant-aware names and finds the
  pre-seeded collections green.

## Risk

Low. None of the changes affect Milvus-mode (default). All three patches
are scoped to either the Qdrant code path or to a lazy-resolution change
that defers behaviour without changing it.

---------

Co-authored-by: Ptah-CT <221234802+Ptah-CT@users.noreply.github.com>
Ptah-CT added a commit that referenced this pull request May 15, 2026
…Lock

CodeRabbit finding #3: _ensure_session() could be entered concurrently
by multiple coroutines (e.g. parallel rerank batches via gather), each
seeing self.session is None and creating its own aiohttp.ClientSession.
The losers of the race were never tracked and leaked.

Add self._session_lock = asyncio.Lock() in __init__ and use a
double-checked pattern inside _ensure_session: a fast path when the
session is already alive, a lock-protected slow path with re-check
before instantiation.

Findings #1 (Voyage model default) and #2 (raw query in debug log) are
intentionally deferred — production .env always sets RERANK_MODEL and
DEBUG-level logs are not enabled by default in this environment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants