Skip to content

Flatcollection search queryconverter bugfix#308

Open
suddendust wants to merge 7 commits into
mainfrom
flatcollection_search_queryconverter_bugfix
Open

Flatcollection search queryconverter bugfix#308
suddendust wants to merge 7 commits into
mainfrom
flatcollection_search_queryconverter_bugfix

Conversation

@suddendust
Copy link
Copy Markdown
Contributor

Fix legacy search() transformer for array and JSONB columns

Problem

The legacy Collection.search() path goes through LegacyFilterToQueryFilterTransformer (for filters) and LegacyQueryToV2QueryTransformer (for selections / orderBy) to convert the v1 Filter/Query API into the v2 Query API before handing off to the Postgres parsers. Both transformers were dropping the column's type information when they constructed the v2 IdentifierExpression, with two visible consequences:

  1. Array columns produced invalid SQL. A filter like Filter(EQ, "tags", "hygiene") against a TEXT[] column was rendered as "tags" = ? with a text RHS — Postgres rejects this at execution time because text = text[] has no operator. The same problem occurred for INTEGER[], DOUBLE PRECISION[], BOOLEAN[], and any IN/NOT_IN variants on array columns.

  2. JSONB column resolution was ad-hoc. Both transformers separately walked the schema looking for a JSONB prefix and built JsonIdentifierExpression directly, with no shared codepath for the "field resolves to the JSONB column itself" case (which is forbidden — JsonIdentifierExpression requires a non-empty jsonPath).

Net effect: filters on array columns failed at runtime, and the same logical query that worked through the v2 find() API blew up through the legacy search() API.

Solution

Introduced IdentifierExpressionFactory as a single resolution point that maps a (fieldName, ColumnMetadata) pair to the correct IdentifierExpression subtype using the column's canonicalType and isArray() signals from the schema registry.

ColumnMetadata.canonicalType + isArray
        ↓
IdentifierExpressionFactory.createIdentifierFromColumn
        ↓
- ArrayIdentifierExpression.of{Strings,Ints,Longs,Floats,Doubles,Booleans,TimestampsTz,Dates}  (array columns)
- IdentifierExpression.of{String,Int,Long,Float,Double,Boolean,TimestampTz,Date}              (scalar columns)
- IdentifierExpression.of(name)                                                               (JSONB column itself, or no type info)

Three cases the factory has to handle correctly:

Case Returned type Why
Typed scalar column (STRING, INTEGER, …) Typed IdentifierExpression.ofX(name) Parser needs the element type to emit a correctly-cast comparison.
Typed array column (any of the above, isArray() == true) Typed ArrayIdentifierExpression.ofXs(name) Parser routes to array-aware SQL (= ANY(...), &&, @>).
Whole JSONB column referenced by name (no nested path) Untyped IdentifierExpression.of(name) JsonIdentifierExpression requires a non-empty jsonPath — it represents a path inside a JSONB column, not the column itself.
Unknown / no schema info Untyped IdentifierExpression.of(name) Fall through to runtime resolution, behavior matches pre-fix.

Both LegacyFilterToQueryFilterTransformer and LegacyQueryToV2QueryTransformer now route their direct-column lookups through this factory. The JSONB-prefix walk (for nested paths like props.brand) remains in each transformer because the value-type inference (which becomes JsonFieldType) is op-specific:

  • LegacyFilterToQueryFilterTransformer.inferJsonFieldType(value) looks at the filter value's runtime type — String → STRING, Number → NUMBER, Boolean → BOOLEAN. Note: there is no STRING_ARRAY inference path today; array-membership semantics on nested JSONB arrays are not reachable from the legacy filter API. Documented in tests.
  • LegacyQueryToV2QueryTransformer (selections / orderBy) defaults to JsonFieldType.STRING since there's no value to infer from.

@suddendust suddendust changed the title Flatcollection search queryconverter bugfix [Draft] Flatcollection search queryconverter bugfix May 20, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 55.88235% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.63%. Comparing base (059875b) to head (38ed2e8).

Files with missing lines Patch % Lines
...ry/v1/transformer/IdentifierExpressionFactory.java 42.30% 11 Missing and 4 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #308      +/-   ##
============================================
- Coverage     81.68%   81.63%   -0.05%     
- Complexity     1532     1549      +17     
============================================
  Files           241      242       +1     
  Lines          7420     7450      +30     
  Branches        718      720       +2     
============================================
+ Hits           6061     6082      +21     
- Misses          912      916       +4     
- Partials        447      452       +5     
Flag Coverage Δ
integration 81.63% <55.88%> (-0.05%) ⬇️
unit 55.98% <47.05%> (+0.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 20, 2026

Test Results

  124 files  +2    124 suites  +2   40s ⏱️ -13s
  838 tests +9    837 ✅ +9  1 💤 ±0  0 ❌ ±0 
1 174 runs  +8  1 173 ✅ +8  1 💤 ±0  0 ❌ ±0 

Results for commit 38ed2e8. ± Comparison against base commit 059875b.

This pull request removes 16 and adds 25 tests. Note that renamed tests count towards both.
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformCompleteQuery ‑ transformCompleteQuery_allComponentsTransformed()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformFilter ‑ transformCompositeAndFilter_createsLogicalExpression()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformFilter ‑ transformSimpleEqFilter_createsRelationalExpression()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformNullOrEmptyQuery ‑ transformEmptyQuery_returnsEmptyV2Query()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformNullOrEmptyQuery ‑ transformNullQuery_returnsEmptyV2Query()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformOrderBy ‑ transformAscendingOrderBy_createsSortWithAscOrder()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformOrderBy ‑ transformDescendingOrderBy_createsSortWithDescOrder()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformOrderBy ‑ transformJsonbPathOrderBy_createsJsonIdentifierExpression()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformPagination ‑ transformLimitAndOffset_createsPaginationWithBoth()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformPagination ‑ transformLimitOnly_createsPaginationWithZeroOffset()
…
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest ‑ inferJsonFieldTypeListOfStringsYieldsStringNotStringArray()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformCompleteQuery ‑ transformCompleteQueryAllComponentsTransformed()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformFilter ‑ transformCompositeAndFilterCreatesLogicalExpression()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformFilter ‑ transformSimpleEqFilterCreatesRelationalExpression()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformNullOrEmptyQuery ‑ transformEmptyQueryReturnsEmptyV2Query()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformNullOrEmptyQuery ‑ transformNullQueryReturnsEmptyV2Query()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformOrderBy ‑ transformAscendingOrderByCreatesSortWithAscOrder()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformOrderBy ‑ transformDescendingOrderByCreatesSortWithDescOrder()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformOrderBy ‑ transformJsonbPathOrderByCreatesJsonIdentifierExpression()
org.hypertrace.core.documentstore.postgres.query.v1.transformer.LegacyQueryToV2QueryTransformerTest$TransformPagination ‑ transformLimitAndOffsetCreatesPaginationWithBoth()
…

♻️ This comment has been updated with latest results.

@suddendust suddendust changed the title [Draft] Flatcollection search queryconverter bugfix Flatcollection search queryconverter bugfix May 20, 2026
@@ -5975,326 +5975,409 @@ private static Collection getFlatCollection(String dataStoreName) {
@Nested
class FlatCollectionLegacySearchMethod {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strategy here is to run the same v1 and v2 query again the collection and both should return the same results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants