fix feature property order to follow the provider schema by cportele · Pull Request #538 · ldproxy/xtraplatform-spatial

cportele · 2026-06-18T18:09:29Z

NOTE: This is a change in the core of the feature pipeline and needs broad testing before the release of ldproxy 4.8.

Properties backed by a joined table (objects, object arrays, value arrays, feature references) were emitted after the main-table columns even when declared before them, because the per-feature event buffer placed each property where its tokens first arrive - the provider's per-table order, not the schema order. For XML/GML this produced output that did not match the application schema's element order.

FeatureEventBuffer now re-sorts each buffered feature at flush into the declared schema order: object (and feature-root) children are ordered by their schema position; array elements keep their data order, and the children inside each element are ordered like any other object. The pass runs after the slice transformers, so transform behaviour is unchanged.

Object and array element children are consequently emitted in schema order too (previously source/arrival order); the affected token fixtures are updated. Adds FeatureEventBufferOrderSpec and re-enables the previously disabled "joined value array between main columns" mapping case.

Properties backed by a joined table (objects, object arrays, value arrays, feature references) were emitted after the main-table columns even when declared before them, because the per-feature event buffer placed each property where its tokens first arrive - the provider's per-table order, not the schema order. For XML/GML this produced output that did not match the application schema's element order. FeatureEventBuffer now re-sorts each buffered feature at flush into the declared schema order: object (and feature-root) children are ordered by their schema position; array elements keep their data order, and the children inside each element are ordered like any other object. The pass runs after the slice transformers, so transform behaviour is unchanged. Object and array element children are consequently emitted in schema order too (previously source/arrival order); the affected token fixtures are updated. Adds FeatureEventBufferOrderSpec and re-enables the previously disabled "joined value array between main columns" mapping case.

azahnen · 2026-06-23T11:07:12Z

@cportele
I am hesitating to approve this since it seems to me we would be adding a second sorting logic on top of the (dysfunctional) existing sorting logic. I think we should either fix the existing logic or remove/tighten it and then introduce the new one.

I tried to remember and understand how the existing logic is supposed to work. The basic idea is that sorting happens as part of the buffer inserts using cursors:

FeatureEventBuffer has cursors current and currentEnclosing
SchemaMapping contains desired property positions (computed in SchemaMappingBase)
onObjectStart, onValue etc. in FeatureTokenTransformerMappings call FeatureEventBuffer.next with property positions
this moves the cursors to the desired position
the following call to FeatureEventBuffer.append inserts the tokens at cursor position
FeatureEventBuffer holds tokens in desired order at any time

I did not dig any deeper yet to find the underlying issue. Questions:

Since the new logic uses mapping.getPositionsForTargetPath(path) as well, I assume that part is working as expected? In my memory that was the most tricky and unstable part.
Do you see any inherent flaw in the design of the existing logic, or are we just looking for an implementation error?
I do not see any dependency on insert order regarding joined tables etc in the design idea. If there is such an unwanted dependency, it would have to come from getPositionsForTargetPath, right?
I am not opposed to using the new logic if it works better and is easier to understand. Do you see any performance hits?
That being said, we cannot remove the cursor logic since it is used by slice transformers as well, right? And most likely it is also still needed on inserts for nesting?

cportele · 2026-06-23T11:14:21Z

@azahnen - I agree that the right approach is not to put new logic on top of the current, incomplete logic. I just had a hard time to understand the current logic (and its limitations). I will take a deeper look at your questions and come back with answers.

The per-feature event buffer ordered properties in two places: an incremental, cursor-driven placement in append, and a flush-time pass that re-sorted the feature into schema order. The cursor placement only got top-level properties right - a property nested in another object and backed by a joined table was left in production order - so the second pass was layered on top, leaving two overlapping ordering mechanisms. Keep a single mechanism. append now stores tokens in the order the provider produces them, and orderedBySchema is the only pass that applies schema order. It runs in place, lazily and once per feature: before the in-buffer slice transformers read the buffer, and at the latest at flush when no transformer ran. Ordering before the slice transformers is required, not incidental: a property produced from several tables arrives as several fragments, and a transformer reads a property as a contiguous buffer range. Only once the feature is in schema order are a property's fragments contiguous; without it, an unrelated property emitted between two fragments is swallowed into the slice (e.g. a separate array nested inside a concatenated object array). The transformers rewrite within a property and preserve schema order, so the pass is not repeated after they run. The position-addressable slice index that getSlice/replaceSlice need is built once per feature by a single pass over the buffered tokens, only when a slice is first accessed, and then updated in place as slices are rewritten - because the buffer is in schema-position order, a slice that changes size simply shifts every later position by the same amount, so the index is not rebuilt per rewrite. The schema-order pass also coalesces the per-table fragments of a single-valued object into one object: a provider produces an object backed by more than one table as several OBJECT[path]..OBJECT_END[path] blocks at the same position, which the previous per-position accrual merged implicitly. Object-array elements are wrapped in an array and are never affected. Remove the now-unused cursor plumbing (next, the current/currentEnclosing state, the increase/propagate accounting) and the dead source-path reorder transformer FeatureTokenTransformerSorting, which was only referenced from a commented-out, empty getDecoderTransformers override. Regression specs cover a nested joined object declared before its scalar siblings, an object produced as two per-table fragments coalescing into one, and a property whose fragments are split around an unrelated property keeping that property out of its slice.

cportele · 2026-06-24T12:02:54Z

@azahnen

Short version: append no longer sorts — it stores tokens in the provider's order. A single pass (orderedBySchema) applies schema order and is the only thing that orders.

Please have a look, whether the latest commit addresses your questions and concerns.

I tested it also with configurations that use embed, concat and various nested objects/arrays.

Your questions:

Yes — getPositionsForTargetPath works as expected. Positions come from a declaration-order walk of the target schema, no insert-order input. It's correct and stable; the old cursor placement and the new pass read the same map.
The position math was fine; the old append tried to maintain order incrementally and only got top-level properties right — a property nested in another object and backed by a joined table was left in production order, which is why initially a second pass got layered on top.
There is no insert-order dependency. It was entirely in append's incremental slot accounting, not in the position lookup.
Performance: The schema-order pass is O(n) per feature (build tree, sort + coalesce each object's children, flatten), ~30M tokens/s once warm — about 6 µs for a typical feature, ~0.3 ms for a large one (a 500-element object array), low-ms only for pathological features. It runs once per feature. The slice index that getSlice/replaceSlice need is built once per feature (only if a transformer reads a slice) and then updated in place as slices are rewritten — so a feature with many transformed properties (e.g. several concatenated arrays) pays the index cost once, not once per property. append is now cheaper than before (plain append vs. position-indexed insert + offset propagation). All off the fetch/encode/transfer path.
The cursor logic can be and was removed. getSlice/replaceSlice stay (the slice transformers depend on them), but the cursor placement, the next calls, and the current/currentEnclosing state are gone. The index is derived from the buffered tokens instead of maintained during the (now order-agnostic) append.

Why one pass rather than fixing the incremental sort in place: the incremental accounting conflated two jobs — slice bookkeeping for the transformers and emission order — and the propagation that made the first work is exactly what mis-ordered nested joined-table properties in the second. Attempts to fix it in append scrambled object-array grouping. A single pass over the completed feature is simpler and safer: build the token tree, order each object's children by schema position.

When the pass runs: it can't only run at flush. A property produced from several tables arrives as several fragments, and a slice transformer (concat, etc.) reads a property as a contiguous buffer range. The fragments are only contiguous once the feature is in schema order — so the pass has to run before the slice transformers read the buffer, otherwise a transformer's slice can swallow an unrelated property the provider emitted between two fragments. So ensureOrdered() applies the pass in place, lazily, exactly once per feature: before the transformers run, and at the latest at flush when none did. The old append-sort happened to provide this contiguity implicitly; the single pass now provides it explicitly.

Net: one ordering mechanism. The slice machinery is tightened to just slice-indexing, a dead source-path reorder transformer is removed, and regression specs cover the cases that surfaced — a nested joined object declared before its scalar siblings; a single-valued object split into per-table fragments coalescing into one; and a property whose fragments are split around an unrelated property, keeping that property out of its slice.

…lice getSlice threw IndexOutOfBoundsException (fromIndex < 0) on features where an in-buffer slice transformer shrinks a property's slice and a later property is absent from the feature. computeIndex fills the slice index from scratch and recorded a span only for positions that have tokens, leaving every absent property at start 0. replaceSlice shrinking a present slice shifts the start of every later position by the negative size delta, driving an absent position after it below zero; the next getSlice then called buffer.subList with a negative fromIndex. The previous incremental index maintenance kept a valid offset for every position, including empty ones; the single-pass rebuild dropped that. After computing spans, a forward scan now stamps every empty position with a valid buffer offset (the boundary between its occupied neighbours). The buffer is in schema-position order at this point, so the offsets are monotonic and non-negative, and only top-level enclosing positions carry a non-zero length.

…xy/xtraplatform-spatial into fix-feature-property-order

azahnen

Looks good besides the one little nitpick.

After the schema-order rework, the per-token position in FeatureTokenTransformerMappings was used only by the `if (pos > -1)` guard, which filtered exactly one thing: the path-less feature-root object. A path-less marker resolves to the root schema (an object) but has no schema position, so pos() returns -1 while schema() is present; for every other token pos and schema agree. Drop pos from all handlers. The object handlers now exclude the root explicitly with `!schema.isFeature()` (isObject and empty parent path) - self-documenting and provably the same set as pos > -1, since the only token with pos == -1 and a present schema is that root. The array, value and geometry handlers need no such check: the object root fails their type checks already.

cportele requested a review from azahnen as a code owner June 18, 2026 18:09

Merge branch 'master' into fix-feature-property-order

b379d02

cportele added 2 commits June 23, 2026 14:16

Merge branch 'master' into fix-feature-property-order

e4ccd7a

azahnen and others added 3 commits June 25, 2026 08:22

Merge branch 'master' into fix-feature-property-order

5e2a211

Merge branch 'fix-feature-property-order' of https://github.com/ldpro…

8649e92

…xy/xtraplatform-spatial into fix-feature-property-order

azahnen previously approved these changes Jun 26, 2026

View reviewed changes

Comment thread ...atures/src/main/java/de/ii/xtraplatform/features/domain/FeatureTokenTransformerMappings.java Outdated

cportele added 2 commits June 26, 2026 11:30

Merge branch 'master' into fix-feature-property-order

1bfc61b

cportele dismissed azahnen’s stale review via 077d45d June 26, 2026 09:39

cportele merged commit 0b5b52a into master Jun 26, 2026
3 checks passed

cportele deleted the fix-feature-property-order branch June 26, 2026 09:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix feature property order to follow the provider schema#538

fix feature property order to follow the provider schema#538
cportele merged 9 commits into
masterfrom
fix-feature-property-order

cportele commented Jun 18, 2026

Uh oh!

azahnen commented Jun 23, 2026

Uh oh!

cportele commented Jun 23, 2026

Uh oh!

cportele commented Jun 24, 2026

Uh oh!

azahnen left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

cportele commented Jun 18, 2026

Uh oh!

azahnen commented Jun 23, 2026

Uh oh!

cportele commented Jun 23, 2026

Uh oh!

cportele commented Jun 24, 2026

Uh oh!

azahnen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants