fix feature property order to follow the provider schema#538
Conversation
Properties backed by a joined table (objects, object arrays, value arrays, feature references) were emitted after the main-table columns even when declared before them, because the per-feature event buffer placed each property where its tokens first arrive - the provider's per-table order, not the schema order. For XML/GML this produced output that did not match the application schema's element order. FeatureEventBuffer now re-sorts each buffered feature at flush into the declared schema order: object (and feature-root) children are ordered by their schema position; array elements keep their data order, and the children inside each element are ordered like any other object. The pass runs after the slice transformers, so transform behaviour is unchanged. Object and array element children are consequently emitted in schema order too (previously source/arrival order); the affected token fixtures are updated. Adds FeatureEventBufferOrderSpec and re-enables the previously disabled "joined value array between main columns" mapping case.
|
@cportele I tried to remember and understand how the existing logic is supposed to work. The basic idea is that sorting happens as part of the buffer inserts using cursors:
I did not dig any deeper yet to find the underlying issue. Questions:
|
|
@azahnen - I agree that the right approach is not to put new logic on top of the current, incomplete logic. I just had a hard time to understand the current logic (and its limitations). I will take a deeper look at your questions and come back with answers. |
The per-feature event buffer ordered properties in two places: an incremental, cursor-driven placement in append, and a flush-time pass that re-sorted the feature into schema order. The cursor placement only got top-level properties right - a property nested in another object and backed by a joined table was left in production order - so the second pass was layered on top, leaving two overlapping ordering mechanisms. Keep a single mechanism. append now stores tokens in the order the provider produces them, and orderedBySchema is the only pass that applies schema order. It runs in place, lazily and once per feature: before the in-buffer slice transformers read the buffer, and at the latest at flush when no transformer ran. Ordering before the slice transformers is required, not incidental: a property produced from several tables arrives as several fragments, and a transformer reads a property as a contiguous buffer range. Only once the feature is in schema order are a property's fragments contiguous; without it, an unrelated property emitted between two fragments is swallowed into the slice (e.g. a separate array nested inside a concatenated object array). The transformers rewrite within a property and preserve schema order, so the pass is not repeated after they run. The position-addressable slice index that getSlice/replaceSlice need is built once per feature by a single pass over the buffered tokens, only when a slice is first accessed, and then updated in place as slices are rewritten - because the buffer is in schema-position order, a slice that changes size simply shifts every later position by the same amount, so the index is not rebuilt per rewrite. The schema-order pass also coalesces the per-table fragments of a single-valued object into one object: a provider produces an object backed by more than one table as several OBJECT[path]..OBJECT_END[path] blocks at the same position, which the previous per-position accrual merged implicitly. Object-array elements are wrapped in an array and are never affected. Remove the now-unused cursor plumbing (next, the current/currentEnclosing state, the increase/propagate accounting) and the dead source-path reorder transformer FeatureTokenTransformerSorting, which was only referenced from a commented-out, empty getDecoderTransformers override. Regression specs cover a nested joined object declared before its scalar siblings, an object produced as two per-table fragments coalescing into one, and a property whose fragments are split around an unrelated property keeping that property out of its slice.
|
Short version: Please have a look, whether the latest commit addresses your questions and concerns. I tested it also with configurations that use Your questions:
Why one pass rather than fixing the incremental sort in place: the incremental accounting conflated two jobs — slice bookkeeping for the transformers and emission order — and the propagation that made the first work is exactly what mis-ordered nested joined-table properties in the second. Attempts to fix it in When the pass runs: it can't only run at flush. A property produced from several tables arrives as several fragments, and a slice transformer (concat, etc.) reads a property as a contiguous buffer range. The fragments are only contiguous once the feature is in schema order — so the pass has to run before the slice transformers read the buffer, otherwise a transformer's slice can swallow an unrelated property the provider emitted between two fragments. So Net: one ordering mechanism. The slice machinery is tightened to just slice-indexing, a dead source-path reorder transformer is removed, and regression specs cover the cases that surfaced — a nested joined object declared before its scalar siblings; a single-valued object split into per-table fragments coalescing into one; and a property whose fragments are split around an unrelated property, keeping that property out of its slice. |
…lice getSlice threw IndexOutOfBoundsException (fromIndex < 0) on features where an in-buffer slice transformer shrinks a property's slice and a later property is absent from the feature. computeIndex fills the slice index from scratch and recorded a span only for positions that have tokens, leaving every absent property at start 0. replaceSlice shrinking a present slice shifts the start of every later position by the negative size delta, driving an absent position after it below zero; the next getSlice then called buffer.subList with a negative fromIndex. The previous incremental index maintenance kept a valid offset for every position, including empty ones; the single-pass rebuild dropped that. After computing spans, a forward scan now stamps every empty position with a valid buffer offset (the boundary between its occupied neighbours). The buffer is in schema-position order at this point, so the offsets are monotonic and non-negative, and only top-level enclosing positions carry a non-zero length.
…xy/xtraplatform-spatial into fix-feature-property-order
azahnen
left a comment
There was a problem hiding this comment.
Looks good besides the one little nitpick.
After the schema-order rework, the per-token position in FeatureTokenTransformerMappings was used only by the `if (pos > -1)` guard, which filtered exactly one thing: the path-less feature-root object. A path-less marker resolves to the root schema (an object) but has no schema position, so pos() returns -1 while schema() is present; for every other token pos and schema agree. Drop pos from all handlers. The object handlers now exclude the root explicitly with `!schema.isFeature()` (isObject and empty parent path) - self-documenting and provably the same set as pos > -1, since the only token with pos == -1 and a present schema is that root. The array, value and geometry handlers need no such check: the object root fails their type checks already.
NOTE: This is a change in the core of the feature pipeline and needs broad testing before the release of ldproxy 4.8.
Properties backed by a joined table (objects, object arrays, value arrays, feature references) were emitted after the main-table columns even when declared before them, because the per-feature event buffer placed each property where its tokens first arrive - the provider's per-table order, not the schema order. For XML/GML this produced output that did not match the application schema's element order.
FeatureEventBuffer now re-sorts each buffered feature at flush into the declared schema order: object (and feature-root) children are ordered by their schema position; array elements keep their data order, and the children inside each element are ordered like any other object. The pass runs after the slice transformers, so transform behaviour is unchanged.
Object and array element children are consequently emitted in schema order too (previously source/arrival order); the affected token fixtures are updated. Adds FeatureEventBufferOrderSpec and re-enables the previously disabled "joined value array between main columns" mapping case.