Skip to content

Add findFirst and findFirstAny: return the first ordered match, pushing Order + limit-1 to Concourse #139

Description

@jtnelson

Summary

I propose adding findFirst and its hierarchy-inclusive sibling findFirstAny to the Runway read API. Each method returns the first Record that matches a Criteria under a caller-supplied Order, or null when nothing matches. "First" is defined entirely by the Order: the method pushes the Order together with a one-row page down to Concourse so the server returns a single sorted row, rather than loading every match and discarding all but one.

This is a primitive for a new data-sync capability in cinchapi-server, a load-balanced service that runs across Kubernetes pods that scale up and down. Connector configuration (sources, credentials, sync schedule, and a lock field) lives in a shared Concourse instance, and there is no external coordinator. Each instance runs a loop that claims the first available connection (one whose lock is unset, or whose lock is stale because the holder died), records the lock, runs the sync, then releases it. "Pick the first eligible record under a deterministic ordering" is exactly that claim step, and pushing the ordering and the limit to the server keeps the hot loop cheap as the connector set grows. Stale-lock detection is expressed in the query itself (a lock is treated as expired when its stored timestamp value precedes a timeout cutoff) rather than by mutating stored state, so the claim is a pure read followed by a conditional write.

findFirst is the foundation for the findFirstAndEdit family (a separate ticket), which will wrap this selection in an atomic claim-and-update. This ticket delivers only the read primitive.

Proposed API / Syntax

I propose new default methods on DatabaseInterface (src/main/java/com/cinchapi/runway/DatabaseInterface.java). Because every concrete implementer already inherits the read API as default methods over the single abstract select(Selection<?>...), adding these as defaults gives every implementer the behavior for free.

Minimum surface:

<T extends Record> T findFirst(Class<T> clazz, Criteria criteria, Order order);

<T extends Record> T findFirstAny(Class<T> clazz, Criteria criteria,
        Order order);

Mirroring the find overload shape (the realms and filter axes already present on find), I propose these additional overloads for both findFirst and findFirstAny:

// Realms-scoped
<T extends Record> T findFirst(Class<T> clazz, Criteria criteria, Order order,
        Realms realms);

// Client-side predicate
<T extends Record> T findFirst(Class<T> clazz, Criteria criteria, Order order,
        Predicate<T> filter);

// Predicate + realms
<T extends Record> T findFirst(Class<T> clazz, Criteria criteria, Order order,
        Predicate<T> filter, Realms realms);

Per the overload-delegation convention, the simpler overloads delegate to the most-parameterized one (which supplies Realms.any() and a no-op filter as defaults), and that overload contains the single real implementation.

Semantics (the contract callers can rely on):

  • Return value. It returns the single first matching Record, or null when no record matches. null is the zero-match signal, consistent with findUnique(Class, Criteria) and load(Class, long) already returning null on no match. The method never returns an empty collection because it does not return a collection.
  • Ordering. "First" is the first element under order. Ties between records that are equal under order are broken by the underlying engine and are not otherwise specified, and the native and legacy execution paths (see Performance) sort in different places (server-side versus client-side), so their tie-breaking is not guaranteed to agree. The result is therefore deterministic, and stable across both paths, when order is total: callers who need a fully deterministic pick should include a unique tiebreaker key (for example the record identifier) in the Order.
  • No duplicate detection. Unlike findUnique, findFirst never inspects whether more than one record matches and never throws DuplicateEntryException. It is therefore strictly cheaper than findUnique: it requests one row instead of two and performs no second-row check.
  • Order is required. I propose that order be a required parameter on every overload and that callers pass a non-null Order. Without an order, "first" is meaningless. I am not proposing an order-less overload; callers who genuinely do not care about ordering should use the existing find(Class, Criteria, Page) family with a one-row page. (If we later decide to offer an order-less convenience, its Javadoc must state that the resulting order is undefined/natural and may differ between server versions.)
  • findFirstAny applies the same contract across clazz and all of its descendants, matching how findAny relates to find.
  • Predicate interaction (important caveat). The optional Predicate<T> filter is evaluated client-side, after rows come back from the server. A client-side predicate combined with a one-row request could reject the single returned row and yield null even though a matching record exists further down the order. This is the same filter-plus-page hazard that find(..., Page, Predicate) already has, and Runway's existing machinery handles it: when both a filter and a page are present, $selectCriteria (Runway.java:1532) drives an incremental page-advancing retriever (Pagination.applyFilterAndPage) so the page limit is applied to the post-filter stream, not the pre-filter rows. findFirst must route through that same path so that "first record that matches the criteria and passes the filter" is honored rather than "first criteria match, then drop it if the filter fails." The overloads without a filter have no such caveat. Callers should prefer expressing the condition in the Criteria so it pushes to the server; the predicate is a convenience for conditions that cannot be expressed in CCL.

A CCL String-condition variant is intentionally not part of this first cut: the find/findUnique families today take a Criteria, not a CCL String (the only String-typed sort/condition overloads in the family are the deprecated sort-spec ones). Callers who hold CCL text can convert with Criteria.parse(String) before calling findFirst. A future String-condition overload could be added uniformly across the find family, but it is out of scope here to avoid introducing an asymmetry only findFirst has.

Examples

Claim the first eligible connector (lock unset, or its stored lock timestamp older than the staleness cutoff), preferring the connection that has been waiting longest:

long cutoff = Time.now() - LOCK_TIMEOUT_MICROS;
Criteria claimable = Criteria.where()
        .key("locked").operator(Operator.EQUALS).value(false)
        .or()
        .key("lockedAt").operator(Operator.LESS_THAN).value(cutoff)
        .build();

Connector next = runway.findFirst(Connector.class, claimable,
        Order.by("lastSyncedAt").ascending());
if(next != null) {
    // record the lock, run the sync, then release (see findFirstAndEdit
    // ticket for the atomic claim-and-update wrapper)
}
else {
    // nothing to claim right now
}

Hierarchy-inclusive variant, when connector subtypes are modeled as descendant classes:

Connector next = runway.findFirstAny(Connector.class, claimable,
        Order.by("lastSyncedAt").ascending());

Scoping the claim to a tenant's realm:

Connector next = runway.findFirst(Connector.class, claimable,
        Order.by("lastSyncedAt").ascending(), Realms.only("acme"));

null simply means "no first match," so the loop can treat it as "nothing to do this tick."

Implementation pointers

Implement entirely as default methods on DatabaseInterface, built from the Selection fluent API and routed through the existing fetch(Selection) dispatch. There is no need to add a new Selection subtype or touch Runway.select.

  • Build the selection. Use the fluent builders in Selection (src/main/java/com/cinchapi/runway/Selection.java): Selection.of(clazz) / Selection.ofAny(clazz), then .where(criteria), .order(order), .page(Page.limit(1)), and optionally .filter(filter) and .realms(realms). Page.limit(int) is the same factory UNIQUE_PAGINATION = Page.limit(2) uses at DatabaseInterface.java:62; here the limit is 1.
  • Execute and collapse. fetch(Selection) (DatabaseInterface.java:1919) returns select(selection).next(). For a criteria-bearing, non-unique selection that result is an ordered Set<T> whose iteration order reflects the requested Order. Take the first element with Iterables.getFirst(resultSet, null), an idiom already used in this codebase (Record.java). That yields the record or null when empty.
  • Mirror, then diverge from, findUnique. findUnique (DatabaseInterface.java:1106) builds an ofUnique selection and lets $selectUnique (Runway.java:1800) set page = UNIQUE_PAGINATION (Page.limit(2)), drop the Order, and throw on a second row. findFirst is the cheaper cousin: it keeps the Order, uses Page.limit(1), and does no duplicate check, so it builds an ordinary (non-unique) FindSelection rather than a UniqueSelection. Concretely, findFirst does not call into $selectUnique at all; it produces a Set-valued selection and reads the head.
  • Order + limit-1 push-down is already wired. Both $selectCriteria (Runway.java:1532) and $selectClass (Runway.java:1395) branch on hasNativeSortingAndPagination (field at Runway.java:420, set to serverVersion >= 0.10 at Runway.java:551). On native servers the Order and one-row Page are sent to Concourse and a single sorted row comes back. On legacy servers the same code path falls back to fetch-all, then client-side Record.compareTo sort, then stream.skip(...).limit(...) (Runway.java:1606-1623). findFirst is correct on both paths and cheap on native ones; no new branching is required.
  • Filter + one-row page. Route the filtered overloads through the same selection so they hit the filter-plus-page handling in $selectCriteria (the hasFilter && page != null branch using Pagination.applyFilterAndPage, Runway.java:1552). That guarantees the one-row limit is applied after the predicate, avoiding the "single row rejected by filter" pitfall described in the contract.
  • Javadoc and ordering placement. Follow the file's conventions: full block Javadoc with @param/@return, {@link} references, imperative voice ("Find and return the first ..."), and @author Jeff Nelson. Place the new methods alphabetically among the find* defaults (after the findAny* group, near findUnique).

Performance

  • Cost. On a native server (>= 0.10), each call is one server round trip that returns at most one row: the engine applies the index-backed sort and the one-row page server-side. This avoids materializing the full match set and avoids any client-side sort, so cost is bounded by the engine's ordered lookup rather than by the number of matches.
  • Cheaper than findUnique. findUnique requests two rows and performs a second-row check to detect duplicates; findFirst requests one row and does neither, so it does strictly less work for the claim use case.
  • No O(n) head-of-collection scan. The returned ordered set materializes records lazily, so reading the first element via Iterables.getFirst(set, null) only transforms the head element; the method never iterates or instantiates the whole result.
  • Legacy servers. Where native sorting/pagination is unavailable, the existing fallback (Runway.java:1606-1623) fetches the candidate set, sorts client-side via Record.compareTo, then skips/limits. This is the same cost profile the rest of the ordered read API already pays on legacy servers; findFirst introduces no new full scan beyond what find(..., Order, Page) already incurs there. The connector use case targets a modern Concourse, so the native path is the expected one in production.
  • Avoid the filter trap on large sets. When a client-side Predicate is supplied, rely on the page-advancing retriever rather than fetching everything and filtering in memory, so a one-row result over a large match set does not degrade into loading the whole set. Prefer expressing conditions in the Criteria so they push to the server and the predicate stays empty.

Testing

Add a test class (for example FindFirstTest) under src/test/java/com/cinchapi/runway/, extending RunwayBaseClientServerTest, which requires a live Concourse server. Per repo policy, write these tests but do not run them. Follow the established conventions:

  • Parameterize over the save/read paths. Use @RunWith(Parameterized.class) with a bulkCommands={true,false} matrix and flip supportsBulkCommands on the runway instance in beforeTestRun, exactly as PreventStaleWriteTest does, so the behavior is verified on both command-API capabilities.
  • Mandatory test Javadoc. Every @Test method must carry the four-section block Javadoc (Goal, Start state, Workflow as a <ul>, Expected) used throughout PreventStaleWriteTest.
  • Descriptive names. Name methods for the behavior, e.g. testFindFirstReturnsLowestUnderAscendingOrder, testFindFirstReturnsNullWhenNoMatch, testFindFirstHonorsDescendingOrder, testFindFirstAnyIncludesDescendants, testFindFirstWithFilterSkipsRejectedHeadRow, testFindFirstRespectsRealms.

Behaviors to cover:

  • Returns the first record under an ascending Order, and the correct (different) record under a descending Order, over the same match set.
  • Returns null when the criteria matches nothing.
  • Returns the sole match when exactly one record matches.
  • Does not throw when more than one record matches (the key contrast with findUnique): it returns the first under the order.
  • findFirstAny returns a descendant-typed record when the first match is a subclass instance, and excludes descendants when findFirst is used.
  • Realms scoping: a record outside the requested realms is not returned even if it would be first overall.
  • Filter caveat: with a predicate that rejects the order-first row, the method returns the next row that both matches the criteria and passes the filter (not null, and not the rejected row), confirming the limit is applied post-filter.
  • A connector-flavored claim test: seed several records with locked/lockedAt, assert that findFirst under Order.by("lastSyncedAt").ascending() returns the expected eligible record, including the stale-lock case where a record whose stored lockedAt value precedes the cutoff is considered claimable.

Acceptance criteria

  • findFirst(Class, Criteria, Order) and findFirstAny(Class, Criteria, Order) exist as default methods on DatabaseInterface and are inherited by all implementers.
  • Realms, predicate-filter, and predicate-plus-realms overloads exist for both methods, with simpler overloads delegating to a single most-parameterized implementation.
  • Each method returns the first matching record under the supplied Order, or null when there is no match.
  • Neither method performs duplicate detection or throws DuplicateEntryException; both are strictly cheaper than the corresponding findUnique.
  • The selection pushes Order plus a one-row page to the server on native-capable servers, and remains correct on legacy servers via the existing client-side fallback.
  • The filtered overloads apply the one-row limit after the client-side predicate (no "head row rejected by filter" false negative).
  • Full block Javadoc on every new method (imperative voice, {@link} references, @param/@return, @author Jeff Nelson); the predicate caveat and the null-on-no-match contract are documented.
  • A parameterized RunwayBaseClientServerTest subclass covers the behaviors above, with the mandatory four-section test Javadoc. Tests are written but not run.
  • ./gradlew spotlessApply has been run and the diff is purely cosmetic.

Scope and dependencies

In scope for the first cut:

  • findFirst and findFirstAny with the Criteria + Order core and the realms / predicate-filter overloads, as default methods on DatabaseInterface.
  • null-on-no-match semantics and the documented filter interaction.

Out of scope for the first cut:

  • A CCL String-condition overload (callers convert with Criteria.parse(String)); if added later it should be added uniformly across the find family, not only on findFirst.
  • Any order-less convenience overload.
  • The atomic claim-and-update behavior.

Cross-repo dependency: this primitive is consumed by the data-sync connector-claim loop in cinchapi-server, and it is the foundation that the findFirstAndEdit family (a separate Runway ticket) builds on to provide the atomic claim-and-update used by that loop. This ticket delivers only the read primitive; the atomic editing variant is tracked separately.

Related

Part of the connector data-sync locking initiative (cinchapi-server).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions