Unify pagination tokenizers on a shared helper (+ fix Namespace+ID cursor ordering)#4
Open
afreidah wants to merge 1 commit into
Open
Unify pagination tokenizers on a shared helper (+ fix Namespace+ID cursor ordering)#4afreidah wants to merge 1 commit into
afreidah wants to merge 1 commit into
Conversation
…e+ID cursor ordering Consolidate the four pagination tokenizers (ID, Namespace+ID, CreateIndex+ID, ModifyIndex+Namespace+ID) onto a single tokenAndCompare helper that builds a '.'-joined token from an ordered list of numeric/string fields and compares it field-by-field, matching the order memdb iterates the underlying index. This fixes a latent bug in NamespaceIDTokenizer: it compared the joined "<namespace>.<id>" token as a whole string, so a namespace whose name was a prefix of another differing by a dash (e.g. "team" vs "team-a") ordered "team-a.*" before "team.*" (because '.' > '-'), disagreeing with the state store's (Namespace, ID) order. That made pagination duplicate jobs in one namespace and skip the other, repeating a page forever. Add unit tests for the Namespace+ID tiebreaker and a Job.List integration test (TestJobEndpoint_ListJobs_NamespacePagination) that walks pages across the "team"/"team-a" namespaces; it fails before this change and passes after.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
This is the larger-scope alternative to hashicorp#28178 (the narrow
/v1/jobs/statusesModifyIndex cursor fix), opened on my fork so it can be linked for a side-by-side comparison and a scope decision.It consolidates the four pagination tokenizers in
nomad/state/paginator/tokenizer.go-IDTokenizer,NamespaceIDTokenizer,CreateIndexAndIDTokenizer, andModifyIndexAndNamespaceIDTokenizer- onto a singletokenAndComparehelper. Each tokenizer now just declares its ordered fields (numeric or string); the helper serializes the.-joined token and compares it field-by-field, in order, matching how memdb iterates the underlying index.Two payoffs:
/v1/jobs/statusesbug) becomes structurally impossible - every tokenizer carries its full key.NamespaceIDTokenizerfor free (below).The Namespace+ID bug
NamespaceIDTokenizerbuilt"<namespace>.<id>"and compared it as a whole string. memdb orders namespaced objects by the(Namespace, ID)compound index, whose fields are NUL-separated - and NUL sorts before every printable byte, so the index effectively compares namespace, then id. A whole-string compare with a.separator does not: when one namespace name is a prefix of another that differs by a dash (e.g.teamvsteam-a),"team.j1"vs"team-a.j1"compares.(0x2E) against-(0x2D), so the token ordersteam-abeforeteam- the opposite of the state store.Result, when paginating across those namespaces: jobs in
teamare duplicated across pages, jobs inteam-abecome unreachable, and the cursor can return the samenext_tokenforever. This affects every endpoint that uses the Namespace+ID cursor (jobs, variables, CSI/host volumes, service registrations, and others).Comparing field-by-field (namespace, then id) - which the shared helper does for all tokenizers - matches the memdb order and fixes it. The token format is unchanged (
namespace.id), so existing and in-flight tokens stay valid across an upgrade.Reproduction
A/B harness, namespaces
teamandteam-a, 4 jobs each,per_page=4:https://github.com/afreidah/nomad/blob/repro-jobs-statuses-28132/repro-28132/ns-order-demo.sh
Before:
After: every job is returned exactly once and the walk terminates.
Changes
nomad/state/paginator/tokenizer.go- addtokenAndCompare+tokenField; reimplement all four tokenizers on top of it. Token formats and the legacy bare-integer fallback are preserved.nomad/state/paginator/tokenizer_test.go- unit tests for the Namespace+ID tiebreaker (including theteam/team-acase) and edges.nomad/job_endpoint_test.go-TestJobEndpoint_ListJobs_NamespacePagination, aJob.Listwalk across theteam/team-anamespaces.Testing
The new integration test fails on
mainand passes with this change:Scope
This supersets hashicorp#28178. Open question for the maintainers: keep the narrow hashicorp#28178 and track the Namespace+ID bug and the refactor separately, or take this unified version instead. Equally happy either way - that's the call I'd like your input on.
AI usage
Flagging for transparency, consistent with hashicorp#28178. The root-cause analysis, the reproduction, and the verification are mine: I confirmed the
team/team-afailure and the before/after using the harness and the integration test. This refactor's Go was written with AI assistance (the sharedtokenAndComparehelper and the new tests), working from that analysis and the agreed design - unlike hashicorp#28178, where the change was a one-line bool flip and no AI touched the source. I have reviewed and understand every line and I stand behind it.