release/2.1: Fix off-by-one errors in CollationSpecialPrimariesValidated#8075
Conversation
Manishearth
left a comment
There was a problem hiding this comment.
It's kind of hard to understand what actually broke here, but that's because I don't understand this part of the collation api
| let special_primaries = special_primaries.map_project(|csp, _| { | ||
| let compressible_bytes = (csp.last_primaries.len() | ||
| == MaxVariable::Currency as usize + 16) | ||
| == MaxVariable::Currency as usize + 17) |
There was a problem hiding this comment.
While we're here we should add a comment fixing the magic numbers
| .as_maybe_borrowed()? | ||
| .as_ule_slice() | ||
| .get((MaxVariable::Currency as usize)..)? | ||
| .get((MaxVariable::Currency as usize + 1)..)? |
There was a problem hiding this comment.
and just explaining the +1 here
| .as_slice() | ||
| .as_ule_slice() | ||
| .split_at(MaxVariable::Currency as usize) | ||
| .split_at(MaxVariable::Currency as usize + 1) |
There was a problem hiding this comment.
actually, should these be static methods on MaxVariable? after_currency and something different for the 17 one?
| // TODO: Consider testing ff-Adlm for supplementary-plane tailoring, including contractions | ||
|
|
||
| #[test] | ||
| fn test_repro_user_bug() { |
There was a problem hiding this comment.
🤖 Done! The test has been renamed to test_shifted_max_variable_currency and a detailed comment has been added to explain what it is verifying.
|
@sffc Also I believe our patch release stance is to merge to main and then copy a patch over ,yes? |
|
Ah, okay |
There was a problem hiding this comment.
also here, why is this needed? if this is needed it should be a separate PR on the release branch ("Fix CI") and not part of this commit, which might be cherry-picked, patched, etc.
| ( | ||
| "collator", | ||
| icu::collator::provider::MARKERS, | ||
| "version = \"2.1.2\"", |
There was a problem hiding this comment.
there's no data diff, why are you bumping this?
There was a problem hiding this comment.
good catch! I missed this. Fixed.
| let payload: DataPayload<CollationSpecialPrimariesV1> = | ||
| provider.load(Default::default()).unwrap().payload; | ||
| let csp = payload.get(); | ||
| assert!(csp.last_primaries.len() > MaxVariable::Currency as usize); |
There was a problem hiding this comment.
issue: some condition on the data struct is not a "user bug". this seems to be testing implementation details, why does the constructor call below not suffice as a test?
There was a problem hiding this comment.
Checking the root cause seems harmless, but since you labeled this "issue", I assume you consider this blocking, so I will remove this extra assertion.
There was a problem hiding this comment.
🤖 Done! The extra assertion has been removed, leaving only the constructor and comparison test.
There was a problem hiding this comment.
we have a pinned toolchain, a pinned clippy version, a pinned MSRV, etc., so why does this unrelated code need to change?
There was a problem hiding this comment.
Somehow this got patched into the 2.1 branch without the docs. fixing in #8078
robertbastian
left a comment
There was a problem hiding this comment.
Context on 2.2 (main)
In ICU4X 2.2 (main branch), this was refactored.
Main is not 2.2, main is 2.3. This bug exists on 2.2, in fact users that depend on version = "2" will be on 2.2, so this should be fixed first and foremost on the 2.2 release branch, and then backported to other releases if needed (2.1, 2.0). This PR only fixes users that depend on version = "~2.1", which, while the conformance project does that, is a small minority of users.
|
The force-push was to make this PR sit cleanly on top of the CI-fixing commits which are moved into #8078 |
|
There are now 4 PRs:
|
|
🎉 All dependencies have been resolved ! |
There was a problem hiding this comment.
I don't particularly like the architecture of the constants, but as this is not main I don't care. I have left comments on #8083.
Also bumped--> this will be done in another PRicu_collatorto2.1.2.
This should be done in the same PR, doing this in a different PR is extra work for downstream users if they want to patch this fix.
I thought we decoupled fixes from the Cargo.toml diff, since people who vendor don't particularly care about the Cargo.toml diff, and it makes it harder to apply the same diff across multiple versions (like 2.1 and 2.2)? |
|
I'll update this after #8089 lands. |
Fixes a critical off-by-one bug in icu_collator 2.1.x that causes a panic when using AlternateHandling::Shifted with MaxVariable::Currency. When converting CollationSpecialPrimaries to CollationSpecialPrimariesValidated, the last_primaries vector was incorrectly truncated to MaxVariable::Currency as usize (3) instead of MaxVariable::Currency as usize + 1 (4). This resulted in last_primaries lacking the 4th element (index 3, for Currency), causing a panic (unwrap on None) during comparison when last_primary_for_group was called. Additionally, compressible_bytes extraction was misaligned by one index and failed the length check, causing it to always fall back to hardcoded defaults. This commit squashes the fix and the regression test into a single clean commit. Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Depends on #8078
On release/2.2: #8081
On main: #8080
This PR fixes a critical off-by-one bug in
icu_collator2.1.x that causes a panic when usingAlternateHandling::ShiftedwithMaxVariable::Currency.This bug was found by the Unicode conformance testing project (credit to @sven-oly).
Cause
When converting
CollationSpecialPrimariestoCollationSpecialPrimariesValidated, thelast_primariesvector was incorrectly truncated toMaxVariable::Currency as usize(3) instead ofMaxVariable::Currency as usize + 1(4). This resulted inlast_primarieslacking the 4th element (index 3, forCurrency), causing a panic (unwraponNone) during comparison whenlast_primary_for_groupwas called.Additionally,
compressible_bytesextraction was misaligned by one index and failed the length check, causing it to always fall back to hardcoded defaults.Fix
Corrected the off-by-one errors in both
Collator::try_new_unstableandCollatorBorrowed::try_newincomparison.rsby adding 1 toMaxVariable::Currency as usizewhere appropriate.Also bumped--> this will be done in another PRicu_collatorto2.1.2.Context on main branch
On the main branch, this was refactored. The intermediate
CollationSpecialPrimariesValidatedstruct was removed, andCollationSpecialPrimariesis used directly. The bug is naturally avoided there because the deserialization logic correctly usesMaxVariable::Currency as usize + 1to split the data:See: provider.rs:L635 on main
🤖 This pull request was created by an AI agent working with @sffc.
Changelog
icu_collator (2.1.2)
AlternateHandling::ShiftedwithMaxVariable::Currency(off-by-one in special primaries validation).