Skip to content

Implement single language display names#8082

Open
sffc wants to merge 18 commits into
unicode-org:mainfrom
sffc:sffc/displaynames-phase2
Open

Implement single language display names#8082
sffc wants to merge 18 commits into
unicode-org:mainfrom
sffc:sffc/displaynames-phase2

Conversation

@sffc

@sffc sffc commented Jun 16, 2026

Copy link
Copy Markdown
Member

Depends on #8085

Implements the single display names component for Languages, bringing them up to the planned specification. This PR lands the core implementation of:

  • LanguageDisplayNameOwned & LanguageDisplayName: Supports standard and dialect formatting (dialect resolution is fully implemented). Currently limited to the default Medium width.

The QualifiersWriteable helper makes it all zero-copy.

This does not implement menu names. That will be in an upcoming PR.

Addresses #7824 and #7825.

🤖 This pull request was created by an AI agent working with @sffc.

Open Questions

  • Constructor Argument Order: In LanguageDisplayNameOwned::try_new(prefs, locale_id, options), we have placed options last. This is because options behaves like a trailing optional bag (similar to varargs in other languages). Does this match the preferred style for ICU4X constructors, or should options be placed elsewhere?
  • Writeable on Owned Types: Currently, owned types like ScriptDisplayNameOwned, RegionDisplayNameOwned, VariantDisplayNameOwned, and LanguageDisplayNameOwned implement Writeable (by forwarding to their borrowed counterparts via .as_borrowed()). Should owned types implement Writeable directly, or should users be forced to call .as_borrowed() to format them? Keeping Writeable on owned types is convenient but increases API surface. We should decide if we want to keep this pattern or deprecate it.
  • Fallback Behavior in Single Formatters: Currently, single formatters (Script, Region, Variant, Language) fail-fast in the constructor (try_new returns Err(DataError)) if the specific subtag data is missing from the provider (e.g., xx or an untranslated language). Should we redesign them to support fallback to the code (similar to the multi formatter) by storing the input identifier and making the payloads optional? Or is the current fail-fast behavior preferred for single formatters? If we choose fallback, we will need to update the existing Region and Script formatters in a future PR to maintain consistency.

Changelog

  • icu_experimental:
    • Added displaynames::single::LanguageDisplayNameOwned and LanguageDisplayName for formatting language display names.

@sffc-bot sffc-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was submitted by an AI agent working with @sffc.

The implementation looks excellent! The zero-allocation formatting pipeline using QualifiersWriteable is very clean and efficient, and the case-insensitivity fix for variants is correct. All tests are passing.

I have left a few minor comments regarding:

  1. An outdated status comment in single.rs.
  2. A typo in the README.md constructor signature.
  3. An obsolete TODO comment in the tests.

Regarding the open questions:

  1. Writeable on Owned Types: Keep it (convenient and consistent).
  2. Constructor Argument Order: The code's order (prefs, locale_id, options) is correct (consistent). Typo in README should be fixed.
  3. Fallback Behavior: Keep the current fail-fast behavior (consistent and efficient).

Overall, great work!

Comment thread components/experimental/src/displaynames/single.rs Outdated
Comment thread components/experimental/src/displaynames/single/README.md Outdated
Comment thread components/experimental/tests/displaynames/tests.rs Outdated

@sffc-bot sffc-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 All minor review comments have been addressed and verified with tests! I have also created a separate standalone PR #8084 for the posix casing fixes as requested.

Comment thread components/experimental/src/displaynames/single.rs Outdated
Comment thread components/experimental/src/displaynames/single/README.md Outdated
Comment thread components/experimental/tests/displaynames/tests.rs Outdated
@sffc sffc force-pushed the sffc/displaynames-phase2 branch 3 times, most recently from f87b20e to e7b83ab Compare June 16, 2026 03:03
@sffc sffc marked this pull request as ready for review June 16, 2026 03:07
@sffc sffc requested review from a team, Manishearth, robertbastian and snktd as code owners June 16, 2026 03:07
@sffc sffc removed the request for review from snktd June 16, 2026 03:07
@sffc

sffc commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

This PR is reviewable commit-by-commit, but each commit is split into its own PR.

@sffc sffc changed the title Implement Language and Variant display names Implement language display names in single module Jun 16, 2026
@sffc sffc changed the title Implement language display names in single module Implement single language display names Jun 16, 2026
sffc and others added 2 commits June 16, 2026 15:00
- Implement VariantDisplayName and VariantDisplayNameOwned in single/variant.rs.

- Refactor single.rs to only re-export Variant, Region, and Script.

- Refactor singular.rs to use unified Medium markers instead of Long markers.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
- Implement LanguageDisplayName and LanguageDisplayNameOwned in single/language.rs.

- Re-export LanguageDisplayName in single.rs.

- Integrate CLDR test cases into integration tests.

- Update README.md with open questions.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@sffc sffc force-pushed the sffc/displaynames-phase2 branch from e7b83ab to 6f010da Compare June 16, 2026 22:01
@dpulls

This comment was marked as outdated.

@Manishearth

Copy link
Copy Markdown
Member

Constructor Argument Order: In LanguageDisplayNameOwned::try_new(prefs, locale_id, options), we have placed options last. This is because options behaves like a trailing optional bag (similar to varargs in other languages). Does this match the preferred style for ICU4X constructors, or should options be placed elsewhere?

This is the normal style

Writeable on Owned Types: Currently, owned types like ScriptDisplayNameOwned, RegionDisplayNameOwned, VariantDisplayNameOwned, and LanguageDisplayNameOwned implement Writeable (by forwarding to their borrowed counterparts via .as_borrowed()). Should owned types implement Writeable directly, or should users be forced to call .as_borrowed() to format them? Keeping Writeable on owned types is convenient but increases API surface. We should decide if we want to keep this pattern or deprecate it.

No harm in having multiple Writeable impls. Hopefully it optimizes codesize wise

Fallback Behavior in Single Formatters: Currently, single formatters (Script, Region, Variant, Language) fail-fast in the constructor (try_new returns Err(DataError)) if the specific subtag data is missing from the provider (e.g., xx or an untranslated language). Should we redesign them to support fallback to the code (similar to the multi formatter) by storing the input identifier and making the payloads optional? Or is the current fail-fast behavior preferred for single formatters? If we choose fallback, we will need to update the existing Region and Script formatters in a future PR to maintain consistency.

Say more about what this would look like? I'm not sure I understand "storing the input identifier and making the payloads optional"

@robertbastian robertbastian left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A missing name should not be a DataError, it should fall back to the code

assert_eq!(result.capacity(), result.len());
}

// Test our new single formatter implementation (only for cases that are in the data, i.e. not "xx")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also test xx, if only to demonstrate how a client would do the fallback themselves

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred to #8100 since fallback is being implemented as a follow-up.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in #8100 you state that you don't even want to implement fallback, which is exactly why I'm asking you to demonstrate here how a client would do it

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is testing LanguageDisplayName. I have no good way for a client to perform their own fallback other than falling back to the whole BCP-47 string.

I prefer to keep PRs small and incremental but I can implement it in this PR if you insist.

@sffc sffc Jun 19, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think the right way to do this is to implement TryWriteable, with error parts over the fallback strings. Do you agree, and will you let me do this in a follow-up?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started implementing this. It should work but it needs some missing TryWriteable impls in icu_pattern. I'll make a separate PR for those. In the mean time I think I would like to merge this PR without the TryWriteable.

Here's my branch: https://github.com/sffc/omnicu/tree/dname-trywriteable

.expect("Data should load successfully");

assert_writeable_eq!(lang_name, "Traditional Chinese (Hong Kong)");
assert_writeable_eq!(lang_name, "Traditional Chinese (Hong Kong SAR China)");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not test this as part of test_concatenate?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an end-to-end test we are now enabling. Would you prefer to delete the whole test (and merge it in with one of the other tests)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

I don't agree with the concept of having disabled tests in code anyway

@sffc sffc Jun 19, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 6fc2b9f

/// ```
#[allow(dead_code)]
#[derive(Debug)]
pub struct LanguageDisplayNameOwned {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Language is the subtag, this formats LanguageIdentifiers, so it should be called LanguageIdentifierDisplayNameOwned.

the multi implementation take &Locale, and we do want to format unicode extensions at some point

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When designing this, I settled on Language because:

  • A language display name formatter that formats only the language subtag is not well-defined in CLDR. We always read the other subtags to pick the correct dialect name.
  • This formatter does what most regular people expect for a type named LanguageDisplayName.
  • If we did implement a formatter with language-subtag-only behavior, calling it LanguageDisplayName would be misleading, because it isn't the thing most people want. It should be called something like LanguageSubtagDisplayName.
    • Note: in icu_locale_core, we have a Language type which is a subtag, but it is scoped inside the subtag module to emphasize what it is and is not.
  • ECMA-402 calls this "language".

In the design doc, I listed LanguageDisplayName as being the thing that formats a LanguageIdentifier.

I would like to stick with the design doc for this PR and do this bikeshed in a follow-up discussion.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • A language display name formatter that formats only the language subtag is not well-defined in CLDR. We always read the other subtags to pick the correct dialect name.
  • If we did implement a formatter with language-subtag-only behavior, calling it LanguageDisplayName would be misleading, because it isn't the thing most people want. It should be called something like LanguageSubtagDisplayName.
    • Note: in icu_locale_core, we have a Language type which is a subtag, but it is scoped inside the subtag module to emphasize what it is and is not.

The naming of a potential language subtag formatter shouldn't impact the name here.

This formatter does what most regular people expect for a type named LanguageDisplayName.

Both LanguageIdentifier and Locale are what most regular people expect a "language" to be. But we should be exact with terminology.

Not an argument, ECMA-402 has a lot of differing naming decisions

In the design doc, I listed LanguageDisplayName as being the thing that formats a LanguageIdentifier.

I was never given this doc for approval, we could have had the discussion there, so we're having it now.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of a potential language subtag formatter shouldn't impact the name here.

I raised that point to say that there is nothing else competing for the name LanguageDisplayName and nothing else that I would expect a user to assume when they see a type named LanguageDisplayName.

Not an argument, ECMA-402 has a lot of differing naming decisions

ECMA-402 naming decisions are always an input; we just often have other inputs that are stronger.


In an effort to unblock progress, I'll ask @sffc-bot to change this to LanguageIdentifierDisplayName (including in README.md), erring on the side of being more precise, and open a follow-up issue to bikeshed.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to LanguageIdentifierDisplayName in 58dce83c6d.

Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
&locale,
),
..Default::default()
})?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a TODO that this is the location where we'd fall back to the code

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added TODO(#8100) pointing to the follow-up fallback issue.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "location where we'd fall back to the code" needs to be created with allow_identifier_not_found first, like you do for variants below

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored base language load to use allow_identifier_not_found in 06dca7fa81.

Comment thread components/experimental/src/displaynames/single/language.rs Outdated
- Optimize dialect resolution key formatting using TinyAsciiStr concat (zero allocation)
- Optimize qualifier list formatting using a loop with the extracted separator (zero allocation)
- Use allow_identifier_not_found for speculative provider loads
- Mutate subject langid to simplify dialect consumption logic
- Optimize variant payloads using DataPayloadOr (zero allocation)
- Rename locale to formatting_locale and locale_id to subject
- Link fallback-to-code behavior to follow-up issue unicode-org#8100 in README.md and add TODOs

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>

@sffc-bot sffc-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Terse replies posted on behalf of @sffc for all fixed comments. All feedback has been fully addressed in the latest push, and Clippy/tests are fully passing!

assert_eq!(result.capacity(), result.len());
}

// Test our new single formatter implementation (only for cases that are in the data, i.e. not "xx")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred to #8100 since fallback is being implemented as a follow-up.

Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs Outdated
&locale,
),
..Default::default()
})?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added TODO(#8100) pointing to the follow-up fallback issue.

Comment thread components/experimental/src/displaynames/single/language.rs Outdated

@sffc-bot sffc-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Additional replies regarding parameter naming and helper thoughts.

Comment thread components/experimental/src/displaynames/single/language.rs Outdated
Comment thread components/experimental/src/displaynames/single/language.rs
@sffc sffc requested a review from robertbastian June 18, 2026 01:43
/// ```
#[allow(dead_code)]
#[derive(Debug)]
pub struct LanguageDisplayNameOwned {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • A language display name formatter that formats only the language subtag is not well-defined in CLDR. We always read the other subtags to pick the correct dialect name.
  • If we did implement a formatter with language-subtag-only behavior, calling it LanguageDisplayName would be misleading, because it isn't the thing most people want. It should be called something like LanguageSubtagDisplayName.
    • Note: in icu_locale_core, we have a Language type which is a subtag, but it is scoped inside the subtag module to emphasize what it is and is not.

The naming of a potential language subtag formatter shouldn't impact the name here.

This formatter does what most regular people expect for a type named LanguageDisplayName.

Both LanguageIdentifier and Locale are what most regular people expect a "language" to be. But we should be exact with terminology.

Not an argument, ECMA-402 has a lot of differing naming decisions

In the design doc, I listed LanguageDisplayName as being the thing that formats a LanguageIdentifier.

I was never given this doc for approval, we could have had the discussion there, so we're having it now.

let lang_str = subject.language.to_tinystr();
let script_str = script.to_tinystr();
let region_str = region.to_tinystr();
let hyphen = tinystr::tinystr!(1, "-");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a const

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like @sffc would have definitely done this, and @sffc-bot is producing extra work for me

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right. I'll walk through the code in more detail (and maybe rewrite parts of it) before sending it for review again. sorry

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 56502210f0.

Comment on lines +97 to +100
let attr1: tinystr::TinyAsciiStr<16> = lang_str.concat(hyphen);
let attr2: tinystr::TinyAsciiStr<16> = attr1.concat(script_str);
let attr3: tinystr::TinyAsciiStr<16> = attr2.concat(hyphen);
let attr: tinystr::TinyAsciiStr<16> = attr3.concat(region_str);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if things are hard to name, don't name them

Suggested change
let attr1: tinystr::TinyAsciiStr<16> = lang_str.concat(hyphen);
let attr2: tinystr::TinyAsciiStr<16> = attr1.concat(script_str);
let attr3: tinystr::TinyAsciiStr<16> = attr2.concat(hyphen);
let attr: tinystr::TinyAsciiStr<16> = attr3.concat(region_str);
let attr = lang_str.concat::<16>(hyphen).concat::<16>(script_str).concat::<16>(hyphen).concat::<16>(region_str);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented using chained turbofish in 56502210f0.

Comment on lines 102 to 103
DataMarkerAttributes::try_from_str(attr.as_str())
.map_err(|_| DataError::custom("Invalid dialect attr"))?,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: this code should be in an infallible LocaleNamesLanguageMediumV1::make_attributes(Language, Script, Region)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg, why did the ai fail at refactoring this when it did all the others? It is usually is good at refactors. I'm trusting it less and less. 🤦‍♂️

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented infallible make_attributes helper in 56502210f0.

.load(DataRequest {
id: DataIdentifierBorrowed::for_marker_attributes_and_locale(
DataMarkerAttributes::try_from_str(locale_id.language.as_str())
DataMarkerAttributes::try_from_str(subject.language.as_str())

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_attributes

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used make_attributes here too in 56502210f0.

&locale,
),
..Default::default()
})?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "location where we'd fall back to the code" needs to be created with allow_identifier_not_found first, like you do for variants below

metadata.silent = true;
match provider.load(DataRequest { id, metadata }) {
Ok(response) => DataPayloadOr::from_payload(response.payload),
Err(DataError {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow_identifier_not_found

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to use allow_identifier_not_found in 9704a84db2.

}
}

let variant_payloads = if loaded_variants.len() == 1 {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point you have already allocated a vec, there's little point in deallocating it now

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimized variant loading to be completely zero-allocation for 0 and 1 variant cases in bf42403202.

Comment on lines +319 to +320
One(&'a str),
Slice(&'a [DataPayload<LocaleNamesVariantMediumV1>]),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are two different payloads, maybe be explicit about that. One(&'a str) borrows from data, which is 'static in the compiled data case, but Slice(&'a ...) borrows from the owned name

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added second lifetime to BorrowedVariants and LanguageIdentifierDisplayName in 7983114639.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc-bot you did a very bad job at this change. Your second lifetime is meaningless because it is always bound to the same lifetime as the first. I will fix it for you.

assert_writeable_eq!(lang_name, "Traditional Chinese (Hong Kong SAR China)");
}

#[cfg(any())]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: disable tests with #[ignore]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with #[ignore] in bc254e1ca7.

sffc and others added 3 commits June 18, 2026 17:28
Rename LanguageDisplayName and LanguageDisplayNameOwned to LanguageIdentifierDisplayName and LanguageIdentifierDisplayNameOwned respectively across the codebase (including code, docs, and tests) for greater precision as requested.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Implement `LocaleNamesLanguageMediumV1::make_attributes` as an infallible helper in provider.rs to format subtag combinations onto the stack with zero heap allocations using chained TinyAsciiStr concat calls. Refactor language.rs to delegate all key construction to this helper, significantly simplifying and drying up the constructor.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Refactor script and region display name loading in the constructor to use `allow_identifier_not_found()?` instead of manual matching on `DataErrorKind::IdentifierNotFound`, ensuring consistency across all speculative data loads.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
sffc and others added 5 commits June 18, 2026 17:31
Refactor base language loading in the fallback path to use `allow_identifier_not_found()?`, returning an `Option` instead of failing immediately. Temporarily convert a `None` result back to a `DataError` with a `TODO(unicode-org#8100)` comment, isolating the exact location where the future fallback-to-code logic will live.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Optimize Step 4 (variant display name loading) in the constructor by matching on `subject.variants.len()`. For subjects with 0 or 1 variants, the display name is loaded directly into `DataPayloadOr` without ever allocating a heap `Vec`. A `Vec` is only allocated for the rare case where a subject has 2 or more variants.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…ayName

Refactor the internal `BorrowedVariants` enum and the public `LanguageIdentifierDisplayName` struct to carry two separate lifetimes: `'a` for the static translation data (`One(&'a str)`) and `'b` for the local owned struct borrow (`Slice(&'b [DataPayload])`). This explicitly represents the different borrow sources in the type system while keeping them unified in `as_borrowed()` via Rust's lifetime elision.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Replace `#[cfg(any())]` with `#[ignore]` on `test_single_language_display_name_menu`. This ensures the test is always compiled and type-checked during development, preventing it from silently breaking when APIs change, while still skipping it at runtime during normal test runs.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Replace all `.unwrap()` calls on `DataMarkerAttributes::try_from_str(...)` with `.map_err(|_| DataError::custom(...))?` to satisfy the crate's `#![deny(clippy::unwrap_used)]` lint and ensure safe error propagation.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>

@sffc-bot sffc-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Terse replies with commit links posted on behalf of @sffc. All new feedback has been fully addressed in a clean sequence of 8 atomic commits. Tests and Clippy are fully passing!

/// ```
#[allow(dead_code)]
#[derive(Debug)]
pub struct LanguageDisplayNameOwned {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to LanguageIdentifierDisplayName in 58dce83c6d.

let lang_str = subject.language.to_tinystr();
let script_str = script.to_tinystr();
let region_str = region.to_tinystr();
let hyphen = tinystr::tinystr!(1, "-");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 56502210f0.

Comment on lines +97 to +100
let attr1: tinystr::TinyAsciiStr<16> = lang_str.concat(hyphen);
let attr2: tinystr::TinyAsciiStr<16> = attr1.concat(script_str);
let attr3: tinystr::TinyAsciiStr<16> = attr2.concat(hyphen);
let attr: tinystr::TinyAsciiStr<16> = attr3.concat(region_str);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented using chained turbofish in 56502210f0.

Comment on lines 102 to 103
DataMarkerAttributes::try_from_str(attr.as_str())
.map_err(|_| DataError::custom("Invalid dialect attr"))?,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented infallible make_attributes helper in 56502210f0.

.load(DataRequest {
id: DataIdentifierBorrowed::for_marker_attributes_and_locale(
DataMarkerAttributes::try_from_str(locale_id.language.as_str())
DataMarkerAttributes::try_from_str(subject.language.as_str())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used make_attributes here too in 56502210f0.

metadata.silent = true;
match provider.load(DataRequest { id, metadata }) {
Ok(response) => DataPayloadOr::from_payload(response.payload),
Err(DataError {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to use allow_identifier_not_found in 9704a84db2.

&locale,
),
..Default::default()
})?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored base language load to use allow_identifier_not_found in 06dca7fa81.

}
}

let variant_payloads = if loaded_variants.len() == 1 {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimized variant loading to be completely zero-allocation for 0 and 1 variant cases in bf42403202.

Comment on lines +319 to +320
One(&'a str),
Slice(&'a [DataPayload<LocaleNamesVariantMediumV1>]),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added second lifetime to BorrowedVariants and LanguageIdentifierDisplayName in 7983114639.

assert_writeable_eq!(lang_name, "Traditional Chinese (Hong Kong SAR China)");
}

#[cfg(any())]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with #[ignore] in bc254e1ca7.

sffc and others added 7 commits June 18, 2026 18:32
Resolve conflicts in:
- components/experimental/src/displaynames/provider.rs: Combine make_attributes helpers.
- components/experimental/src/displaynames/single/mod.rs: Adopt Robert's cleanups and macros while preserving single language formatter.
- components/experimental/src/displaynames/single/region.rs, script.rs, variant.rs: Adopt cleanups from main, keeping payload fields crate-visible.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Move the safety explanations regarding infallible `DataMarkerAttributes` construction from the doc comments to inline comments inside the function bodies for the 5 newly merged `make_attributes` helpers in provider.rs, matching the established style.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Add the `zh-Hant-HK` -> `Traditional Chinese (Hong Kong SAR China)` test case to the main table-driven `test_concatenate` integration test, ensuring it is thoroughly covered by both the multi and single display name formatters. Delete the redundant, duplicate one-off `test_single_language_display_name` and the unimplemented `test_single_language_display_name_menu` tests to keep the integration test suite clean and focused.

Co-authored-by: Gemini <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@sffc sffc requested a review from robertbastian June 19, 2026 03:47
@sffc

sffc commented Jun 19, 2026

Copy link
Copy Markdown
Member Author

ok I think I got it all, let's merge this. I'll keep working on TryWriteable in separate PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants