Fix implementation of locale region override#7335
Fix implementation of locale region override#7335sffc wants to merge 9 commits intounicode-org:mainfrom
Conversation
f6f621a to
243a4ba
Compare
| script: self.script, | ||
| region: match (self.region, self.ue_region) { | ||
| (Some(_), Some(r)) if region_priority => Some(r), | ||
| (_, Some(r)) if region_priority => Some(r), |
There was a problem hiding this comment.
@zbraniecki: was there a reason you were matching Some(_) on the regular region subtag here?
There was a problem hiding this comment.
No, I think it was a bug! Thanks for catching it!
| /// ``` | ||
| /// | ||
| /// [UTS #35]: https://www.unicode.org/reports/tr35/#RegionOverride | ||
| pub fn try_from_extension_value(value: &Value) -> Result<Self, RegionExtensionError> { |
There was a problem hiding this comment.
@zbraniecki wdyt of this new API. the signature and where it lives?
There was a problem hiding this comment.
I'd prefer if this lived on Value, or Subtag, or was internal to the preference parser. Something like parse_unicode_subdivision_id(self) -> Option<(Region, TinyAsciiStr<4>)>.
I also think it should be private for now.
| match Region::try_from_extension_value(v) { | ||
| Ok(r) => Some(r), | ||
| Err(_e) => { | ||
| // How should we handle the error? Ignore? |
| /// ``` | ||
| /// | ||
| /// [UTS #35]: https://www.unicode.org/reports/tr35/#RegionOverride | ||
| pub fn try_from_extension_value(value: &Value) -> Result<Self, RegionExtensionError> { |
| TestCase { | ||
| input: "en-u-rg-gbzzzz-sd-ustx", | ||
| language_priority: "en-u-sd-ustx", | ||
| region_priority: "en-GB-u-sd-ustx", // does this make sense? |
There was a problem hiding this comment.
about as much sense as the input
please don't submit these comments
There was a problem hiding this comment.
actually this doesn't make sense, as the rg override overrides the subdivision as well
I think we might want to remove the subdivision entirely for language priority, as that should only be used for region-specific data?
There was a problem hiding this comment.
Handled in the context of #7337 and its follow-ups
| use size_test_macro::size_test; | ||
|
|
||
| #[cfg(test)] | ||
| use icu_locale_core::locale; |
There was a problem hiding this comment.
nit: move this into test_numbering_resolution_fallback
| /// | ||
| /// ⚠️ This constructor ignores fallback priority and Unicode extension keywords, | ||
| /// and it reads only the first variant. | ||
| pub fn from_content_language_identifier(langid: &LanguageIdentifier) -> Self { |
There was a problem hiding this comment.
I would prefer not to make the content/user locale disctinction on the DataLocale level (which I still consider to be an icu_provider type). Each component decides for itself how to map locales into DataLocales, and what the semantics of its input locales are.
If you really want to name these methods (maybe something else), then we should deprecate the From implementations.
There was a problem hiding this comment.
I wasn't able to figure out how to deprecate the From impls.
| /// ⚠️ This constructor ignores fallback priority and Unicode extension keywords, | ||
| /// and it reads only the first variant. | ||
| pub fn from_content_language_identifier(langid: &LanguageIdentifier) -> Self { | ||
| Self { |
There was a problem hiding this comment.
this duplicates code in From<&LanguageIdentifier>
There was a problem hiding this comment.
observation: unrelated cleanup
| /// ``` | ||
| /// | ||
| /// [UTS #35]: https://www.unicode.org/reports/tr35/#RegionOverride | ||
| pub fn try_from_extension_value(value: &Value) -> Result<Self, RegionExtensionError> { |
There was a problem hiding this comment.
I'd prefer if this lived on Value, or Subtag, or was internal to the preference parser. Something like parse_unicode_subdivision_id(self) -> Option<(Region, TinyAsciiStr<4>)>.
I also think it should be private for now.
| /// | ||
| /// [UTS #35]: https://www.unicode.org/reports/tr35/#RegionOverride | ||
| #[displaydoc("Invalid region or subdivision extension value")] | ||
| Invalid, |
There was a problem hiding this comment.
Can we just use ParseError::InvalidExtension here? We don't have per-type or per method parsing errors
There was a problem hiding this comment.
It's a different level than a syntax error. I was going to use ParseError but thought you would give the opposite feedback that we should instead use a narrow error type. If you confirm that you prefer ParseError then I'll switch over.
There was a problem hiding this comment.
This is obsolete since we found the existing function that returns ParseError
| let payload_locale_override = if let Some(locale) = options.content_locale { | ||
| let locale = DataLocale::from(locale); | ||
| let locale = DataLocale::from_content_language_identifier(locale); |
There was a problem hiding this comment.
I think renaming the identifier to content_language has a similar readability effect as renaming from
| let payload_locale_override = if let Some(locale) = options.content_locale { | |
| let locale = DataLocale::from(locale); | |
| let locale = DataLocale::from_content_language_identifier(locale); | |
| let payload_locale_override = if let Some(content_language) = options.content_locale { | |
| let locale = DataLocale::from(content_language); |
| #[diplomat::opaque] | ||
| /// An ICU4X Locale, capable of representing strings like `"en-US"`. | ||
| /// | ||
| /// In `icu_capi`, this type also covers the uses of `DataLocale`. |
There was a problem hiding this comment.
NPM/Dart/MVN users won't know what icu_capi is
| /// In `icu_capi`, this type also covers the uses of `DataLocale`. | |
| /// This type also covers the Rust type `DataLocale`. |
| /// ``` | ||
| /// use icu_locale_core::{locale, Locale}; | ||
| /// use icu_provider::DataLocale; | ||
| /// use icu_locale_core::locale; |
There was a problem hiding this comment.
why split these?
why change the DataLocale import? I like importing it from icu_provider
| /// // For en-US with a Germany preference override | ||
| /// let info = WeekInformation::try_new(locale!("en-US-u-rg-dezzzz").into()).unwrap(); | ||
| /// assert_eq!(info.first_weekday, Weekday::Monday); | ||
| /// ``` |
There was a problem hiding this comment.
please add an example of the fw override as well, and how it interacts with rg.
robertbastian
left a comment
There was a problem hiding this comment.
We already have code to parse unicode_subdivision_id.
https://docs.rs/icu/latest/icu/locale/extensions/unicode/struct.SubdivisionId.html
Simpler version of #7335
We had partial claims for supporting this, but there were zero tests, and it turns out our region override code was not in working condition. I fixed the bugs and added test coverage.
Fixes #4413