-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Extended Currency DataModel #4706
Merged
Merged
Changes from all commits
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
7202666
Merge branch 'main' of github.com:unicode-org/icu4x
younies 1bfbcdd
Merge branch 'unicode-org:main' into main
younies f03df49
first step
younies f48160a
Change the file name
younies ab6ba49
Extracted the needed data
younies 7205403
Design the data skeleton
younies b231bf4
Fix datagen model
younies 6cd6f90
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies 16c04f8
move `currencies.rs`
younies f29ff4f
Fix the code and organize
younies 57fdaf0
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies c90f58b
fix build
younies df0a3b3
fix skeleton
younies 4270963
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies e83f31d
fix merge
younies fb09cf3
skeleton end to end
younies 5218c53
Fix skeleton
younies e3b640b
prepare for the aux key
younies 9bca26e
Merge branch 'main' into add-long-currency-data
younies 8af221c
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies 11df80c
add ULE file
younies 6ff016f
add ule
younies e5dc375
fixo
younies 6ca63a3
save
younies 732fa5e
Fix library
younies 5ad9e83
Merge branch 'main' into add-long-currency-data
younies 97466a1
Merge branch 'add-long-currency-data' of github.com:younies/icu4x int…
younies 05e26bf
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies f30ef6f
Fix merge
younies f26cff9
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies f7b173e
Implement `IterableDataProvider`
younies eca009d
Add currency code as auxiliary code
younies c5cf174
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies 4909725
Fix merge
younies affa7e1
remove unneeded file
younies 447e2a7
Add the data
younies 882b704
update the data
younies 8d4c34f
remove unneeded code
younies f13455f
add config serde borrow
younies 01625e3
fix clippy
younies 3aa476e
fix fmt
younies 761686c
fix comment
younies d180000
Fix ZeroCapy
younies 12145f9
Fix names and comments
younies 41facd5
Fix clippy
younies d733efb
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies bf84dc9
remove unused code
younies a579caf
Fix resource
younies f0a7269
fix
younies da175fc
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies 6d1372e
fix merge
younies c58248e
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies d6cad90
Add test cases
younies e5f787d
Merge branch 'add-long-currency-data' of github.com:younies/icu4x int…
younies 4a417c0
Remove temporarily some currencies
younies 8611228
Add a TODO
younies aeb1d11
Merge branch 'main' into add-long-currency-data
younies File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
85 changes: 85 additions & 0 deletions
85
components/experimental/src/dimension/provider/extended_currency.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
// This file is part of ICU4X. For terms of use, please see the file | ||
// called LICENSE at the top level of the ICU4X source tree | ||
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ). | ||
|
||
// Provider structs must be stable | ||
#![allow(clippy::exhaustive_structs, clippy::exhaustive_enums)] | ||
|
||
//! Data provider struct definitions for this ICU4X component. | ||
//! | ||
//! Read more about data providers: [`icu_provider`] | ||
|
||
use icu_provider::prelude::*; | ||
use zerovec::ZeroMap; | ||
|
||
#[cfg(feature = "compiled_data")] | ||
/// Baked data | ||
/// | ||
/// <div class="stab unstable"> | ||
/// 🚧 This code is considered unstable; it may change at any time, in breaking or non-breaking ways, | ||
/// including in SemVer minor releases. In particular, the `DataProvider` implementations are only | ||
/// guaranteed to match with this version's `*_unstable` providers. Use with caution. | ||
/// </div> | ||
pub use crate::provider::Baked; | ||
|
||
/// Currency Extended V1 data struct. | ||
#[icu_provider::data_struct(marker(CurrencyExtendedDataV1Marker, "currency/extended@1"))] | ||
#[derive(Debug, Clone, Default, PartialEq)] | ||
#[cfg_attr(feature = "serde", derive(serde::Deserialize))] | ||
#[cfg_attr( | ||
feature = "datagen", | ||
derive(serde::Serialize, databake::Bake), | ||
databake(path = icu_experimental::dimension::provider::extended_currency) | ||
)] | ||
#[yoke(prove_covariance_manually)] | ||
pub struct CurrencyExtendedDataV1<'data> { | ||
// TODO: Implement currency pattern selection logic to choose between standard or standard next to text pattern. | ||
/// Contains the localized display names for a currency based on plural rules. | ||
/// For instance, in the "en" locale for the "USD" currency: | ||
/// - "US Dollars" when count is `zero`, | ||
/// - "US Dollar" when count is `one`, | ||
/// ... etc. | ||
#[cfg_attr(feature = "serde", serde(borrow))] | ||
pub display_names: ZeroMap<'data, Count, str>, | ||
} | ||
|
||
/// A CLDR plural keyword, or the explicit value 1. | ||
/// See <https://www.unicode.org/reports/tr35/tr35-numbers.html#Language_Plural_Rules>. | ||
#[zerovec::make_ule(CountULE)] | ||
#[zerovec::derive(Debug)] | ||
#[derive(Copy, Clone, PartialOrd, Ord, PartialEq, Eq, Debug)] | ||
#[cfg_attr(feature = "serde", derive(serde::Deserialize))] | ||
#[cfg_attr( | ||
feature = "datagen", | ||
derive(serde::Serialize, databake::Bake), | ||
databake(path = icu_experimental::dimension::provider::extended_currency) | ||
)] | ||
#[repr(u8)] | ||
pub enum Count { | ||
/// The CLDR keyword `zero`. | ||
Zero = 0, | ||
/// The CLDR keyword `one`. | ||
One = 1, | ||
/// The CLDR keyword `two`. | ||
Two = 2, | ||
/// The CLDR keyword `few`. | ||
Few = 3, | ||
/// The CLDR keyword `many`. | ||
Many = 4, | ||
/// The CLDR keyword `other`. | ||
Other = 5, | ||
// TODO(younies): revise this for currency | ||
/// The explicit 1 case, see <https://www.unicode.org/reports/tr35/tr35-numbers.html#Explicit_0_1_rules>. | ||
Explicit1 = 6, | ||
// NOTE(egg): No explicit 0, because the compact decimal pattern selection | ||
// algorithm does not allow such a thing to arise. | ||
// TODO(younies): implment this case. | ||
/// The default case. | ||
/// NOTE: | ||
/// Used as the default when there is no match. | ||
/// This is also used to replace the most frequently occurring case in all plural rules. | ||
Default = 7, | ||
|
||
/// The display name for the currency. | ||
DisplayName = 8, | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 changes: 6 additions & 0 deletions
6
provider/datagen/src/transform/cldr/cldr_serde/currencies/mod.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
// This file is part of ICU4X. For terms of use, please see the file | ||
// called LICENSE at the top level of the ICU4X source tree | ||
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ). | ||
|
||
pub mod data; | ||
pub mod supplemental; |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
172 changes: 172 additions & 0 deletions
172
provider/datagen/src/transform/cldr/currency/extended.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,172 @@ | ||
// This file is part of ICU4X. For terms of use, please see the file | ||
// called LICENSE at the top level of the ICU4X source tree | ||
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ). | ||
|
||
use crate::provider::transform::cldr::cldr_serde; | ||
use crate::DatagenProvider; | ||
|
||
use std::borrow::Cow; | ||
use std::collections::BTreeMap; | ||
use std::collections::HashSet; | ||
|
||
use icu_experimental::dimension::provider::extended_currency::Count; | ||
use icu_provider::datagen::IterableDataProvider; | ||
use tinystr::TinyAsciiStr; | ||
|
||
use icu_experimental::dimension::provider::extended_currency::*; | ||
use icu_provider::prelude::*; | ||
use icu_provider::DataProvider; | ||
|
||
impl DataProvider<CurrencyExtendedDataV1Marker> for crate::DatagenProvider { | ||
fn load( | ||
&self, | ||
req: DataRequest, | ||
) -> Result<DataResponse<CurrencyExtendedDataV1Marker>, DataError> { | ||
self.check_req::<CurrencyExtendedDataV1Marker>(req)?; | ||
|
||
let langid = req.locale.get_langid(); | ||
let currencies_resource: &cldr_serde::currencies::data::Resource = | ||
self.cldr()? | ||
.numbers() | ||
.read_and_parse(&langid, "currencies.json")?; | ||
|
||
let aux = req | ||
.marker_attributes | ||
.parse::<TinyAsciiStr<3>>() | ||
.map_err(|_| DataError::custom("failed to parse aux key into tinystr"))?; | ||
let currency = currencies_resource | ||
.main | ||
.value | ||
.numbers | ||
.currencies | ||
.get(&aux.to_unvalidated()) | ||
.ok_or(DataError::custom("No currency associated with the aux key"))?; | ||
|
||
let mut placeholders: BTreeMap<Count, String> = BTreeMap::new(); | ||
|
||
fn add_placeholder( | ||
placeholders: &mut BTreeMap<Count, String>, | ||
key: Count, | ||
value: Option<String>, | ||
) { | ||
if let Some(val) = value { | ||
placeholders.insert(key, val); | ||
} | ||
} | ||
|
||
add_placeholder(&mut placeholders, Count::Zero, currency.zero.clone()); | ||
add_placeholder(&mut placeholders, Count::One, currency.one.clone()); | ||
add_placeholder(&mut placeholders, Count::Two, currency.two.clone()); | ||
add_placeholder(&mut placeholders, Count::Few, currency.few.clone()); | ||
add_placeholder(&mut placeholders, Count::Many, currency.many.clone()); | ||
add_placeholder(&mut placeholders, Count::Other, currency.other.clone()); | ||
add_placeholder( | ||
&mut placeholders, | ||
Count::DisplayName, | ||
currency.display_name.clone(), | ||
); | ||
|
||
let data = CurrencyExtendedDataV1 { | ||
display_names: placeholders | ||
.into_iter() | ||
.map(|(k, v)| (k, Cow::Owned(v))) | ||
.collect(), | ||
}; | ||
|
||
Ok(DataResponse { | ||
metadata: Default::default(), | ||
payload: Some(DataPayload::from_owned(data)), | ||
}) | ||
} | ||
} | ||
|
||
impl IterableDataProvider<CurrencyExtendedDataV1Marker> for DatagenProvider { | ||
fn supported_requests(&self) -> Result<HashSet<(DataLocale, DataMarkerAttributes)>, DataError> { | ||
// TODO: This is a temporary implementation until we have a better way to handle large number of json files. | ||
let currencies_to_support: HashSet<_> = | ||
["USD", "CAD", "EUR", "GBP", "EGP"].into_iter().collect(); | ||
|
||
let mut result = HashSet::new(); | ||
let numbers = self.cldr()?.numbers(); | ||
let langids = numbers.list_langs()?; | ||
for langid in langids { | ||
let currencies_resource: &cldr_serde::currencies::data::Resource = self | ||
.cldr()? | ||
.numbers() | ||
.read_and_parse(&langid, "currencies.json")?; | ||
|
||
let currencies = ¤cies_resource.main.value.numbers.currencies; | ||
for key in currencies.keys() { | ||
let key_string = key | ||
.try_into_tinystr() | ||
.map_err(|_| DataError::custom("failed to parse currency code into tinystr"))? | ||
.parse::<String>() | ||
.map_err(|_| DataError::custom("failed to parse currency code into string"))?; | ||
if !currencies_to_support.contains(key_string.as_str()) { | ||
continue; | ||
} | ||
|
||
let key = key | ||
.try_into_tinystr() | ||
.map_err(|_| DataError::custom("failed to parse currency code into tinystr"))?; | ||
|
||
let attributes = DataMarkerAttributes::from_tinystr(key.resize()); | ||
result.insert((DataLocale::from(&langid), attributes)); | ||
} | ||
} | ||
|
||
Ok(result) | ||
} | ||
} | ||
|
||
#[test] | ||
fn test_basic() { | ||
use icu_locale_core::langid; | ||
|
||
let provider = DatagenProvider::new_testing(); | ||
let en: DataPayload<CurrencyExtendedDataV1Marker> = provider | ||
.load(DataRequest { | ||
locale: &langid!("en").into(), | ||
marker_attributes: &"USD".parse().unwrap(), | ||
..Default::default() | ||
}) | ||
.unwrap() | ||
.take_payload() | ||
.unwrap(); | ||
let display_names = en.get().to_owned().display_names; | ||
assert_eq!(display_names.get(&Count::Zero), None); | ||
assert_eq!(display_names.get(&Count::One).unwrap(), "US dollar"); | ||
assert_eq!(display_names.get(&Count::Two), None); | ||
assert_eq!(display_names.get(&Count::Few), None); | ||
assert_eq!(display_names.get(&Count::Many), None); | ||
assert_eq!(display_names.get(&Count::Other).unwrap(), "US dollars"); | ||
assert_eq!(display_names.get(&Count::DisplayName).unwrap(), "US Dollar"); | ||
|
||
let fr: DataPayload<CurrencyExtendedDataV1Marker> = provider | ||
.load(DataRequest { | ||
locale: &langid!("fr").into(), | ||
marker_attributes: &"USD".parse().unwrap(), | ||
..Default::default() | ||
}) | ||
.unwrap() | ||
.take_payload() | ||
.unwrap(); | ||
|
||
let display_names = fr.get().to_owned().display_names; | ||
assert_eq!(display_names.get(&Count::Zero), None); | ||
assert_eq!( | ||
display_names.get(&Count::One).unwrap(), | ||
"dollar des États-Unis" | ||
); | ||
assert_eq!(display_names.get(&Count::Two), None); | ||
assert_eq!(display_names.get(&Count::Few), None); | ||
assert_eq!(display_names.get(&Count::Many), None); | ||
assert_eq!( | ||
display_names.get(&Count::Other).unwrap(), | ||
"dollars des États-Unis" | ||
); | ||
assert_eq!( | ||
display_names.get(&Count::DisplayName).unwrap(), | ||
"dollar des États-Unis" | ||
); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
33 changes: 33 additions & 0 deletions
33
provider/datagen/tests/data/baked/macros/currency_extended_v1.rs.data
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: As described in the spec, you also need to include
unitPattern
along with the display names. Consider adding it to currency essentials.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patterns are already in currency essentials. I just need to add the selector in the next PR to choose the standard or standard-alphaNextToNumber
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think unitPattern is different than standard and standard-alphaNextToNumber. It is a third pattern.
But you can add it in the next PR. This one is fine.