Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Extended Currency DataModel #4706

Merged
merged 57 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
7202666
Merge branch 'main' of github.com:unicode-org/icu4x
younies Mar 17, 2024
1bfbcdd
Merge branch 'unicode-org:main' into main
younies Mar 19, 2024
f03df49
first step
younies Mar 19, 2024
f48160a
Change the file name
younies Mar 19, 2024
ab6ba49
Extracted the needed data
younies Mar 19, 2024
7205403
Design the data skeleton
younies Mar 19, 2024
b231bf4
Fix datagen model
younies Mar 20, 2024
6cd6f90
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Mar 20, 2024
16c04f8
move `currencies.rs`
younies Mar 20, 2024
f29ff4f
Fix the code and organize
younies Mar 20, 2024
57fdaf0
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Mar 20, 2024
c90f58b
fix build
younies Mar 20, 2024
df0a3b3
fix skeleton
younies Mar 20, 2024
4270963
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Mar 20, 2024
e83f31d
fix merge
younies Mar 20, 2024
fb09cf3
skeleton end to end
younies Mar 20, 2024
5218c53
Fix skeleton
younies Mar 20, 2024
e3b640b
prepare for the aux key
younies Mar 20, 2024
9bca26e
Merge branch 'main' into add-long-currency-data
younies Mar 22, 2024
8af221c
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Mar 28, 2024
11df80c
add ULE file
younies Mar 28, 2024
6ff016f
add ule
younies Mar 28, 2024
e5dc375
fixo
younies Mar 28, 2024
6ca63a3
save
younies Mar 28, 2024
732fa5e
Fix library
younies Mar 28, 2024
5ad9e83
Merge branch 'main' into add-long-currency-data
younies Mar 31, 2024
97466a1
Merge branch 'add-long-currency-data' of github.com:younies/icu4x int…
younies Mar 31, 2024
05e26bf
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Apr 30, 2024
f30ef6f
Fix merge
younies Apr 30, 2024
f26cff9
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Jun 4, 2024
f7b173e
Implement `IterableDataProvider`
younies Jun 4, 2024
eca009d
Add currency code as auxiliary code
younies Jun 5, 2024
c5cf174
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Jun 5, 2024
4909725
Fix merge
younies Jun 5, 2024
affa7e1
remove unneeded file
younies Jun 5, 2024
447e2a7
Add the data
younies Jun 5, 2024
882b704
update the data
younies Jun 5, 2024
8d4c34f
remove unneeded code
younies Jun 5, 2024
f13455f
add config serde borrow
younies Jun 5, 2024
01625e3
fix clippy
younies Jun 5, 2024
3aa476e
fix fmt
younies Jun 5, 2024
761686c
fix comment
younies Jun 5, 2024
d180000
Fix ZeroCapy
younies Jun 6, 2024
12145f9
Fix names and comments
younies Jun 6, 2024
41facd5
Fix clippy
younies Jun 6, 2024
d733efb
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Jun 6, 2024
bf84dc9
remove unused code
younies Jun 6, 2024
a579caf
Fix resource
younies Jun 6, 2024
f0a7269
fix
younies Jun 6, 2024
da175fc
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Jun 6, 2024
6d1372e
fix merge
younies Jun 6, 2024
c58248e
Merge branch 'main' of github.com:unicode-org/icu4x into add-long-cur…
younies Jun 9, 2024
d6cad90
Add test cases
younies Jun 10, 2024
e5f787d
Merge branch 'add-long-currency-data' of github.com:younies/icu4x int…
younies Jun 10, 2024
4a417c0
Remove temporarily some currencies
younies Jun 10, 2024
8611228
Add a TODO
younies Jun 10, 2024
aeb1d11
Merge branch 'main' into add-long-currency-data
younies Jun 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ pub enum PlaceholderValue {
/// The place holder is the iso code.
ISO,
}

#[cfg_attr(
feature = "datagen",
derive(serde::Serialize, databake::Bake),
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
// This file is part of ICU4X. For terms of use, please see the file
// called LICENSE at the top level of the ICU4X source tree
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).

// Provider structs must be stable
#![allow(clippy::exhaustive_structs, clippy::exhaustive_enums)]

//! Data provider struct definitions for this ICU4X component.
//!
//! Read more about data providers: [`icu_provider`]

use icu_provider::prelude::*;
use zerovec::ZeroMap;

#[cfg(feature = "compiled_data")]
/// Baked data
///
/// <div class="stab unstable">
/// 🚧 This code is considered unstable; it may change at any time, in breaking or non-breaking ways,
/// including in SemVer minor releases. In particular, the `DataProvider` implementations are only
/// guaranteed to match with this version's `*_unstable` providers. Use with caution.
/// </div>
pub use crate::provider::Baked;

/// Currency Extended V1 data struct.
#[icu_provider::data_struct(marker(CurrencyExtendedDataV1Marker, "currency/extended@1"))]
#[derive(Debug, Clone, Default, PartialEq)]
#[cfg_attr(feature = "serde", derive(serde::Deserialize))]
#[cfg_attr(
feature = "datagen",
derive(serde::Serialize, databake::Bake),
databake(path = icu_experimental::dimension::provider::extended_currency)
)]
#[yoke(prove_covariance_manually)]
pub struct CurrencyExtendedDataV1<'data> {
// TODO: Implement currency pattern selection logic to choose between standard or standard next to text pattern.
/// Contains the localized display names for a currency based on plural rules.
/// For instance, in the "en" locale for the "USD" currency:
/// - "US Dollars" when count is `zero`,
/// - "US Dollar" when count is `one`,
/// ... etc.
#[cfg_attr(feature = "serde", serde(borrow))]
pub display_names: ZeroMap<'data, Count, str>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: As described in the spec, you also need to include unitPattern along with the display names. Consider adding it to currency essentials.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patterns are already in currency essentials. I just need to add the selector in the next PR to choose the standard or standard-alphaNextToNumber

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unitPattern is different than standard and standard-alphaNextToNumber. It is a third pattern.

But you can add it in the next PR. This one is fine.

}

/// A CLDR plural keyword, or the explicit value 1.
/// See <https://www.unicode.org/reports/tr35/tr35-numbers.html#Language_Plural_Rules>.
#[zerovec::make_ule(CountULE)]
#[zerovec::derive(Debug)]
#[derive(Copy, Clone, PartialOrd, Ord, PartialEq, Eq, Debug)]
#[cfg_attr(feature = "serde", derive(serde::Deserialize))]
#[cfg_attr(
feature = "datagen",
derive(serde::Serialize, databake::Bake),
databake(path = icu_experimental::dimension::provider::extended_currency)
)]
#[repr(u8)]
pub enum Count {
/// The CLDR keyword `zero`.
Zero = 0,
/// The CLDR keyword `one`.
One = 1,
/// The CLDR keyword `two`.
Two = 2,
/// The CLDR keyword `few`.
Few = 3,
/// The CLDR keyword `many`.
Many = 4,
/// The CLDR keyword `other`.
Other = 5,
// TODO(younies): revise this for currency
/// The explicit 1 case, see <https://www.unicode.org/reports/tr35/tr35-numbers.html#Explicit_0_1_rules>.
Explicit1 = 6,
// NOTE(egg): No explicit 0, because the compact decimal pattern selection
// algorithm does not allow such a thing to arise.
// TODO(younies): implment this case.
/// The default case.
/// NOTE:
/// Used as the default when there is no match.
/// This is also used to replace the most frequently occurring case in all plural rules.
Default = 7,

/// The display name for the currency.
DisplayName = 8,
}
1 change: 1 addition & 0 deletions components/experimental/src/dimension/provider/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,6 @@
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).

pub mod currency;
pub mod extended_currency;
pub mod percent;
pub mod ule;
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,27 @@ pub(in crate::provider) struct CurrencyPatterns {

#[serde(rename = "symbol-alt-narrow")]
pub(in crate::provider) narrow: Option<String>,

#[serde(rename = "displayName")]
pub(in crate::provider) display_name: Option<String>,

#[serde(rename = "displayName-count-zero")]
pub(in crate::provider) zero: Option<String>,

#[serde(rename = "displayName-count-one")]
pub(in crate::provider) one: Option<String>,

#[serde(rename = "displayName-count-two")]
pub(in crate::provider) two: Option<String>,

#[serde(rename = "displayName-count-few")]
pub(in crate::provider) few: Option<String>,

#[serde(rename = "displayName-count-many")]
pub(in crate::provider) many: Option<String>,

#[serde(rename = "displayName-count-other")]
pub(in crate::provider) other: Option<String>,
}

#[derive(PartialEq, Debug, Deserialize)]
Expand All @@ -30,4 +51,4 @@ pub(in crate::provider) struct LangNumbers {
pub(in crate::provider) numbers: Numbers,
}

pub(in crate::provider) type Resource = super::LocaleResource<LangNumbers>;
pub(in crate::provider) type Resource = super::super::LocaleResource<LangNumbers>;
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
// This file is part of ICU4X. For terms of use, please see the file
// called LICENSE at the top level of the ICU4X source tree
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).

pub mod data;
pub mod supplemental;
1 change: 0 additions & 1 deletion provider/datagen/src/transform/cldr/cldr_serde/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ pub(in crate::provider) mod ca;
pub(in crate::provider) mod coverage_levels;
#[cfg(feature = "experimental_components")]
pub(in crate::provider) mod currencies;
pub(in crate::provider) mod currency_data;
#[cfg(feature = "experimental_components")]
pub(in crate::provider) mod date_fields;
pub(in crate::provider) mod directionality;
Expand Down
10 changes: 5 additions & 5 deletions provider/datagen/src/transform/cldr/currency/essentials.rs
Original file line number Diff line number Diff line change
Expand Up @@ -83,10 +83,10 @@ impl DataProvider<CurrencyEssentialsV1Marker> for DatagenProvider {
self.check_req::<CurrencyEssentialsV1Marker>(req)?;
let langid = req.locale.get_langid();

let currencies_resource: &cldr_serde::currencies::Resource = self
.cldr()?
.numbers()
.read_and_parse(&langid, "currencies.json")?;
let currencies_resource: &cldr_serde::currencies::data::Resource =
self.cldr()?
.numbers()
.read_and_parse(&langid, "currencies.json")?;

let numbers_resource: &cldr_serde::numbers::Resource = self
.cldr()?
Expand Down Expand Up @@ -117,7 +117,7 @@ impl IterableDataProviderCached<CurrencyEssentialsV1Marker> for DatagenProvider

fn extract_currency_essentials<'data>(
provider: &DatagenProvider,
currencies_resource: &cldr_serde::currencies::Resource,
currencies_resource: &cldr_serde::currencies::data::Resource,
numbers_resource: &cldr_serde::numbers::Resource,
) -> Result<CurrencyEssentialsV1<'data>, DataError> {
let currencies = &currencies_resource.main.value.numbers.currencies;
Expand Down
172 changes: 172 additions & 0 deletions provider/datagen/src/transform/cldr/currency/extended.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
// This file is part of ICU4X. For terms of use, please see the file
// called LICENSE at the top level of the ICU4X source tree
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).

use crate::provider::transform::cldr::cldr_serde;
use crate::DatagenProvider;

use std::borrow::Cow;
use std::collections::BTreeMap;
use std::collections::HashSet;

use icu_experimental::dimension::provider::extended_currency::Count;
use icu_provider::datagen::IterableDataProvider;
use tinystr::TinyAsciiStr;

use icu_experimental::dimension::provider::extended_currency::*;
use icu_provider::prelude::*;
use icu_provider::DataProvider;

impl DataProvider<CurrencyExtendedDataV1Marker> for crate::DatagenProvider {
fn load(
&self,
req: DataRequest,
) -> Result<DataResponse<CurrencyExtendedDataV1Marker>, DataError> {
self.check_req::<CurrencyExtendedDataV1Marker>(req)?;

let langid = req.locale.get_langid();
let currencies_resource: &cldr_serde::currencies::data::Resource =
self.cldr()?
.numbers()
.read_and_parse(&langid, "currencies.json")?;

let aux = req
.marker_attributes
.parse::<TinyAsciiStr<3>>()
.map_err(|_| DataError::custom("failed to parse aux key into tinystr"))?;
let currency = currencies_resource
.main
.value
.numbers
.currencies
.get(&aux.to_unvalidated())
.ok_or(DataError::custom("No currency associated with the aux key"))?;

let mut placeholders: BTreeMap<Count, String> = BTreeMap::new();

fn add_placeholder(
placeholders: &mut BTreeMap<Count, String>,
key: Count,
value: Option<String>,
) {
if let Some(val) = value {
placeholders.insert(key, val);
}
}

add_placeholder(&mut placeholders, Count::Zero, currency.zero.clone());
add_placeholder(&mut placeholders, Count::One, currency.one.clone());
add_placeholder(&mut placeholders, Count::Two, currency.two.clone());
add_placeholder(&mut placeholders, Count::Few, currency.few.clone());
add_placeholder(&mut placeholders, Count::Many, currency.many.clone());
add_placeholder(&mut placeholders, Count::Other, currency.other.clone());
add_placeholder(
&mut placeholders,
Count::DisplayName,
currency.display_name.clone(),
);

let data = CurrencyExtendedDataV1 {
display_names: placeholders
.into_iter()
.map(|(k, v)| (k, Cow::Owned(v)))
.collect(),
};

Ok(DataResponse {
metadata: Default::default(),
payload: Some(DataPayload::from_owned(data)),
})
}
}

impl IterableDataProvider<CurrencyExtendedDataV1Marker> for DatagenProvider {
fn supported_requests(&self) -> Result<HashSet<(DataLocale, DataMarkerAttributes)>, DataError> {
// TODO: This is a temporary implementation until we have a better way to handle large number of json files.
let currencies_to_support: HashSet<_> =
["USD", "CAD", "EUR", "GBP", "EGP"].into_iter().collect();

let mut result = HashSet::new();
let numbers = self.cldr()?.numbers();
let langids = numbers.list_langs()?;
for langid in langids {
let currencies_resource: &cldr_serde::currencies::data::Resource = self
.cldr()?
.numbers()
.read_and_parse(&langid, "currencies.json")?;

let currencies = &currencies_resource.main.value.numbers.currencies;
for key in currencies.keys() {
let key_string = key
.try_into_tinystr()
.map_err(|_| DataError::custom("failed to parse currency code into tinystr"))?
.parse::<String>()
.map_err(|_| DataError::custom("failed to parse currency code into string"))?;
if !currencies_to_support.contains(key_string.as_str()) {
continue;
}

let key = key
.try_into_tinystr()
.map_err(|_| DataError::custom("failed to parse currency code into tinystr"))?;

let attributes = DataMarkerAttributes::from_tinystr(key.resize());
result.insert((DataLocale::from(&langid), attributes));
}
}

Ok(result)
}
}

#[test]
fn test_basic() {
use icu_locale_core::langid;

let provider = DatagenProvider::new_testing();
let en: DataPayload<CurrencyExtendedDataV1Marker> = provider
.load(DataRequest {
locale: &langid!("en").into(),
marker_attributes: &"USD".parse().unwrap(),
..Default::default()
})
.unwrap()
.take_payload()
.unwrap();
let display_names = en.get().to_owned().display_names;
assert_eq!(display_names.get(&Count::Zero), None);
assert_eq!(display_names.get(&Count::One).unwrap(), "US dollar");
assert_eq!(display_names.get(&Count::Two), None);
assert_eq!(display_names.get(&Count::Few), None);
assert_eq!(display_names.get(&Count::Many), None);
assert_eq!(display_names.get(&Count::Other).unwrap(), "US dollars");
assert_eq!(display_names.get(&Count::DisplayName).unwrap(), "US Dollar");

let fr: DataPayload<CurrencyExtendedDataV1Marker> = provider
.load(DataRequest {
locale: &langid!("fr").into(),
marker_attributes: &"USD".parse().unwrap(),
..Default::default()
})
.unwrap()
.take_payload()
.unwrap();

let display_names = fr.get().to_owned().display_names;
assert_eq!(display_names.get(&Count::Zero), None);
assert_eq!(
display_names.get(&Count::One).unwrap(),
"dollar des États-Unis"
);
assert_eq!(display_names.get(&Count::Two), None);
assert_eq!(display_names.get(&Count::Few), None);
assert_eq!(display_names.get(&Count::Many), None);
assert_eq!(
display_names.get(&Count::Other).unwrap(),
"dollars des États-Unis"
);
assert_eq!(
display_names.get(&Count::DisplayName).unwrap(),
"dollar des États-Unis"
);
}
1 change: 1 addition & 0 deletions provider/datagen/src/transform/cldr/currency/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).

pub(in crate::provider) mod essentials;
pub(in crate::provider) mod extended;
7 changes: 7 additions & 0 deletions provider/datagen/tests/data/baked/macros.rs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading