You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a low priority issue from https://bugzilla.mozilla.org/show_bug.cgi?id=1871754. Before using ICU4X, Gecko's word segmenter for Chinese and Japanese is that segment is whether character class is same or not.
Actually, word segmenter for Chinese and Japanese are based on dictionary. Since new words are always incremented, dictionary implementation may not be enough for quality without updating it.
Although we are considering to use other ways for it such as Machine Leaning in the future, it may be better that we have a segmenter's options not to use dictionary for some languages only (If Japanese, we don't use dictionary, but other can use it).
This is a low priority issue from https://bugzilla.mozilla.org/show_bug.cgi?id=1871754. Before using ICU4X, Gecko's word segmenter for Chinese and Japanese is that segment is whether character class is same or not.
Actually, word segmenter for Chinese and Japanese are based on dictionary. Since new words are always incremented, dictionary implementation may not be enough for quality without updating it.
Although we are considering to use other ways for it such as Machine Leaning in the future, it may be better that we have a segmenter's options not to use dictionary for some languages only (If Japanese, we don't use dictionary, but other can use it).
CC: @aethanyc
The text was updated successfully, but these errors were encountered: