pub trait UnicodeNormalization<I: Iterator<Item = char>> {
// Required methods
fn nfd(self) -> Decompositions<I> ⓘ;
fn nfkd(self) -> Decompositions<I> ⓘ;
fn nfc(self) -> Recompositions<I> ⓘ;
fn nfkc(self) -> Recompositions<I> ⓘ;
fn cjk_compat_variants(self) -> Replacements<I> ⓘ;
fn stream_safe(self) -> StreamSafe<I> ⓘ;
}
Expand description
Methods for iterating over strings while applying Unicode normalizations as described in Unicode Standard Annex #15.
Required Methods§
Sourcefn nfd(self) -> Decompositions<I> ⓘ
fn nfd(self) -> Decompositions<I> ⓘ
Returns an iterator over the string in Unicode Normalization Form D (canonical decomposition).
Sourcefn nfkd(self) -> Decompositions<I> ⓘ
fn nfkd(self) -> Decompositions<I> ⓘ
Returns an iterator over the string in Unicode Normalization Form KD (compatibility decomposition).
Sourcefn nfc(self) -> Recompositions<I> ⓘ
fn nfc(self) -> Recompositions<I> ⓘ
An Iterator over the string in Unicode Normalization Form C (canonical decomposition followed by canonical composition).
Sourcefn nfkc(self) -> Recompositions<I> ⓘ
fn nfkc(self) -> Recompositions<I> ⓘ
An Iterator over the string in Unicode Normalization Form KC (compatibility decomposition followed by canonical composition).
Sourcefn cjk_compat_variants(self) -> Replacements<I> ⓘ
fn cjk_compat_variants(self) -> Replacements<I> ⓘ
A transformation which replaces CJK Compatibility Ideograph codepoints with normal forms using Standardized Variation Sequences. This is not part of the canonical or compatibility decomposition algorithms, but performing it before those algorithms produces normalized output which better preserves the intent of the original text.
Note that many systems today ignore variation selectors, so these may not immediately help text display as intended, but they at least preserve the information in a standardized form, giving implementations the option to recognize them.
Sourcefn stream_safe(self) -> StreamSafe<I> ⓘ
fn stream_safe(self) -> StreamSafe<I> ⓘ
An Iterator over the string with Conjoining Grapheme Joiner characters inserted according to the Stream-Safe Text Process (UAX15-D4)