Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,737 changes: 706 additions & 1,031 deletions lib/stdlib/test/unicode_util_SUITE_data/GraphemeBreakTest.txt

Large diffs are not rendered by default.

35,990 changes: 19,329 additions & 16,661 deletions lib/stdlib/test/unicode_util_SUITE_data/LineBreakTest.txt

Large diffs are not rendered by default.

75 changes: 72 additions & 3 deletions lib/stdlib/test/unicode_util_SUITE_data/NormalizationTest.txt

Large diffs are not rendered by default.

Binary file modified lib/stdlib/test/unicode_util_SUITE_data/unicode_table.bin
Binary file not shown.
40 changes: 34 additions & 6 deletions lib/stdlib/uc_spec/CaseFolding.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# CaseFolding-16.0.0.txt
# Date: 2024-04-30, 21:48:11 GMT
# © 2024 Unicode®, Inc.
# CaseFolding-17.0.0.txt
# Date: 2025-07-30, 23:54:36 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
#
Expand All @@ -18,15 +18,15 @@
# The data supports both implementations that require simple case foldings
# (where string lengths don't change), and implementations that allow full case folding
# (where string lengths may grow). Note that where they can be supported, the
# full case foldings are superior: for example, they allow "MASSE" and "Maße" to match.
# full case foldings are superior: for example, they allow "FUSS" and "Fuß" to match.
#
# All code points not listed in this file map to themselves.
#
# NOTE: case folding does not preserve normalization formats!
#
# For information on case folding, including how to have case folding
# preserve normalization formats, see Section 3.13 Default Case Algorithms in
# The Unicode Standard.
# preserve normalization formats, see the
# "Conformance" / "Default Case Algorithms" section of the core specification.
#
# ================================================================================
# Format
Expand Down Expand Up @@ -1243,7 +1243,10 @@ A7C7; C; A7C8; # LATIN CAPITAL LETTER D WITH SHORT STROKE OVERLAY
A7C9; C; A7CA; # LATIN CAPITAL LETTER S WITH SHORT STROKE OVERLAY
A7CB; C; 0264; # LATIN CAPITAL LETTER RAMS HORN
A7CC; C; A7CD; # LATIN CAPITAL LETTER S WITH DIAGONAL STROKE
A7CE; C; A7CF; # LATIN CAPITAL LETTER PHARYNGEAL VOICED FRICATIVE
A7D0; C; A7D1; # LATIN CAPITAL LETTER CLOSED INSULAR G
A7D2; C; A7D3; # LATIN CAPITAL LETTER DOUBLE THORN
A7D4; C; A7D5; # LATIN CAPITAL LETTER DOUBLE WYNN
A7D6; C; A7D7; # LATIN CAPITAL LETTER MIDDLE SCOTS S
A7D8; C; A7D9; # LATIN CAPITAL LETTER SIGMOID S
A7DA; C; A7DB; # LATIN CAPITAL LETTER LAMBDA
Expand Down Expand Up @@ -1616,6 +1619,31 @@ FF3A; C; FF5A; # FULLWIDTH LATIN CAPITAL LETTER Z
16E5D; C; 16E7D; # MEDEFAIDRIN CAPITAL LETTER O
16E5E; C; 16E7E; # MEDEFAIDRIN CAPITAL LETTER AI
16E5F; C; 16E7F; # MEDEFAIDRIN CAPITAL LETTER Y
16EA0; C; 16EBB; # BERIA ERFE CAPITAL LETTER ARKAB
16EA1; C; 16EBC; # BERIA ERFE CAPITAL LETTER BASIGNA
16EA2; C; 16EBD; # BERIA ERFE CAPITAL LETTER DARBAI
16EA3; C; 16EBE; # BERIA ERFE CAPITAL LETTER EH
16EA4; C; 16EBF; # BERIA ERFE CAPITAL LETTER FITKO
16EA5; C; 16EC0; # BERIA ERFE CAPITAL LETTER GOWAY
16EA6; C; 16EC1; # BERIA ERFE CAPITAL LETTER HIRDEABO
16EA7; C; 16EC2; # BERIA ERFE CAPITAL LETTER I
16EA8; C; 16EC3; # BERIA ERFE CAPITAL LETTER DJAI
16EA9; C; 16EC4; # BERIA ERFE CAPITAL LETTER KOBO
16EAA; C; 16EC5; # BERIA ERFE CAPITAL LETTER LAKKO
16EAB; C; 16EC6; # BERIA ERFE CAPITAL LETTER MERI
16EAC; C; 16EC7; # BERIA ERFE CAPITAL LETTER NINI
16EAD; C; 16EC8; # BERIA ERFE CAPITAL LETTER GNA
16EAE; C; 16EC9; # BERIA ERFE CAPITAL LETTER NGAY
16EAF; C; 16ECA; # BERIA ERFE CAPITAL LETTER OI
16EB0; C; 16ECB; # BERIA ERFE CAPITAL LETTER PI
16EB1; C; 16ECC; # BERIA ERFE CAPITAL LETTER ERIGO
16EB2; C; 16ECD; # BERIA ERFE CAPITAL LETTER ERIGO TAMURA
16EB3; C; 16ECE; # BERIA ERFE CAPITAL LETTER SERI
16EB4; C; 16ECF; # BERIA ERFE CAPITAL LETTER SHEP
16EB5; C; 16ED0; # BERIA ERFE CAPITAL LETTER TATASOUE
16EB6; C; 16ED1; # BERIA ERFE CAPITAL LETTER UI
16EB7; C; 16ED2; # BERIA ERFE CAPITAL LETTER WASSE
16EB8; C; 16ED3; # BERIA ERFE CAPITAL LETTER AY
1E900; C; 1E922; # ADLAM CAPITAL LETTER ALIF
1E901; C; 1E923; # ADLAM CAPITAL LETTER DAALI
1E902; C; 1E924; # ADLAM CAPITAL LETTER LAAM
Expand Down
6 changes: 3 additions & 3 deletions lib/stdlib/uc_spec/CompositionExclusions.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# CompositionExclusions-16.0.0.txt
# Date: 2024-02-02
# © 2024 Unicode®, Inc.
# CompositionExclusions-17.0.0.txt
# Date: 2025-08-01
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
#
Expand Down
109 changes: 72 additions & 37 deletions lib/stdlib/uc_spec/EastAsianWidth.txt

Large diffs are not rendered by default.

28 changes: 19 additions & 9 deletions lib/stdlib/uc_spec/GraphemeBreakProperty.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# GraphemeBreakProperty-16.0.0.txt
# Date: 2024-05-31, 18:09:38 GMT
# © 2024 Unicode®, Inc.
# GraphemeBreakProperty-17.0.0.txt
# Date: 2025-06-30, 06:20:23 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
#
Expand Down Expand Up @@ -30,12 +30,11 @@
113D1 ; Prepend # Lo TULU-TIGALARI REPHA
1193F ; Prepend # Lo DIVES AKURU PREFIXED NASAL SIGN
11941 ; Prepend # Lo DIVES AKURU INITIAL RA
11A3A ; Prepend # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
11A84..11A89 ; Prepend # Lo [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA
11D46 ; Prepend # Lo MASARAM GONDI REPHA
11F02 ; Prepend # Lo KAWI SIGN REPHA

# Total code points: 28
# Total code points: 27

# ================================================

Expand Down Expand Up @@ -243,7 +242,8 @@ E01F0..E0FFF ; Control # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>
1A7F ; Extend # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT
1AB0..1ABD ; Extend # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; Extend # Me COMBINING PARENTHESES OVERLAY
1ABF..1ACE ; Extend # Mn [16] COMBINING LATIN SMALL LETTER W BELOW..COMBINING LATIN SMALL LETTER INSULAR T
1ABF..1ADD ; Extend # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; Extend # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1B00..1B03 ; Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B34 ; Extend # Mn BALINESE SIGN REREKAN
1B35 ; Extend # Mc BALINESE VOWEL SIGN TEDUNG
Expand Down Expand Up @@ -339,7 +339,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
10D24..10D27 ; Extend # Mn [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI
10D69..10D6D ; Extend # Mn [5] GARAY VOWEL SIGN E..GARAY CONSONANT NASALIZATION MARK
10EAB..10EAC ; Extend # Mn [2] YEZIDI COMBINING HAMZA MARK..YEZIDI COMBINING MADDA MARK
10EFC..10EFF ; Extend # Mn [4] ARABIC COMBINING ALEF OVERLAY..ARABIC SMALL LOW WORD MADDA
10EFA..10EFF ; Extend # Mn [6] ARABIC DOUBLE VERTICAL BAR BELOW..ARABIC SMALL LOW WORD MADDA
10F46..10F50 ; Extend # Mn [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW
10F82..10F85 ; Extend # Mn [4] OLD UYGHUR COMBINING DOT ABOVE..OLD UYGHUR COMBINING TWO DOTS BELOW
11001 ; Extend # Mn BRAHMI SIGN ANUSVARA
Expand Down Expand Up @@ -430,6 +430,9 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
11A59..11A5B ; Extend # Mn [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK
11A8A..11A96 ; Extend # Mn [13] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO SIGN ANUSVARA
11A98..11A99 ; Extend # Mn [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER
11B60 ; Extend # Mn SHARADA VOWEL SIGN OE
11B62..11B64 ; Extend # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
11B66 ; Extend # Mn SHARADA VOWEL SIGN CANDRA E
11C30..11C36 ; Extend # Mn [7] BHAIKSUKI VOWEL SIGN I..BHAIKSUKI VOWEL SIGN VOCALIC L
11C38..11C3D ; Extend # Mn [6] BHAIKSUKI VOWEL SIGN E..BHAIKSUKI SIGN ANUSVARA
11C3F ; Extend # Mn BHAIKSUKI SIGN VIRAMA
Expand Down Expand Up @@ -489,13 +492,17 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
1E2EC..1E2EF ; Extend # Mn [4] WANCHO TONE TUP..WANCHO TONE KOINI
1E4EC..1E4EF ; Extend # Mn [4] NAG MUNDARI SIGN MUHOR..NAG MUNDARI SIGN SUTUH
1E5EE..1E5EF ; Extend # Mn [2] OL ONAL SIGN MU..OL ONAL SIGN IKIR
1E6E3 ; Extend # Mn TAI YO SIGN UE
1E6E6 ; Extend # Mn TAI YO SIGN AU
1E6EE..1E6EF ; Extend # Mn [2] TAI YO SIGN AY..TAI YO SIGN ANG
1E6F5 ; Extend # Mn TAI YO SIGN OM
1E8D0..1E8D6 ; Extend # Mn [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS
1E944..1E94A ; Extend # Mn [7] ADLAM ALIF LENGTHENER..ADLAM NUKTA
1F3FB..1F3FF ; Extend # Sk [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6
E0020..E007F ; Extend # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 2198
# Total code points: 2237

# ================================================

Expand Down Expand Up @@ -646,6 +653,9 @@ ABEC ; SpacingMark # Mc MEETEI MAYEK LUM IYEK
11A39 ; SpacingMark # Mc ZANABAZAR SQUARE SIGN VISARGA
11A57..11A58 ; SpacingMark # Mc [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU
11A97 ; SpacingMark # Mc SOYOMBO SIGN VISARGA
11B61 ; SpacingMark # Mc SHARADA VOWEL SIGN OOE
11B65 ; SpacingMark # Mc SHARADA VOWEL SIGN SHORT O
11B67 ; SpacingMark # Mc SHARADA VOWEL SIGN CANDRA O
11C2F ; SpacingMark # Mc BHAIKSUKI VOWEL SIGN AA
11C3E ; SpacingMark # Mc BHAIKSUKI SIGN VISARGA
11CA9 ; SpacingMark # Mc MARCHEN SUBJOINED LETTER YA
Expand All @@ -661,7 +671,7 @@ ABEC ; SpacingMark # Mc MEETEI MAYEK LUM IYEK
1612A..1612C ; SpacingMark # Mc [3] GURUNG KHEMA CONSONANT SIGN MEDIAL YA..GURUNG KHEMA CONSONANT SIGN MEDIAL HA
16F51..16F87 ; SpacingMark # Mc [55] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN UI

# Total code points: 378
# Total code points: 381

# ================================================

Expand Down
55 changes: 37 additions & 18 deletions lib/stdlib/uc_spec/IndicSyllabicCategory.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# IndicSyllabicCategory-16.0.0.txt
# Date: 2024-04-30, 21:48:21 GMT
# © 2024 Unicode®, Inc.
# IndicSyllabicCategory-17.0.0.txt
# Date: 2025-08-01, 04:02:23 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
#
Expand Down Expand Up @@ -471,8 +471,6 @@ ABD1 ; Vowel_Independent # Lo MEETEI MAYEK LETTER ATIYA
11909 ; Vowel_Independent # Lo DIVES AKURU LETTER O
119A0..119A7 ; Vowel_Independent # Lo [8] NANDINAGARI LETTER A..NANDINAGARI LETTER VOCALIC RR
119AA..119AD ; Vowel_Independent # Lo [4] NANDINAGARI LETTER E..NANDINAGARI LETTER AU
11A00 ; Vowel_Independent # Lo ZANABAZAR SQUARE LETTER A
11A50 ; Vowel_Independent # Lo SOYOMBO LETTER A
11C00..11C08 ; Vowel_Independent # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
11C0A..11C0D ; Vowel_Independent # Lo [4] BHAIKSUKI LETTER E..BHAIKSUKI LETTER AU
11D00..11D06 ; Vowel_Independent # Lo [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E
Expand Down Expand Up @@ -729,6 +727,12 @@ ABE9..ABEA ; Vowel_Dependent # Mc [2] MEETEI MAYEK VOWEL SIGN CHEINAP..MEET
11A51..11A56 ; Vowel_Dependent # Mn [6] SOYOMBO VOWEL SIGN I..SOYOMBO VOWEL SIGN OE
11A57..11A58 ; Vowel_Dependent # Mc [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU
11A59..11A5B ; Vowel_Dependent # Mn [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK
11B60 ; Vowel_Dependent # Mn SHARADA VOWEL SIGN OE
11B61 ; Vowel_Dependent # Mc SHARADA VOWEL SIGN OOE
11B62..11B64 ; Vowel_Dependent # Mn [3] SHARADA VOWEL SIGN UE..SHARADA VOWEL SIGN SHORT E
11B65 ; Vowel_Dependent # Mc SHARADA VOWEL SIGN SHORT O
11B66 ; Vowel_Dependent # Mn SHARADA VOWEL SIGN CANDRA E
11B67 ; Vowel_Dependent # Mc SHARADA VOWEL SIGN CANDRA O
11C2F ; Vowel_Dependent # Mc BHAIKSUKI VOWEL SIGN AA
11C30..11C36 ; Vowel_Dependent # Mn [7] BHAIKSUKI VOWEL SIGN I..BHAIKSUKI VOWEL SIGN VOCALIC L
11C38..11C3B ; Vowel_Dependent # Mn [4] BHAIKSUKI VOWEL SIGN E..BHAIKSUKI VOWEL SIGN AU
Expand Down Expand Up @@ -777,6 +781,8 @@ A926..A92A ; Vowel # Mn [5] KAYAH LI VOWEL UE..KAYAH LI VOWEL O
# Indic script layout (NBSP and dotted circle), as well as a few script-
# specific vowel-holder characters which are not technically
# consonants, but serve instead as bases for placement of vowel marks.
# Vowel carriers that are null consonants instead have the
# Indic_Syllabic_Category Consonant.

# [Not derivable]

Expand All @@ -787,7 +793,6 @@ A926..A92A ; Vowel # Mn [5] KAYAH LI VOWEL UE..KAYAH LI VOWEL O
0A72..0A73 ; Consonant_Placeholder # Lo [2] GURMUKHI IRI..GURMUKHI URA
104B ; Consonant_Placeholder # Po MYANMAR SIGN SECTION
104E ; Consonant_Placeholder # Po MYANMAR SYMBOL AFOREMENTIONED
1900 ; Consonant_Placeholder # Lo LIMBU VOWEL-CARRIER LETTER
1CFA ; Consonant_Placeholder # Lo VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA
2010..2014 ; Consonant_Placeholder # Pd [5] HYPHEN..EM DASH
25CC ; Consonant_Placeholder # So DOTTED CIRCLE
Expand All @@ -800,7 +805,14 @@ AA74..AA76 ; Consonant_Placeholder # Lo [3] MYANMAR LOGOGRAM KHAMTI OAY..MY

# Indic_Syllabic_Category=Consonant

# Consonant (ordinary abugida consonants, with inherent vowels)
# Consonant
# This includes ordinary abugida consonants with inherent vowels.
# In scripts that do not have distinct independent vowel characters, but instead
# form independent vowels by adding dependent vowels to a vowel carrier which
# otherwise represents the inherent vowel, that vowel carrier has the
# Indic_Syllabic_Category Consonant, as a null consonant. Such vowel carriers
# can often also be analyzed as glottal stops with inherent vowels.
# An example is U+0F68 TIBETAN LETTER A.

# [Not derivable]

Expand Down Expand Up @@ -878,7 +890,7 @@ AA74..AA76 ; Consonant_Placeholder # Lo [3] MYANMAR LOGOGRAM KHAMTI OAY..MY
1763..176C ; Consonant # Lo [10] TAGBANWA LETTER KA..TAGBANWA LETTER YA
176E..1770 ; Consonant # Lo [3] TAGBANWA LETTER LA..TAGBANWA LETTER SA
1780..17A2 ; Consonant # Lo [35] KHMER LETTER KA..KHMER LETTER QA
1901..191E ; Consonant # Lo [30] LIMBU LETTER KA..LIMBU LETTER TRA
1900..191E ; Consonant # Lo [31] LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER TRA
1950..1962 ; Consonant # Lo [19] TAI LE LETTER KA..TAI LE LETTER NA
1980..19AB ; Consonant # Lo [44] NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER LOW SUA
1A00..1A16 ; Consonant # Lo [23] BUGINESE LETTER KA..BUGINESE LETTER HA
Expand Down Expand Up @@ -955,7 +967,9 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
11915..11916 ; Consonant # Lo [2] DIVES AKURU LETTER NYA..DIVES AKURU LETTER TTA
11918..1192F ; Consonant # Lo [24] DIVES AKURU LETTER DDA..DIVES AKURU LETTER ZA
119AE..119D0 ; Consonant # Lo [35] NANDINAGARI LETTER KA..NANDINAGARI LETTER RRA
11A00 ; Consonant # Lo ZANABAZAR SQUARE LETTER A
11A0B..11A32 ; Consonant # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
11A50 ; Consonant # Lo SOYOMBO LETTER A
11A5C..11A83 ; Consonant # Lo [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA
11C0E..11C2E ; Consonant # Lo [33] BHAIKSUKI LETTER KA..BHAIKSUKI LETTER HA
11C72..11C8F ; Consonant # Lo [30] MARCHEN LETTER KA..MARCHEN LETTER A
Expand Down Expand Up @@ -985,50 +999,53 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE

# Indic_Syllabic_Category=Consonant_With_Stacker

# Consonants that may make stacked ligatures with the next consonant
# without the use of a virama
# Consonants that may cause conjunct formation or consonant stacking with the
# next consonant, without the use of a stacker

# [Not derivable]

0CF1..0CF2 ; Consonant_With_Stacker # Lo [2] KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMANIYA
1CF5..1CF6 ; Consonant_With_Stacker # Lo [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA
11003..11004 ; Consonant_With_Stacker # Lo [2] BRAHMI SIGN JIHVAMULIYA..BRAHMI SIGN UPADHMANIYA
11460..11461 ; Consonant_With_Stacker # Lo [2] NEWA SIGN JIHVAMULIYA..NEWA SIGN UPADHMANIYA
11A3A ; Consonant_With_Stacker # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA

# ================================================

# Indic_Syllabic_Category=Consonant_Prefixed

# Cluster-initial consonants
# Other consonants that behave like a Consonant_Preceding_Repha

# [Not derivable]

111C2..111C3 ; Consonant_Prefixed # Lo [2] SHARADA SIGN JIHVAMULIYA..SHARADA SIGN UPADHMANIYA
1193F ; Consonant_Prefixed # Lo DIVES AKURU PREFIXED NASAL SIGN
11A3A ; Consonant_Prefixed # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
11A84..11A89 ; Consonant_Prefixed # Lo [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA
11A84..11A85 ; Consonant_Prefixed # Lo [2] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO SIGN UPADHMANIYA
11A87..11A89 ; Consonant_Prefixed # Lo [3] SOYOMBO CLUSTER-INITIAL LETTER LA..SOYOMBO CLUSTER-INITIAL LETTER SA

# ================================================

# Indic_Syllabic_Category=Consonant_Preceding_Repha

# Repha Form of RA (reanalyzed in some scripts), when preceding the main
# consonant.
# Cluster-initial "r" consonants in the form of a dependent sign (also known as
# "repha") that precede the base character in the encoding order, but are
# reordered in text rendering to be somewhere after the base. Reanalyzed in
# some orthographies to be a final consonant.

# [Not derivable]

0D4E ; Consonant_Preceding_Repha # Lo MALAYALAM LETTER DOT REPH
113D1 ; Consonant_Preceding_Repha # Lo TULU-TIGALARI REPHA
11941 ; Consonant_Preceding_Repha # Lo DIVES AKURU INITIAL RA
11A86 ; Consonant_Preceding_Repha # Lo SOYOMBO CLUSTER-INITIAL LETTER RA
11D46 ; Consonant_Preceding_Repha # Lo MASARAM GONDI REPHA
11F02 ; Consonant_Preceding_Repha # Lo KAWI SIGN REPHA

# ================================================

# Indic_Syllabic_Category=Consonant_Initial_Postfixed

# Consonants that succeed the main consonant in character sequences, but are
# pronounced before it.
# Other consonants that behave like a Consonant_Succeeding_Repha

# [Not derivable]

Expand All @@ -1038,7 +1055,9 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE

# Indic_Syllabic_Category=Consonant_Succeeding_Repha

# Repha Form of RA (reanalyzed in some scripts), when succeeding the main
# Cluster-initial "r" consonants that behave like a Consonant_Preceding_Repha
# but succeed the base character in the encoding order, and are thus not
# reordered in text rendering. Reanalyzed in some orthographies to be a final
# consonant.

# [Not derivable]
Expand Down
Loading
Loading