Skip to content

Commit 2307d48

Browse files
authored
Hiragana and katakana digraphs (#885)
* UnicodeData.txt lines from L2/24-150 with code points from L2/24-165 §15. * lb=ID like ゟ and ヿ * Scripts according to the names * Regenerate UCD * Failing test, this will get tricky * time to mess with the invariants language * Allow strings in Propertywise tests * spots * stringAt * A test that fails reasonably * ea=W * Regenerate UCD * Regenerate UCD * Ignore IDNA2008_Category
1 parent 9034a4e commit 2307d48

19 files changed

+115
-64
lines changed

unicodetools/data/ucd/dev/DerivedAge.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DerivedAge-18.0.0.txt
2-
# Date: 2025-11-28, 13:23:52 GMT
2+
# Date: 2025-11-28, 15:46:32 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2144,6 +2144,7 @@ FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIG
21442144
18D1F..18D20 ; 18.0 # [2] TANGUT IDEOGRAPH-18D1F..TANGUT IDEOGRAPH-18D20
21452145
18E00..19191 ; 18.0 # [914] JURCHEN CHARACTER-18E00..JURCHEN CHARACTER-19191
21462146
191A0..191D2 ; 18.0 # [51] JURCHEN RADICAL-01..JURCHEN RADICAL-51
2147+
1B123..1B125 ; 18.0 # [3] HIRAGANA DIGRAPH KOTO..KATAKANA DIGRAPH TOTE
21472148
1B127..1B128 ; 18.0 # [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
21482149
1B168 ; 18.0 # KATAKANA LETTER SMALL ARCHAIC YE
21492150
1DF1F..1DF24 ; 18.0 # [6] LATIN SMALL LETTER D-ETH DIGRAPH..LATIN SMALL LETTER T-THETA DIGRAPH
@@ -2154,6 +2155,6 @@ FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIG
21542155
2B81E ; 18.0 # CJK UNIFIED IDEOGRAPH-2B81E
21552156
3D000..3FC3F ; 18.0 # [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
21562157

2157-
# Total code points: 12828
2158+
# Total code points: 12831
21582159

21592160
# EOF

unicodetools/data/ucd/dev/DerivedCoreProperties.txt

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DerivedCoreProperties-18.0.0.txt
2-
# Date: 2025-11-28, 13:24:15 GMT
2+
# Date: 2025-11-28, 15:46:55 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1352,7 +1352,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
13521352
1AFF0..1AFF3 ; Alphabetic # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
13531353
1AFF5..1AFFB ; Alphabetic # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
13541354
1AFFD..1AFFE ; Alphabetic # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
1355-
1B000..1B122 ; Alphabetic # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
1355+
1B000..1B125 ; Alphabetic # Lo [294] KATAKANA LETTER ARCHAIC E..KATAKANA DIGRAPH TOTE
13561356
1B127..1B128 ; Alphabetic # Lo [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
13571357
1B132 ; Alphabetic # Lo HIRAGANA LETTER SMALL KO
13581358
1B150..1B152 ; Alphabetic # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
@@ -1479,7 +1479,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
14791479
31350..33479 ; Alphabetic # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
14801480
3D000..3FC3F ; Alphabetic # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
14811481

1482-
# Total code points: 160212
1482+
# Total code points: 160215
14831483

14841484
# ================================================
14851485

@@ -6992,7 +6992,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
69926992
1AFF0..1AFF3 ; ID_Start # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
69936993
1AFF5..1AFFB ; ID_Start # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
69946994
1AFFD..1AFFE ; ID_Start # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
6995-
1B000..1B122 ; ID_Start # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
6995+
1B000..1B125 ; ID_Start # Lo [294] KATAKANA LETTER ARCHAIC E..KATAKANA DIGRAPH TOTE
69966996
1B127..1B128 ; ID_Start # Lo [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
69976997
1B132 ; ID_Start # Lo HIRAGANA LETTER SMALL KO
69986998
1B150..1B152 ; ID_Start # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
@@ -7104,7 +7104,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
71047104
31350..33479 ; ID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
71057105
3D000..3FC3F ; ID_Start # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
71067106

7107-
# Total code points: 158704
7107+
# Total code points: 158707
71087108

71097109
# ================================================
71107110

@@ -8397,7 +8397,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
83978397
1AFF0..1AFF3 ; ID_Continue # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
83988398
1AFF5..1AFFB ; ID_Continue # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
83998399
1AFFD..1AFFE ; ID_Continue # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
8400-
1B000..1B122 ; ID_Continue # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
8400+
1B000..1B125 ; ID_Continue # Lo [294] KATAKANA LETTER ARCHAIC E..KATAKANA DIGRAPH TOTE
84018401
1B127..1B128 ; ID_Continue # Lo [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
84028402
1B132 ; ID_Continue # Lo HIRAGANA LETTER SMALL KO
84038403
1B150..1B152 ; ID_Continue # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
@@ -8551,7 +8551,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
85518551
3D000..3FC3F ; ID_Continue # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
85528552
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
85538553

8554-
# Total code points: 162050
8554+
# Total code points: 162053
85558555

85568556
# ================================================
85578557

@@ -9241,7 +9241,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
92419241
1AFF0..1AFF3 ; XID_Start # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
92429242
1AFF5..1AFFB ; XID_Start # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
92439243
1AFFD..1AFFE ; XID_Start # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
9244-
1B000..1B122 ; XID_Start # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
9244+
1B000..1B125 ; XID_Start # Lo [294] KATAKANA LETTER ARCHAIC E..KATAKANA DIGRAPH TOTE
92459245
1B127..1B128 ; XID_Start # Lo [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
92469246
1B132 ; XID_Start # Lo HIRAGANA LETTER SMALL KO
92479247
1B150..1B152 ; XID_Start # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
@@ -9353,7 +9353,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
93539353
31350..33479 ; XID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
93549354
3D000..3FC3F ; XID_Start # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
93559355

9356-
# Total code points: 158681
9356+
# Total code points: 158684
93579357

93589358
# ================================================
93599359

@@ -10647,7 +10647,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
1064710647
1AFF0..1AFF3 ; XID_Continue # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
1064810648
1AFF5..1AFFB ; XID_Continue # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1064910649
1AFFD..1AFFE ; XID_Continue # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
10650-
1B000..1B122 ; XID_Continue # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
10650+
1B000..1B125 ; XID_Continue # Lo [294] KATAKANA LETTER ARCHAIC E..KATAKANA DIGRAPH TOTE
1065110651
1B127..1B128 ; XID_Continue # Lo [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
1065210652
1B132 ; XID_Continue # Lo HIRAGANA LETTER SMALL KO
1065310653
1B150..1B152 ; XID_Continue # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
@@ -10801,7 +10801,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
1080110801
3D000..3FC3F ; XID_Continue # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
1080210802
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
1080310803

10804-
# Total code points: 162031
10804+
# Total code points: 162034
1080510805

1080610806
# ================================================
1080710807

@@ -12891,7 +12891,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
1289112891
1AFF0..1AFF3 ; Grapheme_Base # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
1289212892
1AFF5..1AFFB ; Grapheme_Base # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1289312893
1AFFD..1AFFE ; Grapheme_Base # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
12894-
1B000..1B122 ; Grapheme_Base # Lo [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
12894+
1B000..1B125 ; Grapheme_Base # Lo [294] KATAKANA LETTER ARCHAIC E..KATAKANA DIGRAPH TOTE
1289512895
1B127..1B128 ; Grapheme_Base # Lo [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
1289612896
1B132 ; Grapheme_Base # Lo HIRAGANA LETTER SMALL KO
1289712897
1B150..1B152 ; Grapheme_Base # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
@@ -13104,7 +13104,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
1310413104
31350..33479 ; Grapheme_Base # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
1310513105
3D000..3FC3F ; Grapheme_Base # Lo [11328] SEAL CHARACTER-3D000..SEAL CHARACTER-3FC3F
1310613106

13107-
# Total code points: 170310
13107+
# Total code points: 170313
1310813108

1310913109
# ================================================
1311013110

unicodetools/data/ucd/dev/DerivedNormalizationProps.txt

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DerivedNormalizationProps-18.0.0.txt
2-
# Date: 2025-11-27, 17:33:31 GMT
2+
# Date: 2025-11-28, 15:46:59 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -1665,6 +1665,7 @@ FFED..FFEE ; NFKD_QC; N # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CI
16651665
11938 ; NFKD_QC; N # Mc DIVES AKURU VOWEL SIGN O
16661666
16121..16128 ; NFKD_QC; N # Mn [8] GURUNG KHEMA VOWEL SIGN U..GURUNG KHEMA VOWEL SIGN AU
16671667
16D68..16D6A ; NFKD_QC; N # Lo [3] KIRAT RAI VOWEL SIGN AI..KIRAT RAI VOWEL SIGN AU
1668+
1B123..1B125 ; NFKD_QC; N # Lo [3] HIRAGANA DIGRAPH KOTO..KATAKANA DIGRAPH TOTE
16681669
1CCD6..1CCEF ; NFKD_QC; N # So [26] OUTLINED LATIN CAPITAL LETTER A..OUTLINED LATIN CAPITAL LETTER Z
16691670
1CCF0..1CCF9 ; NFKD_QC; N # Nd [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE
16701671
1D15E..1D164 ; NFKD_QC; N # So [7] MUSICAL SYMBOL HALF NOTE..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
@@ -1757,7 +1758,7 @@ FFED..FFEE ; NFKD_QC; N # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CI
17571758
1FBF0..1FBF9 ; NFKD_QC; N # Nd [10] SEGMENTED DIGIT ZERO..SEGMENTED DIGIT NINE
17581759
2F800..2FA1D ; NFKD_QC; N # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
17591760

1760-
# Total code points: 17145
1761+
# Total code points: 17148
17611762

17621763
# ================================================
17631764

@@ -2079,6 +2080,7 @@ FFED..FFEE ; NFKC_QC; N # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CI
20792080
10781..10785 ; NFKC_QC; N # Lm [5] MODIFIER LETTER SUPERSCRIPT TRIANGULAR COLON..MODIFIER LETTER SMALL B WITH HOOK
20802081
10787..107B0 ; NFKC_QC; N # Lm [42] MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK
20812082
107B2..107BF ; NFKC_QC; N # Lm [14] MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER SMALL ESH WITH DOUBLE BAR
2083+
1B123..1B125 ; NFKC_QC; N # Lo [3] HIRAGANA DIGRAPH KOTO..KATAKANA DIGRAPH TOTE
20822084
1CCD6..1CCEF ; NFKC_QC; N # So [26] OUTLINED LATIN CAPITAL LETTER A..OUTLINED LATIN CAPITAL LETTER Z
20832085
1CCF0..1CCF9 ; NFKC_QC; N # Nd [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE
20842086
1D15E..1D164 ; NFKC_QC; N # So [7] MUSICAL SYMBOL HALF NOTE..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
@@ -2171,7 +2173,7 @@ FFED..FFEE ; NFKC_QC; N # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CI
21712173
1FBF0..1FBF9 ; NFKC_QC; N # Nd [10] SEGMENTED DIGIT ZERO..SEGMENTED DIGIT NINE
21722174
2F800..2FA1D ; NFKC_QC; N # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
21732175

2174-
# Total code points: 5024
2176+
# Total code points: 5027
21752177

21762178
# ================================================
21772179

@@ -2833,6 +2835,7 @@ FFE3 ; Expands_On_NFKD # Sk FULLWIDTH MACRON
28332835
11938 ; Expands_On_NFKD # Mc DIVES AKURU VOWEL SIGN O
28342836
16121..16128 ; Expands_On_NFKD # Mn [8] GURUNG KHEMA VOWEL SIGN U..GURUNG KHEMA VOWEL SIGN AU
28352837
16D68..16D6A ; Expands_On_NFKD # Lo [3] KIRAT RAI VOWEL SIGN AI..KIRAT RAI VOWEL SIGN AU
2838+
1B123..1B125 ; Expands_On_NFKD # Lo [3] HIRAGANA DIGRAPH KOTO..KATAKANA DIGRAPH TOTE
28362839
1D15E..1D164 ; Expands_On_NFKD # So [7] MUSICAL SYMBOL HALF NOTE..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
28372840
1D1BB..1D1C0 ; Expands_On_NFKD # So [6] MUSICAL SYMBOL MINIMA..MUSICAL SYMBOL FUSA BLACK
28382841
1F100..1F10A ; Expands_On_NFKD # No [11] DIGIT ZERO FULL STOP..DIGIT NINE COMMA
@@ -2845,7 +2848,7 @@ FFE3 ; Expands_On_NFKD # Sk FULLWIDTH MACRON
28452848
1F213 ; Expands_On_NFKD # So SQUARED KATAKANA DE
28462849
1F240..1F248 ; Expands_On_NFKD # So [9] TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-672C..TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557
28472850

2848-
# Total code points: 13410
2851+
# Total code points: 13413
28492852

28502853
# ================================================
28512854

@@ -2972,6 +2975,7 @@ FE74 ; Expands_On_NFKC # Lo ARABIC KASRATAN ISOLATED FORM
29722975
FE76..FE7F ; Expands_On_NFKC # Lo [10] ARABIC FATHA ISOLATED FORM..ARABIC SUKUN MEDIAL FORM
29732976
FEF5..FEFC ; Expands_On_NFKC # Lo [8] ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE ISOLATED FORM..ARABIC LIGATURE LAM WITH ALEF FINAL FORM
29742977
FFE3 ; Expands_On_NFKC # Sk FULLWIDTH MACRON
2978+
1B123..1B125 ; Expands_On_NFKC # Lo [3] HIRAGANA DIGRAPH KOTO..KATAKANA DIGRAPH TOTE
29752979
1D15E..1D164 ; Expands_On_NFKC # So [7] MUSICAL SYMBOL HALF NOTE..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
29762980
1D1BB..1D1C0 ; Expands_On_NFKC # So [6] MUSICAL SYMBOL MINIMA..MUSICAL SYMBOL FUSA BLACK
29772981
1F100..1F10A ; Expands_On_NFKC # No [11] DIGIT ZERO FULL STOP..DIGIT NINE COMMA
@@ -2983,7 +2987,7 @@ FFE3 ; Expands_On_NFKC # Sk FULLWIDTH MACRON
29832987
1F200..1F201 ; Expands_On_NFKC # So [2] SQUARE HIRAGANA HOKA..SQUARED KATAKANA KOKO
29842988
1F240..1F248 ; Expands_On_NFKC # So [9] TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-672C..TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557
29852989

2986-
# Total code points: 1237
2990+
# Total code points: 1240
29872991

29882992
# ================================================
29892993

@@ -7231,6 +7235,9 @@ FFF0..FFF8 ; NFKC_CF; # Cn [9] <reserved-FFF0>..<reserved-FF
72317235
16EB6 ; NFKC_CF; 16ED1 # L& BERIA ERFE CAPITAL LETTER UI
72327236
16EB7 ; NFKC_CF; 16ED2 # L& BERIA ERFE CAPITAL LETTER WASSE
72337237
16EB8 ; NFKC_CF; 16ED3 # L& BERIA ERFE CAPITAL LETTER AY
7238+
1B123 ; NFKC_CF; 3053 3068 # Lo HIRAGANA DIGRAPH KOTO
7239+
1B124 ; NFKC_CF; 30C8 30AD # Lo KATAKANA DIGRAPH TOKI
7240+
1B125 ; NFKC_CF; 30C8 30C6 # Lo KATAKANA DIGRAPH TOTE
72347241
1BCA0..1BCA3 ; NFKC_CF; # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
72357242
1CCD6 ; NFKC_CF; 0061 # So OUTLINED LATIN CAPITAL LETTER A
72367243
1CCD7 ; NFKC_CF; 0062 # So OUTLINED LATIN CAPITAL LETTER B
@@ -9248,7 +9255,7 @@ E0080..E00FF ; NFKC_CF; # Cn [128] <reserved-E0080>..<reserved-E
92489255
E0100..E01EF ; NFKC_CF; # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
92499256
E01F0..E0FFF ; NFKC_CF; # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>
92509257

9251-
# Total code points: 10647
9258+
# Total code points: 10650
92529259

92539260
# ================================================
92549261

@@ -13458,6 +13465,9 @@ FFF0..FFF8 ; NFKC_SCF; # Cn [9] <reserved-FFF0>..<reserved-F
1345813465
16EB6 ; NFKC_SCF; 16ED1 # L& BERIA ERFE CAPITAL LETTER UI
1345913466
16EB7 ; NFKC_SCF; 16ED2 # L& BERIA ERFE CAPITAL LETTER WASSE
1346013467
16EB8 ; NFKC_SCF; 16ED3 # L& BERIA ERFE CAPITAL LETTER AY
13468+
1B123 ; NFKC_SCF; 3053 3068 # Lo HIRAGANA DIGRAPH KOTO
13469+
1B124 ; NFKC_SCF; 30C8 30AD # Lo KATAKANA DIGRAPH TOKI
13470+
1B125 ; NFKC_SCF; 30C8 30C6 # Lo KATAKANA DIGRAPH TOTE
1346113471
1BCA0..1BCA3 ; NFKC_SCF; # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
1346213472
1CCD6 ; NFKC_SCF; 0061 # So OUTLINED LATIN CAPITAL LETTER A
1346313473
1CCD7 ; NFKC_SCF; 0062 # So OUTLINED LATIN CAPITAL LETTER B
@@ -15475,7 +15485,7 @@ E0080..E00FF ; NFKC_SCF; # Cn [128] <reserved-E0080>..<reserved-
1547515485
E0100..E01EF ; NFKC_SCF; # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
1547615486
E01F0..E0FFF ; NFKC_SCF; # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>
1547715487

15478-
# Total code points: 10609
15488+
# Total code points: 10612
1547915489

1548015490
# ================================================
1548115491

@@ -16398,6 +16408,7 @@ FFF0..FFF8 ; Changes_When_NFKC_Casefolded # Cn [9] <reserved-FFF0>..<reserv
1639816408
118A0..118BF ; Changes_When_NFKC_Casefolded # L& [32] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI CAPITAL LETTER VIYO
1639916409
16E40..16E5F ; Changes_When_NFKC_Casefolded # L& [32] MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN CAPITAL LETTER Y
1640016410
16EA0..16EB8 ; Changes_When_NFKC_Casefolded # L& [25] BERIA ERFE CAPITAL LETTER ARKAB..BERIA ERFE CAPITAL LETTER AY
16411+
1B123..1B125 ; Changes_When_NFKC_Casefolded # Lo [3] HIRAGANA DIGRAPH KOTO..KATAKANA DIGRAPH TOTE
1640116412
1BCA0..1BCA3 ; Changes_When_NFKC_Casefolded # Cf [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
1640216413
1CCD6..1CCEF ; Changes_When_NFKC_Casefolded # So [26] OUTLINED LATIN CAPITAL LETTER A..OUTLINED LATIN CAPITAL LETTER Z
1640316414
1CCF0..1CCF9 ; Changes_When_NFKC_Casefolded # Nd [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE
@@ -16505,6 +16516,6 @@ E0080..E00FF ; Changes_When_NFKC_Casefolded # Cn [128] <reserved-E0080>..<reser
1650516516
E0100..E01EF ; Changes_When_NFKC_Casefolded # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
1650616517
E01F0..E0FFF ; Changes_When_NFKC_Casefolded # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>
1650716518

16508-
# Total code points: 10647
16519+
# Total code points: 10650
1650916520

1651016521
# EOF

unicodetools/data/ucd/dev/EastAsianWidth.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# EastAsianWidth-18.0.0.txt
2-
# Date: 2025-11-28, 13:24:22 GMT
2+
# Date: 2025-11-28, 15:47:02 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -2400,7 +2400,7 @@ FFFD ; A # So REPLACEMENT CHARACTER
24002400
1AFF5..1AFFB ; W # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
24012401
1AFFD..1AFFE ; W # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
24022402
1B000..1B0FF ; W # Lo [256] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER RE-2
2403-
1B100..1B122 ; W # Lo [35] HENTAIGANA LETTER RE-3..KATAKANA LETTER ARCHAIC WU
2403+
1B100..1B125 ; W # Lo [38] HENTAIGANA LETTER RE-3..KATAKANA DIGRAPH TOTE
24042404
1B127..1B128 ; W # Lo [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
24052405
1B132 ; W # Lo HIRAGANA LETTER SMALL KO
24062406
1B150..1B152 ; W # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO

unicodetools/data/ucd/dev/LineBreak.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# LineBreak-18.0.0.txt
2-
# Date: 2025-11-28, 13:24:23 GMT
2+
# Date: 2025-11-28, 15:47:03 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -3313,7 +3313,7 @@ FFFD ; AI # So REPLACEMENT CHARACTER
33133313
1AFF5..1AFFB ; AL # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
33143314
1AFFD..1AFFE ; AL # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
33153315
1B000..1B0FF ; ID # Lo [256] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER RE-2
3316-
1B100..1B122 ; ID # Lo [35] HENTAIGANA LETTER RE-3..KATAKANA LETTER ARCHAIC WU
3316+
1B100..1B125 ; ID # Lo [38] HENTAIGANA LETTER RE-3..KATAKANA DIGRAPH TOTE
33173317
1B127..1B128 ; ID # Lo [2] KATAKANA LETTER ALTERNATE NE..KATAKANA LETTER ALTERNATE WI
33183318
1B132 ; CJ # Lo HIRAGANA LETTER SMALL KO
33193319
1B150..1B152 ; CJ # Lo [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO

0 commit comments

Comments
 (0)