Skip to content

InSC and InPC changes for 17.0β #1130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion unicodetools/data/ucd/dev/IndicPositionalCategory.txt
Original file line number Diff line number Diff line change
Expand Up @@ -591,7 +591,6 @@ ABE5 ; Top # Mn MEETEI MAYEK VOWEL SIGN ANAP
11A01 ; Top # Mn ZANABAZAR SQUARE VOWEL SIGN I
11A04..11A09 ; Top # Mn [6] ZANABAZAR SQUARE VOWEL SIGN E..ZANABAZAR SQUARE VOWEL SIGN REVERSED I
11A35..11A38 ; Top # Mn [4] ZANABAZAR SQUARE SIGN CANDRABINDU..ZANABAZAR SQUARE SIGN ANUSVARA
11A3A ; Top # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
11A51 ; Top # Mn SOYOMBO VOWEL SIGN I
11A54..11A56 ; Top # Mn [3] SOYOMBO VOWEL SIGN E..SOYOMBO VOWEL SIGN OE
11A84..11A89 ; Top # Lo [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA
Expand Down
26 changes: 18 additions & 8 deletions unicodetools/data/ucd/dev/IndicSyllabicCategory.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# IndicSyllabicCategory-17.0.0.txt
# Date: 2025-01-27, 18:09:16 GMT
# Date: 2025-05-08, 22:20:16 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -473,8 +473,6 @@ ABD1 ; Vowel_Independent # Lo MEETEI MAYEK LETTER ATIYA
11909 ; Vowel_Independent # Lo DIVES AKURU LETTER O
119A0..119A7 ; Vowel_Independent # Lo [8] NANDINAGARI LETTER A..NANDINAGARI LETTER VOCALIC RR
119AA..119AD ; Vowel_Independent # Lo [4] NANDINAGARI LETTER E..NANDINAGARI LETTER AU
11A00 ; Vowel_Independent # Lo ZANABAZAR SQUARE LETTER A
11A50 ; Vowel_Independent # Lo SOYOMBO LETTER A
11C00..11C08 ; Vowel_Independent # Lo [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
11C0A..11C0D ; Vowel_Independent # Lo [4] BHAIKSUKI LETTER E..BHAIKSUKI LETTER AU
11D00..11D06 ; Vowel_Independent # Lo [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E
Expand Down Expand Up @@ -791,6 +789,8 @@ A926..A92A ; Vowel # Mn [5] KAYAH LI VOWEL UE..KAYAH LI VOWEL O
# Indic script layout (NBSP and dotted circle), as well as a few script-
# specific vowel-holder characters which are not technically
# consonants, but serve instead as bases for placement of vowel marks.
# Vowel carriers that are null consonants instead have the
# Indic_Syllabic_Category Consonant.

# [Not derivable]

Expand All @@ -801,7 +801,6 @@ A926..A92A ; Vowel # Mn [5] KAYAH LI VOWEL UE..KAYAH LI VOWEL O
0A72..0A73 ; Consonant_Placeholder # Lo [2] GURMUKHI IRI..GURMUKHI URA
104B ; Consonant_Placeholder # Po MYANMAR SIGN SECTION
104E ; Consonant_Placeholder # Po MYANMAR SYMBOL AFOREMENTIONED
1900 ; Consonant_Placeholder # Lo LIMBU VOWEL-CARRIER LETTER
1CFA ; Consonant_Placeholder # Lo VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA
2010..2014 ; Consonant_Placeholder # Pd [5] HYPHEN..EM DASH
25CC ; Consonant_Placeholder # So DOTTED CIRCLE
Expand All @@ -814,7 +813,14 @@ AA74..AA76 ; Consonant_Placeholder # Lo [3] MYANMAR LOGOGRAM KHAMTI OAY..MY

# Indic_Syllabic_Category=Consonant

# Consonant (ordinary abugida consonants, with inherent vowels)
# Consonant
# This includes ordinary abugida consonants with inherent vowels.
# In scripts that do not have distinct independent vowel characters, but instead
# form independent vowels by adding dependent vowels to a vowel carrier which
# otherwise represents the inherent vowel, that vowel carrier has the
# Indic_Syllabic_Category Consonant, as a null consonant. Such vowel carriers
# can often also be analyzed as glottal stops with inherent vowels.
# An example is U+0F68 ཨ TIBETAN LETTER A.

# [Not derivable]

Expand Down Expand Up @@ -893,7 +899,7 @@ AA74..AA76 ; Consonant_Placeholder # Lo [3] MYANMAR LOGOGRAM KHAMTI OAY..MY
1763..176C ; Consonant # Lo [10] TAGBANWA LETTER KA..TAGBANWA LETTER YA
176E..1770 ; Consonant # Lo [3] TAGBANWA LETTER LA..TAGBANWA LETTER SA
1780..17A2 ; Consonant # Lo [35] KHMER LETTER KA..KHMER LETTER QA
1901..191E ; Consonant # Lo [30] LIMBU LETTER KA..LIMBU LETTER TRA
1900..191E ; Consonant # Lo [31] LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER TRA
1950..1962 ; Consonant # Lo [19] TAI LE LETTER KA..TAI LE LETTER NA
1980..19AB ; Consonant # Lo [44] NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER LOW SUA
1A00..1A16 ; Consonant # Lo [23] BUGINESE LETTER KA..BUGINESE LETTER HA
Expand Down Expand Up @@ -970,7 +976,9 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
11915..11916 ; Consonant # Lo [2] DIVES AKURU LETTER NYA..DIVES AKURU LETTER TTA
11918..1192F ; Consonant # Lo [24] DIVES AKURU LETTER DDA..DIVES AKURU LETTER ZA
119AE..119D0 ; Consonant # Lo [35] NANDINAGARI LETTER KA..NANDINAGARI LETTER RRA
11A00 ; Consonant # Lo ZANABAZAR SQUARE LETTER A
11A0B..11A32 ; Consonant # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
11A50 ; Consonant # Lo SOYOMBO LETTER A
11A5C..11A83 ; Consonant # Lo [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA
11C0E..11C2E ; Consonant # Lo [33] BHAIKSUKI LETTER KA..BHAIKSUKI LETTER HA
11C72..11C8F ; Consonant # Lo [30] MARCHEN LETTER KA..MARCHEN LETTER A
Expand Down Expand Up @@ -1016,6 +1024,7 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
1CF5..1CF6 ; Consonant_With_Stacker # Lo [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA
11003..11004 ; Consonant_With_Stacker # Lo [2] BRAHMI SIGN JIHVAMULIYA..BRAHMI SIGN UPADHMANIYA
11460..11461 ; Consonant_With_Stacker # Lo [2] NEWA SIGN JIHVAMULIYA..NEWA SIGN UPADHMANIYA
11A3A ; Consonant_With_Stacker # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA

# ================================================

Expand All @@ -1027,8 +1036,8 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE

111C2..111C3 ; Consonant_Prefixed # Lo [2] SHARADA SIGN JIHVAMULIYA..SHARADA SIGN UPADHMANIYA
1193F ; Consonant_Prefixed # Lo DIVES AKURU PREFIXED NASAL SIGN
11A3A ; Consonant_Prefixed # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
11A84..11A89 ; Consonant_Prefixed # Lo [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA
11A84..11A85 ; Consonant_Prefixed # Lo [2] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO SIGN UPADHMANIYA
11A87..11A89 ; Consonant_Prefixed # Lo [3] SOYOMBO CLUSTER-INITIAL LETTER LA..SOYOMBO CLUSTER-INITIAL LETTER SA

# ================================================

Expand All @@ -1042,6 +1051,7 @@ ABD2..ABDA ; Consonant # Lo [9] MEETEI MAYEK LETTER GOK..MEETEI MAYEK LETTE
0D4E ; Consonant_Preceding_Repha # Lo MALAYALAM LETTER DOT REPH
113D1 ; Consonant_Preceding_Repha # Lo TULU-TIGALARI REPHA
11941 ; Consonant_Preceding_Repha # Lo DIVES AKURU INITIAL RA
11A86 ; Consonant_Preceding_Repha # Lo SOYOMBO CLUSTER-INITIAL LETTER RA
11D46 ; Consonant_Preceding_Repha # Lo MASARAM GONDI REPHA
11F02 ; Consonant_Preceding_Repha # Lo KAWI SIGN REPHA

Expand Down
5 changes: 2 additions & 3 deletions unicodetools/data/ucd/dev/auxiliary/GraphemeBreakProperty.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# GraphemeBreakProperty-17.0.0.txt
# Date: 2025-01-27, 18:09:16 GMT
# Date: 2025-05-08, 22:20:13 GMT
# © 2025 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -30,12 +30,11 @@
113D1 ; Prepend # Lo TULU-TIGALARI REPHA
1193F ; Prepend # Lo DIVES AKURU PREFIXED NASAL SIGN
11941 ; Prepend # Lo DIVES AKURU INITIAL RA
11A3A ; Prepend # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
11A84..11A89 ; Prepend # Lo [6] SOYOMBO SIGN JIHVAMULIYA..SOYOMBO CLUSTER-INITIAL LETTER SA
11D46 ; Prepend # Lo MASARAM GONDI REPHA
11F02 ; Prepend # Lo KAWI SIGN REPHA

# Total code points: 28
# Total code points: 27

# ================================================

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1168,10 +1168,19 @@ Value: Consonant_Placeholder
# Indic script layout (NBSP and dotted circle), as well as a few script-
# specific vowel-holder characters which are not technically
# consonants, but serve instead as bases for placement of vowel marks.
# Vowel carriers that are null consonants instead have the
# Indic_Syllabic_Category Consonant.

# [Not derivable]
Value: Consonant
# Consonant (ordinary abugida consonants, with inherent vowels)
# Consonant
# This includes ordinary abugida consonants with inherent vowels.
# In scripts that do not have distinct independent vowel characters, but instead
# form independent vowels by adding dependent vowels to a vowel carrier which
# otherwise represents the inherent vowel, that vowel carrier has the
# Indic_Syllabic_Category Consonant, as a null consonant. Such vowel carriers
# can often also be analyzed as glottal stops with inherent vowels.
# An example is U+0F68 ཨ TIBETAN LETTER A.

# [Not derivable]
Value: Consonant_Dead
Expand Down