@chr33s/pdf-codepoints 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,281 @@
1
+ # SpecialCasing-12.0.0.txt
2
+ # Date: 2019-01-22, 08:18:50 GMT
3
+ # © 2019 Unicode®, Inc.
4
+ # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
5
+ # For terms of use, see http://www.unicode.org/terms_of_use.html
6
+ #
7
+ # Unicode Character Database
8
+ # For documentation, see http://www.unicode.org/reports/tr44/
9
+ #
10
+ # Special Casing
11
+ #
12
+ # This file is a supplement to the UnicodeData.txt file. It does not define any
13
+ # properties, but rather provides additional information about the casing of
14
+ # Unicode characters, for situations when casing incurs a change in string length
15
+ # or is dependent on context or locale. For compatibility, the UnicodeData.txt
16
+ # file only contains simple case mappings for characters where they are one-to-one
17
+ # and independent of context and language. The data in this file, combined with
18
+ # the simple case mappings in UnicodeData.txt, defines the full case mappings
19
+ # Lowercase_Mapping (lc), Titlecase_Mapping (tc), and Uppercase_Mapping (uc).
20
+ #
21
+ # Note that the preferred mechanism for defining tailored casing operations is
22
+ # the Unicode Common Locale Data Repository (CLDR). For more information, see the
23
+ # discussion of case mappings and case algorithms in the Unicode Standard.
24
+ #
25
+ # All code points not listed in this file that do not have a simple case mappings
26
+ # in UnicodeData.txt map to themselves.
27
+ # ================================================================================
28
+ # Format
29
+ # ================================================================================
30
+ # The entries in this file are in the following machine-readable format:
31
+ #
32
+ # <code>; <lower>; <title>; <upper>; (<condition_list>;)? # <comment>
33
+ #
34
+ # <code>, <lower>, <title>, and <upper> provide the respective full case mappings
35
+ # of <code>, expressed as character values in hex. If there is more than one character,
36
+ # they are separated by spaces. Other than as used to separate elements, spaces are
37
+ # to be ignored.
38
+ #
39
+ # The <condition_list> is optional. Where present, it consists of one or more language IDs
40
+ # or casing contexts, separated by spaces. In these conditions:
41
+ # - A condition list overrides the normal behavior if all of the listed conditions are true.
42
+ # - The casing context is always the context of the characters in the original string,
43
+ # NOT in the resulting string.
44
+ # - Case distinctions in the condition list are not significant.
45
+ # - Conditions preceded by "Not_" represent the negation of the condition.
46
+ # The condition list is not represented in the UCD as a formal property.
47
+ #
48
+ # A language ID is defined by BCP 47, with '-' and '_' treated equivalently.
49
+ #
50
+ # A casing context for a character is defined by Section 3.13 Default Case Algorithms
51
+ # of The Unicode Standard.
52
+ #
53
+ # Parsers of this file must be prepared to deal with future additions to this format:
54
+ # * Additional contexts
55
+ # * Additional fields
56
+ # ================================================================================
57
+
58
+ # ================================================================================
59
+ # Unconditional mappings
60
+ # ================================================================================
61
+
62
+ # The German es-zed is special--the normal mapping is to SS.
63
+ # Note: the titlecase should never occur in practice. It is equal to titlecase(uppercase(<es-zed>))
64
+
65
+ 00DF; 00DF; 0053 0073; 0053 0053; # LATIN SMALL LETTER SHARP S
66
+
67
+ # Preserve canonical equivalence for I with dot. Turkic is handled below.
68
+
69
+ 0130; 0069 0307; 0130; 0130; # LATIN CAPITAL LETTER I WITH DOT ABOVE
70
+
71
+ # Ligatures
72
+
73
+ FB00; FB00; 0046 0066; 0046 0046; # LATIN SMALL LIGATURE FF
74
+ FB01; FB01; 0046 0069; 0046 0049; # LATIN SMALL LIGATURE FI
75
+ FB02; FB02; 0046 006C; 0046 004C; # LATIN SMALL LIGATURE FL
76
+ FB03; FB03; 0046 0066 0069; 0046 0046 0049; # LATIN SMALL LIGATURE FFI
77
+ FB04; FB04; 0046 0066 006C; 0046 0046 004C; # LATIN SMALL LIGATURE FFL
78
+ FB05; FB05; 0053 0074; 0053 0054; # LATIN SMALL LIGATURE LONG S T
79
+ FB06; FB06; 0053 0074; 0053 0054; # LATIN SMALL LIGATURE ST
80
+
81
+ 0587; 0587; 0535 0582; 0535 0552; # ARMENIAN SMALL LIGATURE ECH YIWN
82
+ FB13; FB13; 0544 0576; 0544 0546; # ARMENIAN SMALL LIGATURE MEN NOW
83
+ FB14; FB14; 0544 0565; 0544 0535; # ARMENIAN SMALL LIGATURE MEN ECH
84
+ FB15; FB15; 0544 056B; 0544 053B; # ARMENIAN SMALL LIGATURE MEN INI
85
+ FB16; FB16; 054E 0576; 054E 0546; # ARMENIAN SMALL LIGATURE VEW NOW
86
+ FB17; FB17; 0544 056D; 0544 053D; # ARMENIAN SMALL LIGATURE MEN XEH
87
+
88
+ # No corresponding uppercase precomposed character
89
+
90
+ 0149; 0149; 02BC 004E; 02BC 004E; # LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
91
+ 0390; 0390; 0399 0308 0301; 0399 0308 0301; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
92
+ 03B0; 03B0; 03A5 0308 0301; 03A5 0308 0301; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
93
+ 01F0; 01F0; 004A 030C; 004A 030C; # LATIN SMALL LETTER J WITH CARON
94
+ 1E96; 1E96; 0048 0331; 0048 0331; # LATIN SMALL LETTER H WITH LINE BELOW
95
+ 1E97; 1E97; 0054 0308; 0054 0308; # LATIN SMALL LETTER T WITH DIAERESIS
96
+ 1E98; 1E98; 0057 030A; 0057 030A; # LATIN SMALL LETTER W WITH RING ABOVE
97
+ 1E99; 1E99; 0059 030A; 0059 030A; # LATIN SMALL LETTER Y WITH RING ABOVE
98
+ 1E9A; 1E9A; 0041 02BE; 0041 02BE; # LATIN SMALL LETTER A WITH RIGHT HALF RING
99
+ 1F50; 1F50; 03A5 0313; 03A5 0313; # GREEK SMALL LETTER UPSILON WITH PSILI
100
+ 1F52; 1F52; 03A5 0313 0300; 03A5 0313 0300; # GREEK SMALL LETTER UPSILON WITH PSILI AND VARIA
101
+ 1F54; 1F54; 03A5 0313 0301; 03A5 0313 0301; # GREEK SMALL LETTER UPSILON WITH PSILI AND OXIA
102
+ 1F56; 1F56; 03A5 0313 0342; 03A5 0313 0342; # GREEK SMALL LETTER UPSILON WITH PSILI AND PERISPOMENI
103
+ 1FB6; 1FB6; 0391 0342; 0391 0342; # GREEK SMALL LETTER ALPHA WITH PERISPOMENI
104
+ 1FC6; 1FC6; 0397 0342; 0397 0342; # GREEK SMALL LETTER ETA WITH PERISPOMENI
105
+ 1FD2; 1FD2; 0399 0308 0300; 0399 0308 0300; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND VARIA
106
+ 1FD3; 1FD3; 0399 0308 0301; 0399 0308 0301; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
107
+ 1FD6; 1FD6; 0399 0342; 0399 0342; # GREEK SMALL LETTER IOTA WITH PERISPOMENI
108
+ 1FD7; 1FD7; 0399 0308 0342; 0399 0308 0342; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND PERISPOMENI
109
+ 1FE2; 1FE2; 03A5 0308 0300; 03A5 0308 0300; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND VARIA
110
+ 1FE3; 1FE3; 03A5 0308 0301; 03A5 0308 0301; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA
111
+ 1FE4; 1FE4; 03A1 0313; 03A1 0313; # GREEK SMALL LETTER RHO WITH PSILI
112
+ 1FE6; 1FE6; 03A5 0342; 03A5 0342; # GREEK SMALL LETTER UPSILON WITH PERISPOMENI
113
+ 1FE7; 1FE7; 03A5 0308 0342; 03A5 0308 0342; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND PERISPOMENI
114
+ 1FF6; 1FF6; 03A9 0342; 03A9 0342; # GREEK SMALL LETTER OMEGA WITH PERISPOMENI
115
+
116
+ # IMPORTANT-when iota-subscript (0345) is uppercased or titlecased,
117
+ # the result will be incorrect unless the iota-subscript is moved to the end
118
+ # of any sequence of combining marks. Otherwise, the accents will go on the capital iota.
119
+ # This process can be achieved by first transforming the text to NFC before casing.
120
+ # E.g. <alpha><iota_subscript><acute> is uppercased to <ALPHA><acute><IOTA>
121
+
122
+ # The following cases are already in the UnicodeData.txt file, so are only commented here.
123
+
124
+ # 0345; 0345; 0399; 0399; # COMBINING GREEK YPOGEGRAMMENI
125
+
126
+ # All letters with YPOGEGRAMMENI (iota-subscript) or PROSGEGRAMMENI (iota adscript)
127
+ # have special uppercases.
128
+ # Note: characters with PROSGEGRAMMENI are actually titlecase, not uppercase!
129
+
130
+ 1F80; 1F80; 1F88; 1F08 0399; # GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI
131
+ 1F81; 1F81; 1F89; 1F09 0399; # GREEK SMALL LETTER ALPHA WITH DASIA AND YPOGEGRAMMENI
132
+ 1F82; 1F82; 1F8A; 1F0A 0399; # GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA AND YPOGEGRAMMENI
133
+ 1F83; 1F83; 1F8B; 1F0B 0399; # GREEK SMALL LETTER ALPHA WITH DASIA AND VARIA AND YPOGEGRAMMENI
134
+ 1F84; 1F84; 1F8C; 1F0C 0399; # GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA AND YPOGEGRAMMENI
135
+ 1F85; 1F85; 1F8D; 1F0D 0399; # GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA AND YPOGEGRAMMENI
136
+ 1F86; 1F86; 1F8E; 1F0E 0399; # GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
137
+ 1F87; 1F87; 1F8F; 1F0F 0399; # GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
138
+ 1F88; 1F80; 1F88; 1F08 0399; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI
139
+ 1F89; 1F81; 1F89; 1F09 0399; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PROSGEGRAMMENI
140
+ 1F8A; 1F82; 1F8A; 1F0A 0399; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA AND PROSGEGRAMMENI
141
+ 1F8B; 1F83; 1F8B; 1F0B 0399; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA AND PROSGEGRAMMENI
142
+ 1F8C; 1F84; 1F8C; 1F0C 0399; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA AND PROSGEGRAMMENI
143
+ 1F8D; 1F85; 1F8D; 1F0D 0399; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA AND PROSGEGRAMMENI
144
+ 1F8E; 1F86; 1F8E; 1F0E 0399; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
145
+ 1F8F; 1F87; 1F8F; 1F0F 0399; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
146
+ 1F90; 1F90; 1F98; 1F28 0399; # GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI
147
+ 1F91; 1F91; 1F99; 1F29 0399; # GREEK SMALL LETTER ETA WITH DASIA AND YPOGEGRAMMENI
148
+ 1F92; 1F92; 1F9A; 1F2A 0399; # GREEK SMALL LETTER ETA WITH PSILI AND VARIA AND YPOGEGRAMMENI
149
+ 1F93; 1F93; 1F9B; 1F2B 0399; # GREEK SMALL LETTER ETA WITH DASIA AND VARIA AND YPOGEGRAMMENI
150
+ 1F94; 1F94; 1F9C; 1F2C 0399; # GREEK SMALL LETTER ETA WITH PSILI AND OXIA AND YPOGEGRAMMENI
151
+ 1F95; 1F95; 1F9D; 1F2D 0399; # GREEK SMALL LETTER ETA WITH DASIA AND OXIA AND YPOGEGRAMMENI
152
+ 1F96; 1F96; 1F9E; 1F2E 0399; # GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
153
+ 1F97; 1F97; 1F9F; 1F2F 0399; # GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
154
+ 1F98; 1F90; 1F98; 1F28 0399; # GREEK CAPITAL LETTER ETA WITH PSILI AND PROSGEGRAMMENI
155
+ 1F99; 1F91; 1F99; 1F29 0399; # GREEK CAPITAL LETTER ETA WITH DASIA AND PROSGEGRAMMENI
156
+ 1F9A; 1F92; 1F9A; 1F2A 0399; # GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA AND PROSGEGRAMMENI
157
+ 1F9B; 1F93; 1F9B; 1F2B 0399; # GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA AND PROSGEGRAMMENI
158
+ 1F9C; 1F94; 1F9C; 1F2C 0399; # GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA AND PROSGEGRAMMENI
159
+ 1F9D; 1F95; 1F9D; 1F2D 0399; # GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA AND PROSGEGRAMMENI
160
+ 1F9E; 1F96; 1F9E; 1F2E 0399; # GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
161
+ 1F9F; 1F97; 1F9F; 1F2F 0399; # GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
162
+ 1FA0; 1FA0; 1FA8; 1F68 0399; # GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI
163
+ 1FA1; 1FA1; 1FA9; 1F69 0399; # GREEK SMALL LETTER OMEGA WITH DASIA AND YPOGEGRAMMENI
164
+ 1FA2; 1FA2; 1FAA; 1F6A 0399; # GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA AND YPOGEGRAMMENI
165
+ 1FA3; 1FA3; 1FAB; 1F6B 0399; # GREEK SMALL LETTER OMEGA WITH DASIA AND VARIA AND YPOGEGRAMMENI
166
+ 1FA4; 1FA4; 1FAC; 1F6C 0399; # GREEK SMALL LETTER OMEGA WITH PSILI AND OXIA AND YPOGEGRAMMENI
167
+ 1FA5; 1FA5; 1FAD; 1F6D 0399; # GREEK SMALL LETTER OMEGA WITH DASIA AND OXIA AND YPOGEGRAMMENI
168
+ 1FA6; 1FA6; 1FAE; 1F6E 0399; # GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
169
+ 1FA7; 1FA7; 1FAF; 1F6F 0399; # GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
170
+ 1FA8; 1FA0; 1FA8; 1F68 0399; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI
171
+ 1FA9; 1FA1; 1FA9; 1F69 0399; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PROSGEGRAMMENI
172
+ 1FAA; 1FA2; 1FAA; 1F6A 0399; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA AND PROSGEGRAMMENI
173
+ 1FAB; 1FA3; 1FAB; 1F6B 0399; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA AND PROSGEGRAMMENI
174
+ 1FAC; 1FA4; 1FAC; 1F6C 0399; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA AND PROSGEGRAMMENI
175
+ 1FAD; 1FA5; 1FAD; 1F6D 0399; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA AND PROSGEGRAMMENI
176
+ 1FAE; 1FA6; 1FAE; 1F6E 0399; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
177
+ 1FAF; 1FA7; 1FAF; 1F6F 0399; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
178
+ 1FB3; 1FB3; 1FBC; 0391 0399; # GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI
179
+ 1FBC; 1FB3; 1FBC; 0391 0399; # GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI
180
+ 1FC3; 1FC3; 1FCC; 0397 0399; # GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI
181
+ 1FCC; 1FC3; 1FCC; 0397 0399; # GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
182
+ 1FF3; 1FF3; 1FFC; 03A9 0399; # GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI
183
+ 1FFC; 1FF3; 1FFC; 03A9 0399; # GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
184
+
185
+ # Some characters with YPOGEGRAMMENI also have no corresponding titlecases
186
+
187
+ 1FB2; 1FB2; 1FBA 0345; 1FBA 0399; # GREEK SMALL LETTER ALPHA WITH VARIA AND YPOGEGRAMMENI
188
+ 1FB4; 1FB4; 0386 0345; 0386 0399; # GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI
189
+ 1FC2; 1FC2; 1FCA 0345; 1FCA 0399; # GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI
190
+ 1FC4; 1FC4; 0389 0345; 0389 0399; # GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI
191
+ 1FF2; 1FF2; 1FFA 0345; 1FFA 0399; # GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI
192
+ 1FF4; 1FF4; 038F 0345; 038F 0399; # GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI
193
+
194
+ 1FB7; 1FB7; 0391 0342 0345; 0391 0342 0399; # GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI
195
+ 1FC7; 1FC7; 0397 0342 0345; 0397 0342 0399; # GREEK SMALL LETTER ETA WITH PERISPOMENI AND YPOGEGRAMMENI
196
+ 1FF7; 1FF7; 03A9 0342 0345; 03A9 0342 0399; # GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI
197
+
198
+ # ================================================================================
199
+ # Conditional Mappings
200
+ # The remainder of this file provides conditional casing data used to produce
201
+ # full case mappings.
202
+ # ================================================================================
203
+ # Language-Insensitive Mappings
204
+ # These are characters whose full case mappings do not depend on language, but do
205
+ # depend on context (which characters come before or after). For more information
206
+ # see the header of this file and the Unicode Standard.
207
+ # ================================================================================
208
+
209
+ # Special case for final form of sigma
210
+
211
+ 03A3; 03C2; 03A3; 03A3; Final_Sigma; # GREEK CAPITAL LETTER SIGMA
212
+
213
+ # Note: the following cases for non-final are already in the UnicodeData.txt file.
214
+
215
+ # 03A3; 03C3; 03A3; 03A3; # GREEK CAPITAL LETTER SIGMA
216
+ # 03C3; 03C3; 03A3; 03A3; # GREEK SMALL LETTER SIGMA
217
+ # 03C2; 03C2; 03A3; 03A3; # GREEK SMALL LETTER FINAL SIGMA
218
+
219
+ # Note: the following cases are not included, since they would case-fold in lowercasing
220
+
221
+ # 03C3; 03C2; 03A3; 03A3; Final_Sigma; # GREEK SMALL LETTER SIGMA
222
+ # 03C2; 03C3; 03A3; 03A3; Not_Final_Sigma; # GREEK SMALL LETTER FINAL SIGMA
223
+
224
+ # ================================================================================
225
+ # Language-Sensitive Mappings
226
+ # These are characters whose full case mappings depend on language and perhaps also
227
+ # context (which characters come before or after). For more information
228
+ # see the header of this file and the Unicode Standard.
229
+ # ================================================================================
230
+
231
+ # Lithuanian
232
+
233
+ # Lithuanian retains the dot in a lowercase i when followed by accents.
234
+
235
+ # Remove DOT ABOVE after "i" with upper or titlecase
236
+
237
+ 0307; 0307; ; ; lt After_Soft_Dotted; # COMBINING DOT ABOVE
238
+
239
+ # Introduce an explicit dot above when lowercasing capital I's and J's
240
+ # whenever there are more accents above.
241
+ # (of the accents used in Lithuanian: grave, acute, tilde above, and ogonek)
242
+
243
+ 0049; 0069 0307; 0049; 0049; lt More_Above; # LATIN CAPITAL LETTER I
244
+ 004A; 006A 0307; 004A; 004A; lt More_Above; # LATIN CAPITAL LETTER J
245
+ 012E; 012F 0307; 012E; 012E; lt More_Above; # LATIN CAPITAL LETTER I WITH OGONEK
246
+ 00CC; 0069 0307 0300; 00CC; 00CC; lt; # LATIN CAPITAL LETTER I WITH GRAVE
247
+ 00CD; 0069 0307 0301; 00CD; 00CD; lt; # LATIN CAPITAL LETTER I WITH ACUTE
248
+ 0128; 0069 0307 0303; 0128; 0128; lt; # LATIN CAPITAL LETTER I WITH TILDE
249
+
250
+ # ================================================================================
251
+
252
+ # Turkish and Azeri
253
+
254
+ # I and i-dotless; I-dot and i are case pairs in Turkish and Azeri
255
+ # The following rules handle those cases.
256
+
257
+ 0130; 0069; 0130; 0130; tr; # LATIN CAPITAL LETTER I WITH DOT ABOVE
258
+ 0130; 0069; 0130; 0130; az; # LATIN CAPITAL LETTER I WITH DOT ABOVE
259
+
260
+ # When lowercasing, remove dot_above in the sequence I + dot_above, which will turn into i.
261
+ # This matches the behavior of the canonically equivalent I-dot_above
262
+
263
+ 0307; ; 0307; 0307; tr After_I; # COMBINING DOT ABOVE
264
+ 0307; ; 0307; 0307; az After_I; # COMBINING DOT ABOVE
265
+
266
+ # When lowercasing, unless an I is before a dot_above, it turns into a dotless i.
267
+
268
+ 0049; 0131; 0049; 0049; tr Not_Before_Dot; # LATIN CAPITAL LETTER I
269
+ 0049; 0131; 0049; 0049; az Not_Before_Dot; # LATIN CAPITAL LETTER I
270
+
271
+ # When uppercasing, i turns into a dotted capital I
272
+
273
+ 0069; 0069; 0130; 0130; tr; # LATIN SMALL LETTER I
274
+ 0069; 0069; 0130; 0130; az; # LATIN SMALL LETTER I
275
+
276
+ # Note: the following case is already in the UnicodeData.txt file.
277
+
278
+ # 0131; 0131; 0049; 0049; tr; # LATIN SMALL LETTER DOTLESS I
279
+
280
+ # EOF
281
+