summaryrefslogtreecommitdiff
path: root/doc/UNICODE_PROPERTIES
blob: 1148b4d012766d5950a55c3036b93652235f1789 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
Unicode Properties (from Unicode Version: 12.0.0)

 15: ASCII_Hex_Digit
 16: Adlam
 17: Ahom
 18: Alphabetic
 19: Anatolian_Hieroglyphs
 20: Any
 21: Arabic
 22: Armenian
 23: Assigned
 24: Avestan
 25: Balinese
 26: Bamum
 27: Bassa_Vah
 28: Batak
 29: Bengali
 30: Bhaiksuki
 31: Bidi_Control
 32: Bopomofo
 33: Brahmi
 34: Braille
 35: Buginese
 36: Buhid
 37: C
 38: Canadian_Aboriginal
 39: Carian
 40: Case_Ignorable
 41: Cased
 42: Caucasian_Albanian
 43: Cc
 44: Cf
 45: Chakma
 46: Cham
 47: Changes_When_Casefolded
 48: Changes_When_Casemapped
 49: Changes_When_Lowercased
 50: Changes_When_Titlecased
 51: Changes_When_Uppercased
 52: Cherokee
 53: Cn
 54: Co
 55: Common
 56: Coptic
 57: Cs
 58: Cuneiform
 59: Cypriot
 60: Cyrillic
 61: Dash
 62: Default_Ignorable_Code_Point
 63: Deprecated
 64: Deseret
 65: Devanagari
 66: Diacritic
 67: Dogra
 68: Duployan
 69: Egyptian_Hieroglyphs
 70: Elbasan
 71: Elymaic
 72: Emoji
 73: Emoji_Component
 74: Emoji_Modifier
 75: Emoji_Modifier_Base
 76: Emoji_Presentation
 77: Ethiopic
 78: Extended_Pictographic
 79: Extender
 80: Georgian
 81: Glagolitic
 82: Gothic
 83: Grantha
 84: Grapheme_Base
 85: Grapheme_Extend
 86: Grapheme_Link
 87: Greek
 88: Gujarati
 89: Gunjala_Gondi
 90: Gurmukhi
 91: Han
 92: Hangul
 93: Hanifi_Rohingya
 94: Hanunoo
 95: Hatran
 96: Hebrew
 97: Hex_Digit
 98: Hiragana
 99: Hyphen
100: IDS_Binary_Operator
101: IDS_Trinary_Operator
102: ID_Continue
103: ID_Start
104: Ideographic
105: Imperial_Aramaic
106: Inherited
107: Inscriptional_Pahlavi
108: Inscriptional_Parthian
109: Javanese
110: Join_Control
111: Kaithi
112: Kannada
113: Katakana
114: Kayah_Li
115: Kharoshthi
116: Khmer
117: Khojki
118: Khudawadi
119: L
120: LC
121: Lao
122: Latin
123: Lepcha
124: Limbu
125: Linear_A
126: Linear_B
127: Lisu
128: Ll
129: Lm
130: Lo
131: Logical_Order_Exception
132: Lowercase
133: Lt
134: Lu
135: Lycian
136: Lydian
137: M
138: Mahajani
139: Makasar
140: Malayalam
141: Mandaic
142: Manichaean
143: Marchen
144: Masaram_Gondi
145: Math
146: Mc
147: Me
148: Medefaidrin
149: Meetei_Mayek
150: Mende_Kikakui
151: Meroitic_Cursive
152: Meroitic_Hieroglyphs
153: Miao
154: Mn
155: Modi
156: Mongolian
157: Mro
158: Multani
159: Myanmar
160: N
161: Nabataean
162: Nandinagari
163: Nd
164: New_Tai_Lue
165: Newa
166: Nko
167: Nl
168: No
169: Noncharacter_Code_Point
170: Nushu
171: Nyiakeng_Puachue_Hmong
172: Ogham
173: Ol_Chiki
174: Old_Hungarian
175: Old_Italic
176: Old_North_Arabian
177: Old_Permic
178: Old_Persian
179: Old_Sogdian
180: Old_South_Arabian
181: Old_Turkic
182: Oriya
183: Osage
184: Osmanya
185: Other_Alphabetic
186: Other_Default_Ignorable_Code_Point
187: Other_Grapheme_Extend
188: Other_ID_Continue
189: Other_ID_Start
190: Other_Lowercase
191: Other_Math
192: Other_Uppercase
193: P
194: Pahawh_Hmong
195: Palmyrene
196: Pattern_Syntax
197: Pattern_White_Space
198: Pau_Cin_Hau
199: Pc
200: Pd
201: Pe
202: Pf
203: Phags_Pa
204: Phoenician
205: Pi
206: Po
207: Prepended_Concatenation_Mark
208: Ps
209: Psalter_Pahlavi
210: Quotation_Mark
211: Radical
212: Regional_Indicator
213: Rejang
214: Runic
215: S
216: Samaritan
217: Saurashtra
218: Sc
219: Sentence_Terminal
220: Sharada
221: Shavian
222: Siddham
223: SignWriting
224: Sinhala
225: Sk
226: Sm
227: So
228: Soft_Dotted
229: Sogdian
230: Sora_Sompeng
231: Soyombo
232: Sundanese
233: Syloti_Nagri
234: Syriac
235: Tagalog
236: Tagbanwa
237: Tai_Le
238: Tai_Tham
239: Tai_Viet
240: Takri
241: Tamil
242: Tangut
243: Telugu
244: Terminal_Punctuation
245: Thaana
246: Thai
247: Tibetan
248: Tifinagh
249: Tirhuta
250: Ugaritic
251: Unified_Ideograph
252: Unknown
253: Uppercase
254: Vai
255: Variation_Selector
256: Wancho
257: Warang_Citi
258: White_Space
259: XID_Continue
260: XID_Start
261: Yi
262: Z
263: Zanabazar_Square
264: Zl
265: Zp
266: Zs
 16: Adlm
 42: Aghb
 15: AHex
 21: Arab
105: Armi
 22: Armn
 24: Avst
 25: Bali
 26: Bamu
 27: Bass
 28: Batk
 29: Beng
 30: Bhks
 31: Bidi_C
 32: Bopo
 33: Brah
 34: Brai
 35: Bugi
 36: Buhd
 45: Cakm
 38: Cans
 39: Cari
120: Cased_Letter
 52: Cher
 40: CI
201: Close_Punctuation
137: Combining_Mark
199: Connector_Punctuation
 43: Control
 56: Copt
 59: Cprt
218: Currency_Symbol
 47: CWCF
 48: CWCM
 49: CWL
 50: CWT
 51: CWU
 60: Cyrl
200: Dash_Punctuation
163: Decimal_Number
 63: Dep
 65: Deva
 62: DI
 66: Dia
 67: Dogr
 64: Dsrt
 68: Dupl
 69: Egyp
 70: Elba
 71: Elym
147: Enclosing_Mark
 77: Ethi
 79: Ext
202: Final_Punctuation
 44: Format
 80: Geor
 81: Glag
 89: Gong
144: Gonm
 82: Goth
 83: Gran
 84: Gr_Base
 87: Grek
 85: Gr_Ext
 86: Gr_Link
 88: Gujr
 90: Guru
 92: Hang
 91: Hani
 94: Hano
 95: Hatr
 96: Hebr
 97: Hex
 98: Hira
 19: Hluw
194: Hmng
171: Hmnp
174: Hung
102: IDC
104: Ideo
103: IDS
100: IDSB
101: IDST
205: Initial_Punctuation
175: Ital
109: Java
110: Join_C
114: Kali
113: Kana
115: Khar
116: Khmr
117: Khoj
112: Knda
111: Kthi
238: Lana
121: Laoo
122: Latn
123: Lepc
119: Letter
167: Letter_Number
124: Limb
125: Lina
126: Linb
264: Line_Separator
131: LOE
128: Lowercase_Letter
135: Lyci
136: Lydi
138: Mahj
139: Maka
141: Mand
142: Mani
143: Marc
137: Mark
226: Math_Symbol
148: Medf
150: Mend
151: Merc
152: Mero
140: Mlym
129: Modifier_Letter
225: Modifier_Symbol
156: Mong
157: Mroo
149: Mtei
158: Mult
159: Mymr
162: Nand
176: Narb
161: Nbat
169: NChar
166: Nkoo
154: Nonspacing_Mark
170: Nshu
160: Number
185: OAlpha
186: ODI
172: Ogam
187: OGr_Ext
188: OIDC
189: OIDS
173: Olck
190: OLower
191: OMath
208: Open_Punctuation
181: Orkh
182: Orya
183: Osge
184: Osma
 37: Other
130: Other_Letter
168: Other_Number
206: Other_Punctuation
227: Other_Symbol
192: OUpper
195: Palm
265: Paragraph_Separator
196: Pat_Syn
197: Pat_WS
198: Pauc
207: PCM
177: Perm
203: Phag
107: Phli
209: Phlp
204: Phnx
153: Plrd
 54: Private_Use
108: Prti
193: Punctuation
 56: Qaac
106: Qaai
210: QMark
212: RI
213: Rjng
 93: Rohg
214: Runr
216: Samr
180: Sarb
217: Saur
228: SD
262: Separator
223: Sgnw
221: Shaw
220: Shrd
222: Sidd
118: Sind
224: Sinh
229: Sogd
179: Sogo
230: Sora
231: Soyo
266: Space_Separator
146: Spacing_Mark
219: STerm
232: Sund
 57: Surrogate
233: Sylo
215: Symbol
234: Syrc
236: Tagb
240: Takr
237: Tale
164: Talu
241: Taml
242: Tang
239: Tavt
243: Telu
244: Term
248: Tfng
235: Tglg
245: Thaa
247: Tibt
249: Tirh
133: Titlecase_Letter
250: Ugar
251: UIdeo
 53: Unassigned
134: Uppercase_Letter
254: Vaii
255: VS
257: Wara
256: Wcho
258: WSpace
259: XIDC
260: XIDS
178: Xpeo
 58: Xsux
261: Yiii
263: Zanb
106: Zinh
 55: Zyyy
252: Zzzz
267: In_Basic_Latin
268: In_Latin_1_Supplement
269: In_Latin_Extended_A
270: In_Latin_Extended_B
271: In_IPA_Extensions
272: In_Spacing_Modifier_Letters
273: In_Combining_Diacritical_Marks
274: In_Greek_and_Coptic
275: In_Cyrillic
276: In_Cyrillic_Supplement
277: In_Armenian
278: In_Hebrew
279: In_Arabic
280: In_Syriac
281: In_Arabic_Supplement
282: In_Thaana
283: In_NKo
284: In_Samaritan
285: In_Mandaic
286: In_Syriac_Supplement
287: In_Arabic_Extended_A
288: In_Devanagari
289: In_Bengali
290: In_Gurmukhi
291: In_Gujarati
292: In_Oriya
293: In_Tamil
294: In_Telugu
295: In_Kannada
296: In_Malayalam
297: In_Sinhala
298: In_Thai
299: In_Lao
300: In_Tibetan
301: In_Myanmar
302: In_Georgian
303: In_Hangul_Jamo
304: In_Ethiopic
305: In_Ethiopic_Supplement
306: In_Cherokee
307: In_Unified_Canadian_Aboriginal_Syllabics
308: In_Ogham
309: In_Runic
310: In_Tagalog
311: In_Hanunoo
312: In_Buhid
313: In_Tagbanwa
314: In_Khmer
315: In_Mongolian
316: In_Unified_Canadian_Aboriginal_Syllabics_Extended
317: In_Limbu
318: In_Tai_Le
319: In_New_Tai_Lue
320: In_Khmer_Symbols
321: In_Buginese
322: In_Tai_Tham
323: In_Combining_Diacritical_Marks_Extended
324: In_Balinese
325: In_Sundanese
326: In_Batak
327: In_Lepcha
328: In_Ol_Chiki
329: In_Cyrillic_Extended_C
330: In_Georgian_Extended
331: In_Sundanese_Supplement
332: In_Vedic_Extensions
333: In_Phonetic_Extensions
334: In_Phonetic_Extensions_Supplement
335: In_Combining_Diacritical_Marks_Supplement
336: In_Latin_Extended_Additional
337: In_Greek_Extended
338: In_General_Punctuation
339: In_Superscripts_and_Subscripts
340: In_Currency_Symbols
341: In_Combining_Diacritical_Marks_for_Symbols
342: In_Letterlike_Symbols
343: In_Number_Forms
344: In_Arrows
345: In_Mathematical_Operators
346: In_Miscellaneous_Technical
347: In_Control_Pictures
348: In_Optical_Character_Recognition
349: In_Enclosed_Alphanumerics
350: In_Box_Drawing
351: In_Block_Elements
352: In_Geometric_Shapes
353: In_Miscellaneous_Symbols
354: In_Dingbats
355: In_Miscellaneous_Mathematical_Symbols_A
356: In_Supplemental_Arrows_A
357: In_Braille_Patterns
358: In_Supplemental_Arrows_B
359: In_Miscellaneous_Mathematical_Symbols_B
360: In_Supplemental_Mathematical_Operators
361: In_Miscellaneous_Symbols_and_Arrows
362: In_Glagolitic
363: In_Latin_Extended_C
364: In_Coptic
365: In_Georgian_Supplement
366: In_Tifinagh
367: In_Ethiopic_Extended
368: In_Cyrillic_Extended_A
369: In_Supplemental_Punctuation
370: In_CJK_Radicals_Supplement
371: In_Kangxi_Radicals
372: In_Ideographic_Description_Characters
373: In_CJK_Symbols_and_Punctuation
374: In_Hiragana
375: In_Katakana
376: In_Bopomofo
377: In_Hangul_Compatibility_Jamo
378: In_Kanbun
379: In_Bopomofo_Extended
380: In_CJK_Strokes
381: In_Katakana_Phonetic_Extensions
382: In_Enclosed_CJK_Letters_and_Months
383: In_CJK_Compatibility
384: In_CJK_Unified_Ideographs_Extension_A
385: In_Yijing_Hexagram_Symbols
386: In_CJK_Unified_Ideographs
387: In_Yi_Syllables
388: In_Yi_Radicals
389: In_Lisu
390: In_Vai
391: In_Cyrillic_Extended_B
392: In_Bamum
393: In_Modifier_Tone_Letters
394: In_Latin_Extended_D
395: In_Syloti_Nagri
396: In_Common_Indic_Number_Forms
397: In_Phags_pa
398: In_Saurashtra
399: In_Devanagari_Extended
400: In_Kayah_Li
401: In_Rejang
402: In_Hangul_Jamo_Extended_A
403: In_Javanese
404: In_Myanmar_Extended_B
405: In_Cham
406: In_Myanmar_Extended_A
407: In_Tai_Viet
408: In_Meetei_Mayek_Extensions
409: In_Ethiopic_Extended_A
410: In_Latin_Extended_E
411: In_Cherokee_Supplement
412: In_Meetei_Mayek
413: In_Hangul_Syllables
414: In_Hangul_Jamo_Extended_B
415: In_High_Surrogates
416: In_High_Private_Use_Surrogates
417: In_Low_Surrogates
418: In_Private_Use_Area
419: In_CJK_Compatibility_Ideographs
420: In_Alphabetic_Presentation_Forms
421: In_Arabic_Presentation_Forms_A
422: In_Variation_Selectors
423: In_Vertical_Forms
424: In_Combining_Half_Marks
425: In_CJK_Compatibility_Forms
426: In_Small_Form_Variants
427: In_Arabic_Presentation_Forms_B
428: In_Halfwidth_and_Fullwidth_Forms
429: In_Specials
430: In_Linear_B_Syllabary
431: In_Linear_B_Ideograms
432: In_Aegean_Numbers
433: In_Ancient_Greek_Numbers
434: In_Ancient_Symbols
435: In_Phaistos_Disc
436: In_Lycian
437: In_Carian
438: In_Coptic_Epact_Numbers
439: In_Old_Italic
440: In_Gothic
441: In_Old_Permic
442: In_Ugaritic
443: In_Old_Persian
444: In_Deseret
445: In_Shavian
446: In_Osmanya
447: In_Osage
448: In_Elbasan
449: In_Caucasian_Albanian
450: In_Linear_A
451: In_Cypriot_Syllabary
452: In_Imperial_Aramaic
453: In_Palmyrene
454: In_Nabataean
455: In_Hatran
456: In_Phoenician
457: In_Lydian
458: In_Meroitic_Hieroglyphs
459: In_Meroitic_Cursive
460: In_Kharoshthi
461: In_Old_South_Arabian
462: In_Old_North_Arabian
463: In_Manichaean
464: In_Avestan
465: In_Inscriptional_Parthian
466: In_Inscriptional_Pahlavi
467: In_Psalter_Pahlavi
468: In_Old_Turkic
469: In_Old_Hungarian
470: In_Hanifi_Rohingya
471: In_Rumi_Numeral_Symbols
472: In_Old_Sogdian
473: In_Sogdian
474: In_Elymaic
475: In_Brahmi
476: In_Kaithi
477: In_Sora_Sompeng
478: In_Chakma
479: In_Mahajani
480: In_Sharada
481: In_Sinhala_Archaic_Numbers
482: In_Khojki
483: In_Multani
484: In_Khudawadi
485: In_Grantha
486: In_Newa
487: In_Tirhuta
488: In_Siddham
489: In_Modi
490: In_Mongolian_Supplement
491: In_Takri
492: In_Ahom
493: In_Dogra
494: In_Warang_Citi
495: In_Nandinagari
496: In_Zanabazar_Square
497: In_Soyombo
498: In_Pau_Cin_Hau
499: In_Bhaiksuki
500: In_Marchen
501: In_Masaram_Gondi
502: In_Gunjala_Gondi
503: In_Makasar
504: In_Tamil_Supplement
505: In_Cuneiform
506: In_Cuneiform_Numbers_and_Punctuation
507: In_Early_Dynastic_Cuneiform
508: In_Egyptian_Hieroglyphs
509: In_Egyptian_Hieroglyph_Format_Controls
510: In_Anatolian_Hieroglyphs
511: In_Bamum_Supplement
512: In_Mro
513: In_Bassa_Vah
514: In_Pahawh_Hmong
515: In_Medefaidrin
516: In_Miao
517: In_Ideographic_Symbols_and_Punctuation
518: In_Tangut
519: In_Tangut_Components
520: In_Kana_Supplement
521: In_Kana_Extended_A
522: In_Small_Kana_Extension
523: In_Nushu
524: In_Duployan
525: In_Shorthand_Format_Controls
526: In_Byzantine_Musical_Symbols
527: In_Musical_Symbols
528: In_Ancient_Greek_Musical_Notation
529: In_Mayan_Numerals
530: In_Tai_Xuan_Jing_Symbols
531: In_Counting_Rod_Numerals
532: In_Mathematical_Alphanumeric_Symbols
533: In_Sutton_SignWriting
534: In_Glagolitic_Supplement
535: In_Nyiakeng_Puachue_Hmong
536: In_Wancho
537: In_Mende_Kikakui
538: In_Adlam
539: In_Indic_Siyaq_Numbers
540: In_Ottoman_Siyaq_Numbers
541: In_Arabic_Mathematical_Alphabetic_Symbols
542: In_Mahjong_Tiles
543: In_Domino_Tiles
544: In_Playing_Cards
545: In_Enclosed_Alphanumeric_Supplement
546: In_Enclosed_Ideographic_Supplement
547: In_Miscellaneous_Symbols_and_Pictographs
548: In_Emoticons
549: In_Ornamental_Dingbats
550: In_Transport_and_Map_Symbols
551: In_Alchemical_Symbols
552: In_Geometric_Shapes_Extended
553: In_Supplemental_Arrows_C
554: In_Supplemental_Symbols_and_Pictographs
555: In_Chess_Symbols
556: In_Symbols_and_Pictographs_Extended_A
557: In_CJK_Unified_Ideographs_Extension_B
558: In_CJK_Unified_Ideographs_Extension_C
559: In_CJK_Unified_Ideographs_Extension_D
560: In_CJK_Unified_Ideographs_Extension_E
561: In_CJK_Unified_Ideographs_Extension_F
562: In_CJK_Compatibility_Ideographs_Supplement
563: In_Tags
564: In_Variation_Selectors_Supplement
565: In_Supplementary_Private_Use_Area_A
566: In_Supplementary_Private_Use_Area_B
567: In_No_Block