summaryrefslogtreecommitdiff
path: root/doc/UNICODE_PROPERTIES
blob: 1f961ebda1b18b978981a2b9e6414c87281fa32f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
Unicode Properties (from Unicode Version: 11.0.0)

 15: ASCII_Hex_Digit
 16: Adlam
 17: Ahom
 18: Alphabetic
 19: Anatolian_Hieroglyphs
 20: Any
 21: Arabic
 22: Armenian
 23: Assigned
 24: Avestan
 25: Balinese
 26: Bamum
 27: Bassa_Vah
 28: Batak
 29: Bengali
 30: Bhaiksuki
 31: Bidi_Control
 32: Bopomofo
 33: Brahmi
 34: Braille
 35: Buginese
 36: Buhid
 37: C
 38: Canadian_Aboriginal
 39: Carian
 40: Case_Ignorable
 41: Cased
 42: Caucasian_Albanian
 43: Cc
 44: Cf
 45: Chakma
 46: Cham
 47: Changes_When_Casefolded
 48: Changes_When_Casemapped
 49: Changes_When_Lowercased
 50: Changes_When_Titlecased
 51: Changes_When_Uppercased
 52: Cherokee
 53: Cn
 54: Co
 55: Common
 56: Coptic
 57: Cs
 58: Cuneiform
 59: Cypriot
 60: Cyrillic
 61: Dash
 62: Default_Ignorable_Code_Point
 63: Deprecated
 64: Deseret
 65: Devanagari
 66: Diacritic
 67: Dogra
 68: Duployan
 69: Egyptian_Hieroglyphs
 70: Elbasan
 71: Emoji
 72: Emoji_Component
 73: Emoji_Modifier
 74: Emoji_Modifier_Base
 75: Emoji_Presentation
 76: Ethiopic
 77: Extended_Pictographic
 78: Extender
 79: Georgian
 80: Glagolitic
 81: Gothic
 82: Grantha
 83: Grapheme_Base
 84: Grapheme_Extend
 85: Grapheme_Link
 86: Greek
 87: Gujarati
 88: Gunjala_Gondi
 89: Gurmukhi
 90: Han
 91: Hangul
 92: Hanifi_Rohingya
 93: Hanunoo
 94: Hatran
 95: Hebrew
 96: Hex_Digit
 97: Hiragana
 98: Hyphen
 99: IDS_Binary_Operator
100: IDS_Trinary_Operator
101: ID_Continue
102: ID_Start
103: Ideographic
104: Imperial_Aramaic
105: Inherited
106: Inscriptional_Pahlavi
107: Inscriptional_Parthian
108: Javanese
109: Join_Control
110: Kaithi
111: Kannada
112: Katakana
113: Kayah_Li
114: Kharoshthi
115: Khmer
116: Khojki
117: Khudawadi
118: L
119: LC
120: Lao
121: Latin
122: Lepcha
123: Limbu
124: Linear_A
125: Linear_B
126: Lisu
127: Ll
128: Lm
129: Lo
130: Logical_Order_Exception
131: Lowercase
132: Lt
133: Lu
134: Lycian
135: Lydian
136: M
137: Mahajani
138: Makasar
139: Malayalam
140: Mandaic
141: Manichaean
142: Marchen
143: Masaram_Gondi
144: Math
145: Mc
146: Me
147: Medefaidrin
148: Meetei_Mayek
149: Mende_Kikakui
150: Meroitic_Cursive
151: Meroitic_Hieroglyphs
152: Miao
153: Mn
154: Modi
155: Mongolian
156: Mro
157: Multani
158: Myanmar
159: N
160: Nabataean
161: Nd
162: New_Tai_Lue
163: Newa
164: Nko
165: Nl
166: No
167: Noncharacter_Code_Point
168: Nushu
169: Ogham
170: Ol_Chiki
171: Old_Hungarian
172: Old_Italic
173: Old_North_Arabian
174: Old_Permic
175: Old_Persian
176: Old_Sogdian
177: Old_South_Arabian
178: Old_Turkic
179: Oriya
180: Osage
181: Osmanya
182: Other_Alphabetic
183: Other_Default_Ignorable_Code_Point
184: Other_Grapheme_Extend
185: Other_ID_Continue
186: Other_ID_Start
187: Other_Lowercase
188: Other_Math
189: Other_Uppercase
190: P
191: Pahawh_Hmong
192: Palmyrene
193: Pattern_Syntax
194: Pattern_White_Space
195: Pau_Cin_Hau
196: Pc
197: Pd
198: Pe
199: Pf
200: Phags_Pa
201: Phoenician
202: Pi
203: Po
204: Prepended_Concatenation_Mark
205: Ps
206: Psalter_Pahlavi
207: Quotation_Mark
208: Radical
209: Regional_Indicator
210: Rejang
211: Runic
212: S
213: Samaritan
214: Saurashtra
215: Sc
216: Sentence_Terminal
217: Sharada
218: Shavian
219: Siddham
220: SignWriting
221: Sinhala
222: Sk
223: Sm
224: So
225: Soft_Dotted
226: Sogdian
227: Sora_Sompeng
228: Soyombo
229: Sundanese
230: Syloti_Nagri
231: Syriac
232: Tagalog
233: Tagbanwa
234: Tai_Le
235: Tai_Tham
236: Tai_Viet
237: Takri
238: Tamil
239: Tangut
240: Telugu
241: Terminal_Punctuation
242: Thaana
243: Thai
244: Tibetan
245: Tifinagh
246: Tirhuta
247: Ugaritic
248: Unified_Ideograph
249: Unknown
250: Uppercase
251: Vai
252: Variation_Selector
253: Warang_Citi
254: White_Space
255: XID_Continue
256: XID_Start
257: Yi
258: Z
259: Zanabazar_Square
260: Zl
261: Zp
262: Zs
 16: Adlm
 42: Aghb
 15: AHex
 21: Arab
104: Armi
 22: Armn
 24: Avst
 25: Bali
 26: Bamu
 27: Bass
 28: Batk
 29: Beng
 30: Bhks
 31: Bidi_C
 32: Bopo
 33: Brah
 34: Brai
 35: Bugi
 36: Buhd
 45: Cakm
 38: Cans
 39: Cari
119: Cased_Letter
 52: Cher
 40: CI
198: Close_Punctuation
136: Combining_Mark
196: Connector_Punctuation
 43: Control
 56: Copt
 59: Cprt
215: Currency_Symbol
 47: CWCF
 48: CWCM
 49: CWL
 50: CWT
 51: CWU
 60: Cyrl
197: Dash_Punctuation
161: Decimal_Number
 63: Dep
 65: Deva
 62: DI
 66: Dia
 67: Dogr
 64: Dsrt
 68: Dupl
 69: Egyp
 70: Elba
146: Enclosing_Mark
 76: Ethi
 78: Ext
199: Final_Punctuation
 44: Format
 79: Geor
 80: Glag
 88: Gong
143: Gonm
 81: Goth
 82: Gran
 83: Gr_Base
 86: Grek
 84: Gr_Ext
 85: Gr_Link
 87: Gujr
 89: Guru
 91: Hang
 90: Hani
 93: Hano
 94: Hatr
 95: Hebr
 96: Hex
 97: Hira
 19: Hluw
191: Hmng
171: Hung
101: IDC
103: Ideo
102: IDS
 99: IDSB
100: IDST
202: Initial_Punctuation
172: Ital
108: Java
109: Join_C
113: Kali
112: Kana
114: Khar
115: Khmr
116: Khoj
111: Knda
110: Kthi
235: Lana
120: Laoo
121: Latn
122: Lepc
118: Letter
165: Letter_Number
123: Limb
124: Lina
125: Linb
260: Line_Separator
130: LOE
127: Lowercase_Letter
134: Lyci
135: Lydi
137: Mahj
138: Maka
140: Mand
141: Mani
142: Marc
136: Mark
223: Math_Symbol
147: Medf
149: Mend
150: Merc
151: Mero
139: Mlym
128: Modifier_Letter
222: Modifier_Symbol
155: Mong
156: Mroo
148: Mtei
157: Mult
158: Mymr
173: Narb
160: Nbat
167: NChar
164: Nkoo
153: Nonspacing_Mark
168: Nshu
159: Number
182: OAlpha
183: ODI
169: Ogam
184: OGr_Ext
185: OIDC
186: OIDS
170: Olck
187: OLower
188: OMath
205: Open_Punctuation
178: Orkh
179: Orya
180: Osge
181: Osma
 37: Other
129: Other_Letter
166: Other_Number
203: Other_Punctuation
224: Other_Symbol
189: OUpper
192: Palm
261: Paragraph_Separator
193: Pat_Syn
194: Pat_WS
195: Pauc
204: PCM
174: Perm
200: Phag
106: Phli
206: Phlp
201: Phnx
152: Plrd
 54: Private_Use
107: Prti
190: Punctuation
 56: Qaac
105: Qaai
207: QMark
209: RI
210: Rjng
 92: Rohg
211: Runr
213: Samr
177: Sarb
214: Saur
225: SD
258: Separator
220: Sgnw
218: Shaw
217: Shrd
219: Sidd
117: Sind
221: Sinh
226: Sogd
176: Sogo
227: Sora
228: Soyo
262: Space_Separator
145: Spacing_Mark
216: STerm
229: Sund
 57: Surrogate
230: Sylo
212: Symbol
231: Syrc
233: Tagb
237: Takr
234: Tale
162: Talu
238: Taml
239: Tang
236: Tavt
240: Telu
241: Term
245: Tfng
232: Tglg
242: Thaa
244: Tibt
246: Tirh
132: Titlecase_Letter
247: Ugar
248: UIdeo
 53: Unassigned
133: Uppercase_Letter
251: Vaii
252: VS
253: Wara
254: WSpace
255: XIDC
256: XIDS
175: Xpeo
 58: Xsux
257: Yiii
259: Zanb
105: Zinh
 55: Zyyy
249: Zzzz
263: In_Basic_Latin
264: In_Latin_1_Supplement
265: In_Latin_Extended_A
266: In_Latin_Extended_B
267: In_IPA_Extensions
268: In_Spacing_Modifier_Letters
269: In_Combining_Diacritical_Marks
270: In_Greek_and_Coptic
271: In_Cyrillic
272: In_Cyrillic_Supplement
273: In_Armenian
274: In_Hebrew
275: In_Arabic
276: In_Syriac
277: In_Arabic_Supplement
278: In_Thaana
279: In_NKo
280: In_Samaritan
281: In_Mandaic
282: In_Syriac_Supplement
283: In_Arabic_Extended_A
284: In_Devanagari
285: In_Bengali
286: In_Gurmukhi
287: In_Gujarati
288: In_Oriya
289: In_Tamil
290: In_Telugu
291: In_Kannada
292: In_Malayalam
293: In_Sinhala
294: In_Thai
295: In_Lao
296: In_Tibetan
297: In_Myanmar
298: In_Georgian
299: In_Hangul_Jamo
300: In_Ethiopic
301: In_Ethiopic_Supplement
302: In_Cherokee
303: In_Unified_Canadian_Aboriginal_Syllabics
304: In_Ogham
305: In_Runic
306: In_Tagalog
307: In_Hanunoo
308: In_Buhid
309: In_Tagbanwa
310: In_Khmer
311: In_Mongolian
312: In_Unified_Canadian_Aboriginal_Syllabics_Extended
313: In_Limbu
314: In_Tai_Le
315: In_New_Tai_Lue
316: In_Khmer_Symbols
317: In_Buginese
318: In_Tai_Tham
319: In_Combining_Diacritical_Marks_Extended
320: In_Balinese
321: In_Sundanese
322: In_Batak
323: In_Lepcha
324: In_Ol_Chiki
325: In_Cyrillic_Extended_C
326: In_Georgian_Extended
327: In_Sundanese_Supplement
328: In_Vedic_Extensions
329: In_Phonetic_Extensions
330: In_Phonetic_Extensions_Supplement
331: In_Combining_Diacritical_Marks_Supplement
332: In_Latin_Extended_Additional
333: In_Greek_Extended
334: In_General_Punctuation
335: In_Superscripts_and_Subscripts
336: In_Currency_Symbols
337: In_Combining_Diacritical_Marks_for_Symbols
338: In_Letterlike_Symbols
339: In_Number_Forms
340: In_Arrows
341: In_Mathematical_Operators
342: In_Miscellaneous_Technical
343: In_Control_Pictures
344: In_Optical_Character_Recognition
345: In_Enclosed_Alphanumerics
346: In_Box_Drawing
347: In_Block_Elements
348: In_Geometric_Shapes
349: In_Miscellaneous_Symbols
350: In_Dingbats
351: In_Miscellaneous_Mathematical_Symbols_A
352: In_Supplemental_Arrows_A
353: In_Braille_Patterns
354: In_Supplemental_Arrows_B
355: In_Miscellaneous_Mathematical_Symbols_B
356: In_Supplemental_Mathematical_Operators
357: In_Miscellaneous_Symbols_and_Arrows
358: In_Glagolitic
359: In_Latin_Extended_C
360: In_Coptic
361: In_Georgian_Supplement
362: In_Tifinagh
363: In_Ethiopic_Extended
364: In_Cyrillic_Extended_A
365: In_Supplemental_Punctuation
366: In_CJK_Radicals_Supplement
367: In_Kangxi_Radicals
368: In_Ideographic_Description_Characters
369: In_CJK_Symbols_and_Punctuation
370: In_Hiragana
371: In_Katakana
372: In_Bopomofo
373: In_Hangul_Compatibility_Jamo
374: In_Kanbun
375: In_Bopomofo_Extended
376: In_CJK_Strokes
377: In_Katakana_Phonetic_Extensions
378: In_Enclosed_CJK_Letters_and_Months
379: In_CJK_Compatibility
380: In_CJK_Unified_Ideographs_Extension_A
381: In_Yijing_Hexagram_Symbols
382: In_CJK_Unified_Ideographs
383: In_Yi_Syllables
384: In_Yi_Radicals
385: In_Lisu
386: In_Vai
387: In_Cyrillic_Extended_B
388: In_Bamum
389: In_Modifier_Tone_Letters
390: In_Latin_Extended_D
391: In_Syloti_Nagri
392: In_Common_Indic_Number_Forms
393: In_Phags_pa
394: In_Saurashtra
395: In_Devanagari_Extended
396: In_Kayah_Li
397: In_Rejang
398: In_Hangul_Jamo_Extended_A
399: In_Javanese
400: In_Myanmar_Extended_B
401: In_Cham
402: In_Myanmar_Extended_A
403: In_Tai_Viet
404: In_Meetei_Mayek_Extensions
405: In_Ethiopic_Extended_A
406: In_Latin_Extended_E
407: In_Cherokee_Supplement
408: In_Meetei_Mayek
409: In_Hangul_Syllables
410: In_Hangul_Jamo_Extended_B
411: In_High_Surrogates
412: In_High_Private_Use_Surrogates
413: In_Low_Surrogates
414: In_Private_Use_Area
415: In_CJK_Compatibility_Ideographs
416: In_Alphabetic_Presentation_Forms
417: In_Arabic_Presentation_Forms_A
418: In_Variation_Selectors
419: In_Vertical_Forms
420: In_Combining_Half_Marks
421: In_CJK_Compatibility_Forms
422: In_Small_Form_Variants
423: In_Arabic_Presentation_Forms_B
424: In_Halfwidth_and_Fullwidth_Forms
425: In_Specials
426: In_Linear_B_Syllabary
427: In_Linear_B_Ideograms
428: In_Aegean_Numbers
429: In_Ancient_Greek_Numbers
430: In_Ancient_Symbols
431: In_Phaistos_Disc
432: In_Lycian
433: In_Carian
434: In_Coptic_Epact_Numbers
435: In_Old_Italic
436: In_Gothic
437: In_Old_Permic
438: In_Ugaritic
439: In_Old_Persian
440: In_Deseret
441: In_Shavian
442: In_Osmanya
443: In_Osage
444: In_Elbasan
445: In_Caucasian_Albanian
446: In_Linear_A
447: In_Cypriot_Syllabary
448: In_Imperial_Aramaic
449: In_Palmyrene
450: In_Nabataean
451: In_Hatran
452: In_Phoenician
453: In_Lydian
454: In_Meroitic_Hieroglyphs
455: In_Meroitic_Cursive
456: In_Kharoshthi
457: In_Old_South_Arabian
458: In_Old_North_Arabian
459: In_Manichaean
460: In_Avestan
461: In_Inscriptional_Parthian
462: In_Inscriptional_Pahlavi
463: In_Psalter_Pahlavi
464: In_Old_Turkic
465: In_Old_Hungarian
466: In_Hanifi_Rohingya
467: In_Rumi_Numeral_Symbols
468: In_Old_Sogdian
469: In_Sogdian
470: In_Brahmi
471: In_Kaithi
472: In_Sora_Sompeng
473: In_Chakma
474: In_Mahajani
475: In_Sharada
476: In_Sinhala_Archaic_Numbers
477: In_Khojki
478: In_Multani
479: In_Khudawadi
480: In_Grantha
481: In_Newa
482: In_Tirhuta
483: In_Siddham
484: In_Modi
485: In_Mongolian_Supplement
486: In_Takri
487: In_Ahom
488: In_Dogra
489: In_Warang_Citi
490: In_Zanabazar_Square
491: In_Soyombo
492: In_Pau_Cin_Hau
493: In_Bhaiksuki
494: In_Marchen
495: In_Masaram_Gondi
496: In_Gunjala_Gondi
497: In_Makasar
498: In_Cuneiform
499: In_Cuneiform_Numbers_and_Punctuation
500: In_Early_Dynastic_Cuneiform
501: In_Egyptian_Hieroglyphs
502: In_Anatolian_Hieroglyphs
503: In_Bamum_Supplement
504: In_Mro
505: In_Bassa_Vah
506: In_Pahawh_Hmong
507: In_Medefaidrin
508: In_Miao
509: In_Ideographic_Symbols_and_Punctuation
510: In_Tangut
511: In_Tangut_Components
512: In_Kana_Supplement
513: In_Kana_Extended_A
514: In_Nushu
515: In_Duployan
516: In_Shorthand_Format_Controls
517: In_Byzantine_Musical_Symbols
518: In_Musical_Symbols
519: In_Ancient_Greek_Musical_Notation
520: In_Mayan_Numerals
521: In_Tai_Xuan_Jing_Symbols
522: In_Counting_Rod_Numerals
523: In_Mathematical_Alphanumeric_Symbols
524: In_Sutton_SignWriting
525: In_Glagolitic_Supplement
526: In_Mende_Kikakui
527: In_Adlam
528: In_Indic_Siyaq_Numbers
529: In_Arabic_Mathematical_Alphabetic_Symbols
530: In_Mahjong_Tiles
531: In_Domino_Tiles
532: In_Playing_Cards
533: In_Enclosed_Alphanumeric_Supplement
534: In_Enclosed_Ideographic_Supplement
535: In_Miscellaneous_Symbols_and_Pictographs
536: In_Emoticons
537: In_Ornamental_Dingbats
538: In_Transport_and_Map_Symbols
539: In_Alchemical_Symbols
540: In_Geometric_Shapes_Extended
541: In_Supplemental_Arrows_C
542: In_Supplemental_Symbols_and_Pictographs
543: In_Chess_Symbols
544: In_CJK_Unified_Ideographs_Extension_B
545: In_CJK_Unified_Ideographs_Extension_C
546: In_CJK_Unified_Ideographs_Extension_D
547: In_CJK_Unified_Ideographs_Extension_E
548: In_CJK_Unified_Ideographs_Extension_F
549: In_CJK_Compatibility_Ideographs_Supplement
550: In_Tags
551: In_Variation_Selectors_Supplement
552: In_Supplementary_Private_Use_Area_A
553: In_Supplementary_Private_Use_Area_B
554: In_No_Block