diff options
author | Jörg Frings-Fürst <debian@jff.email> | 2018-03-07 05:54:53 +0100 |
---|---|---|
committer | Jörg Frings-Fürst <debian@jff.email> | 2018-03-07 05:54:53 +0100 |
commit | 76ef1d8e3249e82a6965fd17157bee00a7857ff3 (patch) | |
tree | 7d3d34b059039faf525d1e95bbdc1945a9fa103c /doc/libunistring.info | |
parent | 0cb66c451a1a4e717878b8296b79c8d7cfd38b30 (diff) | |
parent | 93e8e16be294d19261c7378dd2e46d3f35f06926 (diff) |
Merge branch 'feature/upstream' into develop
Diffstat (limited to 'doc/libunistring.info')
-rw-r--r-- | doc/libunistring.info | 1195 |
1 files changed, 696 insertions, 499 deletions
diff --git a/doc/libunistring.info b/doc/libunistring.info index d1fdfa2..c4be8a4 100644 --- a/doc/libunistring.info +++ b/doc/libunistring.info @@ -33,6 +33,7 @@ GNU libunistring * uniregex.h:: Regular expressions * Using the library:: How to link with the library and use it? * More functionality:: More advanced functionality +* The wchar_t mess:: Why ‘wchar_t *’ strings are useless * Licenses:: Licenses * Index:: General Index @@ -46,7 +47,6 @@ Introduction * Locale encodings:: What is a locale encoding? * In-memory representation:: How to represent strings in memory? * char * strings:: What to keep in mind with ‘char *’ strings -* The wchar_t mess:: Why ‘wchar_t *’ strings are useless * Unicode strings:: How are Unicode strings represented? unistr.h @@ -57,6 +57,26 @@ unistr.h * Elementary string functions with memory allocation:: * Elementary string functions on NUL terminated strings:: +Elementary string functions + +* Iterating:: +* Creating Unicode strings:: +* Copying Unicode strings:: +* Comparing Unicode strings:: +* Searching for a character:: +* Counting characters:: + +Elementary string functions on NUL terminated strings + +* Iterating over a NUL terminated Unicode string:: +* Length:: +* Copying a NUL terminated Unicode string:: +* Comparing NUL terminated Unicode strings:: +* Duplicating a NUL terminated Unicode string:: +* Searching for a character in a NUL terminated Unicode string:: +* Searching for a substring:: +* Tokenizing:: + unictype.h * General category:: @@ -248,7 +268,7 @@ having text in multiple languages present in the same document or even in the same line of text. But use of Unicode is not everything. Internationalization usually -consists of three features: +consists of four features: • Use of Unicode where needed for text processing. This is what this library is for. • Use of message catalogs for messages shown to the user, This is @@ -257,6 +277,9 @@ consists of three features: numeric formatting, or for sorting of text. This can be done adequately with the POSIX APIs and the implementation of locales in the GNU C library. + • In graphical user interfaces, adapting the GUI to the default text + direction of the current locale (see right-to-left languages + (https://en.wikipedia.org/wiki/Right-to-left)). File: libunistring.info, Node: Locale encodings, Next: In-memory representation, Prev: Unicode and i18n, Up: Introduction @@ -299,7 +322,7 @@ encoding that was used in this country earlier. The legacy locale encodings, ISO-8859-15 (which supplanted ISO-8859-1 in most of Europe), ISO-8859-2, KOI8-R, EUC-JP, etc., are still in use -in many places, though. +in some places, though. UTF-16 and UTF-32 are not used as locale encodings, because they are not ASCII compatible. @@ -326,8 +349,23 @@ program. • As ‘wchar_t *’, a.k.a. “wide strings”. This approach is misguided, see *note The wchar_t mess::. + Of course, a ‘char *’ string can, in some cases, be encoded in UTF-8. +You will use the data type depending on what you can guarantee about how +it’s encoded: If a string is encoded in the locale encoding, or if you +don’t know how it’s encoded, use ‘char *’. If, on the other hand, you +can _guarantee_ that it is UTF-8 encoded, then you can use the UTF-8 +string type, ‘uint8_t *’, for it. + + The five types ‘char *’, ‘uint8_t *’, ‘uint16_t *’, ‘uint32_t *’, and +‘wchar_t *’ are incompatible types at the C level. Therefore, ‘gcc +-Wall’ will produce a warning if, by mistake, your code contains a +mismatch between these types. In the context of using GNU libunistring, +even a warning about a mismatch between ‘char *’ and ‘uint8_t *’ is a +sign of a bug in your code that you should not try to silence through a +cast. + -File: libunistring.info, Node: char * strings, Next: The wchar_t mess, Prev: In-memory representation, Up: Introduction +File: libunistring.info, Node: char * strings, Next: Unicode strings, Prev: In-memory representation, Up: Introduction 1.5 ‘char *’ strings ==================== @@ -426,53 +464,9 @@ assumptions built-in that are not valid in some languages: in ‘<unicase.h>’, see *note unicase.h::. -File: libunistring.info, Node: The wchar_t mess, Next: Unicode strings, Prev: char * strings, Up: Introduction - -1.6 The ‘wchar_t’ mess -====================== - - The ISO C and POSIX standard creators made an attempt to fix the -first problem mentioned in the previous section. They introduced - • a type ‘wchar_t’, designed to encapsulate an entire character, - • a “wide string” type ‘wchar_t *’, and - • functions declared in ‘<wctype.h>’ that were meant to supplant the - ones in ‘<ctype.h>’. - - Unfortunately, this API and its implementation has numerous problems: - - • On AIX and Windows platforms, ‘wchar_t’ is a 16-bit type. This - means that it can never accommodate an entire Unicode character. - Either the ‘wchar_t *’ strings are limited to characters in UCS-2 - (the “Basic Multilingual Plane” of Unicode), or — if ‘wchar_t *’ - strings are encoded in UTF-16 — a ‘wchar_t’ represents only half of - a character in the worst case, making the ‘<wctype.h>’ functions - pointless. - - • On Solaris and FreeBSD, the ‘wchar_t’ encoding is locale dependent - and undocumented. This means, if you want to know any property of - a ‘wchar_t’ character, other than the properties defined by - ‘<wctype.h>’ — such as whether it’s a dash, currency symbol, - paragraph separator, or similar —, you have to convert it to ‘char - *’ encoding first, by use of the function ‘wctomb’. - - • When you read a stream of wide characters, through the functions - ‘fgetwc’ and ‘fgetws’, and when the input stream/file is not in the - expected encoding, you have no way to determine the invalid byte - sequence and do some corrective action. If you use these - functions, your program becomes “garbage in - more garbage out” or - “garbage in - abort”. - - As a consequence, it is better to use multibyte strings, as explained -in the previous section. Such multibyte strings can bypass limitations -of the ‘wchar_t’ type, if you use functions defined in gnulib and -libunistring for text processing. They can also faithfully transport -malformed characters that were present in the input, without requiring -the program to produce garbage or abort. - - -File: libunistring.info, Node: Unicode strings, Prev: The wchar_t mess, Up: Introduction +File: libunistring.info, Node: Unicode strings, Prev: char * strings, Up: Introduction -1.7 Unicode strings +1.6 Unicode strings =================== libunistring supports Unicode strings in three representations: @@ -572,6 +566,15 @@ File: libunistring.info, Node: unitypes.h, Next: unistr.h, Prev: Conventions, This type represents a single Unicode character, outside of an UTF-32 string. + The types ‘ucs4_t’ and ‘uint32_t’ happen to be identical. They +differ in use and intent, however: + • Use ‘uint32_t *’ to designate an UTF-32 string. Use ‘ucs4_t’ to + designate a single Unicode character, outside of an UTF-32 string. + • Conversions functions that take an UTF-32 string as input will + usually perform a range-check on the ‘uint32_t’ values. Whereas + functions that are declared to take ‘ucs4_t’ arguments will not + perform such a range-check. + File: libunistring.info, Node: unistr.h, Next: uniconv.h, Prev: unitypes.h, Up: Top @@ -618,32 +621,65 @@ forms of Unicode strings. *RESULTBUF, size_t *LENGTHP) Converts an UTF-8 string to an UTF-16 string. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint32_t * u8_to_u32 (const uint8_t *S, size_t N, uint32_t *RESULTBUF, size_t *LENGTHP) Converts an UTF-8 string to an UTF-32 string. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint8_t * u16_to_u8 (const uint16_t *S, size_t N, uint8_t *RESULTBUF, size_t *LENGTHP) Converts an UTF-16 string to an UTF-8 string. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint32_t * u16_to_u32 (const uint16_t *S, size_t N, uint32_t *RESULTBUF, size_t *LENGTHP) Converts an UTF-16 string to an UTF-32 string. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint8_t * u32_to_u8 (const uint32_t *S, size_t N, uint8_t *RESULTBUF, size_t *LENGTHP) Converts an UTF-32 string to an UTF-8 string. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint16_t * u32_to_u16 (const uint32_t *S, size_t N, uint16_t *RESULTBUF, size_t *LENGTHP) Converts an UTF-32 string to an UTF-16 string. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + File: libunistring.info, Node: Elementary string functions, Next: Elementary string functions with memory allocation, Prev: Elementary string conversions, Up: unistr.h 4.3 Elementary string functions =============================== +* Menu: + +* Iterating:: +* Creating Unicode strings:: +* Copying Unicode strings:: +* Comparing Unicode strings:: +* Searching for a character:: +* Counting characters:: + + +File: libunistring.info, Node: Iterating, Next: Creating Unicode strings, Up: Elementary string functions + +4.3.1 Iterating over a Unicode string +------------------------------------- + The following functions inspect and return details about the first character in a Unicode string. @@ -657,12 +693,9 @@ character in a Unicode string. This function is similar to ‘mblen’, except that it operates on a Unicode string and that S must not be NULL. - -- Function: int u8_mbtouc_unsafe (ucs4_t *PUC, const uint8_t *S, - size_t N) - -- Function: int u16_mbtouc_unsafe (ucs4_t *PUC, const uint16_t *S, - size_t N) - -- Function: int u32_mbtouc_unsafe (ucs4_t *PUC, const uint32_t *S, - size_t N) + -- Function: int u8_mbtouc (ucs4_t *PUC, const uint8_t *S, size_t N) + -- Function: int u16_mbtouc (ucs4_t *PUC, const uint16_t *S, size_t N) + -- Function: int u32_mbtouc (ucs4_t *PUC, const uint32_t *S, size_t N) Returns the length (number of units) of the first character in S, putting its ‘ucs4_t’ representation in ‘*PUC’. Upon failure, ‘*PUC’ is set to ‘0xfffd’, and an appropriate number of units is @@ -670,16 +703,23 @@ character in a Unicode string. The number of available units, N, must be > 0. + This function fails if an invalid sequence of units is encountered + at the beginning of S, or if additional units (after the N provided + units) would be needed to form a character. + This function is similar to ‘mbtowc’, except that it operates on a Unicode string, PUC and S must not be NULL, N must be > 0, and the NUL character is not treated specially. - -- Function: int u8_mbtouc (ucs4_t *PUC, const uint8_t *S, size_t N) - -- Function: int u16_mbtouc (ucs4_t *PUC, const uint16_t *S, size_t N) - -- Function: int u32_mbtouc (ucs4_t *PUC, const uint32_t *S, size_t N) - This function is like ‘u8_mbtouc_unsafe’, except that it will - detect an invalid UTF-8 character, even if the library is compiled - without ‘--enable-safety’. + -- Function: int u8_mbtouc_unsafe (ucs4_t *PUC, const uint8_t *S, + size_t N) + -- Function: int u16_mbtouc_unsafe (ucs4_t *PUC, const uint16_t *S, + size_t N) + -- Function: int u32_mbtouc_unsafe (ucs4_t *PUC, const uint32_t *S, + size_t N) + This function is identical to + ‘u8_mbtouc’/‘u16_mbtouc’/‘u32_mbtouc’. Earlier versions of this + function performed fewer range-checks on the sequence of units. -- Function: int u8_mbtoucr (ucs4_t *PUC, const uint8_t *S, size_t N) -- Function: int u16_mbtoucr (ucs4_t *PUC, const uint16_t *S, size_t N) @@ -695,6 +735,12 @@ character in a Unicode string. This function is similar to ‘u8_mbtouc’, except that the return value gives more details about the failure, similar to ‘mbrtowc’. + +File: libunistring.info, Node: Creating Unicode strings, Next: Copying Unicode strings, Prev: Iterating, Up: Elementary string functions + +4.3.2 Creating Unicode strings one character at a time +------------------------------------------------------ + The following function stores a Unicode character as a Unicode string in memory. @@ -710,6 +756,12 @@ in memory. Unicode strings, S must not be NULL, and the argument N must be specified. + +File: libunistring.info, Node: Copying Unicode strings, Next: Comparing Unicode strings, Prev: Creating Unicode strings, Up: Elementary string functions + +4.3.3 Copying Unicode strings +----------------------------- + The following functions copy Unicode strings in memory. -- Function: uint8_t * u8_cpy (uint8_t *DEST, const uint8_t *SRC, @@ -746,6 +798,12 @@ in memory. This function is similar to ‘memset’, except that it operates on Unicode strings. + +File: libunistring.info, Node: Comparing Unicode strings, Next: Searching for a character, Prev: Copying Unicode strings, Up: Elementary string functions + +4.3.4 Comparing Unicode strings +------------------------------- + The following function compares two Unicode strings of the same length. @@ -778,6 +836,12 @@ different lengths. This function is similar to the gnulib function ‘memcmp2’, except that it operates on Unicode strings. + +File: libunistring.info, Node: Searching for a character, Next: Counting characters, Prev: Comparing Unicode strings, Up: Elementary string functions + +4.3.5 Searching for a character in a Unicode string +--------------------------------------------------- + The following function searches for a given Unicode character. -- Function: uint8_t * u8_chr (const uint8_t *S, size_t N, ucs4_t UC) @@ -791,6 +855,12 @@ different lengths. This function is similar to ‘memchr’, except that it operates on Unicode strings. + +File: libunistring.info, Node: Counting characters, Prev: Searching for a character, Up: Elementary string functions + +4.3.6 Counting the characters in a Unicode string +------------------------------------------------- + The following function counts the number of Unicode characters. -- Function: size_t u8_mbsnlen (const uint8_t *S, size_t N) @@ -821,6 +891,23 @@ File: libunistring.info, Node: Elementary string functions on NUL terminated st 4.5 Elementary string functions on NUL terminated strings ========================================================= +* Menu: + +* Iterating over a NUL terminated Unicode string:: +* Length:: +* Copying a NUL terminated Unicode string:: +* Comparing NUL terminated Unicode strings:: +* Duplicating a NUL terminated Unicode string:: +* Searching for a character in a NUL terminated Unicode string:: +* Searching for a substring:: +* Tokenizing:: + + +File: libunistring.info, Node: Iterating over a NUL terminated Unicode string, Next: Length, Up: Elementary string functions on NUL terminated strings + +4.5.1 Iterating over a NUL terminated Unicode string +---------------------------------------------------- + The following functions inspect and return details about the first character in a Unicode string. @@ -859,6 +946,12 @@ previous character in a Unicode string. reached. Puts the character’s ‘ucs4_t’ representation in ‘*PUC’. Note that this function works only on well-formed Unicode strings. + +File: libunistring.info, Node: Length, Next: Copying a NUL terminated Unicode string, Prev: Iterating over a NUL terminated Unicode string, Up: Elementary string functions on NUL terminated strings + +4.5.2 Length of a NUL terminated Unicode string +----------------------------------------------- + The following functions determine the length of a Unicode string. -- Function: size_t u8_strlen (const uint8_t *S) @@ -877,6 +970,12 @@ previous character in a Unicode string. This function is similar to ‘strnlen’ and ‘wcsnlen’, except that it operates on Unicode strings. + +File: libunistring.info, Node: Copying a NUL terminated Unicode string, Next: Comparing NUL terminated Unicode strings, Prev: Length, Up: Elementary string functions on NUL terminated strings + +4.5.3 Copying a NUL terminated Unicode string +--------------------------------------------- + The following functions copy portions of Unicode strings in memory. -- Function: uint8_t * u8_strcpy (uint8_t *DEST, const uint8_t *SRC) @@ -946,6 +1045,12 @@ previous character in a Unicode string. This function is similar to ‘strncat’ and ‘wcsncat’, except that it operates on Unicode strings. + +File: libunistring.info, Node: Comparing NUL terminated Unicode strings, Next: Duplicating a NUL terminated Unicode string, Prev: Copying a NUL terminated Unicode string, Up: Elementary string functions on NUL terminated strings + +4.5.4 Comparing NUL terminated Unicode strings +---------------------------------------------- + The following functions compare two Unicode strings. -- Function: int u8_strcmp (const uint8_t *S1, const uint8_t *S2) @@ -984,6 +1089,12 @@ previous character in a Unicode string. This function is similar to ‘strncmp’ and ‘wcsncmp’, except that it operates on Unicode strings. + +File: libunistring.info, Node: Duplicating a NUL terminated Unicode string, Next: Searching for a character in a NUL terminated Unicode string, Prev: Comparing NUL terminated Unicode strings, Up: Elementary string functions on NUL terminated strings + +4.5.5 Duplicating a NUL terminated Unicode string +------------------------------------------------- + The following function allocates a duplicate of a Unicode string. -- Function: uint8_t * u8_strdup (const uint8_t *S) @@ -994,6 +1105,12 @@ previous character in a Unicode string. This function is similar to ‘strdup’ and ‘wcsdup’, except that it operates on Unicode strings. + +File: libunistring.info, Node: Searching for a character in a NUL terminated Unicode string, Next: Searching for a substring, Prev: Duplicating a NUL terminated Unicode string, Up: Elementary string functions on NUL terminated strings + +4.5.6 Searching for a character in a NUL terminated Unicode string +------------------------------------------------------------------ + The following functions search for a given Unicode character. -- Function: uint8_t * u8_strchr (const uint8_t *STR, ucs4_t UC) @@ -1050,6 +1167,12 @@ Unicode character in or outside a given set of Unicode characters. This function is similar to ‘strpbrk’ and ‘wcspbrk’, except that it operates on Unicode strings. + +File: libunistring.info, Node: Searching for a substring, Next: Tokenizing, Prev: Searching for a character in a NUL terminated Unicode string, Up: Elementary string functions on NUL terminated strings + +4.5.7 Searching for a substring in a NUL terminated Unicode string +------------------------------------------------------------------ + The following functions search whether a given Unicode string is a substring of another Unicode string. @@ -1080,6 +1203,12 @@ substring of another Unicode string. *SUFFIX) Tests whether STR ends with SUFFIX. + +File: libunistring.info, Node: Tokenizing, Prev: Searching for a substring, Up: Elementary string functions on NUL terminated strings + +4.5.8 Tokenizing a NUL terminated Unicode string +------------------------------------------------ + The following function does one step in tokenizing a Unicode string. -- Function: uint8_t * u8_strtok (uint8_t *STR, const uint8_t *DELIM, @@ -1562,162 +1691,164 @@ File: libunistring.info, Node: Object oriented API, Next: Bit mask API, Up: G The following are the predefined general category value. Additional general categories may be added in the future. - -- Constant: uc_general_category_t UC_CATEGORY_L - -- Constant: uc_general_category_t UC_CATEGORY_LC - -- Constant: uc_general_category_t UC_CATEGORY_Lu - -- Constant: uc_general_category_t UC_CATEGORY_Ll - -- Constant: uc_general_category_t UC_CATEGORY_Lt - -- Constant: uc_general_category_t UC_CATEGORY_Lm - -- Constant: uc_general_category_t UC_CATEGORY_Lo - -- Constant: uc_general_category_t UC_CATEGORY_M - -- Constant: uc_general_category_t UC_CATEGORY_Mn - -- Constant: uc_general_category_t UC_CATEGORY_Mc - -- Constant: uc_general_category_t UC_CATEGORY_Me - -- Constant: uc_general_category_t UC_CATEGORY_N - -- Constant: uc_general_category_t UC_CATEGORY_Nd - -- Constant: uc_general_category_t UC_CATEGORY_Nl - -- Constant: uc_general_category_t UC_CATEGORY_No - -- Constant: uc_general_category_t UC_CATEGORY_P - -- Constant: uc_general_category_t UC_CATEGORY_Pc - -- Constant: uc_general_category_t UC_CATEGORY_Pd - -- Constant: uc_general_category_t UC_CATEGORY_Ps - -- Constant: uc_general_category_t UC_CATEGORY_Pe - -- Constant: uc_general_category_t UC_CATEGORY_Pi - -- Constant: uc_general_category_t UC_CATEGORY_Pf - -- Constant: uc_general_category_t UC_CATEGORY_Po - -- Constant: uc_general_category_t UC_CATEGORY_S - -- Constant: uc_general_category_t UC_CATEGORY_Sm - -- Constant: uc_general_category_t UC_CATEGORY_Sc - -- Constant: uc_general_category_t UC_CATEGORY_Sk - -- Constant: uc_general_category_t UC_CATEGORY_So - -- Constant: uc_general_category_t UC_CATEGORY_Z - -- Constant: uc_general_category_t UC_CATEGORY_Zs - -- Constant: uc_general_category_t UC_CATEGORY_Zl - -- Constant: uc_general_category_t UC_CATEGORY_Zp - -- Constant: uc_general_category_t UC_CATEGORY_C - -- Constant: uc_general_category_t UC_CATEGORY_Cc - -- Constant: uc_general_category_t UC_CATEGORY_Cf - -- Constant: uc_general_category_t UC_CATEGORY_Cs - -- Constant: uc_general_category_t UC_CATEGORY_Co - -- Constant: uc_general_category_t UC_CATEGORY_Cn - - The following are alias names for predefined General category values. + The ‘UC_CATEGORY_*’ constants reflect the systematic general category +values assigned by the Unicode Consortium. Whereas the other ‘UC_*’ +macros are aliases, for use when readable code is preferred. + -- Constant: uc_general_category_t UC_CATEGORY_L -- Macro: uc_general_category_t UC_LETTER - This is another name for ‘UC_CATEGORY_L’. + This represents the general category “Letter”. + -- Constant: uc_general_category_t UC_CATEGORY_LC -- Macro: uc_general_category_t UC_CASED_LETTER - This is another name for ‘UC_CATEGORY_LC’. + -- Constant: uc_general_category_t UC_CATEGORY_Lu -- Macro: uc_general_category_t UC_UPPERCASE_LETTER - This is another name for ‘UC_CATEGORY_Lu’. + This represents the general category “Letter, uppercase”. + -- Constant: uc_general_category_t UC_CATEGORY_Ll -- Macro: uc_general_category_t UC_LOWERCASE_LETTER - This is another name for ‘UC_CATEGORY_Ll’. + This represents the general category “Letter, lowercase”. + -- Constant: uc_general_category_t UC_CATEGORY_Lt -- Macro: uc_general_category_t UC_TITLECASE_LETTER - This is another name for ‘UC_CATEGORY_Lt’. + This represents the general category “Letter, titlecase”. + -- Constant: uc_general_category_t UC_CATEGORY_Lm -- Macro: uc_general_category_t UC_MODIFIER_LETTER - This is another name for ‘UC_CATEGORY_Lm’. + This represents the general category “Letter, modifier”. + -- Constant: uc_general_category_t UC_CATEGORY_Lo -- Macro: uc_general_category_t UC_OTHER_LETTER - This is another name for ‘UC_CATEGORY_Lo’. + This represents the general category “Letter, other”. + -- Constant: uc_general_category_t UC_CATEGORY_M -- Macro: uc_general_category_t UC_MARK - This is another name for ‘UC_CATEGORY_M’. + This represents the general category “Marker”. + -- Constant: uc_general_category_t UC_CATEGORY_Mn -- Macro: uc_general_category_t UC_NON_SPACING_MARK - This is another name for ‘UC_CATEGORY_Mn’. + This represents the general category “Marker, nonspacing”. + -- Constant: uc_general_category_t UC_CATEGORY_Mc -- Macro: uc_general_category_t UC_COMBINING_SPACING_MARK - This is another name for ‘UC_CATEGORY_Mc’. + This represents the general category “Marker, spacing combining”. + -- Constant: uc_general_category_t UC_CATEGORY_Me -- Macro: uc_general_category_t UC_ENCLOSING_MARK - This is another name for ‘UC_CATEGORY_Me’. + This represents the general category “Marker, enclosing”. + -- Constant: uc_general_category_t UC_CATEGORY_N -- Macro: uc_general_category_t UC_NUMBER - This is another name for ‘UC_CATEGORY_N’. + This represents the general category “Number”. + -- Constant: uc_general_category_t UC_CATEGORY_Nd -- Macro: uc_general_category_t UC_DECIMAL_DIGIT_NUMBER - This is another name for ‘UC_CATEGORY_Nd’. + This represents the general category “Number, decimal digit”. + -- Constant: uc_general_category_t UC_CATEGORY_Nl -- Macro: uc_general_category_t UC_LETTER_NUMBER - This is another name for ‘UC_CATEGORY_Nl’. + This represents the general category “Number, letter”. + -- Constant: uc_general_category_t UC_CATEGORY_No -- Macro: uc_general_category_t UC_OTHER_NUMBER - This is another name for ‘UC_CATEGORY_No’. + This represents the general category “Number, other”. + -- Constant: uc_general_category_t UC_CATEGORY_P -- Macro: uc_general_category_t UC_PUNCTUATION - This is another name for ‘UC_CATEGORY_P’. + This represents the general category “Punctuation”. + -- Constant: uc_general_category_t UC_CATEGORY_Pc -- Macro: uc_general_category_t UC_CONNECTOR_PUNCTUATION - This is another name for ‘UC_CATEGORY_Pc’. + This represents the general category “Punctuation, connector”. + -- Constant: uc_general_category_t UC_CATEGORY_Pd -- Macro: uc_general_category_t UC_DASH_PUNCTUATION - This is another name for ‘UC_CATEGORY_Pd’. + This represents the general category “Punctuation, dash”. + -- Constant: uc_general_category_t UC_CATEGORY_Ps -- Macro: uc_general_category_t UC_OPEN_PUNCTUATION - This is another name for ‘UC_CATEGORY_Ps’ (“start punctuation”). + This represents the general category “Punctuation, open”, a.k.a. + “start punctuation”. + -- Constant: uc_general_category_t UC_CATEGORY_Pe -- Macro: uc_general_category_t UC_CLOSE_PUNCTUATION - This is another name for ‘UC_CATEGORY_Pe’ (“end punctuation”). + This represents the general category “Punctuation, close”, a.k.a. + “end punctuation”. + -- Constant: uc_general_category_t UC_CATEGORY_Pi -- Macro: uc_general_category_t UC_INITIAL_QUOTE_PUNCTUATION - This is another name for ‘UC_CATEGORY_Pi’. + This represents the general category “Punctuation, initial quote”. + -- Constant: uc_general_category_t UC_CATEGORY_Pf -- Macro: uc_general_category_t UC_FINAL_QUOTE_PUNCTUATION - This is another name for ‘UC_CATEGORY_Pf’. + This represents the general category “Punctuation, final quote”. + -- Constant: uc_general_category_t UC_CATEGORY_Po -- Macro: uc_general_category_t UC_OTHER_PUNCTUATION - This is another name for ‘UC_CATEGORY_Po’. + This represents the general category “Punctuation, other”. + -- Constant: uc_general_category_t UC_CATEGORY_S -- Macro: uc_general_category_t UC_SYMBOL - This is another name for ‘UC_CATEGORY_S’. + This represents the general category “Symbol”. + -- Constant: uc_general_category_t UC_CATEGORY_Sm -- Macro: uc_general_category_t UC_MATH_SYMBOL - This is another name for ‘UC_CATEGORY_Sm’. + This represents the general category “Symbol, math”. + -- Constant: uc_general_category_t UC_CATEGORY_Sc -- Macro: uc_general_category_t UC_CURRENCY_SYMBOL - This is another name for ‘UC_CATEGORY_Sc’. + This represents the general category “Symbol, currency”. + -- Constant: uc_general_category_t UC_CATEGORY_Sk -- Macro: uc_general_category_t UC_MODIFIER_SYMBOL - This is another name for ‘UC_CATEGORY_Sk’. + This represents the general category “Symbol, modifier”. + -- Constant: uc_general_category_t UC_CATEGORY_So -- Macro: uc_general_category_t UC_OTHER_SYMBOL - This is another name for ‘UC_CATEGORY_So’. + This represents the general category “Symbol, other”. + -- Constant: uc_general_category_t UC_CATEGORY_Z -- Macro: uc_general_category_t UC_SEPARATOR - This is another name for ‘UC_CATEGORY_Z’. + This represents the general category “Separator”. + -- Constant: uc_general_category_t UC_CATEGORY_Zs -- Macro: uc_general_category_t UC_SPACE_SEPARATOR - This is another name for ‘UC_CATEGORY_Zs’. + This represents the general category “Separator, space”. + -- Constant: uc_general_category_t UC_CATEGORY_Zl -- Macro: uc_general_category_t UC_LINE_SEPARATOR - This is another name for ‘UC_CATEGORY_Zl’. + This represents the general category “Separator, line”. + -- Constant: uc_general_category_t UC_CATEGORY_Zp -- Macro: uc_general_category_t UC_PARAGRAPH_SEPARATOR - This is another name for ‘UC_CATEGORY_Zp’. + This represents the general category “Separator, paragraph”. + -- Constant: uc_general_category_t UC_CATEGORY_C -- Macro: uc_general_category_t UC_OTHER - This is another name for ‘UC_CATEGORY_C’. + This represents the general category “Other”. + -- Constant: uc_general_category_t UC_CATEGORY_Cc -- Macro: uc_general_category_t UC_CONTROL - This is another name for ‘UC_CATEGORY_Cc’. + This represents the general category “Other, control”. + -- Constant: uc_general_category_t UC_CATEGORY_Cf -- Macro: uc_general_category_t UC_FORMAT - This is another name for ‘UC_CATEGORY_Cf’. + This represents the general category “Other, format”. + -- Constant: uc_general_category_t UC_CATEGORY_Cs -- Macro: uc_general_category_t UC_SURROGATE - This is another name for ‘UC_CATEGORY_Cs’. All code points in this - category are invalid characters. + This represents the general category “Other, surrogate”. All code + points in this category are invalid characters. + -- Constant: uc_general_category_t UC_CATEGORY_Co -- Macro: uc_general_category_t UC_PRIVATE_USE - This is another name for ‘UC_CATEGORY_Co’. + This represents the general category “Other, private use”. + -- Constant: uc_general_category_t UC_CATEGORY_Cn -- Macro: uc_general_category_t UC_UNASSIGNED - This is another name for ‘UC_CATEGORY_Cn’. Some code points in - this category are invalid characters. + This represents the general category “Other, not assigned”. Some + code points in this category are invalid characters. The following functions combine general categories, like in a boolean algebra, except that there is no ‘not’ operation. @@ -2972,7 +3103,7 @@ the higher-level functions in the previous section are directly based. described in the Unicode standard, because the standard says that they are preferred. - Note that this function do not handle the case when three ore more + Note that this function does not handle the case when three or more consecutive characters are needed to determine the boundary. Use ‘uc_grapheme_breaks’ for such cases. @@ -3350,6 +3481,9 @@ Unicode string. size_t N, uint32_t *RESULTBUF, size_t *LENGTHP) Returns the specified normalization form of a string. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + File: libunistring.info, Node: Normalizing comparisons, Next: Normalization of streams, Prev: Normalization of strings, Up: uninorm.h @@ -3385,6 +3519,9 @@ in normalization. NF must be either ‘UNINORM_NFC’ or ‘UNINORM_NFKC’. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: int u8_normcoll (const uint8_t *S1, size_t N1, const uint8_t *S2, size_t N2, uninorm_t NF, int *RESULTP) -- Function: int u16_normcoll (const uint16_t *S1, size_t N1, const @@ -3557,6 +3694,9 @@ locale independent case mappings. The NF argument identifies the normalization form to apply after the case-mapping. It can also be NULL, for no normalization. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint8_t * u8_tolower (const uint8_t *S, size_t N, const char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t *LENGTHP) @@ -3571,6 +3711,9 @@ locale independent case mappings. The NF argument identifies the normalization form to apply after the case-mapping. It can also be NULL, for no normalization. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint8_t * u8_totitle (const uint8_t *S, size_t N, const char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t *LENGTHP) @@ -3589,6 +3732,9 @@ locale independent case mappings. The NF argument identifies the normalization form to apply after the case-mapping. It can also be NULL, for no normalization. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + File: libunistring.info, Node: Case mappings of substrings, Next: Case insensitive comparison, Prev: Case mappings of strings, Up: unicase.h @@ -3682,6 +3828,9 @@ prefix context and the suffix context. Returns the uppercase mapping of a string that is surrounded by a prefix and a suffix. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint8_t * u8_ct_tolower (const uint8_t *S, size_t N, casing_prefix_context_t PREFIX_CONTEXT, casing_suffix_context_t SUFFIX_CONTEXT, const char @@ -3700,6 +3849,9 @@ prefix context and the suffix context. Returns the lowercase mapping of a string that is surrounded by a prefix and a suffix. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint8_t * u8_ct_totitle (const uint8_t *S, size_t N, casing_prefix_context_t PREFIX_CONTEXT, casing_suffix_context_t SUFFIX_CONTEXT, const char @@ -3718,6 +3870,9 @@ prefix context and the suffix context. Returns the titlecase mapping of a string that is surrounded by a prefix and a suffix. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + For example, to uppercase the UTF-8 substring between ‘s + start_index’ and ‘s + end_index’ of a string that extends from ‘s’ to ‘s + u8_strlen (s)’, you can use the statements @@ -3757,6 +3912,9 @@ in case and normalization. The NF argument identifies the normalization form to apply after the case-mapping. It can also be NULL, for no normalization. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: uint8_t * u8_ct_casefold (const uint8_t *S, size_t N, casing_prefix_context_t PREFIX_CONTEXT, casing_suffix_context_t SUFFIX_CONTEXT, const char @@ -3775,6 +3933,9 @@ in case and normalization. Returns the case folded string. The case folding takes into account the case mapping contexts of the prefix and suffix strings. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: int u8_casecmp (const uint8_t *S1, size_t N1, const uint8_t *S2, size_t N2, const char *ISO639_LANGUAGE, uninorm_t NF, int *RESULTP) @@ -3818,6 +3979,9 @@ rules of the current locale. NF must be either ‘UNINORM_NFC’, ‘UNINORM_NFKC’, or NULL for no normalization. + The RESULTBUF and LENGTHP arguments are as described in chapter + *note Conventions::. + -- Function: int u8_casecoll (const uint8_t *S1, size_t N1, const uint8_t *S2, size_t N2, const char *ISO639_LANGUAGE, uninorm_t NF, int *RESULTP) @@ -3944,7 +4108,7 @@ the file ‘DEPENDENCIES’. Then you can proceed to build and install the library, as described in the file ‘INSTALL’. For installation on Windows systems, please -refer to the file ‘README.windows’. +refer to the file ‘INSTALL.windows’. File: libunistring.info, Node: Compiler options, Next: Include files, Prev: Installation, Up: Using the library @@ -4065,7 +4229,7 @@ file, please include a description of the options that you passed to the ‘configure’ script. -File: libunistring.info, Node: More functionality, Next: Licenses, Prev: Using the library, Up: Top +File: libunistring.info, Node: More functionality, Next: The wchar_t mess, Prev: Using the library, Up: Top 17 More advanced functionality ****************************** @@ -4078,9 +4242,54 @@ given toolkit (KDE/Qt or GNOME/Gtk), we recommend the Pango library: <http://www.pango.org/>. -File: libunistring.info, Node: Licenses, Next: Index, Prev: More functionality, Up: Top +File: libunistring.info, Node: The wchar_t mess, Next: Licenses, Prev: More functionality, Up: Top + +Appendix A The ‘wchar_t’ mess +***************************** + + The ISO C and POSIX standard creators made an attempt to fix the +first problem mentioned in the section *note char * strings::. They +introduced + • a type ‘wchar_t’, designed to encapsulate an entire character, + • a “wide string” type ‘wchar_t *’, and + • functions declared in ‘<wctype.h>’ that were meant to supplant the + ones in ‘<ctype.h>’. + + Unfortunately, this API and its implementation has numerous problems: + + • On AIX and Windows platforms, ‘wchar_t’ is a 16-bit type. This + means that it can never accommodate an entire Unicode character. + Either the ‘wchar_t *’ strings are limited to characters in UCS-2 + (the “Basic Multilingual Plane” of Unicode), or — if ‘wchar_t *’ + strings are encoded in UTF-16 — a ‘wchar_t’ represents only half of + a character in the worst case, making the ‘<wctype.h>’ functions + pointless. + + • On Solaris and FreeBSD, the ‘wchar_t’ encoding is locale dependent + and undocumented. This means, if you want to know any property of + a ‘wchar_t’ character, other than the properties defined by + ‘<wctype.h>’ — such as whether it’s a dash, currency symbol, + paragraph separator, or similar —, you have to convert it to ‘char + *’ encoding first, by use of the function ‘wctomb’. + + • When you read a stream of wide characters, through the functions + ‘fgetwc’ and ‘fgetws’, and when the input stream/file is not in the + expected encoding, you have no way to determine the invalid byte + sequence and do some corrective action. If you use these + functions, your program becomes “garbage in - more garbage out” or + “garbage in - abort”. + + As a consequence, it is better to use multibyte strings, as explained +in the section *note char * strings::. Such multibyte strings can +bypass limitations of the ‘wchar_t’ type, if you use functions defined +in gnulib and libunistring for text processing. They can also +faithfully transport malformed characters that were present in the +input, without requiring the program to produce garbage or abort. + + +File: libunistring.info, Node: Licenses, Next: Index, Prev: The wchar_t mess, Up: Top -Appendix A Licenses +Appendix B Licenses ******************* The files of this package are covered by the licenses indicated in @@ -4128,7 +4337,7 @@ each particular file or directory. Here is a summary: File: libunistring.info, Node: GNU GPL, Next: GNU LGPL, Up: Licenses -A.1 GNU GENERAL PUBLIC LICENSE +B.1 GNU GENERAL PUBLIC LICENSE ============================== Version 3, 29 June 2007 @@ -4844,7 +5053,7 @@ please read <http://www.gnu.org/philosophy/why-not-lgpl.html>. File: libunistring.info, Node: GNU LGPL, Next: GNU FDL, Prev: GNU GPL, Up: Licenses -A.2 GNU LESSER GENERAL PUBLIC LICENSE +B.2 GNU LESSER GENERAL PUBLIC LICENSE ===================================== Version 3, 29 June 2007 @@ -5016,7 +5225,7 @@ supplemented by the additional permissions listed below. File: libunistring.info, Node: GNU FDL, Prev: GNU LGPL, Up: Licenses -A.3 GNU Free Documentation License +B.3 GNU Free Documentation License ================================== Version 1.3, 3 November 2008 @@ -5536,50 +5745,49 @@ Index * char, type: char * strings. (line 22) * combining, Unicode characters: Composition of characters. (line 6) -* comparing: Elementary string functions. - (line 108) -* comparing <1>: Elementary string functions on NUL terminated strings. - (line 131) +* comparing: Comparing Unicode strings. + (line 6) +* comparing <1>: Comparing NUL terminated Unicode strings. + (line 6) * comparing, ignoring case: Case insensitive comparison. (line 6) * comparing, ignoring case, with collation rules: Case insensitive comparison. - (line 65) + (line 71) * comparing, ignoring normalization: Normalizing comparisons. (line 6) * comparing, ignoring normalization and case: Case insensitive comparison. (line 6) * comparing, ignoring normalization and case, with collation rules: Case insensitive comparison. - (line 65) + (line 71) * comparing, ignoring normalization, with collation rules: Normalizing comparisons. (line 22) -* comparing, with collation rules: Elementary string functions on NUL terminated strings. - (line 143) +* comparing, with collation rules: Comparing NUL terminated Unicode strings. + (line 18) * comparing, with collation rules, ignoring case: Case insensitive comparison. - (line 65) + (line 71) * comparing, with collation rules, ignoring normalization: Normalizing comparisons. (line 22) * comparing, with collation rules, ignoring normalization and case: Case insensitive comparison. - (line 65) + (line 71) * compiler options: Compiler options. (line 24) * composing, Unicode characters: Composition of characters. (line 6) * converting: Elementary string conversions. (line 6) * converting <1>: uniconv.h. (line 45) -* copying: Elementary string functions. - (line 72) -* copying <1>: Elementary string functions on NUL terminated strings. - (line 62) -* counting: Elementary string functions. - (line 153) +* copying: Copying Unicode strings. + (line 6) +* copying <1>: Copying a NUL terminated Unicode string. + (line 6) +* counting: Counting characters. (line 6) * decomposing: Decomposition of characters. (line 6) * dependencies: Installation. (line 6) * detecting case: Case detection. (line 6) * duplicating: Elementary string functions with memory allocation. (line 6) -* duplicating <1>: Elementary string functions on NUL terminated strings. - (line 169) +* duplicating <1>: Duplicating a NUL terminated Unicode string. + (line 6) * enum iconv_ilseq_handler: uniconv.h. (line 29) * FDL, GNU Free Documentation License: GNU FDL. (line 6) * formatted output: unistdio.h. (line 6) @@ -5594,9 +5802,8 @@ Index (line 6) * installation: Installation. (line 10) * internationalization: Unicode and i18n. (line 6) -* iterating: Elementary string functions. - (line 6) -* iterating <1>: Elementary string functions on NUL terminated strings. +* iterating: Iterating. (line 6) +* iterating <1>: Iterating over a NUL terminated Unicode string. (line 15) * Java, programming language: ISO C and Java syntax. (line 6) @@ -5629,12 +5836,12 @@ Index * rendering: More functionality. (line 9) * return value conventions: Conventions. (line 47) * scripts: Scripts. (line 6) -* searching, for a character: Elementary string functions. - (line 140) -* searching, for a character <1>: Elementary string functions on NUL terminated strings. - (line 179) -* searching, for a substring: Elementary string functions on NUL terminated strings. - (line 235) +* searching, for a character: Searching for a character. + (line 6) +* searching, for a character <1>: Searching for a character in a NUL terminated Unicode string. + (line 6) +* searching, for a substring: Searching for a substring. + (line 6) * stream, normalizing a: Normalization of streams. (line 6) * struct uninorm_filter: Normalization of streams. @@ -5644,13 +5851,13 @@ Index * u16_asnprintf: unistdio.h. (line 111) * u16_asprintf: unistdio.h. (line 109) * u16_casecmp: Case insensitive comparison. - (line 48) + (line 54) * u16_casecoll: Case insensitive comparison. - (line 91) + (line 100) * u16_casefold: Case insensitive comparison. (line 12) * u16_casexfrm: Case insensitive comparison. - (line 71) + (line 77) * u16_casing_prefixes_context: Case mappings of substrings. (line 36) * u16_casing_prefix_context: Case mappings of substrings. @@ -5661,28 +5868,28 @@ Index (line 57) * u16_check: Elementary string checks. (line 10) -* u16_chr: Elementary string functions. - (line 143) -* u16_cmp: Elementary string functions. - (line 113) -* u16_cmp2: Elementary string functions. - (line 129) +* u16_chr: Searching for a character. + (line 9) +* u16_cmp: Comparing Unicode strings. + (line 11) +* u16_cmp2: Comparing Unicode strings. + (line 27) * u16_conv_from_encoding: uniconv.h. (line 51) * u16_conv_to_encoding: uniconv.h. (line 88) -* u16_cpy: Elementary string functions. - (line 76) +* u16_cpy: Copying Unicode strings. + (line 10) * u16_cpy_alloc: Elementary string functions with memory allocation. (line 9) * u16_ct_casefold: Case insensitive comparison. - (line 32) + (line 35) * u16_ct_tolower: Case mappings of substrings. - (line 98) + (line 101) * u16_ct_totitle: Case mappings of substrings. - (line 116) + (line 122) * u16_ct_toupper: Case mappings of substrings. (line 80) -* u16_endswith: Elementary string functions on NUL terminated strings. - (line 259) +* u16_endswith: Searching for a substring. + (line 30) * u16_grapheme_breaks: Grapheme cluster breaks in a string. (line 42) * u16_grapheme_next: Grapheme cluster breaks in a string. @@ -5694,94 +5901,86 @@ Index * u16_is_lowercase: Case detection. (line 22) * u16_is_titlecase: Case detection. (line 32) * u16_is_uppercase: Case detection. (line 12) -* u16_mblen: Elementary string functions. - (line 10) -* u16_mbsnlen: Elementary string functions. - (line 156) -* u16_mbtouc: Elementary string functions. - (line 37) -* u16_mbtoucr: Elementary string functions. - (line 44) -* u16_mbtouc_unsafe: Elementary string functions. +* u16_mblen: Iterating. (line 10) +* u16_mbsnlen: Counting characters. (line 9) +* u16_mbtouc: Iterating. (line 20) +* u16_mbtoucr: Iterating. (line 48) +* u16_mbtouc_unsafe: Iterating. (line 39) +* u16_move: Copying Unicode strings. (line 21) -* u16_move: Elementary string functions. - (line 87) -* u16_next: Elementary string functions on NUL terminated strings. +* u16_next: Iterating over a NUL terminated Unicode string. (line 23) * u16_normalize: Normalization of strings. (line 48) * u16_normcmp: Normalizing comparisons. (line 11) * u16_normcoll: Normalizing comparisons. - (line 37) + (line 40) * u16_normxfrm: Normalizing comparisons. (line 24) * u16_possible_linebreaks: unilbrk.h. (line 44) -* u16_prev: Elementary string functions on NUL terminated strings. +* u16_prev: Iterating over a NUL terminated Unicode string. + (line 34) +* u16_set: Copying Unicode strings. (line 34) -* u16_set: Elementary string functions. - (line 100) * u16_snprintf: unistdio.h. (line 107) * u16_sprintf: unistdio.h. (line 106) -* u16_startswith: Elementary string functions on NUL terminated strings. - (line 251) -* u16_stpcpy: Elementary string functions on NUL terminated strings. - (line 75) -* u16_stpncpy: Elementary string functions on NUL terminated strings. - (line 98) -* u16_strcat: Elementary string functions on NUL terminated strings. - (line 111) -* u16_strchr: Elementary string functions on NUL terminated strings. - (line 182) -* u16_strcmp: Elementary string functions on NUL terminated strings. - (line 134) -* u16_strcoll: Elementary string functions on NUL terminated strings. - (line 144) +* u16_startswith: Searching for a substring. + (line 22) +* u16_stpcpy: Copying a NUL terminated Unicode string. + (line 19) +* u16_stpncpy: Copying a NUL terminated Unicode string. + (line 42) +* u16_strcat: Copying a NUL terminated Unicode string. + (line 55) +* u16_strchr: Searching for a character in a NUL terminated Unicode string. + (line 9) +* u16_strcmp: Comparing NUL terminated Unicode strings. + (line 9) +* u16_strcoll: Comparing NUL terminated Unicode strings. + (line 19) * u16_strconv_from_encoding: uniconv.h. (line 127) * u16_strconv_from_locale: uniconv.h. (line 156) * u16_strconv_to_encoding: uniconv.h. (line 140) * u16_strconv_to_locale: uniconv.h. (line 166) -* u16_strcpy: Elementary string functions on NUL terminated strings. - (line 65) -* u16_strcspn: Elementary string functions on NUL terminated strings. - (line 202) -* u16_strdup: Elementary string functions on NUL terminated strings. - (line 172) -* u16_strlen: Elementary string functions on NUL terminated strings. - (line 47) -* u16_strmblen: Elementary string functions on NUL terminated strings. +* u16_strcpy: Copying a NUL terminated Unicode string. + (line 9) +* u16_strcspn: Searching for a character in a NUL terminated Unicode string. + (line 29) +* u16_strdup: Duplicating a NUL terminated Unicode string. + (line 9) +* u16_strlen: Length. (line 9) +* u16_strmblen: Iterating over a NUL terminated Unicode string. (line 10) -* u16_strmbtouc: Elementary string functions on NUL terminated strings. +* u16_strmbtouc: Iterating over a NUL terminated Unicode string. (line 16) -* u16_strncat: Elementary string functions on NUL terminated strings. - (line 122) -* u16_strncmp: Elementary string functions on NUL terminated strings. - (line 160) -* u16_strncpy: Elementary string functions on NUL terminated strings. - (line 87) -* u16_strnlen: Elementary string functions on NUL terminated strings. - (line 55) -* u16_strpbrk: Elementary string functions on NUL terminated strings. - (line 226) -* u16_strrchr: Elementary string functions on NUL terminated strings. - (line 190) -* u16_strspn: Elementary string functions on NUL terminated strings. - (line 214) -* u16_strstr: Elementary string functions on NUL terminated strings. - (line 240) -* u16_strtok: Elementary string functions on NUL terminated strings. - (line 269) +* u16_strncat: Copying a NUL terminated Unicode string. + (line 66) +* u16_strncmp: Comparing NUL terminated Unicode strings. + (line 35) +* u16_strncpy: Copying a NUL terminated Unicode string. + (line 31) +* u16_strnlen: Length. (line 17) +* u16_strpbrk: Searching for a character in a NUL terminated Unicode string. + (line 53) +* u16_strrchr: Searching for a character in a NUL terminated Unicode string. + (line 17) +* u16_strspn: Searching for a character in a NUL terminated Unicode string. + (line 41) +* u16_strstr: Searching for a substring. + (line 11) +* u16_strtok: Tokenizing. (line 10) * u16_strwidth: uniwidth.h. (line 38) * u16_tolower: Case mappings of strings. - (line 41) + (line 44) * u16_totitle: Case mappings of strings. - (line 55) + (line 61) * u16_toupper: Case mappings of strings. (line 27) * u16_to_u32: Elementary string conversions. - (line 21) + (line 30) * u16_to_u8: Elementary string conversions. - (line 17) + (line 23) * u16_u16_asnprintf: unistdio.h. (line 131) * u16_u16_asprintf: unistdio.h. (line 129) * u16_u16_snprintf: unistdio.h. (line 127) @@ -5790,8 +5989,8 @@ Index * u16_u16_vasprintf: unistdio.h. (line 137) * u16_u16_vsnprintf: unistdio.h. (line 135) * u16_u16_vsprintf: unistdio.h. (line 133) -* u16_uctomb: Elementary string functions. - (line 61) +* u16_uctomb: Creating Unicode strings. + (line 10) * u16_vasnprintf: unistdio.h. (line 119) * u16_vasprintf: unistdio.h. (line 117) * u16_vsnprintf: unistdio.h. (line 115) @@ -5803,13 +6002,13 @@ Index * u32_asnprintf: unistdio.h. (line 150) * u32_asprintf: unistdio.h. (line 148) * u32_casecmp: Case insensitive comparison. - (line 51) + (line 57) * u32_casecoll: Case insensitive comparison. - (line 94) + (line 103) * u32_casefold: Case insensitive comparison. (line 15) * u32_casexfrm: Case insensitive comparison. - (line 74) + (line 80) * u32_casing_prefixes_context: Case mappings of substrings. (line 38) * u32_casing_prefix_context: Case mappings of substrings. @@ -5820,28 +6019,28 @@ Index (line 59) * u32_check: Elementary string checks. (line 11) -* u32_chr: Elementary string functions. - (line 145) -* u32_cmp: Elementary string functions. - (line 115) -* u32_cmp2: Elementary string functions. - (line 131) +* u32_chr: Searching for a character. + (line 11) +* u32_cmp: Comparing Unicode strings. + (line 13) +* u32_cmp2: Comparing Unicode strings. + (line 29) * u32_conv_from_encoding: uniconv.h. (line 54) * u32_conv_to_encoding: uniconv.h. (line 91) -* u32_cpy: Elementary string functions. - (line 78) +* u32_cpy: Copying Unicode strings. + (line 12) * u32_cpy_alloc: Elementary string functions with memory allocation. (line 10) * u32_ct_casefold: Case insensitive comparison. - (line 37) + (line 40) * u32_ct_tolower: Case mappings of substrings. - (line 103) + (line 106) * u32_ct_totitle: Case mappings of substrings. - (line 121) + (line 127) * u32_ct_toupper: Case mappings of substrings. (line 85) -* u32_endswith: Elementary string functions on NUL terminated strings. - (line 261) +* u32_endswith: Searching for a substring. + (line 32) * u32_grapheme_breaks: Grapheme cluster breaks in a string. (line 44) * u32_grapheme_next: Grapheme cluster breaks in a string. @@ -5853,94 +6052,86 @@ Index * u32_is_lowercase: Case detection. (line 24) * u32_is_titlecase: Case detection. (line 34) * u32_is_uppercase: Case detection. (line 14) -* u32_mblen: Elementary string functions. - (line 11) -* u32_mbsnlen: Elementary string functions. - (line 157) -* u32_mbtouc: Elementary string functions. - (line 38) -* u32_mbtoucr: Elementary string functions. - (line 45) -* u32_mbtouc_unsafe: Elementary string functions. +* u32_mblen: Iterating. (line 11) +* u32_mbsnlen: Counting characters. (line 10) +* u32_mbtouc: Iterating. (line 21) +* u32_mbtoucr: Iterating. (line 49) +* u32_mbtouc_unsafe: Iterating. (line 41) +* u32_move: Copying Unicode strings. (line 23) -* u32_move: Elementary string functions. - (line 89) -* u32_next: Elementary string functions on NUL terminated strings. +* u32_next: Iterating over a NUL terminated Unicode string. (line 24) * u32_normalize: Normalization of strings. (line 50) * u32_normcmp: Normalizing comparisons. (line 13) * u32_normcoll: Normalizing comparisons. - (line 39) + (line 42) * u32_normxfrm: Normalizing comparisons. (line 26) * u32_possible_linebreaks: unilbrk.h. (line 46) -* u32_prev: Elementary string functions on NUL terminated strings. +* u32_prev: Iterating over a NUL terminated Unicode string. (line 36) -* u32_set: Elementary string functions. - (line 101) +* u32_set: Copying Unicode strings. + (line 35) * u32_snprintf: unistdio.h. (line 146) * u32_sprintf: unistdio.h. (line 145) -* u32_startswith: Elementary string functions on NUL terminated strings. - (line 253) -* u32_stpcpy: Elementary string functions on NUL terminated strings. - (line 77) -* u32_stpncpy: Elementary string functions on NUL terminated strings. - (line 100) -* u32_strcat: Elementary string functions on NUL terminated strings. - (line 113) -* u32_strchr: Elementary string functions on NUL terminated strings. - (line 183) -* u32_strcmp: Elementary string functions on NUL terminated strings. - (line 135) -* u32_strcoll: Elementary string functions on NUL terminated strings. - (line 145) +* u32_startswith: Searching for a substring. + (line 24) +* u32_stpcpy: Copying a NUL terminated Unicode string. + (line 21) +* u32_stpncpy: Copying a NUL terminated Unicode string. + (line 44) +* u32_strcat: Copying a NUL terminated Unicode string. + (line 57) +* u32_strchr: Searching for a character in a NUL terminated Unicode string. + (line 10) +* u32_strcmp: Comparing NUL terminated Unicode strings. + (line 10) +* u32_strcoll: Comparing NUL terminated Unicode strings. + (line 20) * u32_strconv_from_encoding: uniconv.h. (line 129) * u32_strconv_from_locale: uniconv.h. (line 157) * u32_strconv_to_encoding: uniconv.h. (line 142) * u32_strconv_to_locale: uniconv.h. (line 167) -* u32_strcpy: Elementary string functions on NUL terminated strings. - (line 67) -* u32_strcspn: Elementary string functions on NUL terminated strings. - (line 204) -* u32_strdup: Elementary string functions on NUL terminated strings. - (line 173) -* u32_strlen: Elementary string functions on NUL terminated strings. - (line 48) -* u32_strmblen: Elementary string functions on NUL terminated strings. +* u32_strcpy: Copying a NUL terminated Unicode string. + (line 11) +* u32_strcspn: Searching for a character in a NUL terminated Unicode string. + (line 31) +* u32_strdup: Duplicating a NUL terminated Unicode string. + (line 10) +* u32_strlen: Length. (line 10) +* u32_strmblen: Iterating over a NUL terminated Unicode string. (line 11) -* u32_strmbtouc: Elementary string functions on NUL terminated strings. +* u32_strmbtouc: Iterating over a NUL terminated Unicode string. (line 17) -* u32_strncat: Elementary string functions on NUL terminated strings. - (line 124) -* u32_strncmp: Elementary string functions on NUL terminated strings. - (line 162) -* u32_strncpy: Elementary string functions on NUL terminated strings. - (line 89) -* u32_strnlen: Elementary string functions on NUL terminated strings. - (line 56) -* u32_strpbrk: Elementary string functions on NUL terminated strings. - (line 228) -* u32_strrchr: Elementary string functions on NUL terminated strings. - (line 191) -* u32_strspn: Elementary string functions on NUL terminated strings. - (line 216) -* u32_strstr: Elementary string functions on NUL terminated strings. - (line 242) -* u32_strtok: Elementary string functions on NUL terminated strings. - (line 271) +* u32_strncat: Copying a NUL terminated Unicode string. + (line 68) +* u32_strncmp: Comparing NUL terminated Unicode strings. + (line 37) +* u32_strncpy: Copying a NUL terminated Unicode string. + (line 33) +* u32_strnlen: Length. (line 18) +* u32_strpbrk: Searching for a character in a NUL terminated Unicode string. + (line 55) +* u32_strrchr: Searching for a character in a NUL terminated Unicode string. + (line 18) +* u32_strspn: Searching for a character in a NUL terminated Unicode string. + (line 43) +* u32_strstr: Searching for a substring. + (line 13) +* u32_strtok: Tokenizing. (line 12) * u32_strwidth: uniwidth.h. (line 39) * u32_tolower: Case mappings of strings. - (line 44) + (line 47) * u32_totitle: Case mappings of strings. - (line 58) + (line 64) * u32_toupper: Case mappings of strings. (line 30) * u32_to_u16: Elementary string conversions. - (line 29) + (line 44) * u32_to_u8: Elementary string conversions. - (line 25) + (line 37) * u32_u32_asnprintf: unistdio.h. (line 170) * u32_u32_asprintf: unistdio.h. (line 168) * u32_u32_snprintf: unistdio.h. (line 166) @@ -5949,8 +6140,8 @@ Index * u32_u32_vasprintf: unistdio.h. (line 176) * u32_u32_vsnprintf: unistdio.h. (line 174) * u32_u32_vsprintf: unistdio.h. (line 172) -* u32_uctomb: Elementary string functions. - (line 62) +* u32_uctomb: Creating Unicode strings. + (line 11) * u32_vasnprintf: unistdio.h. (line 158) * u32_vasprintf: unistdio.h. (line 156) * u32_vsnprintf: unistdio.h. (line 154) @@ -5962,13 +6153,13 @@ Index * u8_asnprintf: unistdio.h. (line 72) * u8_asprintf: unistdio.h. (line 70) * u8_casecmp: Case insensitive comparison. - (line 45) + (line 51) * u8_casecoll: Case insensitive comparison. - (line 88) + (line 97) * u8_casefold: Case insensitive comparison. (line 9) * u8_casexfrm: Case insensitive comparison. - (line 68) + (line 74) * u8_casing_prefixes_context: Case mappings of substrings. (line 34) * u8_casing_prefix_context: Case mappings of substrings. @@ -5979,28 +6170,28 @@ Index (line 55) * u8_check: Elementary string checks. (line 9) -* u8_chr: Elementary string functions. - (line 142) -* u8_cmp: Elementary string functions. - (line 111) -* u8_cmp2: Elementary string functions. - (line 127) +* u8_chr: Searching for a character. + (line 8) +* u8_cmp: Comparing Unicode strings. + (line 9) +* u8_cmp2: Comparing Unicode strings. + (line 25) * u8_conv_from_encoding: uniconv.h. (line 48) * u8_conv_to_encoding: uniconv.h. (line 85) -* u8_cpy: Elementary string functions. - (line 74) +* u8_cpy: Copying Unicode strings. + (line 8) * u8_cpy_alloc: Elementary string functions with memory allocation. (line 8) * u8_ct_casefold: Case insensitive comparison. - (line 27) + (line 30) * u8_ct_tolower: Case mappings of substrings. - (line 93) + (line 96) * u8_ct_totitle: Case mappings of substrings. - (line 111) + (line 117) * u8_ct_toupper: Case mappings of substrings. (line 75) -* u8_endswith: Elementary string functions on NUL terminated strings. - (line 257) +* u8_endswith: Searching for a substring. + (line 28) * u8_grapheme_breaks: Grapheme cluster breaks in a string. (line 40) * u8_grapheme_next: Grapheme cluster breaks in a string. @@ -6012,94 +6203,86 @@ Index * u8_is_lowercase: Case detection. (line 20) * u8_is_titlecase: Case detection. (line 30) * u8_is_uppercase: Case detection. (line 10) -* u8_mblen: Elementary string functions. - (line 9) -* u8_mbsnlen: Elementary string functions. - (line 155) -* u8_mbtouc: Elementary string functions. - (line 36) -* u8_mbtoucr: Elementary string functions. - (line 43) -* u8_mbtouc_unsafe: Elementary string functions. +* u8_mblen: Iterating. (line 9) +* u8_mbsnlen: Counting characters. (line 8) +* u8_mbtouc: Iterating. (line 19) +* u8_mbtoucr: Iterating. (line 47) +* u8_mbtouc_unsafe: Iterating. (line 37) +* u8_move: Copying Unicode strings. (line 19) -* u8_move: Elementary string functions. - (line 85) -* u8_next: Elementary string functions on NUL terminated strings. +* u8_next: Iterating over a NUL terminated Unicode string. (line 22) * u8_normalize: Normalization of strings. (line 46) * u8_normcmp: Normalizing comparisons. (line 9) * u8_normcoll: Normalizing comparisons. - (line 35) + (line 38) * u8_normxfrm: Normalizing comparisons. (line 22) * u8_possible_linebreaks: unilbrk.h. (line 42) -* u8_prev: Elementary string functions on NUL terminated strings. +* u8_prev: Iterating over a NUL terminated Unicode string. (line 32) -* u8_set: Elementary string functions. - (line 99) +* u8_set: Copying Unicode strings. + (line 33) * u8_snprintf: unistdio.h. (line 68) * u8_sprintf: unistdio.h. (line 67) -* u8_startswith: Elementary string functions on NUL terminated strings. - (line 249) -* u8_stpcpy: Elementary string functions on NUL terminated strings. - (line 74) -* u8_stpncpy: Elementary string functions on NUL terminated strings. - (line 96) -* u8_strcat: Elementary string functions on NUL terminated strings. - (line 110) -* u8_strchr: Elementary string functions on NUL terminated strings. - (line 181) -* u8_strcmp: Elementary string functions on NUL terminated strings. - (line 133) -* u8_strcoll: Elementary string functions on NUL terminated strings. - (line 143) +* u8_startswith: Searching for a substring. + (line 20) +* u8_stpcpy: Copying a NUL terminated Unicode string. + (line 18) +* u8_stpncpy: Copying a NUL terminated Unicode string. + (line 40) +* u8_strcat: Copying a NUL terminated Unicode string. + (line 54) +* u8_strchr: Searching for a character in a NUL terminated Unicode string. + (line 8) +* u8_strcmp: Comparing NUL terminated Unicode strings. + (line 8) +* u8_strcoll: Comparing NUL terminated Unicode strings. + (line 18) * u8_strconv_from_encoding: uniconv.h. (line 125) * u8_strconv_from_locale: uniconv.h. (line 155) * u8_strconv_to_encoding: uniconv.h. (line 138) * u8_strconv_to_locale: uniconv.h. (line 165) -* u8_strcpy: Elementary string functions on NUL terminated strings. - (line 64) -* u8_strcspn: Elementary string functions on NUL terminated strings. - (line 200) -* u8_strdup: Elementary string functions on NUL terminated strings. - (line 171) -* u8_strlen: Elementary string functions on NUL terminated strings. - (line 46) -* u8_strmblen: Elementary string functions on NUL terminated strings. +* u8_strcpy: Copying a NUL terminated Unicode string. + (line 8) +* u8_strcspn: Searching for a character in a NUL terminated Unicode string. + (line 27) +* u8_strdup: Duplicating a NUL terminated Unicode string. + (line 8) +* u8_strlen: Length. (line 8) +* u8_strmblen: Iterating over a NUL terminated Unicode string. (line 9) -* u8_strmbtouc: Elementary string functions on NUL terminated strings. +* u8_strmbtouc: Iterating over a NUL terminated Unicode string. (line 15) -* u8_strncat: Elementary string functions on NUL terminated strings. - (line 120) -* u8_strncmp: Elementary string functions on NUL terminated strings. - (line 158) -* u8_strncpy: Elementary string functions on NUL terminated strings. - (line 85) -* u8_strnlen: Elementary string functions on NUL terminated strings. - (line 54) -* u8_strpbrk: Elementary string functions on NUL terminated strings. - (line 224) -* u8_strrchr: Elementary string functions on NUL terminated strings. - (line 189) -* u8_strspn: Elementary string functions on NUL terminated strings. - (line 212) -* u8_strstr: Elementary string functions on NUL terminated strings. - (line 238) -* u8_strtok: Elementary string functions on NUL terminated strings. - (line 267) +* u8_strncat: Copying a NUL terminated Unicode string. + (line 64) +* u8_strncmp: Comparing NUL terminated Unicode strings. + (line 33) +* u8_strncpy: Copying a NUL terminated Unicode string. + (line 29) +* u8_strnlen: Length. (line 16) +* u8_strpbrk: Searching for a character in a NUL terminated Unicode string. + (line 51) +* u8_strrchr: Searching for a character in a NUL terminated Unicode string. + (line 16) +* u8_strspn: Searching for a character in a NUL terminated Unicode string. + (line 39) +* u8_strstr: Searching for a substring. + (line 9) +* u8_strtok: Tokenizing. (line 8) * u8_strwidth: uniwidth.h. (line 37) * u8_tolower: Case mappings of strings. - (line 38) + (line 41) * u8_totitle: Case mappings of strings. - (line 52) + (line 58) * u8_toupper: Case mappings of strings. (line 24) * u8_to_u16: Elementary string conversions. (line 9) * u8_to_u32: Elementary string conversions. - (line 13) + (line 16) * u8_u8_asnprintf: unistdio.h. (line 92) * u8_u8_asprintf: unistdio.h. (line 90) * u8_u8_snprintf: unistdio.h. (line 88) @@ -6108,8 +6291,8 @@ Index * u8_u8_vasprintf: unistdio.h. (line 98) * u8_u8_vsnprintf: unistdio.h. (line 96) * u8_u8_vsprintf: unistdio.h. (line 94) -* u8_uctomb: Elementary string functions. - (line 60) +* u8_uctomb: Creating Unicode strings. + (line 9) * u8_vasnprintf: unistdio.h. (line 80) * u8_vasprintf: unistdio.h. (line 78) * u8_vsnprintf: unistdio.h. (line 76) @@ -6150,13 +6333,13 @@ Index (line 80) * uc_digit_value: Digit value. (line 10) * uc_fraction_t: Numeric value. (line 12) -* uc_general_category: Object oriented API. (line 219) -* uc_general_category_and: Object oriented API. (line 180) -* uc_general_category_and_not: Object oriented API. (line 187) -* uc_general_category_byname: Object oriented API. (line 209) -* uc_general_category_long_name: Object oriented API. (line 203) -* uc_general_category_name: Object oriented API. (line 197) -* uc_general_category_or: Object oriented API. (line 174) +* uc_general_category: Object oriented API. (line 221) +* uc_general_category_and: Object oriented API. (line 182) +* uc_general_category_and_not: Object oriented API. (line 189) +* uc_general_category_byname: Object oriented API. (line 211) +* uc_general_category_long_name: Object oriented API. (line 205) +* uc_general_category_name: Object oriented API. (line 199) +* uc_general_category_or: Object oriented API. (line 176) * uc_general_category_t: Object oriented API. (line 6) * uc_graphemeclusterbreak_property: Grapheme cluster break property. (line 37) @@ -6177,7 +6360,7 @@ Index (line 9) * uc_is_digit: Classifications like in ISO C. (line 26) -* uc_is_general_category: Object oriented API. (line 224) +* uc_is_general_category: Object oriented API. (line 226) * uc_is_general_category_withtable: Bit mask API. (line 51) * uc_is_graph: Classifications like in ISO C. (line 30) @@ -6408,11 +6591,11 @@ Index * ulc_asnprintf: unistdio.h. (line 49) * ulc_asprintf: unistdio.h. (line 47) * ulc_casecmp: Case insensitive comparison. - (line 54) + (line 60) * ulc_casecoll: Case insensitive comparison. - (line 97) + (line 106) * ulc_casexfrm: Case insensitive comparison. - (line 77) + (line 83) * ulc_fprintf: unistdio.h. (line 184) * ulc_grapheme_breaks: Grapheme cluster breaks in a string. (line 46) @@ -6498,78 +6681,92 @@ Index Tag Table: Node: Top269 -Node: Introduction3400 -Node: Unicode5493 -Node: Unicode and i18n7378 -Node: Locale encodings8848 -Node: In-memory representation11113 -Node: char * strings12239 -Node: The wchar_t mess17727 -Node: Unicode strings20035 -Node: Conventions21220 -Node: unitypes.h23512 -Node: unistr.h24096 -Node: Elementary string checks24661 -Node: Elementary string conversions25283 -Node: Elementary string functions26585 -Node: Elementary string functions with memory allocation33644 -Node: Elementary string functions on NUL terminated strings34266 -Node: uniconv.h46494 -Node: unistdio.h54447 -Node: uniname.h62700 -Node: unictype.h64106 -Node: General category65034 -Node: Object oriented API66089 -Node: Bit mask API75323 -Node: Canonical combining class77618 -Node: Bidi class81852 -Node: Decimal digit value85265 -Node: Digit value85822 -Node: Numeric value86383 -Node: Mirrored character87285 -Node: Arabic shaping87978 -Node: Joining type88451 -Node: Joining group90601 -Node: Properties94039 -Node: Properties as objects94730 -Node: Properties as functions101752 -Node: Scripts107768 -Node: Blocks109173 -Node: ISO C and Java syntax110516 -Node: Classifications like in ISO C112234 -Node: uniwidth.h115046 -Node: unigbrk.h117092 -Node: Grapheme cluster breaks in a string118586 -Node: Grapheme cluster break property121521 -Node: uniwbrk.h123765 -Node: Word breaks in a string124303 -Node: Word break property125395 -Node: unilbrk.h126722 -Node: uninorm.h131018 -Node: Decomposition of characters131655 -Node: Composition of characters135436 -Node: Normalization of strings136149 -Node: Normalizing comparisons138226 -Node: Normalization of streams140628 -Node: unicase.h142753 -Node: Case mappings of characters143442 -Node: Case mappings of strings145591 -Node: Case mappings of substrings148942 -Node: Case insensitive comparison155864 -Node: Case detection161269 -Node: uniregex.h164583 -Node: Using the library164810 -Node: Installation165221 -Node: Compiler options165708 -Node: Include files167348 -Node: Autoconf macro168601 -Node: Reporting problems170241 -Node: More functionality171059 -Node: Licenses171502 -Node: GNU GPL173933 -Node: GNU LGPL211678 -Node: GNU FDL220161 -Node: Index245470 +Node: Introduction3950 +Node: Unicode6043 +Node: Unicode and i18n7928 +Node: Locale encodings9590 +Node: In-memory representation11855 +Node: char * strings13853 +Node: Unicode strings19340 +Node: Conventions20523 +Node: unitypes.h22815 +Node: unistr.h23912 +Node: Elementary string checks24477 +Node: Elementary string conversions25099 +Node: Elementary string functions26977 +Node: Iterating27382 +Node: Creating Unicode strings30212 +Node: Copying Unicode strings31130 +Node: Comparing Unicode strings32743 +Node: Searching for a character34298 +Node: Counting characters35097 +Node: Elementary string functions with memory allocation35780 +Node: Elementary string functions on NUL terminated strings36402 +Node: Iterating over a NUL terminated Unicode string37001 +Node: Length39269 +Node: Copying a NUL terminated Unicode string40327 +Node: Comparing NUL terminated Unicode strings43431 +Node: Duplicating a NUL terminated Unicode string45527 +Node: Searching for a character in a NUL terminated Unicode string46296 +Node: Searching for a substring49060 +Node: Tokenizing50583 +Node: uniconv.h51456 +Node: unistdio.h59409 +Node: uniname.h67662 +Node: unictype.h69068 +Node: General category69996 +Node: Object oriented API71051 +Node: Bit mask API80892 +Node: Canonical combining class83187 +Node: Bidi class87421 +Node: Decimal digit value90834 +Node: Digit value91391 +Node: Numeric value91952 +Node: Mirrored character92854 +Node: Arabic shaping93547 +Node: Joining type94020 +Node: Joining group96170 +Node: Properties99608 +Node: Properties as objects100299 +Node: Properties as functions107321 +Node: Scripts113337 +Node: Blocks114742 +Node: ISO C and Java syntax116085 +Node: Classifications like in ISO C117803 +Node: uniwidth.h120615 +Node: unigbrk.h122661 +Node: Grapheme cluster breaks in a string124155 +Node: Grapheme cluster break property127090 +Node: uniwbrk.h129335 +Node: Word breaks in a string129873 +Node: Word break property130965 +Node: unilbrk.h132292 +Node: uninorm.h136588 +Node: Decomposition of characters137225 +Node: Composition of characters141006 +Node: Normalization of strings141719 +Node: Normalizing comparisons143892 +Node: Normalization of streams146390 +Node: unicase.h148515 +Node: Case mappings of characters149204 +Node: Case mappings of strings151353 +Node: Case mappings of substrings154992 +Node: Case insensitive comparison162202 +Node: Case detection167895 +Node: uniregex.h171209 +Node: Using the library171436 +Node: Installation171847 +Node: Compiler options172335 +Node: Include files173975 +Node: Autoconf macro175228 +Node: Reporting problems176868 +Node: More functionality177686 +Node: The wchar_t mess178137 +Node: Licenses180475 +Node: GNU GPL182904 +Node: GNU LGPL220649 +Node: GNU FDL229132 +Node: Index254441 End Tag Table |