cgul 32-bit wide-character support More...


Macros | |
| #define | CGUL_WCHAR__NUL ((cgul_wchar_t)0) |
| #define | CGUL_WCHAR__BOM ((cgul_wchar_t)0xfeff) |
| #define | CGUL_WCHAR__MB_LEN_MAX (4) |
Functions | |
| CGUL_EXPORT size_t | cgul_wchar__wcslen (cgul_exception_t *cex, const cgul_wchar_t *ws) |
| CGUL_EXPORT void | cgul_wchar__wcscpy (cgul_exception_t *cex, cgul_wchar_t *dst, const cgul_wchar_t *src) |
| CGUL_EXPORT int | cgul_wchar__wcscmp (const cgul_wchar_t *ws1, const cgul_wchar_t *ws2) |
| CGUL_EXPORT int | cgul_wchar__isspace (cgul_exception_t *cex, cgul_wchar_t wc) |
Variables | |
| CGUL_BEGIN_C typedef cgul_uint32_t | cgul_wchar_t |
| CGUL_EXPORT const cgul_wchar_t | CGUL_WCHAR__EMPTY_STRING [1] |
This file defines cgul_wchar_t which is a 32-bit, wide-character type. It also defines the basic set of functions that manipulate raw, 32-bit, wide-character strings.
| #define CGUL_WCHAR__NUL ((cgul_wchar_t)0) |
The NUL character in a cgul_wchar_t string.
Referenced by cgul_unicode_cxx::hctowc(), and cgul_unicode_cxx::mbtowc().
| #define CGUL_WCHAR__BOM ((cgul_wchar_t)0xfeff) |
The Unicode Byte-Order Mark (BOM) character.
| #define CGUL_WCHAR__MB_LEN_MAX (4) |
The maximum number of bytes needed to hold any UTF-8 multi-byte sequence that represents one Unicode character.
| CGUL_EXPORT size_t cgul_wchar__wcslen | ( | cgul_exception_t * | cex, |
| const cgul_wchar_t * | ws | ||
| ) |
This function returns the number of 32-bit wide-characters in ws. The return value is determined by counting all the wide characters that come before the trailing NUL wide character. The caller is responsible for insuring that ws is NUL-terminated.
| [in] | cex | c-style exception |
| [in] | ws | 32-bit wide-character string |
ws Referenced by cgul_wchar_cxx::wcslen().
| CGUL_EXPORT void cgul_wchar__wcscpy | ( | cgul_exception_t * | cex, |
| cgul_wchar_t * | dst, | ||
| const cgul_wchar_t * | src | ||
| ) |
This function copies the 32-bit wide characters in src to dst. The caller is responsible for making sure that src is NUL-terminated and that dst is large enough to hold all the characters in src including the trailing NUL character.
| [in] | cex | c-style exception |
| [in] | dst | destination 32-bit wide-character string |
| [in] | src | source 32-bit wide-character string |
Referenced by cgul_wchar_cxx::wcscpy().
| CGUL_EXPORT int cgul_wchar__wcscmp | ( | const cgul_wchar_t * | ws1, |
| const cgul_wchar_t * | ws2 | ||
| ) |
Perform a case-significant, 32-bit, wide-character string comparison of ws1 and ws2. Return -1, 0, or 1 if ws1 is less than, equal to, or greater than ws2 respectively.
For simplicity, this function compares strings stricly by the ordinal value of each character which should be sufficient, for example, when comparing strings when inserting them as keys into a cgul_rbtree; however, the comparison results are not likely to be the same as what strcoll() would return if used on the UTF-8 version of the strings.
cgul_exception object as its first parameter in order to make it easier to use the function as the comparison function of cgul_rbtree objects. | [in] | ws1 | left-hand side |
| [in] | ws2 | right-hand side |
ws1 is less than, equal to, or greater than ws2 respectively Referenced by cgul_wchar_cxx::wcscmp().
| CGUL_EXPORT int cgul_wchar__isspace | ( | cgul_exception_t * | cex, |
| cgul_wchar_t | wc | ||
| ) |
Return whether the 32-bit, wide character wc is considered to be white-space. To avoid locale dependencies, white-space is defined as any character from the following list: ' ', '\t', '\n', '\r', '\f', '\v'
| [in] | cex | c-style exception |
| [in] | wc | 32-bit, wide character |
Referenced by cgul_wchar_cxx::isspace().
| CGUL_BEGIN_C typedef cgul_uint32_t cgul_wchar_t |
The cgul_wchar_t typedef always defines a 32-bit wide character which means it is always large enough to hold any Unicode value even on machines that have a native wchar_t that is only 16-bits wide.
If you need to convert between cgul_wchar_t and wchar_t, there are routines in cgul_unicode that convert between UTF-32 and UTF-8. You can then use an operating system dependent mechanism for converting from UTF-8 to wchar_t. Alternatively, it may just be easier to write simple loop that copies each character from a cgul_wchar_t string to a wchar_t string or vice versa.
Referenced by cgul_bdf_glyph_cxx::get_descriptive_name(), cgul_bdf_glyph_cxx::get_encoding(), cgul_wstring_cxx::get_value(), cgul_wstring_cxx::get_value_at(), cgul_unicode_cxx::hctomb(), cgul_unicode_cxx::hctowc(), cgul_unicode_cxx::mbtowc(), cgul_wstring_cxx::operator cgul_wchar_t *(), cgul_wstring_cxx::operator+=(), cgul_wstring_cxx::operator[](), cgul_wstring_cxx::take_value(), and cgul_unicode_cxx::wcstohcs().
| CGUL_EXPORT const cgul_wchar_t CGUL_WCHAR__EMPTY_STRING[1] |
The most annoying thing about using cgul_wchar_t instead of wchar_t is that we lose support for the compiler automatically generating wide-character strings by simply prepending 'L' to the string literal. Because we often need the empty string, this class provides one as a convenience.