Developer's Daily | Unix by Example |
main | java | perl | unix | dev directory | web log |
charmap ? character symbols to define character encodings |
A character set description (charmap) defines a characterset of available characters and their encodings. All supported character sets should have the portable character set as a proper subset. The portable character set is defined in the file /usr/lib/nls/charmap/POSIX for reference purposes. |
The charmap file starts with a header, that may consist of the following keywords: |
<codeset> |
is followed by the name of the codeset. |
<mb_cur_max> |
is followed by the max number of bytes for a multibyte-character. Multibyte characters are currently not supported. The default value is 1. |
<mb_cur_min> |
is followed by the min number of bytes for a character. This value must be less or equal than mb_cur_max. If not specified, it defaults to mb_cur_max. |
<escape_char> |
is followed by a character that should be used as the escape-character for the rest of the file to mark characters that should be interpreted in a special way. It defaults to the backslash ( \ ). |
<comment_char> |
is followed by a character that will be used as the comment-character for the rest of the file. It defaults to the number sign ( # ). |
The charmap-definition itself starts with the keyword CHARMAP in column 1. The following lines may have one of the two following forms to define the character-encodings: |
<symbolic-name> <encoding> <comments> |
This for defines exactly one character and its encoding. |
<symbolic-name>...<symbolic-name> <encoding> <comments> |
This form defines a couple of characters. This is only useful for mutlibyte-characters, which are currently not implemented. |
The last line in a charmap-definition file must contain END CHARMAP. |
A symbolic name for a character contains only characters of the portable character set. The name itself isenclosed between angle brackets. Characters following the <escape_char> are interpreted as itself; for example, the sequence ’<\\\>>’ represents the symbolic name ’\>’ enclosed in angle brackets. |
The encoding may be in each of the following three forms: |
<escape_char>d<number> |
with a decimal number |
<escape_char>x<number> |
with a hexadecimal number |
<escape_char><number> |
with an octal number. |
/usr/lib/nls/charmap/* |
Jochen Hein (jochen.hein@delphi.central.de) |
POSIX.2 |
setlocale(3), localeconv(3), locale(1), locale(5), localedef(1), |