6.4.3 Universal character names

Previous Table of Contents "New C Standard" commentary


                \u hex-quad
                \U hex-quad hex-quad

hex-quad: hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit

816 A universal character name shall not specify a character whose short identifier is less than 00A0 other than 0024 ($), 0040 (@), or 0060 (`), nor one in the range D800 through DFFF inclusive.62)

817 Universal character names may be used in identifiers, character constants, and string literals to designate characters that are not in the basic character set.

818 The universal character name \Unnnnnnnn designates the character whose eight-digit short identifier (as specified by ISO/IEC 10646) is nnnnnnnn.63)

819 Similarly, the universal character name \unnnn designates the character whose four-digit short identifier is nnnn (and whose eight-digit short identifier is 0000nnnn).

820 62) The disallowed characters are the characters in the basic character set and the code positions reserved by ISO/IEC 10646 for control characters, the character DELETE, and the S-zone (reserved for use by UTF-16).

821 63) Short identifiers for characters were first specified in ISO/IEC 10646–1/AMD9:1997.


Created at: 2008-01-30 02:39:41 The text from WG14/N1256 is copyright © ISO