6.4.4.4 Character constants

Previous Table of Contents "New C Standard" commentary

866

character-constant:
                ' c-char-sequence '
                L' c-char-sequence '

c-char-sequence: c-char c-char-sequence c-char

c-char: any member of the source character set except the single-quote ', backslash \, or new-line character escape-sequence

escape-sequence: simple-escape-sequence octal-escape-sequence hexadecimal-escape-sequence universal-character-name

simple-escape-sequence: one of \' \" \? \\ \a \b \f \n \r \t \v

octal-escape-sequence: \ octal-digit \ octal-digit octal-digit \ octal-digit octal-digit octal-digit

hexadecimal-escape-sequence: \x hexadecimal-digit hexadecimal-escape-sequence hexadecimal-digit

867 An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x'.

868 A wide character constant is the same, except prefixed by the letter L.

869 With a few exceptions detailed later, the elements of the sequence are any members of the source character set;

870 they are mapped in an implementation-defined manner to members of the execution character set.

871 The single-quote ', the double-quote ", the question-mark ?, the backslash \, and arbitrary integer values are representable according to the following table of escape sequences:

single quote   '       \'
double quote   "       \"
question mark  ?       \?
backslash      \       \\
octal character        \octal digits
hexadecimal character  \xhexadecimal digits

872 The double-quote " and question-mark ? are representable either by themselves or by the escape sequences \" and \?, respectively, but the single-quote ' and the backslash \ shall be represented, respectively, by the escape sequences \' and \\.

873 The octal digits that follow the backslash in an octal escape sequence are taken to be part of the construction of a single character for an integer character constant or of a single wide character for a wide character constant.

874 The numerical value of the octal integer so formed specifies the value of the desired character or wide character.

875 The hexadecimal digits that follow the backslash and the letter x in a hexadecimal escape sequence are taken to be part of the construction of a single character for an integer character constant or of a single wide character for a wide character constant.

876 The numerical value of the hexadecimal integer so formed specifies the value of the desired character or wide character.

877 Each octal or hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence.

878 In addition, characters not in the basic character set are representable by universal character names and certain nongraphic characters are representable by escape sequences consisting of the backslash \ followed by a lowercase letter: \a, \b, \f, \n, \r, \t, and \v.65)

879 65) The semantics of these characters were discussed in 5.2.2.

880 If any other character follows a backslash, the result is not a token and a diagnostic is required.

881 See “future language directions” (6.11.4).

882 The value of an octal or hexadecimal escape sequence shall be in the range of representable values for the type unsigned char for an integer character constant, or the unsigned type corresponding to wchar_t for a wide character constant.

883 An integer character constant has type int.

884 The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer.

885 The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.

886 If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

887 A wide character constant has type wchar_t, an integer type defined in the <stddef.h> header.

888 The value of a wide character constant containing a single multibyte character that maps to a member of the extended execution character set is the wide character corresponding to that multibyte character, as defined by the mbtowc function, with an implementation-defined current locale.

889 The value of a wide character constant containing more than one multibyte character, or containing a multibyte character or escape sequence not represented in the extended execution character set, is implementation-defined.

890 EXAMPLE 1 The construction '\0' is commonly used to represent the null character.

891 EXAMPLE 2 Consider implementations that use two's-complement representation for integers and eight bits for objects that have type char. In an implementation in which type char has the same range of values as signed char, the integer character constant '\xFF' has the value -1; if type char has the same range of values as unsigned char, the character constant '\xFF' has the value +255.

892 EXAMPLE 3 Even if eight bits are used for objects that have type char, the construction '\x123' specifies an integer character constant containing only one character, since a hexadecimal escape sequence is terminated only by a non-hexadecimal character. To specify an integer character constant containing the two characters whose values are '\x12' and '3', the construction '\0223' may be used, since an octal escape sequence is terminated after three octal digits. (The value of this two-character integer character constant is implementation-defined.)

893 EXAMPLE 4 Even if 12 or more bits are used for objects that have type wchar_t, the construction L'\1234' specifies the implementation-defined value that results from the combination of the values 0123 and '4'.

894 Forward references: common definitions <stddef.h> (7.17), the mbtowc function (7.20.7.2).

Next

Created at: 2008-01-30 02:39:42 The text from WG14/N1256 is copyright © ISO