6.4 Lexical elements

Previous Table of Contents "New C Standard" commentary



preprocessing-token: header-name identifier pp-number character-constant string-literal punctuator each non-white-space character that cannot be one of the above

771 Each preprocessing token that is converted to a token shall have the lexical form of a keyword, an identifier, a constant, a string literal, or a punctuator.

772 A token is the minimal lexical element of the language in translation phases 7 and 8.

773 The categories of tokens are: keywords, identifiers, constants, string literals, and punctuators.

774 A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6.

775 The categories of preprocessing tokens are: header names, identifiers, preprocessing numbers, character constants, string literals, punctuators, and single non-white-space characters that do not lexically match the other preprocessing token categories.58)

776 If a ' or a " character matches the last category, the behavior is undefined.

777 Preprocessing tokens can be separated by white space;

778 this consists of comments (described later), or white-space characters (space, horizontal tab, new-line, vertical tab, and form-feed), or both.

779 As described in 6.10, in certain circumstances during translation phase 4, white space (or the absence thereof) serves as more than preprocessing token separation.

780 White space may appear within a preprocessing token only as part of a header name or between the quotation characters in a character constant or string literal.

781 58) An additional category, placemarkers, is used internally in translation phase 4 (see; it cannot occur in source files.

782 If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token.

783 There is one exception to this rule: a header name preprocessing token is only recognized within a #include preprocessing directive, and within such a directive, Header name preprocessing tokens are recognized only within #include preprocessing directives or in implementation-defined locations within #pragma directives.

784 In such contexts, a sequence of characters that could be either a header name or a string literal is recognized as the former.

785 EXAMPLE 1 The program fragment 1Ex is parsed as a preprocessing number token (one that is not a valid floating or integer constant token), even though a parse as the pair of preprocessing tokens 1 and Ex might produce a valid expression (for example, if Ex were a macro defined as +1). Similarly, the program fragment 1E1 is parsed as a preprocessing number (one that is a valid floating constant token), whether or not E is a macro name.


The program fragment x+++++y is parsed as x ++ ++ + y, which violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct expression.

787 Forward references: character constants (, comments (6.4.9), expressions (6.5), floating constants (, header names (6.4.7), macro replacement (6.10.3), postfix increment and decrement operators (, prefix increment and decrement operators (, preprocessing directives (6.10), preprocessing numbers (6.4.8), string literals (6.4.5).


Created at: 2008-01-30 02:39:41 The text from WG14/N1256 is copyright © ISO