Public Comment Number PC-UK0080 ISO/IEC CD 9899 (SC22N2620) Public Comment =========================================== Date: 1998-02-25 Author: N.M Maclaren Author Affiliation: Self Postal Address: University of Cambridge, Computer Laboratory, New Museums Site, Pembroke Street, Cambridge CB3 3QG, United Kingdom E-mail Address: Telephone Number: +44 1223 334761 Fax Number: +44 1223 334679 Number of individual comments: 1 Comment 1. Category: Normative change to existing feature retaining the original intent Committee Draft subsection: 5.1.1.2 Title: Universal character names and #include Detailed description: I am afraid that universal character names have introduced some incompatibilities with C89, as in directives like '#include "a$b.h"'. This was defined in C89, but becomes undefined in C9X (i.e. it maps to '#include "a\u0024b.h"'). This is a deceptive trap, and one that will cause serious problems on some systems, as "$" is a fairly common character in header names. An evil variant of this is '#include "\udefault\a$b.h"' on MS-DOS and derivative systems. This maps to "\udefault\a\u0024b.h", which is either handled as such or mapped to "?ult\a$b.h", where '?' is whatever '\udefa' is. The current wording implies that one or the other approach should be used, though I don't think that it actually forbids mapping back to "\udefault\a$b.h". Note that these are quiet changes. In C89, "$" can be used in a program as a normal character, subject ONLY to it being a member of the basic source (and, if necessary, the basic execution) character sets. In C9X, it has an implementation-defined value that need not be that of any character. I suggest adding the following to help with the header and pragma problems: Whether a universal character name or character not in the the basic source character set is interpreted during phase 4 in its universal character name form or as a single character is implementation-defined. Under this circumstance alone, an actual extended character encountered in the input, and the same extended character expressed in the input as a universal-character-name, need not be handled equivalently. Recommended practice An implementation should, when appropriate, interpret characters in their input form during phase 4. A good implementation will diagnose uses when this is not the case, or where there is potential ambiguity.