Last changed 2001-09-10. There are 3 items. ==== Potential Defect Report 1 (last modified 2001-09-10) Notes from IST/5/-/14: this needs significant revision. Discuss sizeof versus not-evaluated. Subject: constant expressions and unevaluated subexpressions ------- Problem ------- Consider the code: #define ALPHA 0 #if ALPHA ? (1,1) : 1 6.6 says: [#3] Constant expressions shall not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within a subexpression that is not evaluated.95 Naively one would therefore expect the #if line to be forbidden. However, if ALPHA is defined as 0, neither the comma operator nor the assignment operator is evaluated, so this does not forbid them. 6.10.1 does not contain any wording that would affect this. This appears undesirable to me. This appears to have been an inadvertant change when introducing VLAs. The distinction is between expressions that are not evaluated for syntax reasons (e.g. "&*x" or "sizeof (x = 1)") and those are not evaluated for evaluation reasons (e.g. "x && y" or "x ? y : z"). The former are fine, but the latter should be forbidden. Suggested Technical Corrigendum ------------------------------- Change 6.6#3 to: [#3] Constant expressions shall not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within a subexpression that is not evaluated; for this purpose all the operands of the && || and ?: operators are deemed to be evaluated.95 and change footnote 95 to: 95)The operand of a sizeof operator is usually not evaluated (6.5.3.4). Thus "sizeof(x=1)" is a constant expression if x has been declared appropriately, but "0 && (x=1)" is not. ==== Potential Defect Report 2 (last modified 2001-09-10) Notes from IST/5/-/14: what will be the effect of making the change ? Will it actually help ? Subject: character encodings ------- Problem ------- Defect Report 091 discussed a multibyte character encoding where some single-byte characters are proper prefixes of two-byte characters. For example, single-byte characters have codes 1 to 127 while two-byte characters consist of such a code followed by a code from 128 to 255. At the time WG14 stated that such an encoding was legitimate. Consider the following code: static mbstate_t ps_zero; mbstate_t ps = ps_zero; bool a = true; while (a) { char c = get_a_byte (); wchar_t w; printf ("Code 0x%2.2X -> ", (unsigned int)(unsigned char) c); switch (mbrtowc (&w, &c, 1, &ps)) { case 0: printf ("null wide character\n"); a = false; break; case 1: printf ("wide character 0x%X\n", w); break; case -2: printf ("incomplete\n"); break; case -1: printf ("encoding error\n"); ps = ps_zero; break; default: printf ("can't happen!\n"); a = false; break; } } which fetches bytes one at a time from some other source and converts them to a sequence of wide characters. Now imagine that the above encoding is used, that the wide character mappings of 0x12 and 0x12 0x9A are 0x0012 and 0x9A12 respectively, and that the actual input sequence is: 0x51 0xCC 0x52 0x53 0x54 0x55 0x00 What will the output be ? 0x51 can't produce a wide character, because the correct one cannot be determined until the next byte is examined, so it must return "incomplete". Therefore the first three lines of output will be: Code 0x51 -> incomplete Code 0xCC -> wide character 0xCC51 Code 0x52 -> incomplete Now consider the next line. The byte 0x53 cannot be part of a multibyte sequence beginning 0x52, so this must be viewed as completing a single byte sequence. However, what should be returned ? If the result is 1, this implies that the character 0x53 has been used. On the other hand, if the result is 0, this specifically indicates a null character. So we must take the lesser of two evils and return 1, storing the value 0x53 in the state object. This would make the next few lines of output: Code 0x53 -> wide character 0x52 Code 0x54 -> wide character 0x53 Code 0x55 -> wide character 0x54 which might be counterintuitive but is at least correct. By similar logic, the 0x00 input must disgorge the final stored byte: Code 0x00 -> wide character 0x55 However, this then leads to two problems. Firstly, a naive implementer or programmer might reasonably expect the null character to generate the null wide character. Secondly, the source of bytes might well dry up after a zero byte; for example, suppose that under certain circumstances get_a_byte() examined a fixed string: static char *s = "\x51\xCC\x52\x53\x54\x55"; static char *p = s; return *p++; Calling this again after the zero byte would amount to undefined behaviour. Even if this is strictly correct, it is not what would be expected by the average programmer. Suggested Technical Corrigendum ------------------------------- Add a bullet to 5.2.1.2#1: -- No multibyte character shall be a proper prefix of another multibyte character in the same shift state; that is, it shall be possible to determine the end of a multibyte character by examining only the bytes that make up the character. (The sequence of bytes that causes a change of shift state is a separate multibyte character for this purpose.) ==== Potential Defect Report 3 (last modified 2001-09-10) Notes from IST/5/-/14: all uses of the term "object" would need to be examined before making this change; many others might need alteration. Subject: objects and sub-objects ------- Problem ------- I believe that each element of an array, each member of a structure or union, and each byte of an object, is itself an object. Without this, it's hard to see how many of the semantics of the language can work. However, nowhere is it so stated. Suggested Technical Corrigendum ------------------------------- Add a new paragraph to 3.14 after #1: [#1a] Each element of an array object, and each member of a structure or union object, is itself an object. If an object is composed of more than one byte, each byte is itself an object. ==== Potential Defect Report 4 (last modified 2001-09-10) Subject: parmN restrictions in ------- Problem ------- 7.15.1.4#4 (va_start) reads in part: If the parameter parmN is declared with the register storage class, with a function or array type, or with a type that is not compatible with the type that results after application of the default argument promotions, the behavior is undefined. The interesting restriction is that on function or array types. Is this intended to apply before the adjustment to pointer types ? Or is it an oversight because a parameter cannot have such a type. In other words, is the following code legal: void fred (int jim [5], ...) { va_list va; va_start (va, jim); // ... } Suggested Technical Corrigendum ------------------------------- Change the cited words to either: If the parameter parmN is declared with the register storage class or with a type that is not compatible with the type that results after application of the default argument promotions, the behavior is undefined. or: If the parameter parmN is declared with the register storage class, with a type that - before the adjustment to pointer types in 6.9.1 - would be a function or array type, or with a type that is not compatible with the type that results after application of the default argument promotions, the behavior is undefined. ==== Potential Defect Report 5 (last modified 2001-09-10) Subject: Editorial ------- Problem ------- In 7.15.1.4#7, the last named parameter of f3 is "f4_after", but the call to va_start uses the parameter "n_ptrs". Suggested Technical Corrigendum ------------------------------- Correct the example. ==== Potential Defect Report @@ (last modified 2001-09-@@) Subject: @@@@@@@ ------- Problem ------- @@@@@@@ @@@@@@@ Suggested Technical Corrigendum ------------------------------- @@@@@ @@@@@ ====