Public Comment Number PC-UK0019

ISO/IEC CD 9899 (SC22N2620) Public Comment

=========================================== 

Date: 1998-01-06
Author: Clive D.W. Feather
Author Affiliation: Self
Postal Address:
    Demon Internet Limited
    322 Regents Park Road
    London
    N3  2QQ
    United Kingdom
E-mail Address: <clive@demon.net>
Telephone Number: +44 181 371 1138
Fax Number:       +44 181 371 1037
Number of individual comments: 1


Comment 1. 
Category: Request for information/clarification
Committee Draft subsection: 5.1.1.2, 6.3.1.1

Title: handling of characters not in the execution character set

Detailed description:

Consider the code extract:

    char *s = "\u30CE";

During translation phase 5 the universal character name is converted to a
multibyte character. However, it is not stated what happens if the
implementation does not have a representation for Katakana (30CE is within
the Katakana range of annex I). Therefore it is implicitly undefined.

Now consider the following translation unit:

    #include <stdio.h>

    void fff (void);
    void \u30CE (void);

    int main (void)
    {
        fff ();
        \u30CE ();
        return 0;
    }

    void fff (void)
    {
        printf ("This is %s\n", __func__);

    }

    void \u30CE (void)
    {
        printf ("Hello world!\n");
    }

This is clearly strictly conforming (unless I've made an error :-).

Now consider the trivial change:

    #include <stdio.h>

    void fff (void);
    void \u30CE (void);

    int main (void)
    {
        fff ();
        \u30CE ();
        return 0;
    }

    void fff (void)
    {
        printf ("Hello world!\n");
    }

    void \u30CE (void)
    {
        printf ("This is %s\n", __func__);
    }

This is now undefined on any implementation that cannot represent the
Katakana character set ! I have trouble believing that this was intended,
and I certainly feel that, if it is retained, it should be flagged in the
text of the Standard.