Public Comment Number PC-UK0118

ISO/IEC CD 9899 (SC22N2620) Public Comment

=========================================== 

Date: 1998-02-25
Author: N.M Maclaren
Author Affiliation: Self
Postal Address:
    University of Cambridge,
    Computer Laboratory,
    New Museums Site,
    Pembroke Street,
    Cambridge CB3 3QG,
    United Kingdom
E-mail Address: <nmm1@cam.ac.uk>
Telephone Number: +44 1223 334761
Fax Number:       +44 1223 334679
Number of individual comments: 1


Comment 1. 

Category: Normative change to intent of existing feature

Committee Draft subsection: 6.5.3, 6.5.3.1, 6.7.1

Title: Enhancements to restrict for optimisation


Detailed description:

I think this can be made to work, and will be of immense advantage to many
people, but there are still some very serious problems to resolve.  As the
current proposal stands, I am afraid that the facility will be of limited
use for optimisation on a good many modern machines.  The four most
serious problems that I have seen are the following, and I have included
suggestions to improve all of them.


    1) Arrays have always been slightly second-class objects in C, though
C9X makes them almost into first-class ones, but this is one area that they
remain firmly second-class.  For example, the term in 6.7.1 Function
definitions paragraph 10 "On entry to the function" has caused some
confusion in C89, but becomes extremely serious in the context of VLAs and 
restrict.  Inter alia, it means that there is no way to declare an array
parameter with bounds that can be relied on or even to guarantee that an
apparently array argument is not a null pointer.

The main reason this problem is so serious is that modern RISC CPUs are a
hundred times as fast as their memory, and an increasing number of such
architectures are introducing various forms of operand preloading to cope
with this problem.  To do this efficiently, a compiler needs to know both
the declared size of an array and that it is valid on function entry.  This
can produce much better code than if it has to delay preloading until an
actual reference through the argument, and this sort of optimisation is
fairly standard in Fortran.

I believe that it is fairly easy to make restrict-qualified array arguments
safe for optimisation (though probably not debugging).  It could be done
for all arguments, but there are almost certainly many programs that rely
on the fact that array bounds are not checked in C.  Yes, I know about the
wording in sections 6.3.6 and elsewhere, but they are not watertight and
hence cannot be used for optimisation.


    2) There is a horrible problem introduced by 6.5.3 Type qualifiers
paragraph 7, and (I am afraid) described incorrectly in 6.5.3.1 Formal
definition of restrict example 3.  Consider the following variation of that
example:

    extern int * s = NULL;

    void h (int n, int * const restrict p, const int * const q,
             const int * const r) {
        int i;
        *s = 0;
        for (i = 0; i < n; ++i) *s += (p[i] = q[i]+r[i]);
    }

    double a[5], b[5], c;
    s = &c; fred(5,a,b,b);
    s = &b[3]; fred(5,a,b,b);

Unless I am much mistaken, this is conforming.  It is certainly completely
unoptimisable, unless the compiler can work out what the contents of s
are.  The following example is equally bad on many systems.

    extern int func(int,int);

    void h (int n, int * const restrict p, const int * const q,
             const int * const r) {
        int i;
        for (i = 0; i < n; ++i) p[i] = func(q[i],r[i]);
    }

Firstly, these examples show that plastering arguments with const does not
really help the compiler very much when it comes to optimisation, because
it doesn't prevent update through any other pointer.  Secondly, it shows
that the restrict qualifier does very little to help with indicating that
immutable arrays are actually immutable - all it does say is that a
particular restrict-qualified parameter will not update another argument.

There are two conditions actually needed for optimisation.  One is that a
modified object is not accessed through any other pointer, and restrict
does this.  The other is to know that an unmodified object is immutable
during the life of the pointer, and the current specifications of restrict
and const don't help much.  The latter can be solved by noting that there
is no need to forbid multiple accesses to restrict-qualified objects where
no accesses modify the object.


    3) 6.5.3.1 Formal definition of restrict paragraph 4 refers to "the
array object that is determined dynamically by all references".   Precisely
what does this mean?  Consider the following code fragment - for
simplicity, assume that 'sizeof(double) > 1':

    void fred (restrict double a[3], restrict double b[3]) {
        ((char *)b)[sizeof(double)+2] = ((char *)a)[sizeof(double)+1];
        a[0] = b[2];
    }

    double array[3];
    fred(array,array);

Obviously, I can produce far more evil examples, including ones where the
definition of the array objects depends on implementation-defined or
unspecified (but not undefined) behaviour.  I am afraid that a precise
definition is needed.  My proposal STILL does not handle cases like the
following very well:

    void fred (restrict char *a, char *c) {
        double *b = a;
        ((char *)b)[5] = 0;
        *c = 1;
    }

    double z[10];
    fred(z,&a[6]);

But that is thoroughly perverse.  Yes, I know that it is both legal and
common, but my view is that such code is beyond hope and I don't care if it
runs like a drain.  The best that we can hope for is to prevent it from
causing syntactic or semantic problems, which the current wording does not.


    4) Consider the following:

    extern volatile int fred;

    void joe (const int * const volatile restrict a) {...}

    joe(&fred);

Unless I have lost my marbles, that destroys the whole point of restrict.
The question is whether it is worth bothering about.  My view is probably
not, on the grounds that C does not protect the fool from his folly, but
it is debatable.  It could be locked out very easily.


The following are my suggestions to improve these problems.  I don't
guarantee that they are watertight, but they are as good as I can make
them.


6.5.3 Type qualifiers.

The constraint is a change from the approach taken in C89, and one that
prevents several very important optimisations.  So paragraph 2 should be
changed to:

2   Types other than array types in function declarations or pointer types
    derived from object or incomplete types shall not be restrict
    qualified.

If the combination of restrict and volatile were to be locked out, the
following could be added:

    No type shall be both restrict and volatile qualified.

The following should be added to the end of the second sentence of
paragraph 7:

    ..., except in the case where no reference modifies any part of the
    object.


6.5.3.1 Formal definition of restrict

We need to produce a precise definition of the object A.  Also, in order to
allow preloading, we need to make array bounds binding.  There is some
wording in 6.3.6 Additive operators paragraph 8, but I don't think that it
is sufficient for this purpose.  So, add after paragraph 3:

    The array object A determined dynamically by references though a
    pointer object P of type T * is defined in the following way.  Firstly
    ignore all qualifications in T * and, if T * then becomes void *, treat
    P as if T * were unsigned char *.  Then, consider all pointer
    expressions E based on P evaluated during the execution of B, convert
    them to unsigned char *, and denote the smallest such value by Q and
    the largest R.  If an object is accessed, then Q and R are calculated
    as if each byte of the object and the byte immediately following the
    end of the object were addressed by separate pointer expressions of
    type unsigned char *.  Lastly, let M and N be the largest and smallest
    integers, such that (unsigned char *)&P[M] <= (unsigned char *)Q and
    (unsigned char *)&P[N] >= (unsigned char *)R.  If either M or N is not
    defined because one of (unsigned char *)&P[M] and
    (unsigned char *)&P[N]-1 would be outside the limits of the originally
    allocated object into which P points, the behaviour is undefined.

    If P is declared as a pointer parameter, A is the array with element
    type T and size N-M whose first element has address &P[M].  If P is
    declared as an incomplete array parameter and M is less than 0, the
    behaviour is undefined; otherwise A is the array with element type T
    and size N whose first element has address &P[0].  If P is declared
    as a a complete array parameter (whether variably modified or not) of
    size S and M is less than 0 or N is greater than S, the behaviour is
    undefined; otherwise A is the array with element type T and size S
    whose first element has address &P[0].

The second sentence of paragraph 4 should be changed to:

    Then either all references to values of A shall be through pointer
    expressions based on P, or no reference to A (through expressions based
    on P or otherwise) shall modify any part of its contents during the
    execution of B.

Example 3 should then become:

        void h (int n, int * p, const int * restrict q,
                 const int * restrict r) {
            int i;
            for (i = 0; i < n; ++i) p[i] = q[i]+r[i];
        }

    shows how const can be used in conjunction with restrict.  The const
    qualifiers imply, without the need to examine the body of h, that the
    values of any object accessed through q and r cannot change during the
    execution of h.  This is the precise assertion needed to optimise the
    loop.  Note that p does not need to be restrict qualified in this 
    example, but may allow better optimisation in a more complicated
    function.


6.7.1 Function definitions

The other requirement for preloading is to know that an argument is not a
null pointer.  The following should be added after paragraph 9:

9   Upon entry to a function, if the actual argument corresponding to a
    restrict qualified array parameter is a null pointer, the behaviour is
    undefined.