C9X - changes to integer types

This page describes the planned changes to integer types in C9X. It is based on the papers N691, N698, N724, N725, and N736.

Introduction

N691 is Clive Feather's work on the representation of types, particularly integer types, based loosely on the response to DR 069. N724 is a minor edit, forcing integers to have one of the three "expected" representations.

N698 is the result of long discussions between Randy Myers and Doug Gwyn, plus some input from Clive Feather, Frank Farance, and Douglas Walls.

Implementation defined integer types are incorporated into the Standard by allowing implementations to add additional types to the set of "signed integer types." By existing wording in the Standard, the implementation must supply corresponding unsigned integer types. By definition, the implementation defined signed and unsigned integer types are integer types, basic types, scalar types, and arithmetic types. All of the statements made in the Standard about those type classes automatically apply to the implementation defined integer types. The same wording in the Standard that defines the properties of the standard integer types defines the properties of the implementation defined integer types as well.

N725 clarifies conversions between integer types now that extended types exist, and N736 shows how they apply to the preprocessor.

Normative changes

This gives the normative changes made by these five papers. They are shown relative to ISO 9899:1990, irrespective of how they were originally worded. New footnotes are shown as [N1], [N2], etc.

WG14 has also decided that the term "integral" should be replaced by "integer" throughout the Standard (with the exception of a couple of places that are not relevant to this). These changes have been made to this page.

Subclause 5.2.4.2.1 (<limits.h>)

Add to the end of the subclause:

The value UCHAR_MAX+1 shall equal 2 raised to the power CHAR_BIT.

Subclause 6.1.2.5 (types)

Replace paragraph 3, beginning "There are four signed integer types", with:

There are five standard signed integer types, designated as signed char, short int, int, long int, and long long int. (These and other types may be designated in several additional ways, as described in 6.5.2.) There may also be implementation-defined extended signed integer types [N1]. The standard and extended signed integer types are collectively called just signed integer types [N2].

[N1] Implementation defined keywords must have the form of an identifier reserved for any use as described in 7.1.3.

[N2] Therefore, any statement in this Standard about the signed integer types also applies to the extended signed integer types.

Replace paragraph 5, beginning "For each of the signed integer types", with:

For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements. The unsigned integer types that correspond to the standard signed integer types are the standard unsigned integer types. The unsigned integer types that correspond to the extended signed integer types are the extended unsigned integer types.

The extended signed integer types and extended unsigned integer types are collectively called the extended integer types.

Add a new footnote to the end of paragraph 7, which defines basic type:

[N3] An implementation may define new keywords that provide alterative ways to designate a basic (or any other) type. An alternate way to designate a basic type does not violate the requirement that all basic types be different. Implementation defined keywords must have the form of an identifier reserved for any use as described in 7.1.3.

Replace paragraph 13, which defines integral types, with:

The type char, the signed and unsigned integer types, the integer bitfield types, and the enumerated types are collectively called integer types.

Delete footnote 18.

Subclause 6.1.2.7

Add a new subclause:

6.1.2.7 Representations of types.

The representations of all types are unspecified except as stated in this subclause.

6.1.2.7.1 General.

Values of type unsigned char shall be represented using a pure binary notation [N4].

[N4] A positional representation for integers ... (insert existing footnote 18). A byte contains CHAR_BIT bits, and the values of type unsigned char range from 0 to 2**CHAR_BIT-1.

When stored in objects of any other object type, values of that type consist of N*CHAR_BIT bits, where N is the size of objects of that type, in bytes. The value may be copied into an object of type unsigned char [N] (e.g. by memcpy); the resulting set of bytes is called the object representation of the value. Two values with the same object representation shall compare equal, but values that compare equal might have different object representations.

Certain object representations might not represent a value of that type. If such a representation is accessed due to evaluation of an object, or if such a representation is produced by a side effect that stores into all or any part of the object using an lvalue of that type, then the behaviour is undefined [N5]. Such representations are called trap representations.

[N5] Thus an automatic variable can be initialized to a trap representation without causing undefined behaviour, but the value of the variable cannot be used until a proper value is stored in it.

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values [N6]. The values of padding bytes shall not affect whether the value of such an object is a trap representation. Those bits of a structure or union object that are in the same byte as a bitfield member, but are not part of that member, shall similarly not affect whether the value of such an object is a trap representation.

[N6] Thus structure assignment may be implemented element-at-a-time or via memcpy.

When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values, but the value of the union object shall not thereby become a trap representation.

Where an operator is applied to a value whose object representation includes padding bits but which is not a trap representation, the operator shall ignore those bits for the purpose of determining the value of the result. If the result is stored in an object that has padding bits, it is unspecified how those padding bits are generated - they might not be related to the padding bits of the operands - but a trap representation shall not be generated.

6.1.2.7.2 Integer types.

For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2**(N-1), so that objects of that type shall be capable of representing values from 0 to 2**N-1 using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified [N7]. The precision of an integer type is the number of bits it uses to represent values excluding any sign and padding bits.

[N7] Some combinations of padding bits might generate trap representations, for example, if one padding bit is a parity bit. Nonetheless, no arithmetic operation on valid values can generate a trap representation other than as part of an exception such as an overflow, and this cannot occur with unsigned types.

For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; there shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M <= N). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, then the value shall be modified in one of the following ways: - the corresponding value with sign bit 0 is negated; - the sign bit has the value -2**N; - the sign bit has the value 1-2**N.

The values of any padding bits are unspecified [N7]. A valid (non-trap) object representation of a signed integer type where the sign bit is zero is a valid object representation of the corresponding unsigned type, and shall represent the same value.

Bit field types shall have no padding bits; an N-bit bitfield shall have N value bits if treated as unsigned, and N-1 value bits plus a sign bit if treated as signed.

Subclause 6.1.3.2 (integer constants)

Append to the subclause:

If an integer constant can not be represented by any type in its list, it may have an extended integer type, if the extended integer type can represent its value. If all of the types in the list for the constant are signed, the extended integer type shall be signed. If all of the types in the list for the constant are unsigned, the extended integer type shall be unsigned. If the list contains both signed and unsigned types, the extended integer type may be signed or unsigned.

Subclause 6.2.1.1 (characters and integers)

Replace the first paragraph with:

Every integer type has a integer conversion rank defined as follows:

The following may be used in an expression wherever an int or unsigned int may be used:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions [27]. All other types are unchanged by the integer promotions.

Subclause 6.2.1.2 (signed and unsigned integers)

Replace the entire subclause with:

6.2.1.2 Signed and unsigned integers

When a value with integer type is converted to another integer type, if the value can be represented by the new type, it is unchanged.

Otherwise, if the destination type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the destination type until the value is in the range of the destination type.

Otherwise the destination type is signed and the value cannot be represented in it; the result is implementation-defined.

Subclause 6.2.1.7 (usual arithmetic conversions)

Replace the text from:

Otherwise, the integer promotions are performed on both operands. Then the following rules are applied:

to the end of the paragraph with:

Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:

If both operands have the same type, then no further conversion is needed.

Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.

Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.

Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.

Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

Subclause 6.8.1 (conditional inclusion)

In the second paragraph of the semantics, replace:

The resulting tokens comprise the controlling constant expression which is evaluated according to the rules of 6.4 using arithmetic that has at least the ranges specified in 5.2.4.2, except that int and long, and unsigned int and unsigned long, act as if they have the same representation as, respectively, long long and unsigned long long.

with:

The resulting tokens compose the controlling constant expression, which is evaluated according to the rules of 6.4, except that all signed integer types and all unsigned integer types act as if they have the same representation as, respectively, a single signed integer type and its corresponding unsigned integer type. If the <inttypes.h> header defines intmax_t and uintmax_t, the single signed integer type and corresponding unsigned integer type shall be able to represent the values of, respectively, intmax_t and uintmax_t. Otherwise, the single signed integer type and corresponding unsigned integer type shall be able to represent the the values of, respectively, long long int and unsigned long long int.


Back Back to the C index. CDWF Back to Clive's home page.