This is a review of The Annotated ANSI C Standard, annotated by Herbert Schildt.
This review is made possible by the generosity of Raymond Chen <raymondc@microsoft.com>, who provided the review copy of the book, and is dedicated to the Dream Inn, Santa Cruz, CA, whose staff supplied uncounted cups of coffee while I wrote this review.
This review was first published in comp.std.c on 1994-03-22. After feedback on the group I published various errata. Jutta Degener then collected all these together, converted the plain text postings to HTML, and published the result on the WWW; that version was labelled as "last modified on 1995-03-03". It has now been moved to my own web site, and edits since the 1995 version are shown in a separate style like this.
Thanks to the following for pointing out errors:
Stan Brown <brown@ncoast.org>
Jutta Degener <jutta@pobox.com>
Mark-Jason Dominus <mjd@saul.cis.upenn.edu>
Sue Meloy <suem@hprpcd.rose.hp.com>
Christopher R Volpe <volpe@ausable.crd.ge.com>
Alan Watson <alan@bernie.sal.wisc.edu>
Since The Annotated ANSI C Standard first appeared, many people have commented on errors in the book. After reading several of these, I obtained a copy of the book and have read it in its entirety.
Many of these comments might appear to be relatively trivial. In response to this, I can only point out that the book is commenting on a very carefully designed document, and one that has to be read precisely. If the annotator cannot get things right, then the book is not just useless, but is a positive danger to those who do not have the time to read and analyse every word of the standard. In other contexts, such as a tutorial on C, some of the errors in this book could be allowed to pass, but not in this.
When I state that no mention is made of a topic, this indicates that I feel that the topic is at least as important as ones that were commented on; quite often this refers to the features of the standard which are less easy to understand.
Text quoted directly from the book is indicated by ## in the left margin.
Quite often, the book gives the impression that annotations were omitted because they couldn't be fitted into the format of "standard on the left, comments on the right". Whilst many pages of the standard have no annotations at all, there are no pages with annotation but no standard. I note at least one case below where I believe that a function was not annotated because the comments on the previous section took up too much space.
The front cover of the book shows, amongst much clutter and someone's half-eaten muffin, page 147 of the standard. It is intriguing to note that, not only is this the obsolete ANSI standard rather than the ISO standard, but that it corresponds to half of page 146 in the book.
The major divisions of the standard are referred to as "Part 1", "Part 2", etc. In actual fact, they are "clause 1", "clause 2", and so on. One has to wonder about an author who can't even get that right.
For a year after first writing this review, I believed that at least the left hand pages (the extracts of the Standard) were correct. It turns out that even this isn't the case! [See 6.1.3.1.]
Numbers at the start of each comment are the ISO subclause numbers of Schildt's annotations, which are not always the same as the subclause actually being annotated.
## However, this limits the total character set to 255 characters.Actually, it limits it to UCHAR_MAX characters, which is at least 255, but can be more. There was an opportunity here to explain what multibyte characters actually are, but it seems to have been missed, possibly because of the lack of space.
## An object is either a variable or a constant that resides at aIn C, a constant does not reside in memory, (except for some string literals) and so is not an object. To be more precise, in the C Standard's model of how things work, constants are not something that live in memory, don't have addresses, and aren't objects. This doesn't stop an implementation from handing constants as variables that never change value.
## physical memory address.
## The standard requires that a compiler issue error messages whenwithout discussing the different kinds of errors.
## an error in the source code is encountered.
## You are therefore free to declareThis statement is immediately followed by the example:main()
as required by your
## program.
void main (void)even though the text of the standard directly opposite states that this is undefined. Indeed, the text I quote makes me wonder whether Schildt believes that:
struct foo { int i; double d; } main (double argc, struct foo argv)is permitted !
Most of the examples in the book declare main()
as void
. I won't bother to
point them out individually.
This is something that has since changed. The 1999 Standard allows an implementation
to permit any form it likes for main()
. However, since only the two
standard forms (returning int
, with or without parameters) are required
to be supported, programmers should normally stick with them.
[Added 2008-08-13]
## Though most compilers will automatically return 0 when no otherIndeed it is not. If
## return value is specified (even whenmain()
is declared as
##void
), you should not rely on this fact because it is not
## guaranteed by the standard.
main()
is declared as void
, I don't know of any compiler
that will return 0. Indeed, the standard forbids it to !
In the 1999 Standard, when a call to main()
reaches the terminating
close brace, the behaviour depends on the declared return type:
main()
is declared to return int
the value 0 is returned to the implementation;
## Therefore, a multibyte character is a character that requiresIgnoring the fact that "character" and "byte" are synonymous in the standard (something that is not mentioned in the annotations), the definition of multibyte character is clear that it *does* include single byte characters.
## more than one byte.
## First, the null character may not be used except in the firstI read this as meaning that the multibyte character <00><94> is legal while the multibyte character <94><00> is not. In actual fact, the standard states that a zero byte must not appear in any multibyte character other than the null character (i.e. the end of string indicator). This means that string operations such as
## byte of a multibyte sequence.
strcpy
will work as expected with multibyte character
sequences.
There was an opportunity here to explain multibyte characters and how to use them, something that most books omit. Unfortunately, this one omits it as well.
## In other words, one copy of a library function in memory mayThis is blatant nonsense - on most Unix systems, if the same program is executing several times, all the code is shared by both processes. Indeed, many go further and share one copy of the standard C library among every process on the system.
## not be used by two or more currently executing programs.
What this section of the standard is talking about is re-entrancy. The functions in the library are not re-entrant, and so may not be called from within themselves. For example:
qsort()
cannot be called from within the compare function passed to
qsort();
malloc
must not call
malloc
from within signal handlers.
## A compound statement is a block of code.A nice sounding statement, but totally meaningless. A compound statement is a block of code beginning with { and ending with the matching }. For example, the body of a function is a compound statement.
In the 1999 Standard a block is a concept more relevant to the scope of declarations.
A compound statement is one kind of block, but it is not the only one.
[Added 2008-08-13]
## First, notice that a character is defined as 8 bits (1 byte).Certainly a character is always 1 byte long, since that is what a byte is defined as. However, nowhere does the standard require a byte to be 8 bits; an implementation with 47-bit bytes can conform to the standard.
## All other types may vary in size, but in C a character is always
## 1 byte long.
The POSIX standard does, however, now require that a byte is 8 bits
(this change was made around 2001).
This means that most common implementations will use 8-bit bytes.
Other sizes are, however, found in specialised implementations such as those
for digital signal processors.
[Added 2008-08-13]
The assumption that 1 byte = 8 bits occurs at several other points in the book. I won't always bother to point it out.
## No other keywords are allowed in a conforming program.False. Other keywords are allowed, providing that they either occupy the implementation namespace (such as "__far"), or that they are only used after inclusion of a non-standard system header. For example, a compiler could state that, following "#include <8086.h>", "far" is a keyword. Since no strictly conforming program can include that header, and providing that "far" is not treated specially without it, such a compiler would conform to the standard.
Of course, no other keywords are allowed in a strictly conforming program.
## * File scope begins with the beginning of the file and ends withThis is not true: while the scopes end as described, they begin, for each identifier, at the end of its "declarator" (that is, at the comma, equals sign, or semicolon after it is declared). This is particularly important for identifiers with block scope. Consider this code:
## the end of the file
## * Block scope begins with the opening { of a block and ends with
## its associated closing }.
/* Line 1 */ { /* Line 2 */ int i = 10; /* Line 3 */ { /* Line 4 */ int j = i; /* Line 5 */ int i = 5; /* Line 6 */ printf ("i = %d, j = %d\n", i, j); /* Line 7 */ } /* Line 8 */ }All three variables have block scope, but they are different:
Note, incidentally, that although all three scopes are different, j and the second
i have "the same scope" in the Standard's terminology.
[Added 2008-08-13]
In particular, the "i" on line 4 refers to the one in the outer block, and so j has the value 10, not 5.
This has changed with the 1999 Standard. Block scope ends at the end of a block, but this isn't necessarily indicated by a } character. For example:
for (int k := 0; k < 10; k++) printf ("k = %d\n", k);
is a block, and the scope of k ends at the semicolon which marks the end of
that block.
[Added 2008-08-13]
## Identifiers with external linkage are accessible by your entireOnce again this is in error - for example, an identifier with external linkage is not accessible in a translation unit that uses the same name with internal linkage. The point of linkage is to indicate when the same identifier refers to the same object, yet the annotations omit this entirely.
## program
## An unsigned integer expression cannot overflow. This is becauseMore nonsense. An implementation either does or doesn't have a way to represent overflow - usually integers don't, while floating point may or may not (some systems have INFINITY values that effectively indicate overflow). However, an unsigned integer expression cannot overflow because the standard says so - the choice was made that unsigned integer arithmetic is done modulo some base (
## there is no way to represent such an overflow as an unsigned
## quantity.
UINT_MAX+1
for unsigned int
, ULONG_MAX+1
for unsigned long
). There is
no magic about this; it was an arbitrary decision by the authors of the
standard.
Another way to see the problem with this text is that signed integer
expressions can overflow, even though there is no way to represent
overflow as a signed integer quantity.
[Added 2008-08-13]
## fractional-constant:There should be a dot after the second alternative as well. Otherwise this syntax generates "123" (not actually a floatingconstant) but not "123." (which is).
## digit-sequence[opt] . digit-sequence
## digit-sequence
Apparently this error occurs in some official publications of the Standard,
though not in the original ANSI document and not in the UK publication of
the ISO Standard.
[Added 2008-01-24]
##This comment, and the following text, leave the reader believing that 'A' must have the value 65, and by extension that C requires the use of ASCII codes. This is of course false, but it would be hard to tell from the book.x = 'A'; /* give x the value 65 */
This, plus the comments assuming 8-bit bytes, and use of the terms "high byte" and "low byte" of integers later on, makes me wonder whether a better title for the book is: The ANSI C Standard annotated for some MSDOS compilers :-).
## In other words, the executable version of a C program containsWhile this is one way to implement strings, it is not the only one. Such a comment does not belong in a book like this.
## a table that contains the string literals used by the program.
## Further, the effect of changing the string literal table isIt's more than just implementation dependent (a term which, by the way, is not used by the standard), it's completely undefined. You must not modify a string literal.
## implementation dependent. The best practice is to avoid
## altering the string table.
## In the most general terms, when you convert from a largerWhen an integer value is converted to a signed type which can't hold that value, the result need not be that given by removing some bits. For example, a rule that converted all such values to the minimum value of the destination type (SCHAR_MIN, SHORT_MIN, INT_MIN) would be conforming.
## integer type to a smaller type, high-order bytes are lost.
A simpler way to state what this section means is:
## When converting a larger [floating] type into a smaller one, ifActually, unlike integers, such conversions are undefined, and the program may crash as a result.
## the value cannot be represented, information content may be lost.
## these automatic conversions are also intuitive.These conversions have been the subject of much debate. This section would benefit from a proper explanation of the "value preserving" rules, and why they were chosen.
There were two candidates for the conversion rules: "value preserving" and
"unsigned preserving".
The debate over the two was, apparently, long and involved, but the eventual
driver for the decision was that the "unsigned preserving" rules made it more
likely that a comparison like (T1) 1 > (T2) -1
would cause the -1
to be converted to an unsigned type, making it a very large number and so the
result be false.
[Added 2008-08-13]
## First, an array name without an index is a pointer to the firstThis has to be one of the worst expressions of the Rule I have ever seen ! First, there are a number of contexts (such as sizeof) where an array name does not get changed to a pointer. Second, if the decay to a pointer takes place at all, it takes place whether or not there is an index; for example, decay takes place when the array name is used as a function argument. Last, an array name is an lvalue; it is the resulting pointer that is not.
## element of the array and is not an lvalue.
"The Rule" was a common term used to refer to the rule that, in C, the name
of an array is often converted to ("decays to") a pointer to the
first element of the array.
The presence of this conversion often confuses novices to C.
Once they've got to grips with it, the exceptions then proceed to confuse
them all over again.
[Added 2008-08-13]
## The standard states that when an expression is evaluated, eachThe book then in effect goes on to say that "
## object's value is modified only once. In theory, this
## means the compiler will not physically change the value of a
## variable in memory until the entire expression has been
## evaluated. In practice, however, you may not want to rely
## on this.
i = ++i + 1
" is usually compiled
as if it were "i += 2
".
As anyone who has survived the "i = i++
" thread on comp.lang.c knows, this is
not only nonsense, but dangerous nonsense. The correct way to discuss this
part of the standard is to point out what can and can't be done in a strictly
conforming program, and leave it at that. Suggesting that such code can ever
have a defined answer is asking for trouble.
Fourteen years on, readers probably aren't familiar with "the i = i++ thread
on comp.lang.c".
C is slightly unusual in that it has mechanisms to allow a sub-expression
to change the value of a variable in an expression.
Different implementations of C, particularly on different machine
architectures, handle code like x++
in different ways.
For example, some may use an "INC" instruction to increment the memory
location holding x immediately, while others may leave it to the end of the
expression, implicitly adding a x = x + 1;
statement afterwards.
The generated code may be different depending on whether x occurs elsewhere
in the expression (thus making it more likely it has been copied into a
register) or even on the type of x (incrementing a pointer or a
floating-point number may require a different type of instruction to
incrementing an integer).
Exotic architectures with multi-threading or scoreboarding may be even worse.
Therefore the C Standard sets fairly restrictive conditions on such expressions. The key points can be summarised as two rules:
The first rule forbids expressions like x++ - x++
or
x = x++
.
The second allows expressions like
x = 255 / x
or
x = a [x]
but forbids both
y = x++ * x
and
a [x++] = x
(in each case because the second x isn't used
in determining the incremented value).
There are exceptions;
for example, the && operator is a "sequence point" and the two
operands count as separate expressions for this.
Thus the code x++ && x++
is legitimate.
[Added 2008-08-13]
## The rest of this section formally defined what type of lvalueWell, in one sense this is true. However, what is important is why only some lvalues can refer to a given object, and the annotations completely skip this. The reason is, of course, to indicate when a compiler can assume that two identifiers refer to the same object. For example, in:
## can refer to an object.
char *cp; int *ip; void f (double *d) { *d = 3.14159; *cp = 1; *ip = 2; }The rules of this section say that the assignment to
*cp
could potentially
alter *d
, and the compiler must generate code that takes that into account,
but the assignment to *ip
cannot, and the compiler may assume
that *d
and *ip
do not overlap. This is called aliasing, and knowing when aliasing takes
effect is an important factor in correctly optimising code.
One reason that aliasing is important is that optimising compilers will put the values
of variables into registers.
If the same variable can be referred to in two different ways and the compiler isn't
prepared for that, it could result in the two different ways getting different values
because one is being fetched from the register - which has been updated following an
assignment - and one from memory - which hasn't.
For example, in the above code, a naive compiler might move *d
into a register and then move it back at the end of the function, overwriting the
assignment made via *cp
.
The rules on aliasing control what cases the compiler does and does not need to be
able to cope with.
[Added 2008-08-15]
## When no prototype for a function exists, it is not an error ifOn the contrary, when no prototype exists, the number of arguments to a call must be the same as the number of parameters in the function (which cannot be a varargs function), and the types must be compatible after promotion. What should have been written is that no error message is required if these rules are broken.
## the types and/or number of parameters and arguments differ.
## The reason for this seemingly strange rule is to provide
## compatibility with older C programs in which prototypes do not
## exist.
The "common initial subsequence" rule changed in the 1999 Standard to be more
restrictive:
not only must the two members of the union have a common initial subsequence,
but that must be visible to the compiler at the time of use.
[Added 2008-08-15]
## When right-shifting a negative value, generally, ones areThe result of signed right shift of a negative number is implementation defined; there is no suggestion in the standard that shifting in ones is the "best" thing to do.
## shifted in (thus preserving the sign bit), but this is
## implementation dependent.
+=
etc.), the annotations mention
that "a += b
" means the same as "a = a + b
", but do not point out that the
two are not equivalent; for example, "*a++ *= 2
" is strictly conforming code
which increments a once, while "*a++ = *a++ * 2
" is not.
This is a major reason these operators exist.
[Added 2008-08-15]
## In simple language, a declarator is the name of the object beingIn real C, a declarator is everything about the type and name of the object except the basic type and storage class. For example, in "
## declared.
static int *p[5];
",
the declarator is "*p[5]
", and includes the concepts of pointer, array, and
size of array as well as the name.
## A variable declared usingNot only is this wrong, but the annotations to 6.7.2 directly contradict it, with the correct example of "extern
is not a definition.
extern int count = 10;
".
## In essence, a static local variable is a global variable withActually, a static local variable is a global variable with its scope restricted to some block scope; that is, from the end of its declarator to the closing } of the block it is declared in.
## its scope restricted to a single function.
That is, this could be the body of a function, but it could be some smaller block
enclosed within it.
[Added 2008-08-15]
## WhenThe global variable or function has file scope whether or notstatic
is applied to a global variable or function, it
## causes that variable or function to have file scope
static
is
applied to it. The static
keyword causes it to have internal linkage, which
is a different matter.
And, contrary to what the quoted text implies, a variable declared
static
within a function body has block scope, not file scope.
[Added 2008-08-15]
## TheIt can't be completely ignored, because whether or not it affects the way in which the variable is implemented, it is still illegal to take the address of an object declaredregister
specifier is only a request to the compiler, which
## may be completely ignored.
register
.
## This padding must occur at the end, not at the beginning, of thePadding can occur anywhere except at the beginning of a structure. In particular, it can occur between two fields. Of course a union can only be padded at the end.
## object.
## (Many compilers display a warning about this fragment, but stillActually, the standard requires a diagnostic for the third line, because it violates the third dashed item of the constraints of 6.3.16.1.
## accept it.)
##const int i = 10;
##int *p;
##p = &i;
##*p = 0; /* modify a const object through p */
both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the qualifiers of the type pointed to by the right;This is violated because the type pointed to by
p
is not
const
while i
is.
If an explicit cast had been used in that line, I believe that the assignment would be strictly conforming. If so, then it is true that the standard does not require a diagnostic for the last line, but nevertheless it is undefined, not just something to warn about.
## The information and constraints in this section are mostlySince this section defines how to declare arrays, pointers, and procedure prototypes, one has to wonder what the author actually considers interesting !
## applicable to compiler implementors.
In a parameter declaration, a single typedef name in parentheses is taken to be an abstract declarator that specifies a function with a single parameter, not as redundant parentheses around the identifier for a declarator.
In other words, in a declaration like
int f (const (size_t), int);
the first parameter of f
is a function taking one parameter
of type size_t
(and returning a const int
),
not a const
parameter called size_t
.
[Added 2008-08-15]
unsigned char *v[5];then the type of v is:
unsigned char *[5];
## The general form of an initialization isOnce again, the whole concept of declarators is omitted. While it is true that that is one form of an initialization, it excludes lines like:
## type var = initializer;
int a [5] = { 1, 2, 3, 4, 5 };
default
case label must be
the last one in the switch, and that it can't have an associated break
.
The problem with these "general" forms is that, while they are fine in a
teaching context, they omit all the grubby details that a user of the standard
needs to know, such as fall-through cases, or
Duff's Device.
I would also have appreciated a warning that ordinary labels are still allowed within the body of a switch statement, so:
switch (i) { /* ... */ defualt: j = 0; break; }is legal code, but is not the default case of the switch.
Ugly and misleading, yes, but legal.
[Added 2008-08-15]
## To understand the difference between the modern and old forms,
## here is the same function defined using both forms:
##
##/* Modern function definition. */
##float f (int a, char c)
##{
##/* ... */
##}
##/* Old-form function definition. */
##float f (a, c)
##int a;
##char c;
##{
##/* ... */
##}
Unfortunately, these two aren't exactly the same. With the modern function
definition, the argument corresponding to c is converted to type char
and
passed to the function. With the old-form definition, it is converted to int
,
passed to the function as an int
, and then converted to char
.
Why does this matter, you may ask ? Well, it matters when we're trying to write a prototype for the function. The prototype for the new form definition is:
float f (int a, char c);as you might expect. However, the prototype for the old form is:
float f (int a, int c);
And this, in turn, matters because the parameter passing conventions might be
different for the two cases.
For example, an int
might be passed in one sort of register while
a char
is pushed on to the stack, or passed in a different sort of
register.
[Added 2008-08-15]
## TheActually, it has three forms. While the third is fairly uncommon, it ought at least to be acknowledged.#include
statement has these two forms:
printf ("%d ", ABS (((-20) < 0 ? -(-20) : (-20)));and should be:
printf ("%d ", ((-20) < 0 ? -(-20) : (-20)));
#pragma
in
a translation unit (this means after #ifdef
'd-out code has
been removed) prevents it from being strictly conforming.
## All conforming C compilers will supply all of the functionsThis only applies to "hosted" implementations, and is not true for "freestanding" implementations.
## described here.
## Frankly, many C programmers are not aware of the rules describedQuite right ! Unfortunately, the chance to explain the rules was missed.
## in this section.
## IfThis isn't true at all. No library function will ever set errno to zero, but if it is zero before one is called, it can remain zero even if an error does occur.errno
is zero, then no error has been detected.
The standard library functions divide into those which are specified to set
errno
on error and those which are not.
The former must not set it to zero if an error occurs.
The latter may set it to any value except zero even if no error occurs,
or may leave it unchanged even if an error does occur.
[Added 2008-08-15]
offsetof()
.
Unfortunately, the
explanation of this example assumes that there is no padding in the structure.
If structures had no padding, offsetof()
wouldn't be needed because the offset
of a field could be computed from the sizes of the preceding fields.
It has since been pointed out to me that good coding practice would involve
not putting the details of a structure into other code, since changes to
the structure would cause that other code to break.
Instead, using offsetof()
will make the code robust against
such changes.
This is a good point, and my comment is in error to this extent, though
it is still the case that the original annotation is wrong in assuming that
no padding will be present.
The person who pointed this out does not believe that names should be used
when discussion someone else's work.
So I will not give him credit.
[Added 2008-03-15]
##x is alphanumeric
## TheThis is true in one sense, but oh so misleading. Thesetlocale()
function sets all or a specified portion of
## those items described in thelconv
structure
setlocale()
function
alters the meaning of many of the functions in the standard. For example, it
can change which characters are letters, or it can alter the decimal point
character. It can also affect the order in which strcoll()
sorts strings.
In all, there are five "categories" that it can affect. The lconv
structure
is affected by two of these, but not the other three, and it is not the only
thing that these two affect.
setjmp()
using the statement:
##Unfortunately, the standard puts strict limits on the places in whichresult = setjmp (jumpbuf);
setjmp
can be called; essentially it must be one of the four forms:
while (setjmp (jumpbuf)) while (setjmp (jumpbuf) < 42) while (!setjmp (jumpbuf)) setjmp (jumpbuf);[The "
while
" may be replaced by "if
" or "switch
", or may be the implicit
while of a "for
" statement.]
[The last form can also have (void)
in front of it.
[Added 2008-08-15]]
The example in the annotations, however, doesn't use any of these forms, and
so the compiler must produce a diagnostic for this code.
That was overstating it; a diagnostic is not required by the Standard.
It's still undefined code, though.
[Added 2008-08-15]
The standard also puts limitations on what can be done with local variables
in functions that call setjmp()
. I am surprised to find no mention of these
limitations at all.
sig_atomic_t
, and when it should be used.
## The typeActually, not only is there no such requirement in the standard, butfpos_t
is some type of an unsigned integer.
fpos_t
was designed for the circumstances when a file position can't be fitted
into an unsigned long
. The forthcoming Normative Addendum 1 also puts further
requirements on fpos_t
which, while compatible with the current standard, can
not be implemented if it is an unsigned integer.
## Thus, it is permissible for a text stream to treat allFine sounding words. I wish I knew what they mean !
## characters as part of one long, uninterrupted line, if it
## so chooses.
The standard states that an implementation may treat spaces at the end of lines in text files specially, and may add and remove zero bytes at the end of binary files. Neither of these rules are mentioned.
However, it can't ignore the newline characters in text files and fail to
report them.
In particular, it can't join together two lines when handling a call
to fgets()
.
[Added 2008-08-15]
fflush(NULL)
, nor that fflush
cannot be applied to
an input stream.
## Note that if stream is a pointer toJust a nit, butstdout
,
stdout
is a pointer, and stream
cannot point to it.
Here, and in many other places, printf()
is called with a format of "%lf"
and
a corresponding argument which is a double
. Unfortunately, the standard states
that "%f"
is the correct format for a double, and "%lf"
is undefined. This is
a particularly bad sin because the description of the "l"
flag is missing
(left page 132 of the book is a repeat of page 131).
While I cannot of course just copy the missing text, I have summarised what has been lost.
In 1997 someone reported that his copy included an insert with the missing
page.
[Added 2008-01-24]
fflush(stdin)
", which is
undefined.
Finally, the comment after that use is:
## /* clear crlf from input buffer */
The Standard doesn't even talk about "crlf" pairs, and except in discussing
the meaning of text streams (7.9.2), use of the term is inappropriate.
unsigned char
.
The first example calls fgetc()
and assigns the result to a char
variable.
This means that an error or end-of-file will cause the program to loop forever.
## The following fragment illustrates how files are commonly read:This example suffers from the "Pascal disease". The function
##do {
##ch = fgetc (fp);
##/* ... */
##} while (!feof (fp));
feof()
does not
mean "end of file has been reached", but means "a previous read hit end of
file and returned EOF
". Thus, when the last character of the file is read and processed,
"feof
(fp)"
will still be false, and the loop will be repeated one
more time. This time, ch
will be set to EOF
, but there is no indication in
the annotations that this must be treated specially. Only after this EOF
has
been processed, probably wrongly - for example, if the file is being copied
to somewhere else, a spurious character will be output - will the call to
feof()
return true.
## Also, for files opened for binary operations,This is dangerous nonsense, caused because the annotations useEOF
is a valid
## binary value and does not necessarily indicate an error or
## end-of-file condition.
char
variables
instead of int
s to hold the results of fgetc()
. What the standard says is, in
effect, that fgetc()
returns a positive or zero value if it read a character,
and a negative value (EOF
) if it reached end-of-file or an error occurred.
It is true that EOF
, cast to the type unsigned char
, is identical to a value
that can be read from a binary file (or even a text file). However, this is
just the effect of bad programming; anyone with experience in C file handling
should be aware of this.
## Also, remember that if the string does not contain a validActually, it is quite hard to make such a check, but luckily it is also unnecessary. If there is no number in the string, all three functions set
## numeric value as defined by the function, then 0 is returned.
## Althoughstrtod()
,strtol()
, andstrtoul()
seterrno
when an
## out-of-range condition exists, there is no requirement that
##errno
be set when the string does not contain a number. Thus,
## if this is important to your program, you must manually
## check for the presence of a number before calling one of the
## conversion functions.
*endptr
to the original value of nptr
, while if there is (even if it is zero)
they set it to point after the last character of the number.
time_t
is an integral type, and so assigns
the result of time()
to a long
. In fact, it could be double
, and it might be
that the cast always yields zero. To extract a random number from the value
returned by time()
, it is necessary to do something like the following, which
constructs an unsigned int
from all the bits of a time_t
value.
unsigned int random_from_time (time_t t) { unsigned int i, j, k; char *p;i = 0; p = (char *) &t; /* Divide t up into pieces each the size of an unsigned int */ for (k = 0; k + sizeof j <= sizeof t; k += sizeof j) { /* Copy the bits of the piece into j and add the value to i */ memcpy ((char *) &j, p + k, sizeof j); i += j; } /* Do the same with any remnant (e.g. if j is 4 bytes and t is 11) */ if (k < sizeof t) { j = 0; memcpy ((char *) &j, p + k, sizeof t - k); i += j; } return i; }
## Since multibyte characters are implementation-specific, youThere is a lot that can be said about multibyte and wide characters without having to know individual encodings, and there is a sore lack of such tutorial material. It is a great pity to be faced with two almost blank pages instead.
## should refer to your compiler's user manual for details.
strxfrm()
. It should
be noted that the result of strxfrm
may be longer than the original string.
The example compares two arrays of floats using memcmp
. While such a comparison
is strictly conforming, it is not useful - the result of the comparison depends
on the details of the encoding of floats, and is in no way related to which
number is greater or smaller. (For example, it is possible to have an encoding
in which 0 < 2
, but 2 > 3
, as far as this comparison works. In the same way,
comparing integers with memcmp
is equally useless on a little-endian system.)
mktime
and how it can be used to solve problems
like "what day is 100 days after December 25th 1993 ?" This appears to be solely
because there was no room on the page opposite the definition of mktime
.
After the % appear:
The flags are:
The field width is an asterisk or a decimal integer. If the converted value has fewer characters that the field width, it is padded to the width (unless altered by the flags, the padding is with spaces on the left). The field width cannot reduce the width of the converted value. Note that a zero at the start of the width is the "0" flag; it does not mean that the width is in octal.
The precision is a dot followed by an asterisk, a decimal integer, or nothing (equivalent to zero). It can only appear with certain conversions, and its meaning varies:
If the width, precision, or both, is an asterisk,
the actual value is taken from an int argument to the fprintf()
function.
The arguments are always in the order:
The optional letters may appear as follows: