## A bit of explanation regarding the quiz in the last post

By pascal on Friday, January 20 2012, 12:58 - Permalink

There are only positive constants in C, as per section 6.4.4 in the C99 standard:

integer-constant: decimal-constant integer-suffixopt octal-constant integer-suffixopt hexadecimal-constant integer-suffixopt decimal-constant: nonzero-digit decimal-constant digit octal-constant: 0 octal-constant octal-digit hexadecimal-constant: hexadecimal-preﬁx hexadecimal-digit hexadecimal-constant hexadecimal-digit ...

The minus sign is not part of the constant according to the grammar.

The expression `-0x80000000`

is parsed and typed as the application of the unary negation operator `-`

to the constant `0x80000000`

. The table in section 6.4.4.1 of the standard shows that, when typing hexadecimal constants, unsigned types must be tried. The list of types to try to fit the hexadecimal constant in is, in order, `int`

, `unsigned int`

, `long`

, `unsigned long`

, `long long`

, `unsigned long long`

.

For many architectures, the first type in the list that fits `0x80000000`

is `unsigned int`

. Unary negation, when applied to an `unsigned int`

, returns an `unsigned int`

, so that `-0x80000000`

has type `unsigned int`

and value `0x80000000`

.

Following the same reasoning as above, reading from the "Decimal Constant" column of the table in the C99 standard, the types to try are `int`

, `long`

, and `long long`

. This might lead you to expect `-2147483648`

for the value of the expression `-2147483648`

compiled with GCC. Instead, when compiling this expression on a 32-bit architecture, GCC emits a warning, and the expression has the value `2147483648`

instead. The warning is:

t.c:6: warning: this decimal constant is unsigned only in ISO C90

Indeed, there is a subtlety here for 32-bit architectures. GCC by default follows the C90 standard. It's not so much that the spirit of the table in section 6.4.4.1 in C99 changed between C90 and C99. The spirit remained the same, with unsigned types being tried for octal and hexadecimal constants, and mostly only signed types being tried for decimal constants. Here is the relevant snippet from the C90 standard:

The type of an integer constant is the first of the corresponding list in which its value can be represented. Unsuffixed decimal: int, long int, unsigned long int;

The difference really stems from the fact C90 did not have a `long long`

type, and the list of types to try for a decimal constant ended in `unsigned long`

, since that type contained values that did not fit in any other type. On a 32-bit architecture, where `long`

and `int`

are both 32-bit, `2147483648`

fits neither `int`

nor `long`

, and so ends up being typed as an `unsigned long`

. Note that on an architecture where `long`

is 64-bit, then `2147483648`

and `-2147483648`

are typed as `long`

.

Finally, when GCC is told, with option `-std=c99`

, to apply C99 rules on an architecture where `long`

is 32-bit, then `2147483648`

is typed as `long long`

, so that the expression `-2147483648`

has type `long long`

and value `-2147483648`

.

This should explain the results obtained when compiling the three programs from last post with GCC on 32-bit and on 64-bit architectures.

## Comments

Nice quiz. I've been writing C for close to 20 years and just the other day got bit by a constant problem, I forgot an L or a U or both. Go's arbitrary precision in constant calculations seems like an improvement...

Hello, John. Glad you liked it! The post has also generated offline discussions that will translate to two additional posts on the same general subject. Who would think that C programs that manipulate integers can cause so many aha moments?

It is worth to note that MISRA-C:2004 has two rules related to these cases:

.

- 10.6 A “U” suffix shall be applied to all constants of unsigned type.

- 12.9 The unary minus operator shall not be applied to an expression whose underlying type is unsigned.

.

Your post give a good example of the wisdom of these rules which forbid the expression -0x80000000 altogether.