Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Sign extension

There is no “sign extension” in C. Please stop referring to “sign extension” in C programs.

In assembly, there is such a thing as “sign extension”

Sign-extend is an assembly instruction, say movsx %al, %ebx, to transfer the contents of a narrow register, say %al, to a wide register, say %ebx, copying the most significant bit of the narrow register all over the unoccupied bits of the wide register.

In C, there is no “sign extension”. Really.

In C, a short variable s contains -3 and the contents of that variable is transferred to another variable of type int, like so:

int i = s;

Variable i receives -3. That's it. That's “sign extension”.

It's an assignment and, in this particular case, a promotion. With other types it could have been a conversion. The target wide variable receives the value that was contained in the narrow variable. Why would this operation need a name, and such an ugly name as “sign extension” at that?

The name for “sign extension” in C is “getting the value”. Variable i receives the value of s. That's it. That is all there is.

Comments

1. On Thursday, April 25 2013, 22:53 by John Regehr

Pascal, what about right shift of signed integers?

2. On Thursday, April 25 2013, 23:11 by pascal

@John,

I'm not sure what you mean. Is it E1>>E2, the “integral part of the quotient of E1 / 2^E2” operation? Yes, this one is quite a mouthful. Someone ought to propose a better name for it.

3. On Friday, April 26 2013, 00:16 by John Regehr

Pascal now I suspect you are joking with me a little.

C exposes bit-level operations on integers and in the case where we are thinking from a bitwise perspective, sign extension is the most natural way to explain the behavior of signed right shift.

4. On Friday, April 26 2013, 09:36 by pascal

@John

All right, bitwise operations do make sense in reference to a binary representation (that is usually 2's complement nowadays). I admit.

This improvised blog post referred to the habit some C programmers have of thinking at a lower level of abstraction than necessary. It was a case of “rant by a thousand paper cuts”. One example of C programmer thought that caused it was your illustrious homonym treating the behavior of a program that must have been something like signed char c = -3; unsigned u = c; as if it was worth tweeting.

Yes, you can think of it as “C sign-extending the char to an int before making it unsigned int”, but that's just making life difficult for yourself. You can simply think of it as converting -3 to unsigned int.

5. On Friday, April 26 2013, 10:24 by pascal

A related question on StackOverflow with, at the time of this writing, two good answers by Michael Burr and Alexey Frunze: Why not use complement of VAL instead of (-VAL -1)

6. On Saturday, April 27 2013, 04:23 by John Regehr

I agree, many C programmers like to think about C at too low of a level. This is closely related to the big lie people tell about C being a portable assembly language!