Yesterday's quiz was about the expression `*(char*)(float[]){x*x} - 63`

(for big-endian architectures) or `*(3+(char*)(float[]){x*x}) - 63`

(for little-endian ones). This post provides an explanation.

First, let us try the function on a few values:

int main(){ for (unsigned int i=0; i<=20; i++) printf("l(%2u)=%d\n", i, l(i)); }

This may provide the beginning of a hint:

l( 0)=-63 l( 1)=0 l( 2)=1 l( 3)=2 l( 4)=2 l( 5)=2 l( 6)=3 l( 7)=3 l( 8)=3 l( 9)=3 l(10)=3 l(11)=3 l(12)=4 l(13)=4 l(14)=4 l(15)=4 l(16)=4 l(17)=4 l(18)=4 l(19)=4 l(20)=4

The construct `(float[]){…}`

is C99's syntax for anonymous arrays, a kickass programming technique. This is an unabated quote.

In the case at hand, the construct converts to float the contents of the braces and puts the result in memory. The function puts the float in memory in order to read its most significant byte. That's `*(char*)…`

on a big-endian architecture, and `*(3+(char*)…)`

on a little-endian one.

One reason to read a single char is to circumvent strict aliasing rules—which do not apply to type `char`

. A simpler version of the same function would have been `(*(int*)(float[]){x} >> 23) - 127`

, but that version would break strict aliasing rules. Also, it would be too obvious.

The most significant bits of a single-precision IEEE 754 floating-point representation are, in order, one sign bit and eight exponent bits. By reading the most significant byte, we get most of the exponent, but one bit is lost. To compensate for this, the operation is applied to `x*x`

, whose exponent is double the exponent of `x`

.

In conclusion, yesterday's one-liner returns an integer approximation of the base-2 logarithm of a reasonably small `unsigned int x`

. On a typical 32-bit architecture, it is exact for powers of two up to `2¹⁵`

. If `x`

is zero, the function returns its best approximation of `-∞`

, that is, `-63`

.