Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Sunday, February 23 2014

An interesting SSL implementation bug: CVE-2013-5914

SSL in the news

SSL is a protocol for point-to-point confidential and authenticated communication over an insecure medium. It is the protocol behind HTTPS, among many other uses. In an Internet-connected system, the SSL implementation stands at the frontier between the system and the hostile outside world. For this reason, SSL implementation flaws are a prime target for attacks.

An ordinary bug

A rather banal SSL implementation bug was revealed over the weekend. A duplicated line in a crucial, if ordinary-looking, sequence of validation operations means that some of the validation steps are not taken:

    ...
    if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail; // <-------------------------------------------------
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;
    ...
fail:
    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;
}

Note that the second consecutive goto fail is executed with variable err containing the result of the last validation operation, 0, indicating success. This value is returned as-is by the function; the caller is therefore mislead into believing that the validation succeeded, despite some of the validation steps not having been executed.

Because C lacks an exception mechanism, the above is an often-seen programming pattern. The programming style here can hardly be blamed: this is how the best C programs are written. Except of course for the extraneous goto.

The SSL implementation, and thus the bug in question, are found in tens of millions of iOS devices, with a few additional Mac OS X computers on top of that. The scope of the security problem caused by this bug, and the obviousness of the issue when pointed out, have together lead to much commentary on Twitter and other popular discussion forums.

So reaction. Very noise

“I would never have had this problem because I know that goto is bad”, some commenters claim. I wish I was caricaturing, but I am unfortunately only paraphrasing and combining several public reactions into one.


“I would never have had this problem because I use braces”, essentially state others. Certainly, the root cause of the problem must have been that the developer who introduced the bug was confused about the meaning of if (c) stmt1; stmt2;. Just look how ey indented it!

These two often-heard remarks strongly suggest to use brace-less control flow or the presence of goto as predictors of defects in C code. I am sure this will be a fruitful research direction. Someone from the software engineering academic community should look into it.


“Just adding a single line of code can bring a system to its knees”, reminds Arie van Deursen. True, true, an important lesson there. We should all insert the following line somewhere in our respective codebases from time to time and take a good look at the effect it has, in remembrance.

(


It is Ariane 5 all over again! Worse, instead of just the makers of formal verification systems, everyone seems to have a scheme to prevent the problem they already know is there.

An interesting bug in a different SSL implementation

The problem in most of the armchair analysis that has been going on on the Internet lies in the two following questions: how many more bugs in security-critical C code does the proposed remedy (be it more braces, fewer gotos, …) find? How many time-costly, tedious, soul-crushingly boring false positives does it uncover?

Since commenters have been generalizing on the basis of a population of one sample, I feel no shame in presenting my own single example, raising the total number of scrutinized bugs to two. Note that, for us to make statistically sound arguments, we will eventually need to extend the discussion to examples of correct code that we do not wish to change.

Until then, here is a funny SSL implementation bug. It is characterized as follows, in a version of PolarSSL 1.1.8 that colleagues and I have been modifying.

Screenshot 1: an alarm in our tis_recv() function?

CVE-2013-5914-1.png

PolarSSL expects the program in which it is incorporated to provide it with a function that receives data from the outside world and writes it to a memory buffer. We have made such a function, baptized it tis_recv, and we have set up PolarSSL to use it.

The function tis_recv takes three arguments. The first one is a context argument in case someone needs one (our function ignores this argument). Second is a pointer to the buffer where the data is to be written, then third is the maximum size the function is allowed to write there.

We have specified our function tis_recv thus:

/*@ 
  requires recv_r1: \valid(output+(0..output_len-1)) ;
  requires recv_r2: output_len > 0 ; 
*/
int tis_recv(void* p, unsigned char * output, size_t output_len);

The screenshot indicates on the bottom right that the pre-condition recv_r1, which states that the argument output points to a buffer large enough for output_len characters, is not verified. How could this be? Surely a false positive… Or someone is calling the function with the wrong arguments. It does not look like the problem is in our function.

The GUI informs us that the function tis_recv is called in one place only, and that place is inside ssl_fetch_input(). It is called through a function pointer stored inside a member of a struct accessed through a pointer. The GUI tells us that we can mentally substitute ssl->f_recv(..) with tis_recv(...).

Screenshot 2: a wrong third argument

The GUI tells us that the buffer that PolarSSL passes to tis_recv() has size 16KiB-ish (not pictured), and that the variable len passed as third argument appears to take any of the values of its size_t type (pictured in the bottom panel below).

CVE-2013-5914-2.png

Screenshot 3: jumping back to the origin of the value

We inquire where the value of variable len was last set, and the GUI tells us it is at the yellow line just above the function call (pictured, in the middle panel). Well, hum, yes, we could have noticed that, but it was faster to click.

CVE-2013-5914-3.png

Screenshot 4: argument nb_want is too large

The value of len is computed from ssl_fetch_input()'s argument nb_want, and the value of nb_want appears to be too large, 65540, for the size of the buffer that the GUI tells us we have (in the screenshot below, the values computed as possible for nb_want are displayed in the bottom panel).

CVE-2013-5914-4.png

Screenshot 5: dangerous values come from caller ssl_read_record()

A new possibility offered by the Frama-C version I am using that may not even(*) be available in the latest release Fluorine is to observe, in the bottom panel, which call-sites and originating call-stacks cause which values to occur in a variable. Here, the GUI shows that nb_want appears to be 65540 when called from function ssl_read_record() at line 1178 in file ssl_tls.c of PolarSSL. This means we can continue the investigation there. In contrast, the value of nb_want can only be 5 when ssl_fetch_input() is called from ssl_parse_client_key_exchange(), so there is no need to look at that function: it is definitely not part of this problem.

(*) I don't remember. It has been a long time, has it not?

CVE-2013-5914-5.png

Screenshot 6: the seemingly too large argument is ssl->in_msglen

CVE-2013-5914-6.png

The problem is that ssl->in_msglen is too large here. But where does it come from?

Screenshot 7:

ssl->in_msglen has been computed from two bytes sent by the network (bad, bad network). But the value of ssl->in_msglen is supposed to have been validated before being used. This is what the lines between the obtention of the value and its use are supposed to do.

CVE-2013-5914-7.png

Screenshot 8:

CVE-2013-5914-8.png

The value of ssl->in_msglen is validated when ssl->minor_ver is 0, and it is validated when ssl->minor_ver is 1. But according to the GUI, ssl->minor_ver can be any of 0, 1 or 2.

Explanation

At this point it is only a matter of confirming that the call to ssl_read_record() can indeed be reached with ssl->minor_ver set to 2. This is where one switches to existential mode, possibly crafting a malicious message, or simply building a convincing argument that values can converge here to cause bad things and send it to the official maintainer .

When I said that this was a modified PolarSSL 1.1.8 we were analyzing, I cheated a little. This is a 1.1.8 version in which I have re-introduced a bug that was there from 1.0.0 to 1.1.7. The principal maintainer of PolarSSL suggests to fix the bug by replacing == SSL_MINOR_VERSION_1 by >= SSL_MINOR_VERSION_1.

Conclusion

We have seen a second example of a real bug that can occur in an SSL implementation. Understandably in the current context, there has been much speculation over the last 48 hours on the innocence of the bug in Apple's implementation. Might it be a voluntarily inserted backdoor? Is the NSA behind this bug? (I link here to John Gruber's post because he writes clearly, but the same question is raised all over the Internet).

Allow me to put it this way: if the Apple SSL bug is a trick from the NSA, then you US citizens are lucky. Our spy agency in Europe is so much better that it does not even have a name you have heard before, and it is able to plant bugs where the buffer overflow leading to arbitrary code execution is three function calls away from the actual bug. The bug from our spy agency is so deniable that the code actually used to be fine when there were only two minor revisions of the SSL protocol. The backdoors from your spy agency are so lame that the Internet has opinions about them.


Real bugs come in all sizes and shapes. That one mistake with security implication also causes easily detectable unreachable code or wrongly-indented lines does not mean that all security flaws will be detected so easily, and that plenty of quirky but functionally correct code will not be wrongly flagged.

Speaking of varied bugs, “what about PolarSSL version 1.1.8 without the manually re-inserted bug from CVE-2013-5914?”, I hear you ask. This will be the subject of another blog post.

Acknowledgments

Philippe Herrmann was first to use Frama-C to find a security bug in PolarSSL (Philippe's bug was a denial of service in versions up to 1.1.1, fixed in 1.1.2). Anne Pacalet and Paul Bakker have helped me with various aspects of PolarSSL's verification, including but not limited to the bug described in this post. Twitter users aloria, thegrugq, and johnregehr provided comments on drafts of this post. And finally, the Internet made this post possible by being itself.

Thursday, May 9 2013

Definition of FLT_EPSILON

Correct and wrong definitions for the constant FLT_EPSILON

If I google “FLT_EPSILON”, the topmost result is this page with this definition:

FLT_EPSILON the minimum positive number such that 1.0 + FLT_EPSILON != 1.0.

No, no, no, no, no.

I don't know where this definition originates from, but it is obviously from some sort of standard C library, and it is wrong, wrong, wrong, wrong, wrong. The definition of the C99 standard is:

the difference between 1 and the least value greater than 1 that is representable in the given floating point type, b^(1−p)

The GNU C library gets it right:

FLT_EPSILON: This is the difference between 1 and the smallest floating point number of type float that is greater than 1.

The difference

On any usual architecture, with the correct definition, FLT_EPSILON is 0x0.000002p0, the difference between 0x1.000000p0 and the smallest float above it, 0x1.000002p0.


The notation 0x1.000002p0 is a convenient hexadecimal input format, introduced in C99, for floating-point numbers. The last digit is a 2 where one might have expected a 1 because single-precision floats have 23 explicit bits of mantissa, and 23 is not a multiple of 4. So the 2 in 0x1.000002p0 represents the last bit that can be set in a single-precision floating-point number in the interval [1…2).


If one adds FLT_EPSILON to 1.0f, one does obtain 0x1.000002p0. But is it the smallest float with this property?

#include <stdio.h>

void pr_candidate(float f)
{
  printf("candidate: %.6a\tcandidate+1.0f: %.6a\n", f, 1.0f + f); 
}

int main(){
  pr_candidate(0x0.000002p0);
  pr_candidate(0x0.000001fffffep0);
  pr_candidate(0x0.0000018p0);
  pr_candidate(0x0.000001000002p0);
  pr_candidate(0x0.000001p0);
}

This program, compiled and executed, produces:

candidate: 0x1.000000p-23	candidate+1.0f: 0x1.000002p+0
candidate: 0x1.fffffep-24	candidate+1.0f: 0x1.000002p+0
candidate: 0x1.800000p-24	candidate+1.0f: 0x1.000002p+0
candidate: 0x1.000002p-24	candidate+1.0f: 0x1.000002p+0
candidate: 0x1.000000p-24	candidate+1.0f: 0x1.000000p+0

No, 0x0.000002p0 is not the smallest number that, added to 1.0f, causes the result to be above 1.0f. This honor goes to 0x0.000001000002p0, the smallest float above half FLT_EPSILON.

Exactly half FLT_EPSILON, the number 0x0.000001p0 or 0x1.0p-24 as you might prefer to call it, causes the result of the addition to be exactly midway between 1.0f and its successor. The rule says that the “even” one has to be picked in this case. The “even” one is 1.0f.

Conclusion

Fortunately, in the file that initiated this rant, the value for FLT_EPSILON is correct:

#define FLT_EPSILON 1.19209290E-07F // decimal constant

This is the decimal representation of 0x0.000002p0. Code compiled against this header will work. It is only the comment that's wrong.

Wednesday, May 1 2013

A conversionless conversion function

A rant about programming interview questions

Software development is a peculiar field. An applicant for a more traditionally artistic position would bring a portfolio to eir job interview: a selection of creations ey deems representative of eir work and wants to be judged by. But in the field of software development, candidates are often asked to solve a programming problem live, the equivalent of telling a candidate for a photography job “these are nice photographs you have brought, but could you take a picture for me, right now, with this unfamiliar equipment?”

Lots have already been written on programming job interview questions. Googling “fizzbuzz” alone reveals plenty of positions taken, reacted to, and counter-argumented. I do not intend to add to the edifice. Taking a step back, however, I notice that many of these posts do not tackle the question of why what works for the traditional arts should not be appropriate for the art of programming.


What I intend to discuss is the poor quality of “do this simple task, but avoid this simple construct that makes it trivial” interview questions. I hate those. Everytime I hear a new one it seems to reach a new high in sheer stupidity. The questions are usually very poorly specified, too. One such question might be to convert a floating-point value to integer without using a cast. Is floor() or ceil() allowed? Are other library functions than these ones that solve the problem too directly allowed? May I use a union to access the bits of the floating-point representation? Or memcpy()?

Well, I have solved this particular question once and for all. The conversion function makes up the second part of this post. It uses only floating-point computations, no tricks. Now, one just needs to learn it and to regurgitate it as appropriate at interview time (there is no way one can write a working version of this program on blackboard, too). Who is hiring?

A program

The function below requires a strict IEEE 754 implementation. If your GCC is generating x87 instructions, options -msse2 -mfpmath=sse should prevent it from doing so and allow you to run the program:

#include <math.h>
#include <float.h>
#include <stdio.h>
#include <limits.h>

/*@ requires 0 <= f < ULLONG_MAX + 1 ; */
unsigned long long dbl2ulonglong(double f)
{
  if (f < 1) return 0;
  unsigned long long l = 0;

  for (double coef = 3; coef != 0x1.0p53; coef = 2 * coef - 1, l >>= 1)
    {
      double t = coef * f;
      double o = f - t + t - f;
      if (o != 0)
	{
	  l |= 0x8000000000000000ULL;
	  f -= fabs(o);
	}
    }
  l |= 0x8000000000000000ULL;
  for ( ; f != 0x1.0p63; f *= 2) l>>=1;
  return l;
}

int main()
{
  double f = 123456.;
  unsigned long long l = dbl2ulonglong(f);
  printf("result:%.18g %llu\n", f, l);

  f = ULLONG_MAX * 0.99;
  l = dbl2ulonglong(f);
  printf("result:%.18g %llu\n", f, l);

  printf("rounding:%llu %llu %llu %llu\n", 
	 dbl2ulonglong(DBL_MIN),
	 dbl2ulonglong(1.4),
	 dbl2ulonglong(1.5),
	 dbl2ulonglong(1.6));

  return 0;
}

The expected result is:

result:123456 123456
result:1.82622766329724559e+19 18262276632972455936
rounding:0 1 1 1

Incidentally, does anyone know how to write a correctness proof for this function? A formal proof would be nice, but just an informal explanation of how one convinces oneself that o = f - t + t - f is the right formula would already be something.

Friday, April 26 2013

Of compiler warnings discussions

A discussion I often have addresses the question of whether a compiler¹ can warn for all possible illegal actions a program could take at run-time within a specific, well-understood family² .

(1) or some other piece of software that receives a program as input and, in finite time, does something with it, for example a static analyzer

(2) for example the family of out-of-bounds memory accesses


For some reason my opinion differs significantly from the consensus opinion, but since this is my^H^H the Frama-C blog, I thought I would nevertheless put it here for exposure.

The Halting problem is only relevant as a metaphor

The Halting problem is defined for Turing machines. A computer is not a Turing machine. A Turing machine has an infinite ribbon, whereas a real computer has finite memory.


Imagine a real, physical computer running a program. Imagine that you sit at the keyboard, typing new inputs indefinitely, swapping the keyboard for a new one when wear and tear make it unusable, putting the computer on a battery and moving it to another planet when the Sun explodes, rigging your will to ensure your descendants will carry on the typing, this sort of thing.

Seems like a glimpse of infinity, right? But the mathematical abstraction for this computer is not a Turing machine, it is a boring Mealy machine (an automaton with inputs and outputs). A Turing machine has an infinite ribbon it can write to and read from again. The above set-up does not. It is a Mealy machine with 256^(10^9) states (assuming a 1TB hard-drive).

A sound analyzer is not impossible

A sound analyzer for out-of-bounds accesses is an analyzer that warns for every out-of-bounds access. Making one is easier than it seems. There is always the solution of warning for every dereferencing construct in the program (in C, these would be *, [] and ->).

A useful sound analyzer is not impossible

That last analyzer, although sound, was pretty useless, was it not? General theoretical results on the undecidability of deciding of the run-time behavior of Turing machine would predict —if allowed to be intrapolated to computers and programs— that it is impossible to build a sound and complete analyzer.

A complete analyzer for out-of-bounds accesses warns only for accesses that can really be out-of-bounds for some program inputs.

Indeed, a sound and complete analyzer is difficult. It is impossible for Turing machines, and it is hard even for mere computers and programs.

Luckily, the theoretical difficulty is only in reference to arbitrary programs. It is theoretically difficult to make the promise to warn for all out-of-bounds accesses, and to warn only for real out-of-bound accesses, for arbitrary programs, but an analyzer does not have to do that to be useful. It only needs to deal with programs people are actually interested in (an excessively small subset of all programs).

And it does not even need to be sound and complete to be useful. An analyzer that warns for all out-of-bounds accesses with only a few false positives is still useful (and still sound. “Complete” is what it is not).

Conversely, an analyzer that warns only for actual out-of-bounds accesses and only omits a few is still useful and complete, although it is not sound. But I have not spent the last 8 years working on such an analyzer, so it is not my job to advertise one. Whoever wrote one can tout it in its own blog.


This blog contains examples of analyses made using a sound analyzer. The analyzer does not emit so many false positives that it is unusable. The analyses involve a variety of real programs: QuickLZ (there were a few false positives, but a bug was found and was fixed), the reference implementation for Skein (no bug was found but there were no false positives), Zlib (in this ongoing analysis there are plenty of false positives but one extremely minor issue has already been found), …

But this cannot work in presence of dynamically loaded libraries!

Well, yes, in order to do “program analysis”, the program must be available. It is in the name.

Speaking of binary-only libraries or programs, the counter-arguments I have written above apply the same to the analysis of a program in binary form. Analyzing a binary program is significantly harder than analyzing source code, but a binary analyzer does not have to be unsound, it does not have to be incomplete, and it does not have to decide the Halting problem. The examples I have provided are for analyses of source code because that's what I work on, but look for the blog of a binary static analyzer and there is no theoretical reason you won't be pleasantly surprised.

But this cannot work if there are inputs from the user / from a file!

Of course it can. The more you know and don't tell to the analyzer, the more its warnings may subjectively feel like false positives to you, but the analyzer can always assume the worst about any inputs:

#define EOF (-1)

/*@ ensures \result == EOF || 0 <= \result <= 255 ; */
int getchar(void);

int main(int c, char **v)
{
  int a[5], i;
  i = getchar();
  a[i] = 6;
}
$ frama-c -pp-annot t.c -val
...
t.c:10:[kernel] warning: accessing out of bounds index [-1..255]. assert 0 ≤ i < 5;

The first four lines would not have to be written everytime. They model a function from the standard library and would only need to be updated when the specification of the standard C library changes.

Speaking of which, these four lines are not a very good modelisation of function getchar(). It is possible to do better than this, and the above is only an example, simplified for clarity but best not re-used.

Conclusion

In conclusion, it is possible to make sound static analyzers, because theoretical undecidability results are about Turing machines, because people are not interested in arbitrary programs, and because a couple false positives once in a while do not necessarily make a sound analyzer unusable. The theoretical counter-arguments apply to binary code, to concurrent programs and when reversing the use of the “sound” and “complete” adjective, but I only have examples for the static analysis of sequential C source code.

Thursday, April 25 2013

Sign extension

There is no “sign extension” in C. Please stop referring to “sign extension” in C programs.

In assembly, there is such a thing as “sign extension”

Sign-extend is an assembly instruction, say movsx %al, %ebx, to transfer the contents of a narrow register, say %al, to a wide register, say %ebx, copying the most significant bit of the narrow register all over the unoccupied bits of the wide register.

In C, there is no “sign extension”. Really.

In C, a short variable s contains -3 and the contents of that variable is transferred to another variable of type int, like so:

int i = s;

Variable i receives -3. That's it. That's “sign extension”.

It's an assignment and, in this particular case, a promotion. With other types it could have been a conversion. The target wide variable receives the value that was contained in the narrow variable. Why would this operation need a name, and such an ugly name as “sign extension” at that?

The name for “sign extension” in C is “getting the value”. Variable i receives the value of s. That's it. That is all there is.

Thursday, April 4 2013

Google forking WebKit

Blink as seen from the inside

As you have undoubtedly heard if you follow at all this sort of thing, as of April 3, Google is forking WebKit. Its Chrome browser will henceforth rely on its own variation of the popular rendering engine, Blink. This is big news. If I were Google, I would have pulled a Gmail and made this serious announcement two days earlier. Google security engineer Justin Schuh gives the inside scoop.


Something particularly strikes me in Justin's post. This post reacts to that aspect of Justin's, and to that aspect only.

As of 2013, Cybersecurity is an unsolved problem

Let us be honest, cybersecurity, in general, in 2013, is a bit of an unsolved problem. Despite this state of affair, or because of it, it can be difficult to get some of the concerned companies to recognize there exists a problem at all.

Fortunately, some are more open than others regarding security. Google is more open than most, and when an exceptional event, such as the forking of WebKit, justifies it, someone there may remind us of all the Chrome team does for security in a convenient list:

[…] the Chrome security team has taken a very active role in WebKit security over the last several years, and really led the pack in making Webkit more robust against exploits. We’ve fuzzed at previously unheard of scale <goo.gl/R718K>, paid out hundreds of thousands of dollars in bug bounties <goo.gl/PGr89>, performed extensive code auditing, fixed many hundreds of security bugs, and introduced a slew of hardening measures. […]


My point is only tangentially relevant to web rendering engines, so I am taking the above list out of context, and adding my own punchline: despite of all these efforts, measured in millions of dollars, it is likely that there is still one exploitable security flaw in Chrome, and it is possible that, as you read this, someone somewhere is exploiting this flaw.


To be very clear, I am not blaming Google here. Google allocates the millions of dollars that cybersecurity warrants. It is open about how it uses this money; Google uses all the techniques that have proved their worthiness: random testing, bug bounties, code reviews, defensive programming. I am only saying that cybersecurity is difficult, since even doing all this does not guarantee absolute confidence. If you were to embed Chromium on some kind of connected device, even after doing all the above, you had better leave room in the design for regular software updates.

Sunday, March 3 2013

Correct rounding or mathematically-correct rounding?

CRlibm is a high-quality library of floating-point elementary functions. We used it as reference a long time ago in this blog while looking at lesser elementary function implementations and the even lesser properties we could verify about them.

A bold choice

The CRlibm documentation contains this snippet:

[…] it may happen that the requirement of correct rounding conflicts with a basic mathematical property of the function, such as its domain and range. A typical example is the arctangent of a very large number which, rounded up, will be a number larger than π/2 (fortunately, ◦(π/2) < π/2). The policy that will be implemented in crlibm will be

• to give priority to the mathematical property in round to nearest mode (so as not to hurt the innocent user who may expect such a property to be respected), and

• to give priority to correct rounding in the directed rounding modes, in order to provide trustful bounds to interval arithmetic.

The choice for directed rounding modes is obviously right. I am concerned about the choice made for round-to-nearest. The documentation states the dilemma very well. One can imagine slightly out of range values causing out-of-bound indexes during table look-ups and worse things.


I seldom reason about floating-point programs. I work on static analysis and am only concerned about floating-point inasmuch as it is a requirement for writing a static analyzer correct for programs that include floating-point computations.

However, when I do reason about floating-point programs, I am more often compounding approximations, starting from the base assumption that a correctly rounded function returns a result within 1/2ulp of the mathematical result than I am assuming that atan(x) ≤ π/2. The choice the CRlibm implementors made means that suddenly, the reasoning I often make is wrong. The value of atan(x) in the program may not be 1/2ulp from the real arctangent of the same x. It can be more when x is very large and mathematical-correctness overrode correct rounding.

Truck drivers fall asleep at the wheel when they face long, dull stretches of straight empty roads. Similarly, it is good to have another special case to consider when reasoning about floating-point computations. With only infinites and denormals to worry about, it can get, you know, a bit too easy.

Oh well, it's only π/2

In this section I rhetorically assume that it is only π/2 for which there is a problem. The CRlibm documentation reminds us that in the case of double precision, we were lucky. Or perhaps it isn't luck, and the IEEE 754 committee took the desirableness of the property (double)π/2 < π/2 into account when it chose the number of bits in the significand of the double-precision format.


How lucky (or careful) have we been, exactly? Let us test it with the program below — assuming my compilation platform works as intended.

#include <stdio.h>

#define PI(S) 3.1415926535897932384626433832795028841971693993751##S

float f = PI(f);
double d = PI();
long double ld = PI(L);

int main(){
  printf("   3.14159265358979323846264338327950288419716939937510\n");
  printf("f  %.50f\n", f);
  printf("d  %.50f\n", d);
  printf("ld %.50Lf\n",ld);
}

The result of compiling and executing the program is, for me:

   3.14159265358979323846264338327950288419716939937510
f  3.14159274101257324218750000000000000000000000000000
d  3.14159265358979311599796346854418516159057617187500
ld 3.14159265358979323851280895940618620443274267017841

As you can see, the nearest single-precision float to π is above π, as is the nearest 80-bit long double. The same goes for π/2 because the floating-point representations for π and π/2 only differ in the exponent. Consequently, the issue raised by the CRlibm implementors will come up for both functions atanf() and atanl(), when it is time to get them done. We were not very lucky after all (or careful when defining the IEEE 754 standard).

A subjective notion

But what exactly is the informal “mathematical correctness” notion that this post is predicated upon? Yes, the “innocent user” may expect mathematical properties to be respected as much as possible, but there are plenty of mathematical properties! Let us enumerate some more:


If x ≤ 1 in a program, then exp(x) should always be lower than the mathematical constant e.

So far so good. The above is a good rule for an exponential implementation to respect. We are making progress.


Here is another property:

If x ≥ 1 in a program, then exp(x) should always be greater than the mathematical constant e.


We are decidedly unlucky today, because at most one of these is going to be true of any floating-point function exp(). The programmatic value exp(1) must be either above or below the mathematical constant e (it is never equal to it because the mathematical constant e does not have a finite representation in binary).

Why does it matter anyway?

Let us revisit the argument:

to give priority to the mathematical property in round to nearest mode (so as not to hurt the innocent user who may expect such a property to be respected)

I alluded to a possible problem with a programmer computing an array index from atanf(x) under the assumption that it is always lower than π/2. But how exactly would an innocent user even notice that atanf(1e30) is not lower than π/2? The value π/2 cannot exist in eir program any more than e. The user might innocently write an assertion like:

assert(atanf(x)<=(3.1415926535897932f/2.0f));

This assertion will never trigger! The function atanf() will indeed return at most the single-precision float 3.1415926535897932f/2.0f. It does not matter that this number is actually slightly larger than π/2. For all intents and purposes, in the twisted world of single-precision floating-point, this number is π/2.

Conclusion

There are other scenarios in which the innocent user might genuinely have an unpleasant surprise. The result of a computation may be converted to decimal for humans to read and the user may be surprised to see a value outside the range ey expected. But this user would have the wrong expectations, just as if ey expected 10.0 * atan(x) to always be less than 5π. Plenty of these users and developers can be found. But my opinion, for what it is worth, is that by making special cases you are not helping these users, only feeding their delusions.

The correct way to set expectations regarding the results of a floating-point program is numerical analysis. Numerical analysis is hard. Special cases such as the authors of CRlibm threaten to implement only seem to make it harder.

Friday, February 1 2013

ENSL seminar

Seminar

As anticipated, I was at my alma mater's student seminar last tuesday. School and seminar were very much like I remembered them. The latter was improved by orange juice and biscuits to chat around after the talk, that I do not think were part of the protocol when I was a student.

My years at the ENS of Lyon were a good time, but I had forgotten the reasons that made them so. If any of the students from the seminar read this, I hope they will pass on my thanks to everyone for reminding me why. In one word, it is the company.


The sort of discussion that emerges naturally in an open, low-stakes chat after a general presentation of Frama-C but is too open-ended to be part of the talk's questions is that of general usefulness.

Not every programmer is writing critical embedded code—in fact, most of us aren't. But in this day and age, many of us who program at all write code with mild to severe security implications, and I believe that with a reasonable amount of work, Frama-C can next help here.

Video games, and software security in the news

The seminar also gave me an excuse to visit my family that lives in the general area of Lyon. As a result, today, I was helping my nephew play Professor Fizzwizzle (now an online game, apparently) after lunch, when the 13:00 TV news announced that some webcams, sold for the purpose of allowing homeowners to watch their living room remotely and reassure themselves that they are not being burgled, instead allowed anyone with internet access to watch, at any time.

This was not by design, of course. This was the result of a security flaw. I am not saying that the security flaw was one of the kind Frama-C identifies, say, a buffer overflow in a C program: the France 2 news report (in French, available for a week) did not go into such detail. But I am willing to bet that software was involved.

The future

So what world is my three-year-old nephew going to grow in? Either we collectively solve the software security problem, and hopefully, when he is in age of lending an ear to TV news while playing Professor Fizzwizzle, there won't be any more of these spectacular failures to announce. Or we do not, and then, at the current rhythm, TV news won't announce security failures because they will not be news any more.

A reflexive illustration

One last thing: this blog post contains two links, one that I expect to be to a Java web applet (my nephew was playing a game that I bought a license for and downloaded the old-fashioned way in the early 2000s, but it always felt like Java, especially at launch-time). The other is to a Flash video. If either of them worked for you, you should seriously think about disabling plug-ins in your browser. Yes, you will be missing out, but frankly, this is the only solution at this time.


Students from ENSL or elsewhere, if you would like to help solve this problem (at least the part where embedded-like C code is concerned) and at the same time solve your problem of picking a first- or second-year internship, e-mail addresses of Frama-C developers are either firstname dot lastname at inria dot fr for contributors from Inria, or firstname dot lastname at cea dot fr for contributors from CEA (including me).

Wednesday, January 23 2013

2000s

Blogger of multiple qualities Harry McCracken was recently still looking for “an iPad PDF reader which can handle giant files which are 100s of pages long without choking”. Sorry, I meant “STILL looking”.

PDF is a nineteen-nineties technology to display text and pictures. Key quote: “Each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, graphics, and other information needed to display it”.

Now, you might think that any proposal made in 1991 to master the intricacies of displaying text and pictures would have succeeded by 2013, but apparently not, according to Harry's call for help.


Fortunately, another nineteen-nineties technology came to the rescue: the Djvu format was created in 1996 for the purpose, I wish I was kidding, to store text and images. Key quote: “a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, indexed color images, and photographs”.


It is good to see progress at work. Even if we acknowledge that it takes years to get these things right and that a project started in 1996 is bound to be more technologically advanced than one started in 1991, there remains the question, what the file have we been collectively doing during the years 2000?


This leads me into a personal anecdote. In 1995, I was admitted at the ENS of Lyon (and I may be thinking about it because I am visiting back there on January 29. Drop by if you are around…). These were interesting times: 1995 is more or less the year the internet slash the world wide web started to become widespread in the US and then in other parts of the world. We were privileged to have access to it already at the ENSL.

In 1996, I left France for a 2-and-a-half months internship at Indiana University. Before I left, I taught my father to use the 28800 BPS external modem I was leaving him to dial long-distance to Lyon, impersonate me on the ENSL's server, upload a text file containing news of the family, download a text file that might contain news from me, and disconnect as soon as possible because heck, that was expensive (although cheaper than any available alternative). This was done with software that was called either “Kermit” or “ZMODEM”. Kermit, I am pretty sure it was. You could not trust these new-fangled protocols then any more than you can now.

Nowadays, my father has his own e-mail address and he may write to anyone he likes, as opposed to leaving messages in a dropbox for me. And he does not pay long-distance rates for it.

But I can't help thinking that all progress that has occurred since then was incremental. Slightly better for much cheaper. There has been the iPhone. But then again, the first PalmPilot was launched in 1997.


So, what have we been doing during the years 2000? What technologies were so life-changing that we are going to take them for granted in the 2010s?

Tuesday, January 1 2013

Software obsolescence and security

A couple of months ago, I packed a computer into a brown plastic bag of the kind usually used for trash. I then carried the carefully packed computer down to the basement. Physically, the computer still works. It is still beautiful (it is an iMac G4). It has been with me for the last ten years: I bought it second-hand in December 2002 from my friend Damien Doligez. I have used it for both work and entertainment for several years, and then gave it to my computerwise-unsophisticated sister who has been using it since. The design, much friendlier than later iMacs, and even than earlier iMacs, evokes the Luxo Pixar character. This computer was built to last: after ten years, the white plastics are still white. I pick this insignificant detail not because I particularly care about the tint of my computing devices, but because I don't. This is not an accident. People took care of the details when designing this computer, all the way down to how it would look as it aged.

The iMac G4 came with the cutest matched loudspeakers. The integrated modem would not find much use nowadays, but the computer also has USB, firewire and ethernet ports, and Wi-Fi. The thinking behind the integrated modem remains avant-gardist even if the technology is now outdated: this computer was the hub of its buyer's future digital lifestyle. It was intended to be connected, and it was intended not to look like an untidy mess in your living room.


300px-IMac.jpg


I opened it a couple of times to upgrade the hard disk and optical drive. It reminded me of the glimpse I once took as a student of the insides of a Sun workstation. I don't think I could explain this particular feeling to someone who did not see the inside of a Sun workstation—or of an iMac. As a concrete example that does not do justice to the experience, the unused screw holes of the factory-installed optical drive I retrieved from the computer had been taped over, so that air would not be sucked through the drive, lest dust accumulate on the drive's door as years passed.


Unfortunately, some functions mostly everyone expects from a computer nowadays is to display e-mail and web pages, and this computer is no longer fit for this purpose. This isn't a case of running new software on old hardware. This has never worked, but one used to be able to use loved hi-tech gadgets way past obsolescence as long as one avoided the mistake of installing new software on them. For lustra, you could keep using the spreadsheet you were used to for accounting or your simple-but-sufficient word processor. Even games kept working!

The problem nowadays is that the option not to upgrade software is no longer available: software more than a few years old ceases to receive security patches. The iMac G4 in question cannot be upgraded past Mac OS X 10.4. On Mac OS X 10.4, Firefox cannot be upgraded past version 3.6. Firefox 3.6 received its last security update in March 2012 and has now joined the Great Repository in the Sky.


Here is an idea for 2013: how about making software that works? Of course, it would not automatically adapt to new protocols, but as long as the user was satisfied with the old ones, this software would simply keep working. It would not have to be constantly updated with security patches because it would have been made right the first time, just like hardware sometimes is.

I would not want to give the impression that I am unfairly picking on the fine developers of Firefox. Firefox 3.6 continued to be maintained long after Safari for OS X 10.4 had ceased to receive security updates. In fact, the TenFourFox project, a maintained variant of Firefox for precisely the kind of computer I took to the basement, is a testament to the power of Open Source Software. Unfortunately, there is no Adobe Flash support in TenFourFox, and I fear that this might complicate the life of a computerwise-unsophisticated user.


Acknowledgements: besides selling me his computer, Damien Doligez officially owned the first OS X-able mac I used, as a PhD student. Harry McCracken suggested to liven up this post with a picture. Gerhard Piezinger photographed an iMac very much like mine, and I used his picture.

- page 1 of 5