Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Friday, September 23 2016

A mini-tutorial of ACSL specifications for Value

(with the collaboration of F. Kirchner, V. Prevosto and B. Yakobowski)

Users of the Value plugin often need to use functions for which there is no available code, or whose code could be abstracted away. In such cases, ACSL specifications often come in handy. Our colleagues at Fraunhofer prepared the excellent ACSL by example report, but it is mostly directed at WP-style proofs.

In this post we explain how to specify in ACSL a simple function, in a way that is optimal for Value.

Prerequisites:

  • Basic knowledge of Value;
  • Basic knowledge of ACSL.

The messages and behaviors presented here are those of Frama-C Aluminium. Using another version of Frama-C might lead to different results.

A simple function

In this tutorial, we proceed in a gradual manner to isolate problems and make sure that each step works as intended. Let us consider the following informal specification:

// returns a random character between buf[0] and buf[n-1]
char get_random_char(char const *buf, unsigned n);

A minimal ACSL specification for this function must check two things:

  1. that n is strictly positive: otherwise, we'd have to select a character among an empty set, thereby killing any logician that would wander around at that time (buf is a byte array that may contain \0, so that there is no obvious way to return a default value in case n == 0);
  2. that we are allowed (legally, as per the C standard) to read characters between buf[0] and buf[n-1];

The following specification ensures both:

/*@
  requires n > 0;
  requires \valid_read(buf+(0 .. n-1));
*/
char get_random_char(char const *buf, unsigned n);

Note the spaces between the bounds in 0 .. n-1. A common issue, when beginning to write ACSL specifications, is to write ranges such as 0..n-1 without spaces around the dots. It works in most cases, but as 0..n is a floating-point preprocessing token (yes, this looks strange, but you can see it yourself in section 6.4.8 of C11 standard if you're not convinced) funny things might happen when pre-processing the annotation. For instance, if we had a macro MAX instead of n, (e.g. #define MAX 255), then writing [0..MAX] would result in the following error message: [kernel] user error: unbound logic variable MAX, since the pre-processor would dutifully consider 0..MAX as a single token, thus would not perform the expansion of MAX. For that reason, we recommend always writing ranges with spaces around the dots.

To check our specification, we devise a main function that simply calls get_random_char with the appropriate arguments:

void main() {
  char *buf = "abc";
  char c = get_random_char(buf, 3);
}

Then, if we run this with Value (frama-c -val), it will produce the expected result, but with a warning:

[kernel] warning: No code nor implicit assigns clause for function get_random_char, generating default assigns from the prototype

By printing the output of Frama-C's normalisation (frama-c -print), we can see what its generated assigns clause looks like:

assigns \result;
assigns \result \from *(buf+(0 ..)), n;

Despite the warning, Frama-C kernel generated a clause that is correct and not too imprecise in this case. But we should not rely on that and avoid this warning whenever possible, by writing our own assigns clauses:

/*@
  requires n > 0;
  requires \valid_read(buf+(0 .. n-1));
  assigns \result \from buf[0 .. n-1], n;
*/
char get_random_char(char const *buf, unsigned n);

Running Value again will not emit any warnings, plus it will indicate that both preconditions were validated:

[kernel] Parsing FRAMAC_SHARE/libc/__fc_builtin_for_normalization.i (no preprocessing)
[kernel] Parsing file.c (with preprocessing)
[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed
[value] Values of globals at initialization

[value] computing for function get_random_char <- main.
        Called from file.c:10.
[value] using specification for function get_random_char
file.c:2:[value] function get_random_char: precondition got status valid.
file.c:3:[value] function get_random_char: precondition got status valid.
[value] Done for function get_random_char
[value] Recording results for main
[value] done for function main
[value] ====== VALUES COMPUTED ======
[value] Values at end of function main:
  buf ∈ {{ "abc" }}
  c ∈ [--..--]

However, the result is still imprecise: c ∈ [--..--], that is, it may contain any character, not only 'a', 'b' or 'c'. This is expected: assigns \from ... clauses are used by Value to compute dependencies information, but they are not sufficient to constrain the exact contents of the assigned variables. We need to use postconditions for that, that is, ensures clauses.

The following ensures clause states that the result value must be one of the characters in buf[0 .. n-1]:

ensures \subset(\result, buf[0 .. n-1]);

The ACSL \subset predicate is interpreted by Value, which isolates the characters in buf, and thus returns a precise result: c ∈ {97; 98; 99}. Note that Value prints the result as integers, not as ASCII characters.

This concludes the specification of a very simple partial function. But what if we want to take into account more possibilities, e.g. a NULL pointer in buf, or n == 0?

First try: a robust and concise specification

Our get_random_char function is simple, but not robust. It is vulnerable to accidents and attacks. Let us apply some Postel-style recommendations and be liberal in what we accept from others: instead of expecting that the user will not pass a NULL pointer or n == 0, we want to accept these cases, but return an error code if they happen.

We have two choices: add an extra argument to get_random_char that represents the result status (OK/error), or use the return value as the status itself, and return the character via a pointer. Here, we will adopt the latter approach:

typedef enum { OK, NULL_PTR, INVALID_LEN } status;

// if buf != NULL and n > 0, copies a random character from buf[0 .. n-1]
// into *out and returns OK.
// Otherwise, returns NULL_PTR if buf == NULL, or INVALID_LEN if n == 0.
status safe_get_random_char(char *out, char const *buf, unsigned n);

We could also have used errno here, but global variables are inelegant and, most importantly, it would defeat the purpose of this tutorial, wouldn't it?

This new function is more robust to user input, but it has a price: its specification will require more sophistication.

Before revealing the full specification, let us try a first, somewhat naive approach:

typedef enum { OK, NULL_PTR, INVALID_LEN } status;

/*@
  assigns \result, *out \from buf[0 .. n-1], n;
  ensures \subset(*out, buf[0 .. n-1]);
*/
status safe_get_random_char(char *out, char const *buf, unsigned n);

void main() {
  char *buf = "abc";
  char c;
  status res = safe_get_random_char(&c, buf, 3);
  //@ assert res == OK;
}

This specification has several errors, but we will try it anyway. We will reveal the errors as we progress.

Once again, we use a small C main function to check our specification, starting with the non-erroneous case. This reveals that Value does not return the expected result:

...
[value] Values at end of function main:
  buf ∈ {{ "abc" }}
  c ∈ [--..--] or UNINITIALIZED
  res ∈ {0}

Variable c is imprecise, despite our \ensures clause. This is due to the fact that Value will not reduce the contents of a memory location that may be uninitialized. Thus, when specifying the ensures clause of a pointed variable, it is first necessary to state that the value of that variable is properly initialized.

This behavior may evolve in future Frama-C releases. In particular, our specification could have resulted in a more precise result, such as c ∈ {97; 98; 99} or UNINITIALIZED.

Adding ensures \initialized(out); before the \ensures \subset(...) clause will allow c to be precisely reduced to {97; 98; 99}. This solves our immediate problem, but creates another: is the specification too strong? Could we have an implementation of get_random_char in which \initialized(out) does not hold?

The answer here is definitely yes, especially in the case where buf is NULL. It is arguably also the case when n == 0, although it depends on the implementation.

The most precise and correct result that we expect here is c ∈ {97; 98; 99} or UNINITIALIZED. To obtain it, we will use behaviors.

Another reason to consider using behaviors is the fact that our \assigns clause is too generic, leading to avoidable warnings by Value: because we have written that *out may be assigned, Value will try to evaluate if out is a valid pointer. If our main function includes a test such as res = safe_get_random_char(NULL, buf, 3), Value will output the following:

[value] warning: Completely invalid destination for assigns clause *out. Ignoring.

This warning is not needed here, but its presence is justified by the fact that, in many cases, it helps detect incorrect specifications, such as an extra *. Value will still accept our specification, but if we want a precise analysis, the best solution is to use behaviors to specify each case.

Second try: using behaviors

The behaviors of our function correspond to each of the three cases of enum status:

  1. NULL_PTR when either out or buf are NULL;
  2. INVALID_LEN when out and buf are not NULL but n == 0;
  3. OK otherwise.

ACSL behaviors can be named, improving readability and generating more precise error messages. We will define a set of complete and disjoint behaviors; this is not required in ACSL, but for simple functions it often matches more closely what the code would actually do, and is simpler to reason about.

One important remark is that disjoint and complete behaviors are not checked by Value. The value analysis does take them into account to improve its precision when possible, but if they are incorrectly specified, Value may not be able to warn the user about it.

Here is one way to specify three disjoint behaviors for this function, using mutually exclusive assumes clauses.

/*@
  behavior ok:
    assumes out != null && buf != \null;
    assumes n > 0;
    requires \valid_read(buf+(0 .. n-1));
    requires \valid(out);
    // TODO: assigns and ensures clauses
  behavior null_ptr:
    assumes out == \null || buf == \null;
    // TODO: assigns and ensures clauses
  behavior invalid_len:
    assumes out != \null && buf != \null;
    assumes n == 0;
    // TODO: assigns and ensures clauses
*/
status safe_get_random_char(char *out, char const *buf, unsigned n);

Our choice of behaviors is not unique: we could have defined, for instance, two behaviors for null pointers, e.g. null_out and null_buf; but the return value is the same for both, so this would not improve precision.

Note that assumes and requires play different roles in ACSL: assumes are used to determine which behavior(s) are active, while requires impose constraints on the pre-state of the function or current behavior. For instance, one should not specify assumes \valid(out) in one behavior and assumes !\valid(out) in another: what this specification would actually mean is that the corresponding C function should somehow be able to distinguish between locations where it can write and locations where it cannot, as if one could write if (valid_pointer(buf)) in the C code. In practical terms, the main consequence is that such a specification would prevent Value from detecting errors related to memory validity.

For a more concrete example of the difference between them, you may consult the small appendix at the end of this post.

Assigns clauses in behaviors

Our specification for safe_get_random_char is incomplete: it has no assigns or ensures clauses.

Writing assigns clauses in behaviors is not always trivial, so here is a small, simplified summary of the main rules concerning the specification of such clauses for an analysis with Value:

  1. Global (default) assigns must always be present (even for complete behaviors), and must be at least as general as the assigns clauses in each behavior;
  2. Behaviors only need assigns clauses when they are more specific than the global one.
  3. Having a complete behaviors clause allows the global behavior to be ignored during the evaluation of the post-state, which may lead to a more precise result; but the global assigns must still be present.

Note: behavior inclusion is not currently checked by Value. Therefore, if a behavior's assigns clause is not included in the default one, the result is undefined.

Also, the reason why assigns specifications are verbose and partially redundant is in part because not every Frama-C plugin is able to precisely handle behaviors. They use the global assigns in this case.

For ensures clauses, the situation is simpler: global and local ensures clauses are simply merged with an implicit logical AND between them.

The following specification is a complete example of the usage of behaviors, with precise ensures clauses for both outputs (\result and out). The main function below tests each use case, and running Value results in valid statuses for all preconditions and assertions. The \from terms in the assigns clauses are detailed further below.

#include <stdlib.h>
typedef enum {OK, NULL_PTR, INVALID_LEN} status;

/*@
  assigns \result \from out, buf, n;
  assigns *out \from out, buf, buf[0 .. n-1], n;
  behavior null_ptr:
    assumes out == \null || buf == \null;
    assigns \result \from out, buf, n;
    ensures \result == NULL_PTR;
  behavior invalid_len:
    assumes out != \null && buf != \null;
    assumes n == 0;
    assigns \result \from out, buf, n;
    ensures \result == INVALID_LEN;
  behavior ok:
    assumes out != \null && buf != \null;
    assumes n > 0;
    requires \valid(out);
    requires \valid_read(&buf[0 .. n-1]);
    ensures \result == OK;
    ensures \initialized(out);
    ensures \subset(*out, buf[0 .. n-1]);
  complete behaviors;
  disjoint behaviors;
 */
status safe_get_random_char(char *out, char const *buf, unsigned n);

void main() {
  char *msg = "abc";
  int len_arr = 4;
  status res;
  char c;
  res = safe_get_random_char(&c, msg, len_arr);
  //@ assert res == OK;
  res = safe_get_random_char(&c, NULL, len_arr);
  //@ assert res == NULL_PTR;
  res = safe_get_random_char(NULL, msg, len_arr);
  //@ assert res == NULL_PTR;
  res = safe_get_random_char(&c, msg, 0);
  //@ assert res == INVALID_LEN;
}

Our specification includes several functional dependencies (\from), but there is still one missing. Can you guess which one? The answer, as well as more details about why and how to write functional dependencies, will appear in the next post. Stay tuned!

Appendix: example of the difference between requires and assumes clauses

This appendix presents a small concrete example of what happens when the user mistakingly uses requires clauses instead of assumes. It is directed towards beginners in ACSL.

Consider the (incorrect) example below, where bzero_char simply writes 0 to the byte pointed by the argument c:

/*@
  assigns *c \from c;
  behavior ok:
    assumes \valid(c);
    ensures *c == 0;
  behavior invalid:
    assumes !\valid(c);
    assigns \nothing;
  complete behaviors;
  disjoint behaviors;
*/
void bzero_char(char *c);

void main() {
  char *c = "abc";
  bzero_char(c);
}

In this example, Value will evaluate the validity of the pointer c, and conclude that, because it comes from a string literal, it may not be written to (therefore \valid(c) is false). Value then does nothing and returns, without any warnings or error messages.

However, had we written it the recommended way, the result would be more useful:

/*@
  assigns *c \from c;
  behavior ok:
    assumes c != \null;
    requires \valid(c);
    ensures *c == 0;
  behavior invalid:
    assumes c == \null;
    assigns \nothing;
  complete behaviors;
  disjoint behaviors;
*/
void bzero_char(char *c);

void main() {
  char *c = "abc";
  bzero_char(c);
}

In this case, Value will evaluate c as non-null, and will therefore activate behavior ok. This behavior has a requires clause, therefore Value will check that memory location c can be written to. Because this is not the case, Value will emit an alarm:

[value] warning: function bzero_char, behavior ok: precondition got status invalid.

Note that checking whether c is null or non-null is something that can be done in the C code, while checking whether a given pointer p is a valid memory location is not. As a rule of thumb, conditions in the code correspond to assumes clauses in behaviors, while requires clauses correspond to semantic properties, function prerequisites that cannot necessarily be tested by the implementation.

Friday, April 8 2016

Small improvements to the Frama-C GUI

The Frama-C GUI received a few quality-of-life improvements that have not been announced during the Magnesium release. This post presents some of them, along with general GUI tips.

We focus on general-purpose UI improvements, while another post will discuss new Value-specific features, related to the new Values tab.

These UI features should be discoverable via menus or other signifiers in the UI, but we decided to keep them hidden until stabilization, to avoid breaking existing workflows in case we decided to remove them. They are still not 100% mature, but since we have been pleasantly using them for several months, we feel confident enough to disclose their existence.

The features mentioned here are:

Textual search

It is now possible to perform a textual search in most Frama-C panels using Ctrl+F. Here is the list of panels where this works:

  • List of globals (upper-left panel, which displays filenames and global definitions);
  • CIL code (central panel);
  • Original source code (upper-right panel);
  • Information tab (lower panel);
  • Console tab (lower panel).

When you press Ctrl+F, the currently focused panel will display a small search window (if it accepts textual search) like this one:

Text search panel

Notice that the window mentions in which panel the search will be performed. Also, the search is case-sensitive.

Click on Find to search for the string. You can type F3 to perform a "Repeat search", that is, advance to the next occurrence of the text without opening a new panel. If you reach the end of the text, this dialog will notify you:

Reaching the end of the text

Also, if the search text is not found, you'll be presented a different dialog:

Text not found dialog

Message/Property counters

The Messages and Properties tabs now display the total amount of items they contain. This summary is useful for a quick comparison between short analyses. The Messages tab always display the amount of items, while the Properties tab, because of its "lazy" nature (items are only updated after clicking the Refresh button), initially does not display anything, but after selecting the set of desired filters, the Refresh button will update the total count, as illustrated in the figures below (click on the orange circles to switch images).

  • Message/Property counters (1/6)
  • Message/Property counters (2/6)
  • Message/Property counters (3/6)
  • Message/Property counters (4/6)
  • Message/Property counters (5/6)
  • Message/Property counters (6/6)

Note that the font in the Refresh button becomes bold to indicate that an update is needed, and returns to normal afterwards.

Types in the Information tab

The Information tab has been modified to include some typing information about variables, as shown in the example below.

Type information

In the example, the user clicked on the res expression, which displays some information (rectangle 1) such as the current function, statement id (used internally), source code line (with a clickable link to focus it in the original source panel), the type of the variable (with a clickable link), and some extra details.

By clicking on the variable type, we obtain more information (rectangle 2):

  • sizeof;
  • Source code line where the type is defined (if it is not a predefined type), with a clickable link;
  • Expanded type information (in the case of composite types). If a field is defined via yet another composite type, a link to the type information of that type will also be displayed.

This is very useful when exploring new code bases.

Note that some information that was previously displayed in this tab (e.g. the results computed by the Value plug-in) has been moved to the Values tab, which will be described in a later post.

Property filters and filtersets

One common issue for new users is to properly set the filters in the Properties tab, to avoid displaying too much information. Advanced users also had some difficulties defining a precise set of filters that matched their needs.

To help both kinds of users, the Properties tab had two minor improvements: first, some filter categories were defined (Kind, Status, RTE-related), to allow folding/unfolding of sets of filters.

Second, a few recommended filter sets were defined, corresponding to the most often used filter combinations. A small "Filter" button (a pointing hand clicking on a square) has been added next to the Refresh button, as indicated in the screenshot below:

Filter properties button

The "Reset all filters to default" menu consists in:

  • Hiding properties with status Considered valid, Untried or Dead;

  • Hiding statuses that are useful only in some specific situations: this includes Behaviors, Axiomatic and Reachable.

These statuses are not included by default for two reasons: Axiomatic and Behaviors are redundant w.r.t. the contents of other filters, while Reachable properties such as the ones produced by Value may generate noise, e.g. dead code detected by Value may show up as "Reachable Invalid" status (i.e. "unreachable")¸ when its actual consolidated status is Valid but dead.

In most circumstances, these properties are not relevant for an analysis of potential alarms. Filtering them by default reduces noise, but it is still possible to select them when necessary.

A more restrictive view can be obtained by selecting the menu "Reset 'Status' filters to show only unproven/invalid". This further eliminates Valid and Valid under hypotheses, leaving only orange/red bullets, which are often the only ones we are interested in.

Note that the Current function filter is independent of these buttons.

(Approximate) mapping from original to CIL source

A much-requested functionality is the mapping from the original source (right panel) to the CIL code, to synchronize the views of both panels.

Before Magnesium, clicking on the original source code did not move the cursor on the CIL code. Now, an approximate mapping allows the user to click on the original source and have the CIL source scrolled to the approximate corresponding location, as in the example below, in which the user clicked on the macro IEEE_1180_1990_ABS.

Reverse mapping

Note that this feature is not 100% reliable because the mapping between both sources is not always invertible. For instance, consider syntactic unrolling of loops, or multiple expansion of macros. In some cases the mapping fails (no location is found) or moves the CIL code to a distant part; e.g. some parts of expressions, when clicked, scroll to the variable declaration instead. Also, static variables and macro definitions are particularly problematic. But overall, this feature is a net benefit, especially when trying to find a specific location in a large function. If you find unintuitive behaviors, do not hesitate to tell us; there may be some patterns which have not yet been considered.

Tip: In some cases, different parts of an expression give different results, so it may be worth trying a few nearby clicks.

Conclusion

Each modification to the Frama-C GUI in itself is a minor, almost imperceptible improvement, but together these features greatly contribute to our comfort and productivity when using the GUI. We hope you'll enjoy using them as well!

Friday, April 1 2016

Frama-C blog loses self-awareness, new authors step in

The Frama-C blog is back, with an extended set of writers and a different focus: small pieces of (informal) documentation and useful tips for Frama-C users.

During its self-awareness period, the Frama-C blog realized that silence is a valid option, sometimes better than the alternatives.

Still, we thought better to remove its self-awareness and regain control of the blog, to post useful information for Frama-C users, ranging from beginners to advanced plug-in developers.

The name of the blog is still relevant: we will keep talking about Frama-C news and ideas, but with a slightly different focus, dedicating some posts to usage tips, new features, and general information that we judge useful for the Frama-C community.

We hope to keep the blog useful and informative. Feel free to post your comments and questions, either here or on the frama-c-discuss list.

Tuesday, May 20 2014

Frama-C blog becomes self-aware, author unnecessary

A reader's challenge

A couple of days ago, faithful reader David Gil sent in a challenge:

The reference code for Keccak/SHA-3 has a correctness bug in the Optimized64 implementation. Can the value analysis plugin find it?

My patch fixing that bug was accepted; I believe that the trunk is correct as well. (The correctness problem was in the Optimized64 implementation.)

Of course Value Analysis produces (many) warnings on the Keccak code, but I was unable to find settings that provided sufficient precision to recognize this bug as especially serious. (If you would like a hint, correctness depends on the color of your bird.)

The many warnings Value Analysis produced spurred me to substantially rewrite the Keccak Team's implementation. My version, with some ASCL annotations is in a very preliminary state. I am sure there are many bugs... There are some attempts at verification in the directory verification/frama-c

A farewell

It is a great pleasure for me to discover that this blog has reached its independence. It has had more comments than posts for a little while, and now, with readers sending in their own posts, I might as well move to a start-up that, in collaboration with my previous employer CEA LIST, provides products and services based on Frama-C. I may even publish a blurb there from time to time when something newsworthy comes up.

My facetious colleague Virgile Prevosto will hopefully continue to provide insights on the more subtle aspects of ACSL's semantics here. I know I will read them.

Tuesday, March 18 2014

Nginx buffer overflow

A buffer overflow has been discovered in recent versions of the HTTP server Nginx.

Hacker News user jmnicolas pondered out loud: “I wonder if this discovery is a result of OpenBSD switching its focus from Apache to Nginx?”

It took me one minute to understand what ey meant. I was actually wondering why OpenBSD would be putting buffer overflows in Nginx, now that they are done putting buffer overflows in Apache. I mean, surely there is room for one more buffer overflow in Apache?


The analysis of what this means about this industry and/or me is left as an exercise to the reader.

Sunday, February 23 2014

An interesting SSL implementation bug: CVE-2013-5914

SSL in the news

SSL is a protocol for point-to-point confidential and authenticated communication over an insecure medium. It is the protocol behind HTTPS, among many other uses. In an Internet-connected system, the SSL implementation stands at the frontier between the system and the hostile outside world. For this reason, SSL implementation flaws are a prime target for attacks.

An ordinary bug

A rather banal SSL implementation bug was revealed over the weekend. A duplicated line in a crucial, if ordinary-looking, sequence of validation operations means that some of the validation steps are not taken:

    ...
    if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail; // <-------------------------------------------------
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;
    ...
fail:
    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;
}

Note that the second consecutive goto fail is executed with variable err containing the result of the last validation operation, 0, indicating success. This value is returned as-is by the function; the caller is therefore mislead into believing that the validation succeeded, despite some of the validation steps not having been executed.

Because C lacks an exception mechanism, the above is an often-seen programming pattern. The programming style here can hardly be blamed: this is how the best C programs are written. Except of course for the extraneous goto.

The SSL implementation, and thus the bug in question, are found in tens of millions of iOS devices, with a few additional Mac OS X computers on top of that. The scope of the security problem caused by this bug, and the obviousness of the issue when pointed out, have together lead to much commentary on Twitter and other popular discussion forums.

So reaction. Very noise

“I would never have had this problem because I know that goto is bad”, some commenters claim. I wish I was caricaturing, but I am unfortunately only paraphrasing and combining several public reactions into one.


“I would never have had this problem because I use braces”, essentially state others. Certainly, the root cause of the problem must have been that the developer who introduced the bug was confused about the meaning of if (c) stmt1; stmt2;. Just look how ey indented it!

These two often-heard remarks strongly suggest to use brace-less control flow or the presence of goto as predictors of defects in C code. I am sure this will be a fruitful research direction. Someone from the software engineering academic community should look into it.


“Just adding a single line of code can bring a system to its knees”, reminds Arie van Deursen. True, true, an important lesson there. We should all insert the following line somewhere in our respective codebases from time to time and take a good look at the effect it has, in remembrance.

(


It is Ariane 5 all over again! Worse, instead of just the makers of formal verification systems, everyone seems to have a scheme to prevent the problem they already know is there.

An interesting bug in a different SSL implementation

The problem in most of the armchair analysis that has been going on on the Internet lies in the two following questions: how many more bugs in security-critical C code does the proposed remedy (be it more braces, fewer gotos, …) find? How many time-costly, tedious, soul-crushingly boring false positives does it uncover?

Since commenters have been generalizing on the basis of a population of one sample, I feel no shame in presenting my own single example, raising the total number of scrutinized bugs to two. Note that, for us to make statistically sound arguments, we will eventually need to extend the discussion to examples of correct code that we do not wish to change.

Until then, here is a funny SSL implementation bug. It is characterized as follows, in a version of PolarSSL 1.1.8 that colleagues and I have been modifying.

Screenshot 1: an alarm in our tis_recv() function?

CVE-2013-5914-1.png

PolarSSL expects the program in which it is incorporated to provide it with a function that receives data from the outside world and writes it to a memory buffer. We have made such a function, baptized it tis_recv, and we have set up PolarSSL to use it.

The function tis_recv takes three arguments. The first one is a context argument in case someone needs one (our function ignores this argument). Second is a pointer to the buffer where the data is to be written, then third is the maximum size the function is allowed to write there.

We have specified our function tis_recv thus:

/*@ 
  requires recv_r1: \valid(output+(0..output_len-1)) ;
  requires recv_r2: output_len > 0 ; 
*/
int tis_recv(void* p, unsigned char * output, size_t output_len);

The screenshot indicates on the bottom right that the pre-condition recv_r1, which states that the argument output points to a buffer large enough for output_len characters, is not verified. How could this be? Surely a false positive… Or someone is calling the function with the wrong arguments. It does not look like the problem is in our function.

The GUI informs us that the function tis_recv is called in one place only, and that place is inside ssl_fetch_input(). It is called through a function pointer stored inside a member of a struct accessed through a pointer. The GUI tells us that we can mentally substitute ssl->f_recv(..) with tis_recv(...).

Screenshot 2: a wrong third argument

The GUI tells us that the buffer that PolarSSL passes to tis_recv() has size 16KiB-ish (not pictured), and that the variable len passed as third argument appears to take any of the values of its size_t type (pictured in the bottom panel below).

CVE-2013-5914-2.png

Screenshot 3: jumping back to the origin of the value

We inquire where the value of variable len was last set, and the GUI tells us it is at the yellow line just above the function call (pictured, in the middle panel). Well, hum, yes, we could have noticed that, but it was faster to click.

CVE-2013-5914-3.png

Screenshot 4: argument nb_want is too large

The value of len is computed from ssl_fetch_input()'s argument nb_want, and the value of nb_want appears to be too large, 65540, for the size of the buffer that the GUI tells us we have (in the screenshot below, the values computed as possible for nb_want are displayed in the bottom panel).

CVE-2013-5914-4.png

Screenshot 5: dangerous values come from caller ssl_read_record()

A new possibility offered by the Frama-C version I am using that may not even(*) be available in the latest release Fluorine is to observe, in the bottom panel, which call-sites and originating call-stacks cause which values to occur in a variable. Here, the GUI shows that nb_want appears to be 65540 when called from function ssl_read_record() at line 1178 in file ssl_tls.c of PolarSSL. This means we can continue the investigation there. In contrast, the value of nb_want can only be 5 when ssl_fetch_input() is called from ssl_parse_client_key_exchange(), so there is no need to look at that function: it is definitely not part of this problem.

(*) I don't remember. It has been a long time, has it not?

CVE-2013-5914-5.png

Screenshot 6: the seemingly too large argument is ssl->in_msglen

CVE-2013-5914-6.png

The problem is that ssl->in_msglen is too large here. But where does it come from?

Screenshot 7:

ssl->in_msglen has been computed from two bytes sent by the network (bad, bad network). But the value of ssl->in_msglen is supposed to have been validated before being used. This is what the lines between the obtention of the value and its use are supposed to do.

CVE-2013-5914-7.png

Screenshot 8:

CVE-2013-5914-8.png

The value of ssl->in_msglen is validated when ssl->minor_ver is 0, and it is validated when ssl->minor_ver is 1. But according to the GUI, ssl->minor_ver can be any of 0, 1 or 2.

Explanation

At this point it is only a matter of confirming that the call to ssl_read_record() can indeed be reached with ssl->minor_ver set to 2. This is where one switches to existential mode, possibly crafting a malicious message, or simply building a convincing argument that values can converge here to cause bad things and send it to the official maintainer .

When I said that this was a modified PolarSSL 1.1.8 we were analyzing, I cheated a little. This is a 1.1.8 version in which I have re-introduced a bug that was there from 1.0.0 to 1.1.7. The principal maintainer of PolarSSL suggests to fix the bug by replacing == SSL_MINOR_VERSION_1 by >= SSL_MINOR_VERSION_1.

Conclusion

We have seen a second example of a real bug that can occur in an SSL implementation. Understandably in the current context, there has been much speculation over the last 48 hours on the innocence of the bug in Apple's implementation. Might it be a voluntarily inserted backdoor? Is the NSA behind this bug? (I link here to John Gruber's post because he writes clearly, but the same question is raised all over the Internet).

Allow me to put it this way: if the Apple SSL bug is a trick from the NSA, then you US citizens are lucky. Our spy agency in Europe is so much better that it does not even have a name you have heard before, and it is able to plant bugs where the buffer overflow leading to arbitrary code execution is three function calls away from the actual bug. The bug from our spy agency is so deniable that the code actually used to be fine when there were only two minor revisions of the SSL protocol. The backdoors from your spy agency are so lame that the Internet has opinions about them.


Real bugs come in all sizes and shapes. That one mistake with security implication also causes easily detectable unreachable code or wrongly-indented lines does not mean that all security flaws will be detected so easily, and that plenty of quirky but functionally correct code will not be wrongly flagged.

Speaking of varied bugs, “what about PolarSSL version 1.1.8 without the manually re-inserted bug from CVE-2013-5914?”, I hear you ask. This will be the subject of another blog post.

Acknowledgments

Philippe Herrmann was first to use Frama-C to find a security bug in PolarSSL (Philippe's bug was a denial of service in versions up to 1.1.1, fixed in 1.1.2). Anne Pacalet and Paul Bakker have helped me with various aspects of PolarSSL's verification, including but not limited to the bug described in this post. Twitter users aloria, thegrugq, and johnregehr provided comments on drafts of this post. And finally, the Internet made this post possible by being itself.

Tuesday, February 4 2014

Assertions

Jesse Ruderman on assertions and fuzzing

Jesse Ruderman has published a blog post on assertions and how they complement fuzzing. Key quote: “Fuzzers make things go wrong. Assertions make sure we find out.”

Readers of this blog are accustomed to me talking about differential testing, where a reference result (say, obtained by compiling a random C program with a quality compiler) is used to detect bugs in the target program (say, a static analysis framework for C programs). Differential testing imposes constraints: the random input must have a definite meaning, and a reference implementation needs to be available to compute this meaning. Often one finds bugs in the reference implementation together with bugs in the target program.

Besides, there are deeply burrowed bugs that are difficult to reveal with such a black-box approach. Assertions simplify the problem: if the code being tested contains enough well-chosen assertions, bugs can be caught without a reference implementation.

Sometimes, an assertion is a reference implementation: some of the assertions in Frama-C are alternative computations of the same intermediate result, followed by a comparison of the normally computed result with the alternately computed result. These assertions initially caught bugs in either computation. Since then, they have caught many more bugs in hash-consing, where the two results are structurally identical but are not shared.

Even when an assertion happens to contain a reference implementation for an intermediate result, it saves a lot of time to write the assertion as opposed to producing a complete reference implementation for the whole problem (say, interpreting C programs). The alternative implementation in the assertion does not have to be efficient: in Frama-C's value analysis, it is considered acceptable to spend 20% of the analysis time executing inefficient reference implementations in assertions.

John Regehr on writing and using assertions

John Regehr just published a blog post too on the subject of assertions. What a coincidence! Key quote: “we have tools that are specifically designed to help us reason about assertions; my favorite one for C code is Frama-C”.

In most of his post, John describes executable assertions written in the same language as the program being debugged. In the section quoted above, he moves to specific annotation languages to write assertions in. The advantages of using the same language as the programming language for assertions require no elaboration: it's the same language! The programmer already knows it, and the reviewer of the code already knows it.

But there is a spectrum of annotation languages for assertions. Eiffel stands its ground somewhere on this spectrum, very close to the underlying programming language but with enough good intentions to be noted. I think that the JML annotation language for Java was initially mostly intended for run-time checking, but ended up being very popular too as the annotation language used by static analyzers (I would be happy to be corrected if I am getting this wrong). Nearby JML lies E-ACSL, an executable subset of ACSL and also a Frama-C plug-in to convert a C program annotated with /*@ ... */ assertions into an instrumented C program that detects at run-time violated assertions. SPARK 2014 aims at making both camps happy.


I should point out for the sake of scientific integrity that I am being slightly cheeky in the choice of the key quote to represent John's post. I recommend you read the whole thing, of which the Frama-C endorsement is only a tiny fraction.

Taking advantage of C assertions with Frama-C

One can also use Frama-C in conjunction with existing executable assertions, typically implemented with the standard function assert() in header assert.h. The function assert() takes a C expression representing a property that should hold.

One way to take advantage of assertions is to fail to establish that they always hold, and warn the user that perhaps they don't. It is easy for anyone who wants this behavior to obtain it. One simply needs to specify assert() thus:

/*@ requires x != 0 ; */
void assert(int x);

With this specification, any existing call to the C function assert(), intended for the programmer to be executed at run-time, is required to have an argument that demonstrably corresponds to a non-null expression. This specification creates a link of sorts between static verification and run-time checking (or expressions intended to be checked at run-time, anyway).

Here is an example:

...
/*@ requires x >= 0 ;*/
double sqrt(double x);

int get_int(void);

double dist(double x, double y)
{
  double s = x * x + y * y;
  assert(s >= 0.0);
  return sqrt(s);
}

int main(int c, char **v){
  dist(get_int(), get_int());
}

When this program is analyzed, the value analysis detects that the assertion s >= 0.0 helpfully inserted by the programmer as an argument to assert() may not hold. In this particular example, the warning is a false positive, but never mind that for the moment.

$ frama-c -val t.c
...
t.c:4:[value] Function assert: precondition got status unknown.
...
t.c:1:[value] Function sqrt: precondition got status unknown.

Even more irritating in the example above is that after producing a false positive for assert(), the analyzer produces a false positive for the pre-condition of sqrt(). This brings us to another way a static checker could use C assertions to its advantage: it could take them as hints, properties that can be assumed to hold in order to avoid warning for problems that would seem to arise later if they did not.

With the user-specified function assert() above, the analyzer computes the truth value of the argument s >= 0.0. Because the set of values computed for s is approximated, the set of values for the expression s >= 0.0 is {0; 1}. The analyzer can tell that the pre-condition for assert() may not hold, but the relationship with the possible values of s is too indirect for it to decide that the forbidden situation corresponds to s negative and that only s positive is allowed.

There exists a better modelization of assert(). It was implemented as a value analysis built-in in order to offer the very functionality we are describing here:

$ frama-c -val -val-builtin assert:Frama_C_assert t.c
...
t.c:11:[value] warning: Frama_C_assert: unknown
...
t.c:1:[value] Function sqrt: precondition got status valid.
...
[value] Values at end of function dist:
  s ∈ [-0. .. 9.22337203685e+18]

On the same example, the builtin version of assert detects that it may be passed a null expression and warns about that. These is no improvement there (annotating the example to convince the analyzer that s is always positive is left as an exercise to the reader). Subsequently, and this is an improvement with respect to the previous analysis, the builtin does its best to incorporate the information provided in the assertion, so that it can tell that s is henceforth positive and that the pre-condition of sqrt() is satisfied.

Friday, January 24 2014

Bear-joke security is dead

Likely, you have heard this one before:

Two campers are surprised by an angry bear. One of them starts putting on eir running shoes. Surprised, the other exclaims “What are you doing, Alex? You can't outrun a bear!”

To which Alex replies: “I don't have to outrun the bear. I only have to outrun you.”

This joke used to be popular with security experts, who employed it as a metaphor. I only ever heard it in that context, the first time as a student in the late nineties. The bear joke was still used for this purpose in 2008.


You may also have heard that there is a new bear in town, and the new bear can take both Alex and eir friend, with or without running shoes. Also, the new bear could literally(*) eat a horse. And the bear is part of an organized network of hungry bears with walkie-talkies and sniper guns.

If your approach to security was based on bear jokes with dubious morals, now must really suck. Andy Green's blog post “Cryptography may not be dead, but it is on life support” is representative of the change. One of the quote he takes from Schneier's talk is:

Most of how the NSA deals with cryptography is by getting around it … They exploit bad implementations—we have lots of those.

Yes, we do, but we don't have to use them.


Here is to 2014 being the year of reliable cryptography implementations that cannot be circumvented through defects.


(*) “literally” within the confines of this metaphor

Friday, January 17 2014

Post-conditions and names of arguments

In an ACSL post-condition, any reference to the name of one of the function's arguments is assumed to refer to the initial value of the argument.

/* ensures arg == 1; */
void f(int arg)
{
  arg = 1;
}

For instance, in function f above, Frama-C's value analysis plug-in will typically say that the status of the post-condition is unknown, because arg is taken to mean \old(arg), the value passed to f and thus untouched by the assignment. The rationale is that arg has already ceased to live when the post-condition is evaluated: the program could not observe or otherwise depend on the value of arg after the call to f() anyway. On the other hand, it is convenient to be able to relate results and arguments of the function, and this can be done with the simple syntax “arg”. For a global variable G, one would have to write “\old(G)” to refer in f's post-condition to the value of G just before the call to f. The syntax “G” in a post-condition would refer to the value of G at the end of the function.


But do not worry, if you forget the above subtlety, you can always spend twenty minutes adding debug messages to the value analysis plug-in until you finally remember that said subtlety is what is actually implemented.

Saturday, December 21 2013

Improving the world for 2014, one programmer at a time

I move that all computing devices be equipped with a small explosive charge, set to explode when someone refers someone else to Goldberg's What Every Computer Scientist Should Know About Floating-Point Arithmetic without the referrer emself having read it.

It would not have to be lethal. I could be content if the apparatus maimed the user into harmlessness.


For what it is worth, the advice I wish I had received instead of being suggested Goldberg's essay the first time is:

  1. Print numbers in hexadecimal format as often as you can (%a in C99). Input them that way when it makes sense.
  2. Don't worry, floating-point numbers are just bits. Stop listening to anyone who tries to confuse you with generic reasoning about an arbitrary base β.


YMMV

Sunday, November 24 2013

C-Reduce and Frama-C

Automatic testcase reduction for compilers: the story so far

A previous post linked to a discussion of manual testcase reduction when the program being debugged is a compiler (and the input demonstrating the faulty behavior is thus a C program). Since then, quite a few related events took place:

In a comment to the linked post, someone mentioned automatic testcase reduction with “delta debugging”, and Bruce Dawson, the author of the original post, replied:

Given that the compiler sometimes changed what register it was using I’m not sure that the effort of writing an accurate test for this bug would be worthwhile. It really didn’t take me much time to reduce this bug — half an hour perhaps? Once you don’t need to link the code you can slash-and-burn at a fearsome rate, and my brain is more flexible at recognizing legitimate variations of the bug than any script would be.

And I further commented that on the other hand, a generic “accurate [automatic] test” only needs to be written once, and can serve multiple times. This post intends to show how right Bruce is: subtle pitfalls lurk when writing the acceptance test for automatically reducing a C program. Some of the dangers are theoretical enough when the program being debugged is a compiler, but become more annoying in practice when it is, say, a static analysis framework.

Meanwhile, John Regehr wrote a blog post of his own about the automatic reduction of testcases for compilers. The remarks in his post are quite orthogonal to the remarks in this one. My remarks are in the context of reducing testcases for Frama-C, which is a bit of a misappropriation, and reveals specific issues in the same way that using Csmith to find bugs in Frama-C provided some new insights.

The general idea explained on an example

Consider the example of a file from the Linux kernel that Frama-C's front-end chokes on. The original file causing the problem is 1.5MiB large after pre-processing and we would like automatic reduction to produce a smaller file that exhibits the same problem, so that we can show it to our Frama-C front-end expert.

The symptom of the problem is the error message below:

include/linux/delay.h:39:[kernel] failure: invalid implicit conversion from void to int

We can build a script that tests for the presence of this line in Frama-C's output. This script will then tell the reducer whether the latter is still on the right path or accidentally erased the interesting behavior.

A first obvious remark is that the test should not check for the presence of “include/linux/delay.h:39:[kernel] failure: invalid implicit conversion from void to int” exactly. If it does, then a reduced program can only pass the test by making it look as if the problem is at that file and line, which will prevent removal of #line file directives in the testcase, and also hinder the removal of irrelevant lines that only seem useful because they happen to make the error message come from the expected line.

Instead, the test should be only the presence of “failure: invalid implicit conversion from void to int” in the output of Frama-C. We think we will be happy with any testcase that produces this output, so we should configure the reducer with a test that detects this output and this output only. This idea will be a recurring theme in this post.

C-Reduce is the best reducer of C programs out there. C-Reduce takes as input a C program to reduce and a script implementing the interestingness test. A pre-condition is that the initial program has to be interesting. In exchange, C-Reduce produces the smallest interesting program that it can, removing and moving around bits of the original program and checking that the result remains interesting. When applied with the aforementioned 1.5MiB file from the Linux kernel and with the interestingness test frama-c t.i | grep "failure: invalid implicit conversion…", C-Reduce produces the minimal file below:

fn1 ()
{
    0 ? 0 * 0 ? : 0 : 0;
}

This superficially looks like something that we could show to our Frama-C front-end expert, although we will generate a better reduced program later in the post. If we make the mistake of grepping for "delay.h:39:[[]kernel[]] failure: invalid implicit conversion…" instead, C-Reduce still does a good job, preserving only a single additional line that happens to cause the warning to be localized where the interestingness test expects it:

#37 "include/linux/delay.h"
fn1 ()
{
    0 ? 0 * 0 ? : 0 : 0;
}

C-Reduce's only quirk may be that it sometimes reduces too much. In both reduced programs above, the function fn1() is implicitly typed as returning int according to now obsolescent C90 rules. In addition, the syntax e1 ? : e2 is not standard C but a GCC extension that happens to be accepted by the Frama-C front-end.

We have been lucky

We have in fact been lucky in the above reduction: the reduced program looks like something that should be accepted by Frama-C (despite the int ellipsis and the strange ? : syntax), and instead of accepting it, Frama-C emits an error message. This is exactly what we were hoping to obtain from the reduction process.

Consider the eventuality of the reduced program being an invalid program, incorrectly converting a void expression to int. Frama-C would be behaving correctly by emitting its warning on the reduced program. C-Reduce would have behaved according to specification, but we would be left without anything to show our front-end expert. We would have been had.

This is not a problem in C-Reduce, but a problem with us not using it properly. The reason the original 1.5MiB program is interesting is that it is accepted by GCC and rejected by Frama-C. This is the criterion the interestingness test should implement. When the interestingness test passed to C-Reduce implements an approximation of the notion of interestingness we really desire, we have only ourselves to blame if C-Reduce turns up an invalid reduced file that Frama-C correctly rejects.

Stop reducing so much, C-Reduce!

We do not want the Frama-C front-end expert we report the bug to to be sidetracked by considerations on obsolete syntax or strange extensions. The best testcase to show em is a short but perfectly standard C program. We can make this preference part of our interestingness test.

The first idea may be to reject any file that is not strictly C99-compliant. That would be a file that gcc -std=c99 -pedantic accepts to compile, I think. However, the unreduced testcase here is a file from the Linux kernel. The unreduced testcase does not pass this test.

Delta debugging is for preserving interestingness along size-reduction steps, not for making interestingness appear. Therefore, we cannot demand a reduced file that passes gcc -std=c99 -pedantic if the original file to start from does not pass it either.

Note: techniques borrowed from C-Reduce and from genetic programming might make interestingness appear out of uninteresting C files. Structural coverage of the program under test, beyond the ability to crash it, may make a useful, non-boolean fitness function. Or better, for a chosen assertion inside the program under test, coverage of the statements that lead to the assertion. This process would be different from delta debugging but might generate files that reveal bugs in static analysis frameworks. I hope someone will look into it someday.

However, simply compiling the tentative reduced file we already have with well-chosen compiler options reveals that GCC can already tell this file is not perfect. It emits the warnings snippets forbids omitting the middle term (regarding the syntax extension for ? :) and warning: return type defaults to (regarding implicit return types of functions). Our original file, although it originates from the Linux kernel, is not so ugly that it causes these warnings. Consequently, we can test for these warnings in an interestingness test that accepts the original file and will guide C-Reduce towards a clean reduced file.

If we do this, we obtain:

/*  */ void
fn1 ()
{
    0 ? 0 * 0 ? 0 : 0 : 0;
}

The Frama-C issue at hand was reported using the reduced C program as evidence, and was promptly fixed by Virgile Prevosto, Frama-C front-end expert extraordinaire.

C-Reduce is now parallel

The latest version of C-Reduce can explore several possible reductions in parallel. This is particularly interesting when the program under test takes a long time. A bug deep into the value analysis plug-in may take minutes to occur in the original program, and C-Reduce typically launches thousands of instances in the process of reducing a large program to a small one.

However, it does not pay to run too many instances of memory-bandwidth-limited programs in parallel. The computer I ran this example on has 4 cores and two execution threads by core (it relies on hyper-threading). The system thus sees 8 processors. By default, C-Reduce launches up to 8 instances in parallel. Is this the best choice?

If I use option C-Reduce's option -n to choose different levels of parallelism instead, I obtain, for this particular memory-intensive(*) program being debugged (Frama-C) and this particular initial C program, the following timings:

# instances    Time (s)
 1               620
 2               375
 3               346
 4               367
 5               396
 8               492

For this particular use, C-Reduce's default of using all available execution threads is faster than using only one execution thread. It is also slower than all other parallelization choices. The fastest reduction is obtained for 3 parallel instances, and it is almost twice faster than the non-parallel version.

There is some randomness to the reduction algorithm, however, and the experiment should be repeated with varying initial programs before conclusions are drawn.

(*) Frama-C can sometimes use as much as half the memory a modern web navigator typically uses. Imagine!

Making each instance terminate earlier

Even with an optimal choice of the degree of parallelism, automatic reduction can be time-consuming. A sub-optimal interestingness test for a bug that takes several minutes to appear in Frama-C can make the entire reduction take days, or exhaust the operator's patience altogether. This is only machine time we are talking about, but sometimes a human is waiting on the result of the reduction. In this section, we point out two useful tricks to bear in mind, especially when the program under test can take a long time for some inputs. The fact that the program under test takes a long time can be the problem being investigated, but does not have to.

First trick: using grep option -l

Often, as soon as a particular diagnostic message has been emitted, it is possible to conclude that the testcase is interesting. In the example above, as soon as the message “failure: invalid implicit conversion…” has been seen in Frama-C's output for a C program that GCC otherwise accepts, the input C program is known to be interesting.

By default, a shell script such as frama-c … test.c | grep "interesting message" allows Frama-C to finish its analysis before returning a boolean value. A simple improvement is to use grep's -l option, that causes grep to exit just after the matching line has been seen.

Contrast these two Unix sessions:

$ cat | grep interesting ; echo $?
nothing to see here
still nothing
this is interesting
this is interesting     ← this matching line is repeated by grep
one more line
two more lines
three more lines
cat is taking a long time
...
^D
0   ← error code of grep, printed by echo, indicating an interesting line was seen

In the above session, the grep command was still scanning its input after an interesting line had been seen, until I indicated that the “analysis” was finished with control-D. Below, I repeat the experiment, this time using grep -l to detect the keyword “interesting”.

$ cat | grep -l interesting ; echo $?
nothing to see here
still nothing
this is interesting
(standard input)     ← grep indicates a matching line has been seen on stdin
one more line
0   ← error code of grep, printed by echo, indicating an interesting line was seen

Thanks to the -l option, the cat command was terminated abruptly before it had finished, when emitting the line immediately after the interesting line (cat was terminated for trying to write to a closed file descriptor). Actually, this is a slight chink in the armor: the program whose output is piped into grep needs to write one more line after the interesting line in order to be terminated. If the interesting behavior is Frama-C taking a long time for some inputs, and the interesting log line is one that says “warning: things are going to take an infinite time from now on”, and nothing is ever printed after that line, Frama-C will not be terminated. My Unix-fu is too weak for me to offer a solution where the program under test is terminated as soon as possible, but the next subsection offers a palliative anyway.

Second trick: using a timeout

The initial input program may fail (i.e. “be interesting”) within, say, 5 minutes. This does not prevent slight variations of the program to take forever to fail, or to take forever to succeed. In each case, we do not want the reduction process to waste days of computations testing a single variation. We want the reduction process instead to try as many variations as possible quickly in order to converge to a short one as soon as possible. The strength of delta debugging is in iteration.

A simple idea here is to define the interestingness property as “the C program causes the interesting message to be emitted by Frama-C within 10 minutes”. Any program that has already been analyzed for 11 minutes is uninteresting by this definition. This property can be implemented simply with the Unix command ulimit.

Strictly speaking, the interestingness property should be completely reproducible, whereas this one is not. However, C-Reduce still works very well in practice with such a limit in place in the interestingness test. To limit the risk of breaking internal C-Reduce assertions, the time the initial program takes to be interesting should be multiplied by a safety factor. Because C-Reduce launches several instances in parallel, each instance runs a bit slower than if it was alone, and execution times can vary a bit. I have never had any trouble using as the timeout twice the time the initial program takes to fail.

Automating the manual

Let us take a step back and look at what we have been doing above. We started with one obvious interesting property, and we progressively refined it in order to delineate the really interesting programs. We initially thought we were interested in C programs that cause Frama-C to emit the message “include/linux/delay.h:39:[kernel] failure: invalid implicit conversion from void to int”, but it eventually turned out that we really wanted Frama-C to emit “failure: invalid implicit conversion from void to int” within 10 minutes for a C program that GCC compiles without emitting warnings about ? : syntax extensions or implicit return types.

We have been obtaining the additional sub-properties in a trial-and-error fashion: we did not know that we wanted any of them until C-Reduce produced a program that did not have it. Still, I believe that:

  1. These additional sub-properties would always come from the same repertoire in actual C-Reduce practice. It is possible to build a large catalog of these desirable properties so that the end user does not need to re-discover each of them.
  2. Before the reduction per se even starts, the original program should be tested for all the above niceness properties. All the niceness properties the original program has should be preserved in the reduced program by making them part of the interestingness test. That is, if the original program does not warn about implicit return types, the reduced program should not warn about implicit return types. On the other hand, if the original program does warn about implicit return types, then the bug being investigated may be about implicit return types, so it is not possible to require the reduced program to avoid the warning.
  3. Still, it would be a useful feature if the reducer could detect that a niceness property has appeared during reduction, and strove to preserve it from there on. This was perhaps easy to implement in the early non-parallel version of C-Reduce: it looks like it could be done entirely by side-effects in the interestingness test. Since, C-Reduce has been parallelized, which greatly helps with reduction speed but means that implementing this feature becomes more complicated, if it is at all desirable.

Conclusion

This post collects a number of useful tricks for using C-Reduce, in particular when reducing bugs for Frama-C. C-Reduce is already immensely useful as it is, but this post also indicates some future work directions to make its use more automatic and more convenient yet.

Saturday, October 26 2013

Jakob Engblom on the Toyota Acceleration Case

We interrupt our regularly scheduled program to link to this post by Jakob Engblom. The post was prompted by the jury's conclusion, in a lawsuit over what appears to be an automotive software glitch, that the carmaker is liable.

Monday, October 21 2013

Bruce Dawson on compiler bugs

Bruce Dawson has written a superb blog post on a Visual C++ compiler bug (now fixed), covering every aspect an essay on compiler bugs should cover.

I really like one section, that I am going to quote in full:

Security

In these paranoid days of the NSA subverting every computer system available every bug is a possible security flaw. This bug seems curiously commonplace – why wasn’t it discovered when building Windows or other 64-bit products? The most likely explanation is chance and happenstance, but compiler bugs can be used to make innocuous source code do surprising things. Maybe this bug is used as part of a back door to let code execution go in a direction that the source code says is impossible. This happened accidentally on the Xbox 360 where the low 32 bits of r0 were checked and then all 64 bits were used. This inconsistency allowed a hypervisor privilege escalation that led to arbitrary code execution.

This bug is probably not security related, but security is another reason to ensure that compiler bugs get fixed.


Bruce's post also illustrates the “most compiler bugs aren't” problem and provides a step-by-step tutorial on manual bug reduction. Regarding the latter, my reaction to Bruce's post was to bring up automatic testcase reduction with “delta debugging”, but Gravatar user jrrr beat me to it.

Frama-C contributes to making automatic testcase reduction work in practice for C compiler bugs by providing a reference C implementation that detects the undefined behaviors that could make the (original or reduced) bug reports invalid. We also use delta debugging internally to simplify our own reports of bugs in Frama-C. The next post in this blog will illustrate this use of delta debugging on a concrete example.

Friday, October 18 2013

OCaml's option -compact can optimize for code size and speed

The OCaml compiler can generate native code for various architectures. One of the few code generation options is named -compact:

$ ocamlopt -help
Usage: ocamlopt <options> <files>
Options are:
  ...
  -compact  Optimize code size rather than speed
  ...

One of the effects, perhaps the only effect(?), of this option is on the code generated for allocation. In OCaml, a small block typically gets allocated from the minor heap. When all goes well, allocating in the minor heap means simply decrementing the pointer that separates the already-allocated top of the minor heap from the still-available bottom of the minor heap. Eventually the minor heap becomes full; what happens then is complicated. But the fast path is basically a subtraction and a conditional branch (usually not taken) to the complicated stuff.

With option -compact, an allocation simply calls a routine that does all the above. Without it, the subtraction and conditional jump are inlined at the site where the allocation is requested. OCaml being a typical functional language, there are plenty of these. Inlining a couple of instructions makes the code bigger, but when execution stays on the fast path, which is often, a routine call is avoided completely.

A small example

Consider the one-line OCaml function let f i = Some i. This function takes a value i and returns a Some memory cell that points to i. I intended i to be an integer, but I inadvertently wrote code that works for any type of i. This was possible because in OCaml, all types of values share a uniform representation, as discussed in the introduction of a previous post.

A Some cell occupies 16 bytes (yes, this is huge. Blame 64-bit computing). The fast code generated by ocamlopt -S t.ml, where L102 is the label at which selcom-called code manages the case when the minor heap is full:

	...
	subq	$16, %r15
	movq	_caml_young_limit@GOTPCREL(%rip), %rax
	cmpq	(%rax), %r15
	jb	L102
	...

The -compact option causes more compact code to be emitted for the allocation. When our function f is compiled with ocamlopt -S -compact t.ml, the sequence generated for the allocation is much shorter indeed. There is even a specialized allocation function for allocating 16-byte blocs, so no arguments need to be set up:

	...
	call	_caml_alloc1
	...

Effects of option -compact in the large

When the development version of Frama-C is compiled with option -compact, the binary takes 14.4MB.

Without option -compact, the Frama-C binary weights in at 15.4MB.


One additional megabyte of code is not going to hurt us at this point, right? It is not as if the Frama-C binary had to fit exactly in a box of ten floppy disks. As long as the larger code is faster, it is worth it…

$ time  bin/toplevel.lean_and_mean tests/test/adpcm.c -sparecode > /dev/null

user	0m0.843s

$ time  bin/toplevel.big_and_slow tests/test/adpcm.c -sparecode > /dev/null

user	0m0.856s

… but it turns out that the larger code isn't faster. In the case of Frama-C, and of the computer I am typing on, at least, the code generated with option -compact is smaller and faster.

Conclusion

It is not easy to try to predict performance on modern out-of-order architectures, but if I had to hazard an explanation, I would say that, besides the obvious cache concerns, having many scattered predictable conditional branches may be a losing proposition compared to one single conditional branch visited from many places with call and ret. Every conditional branch that isn't in the branch prediction buffer is at risk of being mispredicted. In addition, each entry taken by the branches that have recently been visited is an entry that isn't available for branches translated from conditionals in the source code.

In contrast, the execution of the call and ret instructions has been optimized over the years. The target of the ret instruction is predicted with a Return Stack Buffer that has a conceptually simpler job than the Branch Predictor (although it is clearly possible to write code for which the ret instructions cause a stall, which could re-establish an advantage for the partially inlined version).


The conclusion is that everyone who uses OCaml should try the native compiler's option -compact, and that perhaps the default should be for this option to be on.

Wednesday, October 9 2013

The overflow when converting from float to integer is undefined behavior

Integer overflows in C

A previous post on this blog was a reminder that in C, signed integer arithmetic overflow is undefined behavior. In contrast, the behavior of overflows in conversions from integer type to signed integer type is implementation-defined. The C99 standard allows for an implementation-defined signal to be raised, but in practice, the widespread compilation platforms provide two's complement behavior. And you can trust that they will continue to do so, because it's implementation-defined. Compiler makers cannot change their mind willy-nilly as if it was undefined behavior:

6.3.1.3 Signed and unsigned integers

1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Floating-point overflows in C

The C standard does not mandate IEEE 754 floating-point arithmetic. Still, in practice, modern compilation platforms, if they provide floating-point features at all, provide either exactly IEEE 754 binary32 and binary64 formats and computations, or the same formats and a close approximation of the same computations.

IEEE 754 floating-point defines +inf and -inf values, so that any real number can be approximated in the target IEEE 754 format (albeit, when it ends up represented as an infinity, not precisely). This means that for C compilation platforms that implement IEEE 754 for floating-point, the condition “the value can be represented in the new type” is always true. There is no reason to worry of undefined behavior caused by overflow in either floating-point arithmetic or in the conversion of a double to a float.

Or indeed, in a constant. Consider GCC's warning here:

$ cat t.c
#include <stdio.h>

int main()
{
  double big = 0x1.0p5000;
  printf("%f\n", big);
}

$ gcc-172652/bin/gcc -std=c99 -Wall t.c && ./a.out 
t.c: In function ‘main’:
t.c:5:3: warning: floating constant exceeds range of ‘double’ [-Woverflow]
inf

The number 2^5000, represented in C as 0x1.0p5000, is totally in the range of double, which goes up to inf. Clang similarly warns that “magnitude of floating-point constant too large for type double”. A proper warning message would be that 2^5000 cannot be represented precisely, instead of implying that it cannot be represented at all.

Floating-point ↔ integer conversion overflows in C

But enough pedantry contests with compilers. The range of floating-point representations being what it is, we are left with only overflows in conversions from floating-point to integer to consider.


Suspense… (for the reader who did not pay attention to the title)


Overflows in conversions from floating-point to integer are undefined behavior. Clause 6.3.1.4 in the C99 standard make them so:

6.3.1.4 Real floating and integer

1 When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

What can happen in practice when a C program invokes this particular flavor of undefined behavior? It is as bad as dereferencing an invalid address, mostly harmless like signed integer arithmetic overflow, or what? Let us find out.

The following program converts the double representation of 2^31, the smallest positive integer that does not fit a 32-bit int, to int.

int printf(const char *, ...);

int main()
{
  int i = 0x1.0p31;
  printf("%d\n", i);
}

Frama-C's value analysis warns about undefined behavior in this program:

$ frama-c -val t.c

warning: overflow in conversion of 0x1.0p31 (2147483648.) 
   from floating-point to integer.
   assert -2147483649 < 0x1.0p31 < 2147483648;

Fine-tuning the assertion -2147483649 < 0x1.0p31 < 2147483648 was a riot, by the way. Do you see why?

My aging (but still valiant) PowerPC-based Mac appears to think that saturation is the way to go: the variable i is set to INT_MAX:

$ gcc -std=c99 t.c && ./a.out 
2147483647

Dillon Pariente was first to draw our attention to overflow in floating-point-to-integer conversions, which caused CPU exceptions on the target CPU for the code he was analyzing. I understood that target CPU to also be a PowerPC, so I suspect the behavior must be configurable on that architecture.

Dillon Pariente's example was along the lines of float f = INT_MAX; int i = f; which is also hilarious if you are into that sort of humor.


In order to really show how weird things can get on Intel processors, I need to modify the test program a bit:

int printf(const char *, ...);

volatile double v = 0;

int main()
{
  int i1 = 0x1.0p31;
  int i2 = 0x1.0p31 + v;
  printf("%d %d\n", i1, i2);
}

The volatile type qualifier precludes optimization, but there is no hardware or thread to change the value of variable v. The two expressions 0x1.0p31 and 0x1.0p31 + v are both expressions of type double that evaluate to 2^31.

Still GCC and Clang, like a single compiler, think that these two expressions needn't result in the same value when converted to int:

$ gcc t.c && ./a.out 
2147483647 -2147483648
$ clang  t.c && ./a.out 
2147483647 -2147483648

The results are different because one conversion was evaluated statically to be placed in %esi (2147483647) whereas the other was evaluated at run-time in %edx with the cvttsd2si instruction:

$ clang -S -O t.c  && cat t.s
...
_main:                                  ## @main
...
	movsd	_v(%rip), %xmm0
	addsd	LCPI0_0(%rip), %xmm0
	cvttsd2si	%xmm0, %edx
	leaq	L_.str(%rip), %rdi
	movl	$2147483647, %esi       ## imm = 0x7FFFFFFF
	xorb	%al, %al
	callq	_printf
...
L_.str:                                 ## @.str
	.asciz	 "%d %d\n"

Only undefined behavior allows GCC and Clang to produce different values for i1 and i2 here: the values of these two variables are computed by applying the same conversion to the same original double number, and should be identical if the program was defined.


Generally speaking, cvttsd2si always produces -0x80000000 in cases of overflow. That is almost like saturation, except that floating-point numbers that are too positive are wrapped to INT_MIN. One may think of it as saturating to either -0x80000000 or 0x80000000, and in the latter case, wrapping around to -0x80000000 because of two's complement. I do not know whether this rationale bears any resemblance to the one Intel's engineers used to justify their choice.


So one might think that this is the end of the story: as long as the conversion is done at run-time on an Intel platform, the compiler uses the cvttsd2si instruction. Overflows, if overflows there are, “saturate to INT_MIN” as the convention is on this platform. This can be confirmed experimentally with the following program variant:

#include <stdio.h>
#include <stdlib.h>

int main(int c, char **v)
{
  int i = 0x1.0p31 + strtod(v[1], 0);
  printf("%d\n", i);
}

This new program takes a number from the command-line and adds it to 2^31, so that there is no opportunity for compile-time evaluation. We expect the conversion to saturate to INT_MIN, and it does:

$ gcc -std=c99 t.c && ./a.out 1234 && ./a.out 12345 && ./a.out 123456
-2147483648
-2147483648
-2147483648


Wait! It gets more amusing still. Let us change the program imperceptibly:

int main(int c, char **v)
{
  unsigned int i = 0x1.0p32 + strtod(v[1], 0);
  printf("%u\n", i);
}

The behavior of run-time overflow in the conversion from double to integer changes completely:

$ gcc -m64 -std=c99 t.c && ./a.out 1234 && ./a.out 123456 && ./a.out 12345678999 
1234
123456
3755744407

But conversion saturates again, at zero this time, for the same program, when targeting IA-32:

$ gcc -m32 -std=c99 t.c && ./a.out 1234 && ./a.out 123456 && ./a.out 12345678999
0
0
0

Do you have an explanation for this one? Leave a message in the comments section below. The fastest author of a complete explanation wins a static analyzer license.

Conclusion

In conclusion, the overflow in the conversion from floating-point to integer is rather on the nasty side of C's undefined behavior spectrum. It may appear to behave consistently if the compilation targets an architecture where the underlying assembly instruction(s) saturate. Saturation is the behavior that compilers GCC and Clang implement when they are able to evaluate the conversion at compile-time. In these conditions, a lucky programmer may not actually observe anything strange.

The idiosyncrasies of other architectures may lead to very different results for overflowing conversions depending on parameters outside the programmer's control (constant propagation, for instance, is more or less efficient depending on the optimization level and may be difficult to predict, as we already complained about when discussing Clang targeting the 387 FPU).


Acknowledgements: In addition to Dillon Pariente, I discussed this topic with Boris Yakobowski, John Regehr, Stephen Canon, and StackOverflow users tenos, Sander De Dycker and Mike Seymour prior to writing this blog post.

Wednesday, September 25 2013

The problem with differential testing is that at least one of the compilers must get it right

A long time ago, John Regehr wrote a blog post about a 3-3 split vote that occurred while he was finding bugs in C compilers through differential testing. John could have included Frama-C's value analysis in his set of C implementations, and then the vote would have been 4-3 for the correct interpretation (Frama-C's value analysis predicts the correct value on the particular C program that was the subject of the post). But self-congratulatory remarks are not the subject of today's post. Non-split votes in differential testing where all compilers get it wrong are.

A simple program to find double-rounding examples

The program below looks for examples of harmful double-rounding in floating-point multiplication. Harmful double-rounding occurs when the result of the multiplication of two double operands differs between the double-precision multiplication (the result is rounded directly to what fits the double format) and the extended-double multiplication (the mathematical result of multiplying two double numbers may not be representable exactly even with extended-double precision, so it is rounded to extended-double, and then rounded again to double, which changes the result).

$ cat dr.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <float.h>
#include <limits.h>

int main(){
  printf("%d %a %La\n", FLT_EVAL_METHOD, DBL_MAX, LDBL_MAX);
  while(1){
    double d1 = ((unsigned long)rand()<<32) +
                           ((unsigned long)rand()<<16) + rand() ;
    double d2 = ((unsigned long)rand()<<32) +
                           ((unsigned long)rand()<<16) + rand() ;
    long double ld1 = d1;
    long double ld2 = d2;
    
    if (d1 * d2 != (double)(ld1 * ld2))
      printf("%a*%a=%a but (double)((long double) %a * %a))=%a\n", 
	     d1, d2, d1*d2,
	     d1, d2, (double)(ld1 * ld2));
  }
}

The program is platform-dependent, but if it starts printing something like below, then a long list of double-rounding examples should immediately follow:

0 0x1.fffffffffffffp+1023 0xf.fffffffffffffffp+16380

Results

In my case, what happened was:

$ gcc -v
Using built-in specs.
Target: i686-apple-darwin11
...
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
$ gcc -std=c99 -O2 -Wall dr.c && ./a.out 
0 0x1.fffffffffffffp+1023 0xf.fffffffffffffffp+16380
^C

I immediately blamed myself for miscalculating the probability of easily finding such examples, getting a conversion wrong, or following while (1) with a semicolon. But it turned out I had not done any of those things. I turned to Clang for a second opinion:

$ clang -v
Apple clang version 4.1 (tags/Apple/clang-421.11.66) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin12.4.0
Thread model: posix
$ clang -std=c99 -O2 -Wall dr.c && ./a.out 
0 0x1.fffffffffffffp+1023 0xf.fffffffffffffffp+16380
^C

Conclusion

It became clear what had happened when looking at the assembly code:

$ clang -std=c99 -O2 -Wall -S dr.c && cat dr.s
...
	mulsd	%xmm4, %xmm5
	ucomisd	%xmm5, %xmm5
	jnp	LBB0_1
...

Clang had compiled the test for deciding whether to call printf() into if (xmm5 != xmm5) for some register xmm5.

$ gcc -std=c99 -O2 -Wall -S dr.c && cat dr.s
...
	mulsd	%xmm1, %xmm2
	ucomisd	%xmm2, %xmm2
	jnp	LBB1_1
...

And GCC had done the same. Although, to be fair, the two compilers appear to be using LLVM as back-end, so this could be the result of a single bug. But this would remove all the salt of the anecdote, so let us hope it isn't.


It is high time that someone used fuzz-testing to debug floating-point arithmetic in compilers. Hopefully one compiler will get it right sometimes and we can work from there.

Monday, September 2 2013

The case for formal verification of existing software

Perry E. Metzger takes a look at formal verification. This is good stuff; there is a lot to agree with here.

However, agreeing with Perry's post alone would not make a very interesting counterpoint. If agreeing was the only thing I intended to do, I might even not have written this post. Instead I intended to add, and I apologize in advance for the predictability of my views, that while creating formally verified software from scratch is useful, verifying existing software is useful too.

Yes, formally verified software written from scratch can now be large enough in scope to be a compiler or a microkernel, but when verifying existing software, we can tackle the problem from the other end: we can pick any useful software component, and verify that. We can pick a software component so useful that it is already used by millions. If we succeed in verifying it, we have put formally verified software in the hands of millions of satisfied users. Transparently.


Take the example of the SSL implementation I am taking a couple of weeks to finish massaging through Frama-C. It is not as wide in scope as Quark, seL4 or CompCert. Neither am I aiming for the same kind of functional correctness as these projects are: I am only verifying the absence of undefined behaviors in the component, and verifying the functional dependencies of the cryptographic primitives(*).

But PolarSSL is useful. Plus, by its very nature, it is exposed to security attacks (SSL is the protocol behind HTTPS). And the former three examples are full-blown research projects, in contrast to my single person.month effort.

The bonus is that the formally verified PolarSSL can be compiled and embedded with all the same compilers and into all the same firmwares as the earlier non-verified version. It is the same as the non-verified version except maybe for a couple of bugfixes, and the confidence that for an identified usage pattern, no more undefined behavior bugs will ever need to be fixed.

All in all, the formal verification of existing code, despite its differences from its “from scratch” counterpart, has too become a force to be reckoned with.


(*) translation: I use Frama-C option -deps and I compare the result to what I expected


Acknowledgement: I got the link to Perry E. Metzger's post through Toby Murray.

Saturday, August 24 2013

Function pointers in C

This post contains a complete list of everything a C program can do with a function pointer, for a rather reasonable definition of “do”. Examples of things not to do with a function pointer are also provided. That list, in contrast, is in no way exhaustive.

What a C program can do with a function pointer

Convert it to a different function pointer type

A function pointer can be converted to a different function pointer type. The C99 standard's clause 6.3.2.3:8 starts:

“A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer.”

Call the pointed function with the original type

Clause 6.3.2.3:8 continues:

“If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.”

Alright, so the above title is slightly sensationalistic: the pointed function can be called with a compatible type. After typedef int t;, the types t and int are compatible, and so are t (*)(t) and int (*)(int), the types of functions taking a t and returning a t and of functions taking an int and returning an int, respectively.

There is no third thing a C program can do with a function pointer

Seriously. The C99 standard has uintptr_t, a recommended integer type to convert data pointers to, but there is not even an equivalent integer type to store function pointers.

What a C program cannot do with a function pointer

Convert it to an ordinary pointer

Function pointers should not be converted to char * or void *, both of which are intended for pointers to data (“objects” in the vocabulary of the C standard). Historically, there has been plenty of reasons why pointers to functions and pointers to data might not have the same representation. With 64-bit architectures, the same reasons continue to apply nowadays.

Call the pointed function with an incompatible type

Even if you know that type float is 32-bit, the same as int on your platform, the following is undefined:

void f(int x);

int main(){
  void (*p)(float) = f;
  (*p)(3);
}

The line void (*p)(float) = f;, which defines a variable p of type “pointer to function that takes a float”, and initializes it with the conversion of f, is legal as per 6.3.2.3:8. However, the following statement, (*p)(3); is actually equivalent to (*p)((float)3);, because the type of p is used to decide how to convert the argument prior to the call, and it is undefined because p points to a function that requires an int as argument.


Even if you know that the types int and long are both 32-bit and virtually indistinguishable on your platform (you may be using an ILP32 or an IL32P64 platform), the types int and long are not compatible. Josh Haberman has written a nice essay on this precise topic.

Consider the program:

void f(int x);

int main(){
  void (*p)(long) = f;
  (*p)(3);
}

This time the statement is equivalent to (*p)((long)3);, and it is undefined, even if long and int are both 32-bit (substitute long and long long if you have a typical I32LP64 platform).


Lastly, the example that prompted this post was, in a bit of Open-Source code, the creation of a new execution thread. The example can be simplified into:

void apply(void (*f)(void*), void *arg)
{
  f(arg);
}

void fun(int *x){
  // work work
  *x = 1;
}

int data;

int main(){
  apply(fun, &data);
}

The undefined behavior is not visible: it takes place inside function apply(), which is a standard library function (it was pthread_create() in the original example). But it is there: the function apply() expects a pointer to function that takes a void* and applies it as such. The types int * and void * are not compatible, and neither are the types of functions that take these arguments.

Note that gcc -Wall warns about the conversion when passing fun to apply():

t.c:11: warning: passing argument 1 of ‘apply’ from incompatible pointer type

Fixing this warning with a cast to void (*)(void*) is a programmer mistake. The bug indicated by the warning is that there is a risk that fun() will be applied with the wrong type, and this warning is justified here, since fun() will be applied with the wrong type inside function apply(). If we “fix” the program this way:

$ tail -3 t.c
int main(){
  apply((void (*)(void*))fun, &data);
}
$ gcc -std=c99 -Wall t.c
$ 

The explicit cast to (void (*)(void*) silences the compiler, but the bug is still in the same place, in function apply().


Fortunately gcc -std=c99 -Wall is not the only static analyzer we can rely on. Frama-C's value analysis warns where the problem really is, in function apply(), and it warns for both the version with implicit conversion and the version with explicit cast:

$ frama-c -val t.c
…
[value] computing for function apply <- main.
        Called from t.c:14.
t.c:3:[value] warning: Function pointer and pointed function 'fun'  have incompatible types:
        void (void *) vs. void (int *x). assert(function type matches)

The correct way to use function apply() without changing it is to make a function with the correct type for it, and to pass that function to apply():

void stub(void *x){
  fun(x);  
}
…
  apply(stub, &data);

Note that in the above, x is implicitly converted when passed to function fun(), the same way that &data is implicitly converted to void* when passed to apply().

Conclusion

There is almost nothing you can do in C with a function pointer. The feature is still very useful and instills a bit of genericity in an otherwise decidedly low-level language.

Function pointers are not often used in the standard library, considering: qsort() is, with pthread_create(), another of the few functions that requires a function pointer. Like it, it is often misused: it has its own entry in the C FAQ.


Jens Gustedt provided advice in the writing of this post.

Monday, August 5 2013

Exact case management in floating-point library functions

The documentation of the correctly-rounded CRlibm floating-point library states, for the difficult pow() function (p. 159):

Directed rounding requires additional work, in particular in subnormal handling and in exact case management. There are more exact cases in directed rounding modes, therefore the performance should also be inferior.

The phrase “exact case” refers to inputs that need to be treated specially because no number of “Ziv iterations”, at increasing precisions, can ever resolve which way the rounding should go.


Quiz: Isn't an exact case an exact case independently of the rounding mode? How can exact cases vary with the rounding mode?


If you can answer the above quiz without having to rubber duck through the entire question on an internet forum, you have me beat.

Wednesday, July 31 2013

From Pascal strings to Python tuples

Quiz time

What does the program below do?

#include <stdio.h>

int main(){
  struct {
    int t[4];
    int u;
  } v;
  v.u = 3;
  v.t[4] = 4;
  printf("v.u=%d", v.u);
  return 0;
}

Two answers are “it prints v.u=4” and “it prints v.u=3”:

$ gcc t.c && ./a.out 
v.u=4
$ gcc -O2 t.c && ./a.out 
v.u=3

The correct answer is of course that the program invokes undefined behavior. It is not that we are using at any time an lvalue of the wrong type to access memory, breaking the so-called “strict aliasing rules”. It is not that v.t+4 is outside of object v. The problem is that v.t+4 is outside object v.t. So GCC does what it pleases, and when compiling with -O2, optimizes brutally:

$ gcc -S -O2 t.c && cat t.s
.LC0:
	.string	"v.u=%d\n"
…
	movl 	$3, %edx
	movl 	$.LC0, %esi
	movl	 $1, %edi
	xorl	%eax, %eax
	call	__printf_chk

Frama-C's value analysis warns for the above program:

$ frama-c -val t.c
t.c:9:[kernel] warning: accessing out of bounds index {4}. assert 4 < 4;

In general, accessing t[i] when t is an array of size 4 is only valid when i < 4, but here the index is hard-coded as 4, so line 9 is only valid when 4 < 4. That is, never: all executions that reach line 9 encounter undefined behavior there.

Second quiz, same as the first quiz

What does the program below do?

#include "stdlib.h"

typedef struct{
  int tab[1];
} ts;

int main() {
  ts *q = malloc(5*sizeof(int));
  q->tab[2]= 5;
  return 1;
}

If you guessed “invoke undefined behavior”, well done!


The program above was shown to me by facetious colleague Bernard Botella, who is hard at work analyzing Python 2.7.4's runtime in the context of a project named SafePython. The snippet above is his reduced version of a larger piece of C code he found there. The issue Bernard was having started with the type definition below, and I will let you guess the rest:

typedef struct {
   PyObject_VAR_HEAD
   PyObject *ob_item[1];

   /* ob_item contains space for 'ob_size' elements.
    * Items must normally not be NULL, except during construction when
    * the tuple is not yet visible outside the function that builds it.
    */
} PyTupleObject;

In C90, the “array of size 1 as last member of a struct” was a common idiom for implementing things like Pascal strings. And of course it is just as valid for variable-length tuples. The problem is that this is not 1990 any more: compilers now use undefined behavior as an excuse to optimize aggressively, and the idiom is no longer valid at all for either tuples or Pascal strings. On the plus side, in the C99 standard we got “incomplete types”, a safe way to implement tuples and Pascal strings:

typedef struct {
   PyObject_VAR_HEAD
   PyObject *ob_item[];
…

Conclusion

I have encouraged my colleague Bernard to report the above as a bug in Python. This kind of bug report is usually ignored, because it denounces idioms that programmers have used for a long time and that they think they understand. Just remember: if you think you can predict what the program in the second quiz does, you should be able to predict what the program in the first quiz does (or explain what is different about it).

- page 1 of 22