Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - value-builtins

Entries feed - Comments feed

Tuesday, December 13 2016

Frama-C Silicon has been released!

Frama-C 14 (Silicon) has just been released. In this post, we present a few additions that should help everyday usage of Value EVA.

Value is now EVA

The Value analysis plug-in has been renamed EVA, for Evolved Value Analysis. It has a new architecture that allows plugging abstract domains, among other features. It is a truly remarkable evolution which this post is too small to contain1, however, so it will presented in more detail later.

1 Facetious reference to Fermat's Last Theorem

Automatic built-in matching for libc functions

One of the new user-friendly features is the -val-builtins-auto option, which avoids having to memorize which built-in to use for each libc function that has one, namely malloc, free, and some floating-point and string-related functions (e.g. pow and strlen).

For instance, consider the following toy program, which simply allocates a buffer, copies a string into it, then allocates a buffer of the right size for the string, and stores it there.

// file.c
#include <stdio.h>
#include <stdlib.h>
#include "string.c" // include Frama-C's implementation of string.h

int main() {
  char *buf = malloc(256); // allocate a large buffer
  if (!buf) exit(1);
  char *msg = "silicon";
  strcpy(buf, msg);
  size_t n = strlen(buf);
  char *str = malloc(n + 1); // allocate a buffer with the exact size (plus '\0')
  if (!str) exit(1);
  strncpy(str, buf, n);
  str[n] = 0;
  size_t n2 = strlen(str);
  //@ assert n == n2;
  free(str);
  free(buf);
  return 0;
}

This program uses dynamic allocation and calls functions from string.h.

The following command-line is enough to obtain an interesting result:

frama-c file.c -val -val-builtins-auto -slevel 7

Without -val-builtins-auto one would need to use this overly long argument:

-val-builtin malloc:Frama_C_alloc_by_stack,free:Frama_C_free,strlen:Frama_C_strlen

For more details about Frama_C_alloc_by_stack, check the EVA manual, section 8.1.

The builtins for free and strlen were automatically chosen by EVA. Note however that strcpy and strncpy do not have builtins. In this case, we include "string.c" (which is actually in share/libc/string.c) to use the implementations available with Frama-C.

Analyzing a program using the implementations in share/libc/string.c is less efficient than using a built-in, but more precise than using only a specification. These implementations are designed to minimize the number of alarms when their inputs are imprecise. Also, because they are not optimized for execution, they are conceptually simpler than the actual libc implementations.

Using these functions. -slevel 7 ensures that their loops are fully unrolled in the example. Can you guess why 7 is the right value here?

Inspecting values in the GUI

Another improvement to usability comes in the form of a popup menu in the GUI. To see it, run the following command using the same file as previously:

frama-c-gui file.c -val -val-builtins-auto -slevel 7

On the Frama-C GUI, click on the str expression in the statement str = (char *)malloc(n + (size_t)1); (corresponding to line 11 in the original source code). Then open the Values tab, and you will see something similar to this:

Show pointed values in the GUI

In the Values tab on the bottom, right-clicking on the cell containing the NULL; &__malloc_main_l11[0] value will show a popup menu "Display values for ...". Clicking on it will add a new column displaying its contents.

Before Silicon, this information was already available, but as the result of a long and painful process. The new popup menu shows one entry per pointed variable in the chosen cell, so if there are several different values, there will be several popup menu entries.

malloc may fail

In the previous example, the values of str are those returned by the malloc builtin: NULL and a newly allocated base (__malloc_main_l11). This models the fact that there may not be enough memory, and malloc may fail. The code should definitely handle this case! But for the hurried evaluator, the use of option -no-val-malloc-returns-null can help focus on the other potential run-time errors (before coming back to the malloc-related ones).

Still ways to go

In this example, there are still some warnings, due to the specification of functions strcpy and strncpy, which use logical constructs not yet evaluated by EVA (but very useful for WP). They are not an issue in this example, since we used the actual implementation of these functions, and therefore do not need their specifications, but future work on EVA will help deal with these details and provide a more streamlined experience.

Tuesday, February 4 2014

Assertions

Jesse Ruderman on assertions and fuzzing

Jesse Ruderman has published a blog post on assertions and how they complement fuzzing. Key quote: “Fuzzers make things go wrong. Assertions make sure we find out.”

Readers of this blog are accustomed to me talking about differential testing, where a reference result (say, obtained by compiling a random C program with a quality compiler) is used to detect bugs in the target program (say, a static analysis framework for C programs). Differential testing imposes constraints: the random input must have a definite meaning, and a reference implementation needs to be available to compute this meaning. Often one finds bugs in the reference implementation together with bugs in the target program.

Besides, there are deeply burrowed bugs that are difficult to reveal with such a black-box approach. Assertions simplify the problem: if the code being tested contains enough well-chosen assertions, bugs can be caught without a reference implementation.

Sometimes, an assertion is a reference implementation: some of the assertions in Frama-C are alternative computations of the same intermediate result, followed by a comparison of the normally computed result with the alternately computed result. These assertions initially caught bugs in either computation. Since then, they have caught many more bugs in hash-consing, where the two results are structurally identical but are not shared.

Even when an assertion happens to contain a reference implementation for an intermediate result, it saves a lot of time to write the assertion as opposed to producing a complete reference implementation for the whole problem (say, interpreting C programs). The alternative implementation in the assertion does not have to be efficient: in Frama-C's value analysis, it is considered acceptable to spend 20% of the analysis time executing inefficient reference implementations in assertions.

John Regehr on writing and using assertions

John Regehr just published a blog post too on the subject of assertions. What a coincidence! Key quote: “we have tools that are specifically designed to help us reason about assertions; my favorite one for C code is Frama-C”.

In most of his post, John describes executable assertions written in the same language as the program being debugged. In the section quoted above, he moves to specific annotation languages to write assertions in. The advantages of using the same language as the programming language for assertions require no elaboration: it's the same language! The programmer already knows it, and the reviewer of the code already knows it.

But there is a spectrum of annotation languages for assertions. Eiffel stands its ground somewhere on this spectrum, very close to the underlying programming language but with enough good intentions to be noted. I think that the JML annotation language for Java was initially mostly intended for run-time checking, but ended up being very popular too as the annotation language used by static analyzers (I would be happy to be corrected if I am getting this wrong). Nearby JML lies E-ACSL, an executable subset of ACSL and also a Frama-C plug-in to convert a C program annotated with /*@ ... */ assertions into an instrumented C program that detects at run-time violated assertions. SPARK 2014 aims at making both camps happy.


I should point out for the sake of scientific integrity that I am being slightly cheeky in the choice of the key quote to represent John's post. I recommend you read the whole thing, of which the Frama-C endorsement is only a tiny fraction.

Taking advantage of C assertions with Frama-C

One can also use Frama-C in conjunction with existing executable assertions, typically implemented with the standard function assert() in header assert.h. The function assert() takes a C expression representing a property that should hold.

One way to take advantage of assertions is to fail to establish that they always hold, and warn the user that perhaps they don't. It is easy for anyone who wants this behavior to obtain it. One simply needs to specify assert() thus:

/*@ requires x != 0 ; */
void assert(int x);

With this specification, any existing call to the C function assert(), intended for the programmer to be executed at run-time, is required to have an argument that demonstrably corresponds to a non-null expression. This specification creates a link of sorts between static verification and run-time checking (or expressions intended to be checked at run-time, anyway).

Here is an example:

...
/*@ requires x >= 0 ;*/
double sqrt(double x);

int get_int(void);

double dist(double x, double y)
{
  double s = x * x + y * y;
  assert(s >= 0.0);
  return sqrt(s);
}

int main(int c, char **v){
  dist(get_int(), get_int());
}

When this program is analyzed, the value analysis detects that the assertion s >= 0.0 helpfully inserted by the programmer as an argument to assert() may not hold. In this particular example, the warning is a false positive, but never mind that for the moment.

$ frama-c -val t.c
...
t.c:4:[value] Function assert: precondition got status unknown.
...
t.c:1:[value] Function sqrt: precondition got status unknown.

Even more irritating in the example above is that after producing a false positive for assert(), the analyzer produces a false positive for the pre-condition of sqrt(). This brings us to another way a static checker could use C assertions to its advantage: it could take them as hints, properties that can be assumed to hold in order to avoid warning for problems that would seem to arise later if they did not.

With the user-specified function assert() above, the analyzer computes the truth value of the argument s >= 0.0. Because the set of values computed for s is approximated, the set of values for the expression s >= 0.0 is {0; 1}. The analyzer can tell that the pre-condition for assert() may not hold, but the relationship with the possible values of s is too indirect for it to decide that the forbidden situation corresponds to s negative and that only s positive is allowed.

There exists a better modelization of assert(). It was implemented as a value analysis built-in in order to offer the very functionality we are describing here:

$ frama-c -val -val-builtin assert:Frama_C_assert t.c
...
t.c:11:[value] warning: Frama_C_assert: unknown
...
t.c:1:[value] Function sqrt: precondition got status valid.
...
[value] Values at end of function dist:
  s ∈ [-0. .. 9.22337203685e+18]

On the same example, the builtin version of assert detects that it may be passed a null expression and warns about that. These is no improvement there (annotating the example to convince the analyzer that s is always positive is left as an exercise to the reader). Subsequently, and this is an improvement with respect to the previous analysis, the builtin does its best to incorporate the information provided in the assertion, so that it can tell that s is henceforth positive and that the pre-condition of sqrt() is satisfied.

Thursday, August 23 2012

Technical interlude

The current series of posts is pretty technical, huh? Here is a refreshing diversion in the form of a technical post that also touches on existential questions. What is this blog for? We don't know.

Best solutions, or solutions anyone could have found

If you have been following the current series of posts, you may have noticed that a recurring dilemma is whether the goal is to demonstrate the capabilities of the software tool under consideration (Frama-C's value analysis plug-in) when pushed to its limits by its authors, or to demonstrate what anyone could do without being a creator of the tool, taking advantage of Linux, Mac OS X and sometimes Windows packages for the software, and, most importantly, having only the available documentation to understand what the software makes possible.


The general goal of the blog is to empower users. In fact, much of the aforementioned documentation is available as posts in the blog itself. These posts sometimes describe techniques that were undocumented until then, but describe them so that they wouldn't be undocumented any more. Some of the posts even describe techniques that weren't undocumented at the time of writing. As my colleague Jean-Jacques Lévy would say, “repetition is the essence of teaching”.


However, I do not expect that most software tool competitions are about measuring the availability and learnability of tools. For instance, the SV-COMP competition explicitly forbids anyone other than the authors of the tool to participate with it:

A person (participant) was qualified as competition contributor for a competition candidate if the person was a contributing designer/developer of the submitted competition candidate (witnessed by occurrence of the person’s name on the tool’s project web page, a tool paper, or in the revision logs).

Ostensibly, the tools are the participants in the competition. Involving the authors is only a mean to make the competition as fair as possible for the tools. Some colleagues and I have expressed the opinion that, for static analysis, at this time, this was the right way:

In general, the best way to make sure that a benchmark does not misrepresent the strengths and weaknesses of a tool is to include the toolmakers in the process.


Some competitions on the other hand acknowledge that a participant is a person(s)+tool hybrid, and welcome entries from users that did not create the tools they used. This is perfectly fine, especially for the very interactive tools involved in the cited competition, as long as it is understood that it is not a tool alone, but a hybrid participant, that is being evaluated in such a competition. Some of the participating tools participated in several teams, associated to several users (although the winning teams did tend to be composed of toolmakers using their respective tools).


It is not clear to me where on this axis the RERS competition falls. It is a “free-style” competition. The entries are largely evaluated on results, with much freedom with respect to the way these were obtained. This definitely can also be fun(*).

The competition certainly does not require participating tools to be packaged and documented in such exquisite detail that anyone could have computed the same results. As such, the best description to write is the description of what the authors of the value analysis can do with it with all the shortcuts they trust themselves to take. If some of the questions can be solved efficiently using undocumented features, so be it. It is the free-style style.

But the mission of the blog remains to document, and showing what we have done without explaining how to reproduce it sounds like boasting. I am torn.


(*) An untranslatable French idiom, “bon public”, literally “good audience”, used in an adjective position, conveys a number of subtle, often humorous nuances. I think this applies to me and software competitions. They just all seem fun to me.

A necessary built-in

A value analysis built-in whose usefulness, if it were available, became apparent in the last post was a built-in that halts the analyzer.

Several analysis log snippets in the previous post show a ^C as last line, to indicate that the log already contains the answer to the question being investigated, and that the user watching the analysis go by can freely interrupt it then.

I cheated when editing the post, by the way. I am not fast enough to interrupt the analysis just after the first conclusive log message. But I did interrupt these analyses, which otherwise might have gone on for a long time for no purpose.

I thought I would enjoy my colleagues' horrified reaction so I told everyone about my intention of writing a value analysis built-in, like some already exist for memcpy() or memset(), for halting the analyzer. Not halting the currently considered execution, mind you. There is /*@ assert \false; */ for that, and if you need this functionality available as a C function, you can put /*@ assert \false; */ inside your own function, like I did for assert() earlier when prepping the competition programs for Frama-C.


Clearly, such a built-in does not have a contract, its effects are meta-effects that transcend the framework (in what would, for these colleagues, be the bad sense of the word).

A time-saver

Today I set out to write this built-in, because I started to get serious about the competition and it would be convenient to just instrument the program with:

if ( /* current state identical to predecessor state /* )
{
  Frama_C_show_each_cycle_found(1);
  Frama_C_halt_analysis();
}

instead of having an operator, or even a shell script, watch the log for the Frama_C_show_each_cycle_found message and interrupt Frama-C then.


Soon after opening the file and finding my place, I realized that this feature was already available. I am telling you this because this blog is for the benefit of transparency, for documenting things and for empowering users, but mostly because I want to see my colleagues' horrified reaction when they read about this trick. Actually, I would not use it if I were you. It might cease to work in a future version.

if ( /* current state identical to predecessor state /* )
{
  Frama_C_show_each_cycle_found(1);
  Frama_C_cos(0.0, 0.0);
}

You see, Frama-C value analysis built-ins, such as Frama_C_cos(), do not have an intrinsic type. You can include a header that will give them a type if you wish:

double Frama_C_cos(double x);

Otherwise, the old C89 rules for missing function prototypes apply. During parsing, a type is inferred from the types of the arguments at the call site.

Only if and when the analysis actually reaches the Frama_C_cos(0.0, 0.0) call, the analyzer will realize that it is passed arguments that are incompatible with the OCaml function that implement the built-in. An exception will be raised. Since there is no reason to try to recover from a built-in misuse, this exception is caught only at the highest level. It stops the entire analyzer (it marks the entire analysis results as unreliable, as it should, but of course it does not retroactively unemit the Frama_C_show_each_cycle_found message in the log that says a conclusion has been reached).


The result looks like this, this time without cheating:

[value] Called Frama_C_show_each_cycle_found({1})
...
[value] computing for function any_int <- main.
        Called from Problem6_terminateU_pred.c:9516.
[value] Done for function any_int
[value] Semantic level unrolling superposing up to 1400 states
[value] user error: Invalid argument for Frama_C_cos function
[value] user error: Degeneration occured:
        results are not correct for lines of code
        that can be reached from the degeneration point.
[kernel] Plug-in value aborted because of invalid user input.

This is convenient for timing the analyzer as it tackles some of the harder problems. In the case above, the time was 5m35s answering a liveness question about Problem 6. It would have been a pain to watch the log for that long, fingers poised to interrupt the analyzer.


That's all for this interlude. Expect exciting timings in the next post. There may also be talk of undocumented features that transcend the problem, in the good sense of the word.

Tuesday, July 24 2012

Results are in

A contest and a self-pitying lament

John Regehr was organizing a craziest undefined behavior contest and the results are in. I had an entry in the contest, but I did not win. My entry apparently was too obviously dangerous. As John puts it, “I would have expected a modern C compiler to exploit those undefined behaviors”.

I have explained the problem inherent to constructs of the kind &a - 1 < &a in several different sets of circumstances, to different programmers for different projects, and I was unable to convince any of them that the construct or a variant of it was not harmless and that the code should be changed. But for this contest, the construct was too obviously wrong and too prone to be taken advantage of by an aggressive compiler to win. This is the story of my life, really.

Notable entries

The contest invited C++ entries as well, but fortunately for this blog post, most of the submissions were actually in C.

Runner-up

int main (void) {
  int x = 1;
  while (x) x <<= 1;
  return x;
}

$ bin/toplevel.opt -val t1.c | grep assert
t1.c:3:[kernel] warning: invalid LHS operand for left shift. assert x ≥ 0;

As shown above, Frama-C's value analysis warns for the undefined behavior that made this entry a runner-up in the contest.

Winner #1

enum {N = 32};
int a[N], pfx[N];
void prefix_sum (void)
{
  int i, accum;
  for (i = 0, accum = a[0]; i < N; i++, accum += a[i])
    pfx[i] = accum;
}

$ bin/toplevel.opt -val t2.c -main prefix_sum | grep assert
t2.c:6:[kernel] warning: accessing out of bounds index [1..32]. assert i < N;

In this second example, you need to tell Frama-C to analyze the function prefix_sum. Otherwise, it won't find the undefined behavior: the value analysis is not the sort of heuristic bug-finding tool that detects patterns in code regardless of calling contexts. For precision and soundness, Frama-C's value analysis tries to determine whether prefix_sum is safe in the context(s) in which it is actually called during execution. Without option -main prefix_sum, it looks like it is not called at all, and therefore safe. Indeed, if you try to compile and run the program, the linker complains about a missing main() function, meaning that the code cannot do harm.

Having to use option -main and others to describe your intentions is the sort of quirk you can pick up by going through a tutorial or the documentation.

Winner #2

#include <stdio.h>
#include <stdlib.h>
 
int main() {
  int *p = (int*)malloc(sizeof(int));
  int *q = (int*)realloc(p, sizeof(int));
  *p = 1;
  *q = 2;
  if (p == q)
    printf("%d %d\n", *p, *q);
}

This example involves dynamic allocation, which is one of the C features that Frama-C's value analysis, as a static analyzer, does not claim to handle precisely in all cases.

Here, however, the program is completely deterministic: it does not have inputs of any kind. With an unreleased version of the value analysis, it is possible to interpret such programs for the detection of undefined behaviors. In this interpreter mode, the value analysis handles many more C constructs, including malloc() and realloc() calls.

That version of the value analysis warns about the use of p in the line if (p == q), which is indeed the first use of p after its contents have become indeterminate (in the sense of the C99 standard). The issue here relates to that post on the impossibility of freeing a pointer twice.

My entry

My entry was on the subject of illegal pointer arithmetic, which the value analysis does not warn about because there would be too many programs that it would (correctly) flag. It also involved comparisons of such illegal pointers, which the value analysis does warn about. These comparisons are also commonly found in the wild, but these are very very often actually dangerous, and have security implications.

There is nothing much I can say on the subject that was not already in a previous post on pointer comparisons.

Conclusion

You might think I am going to end up with a statement like “Frama-C's value analysis detects* all undefined behaviors because it finds the undefined behaviors in all four examples illustrated in the contest”.


Perish the thought! That would be fallacious. My claim is that Frama-C's value analysis detects* all undefined behaviors that you may care about, and even some you may think you do not need to care about but that might have surprising consequences, as exemplified on some notable entries in John's contest. The reason it finds* all undefined behaviors is not that it does well for a couple of examples but that it was designed to do so.


Incidentally, Frama-C's value analysis was tested on a lot more examples when working on this article on testcase reduction, and results were compared to KCC's. The Csmith random C program generator produces only defined programs, but C-Reduce, one of the reducers described in the article, produces tons of undefined behaviors. Although that was not a goal of the article, the work also served as an application of differential testing to the detectors of undefined behaviors that are KCC and Frama-C's value analysis.


There are problematic features of the C language that were left aside when initially designing the value analysis for its primary target of critical embedded code. This limits some uses, but it remains possible to extract valuable information off a C program, using the value analysis and the right methodology. One workaround is to use the value analysis as a C interpreter, in which case standard library functions are easier to model. Many of them already are available in the unreleased value analysis.


(*) when used properly. See Winner #1 and Winner #2 for examples of what “using properly” may entail.


My thanks to Sven Mattsen for proofreading this post.

Thursday, January 5 2012

free(): revisited already

If Frama-C doesn't work out, we can always make a comedy team

Facetious colleagues ask me how I make Frama-C's value analysis' messages so informative. "Pascal," one of them says, "in this case study, the generated log contains 8GiB of information! It won't open in Emacs...". I helpfully point out that it is probably a 2^32 thing and that he should use a 64-bit Emacs.

Informativeness with uninitialized variables

Consider the program below:

main(){
  int x = 1;
  int y;
  int z = 3;
  int t = 4;
  int u = 5;
  x += y;
  x += z;
  x += t;
  x += u;
  printf("%d\n", x);
}

A less informative analyzer would predict that the printf() call prints any number, or it might call the printed number "Uninitialized". Either way, the user would be dumbfounded. Ey would have to track the origin of this imprecise diagnostic emself. Going backwards from causes to origins, ey would finally end up to the declaration of local variable y, which ey intended to initialize but forgot to.

The value analysis tells you right there at line 7 (x += y;) that the program is doing something wrong, so that the way back up from cause to origin is much shorter. It's more informative to point out that the program is already doing something wrong at line 7 than to silently propagate a value indicating something wrong has happened somewhere.

Informativeness with dangling pointers

My most facetious colleague Anne wonders whether one shouldn't be allowed to increment q in the program from last post:

main(){
  int *p = malloc(12);
  int *q = p;

  Frama_C_dump_each();
  free(p);
  q++;
  ...
}

This program is treated analogously to that program, and by "analogously" I mean that the same functions are called behind the scenes and the same messages are printed:

main(){
  int *p;
  int *q;

  {
    int a[3];
    p = a;
    q = p;
    Frama_C_dump_each();
  }
  q++;
  ...
}

Could we allow q to be incremented?

In the example with free(), perhaps. In the example with a local variable a, the standard clearly calls the contents of q indeterminate, and it's allowable for a static analyzer to emit an alarm when indeterminate memory contents are used for anything except copying.

The standard is surprisingly brief when it comes to defining free().

The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. [...] if the argument does not match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.

Should we allow q to be incremented?

Well, I don't want to, for the same reason as in the uninitialized variable y example. This would postpone the moment the error is pointed out to the user, and give em extra work to track the programming error (which is to do anything with q after the free()/end of block. The informative solution is to warn at q++; And I take informativeness very seriously. (You've been a great audience. I'll be here all week)


I think the standard should say that any pointer previously pointing inside the freed space becomes indeterminate. Anyway, that's how it's treated in Frama-C's value analysis, and if you don't like it, you can use another analyzer, unless you are Anne, in which case you can't and you must use this one.

Double free(), no such thing

I have been able to divert a few hours yesterday and today for programming. It was well worth it, as I have discovered a theorem. It is new to me, and I wonder whether it was ever published. The theorem is, a C program cannot double free() a block even if it tries! I thought this deserved an announcement. Some students I know would be glad to hear that.


Indeed, a C program can at most pass an indeterminate value to a function. This is highly objectionable in itself. Students, don't write programs that do this either!

Allow me to demonstrate on the following program:

...
main(){
  int *p = malloc(12);
  int *q = p;

  free(p);
  free(q);
}

You might think that the program above double-frees the block allocated and referenced by p. But it doesn't: after the first call to free(), the contents of q are indeterminate, so that it's impossible to free the block a second time. I will insert a few calls to primitive function Frama_C_dump_each() to show what happens:

main(){
  int *p = malloc(12);
  int *q = p;

  Frama_C_dump_each();
  free(p);
  Frama_C_dump_each();
  free(q);
  Frama_C_dump_each();
}
$ frama-c -val this_is_not_a_double_free.c

The analysis follows the control flow of the program. First a block is allocated, and a snapshot of the memory state is printed a first time in the log:

...
[value] computing for function malloc <- main.
        Called from test.c:12.
[value] Recording results for malloc
[value] Done for function malloc
[value] DUMPING STATE of file test.c line 15
        p ∈ {{ &Frama_C_alloc }}
        q ∈ {{ &Frama_C_alloc }}
        =END OF DUMP==
...

Then free() is called and a second snapshot is logged:

...
[value] computing for function free <- main.
        Called from test.c:16.
[value] Recording results for free
[value] Done for function free
[value] DUMPING STATE of file test.c line 17
        p ∈ ESCAPINGADDR
        q ∈ ESCAPINGADDR
        =END OF DUMP==
...

And then, at the critical moment, the program passes an indeterminate value to a function:

...
test.c:18:[kernel] warning: accessing left-value q that contains escaping addresses;
                      assert(Ook)
test.c:18:[kernel] warning: completely undefined value in {{ q -> {0} }} (size:<32>).
test.c:18:[value] Non-termination in evaluation of function call expression argument
        (void *)q
...
[value] Values at end of function main:
          NON TERMINATING FUNCTION

Note how the last call to Frama_C_dump_each() is not even printed. Execution stops at the time of calling the second free(). Generalizing this reasoning to all attempts to call free() twice, we demonstrate it is impossible to double-free a memory block.

QED


Question: does it show too much that this detection uses the same mechanism as the detection for addresses of local variables escaping their scope? I can change the message if it's too confusing. What would a better formulation be?

As I hope I have made clear, the message will be displayed as soon as the program does anything at all with a pointer that has been freed — and indeed, I have not implemented any special check for double frees. The program below is forbidden just the same, and does not get past the q++;

main(){
  int *p = malloc(12);
  int *q = p;

  Frama_C_dump_each();
  free(p);
  q++;
  ...
}

On the other hand, I still have to detect that the program does not free a global variable, something the analysis currently allows to great comical effect. (Hey, I have only had a couple of hours!)

Please leave your thoughts in the comments.

Friday, September 23 2011

Safe donut

This post documents the steps I followed in order to finish verifying function compute(), picking up from there.

Previously on this blog

In last episode, we had found that some sub-cubes in the search space appeared to lead to dangerous value sets for variable N. These sets were:

        N ∈ {10; 11; 12}
        N ∈ {6; 7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12; 13}
        N ∈ {8; 9; 10; 11; 12}
        N ∈ {8; 9; 10; 11; 12; 13}
        N ∈ {9; 10; 11; 12}
        N ∈ {9; 10; 11; 12; 13}

There are only eight dangerous value sets for N, but these correspond to many values of input variables An, Bn, in, jn in the analysis context we wrote for the occasion. Grepping for each of these eight lines in the original logs allows to get the corresponding values of An, Bn, in, jn. These are the values for which we haven't concluded yet that compute() is safe:

for i in 1 2 3 4
do
    for line in \
	"N ∈ {10; 11; 12}" \
	"N ∈ {6; 7; 8; 9; 10; 11; 12}" \
	"N ∈ {7; 8; 9; 10; 11; 12}" \
	"N ∈ {7; 8; 9; 10; 11; 12; 13}" \
	"N ∈ {8; 9; 10; 11; 12}" \
	"N ∈ {8; 9; 10; 11; 12; 13}" \
	"N ∈ {9; 10; 11; 12}" \
	"N ∈ {9; 10; 11; 12; 13}" 
    do 
        grep "$line" log$i -A4 -B0 | grep -v "N "
        echo --
    done > pb$i
done

The results are well distributed among the four quarters of the search space that I initially chose arbitrarily. That's good news, as it means things can be kept parallel by keeping these files separate. One file looks like:

        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-26}
        jn ∈ {18}
--
        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-26}
        jn ∈ {19}
--
        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-3}
        jn ∈ {6}
--
...

How to re-analyze the problematic subcubes more precisely

Each file pb1, ..., pb4 can be processed a little bit further to look like C code:

for i in 1 2 3 4
do
  sed -e s"/--/f();/" -e s"/∈ {/= /" -e s"/}/;/" < pb$i > pr$i.c
done

The above commands transform the interesting values files into:

        An = -26;
        Bn = -15;
        in = -26;
        jn = 18;
f();
        An = -26;
        Bn = -15;
        in = -26;
        jn = 19;
f();
        An = -26;
        Bn = -15;
        in = -3;
        jn = 6;
f();
...

Each of the four pieces of program weights in at a bit less than 7 MB of C code. I plan to make good use of this information tidbit the next time someone asks me what project sizes Frama-C's value analysis can handle.

$ ls -l pr?.c
-rw-r--r-- 1 cuoq cuoq 6788869 2011-09-17 18:52 pr1.c
-rw-r--r-- 1 cuoq cuoq 6551620 2011-09-17 18:52 pr2.c
-rw-r--r-- 1 cuoq cuoq 6655238 2011-09-17 18:52 pr3.c
-rw-r--r-- 1 cuoq cuoq 6486765 2011-09-17 18:52 pr4.c


The files pr1.c, ..., pr4.c are not complete C programs, but they work well with the following prolog (download):

...

int An, Bn, in, jn;

void f(void)
{
  int Ans, Bns, ins, jns;
  float A, B, i, j;
  for (Ans=4*An; Ans<4*(An+1); Ans++)
    {
      A = Frama_C_float_interval(Ans / 32., (Ans + 1) / 32.);
      for (Bns=4*Bn; Bns<4*(Bn+1); Bns++)
        {
          B = Frama_C_float_interval(Bns / 32., (Bns + 1) / 32.);
          for (ins=4*in; ins<4*(in+1); ins++)
            {
              i = Frama_C_float_interval(ins / 32., (ins + 1) / 32.);
              for (jns=4*jn; jns<4*(jn+1); jns++)
                {
                  j = Frama_C_float_interval(jns / 32., (jns + 1) / 32.);
                  compute(A, B, i, j);
                }
            }
        }
    }
}

main(){

A closing bracket } is also needed to make the whole thing a syntactically correct C program.


Alas, a regrettable performance bug in Frama-C's front end prevents from analyzing such huge generated C functions. We are a bit too close to the Nitrogen release to change data structures for representing the AST, so this bug will probably remain for at least one release cycle. To circumvent the issue, I simply split the files into 182 reasonably-sized chunks (reasonably-sized here meaning 10000 lines, a more usual size for a function).

split -l 10000 pr1.c pr1.c.

182 C files to analyze and 4 cores to analyze them with: this is an undreamed-of opportunity to make use of the xargs -n 1 -P 4 command.

ls pr?.c.* | xargs -n 1 -P 4 ./do.sh

Here is the script do.sh for handling one program. It first catenates the prolog, the chunk, and a closing bracket, and then it launches the value analysis on the resulting C program:

#!/bin/sh
( cat prolog.c ; cat $1 ; echo "}" ) > t_$1.c
frama-c -val share/builtin.c t_$1.c -obviously-terminates \
  -no-val-show-progress -all-rounding-modes > log_pass2_$1 2>&1

Results

The above, yada yada yada, produces one log file for each of the 182 chunks:

$ ls -l log_pass2_pr*
-rw-r--r-- 1 cuoq cuoq 500913957 2011-09-17 20:50 log_pass2_pr1.c.aa
-rw-r--r-- 1 cuoq cuoq 502329593 2011-09-17 20:49 log_pass2_pr1.c.ab
-rw-r--r-- 1 cuoq cuoq 503146982 2011-09-17 20:51 log_pass2_pr1.c.ac
...
-rw-r--r-- 1 cuoq cuoq 502560543 2011-09-18 01:22 log_pass2_pr1.c.ay
-rw-r--r-- 1 cuoq cuoq 502283181 2011-09-18 01:23 log_pass2_pr1.c.az
-rw-r--r-- 1 cuoq cuoq 503974409 2011-09-18 01:30 log_pass2_pr1.c.ba
-rw-r--r-- 1 cuoq cuoq 501308298 2011-09-18 01:29 log_pass2_pr1.c.bb
...
-rw-r--r-- 1 cuoq cuoq 502932885 2011-09-18 05:20 log_pass2_pr1.c.bs
-rw-r--r-- 1 cuoq cuoq 422006804 2011-09-18 05:03 log_pass2_pr1.c.bt
-rw-r--r-- 1 cuoq cuoq 502353901 2011-09-18 05:19 log_pass2_pr2.c.aa
-rw-r--r-- 1 cuoq cuoq 502485241 2011-09-18 05:23 log_pass2_pr2.c.ab
-rw-r--r-- 1 cuoq cuoq 503562848 2011-09-18 05:57 log_pass2_pr2.c.ac
...
-rw-r--r-- 1 cuoq cuoq 184986900 2011-09-18 12:28 log_pass2_pr2.c.bs
-rw-r--r-- 1 cuoq cuoq 498627515 2011-09-18 13:11 log_pass2_pr3.c.aa
...
-rw-r--r-- 1 cuoq cuoq 263096852 2011-09-19 05:27 log_pass2_pr4.c.bs

Incidentally, it appears from the above listing that the second pass took about 36 hours (two nights and one day). The inside of one log file looks like:

[value] DUMPING STATE of file t_pr1.c.aa.c line 24
...
        N ∈ {10; 11}
        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-26}
        jn ∈ {18}
        Ans ∈ {-104}
        Bns ∈ {-60}
        ins ∈ {-104}
        jns ∈ {72}
        A ∈ [-3.25 .. -3.21875]
        B ∈ [-1.875 .. -1.84375]
        i ∈ [-3.25 .. -3.21875]
        j ∈ [2.25 .. 2.28125]
        =END OF DUMP==
t_pr1.c.aa.c:15:[value] Function compute: postcondition got status valid.
[value] DUMPING STATE of file t_pr1.c.aa.c line 24
...
        N ∈ {10; 11}
        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-26}
        jn ∈ {18}
        Ans ∈ {-104}
        Bns ∈ {-60}
        ins ∈ {-104}
        jns ∈ {73}
        A ∈ [-3.25 .. -3.21875]
        B ∈ [-1.875 .. -1.84375]
        i ∈ [-3.25 .. -3.21875]
        j ∈ [2.28125 .. 2.3125]
        =END OF DUMP==
...

Each value set for variable N is computed more precisely than in the first pass, because the values for floating-point variables A, B, i, j are known more precisely. Above, N is determined to be either 10 or 11 (both safe values) for two of the subsubcubes under examination. Below is the complete list of sets of values computed for N:

$ for i in log_pass2_pr* ; do grep "N " $i | sort -u ; done | sort -u
        N ∈ {10}
        N ∈ {10; 11}
        N ∈ {11}
        N ∈ {7; 8; 9}
        N ∈ {8; 9}
        N ∈ {8; 9; 10}
        N ∈ {9}
        N ∈ {9; 10}
        N ∈ {9; 10; 11}

Conclusion

I have already provided the "the function being analyzed is safe after all" conclusion in the previous post. This post is only for showing the "how", so that you can judge for yourself, for instance, how much cheating there was. There was quite a bit of C code written for the purpose of the verification: this code could itself have bugs. And there was a good amount of logs processing with shell scripts. There could be bugs there too. Stéphane Duprat would however point out that if we had done the verification with tests, there would have been testing code and logs-processing-scripts too. The difference between the two approaches is that we used Frama_C_float_interval() in the places where we might have used a random number generator. It does not feel like a large conceptual leap, and we obtained deeper properties in exchange (would you trust the function not to return 12 if, like me, you didn't understand what it computes at all and had only tested it a million times? A billion times? How many tests would be enough for you to sit under a piano?)

The possibility that variable N in function compute() held a number larger than 11 was related to the last and most difficult of a series of alarms that we found in a mostly unmodified piece of obfuscated C code. Some of the other alarms disappeared simply by playing with the -slevel setting, which is the setting to think of immediately when the analysis is fast enough and some alarms remain. I explained where the other alarms came from (a call to memset() was not handled precisely enough, etc.) but in actual use, you don't need to: just increase the argument to -slevel and see if the alarms go away, even before starting to think.


The method I have shown here is a little bit laborious, but one reason for us Frama-C developers to do these case studies is to find out about possible uses and make them more comfortable in future releases. This series of posts shows how to do this kind of verification now (well... it describes how to do them soon; the method shown will work with the Nitrogen release). This is absolutely not definitive. A plug-in could automate a lot of what we have done by hand here.


This post was improved by insights from Florent Kirchner.

Saturday, September 17 2011

Probably safe donut

Introduction

In the first post in the obfuscated animated donut series, my colleague Anne pointed out that:

The alarm about : assert \valid(".,-~:;=!*#$@"+tmp_7); seems strange because the analysis tells us that tmp_7 ∈ [0..40] at this point... How can this be valid ?

It is only safe to use an offset between 0 and 11 when accessing inside the quantization string ".,-~:;=!*#$@". The value analysis determined that the offset was certainly between 0 and 40, so the analyzer can't be sure that the memory access is safe, and emits an alarm. Of course, [0..40] is only a conservative approximation for the value of the offset at that point. This approximation is computed with model functions sin() and cos() only known to return values in [-1.0 .. 1.0]. These model functions completely lose the relation between the sine and the cosine of a same angle. The computation we are interested in is more clearly seen in the function below, which was extracted using Frama-C's slicing plug-in.

/*@ ensures 0 <= \result <= 11 ; */
int compute(float A, float B, float i, float j)
{
  float c=sin(i), l=cos(i);
  float d=cos(j), f=sin(j);
  float g=cos(A), e=sin(A);
  float m=cos(B), n=sin(B);

  int N=8*((f*e-c*d*g)*m-c*d*e-f*g-l*d*n);
  return N>0?N:0;
}

Both the sine and the cosine of each argument are computed, and eventually used in the same formula. Surely the analyzer would do better if it could just remember that sin(x)²+cos(x)²=1. Alas, it doesn't. It just knows that each is in [-1.0 .. 1.0].

Making progress

But the Nitrogen release will have more precise sin() and cos() modelizations! Focusing only on the sub-computation we are interested in, we can split the variation domains for arguments A, B, i, j of function compute(), until it is clear whether or not N may be larger than 11.

I suggest to use the program below (download), where the code around compute() creates the right analysis context. Just like in unit testing, there are stubs and witnesses in addition to the function under test.

...

/*@ ensures 0 <= \result <= 11 ; */
int compute(float A, float B, float i, float j)
{
  float c=sin(i), l=cos(i);
  float d=cos(j), f=sin(j);
  float g=cos(A), e=sin(A);
  float m=cos(B), n=sin(B);

  int N=8*((f*e-c*d*g)*m-c*d*e-f*g-l*d*n);
Frama_C_dump_each();
  return N>0?N:0;
}

main(){
  int An, Bn, in, jn;
  float A, B, i, j;
  for (An=-26; An<26; An++)
    {
      A = Frama_C_float_interval(An / 8., (An + 1) / 8.);
      for (Bn=-26; Bn<26; Bn++)
	{
	  B = Frama_C_float_interval(Bn / 8., (Bn + 1) / 8.);
	  for (in=-26; in<26; in++)
	    {
	      i = Frama_C_float_interval(in / 8., (in + 1) / 8.);
	      for (jn=-26; jn<26; jn++)
		{
		  j = Frama_C_float_interval(jn / 8., (jn + 1) / 8.);
		  compute(A, B, i, j);
		}
	    }
	}
    }
}

Each for loop only covers the range [-3.25 .. 3.25]. In the real program, A and B vary in much wider intervals. If target functions sin() and cos() use a decent argument reduction algorithm (and incidentally, use the same one), the conclusions obtained for [-3.25 .. 3.25] should generalize to all finite floating-point numbers. This is really a property of the target platform that we cannot do anything about at this level, other than clearly state that we are assuming it.


In fact, the programs I analyzed were not exactly like above: this task is trivial to run in parallel, so I split the search space in four and ran four analyzers at the same time:

$ for i in 1 2 3 4 ; do ( time bin/toplevel.opt -val share/builtin.c \
an$i.c -obviously-terminates -no-val-show-progress \
-all-rounding-modes ) > log$i 2>&1 & disown %1 ; done

After a bit less than two hours (YMMV), I got files log1,..., log4 that contained snapshots of the memory state just after variable N had been computed:

[value] DUMPING STATE of file an1.c line 24
...
        c ∈ [-0.016591893509 .. 0.108195140958]
        l ∈ [-1. .. -0.994129657745]
        d ∈ [-1. .. -0.994129657745]
        f ∈ [-0.016591893509 .. 0.108195140958]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {-1; 0; 1}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-26}
        jn ∈ {-26}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-3.25 .. -3.125]
        j ∈ [-3.25 .. -3.125]
        =END OF DUMP==
an1.c:15:[value] Function compute: postcondition got status valid.
[value] DUMPING STATE of file an1.c line 24
...
        c ∈ [-0.016591893509 .. 0.108195140958]
        l ∈ [-1. .. -0.994129657745]
        d ∈ [-0.999862372875 .. -0.989992439747]
        f ∈ [-0.141120016575 .. -0.0165918916464]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {-2; -1; 0; 1}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-26}
        jn ∈ {-25}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-3.25 .. -3.125]
        j ∈ [-3.125 .. -3.]
        =END OF DUMP==
[value] DUMPING STATE of file an1.c line 24
...

Some input sub-intervals for A, B, i, j make the analyzer uncertain about the post-condition N ≤ 11. Here are the first such inputs, obtained by searching for "unknown" in the log:

...
[value] DUMPING STATE of file an1.c line 24
 ...
        c ∈ [-0.850319802761 .. -0.778073191643]
        l ∈ [-0.628173649311 .. -0.526266276836]
        d ∈ [0.731688857079 .. 0.810963153839]
        f ∈ [0.585097253323 .. 0.681638777256]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {8; 9; 10; 11; 12}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-18}
        jn ∈ {5}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-2.25 .. -2.125]
        j ∈ [0.625 .. 0.75]
        =END OF DUMP==
an1.c:15:[value] Function compute: postcondition got status unknown.
...

Here is the complete list of value sets obtained for variable N on the entire search space:

$ cat log? | grep "N " | sort | uniq
        N ∈ {0; 1}
        N ∈ {0; 1; 2}
        N ∈ {0; 1; 2; 3}
        N ∈ {0; 1; 2; 3; 4}
        N ∈ {0; 1; 2; 3; 4; 5}
        N ∈ {0; 1; 2; 3; 4; 5; 6}
        N ∈ {-1; 0}
        N ∈ {-1; 0; 1}
        N ∈ {10; 11}
        N ∈ {10; 11; 12}
        N ∈ {-1; 0; 1; 2}
        N ∈ {-1; 0; 1; 2; 3}
        N ∈ {-1; 0; 1; 2; 3; 4}
        N ∈ {-10; -9}
        N ∈ {-10; -9; -8}
        N ∈ {-10; -9; -8; -7}
        N ∈ {-10; -9; -8; -7; -6}
        N ∈ {-10; -9; -8; -7; -6; -5}
        N ∈ {-10; -9; -8; -7; -6; -5; -4}
        N ∈ {-11; -10}
        N ∈ {-11; -10; -9}
        N ∈ {-11; -10; -9; -8}
        N ∈ {-11; -10; -9; -8; -7}
        N ∈ {-11; -10; -9; -8; -7; -6}
        N ∈ {-11; -10; -9; -8; -7; -6; -5}
        N ∈ {1; 2}
        N ∈ {-12; -11; -10}
        N ∈ {-12; -11; -10; -9}
        N ∈ {-12; -11; -10; -9; -8}
        N ∈ {-12; -11; -10; -9; -8; -7}
        N ∈ {-12; -11; -10; -9; -8; -7; -6}
        N ∈ {1; 2; 3}
        N ∈ {1; 2; 3; 4}
        N ∈ {1; 2; 3; 4; 5}
        N ∈ {1; 2; 3; 4; 5; 6}
        N ∈ {1; 2; 3; 4; 5; 6; 7}
        N ∈ {-13; -12; -11; -10; -9}
        N ∈ {-13; -12; -11; -10; -9; -8}
        N ∈ {-13; -12; -11; -10; -9; -8; -7}
        N ∈ {-2; -1}
        N ∈ {-2; -1; 0}
        N ∈ {-2; -1; 0; 1}
        N ∈ {-2; -1; 0; 1; 2}
        N ∈ {-2; -1; 0; 1; 2; 3}
        N ∈ {2; 3}
        N ∈ {2; 3; 4}
        N ∈ {2; 3; 4; 5}
        N ∈ {2; 3; 4; 5; 6}
        N ∈ {2; 3; 4; 5; 6; 7}
        N ∈ {2; 3; 4; 5; 6; 7; 8}
        N ∈ {-3; -2}
        N ∈ {-3; -2; -1}
        N ∈ {-3; -2; -1; 0}
        N ∈ {-3; -2; -1; 0; 1}
        N ∈ {-3; -2; -1; 0; 1; 2}
        N ∈ {3; 4}
        N ∈ {3; 4; 5}
        N ∈ {3; 4; 5; 6}
        N ∈ {3; 4; 5; 6; 7}
        N ∈ {3; 4; 5; 6; 7; 8}
        N ∈ {3; 4; 5; 6; 7; 8; 9}
        N ∈ {-4; -3}
        N ∈ {-4; -3; -2}
        N ∈ {-4; -3; -2; -1}
        N ∈ {-4; -3; -2; -1; 0}
        N ∈ {-4; -3; -2; -1; 0; 1}
        N ∈ {4; 5}
        N ∈ {4; 5; 6}
        N ∈ {4; 5; 6; 7}
        N ∈ {4; 5; 6; 7; 8}
        N ∈ {4; 5; 6; 7; 8; 9}
        N ∈ {4; 5; 6; 7; 8; 9; 10}
        N ∈ {-5; -4}
        N ∈ {-5; -4; -3}
        N ∈ {-5; -4; -3; -2}
        N ∈ {-5; -4; -3; -2; -1}
        N ∈ {-5; -4; -3; -2; -1; 0}
        N ∈ {5; 6}
        N ∈ {5; 6; 7}
        N ∈ {5; 6; 7; 8}
        N ∈ {5; 6; 7; 8; 9}
        N ∈ {5; 6; 7; 8; 9; 10}
        N ∈ {5; 6; 7; 8; 9; 10; 11}
        N ∈ {-6; -5}
        N ∈ {-6; -5; -4}
        N ∈ {-6; -5; -4; -3}
        N ∈ {-6; -5; -4; -3; -2}
        N ∈ {-6; -5; -4; -3; -2; -1}
        N ∈ {-6; -5; -4; -3; -2; -1; 0}
        N ∈ {6; 7}
        N ∈ {6; 7; 8}
        N ∈ {6; 7; 8; 9}
        N ∈ {6; 7; 8; 9; 10}
        N ∈ {6; 7; 8; 9; 10; 11}
        N ∈ {6; 7; 8; 9; 10; 11; 12}
        N ∈ {-7}
        N ∈ {7}
        N ∈ {-7; -6}
        N ∈ {-7; -6; -5}
        N ∈ {-7; -6; -5; -4}
        N ∈ {-7; -6; -5; -4; -3}
        N ∈ {-7; -6; -5; -4; -3; -2}
        N ∈ {-7; -6; -5; -4; -3; -2; -1}
        N ∈ {7; 8}
        N ∈ {7; 8; 9}
        N ∈ {7; 8; 9; 10}
        N ∈ {7; 8; 9; 10; 11}
        N ∈ {7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12; 13}
        N ∈ {-8; -7}
        N ∈ {-8; -7; -6}
        N ∈ {-8; -7; -6; -5}
        N ∈ {-8; -7; -6; -5; -4}
        N ∈ {-8; -7; -6; -5; -4; -3}
        N ∈ {-8; -7; -6; -5; -4; -3; -2}
        N ∈ {8; 9}
        N ∈ {8; 9; 10}
        N ∈ {8; 9; 10; 11}
        N ∈ {8; 9; 10; 11; 12}
        N ∈ {8; 9; 10; 11; 12; 13}
        N ∈ {9; 10}
        N ∈ {9; 10; 11}
        N ∈ {9; 10; 11; 12}
        N ∈ {9; 10; 11; 12; 13}
        N ∈ {-9; -8}
        N ∈ {-9; -8; -7}
        N ∈ {-9; -8; -7; -6}
        N ∈ {-9; -8; -7; -6; -5}
        N ∈ {-9; -8; -7; -6; -5; -4}
        N ∈ {-9; -8; -7; -6; -5; -4; -3}

The above is pretty cute, and it is just short enough for manual inspection. The inspection reveals that all the lines with dangerous values for N contain the number 12, so that we can grep for that. We arrive at the following list of dangerous value sets for N:

        N ∈ {10; 11; 12}
        N ∈ {6; 7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12; 13}
        N ∈ {8; 9; 10; 11; 12}
        N ∈ {8; 9; 10; 11; 12; 13}
        N ∈ {9; 10; 11; 12}
        N ∈ {9; 10; 11; 12; 13}

Note: it is normal that each dangerous line contains 12: if the safety property that we are trying to establish is true, then there must be a safe value (say, 10) in each set. And the sets contain several values as the result of approximations in interval arithmetics, so that sets of values for N are intervals of integers. Any interval that contains both a safe value and an unsafe value must contain the largest safe value 11 and the smallest unsafe value 12.

Conclusion

We have improved our estimate of the maximum value of N from 40 to 13. That's pretty good, although it is not quite enough to conclude that the donut program is safe.

In conclusion to this post, now that we have ascertained that a dangerous set for N was a set that contains 12, we can compute the ratio of the search space for which we haven't yet established that the program was safe. We are exploring a dimension-4 cube by dividing it in sub-cubes. There are 52⁴ = 7 311 616 sub-cubes. Of these, this many are still not known to be safe:

$ cat log? | grep "N " | grep " 12" | wc -l
360684

We are 95% there. The remaining 5% shouldn't take more than another 95% of the time.

Wednesday, September 14 2011

Linux and floating-point: nearly there

In the process of implementing the value analysis built-ins Frama_C_precise_sin() and Frama_C_precise_cos() from last post, I stumbled on some interesting floating-point results. The sensationalistic title blames Linux, but I didn't fully investigate the problem yet, and it could be somewhere else.

If you have the Frama-C sources lying around, you might be able to reproduce it as below. Otherwise, just read on; it is rather obvious what is wrong, even for a floating-point issue.

$ make bin/toplevel.top
...
$ bin/toplevel.top 
        Objective Caml version 3.12.1

# open Ival ;;        
# set_round_upward () ;;
- : unit = ()
# sin (-3.0) ;;
- : float = -2.75991758268585219
# set_round_nearest_even () ;;
- : unit = ()
# sin (-3.0) ;;
- : float = -0.141120008059867214

2.76 is a large amplitude for the sine of any angle, and 3.0 isn't even unreasonable as an argument to pass to trigonometric functions. Trigonometric functions on my other OS work from OCaml in all rounding modes. On this Linux computer, the result is reasonable in the default nearest-even mode. This all rather points to an issue in libraries rather than OCaml itself.

If even desktop computers and servers provide this kind of third-rate trigonometric functions, what should we assume about embedded code?

Monday, September 12 2011

Better is the enemy of good... sometimes

This post is about widening. This technique was shown in the second part of a previous post about memcpy(), where it was laboriously used to analyze imprecisely function memcpy() as it is usually written. The value analysis in Frama-C has the ability to summarize loops in less time than they would take to execute. Indeed, it can analyze in finite time loops for which it is not clear whether they terminate at all. This is the result of widening, the voluntary loss of precision in loops. As a side-effect, widening can produce some counter-intuitive results. This post is about these counter-intuitive results.

The expected: better information in, better information out

Users generally expect that the more you know about the memory state before the interpretation of a piece of program, the more you should know about the memory state after this piece of program. Users are right to expect this, and indeed, it is generally true:

int u;

main(){
  u = Frama_C_interval(20, 80);
  if (u & 1)
    u = 3 * u + 1; // odd
  else
    u = u / 2; // even
}

The command frama-c -val c.c share/builtin.c shows the final value of u to be in [10..241]. If the first line of main() is changed to make the input value range over [40..60] instead of [20..80], then the output interval for u becomes [20..181]. This seems normal : you know more about variable u going in (all the values in [40..60] are included in [20..80]), so you know more about u coming out ([20..181] is included in [10..241]). Most of the analyzer's weaknesses, such as, in this case, the inability to recognize that the largest value that can get multiplied by 3 is respectively 79 or 59, do not compromise this general principle. But widening does.

Better information in, worse information out

Widening happens when there is a loop in the program, such as in the example below:

int u, i;

main(){  
  u = Frama_C_interval(40, 60);
  for (i=0; i<10; i++)
    u = (25 + 3 * u) / 4;
}
$ frama-c -val loop.c share/builtin.c
...
[value] Values for function main:
          u ∈ [15..60]

Analyzing this program with the default options produce the interval [15..60] for u. Now, if we change the initial set for u to be [25..60], an interval that contains all the values from [40..60] and more:

$ frama-c -val loop.c share/builtin.c
...
[value] Values for function main:
          u ∈ [25..60]

By using a larger, less precise set as the input of this analysis, we obtain a more precise result. This is counter-intuitive, although it is rarely noticed in practice.

Better modelization, worse analysis

There are several ways to make this unpleasant situation less likely to be noticed. Only a few of these are implemented in Frama-C's value analysis. Still, it was difficult enough to find a short example to illustrate my next point, and the program is only as short as I was able to make it. Consider the following program:

int i, j;
double ftmp1;

double Frama_C_cos(double);
double Frama_C_cos_precise(double);

#define TH 0.196349540849361875
#define NBC1 14
#define S 0.35355339

double cos(double x)
{
  return Frama_C_cos(x);
}

main(){  
  for (i = 0; i < 8; i++)
    for (j = 0; j < 8; j++)
      {
        ftmp1 = ((j == 0) ? S : 0.5) *
          cos ((2.0 * i + 1.0) * j * TH);
	ftmp1 *= (1 << NBC1);
      }
}

If you analyze the above program with a smartish modelization of function cos(), one that recognizes when its input is an interval [l..u] of floating-point values on which cos() is monotonous, and computes the optimal output in this case, you get the following results:

$ frama-c -val idct.c share/builtin.c
...
          ftmp1 ∈ [-3.40282346639e+38 .. 8192.]

Replacing the cos() modelization by a more naive one that returns [-1. .. 1.] as soon as its input set isn't a singleton, you get the improved result:

$ frama-c -val idct.c share/builtin.c
...
          ftmp1 ∈ [-8192. .. 8192.]

This is slightly different from the [25..60] example, in that here we are not changing the input set but the modelization of one of the functions involved in the program. Still, the causes are the same, and the effects just as puzzling when they are noticeable. And this is why, until version Carbon, only the naive modelization of cos() was provided.

In the next release, there will be, in addition to the naive versions, Frama_C_cos_precise() and Frama_C_sin_precise() built-ins for when you want them. Typically, this will be for the more precise analyses that do not involve widening.

- page 1 of 2