Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - value-builtins

Entries feed - Comments feed

Thursday, August 23 2012

Technical interlude

The current series of posts is pretty technical, huh? Here is a refreshing diversion in the form of a technical post that also touches on existential questions. What is this blog for? We don't know.

Best solutions, or solutions anyone could have found

If you have been following the current series of posts, you may have noticed that a recurring dilemma is whether the goal is to demonstrate the capabilities of the software tool under consideration (Frama-C's value analysis plug-in) when pushed to its limits by its authors, or to demonstrate what anyone could do without being a creator of the tool, taking advantage of Linux, Mac OS X and sometimes Windows packages for the software, and, most importantly, having only the available documentation to understand what the software makes possible.


The general goal of the blog is to empower users. In fact, much of the aforementioned documentation is available as posts in the blog itself. These posts sometimes describe techniques that were undocumented until then, but describe them so that they wouldn't be undocumented any more. Some of the posts even describe techniques that weren't undocumented at the time of writing. As my colleague Jean-Jacques Lévy would say, “repetition is the essence of teaching”.


However, I do not expect that most software tool competitions are about measuring the availability and learnability of tools. For instance, the SV-COMP competition explicitly forbids anyone other than the authors of the tool to participate with it:

A person (participant) was qualified as competition contributor for a competition candidate if the person was a contributing designer/developer of the submitted competition candidate (witnessed by occurrence of the person’s name on the tool’s project web page, a tool paper, or in the revision logs).

Ostensibly, the tools are the participants in the competition. Involving the authors is only a mean to make the competition as fair as possible for the tools. Some colleagues and I have expressed the opinion that, for static analysis, at this time, this was the right way:

In general, the best way to make sure that a benchmark does not misrepresent the strengths and weaknesses of a tool is to include the toolmakers in the process.


Some competitions on the other hand acknowledge that a participant is a person(s)+tool hybrid, and welcome entries from users that did not create the tools they used. This is perfectly fine, especially for the very interactive tools involved in the cited competition, as long as it is understood that it is not a tool alone, but a hybrid participant, that is being evaluated in such a competition. Some of the participating tools participated in several teams, associated to several users (although the winning teams did tend to be composed of toolmakers using their respective tools).


It is not clear to me where on this axis the RERS competition falls. It is a “free-style” competition. The entries are largely evaluated on results, with much freedom with respect to the way these were obtained. This definitely can also be fun(*).

The competition certainly does not require participating tools to be packaged and documented in such exquisite detail that anyone could have computed the same results. As such, the best description to write is the description of what the authors of the value analysis can do with it with all the shortcuts they trust themselves to take. If some of the questions can be solved efficiently using undocumented features, so be it. It is the free-style style.

But the mission of the blog remains to document, and showing what we have done without explaining how to reproduce it sounds like boasting. I am torn.


(*) An untranslatable French idiom, “bon public”, literally “good audience”, used in an adjective position, conveys a number of subtle, often humorous nuances. I think this applies to me and software competitions. They just all seem fun to me.

A necessary built-in

A value analysis built-in whose usefulness, if it were available, became apparent in the last post was a built-in that halts the analyzer.

Several analysis log snippets in the previous post show a ^C as last line, to indicate that the log already contains the answer to the question being investigated, and that the user watching the analysis go by can freely interrupt it then.

I cheated when editing the post, by the way. I am not fast enough to interrupt the analysis just after the first conclusive log message. But I did interrupt these analyses, which otherwise might have gone on for a long time for no purpose.

I thought I would enjoy my colleagues' horrified reaction so I told everyone about my intention of writing a value analysis built-in, like some already exist for memcpy() or memset(), for halting the analyzer. Not halting the currently considered execution, mind you. There is /*@ assert \false; */ for that, and if you need this functionality available as a C function, you can put /*@ assert \false; */ inside your own function, like I did for assert() earlier when prepping the competition programs for Frama-C.


Clearly, such a built-in does not have a contract, its effects are meta-effects that transcend the framework (in what would, for these colleagues, be the bad sense of the word).

A time-saver

Today I set out to write this built-in, because I started to get serious about the competition and it would be convenient to just instrument the program with:

if ( /* current state identical to predecessor state /* )
{
  Frama_C_show_each_cycle_found(1);
  Frama_C_halt_analysis();
}

instead of having an operator, or even a shell script, watch the log for the Frama_C_show_each_cycle_found message and interrupt Frama-C then.


Soon after opening the file and finding my place, I realized that this feature was already available. I am telling you this because this blog is for the benefit of transparency, for documenting things and for empowering users, but mostly because I want to see my colleagues' horrified reaction when they read about this trick. Actually, I would not use it if I were you. It might cease to work in a future version.

if ( /* current state identical to predecessor state /* )
{
  Frama_C_show_each_cycle_found(1);
  Frama_C_cos(0.0, 0.0);
}

You see, Frama-C value analysis built-ins, such as Frama_C_cos(), do not have an intrinsic type. You can include a header that will give them a type if you wish:

double Frama_C_cos(double x);

Otherwise, the old C89 rules for missing function prototypes apply. During parsing, a type is inferred from the types of the arguments at the call site.

Only if and when the analysis actually reaches the Frama_C_cos(0.0, 0.0) call, the analyzer will realize that it is passed arguments that are incompatible with the OCaml function that implement the built-in. An exception will be raised. Since there is no reason to try to recover from a built-in misuse, this exception is caught only at the highest level. It stops the entire analyzer (it marks the entire analysis results as unreliable, as it should, but of course it does not retroactively unemit the Frama_C_show_each_cycle_found message in the log that says a conclusion has been reached).


The result looks like this, this time without cheating:

[value] Called Frama_C_show_each_cycle_found({1})
...
[value] computing for function any_int <- main.
        Called from Problem6_terminateU_pred.c:9516.
[value] Done for function any_int
[value] Semantic level unrolling superposing up to 1400 states
[value] user error: Invalid argument for Frama_C_cos function
[value] user error: Degeneration occured:
        results are not correct for lines of code
        that can be reached from the degeneration point.
[kernel] Plug-in value aborted because of invalid user input.

This is convenient for timing the analyzer as it tackles some of the harder problems. In the case above, the time was 5m35s answering a liveness question about Problem 6. It would have been a pain to watch the log for that long, fingers poised to interrupt the analyzer.


That's all for this interlude. Expect exciting timings in the next post. There may also be talk of undocumented features that transcend the problem, in the good sense of the word.

Tuesday, July 24 2012

Results are in

A contest and a self-pitying lament

John Regehr was organizing a craziest undefined behavior contest and the results are in. I had an entry in the contest, but I did not win. My entry apparently was too obviously dangerous. As John puts it, “I would have expected a modern C compiler to exploit those undefined behaviors”.

I have explained the problem inherent to constructs of the kind &a - 1 < &a in several different sets of circumstances, to different programmers for different projects, and I was unable to convince any of them that the construct or a variant of it was not harmless and that the code should be changed. But for this contest, the construct was too obviously wrong and too prone to be taken advantage of by an aggressive compiler to win. This is the story of my life, really.

Notable entries

The contest invited C++ entries as well, but fortunately for this blog post, most of the submissions were actually in C.

Runner-up

int main (void) {
  int x = 1;
  while (x) x <<= 1;
  return x;
}

$ bin/toplevel.opt -val t1.c | grep assert
t1.c:3:[kernel] warning: invalid LHS operand for left shift. assert x ≥ 0;

As shown above, Frama-C's value analysis warns for the undefined behavior that made this entry a runner-up in the contest.

Winner #1

enum {N = 32};
int a[N], pfx[N];
void prefix_sum (void)
{
  int i, accum;
  for (i = 0, accum = a[0]; i < N; i++, accum += a[i])
    pfx[i] = accum;
}

$ bin/toplevel.opt -val t2.c -main prefix_sum | grep assert
t2.c:6:[kernel] warning: accessing out of bounds index [1..32]. assert i < N;

In this second example, you need to tell Frama-C to analyze the function prefix_sum. Otherwise, it won't find the undefined behavior: the value analysis is not the sort of heuristic bug-finding tool that detects patterns in code regardless of calling contexts. For precision and soundness, Frama-C's value analysis tries to determine whether prefix_sum is safe in the context(s) in which it is actually called during execution. Without option -main prefix_sum, it looks like it is not called at all, and therefore safe. Indeed, if you try to compile and run the program, the linker complains about a missing main() function, meaning that the code cannot do harm.

Having to use option -main and others to describe your intentions is the sort of quirk you can pick up by going through a tutorial or the documentation.

Winner #2

#include <stdio.h>
#include <stdlib.h>
 
int main() {
  int *p = (int*)malloc(sizeof(int));
  int *q = (int*)realloc(p, sizeof(int));
  *p = 1;
  *q = 2;
  if (p == q)
    printf("%d %d\n", *p, *q);
}

This example involves dynamic allocation, which is one of the C features that Frama-C's value analysis, as a static analyzer, does not claim to handle precisely in all cases.

Here, however, the program is completely deterministic: it does not have inputs of any kind. With an unreleased version of the value analysis, it is possible to interpret such programs for the detection of undefined behaviors. In this interpreter mode, the value analysis handles many more C constructs, including malloc() and realloc() calls.

That version of the value analysis warns about the use of p in the line if (p == q), which is indeed the first use of p after its contents have become indeterminate (in the sense of the C99 standard). The issue here relates to that post on the impossibility of freeing a pointer twice.

My entry

My entry was on the subject of illegal pointer arithmetic, which the value analysis does not warn about because there would be too many programs that it would (correctly) flag. It also involved comparisons of such illegal pointers, which the value analysis does warn about. These comparisons are also commonly found in the wild, but these are very very often actually dangerous, and have security implications.

There is nothing much I can say on the subject that was not already in a previous post on pointer comparisons.

Conclusion

You might think I am going to end up with a statement like “Frama-C's value analysis detects* all undefined behaviors because it finds the undefined behaviors in all four examples illustrated in the contest”.


Perish the thought! That would be fallacious. My claim is that Frama-C's value analysis detects* all undefined behaviors that you may care about, and even some you may think you do not need to care about but that might have surprising consequences, as exemplified on some notable entries in John's contest. The reason it finds* all undefined behaviors is not that it does well for a couple of examples but that it was designed to do so.


Incidentally, Frama-C's value analysis was tested on a lot more examples when working on this article on testcase reduction, and results were compared to KCC's. The Csmith random C program generator produces only defined programs, but C-Reduce, one of the reducers described in the article, produces tons of undefined behaviors. Although that was not a goal of the article, the work also served as an application of differential testing to the detectors of undefined behaviors that are KCC and Frama-C's value analysis.


There are problematic features of the C language that were left aside when initially designing the value analysis for its primary target of critical embedded code. This limits some uses, but it remains possible to extract valuable information off a C program, using the value analysis and the right methodology. One workaround is to use the value analysis as a C interpreter, in which case standard library functions are easier to model. Many of them already are available in the unreleased value analysis.


(*) when used properly. See Winner #1 and Winner #2 for examples of what “using properly” may entail.


My thanks to Sven Mattsen for proofreading this post.

Thursday, January 5 2012

free(): revisited already

If Frama-C doesn't work out, we can always make a comedy team

Facetious colleagues ask me how I make Frama-C's value analysis' messages so informative. "Pascal," one of them says, "in this case study, the generated log contains 8GiB of information! It won't open in Emacs...". I helpfully point out that it is probably a 2^32 thing and that he should use a 64-bit Emacs.

Informativeness with uninitialized variables

Consider the program below:

main(){
  int x = 1;
  int y;
  int z = 3;
  int t = 4;
  int u = 5;
  x += y;
  x += z;
  x += t;
  x += u;
  printf("%d\n", x);
}

A less informative analyzer would predict that the printf() call prints any number, or it might call the printed number "Uninitialized". Either way, the user would be dumbfounded. Ey would have to track the origin of this imprecise diagnostic emself. Going backwards from causes to origins, ey would finally end up to the declaration of local variable y, which ey intended to initialize but forgot to.

The value analysis tells you right there at line 7 (x += y;) that the program is doing something wrong, so that the way back up from cause to origin is much shorter. It's more informative to point out that the program is already doing something wrong at line 7 than to silently propagate a value indicating something wrong has happened somewhere.

Informativeness with dangling pointers

My most facetious colleague Anne wonders whether one shouldn't be allowed to increment q in the program from last post:

main(){
  int *p = malloc(12);
  int *q = p;

  Frama_C_dump_each();
  free(p);
  q++;
  ...
}

This program is treated analogously to that program, and by "analogously" I mean that the same functions are called behind the scenes and the same messages are printed:

main(){
  int *p;
  int *q;

  {
    int a[3];
    p = a;
    q = p;
    Frama_C_dump_each();
  }
  q++;
  ...
}

Could we allow q to be incremented?

In the example with free(), perhaps. In the example with a local variable a, the standard clearly calls the contents of q indeterminate, and it's allowable for a static analyzer to emit an alarm when indeterminate memory contents are used for anything except copying.

The standard is surprisingly brief when it comes to defining free().

The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. [...] if the argument does not match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.

Should we allow q to be incremented?

Well, I don't want to, for the same reason as in the uninitialized variable y example. This would postpone the moment the error is pointed out to the user, and give em extra work to track the programming error (which is to do anything with q after the free()/end of block. The informative solution is to warn at q++; And I take informativeness very seriously. (You've been a great audience. I'll be here all week)


I think the standard should say that any pointer previously pointing inside the freed space becomes indeterminate. Anyway, that's how it's treated in Frama-C's value analysis, and if you don't like it, you can use another analyzer, unless you are Anne, in which case you can't and you must use this one.

Double free(), no such thing

I have been able to divert a few hours yesterday and today for programming. It was well worth it, as I have discovered a theorem. It is new to me, and I wonder whether it was ever published. The theorem is, a C program cannot double free() a block even if it tries! I thought this deserved an announcement. Some students I know would be glad to hear that.


Indeed, a C program can at most pass an indeterminate value to a function. This is highly objectionable in itself. Students, don't write programs that do this either!

Allow me to demonstrate on the following program:

...
main(){
  int *p = malloc(12);
  int *q = p;

  free(p);
  free(q);
}

You might think that the program above double-frees the block allocated and referenced by p. But it doesn't: after the first call to free(), the contents of q are indeterminate, so that it's impossible to free the block a second time. I will insert a few calls to primitive function Frama_C_dump_each() to show what happens:

main(){
  int *p = malloc(12);
  int *q = p;

  Frama_C_dump_each();
  free(p);
  Frama_C_dump_each();
  free(q);
  Frama_C_dump_each();
}
$ frama-c -val this_is_not_a_double_free.c

The analysis follows the control flow of the program. First a block is allocated, and a snapshot of the memory state is printed a first time in the log:

...
[value] computing for function malloc <- main.
        Called from test.c:12.
[value] Recording results for malloc
[value] Done for function malloc
[value] DUMPING STATE of file test.c line 15
        p ∈ {{ &Frama_C_alloc }}
        q ∈ {{ &Frama_C_alloc }}
        =END OF DUMP==
...

Then free() is called and a second snapshot is logged:

...
[value] computing for function free <- main.
        Called from test.c:16.
[value] Recording results for free
[value] Done for function free
[value] DUMPING STATE of file test.c line 17
        p ∈ ESCAPINGADDR
        q ∈ ESCAPINGADDR
        =END OF DUMP==
...

And then, at the critical moment, the program passes an indeterminate value to a function:

...
test.c:18:[kernel] warning: accessing left-value q that contains escaping addresses;
                      assert(Ook)
test.c:18:[kernel] warning: completely undefined value in {{ q -> {0} }} (size:<32>).
test.c:18:[value] Non-termination in evaluation of function call expression argument
        (void *)q
...
[value] Values at end of function main:
          NON TERMINATING FUNCTION

Note how the last call to Frama_C_dump_each() is not even printed. Execution stops at the time of calling the second free(). Generalizing this reasoning to all attempts to call free() twice, we demonstrate it is impossible to double-free a memory block.

QED


Question: does it show too much that this detection uses the same mechanism as the detection for addresses of local variables escaping their scope? I can change the message if it's too confusing. What would a better formulation be?

As I hope I have made clear, the message will be displayed as soon as the program does anything at all with a pointer that has been freed — and indeed, I have not implemented any special check for double frees. The program below is forbidden just the same, and does not get past the q++;

main(){
  int *p = malloc(12);
  int *q = p;

  Frama_C_dump_each();
  free(p);
  q++;
  ...
}

On the other hand, I still have to detect that the program does not free a global variable, something the analysis currently allows to great comical effect. (Hey, I have only had a couple of hours!)

Please leave your thoughts in the comments.

Friday, September 23 2011

Safe donut

This post documents the steps I followed in order to finish verifying function compute(), picking up from there.

Previously on this blog

In last episode, we had found that some sub-cubes in the search space appeared to lead to dangerous value sets for variable N. These sets were:

        N ∈ {10; 11; 12}
        N ∈ {6; 7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12; 13}
        N ∈ {8; 9; 10; 11; 12}
        N ∈ {8; 9; 10; 11; 12; 13}
        N ∈ {9; 10; 11; 12}
        N ∈ {9; 10; 11; 12; 13}

There are only eight dangerous value sets for N, but these correspond to many values of input variables An, Bn, in, jn in the analysis context we wrote for the occasion. Grepping for each of these eight lines in the original logs allows to get the corresponding values of An, Bn, in, jn. These are the values for which we haven't concluded yet that compute() is safe:

for i in 1 2 3 4
do
    for line in \
	"N ∈ {10; 11; 12}" \
	"N ∈ {6; 7; 8; 9; 10; 11; 12}" \
	"N ∈ {7; 8; 9; 10; 11; 12}" \
	"N ∈ {7; 8; 9; 10; 11; 12; 13}" \
	"N ∈ {8; 9; 10; 11; 12}" \
	"N ∈ {8; 9; 10; 11; 12; 13}" \
	"N ∈ {9; 10; 11; 12}" \
	"N ∈ {9; 10; 11; 12; 13}" 
    do 
        grep "$line" log$i -A4 -B0 | grep -v "N "
        echo --
    done > pb$i
done

The results are well distributed among the four quarters of the search space that I initially chose arbitrarily. That's good news, as it means things can be kept parallel by keeping these files separate. One file looks like:

        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-26}
        jn ∈ {18}
--
        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-26}
        jn ∈ {19}
--
        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-3}
        jn ∈ {6}
--
...

How to re-analyze the problematic subcubes more precisely

Each file pb1, ..., pb4 can be processed a little bit further to look like C code:

for i in 1 2 3 4
do
  sed -e s"/--/f();/" -e s"/∈ {/= /" -e s"/}/;/" < pb$i > pr$i.c
done

The above commands transform the interesting values files into:

        An = -26;
        Bn = -15;
        in = -26;
        jn = 18;
f();
        An = -26;
        Bn = -15;
        in = -26;
        jn = 19;
f();
        An = -26;
        Bn = -15;
        in = -3;
        jn = 6;
f();
...

Each of the four pieces of program weights in at a bit less than 7 MB of C code. I plan to make good use of this information tidbit the next time someone asks me what project sizes Frama-C's value analysis can handle.

$ ls -l pr?.c
-rw-r--r-- 1 cuoq cuoq 6788869 2011-09-17 18:52 pr1.c
-rw-r--r-- 1 cuoq cuoq 6551620 2011-09-17 18:52 pr2.c
-rw-r--r-- 1 cuoq cuoq 6655238 2011-09-17 18:52 pr3.c
-rw-r--r-- 1 cuoq cuoq 6486765 2011-09-17 18:52 pr4.c


The files pr1.c, ..., pr4.c are not complete C programs, but they work well with the following prolog (download):

...

int An, Bn, in, jn;

void f(void)
{
  int Ans, Bns, ins, jns;
  float A, B, i, j;
  for (Ans=4*An; Ans<4*(An+1); Ans++)
    {
      A = Frama_C_float_interval(Ans / 32., (Ans + 1) / 32.);
      for (Bns=4*Bn; Bns<4*(Bn+1); Bns++)
        {
          B = Frama_C_float_interval(Bns / 32., (Bns + 1) / 32.);
          for (ins=4*in; ins<4*(in+1); ins++)
            {
              i = Frama_C_float_interval(ins / 32., (ins + 1) / 32.);
              for (jns=4*jn; jns<4*(jn+1); jns++)
                {
                  j = Frama_C_float_interval(jns / 32., (jns + 1) / 32.);
                  compute(A, B, i, j);
                }
            }
        }
    }
}

main(){

A closing bracket } is also needed to make the whole thing a syntactically correct C program.


Alas, a regrettable performance bug in Frama-C's front end prevents from analyzing such huge generated C functions. We are a bit too close to the Nitrogen release to change data structures for representing the AST, so this bug will probably remain for at least one release cycle. To circumvent the issue, I simply split the files into 182 reasonably-sized chunks (reasonably-sized here meaning 10000 lines, a more usual size for a function).

split -l 10000 pr1.c pr1.c.

182 C files to analyze and 4 cores to analyze them with: this is an undreamed-of opportunity to make use of the xargs -n 1 -P 4 command.

ls pr?.c.* | xargs -n 1 -P 4 ./do.sh

Here is the script do.sh for handling one program. It first catenates the prolog, the chunk, and a closing bracket, and then it launches the value analysis on the resulting C program:

#!/bin/sh
( cat prolog.c ; cat $1 ; echo "}" ) > t_$1.c
frama-c -val share/builtin.c t_$1.c -obviously-terminates \
  -no-val-show-progress -all-rounding-modes > log_pass2_$1 2>&1

Results

The above, yada yada yada, produces one log file for each of the 182 chunks:

$ ls -l log_pass2_pr*
-rw-r--r-- 1 cuoq cuoq 500913957 2011-09-17 20:50 log_pass2_pr1.c.aa
-rw-r--r-- 1 cuoq cuoq 502329593 2011-09-17 20:49 log_pass2_pr1.c.ab
-rw-r--r-- 1 cuoq cuoq 503146982 2011-09-17 20:51 log_pass2_pr1.c.ac
...
-rw-r--r-- 1 cuoq cuoq 502560543 2011-09-18 01:22 log_pass2_pr1.c.ay
-rw-r--r-- 1 cuoq cuoq 502283181 2011-09-18 01:23 log_pass2_pr1.c.az
-rw-r--r-- 1 cuoq cuoq 503974409 2011-09-18 01:30 log_pass2_pr1.c.ba
-rw-r--r-- 1 cuoq cuoq 501308298 2011-09-18 01:29 log_pass2_pr1.c.bb
...
-rw-r--r-- 1 cuoq cuoq 502932885 2011-09-18 05:20 log_pass2_pr1.c.bs
-rw-r--r-- 1 cuoq cuoq 422006804 2011-09-18 05:03 log_pass2_pr1.c.bt
-rw-r--r-- 1 cuoq cuoq 502353901 2011-09-18 05:19 log_pass2_pr2.c.aa
-rw-r--r-- 1 cuoq cuoq 502485241 2011-09-18 05:23 log_pass2_pr2.c.ab
-rw-r--r-- 1 cuoq cuoq 503562848 2011-09-18 05:57 log_pass2_pr2.c.ac
...
-rw-r--r-- 1 cuoq cuoq 184986900 2011-09-18 12:28 log_pass2_pr2.c.bs
-rw-r--r-- 1 cuoq cuoq 498627515 2011-09-18 13:11 log_pass2_pr3.c.aa
...
-rw-r--r-- 1 cuoq cuoq 263096852 2011-09-19 05:27 log_pass2_pr4.c.bs

Incidentally, it appears from the above listing that the second pass took about 36 hours (two nights and one day). The inside of one log file looks like:

[value] DUMPING STATE of file t_pr1.c.aa.c line 24
...
        N ∈ {10; 11}
        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-26}
        jn ∈ {18}
        Ans ∈ {-104}
        Bns ∈ {-60}
        ins ∈ {-104}
        jns ∈ {72}
        A ∈ [-3.25 .. -3.21875]
        B ∈ [-1.875 .. -1.84375]
        i ∈ [-3.25 .. -3.21875]
        j ∈ [2.25 .. 2.28125]
        =END OF DUMP==
t_pr1.c.aa.c:15:[value] Function compute: postcondition got status valid.
[value] DUMPING STATE of file t_pr1.c.aa.c line 24
...
        N ∈ {10; 11}
        An ∈ {-26}
        Bn ∈ {-15}
        in ∈ {-26}
        jn ∈ {18}
        Ans ∈ {-104}
        Bns ∈ {-60}
        ins ∈ {-104}
        jns ∈ {73}
        A ∈ [-3.25 .. -3.21875]
        B ∈ [-1.875 .. -1.84375]
        i ∈ [-3.25 .. -3.21875]
        j ∈ [2.28125 .. 2.3125]
        =END OF DUMP==
...

Each value set for variable N is computed more precisely than in the first pass, because the values for floating-point variables A, B, i, j are known more precisely. Above, N is determined to be either 10 or 11 (both safe values) for two of the subsubcubes under examination. Below is the complete list of sets of values computed for N:

$ for i in log_pass2_pr* ; do grep "N " $i | sort -u ; done | sort -u
        N ∈ {10}
        N ∈ {10; 11}
        N ∈ {11}
        N ∈ {7; 8; 9}
        N ∈ {8; 9}
        N ∈ {8; 9; 10}
        N ∈ {9}
        N ∈ {9; 10}
        N ∈ {9; 10; 11}

Conclusion

I have already provided the "the function being analyzed is safe after all" conclusion in the previous post. This post is only for showing the "how", so that you can judge for yourself, for instance, how much cheating there was. There was quite a bit of C code written for the purpose of the verification: this code could itself have bugs. And there was a good amount of logs processing with shell scripts. There could be bugs there too. Stéphane Duprat would however point out that if we had done the verification with tests, there would have been testing code and logs-processing-scripts too. The difference between the two approaches is that we used Frama_C_float_interval() in the places where we might have used a random number generator. It does not feel like a large conceptual leap, and we obtained deeper properties in exchange (would you trust the function not to return 12 if, like me, you didn't understand what it computes at all and had only tested it a million times? A billion times? How many tests would be enough for you to sit under a piano?)

The possibility that variable N in function compute() held a number larger than 11 was related to the last and most difficult of a series of alarms that we found in a mostly unmodified piece of obfuscated C code. Some of the other alarms disappeared simply by playing with the -slevel setting, which is the setting to think of immediately when the analysis is fast enough and some alarms remain. I explained where the other alarms came from (a call to memset() was not handled precisely enough, etc.) but in actual use, you don't need to: just increase the argument to -slevel and see if the alarms go away, even before starting to think.


The method I have shown here is a little bit laborious, but one reason for us Frama-C developers to do these case studies is to find out about possible uses and make them more comfortable in future releases. This series of posts shows how to do this kind of verification now (well... it describes how to do them soon; the method shown will work with the Nitrogen release). This is absolutely not definitive. A plug-in could automate a lot of what we have done by hand here.


This post was improved by insights from Florent Kirchner.

Saturday, September 17 2011

Probably safe donut

Introduction

In the first post in the obfuscated animated donut series, my colleague Anne pointed out that:

The alarm about : assert \valid(".,-~:;=!*#$@"+tmp_7); seems strange because the analysis tells us that tmp_7 ∈ [0..40] at this point... How can this be valid ?

It is only safe to use an offset between 0 and 11 when accessing inside the quantization string ".,-~:;=!*#$@". The value analysis determined that the offset was certainly between 0 and 40, so the analyzer can't be sure that the memory access is safe, and emits an alarm. Of course, [0..40] is only a conservative approximation for the value of the offset at that point. This approximation is computed with model functions sin() and cos() only known to return values in [-1.0 .. 1.0]. These model functions completely lose the relation between the sine and the cosine of a same angle. The computation we are interested in is more clearly seen in the function below, which was extracted using Frama-C's slicing plug-in.

/*@ ensures 0 <= \result <= 11 ; */
int compute(float A, float B, float i, float j)
{
  float c=sin(i), l=cos(i);
  float d=cos(j), f=sin(j);
  float g=cos(A), e=sin(A);
  float m=cos(B), n=sin(B);

  int N=8*((f*e-c*d*g)*m-c*d*e-f*g-l*d*n);
  return N>0?N:0;
}

Both the sine and the cosine of each argument are computed, and eventually used in the same formula. Surely the analyzer would do better if it could just remember that sin(x)²+cos(x)²=1. Alas, it doesn't. It just knows that each is in [-1.0 .. 1.0].

Making progress

But the Nitrogen release will have more precise sin() and cos() modelizations! Focusing only on the sub-computation we are interested in, we can split the variation domains for arguments A, B, i, j of function compute(), until it is clear whether or not N may be larger than 11.

I suggest to use the program below (download), where the code around compute() creates the right analysis context. Just like in unit testing, there are stubs and witnesses in addition to the function under test.

...

/*@ ensures 0 <= \result <= 11 ; */
int compute(float A, float B, float i, float j)
{
  float c=sin(i), l=cos(i);
  float d=cos(j), f=sin(j);
  float g=cos(A), e=sin(A);
  float m=cos(B), n=sin(B);

  int N=8*((f*e-c*d*g)*m-c*d*e-f*g-l*d*n);
Frama_C_dump_each();
  return N>0?N:0;
}

main(){
  int An, Bn, in, jn;
  float A, B, i, j;
  for (An=-26; An<26; An++)
    {
      A = Frama_C_float_interval(An / 8., (An + 1) / 8.);
      for (Bn=-26; Bn<26; Bn++)
	{
	  B = Frama_C_float_interval(Bn / 8., (Bn + 1) / 8.);
	  for (in=-26; in<26; in++)
	    {
	      i = Frama_C_float_interval(in / 8., (in + 1) / 8.);
	      for (jn=-26; jn<26; jn++)
		{
		  j = Frama_C_float_interval(jn / 8., (jn + 1) / 8.);
		  compute(A, B, i, j);
		}
	    }
	}
    }
}

Each for loop only covers the range [-3.25 .. 3.25]. In the real program, A and B vary in much wider intervals. If target functions sin() and cos() use a decent argument reduction algorithm (and incidentally, use the same one), the conclusions obtained for [-3.25 .. 3.25] should generalize to all finite floating-point numbers. This is really a property of the target platform that we cannot do anything about at this level, other than clearly state that we are assuming it.


In fact, the programs I analyzed were not exactly like above: this task is trivial to run in parallel, so I split the search space in four and ran four analyzers at the same time:

$ for i in 1 2 3 4 ; do ( time bin/toplevel.opt -val share/builtin.c \
an$i.c -obviously-terminates -no-val-show-progress \
-all-rounding-modes ) > log$i 2>&1 & disown %1 ; done

After a bit less than two hours (YMMV), I got files log1,..., log4 that contained snapshots of the memory state just after variable N had been computed:

[value] DUMPING STATE of file an1.c line 24
...
        c ∈ [-0.016591893509 .. 0.108195140958]
        l ∈ [-1. .. -0.994129657745]
        d ∈ [-1. .. -0.994129657745]
        f ∈ [-0.016591893509 .. 0.108195140958]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {-1; 0; 1}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-26}
        jn ∈ {-26}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-3.25 .. -3.125]
        j ∈ [-3.25 .. -3.125]
        =END OF DUMP==
an1.c:15:[value] Function compute: postcondition got status valid.
[value] DUMPING STATE of file an1.c line 24
...
        c ∈ [-0.016591893509 .. 0.108195140958]
        l ∈ [-1. .. -0.994129657745]
        d ∈ [-0.999862372875 .. -0.989992439747]
        f ∈ [-0.141120016575 .. -0.0165918916464]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {-2; -1; 0; 1}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-26}
        jn ∈ {-25}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-3.25 .. -3.125]
        j ∈ [-3.125 .. -3.]
        =END OF DUMP==
[value] DUMPING STATE of file an1.c line 24
...

Some input sub-intervals for A, B, i, j make the analyzer uncertain about the post-condition N ≤ 11. Here are the first such inputs, obtained by searching for "unknown" in the log:

...
[value] DUMPING STATE of file an1.c line 24
 ...
        c ∈ [-0.850319802761 .. -0.778073191643]
        l ∈ [-0.628173649311 .. -0.526266276836]
        d ∈ [0.731688857079 .. 0.810963153839]
        f ∈ [0.585097253323 .. 0.681638777256]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {8; 9; 10; 11; 12}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-18}
        jn ∈ {5}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-2.25 .. -2.125]
        j ∈ [0.625 .. 0.75]
        =END OF DUMP==
an1.c:15:[value] Function compute: postcondition got status unknown.
...

Here is the complete list of value sets obtained for variable N on the entire search space:

$ cat log? | grep "N " | sort | uniq
        N ∈ {0; 1}
        N ∈ {0; 1; 2}
        N ∈ {0; 1; 2; 3}
        N ∈ {0; 1; 2; 3; 4}
        N ∈ {0; 1; 2; 3; 4; 5}
        N ∈ {0; 1; 2; 3; 4; 5; 6}
        N ∈ {-1; 0}
        N ∈ {-1; 0; 1}
        N ∈ {10; 11}
        N ∈ {10; 11; 12}
        N ∈ {-1; 0; 1; 2}
        N ∈ {-1; 0; 1; 2; 3}
        N ∈ {-1; 0; 1; 2; 3; 4}
        N ∈ {-10; -9}
        N ∈ {-10; -9; -8}
        N ∈ {-10; -9; -8; -7}
        N ∈ {-10; -9; -8; -7; -6}
        N ∈ {-10; -9; -8; -7; -6; -5}
        N ∈ {-10; -9; -8; -7; -6; -5; -4}
        N ∈ {-11; -10}
        N ∈ {-11; -10; -9}
        N ∈ {-11; -10; -9; -8}
        N ∈ {-11; -10; -9; -8; -7}
        N ∈ {-11; -10; -9; -8; -7; -6}
        N ∈ {-11; -10; -9; -8; -7; -6; -5}
        N ∈ {1; 2}
        N ∈ {-12; -11; -10}
        N ∈ {-12; -11; -10; -9}
        N ∈ {-12; -11; -10; -9; -8}
        N ∈ {-12; -11; -10; -9; -8; -7}
        N ∈ {-12; -11; -10; -9; -8; -7; -6}
        N ∈ {1; 2; 3}
        N ∈ {1; 2; 3; 4}
        N ∈ {1; 2; 3; 4; 5}
        N ∈ {1; 2; 3; 4; 5; 6}
        N ∈ {1; 2; 3; 4; 5; 6; 7}
        N ∈ {-13; -12; -11; -10; -9}
        N ∈ {-13; -12; -11; -10; -9; -8}
        N ∈ {-13; -12; -11; -10; -9; -8; -7}
        N ∈ {-2; -1}
        N ∈ {-2; -1; 0}
        N ∈ {-2; -1; 0; 1}
        N ∈ {-2; -1; 0; 1; 2}
        N ∈ {-2; -1; 0; 1; 2; 3}
        N ∈ {2; 3}
        N ∈ {2; 3; 4}
        N ∈ {2; 3; 4; 5}
        N ∈ {2; 3; 4; 5; 6}
        N ∈ {2; 3; 4; 5; 6; 7}
        N ∈ {2; 3; 4; 5; 6; 7; 8}
        N ∈ {-3; -2}
        N ∈ {-3; -2; -1}
        N ∈ {-3; -2; -1; 0}
        N ∈ {-3; -2; -1; 0; 1}
        N ∈ {-3; -2; -1; 0; 1; 2}
        N ∈ {3; 4}
        N ∈ {3; 4; 5}
        N ∈ {3; 4; 5; 6}
        N ∈ {3; 4; 5; 6; 7}
        N ∈ {3; 4; 5; 6; 7; 8}
        N ∈ {3; 4; 5; 6; 7; 8; 9}
        N ∈ {-4; -3}
        N ∈ {-4; -3; -2}
        N ∈ {-4; -3; -2; -1}
        N ∈ {-4; -3; -2; -1; 0}
        N ∈ {-4; -3; -2; -1; 0; 1}
        N ∈ {4; 5}
        N ∈ {4; 5; 6}
        N ∈ {4; 5; 6; 7}
        N ∈ {4; 5; 6; 7; 8}
        N ∈ {4; 5; 6; 7; 8; 9}
        N ∈ {4; 5; 6; 7; 8; 9; 10}
        N ∈ {-5; -4}
        N ∈ {-5; -4; -3}
        N ∈ {-5; -4; -3; -2}
        N ∈ {-5; -4; -3; -2; -1}
        N ∈ {-5; -4; -3; -2; -1; 0}
        N ∈ {5; 6}
        N ∈ {5; 6; 7}
        N ∈ {5; 6; 7; 8}
        N ∈ {5; 6; 7; 8; 9}
        N ∈ {5; 6; 7; 8; 9; 10}
        N ∈ {5; 6; 7; 8; 9; 10; 11}
        N ∈ {-6; -5}
        N ∈ {-6; -5; -4}
        N ∈ {-6; -5; -4; -3}
        N ∈ {-6; -5; -4; -3; -2}
        N ∈ {-6; -5; -4; -3; -2; -1}
        N ∈ {-6; -5; -4; -3; -2; -1; 0}
        N ∈ {6; 7}
        N ∈ {6; 7; 8}
        N ∈ {6; 7; 8; 9}
        N ∈ {6; 7; 8; 9; 10}
        N ∈ {6; 7; 8; 9; 10; 11}
        N ∈ {6; 7; 8; 9; 10; 11; 12}
        N ∈ {-7}
        N ∈ {7}
        N ∈ {-7; -6}
        N ∈ {-7; -6; -5}
        N ∈ {-7; -6; -5; -4}
        N ∈ {-7; -6; -5; -4; -3}
        N ∈ {-7; -6; -5; -4; -3; -2}
        N ∈ {-7; -6; -5; -4; -3; -2; -1}
        N ∈ {7; 8}
        N ∈ {7; 8; 9}
        N ∈ {7; 8; 9; 10}
        N ∈ {7; 8; 9; 10; 11}
        N ∈ {7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12; 13}
        N ∈ {-8; -7}
        N ∈ {-8; -7; -6}
        N ∈ {-8; -7; -6; -5}
        N ∈ {-8; -7; -6; -5; -4}
        N ∈ {-8; -7; -6; -5; -4; -3}
        N ∈ {-8; -7; -6; -5; -4; -3; -2}
        N ∈ {8; 9}
        N ∈ {8; 9; 10}
        N ∈ {8; 9; 10; 11}
        N ∈ {8; 9; 10; 11; 12}
        N ∈ {8; 9; 10; 11; 12; 13}
        N ∈ {9; 10}
        N ∈ {9; 10; 11}
        N ∈ {9; 10; 11; 12}
        N ∈ {9; 10; 11; 12; 13}
        N ∈ {-9; -8}
        N ∈ {-9; -8; -7}
        N ∈ {-9; -8; -7; -6}
        N ∈ {-9; -8; -7; -6; -5}
        N ∈ {-9; -8; -7; -6; -5; -4}
        N ∈ {-9; -8; -7; -6; -5; -4; -3}

The above is pretty cute, and it is just short enough for manual inspection. The inspection reveals that all the lines with dangerous values for N contain the number 12, so that we can grep for that. We arrive at the following list of dangerous value sets for N:

        N ∈ {10; 11; 12}
        N ∈ {6; 7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12; 13}
        N ∈ {8; 9; 10; 11; 12}
        N ∈ {8; 9; 10; 11; 12; 13}
        N ∈ {9; 10; 11; 12}
        N ∈ {9; 10; 11; 12; 13}

Note: it is normal that each dangerous line contains 12: if the safety property that we are trying to establish is true, then there must be a safe value (say, 10) in each set. And the sets contain several values as the result of approximations in interval arithmetics, so that sets of values for N are intervals of integers. Any interval that contains both a safe value and an unsafe value must contain the largest safe value 11 and the smallest unsafe value 12.

Conclusion

We have improved our estimate of the maximum value of N from 40 to 13. That's pretty good, although it is not quite enough to conclude that the donut program is safe.

In conclusion to this post, now that we have ascertained that a dangerous set for N was a set that contains 12, we can compute the ratio of the search space for which we haven't yet established that the program was safe. We are exploring a dimension-4 cube by dividing it in sub-cubes. There are 52⁴ = 7 311 616 sub-cubes. Of these, this many are still not known to be safe:

$ cat log? | grep "N " | grep " 12" | wc -l
360684

We are 95% there. The remaining 5% shouldn't take more than another 95% of the time.

Wednesday, September 14 2011

Linux and floating-point: nearly there

In the process of implementing the value analysis built-ins Frama_C_precise_sin() and Frama_C_precise_cos() from last post, I stumbled on some interesting floating-point results. The sensationalistic title blames Linux, but I didn't fully investigate the problem yet, and it could be somewhere else.

If you have the Frama-C sources lying around, you might be able to reproduce it as below. Otherwise, just read on; it is rather obvious what is wrong, even for a floating-point issue.

$ make bin/toplevel.top
...
$ bin/toplevel.top 
        Objective Caml version 3.12.1

# open Ival ;;        
# set_round_upward () ;;
- : unit = ()
# sin (-3.0) ;;
- : float = -2.75991758268585219
# set_round_nearest_even () ;;
- : unit = ()
# sin (-3.0) ;;
- : float = -0.141120008059867214

2.76 is a large amplitude for the sine of any angle, and 3.0 isn't even unreasonable as an argument to pass to trigonometric functions. Trigonometric functions on my other OS work from OCaml in all rounding modes. On this Linux computer, the result is reasonable in the default nearest-even mode. This all rather points to an issue in libraries rather than OCaml itself.

If even desktop computers and servers provide this kind of third-rate trigonometric functions, what should we assume about embedded code?

Monday, September 12 2011

Better is the enemy of good... sometimes

This post is about widening. This technique was shown in the second part of a previous post about memcpy(), where it was laboriously used to analyze imprecisely function memcpy() as it is usually written. The value analysis in Frama-C has the ability to summarize loops in less time than they would take to execute. Indeed, it can analyze in finite time loops for which it is not clear whether they terminate at all. This is the result of widening, the voluntary loss of precision in loops. As a side-effect, widening can produce some counter-intuitive results. This post is about these counter-intuitive results.

The expected: better information in, better information out

Users generally expect that the more you know about the memory state before the interpretation of a piece of program, the more you should know about the memory state after this piece of program. Users are right to expect this, and indeed, it is generally true:

int u;

main(){
  u = Frama_C_interval(20, 80);
  if (u & 1)
    u = 3 * u + 1; // odd
  else
    u = u / 2; // even
}

The command frama-c -val c.c share/builtin.c shows the final value of u to be in [10..241]. If the first line of main() is changed to make the input value range over [40..60] instead of [20..80], then the output interval for u becomes [20..181]. This seems normal : you know more about variable u going in (all the values in [40..60] are included in [20..80]), so you know more about u coming out ([20..181] is included in [10..241]). Most of the analyzer's weaknesses, such as, in this case, the inability to recognize that the largest value that can get multiplied by 3 is respectively 79 or 59, do not compromise this general principle. But widening does.

Better information in, worse information out

Widening happens when there is a loop in the program, such as in the example below:

int u, i;

main(){  
  u = Frama_C_interval(40, 60);
  for (i=0; i<10; i++)
    u = (25 + 3 * u) / 4;
}
$ frama-c -val loop.c share/builtin.c
...
[value] Values for function main:
          u ∈ [15..60]

Analyzing this program with the default options produce the interval [15..60] for u. Now, if we change the initial set for u to be [25..60], an interval that contains all the values from [40..60] and more:

$ frama-c -val loop.c share/builtin.c
...
[value] Values for function main:
          u ∈ [25..60]

By using a larger, less precise set as the input of this analysis, we obtain a more precise result. This is counter-intuitive, although it is rarely noticed in practice.

Better modelization, worse analysis

There are several ways to make this unpleasant situation less likely to be noticed. Only a few of these are implemented in Frama-C's value analysis. Still, it was difficult enough to find a short example to illustrate my next point, and the program is only as short as I was able to make it. Consider the following program:

int i, j;
double ftmp1;

double Frama_C_cos(double);
double Frama_C_cos_precise(double);

#define TH 0.196349540849361875
#define NBC1 14
#define S 0.35355339

double cos(double x)
{
  return Frama_C_cos(x);
}

main(){  
  for (i = 0; i < 8; i++)
    for (j = 0; j < 8; j++)
      {
        ftmp1 = ((j == 0) ? S : 0.5) *
          cos ((2.0 * i + 1.0) * j * TH);
	ftmp1 *= (1 << NBC1);
      }
}

If you analyze the above program with a smartish modelization of function cos(), one that recognizes when its input is an interval [l..u] of floating-point values on which cos() is monotonous, and computes the optimal output in this case, you get the following results:

$ frama-c -val idct.c share/builtin.c
...
          ftmp1 ∈ [-3.40282346639e+38 .. 8192.]

Replacing the cos() modelization by a more naive one that returns [-1. .. 1.] as soon as its input set isn't a singleton, you get the improved result:

$ frama-c -val idct.c share/builtin.c
...
          ftmp1 ∈ [-8192. .. 8192.]

This is slightly different from the [25..60] example, in that here we are not changing the input set but the modelization of one of the functions involved in the program. Still, the causes are the same, and the effects just as puzzling when they are noticeable. And this is why, until version Carbon, only the naive modelization of cos() was provided.

In the next release, there will be, in addition to the naive versions, Frama_C_cos_precise() and Frama_C_sin_precise() built-ins for when you want them. Typically, this will be for the more precise analyses that do not involve widening.

Monday, June 6 2011

Fixing robots, part 1

This blog post is a revised version of part of my submission to the ICPC 2011 Industry Challenge. Please, go ahead and read the challenge description. I could only paraphrase it without adding anything to it, and so I won't.

The study was made with the April development version of Frama-C, which differs from the last released version in a number of bugfixes (list of reported issues) and minor new features. Specifically, it uses the value analysis and dependencies computations (documented in this manual) and code navigation features. The latter reuse building blocks that were implemented for slicing.

The ever important first step: identify missing library functions

The first step, as the faithful reader of this blog would now know, before using the value analysis for anything is to identify library functions the program depends on:

frama-c impls.c main.c roco.c sim.c main_testcase_1.c -metrics -cpp-command "gcc -DTESTCASE=1 -C -E"

...
Undefined functions (10):
 __builtin_inf  (15 calls); __builtin_inff  (15 calls); fprintf  (2 calls);
 __swbuf  (10 calls); __builtin_fabsf  (15 calls); exp  (1 call);
 __builtin_infl  (15 calls); __builtin_fabsl  (15 calls);
 __builtin_fabs  (15 calls); fabs  (9 calls); 
...

The host computer's standard headers have been used, hence the confusing __builtin_ prefixes. Better not rely on them: the robot's operating system probably bears little resemblance to the host's. Placeholder headers provided with Frama-C can be used instead:

frama-c impls.c main.c roco.c sim.c main_testcase_1.c -metrics -cpp-command "gcc -DTESTCASE=1 -C -E -nostdinc -I. -I/usr/local/share/frama-c/libc"

...
Undefined functions (3):
 exp  (1 call); fabs  (9 calls); fprintf  (2 calls); 
...

Standard functions exp() and fabs() are missing from the Carbon release, but have been added to the development version. With the development version, one simply needs to list /usr/local/share/frama-c/libc/math.c as a file of the analysis project. Function exp() is implemented as a value analysis builtin, taking advantage of the host's own exp() function.

If you wish to reproduce at home, you have to provide your own implementation for exp(). It may not need to be very accurate, but it needs to be analyzable with Carbon's value analysis. A good way would be to clean up the implementation from Sun that everyone is using. As I already said in previous posts, accessing the bits of a floating-point representation directly is so 1990s. If someone removed these ugly and obsolete manipulations, we would end up with a better implementation for exp() (and it would be analyzable with the value analysis from Carbon).

For fabs(), you can use the function below. An implementation from a library would probably unset the sign bit in the IEEE 754 representation but, again, Carbon's value analysis doesn't handle this precisely yet. The implementation below is handled precisely in the sense that a singleton for the input x results in a singleton for the returned value.

double fabs(double x){
  if (x==0.0) return 0.0;
  if (x>0.0) return x;
  return -x;
}

Finally, calls to fprintf(), that do not have any repercussions on the continuing execution, can be replaced with calls to the Frama_C_show_each() built-in. We can let the pre-processor to do this for us, passing the option -Dfprintf=Frama_C_show_each to GCC when pre-processing.

Looking for unspecified/undefined behaviors

We are now ready to launch the value analysis. Since we anticipate testcases 2 and 3 to be very similar, we write a short script:

#!/bin/bash

export CPP="gcc -Dfprintf=Frama_C_show_each -DTESTCASE=$1 -C -E -nostdinc -I. -I/usr/local/share/frama-c/libc"
FILES="/usr/local/share/frama-c/libc/math.c impls.c main.c roco.c sim.c"
TESTCASE="main_testcase_$1.c"

exec frama-c ${FILES} ${TESTCASE} -val -slevel 50000 -unspecified-access -val-signed-overflow-alarms -calldeps -save state$1

About ten minutes after running script.sh 1 > log1, we obtain a log (long:~70MiB) and a state (~1MiB). The log is long because it contains progression messages that are only intended to help identify analysis issues, and can be ignored most of the time. The state contains all the information that has been computed about the program, including values of variables, in a very compact format.

The analysis is completely unrolled (because of option -slevel 50000) and precise until the end. This means that the value analysis has in effect simulated the execution of the program with the inputs provided in main_testcase_1.c.

The log, despite its size, does not warn about any of the undefined or unspecified behaviors that the value analysis is guaranteed to identify (uninitialized access, use of a dangling pointer, overflows in signed integer arithmetics, invalid memory access, invalid comparison of pointers, division by zero, undefined logical shift, overflows in conversions from floating-point to integer, infinite or NaN resulting from a floating-point operation, undefined side-effects in expressions). This is very important. It means that we can rest assured that the strange dynamic behavior we are going to investigate is not caused by the misuse of one of C's dangerous constructs. Nothing would be more frustating than having to track the value of a variable which, according to the source code, is not supposed to change, but is modified through a buffer overflow. The value analysis guarantees we won't have to do that for this execution.

What would of course be better would be to verify that there can be none of the above undefined behaviors for any command sequence. This would be a much stronger result, but would also require a lot more work. When simulating a single execution inside the analyzer, we only check that that particular execution is free of undefined behavior, but it does not require any work from us (only a few minutes of work from the computer).

Exploiting the value analysis' results for program comprehension

The values computed and stored in state1 can be observed in Frama-C's GUI, using the command-line frama-c-gui -load state1. The GUI can also be used to identify the definition site(s) of a variable's value: select the variable at the program point you are interested in, and in the contextual menu, invoke Dependencies → Show defs.

Here, it is the value of variable RoCo_engineVoltage as displayed by the call to fprintf() (that we transformed into a call to Frama_C_show_each()) that is wrong, so we request the definition site(s) of that value:

s1.png

The GUI has pointed us to a call to function RoCo_Process() (using the yellow mark), so we now request the definition sites of RoCo_engineVoltage by the return; statement of that function. We obtain the two sites identified below:

s2.png

The condition that decides which branch is executing is the one shown in the screenshot below.

s3.png

The value analysis tell us the value of RoCo_isActive can be either 0 or 1 at this point during execution, but this variable is one of the variables whose value is printed in the analysis log, and its value was 1 at the instant we are interested in. We therefore focus on the definition site where the value assigned to RoCo_engineVoltage is computed in a call to PT1_Filter().

The dependencies of the particular call to PT1_Filter() we are interested in were computed by option -calldeps and can be found in the log. The call we are interested in is at statement 433. The log contains:

call PT1_Filter at statement 433:
 voltageFilter FROM state; x; t1; dt; voltageFilter
 \result FROM state; x; t1; dt; voltageFilter

Apart from reading its arguments state, x, t1, and dt, the call accesses a static variable voltageFilter. The address of voltageFilter is taken, so we have to be careful: this variable could be modified erroneously through a pointer (although the address-taking appears to be only to pass it to PT1_Filter(), which is innocent enough).

In fact, at this point, we have no idea which of the variables involved in the computation of the result of this call to PT1_Filter() is wrong. Clicking on a variable in the GUI provides the set of values for this variable at this program point, but this is still too imprecise here, since it mixes all 10000 or so passages through the statement.

Let us take advantage of the "blackboard" structure of the analyzed program and dump the entire program state at this statement, by inserting a call to Frama_C_dump_each(). See this previous post for a list of advantages of this built-in function over printf() or a debugger.

--- roco.c	(revision 12956)
+++ roco.c	(working copy)
@@ -293,6 +293,7 @@
                 desiredEngineVoltage, Engine_maxVoltage_PARAM);
         limitationActive = (Engine_maxVoltage_PARAM == desiredEngineVoltage) ||
                              (Engine_minVoltage_PARAM == desiredEngineVoltage);
+	Frama_C_dump_each();
         RoCo_engineVoltage = PT1_Filter (&voltageFilter, desiredEngineVoltage,
                 t13, dT);
         wasInit = init;

We need to launch the analysis again and find something to do for 10 minutes. This is a good time to start looking at bugs 2 and 3.

The log now contains state dumps for each passage through the statement where RoCo_engineVoltage is computed.

According to the ReadMe.txt in the challenge package, an order is given at t=50s. The log shows that this order fails to be executed speedily. The state dump at which lastTime contains 50000 and the next few ones show that of RoCo_engineVoltage's dependencies, variable desiredEngineVoltage is the one with the suspicious value: it is only -0.8, whereas parameters in file roco_config_testcase_1.h and values of the variable elsewhere in our log show that this voltage can go much higher. We are therefore left with the sub-problem of identifying why this variable has this value at this program point.

We use the same tools we have already used for RoCo_engineVoltage, this time applied to variable desiredEngineVoltage and this program point. The screenshot below shows the definitions sites for that value.

s4.png

The value of variable desiredEngineVoltage is defined by the call to function Limiter_Out(), whose argument is in turn defined by the call to Interpolate_from_curve() above.

Option -calldeps computed the implicit inputs of this call, which can be found in the log:

call Interpolate_from_curve at statement 423:
 \result FROM curve; x;
              EngineSpeedToVoltage_CURVE{.numPoints; .x[0..4]; .y[1..3]; }

The state dump in which lastTime==50000 shows that a low value for angleDiffRequest is the cause for the low value of desiredEngineVoltage.

The "Show defs" action in the GUI finds three possible definition sites for this value of angleDiffRequest, shown in the screenshots below.

s5.png

s6.png

s7.png

We find in the log that variable rampValue remains at 0 in the cycles that follow instant 50000. The value we observe for angleDiffRequest is compatible with the algorithm and values of variables at lines 264-277 of file roco.c. So it looks like the cause of the issue is the value of variable rampValue. Action "Show defs" in the GUI indicates that this value is computed by the call to Ramp_out() at line 240. The value of rampTarget is computed as 0.0 or 1.0 from a number of variables, and of these variables, RoCo_legAngleValid was always 1, and direction was always 0 or 1. The latter is suspicious, since in this execution, orders are given to move in both directions:

s8.png

The command "Show defs" applied to variable direction shows that indeed, the variable may have been set to 0 or 1 in three different sites.

s9.png

The site that corresponds to the MoveByAngle command, the middle one in the screenshot above, is suspicious: in the file main_testcase_1.c, the angle passed to this command is negative. This is not just a strange convention, because the computation below for the third definition site determined the direction for the other commands, that are all in the opposite direction, and variable direction was assigned 1 again there.

This suggests the fix below.

--- roco.c	(revision 13019)
+++ roco.c	(working copy)
@@ -211,7 +211,7 @@
                     direction = 0;
                 }
                 else {
-                    direction = (RoCo_desiredDeltaAngle > 0.0) ? -1 : 1;
+                    direction = (RoCo_desiredDeltaAngle > 0.0) ? 1 : -1;
                 }
                 RoCo_commandMoveByAngle = FALSE;
             }


Looking at the source code for related issues, one may notice that the value given to direction is also affected by the piece of code below. It would be worth the time testing different values of t9 and RoCo_hasMinMaxAngles_PARAM, although that complex computation is only active when using commands other than MoveByAngle.

            if ((fabs(t9) > t11) && (direction == 0)) {
                direction = ((((!RoCo_hasMinMaxAngles_PARAM) ||
                               (fabs(t9) >= 180.0))
                              ? t9 : -t9) > 0.0) ? 1 : -1;
            }

Thanks, and Next

My colleagues Benjamin Monate and Virgile Prevosto found the challenge and set up the files in the development repository, so that the only thing that was left to do was to identify the bugs, and started looking at them with me. Anne Pacalet and David Mentré have provided feedback for transforming this submission into a blog post. Anne Pacalet also implemented of all the code navigation features used in this study. The organizers Jochen Quante and Andrew Begel have done an impressive job of making this challenge interesting, and indeed, challenging.

I have also submitted my explanation of tasks 2 and 3 in the challenge. However, I thought someone might want to look at these themselves now that I have shown how easy it is on task 1. The only difficult part is finding an exp() function.

To be continued...

Thursday, March 3 2011

This time for real: verifying function cosine

This post follows that post where a typical double-precision implementation for a cosine function is shown. That implementation is not a correctly rounded implementation, but its double result is precise to a level close to what the double-precision floating-point representation allows. It uses low-level tricks that make it faster on nineties computers.

In this post, I offer a slightly simplified version of the same function (download). This function does not compute exactly the same results, but according to a few billion tests I ran, it does not seem more wrong than the original. I expect the simplified version is actually faster than the original on tens computers, but this is not why I modified it. I needed to make changes because the value analysis does not yet precisely handle all the low-level conversions in the original (accessing the 32 most significant bits of a double). Although the value analysis does not say anything false, it is only possible to analyze the function precisely if this idiom is handled precisely.

If it was important to analyze as-is code that uses this low-level idiom, implementation of this treatment could be prioritized according to the results of the experiment in this post. Results on the original function can be expected to be the same after the new feature is implemented as the results for the modified function described here.

I implemented a quick-and-dirty value analysis builtin for the purpose of this verification. It is not provided so much for being re-used as for possible comments: it is large enough that a bug in it could invalidate the verification. This builtin checks that its second argument is close enough to the cosine of its first argument (it uses its third argument as allowable distance). The builtin displays a log line for each call, so that you can check that it was properly called.

Demonstrating how easy it is to write an unsafe builtin, this implementation assumes without verification that it is called in a range where function cosine is decreasing.


I then wrote the following lines of C:

          x = Frama_C_double_interval(..., ...);
          m = mcos(x);
          Frama_C_compare_cos(x, m, 0x1.0p-21);

Next, I arranged to have the value analysis repeatedly analyze these lines with bounds passed to Frama_C_double_interval that, taken together, cover entirely the range from 0 to π/4. I will not say how I did, because it is not very interesting. Besides, the method I used is convenient enough to discourage someone from writing an Emacs Lisp macro to generate sub-division assertions. That would be an unwanted side-effect: it in fact has complementary advantages with the would-be assertion generation macro that I recommend someone look into.

The verification used a few hours of CPU time (I did not try very hard to make it fast). One thing I did was adapt the input sub-interval width to the part of the curve that was being verified, from two millionths for inputs near zero to a quarter of a millionth for inputs near π/4.

During this experiment, I actually used the cosine implementation from CRlibm as reference function. This library, incidentally developed at my alma mater, provides a number of fast and correctly rounded implementations for mathematical functions. The value analysis was configured with the -all-rounding-modes option, so the results are guaranteed for arbitrary rounding modes, and are guaranteed even if the compiler uses supplemental precision for intermediate results (e.g. if the compiler uses the infamous 80-bit historial floating-point unit of Intel processors).

Here are the logs I obtained (warning: I did not make any file of that level of boringness publicly available since my PhD, plus these are really big, even after compression. Handle with care): log1.gz log2.gz log3.gz log4.gz log5.gz log6.gz log7.gz log8.gz log9.gz log10.gz log11.gz log12.gz log13.gz log14.gz log15.gz. To save you the bother of downloading even the first file, here is what the first three lines look like.

[value] CC -0.0000000000000000 0.0000019073486328 0.9999999999981809 1.0000000000000002 0.9999999999981810 1.0000000000000000 OK
[value] CC 0.0000019073486328 0.0000038146972656 0.9999999999927240 0.9999999999981811 0.9999999999927240 0.9999999999981811 OK
[value] CC 0.0000038146972656 0.0000057220458984 0.9999999999836291 0.9999999999927242 0.9999999999836291 0.9999999999927242 OK


From the fact that the builtin did not complain for any sub-interval, we can conclude that on the interval 0..π/4, the proposed implementation for cosine is never farther than 2^-21 (half a millionth) from the real function cosine.

According to the aforementioned billion tests I made, the implementation seems to be much more precise than above (to about 2^-52 times its argument). The key difference is that I am quite sure it is precise to half a millionth, and that I have observed that it seems to be precise to about 2^-52 times its argument.


Guillaume Melquiond provided irreplaceable floating-point expertise in the preparation of this post.

- page 1 of 2