Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - facetious-colleagues

Entries feed - Comments feed

Friday, February 10 2012

Checking for overflows operation by operation

My colleague Bernard Botella pointed out an interesting example in an offline discussion following the last quiz.

The setup

Consider the snippet:

int s;
unsigned u1, u2;
...
s = u1 - u2;

The programmer's intention with the assignment is to compute, in variable s of type int, the mathematical difference between the value of u1 and the value of u2. You would expect that, using automatic plug-ins in Frama-C, ey would be able to check that the initial values for u1 and u2 are in ranges for which this is always what happens.

In fact, this looks like a perfect assignment for Rte and the value analysis. Rte can generate assertions for the various conversions and the value analysis can check whether the conditions are satisfied, following the method outlined in a recent post.

This is, after all, exactly what static analyzers — and coding rules that forbid overflows — are for.

An obvious false positive

The overflow in the subtraction u1 - u2 looks like it can be justified away. After all, such a warning is emitted when u1==3 and u2==5, and s will still receive the expected value -2 after the implementation-defined conversion from the value of expression u1 - u2, that is, UINT_MAX - 1, to int.

The programmer may think “okay, I can see why I get this warning, it happens as soon as u2 is larger than u1, but this is defined and, in my case, innocuous. I only want to check that the mathematical difference fits in s. The check about the absence of overflow in the conversion to int is what I am really interested in.”


Say that the programmer is using the Rte plug-in to express the conditions for both the subtraction and the implicit conversion to int to be safe, pretty-printing an annotated program as a result:

$ frama-c -rte-unsigned-ov -rte t.c -print
...
void main(void)
{
  /*@ assert rte: (unsigned int)(u1-u2) ≤ 2147483647; */
  /*@ assert rte: u1-u2 ≥ 0; */
  s = (int)(u1 - u2);
  return;
}

Since the programmer has seen that the second assertion, about an overflow in the subtraction u1 - u2, is too restrictive, preventing use with u1==3 and u2==5, ey removes it. The programmer instead focuses on making sure that there is no overflow in the conversion to int of the difference, and feels confident if ey sees that there isn't any warning about that. It seems normal, right?

void main(void)
{
  /*@ assert rte: (unsigned int)(u1-u2) ≤ 2147483647; */
  s = (int)(u1 - u2);
  return;
}

But…

Consider the case where u1 is very small, u2 is very large, and the value analysis has been able to infer this information relatively precisely. We assume 0 ≤ u1 ≤ 10 and 3000000000 ≤ u2 ≤ 4000000000 for this example:

/*@ requires 0 <= u1 <= 10 ;
    requires 3000000000 <= u2 <= 4000000000 ; */
void main(void)
{
  /*@ assert rte: (unsigned int)(u1-u2) ≤ 2147483647; */
  s = (int)(u1 - u2);
  return;
}

In this case, the value analysis does not warn about the conversion to int ...

$ frama-c -val -lib-entry u.c
...
u.c:8:[value] Assertion got status valid.

... but the value in s is not the mathematical difference between u1 and u2.

Conclusion

The conclusion is that looking at overflows atomically is not a silver bullet. Granted, if all possible overflows in the program are verified to be impossible, then machine integers in the program observationally behave like mathematical integers. But the more common case is that some overflows do in fact happen in the program and are justified as benign by the programmer. The issue pointed out by Bernard Botella is that it is possible to justify an overflow, for which the rule seems too strict, by exhibiting input values for which the expected result is computed, and then to be surprised by the consequences.

Perhaps in another post, we can look at ways around this issue.

Friday, January 20 2012

Constants quiz

What does the following program print?

long long l;
#include <stdio.h>

int main()
{
  l = -0x80000000;
  printf("%lld\n", l);
}

$ gcc t.c && ./a.out 

And this one? (beware, trick question)

long long l;
#include <stdio.h>

int main()
{
  l = -2147483648;
  printf("%lld\n", l);
}

$ gcc t.c && ./a.out 

What, "it's too difficult!"? Okay, here is a hint in the form of a third question:

long long l;
#include <stdio.h>

int main()
{
  l = -2147483648;
  printf("%lld\n", l);
}

$ gcc -std=c99 t.c && ./a.out 

You are probably reading this on a computer on which the answers are "2147483648", "2147483648" and "-2147483648".

Wednesday, January 11 2012

Static analysis benchmarks

Reminiscing

The first benchmark Frama-C's value analysis was ever evaluated on was the Verisec suite, described in "A buffer overflow benchmark for software model checkers". My colleague Géraud Canet was doing the work, thence the facetious-colleagues tag of this post. In retrospect, Verisec was a solid piece of work. Being naive when it comes to statistics, I didn't recognize it at the time, but when I later examined other benchmarks, I developed an appreciation for this one. It does a lot of things right.

  • It contains good examples as well as bad. A static analyzer is for separating good programs from bad ones. It should be obvious that you cannot evaluate how well a static analyzer does using only bad programs.
  • The article's title says it's for "Software Model Checkers" but it does a good job of being accessible to a wide range of tools with varying abilities. Some good/bad program pairs are available in "no pointer, only arrays" variants and in "small arrays only" variants, providing more comparison points between different techniques. This may help discover that some technique detects a wrong behavior with small arrays but does not scale to larger ones, whereas another one does not detect them at all.
  • The entries in the benchmark are moderate-sized snippets of code with unknown inputs. If, by contrast, in most or all of the examples, the inputs were fixed, and assuming that these examples did not escape in an infinite loop, then a detection strategy based on the concrete execution of the program step by step and detection of wrong behaviors on the fly would sail through the benchmark. The benchmark would accurately capture the fact that for such cases, an analyzer can easily be made correct and complete. What would be missing would be that users of static analysis tools turn to them to detect wrong behaviors that can occur with some inputs among many, and not just to detect that there are some wrong behaviors when known (dangerous) inputs are used.


Géraud was using the Verisec benchmark right, too. He spent enough time on each case to uncover some interesting facts. One uncovered interesting fact was a definite bug in one of Verisec's "good" cases, where the targeted bug was supposed to be fixed. The explanation was this: there were two bugs in the real-life snippet that had been used as "bad" case, and only one had hitherto been noticed. The known bug had been fixed in the corresponding "good" case in the benchmark and in real life. The unknown bug had not, as the song of La Palisse might have said.

When a static analysis benchmark is done right, as the Verisec benchmark is, it tries to measure the ability of tools to solve a hard problem: predicting whether something happens at run-time in a non-trivial program with unknown inputs. Furthermore, the reason there is a need for static analyzers and for a benchmark is that humans are not good either at recognizing issues in code. So, if the benchmark is representative of the problems a static analyzer should solve, there exists no perfect oracle that can help decide what the correct answers are on each snippet, and it can be expected to contain a few bugs classified as "good" code.

Is safety static analysis incredibly immature?

In a recent post I compared Frama-C's value analysis to PolySpace and Astrée. It was a single-point comparison in each case, each time using an example chosen by the people who sell the analyzer, an example used on the analyzer's website to demonstrate the analyzer's precision. In that post's comments, Sylvain regretted "that the field of static error-detection for embedded C (as opposed to security-oriented static analysis, Cf. the [NIST SAMATE] test suite) is incredibly immature from a customer point of view".

His conclusion is at the opposite of the idea I was trying to express there. I was trying to say that embedded safety static analysis was mature and that all that remained to do to convince was to stop making unrealistic claims and derisive comparisons. Talk about scoring own goals. This blog post is my penitence for not having been clear the first time.


The field of software security is orders of magnitude more complex and disorganized than that of embedded safety. It deals with programs that are larger and use more difficult constructs (e.g. dynamic allocation). The properties that need to be verified are numerous and widely dispersed:

  • There are the undefined behaviors that cause something to happen that was not intended by the programmer. These can breach any of a system's confidentiality, integrity or availability. They are the same undefined behaviors that Polyspace, Astrée or Frama-C's value analysis detect, give or take that a previous version of Astrée did not detect uninitialized variables or that the Nitrogen version of Frama-C's value analysis (initially intended for embedded code) does not include the modelization of free() I have been working on recently — and therefore does not detect undefined behaviors related to free().
  • There are security issues in code without undefined behaviors, such as the hashing denial of service attack everyone is talking about these days.
  • And as a last illustration of the variety of security properties, timing information leaks, the detection of which I recently discussed, belong to security considerations.

As a digression, we are not worried about the hashing denial of service attack in Frama-C, but the counter-measures in OCaml 3.13 (that allows to seed hash functions) are going to be a bit of a bother for Frama-C developers. He that among you never iterated on a hash-table when displaying the results of a regression test, let him first cast a stone.


The point is that software security people need a suite like NIST's to describe by example what their field is about (more than 45 000 C/C++ cases to date). The field of embedded safety does not have a similar suite because it does not need one. AbsInt, the company that sells Astrée, did not wait for me to (gently) make fun of their website before they implemented uninitialized variable detection: they did it of their own volition because they knew Astrée should do it, and they had already released a new version with that feature at the time my colleague Florent Kirchner tried their example and noticed the uninitialized variable.

A non-benchmark

Also, I think it's a slippery slope to call what NIST has a "test suite", because before long someone will call it a "benchmark", and NIST's suite is no benchmark. Many of the examples have definite inputs, making them unrepresentative — specifically, making them too easy to handle with techniques that do not scale to the billions of possible inputs that must be handled routinely in real life. There does not seem to be nearly enough "good" examples in the suite: the few I looked at were all "bad" examples, and based on this sample, an analyzer that always answered "there is an issue in the target code" would get a very good mark indeed if this was a benchmark.


Worse, in some cases, the "bad" line to find is extremely benign and there are more serious issues in the case. Consider entry 1737, one of the few I looked at:

...
void test( char * str )
{
  char * buf;
  char * newbuf;
  char * oldbufaddress;

  buf = malloc( MAXSIZE );
  if ( !buf )
    return;
  strncpy( buf, str, MAXSIZE );
  buf[MAXSIZE - 1] = '\0';
  printf( "original buffer content is : %s\n", buf );
  oldbufaddress = buf;
  buf = realloc( buf, 1024 ); /*BAD */
  printf( "realloced buffer content is : %s\n", buf );
  printf( "original buffer address content is : %s\n", oldbufaddress );
  free( buf );
}

int main( int argc, char * * argv )
{
  char * userstr;

  if ( argc > 1 )
  {
    userstr = argv[1];
    test( userstr );
  }
...

With the proper modelization for printf(), free(), and realloc(), such as I am currently working on, Frama-C's value analysis would warn about both printf() calls. The warning for the first call is because realloc() can fail and return NULL (passing NULL to printf() invokes undefined behavior). In this case, there famously also exists a memory leak problem, as the initial block is not freed, but a memory leak is not an undefined behavior. The warning for the second call is because oldbufaddress becomes indeterminate if realloc() moves the block.


The above is not what the author of the test case wants to hear. What the author of the test case wants to hear, clarified by the CWE-244 reference, is that the realloc() call is dangerous because if realloc() moves the block, it may leave the confidential information that was contained in the block lying on the heap. This is fine, if the case is intended as a starting point for discussion (although it would be better without an undefined behavior unrelated to the specific issue being tested).

One problem is that there is no indication in the program that the block contains anything secret. A static analyzer that warns for this program should warn for any use of realloc(), since this is how realloc() is intended to be used (it is intended to preserve the contents of the block being resized). An optimal analyzer for this issue is grep realloc *.c.

It does not hurt to zero the secret data before you free() a block, and avoid realloc(), in a "belt and suspenders" approach to security, but if you use Frama-C's value analysis properly to verify that the program does not have undefined behaviors, you do not have to do this, and the code will still be safe from heap inspection vulnerabilities. Perhaps using the value analysis "properly" is inapplicable in your case; but you should at least be aware of the design choices here. Case 1737, by itself, just seems to pick one specific circumvention method for a general issue, and then confuses the resolution of the general issue with the application the specific method. The map is not the territory.


Were this case used as a benchmark, it would very much remind me of primary school tests written by teachers without any deep understanding of the matter they were teaching, and where you had to guess what was in these teachers' minds at each of their unclear questions. Here, for instance, the value analysis warns that the program may use a dangling pointer in the second printf() — intended to symbolize the information leak. If the program instead called malloc() and tried to use the contents of the new block without having initialized it, then the value analysis would warn about that. Generally speaking, any way that I know of the contents of the initial block can be read from the heap is flagged at some point as an undefined behavior, but this is not what case 1737 tests, because case 1737 tests "analyzer warns about realloc()", and the correct answer is to warn about realloc().

Conclusion

Coming back to the question of what field is immature from a customer point of view, if your interest is in software security, NIST's suite certainly can be used as a basis to understand what kind of static analyzer you may be after. You can try candidate static analyzers on cases that you have selected as most representative of your own problems. But it won't save you from reading each candidate's documentation, understanding how each of them work, and carefully examining their results. Any mechanical interpretation of the results will lead you to pick grep realloc without even understanding case 1737 and its limitations.


The issue for comparing embedded safety static analyzers is that the design space is too large, and the design space for security static analysis certainly isn't any smaller. And if someone ever boasts that their security-oriented software analyzer covers 95% of the NIST SAMATE "benchmark", they are probably selling snake oil. Just saying.

Thursday, January 5 2012

free(): revisited already

If Frama-C doesn't work out, we can always make a comedy team

Facetious colleagues ask me how I make Frama-C's value analysis' messages so informative. "Pascal," one of them says, "in this case study, the generated log contains 8GiB of information! It won't open in Emacs...". I helpfully point out that it is probably a 2^32 thing and that he should use a 64-bit Emacs.

Informativeness with uninitialized variables

Consider the program below:

main(){
  int x = 1;
  int y;
  int z = 3;
  int t = 4;
  int u = 5;
  x += y;
  x += z;
  x += t;
  x += u;
  printf("%d\n", x);
}

A less informative analyzer would predict that the printf() call prints any number, or it might call the printed number "Uninitialized". Either way, the user would be dumbfounded. Ey would have to track the origin of this imprecise diagnostic emself. Going backwards from causes to origins, ey would finally end up to the declaration of local variable y, which ey intended to initialize but forgot to.

The value analysis tells you right there at line 7 (x += y;) that the program is doing something wrong, so that the way back up from cause to origin is much shorter. It's more informative to point out that the program is already doing something wrong at line 7 than to silently propagate a value indicating something wrong has happened somewhere.

Informativeness with dangling pointers

My most facetious colleague Anne wonders whether one shouldn't be allowed to increment q in the program from last post:

main(){
  int *p = malloc(12);
  int *q = p;

  Frama_C_dump_each();
  free(p);
  q++;
  ...
}

This program is treated analogously to that program, and by "analogously" I mean that the same functions are called behind the scenes and the same messages are printed:

main(){
  int *p;
  int *q;

  {
    int a[3];
    p = a;
    q = p;
    Frama_C_dump_each();
  }
  q++;
  ...
}

Could we allow q to be incremented?

In the example with free(), perhaps. In the example with a local variable a, the standard clearly calls the contents of q indeterminate, and it's allowable for a static analyzer to emit an alarm when indeterminate memory contents are used for anything except copying.

The standard is surprisingly brief when it comes to defining free().

The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. [...] if the argument does not match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.

Should we allow q to be incremented?

Well, I don't want to, for the same reason as in the uninitialized variable y example. This would postpone the moment the error is pointed out to the user, and give em extra work to track the programming error (which is to do anything with q after the free()/end of block. The informative solution is to warn at q++; And I take informativeness very seriously. (You've been a great audience. I'll be here all week)


I think the standard should say that any pointer previously pointing inside the freed space becomes indeterminate. Anyway, that's how it's treated in Frama-C's value analysis, and if you don't like it, you can use another analyzer, unless you are Anne, in which case you can't and you must use this one.

Monday, December 5 2011

Formidable colleagues, patchlevels and new features in Nitrogen

My colleagues are formidable. Over the last few days, three of them fixed four bugs in my code. And one of the fixes was a simple, obviously correct fix for the bug previously discussed here.


In celebration of these terrific colleagues, here is a new patchset for Nitrogen 20111001, patchlevel 2. It replaces patchlevel 1. I have tested it as thoroughly as the previous one, that is, I didn't even check if it applied cleanly to the Nitrogen tarball.


Finally, with the aforementioned bugfix out the door, it is time to boast about another new feature available in Nitrogen: string constants. String constants are now properly treated as read-only: the program is not supposed to write there. If it does, an alarm is emitted, and for the sake of precision when analyzing the rest of the program, it is assumed that the write did not take place in the string constant. If the pointer could also point somewhere writable, then the write is assumed to take place there instead. If the pointer was only pointing to read-only or otherwise unwritable locations, the propagation ends for that state.

If this seems a bit abstract, here is the same example as before, revisited:

char *p = "foo";
char c;
char s = 'F';

void f(int u)
{
  if (u) p = &c;
  *p = s;
}

Treating f as the entry point of the program, the analyzer doesn't know what values may be passed for the argument u. Just before assigning to *p, it assumes that p may point to c or inside "foo", but as the program tries to write to *p, it warns that p had better point somewhere writable, and since the user will of course make sure of that, it then knows that p points to c, and since p was pointing to c, it also knows that c contains 'F' at the end of the function:

...
bl.c:8:[kernel] warning: out of bounds write. assert \valid(p);
...
[value] Values at end of function f:
          p ∈ {{ &c }}
          c ∈ {70}

One of the next steps will be to transform the "p points somewhere writable" condition before the assignment into "u is non-null" as a pre-condition of function f(), using weakest pre-condition computation. Quite a lot of effort remains before this works. For one thing, the value analysis does not yet distinguish between "valid for reading" and "valid for writing" in the ACSL predicates it annotates the program with (although it makes the distinction in the log messages).

Speaking of string constants, I should also mention that the value analysis in Nitrogen emits a warning for "foo" == "foo". I already boasted about that here, but Nitrogen wasn't released at the time.


This post was brought to you thanks to the formidableness of Philippe Herrmann, Virgile Prevosto, and Boris Yakobowski.

Monday, September 19 2011

Penultimate donut post: the function compute() is safe

Do two jobs, and do them well

In the previous post, I used the command:

$ cat log? | grep "N " | sort | uniq

This may be an inefficient way to get a list of unique lines containing "N ". The command sort does not know that it is sending its results to uniq, so it has to keep all duplicates in memory, or at least keep a count for each of them. As pointed out by my colleague Benjamin Monate, you can make sort's job easier by telling it directly that you are only interested in unique lines. The option is sort -u. This can only make sort's job simpler, even if it has an optimization for dealing with duplicate lines efficiently.

Update on the donut verification

The computations for the remaining difficult 5% of the search space are now finished. I was hoping not to have to do a third pass, so I split each side of the examined subcube in 4. In other words, I divided each subcube in 4⁴ or 256 subsubcubes. A little birdie told me that splitting each side in two wouldn't be sufficient. That birdie had been wrong before, but this time, it was absolutely sure— it got the tip straight from the subcube's mouth.

I have kept the entire logs for this second pass, since I didn't know yet whether I would need the additional information they contain. I would have needed them if a third pass had been necessary, but it won't be. It is now official: function compute() is safe for all arguments between -3.25 and 3.25.

Now I must write the concluding post quickly, because the full logs are taking 85GB on our computations server.

Saturday, September 17 2011

Probably safe donut

Introduction

In the first post in the obfuscated animated donut series, my colleague Anne pointed out that:

The alarm about : assert \valid(".,-~:;=!*#$@"+tmp_7); seems strange because the analysis tells us that tmp_7 ∈ [0..40] at this point... How can this be valid ?

It is only safe to use an offset between 0 and 11 when accessing inside the quantization string ".,-~:;=!*#$@". The value analysis determined that the offset was certainly between 0 and 40, so the analyzer can't be sure that the memory access is safe, and emits an alarm. Of course, [0..40] is only a conservative approximation for the value of the offset at that point. This approximation is computed with model functions sin() and cos() only known to return values in [-1.0 .. 1.0]. These model functions completely lose the relation between the sine and the cosine of a same angle. The computation we are interested in is more clearly seen in the function below, which was extracted using Frama-C's slicing plug-in.

/*@ ensures 0 <= \result <= 11 ; */
int compute(float A, float B, float i, float j)
{
  float c=sin(i), l=cos(i);
  float d=cos(j), f=sin(j);
  float g=cos(A), e=sin(A);
  float m=cos(B), n=sin(B);

  int N=8*((f*e-c*d*g)*m-c*d*e-f*g-l*d*n);
  return N>0?N:0;
}

Both the sine and the cosine of each argument are computed, and eventually used in the same formula. Surely the analyzer would do better if it could just remember that sin(x)²+cos(x)²=1. Alas, it doesn't. It just knows that each is in [-1.0 .. 1.0].

Making progress

But the Nitrogen release will have more precise sin() and cos() modelizations! Focusing only on the sub-computation we are interested in, we can split the variation domains for arguments A, B, i, j of function compute(), until it is clear whether or not N may be larger than 11.

I suggest to use the program below (download), where the code around compute() creates the right analysis context. Just like in unit testing, there are stubs and witnesses in addition to the function under test.

...

/*@ ensures 0 <= \result <= 11 ; */
int compute(float A, float B, float i, float j)
{
  float c=sin(i), l=cos(i);
  float d=cos(j), f=sin(j);
  float g=cos(A), e=sin(A);
  float m=cos(B), n=sin(B);

  int N=8*((f*e-c*d*g)*m-c*d*e-f*g-l*d*n);
Frama_C_dump_each();
  return N>0?N:0;
}

main(){
  int An, Bn, in, jn;
  float A, B, i, j;
  for (An=-26; An<26; An++)
    {
      A = Frama_C_float_interval(An / 8., (An + 1) / 8.);
      for (Bn=-26; Bn<26; Bn++)
	{
	  B = Frama_C_float_interval(Bn / 8., (Bn + 1) / 8.);
	  for (in=-26; in<26; in++)
	    {
	      i = Frama_C_float_interval(in / 8., (in + 1) / 8.);
	      for (jn=-26; jn<26; jn++)
		{
		  j = Frama_C_float_interval(jn / 8., (jn + 1) / 8.);
		  compute(A, B, i, j);
		}
	    }
	}
    }
}

Each for loop only covers the range [-3.25 .. 3.25]. In the real program, A and B vary in much wider intervals. If target functions sin() and cos() use a decent argument reduction algorithm (and incidentally, use the same one), the conclusions obtained for [-3.25 .. 3.25] should generalize to all finite floating-point numbers. This is really a property of the target platform that we cannot do anything about at this level, other than clearly state that we are assuming it.


In fact, the programs I analyzed were not exactly like above: this task is trivial to run in parallel, so I split the search space in four and ran four analyzers at the same time:

$ for i in 1 2 3 4 ; do ( time bin/toplevel.opt -val share/builtin.c \
an$i.c -obviously-terminates -no-val-show-progress \
-all-rounding-modes ) > log$i 2>&1 & disown %1 ; done

After a bit less than two hours (YMMV), I got files log1,..., log4 that contained snapshots of the memory state just after variable N had been computed:

[value] DUMPING STATE of file an1.c line 24
...
        c ∈ [-0.016591893509 .. 0.108195140958]
        l ∈ [-1. .. -0.994129657745]
        d ∈ [-1. .. -0.994129657745]
        f ∈ [-0.016591893509 .. 0.108195140958]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {-1; 0; 1}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-26}
        jn ∈ {-26}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-3.25 .. -3.125]
        j ∈ [-3.25 .. -3.125]
        =END OF DUMP==
an1.c:15:[value] Function compute: postcondition got status valid.
[value] DUMPING STATE of file an1.c line 24
...
        c ∈ [-0.016591893509 .. 0.108195140958]
        l ∈ [-1. .. -0.994129657745]
        d ∈ [-0.999862372875 .. -0.989992439747]
        f ∈ [-0.141120016575 .. -0.0165918916464]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {-2; -1; 0; 1}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-26}
        jn ∈ {-25}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-3.25 .. -3.125]
        j ∈ [-3.125 .. -3.]
        =END OF DUMP==
[value] DUMPING STATE of file an1.c line 24
...

Some input sub-intervals for A, B, i, j make the analyzer uncertain about the post-condition N ≤ 11. Here are the first such inputs, obtained by searching for "unknown" in the log:

...
[value] DUMPING STATE of file an1.c line 24
 ...
        c ∈ [-0.850319802761 .. -0.778073191643]
        l ∈ [-0.628173649311 .. -0.526266276836]
        d ∈ [0.731688857079 .. 0.810963153839]
        f ∈ [0.585097253323 .. 0.681638777256]
        g ∈ [-1. .. -0.994129657745]
        e ∈ [-0.016591893509 .. 0.108195140958]
        m ∈ [-1. .. -0.994129657745]
        n ∈ [-0.016591893509 .. 0.108195140958]
        N ∈ {8; 9; 10; 11; 12}
        An ∈ {-26}
        Bn ∈ {-26}
        in ∈ {-18}
        jn ∈ {5}
        A ∈ [-3.25 .. -3.125]
        B ∈ [-3.25 .. -3.125]
        i ∈ [-2.25 .. -2.125]
        j ∈ [0.625 .. 0.75]
        =END OF DUMP==
an1.c:15:[value] Function compute: postcondition got status unknown.
...

Here is the complete list of value sets obtained for variable N on the entire search space:

$ cat log? | grep "N " | sort | uniq
        N ∈ {0; 1}
        N ∈ {0; 1; 2}
        N ∈ {0; 1; 2; 3}
        N ∈ {0; 1; 2; 3; 4}
        N ∈ {0; 1; 2; 3; 4; 5}
        N ∈ {0; 1; 2; 3; 4; 5; 6}
        N ∈ {-1; 0}
        N ∈ {-1; 0; 1}
        N ∈ {10; 11}
        N ∈ {10; 11; 12}
        N ∈ {-1; 0; 1; 2}
        N ∈ {-1; 0; 1; 2; 3}
        N ∈ {-1; 0; 1; 2; 3; 4}
        N ∈ {-10; -9}
        N ∈ {-10; -9; -8}
        N ∈ {-10; -9; -8; -7}
        N ∈ {-10; -9; -8; -7; -6}
        N ∈ {-10; -9; -8; -7; -6; -5}
        N ∈ {-10; -9; -8; -7; -6; -5; -4}
        N ∈ {-11; -10}
        N ∈ {-11; -10; -9}
        N ∈ {-11; -10; -9; -8}
        N ∈ {-11; -10; -9; -8; -7}
        N ∈ {-11; -10; -9; -8; -7; -6}
        N ∈ {-11; -10; -9; -8; -7; -6; -5}
        N ∈ {1; 2}
        N ∈ {-12; -11; -10}
        N ∈ {-12; -11; -10; -9}
        N ∈ {-12; -11; -10; -9; -8}
        N ∈ {-12; -11; -10; -9; -8; -7}
        N ∈ {-12; -11; -10; -9; -8; -7; -6}
        N ∈ {1; 2; 3}
        N ∈ {1; 2; 3; 4}
        N ∈ {1; 2; 3; 4; 5}
        N ∈ {1; 2; 3; 4; 5; 6}
        N ∈ {1; 2; 3; 4; 5; 6; 7}
        N ∈ {-13; -12; -11; -10; -9}
        N ∈ {-13; -12; -11; -10; -9; -8}
        N ∈ {-13; -12; -11; -10; -9; -8; -7}
        N ∈ {-2; -1}
        N ∈ {-2; -1; 0}
        N ∈ {-2; -1; 0; 1}
        N ∈ {-2; -1; 0; 1; 2}
        N ∈ {-2; -1; 0; 1; 2; 3}
        N ∈ {2; 3}
        N ∈ {2; 3; 4}
        N ∈ {2; 3; 4; 5}
        N ∈ {2; 3; 4; 5; 6}
        N ∈ {2; 3; 4; 5; 6; 7}
        N ∈ {2; 3; 4; 5; 6; 7; 8}
        N ∈ {-3; -2}
        N ∈ {-3; -2; -1}
        N ∈ {-3; -2; -1; 0}
        N ∈ {-3; -2; -1; 0; 1}
        N ∈ {-3; -2; -1; 0; 1; 2}
        N ∈ {3; 4}
        N ∈ {3; 4; 5}
        N ∈ {3; 4; 5; 6}
        N ∈ {3; 4; 5; 6; 7}
        N ∈ {3; 4; 5; 6; 7; 8}
        N ∈ {3; 4; 5; 6; 7; 8; 9}
        N ∈ {-4; -3}
        N ∈ {-4; -3; -2}
        N ∈ {-4; -3; -2; -1}
        N ∈ {-4; -3; -2; -1; 0}
        N ∈ {-4; -3; -2; -1; 0; 1}
        N ∈ {4; 5}
        N ∈ {4; 5; 6}
        N ∈ {4; 5; 6; 7}
        N ∈ {4; 5; 6; 7; 8}
        N ∈ {4; 5; 6; 7; 8; 9}
        N ∈ {4; 5; 6; 7; 8; 9; 10}
        N ∈ {-5; -4}
        N ∈ {-5; -4; -3}
        N ∈ {-5; -4; -3; -2}
        N ∈ {-5; -4; -3; -2; -1}
        N ∈ {-5; -4; -3; -2; -1; 0}
        N ∈ {5; 6}
        N ∈ {5; 6; 7}
        N ∈ {5; 6; 7; 8}
        N ∈ {5; 6; 7; 8; 9}
        N ∈ {5; 6; 7; 8; 9; 10}
        N ∈ {5; 6; 7; 8; 9; 10; 11}
        N ∈ {-6; -5}
        N ∈ {-6; -5; -4}
        N ∈ {-6; -5; -4; -3}
        N ∈ {-6; -5; -4; -3; -2}
        N ∈ {-6; -5; -4; -3; -2; -1}
        N ∈ {-6; -5; -4; -3; -2; -1; 0}
        N ∈ {6; 7}
        N ∈ {6; 7; 8}
        N ∈ {6; 7; 8; 9}
        N ∈ {6; 7; 8; 9; 10}
        N ∈ {6; 7; 8; 9; 10; 11}
        N ∈ {6; 7; 8; 9; 10; 11; 12}
        N ∈ {-7}
        N ∈ {7}
        N ∈ {-7; -6}
        N ∈ {-7; -6; -5}
        N ∈ {-7; -6; -5; -4}
        N ∈ {-7; -6; -5; -4; -3}
        N ∈ {-7; -6; -5; -4; -3; -2}
        N ∈ {-7; -6; -5; -4; -3; -2; -1}
        N ∈ {7; 8}
        N ∈ {7; 8; 9}
        N ∈ {7; 8; 9; 10}
        N ∈ {7; 8; 9; 10; 11}
        N ∈ {7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12; 13}
        N ∈ {-8; -7}
        N ∈ {-8; -7; -6}
        N ∈ {-8; -7; -6; -5}
        N ∈ {-8; -7; -6; -5; -4}
        N ∈ {-8; -7; -6; -5; -4; -3}
        N ∈ {-8; -7; -6; -5; -4; -3; -2}
        N ∈ {8; 9}
        N ∈ {8; 9; 10}
        N ∈ {8; 9; 10; 11}
        N ∈ {8; 9; 10; 11; 12}
        N ∈ {8; 9; 10; 11; 12; 13}
        N ∈ {9; 10}
        N ∈ {9; 10; 11}
        N ∈ {9; 10; 11; 12}
        N ∈ {9; 10; 11; 12; 13}
        N ∈ {-9; -8}
        N ∈ {-9; -8; -7}
        N ∈ {-9; -8; -7; -6}
        N ∈ {-9; -8; -7; -6; -5}
        N ∈ {-9; -8; -7; -6; -5; -4}
        N ∈ {-9; -8; -7; -6; -5; -4; -3}

The above is pretty cute, and it is just short enough for manual inspection. The inspection reveals that all the lines with dangerous values for N contain the number 12, so that we can grep for that. We arrive at the following list of dangerous value sets for N:

        N ∈ {10; 11; 12}
        N ∈ {6; 7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12}
        N ∈ {7; 8; 9; 10; 11; 12; 13}
        N ∈ {8; 9; 10; 11; 12}
        N ∈ {8; 9; 10; 11; 12; 13}
        N ∈ {9; 10; 11; 12}
        N ∈ {9; 10; 11; 12; 13}

Note: it is normal that each dangerous line contains 12: if the safety property that we are trying to establish is true, then there must be a safe value (say, 10) in each set. And the sets contain several values as the result of approximations in interval arithmetics, so that sets of values for N are intervals of integers. Any interval that contains both a safe value and an unsafe value must contain the largest safe value 11 and the smallest unsafe value 12.

Conclusion

We have improved our estimate of the maximum value of N from 40 to 13. That's pretty good, although it is not quite enough to conclude that the donut program is safe.

In conclusion to this post, now that we have ascertained that a dangerous set for N was a set that contains 12, we can compute the ratio of the search space for which we haven't yet established that the program was safe. We are exploring a dimension-4 cube by dividing it in sub-cubes. There are 52⁴ = 7 311 616 sub-cubes. Of these, this many are still not known to be safe:

$ cat log? | grep "N " | grep " 12" | wc -l
360684

We are 95% there. The remaining 5% shouldn't take more than another 95% of the time.

Friday, September 9 2011

Fun with constants

Another facetious colleague reports a strange behavior with the following C program:

int main (void) {
 int x = 100;
 int y = 042;
 int z = 005;
 printf ("%d\n", x+y+z);
 return (x+y+z) - 147;
}

Tuesday, July 26 2011

Fun with usual arithmetic conversions

A facetious colleague reports the following program as a bug:

int main () {
  signed char c=0;
  while(1) c++;
  return 0;
}

The commandline frama-c -val -val-signed-overflow-alarms charpp.c, he says, does not emit any alarm for the c++; instruction.


Indeed, the c++; above is equivalent to c = (signed char) ((int)c + 1);.

If the implementation-defined(6.3.1.3 §3) conversions from int to signed char are assumed to give the expected modulo results, this instruction never displays undefined behavior. The really dangerous operation, signed addition, takes place between arguments of type int and returns an int. Thus, in this particular program and for the default x86_32 target architecture, the result of the addition is always representable in the result type.

page 2 of 2 -