Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - linking

Entries feed - Comments feed

Friday, July 27 2012

Oxygen is stricter about types and why you should get used to it

I have just sent a list of changewishes (1, 2) to a static analysis competition mailing-list, and that reminded me of a blog post I had to write on the strictness of the type-checker in upcoming Frama-C release Oxygen. This is the blog post.

This post is not about uninitialized variables

The static analysis competition in question evaluates each participating tool on many testcases. A good number of these testcases are rejected by the current Frama-C. This is why I was writing to the mailing-list: Frama-C cannot participate in the competition if the testcases are not changed (Frama-C does not have to participate, of course. It's just that it looks fun and we would probably like to be in it).

In fact, many of the testcases were already problematic for Frama-C version 20111001 (Nitrogen), the version in the current Debian, Ubuntu, FreeBSD, NetBSD and other Unix distributions. Indeed, a lot of the testcases rely on uninitialized variables for entropy, which this post by Xi Wang and this post by Mark Shroyer show is wrong. Instead of the problem that is supposed to be detected (or not), Nitrogen detects the uninitialized use. I covered this already; this blog post is not about uninitialized variables (keep reading!).

This post is about C type-checking

While trying to get a list of uninitialized-variable-using testcases, I realized that something had changed since my last evaluation of the competition's benchmarks. Many of them were now rejected at type-checking!

The new problem is, many testcases in the benchmarks call functions without having provided a prototype, and some ultimately define the called function with a type incompatible with the type inferred at the call site. Frama-C used to be lenient about typing issues, but after fixing one soundness bug too many that was caused by the leniency, we have decided to throw in the towel. Virgile described one of the typing issues for which Frama-C used to be too lenient. It was not the only one.

This is an unusual position to take. In the article A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World, the makers of a commercially successful bug-finding tool state:

Our product's requirements roughly form a "least common denominator" set needed by any tool that uses non-trivial analysis to check large amounts of code across many organizations; the tool must find and parse the code, [...]

The above is wise: to be maximally useful, the static analyzer should choke on as few idiosyncrasies as possible. The analyzed code can be assumed to compile with some compiler, and to go through some tests (there must still be a few of these, right)? Why warn when a function is called without a prototype, if the compiler accepted it? Why reject when the function's implementation has a type that is incompatible with the type that was inferred at the call site? This is probably not the issue the user is looking for (if it was, the user would have fixed it when eir compiler warned for it).

Oxygen is strict and that's it

Well, our answer is that we tried and we found that it was too much work to try to be sound and precise with these constraints, as exemplified by Virgile's blog post. Show us a tool that accepts your weakly typed program, and we will show you a tool that probably isn't sound and precise at the same time (we have kept the examples that led to the decision. These examples demonstrate real issues masked by lenient typing. If you think you have found a sound analyzer that is at the same time conveniently permissive on typing, it will be our pleasure to provide the examples for you to try).


We hope that Frama-C will still be useful with these new restrictions on typing. Fortunately for us, there are more real worlds than the expression “the Real World” in the cited article's title might lead you to think (and this quip should not be taken as a reproach towards the authors of “A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World”, a very useful document with an appropriately chosen title considering its intended public).

Tuesday, May 15 2012

On arrays vs. pointers

This post is a follow-up of bug 990. When this issue was resolved the following question arose: "Why is it so dangerous to have a global c declared as a pointer in one compilation unit and defined as an array in another one, given the fact that almost anywhere an array decays as a pointer to their first element when needed?" Well, the key to the answer lies in the *almost*. In fact, if a is an array, the associated pointer expression is &a[0], which means that it is not an lvalue, on the contrary to a pointer variable p. In other words, the compilation unit that sees c as a pointer can assign new values into it, while it is not really expected by the compilation unit where c is defined. Worse, this point cannot be checked by a compiler, since there's no type at linking phase, where the issue could be spotted. Take for instance the following two files: main.c

extern char* c;

extern char f(void);

int main () {
  char *oldc;
  char c1[] = "bar"; 
  c = c1;
  char test = f();
  c = oldc;
  switch (test) {
  case 'o': return 0;
  case 'r': return 1;
  default: return test;
  }
 }

and aux.c

char c [] = "foo";

char f() { return c[2]; }

gcc (Ubuntu/Linaro 4.6.3-1ubuntu5 to be precise) compiles them without any message:

gcc -o test aux.c main.c

Yet, what should be the exit status of test? One could imagine that it might be 0 if the compiler optimizes in f the fact that c is at a constant address, or 1 if it uses the actual address of c (treating it effectively as pointer) when f is called. In fact, on my machine, the exit status is a completely arbitrary number that depends on whatever lies in the memory when test is launched:

virgile@is220177:~/tmp$ ./test
virgile@is220177:~/tmp$ echo $?
98
virgile@is220177:~/tmp$ ./test
virgile@is220177:~/tmp$ echo $?
145

In summary, the future behavior of Frama-C, that will, starting from Oxygen, flatly reject such code (instead of just emitting a warning), is not exceedingly pedantic. Such a discrepancy between declarations is really an important issue.