Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tuesday, February 14 2012

Checking for overflows, revisited once

I do not have any solution I am 100% happy with to the overflow dilemma in the previous post. Here is one of the solutions that does not make me 100% happy.


The first (partial) solution is: program so that overflows correspond exactly to unwanted circumstances (and then it becomes meaningful to verify the absence of overflows. There are great automatic tools for that. Have you heard of them?)

“Program so that…” sounds abstract, but it may be as simple as using larger integer types that won't overflow for all intermediate computations:

  s = (int)((long long) u1 - (long long)u2);

Then the desired property (variable s contains the mathematical difference of u1 and u2) entirely rests on the final conversion from long long to int.


This solution doesn't apply in the common situation where the code is already written and gratuitous changes to it are prohibited. Also, you need to have an integer type large enough to contain all intermediate computations without overflows. In this example, unsigned int is 32-bit and long long is 64-bit, but the standard does not prevent unsigned int to be 64-bit and the same size as long long, in which case the conversion from unsigned int to long long could overflow. Rte would only insert additional assertions for the two intermediate conversions that we have added in the assignment. This wouldn't help; it would only make things worse.


“Won't the generated code be less efficient?”, you rightly ask. That all depends on the compiler. Optimizing compilers have no trouble at all optimizing the above assignment so that it uses only one 32-bit subtraction. On the Linux workstation on which I tried this, both clang -m32 -O0 and gcc -m32 -O0 did. It is amusing, because the code generated by both compilers with -O0 is ugly (gratuitously moving registers to memory and then to registers again) but, even with optimizations disabled, they still both get the fact that it is only necessary to compute the least significant 32 bits of the result.

With optimizations, both Clang and GCC generate clean code that naturally still uses a single 32-bit subtraction:

$ gcc -S -O2 -m32 t.c
$ cat t.s
...
f:
	pushl	%ebp
	movl	%esp, %ebp
	movl	8(%ebp), %eax
        subl	12(%ebp), %eax
	popl	%ebp
	ret

As an anecdote, when generating x86-64 code (and therefore having 64-bit subtraction available as a single instruction, with only the drawback of a slightly larger encoding), gcc -O0 keeps using 32-bit subtraction but clang -O0 uses 64-bit subtraction. This does not reveal anything about the compilers: it does not make sense to compare for efficiency the code they generate with optimization disabled.


That's all for this post.

Friday, February 10 2012

Checking for overflows operation by operation

My colleague Bernard Botella pointed out an interesting example in an offline discussion following the last quiz.

The setup

Consider the snippet:

int s;
unsigned u1, u2;
...
s = u1 - u2;

The programmer's intention with the assignment is to compute, in variable s of type int, the mathematical difference between the value of u1 and the value of u2. You would expect that, using automatic plug-ins in Frama-C, ey would be able to check that the initial values for u1 and u2 are in ranges for which this is always what happens.

In fact, this looks like a perfect assignment for Rte and the value analysis. Rte can generate assertions for the various conversions and the value analysis can check whether the conditions are satisfied, following the method outlined in a recent post.

This is, after all, exactly what static analyzers — and coding rules that forbid overflows — are for.

An obvious false positive

The overflow in the subtraction u1 - u2 looks like it can be justified away. After all, such a warning is emitted when u1==3 and u2==5, and s will still receive the expected value -2 after the implementation-defined conversion from the value of expression u1 - u2, that is, UINT_MAX - 1, to int.

The programmer may think “okay, I can see why I get this warning, it happens as soon as u2 is larger than u1, but this is defined and, in my case, innocuous. I only want to check that the mathematical difference fits in s. The check about the absence of overflow in the conversion to int is what I am really interested in.”


Say that the programmer is using the Rte plug-in to express the conditions for both the subtraction and the implicit conversion to int to be safe, pretty-printing an annotated program as a result:

$ frama-c -rte-unsigned-ov -rte t.c -print
...
void main(void)
{
  /*@ assert rte: (unsigned int)(u1-u2) ≤ 2147483647; */
  /*@ assert rte: u1-u2 ≥ 0; */
  s = (int)(u1 - u2);
  return;
}

Since the programmer has seen that the second assertion, about an overflow in the subtraction u1 - u2, is too restrictive, preventing use with u1==3 and u2==5, ey removes it. The programmer instead focuses on making sure that there is no overflow in the conversion to int of the difference, and feels confident if ey sees that there isn't any warning about that. It seems normal, right?

void main(void)
{
  /*@ assert rte: (unsigned int)(u1-u2) ≤ 2147483647; */
  s = (int)(u1 - u2);
  return;
}

But…

Consider the case where u1 is very small, u2 is very large, and the value analysis has been able to infer this information relatively precisely. We assume 0 ≤ u1 ≤ 10 and 3000000000 ≤ u2 ≤ 4000000000 for this example:

/*@ requires 0 <= u1 <= 10 ;
    requires 3000000000 <= u2 <= 4000000000 ; */
void main(void)
{
  /*@ assert rte: (unsigned int)(u1-u2) ≤ 2147483647; */
  s = (int)(u1 - u2);
  return;
}

In this case, the value analysis does not warn about the conversion to int ...

$ frama-c -val -lib-entry u.c
...
u.c:8:[value] Assertion got status valid.

... but the value in s is not the mathematical difference between u1 and u2.

Conclusion

The conclusion is that looking at overflows atomically is not a silver bullet. Granted, if all possible overflows in the program are verified to be impossible, then machine integers in the program observationally behave like mathematical integers. But the more common case is that some overflows do in fact happen in the program and are justified as benign by the programmer. The issue pointed out by Bernard Botella is that it is possible to justify an overflow, for which the rule seems too strict, by exhibiting input values for which the expected result is computed, and then to be surprised by the consequences.

Perhaps in another post, we can look at ways around this issue.

Saturday, February 4 2012

Using the Rte and value analysis plug-ins to detect overflows

This post is another of the methodological cheat cards that made up much of this blog at its beginnings, before I decided that controversial comparisons between static analyzers were more fun to write.

The problem: detecting semantic coding rules transgressions

By default, Frama-C's value analysis does not warn about integer overflows. You need to use option -val-signed-overflow-alarms for an alarm to be emitted whenever it looks like one of these might happen. Even then, the value analysis only warns about overflows occurring in signed arithmetic operations. It does not warn about overflows in unsigned operations, because these are defined. It does not warn either about overflows in conversions, such as (unsigned int)(-1), that is defined because the destination type is unsigned, or (int) UINT_MAX, that is implementation-defined (and in practice, usually reveals the 2's complement representation used by the target architecture).

In consequence, the value analysis does not warn about the post-decrement operation in the following example, sent in by a faithful user.

main(){
    unsigned int nb_trames = Frama_C_interval (0, 5120);
    nb_trames -- ;
    nb_trames /= 256 ;
}

The faithful user noticed that something was wrong in the function because the range computed by the value analysis for variable nb_trames included the case where nb_trames was initially 0 and eventually UINT_MAX / 256. This did not match the specification, but it was pure luck that the underflow was detected through this side-effect. Sometimes, one would like a possible underflow such as nb_trames -- ; in the above example to be detected as wrong in itself, not because it is forbidden by the C standard (it isn't), but because it is forbidden by specifically applicable coding standards.


Speaking of blogging beginnings and of controversy, the position on coding standards that I expressed in my very first blog post was that "any fool can invent theirs, and some have". Another thing that fools reputedly do is not change their minds. Perhaps coincidentally, my opinion on the subject has mellowed a bit. Still, there are many of these coding standards, and I am unwilling to multiply, in the value analysis, options similar to -val-signed-overflow-alarms but for the various conditions that someone somewhere might think undesirable. In any case, this unwillingness is only my opinion, and this opinion is only in the context of the value analysis plug-in. Anyone is encouraged to write Frama-C plug-ins for verifying any coding standard they like, and some have.


One of the highlights of the last U3CAT meeting was, in an industrial case study, the combination of the value analysis and Rte plug-ins to detect overflows in unsigned arithmetic operations or in any kind of integer conversion.

The Frama-C Rte plug-in

The Rte plug-in can annotate a program with ACSL assertions that exclude unwanted conditions at execution. As the name of the plug-in indicates, the initial motivation was to have assertions that excluded run-time errors, but Rte is actually quite flexible in what it allows to forbid. In particular, it allows to generate assertions that forbid the unsigned underflow from our example:

$ frama-c -rte-unsigned-ov -rte m.c -print
...
extern int ( /* missing proto */ Frama_C_interval)();
int main(void)
{
   int __retres;
   unsigned int nb_trames;
   nb_trames = (unsigned int)Frama_C_interval(0,5120);
   /*@ assert rte: nb_trames-1U ≥ 0; */
   nb_trames --;
   nb_trames /= (unsigned int)256;
   __retres = 0;
   return (__retres);
}

The normalization step applied to any program that goes through Frama-C shows, but the original program is quite recognizable, with the ACSL assertion nb_trames-1U ≥ 0; strategically inserted just before nb_trames is decremented.

Evaluating with -val assertions emitted by Rte

Assigning a truth value to the assertions inserted by Rte is as simple as running the value analysis on the annotated program it produced. It can be done in a single command like so:

$ frama-c -rte-unsigned-ov -rte m.c share/builtin.c -then -val

And it works: the value analysis rightly detects that the conditions for the absence of unsigned overflow are not met.

m.c:3:[value] Assertion got status unknown.

Credits

John Regehr convincingly argued the case of option -val-signed-overflow-alarms at a time when it didn't exist yet. Faithful user Jean-Marc Harang provided the example used in this post, and he also authored the referred presentation at the last U3CAT meeting. The Rte plug-in is developed by Philippe Herrmann.