Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Better is the enemy of good... sometimes

This post is about widening. This technique was shown in the second part of a previous post about memcpy(), where it was laboriously used to analyze imprecisely function memcpy() as it is usually written. The value analysis in Frama-C has the ability to summarize loops in less time than they would take to execute. Indeed, it can analyze in finite time loops for which it is not clear whether they terminate at all. This is the result of widening, the voluntary loss of precision in loops. As a side-effect, widening can produce some counter-intuitive results. This post is about these counter-intuitive results.

The expected: better information in, better information out

Users generally expect that the more you know about the memory state before the interpretation of a piece of program, the more you should know about the memory state after this piece of program. Users are right to expect this, and indeed, it is generally true:

int u;

main(){
  u = Frama_C_interval(20, 80);
  if (u & 1)
    u = 3 * u + 1; // odd
  else
    u = u / 2; // even
}

The command frama-c -val c.c share/builtin.c shows the final value of u to be in [10..241]. If the first line of main() is changed to make the input value range over [40..60] instead of [20..80], then the output interval for u becomes [20..181]. This seems normal : you know more about variable u going in (all the values in [40..60] are included in [20..80]), so you know more about u coming out ([20..181] is included in [10..241]). Most of the analyzer's weaknesses, such as, in this case, the inability to recognize that the largest value that can get multiplied by 3 is respectively 79 or 59, do not compromise this general principle. But widening does.

Better information in, worse information out

Widening happens when there is a loop in the program, such as in the example below:

int u, i;

main(){  
  u = Frama_C_interval(40, 60);
  for (i=0; i<10; i++)
    u = (25 + 3 * u) / 4;
}
$ frama-c -val loop.c share/builtin.c
...
[value] Values for function main:
          u ∈ [15..60]

Analyzing this program with the default options produce the interval [15..60] for u. Now, if we change the initial set for u to be [25..60], an interval that contains all the values from [40..60] and more:

$ frama-c -val loop.c share/builtin.c
...
[value] Values for function main:
          u ∈ [25..60]

By using a larger, less precise set as the input of this analysis, we obtain a more precise result. This is counter-intuitive, although it is rarely noticed in practice.

Better modelization, worse analysis

There are several ways to make this unpleasant situation less likely to be noticed. Only a few of these are implemented in Frama-C's value analysis. Still, it was difficult enough to find a short example to illustrate my next point, and the program is only as short as I was able to make it. Consider the following program:

int i, j;
double ftmp1;

double Frama_C_cos(double);
double Frama_C_cos_precise(double);

#define TH 0.196349540849361875
#define NBC1 14
#define S 0.35355339

double cos(double x)
{
  return Frama_C_cos(x);
}

main(){  
  for (i = 0; i < 8; i++)
    for (j = 0; j < 8; j++)
      {
        ftmp1 = ((j == 0) ? S : 0.5) *
          cos ((2.0 * i + 1.0) * j * TH);
	ftmp1 *= (1 << NBC1);
      }
}

If you analyze the above program with a smartish modelization of function cos(), one that recognizes when its input is an interval [l..u] of floating-point values on which cos() is monotonous, and computes the optimal output in this case, you get the following results:

$ frama-c -val idct.c share/builtin.c
...
          ftmp1 ∈ [-3.40282346639e+38 .. 8192.]

Replacing the cos() modelization by a more naive one that returns [-1. .. 1.] as soon as its input set isn't a singleton, you get the improved result:

$ frama-c -val idct.c share/builtin.c
...
          ftmp1 ∈ [-8192. .. 8192.]

This is slightly different from the [25..60] example, in that here we are not changing the input set but the modelization of one of the functions involved in the program. Still, the causes are the same, and the effects just as puzzling when they are noticeable. And this is why, until version Carbon, only the naive modelization of cos() was provided.

In the next release, there will be, in addition to the naive versions, Frama_C_cos_precise() and Frama_C_sin_precise() built-ins for when you want them. Typically, this will be for the more precise analyses that do not involve widening.