## Better is the enemy of good... sometimes

By pascal on Monday, September 12 2011, 13:38 - Permalink

This post is about widening. This technique was shown in the second part of
a previous post about memcpy(),
where it was laboriously used to analyze imprecisely
function `memcpy()`

as it is usually written.
The value analysis in Frama-C has the ability to summarize loops
in less time than they would take to execute. Indeed, it can
analyze in finite time loops for which it is not clear whether
they terminate at all. This is the result of widening, the
voluntary loss of precision in loops.
As a side-effect, widening can produce some
counter-intuitive results. This post is about these
counter-intuitive results.

## The expected: better information in, better information out

Users generally expect that the more you know about the memory state before the interpretation of a piece of program, the more you should know about the memory state after this piece of program. Users are right to expect this, and indeed, it is generally true:

int u; main(){ u = Frama_C_interval(20, 80); if (u & 1) u = 3 * u + 1; // odd else u = u / 2; // even }

The command `frama-c -val c.c share/builtin.c`

shows the final value of `u`

to be in `[10..241]`

.
If the first line of `main()`

is changed to make the input value range
over `[40..60]`

instead of `[20..80]`

, then the output
interval for `u`

becomes `[20..181]`

. This seems normal :
you know more about variable `u`

going in (all the values in `[40..60]`

are included in `[20..80]`

), so you know more about `u`

coming out (`[20..181]`

is included in `[10..241]`

).
Most of the analyzer's weaknesses, such as, in this case, the inability
to recognize that the largest value that can get multiplied by 3 is
respectively 79 or 59, do not compromise this general principle. But widening
does.

## Better information in, worse information out

Widening happens when there is a loop in the program, such as in the example below:

int u, i; main(){ u = Frama_C_interval(40, 60); for (i=0; i<10; i++) u = (25 + 3 * u) / 4; }

$ frama-c -val loop.c share/builtin.c ... [value] Values for function main: u ∈ [15..60]

Analyzing this program with the
default options produce the interval `[15..60]`

for `u`

.
Now, if we change the initial set for `u`

to
be `[25..60]`

, an interval that contains all the values
from `[40..60]`

and more:

$ frama-c -val loop.c share/builtin.c ... [value] Values for function main: u ∈ [25..60]

By using a larger, less precise set as the input of this analysis, we obtain a more precise result. This is counter-intuitive, although it is rarely noticed in practice.

## Better modelization, worse analysis

There are several ways to make this unpleasant situation less likely to be noticed. Only a few of these are implemented in Frama-C's value analysis. Still, it was difficult enough to find a short example to illustrate my next point, and the program is only as short as I was able to make it. Consider the following program:

int i, j; double ftmp1; double Frama_C_cos(double); double Frama_C_cos_precise(double); #define TH 0.196349540849361875 #define NBC1 14 #define S 0.35355339 double cos(double x) { return Frama_C_cos(x); } main(){ for (i = 0; i < 8; i++) for (j = 0; j < 8; j++) { ftmp1 = ((j == 0) ? S : 0.5) * cos ((2.0 * i + 1.0) * j * TH); ftmp1 *= (1 << NBC1); } }

If you analyze the above program with a smartish
modelization of function `cos()`

, one that recognizes
when its input is an interval `[l..u]`

of floating-point
values on which `cos()`

is monotonous, and computes
the optimal output in this case, you get
the following results:

$ frama-c -val idct.c share/builtin.c ... ftmp1 ∈ [-3.40282346639e+38 .. 8192.]

Replacing the `cos()`

modelization by
a more naive one that returns `[-1. .. 1.]`

as soon as its input set isn't a singleton, you get
the improved result:

$ frama-c -val idct.c share/builtin.c ... ftmp1 ∈ [-8192. .. 8192.]

This is slightly different from the `[25..60]`

example, in that here we are not changing the input
set but the modelization of one of the functions involved
in the program. Still, the causes are the same, and the
effects just as puzzling when they are noticeable.
And this is why, until version Carbon, only
the naive modelization of `cos()`

was
provided.

In the next release, there will be, in addition to the naive versions, `Frama_C_cos_precise()`

and
`Frama_C_sin_precise()`

built-ins for when you want them. Typically, this will be for the more precise analyses that do not involve widening.