## Helping the value analysis — part 2

By pascal on Sunday, March 27 2011, 15:15 - Permalink

## How it started

In the last post, we started a new series about helping the value analysis be more precise. We introduced an interpolation function with the goal of verifying that it is free of run-time errors. Having launched a first value analysis, we got one alarm we need to take care of.

interp.c:28:[kernel] warning: float operation: assert \is_finite((double )((double )((double )(y1*(double )(x-x0))+(double ) (y0*(double )(x1-x)))/(double ) (x1-x0)));

This alarm is emitted because the analyzer is unsure that the value computed by the function is not an infinite or NaN. The suspicion originates from the division by `x1 - x0`

. Furthermore, we informally deduced from `x0 = t[i].x`

and `x1 = t[i+1].x`

, that `x1`

is always strictly greater than `x0`

, and more specifically that `x1 - x0 >= 0.125`

.

In this post we show how to eliminate the alarm formally. For a change of style, we start with a few attempts that are natural but do not work.

## Getting more precise

From the Value Analysis manual and former posts we know already that using the option `-slevel`

with an adequate argument may (in some cases) help the analyzer already in being more precise. This is the first thing to try:

frama-c -val -slevel 10 -main interp interp.c

We obtain the same results as before. The alarm does not disappear. The analyzer still does not know if the variables of `x1`

and `x0`

may contain the same value or (as they are `double`

variables) if they may be very close to each other, which would mean the subtraction `x1 - x0`

may be very close to zero.

We further investigate the output of the analysis:

[value] Values for function interp: x0 ∈ [0. .. 5.] x1 ∈ [0.125 .. 10.] y0 ∈ [1. .. 74.2099485248] y1 ∈ [1.00782267783 .. 11013.2329201] i ∈ {0; 1; 2; 3; 4; 5; 6; } __retres ∈ [-1.79769313486e+308 .. 1.79769313486e+308]

We can see that the expected minimum and maximum return values are very large, occupying the entire range allowed by a double. This is consistent with the fact that the value analysis is unable to guarantee there is no overflow in the computation of this result.

A natural first thought could be to write an assertion containing
what we deduced already from looking at the code, namely `x1 - x0 >= 0.125`

.
In fact, let's modify the program so:

y1 = ...; //@ assert x1 - x0 >= 0.125; Frama_C_show_each_diff(x1 - x0); return (y1 * (x - x0) + y0 * (x1 - x)) / (x1 - x0); }

Inserting this assertion could be useful in two ways:
the analyzer could tell us that it is true, which would be reassuring;
and it could take advantage of the information it contains to improve
the precision of its results. The call to `Frama_C_show_each...`

only
helps to observe the improvements.

frama-c -val -slevel 10 -main interp interp.c ... interp_try.c:29:[value] Assertion got status unknown. [value] Called Frama_C_show_each_diff([-4.875 .. 10.])

Here, neither of the two ways work: the analyzer cannot decide
if this assertion is correct, and it is
also unable to take advantage of the provided information to eliminate the
alarm or, for that matter, to improve its prediction of the value of
`x1 - x0`

, which still contains zero and values close to zero.

But why? The states propagated by the value analysis assign values to individual
variables, such as `x0`

and `x1`

. Looking again at the final values, we see that `x0`

and `x1`

are both interpreted as the
following floating-point ranges:

x0 ∈ [0. .. 5.] x1 ∈ [0.125 .. 10.]

Providing information about `x1 - x0`

when `x0`

and `x1`

are already imprecise has the following drawbacks:

- the analyzer cannot tell whether the assertion is true or false, because it is already manipulating imprecise values for
`x0`

and`x1`

that could make the formula true or false. - Even assuming it was strong at symbolic manipulations and able to transform
`x1 - x0 >= 0.125`

into`x1 >= x0 + 0.125`

(or a similar formula that is actually a correct transformation in double-precision floating-point arithmetics), to improve the value of`x1`

, it would be impaired by the fact that`x0`

is already imprecise, so that the formula doesn't tell it anything new about the value of`x1`

. And similarly when trying to take advantage of the formula to improve the value of`x0`

.

The value analysis does not represent information about the
relationship the values of `x0`

and `x1`

may have. Therefore, it
considers possibilities here that do not exist in a real execution.
Since the states propagated by the analysis assign values to
variables only, the best way to provide information is to write assertions
that directly tell about the values of variables:

x0 = t[i].x; /*@ assert x0 == 0. || x0 == 0.125 || x0 == 0.25 || x0 == 0.5 || x0 == 1. || x0 == 2. || x0 == 5. ; */ y0 = t[i].y; x1 = t[i+1].x; /*@ assert x1 == 0.125 || x1 == 0.25 || x1 == 0.5 || x1 == 1. || x1 == 2. || x1 == 5. || x1 == 10. ; */ y1 = t[i+1].y; Frama_C_show_each_diff(x1 - x0); return (y1 * (x - x0) + y0 * (x1 - x)) / (x1 - x0); }

We augment the option `-slevel`

since the two assertions multiply the number
of possibilities:

frama-c -val -slevel 100 -main interp interp.c ... [value] Called Frama_C_show_each_diff(-4.875) [value] Called Frama_C_show_each_diff(-4.75) [value] Called Frama_C_show_each_diff(-4.5) ... [value] Called Frama_C_show_each_diff({0; }) ... [value] Called Frama_C_show_each_diff(5.) [value] Called Frama_C_show_each_diff(10.) ... interp_try.c:36:[kernel] warning: float operation: assert \is_finite((double )((double )((double )(y1*(double )(x-x0))+(double ) (y0*(double )(x1-x)))/(double ) (x1-x0)));

The alarm remains. This was to be expected, because
the analyzer does not know that certain combinations of values for the pair
`(x0, x1)`

are impossible, which include the cases where `x0 == x1`

. To remove these cases, we can instead provide an assertion about the possible pairs of `x0`

and `x1`

:

/*@ assert x0 == 0. && x1 == 0.125 || x0 == 0.125 && x1 == 0.25 || x0 == 0.25 && x1 == 0.5 || x0 == 0.5 && x1 == 1. || x0 == 1. && x1 == 2. || x0 == 2. && x1 == 5. || x0 == 5. && x1 == 10. ; */

Looking at the output again, we notice that the alarm has disappeared.
The values displayed for `x1 - x0`

show that the assertion was taken into account. Finally, the return value is in a more acceptable range now.

[value] Called Frama_C_show_each_diff(5.) [value] Called Frama_C_show_each_diff(3.) [value] Called Frama_C_show_each_diff(1.) [value] Called Frama_C_show_each_diff(0.5) [value] Called Frama_C_show_each_diff(0.25) [value] Called Frama_C_show_each_diff(0.125) [value] Called Frama_C_show_each_diff(0.125) ... [value] Values for function interp: ... __retres ∈ [-16801.608905 .. 881132.843557]

Still, one problem remains. The assertion can not be verified by the value analysis:

interp_try.c:29:[value] Assertion got status unknown.

The alarm is gone, but only under the assumption that the information provided in the above assertion is correct. And as far as the analyzer is concerned, it is unfortunately not guaranteed.

This attempt is only half-successful because we are trying to reduce
and divide the propagated state too late, at a point where the
values for `x0`

and `x1`

have already been computed, and the information
about their relationship already lost. Ideally, we would like to nudge
the analyzer into finding out about the relationship by itself. Being
non-relational, it cannot
represent the relationship inside a single state, but it can represent
it virtually by using several states. This is, in fact, what we have
been forcing it to do with the last assertion.

## How to improve the provided hint?

The computation of the formula, as investigated so far, is
dependent on the possible values of `x0, x1, y0`

and `y1`

.
Looking again at the function, we can see that the respective
computations of `x0, x1, y0`

and `y1`

are dependent on the value of `i`

immediately before.
Due to the postcondition of the `choose_segment`

function, we know that only integer values between `0`

and `6`

might be assigned to `i`

. This post-condition is taken into account by the value analysis and the information
restituted in the final values displayed for `interp`

:

i ∈ {0; 1; 2; 3; 4; 5; 6; }

Depending on the value of `i`

there are actually only seven possible cases.

x0 == 0. && x1 == 0.125 && y0 == 1. && y1 == 1.00782267782571089 || x0 == 0.125 && x1 == 0.25 && y0 == 1.00782267782571089 && y1 == 1.03141309987957319 || x0 == 0.25 && x1 == 0.5 && y0 == 1.03141309987957319 && y1 == 1.1276259652063807 || x0 == 0.5 && x1 == 1. && y0 == 1.1276259652063807 && y1 == 1.54308063481524371 || x0 == 1. && x1 == 2. && y0 == 1.54308063481524371 && y1 == 3.76219569108363139 || x0 == 2. && x1 == 5. && y0 == 3.76219569108363139 && y1 == 74.2099485247878476 || x0 == 5. && x1 == 10. && y0 == 74.2099485247878476 && y1 == 11013.2329201033244

Now we want to perform a case analysis to make the analyzer propagate only those 7 specific cases.
A new appoach could be to formulate an assertion to make the analysis
be split before the computation of the four variables by using information the analyzer already has, the range of values for `i`

.
The function could now look like this:

/*@ requires 0. <= x <= 10. ; */ double interp(double x) { double x0, x1, y0, y1; int i; i = choose_segment(x); //@ assert i == 0 || i == 1 || i == 2 || i == 3 || i == 4 || i == 5 || i == 6 ; x0 = t[i].x; y0 = t[i].y; x1 = t[i+1].x; y1 = t[i+1].y; Frama_C_show_each_x0_x1(x0, x1); return (y1 * (x - x0) + y0 * (x1 - x)) / (x1 - x0); }

Running Frama-C, this time we achieve success. The alarm is gone and the hint assertion can at the same time be verified.
The call to `Frama_C_show_each_x0_x1(x0, x1)`

confirms the absence of any case in which `x0`

and `x1`

contain the same value. Thereby, it is possible to show that the divisor is not zero or a value too close to zero.

frama-c -val -slevel 100 -main interp interp.c ... interp.c:25:[value] Assertion got status valid. [value] Called Frama_C_show_each_x0_x1(5.,10.) [value] Called Frama_C_show_each_x0_x1(2.,5.) [value] Called Frama_C_show_each_x0_x1(1.,2.) [value] Called Frama_C_show_each_x0_x1(0.5,1.) [value] Called Frama_C_show_each_x0_x1(0.25,0.5) [value] Called Frama_C_show_each_x0_x1(0.125,0.25) [value] Called Frama_C_show_each_x0_x1({0; },0.125) ... [value] Values for function interp: ... __retres ∈ [-11013.2329201 .. 11161.6528172]

Regarding the return value, again we obtain a result in an acceptable range. The difference with the previous approach is that this time the analyzer is able to verify the assertion. Therefore, this last approach is not only less verbose but also safer than the former.

In conclusion, case analysis is a powerful tool to help Frama-C's value analysis, especially when the information required to reach the verification goal is relational in nature. When this technique works, the assertion to split the different cases shouldn't be inserted too late, when the analyzer has no chance to prove that the cases offered cover all the possibilities. And it shouldn't be inserted too early, lest the information be lost again, the number of combinations become too big, or the analyzer waste resources on irrelevant parts of the target program.

This post is co-authored by Kerstin Hartig.