Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - value-builtins

Entries feed - Comments feed

Monday, June 6 2011

Fixing robots, part 1

This blog post is a revised version of part of my submission to the ICPC 2011 Industry Challenge. Please, go ahead and read the challenge description. I could only paraphrase it without adding anything to it, and so I won't.

The study was made with the April development version of Frama-C, which differs from the last released version in a number of bugfixes (list of reported issues) and minor new features. Specifically, it uses the value analysis and dependencies computations (documented in this manual) and code navigation features. The latter reuse building blocks that were implemented for slicing.

The ever important first step: identify missing library functions

The first step, as the faithful reader of this blog would now know, before using the value analysis for anything is to identify library functions the program depends on:

frama-c impls.c main.c roco.c sim.c main_testcase_1.c -metrics -cpp-command "gcc -DTESTCASE=1 -C -E"

Undefined functions (10):
 __builtin_inf  (15 calls); __builtin_inff  (15 calls); fprintf  (2 calls);
 __swbuf  (10 calls); __builtin_fabsf  (15 calls); exp  (1 call);
 __builtin_infl  (15 calls); __builtin_fabsl  (15 calls);
 __builtin_fabs  (15 calls); fabs  (9 calls); 

The host computer's standard headers have been used, hence the confusing __builtin_ prefixes. Better not rely on them: the robot's operating system probably bears little resemblance to the host's. Placeholder headers provided with Frama-C can be used instead:

frama-c impls.c main.c roco.c sim.c main_testcase_1.c -metrics -cpp-command "gcc -DTESTCASE=1 -C -E -nostdinc -I. -I/usr/local/share/frama-c/libc"

Undefined functions (3):
 exp  (1 call); fabs  (9 calls); fprintf  (2 calls); 

Standard functions exp() and fabs() are missing from the Carbon release, but have been added to the development version. With the development version, one simply needs to list /usr/local/share/frama-c/libc/math.c as a file of the analysis project. Function exp() is implemented as a value analysis builtin, taking advantage of the host's own exp() function.

If you wish to reproduce at home, you have to provide your own implementation for exp(). It may not need to be very accurate, but it needs to be analyzable with Carbon's value analysis. A good way would be to clean up the implementation from Sun that everyone is using. As I already said in previous posts, accessing the bits of a floating-point representation directly is so 1990s. If someone removed these ugly and obsolete manipulations, we would end up with a better implementation for exp() (and it would be analyzable with the value analysis from Carbon).

For fabs(), you can use the function below. An implementation from a library would probably unset the sign bit in the IEEE 754 representation but, again, Carbon's value analysis doesn't handle this precisely yet. The implementation below is handled precisely in the sense that a singleton for the input x results in a singleton for the returned value.

double fabs(double x){
  if (x==0.0) return 0.0;
  if (x>0.0) return x;
  return -x;

Finally, calls to fprintf(), that do not have any repercussions on the continuing execution, can be replaced with calls to the Frama_C_show_each() built-in. We can let the pre-processor to do this for us, passing the option -Dfprintf=Frama_C_show_each to GCC when pre-processing.

Looking for unspecified/undefined behaviors

We are now ready to launch the value analysis. Since we anticipate testcases 2 and 3 to be very similar, we write a short script:


export CPP="gcc -Dfprintf=Frama_C_show_each -DTESTCASE=$1 -C -E -nostdinc -I. -I/usr/local/share/frama-c/libc"
FILES="/usr/local/share/frama-c/libc/math.c impls.c main.c roco.c sim.c"

exec frama-c ${FILES} ${TESTCASE} -val -slevel 50000 -unspecified-access -val-signed-overflow-alarms -calldeps -save state$1

About ten minutes after running 1 > log1, we obtain a log (long:~70MiB) and a state (~1MiB). The log is long because it contains progression messages that are only intended to help identify analysis issues, and can be ignored most of the time. The state contains all the information that has been computed about the program, including values of variables, in a very compact format.

The analysis is completely unrolled (because of option -slevel 50000) and precise until the end. This means that the value analysis has in effect simulated the execution of the program with the inputs provided in main_testcase_1.c.

The log, despite its size, does not warn about any of the undefined or unspecified behaviors that the value analysis is guaranteed to identify (uninitialized access, use of a dangling pointer, overflows in signed integer arithmetics, invalid memory access, invalid comparison of pointers, division by zero, undefined logical shift, overflows in conversions from floating-point to integer, infinite or NaN resulting from a floating-point operation, undefined side-effects in expressions). This is very important. It means that we can rest assured that the strange dynamic behavior we are going to investigate is not caused by the misuse of one of C's dangerous constructs. Nothing would be more frustating than having to track the value of a variable which, according to the source code, is not supposed to change, but is modified through a buffer overflow. The value analysis guarantees we won't have to do that for this execution.

What would of course be better would be to verify that there can be none of the above undefined behaviors for any command sequence. This would be a much stronger result, but would also require a lot more work. When simulating a single execution inside the analyzer, we only check that that particular execution is free of undefined behavior, but it does not require any work from us (only a few minutes of work from the computer).

Exploiting the value analysis' results for program comprehension

The values computed and stored in state1 can be observed in Frama-C's GUI, using the command-line frama-c-gui -load state1. The GUI can also be used to identify the definition site(s) of a variable's value: select the variable at the program point you are interested in, and in the contextual menu, invoke Dependencies → Show defs.

Here, it is the value of variable RoCo_engineVoltage as displayed by the call to fprintf() (that we transformed into a call to Frama_C_show_each()) that is wrong, so we request the definition site(s) of that value:


The GUI has pointed us to a call to function RoCo_Process() (using the yellow mark), so we now request the definition sites of RoCo_engineVoltage by the return; statement of that function. We obtain the two sites identified below:


The condition that decides which branch is executing is the one shown in the screenshot below.


The value analysis tell us the value of RoCo_isActive can be either 0 or 1 at this point during execution, but this variable is one of the variables whose value is printed in the analysis log, and its value was 1 at the instant we are interested in. We therefore focus on the definition site where the value assigned to RoCo_engineVoltage is computed in a call to PT1_Filter().

The dependencies of the particular call to PT1_Filter() we are interested in were computed by option -calldeps and can be found in the log. The call we are interested in is at statement 433. The log contains:

call PT1_Filter at statement 433:
 voltageFilter FROM state; x; t1; dt; voltageFilter
 \result FROM state; x; t1; dt; voltageFilter

Apart from reading its arguments state, x, t1, and dt, the call accesses a static variable voltageFilter. The address of voltageFilter is taken, so we have to be careful: this variable could be modified erroneously through a pointer (although the address-taking appears to be only to pass it to PT1_Filter(), which is innocent enough).

In fact, at this point, we have no idea which of the variables involved in the computation of the result of this call to PT1_Filter() is wrong. Clicking on a variable in the GUI provides the set of values for this variable at this program point, but this is still too imprecise here, since it mixes all 10000 or so passages through the statement.

Let us take advantage of the "blackboard" structure of the analyzed program and dump the entire program state at this statement, by inserting a call to Frama_C_dump_each(). See this previous post for a list of advantages of this built-in function over printf() or a debugger.

--- roco.c	(revision 12956)
+++ roco.c	(working copy)
@@ -293,6 +293,7 @@
                 desiredEngineVoltage, Engine_maxVoltage_PARAM);
         limitationActive = (Engine_maxVoltage_PARAM == desiredEngineVoltage) ||
                              (Engine_minVoltage_PARAM == desiredEngineVoltage);
+	Frama_C_dump_each();
         RoCo_engineVoltage = PT1_Filter (&voltageFilter, desiredEngineVoltage,
                 t13, dT);
         wasInit = init;

We need to launch the analysis again and find something to do for 10 minutes. This is a good time to start looking at bugs 2 and 3.

The log now contains state dumps for each passage through the statement where RoCo_engineVoltage is computed.

According to the ReadMe.txt in the challenge package, an order is given at t=50s. The log shows that this order fails to be executed speedily. The state dump at which lastTime contains 50000 and the next few ones show that of RoCo_engineVoltage's dependencies, variable desiredEngineVoltage is the one with the suspicious value: it is only -0.8, whereas parameters in file roco_config_testcase_1.h and values of the variable elsewhere in our log show that this voltage can go much higher. We are therefore left with the sub-problem of identifying why this variable has this value at this program point.

We use the same tools we have already used for RoCo_engineVoltage, this time applied to variable desiredEngineVoltage and this program point. The screenshot below shows the definitions sites for that value.


The value of variable desiredEngineVoltage is defined by the call to function Limiter_Out(), whose argument is in turn defined by the call to Interpolate_from_curve() above.

Option -calldeps computed the implicit inputs of this call, which can be found in the log:

call Interpolate_from_curve at statement 423:
 \result FROM curve; x;
              EngineSpeedToVoltage_CURVE{.numPoints; .x[0..4]; .y[1..3]; }

The state dump in which lastTime==50000 shows that a low value for angleDiffRequest is the cause for the low value of desiredEngineVoltage.

The "Show defs" action in the GUI finds three possible definition sites for this value of angleDiffRequest, shown in the screenshots below.




We find in the log that variable rampValue remains at 0 in the cycles that follow instant 50000. The value we observe for angleDiffRequest is compatible with the algorithm and values of variables at lines 264-277 of file roco.c. So it looks like the cause of the issue is the value of variable rampValue. Action "Show defs" in the GUI indicates that this value is computed by the call to Ramp_out() at line 240. The value of rampTarget is computed as 0.0 or 1.0 from a number of variables, and of these variables, RoCo_legAngleValid was always 1, and direction was always 0 or 1. The latter is suspicious, since in this execution, orders are given to move in both directions:


The command "Show defs" applied to variable direction shows that indeed, the variable may have been set to 0 or 1 in three different sites.


The site that corresponds to the MoveByAngle command, the middle one in the screenshot above, is suspicious: in the file main_testcase_1.c, the angle passed to this command is negative. This is not just a strange convention, because the computation below for the third definition site determined the direction for the other commands, that are all in the opposite direction, and variable direction was assigned 1 again there.

This suggests the fix below.

--- roco.c	(revision 13019)
+++ roco.c	(working copy)
@@ -211,7 +211,7 @@
                     direction = 0;
                 else {
-                    direction = (RoCo_desiredDeltaAngle > 0.0) ? -1 : 1;
+                    direction = (RoCo_desiredDeltaAngle > 0.0) ? 1 : -1;
                 RoCo_commandMoveByAngle = FALSE;

Looking at the source code for related issues, one may notice that the value given to direction is also affected by the piece of code below. It would be worth the time testing different values of t9 and RoCo_hasMinMaxAngles_PARAM, although that complex computation is only active when using commands other than MoveByAngle.

            if ((fabs(t9) > t11) && (direction == 0)) {
                direction = ((((!RoCo_hasMinMaxAngles_PARAM) ||
                               (fabs(t9) >= 180.0))
                              ? t9 : -t9) > 0.0) ? 1 : -1;

Thanks, and Next

My colleagues Benjamin Monate and Virgile Prevosto found the challenge and set up the files in the development repository, so that the only thing that was left to do was to identify the bugs, and started looking at them with me. Anne Pacalet and David Mentré have provided feedback for transforming this submission into a blog post. Anne Pacalet also implemented of all the code navigation features used in this study. The organizers Jochen Quante and Andrew Begel have done an impressive job of making this challenge interesting, and indeed, challenging.

I have also submitted my explanation of tasks 2 and 3 in the challenge. However, I thought someone might want to look at these themselves now that I have shown how easy it is on task 1. The only difficult part is finding an exp() function.

To be continued...

Thursday, March 3 2011

This time for real: verifying function cosine

This post follows that post where a typical double-precision implementation for a cosine function is shown. That implementation is not a correctly rounded implementation, but its double result is precise to a level close to what the double-precision floating-point representation allows. It uses low-level tricks that make it faster on nineties computers.

In this post, I offer a slightly simplified version of the same function (download). This function does not compute exactly the same results, but according to a few billion tests I ran, it does not seem more wrong than the original. I expect the simplified version is actually faster than the original on tens computers, but this is not why I modified it. I needed to make changes because the value analysis does not yet precisely handle all the low-level conversions in the original (accessing the 32 most significant bits of a double). Although the value analysis does not say anything false, it is only possible to analyze the function precisely if this idiom is handled precisely.

If it was important to analyze as-is code that uses this low-level idiom, implementation of this treatment could be prioritized according to the results of the experiment in this post. Results on the original function can be expected to be the same after the new feature is implemented as the results for the modified function described here.

I implemented a quick-and-dirty value analysis builtin for the purpose of this verification. It is not provided so much for being re-used as for possible comments: it is large enough that a bug in it could invalidate the verification. This builtin checks that its second argument is close enough to the cosine of its first argument (it uses its third argument as allowable distance). The builtin displays a log line for each call, so that you can check that it was properly called.

Demonstrating how easy it is to write an unsafe builtin, this implementation assumes without verification that it is called in a range where function cosine is decreasing.

I then wrote the following lines of C:

          x = Frama_C_double_interval(..., ...);
          m = mcos(x);
          Frama_C_compare_cos(x, m, 0x1.0p-21);

Next, I arranged to have the value analysis repeatedly analyze these lines with bounds passed to Frama_C_double_interval that, taken together, cover entirely the range from 0 to π/4. I will not say how I did, because it is not very interesting. Besides, the method I used is convenient enough to discourage someone from writing an Emacs Lisp macro to generate sub-division assertions. That would be an unwanted side-effect: it in fact has complementary advantages with the would-be assertion generation macro that I recommend someone look into.

The verification used a few hours of CPU time (I did not try very hard to make it fast). One thing I did was adapt the input sub-interval width to the part of the curve that was being verified, from two millionths for inputs near zero to a quarter of a millionth for inputs near π/4.

During this experiment, I actually used the cosine implementation from CRlibm as reference function. This library, incidentally developed at my alma mater, provides a number of fast and correctly rounded implementations for mathematical functions. The value analysis was configured with the -all-rounding-modes option, so the results are guaranteed for arbitrary rounding modes, and are guaranteed even if the compiler uses supplemental precision for intermediate results (e.g. if the compiler uses the infamous 80-bit historial floating-point unit of Intel processors).

Here are the logs I obtained (warning: I did not make any file of that level of boringness publicly available since my PhD, plus these are really big, even after compression. Handle with care): log1.gz log2.gz log3.gz log4.gz log5.gz log6.gz log7.gz log8.gz log9.gz log10.gz log11.gz log12.gz log13.gz log14.gz log15.gz. To save you the bother of downloading even the first file, here is what the first three lines look like.

[value] CC -0.0000000000000000 0.0000019073486328 0.9999999999981809 1.0000000000000002 0.9999999999981810 1.0000000000000000 OK
[value] CC 0.0000019073486328 0.0000038146972656 0.9999999999927240 0.9999999999981811 0.9999999999927240 0.9999999999981811 OK
[value] CC 0.0000038146972656 0.0000057220458984 0.9999999999836291 0.9999999999927242 0.9999999999836291 0.9999999999927242 OK

From the fact that the builtin did not complain for any sub-interval, we can conclude that on the interval 0..π/4, the proposed implementation for cosine is never farther than 2^-21 (half a millionth) from the real function cosine.

According to the aforementioned billion tests I made, the implementation seems to be much more precise than above (to about 2^-52 times its argument). The key difference is that I am quite sure it is precise to half a millionth, and that I have observed that it seems to be precise to about 2^-52 times its argument.

Guillaume Melquiond provided irreplaceable floating-point expertise in the preparation of this post.

Wednesday, March 2 2011

On memcpy (part 2: the OCaml source code approach)

When picking up the title for the previous post on function memcpy, I anticipated this second part would describe the Frama_C_memcpy built-in function. The subtitle "part 1: the source code approach" seemed a good idea, since the first part was about using C source code to tell the analyzer what this function memcpy it often encounters is.

I did not think further, and did not realize that I would spend this sequel post explaining that a built-in function is an OCaml function that directly transforms an abstract state into another abstract state without going through the tedious, instruction-by-instruction interpretation of C code. So this second post is still about source code.

It is not source code that it is recommended, or even documented how, to write yourself. But it is a convenient way for value analysis experts to introduce certain new features.

What new features? Smoking hot, blazing fast new features.

Also, features that work completely right neither the first nor the second time. The Formula One of features, if you will. In fact, you could be forgiven for calling them Virgin Racing features, although I prefer to think of them as Ferrari features.

As just stated, builtins are about using OCaml functions to describe the effect of an unavailable C function. For instance, calls to Frama_C_memcpy(d,s,l) are interpreted with an OCaml function that, in a single step, copies a slice of memory of length l*8 bits from s to d. And it is fast. Behold:

char s[10000], d[10000];

  // Make problem representative
  for (int i=0; i<100; i++)
    s[100*i] = i;

  // Time one or the other
  //  c_memcpy(d, s, 10000);
  Frama_C_memcpy(d, s, 10000);

  // Check that d contains the right thing
  return 0;

(download the entire program)

Using the C version of memcpy and sufficient unrolling, you get the expected contents for d:

$ time bin/toplevel.opt -val -slevel 20000 memcpytest.c -no-results
        d[0..99] ∈ {0; }
         [100] ∈ {1; }
         [101..199] ∈ {0; }
         [200] ∈ {2; }
user   0m23.155s

Using the Frama_C_memcpy built-in, you get the same precise result for d:

$ time bin/toplevel.opt -val -slevel 20000 memcpytest.c -no-results
        d[0..99] ∈ {0; }
         [100] ∈ {1; }
         [101..199] ∈ {0; }
         [200] ∈ {2; }
user   0m0.402s

You also get nearly 23 seconds of your life back.

So what are the things that can go wrong with a builtin replacement function?

  1. If the C code that is not analyzed would have caused an alarm to be emitted, the built-in should emit that alarm too, or at the very least be provided with a header file that offers an ACSL pre-condition along with a prototype for the builtin. Both alternatives work, but it is easy to forget to do either.
  2. All cases of imprecise arguments should be handled. Support for an imprecise length was added recently. Very imprecise source or destination pointers may still cause the analysis to abort (it is being done and I am not sure where we were the last time we were working on this).
  3. The builtin function calls the value analysis' internal functions in contexts that differ from the usual, well tested circumstances occurring during the analysis of normal C programs. Strange behaviors or bugs may be uncovered by the user. As an example, I was just preparing the next section of this post, where I would have shown how imprecise lengths were now handled. I was going to suggest modifying the above test program, thus:
  Frama_C_memcpy(d, s, Frama_C_nondet(150,10000));

This brought up the following two remarks: first, there is an internal function for moving slices of the abstract state around that is slower than it should be, and it shows on this test. Secondly, the value analysis in Carbon will tell you that bits 1200 to 79999 of d contain either 0 or a very large number, a number with about 23400 decimal digits that start with 2164197 and end with 85824. The value analysis is right, of course. If the length argument to Frama_C_memcpy was 150, then that slice of memory contains zero. If the length was 10000, the numbers in d that were copied from s can be interpreted as representing exactly that 78800-bit number. This is the value analysis' way of telling you that it's either one or the other: either the length was 150 or it was 10000. This is more informative than telling you cell by cell that the cell contains zero or another number, because then you wouldn't see that it's either all one or all the other. The value analysis is rather cute that way.

Note: if the 78800-bit number explanation seems terribly complicated, do not worry, there will be a better explanation for this issue in a future post. We just need another new feature in order to be able to show you.

In conclusion, not everyone may want a Formula One car. But Frama_C_memcpy should at least work for completely unrolled deterministic programs, where it can speed up analysis quite a bit compared to the interpretation of a C memcpy.

Thanks to Julien for help with motorsports metaphors, and to Boris for help in improving Frama_C_memcpy (it used to be worse).

page 2 of 2 -