Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - derived analysis

Entries feed - Comments feed

Friday, September 21 2012

A value analysis option to reuse previous function analyses

A context-sensitive analysis

Frama-C's value analysis is context-sensitive.

This means that when a function f2() is called from a caller f1(), function f2() is analyzed as many times as the analyzer goes over f1(). Function f2() is analyzed each time with a different program state—the program state corresponding to the specific call. If f2() is called from within a loop, the analyzer may analyze it many times.

This is for the worthy cause of precision. Any attempt to summarize f2(), so that it can be analyzed once and for all, can only result in less precise analysis results.

An example

Consider the following program:

int a1 = 1;
int a2 = 2;
int a3 = 3;

int p1 = -1;
int p2 = -1;
int p3 = -1;

void calculate_output(void)
{
  if (a1 == 1 && a2 == 2 && a3 == 3)
  {
    a1 = 2;
    return;
  }
  if (a1 == 2 && a2 == 2 && a3 == 3)
  {
    a1 = 1;
    return;
  }
}

main()
{
  calculate_output();
  p1 = a1;
  p2 = a2;
  p3 = a3;
  Frama_C_dump_each();
  calculate_output();
  p1 = a1;
  p2 = a2;
  p3 = a3;
  Frama_C_dump_each();
  calculate_output();
  p1 = a1;
  p2 = a2;
  p3 = a3;
  Frama_C_dump_each();
  calculate_output();
}

For the reason stated above, function calculate_output() is analyzed four times when this program is analyzed with the command frama-c -val:

$ frama-c -val calc.c
...
[value] computing for function calculate_output <- main.
        Called from calc.c:25.
...
[value] computing for function calculate_output <- main.
        Called from calc.c:30.
...
[value] computing for function calculate_output <- main.
        Called from calc.c:35.
...
[value] computing for function calculate_output <- main.
        Called from calc.c:40.
...
[value] Values at end of function main:
          a1 ∈ {1}
          p1 ∈ {2}
          p2 ∈ {2}
          p3 ∈ {3}
...

An unnamed option

If you look at the program closely, you may notice that the third call and the first call (and respectively the fourth and second call) have a lot in common. For each of these pairs the analysis follows the same paths inside the function and change the program state in the same way. This is because although the states differ in variables p1, p2, p3, they are identical for the variables a1, a2, a3 that are the only variables read.

There is an undocumented (and indeed, unfinished) value analysis option to take advantage of previous analyses of a function call without loss of precision. It only reuses calls in circumstances similar to those that apply to the third and fourth call to calculate_output() above. When this option is enabled, Frama-C, after analyzing a function call, computes the sets of locations that were effectively read and written for that specific call and records these inputs and outputs, as well as the program state before and after the call, in a cache for the purpose of later re-using the analysis just done.

With that option, the log output during the analysis becomes:

[value] computing for function calculate_output <- main.
        Called from calc.c:25.
...
[value] computing for function calculate_output <- main.
        Called from calc.c:30.
...
calc.c:35:[value] Reusing old results for call to calculate_output
...
calc.c:40:[value] Reusing old results for call to calculate_output
...
[value] Values at end of function main:
          a1 ∈ {1}
          p1 ∈ {2}
          p2 ∈ {2}
          p3 ∈ {3}

History and application to the RERS challenge

The option alluded to in this post was not implemented for the purpose of the RERS 2012 competition. It was implemented earlier in the context of a large case study where it contributed to speed up the analysis greatly. A lot of work would be involved in making this option work well in all contexts (and, actually, in making sure that derived analyses that plug into the value analysis are not confused by bits of analysis being reused).

Still, although it is unfinished, the mysterious option is finished enough to be useful in the context of the RERS competition. It is particularly useful when determining the status of liveness properties such as (F oW). Whether infinite execution traces are detected with a counter or with a quantum superposition of all previous states, the extra program state introduced by the instrumentation does not influence the behavior of function calculate_output(). In that context, it is a huge gain to be able to re-use the analysis of calculate_output(), which would otherwise, for each state, have to be duplicated for all possible predecessors.


Boris Yakobowski invented and implemented the option this post refers to, without which the hardest properties in the RERS challenge would be intractable.

Monday, January 16 2012

Csmith testing again

My presentation Friday at the U3CAT meeting was on the topic of Frama-C Csmith testing. Several posts in this blog already describe facets of this work (it has its own tag). Yet another angle can be found in this short article draft. Said draft, by the way, will soon need to be either finalized or resubmitted, so remarks are welcome if you have them. This webpage provides figures and implementation details that could have been included in the article to improve clarity, but weren't because we were aiming for the 6-page short format. The webpage is not intended to make sense separately from the article.


January 25, 2012: Updated with most recent draft of the article. The companion webpage has been updated, too

Saturday, December 31 2011

Do not use AES in a context where timing attacks are possible

Justification

There recently was a thread in the Frama-C mailing list on verifying the Rijndael cipher, standardized and better-known as AES. Nowadays, AES is mostly famous for being sensitive to timing attacks. An attacker measuring the time it takes to encrypt known plaintext with an unknown key can deduce the key (key quote: "Using secret data as an array index is a recipe for disaster").

This brings me back to an experiment with University of Minho researchers where Frama-C's value analysis was used to verify that cryptographic C functions have constant execution time. The justification and the conclusion are the same as in this blog post that had served as the starting point of that experiment.


There are two aspects to constant-time programming: checking that the execution path does not depend on secrets, and (remember the key quote above) checking that array indices do not depend on secrets. I implemented the first check as an extension of Frama-C's dependencies analysis. Manuel Bernardo Barbosa et al similarly implemented the second one, and they used both analyses to verify that cryptographic functions in the NaCl library were safe from timing attacks. I had a glance at their implementation but I had not made the effort to port it to new Frama-C releases (I assume they had not either).

Example

Consider the function f() below:

int t[10] = {0, 1, 2, 2, 1, 8, 9, 17, 54, 79};

int u[10];

int g;
int a;

void f(int n)
{
  int i;
  int *p;

  for (i = 0; i < t[n] * t[n+1] - 3; i++)
    {
      p = u + i + a;
      if (g)
	*p = i;
    }
}

Execution path dependencies analysis relies on standard dependencies analysis (option -deps), which rely on the value analysis. Say you are interested in the execution times of function f() when array t has its initial values, global variables a and g are each 0 or 1, and the argument n is between 0 and 5. You would then create the following analysis context, as you would for any verification based on the value analysis:

main(){
  a = Frama_C_interval(0, 1);
  g = Frama_C_interval(0, 1);
  f(Frama_C_interval(0, 5));
}

Analyzing this complete program for execution path dependencies:

$ frama-c -experimental-path-deps -deps share/builtin.c ep.c
...
Computing path dependencies for function f
Path dependencies of f: t[0..6]; g; n

Within the initial conditions defined by the function main() (and in particular, n between 0 and 5), the execution path depends on the initial values of t[0..6], g and n at the time f() is called. Note that variables p, a and u are not included in this list, although they are used, because the execution path does not depend on them. So if f() was a cryptographic function and variable a contained a secret, you would not have to worry that this secret can be deduced by observing through timing that the function has different execution paths depending on a.


But does this mean that you are safe from timing attacks, though? As demonstrated in the article linked first in this post, you should also worry that a malicious adversary will infer information from the array indices used in f():

$ frama-c -experimental-mem-deps -deps share/builtin.c ep.c
...
Computing mem dependencies for function f
Mem dependencies of f: t[0..6]; a; n

The above option, that I quickly re-implemented when I saw that the verification of cryptographic functions was being discussed in the mailing list, tells which of f()'s inputs influence the computation of array indices, in the same way that option -experimental-path-deps tells which of f()'s inputs influence the execution path inside f(). In this example, it tells that the contents of input variable a can leak through a cache-timing attack (because a is used in the computation of p, and later p may be dereferenced for writing).

What about AES, then?

$ frama-c rijndael.c -cpp-command "gcc -C -E -I /usr/local/Frama-C/share/frama-c/libc" -no-annot -slevel 9999 mymain.c -experimental-mem-deps -deps
...
Mem dependencies of rijndaelEncrypt: rk; Nr_0; pt; ct; in[0..15]; key[0..39];
                                     Te0[0..255]; Te1[0..255]; Te2[0..255]; Te3[0..255]

The analysis finds that there is a risk that in[0..15] (the input buffer) and key[0..39] (the key) may be leaked through cache-timing attacks. The other variables are Nr_0, the number of rounds, pt and ct, just pointers, and the Tei arrays, the S-boxes. Those aren't secrets so it doesn't matter that they are listed.

Like every other analysis in Frama-C, this analysis always makes approximations on the safe side, so we expected to find this result (since we knew the attack had been realized). The question is, is the analysis precise enough to conclude that another cryptographic function is safe? The answer is yes: the hash function Skein-256 can be shown to have an execution time that depends only on the length of the input buffer, but not of its contents.

Implementation

The option -experimental-path-deps has been available as a hidden, proof-of-concept feature inside the official Frama-C distribution roughly since the Valgrind blog post was published in 2010.

I re-implemented the analysis of memory accesses dependencies analysis demonstrated here as -experimental-mem-deps. It has been available since Frama-C version Oxygen.

Note that the feature was very lightly tested. The new option -experimental-mem-deps was written in 45 minutes and the examples in this post are the only testing it received. If you feel that automatic analyses such as these are too automatic (they don't force you to write invariants for loops in the target code, for instance), you can instead spend your time usefully by devising test suites to make sure they work as intended in all cases.

Conclusion

One must be careful with a posteriori studies. There are plenty of bugs for which, once they have been pointed out, it is tempting to say "hey, I could have found it with this or that static analysis tool! My tool finds these bugs. Everyone is so stupid not to be using it". The point is that, much like a dowser being tested, you only found the bug with your tool when someone else had told you where it was. Had you tried to find it before being told, perhaps you would have given up right in the setup phase, or you would have had so many false positives to review that you would never have gotten round to the bug, or you would have gotten round to it but would have dismissed it quickly as another false positive.

So I want to make it absolutely clear that I am not saying that the AES implementation conveniently posted in the mailing list should have been checked with Frama-C when it was written. This is a causal impossibility. However, timing attacks are a recognized family of attacks now, and cryptographic functions analyze rather well (based on experiments with AES and Skein), so Frama-C is definitely a tool that can be used nowadays to verify that new cryptographic functions are being implemented according to the state of the art.


This blog post owes to Adam Langley for describing the method in the context of Valgrind, to Manuel Bernardo Barbosa and his students for courageously writing the first version of the array indices dependencies analysis (which I am sure took them more than 45 minutes) and applying it, and to 罗婷 for posting on frama-c-discuss a version of AES that was ready to analyze.

Saturday, November 5 2011

What functions does a function use: option -users

Exploring unfamiliar code

Sometimes, one finds oneself in the situation of exploring unfamiliar code. In these circumstances, it is sometimes useful to know which functions a function f() uses. This sounds like something that can be computed from the callgraph, and there exists plenty of tools out there that can extract a callgraph from a C program, but the callgraph approach has several drawbacks:

  1. A static callgraph does not include calls made through function pointers. Therefore, you do not see all the functions that f() uses: the list omits the functions that were directly or indirectly called through a function pointer.
  2. The set of functions computed from the call graph is over-approximated, because if f() calls g() and g() may sometimes call h(), it doesn't necessarily mean that f() uses h(). Indeed, perhaps g() never calls h() when it is called from f(), but only when it is called from another function k().

Example

Here is an example that illustrates both issues.

enum op { ADD, MULT };

void copy_int(int *src, int *dst)
{
  *dst = *src;
}

int really_add(int u, int v)
{
  return u + v;
}

int really_mult(int u, int v)
{
  return u * v;
}

int do_op(enum op op, int u, int v)
{
  if (op == ADD)
    return really_add(u, v);
  else if (op == MULT)
    return really_mult(u, v);
  else 
    return -1;
}

int add(int x, int y)
{
  int a, b, res;
  void (*fun_ptr)(int*, int*);
  fun_ptr = copy_int;
  (*fun_ptr)(&x, &a);
  (*fun_ptr)(&y, &b);
  res = do_op(ADD, a, b);
  return res;
}

Using a syntactic callgraph to compute the functions used by add(), one finds do_op(), really_add(), and really_mult(). This list is over-approximated because add() does not really use really_mult(). More importantly, the list omits function copy_int(), which is used by add().

Frama-C's users analysis

Frama-C's users analysis computes this list instead:

$ frama-c -users -lib-entry -main add example.c
...
[users] ====== DISPLAYING USERS ======
        do_op: really_add 
        add: copy_int really_add do_op 
        ====== END OF USERS ==========

The users analysis exploits the results of the value analysis, so the results hold for the initial conditions the value analysis was configured for. Here, the value analysis was instructed to study the function add() by itself. In these conditions, do_op() only calls really_add(), but if the analysis focused on a larger program it would see that do_op() also sometimes call really_mult(). The users analysis can tell that add() uses copy_int(), really_add(), and do_op(), and does not use really_mult().


This kind of synthetic information is very useful when trying to get a grip on large programs, for instance, when trying to extract a useful function from a large codebase to make it a library. Unsurprisingly, plenty of tools already existed before Frama-C that tried to provide this sort of information. But having information on the dynamic behavior of the program can make a large difference in the value of the synthetic information computed.


My colleagues at Airbus Opérations SAS and Atos SA will present serious applications of the -users option at ERTS² 2012 (next February).

Monday, August 8 2011

".,-~:;=!*#$@"[N>0?N:0]

Hey, I left out one alarm last time:

donut.c:15 ... out of bounds read. assert \valid(".,-~:;=!*#$@"+tmp_7);

This corresponds to ".,-~:;=!*#$@"[N>0?N:0] in the obfuscated code. I wanted to have a blog post about this construct in particular because I was curious whether it would break the content management system's URL generation.

The reason why this alarm is a false alarm is a bit difficult to see in context, so let us focus on the function below, which contains only the relevant computations.

double sin(double), cos(double);

/*@ ensures 0 <= \result <= 11 ; */
int compute(float A, float B, float i, float j)
{
  float c=sin(i), l=cos(i);
  float d=cos(j), f=sin(j);
  float g=cos(A), e=sin(A);
  float m=cos(B), n=sin(B);

  int N=8*((f*e-c*d*g)*m-c*d*e-f*g-l*d*n);
  return N>0?N:0;
}


How would we argue that it is allowable to do this in a real verification? If the target compiler implements IEEE 754 floating-point exactly, then it doesn't matter what instructions are interspersed with the ones we are studying: they always give the same IEEE 754-mandated values. If the compiler only implements IEEE 754 approximately (say, computing some intermediate values in higher precision, and later double-rounding some of them at its whim), then we can analyze the function above with option -all-rounding-modes and argue that while the instructions around these ones may, in the real program, interfere with the computation, all possible results of the real program are captured by the value analysis' predictions for this function.

I do not know whether this kind of argument would be accepted in an industrial context, but on the other hand, the difficulty here, similarly to the case of variable o in the last post, is artificially introduced by the program's obfuscation. So the point is probably moot: industrial code would not be allowed to mix up computations like this in the first place.

If the discussion above seems a bit abstract, perhaps you have not read David Monniaux's report on the pitfalls of floating-point computations for static analysis. Consider it recommended.


Lastly, modifying the program as below and invoking Frama-C's slicing plug-in with the command that follows was useful for extracting the function compute() above.

int          k;double sin()
         ,cos();main(){float A=
       0,B=0,i,j,z[1760];char b[
     1760];printf("\x1b[2J");for(;;
  ){memset(b,32,1760);memset(z,0,7040)
  ;for(j=0;6.28>j;j+=0.07)for(i=0;6.28
 >i;i+=0.02){float c=sin(i),d=cos(j),e=
 sin(A),f=sin(j),g=cos(A),h=d+2,D=1/(c*
 h*e+f*g+5),l=cos      (i),m=cos(B),n=s\
in(B),t=c*h*g-f*        e;int x=40+30*D*
(l*h*m-t*n),y=            12+15*D*(l*h*n
+t*m),o=x+80*y,          N=8*((f*e-c*d*g
 )*m-c*d*e-f*g-l        *d*n);

	  /*@ slice pragma expr N; */

                               if(22>y&&
 y>0&&x>0&&80>x&&D>z[o]){z[o]=D;;;b[o]=
 ".,-~:;=!*#$@"[N>0?N:0];}}/*#****!!-*/
  printf("\x1b[H");for(k=0;1761>k;k++)
   putchar(k%80?b[k]:10);A+=0.04;B+=
     0.02;}}/*****####*******!!=;:~
       ~::==!!!**********!!!==::-
         .,~~;;;========;;;:~-.
             ..,--------,*/
$ bin/toplevel.opt -val share/libc.c -slice-pragma main -slice-print donut.c
...
/* Generated by Frama-C */
extern double sin();
extern double cos();
void main(void)
{
  float A;
  float B;
  float i;
  float j;
  A = (float)0;
  B = (float)0;
  while (1) {
    j = (float)0;
    while (6.28 > (double)j) {
      i = (float)0;
      while (6.28 > (double)i) {
        { float c; float d; float e; float f; float g; float l; float m;
          float n; int N; c = (float)sin(i); d = (float)cos(j);
          e = (float)sin(A); f = (float)sin(j); g = (float)cos(A);
          l = (float)cos(i); m = (float)cos(B); n = (float)sin(B);
          N = (int)((float)8 * ((((f * e - (c * d) * g) * m - (c * d) * e) - 
                                 f * g) - (l * d) * n));
          /*@ slice pragma expr N; */ ;
        }
        
        i = (float)((double)i + 0.02);
      }
      j = (float)((double)j + 0.07);
    }
    A = (float)((double)A + 0.04);
    B = (float)((double)B + 0.02);
  }
  return;
}
...

Aside: in my first attempt to produce the slice above, I made the mistake of including the approximative sin() and cos() functions written earlier:

#include "share/builtin.h"

double sin(double x)
{
  return Frama_C_double_interval(-1.0, 1.0);
}
...

The slicing plug-in, always up for mischief, noticed that sin() and cos() did not use their arguments and removed completely the computations of variables A and B, and their declarations, now useless. The values of these variables do not matter, but we'll need to know that cos() and sin() are applied to the same variables i, j, A, B if we want to have a chance to establish the post-condition. The slicing was removing this crucial piece of knowledge from the sliced program, but it was only taking advantage of the information that we had given it with our function stubs.

Thursday, June 9 2011

Fixing robots, part 2

This post follows that post.

For cases 2 and 3, in order to get more complete execution traces, we again take advantage of Frama_C_dump_each(). Before launching the analysis, we insert a call to this primitive right at the end of RoCo_process(), so that local variables of that function will be displayed too.

--- roco.c.orig	2011-05-16 13:53:57.000000000 +0200
+++ roco.c	2011-05-16 13:55:18.000000000 +0200
@@ -303,4 +303,5 @@
     else {
     	RoCo_engineVoltage = 0.0;
     }
+    Frama_C_dump_each();
 }

The static analysis done by the previously provided shell script (key command below) again confirms that the execution does not run into any of the C99 undefined behaviors that the value analysis detects (uninitialized access, use of a dangling pointer, overflows in signed integer arithmetics, invalid memory access, invalid comparison of pointers, division by zero, undefined logical shift, overflows in conversions from floating-point to integer, infinite or NaN resulting from a floating-point operation, undefined side-effects in expressions). Yes, I will now give the full list every time. One year ago I stumbled on a comparative study between PolySpace and Frama-C's value analysis that gave the full list of alarms for PolySpace but listed only two types of undefined behaviors for the value analysis. The author had lazily copy-pasted a sentence from the documentation's introduction, but had crucially omitted the ellipsis. As a consequence, I will now inflict the full list on everyone every time. Bwa-ha-ha. Also, I plan to add three more categories of detected undefined behaviors in the near future. Bwa-ha-ha-ha-ha.

frama-c ${FILES} ${TESTCASE} -val -slevel 50000 -unspecified-access -val-signed-overflow-alarms -calldeps -save state$1

Case 2

The trace corroborates the "does not stop jiggling when it hits the target angle" diagnostic. We concentrate on relevant lines of the log using the command:

grep "\\(STATE\\|RoCo_legAngle \\|RoCo_engineVoltage\\|lastTime\\)" log2

Here is the bit when the target for the first command is reached:

[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -29.9791361754
        RoCo_engineVoltage ∈ -0.700730737571
        lastTime ∈ {66520}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.0013178639
        RoCo_engineVoltage ∈ -0.573709804476
        lastTime ∈ {66540}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.0235995355
        RoCo_engineVoltage ∈ -0.469713860267
        lastTime ∈ {66560}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.0454486754
        RoCo_engineVoltage ∈ -0.384569182548
        lastTime ∈ {66580}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.0664935071
        RoCo_engineVoltage ∈ -0.314858616438
        lastTime ∈ {66600}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.0864858403
        RoCo_engineVoltage ∈ -0.257784432149
        lastTime ∈ {66620}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.1052716542
        RoCo_engineVoltage ∈ 0.07897475281
        lastTime ∈ {66640}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.1227678881
        RoCo_engineVoltage ∈ 0.354689853917
        lastTime ∈ {66660}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.1389442061
        RoCo_engineVoltage ∈ 0.580426286282
        lastTime ∈ {66680}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.1538087376
        RoCo_engineVoltage ∈ 0.765243645549
        lastTime ∈ {66700}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.1673969897
        RoCo_engineVoltage ∈ 0.916559301284
        lastTime ∈ {66720}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.1791116186
        RoCo_engineVoltage ∈ 1.04044608206
        lastTime ∈ {66740}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.1885670036
        RoCo_engineVoltage ∈ 1.14187599937
        lastTime ∈ {66760}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.195537956
        RoCo_engineVoltage ∈ 1.22491979196
        lastTime ∈ {66780}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.1999193366
        RoCo_engineVoltage ∈ 1.29291029881
        lastTime ∈ {66800}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.2016944051
        RoCo_engineVoltage ∈ 1.34857621768
        lastTime ∈ {66820}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -30.2009101399
        RoCo_engineVoltage ∈ 1.39415161736
        lastTime ∈ {66840}
...
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -29.9951370606
        RoCo_engineVoltage ∈ 0.46939308602
        lastTime ∈ {67120}
...
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ -29.8107877453
        RoCo_engineVoltage ∈ -1.35491972386
        lastTime ∈ {67400}
...

The voltage is still negative when the target angle of -30 is reached. The voltage becomes positive around the angle of -30.1, but the leg is still moving down at that time. We attribute this phenomenon to inertia. The angle reaches about -30.2 before it finally stops decreasing and starts increasing again.

Using the same method as in case 1, it is possible to identify some variables that influence the choice of value for Roco_Enginevoltage. This allows us to determine a number of independent changes that each fix the problem.

The first possible change is to make the robot leg move back more slowly when it has overshot its destination. This fixes the issue, according to the Python script provided to be used as an oracle.

--- roco_config_testcase_2.h~	2011-03-23 10:44:24.000000000 +0100
+++ roco_config_testcase_2.h	2011-05-19 20:15:56.000000000 +0200
@@ -35,7 +35,7 @@
 real64 RoCo_maxAngle_PARAM = 90.0;
 real64 RoCo_minAngle_PARAM = -90.0;
 real64 RoCo_shutdownTimeout_PARAM = 12.0;
-real64 RoCo_stepSpeed_PARAM = 10.0;
+real64 RoCo_stepSpeed_PARAM = 5.0;
 real64 RoCo_TempFltT_PARAM = 0.1;
 real64 RoCo_TimeSlopeNeg_PARAM = 1.0;
 real64 RoCo_TimeSlopePos_PARAM = 1.0;

Another possible fix is to make the robot accept a larger error margin when deciding whether it is far enough of the intended angle to justify moving back.

--- roco_config_testcase_2.h~	2011-03-23 10:44:24.000000000 +0100
+++ roco_config_testcase_2.h	2011-05-19 20:18:12.000000000 +0200
@@ -11,7 +11,7 @@
 real64 RoCo_angleAtMark_PARAM = 25.0;
 real64 RoCo_angleReachedThreshold1_PARAM = 2.0;
 real64 RoCo_angleReachedThreshold1Fast_PARAM = 4.0;
-real64 RoCo_angleReachedThreshold2_PARAM = 0.1;
+real64 RoCo_angleReachedThreshold2_PARAM = 0.3;
 real64 RoCo_batteryLowLimit_PARAM = 15.0;
 real64 RoCo_batteryLowDelay_PARAM = 60.0;
 boolean RoCo_checkBatteryVoltage_PARAM = TRUE;

Finally, my two personal favorite fixes are below. They are (most probably) completely wrong, but they do fix the problem according to the Python script oracle. The first makes the voltage filter so strong that the oscillations become less visible to the script:

--- roco_config_testcase_2.h~	2011-03-23 10:44:24.000000000 +0100
+++ roco_config_testcase_2.h	2011-05-19 20:19:10.000000000 +0200
@@ -39,7 +39,7 @@
 real64 RoCo_TempFltT_PARAM = 0.1;
 real64 RoCo_TimeSlopeNeg_PARAM = 1.0;
 real64 RoCo_TimeSlopePos_PARAM = 1.0;
-real64 RoCo_voltageFilter_PARAM = 0.1;
+real64 RoCo_voltageFilter_PARAM = 0.7;
 
 real64 Engine_maxVoltage_PARAM = 18.0;
 real64 Engine_minVoltage_PARAM = -18.0;

My other favorite fix makes the voltage filter so weak that the motor no longer overshoots systematically when trying to correct its position.

--- roco_config_testcase_2.h~	2011-03-23 10:44:24.000000000 +0100
+++ roco_config_testcase_2.h	2011-05-19 20:20:35.000000000 +0200
@@ -39,7 +39,7 @@
 real64 RoCo_TempFltT_PARAM = 0.1;
 real64 RoCo_TimeSlopeNeg_PARAM = 1.0;
 real64 RoCo_TimeSlopePos_PARAM = 1.0;
-real64 RoCo_voltageFilter_PARAM = 0.1;
+real64 RoCo_voltageFilter_PARAM = 0.02;
 
 real64 Engine_maxVoltage_PARAM = 18.0;
 real64 Engine_minVoltage_PARAM = -18.0;

"Favorite" does not mean that I would recommend either of these two incompatible fixes. The first two seem more reasonable and less likely to break something else.

Case 3

According to the challenge description, we should use our "tacit knowledge gained from several years of employment at RobotControllers.com". Since we have not really been employed at RobotControllers.com for several years, it is only fair to use the tacit knowledge encoded in the provided acceptance Python script. This is probably what the organizers meant. Looking at the script, we see that the robot leg destruction is caused by a sudden variation in the voltage applied to the motor. In practice, the motor, when carried away by the inertia of the mechanism it is supposed to set in motion, becomes a generator. The difference between the voltage this generator provides and the voltage set by the environment should never be too large. A well-designed modern power supply can take the hit, but a large voltage difference also means large intensities going through the motors' coils. The motor turns into a generator, but it doesn't turn into a well-protected one.

The relevant part of the log is extracted as follows:

grep "\\(STATE\\|RoCo_legAngle \\|RoCo_engineVoltage\\|lastTime\\)" log3
...
[value] DUMPING STATE of file roco.c line 306
        RoCo_engineVoltage ∈ -6.93042056043
        lastTime ∈ {109960}
        rampValue ∈ 0.998
[value] DUMPING STATE of file roco.c line 306
        RoCo_engineVoltage ∈ -6.94303317304
        lastTime ∈ {109980}
        rampValue ∈ 1.
[value] DUMPING STATE of file roco.c line 306
        RoCo_engineVoltage ∈ -4.41559004995
        lastTime ∈ {110000}
        rampValue ∈ 1.
[value] DUMPING STATE of file roco.c line 306
        RoCo_engineVoltage ∈ -2.34629463843
        lastTime ∈ {110020}
        rampValue ∈ 1.
[value] DUMPING STATE of file roco.c line 306
        RoCo_engineVoltage ∈ -0.652098847809
        lastTime ∈ {110040}
        rampValue ∈ 1.
[value] DUMPING STATE of file roco.c line 306
        RoCo_engineVoltage ∈ 0.734991347706
        lastTime ∈ {110060}
        rampValue ∈ 1.
[value] DUMPING STATE of file roco.c line 306
        RoCo_engineVoltage ∈ 1.87064474807
        lastTime ∈ {110080}

I have included variable rampValue above because, as one can see by exploring the code in the GUI as for case 1, in addition to the voltage filter, the program has a mechanism to make the voltage increase slowly when transitioning from idle to active or from active to idle. Slice for the statement that computes the value of rampValue to see this mechanism more clearly for yourself: the command frama-c-gui -load state3 can be used to load the values previously computed. In the statement's contextual menu, first select "Enable slicing", and then "Slicing → Slice stmt".

slice_rampValue.png

The log extract above shows that unfortunately, this mechanism does not smooth the voltage when a new order arrives in the middle of the execution of the current order. Here, the motor has just ramped out to full speed (rampValue==1.0) when the command to move in the opposite direction arrives at lastTime==110000. The value of rampValue remains pegged at 1.0 during the transition. This could be considered to be the bug that causes the voltage jump.

Apart from the sleazy way we obtain the numerical limit by reverse-engineering the python script, it is impossible to argue against the inclusion of the patch below. Although it is written without understanding the root cause of the problem, it exactly prevents a situation that is never desirable, the destruction of the robot.

--- challenge/src/roco.c.orig	2011-05-16 13:53:57.000000000 +0200
+++ ppc/tests/ICPC/src/roco_sol3.c	2011-05-02 16:35:52.000000000 +0200
@@ -293,8 +293,17 @@
                 desiredEngineVoltage, Engine_maxVoltage_PARAM);
         limitationActive = (Engine_maxVoltage_PARAM == desiredEngineVoltage) ||
                              (Engine_minVoltage_PARAM == desiredEngineVoltage);
-        RoCo_engineVoltage = PT1_Filter (&voltageFilter, desiredEngineVoltage,
-                t13, dT);
+
+	{ real64 tentativeV = PT1_Filter (&voltageFilter, desiredEngineVoltage,
+					  t13, dT);
+	  if (tentativeV - RoCo_engineVoltage > 0.390625)
+	    RoCo_engineVoltage += 0.390625;
+	  else if (tentativeV - RoCo_engineVoltage < -0.390625)
+	    RoCo_engineVoltage -= 0.390625;
+	  else 
+	    RoCo_engineVoltage = tentativeV;
+	}
+
         wasInit = init;
         wasActive = RoCo_isActive;
         Timer_tick (&shutdownTimer, dT);

Even if/when the root cause of the problem is fixed, this patch should remain in place for the sake of defensive programming. It exactly prevents the voltage jumps that cause the destruction of the motor, and has almost no practical effects otherwise, so it does not hurt.

The best fix would perhaps be to do a ramp down followed by a ramp up when receiving, in the middle of the execution of an order, another order that causes a change in direction. This seems like a major change to the logic of the program, so we won't attempt it and we will instead rely on the patch above to prevent destruction of the robot leg.

After applying the patch above, we run and analyze testcase 3 again and we obtain another complaint from the validation script eval.py provided by the organizers:

src $ make && ./roco_3.exe 2> trace3 && python ../eval.py 3 < trace3
make: Nothing to be done for `all'.
149100: TOO SLOW / TARGET NOT REACHED!
found 1 error(s)

To look on the bright side of life, the robot didn't explode.

The analysis log again conveniently provides all variables' values in a single (if longish) step:

$ grep "\\(RoCo_engineVoltage\\|lastTime\\|STATE\\|rampValue\\|legAngle \\)" log3 
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 19.970696807
        RoCo_engineVoltage ∈ 6.00755510873
        lastTime ∈ {115740}
        rampValue ∈ 0.874
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 20.0821939081
        RoCo_engineVoltage ∈ 5.99058545861
        lastTime ∈ {115760}
        rampValue ∈ 0.872
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 20.1933858392
        RoCo_engineVoltage ∈ 5.9736158
        lastTime ∈ {115780}
        rampValue ∈ 0.87
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 20.304272565
        RoCo_engineVoltage ∈ 5.95664613443
        lastTime ∈ {115800}
        rampValue ∈ 0.868
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 20.4148540546
        RoCo_engineVoltage ∈ 5.93967646317
        lastTime ∈ {115820}
        rampValue ∈ 0.866
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 20.5251302809
        RoCo_engineVoltage ∈ 5.92270678725
        lastTime ∈ {115840}
        rampValue ∈ 0.864
...

As illustrated in the log snippet above, the robot leg overshoots the target position. This can be attributed to the value of variable rampValue being too high, and specifically, decreasing too slowly. Since there is now a piece of code to limit voltage jumps, we can increase the speed at which this variable varies without fear of destroying the robot. In fact, for all we know, this variable was set to vary slowly in a misguided attempt to fix the explosion problem.

Since parameters RoCo_TimeSlopePos_PARAM and RoCo_TimeSlopeNeg_PARAM influence directly the computation of rampValue, we decide to adjust these:

--- roco_config_testcase_3.h	2011-03-23 11:07:48.000000000 +0100
+++ roco_config_testcase_3_sol3.h	2011-05-02 16:35:52.000000000 +0200
@@ -37,8 +37,8 @@
 real64 RoCo_shutdownTimeout_PARAM = 12.0;
 real64 RoCo_stepSpeed_PARAM = 8.0;
 real64 RoCo_TempFltT_PARAM = 0.15;
-real64 RoCo_TimeSlopeNeg_PARAM = 0.1;
-real64 RoCo_TimeSlopePos_PARAM = 0.1;
+real64 RoCo_TimeSlopeNeg_PARAM = 0.5;
+real64 RoCo_TimeSlopePos_PARAM = 0.5;
 real64 RoCo_voltageFilter_PARAM = 0.1;
 
 real64 Engine_maxVoltage_PARAM = 15.0;
$ make && ./roco_3.exe 2> trace3 && python ../eval.py 3 < trace3 
...
found 0 error(s)


The ReadMe.txt reports that the field engineers also complained about voltage variations at time t=50. Since the voltage is perfectly stable around t=50s, we can interpret this part of their report as caused by the hallucinogenic fumes famously released by burning electronics. Perhaps they mean t=66140ms, when indeed the voltage can be seen to drop a bit sharply in the original execution trace. The reason why it drops sharply is clear on this selection of variable values:

[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 12.3506194964
        RoCo_engineVoltage ∈ 4.50997019947
        lastTime ∈ {66120}
        rampValue ∈ 0.614
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 12.4293066254
        RoCo_engineVoltage ∈ 4.52254504977
        lastTime ∈ {66140}
        rampValue ∈ 0.616
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 12.5082201017
        RoCo_engineVoltage ∈ 3.90576827098
        lastTime ∈ {66160}
        rampValue ∈ {0}
[value] DUMPING STATE of file roco.c line 306
        RoCo_legAngle ∈ 12.5873599253
        RoCo_engineVoltage ∈ 3.4007941544
        lastTime ∈ {66180}
        rampValue ∈ {0}

Variable rampValue drops from 0.616 to 0.0, and this cannot even be excused by the arrival of an order at that instant. It seems the ramping up/down part of the program might as well be thrown away and re-written from scratch.

Looking at values for instants 66140 and 66160 side-by-side, it becomes clear what changes and causes rampValue to jump to 0. 66140.png

The variable rampTarget changes from 1.0 to 0.0.

We could execute cycle 66160 in a debugger, but that would be banal. Instead, let us use a trick to observe that cycle on its own in Frama-C's GUI. We replace the line RoCo_process(); in main_testcase_3.c with:

                if (lastTime==66160) RoCo_process_66160(); else RoCo_process();

We then copy-paste the body of function RoCo_process() as the definition of an identical function RoCo_process_66160() and launch again the analysis. Yes, this means waiting ten minutes again. You have to make sacrifices to be cool. This maneuver allows us to observe cycle 66160 in the GUI, unmixed with the others, with a lot more code displayed in orange (indicating that this code was not executed at that cycle and can be ignored), and values that are specific for that cycle.

gui_66160.png

Listing the differences between this approach and using a debugger is left as an exercise for the reader. This inspection mostly confirms that the value 0.0 seems normal for variable rampTarget, and that it should be function Ramp_out() that makes sure that rampValue varies slowly even when rampTarget jumps around. Note in passing that the duplication of function RoCo_process() does not help to see what happens inside function Ramp_out() at cycle 66160. But you are probably growing a little restless, so I won't suggest to launch an analysis again. We can perfectly well infer what happens inside function Ramp_out() by reading the code. Doing that, we notice the tests below.

  if (data->dir == -1) {
    data->state = target;
  }
   ...
  if (data->dir == 1) {
    data->state = target;
  }

This doesn't make sense. There is never any reason to jump directly to the target. We suppress these lines, and the voltage jump around 66160 disappears.

Monday, June 6 2011

Fixing robots, part 1

This blog post is a revised version of part of my submission to the ICPC 2011 Industry Challenge. Please, go ahead and read the challenge description. I could only paraphrase it without adding anything to it, and so I won't.

The study was made with the April development version of Frama-C, which differs from the last released version in a number of bugfixes (list of reported issues) and minor new features. Specifically, it uses the value analysis and dependencies computations (documented in this manual) and code navigation features. The latter reuse building blocks that were implemented for slicing.

The ever important first step: identify missing library functions

The first step, as the faithful reader of this blog would now know, before using the value analysis for anything is to identify library functions the program depends on:

frama-c impls.c main.c roco.c sim.c main_testcase_1.c -metrics -cpp-command "gcc -DTESTCASE=1 -C -E"

...
Undefined functions (10):
 __builtin_inf  (15 calls); __builtin_inff  (15 calls); fprintf  (2 calls);
 __swbuf  (10 calls); __builtin_fabsf  (15 calls); exp  (1 call);
 __builtin_infl  (15 calls); __builtin_fabsl  (15 calls);
 __builtin_fabs  (15 calls); fabs  (9 calls); 
...

The host computer's standard headers have been used, hence the confusing __builtin_ prefixes. Better not rely on them: the robot's operating system probably bears little resemblance to the host's. Placeholder headers provided with Frama-C can be used instead:

frama-c impls.c main.c roco.c sim.c main_testcase_1.c -metrics -cpp-command "gcc -DTESTCASE=1 -C -E -nostdinc -I. -I/usr/local/share/frama-c/libc"

...
Undefined functions (3):
 exp  (1 call); fabs  (9 calls); fprintf  (2 calls); 
...

Standard functions exp() and fabs() are missing from the Carbon release, but have been added to the development version. With the development version, one simply needs to list /usr/local/share/frama-c/libc/math.c as a file of the analysis project. Function exp() is implemented as a value analysis builtin, taking advantage of the host's own exp() function.

If you wish to reproduce at home, you have to provide your own implementation for exp(). It may not need to be very accurate, but it needs to be analyzable with Carbon's value analysis. A good way would be to clean up the implementation from Sun that everyone is using. As I already said in previous posts, accessing the bits of a floating-point representation directly is so 1990s. If someone removed these ugly and obsolete manipulations, we would end up with a better implementation for exp() (and it would be analyzable with the value analysis from Carbon).

For fabs(), you can use the function below. An implementation from a library would probably unset the sign bit in the IEEE 754 representation but, again, Carbon's value analysis doesn't handle this precisely yet. The implementation below is handled precisely in the sense that a singleton for the input x results in a singleton for the returned value.

double fabs(double x){
  if (x==0.0) return 0.0;
  if (x>0.0) return x;
  return -x;
}

Finally, calls to fprintf(), that do not have any repercussions on the continuing execution, can be replaced with calls to the Frama_C_show_each() built-in. We can let the pre-processor to do this for us, passing the option -Dfprintf=Frama_C_show_each to GCC when pre-processing.

Looking for unspecified/undefined behaviors

We are now ready to launch the value analysis. Since we anticipate testcases 2 and 3 to be very similar, we write a short script:

#!/bin/bash

export CPP="gcc -Dfprintf=Frama_C_show_each -DTESTCASE=$1 -C -E -nostdinc -I. -I/usr/local/share/frama-c/libc"
FILES="/usr/local/share/frama-c/libc/math.c impls.c main.c roco.c sim.c"
TESTCASE="main_testcase_$1.c"

exec frama-c ${FILES} ${TESTCASE} -val -slevel 50000 -unspecified-access -val-signed-overflow-alarms -calldeps -save state$1

About ten minutes after running script.sh 1 > log1, we obtain a log (long:~70MiB) and a state (~1MiB). The log is long because it contains progression messages that are only intended to help identify analysis issues, and can be ignored most of the time. The state contains all the information that has been computed about the program, including values of variables, in a very compact format.

The analysis is completely unrolled (because of option -slevel 50000) and precise until the end. This means that the value analysis has in effect simulated the execution of the program with the inputs provided in main_testcase_1.c.

The log, despite its size, does not warn about any of the undefined or unspecified behaviors that the value analysis is guaranteed to identify (uninitialized access, use of a dangling pointer, overflows in signed integer arithmetics, invalid memory access, invalid comparison of pointers, division by zero, undefined logical shift, overflows in conversions from floating-point to integer, infinite or NaN resulting from a floating-point operation, undefined side-effects in expressions). This is very important. It means that we can rest assured that the strange dynamic behavior we are going to investigate is not caused by the misuse of one of C's dangerous constructs. Nothing would be more frustating than having to track the value of a variable which, according to the source code, is not supposed to change, but is modified through a buffer overflow. The value analysis guarantees we won't have to do that for this execution.

What would of course be better would be to verify that there can be none of the above undefined behaviors for any command sequence. This would be a much stronger result, but would also require a lot more work. When simulating a single execution inside the analyzer, we only check that that particular execution is free of undefined behavior, but it does not require any work from us (only a few minutes of work from the computer).

Exploiting the value analysis' results for program comprehension

The values computed and stored in state1 can be observed in Frama-C's GUI, using the command-line frama-c-gui -load state1. The GUI can also be used to identify the definition site(s) of a variable's value: select the variable at the program point you are interested in, and in the contextual menu, invoke Dependencies → Show defs.

Here, it is the value of variable RoCo_engineVoltage as displayed by the call to fprintf() (that we transformed into a call to Frama_C_show_each()) that is wrong, so we request the definition site(s) of that value:

s1.png

The GUI has pointed us to a call to function RoCo_Process() (using the yellow mark), so we now request the definition sites of RoCo_engineVoltage by the return; statement of that function. We obtain the two sites identified below:

s2.png

The condition that decides which branch is executing is the one shown in the screenshot below.

s3.png

The value analysis tell us the value of RoCo_isActive can be either 0 or 1 at this point during execution, but this variable is one of the variables whose value is printed in the analysis log, and its value was 1 at the instant we are interested in. We therefore focus on the definition site where the value assigned to RoCo_engineVoltage is computed in a call to PT1_Filter().

The dependencies of the particular call to PT1_Filter() we are interested in were computed by option -calldeps and can be found in the log. The call we are interested in is at statement 433. The log contains:

call PT1_Filter at statement 433:
 voltageFilter FROM state; x; t1; dt; voltageFilter
 \result FROM state; x; t1; dt; voltageFilter

Apart from reading its arguments state, x, t1, and dt, the call accesses a static variable voltageFilter. The address of voltageFilter is taken, so we have to be careful: this variable could be modified erroneously through a pointer (although the address-taking appears to be only to pass it to PT1_Filter(), which is innocent enough).

In fact, at this point, we have no idea which of the variables involved in the computation of the result of this call to PT1_Filter() is wrong. Clicking on a variable in the GUI provides the set of values for this variable at this program point, but this is still too imprecise here, since it mixes all 10000 or so passages through the statement.

Let us take advantage of the "blackboard" structure of the analyzed program and dump the entire program state at this statement, by inserting a call to Frama_C_dump_each(). See this previous post for a list of advantages of this built-in function over printf() or a debugger.

--- roco.c	(revision 12956)
+++ roco.c	(working copy)
@@ -293,6 +293,7 @@
                 desiredEngineVoltage, Engine_maxVoltage_PARAM);
         limitationActive = (Engine_maxVoltage_PARAM == desiredEngineVoltage) ||
                              (Engine_minVoltage_PARAM == desiredEngineVoltage);
+	Frama_C_dump_each();
         RoCo_engineVoltage = PT1_Filter (&voltageFilter, desiredEngineVoltage,
                 t13, dT);
         wasInit = init;

We need to launch the analysis again and find something to do for 10 minutes. This is a good time to start looking at bugs 2 and 3.

The log now contains state dumps for each passage through the statement where RoCo_engineVoltage is computed.

According to the ReadMe.txt in the challenge package, an order is given at t=50s. The log shows that this order fails to be executed speedily. The state dump at which lastTime contains 50000 and the next few ones show that of RoCo_engineVoltage's dependencies, variable desiredEngineVoltage is the one with the suspicious value: it is only -0.8, whereas parameters in file roco_config_testcase_1.h and values of the variable elsewhere in our log show that this voltage can go much higher. We are therefore left with the sub-problem of identifying why this variable has this value at this program point.

We use the same tools we have already used for RoCo_engineVoltage, this time applied to variable desiredEngineVoltage and this program point. The screenshot below shows the definitions sites for that value.

s4.png

The value of variable desiredEngineVoltage is defined by the call to function Limiter_Out(), whose argument is in turn defined by the call to Interpolate_from_curve() above.

Option -calldeps computed the implicit inputs of this call, which can be found in the log:

call Interpolate_from_curve at statement 423:
 \result FROM curve; x;
              EngineSpeedToVoltage_CURVE{.numPoints; .x[0..4]; .y[1..3]; }

The state dump in which lastTime==50000 shows that a low value for angleDiffRequest is the cause for the low value of desiredEngineVoltage.

The "Show defs" action in the GUI finds three possible definition sites for this value of angleDiffRequest, shown in the screenshots below.

s5.png

s6.png

s7.png

We find in the log that variable rampValue remains at 0 in the cycles that follow instant 50000. The value we observe for angleDiffRequest is compatible with the algorithm and values of variables at lines 264-277 of file roco.c. So it looks like the cause of the issue is the value of variable rampValue. Action "Show defs" in the GUI indicates that this value is computed by the call to Ramp_out() at line 240. The value of rampTarget is computed as 0.0 or 1.0 from a number of variables, and of these variables, RoCo_legAngleValid was always 1, and direction was always 0 or 1. The latter is suspicious, since in this execution, orders are given to move in both directions:

s8.png

The command "Show defs" applied to variable direction shows that indeed, the variable may have been set to 0 or 1 in three different sites.

s9.png

The site that corresponds to the MoveByAngle command, the middle one in the screenshot above, is suspicious: in the file main_testcase_1.c, the angle passed to this command is negative. This is not just a strange convention, because the computation below for the third definition site determined the direction for the other commands, that are all in the opposite direction, and variable direction was assigned 1 again there.

This suggests the fix below.

--- roco.c	(revision 13019)
+++ roco.c	(working copy)
@@ -211,7 +211,7 @@
                     direction = 0;
                 }
                 else {
-                    direction = (RoCo_desiredDeltaAngle > 0.0) ? -1 : 1;
+                    direction = (RoCo_desiredDeltaAngle > 0.0) ? 1 : -1;
                 }
                 RoCo_commandMoveByAngle = FALSE;
             }


Looking at the source code for related issues, one may notice that the value given to direction is also affected by the piece of code below. It would be worth the time testing different values of t9 and RoCo_hasMinMaxAngles_PARAM, although that complex computation is only active when using commands other than MoveByAngle.

            if ((fabs(t9) > t11) && (direction == 0)) {
                direction = ((((!RoCo_hasMinMaxAngles_PARAM) ||
                               (fabs(t9) >= 180.0))
                              ? t9 : -t9) > 0.0) ? 1 : -1;
            }

Thanks, and Next

My colleagues Benjamin Monate and Virgile Prevosto found the challenge and set up the files in the development repository, so that the only thing that was left to do was to identify the bugs, and started looking at them with me. Anne Pacalet and David Mentré have provided feedback for transforming this submission into a blog post. Anne Pacalet also implemented of all the code navigation features used in this study. The organizers Jochen Quante and Andrew Begel have done an impressive job of making this challenge interesting, and indeed, challenging.

I have also submitted my explanation of tasks 2 and 3 in the challenge. However, I thought someone might want to look at these themselves now that I have shown how easy it is on task 1. The only difficult part is finding an exp() function.

To be continued...

Monday, February 28 2011

On switch statements

In Carbon 20110201 and earlier versions of Frama-C, if you do not use precautions when analyzing a program with switch statements, you get imprecise results. Consider the example program below.

main(int argc, char **argv){
  switch(argc){
  case 1: 
    Frama_C_show_each_no_args(argc);
    break;
  case 2:
    Frama_C_show_each_exactly_2(argc);    
    /* fall through */
  case 3:
    Frama_C_show_each_2_or_3(argc);
    break;
  default:
    Frama_C_show_each_else(argc);
    break;
  }
  return argc;
}
$ frama-c -val sw.c
...
[value] Called Frama_C_show_each_exactly_2([-2147483648..2147483647])
[value] Called Frama_C_show_each_no_args([-2147483648..2147483647])
[value] Called Frama_C_show_each_else([-2147483648..2147483647])
[value] Called Frama_C_show_each_2_or_3([-2147483648..2147483647])
...

This is correct, but disappointingly imprecise. Frama-C (the framework) can normalize the code to a large extent so that its plug-ins (here, the value analysis) do not need to handle redundant constructs. A switch construct can be encoded using if and goto. Frama-C option -simplify-cfg does exactly that, for the convenience of plug-in authors. As long as they have already programmed a precise analysis for these constructions, they can always get a precise result by requiring the program to be transformed thus.

Crediting where credit is due, the C code normalization philosophy is inherited from CIL, which provided the basis of Frama-C's front-end.


The transformed program looks like this:

$ frama-c -simplify-cfg -print sw.c
...
  if (! (argc == 1)) {
    if (argc == 2) { goto switch_0_2; }
    else {
      if (argc == 3) { goto switch_0_3; }
      else { goto switch_0_default; goto switch_0_break; }
    }
  }
  switch_0_1: /* internal */ Frama_C_show_each_no_args(argc);
  goto switch_0_break;
  switch_0_2: /* internal */ Frama_C_show_each_exactly_2(argc);
  switch_0_3: /* internal */ Frama_C_show_each_2_or_3(argc);
  goto switch_0_break;
  switch_0_default: /* internal */ ;
  Frama_C_show_each_else(argc);
  goto switch_0_break;
  switch_0_break: /* internal */ ;
  return (argc);
...

I did not say the transformed code looked good. But usually, you do not need to look at the generated code; the analysis results are what you are interested in:

$ frama-c -simplify-cfg -val sw.c
...
[value] Called Frama_C_show_each_no_args({1; })
[value] Called Frama_C_show_each_exactly_2({2; })
[value] Called Frama_C_show_each_2_or_3({2; 3; })
[value] Called Frama_C_show_each_else([-2147483648..2147483647])
...

This time, the value analysis' results show that each time the program point at which we placed the Frama_C_show_each_no_args call is traversed, the value of argc is exactly 1. The program point with the call to Frama_C_show_each_exactly_2 is only traversed with argc==2, and so on. These are the expected results.


Sometimes, however, the value analysis is used as a means to a different end, and option -simplify-cfg can conflict with the final objective. Let's say the example program is modified slightly:

int x, t[4];

main(int argc, char **argv){
  t[0] = 10;
  t[1] = 11;
  t[2] = 12;
  t[3] = 13;
  switch(argc){
  case 1: 
    Frama_C_show_each_no_args(argc);
    break;
  case 2:
    Frama_C_show_each_exactly_2(argc);
    /* fall through */
  case 3:
    Frama_C_show_each_2_or_3(argc);
    x = t[argc];
    break;
  default:
    Frama_C_show_each_else(argc);
    break;
  }
  return x;
}

If the task we are actually interested in is obtaining a subset of the statements of the above program, so that this subset, when compiled and executed, returns the same value as the original program, option -simplify-cfg gets in the way. The options to instruct the slicing plug-in to compute this subset are -slice-return main -slice-print. We face the following dilemma:

  1. not use option -simplify-cfg, give the slicing plug-in access to an unmodified abstract syntax tree, but let it rely on imprecise values; or
  2. use option -simplify-cfg, get precise values, but have the slicing plug-in work on an abstract syntax tree representing a mangled version of the original program.

In the first case, we get the sliced program below, which is shaped like the original but retains several statements that could have been removed:

$ frama-c sw.c -slice-return main -slice-print 
...
{
  t[0] = 10;
  t[1] = 11;
  t[2] = 12;
  t[3] = 13;
   switch (argc) {
    case 1: ;
    break;
    case 2: ;
    case 3: ;
    /*@ assert (0 ≤ argc) ∧ (argc < 4);
          // synthesized
       */
    x = t[argc];
    break;
    default: ;
  }
  return (x);
}

In the second case, some useless initializations are removed, but the transformation makes the sliced program harder to follow:

$ frama-c sw.c -slice-return main -slice-print -simplify-cfg
...
{
  t[2] = 12;
  t[3] = 13;
  if (! (argc == 1)) {
    if (argc == 2) { goto switch_0_2; }
    else { if (argc == 3) { goto switch_0_3; } }
  }
  goto switch_0_break;
  switch_0_2: /* internal */ ;
  switch_0_3: /* internal */ ;
  x = t[argc];
  switch_0_break: /* internal */ ;
  return (x);
}


Or... you can use the next version of Frama-C:

$ bin/toplevel.opt sw.c -slice-return main -slice-print 
...
{
  t[2] = 12;
  t[3] = 13;
  switch (argc) {
    case 1: ;
    break;
    case 2: ;
    case 3: ;
    x = t[argc];
    break;
    default: ;
  }
  return (x);
}

As you can see on this example, from the next release onwards, option -simplify-cfg is no longer necessary to get precise results with the value analysis. This means better slicing of programs that use the switch construct — and one option that you may now forget about.


Many thanks to Boris, Taker of our Own Value Analysis Medicine extraordinaire, for implementing the new feature described in this post, and to Miguel Ferreira for providing expertise in English idioms.

Tuesday, January 11 2011

Value-analysis assisted verification of output variables and information flow

A previous post in the value analysis thread in this blog contained this digression:

Actually, we can also prevent function Update to keep trace of a previous input buffer, thanks to secondary analyses derived from the value analysis. One of them computes the set of locations written to by a C function. With this analysis, we can list the entire set of memory locations that Update uses as state, and check that only information that should be kept is. Perhaps there will be a blog post on this topic.

Before the Carbon release, I wasn't in a hurry to talk about this. The value-analysis assisted computations of output variables and information flow are part of the many improvements in Carbon. In the Boron release and before, when applied on the Skein codebase, they gave results that were too approximated to be useful.

Now, they just work.

One example use of these computations is provided as part of the tutorial in the Carbon version of the value analysis manual (it's section 2.5.2 there).

Rather than repeating here this addition to the tutorial, I want to reiterate on the notion of derived analysis. Although the outputs and dependencies computations rely on the results of the value analysis, they do not require you to check each of the alarms generated by the value analysis.

When a sequence of function calls has dependencies (as computed by option -deps) of hash[0..7] FROM msg[0..79]; ONE[bits 0 to 7] (and SELF), it means that if all goes well, array hash after the calls has been computed from array msg and variable ONE. Computing and taking advantage of this information is orthogonal to the verification that all always goes well during the execution of these function calls (which is done by making sure that the value analysis emits no alarm or by checking the emitted alarms with alternative techniques).

In more difficult cases, the latter may require a combination of human reflexion and of hours of computer time. But functional dependencies can be useful even when based on the results of a simpler, faster value analysis.

Saturday, December 4 2010

Unspecified behaviors and derived analyses, part 2

Context

This post is a sequel and conclusion to this remark.

Example of derived analysis: slicing

When writing a Frama-C plug-in to assist in reverse-engineering source code, it does not really make sense to expect the user to check the alarms that are emitted by the value analysis. Consider for instance Frama-C's slicing plug-in. This plug-in produces a simplified version of a program. It is often applied to large unfamiliar codebases; if the user is at the point where he needs a slicer to make sense of the codebase, it's probably a bad time to require him to check alarms on the original unsliced version.

The slicer and other code comprehension plug-ins work around this problem by defining the results they provide as ``valid for well-defined executions''. In the case of the slicer, this is really the only definition that makes sense. Consider the following code snippet:

  x = a;
  y = *p;
  x = x+1;
  // slice for the value of x here.

This piece of program is begging for its second line to be removed, but if p can be the NULL pointer, the sliced program behaves differently from the original: the original program exits abruptly on most architectures, whereas the sliced version computes the value of x.

It is fine to ignore alarms in this context, but the user of a code comprehension plug-in based on the value analysis should study the categorization of alarms in the value analysis' reference manual with particular care. Because the value analysis is more aggressive in trying to extract precise information from the program than other analyses, the user is more likely to observe incorrect results if there is a misunderstanding between him and the tool about what assumptions should be made.

Example of benign alarm: testing p<=q

Everybody agrees that accessing an invalid pointer is an unwanted behavior, but what about comparing two pointers with <= in an undefined manner or assuming that a signed overflow wraps around in 2's complement representation? Function memmove, for instance, typically does the former when applied to two addresses with different bases.

Currently, if there appears to be an undefined pointer comparison, the value analysis propagates a state that contains all possible results for the comparison and for the pointers. This blog post was just trying to scare you. It is possible to take advantage of the value analysis for program comprehension, and all existing program comprehension tools need to make assumptions about undefined behaviors. Most tools do not tell you if they had to make assumptions or not. The value analysis does: each alarm, in general, is also an assumption. Still, as implementation progresses and the value analysis becomes able to extract more information from the alarms it emits, one or several options to configure it either not to emit some alarms, or not to make the corresponding assumptions, will certainly become necessary.

- page 1 of 2