Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - OCaml

Entries feed - Comments feed

Sunday, October 9 2011

Zarith

Perhaps you are looking for something to do pending the upcoming release of Frama-C (someone called it "Nitrozen" once by accident in a commit message. I would like the name to stick, but I anticipate it won't)…


Or you are already a serious user of the value analysis plug-in or of one of the plug-ins that rely on it…


Or you might be in the near future…


Then this blog post is for you, because this blog post is about Zarith, that you should install.


Zarith is a multiprecision library for OCaml created by Antoine Miné. Zarith relies on GMP for the actual computations on big integers. Small integers, and there lies Zarith's advantage over previous multiprecision libraries for OCaml, use the same unboxed representation as OCaml's 31/63-bit integers. For computing on small integers, if you are not using some exotic architecture, Zarith uses the processor's instruction set most efficiently, relying on the processor's flags to detect whether the result can itself be represented as an unboxed integer.

The bottom line is that a multiprecision operation (say, a multiplication) is most of the times executed as the processor's corresponding instruction, with a couple of conditional branches before and after for handling the uncommon cases, and a couple of memory accesses and jumps (including a computed one) for the OCaml ↔ C interface. This may seem like a lot, but it's much less than the other multiprecision libraries. And of course, the unboxed representation saves space, too, as small integers do not have to be allocated at all and just reside in the word that would otherwise have been a pointer to their representation.


Benchmarks comparing Carbon and Nitrogen will come later; these will include the gains from switching from Big_int to Zarith. For minimal fuss, Frama-C will continue to compile with Big_int (the multiprecision library provided as part of the OCaml distribution), and perhaps I will include "Nitrogen compiled with Big_int" in the benchmarks, just to make the point.

Meanwhile, if you plan to use the value analysis for more than ten minutes, trust me, install Zarith now. You can use either the SVN version (instructions) or the 1.0 version. If you already knew about Zarith and had installed it a long time ago, I suggest you upgrade, to benefit from bugfixes and so that Frama-C Nitrogen's -save and -load options work.

[EDIT: deleted compilation instruction for obsolete Frama-C version]

Wednesday, September 28 2011

Summary of a 2010 research article

A number of colleagues and I have been tasked with writing a summary of an article they published in 2010. Each of us has to do this for an article of theirs, that is. I chose the article whose long version title is "A Mergeable Interval Map" (link, PDF available, in English starting with page number 10). The article describes a data structure to associate values to intervals. I like this data structure because it is uncomplicated, and its necessity was made apparent by the implementation of previous ideas. The article on the other hand, may not do as good a job as it should of explaining the constraints that led to invent a new data structure. This is an opportunity to try and improve that part.

Without further information concerning the intended audience, I decided to assume that the reader is vaguely familiar with the low-level aspects of C (he does not need to be a C programmer). It should also help if he is vaguely familiar with, say, one modern programming language (Perl, Python, PHP, C# or OCaml).

A new data structure for static analysis

When working on an analyzer for low-level embedded C, whose selling point is that it automatically and correctly analyzes relatively large programs relatively precisely, your analyzer will inevitably encounter a C program that does something like this:

union U
  {  char c[4];
      int *p; };

The analyzer does not necessarily have to handle type-punning through the above union very precisely (this, for one, requires hypotheses on the target platform's endianness and pointer size), but it should handle it correctly (or "soundly" as some prefer saying). If the union is used to manipulate the bytes of a pointer, and the pointer is copied somewhere else as a consequence, the analyzer should be aware that the member c of the struct, from which the program reads, overlap with member p, through which an address was previously written.

main(){
  u1.p = &i;
  for (i=0; i<4; i++)
    u2.c[i] = u1.c[i];
  *u2.p = 7;
  return i;
}

The program above copies the address of variable i byte by byte from u1 to u2, and finally uses the copied address to surreptitiously write 7 into i. An analyzer may lose track a little of what is happening here, and still remain correct. It may say that i may contain any integer at the end of the program (and ask the programmer to make sure that he is accessing a valid address with *u2.p). This is what frama-c -val does. Alternately, a correct analyzer may precisely follow what is happening in this program, and conclude that the program always returns 7 (this is what frama-c -val -slevel 10 does). But a correct analyzer should never say that the program returns 4, since this is not what happens in an actual run.


To return to the initial topic, in order to be correct for the above program, the value analysis must understand that members c and p of union u1 overlap. The way it does this is by representing the memory range reserved for a variable (in this example, u1, u2 or i) as a map from intervals to values. After the first instruction in main(), the memory for u1 can be represented as a map from interval 0..31 to {{ &i }}, assuming that we use the bit as the basic memory unit and that a pointer occupies 32 bits in memory.


The map is a very useful data structure natively provided in many modern programming languages, where it may also be called "dictionary" or "associative array". A map is a collection of bindings between keys and values. Bindings can be added, removed, and the map can be queried for the value associated to a given key. Typical implementations used in Perl or Python rely on hash tables, an efficient representation that allows all these operations.

It was not possible to use hash tables for the interval-indexed maps needed by the value analysis, and one of the reasons was that the basic set of operations does not include "find the bindings with a key (interval) that intersects the key (interval) I have". Instead, we used a binary search tree (with intervals as keys). Compared to hash tables, binary search trees allow to look in the neighborhood of a key. For instance, when the analyzer looks for the interval 0..7 in the tree that contains a binding for 0..31 (in the first iteration of the for loop in our example), it does not find it, but it finds that the interval 0..7 is contained in 0..31, and that therefore, the binding 0..31 : {{ &i }} is pertinent. This is how the value analysis detects that the address of i that was previously written in memory through u1.p is being copied to u2 in the for loop.


One general consideration with binary search trees is that some sort of heuristic is needed to keep them balanced. A balanced tree has the property that at each node, the number of bindings on the left-hand side is about the same as the number of bindings on the right-hand side. Looking up keys (intervals) in a binary search tree is fast for all keys only if the tree is balanced. The initial implementation of the value analysis used AVL trees, as provided in OCaml's standard library. AVL trees are self-balancing: when the tree changes, existing bindings may be moved about as needed to keep lookups efficient.


Implementation and experimentation revealed that there was another desirable property the ideal data structure should have for our use case.

int t[16];
...
if (... complex condition...)
  t[0] = 3;
else
  t[1] = 4;

In this new example, with AVL trees and their automatic re-balancing operations, the map representing t may end up balanced differently when propagated through the "then" and the "else" branches of the program. Say that the initial map is represented as below:


tri.png


In the memory state after the "then" branch, the contents at index 0 are updated according to the assignment in that branch:


trat.png


In the memory state after the "else" branch, the contents at index 1 are updated. Because of automatic re-balancing operations while this happens, the resulting tree may look like:


trae.png


Next, two very common operations in static analysis happen. The first is checking whether a memory state is included in another, so that only one needs to be kept. Here, neither state from the "then" and "else" branch contains the other. The following operation is to build the union of the two memory states. The result may be represented by the tree below.


trf.png


With AVL trees, it is normal for the tree to be re-balanced automatically when operated on. But both inclusion and union operations would be much easier if the trees coming from the then branch and the else branch were always balanced identically. If they were both balanced in the same way the initial state was, it would be possible to recognize at a glance that the subtree for indices 2, 3, 4 is the same in the states coming from both branches, and to re-use it as a subtree of the result.

Our contribution in the article "A Mergeable Interval Map" is a new kind of binary search tree that uses intervals as keys and that remains balanced through insertions and deletions, but with the additional property that trees that have different operations applied to them remain balanced the same way. We call this property "mergeability", and it makes inclusion and union easier to compute. This means that a static analyzer using this representation is faster and consumes less memory than with a less adequate representation.

Friday, August 26 2011

The OCaml compiler does have some nice optimizations

Many OCaml programmers use it because it offers a reasonable (to them) compromise between expressivity and control over resources use. Serious OCaml users are often heard complaining about relatively simple optimizations that the compiler does not have. But this reveals as much of the kind of programmers that end up using OCaml as it does of OCaml itself.


Case in point: the value analysis in Boron used to have a type similar to:

 type t = bool * bool * v

If this was C, the definition above would be a struct. Here, bool is the predefined type that you expect, v is the type of values a C expression can have in the value analysis, and the type t being defined is the type of values as found in memory. A value in memory can be a normal value (the v part) or it may have reason to be uninitialized (the first bool) or it may be a dangling pointer (the second bool). During analysis, a value in memory can be all of the above at once, as is variable y after the following snippet:

 void f(void)
 {
   int* y;
   if (unknown condition 1)
     y = NULL;
   else if (unknown condition 2)
   {
     int l;
     y = &l;
   }
   ...
 }

After the first instructions of function f, variable y is either uninitialized or a dangling pointer or the normal value NULL. The value analysis "uninitialized" flag can also be seen in action in this post.


An OCaml value of the above type t occupies four words in memory: one header word for automatic memory management, and one for each field. Each field occupies one word to make memory management, genericity, etc. easier.

In the period between Boron and Carbon, I decided to make the representation of type t a bit more compact (this is one of several memory use improvements I boasted about in this blog). The two booleans can be represented as bits inside a same integer; this would save one word. I decided to go further and, in effect, to borrow two unused bits from the header word by using the definition below:

 type t =
     C_uninit_escaping of v
   | C_uninit_noescaping of v
   | C_init_escaping of v
   | C_init_noescaping of v

This type definition is no longer like a C struct. It is more like an union, an union of v or v or v or v, but a discriminated union where it's possible to recognize what kind of v one is talking about. C forces the programmer to manage eir own tag. In OCaml, some bits are reserved in the bloc header for the tag. A value of the new type t always occupies two words, one word for the header (including some bits for the tag), and one word for the contents of type v.

During analysis, when the C program gets a value from memory and uses it in a computation, it is necessary to get the part of type v from a value of type t. With the old definition, the function was:

 let get_v (_, _, v) = v

With the new definition, the same function get_v would be:

 let get_v = 
   function 
      C_uninit_escaping  v
    | C_uninit_noescaping v
    | C_init_escaping v
    | C_init_noescaping v -> v

I was afraid that this would be compiled into an ugly 4-case test, so I used a different implementation, based on low-level OCaml primitives that are not supposed to be used:

 let get_v : t -> v = fun x -> Obj.obj (Obj.field (Obj.repr x) 0)

It is ugly too, but my hope was that the compiled code would be better. It literally means "forget the type of the argument x (something you are never supposed to do in OCaml) and get the first field of that, whatever that is". I was so used to the compiler not optimizing to my liking that I did not even compare the code for the two versions. I should have known better. Julien Signoles discovered that the compiler generates the following code for the first, pure OCaml version:

        movq    (%rax), %rax
        ret

The above is "AT&T" assembly syntax, and corresponds to return *arg; in C. It's even better than the generated code for the second, low-level version because in the first version, the compiler has more type information to rely on. Plus, since I am reading the assembly for the whole module, I can confirm that the function gets inlined where it is called directly (another optimization that the OCaml compiler does very well).

Sunday, August 14 2011

begin-while-repeat

A little while ago, I was commenting somewhere that Forth had the ultimate control structure, the BEGIN (1) WHILE (2) REPEAT construction.

(1) and (2) are holes to be filled with programs. WHILE takes an evaluated condition from the stack, so that (1) can usually be divided into (1a): do some stuff and (1b): determine whether the loop should exit. Regardless of the value of the condition, (2) is run, and then, if the condition was true, REPEAT sends control back to BEGIN, and if it was false, it continues with the rest of the program.

Sometimes you need the full power of the begin-while-repeat. Not coincidentally (because this structure is really useful), this happened to us recently, so we looked for the first time at how to write begin-while-repeat in OCaml. OCaml has only while (e1) do (e2) done, but you can nest expressions arbitrarily. It turns out that the answer is:

while (1a) ; let cond = (1b) in (2) ; cond do () done

One tiny issue is that the OCaml mode for Emacs does not seem to do a very good job of indenting it. What a shame, such a useful idiom.

Wednesday, August 10 2011

Csmith testing reveals that I'm no good at probabilities (and lazy)

Csmith testing

A typical Frama-C Csmith testing script repeats four actions in an infinite loop:

  1. getting a random program from Csmith;
  2. compiling and executing it;
  3. analyzing it with Frama-C;
  4. using the results from step 3, possibly together with those of step 2, to determine whether the program reveals something that would interest a Frama-C developer.

Step four is the difficult part. I definitely want to hear about varied events that could happen during the analysis, and step four represents about half the testing script's line count. One difficulty is that if a Frama-C bug has already been identified, but isn't fixed yet, I do not want to be warned again about that bug until it is supposed to have been fixed. Sometimes it is possible to use Csmith command-line options to generate only programs that do not trigger the bug. Sometimes, this is impossible, and you have to recognize the bug in step 4.


A particular known bug in Frama-C was characterized by the backtrace below, that I am posting in its entirety to give you a chance to identify my mistake before I tell you about it.

         Raised at file "src/memory_state/new_offsetmap.ml", line 361, characters 26-37
         Called from file "src/memory_state/new_offsetmap.ml", line 377, characters 17-64
         Called from file "src/memory_state/new_offsetmap.ml", line 1394, characters 15-34
         Called from file "src/memory_state/new_offsetmap.ml", line 1436, characters 18-114
         Called from file "set.ml", line 293, characters 37-58
         Called from file "src/memory_state/new_offsetmap.ml", line 1433, characters 11-399
         Called from file "src/memory_state/offsetmap.ml", line 1308, characters 18-93
         Called from file "src/memory_state/lmap.ml", line 641, characters 24-260
         Called from file "src/memory_state/lmap.ml", line 626, characters 18-1023
         Called from file "src/memory_state/cvalue.ml", line 1171, characters 12-50
         Called from file "src/value/eval_exprs.ml", line 1399, characters 4-61
         Called from file "src/value/eval_exprs.ml", line 1447, characters 4-75
         Called from file "src/value/eval_exprs.ml", line 1058, characters 8-66
         Called from file "src/value/eval_exprs.ml", line 986, characters 10-62
         Called from file "src/value/eval_exprs.ml", line 1056, characters 8-52
         Called from file "src/value/eval_exprs.ml", line 1207, characters 4-55
         Called from file "src/value/eval_stmts.ml", line 446, characters 6-105
         Called from file "src/value/eval_stmts.ml", line 880, characters 54-69
         Called from file "list.ml", line 74, characters 24-34
         Called from file "src/memory_state/state_set.ml", line 26, characters 2-24
         Called from file "src/value/eval_stmts.ml", line 889, characters 14-232
         Called from file "cil/src/ext/dataflow.ml", line 320, characters 29-47
         Called from file "cil/src/ext/dataflow.ml", line 497, characters 8-21
         Called from file "cil/src/ext/dataflow.ml", line 501, characters 9-22
         Called from file "src/value/eval_funs.ml", line 100, characters 14-37
         Called from file "src/value/eval_funs.ml", line 502, characters 8-63
         Called from file "src/value/eval_stmts.ml", line 819, characters 12-71
         Called from file "src/value/eval_stmts.ml", line 831, characters 10-114
         Called from file "src/value/eval_stmts.ml", line 864, characters 38-62
         Called from file "list.ml", line 74, characters 24-34
         Called from file "src/value/eval_stmts.ml", line 920, characters 26-79
         Called from file "cil/src/ext/dataflow.ml", line 320, characters 29-47
         Called from file "cil/src/ext/dataflow.ml", line 497, characters 8-21
         Called from file "cil/src/ext/dataflow.ml", line 501, characters 9-22
         Called from file "src/value/eval_funs.ml", line 100, characters 14-37
         Called from file "src/value/eval_funs.ml", line 502, characters 8-63
         Called from file "src/value/eval_stmts.ml", line 819, characters 12-71
         Called from file "src/value/eval_stmts.ml", line 831, characters 10-114
         Called from file "src/value/eval_stmts.ml", line 864, characters 38-62
         Called from file "list.ml", line 74, characters 24-34
         Called from file "src/value/eval_stmts.ml", line 920, characters 26-79
         Called from file "cil/src/ext/dataflow.ml", line 320, characters 29-47
         Called from file "cil/src/ext/dataflow.ml", line 497, characters 8-21
         Called from file "cil/src/ext/dataflow.ml", line 501, characters 9-22
         Called from file "src/value/eval_funs.ml", line 100, characters 14-37
         Called from file "src/value/eval_funs.ml", line 482, characters 4-67
         Called from file "src/value/eval_funs.ml", line 564, characters 11-44
         Re-raised at file "src/value/eval_funs.ml", line 578, characters 47-50
         Called from file "src/project/state_builder.ml", line 1076, characters 9-13
         Re-raised at file "src/project/state_builder.ml", line 1080, characters 15-18
         Called from file "src/value/register.ml", line 49, characters 4-24
         Called from file "queue.ml", line 134, characters 6-20
         Called from file "src/kernel/boot.ml", line 36, characters 4-20
         Called from file "src/kernel/cmdline.ml", line 723, characters 2-9
         Called from file "src/kernel/cmdline.ml", line 200, characters 4-8
         
         Unexpected error (New_offsetmap.Make(V).End_reached).

The uncaught exception comes from new_offsetmap:361, in an auxiliary function, but in this particular case, the place whence this auxiliary function is called (line 377 in the same file) is a good way to identify the bug. If the auxiliary function raises the same exception when the function is called from another place, it's another bug. If it's from the same call, it's the same bug.

I adapted my shell script to include the following test in step 4:

if grep "line 377" analysis_log
then true # I do not know how to negate a condition in shell, and I do not want to learn
else ... # continue classifying the analysis

I wanted to grep for file name and line number, but in the backtrace, the file name is surrounded by quotes, and I didn't know how to escape the quotes inside the grep command-line.

Mergeable Interval Maps

The file new_offsetmap.ml contains the implementation of the data structure described in this article. Pages 6 to 33 are in English. Incidentally, Dotclear does not seem to offer any way to specify that a linked document is half in one language and half in another.

Although the new data structure's implementation is far from complete, it is useful, for handling unions, to occasionally translate from the old datastructure to the new one. Csmith generates programs where unions can be written through a member and read from another (the members are only of integral types). The old data structure can handle most sequences of write combinations having occurred before an union is read, but a few complex combinations are not handled. Richard Bonichon and I already wrote the code to handle all possible combinations in the new data structure, and it would be a waste of time to backport this work to the old data structure at this point.


You may have guessed that an exception is raised in this module because it's newer and full of bugs, but in fact, what is currently implemented already works quite well (one of the advantages of having lots of tests and a trusted reference implementation). The reason the new implementation detects bugs is that knowing in advance what it was going to be used for, we included lots of assertion checks for conditions we knew weren't supposed to happen. The bug comes in much earlier, during parsing, and cannot conveniently be worked around when it is detected in this module.

My mistake

What mistake did I make? The backtrace I have shown is 55 lines long. Each of them contains the pattern "xxx lines", where the values for "xxx" are not independent, but we'll assume that they are for the sake of this argument. All files are more than 377 lines long, and some of them are more than 3779 lines long. It is not sufficiently improbable that my script would hide from me another bug, with a completely different backtrace, that just happens to contain "line 377", or even "line 3775".


I noticed this a few days ago. Resolute in my decision not to learn how to escape quotes in shell, I decided to make false negatives much less likely by grepping for "line 377, characters 17-64" instead.

This is when I updated Frama-C from source control. The file new_offsetmap.ml being work in progress, the file had been modified and the line that characterized the bug changed. So I gave up, and started to experiment with grepping special characters. It turns out that it's possible to grep for doubles quotes by using the simple syntax below.

grep 'Raised at file "src/memory_state/new_offsetmap.ml"' analysis_log

What is the moral of this story? What I refused to learn was very simple; I eventually replaced an approximate test by another that is just as approximate, but in a different way; if you see other angles, please write them in the comments.

Aside

Are you thinking "55 frames in the call stack? What a bunch of bloated analysis framework! This thing must be completely useless!"? Then you are making judgments based on criteria you are familiar with and ignoring those that are new to you. A more correct reaction would be "I know how to implement a much lighter analyzer that will have all the same functionality and will be just as precise in 5000 lines of my favorite language over a week-end; I'll do that and show them."


Zaynah Dargaye provided useful comments on an early version of this post.

Monday, August 8 2011

Holiday stuff

I do not expect that many people are diligently working at this time of year, and this is my excuse for posting this series of productivity-wasting links. Some of them are even on-topic.


For those who missed it, GNU/Linux Magazine/France has a summer special on the C programming language, with a 7-page introduction to Frama-C. In French, and as the URL says, "chez votre marchand de journaux". In related news, Dotclear, the Content Management System for this blog, allows to specify a link's target language in its markdown language. I have always wondered what that was good for. Well, the link in this paragraph is dutifully marked as going to a French page. Tell us if you notice any difference.


John Carmack, famous programmer of 3D games, spoke at length about static analysis at QuakeCon 2011 a few days ago. He also says at one point that he wishes he could experiment with, say, OCaml.


In the lab, we are supposed to obtain an administrative authorization before publishing any article. If the article has non-CEA co-authors, we must obtain an authorization in writing from each of them in order to apply for the authorization to publish (I'm simplifying a bit here. In reality, we must submit an intranet form that will automatically be rejected because there the form has no field for co-author authorizations, and then send the co-author authorizations in reply to the rejection mail). I have always assumed this was a CEA-wide thing, and therefore, I wonder whether my colleagues from the physics department had to fulfill this formality when they published this 3469-author article.

Note that I am not expressing any opinion on the scientific value of the article. It is a LHC article, it seems normal for it to have plenty of authors. I am just wondering wondering whether the CEA publishing rule was applied, what proportion of the nearly 200 institutes have a similar rule, and how many e-mails were exchanged in total to appease the gods of red tape.

Thursday, March 3 2011

This time for real: verifying function cosine

This post follows that post where a typical double-precision implementation for a cosine function is shown. That implementation is not a correctly rounded implementation, but its double result is precise to a level close to what the double-precision floating-point representation allows. It uses low-level tricks that make it faster on nineties computers.

In this post, I offer a slightly simplified version of the same function (download). This function does not compute exactly the same results, but according to a few billion tests I ran, it does not seem more wrong than the original. I expect the simplified version is actually faster than the original on tens computers, but this is not why I modified it. I needed to make changes because the value analysis does not yet precisely handle all the low-level conversions in the original (accessing the 32 most significant bits of a double). Although the value analysis does not say anything false, it is only possible to analyze the function precisely if this idiom is handled precisely.

If it was important to analyze as-is code that uses this low-level idiom, implementation of this treatment could be prioritized according to the results of the experiment in this post. Results on the original function can be expected to be the same after the new feature is implemented as the results for the modified function described here.

I implemented a quick-and-dirty value analysis builtin for the purpose of this verification. It is not provided so much for being re-used as for possible comments: it is large enough that a bug in it could invalidate the verification. This builtin checks that its second argument is close enough to the cosine of its first argument (it uses its third argument as allowable distance). The builtin displays a log line for each call, so that you can check that it was properly called.

Demonstrating how easy it is to write an unsafe builtin, this implementation assumes without verification that it is called in a range where function cosine is decreasing.


I then wrote the following lines of C:

          x = Frama_C_double_interval(..., ...);
          m = mcos(x);
          Frama_C_compare_cos(x, m, 0x1.0p-21);

Next, I arranged to have the value analysis repeatedly analyze these lines with bounds passed to Frama_C_double_interval that, taken together, cover entirely the range from 0 to π/4. I will not say how I did, because it is not very interesting. Besides, the method I used is convenient enough to discourage someone from writing an Emacs Lisp macro to generate sub-division assertions. That would be an unwanted side-effect: it in fact has complementary advantages with the would-be assertion generation macro that I recommend someone look into.

The verification used a few hours of CPU time (I did not try very hard to make it fast). One thing I did was adapt the input sub-interval width to the part of the curve that was being verified, from two millionths for inputs near zero to a quarter of a millionth for inputs near π/4.

During this experiment, I actually used the cosine implementation from CRlibm as reference function. This library, incidentally developed at my alma mater, provides a number of fast and correctly rounded implementations for mathematical functions. The value analysis was configured with the -all-rounding-modes option, so the results are guaranteed for arbitrary rounding modes, and are guaranteed even if the compiler uses supplemental precision for intermediate results (e.g. if the compiler uses the infamous 80-bit historial floating-point unit of Intel processors).

Here are the logs I obtained (warning: I did not make any file of that level of boringness publicly available since my PhD, plus these are really big, even after compression. Handle with care): log1.gz log2.gz log3.gz log4.gz log5.gz log6.gz log7.gz log8.gz log9.gz log10.gz log11.gz log12.gz log13.gz log14.gz log15.gz. To save you the bother of downloading even the first file, here is what the first three lines look like.

[value] CC -0.0000000000000000 0.0000019073486328 0.9999999999981809 1.0000000000000002 0.9999999999981810 1.0000000000000000 OK
[value] CC 0.0000019073486328 0.0000038146972656 0.9999999999927240 0.9999999999981811 0.9999999999927240 0.9999999999981811 OK
[value] CC 0.0000038146972656 0.0000057220458984 0.9999999999836291 0.9999999999927242 0.9999999999836291 0.9999999999927242 OK


From the fact that the builtin did not complain for any sub-interval, we can conclude that on the interval 0..π/4, the proposed implementation for cosine is never farther than 2^-21 (half a millionth) from the real function cosine.

According to the aforementioned billion tests I made, the implementation seems to be much more precise than above (to about 2^-52 times its argument). The key difference is that I am quite sure it is precise to half a millionth, and that I have observed that it seems to be precise to about 2^-52 times its argument.


Guillaume Melquiond provided irreplaceable floating-point expertise in the preparation of this post.

Wednesday, March 2 2011

On memcpy (part 2: the OCaml source code approach)

When picking up the title for the previous post on function memcpy, I anticipated this second part would describe the Frama_C_memcpy built-in function. The subtitle "part 1: the source code approach" seemed a good idea, since the first part was about using C source code to tell the analyzer what this function memcpy it often encounters is.

I did not think further, and did not realize that I would spend this sequel post explaining that a built-in function is an OCaml function that directly transforms an abstract state into another abstract state without going through the tedious, instruction-by-instruction interpretation of C code. So this second post is still about source code.

It is not source code that it is recommended, or even documented how, to write yourself. But it is a convenient way for value analysis experts to introduce certain new features.


What new features? Smoking hot, blazing fast new features.

Also, features that work completely right neither the first nor the second time. The Formula One of features, if you will. In fact, you could be forgiven for calling them Virgin Racing features, although I prefer to think of them as Ferrari features.

As just stated, builtins are about using OCaml functions to describe the effect of an unavailable C function. For instance, calls to Frama_C_memcpy(d,s,l) are interpreted with an OCaml function that, in a single step, copies a slice of memory of length l*8 bits from s to d. And it is fast. Behold:

char s[10000], d[10000];

main()
{
  // Make problem representative
  for (int i=0; i<100; i++)
    s[100*i] = i;

  // Time one or the other
  //  c_memcpy(d, s, 10000);
  Frama_C_memcpy(d, s, 10000);

  // Check that d contains the right thing
  Frama_C_dump_each();
  return 0;
}

(download the entire program)


Using the C version of memcpy and sufficient unrolling, you get the expected contents for d:

$ time bin/toplevel.opt -val -slevel 20000 memcpytest.c -no-results
...
        d[0..99] ∈ {0; }
         [100] ∈ {1; }
         [101..199] ∈ {0; }
         [200] ∈ {2; }
         ...
user   0m23.155s


Using the Frama_C_memcpy built-in, you get the same precise result for d:

$ time bin/toplevel.opt -val -slevel 20000 memcpytest.c -no-results
...
        d[0..99] ∈ {0; }
         [100] ∈ {1; }
         [101..199] ∈ {0; }
         [200] ∈ {2; }
         ...
user   0m0.402s

You also get nearly 23 seconds of your life back.


So what are the things that can go wrong with a builtin replacement function?

  1. If the C code that is not analyzed would have caused an alarm to be emitted, the built-in should emit that alarm too, or at the very least be provided with a header file that offers an ACSL pre-condition along with a prototype for the builtin. Both alternatives work, but it is easy to forget to do either.
  2. All cases of imprecise arguments should be handled. Support for an imprecise length was added recently. Very imprecise source or destination pointers may still cause the analysis to abort (it is being done and I am not sure where we were the last time we were working on this).
  3. The builtin function calls the value analysis' internal functions in contexts that differ from the usual, well tested circumstances occurring during the analysis of normal C programs. Strange behaviors or bugs may be uncovered by the user. As an example, I was just preparing the next section of this post, where I would have shown how imprecise lengths were now handled. I was going to suggest modifying the above test program, thus:
  Frama_C_memcpy(d, s, Frama_C_nondet(150,10000));

This brought up the following two remarks: first, there is an internal function for moving slices of the abstract state around that is slower than it should be, and it shows on this test. Secondly, the value analysis in Carbon will tell you that bits 1200 to 79999 of d contain either 0 or a very large number, a number with about 23400 decimal digits that start with 2164197 and end with 85824. The value analysis is right, of course. If the length argument to Frama_C_memcpy was 150, then that slice of memory contains zero. If the length was 10000, the numbers in d that were copied from s can be interpreted as representing exactly that 78800-bit number. This is the value analysis' way of telling you that it's either one or the other: either the length was 150 or it was 10000. This is more informative than telling you cell by cell that the cell contains zero or another number, because then you wouldn't see that it's either all one or all the other. The value analysis is rather cute that way.

Note: if the 78800-bit number explanation seems terribly complicated, do not worry, there will be a better explanation for this issue in a future post. We just need another new feature in order to be able to show you.

In conclusion, not everyone may want a Formula One car. But Frama_C_memcpy should at least work for completely unrolled deterministic programs, where it can speed up analysis quite a bit compared to the interpretation of a C memcpy.


Thanks to Julien for help with motorsports metaphors, and to Boris for help in improving Frama_C_memcpy (it used to be worse).

Sunday, November 21 2010

IEEE 754 single-precision numbers in Frama-C

Every once in a while, someone asks about single-precision floating-point support in Frama-C. Until recently it was often in the context of the value analysis, but actually, thanks to a lot of interesting new results obtained in the context of this project, people working on deductive verification within Frama-C can now look forward to being harassed too.

We are not nearly home and dry yet, though. Frama-C's front-end is CIL (no relation to something from Microsoft with a similar acronym). CIL is written in OCaml (the rest of Frama-C is written in OCaml too, but never mind that). OCaml has a single floating-point type, which corresponds to IEEE 754 double precision.

So guess how floating-point literal constants are represented in the abstract syntax tree built by CIL...

If you guessed "as an IEEE 754 double-precision number", you are partly right. Yay! But also as a string. CIL's authors did think about this, especially since one of CIL's first use was for source-to-source transformation. Specifically, you have all the information in this constructor defined in file frama-c/cil/src/cil_types.mli:

| CReal of float * fkind * string option
    (** Floating point constant. Give the fkind (see ISO 6.4.4.2) and also
     * the textual representation, if available. *)

The above is OCaml code, and the type float means double-precision floating-point number.

So far so good. If there is a 10.0 literal in the program, it stands for the double 10.0. If there is a 0.1 in the program, it stands for one of representable numbers no further than 1ulp from the real 0.1. This is slightly annoying because the representable number chosen by CIL may not be the same as the representable number chosen by your actual C compiler, but in all likelihood both will round the number from the source code to the nearest double-precision floating-point number. Thus the AST analyzed in Frama-C accurately represents the compiled program and all is well.

The picture gets muddied a little if the target program contains a literal such as 3.14f. This is supposed to represent a single-precision floating-point literal in C. I'm willing to bet without really looking that CIL represents that literal in the AST with the double-precision number nearest to the real number written in decimal in the target program. If you are a Frama-C plug-in author willing to do the right thing, you may think of rounding this double-precision number to the nearest single-precision number yourself, but that doesn't work: the double rounding can make your final result different from the compiler's, which will round only once from real to single-precision. To be correct, you have to go back to the string representation that has been saved from the source code.

In fact, Frama-C's front-end should be doing this for all plug-ins, of course. When the literal is single-precision, it should represent it in the AST as an OCaml float containing the single-precision number meant by the programmer (all single-precision numbers can be represented as double-precision numbers). Probably this will be done soon, now that the problem is identified. This was just an example of how far support for single-precision numbers in Frama-C is.

Full disclosure: I said I was betting without looking but I wouldn't take the bet if I were you. I do not just risk public humiliation like that without strong presumptions. First, I was just working on something related and visible effects of this subtlety came up as one annoying unforeseen issue. Secondly, there is no function in OCaml to parse a number as single-precision. There are some OCaml functions to parse double-precision numbers, and this task is enough of a bother that these are defined by calling the C function strtod(). The way to fix the front-end is probably to replicate this OCaml-C interface for strtof(). So my bet is in fact that CIL's authors neither implemented a single-precision parsing function in OCaml nor interfaced it from C.

page 2 of 2 -