Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - argon

Entries feed - Comments feed

Wednesday, January 16 2019

Setting up an analysis with the help of frama-c-script

Frama-C 18 (Argon) includes a new command, frama-c-script, which contains a set of helpful scripts for setting up new analyses. The available commands will likely expand in the future. We present here a quick overview of how to use them efficiently for a very fast setup.

Presentation of frama-c-script

frama-c-script is not a Frama-C plugin, but a command that is installed along with frama-c and frama-c-gui, starting from version 18.0 (Argon). Running frama-c-script on your terminal will output its usage messsage, which includes the list of available commands (which may change in the future):

usage: frama-c-script cmd [args]

  where cmd is:

  - make-template [dir]
      Interactively prepares a template for running analysis scripts,
      writing it to [dir/GNUmakefile]. [dir] is [.] if omitted.

  - make-path
      [for Frama-C developers and advanced users without Frama-C in the path]
      Creates a frama-c-path.mk file in the current working directory.

  - list-files [path/to/compile_commands.json]
      Lists all sources in the given compile_commands.json
      (defaults to './compile_commands.json' if omitted).
      Also lists files defining a 'main' function
      (heuristics-based; neither correct nor complete).

  - flamegraph <flamegraph.txt> [dir]
      Generates flamegraph.svg and flamegraph.html in [dir]
      (or in the FRAMAC_SESSION directory by default).
      Also opens it in a browser, unless variable NOGUI is set.

These commands are often related to the analysis scripts described in previous blog posts, but independent; frama-c-script facilitates the usage of analysis-scripts, but it also has commands which do not require them.

Each command is actually a script in itself, and may thus require other interpreters (e.g. Python or Perl).

Standard usage of frama-c-script

The diagram below illustrates where frama-c-script participates in the workflow of an analysis: in the creation of a makefile template, and also after the analysis, for profiling. Both usages will be detailed in this post.

Integration of frama-c-script in the analysis workflow

Currently, frama-c-script is ideally used in conjunction with a JSON compilation database. As described in a previous blog post, CMake, Build EAR or other tools can be used to produce a compile_commands.json file, containing the list of compilation commands along with their arguments.

Once you have a compile_commands.json file, run:

frama-c-script list-files

It will parse the file and list all sources in it, in a Makefile-compatible syntax. It will also output the list of files containing definitions for a main function. This typically includes the main application itself, but also test cases or secondary applications, which can all be useful entry points for an analysis with Eva.

frama-c-script does not parse the source files to find function definitions; instead, it uses regular expressions to identify likely definitions, which can give both false positives and false negatives. This is a necessary approach since, during this stage, it is too early to have a full, parsable AST to begin with.

With the list of source files, we can now easily create a Makefile template for our application. The make-template command will interactively create a new GNUmakefile in the current directory, after asking for some data:

  • Name of the main make target;
  • List of source files.

The name of the main target is usually the application name. It is also the default Makefile target and the prefix of the directories where the results will be stored (both for parsing and for the analysis with Eva).

The list of source files can be as simple as *.c, or more sophisticated for projects with subdirectories and multiple main functions, in which case some source files must not be included in the list. The list-files command will produce the list of sources when a compile_commands.json file is present.

After these values are filled in, the script will create a GNUmakefile in the current directory. The values are simply written inside the makefile to help kickstart it, but they can be manually edited afterwards.

Ideally, this should be enough to try parsing the source files by running simply make. From then on, the analysis-scripts methodology takes over, and the usual commands (make parse, make, make gui) work as usual.

Easily visualizing flamegraphs

One of the options in frama-c-script is related to flamegraphs.

Flamegraphs are stack trace visualizations which, in Frama-C/Eva, are used to help understand in which functions the analysis spends most of its time. Option -eva-flamegraph has been added in Frama-C 14 (Silicon), and it activates the generation of a flamegraph trace file during the analysis. This file can then be given to the flamegraph.pl Perl script developed by Brendan Gregg to produce an image in SVG format.

The image below provides an example of a flamegraph for the PolarSSL code in Frama-C's open-source-case-studies repository. The stacks start from the top, with the main function, and then each called function is stacked underneath. The length of the bar is the amount of time spent in it. Hovering over a bar shows the function name and the percentage of time spent in the function. Clicking on a bar will zoom on it, increasing the size of its children for better readability. Note that bar colors do not have any special meaning.

Eva flamegraph

The flamegraph command available in frama-c-script is simply a shortcut to help generate the image file for the flamegraph (along with an HTML wrapper) and open it in a browser. The flamegraph.pl script itself is distributed in the Frama-C share directory, which simplifies its usage.

Since the analysis-scripts options already include the generation of flamegraphs by default (option -eva-flamegraph is set by FRAMAC_SHARE/analysis-scripts/frama-c.mk), the user simply needs to run:

frama-c-script flamegraph <target>.eva/flamegraph.txt

And the user's default browser should display the flamegraph generated during the analysis.

Note that flamegraphs can be generated while the analysis is still running; this is useful, for instance, when the analysis is seemingly "frozen" in a given point, without emitting any messages. Running frama-c-script flamegraph will create a flamegraph of the current state of the analysis, while the trace file is still updated by the ongoing analysis.

Conclusion

The features currently available in frama-c-script offer extra comfort when setting up new analyses, by minimizing the amount of repetitive or hard-to-memorize tasks.

list-files requires a JSON compilation database, but it helps obtain the information for creating a GNUmakefile. make-template creates a GNUmakefile which can be manually edited later. flamegraph helps visualize the performance of the analysis, either while it is running, or afterwards.

One of the new commands expected for Frama-C 19 (Potassium) will help find which files in a directory declare and/or define a given function. This will help, for instance, when trying to identify which .c file to include to obtain the body of a function that is called from another source.

Friday, December 14 2018

New loop unroll annotation in Frama-C 18

One of the new features in Frama-C 18 (Argon) is the annotation @loop unroll, used by the Eva plug-in to replace and improve upon the usage of slevel for semantically unrolling loops. This new annotation is more stable, predictable, and overall more efficient. In this post we present the annotation with some examples where it outperforms previous mechanisms.

Loop unrolling mechanisms in Frama-C/Eva

Frama-C has two ways to unroll loops syntactically: either via option -ulevel N (which unrolls all loops in the program N times), or via the loop pragma UNROLL annotation, placed before a loop, which unrolls only that loop. Both have their uses, especially for plug-ins other than Eva; however, Eva has specific mechanisms which are almost always better suited to deal with loops.

When using the Eva plug-in, syntactic loop unrolling is is not recommended: it is more costly than semantic unrolling, and it makes the code harder to read in the GUI.

In Eva, the slevel mechanism is used to improve precision of the analysis inside loops: as long as there is enough slevel to be consumed, each loop iteration will keep consuming it before performing a widening and merging states, which might lead to loss of precision.

This works fine for simple loops, but as soon as there are branches or nested loops, slevel consumption is hard to control: each branch will also consume some slevel, and there is no way to ensure that only outer loop iterations will consume the remaining slevel.

To mitigate the effects of unwanted slevel consumption, options such as -eva-slevel-merge-after-loop force the merging of states at the end of each loop iteration, which minimizes the issue with loops containing branches. However, this option operates at the function level, so nested loops are still an issue.

Plug-ins such as Loop Analysis perform complex estimations to try and indicate values of slevel for each function, accounting for the fact that nested loops and branches require extra slevel. In realistic code bases, this often leads to functions having excessively large values of slevel (in the worst case, such overflows lead to 0, that is, giving up unrolling).

In the end, existing solutions had drawbacks which imposed an extra burden on the parameter refinement process. This motivated the development of the new @loop unroll annotation.

The new loop unroll mechanism

@loop unroll N is a code annotation, to be placed just before a loop in the program (as is the case with all @loop ACSL annotations). It will instruct Eva to avoid merging states after each loop iteration, until either:

  • there are no more states to unroll (i.e. the loop has been completely unrolled), or

  • N unrollings have been performed.

Just like with slevel, a value for N larger than the required to completely unroll the loop will not lead to wasted time.

The annotation currently accepts only constant expressions, that is, those involving literal constants or #define macros. This limitation may be removed in the future.

Nested loops and slevel usage

The main advantage of using @loop unroll is in the presence of nested loops. Consider the following example:

void main() {
  int a[10][10] = {0};
  for (int i = 0; i < 10; i++) {
    for (int j = 0; j < 10; j++) {
      a[i][j] += i + j;
    }
  }
}

In this program, one might naively think that using -eva-slevel 100 would be enough to unroll the loops. This will however result in:

[eva] example1.c:4: starting to merge loop iterations
[eva:alarm] example1.c:5: Warning:
  signed overflow. assert a[i][j] + (int)(i + j) ≤ 2147483647;
...
[eva:final-states] Values at end of function main:
...
   [8][9] ∈ {17}
   [9][0] ∈ {9}
   [9][1..9] ∈ [0..2147483647]

The first message indicates that there was not enough slevel to consume during the analysis, and thus a merge between different loop iterations happened, resulting in loss of precision. Widening causes the last cells of the matrix to contain over-approximated intervals, leading to an alarm.

The reason why 100 is not enough is due to the fact that an extra slevel is consumed for each iteration of the outer loop; for instance, you can try using -eva-slevel 10 and then -eva-slevel 11 and checking the displayed final states: in the first case, the result remains imprecise for the entire matrix:

[eva:final-states] Values at end of function main:
  a[0..9][0..9] ∈ [0..2147483647]

While in the second case (-eva-slevel 11), we have:

[eva:final-states] Values at end of function main:
  a[0][0] ∈ {0}
   [0][1] ∈ {1}
   [0][2] ∈ {2}
   [0][3] ∈ {3}
   [0][4] ∈ {4}
   [0][5] ∈ {5}
   [0][6] ∈ {6}
   [0][7] ∈ {7}
   [0][8] ∈ {8}
   [0][9] ∈ {9}
   [1..9][0..9] ∈ [0..2147483647]

Thus, the actual slevel required to completely unroll both loops is 109 (the last iteration does not require extra slevel). Computing this exact value is not straightforward, and it gets even more complicated when there are branches in the loop.

Using @loop unroll in nested loops

The @loop unroll annotation is local to a loop, and thus completely independent of whether the loop contains or is contained by other loops. Therefore, we can perform simple calculations to obtain the bounds: (max - min) / step in the general case, which here results in (10 - 0) / 1 = 10.

void main() {
  int a[10][10] = {0};
  //@ loop unroll 10;
  for (int i = 0; i < 10; i++) {
    //@ loop unroll 10;
    for (int j = 0; j < 10; j++) {
      a[i][j] += i + j;
    }
  }
}

With the above annotations, simply running Eva, without any slevel required, will result in no alarms at all. The bounds are much simpler to compute. In future Frama-C releases, we hope to obtain this information automatically whenever possible, so the user will not have to add them in simple cases.

There is currently (in Frama-C 18 - Argon) an option to set a minimum value for @loop unroll for all loops in the program: -eva-min-loop-unroll. This may be unwise for realistic case studies with several loops, but in many cases this saves time by not having to add annotations to each loop in the program. In our first example, adding -eva-min-loop-unroll 10 has the same effect as adding the two //@ loop unroll 10; annotations.

Loops with branches and slevel usage

The previous example was comprised of two very simple nested loops. In reality, many loops contain branches and conditional exits. In such cases, slevel usage becomes exponential. For instance, consider this variant of the previous example, with a non-deterministic increment:

void main(int nondet) {
  int a[10][10] = {0};
  for (int i = 0; i < 10; i++) {
    for (int j = 0; j < 10; j++) {
      a[i][j] += i + j;
      if (nondet) a[i][j]++;
    }
  }
}

In this example, instead of 109 as minimum slevel to avoid any warnings, Eva needs 5509. On my machine, the analysis goes from nearly instantaneous (about 0.3s) to visibly perceptible (about 1.7s). And all this for a very modestly-sized loop, with a single branching point.

Eva has an option to help mitigate the issue with slevel: -eva-slevel-merge-after-loop <functions>. This option ensures that, at the end of each loop iteration, all states from different branches in the current iteration will be merged. This avoids the exponential explosion, but still requires extra slevel, in a non-intuitive manner; for instance, the minimum slevel value required to avoid warnings in the above example is 208. Simply trying to understand precisely why this value is necessary and sufficient requires deep understanding of the evaluation mechanism inside Eva.

Again, if we add the same @loop unroll annotations as before, we do not need anything else: the analysis will not generate any warnings, it will run almost as fast as before (with a barely perceptible increase in time due to the branching), and it will be robust to changes: the introduction of extra branches or nested loops will not affect the existing annotations.

Warnings to help the user

The message starting to merge loop iterations is used to indicate insufficient slevel during the analysis of loops. A more specific warning is now available for loop unroll annotations. For instance, suppose we used 9 instead of 10 in our previous annotations. Eva would then report:

[eva:loop-unrolling] example2.c:6: loop not completely unrolled

This warning appears when there is an annotation that is insufficient to completely unroll the loop, which is often a sign that the annotation should contain a larger value. It can be disabled or even turned into an error, to make sure the user will not miss it.

Upgrading to the new mechanism

Existing case studies may benefit from the new annotation without requiring changes to existing parametrization; in most cases, keeping current slevel values (which are still necessary for dealing with branches) while adding such annotations should give results at least as precise, possibly improving the efficiency of the analysis. Then, slevel values can be lowered on a second stage.

The analysis scripts and sources in the Github open-source-case-studies repository have been updated to include several loop unroll annotations throughout the code, to serve as usage examples. During the process, we could confirm their improved reliability, predictability and ease of use. Their only downsides are that some oracles change due to different line numbers after they are introduced, and the fact that some (if not most) of these annotations could be automatically generated.

Conclusion

The slevel mechanism is very generic, but not appropriate in all situations, especially because it does not allow fine-grained control, and is context-dependent. Having a specific mechanism for handling loops, which is the majority of practical use cases, improves reliability and predictability of the analysis. It also prevents exponential blowup of states in more complex loops.

loop unroll annotations are very useful and likely to be present in every meaningful case study. Their usage is straightforward, and future Frama-C versions may end up incorporating more flexible annotations (e.g. using non-constant expressions). The slevel mechanism is still used in some cases, especially for branching, but its usage is likely to diminish over time, replaced with more specialized and expressive mechanisms.