Extracting information from Frama-C programmatically

Virgile Prevosto - 4th Sep 2012

Extending the framework

Some time ago, one of our fellow users asked us whether it was possible to extract the control-flow graph (CFG) of C functions from Frama-C. Fact is, although the CFG is computed during the elaboration of the AST, nothing exists currently to present the information to the user. But Frama-C is a Flexible framework, and a small (50 LoC) script will provide this functionality albeit in a rather crude form. More specifically we'll ask Frama-C to output a file in the dot format that will contain a graph for each of the functions of the program under analysis. Basically our script is composed of three parts: a function to pretty-print the statements as node of the graph a visitor to create the graphs and the boilerplate code that takes care of registering the code into Frama-C kernel.

Pretty-printing

Frama-C already has a function to pretty-print statements but it is not suitable for us as it will recursively print substatements of compound statements (blocks if while ...) while we only want to have a label for the current statement: substatements will be represented by other nodes. We'll use thus the following small function:

let print_stmt out = 
 function 
   | Instr i -> !Ast_printer.d_instr out i 
   | Return _ -> Format.pp_print_string out "<return>" 
   | Goto _ -> Format.pp_print_string out "<goto>" 
   | Break _ -> Format.pp_print_string out "<break>" 
   | Continue _ -> Format.pp_print_string out "<continue>" 
   | If (e _ _ _) -> Format.fprintf out "if %a" !Ast_printer.d_exp e 
   | Switch(e _ _ _) -> Format.fprintf out "switch %a" !Ast_printer.d_exp e 
   | Loop _ -> Format.fprintf out "<loop>" 
   | Block _ -> Format.fprintf out "<enter block>" 
   | UnspecifiedSequence _ -> Format.fprintf out "<enter unspecified sequence>" 
   | TryFinally _ | TryExcept _ -> Format.fprintf out "<try>" 

Creating the graphs

In order to create our output we must make a pass through the whole AST. An easy way to do that is to use CIL's visitor mechanism. There are three kinds of nodes where we have something to do. First at the file level we create the whole graph structure:

method vfile f = 
    Format.fprintf out "@[<hov 2>digraph cfg {@"; 
    ChangeDoChildrenPost (f fun f -> Format.fprintf out "}@."; f) 

Then for each function we encapsulate the CFG in a subgraph (and don't do anything for other globals hence the SkipChildren for the other branch of the pattern-matching).

  method vglob_aux g = 
    match g with 
      | GFun(f loc) -> 
          Format.fprintf out "@[<hov 2>subgraph cluster_%a {@\ 
\ 
                             graph [label=\"%a\"];@"  
            Cil_datatype.Varinfo.pretty f.svar 
            Cil_datatype.Varinfo.pretty f.svar; 
          ChangeDoChildrenPost([g]  fun g -> Format.fprintf out "}@\ 
@]"; g) 
      | _ -> SkipChildren 

Last for each statement we have to create a node of the graph and create the edges toward its successors:

  method vstmt_aux s = 
    Format.fprintf out "s%d@[[label=\"%a\"]@];@" s.sid print_stmt s.skind; 
    List.iter  
      (fun succ -> Format.fprintf out "s%d -> s%d;@" s.sid succ.sid) 
      s.succs; 
    DoChildren 

Registration within Frama-C's main loop

Now we want Frama-C to be aware of our script. Basically this amounts to write a driver function of type unit -> unit that will launch the visitor process and to tell Frama-C that this function should be run as an analysis:

let run () = 
  let chan = open_out "cfg.out" in 
  let fmt = Format.formatter_of_out_channel chan in 
  Visitor.visitFramacFileSameGlobals (new print_cfg fmt) (Ast.get()) 
let () = Db.Main.extend run 

Launching the script

In order to obtain a cfg.out file with our graphs we simply have to call frama-c that way:

frama-c -load-script cfg_print.ml [other_options] file.c [file2.c]  

Then provided graphviz is installed the command

dot -Tpdf -o cfg.pdf cfg.out 

will provide a pdf file similar to this one

What next?

This script is very basic and should be improved in several ways in order to be useful beyond the point of showing how to interact with Frama-C's API (some of these might be the subject of new posts in the future)

  • The graphs could be fancier in particular by distinguishing between branching nodes and plain ones or showing exit of blocks as well as their beginning
  • For now the graphs are generated each time the script is explicitely loaded from the command line. One might eventually want to compile it as a proper plugin systematically loaded by Frama-C with the graph generation commanded by a proper command-line option
  • Some other parameters might be interesting such as choosing the name of the output file including only the CFG of certain functions ...
  • The script could interact with some plugins to give more information (e.g. showing dead paths for a given run of Value)
"
Virgile Prevosto
4th Sep 2012