Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Iterating over the AST

Context

A facetious colleague who claims that he should be better writing his thesis but keeps committing Coq and OCaml files on Frama-C's repository, asked me the following question: Is there a function in Frama-C's kernel that can fold[1] a function over all types that appear in a C program?

As it turns out, nothing is readily available for this exact purpose, but another facetious colleague could have come up with a terse answer: There's a visitor for that! Now, we can be a bit more helpful and actually write that visitor. Our end goal is to come up with a function

fold_typ: (Cil_types.typ -> 'a -> 'a) -> 'a -> 'a

so that fold_typ f init will be f t_1 (f t_2 (... (f t_n init)...)), where the t_i are all the C types appearing in a given C program (in no particular order).

Frama-C's visitors

The visitor pattern is a well-known object-oriented design pattern that can be used to perform a given action over all nodes of a complex data structure (in our case the Abstract Syntax Tree -AST for short- of a C program). Frama-C provides a generic visitor mechanism, built upon CIL's visitor, whose entry points can be found in the aptly named src/kernel/visitor.mli file. It is also documented in section 5.14 of the developer manual. In summary, you first define a class that inherits from the generic visitor and overrides the methods corresponding to the nodes you're interested in (in our case, this will be vtype for visiting uses of Cil_types.typ). Then you apply an object of this class to the function from the Visitor module that correspond to the subpart of the AST that you want to inspect (in our case, this will be the whole AST).

Basic version

The standard visitor does not provide anything to return a value outside of the AST. In fact, all the entry points in Visitor return a node of the same type that the node in which the visit starts (for visitors that don't perform code transformation, this is in fact physically the same node). But if you accept to sneak in a little bit of imperative code into your development -and by using Frama-C you've already accepted that- there is an easy way out: pass to the visitor a reference that it can update, and you just have to read the final value that reference is holding after the visit. The visitor then looks like the following:

class fold_typ_basic f acc =
object
  inherit Visitor.frama_c_inplace
  method vtype ty = acc:= f ty !acc; Cil.DoChildren
end

And that's it. Each time the visitor sees a type, it will apply f to ty and the result of the previous computations, stored in acc. fold_typ then just needs to call the visitor over the whole AST and give it a reference initialized with init:

let fold_typ f init =
  let racc = ref init in
  let vis = new fold_typ_basic f racc in
  Visitor.visitFramacFileSameGlobals vis (Ast.get());
  !racc

Don't do the same work twice

This first version, that is barely more than 10 LoC, works, but we can do a little better. Indeed, f will be called each time a type is encountered in the AST. In most cases, we want to call f once for any given type. This can be done quite simply by memoizing in an instance variable of our visitor the set of types encountered thus far. Frama-C's Cil_datatype module (cil/src/cil_datatype.mli) provides all the needed functions for that:

class fold_typ f acc =
object
  inherit Visitor.frama_c_inplace
  val mutable known_types = Cil_datatype.Typ.Set.empty
  method vtype ty =
    if Cil_datatype.Typ.Set.mem ty known_types then Cil.DoChildren
    else begin
      known_types <- Cil_datatype.Typ.Set.add ty known_types;
      acc:= f ty !acc;
      Cil.DoChildren
    end
end

Testing the infrastructure

It is now time to test if everything works smoothly. The following function will print the name and size of the type who has the biggest size in the analyzed program.

let test () =
  let f ty (maxty,maxsize as acc) =
    try
      let size = Cil.sizeOf_int ty in
      if size > maxsize then (ty,size) else acc
    with Cil.SizeOfError _ -> acc
  in
  let (ty,size) = fold_typ f (Cil.voidType,0) in
  Format.printf "Biggest type is %a,@ with size %d@." 
    !Ast_printer.d_type ty size

Since it is only a quick test, we don't do anything special if Cil complains that it cannot compute the size of a given type: we just stick to the maximal value computed so far.

File fold_typ.ml provides the code for the visitor and the test function. frama-c -load-script fold_typ.ml file.c should output something like

[kernel] preprocessing with "gcc -C -E -I.  file.c"
Biggest type is struct Ts,
with size 44

Exercises

  1. How can you use the fold_typ class to define an iter_typ function that apply a function f returning unit to each type of the AST (val iter_typ: (Cil_types.typ -> unit) -> unit)?
  2. Writing fold_typ is a bit overkill if you're going to apply it once in your plug-in. Write a specialized visitor that will do the same thing as the test function above.

Notes

[1] For those who are not familiar with functional programming: http://en.wikipedia.org/wiki/Fold_%...

Attachments