Iterating over the AST
By virgile on Tuesday, May 22 2012, 10:45 - Permalink
A facetious colleague who claims that he should be better writing his thesis but keeps committing Coq and OCaml files on Frama-C's repository, asked me the following question:
Is there a function in Frama-C's kernel that can fold a function over all types that appear in a C program?
As it turns out, nothing is readily available for this exact purpose, but another facetious colleague could have come up with a terse answer:
There's a visitor for that! Now, we can be a bit more helpful and actually write that visitor. Our end goal is to come up with a function
fold_typ: (Cil_types.typ -> 'a -> 'a) -> 'a -> 'a
fold_typ f init will be
f t_1 (f t_2 (... (f t_n init)...)), where the
t_i are all the C types appearing in a given C program (in no particular order).
The visitor pattern is a well-known object-oriented design pattern that can be used to perform a given action over all nodes of a complex data structure (in our case the Abstract Syntax Tree -AST for short- of a C program). Frama-C provides a generic visitor mechanism, built upon CIL's visitor,
whose entry points can be found in the aptly named
src/kernel/visitor.mli file. It is also documented in section 5.14 of the developer manual. In summary, you first define a class that inherits from the generic visitor and overrides the methods corresponding to the nodes you're interested in (in our case, this will be
vtype for visiting uses of
Cil_types.typ). Then you apply an object of this class to the function from the
Visitor module that correspond to the subpart of the AST that you want to inspect (in our case, this will be the whole AST).
The standard visitor does not provide anything to return a value outside of the AST. In fact, all the entry points in
Visitor return a node of the same type that the node in which the visit starts (for visitors that don't perform code transformation, this is in fact physically the same node). But if you accept to sneak in a little bit of imperative code into your development -and by using Frama-C you've already accepted that- there is an easy way out: pass to the visitor a reference
that it can update, and you just have to read the final value that reference is holding after the visit. The visitor then looks like the following:
class fold_typ_basic f acc = object inherit Visitor.frama_c_inplace method vtype ty = acc:= f ty !acc; Cil.DoChildren end
And that's it. Each time the visitor sees a type, it will apply
ty and the result of the previous computations, stored in
fold_typ then just needs to call the visitor over the whole AST and give it a reference initialized with
let fold_typ f init = let racc = ref init in let vis = new fold_typ_basic f racc in Visitor.visitFramacFileSameGlobals vis (Ast.get()); !racc
Don't do the same work twice
This first version, that is barely more than 10 LoC, works, but we can do a little better. Indeed,
f will be called each time a type is encountered in the AST. In most cases, we want to call
f once for any given type. This can be done quite simply by memoizing in an instance variable of our visitor the set of types encountered thus far. Frama-C's
Cil_datatype module (
cil/src/cil_datatype.mli) provides all the needed functions for that:
class fold_typ f acc = object inherit Visitor.frama_c_inplace val mutable known_types = Cil_datatype.Typ.Set.empty method vtype ty = if Cil_datatype.Typ.Set.mem ty known_types then Cil.DoChildren else begin known_types <- Cil_datatype.Typ.Set.add ty known_types; acc:= f ty !acc; Cil.DoChildren end end
Testing the infrastructure
It is now time to test if everything works smoothly. The following function will print the name and size of the type who has the biggest size in the analyzed program.
let test () = let f ty (maxty,maxsize as acc) = try let size = Cil.sizeOf_int ty in if size > maxsize then (ty,size) else acc with Cil.SizeOfError _ -> acc in let (ty,size) = fold_typ f (Cil.voidType,0) in Format.printf "Biggest type is %a,@ with size %d@." !Ast_printer.d_type ty size
Since it is only a quick test, we don't do anything special if Cil complains that it cannot compute the size of a given type: we just stick to the maximal value computed so far.
File fold_typ.ml provides the code for the visitor and the test function.
frama-c -load-script fold_typ.ml file.c should output something like
[kernel] preprocessing with "gcc -C -E -I. file.c" Biggest type is struct Ts, with size 44
- How can you use the
fold_typclass to define an
iter_typfunction that apply a function
unitto each type of the AST (
val iter_typ: (Cil_types.typ -> unit) -> unit)?
fold_typis a bit overkill if you're going to apply it once in your plug-in. Write a specialized visitor that will do the same thing as the