Explaining why Csmith matters even more than previously anticipated
By pascal on Sunday, December 4 2011, 11:00 - Permalink
Csmith as a static analyzer fuzzer
A new version of Csmith, the generator of random defined C programs, was released a few days ago. This is the version that many functions in Frama-C Nitrogen were debugged against. Conversely, a few bugs in the development versions of Csmith, characterized by programs being generated that were not defined (although they compiled and ran fine), were found by pre-Nitrogen development versions of Frama-C and are fixed in the 2.1.0 Csmith release.
The original intention behind Csmith was to find bugs in compilers. As I pointed out in earlier posts, Csmith can also find bugs in static analysis tools.
Here is the latest bug found by Csmith in Frama-C. To explain: ACSL properties, such as alarms emitted by the value analysis, are attached to the beginning of statements. When the danger is in the implicit cast in the assignment of a function's result to an lvalue, the AST may have been normalized in such a way there is no good place to attach the corresponding alarm: before the assignment-cum-function-call, it is too early to express properties about the function's result, and after it, it is too late. This Frama-C bug was found last because it cannot be found by using the value analysis as a C interpreter on defined programs, my initial testing method. In order to find this bug, alarms must be emitted during the analysis, and there aren't any when interpreting a defined program. The bug was found with a different testing method, where the value analysis is not used as an interpreter, makes approximations, and therefore emit alarms. A simple workaround for this bug is to normalize the program differently, using the pre-existing option
One shouldn't base a research project on another research project
I was Benjamin C. Pierce's intern for a couple of months as an undergraduate. The title of this section is one of the insights he gave me then that stuck. In context, he probably meant the obvious fact that you do not want to spend your time running into bug after bug in the prototypal implementation left from the initial research project. Another, more insidious way this warning holds is that even if the ideas from the first project have a solid implementation, you may encounter a problem finding a public to tell about what you've done.
I find myself in a position where I would like to tell people that Csmith is even more useful than previously anticipated. It can not only find bugs in compilers, but also, with some effort, in static analyzers. Isn't that important? (alright, perhaps I am biased here)
Besides, even when there is no bug to find, Csmith can point out differences in assumptions between the static analyzer and the compiler used to compile safety-critical code. One thing I discovered using Csmith was, for instance, that you shouldn't expect different compilers to handle zero-width bitfields in packed structs the same way. None of the compilers I tried was wrong, they were just using different conventions. Frama-C implements one of the conventions. I have simply added this combination of constructs to the list of features to check either that the analyzed program does not use, or that the target compiler implements exactly the same way as Frama-C.
Sadly, Csmith authors think that generating programs that reveal these differences detracts from Csmith's original goals. When Csmith is used as originally intended, compilers are compared to one another. If they do not give the same result, one of them must be wrong. So they removed these combinations from the generated programs as I was reporting them (and they probably removed many more on their own that would have been interesting in the context of verifying that a static analyzer is well-matched with a specific compiler).
An experience report looking for a public
That Csmith is even more useful than anticipated is only the (luckily positive) conclusion of the experiment. Building up to this conclusion, there are the methods that were used in order to test various Frama-C functions, the categorization of bugs that were found in both Frama-C and Csmith, and perhaps a couple of interesting examples of these. This could fit well in the "experience report" format that some conferences have a tag for in their calls for papers. But I have had little success digging up a conference where submitting would not be a waste of time for me and for the future reviewers.
A Frama-C workshop or a Csmith workshop, if either one existed, would be a perfect fit. One half of the contents would make the report fully on-topic while the other half would give it a touch of exoticism. Unfortunately, both are research tools without the users to justify their own event. John Regehr presented Csmith as the GCC Summit, and I guess this might be a possibility, but this potential report is at most on-topic by association there. On the static analysis side, I do not see too many people being interested in the minutiae of C's pitfalls. One impression I got at ICPC and already commented on in this blog was that the "It's possible to do things on Java, so let's study Java" school of thought has more proponents than the "There are large codebases in C, let's see what we can do about C" school of thought. Even among the latter, I am a bit worried such a report would be perceived as dealing with the details that other researchers have already interiorized as unimportant when they decided to ignore them in their own work.
I am not complaining. In a world in which either Frama-C or Csmith, or comparable tools, were widely used, someone would already have done this experiment before me. I might even have had to read a book discussing the proper way to do this sort of thing before I started. Truly, each researcher probably finds emself in the same situation at one point or another. Experimented researchers are those who have learnt to anticipate that a possible line of work, even if it gives useful results, will give results that are hard to publish.
I am lucky in that "researcher" is only half my job description. Inventing methods to test Frama-C with Csmith as we were going along was much fun, and fixing the bugs was the first and foremost objective.