# Frama-C news and ideas

## Justification

There recently was a thread in the Frama-C mailing list on verifying the Rijndael cipher, standardized and better-known as AES. Nowadays, AES is mostly famous for being sensitive to timing attacks. An attacker measuring the time it takes to encrypt known plaintext with an unknown key can deduce the key (key quote: "Using secret data as an array index is a recipe for disaster").

This brings me back to an experiment with University of Minho researchers where Frama-C's value analysis was used to verify that cryptographic C functions have constant execution time. The justification and the conclusion are the same as in this blog post that had served as the starting point of that experiment.

There are two aspects to constant-time programming: checking that the execution path does not depend on secrets, and (remember the key quote above) checking that array indices do not depend on secrets. I implemented the first check as an extension of Frama-C's dependencies analysis. Manuel Bernardo Barbosa et al similarly implemented the second one, and they used both analyses to verify that cryptographic functions in the NaCl library were safe from timing attacks. I had a glance at their implementation but I had not made the effort to port it to new Frama-C releases (I assume they had not either).

## Example

Consider the function `f()` below:

```int t[10] = {0, 1, 2, 2, 1, 8, 9, 17, 54, 79};

int u[10];

int g;
int a;

void f(int n)
{
int i;
int *p;

for (i = 0; i < t[n] * t[n+1] - 3; i++)
{
p = u + i + a;
if (g)
*p = i;
}
}
```

Execution path dependencies analysis relies on standard dependencies analysis (option `-deps`), which rely on the value analysis. Say you are interested in the execution times of function `f()` when array `t` has its initial values, global variables `a` and `g` are each 0 or 1, and the argument `n` is between 0 and 5. You would then create the following analysis context, as you would for any verification based on the value analysis:

```main(){
a = Frama_C_interval(0, 1);
g = Frama_C_interval(0, 1);
f(Frama_C_interval(0, 5));
}
```

Analyzing this complete program for execution path dependencies:

```\$ frama-c -experimental-path-deps -deps share/builtin.c ep.c
...
Computing path dependencies for function f
Path dependencies of f: t[0..6]; g; n
```

Within the initial conditions defined by the function `main()` (and in particular, `n` between 0 and 5), the execution path depends on the initial values of `t[0..6], g` and `n` at the time `f()` is called. Note that variables `p`, `a` and `u` are not included in this list, although they are used, because the execution path does not depend on them. So if `f()` was a cryptographic function and variable `a` contained a secret, you would not have to worry that this secret can be deduced by observing through timing that the function has different execution paths depending on `a`.

But does this mean that you are safe from timing attacks, though? As demonstrated in the article linked first in this post, you should also worry that a malicious adversary will infer information from the array indices used in `f()`:

```\$ frama-c -experimental-mem-deps -deps share/builtin.c ep.c
...
Computing mem dependencies for function f
Mem dependencies of f: t[0..6]; a; n
```

The above option, that I quickly re-implemented when I saw that the verification of cryptographic functions was being discussed in the mailing list, tells which of `f()`'s inputs influence the computation of array indices, in the same way that option `-experimental-path-deps` tells which of `f()`'s inputs influence the execution path inside `f()`. In this example, it tells that the contents of input variable `a` can leak through a cache-timing attack (because `a` is used in the computation of `p`, and later `p` may be dereferenced for writing).

```\$ frama-c rijndael.c -cpp-command "gcc -C -E -I /usr/local/Frama-C/share/frama-c/libc" -no-annot -slevel 9999 mymain.c -experimental-mem-deps -deps
...
Mem dependencies of rijndaelEncrypt: rk; Nr_0; pt; ct; in[0..15]; key[0..39];
Te0[0..255]; Te1[0..255]; Te2[0..255]; Te3[0..255]
```

The analysis finds that there is a risk that `in[0..15]` (the input buffer) and `key[0..39]` (the key) may be leaked through cache-timing attacks. The other variables are `Nr_0`, the number of rounds, `pt` and `ct`, just pointers, and the `Tei` arrays, the S-boxes. Those aren't secrets so it doesn't matter that they are listed.

Like every other analysis in Frama-C, this analysis always makes approximations on the safe side, so we expected to find this result (since we knew the attack had been realized). The question is, is the analysis precise enough to conclude that another cryptographic function is safe? The answer is yes: the hash function Skein-256 can be shown to have an execution time that depends only on the length of the input buffer, but not of its contents.

## Implementation

The option `-experimental-path-deps` has been available as a hidden, proof-of-concept feature inside the official Frama-C distribution roughly since the Valgrind blog post was published in 2010.

I re-implemented the analysis of memory accesses dependencies analysis demonstrated here as `-experimental-mem-deps`. It has been available since Frama-C version Oxygen.

Note that the feature was very lightly tested. The new option `-experimental-mem-deps` was written in 45 minutes and the examples in this post are the only testing it received. If you feel that automatic analyses such as these are too automatic (they don't force you to write invariants for loops in the target code, for instance), you can instead spend your time usefully by devising test suites to make sure they work as intended in all cases.

## Conclusion

One must be careful with a posteriori studies. There are plenty of bugs for which, once they have been pointed out, it is tempting to say "hey, I could have found it with this or that static analysis tool! My tool finds these bugs. Everyone is so stupid not to be using it". The point is that, much like a dowser being tested, you only found the bug with your tool when someone else had told you where it was. Had you tried to find it before being told, perhaps you would have given up right in the setup phase, or you would have had so many false positives to review that you would never have gotten round to the bug, or you would have gotten round to it but would have dismissed it quickly as another false positive.

So I want to make it absolutely clear that I am not saying that the AES implementation conveniently posted in the mailing list should have been checked with Frama-C when it was written. This is a causal impossibility. However, timing attacks are a recognized family of attacks now, and cryptographic functions analyze rather well (based on experiments with AES and Skein), so Frama-C is definitely a tool that can be used nowadays to verify that new cryptographic functions are being implemented according to the state of the art.

This blog post owes to Adam Langley for describing the method in the context of Valgrind, to Manuel Bernardo Barbosa and his students for courageously writing the first version of the array indices dependencies analysis (which I am sure took them more than 45 minutes) and applying it, and to 罗婷 for posting on frama-c-discuss a version of AES that was ready to analyze.