Frama-C news and ideas

To content | To menu | To search | Frama-C Home

Tag - FLT_EVAL_METHOD

Entries feed - Comments feed

Wednesday, July 24 2013

More on the precise analysis of C programs for FLT_EVAL_METHOD==2

Introduction

It started innocently enough. My colleagues were talking of supporting target compilers with excess floating-point precision. We saw that if analyzing programs destined to be compiled with strict IEEE 754 compilers was a lovely Spring day at the beach, analyzing for compilers that allow excess precision was Normandy in June, 1944. But we had not seen anything, yet.

The notion of compile-time computation depends on the optimization level

One first obvious problem was that of constant expressions that were evaluated at compile-time following rules that differed from run-time ones. And who is to say what is evaluated at compile-time and at run-time? Why, it even depends, for one same compiler, on the optimization level:

#include <stdio.h>

int r1, r2;

double ten = 10.0;

int main(int c, char **v)
{
  ten = 10.0;
  r1 = 0.1 == (1.0 / ten);
  r2 = 0.1 == (1.0 / 10.0);
  printf("r1=%d r2=%d\n", r1, r2);
}

Note how, compared to last time, we make the vicious one-line change of assigning variable ten again inside function main().

$ gcc -v
Target: x86_64-linux-gnu
…
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1) 
$ gcc -mno-sse2 -mfpmath=387 -std=c99 -O2  s.c && ./a.out 
r1=1 r2=1
$ gcc -mno-sse2 -mfpmath=387 -std=c99  s.c && ./a.out 
r1=0 r2=1

So the problem is not just that the static analyzer must be able to recognize the computations that are done at compile-time. A precise static analyzer that went this path would in addition have to model each of the tens of optimization flags of the target compiler and their effects on the definition of constant expression.


Fortunately for us, after many, varied complaints from GCC users, Joseph S. Myers decided that 387 floating-point math in GCC was at least going to be predictable. That would not solve all the issues that had been marked as duplicates of the infamous bug 323 over its lifetime, but it would answer the valid ones.

A ray of hope

Joseph S. Myers provided a reasonable interpretation of the effects of FLT_EVAL_METHOD in the C99 standard. The comparatively old compiler we used in the previous post and in the first section of this one does not contain the patch from that discussion, but recent compilers do. The most recent GCC I have available is SVN snapshot 172652 from 2011. It includes the patch. With this version of GCC, we compile and execute the test program below.

#include <stdio.h>

int r1, r2, r3, r4, r5, r6, r7;

double ten = 10.0;

int main(int c, char **v)
{
  r1 = 0.1 == (1.0 / ten);
  r2 = 0.1 == (1.0 / 10.0);
  r3 = 0.1 == (double) (1.0 / ten);
  r4 = 0.1 == (double) (1.0 / 10.0);
  ten = 10.0;
  r5 = 0.1 == (1.0 / ten);
  r6 = 0.1 == (double) (1.0 / ten);
  r7 = ((double) 0.1) == (1.0 / 10.0);
  printf("r1=%d r2=%d r3=%d r4=%d r5=%d r6=%d r7=%d\n", r1, r2, r3, r4, r5, r6, r7);
}

We obtain the following results, different from the results of the earlier version of GCC, but independent of the optimization level and understandable (all computations are done with FLT_EVAL_METHOD==2 semantics):

$ ./gcc-172652/bin/gcc -mno-sse2 -mfpmath=387 -std=c99  t.c && ./a.out 
r1=1 r2=1 r3=0 r4=0 r5=1 r6=0 r7=0

As per the C99 standard, the choice was made to give the literal 0.1 the value of 0.1L. I am happy to report that this simple explanation for the values of r2 and r7 can be inferred directly from the assembly code. Indeed, the corresponding constant is declared in assembly as:

.LC1:
	.long	3435973837
	.long	3435973836
	.long	16379
	.long	0


Quiz: why is it obvious in the above assembly code for a long double constant that the compiler used the long double approximation for 0.1 instead of the double one?


As described, the semantics of C programs compiled with FLT_EVAL_METHOD==2 are just as deterministic as if they were compiled with FLT_EVAL_METHOD==0. They give different results from the latter, but always the same ones, regardless of optimization level, interference from unrelated statements, and even regardless of the particular compiler generating code with FLT_EVAL_METHOD==2. In the discussion that followed between Joseph Myers and Ian Lance Taylor, this is called “predictable semantics” and it is a boon to anyone who whishes to tell what a program ought to do when executed (including but not limited to precise static analyzers).

Implementation detail: source-to-source transformation or architecture option?

Now that at least one C compiler can be said to have predictable behavior with respect to excess precision, the question arises of supporting FLT_EVAL_METHOD==2 in Frama-C. This could be one more of the architecture-dependent parameters such as the size of type int and the endianness.

The rules are subtle, however, and rather than letting each Frama-C plug-in implement them and get them slightly wrong, it would be less error-prone to implement these rules once and for all as a source-to-source translation from a program with FLT_EVAL_METHOD==2 semantics to a program that when compiled or analyzed with FLT_EVAL_METHOD==0 semantics, computes the same thing as the first one.

The destination of the transformation can be a Frama-C AST

A translated program giving, when compiled with strict IEEE 754 semantics, the FLT_EVAL_METHOD==2 semantics of an existing program can be represented as an AST in Frama-C. Here is how the translation would work on an example:

double interpol(double u1, double u2, double u3)
{ 
  return u2 * (1.0 - u1) + u1 * u3;  
}

Function interpol() above can be compiled with either FLT_EVAL_METHOD==0 or with FLT_EVAL_METHOD==2. In the second case, it actually appears to have slightly better properties than in the first case, but the differences are minor.

A source-to-source translation could transform the function into that below:

double interpol_80(double u1, double u2, double u3)
{ 
  return u2 * (1.0L - (long double)u1) + u1 * (long double)u3;  
}

This transformed function, interpol_80(), when compiled or analyzed with FLT_EVAL_METHOD==0, behaves exactly identical to function interpol() compiled or analyzed with FLT_EVAL_METHOD==2. I made an effort here to insert only the minimum number of explicit conversions but a Frama-C transformation plug-in would not need to be so punctilious.

The source of the transformation cannot be a Frama-C AST

There is however a problem with the implementation of the transformation as a traditional Frama-C transformation plug-in. It turns out that the translation cannot use the normalized Frama-C AST as source. Indeed, if we use a Frama-C command to print the AST of the previous example in textual form:

~ $ frama-c -print -kernel-debug 1 t.c
…
/* Generated by Frama-C */
int main(int c, char **v)
{
  /* Locals: __retres */
  int __retres;
  /* sid:18 */
  r1 = 0.1 == 1.0 / ten;
  /* sid:19 */
  r2 = 0.1 == 1.0 / 10.0;
  /* sid:20 */
  r3 = 0.1 == 1.0 / ten;
  /* sid:21 */
  r4 = 0.1 == 1.0 / 10.0;
  …
}

Explicit casts to a type that an expression already has, such as the casts to double in the assignments to variables r3 and r4, are erased by the Frama-C front-end as part of its normalization. For us, this will not do: these casts, although they convert a double expression to double, change the meaning of the program, as shown by the differences between the values of r1 and r3 and respectively or r2 and r4 when one executes our example.

This setback would not be insurmountable but it means complications. It also implies that FLT_EVAL_METHOD==2 semantics cannot be implemented by individual plug-ins, which looked like a possible alternative.


To conclude this section on a positive note, if the goal is to analyze a C program destined to be compiled to the thirty-year old 8087 instructions with a recent GCC compiler, we can build the version of Frama-C that will produce results precise to the last bit. The amount of work is not inconsiderable, but it is possible.

But wait!

But what about a recent version of Clang? Let us see, using the same C program as previously:

#include <stdio.h>

int r1, r2, r3, r4, r5, r6, r7;

double ten = 10.0;

int main(int c, char **v)
{
  r1 = 0.1 == (1.0 / ten);
  r2 = 0.1 == (1.0 / 10.0);
  r3 = 0.1 == (double) (1.0 / ten);
  r4 = 0.1 == (double) (1.0 / 10.0);
  ten = 10.0;
  r5 = 0.1 == (1.0 / ten);
  r6 = 0.1 == (double) (1.0 / ten);
  r7 = ((double) 0.1) == (1.0 / 10.0);
  printf("r1=%d r2=%d r3=%d r4=%d r5=%d r6=%d r7=%d\n", r1, r2, r3, r4, r5, r6, r7);
}
$ clang -v
Apple LLVM version 4.2 (clang-425.0.24) (based on LLVM 3.2svn)
$ clang -mno-sse2 -std=c99  t.c && ./a.out
r1=0 r2=1 r3=0 r4=1 r5=1 r6=0 r7=1

Oh no! Everything is to be done again… Some expressions are evaluated as compile-time with different results than the run-time ones, as shown by the difference between r1 and r2. The explicit cast to double does not seem to have an effect for r3 and r4 as compared to r1 and r2. This is different from Joseph Myers's interpretation, but if it is because floating-point expressions are always converted to their nominal types before being compared, it may be a good astonishment-lowering move. The value of r5 differs from that of r1, pointing to a non-obvious demarcation line between compile-time evaluation and run-time evaluation. And the values of r5 and r6 differ, meaning that our interpretation “explicit casts to double have no effects” based on the comparison of the values of r1 and r3 on the one hand and r2 and r4 on the other hand, is wrong or that some other compilation pass can interfere.

What a mess! There is no way a precise static analyzer can be made for this recent version of Clang (with these unfashionable options). Plus the results depend on optimizations:

$ clang -mno-sse2 -std=c99 -O2  t.c && ./a.out
r1=0 r2=1 r3=0 r4=1 r5=1 r6=1 r7=1

FLT_EVAL_METHOD is not ready for precise static analysis

In conclusion, it would be possible, and only quite a lot of hard work, to make a precise static analyzer for programs destined to be compiled to x87 instructions by a modern GCC. But for most other compilers, even including recent ones, it is simply impossible: the compiler gives floating-point operations a meaning that only it knows.

This is the sort of problem we tackled in the Hisseo project mentioned last time. One of the solutions we researched was “Just do not make a precise static analyzer”, and another was “Just analyze the generated assembly code where the meaning of floating-point operations has been fixed”. A couple of years later, the third solution, “Just use a proper compiler”, is looking better and better. It could even be a certified one, although it does not have to. Both Clang and GCC, when targeting the SSE2 instruction set, give perfect FLT_EVAL_METHOD==0 results. We should all enjoy this period of temporary sanity until x86-64 processors all sport a fused-multiply-add instruction.

Two things I should point out as this conclusion's conclusion. First, with the introduction of SSE2, the IA-32 platform (and its x86-64 cousin) has gone from the worst platform still in existence for predictable floating-point results to the best. It has correctly rounded operations for the standard single-precision and double-precision formats, and it retains hardware support for an often convenient extended precision format. Second, the fused-multiply-add instruction is a great addition to the instruction set, and I for one cannot wait until I get my paws on a processor that supports it, but it is going to be misused by compilers to compile source-level multiplications and additions. Compilers have not become wiser. The SSE2 instruction set has only made it more costly for them to do the wrong thing than to do the right one. They will break predictability again as soon as the opportunity comes, and the opportunity is already in Intel and AMD's product pipelines.

Saturday, July 6 2013

On the precise analysis of C programs for FLT_EVAL_METHOD==2

There has been talk recently amongst my colleagues of Frama-C-wide support for compilation platforms that define FLT_EVAL_METHOD as 2. Remember that this compiler-set value, introduced in C99, means that all floating-point computations in the C program are made with long double precision, even if the type of the expressions they correspond to is float or double. This post is a reminder, to the attention of these colleagues and myself, of pitfalls to be anticipated in this endeavor.


We are talking of C programs like the one below.

#include <stdio.h>

int r1;

double ten = 10.0;

int main(int c, char **v)
{
  r1 = 0.1 == (1.0 / ten);
  printf("r1=%d\n", r1);
}

With a C99 compilation platform that defines FLT_EVAL_METHOD as 0, this program prints "r1=1", but with a compilation platform that sets FLT_EVAL_METHOD to 2, it prints “r1=0”.

Although we are discussing non-strictly-IEEE 754 compilers, we are assuming IEEE 754-like floating-point: we're not in 1980 any more. Also, we are assuming that long double has more precision than double, because the opposite situation would make any discussion specifically about FLT_EVAL_METHOD == 2 quite moot. In fact, we are precisely thinking of compilation platforms where float is IEEE 754 single-precision (now called binary32), double is IEEE 754 double-precision (binary64), and long double is the 8087 80-bit double-extended format.


Let us find ourselves a compiler with the right properties :

$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
…
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1) 
$ gcc -mfpmath=387 -std=c99 t.c && ./a.out 
r1=0

Good! (it seems)

The test program sets r1 to 0 because the left-hand side 0.1 of the equality test is the double-precision constant 0.1, whereas the right-hand side is the double-extended precision result of the division of 1 by 10. The two differ because 0.1 cannot be represented exactly in binary floating-point, so the long double representation is closer to the mathematical value and thus different from the double representation. We can make sure this is the right explanation by changing the expression for r1 to 0.1L == (1.0 / ten), in which the division is typed as double but computed as long double, then promoted to long double in order to be compared to 0.1L, the long double representation of the mathematical constant 0.1. This change causes r1 to receive the value 1 with our test compiler, whereas the change would make r1 receive 0 if the program was compiled with a strict IEEE 754 C compiler.

Pitfall 1: Constant expressions

Let us test the augmented program below:

#include <stdio.h>

int r1, r2;

double ten = 10.0;

int main(int c, char **v)
{
  r1 = 0.1 == (1.0 / ten);
  r2 = 0.1 == (1.0 / 10.0);
  printf("r1=%d r2=%d\n", r1, r2);
}

In our first setback, the program prints “r1=0 r2=1”. The assignment to r2 has been compiled into a straight constant-to-register move, based on a constant evaluation algorithm that does not obey the same rules that execution does. If we are to write a precise static analyzer that corresponds to this GCC-4.4.3, this issue is going to seriously complicate our task. We will have to delineate a notion of “constant expressions” that the analyzer with evaluate with the same rules as GCC's rules for evaluating constant expressions, and then implement GCC's semantics for run-time evaluation of floating-point expressions for non-constant expressions. And our notion of “constant expression” will have to exactly match GCC's notion of “constant expression”, lest our analyzer be unsound.

Clarification: What is a “precise” static analyzer?

This is as good a time as any to point out that Frama-C's value analysis plug-in, for instance, is already able to analyze programs destined to be compiled with FLT_EVAL_METHOD as 2. By default, the value analysis plug-in assumes IEEE 754 and FLT_EVAL_METHOD == 0:

$ frama-c -val t.c
…
t.c:9:[kernel] warning: Floating-point constant 0.1 is not represented exactly.
 Will use 0x1.999999999999ap-4.
 See documentation for option -warn-decimal-float
…
[value] Values at end of function main:
          r1 ∈ {1}
          r2 ∈ {1}


The possibility of FLT_EVAL_METHOD being set to 2 is captured by the option -all-rounding-modes:

$ frama-c -val -all-rounding-modes t.c
…
t.c:9:[kernel] warning: Floating-point constant 0.1 is not represented exactly.
 Will use 0x1.999999999999ap-4.
 See documentation for option -warn-decimal-float
…
[value] Values at end of function main:
          r1 ∈ {0; 1}
          r2 ∈ {0; 1}

The sets of values predicted for variables r1 and r2 at the end of main() each contain the value given by the program as compiled by GCC-4.4.3, but these sets are not precise. If the program then went on to divide r1 by r2, Frama-C's value analysis would warn about a possible division by zero, whereas we know that with our compiler, the division is safe. The warning would be a false positive.

We are talking here about making a static analyzer with the ability to conclude that r1 is 0 and r2 is 1 because we told it that we are targeting a compiler that makes it so.


The above example command-lines are for Frama-C's value analysis, but during her PhD, Thi Minh Tuyen Nguyen has shown that the same kind of approach could be applied to source-level Hoare-style verification of floating-point C programs. The relevant articles can be found in the results of the Hisseo project.

To be continued

In the next post, we will find more pitfalls, revisit a post by Joseph S. Myers in the GCC mailing list, and conclude that implementing a precise static analyzer for this sort of compilation platform is a lot of work.