Frama-C news and ideas

To content | To menu | To search | Frama-C Home

On the redundancy of C99's restrict

The restrict keyword in C99

C99 introduced a restrict keyword. The intention is to let the programmer specify the absence of alias between some inputs of a function ey is writing.

Consider the function:

int f1(int * restrict p, int * restrict q)
{
  *p = 1;
  *q = 2;
  return *p + *q;
}

Thanks to the restrict keyword, GCC can compile function f1() into the following assembly code:

$ ~/gcc-172652/bin/gcc -O3 -std=c99 -S restr.c
$ cat restr.s
...	
	movl	$1, (%rdi)
	movl	$2, (%rsi)
	movl	$3, %eax
	ret
...

Note that the -std=c99 is necessary. The code, parsed as a C89 program, is syntactically incorrect because restrict is not a keyword in C89.

In the generated x86_64 code, %rdi is the first argument p, %rsi is the second argument q, the parentheses dereference and the dollar sign indicates constants. By convention, register %eax contains the return value when the ret instruction is reached.

The above is pretty good code. Since the information is provided that pointers passed to function f1 for p and q never alias, the compiler knows that at the point of the return, *p must be 1, and therefore the returned expression evaluates to 3.

Contrast with the same function compiled without the restrict annotations, say, because all you have is a C89 compiler:

int f(int * p, int * q)
{
  *p = 1;
  *q = 2;
  return *p + *q;
}

...
	movl	$1, (%rdi)
	movl	$2, (%rsi)
	movl	(%rdi), %eax
	addl	$2, %eax
	ret

This generated assembly code is not as good. The compiler knows that *q can only be 2 at the point of the return statement, but it cannot be certain that *p contains 1, because the function could be passed the same address for both arguments. In order for the code to work in all cases, the compiler decided to reload *p from memory for the addition.

My proposal

I claim that the restrict keyword in C99 both lacks expressive power and is redundant with constructs that already existed in C89. There was no need to encumber the language with a new construct, especially a new keyword that breaks the compilation of existing code that uses a variable named restrict.

C89 already had undefined behavior, a powerful tool for instructing the compiler that it does not have to generate code that handles special cases.


A function f2() that works under the same hypotheses as f1() can be written in C89:

int f2(int * p, int * q)
{
  (*p=*p) & (*q=*q);
  *p = 1;
  *q = 2;
  return *p + *q;
}

The first statement in the body informs the compiler that f2() will never be called with aliasing arguments p and q. The compiler is then free to generate the same code as was generated for the restrict-using function f1(). It is completely legal for the compiler to assume that *p + *q at the return statement evaluates to 3 in the function above. GCC does not generate efficient code for the function f2() above, but it could. It generates the same code as for the plain function f(), correctly compiling (*p=*p) & (*q=*q); to nothing, but failing to take advantage of the freedom it provides to generate a function that does not work when p and q alias.

Wrapping up

My proposal is more flexible than restrict, allowing to specify fine-grained non-aliasing relations between specific pointers. It is slightly more verbose, especially when there are many pointers, but this is a normal downside of allowing to specify aliasing relations pointer by pointer. Note that the pairwise absence of alias between p, q and r can be specified with the compact (*p=*p) & (*q=*q) & (*r=*r);


Compiler makers, please support the (*p=*p) & (*q=*q); code annotation in your compilers.

Comments

1. On Wednesday, July 25 2012, 15:19 by alex

How does it "break existing code"? If you use restrict you are writing C99 (or C11 presumably) and it's reasonable that C89 compilers won't compile later standards. And no existing (presuming that means C89) code that compiles (with a C89 compiler) will have the restrict keyword used?

2. On Wednesday, July 25 2012, 15:48 by pascal

@alex: It breaks existing code that has a variable named restrict. It was legal to have such a variable in C89. However, this was just one of several fallacious arguments I decided to put together in this post. I hope I clarified it in the following one.

3. On Wednesday, July 25 2012, 16:14 by John Regehr

Nice!

4. On Wednesday, July 25 2012, 17:36 by Orion

Wouldn't an easier way to introduce undefined behavior be:

if (p == q) {*(NULL);}

But more importantly, pointers can alias without being equal. char* cat = "cat"; strcpy(cat, cat + 1); does it. That particular one _should_ be an infinite loop, filling the memory with 'c'. Basically, there is more information in restrict, otherwise compilers could optimize with a check for equality, and then separately optimizing the two branches.

5. On Wednesday, July 25 2012, 18:00 by pascal

@Orion

We are at the level of hermeneutics, but I believe that (*p=*p) & (*q=*q); protects you against *p and *q having any bit in common, whereas if (p == q) {*(NULL);} only protects against them being equal. You could build a program where there is a difference using an union of packed structs, and I think that program would only invoke implementation-defined behavior in the build-up (but, again, this is probably open to interpretation). Nice suggestion anyway, the technique may be simpler to use than I initially thought, and I might even try to use undefined behavior constructively to provide optimization hints to the compiler in the future—this blog post was not “for real”, but you somewhat convinced me that the idea may have some real applications.

I agree that “restrict” is not exactly equivalent to my proposal, although they should both allow the compiler to optimize the simple function f() I chose as example. My proposal allows to specify only “p does not alias q and r does not alias t”. The coarser grain of “restrict” is both an advantage and a drawback.

6. On Thursday, July 26 2012, 01:35 by Stephen Canon

@Orion: strcpy(cat, cat+1) is undefined behavior. (7.24.2.3 "If copying takes place between objects that overlap, the behavior is undefined.") The only thing it should do is order takeout using your credit card for the poor engineer who implemented strcpy and is going to need to read your bug report.

7. On Sunday, July 29 2012, 15:32 by lm

In the above example also writing to literal string is undefined behavior.