the compiler is always right

I think everybody who programs has had a bug in one of their programs that they were positive was the compiler’s fault.  (Perhaps “it’s the operating system’s fault” or “it’s the hardware’s fault”, though these are less common.) At this point, you learn the rules of programming:

  1. The compiler is always right.
  2. If the compiler is ever wrong, see rule #1.

The corollary, of course, is that it is your program that has the bug, not the compiler.  (The third through sixth laws are restatements of these two with the operating system and the hardware, respectively.)

Yesterday was one of those occasions where I thought the compiler might be wrong and spent a little time remembering the First Rule.  It’s instructive to look at the example and understand why the compiler is always right.  I was looking at bug 611781, reducing the size of the NSS library shipped with Firefox, and ran across this Linux x86-64 assembly code, compiled with gcc -Os:

0000000000000000 <NSSTrustDomain_GenerateSymmetricKeyFromPassword>:
   0:	41 51                	push   %r9
   2:	48 8b 05 00 00 00 00 	mov    0x0(%rip),%rax
			5: R_X86_64_GOTPCREL	NSS_ERROR_NOT_FOUND+0xfffffffffffffffc
   9:	8b 38                	mov    (%rax),%edi
   b:	e8 00 00 00 00       	callq  10 <NSSTrustDomain_GenerateSymmetricKeyFromPassword+0x10>
			c: R_X86_64_PLT32	nss_SetError+0xfffffffffffffffc
  10:	31 c0                	xor    %eax,%eax
  12:	41 5a                	pop    %r10
  14:	c3                   	retq   

You can find a lot of these small functions in lib/pki/trustdomain.c in an NSS source tree. Looking over this, you notice two things if you know a little bit about x86-64 assembler:

  1. The push and pop instructions are suspiciously mismatched; if I save a register to the stack with push, I ought to restore the same register from the stack with a pop. The compiler must have a bug!
  2. Also, there’s an efficiency concern here. Why is the compiler saving and restoring any registers here? We never use %r9 in the body of the function, so we shouldn’t need to save it at all.

What is the compiler doing here? It’s easiest to explain away the second issue first. If you look at the x86-64 ABI document, you’ll see that in section 3.2.2, the stack must be kept 16-byte aligned before calling other functions. Since our function calls another function, our function must ensure that the stack is properly aligned when that other function begins execution. And since the call instruction on x86-64 (which is how we would have arrived at our function) adjusts the stack pointer by 8 bytes (in addition to all the other work it does), our function must adjust by an additional 8 bytes to maintain 16-byte alignment. The compiler has chosen to use a push instruction to manipulate the stack pointer in this case. This instruction subtracts 8 bytes from the stack pointer and stores the indicated register into the memory at the new stack pointer.

Another way to do this would be to subtract 8 bytes from the stack pointer (sub $0x8, %rsp), which avoids writing the register to memory at all. If you compile with -O2, optimizing for speed, instead of -Os, you would indeed see the compiler using sub $0x8, %rsp. But since we compiled this code with -Os, optimizing for size, the compiler knows that the instruction for pushing a register onto the stack (2 bytes) is smaller than the instruction for subtracting 8 bytes from the stack pointer (4 bytes). Likewise, the instruction for popping a register from the stack (2 bytes) is smaller than the instruction for adding 8 bytes to the stack pointer (4 bytes).

These “useless” instructions are therefore doing real work, which is maintaining the contract of the ABI.

OK, so the efficiency claim has been addressed. What about the correctness claim with mismatched registers? Again, if you look at the aforementioned ABI document, section 3.2.3 describes how registers are used for function calls. The registers %r9 and %r10 are caller-saved registers, which means that a called function is free to overwrite any values stored in those registers. It doesn’t matter what value, if any, our function stores in %r10, because we know that if the caller had cared about the value, the caller would have stored that value away somewhere. Since we need spare registers for maintaining stack alignment via push and pop, caller-saved registers are ideal for pushing and popping with abandon.

In this case, it turned out that my understanding of what the program was doing had the bug, not the compiler.  It’s also worth pointing out that if the compiler really was mismatching register saves and restores, lots and lots of things would be broken.  The likelihood of the code produced in this instance being wrong—but the same problem not occurring in the millions of lines of code the compiler has compiled to produce the system on your computer—is vanishingly small.  The next time you see the compiler doing something weird, remember that the compiler is always right and try to figure out why it’s doing that.

(I should say, of course, that compilers, just like many other computer programs, do have bugs; GCC’s bugzilla or LLVM’s bugzilla would not exist otherwise, nor would bugfix releases continue to come out for your favorite compiler.  But in the vast, vast majority of cases, you have a bug to fix, not the compiler.)

23 comments

  1. Benoit Girard

    Informative. Thanks for the write up!

  2. Reuben Morais

    If you ignore PGO entirely, of course ;)

  3. I feel like there should be a “rule 0: glandium is always right” since he bested gcc in the battle that was bug 832942 :)

  4. Heh, us web developers don’t have it so simple, but at least we *do* get to use the “it’s the engine’s fault” excuse. :) Although there have been times I have been tempted to say that two engines have the exact same bug – my own Rule #1 is “Anything Gecko and Blink agree upon is always right”.

  5. I made TempleOS including a 64-bit compiler. I always pray the compiler is right, LOL.

  6. I disagree with your premise. I remember two bug reports I sent to DEC about the PDP-11 running RSX-11M in the mid 1970’s. The first was a two line program (probably FORTRAN) where the compiler decided to crash the whole system (not just the user) during the compile step. The second was quite similar, except it was a 4 line or so assembler macro.

    • Nathan Froyd

      As a compiler writer, getting short test cases for bugs in the compiler was always welcome.

  7. I *have* encountered compiler bugs before, but your point is certainly valid – the odds are much more likely that the bug is in your own code than in the toolchain…

  8. Please change the headline to “I was mistaken in one case because I didn’t think things through” because the current headline is pretty stupid

  9. Wrong – I have seen bugs in Visual Studio where the offset of variables in a class were different from one module to the next. The fix was to turn off optimization. Or is your answer “ya shoudda bin usin’ GCC”?

    • Nathan Froyd

      My first answer is to wonder whether all modules were being compiled with exactly the same optimization settings, include paths, and preprocessor macros, actually.

  10. This may be true today at least as a useful generalization but anyone who believes this as an absolute has never used embedded system compilers from the 90s and early 2000s. I found and reported bugs every few months, which were fixed in later releases. I still never really trust the compiler.

    • I agree and it still happens today with new microcontroller families. I was using a GCC-based complier that was ported to a new microcontroller and saw it’s v1.0 in 2012 and I was developing with this new chip.

      I was using an unusual linker script to accommodate a bootloader that had to handle live updates (it was for battery updates and the code HAD to function while the new code was being loaded.) It turned out it was laying statically allocated variables on top of each other, making very weird behavior.

      Fortunately, the bugfixes were coming fast and often and an upgrade of the compoler fixed my problem.

    • Nathan Froyd

      A healthy amount of skepticism is always an asset in computer programming.

  11. I chuckle a little inside every time this comes up, because my old old job (before Mozilla) was hacking on GCC, which meant I was always dealing with cases where the compiler was wrong.

    And yet – you know what I compiled most? The compiler. How often did the compiler miscompile itself? Almost never.

  12. In about 10 years of professionally being a programmer I’ve only run across 5 compiler bugs, 2 of which weren’t fixed in a newer version by the time I found them. The first (GCC 4.3.0) was a parse bug in a new feature (namely, class<class> was not parsed as double close bracket) and the second (IAR 5.1.1) was a complex corner case in parsing function overloads (namely, having two functions that take a member function parameter that only differs in constness of overloaded function pointer, where the first isn’t defined before the second is declared).

    In both cases the compiler writers are of course forgiven; bleeding edge corner cases are expected to be found to fail. One of the others was more intriguing though – MSVC 2005 doesn’t allow you to have a const and a non-const copy constructor…

  13. Benoit Jacob

    Is there a particular reason, though, for using two different registers, r9 and 10, for this stack alignment work, or is it just incidental? Would it have made any difference if the compiler had decided to use r9 both times?

    • Nathan Froyd

      It would not have made any difference. FWIW, there are similar functions in the file that use different mismatched registers for stack alignment, too.

      • Might it have made a difference in the processor’s dataflow analysis for ILP? It might help the processor decide that the push and pop did not depend on each other, and so could be interleaved in some clever way.

        • Nathan Froyd

          That’s a good point. I think the call instruction between the push/pop pair clobbers any possible ILP; I’m also not sure the processor would execute the instructions in parallel anyway, as they have a dependence on each other through the modified stack pointer. (GCC, for example, uses manual stack pointer adjustment with mov instructions to eliminate this implicit dependence when optimizing for speed.)

  14. The compiler was fascinatingly wrong for me about a year ago — I’d get builds of B2G that completely refused to load images, because the image loader got stack garbage instead of the number of frames, because the instruction that was supposed to write it got dead-store-eliminated, because of a bug that had already been reported to GCC about, if memory serves, the backend’s many internal varieties of stack pointer / frame pointer getting mixed up or mis-aliased somehow. (And, of course, none of this happened for the people doing the official testing.) The exciting(?) details are in https://bugzil.la/873332

  15. We all have had that 1 or 2 bugs that were from the compiler. My personal favorite was mixing tabs and spaces in a macro in visual studio 6.

    However out of the THOUSANDS of bugs I have ever squashed 99.999% of the time it is my code or some sort of mismatch in interfaces. Then when I come across a compiler bug most of the time I can work around it by just rearranging the code a bit. Note it, submit the bug in, and move on.

    Play the odds its usually your code. The wildly rare case is the compiler.