May 14

the compiler is always right

I think everybody who programs has had a bug in one of their programs that they were positive was the compiler’s fault.  (Perhaps “it’s the operating system’s fault” or “it’s the hardware’s fault”, though these are less common.) At this point, you learn the rules of programming:

  1. The compiler is always right.
  2. If the compiler is ever wrong, see rule #1.

The corollary, of course, is that it is your program that has the bug, not the compiler.  (The third through sixth laws are restatements of these two with the operating system and the hardware, respectively.)

Yesterday was one of those occasions where I thought the compiler might be wrong and spent a little time remembering the First Rule.  It’s instructive to look at the example and understand why the compiler is always right.  I was looking at bug 611781, reducing the size of the NSS library shipped with Firefox, and ran across this Linux x86-64 assembly code, compiled with gcc -Os:

0000000000000000 <NSSTrustDomain_GenerateSymmetricKeyFromPassword>:
   0:	41 51                	push   %r9
   2:	48 8b 05 00 00 00 00 	mov    0x0(%rip),%rax
			5: R_X86_64_GOTPCREL	NSS_ERROR_NOT_FOUND+0xfffffffffffffffc
   9:	8b 38                	mov    (%rax),%edi
   b:	e8 00 00 00 00       	callq  10 <NSSTrustDomain_GenerateSymmetricKeyFromPassword+0x10>
			c: R_X86_64_PLT32	nss_SetError+0xfffffffffffffffc
  10:	31 c0                	xor    %eax,%eax
  12:	41 5a                	pop    %r10
  14:	c3                   	retq   

You can find a lot of these small functions in lib/pki/trustdomain.c in an NSS source tree. Looking over this, you notice two things if you know a little bit about x86-64 assembler:

  1. The push and pop instructions are suspiciously mismatched; if I save a register to the stack with push, I ought to restore the same register from the stack with a pop. The compiler must have a bug!
  2. Also, there’s an efficiency concern here. Why is the compiler saving and restoring any registers here? We never use %r9 in the body of the function, so we shouldn’t need to save it at all.

What is the compiler doing here? It’s easiest to explain away the second issue first. If you look at the x86-64 ABI document, you’ll see that in section 3.2.2, the stack must be kept 16-byte aligned before calling other functions. Since our function calls another function, our function must ensure that the stack is properly aligned when that other function begins execution. And since the call instruction on x86-64 (which is how we would have arrived at our function) adjusts the stack pointer by 8 bytes (in addition to all the other work it does), our function must adjust by an additional 8 bytes to maintain 16-byte alignment. The compiler has chosen to use a push instruction to manipulate the stack pointer in this case. This instruction subtracts 8 bytes from the stack pointer and stores the indicated register into the memory at the new stack pointer.

Another way to do this would be to subtract 8 bytes from the stack pointer (sub $0x8, %rsp), which avoids writing the register to memory at all. If you compile with -O2, optimizing for speed, instead of -Os, you would indeed see the compiler using sub $0x8, %rsp. But since we compiled this code with -Os, optimizing for size, the compiler knows that the instruction for pushing a register onto the stack (2 bytes) is smaller than the instruction for subtracting 8 bytes from the stack pointer (4 bytes). Likewise, the instruction for popping a register from the stack (2 bytes) is smaller than the instruction for adding 8 bytes to the stack pointer (4 bytes).

These “useless” instructions are therefore doing real work, which is maintaining the contract of the ABI.

OK, so the efficiency claim has been addressed. What about the correctness claim with mismatched registers? Again, if you look at the aforementioned ABI document, section 3.2.3 describes how registers are used for function calls. The registers %r9 and %r10 are caller-saved registers, which means that a called function is free to overwrite any values stored in those registers. It doesn’t matter what value, if any, our function stores in %r10, because we know that if the caller had cared about the value, the caller would have stored that value away somewhere. Since we need spare registers for maintaining stack alignment via push and pop, caller-saved registers are ideal for pushing and popping with abandon.

In this case, it turned out that my understanding of what the program was doing had the bug, not the compiler.  It’s also worth pointing out that if the compiler really was mismatching register saves and restores, lots and lots of things would be broken.  The likelihood of the code produced in this instance being wrong—but the same problem not occurring in the millions of lines of code the compiler has compiled to produce the system on your computer—is vanishingly small.  The next time you see the compiler doing something weird, remember that the compiler is always right and try to figure out why it’s doing that.

(I should say, of course, that compilers, just like many other computer programs, do have bugs; GCC’s bugzilla or LLVM’s bugzilla would not exist otherwise, nor would bugfix releases continue to come out for your favorite compiler.  But in the vast, vast majority of cases, you have a bug to fix, not the compiler.)

May 14

my code search engine

Christian Legnitto wrote a blog post where he mentioned Firefox developers being forced to deal with “crufty code-search tools” (and many other perceived suboptimalities in the development process).  I’m looking forward to reading his followup, but I also thought it was worth blogging about what I use for my day-to-day code search needs.

I use Emacs’s rgreprgrep (and its cousin commands grep and lgrep) executes grep on a group of files in a given directory, recursively. The grep results are then displayed in a hyperlinked format for easy browsing between matches. Well, being Emacs, I use it with a few modifications of my own.  Here’s my setup.

First, a small utility I wrote for making quick selections with a single keypress:

(defun character-prompt (alist description)
  (message "Select [%s]: "
           (apply #'string (mapcar #'car alist)))
  (let* ((ch (save-window-excursion
               (select-window (minibuffer-window))
         (x (find ch alist :key #'car)))
      ((null x)
       (message (format "No %s for character: %s" description ch))
       (sleep-for 1)
       (character-prompt alist description))
      (t (cdr x)))))

This function gets used in the small wrapper I wrote around rgrep. Some preliminaries first, like where the Firefox tree lives, files that contain overly long lines and therefore mess with Emacs’s hyperlinking, and directories that I generally don’t deal with in my day-to-day work.

(defvar froydnj-mozilla-srcdir (expand-file-name "~/src/gecko-dev.git/"))
(defvar froydnj-mozilla-ignored-files
  (list "searchindex.js"
(defvar froydnj-mozilla-ignored-directories
  (list "nss" "nsprpub" "js/src/tests" "intl/icu"))

Next, the way I select subsets of files to search in. I learned after writing all this that rgrep already has built-in functionality for this (see the grep-files-aliases variable), but I like my setup better.

(defvar froydnj-mozilla-files
  '((?a . "*")                    ; All of it
    (?c . "*.[cm]*")              ; C/C++/Obj-C
    (?C . "*.[cmh]*")             ; Same, plus headers (and HTML, sadly)
    (?h . "*.h")
    (?H . "*.html")
    (?i . "*.idl")
    (?j . "*.js*")
    (?l . "*.lisp")
    (?m . "Makefile.in")
    (?p . "*.py")
    (?v . "*.java")
    (?w . "*.webidl")))

Finally, the wrapper itself, which prompts for the search pattern, the filename pattern, makes sure the directories and files above are ignored, and executes the search.

(defun froydnj-mozilla-rgrep ()
  (let ((regexp (grep-read-regexp))
        (files (character-prompt froydnj-mozilla-files "filename pattern"))
        (grep-find-ignored-files (append grep-find-ignored-files
        (grep-find-ignored-directories (append grep-find-ignored-directories
    (rgrep regexp files froydnj-mozilla-srcdir)))

One other bit that I find useful is a custom name for each buffer. By default, the rgrep results are deposited in a buffer named *grep* (likewise for grep and lgrep; the following advice applies to them automatically), and future greps overwrite this buffer. Having records of your greps lying around is occasionally useful, so I’ve changed the hook that determines the buffer name for rgrep. The comments that refer to compilation are from a previous job where it was useful to launch compilation jobs from within Emacs. I keep meaning to tweak those bits to launch mach in various configurations instead, but I haven’t done so yet.

(defun froydnj-compilation-buffer-name (major-mode-name)
  (let ((cfg-regexp "\\([-0-9_.a-z/+]+\\.cfg\\)"))
      ;; We're doing local compilation, stick the name of the release
      ;; configuration in the buffer name.
      ((or (string-match "^cd /scratch/froydnj/\\([^ ;]+\\)" command)
           (string-match "^build-config" command))
       (string-match cfg-regexp command)
       (concat "*compilation " (match-string 1 command) "*"))
      ;; We're doing remote compilation, note the machine name and
      ;; the release configuration name.
      ((string-match "^ssh \\([^ ]+\\)" command)
       (let ((machine (match-string 1 command)))
         (string-match cfg-regexp command)
         (concat "*compilation@" machine " " (match-string 1 command) "*")))
      ;; grep.el invokes compile, we might as well take advantage of that.
      ((string-equal major-mode-name "grep")
       (if (boundp 'regexp)
           (concat "*grep for " regexp "*")
      ;; We have no idea, just use the default.

(setq compilation-buffer-name-function 'froydnj-compilation-buffer-name)

Search times are comparable to web-based tools, and browsing results is more convenient. It has its shortcomings (overloaded C++ method names can be a pain to deal with, for instance), but it works well enough for 95%+ of my searching needs.