We build a lot of code at Mozilla. Every time someone pushes changes to the code that makes up Firefox we build the application on multiple platforms in a variety of build configurations. This means that we’re constantly looking for ways to make the build faster–to get faster results from our builds and tests and to use less machine time so that we can use fewer machines for builds and save money.

A few years ago my colleague Mike Hommey did some work to see if we could deploy a shared compiler cache. We had been using ccache for many of our builds, but since we use ephemeral build machines in AWS and we also have a large pool of build machines, it doesn’t help as much as it does on a developer’s local machine. If you’re interested in the details, I’d recommend you go read his series of blog posts: Shared compilation cache experiment, Shared compilation cache experiment, part 2, Testing shared cache on try and Analyzing shared cache on try. The short version is that the project (which he named sccache) was extremely successful and improved our build times in automation quite a bit. Another nice win was that he added support for Microsoft Visual C++ in sccache, which is not supported by ccache, so we were finally able to use a compiler cache on our Windows builds.

This year we started a concerted effort to drive build times down even more, and we’ve made some great headway. Some of the ideas for improvement we came up with would involve changes to sccache. I started looking at making changes to the existing Python sccache codebase and got a bit frustrated. This is not to say that Mike wrote bad code, he does fantastic work! By nature of the sccache design it is doing a lot of concurrent work and Python just does not excel at that workload. After talking with Mike he mentioned that he had originally planned to write sccache in Rust, but at the time Rust had not had its 1.0 release and the ecosystem just wasn’t ready for the work he needed to do. I had spent several months learning Rust after attending an “introduction to Rust” training session and I thought it’d be a good time to revisit that choice. (I went back and looked at some meeting notes and in late April I wrote a bullet point “Got distracted and started rewriting sccache in Rust”.)

As with all good software rewrites, the reality of things made it into a much longer project than anticipated. (In fairness to myself, I did set it aside for a few months to spend time on another project.) After seven months of part-time work on the project it’s finally gotten to the point where I’m ready to put it into production usage, replacing the existing Python tool. I did a series of builds on our Try server to compare performance of the existing sccache and the new version, mostly to make sure that I wasn’t going to cause regressions in build time. I was pleasantly surprised to find that the Rust version gave us a noticeable improvement in build times! I hadn’t done any explicit work on optimizing it, but some of the improvements are likely due to the process startup overhead being much lower for a Rust binary than a Python script. It actually lowered the time we spent running our configure script by about 40% on our  Linux builds and 20% on our OS X builds, which makes some sense because configure invokes the compiler quite a few times, and when using ccache or sccache it will invoke the compiler using that tool.

My next steps are to tackle the improvements that were initially discussed. Making sccache usable for local developers is one thing, since Windows developers can’t use ccache currently this should help quite a bit there. We also want to make it possible for developers to use sccache and get cache hits from the builds that our automation has already done. I’d also like to spend some time polishing the tool a bit so that it’s usable to a wider audience outside of Mozilla. It solves real problems that I’m sure other organizations face as well and it’d be great for others to benefit from our work. Plus, it’s pretty nice to have an excuse to work in Rust. 🙂 You can find the code for the rewritten sccache on GitHub.

Overall I’ve really enjoyed the experience of working in Rust on this project. Compared to working on the Python version of the tool it was nice to have static typing to catch mistakes I made in the compile phase. I’ve really grown to love Rust as a language, I miss things like the match expression when I’m working in other languages now! There were certainly some growing pains–I hit a few cases where the crates.io ecosystem just didn’t have something I expected, or the Rust standard library was missing a feature I needed, but those were not very common occurrences for me. I would definitely reach for Rust again for a project like this!

MozillaBuild 1.5

July 22nd, 2010

MozillaBuild 1.5 has been released:
http://ftp.mozilla.org/pub/mozilla.org/mozilla/libraries/win32/MozillaBuildSetup-1.5.exe

The major highlights include:

  • A newer Mercurial (1.5.4)
  • Support for Visual C++ 2010
  • A newer Python (2.6.5)

You can see the full list of dependent bugs,  as well as the full list of committed changes.

As usual, bugs can be filed at bugzilla.mozilla.org.

Source Server, back on trunk

October 5th, 2009

Some time ago, Lukas Blakk implemented support for a source server on our Windows builds as a class project in Dave Humphrey‘s class at Seneca College. Of course, soon after that we switched our main VCS from CVS to Mercurial, which broke all of her hard work. Thankfully, we got another one of Dave’s students, Jesse Valianes, to fix things to make it work with Mercurial. We landed his patch, but as it turns out we never enabled a setting on our build machines to make it actually work. However, when we finally tried to do so, I found out that another patch we had landed in the interim had broken things. I finally landed a fix for that, and we flipped it back on, and so today’s trunk build is source-enabled again.

If you have no idea what any of this means, it means you can download a Windows nightly build, attach a debugger, have it download the debug symbols automatically from our symbol server, and the debugger will download the matching source for you automatically.

I hope to get this backported to our 1.9.2 and 1.9.1 branches ASAP, so that our 3.5.x and 3.6 release builds will be similarly debuggable.

Firefox Packaging

September 17th, 2009

I recently landed some changes (on trunk and 1.9.2) to the way Firefox packaging works. There are two immediate consequences of this you should be aware of:

  1. Mac builds now use a packaging manifest just like Windows and Linux. If you add a file that you intend to ship on Mac, it needs to wind up in a packaging manifest. (bug 463605)
  2. All the  packaging manifest files have been combined into one single file: browser/installer/package-manifest.in. This should save everyone some time and annoyance. (bug 511642)

These changes had no effect on applications other than Firefox.

Getting There from Here

February 9th, 2009

The New Yorker has a great article about reforming the US health care system entitled “Getting There from Here.” It’s not terribly long and if you’re at all interested in the topic I’d recommend reading it. It discusses how other nations with universal health care arrived at their present systems, which is a topic that seems completely absent from debate on the subject in the USA. Anyway, I’m not really posting here to talk about health care, but one part of this article rang true with me in other ways:

There is no dry-docking health care for a few months, or even for an afternoon, while we rebuild it. Grand plans admit no possibility of mistakes or failures, or the chance to learn from them. If we get things wrong, people will die. This doesn’t mean that ambitious reform is beyond us. But we have to start with what we have.

As owner of the Mozilla build system, I hear a lot of complaints. This is understandable, our build system is showing its age, and we are certainly straining against the limits of Autoconf and GNU make on a regular basis. Along with complaints, I hear a lot of suggestions of the form “why don’t you just use X“, where X is any of a number of alternative tools such as CMake or SCons. The basic answer parallels the quote above. Because we don’t have time to stop and rewrite everything. Our build system contains tens of thousands of lines of makefiles, as well as a configure.in that’s over 8,000 lines. Converting this much build junk by hand is doomed to failure. Converting it through automated tools might be possible, but we would need a smart plan, and it would likely involve testing the tools in parallel with the existing build system for a while in order to make for a smooth transition. In any case, it’s clear that if we are to find a way forward, it will require building on what we have, and not burning it to the ground and starting over from scratch.

[via]

Our Release Engineering group provides a VMWare VM image of the Linux Reference Platform, which is the VM upon which all of the official Linux builds happen. This is very handy, as you can trade some download time (it’s about 1.2 GB) for the time it would take you to install Linux and setup all the build dependencies. I’m currently running Ubuntu 8.04 64-bit on one of my home machines, and I’ve been using VirtualBox for running VMs on the machine because it was super easy to install in Ubuntu (via apt-get). I found out today that VirtualBox can use VMWare disk images, so you can run the Linux Reference Platform VM pretty much out-of-the-box in VirtualBox.

The steps I took were:

  1. Download the reference platform VM from the link above and unzip it somewhere.
  2. Run VirtualBox, and go to the “Virtual Disk Manager”. Click “Add”, and browse to the directory where you unzipped the VM. Add “CentOS-5.0-ref-tools-vm.vmdk” and “CentOS-5.0-ref-tools-vm_1.vmdk” to the “Hard Disks” list in the manager, then click “Ok”.
  3. Click “New” to create a new VM. Name it whatever you want, and select “Linux 2.6” as the OS. Set the base memory size to something usable on your hardware (but not so small that it can’t compile Mozilla). For the “Boot Hard Disk (Primary Master)”, click “Existing…” and select “CentOS-5.0-ref-tools-vm.vmdk”. Click “Ok”, then “Finish”.
  4. Click on your new VM in the list on the left, and click “Settings”. Click on the “Hard Disks” entry in the list on the left of the Settings dialog. Check “Primary Slave”, click the “Select” button to the right of the drop-down, and choose “CentOS-5.0-ref-tools-vm_1.vmdk”. Click “Ok”.
  5. You should now be able to click “Start” and see your new VM boot. It will complain about a missing disk for the /builds mount, this is normal and shouldn’t be a problem.

You should read the wiki page linked in the first paragraph, as by default the VM is not configured to boot into X windows, but does provide a VNC Server.