For a long time when our unit tests or Talos performance tests encountered a crash, the result was nothing but frustration. If you were lucky, you could tell that it crashed, but you had no idea where. Poor Blake spent weeks tracing down a crash from his speculative-parsing patch that only seemed to occur on Talos. Up until recently I figured the only way to make this happen was going to involve a fair amount of work that only I was going to be able to do. A few weeks ago it was determined that this was becoming a significant impact on development, as patches would get checked in, cause a crash and be backed out, leaving the developer with nothing to go on.

Benjamin Smedberg has been hard at work making it possible to get stacks in this situation, using the same Breakpad utilities we use on our Socorro server, but locally on the machine running the tests. Practically all of the pieces were in place this afternoon when #developers cornered Alice and closed the tree while she landed the final patch to make Talos produce stack traces. Boris then committed a test crash, and as a result we were able to see crash stacks in Mochitest (OS X, Linux) as well as Talos (OS X, Linux).

Thanks to Benjamin for doing most of the heavy lifting here, and for
Alice for taking the Talos part across the finish line. The Talos work
was mostly in bug 480577, and the unit test work was bug 481732. Note
that currently this only works in Mochitest (all 4 varieties), it will
work in Reftest/Crashtest after bug 479225 is fixed (which should be soon).

(Cross posted in dev.tree-management, but posting here for a wider audience.)

Getting There from Here

February 9th, 2009

The New Yorker has a great article about reforming the US health care system entitled “Getting There from Here.” It’s not terribly long and if you’re at all interested in the topic I’d recommend reading it. It discusses how other nations with universal health care arrived at their present systems, which is a topic that seems completely absent from debate on the subject in the USA. Anyway, I’m not really posting here to talk about health care, but one part of this article rang true with me in other ways:

There is no dry-docking health care for a few months, or even for an afternoon, while we rebuild it. Grand plans admit no possibility of mistakes or failures, or the chance to learn from them. If we get things wrong, people will die. This doesn’t mean that ambitious reform is beyond us. But we have to start with what we have.

As owner of the Mozilla build system, I hear a lot of complaints. This is understandable, our build system is showing its age, and we are certainly straining against the limits of Autoconf and GNU make on a regular basis. Along with complaints, I hear a lot of suggestions of the form “why don’t you just use X“, where X is any of a number of alternative tools such as CMake or SCons. The basic answer parallels the quote above. Because we don’t have time to stop and rewrite everything. Our build system contains tens of thousands of lines of makefiles, as well as a configure.in that’s over 8,000 lines. Converting this much build junk by hand is doomed to failure. Converting it through automated tools might be possible, but we would need a smart plan, and it would likely involve testing the tools in parallel with the existing build system for a while in order to make for a smooth transition. In any case, it’s clear that if we are to find a way forward, it will require building on what we have, and not burning it to the ground and starting over from scratch.

[via]

Our Release Engineering group provides a VMWare VM image of the Linux Reference Platform, which is the VM upon which all of the official Linux builds happen. This is very handy, as you can trade some download time (it’s about 1.2 GB) for the time it would take you to install Linux and setup all the build dependencies. I’m currently running Ubuntu 8.04 64-bit on one of my home machines, and I’ve been using VirtualBox for running VMs on the machine because it was super easy to install in Ubuntu (via apt-get). I found out today that VirtualBox can use VMWare disk images, so you can run the Linux Reference Platform VM pretty much out-of-the-box in VirtualBox.

The steps I took were:

  1. Download the reference platform VM from the link above and unzip it somewhere.
  2. Run VirtualBox, and go to the “Virtual Disk Manager”. Click “Add”, and browse to the directory where you unzipped the VM. Add “CentOS-5.0-ref-tools-vm.vmdk” and “CentOS-5.0-ref-tools-vm_1.vmdk” to the “Hard Disks” list in the manager, then click “Ok”.
  3. Click “New” to create a new VM. Name it whatever you want, and select “Linux 2.6” as the OS. Set the base memory size to something usable on your hardware (but not so small that it can’t compile Mozilla). For the “Boot Hard Disk (Primary Master)”, click “Existing…” and select “CentOS-5.0-ref-tools-vm.vmdk”. Click “Ok”, then “Finish”.
  4. Click on your new VM in the list on the left, and click “Settings”. Click on the “Hard Disks” entry in the list on the left of the Settings dialog. Check “Primary Slave”, click the “Select” button to the right of the drop-down, and choose “CentOS-5.0-ref-tools-vm_1.vmdk”. Click “Ok”.
  5. You should now be able to click “Start” and see your new VM boot. It will complain about a missing disk for the /builds mount, this is normal and shouldn’t be a problem.

You should read the wiki page linked in the first paragraph, as by default the VM is not configured to boot into X windows, but does provide a VNC Server.

I was reminded of bug 414049 yesterday, a bug I filed about getting screenshots from our unit test machines after every run so we could see if there was obviously something wrong with the machine (like error dialogs covering the screen). Linux and OS X tend to have built-in tools to grab screenshots (as mentioned in the bug), but Windows does not. I searched around for a free tool to do the job, but all I could find was shareware. It’s possible there’s a free tool out there that I just couldn’t find, but I figured I would just write one. After a bit of poking around on MSDN, I wrote screenshot.cpp. It’s only about 70 lines of C++, hard to believe people pay money for stuff like that. I’ve placed it under a BSD license, since it’s useful code and I couldn’t find a simple self-contained example like this anywhere.

7 things

January 16th, 2009

Ok, you all know the rules, so I’m going to skip them. I’ve been tagged three times already (Dave, Benjamin, and Tomcat), so I guess it’s time to succumb.

7 things:

  1. I have a slightly better than beginner-level knowledge of American Sign Language. My wife’s sister is deaf, and my wife is fluent, so I thought it would be rude to not be able to talk to her sister. It’s amazing how much appreciation just learning a little bit of someone’s native language will get you.
  2. When I was a little over 2 years old, my family moved from New Jersey to a house that my parents built in Northeastern Pennsylvania. It was still under construction when we moved in, and the entry way was some planks crossing the 10 foot drop to what was at the time gravel in the basement. My older sister dared me to ride my big wheel over it, and of course I fell in. My mom says she made my dad go look because she was sure I was dead. As far as I’m aware I didn’t suffer any real injuries.
  3. I played the trumpet in marching band for 9 years (4 years in high school, 5 years in college). Yeah, I’m a band geek.
  4. My wife and I spent 2 months in Göteborg, Sweden after we got married in 2006. My former employer’s headquarters is there, and I managed to convince our US president that it was a good idea. Sweden in the summertime is an awesome place to be.
  5. I’m only 5’3″ tall. But if you’ve met me in person you probably already figured that one out.
  6. Mozilla is my third employer since I graduated college in 2002. (Hopefully I’ll stay here longer than my previous two!)
  7. I’ve become addicted to mapping in OpenStreetMap. I mapped my entire town (Although not the surrounding area, that’s data that was imported from the US Cenus’ TIGER data.) and got another GPS for Christmas.

I’m not tagging anyone else. I think this meme has reached the saturation point.

more tests, kthx

January 16th, 2009

Josh recently landed a test plugin, with the intent of finally getting some test coverage of our plugin-handling code via mochitests. This is awesome, as plugins are an area of code where we’ve caused lots of regressions in the past, and until then had zero automated test coverage. After it landed, I took a peek at the code and noticed that it would be pretty easy to extend it to make it usable in our layout tests (reftest) as well. I just landed some patches to add this functionality, so we can now test that our layout of plugins doesn’t regress. If you’d like to write some reftests yourself using this, you can check out the basic tests I added along with the patch. (Note: it’s mac-only at the moment, but there’s gtk2 code ready to land any minute now, and a win32 implementation should be forthcoming.)

SSL in Mochitest

September 22nd, 2008

Without a lot of fanfare, a patch landed recently that enables the use of SSL with the test HTTP server we use in our Mochitest test harness.

About five months ago, I read an article about how Fedora wanted to standardize on NSS as the cryptography solution for their distro in order to be able to leverage a common certificate database, among other things. The article went into detail on how they wrote an OpenSSL wrapper around NSS so they could easily port applications that only supported OpenSSL to use NSS instead. As a concrete example, they showed a ported version of stunnel using NSS. This gave me pause, as one of the things we were lacking in our Mochitest harness was SSL support and stunnel would do exactly what we needed in this case. Considering we already build and ship NSS with every copy of Firefox, and it was clearly possible to implement the functionality we needed using NSS, I set out to figure out how to implement a bare-bones version of stunnel from scratch. After a bit of poking through the online NSPR and NSS documentation, I had a proof of concept application which I called “ssltunnel.” After some insightful review comments from NSS developers I committed it to CVS.

Unfortunately, that wasn’t the end. We still needed to hook this program up to the test harness, and I just didn’t have the motivation to do so. I filed the bug, and hoped someone else would do the work. (as I often do!) Thankfully, that someone appeared in the person of Honza Bambas, whom I can only describe as a “programming rockstar.” He not only integrated ssltunnel into Mochitest, but he rewrote large sections of it to make it work robustly and made it work as an HTTP proxy while he was at it. After some reviews, and a couple of landings and backouts due to unrelated test failures, and some time spent languishing in bugzilla, we finally made his patch stick.

Of course, now that we have this capability, we need tests to use it! Honza has written some great documentation on what is currently available via Mochitest, and how to add custom servers and certificates other things you might want. If you get motivated to write some tests and hit a rough spot, feel free as always to track me down on IRC and ask me about it.

MozillaBuild 1.3

June 16th, 2008

I’ve just released MozillaBuild 1.3, which you can download at your leisure. Major changes from 1.2 include:

  • Includes Mercurial—so now you should be able to build mozilla-central out of the box
  • Added support for using both the Windows Vista SDK and an older Platform SDK at the same time, if you’re using Visual C++ 2005 Express and playing Microsoft header bingo. See the build prerequisites page for more information.
  • Added manifests to some exe files for better Vista compatibility
  • Startup scripts no longer use the rxvt terminal. I realize that some people may not like this, but we’ve had plenty of complaints about rxvt’s suckiness as a terminal, so I don’t think it’s really any worse. The real impetus here was that trying to use Mercurial to connect to a repository over ssh was essentially broken for the first-run case, when you have to accept the server’s key, which wasn’t acceptable to me. As a plus, if you create a shortcut to whichever start-msvc batch file you’re using, you can use the properties dialog to customize its appearance as you would any other command shell in Windows. I may investigate a better replacement console for a future release.

MozillaBuild 1.3 screenshot

As usual, if you have any issues, you can file a bug in the MozillaBuild component.

MochiTest Maker

April 18th, 2008

Just something I threw together this morning: MochiTest Maker. It’s a pure HTML+JavaScript environment for writing MochiTests. It’s not as full-featured as the real MochiTest, as you can’t set HTTP headers or include external files, but it should serve for a lot of simple web content tests.

Ideally at some point I’d like to add a CGI backend to this so you could specify a directory, and have it generate a patch against current CVS to include your test in that directory. That would lower the bar even further for getting new tests into the tree. Another cool addition would be to integrate this with my regression search buildbot (currently offline), so that you could write a mochitest and then with one click submit it to find out when something regressed. That shouldn’t be hard to do, but my buildbot needs to find a more permanent home first.

I think there’s still a lot more we can (and must) do to lower the bar for writing tests. We need all the tests we can get!

Some time ago, we set up a symbol server for our Windows builds. This was sort of an afterthought, it just happened to be really easy to do in our new crash reporting architecture. It turns out that this is incredibly useful for people. This shouldn’t be surprising, given how difficult it is to build your own Firefox. Some time after we set this up, I found out that Microsoft’s debuggers also supported something called a source server (Note: this page did not contain this much information when this project started). This sounded interesting, but it wasn’t something I had time to work on, so I added some information to Seneca’s wiki, hoping an interested student would pick it up as a class project.

To say that I got more than I hoped for would be an understatement. Lukas Blakk took the project and ran with it, producing a working prototype and fleshing it out to the point where it now works perfectly on current nightly builds. She’s done an incredible job working with a practically undocumented feature of Microsoft’s debugging tools and having the perseverance to stick it out. As a result, you can now debug nightly Windows builds with full source available. We’ve got a handy MDC document available to tell you how. You’ll need a nightly from today (April 15th) or newer, and this will be available in the Firefox 3.0 release builds. Happy debugging!