Fixing security holes without introducing new bugs

When fixing any bug, there is a risk of introducing new bugs, which we call regressions. Regressions caused by security fixes can be especially problematic because shipping a buggy security update can erode user trust for future updates.

Fortunately, we discover most regressions before we ship, thanks in large part to security researchers whose patience gives us time to review and test each patch well. But sometimes security releases are delayed slightly because we notice a regression as we are getting ready to release. Worse, we sometimes discover regressions after shipping a release.

This post explores patterns among regressions and suggests changes that could help us catch them earlier.

Motivation

In six cases since Firefox 2, we decided regressions in minor updates were serious enough to rush a followup regression-fix update. For comparison, we shipped 38 other minor updates, mostly to fix security holes, since releasing Firefox 2 in October 2006.

Regression	Caused followup release	Release date
Several bugs	Firefox 2.0.0.9	November 1, 2007
Bug 405584 – canvas.drawImage broken	Firefox 2.0.0.11	November 30, 2007
Bug 425576 – topsite crash	Firefox 2.0.0.14	April 16, 2008
Bug 454708 – non-ascii password issue	Firefox 3.0.3	September 26, 2008
Bug 489322 – extension crash	Firefox 3.0.10	April 27, 2009
Bug 535193 – proxy auth issue	Firefox 3.0.17 and Firefox 3.5.7	January 5, 2010

It’s hard to see trends and patterns in a list of six bugs, so I looked at a larger set of bug reports: all bugs reports marked as regressions caused by security bug fixes. I focused on the 176 such bugs filed between December 2007 and January 2010.

Most of these regressions were fixed before release and many of them were minor. But understanding why they weren’t caught immediately can help prevent major incidents in the future.

Overall regression frequency

Period	Regressions reported that were caused by security fixes
2008 first half	71
2008 second half	28
2009 first half	46
2009 second half	25

Regressions seem to be declining in frequency. This calls for cake!

I believe regressions are declining in frequency mostly because of improvements in automated testing. The number of automated tests increased fourfold since Firefox 3.0: every trunk checkin now results in 220,000 tests running on each platform. Regressions caught by automated tests tend to be fixed immediately, without even being filed in Bugzilla.

Further improvements to automating testing could reduce the number of regressions even more. I noticed four patterns in the bugs that suggest areas where automated testing could be improved: (1) web site breakage; (2) firefox extensions; (3) printing; and (4) document navigation.

Web site breakage

About 48 of the 176 bug reports involved specific web sites. The sites ranged from private intranet sites to the most popular sites on the web. When simply loading a site is enough to reproduce the bug, it may be easy to detect it through automated testing.

Crashes, hangs, and leaks are easy to identify in automated testing. Carsten Book and Bob Clary have set up automated loading of top sites to find these bugs.

Visual problems are trickier to identify in automated testing, because no algorithm can reliably determine when a web site “looks wrong”. But determining whether a site’s appearance has changed can be done automatically. Since security fixes do not intentionally change the appearance of web sites, a tool to detect web site appearance changes could catch these regressions.

In addition to improving automated testing, we should remove support for some regression-prone, Mozilla-specific web features: signed scripts, enablePrivilege, XUL, and XBL. (Eight of the regressions affected sites using these features). These features have proven to be not especially useful to web developers, hard to keep stable, and hard to make secure. XUL and XBL, in particular, have together been responsible for over a hundred vulnerabilities.

Firefox extensions

About 21 of the regressions involved Firefox extensions. One way we could detect these bugs is to run automated tests with extensions installed. We should run Firefox’s entire test suite with popular extensions, and we should run at least basic tests with every extension.

Some regressions affect a feature of an extension rather than a feature built into Firefox. To catch these, we would need to let extension developers provide tests for their extension’s features.

We can also let Firefox developers search the source code of Firefox extensions, so when developers change APIs, they can tell which extensions the changes are likely to affect. We’ve already started to do this.

Printing

Five of the regressions involved printing. Printing bugs frequently make it to release users in part because most nightly users do not print web pages frequently.

Document navigation

Five of the regressions involve document navigation. I suspect document navigation is prone to security bugs, and prone to regressions when those security bugs are fixed, for three reasons:

First, it is a place where many web features interact. Loads can be triggered by users (back, reload, link clicks, bookmarks clicks, submitting a form) or scripts (setting location, location.replace, history.go). Loads trigger many events (unload, beforeunload, pagehide), and scripts running from those events can trigger additional navigation. Additional layers of complexity arise from restoration of form contents, frames, and in-page navigation between hashes.

Second, expectations are complicated. To prevent accidental or deliberate “traps”, the “Back” button has to skip over sequences of purely-script-triggered navigations, which are often difficult to distinguish from user-triggered navigations. At the same time, AJAX web applications have good reasons to want the “Back” button to only change the hash part of the URL rather than leave the page.

Third, document navigation semantics have evolved along with the web, rather than being designed in a holistic way. Browser developers tried to fix security problems and web compatibility problems as they were discovered. This resulted in messy expectations and even messier implementations.

I suggested to our resident exhaustive-testing expert that we try to create a specification and a corresponding set of tests for document navigation and related features. In addition, I wonder whether rewriting document navigation code could simplify it and make it less fragile.

Nightly and beta testing

Firefox nightly build testers found more regressions than beta and release users combined. This is a testament to the power of the Mozilla community, which includes over ten thousand Firefox nightly testers and an incredible bug triage team.

But it’s also worrying: it suggests that in cases when a security patch can only have a few days of nightly testing, rather than several weeks, a major regression could easily slip through unnoticed. A patch might get only a few days of nightly testing if a developer feels that landing a fix effectively discloses a vulnerability, or when we’re rushing to fix a security hole that has already been disclosed.

Automated tests will never be able to catch every regression, so we want to help nightly and beta testers identify bugs as effectively as they can, whether a patch undergoes testing for weeks or days. David Boswell has been improving the nightly start page to highlight the best ways to look for bugs. Developers can now post temporary notes on this page to highlight features that need special testing.

We’re also making it easier to be a nightly tester. Automated tests now keep the worst regressions out of nightlies, making nightlies less prone to breaking. Automatic nightly updates now work for all localizations, and we’re planning to make nightly updates smoother.

To expand beta testing, we’re thinking of adding a checkbox to update preferences to let users choose to become beta testers. Beta testers are not always as active at reporting bugs as nightly users, but their numbers allow clear trends to appear in crash statistics.

Raw data

Maybe you can spot patterns that I didn’t, or suggest other methods of automated testing that could catch these kinds of bugs earlier?

I focused on the symptoms and testcases of the regressions. Someone reading the patches might notice different patterns. Would more bugs have been caught by writing new custom static analyses, paying attention to new compiler warnings, or getting some testers to report assertion failures from running debug builds? Could some architectural change have prevented a class of regressions (or the security bugs whose fixes caused them)?

These links show the 176 regressions caused by security bug fixes between December 2007 and January 2010. The spreadsheet contains my guesses as to how we found each bug and how we could have found it earlier.

List of 176 bugs | Spreadsheet on Google Docs | Spreadsheet as CSV

Thanks to Murali Nandigama for helping with Bugzilla queries. Dan Veditz, Melissa Shapiro, and Drew Ruderman provided valuable feedback on drafts of this post.

Jesse Ruderman
Security bug hunter

Honestly, even a pref is a PITA. We sell our product to companies who then roll it out to their users – sometimes thousands of them. The basic model is a typical 3-tier web-app (DB, middle tier – ASP.NET in our case, front-end), but we have no control over the IT staff who are installing it at their premises.

In many cases we’re recommending that they use Prism – in which case a pref isn’t too bad, as it can be set on the master copy that’s then rolled out to each desktop. For those that prefer to use Firefox, even a pref might require the IT staff to go round and adjust each machine individually, so won’t go down well.

Re-writing the app as an extension would require us to deal with the issue of updating each client machine. Currently we can update our code on the customer’s server and know that every user has the latest version. With an extension we would have to push the update from the company’s servers and hope that everyone updates when prompted. Pushing it from our own site isn’t really viable, as we often customise code on a per-company basis, so it would quickly get messy if we’re hosting dozens of different versions of the app.

I suppose the ideal for us would be a whitelist of sites that can use XUL/XBL. 99.9% of sites would never trigger the whitelist request, as it would be tied to the XUL mimetype. For those that do trigger it, the prompt would be accompanied by warnings of doom and destruction, and would default to _not_ whitelisting the site. In our situation the end users tend to do what the IT dept tell them to, so either the IT staff could go around and whitelist their site themselves, or they could educate their users that they should click “Yes”, but only for their local intranet, and only for this application.

I’m glad to hear that XBL2 is still on the table though.

11 comments on “Fixing security holes without introducing new bugs”

Tony Mechelynck wrote on February 10, 2010 at 6:41 pm:

Ho-hoh!

«In addition to improving automated testing, we should remove support for […] XUL, […].»

So XUL /is/ going away, the cat is out of the box! Now if we (the users) are still believing mconnor’s mealy-mouthed retractations, where he said that when he had said that non-Persona non-Jetpack addons were “deprecated”, he didn’t mean they were going away, we’re in for a big surprise when, oh, with Firefox 4 or 5 maybe, no XUL will be able to run, be it in compiled-in chrome or in an extension.
jruderman wrote on February 10, 2010 at 7:13 pm:

I want to disable XUL and XBL for web content, not remove them from Firefox entirely.
Robert O’Callahan wrote on February 10, 2010 at 7:17 pm:

Jesse’s talking about removing access to those features from untrusted Web content — not from extensions.

Also, that’s his suggestion, not a decision.

Calm down, please.
Robert O’Callahan wrote on February 10, 2010 at 7:18 pm:

Jesse, the HTML5 spec defines document navigation. Maybe not well enough, but that’s where the spec lives.
Daniel Veditz wrote on February 10, 2010 at 7:43 pm:

Nice selective quoting, Tony. In the part you elided Jesse specified he was talking about “web features” and followed with “not especially useful to web developers”.
Tony Mechelynck wrote on February 10, 2010 at 7:50 pm:

ah, OK, then I misunderstood, and maybe overreacted a little, sorry.
Ferdinand wrote on February 11, 2010 at 2:07 am:

The problem of having to fix something an update broke primarily reduces trust because you actually see the update in Firefox. If you see an update every week you remember that. You also will complain about the frequent updates.
When Firefox starts doing automatic background updating far less users will even notice that something was broken. The only indication of updates you should give is the same as the password bar so that if problems suddenly occur the user will know it could be the update.
MarkC wrote on February 11, 2010 at 3:08 am:

How would this affect remote XUL? I know it’s not widely used on the web at large, but it can be extremely useful. I suspect there are a few “XUL dark matter” implementations based on remote XUL, and the company I work for is shortly due to release a signficant new product (well, significant to us) written using remote XUL.

For our purposes, allowing it to be re-enabled via a pref in about:config would be sufficient, though having XUL/XBL support switch back on automatically when a page is sent with the XUL mimetype would be even better.

And how does the proposal to disable XBL fit with the development of XBL2? Now that it’s under the auspices of the W3C, there’s a chance that it might actually get ratified and supported in other browsers at some point. The ability to “component-ise” HTML/SVG/JS/CSS into reusable widgets could be particularly useful to the web-app developers of the world. It would be a shame to see it disabled before it has a chance to pick up any momentum.
Daniel Veditz wrote on February 11, 2010 at 11:29 am:

Fear of “XUL dark matter” is what has kept us from killing it (for web content) up until now. But the fact is that any such web-app is “Mozilla browsers only” and we’re not any happier with that than “IE-only” web sites. (Well, maybe a little happier, but we feel guilty about it.)

If making your users flip a pref is acceptable what about having them install an add-on instead? You might want to talk to our developers in the mozilla.dev.platform newsgroup (also available as a mailing list) and discuss what you need and other ways of accomplishing it. http://www.mozilla.org/community/developer-forums.html#dev-platform (I recommend a real newsreader like Thunderbird over Google Groups if you can, otherwise the mailing list form if you prefer that.)

Allowing a mimetype to turn on remote XUL doesn’t help with security, because of course an attacker could do that.

XBL2 has a different security model than XBL and we’re still interested in pursuing that.
MarkC wrote on February 12, 2010 at 3:08 am:

Honestly, even a pref is a PITA. We sell our product to companies who then roll it out to their users – sometimes thousands of them. The basic model is a typical 3-tier web-app (DB, middle tier – ASP.NET in our case, front-end), but we have no control over the IT staff who are installing it at their premises.

In many cases we’re recommending that they use Prism – in which case a pref isn’t too bad, as it can be set on the master copy that’s then rolled out to each desktop. For those that prefer to use Firefox, even a pref might require the IT staff to go round and adjust each machine individually, so won’t go down well.

Re-writing the app as an extension would require us to deal with the issue of updating each client machine. Currently we can update our code on the customer’s server and know that every user has the latest version. With an extension we would have to push the update from the company’s servers and hope that everyone updates when prompted. Pushing it from our own site isn’t really viable, as we often customise code on a per-company basis, so it would quickly get messy if we’re hosting dozens of different versions of the app.

I suppose the ideal for us would be a whitelist of sites that can use XUL/XBL. 99.9% of sites would never trigger the whitelist request, as it would be tied to the XUL mimetype. For those that do trigger it, the prompt would be accompanied by warnings of doom and destruction, and would default to _not_ whitelisting the site. In our situation the end users tend to do what the IT dept tell them to, so either the IT staff could go around and whitelist their site themselves, or they could educate their users that they should click “Yes”, but only for their local intranet, and only for this application.

I’m glad to hear that XBL2 is still on the table though.
Adam Barth wrote on February 15, 2010 at 5:15 pm:

Thanks for a great read. Very interesting stuff.

Motivation

Overall regression frequency

Web site breakage

Firefox extensions

Printing

Document navigation

Nightly and beta testing

Raw data

Tony Mechelynck wrote on February 10, 2010 at 6:41 pm:

jruderman wrote on February 10, 2010 at 7:13 pm:

Robert O’Callahan wrote on February 10, 2010 at 7:17 pm:

Robert O’Callahan wrote on February 10, 2010 at 7:18 pm:

Daniel Veditz wrote on February 10, 2010 at 7:43 pm:

Tony Mechelynck wrote on February 10, 2010 at 7:50 pm:

Ferdinand wrote on February 11, 2010 at 2:07 am:

MarkC wrote on February 11, 2010 at 3:08 am:

Daniel Veditz wrote on February 11, 2010 at 11:29 am:

MarkC wrote on February 12, 2010 at 3:08 am:

Adam Barth wrote on February 15, 2010 at 5:15 pm: