The story of a tricky bug

The Bug Report

A few weeks ago I skimmed through /r/firefox and saw a post by a user named DeeDee_Z complaining about high memory usage in Firefox. Somebody helpfully suggested that DeeDee_Z look at about:memory, which revealed thousands of blank windows like this:

  │    │  ├────0.15 MB (00.01%) ++ top(about:blank, id=1001)
  │    │  ├────0.15 MB (00.01%) ++ top(about:blank, id=1003)
  │    │  ├────0.15 MB (00.01%) ++ top(about:blank, id=1005

I filed bug 1041808 and asked DeeDee_Z to sign up to Bugzilla so s/he could join the discussion. What followed was several weeks of back and forth, involving suggestions from no fewer than seven Mozilla employees. DeeDee_Z patiently tried numerous diagnostic steps, such as running in safe mode, pasting info from about:support, getting GC/CC logs, and doing a malware scan. (Though s/he did draw the line at running wireshark to detect if any unusual network activity was happening, which I think is fair enough!)

But still there was no progress. Nobody else was able to reproduce the problem, and even DeeDee_Z had trouble making it happen reliably.

And then on August 12, more than three weeks after the bug report was filed, Peter Van der Beken commented that he had seen similar behaviour on his machine, and by adding some logging to Firefox’s guts he had a strong suspicion that it was related to having the “keep until” setting for cookies set to “ask me every time”. DeeDee_Z had the same setting, and quickly confirmed that changing it fixed the problem. Hooray!

I don’t know how Peter found the bug report — maybe he went to file a new bug report about this problem and Bugzilla’s duplicate detection identified the existing bug report — but it’s great that he did. Two days later he landed a simple patch to fix the problem. In Peter’s words:

The patch makes the dialog for allowing/denying cookies actually show up when a cookie is set through the DOM API. Without the patch the dialog is created, but never shown and so it sticks around forever.

This fix is on track to ship in Firefox 34, which is due to be released in late November.

Takeaway lessons

There are a number of takeaway lessons from this story.

First, a determined bug reporter is enormously helpful. I often see vague complaints about Firefox on websites (or even in Bugzilla) with no responses to follow-up questions. In contrast, DeeDee_Z’s initial complaint was reasonably detailed. More importantly, s/he did all the follow-up steps that people asked her/him to do, both on Reddit and in Bugzilla. The about:memory data made it clear it was some kind of window leak, and although the follow-up diagnostic steps didn’t lead to the fix in this case, they did help rule out a number of possibilities. Also, DeeDee_Z was extremely quick to confirm that Peter’s suggestion about the cookie setting fixed the problem, which was very helpful.

Second, many (most?) problems don’t affect everyone. This was quite a nasty problem, but the “ask me every time” setting is not commonly used because causes lots of dialogs to pop up, which few users have the patience to deal with. It’s very common that people have a problem with Firefox (or any other piece of software), incorrectly assume that it affects everyone else equally, and conclude with “I can’t believe anybody uses this thing”. I call this “your experience is not universal“. This is particular true for web browsers, which unfortunately are enormously complicated and have many combinations of settings get little or no testing.

Third, and relatedly, it’s difficult to fix problems that you can’t reproduce. It’s only because Peter could reproduce the problem that he was able to do the logging that led him to the solution.

Fourth, it’s important to file bug reports in Bugzilla. Bugzilla is effectively the Mozilla project’s memory, and it’s monitored by many contributors. The visibility of a bug report in Bugzilla is vastly higher than a random complaint on some other website. If the bug report hadn’t been in Bugzilla, Peter wouldn’t have stumbled across it. So even if he had fixed it, DeeDee_Z wouldn’t have known and probably would have had been stuck with the problem until Firefox 34 came out. That’s assuming s/he didn’t switch to a different browser in the meantime.

Fifth, Mozilla does care about memory usage, particularly cases where memory usage balloons unreasonably. We’ve had a project called MemShrink running for more than three years now. We’ve fixed hundreds of problems, big and small, and continue to do so. Please use about:memory to start the diagnosis, and add the “[MemShrink]” tag to any bug reports in Bugzilla that relate to memory usage, and we will triage them in our fortnightly MemShrink meetings.

Finally, luck plays a part. I don’t often look at /r/firefox, and I could have easily missed DeeDee_Z’s complaint. Also, it was lucky that Peter found the bug in Bugzilla. Many tricky bugs don’t get resolved this quickly.

5 Responses to The story of a tricky bug

  1. I don’t see the trickiness, (yes, I know, hindsight is 20/20, ;)).

    First: “Safemode” doesn’t really deserve that name anymore, as many custom settings and plugins are not disabled or ignored when using it. Among them is the setting which was responsible for the bug, namely “network.cookie.lifetimePolicy”.

    And second: The answer was already in https://bugzilla.mozilla.org/show_bug.cgi?id=1041808#c34 , where s/he described what has been done to the profile after creating a new one. After changing these settings, the problem came back.

  2. What about some combinatorial testing…

    A short list of web sites to visit, all major options flipped at least once.
    Or better all possible two flip configurations. It could be run just rarely
    to catch the most obvious incompatible errors.
    A

  3. I read the phrase “reproduce a bug” on Firefox related blogs before and I ASSUMED developers are able to import the “about:config-dump” of user X into a firefox instance on their local machine so as to visit themselves sites that caused problems for X with the exact confinguration of X. The bug report you link to makes me think I was misguided. But woudn’t my assumption be the logical way to cope with the complexity you describe in an efficient manner??

    • Nicholas Nethercote

      It’s possible to bundle up an entire profile, but that’s rare, and it contains a lot of sensitive information. A more limited export tool is an interesting idea. It wouldn’t help with all cases; for example, lots of problems are due to graphics drivers. But it could help with quite a few cases.

      • Nicholas Nethercote

        Thinking some more, about:support does list all the important modified preferences. And I think the relevant preference was listed in the bug; I think it’s “network.cookie.cookieBehavior: 1″. But you have to know what that means, which isn’t straighforward.