Category Archives: Firefox

AdBlock Plus’s effect on Firefox’s memory usage

[Update: Wladimir Palant has posted a response on the AdBlock Plus blog. Also, a Chrome developer using the handle “Klathmon” has posted numerous good comments in the Reddit discussion of this post, explaining why ad-blockers are inherently CPU- and memory-intensive, and why integrating ad-blocking into a browser wouldn’t necessarily help.]

AdBlock Plus (ABP) is the most popular add-on for Firefox. AMO says that it has almost 19 million users, which is almost triple the number of the second most popular add-on. I have happily used it myself for years — whenever I use a browser that doesn’t have an ad blocker installed I’m always horrified by the number of ads there are on the web.

But we recently learned that ABP can greatly increase the amount of memory used by Firefox.

First, there’s a constant overhead just from enabling ABP of something like 60–70 MiB. (This is on 64-bit builds; on 32-bit builds the number is probably a bit smaller.) This appears to be mostly due to additional JavaScript memory usage, though there’s also some due to extra layout memory.

Second, there’s an overhead of about 4 MiB per iframe, which is mostly due to ABP injecting a giant stylesheet into every iframe. Many pages have multiple iframes, so this can add up quickly. For example, if I load TechCrunch and roll over the social buttons on every story (thus triggering the loading of lots of extra JS code), without ABP, Firefox uses about 194 MiB of physical memory. With ABP, that number more than doubles, to 417 MiB. This is despite the fact that ABP prevents some page elements (ads!) from being loaded.

An even more extreme example is this page, which contains over 400 iframes. Without ABP, Firefox uses about 370 MiB. With ABP, that number jumps to 1960 MiB. Unsurprisingly, the page also loads more slowly with ABP enabled.

So, it’s clear that ABP greatly increases Firefox’s memory usage. Now, this isn’t all bad. Many people (including me!) will be happy with this trade-off — they will gladly use extra memory in order to block ads. But if you’re using a low-end machine without much memory, you might have different priorities.

I hope that the ABP authors can work with us to reduce this overhead, though I’m not aware of any clear ideas on how to do so. In the meantime, it’s worth keeping these measurements in mind. In particular, if you hear people complaining about Firefox’s memory usage, one of the first questions to ask is whether they have ABP installed.

[A note about the comments: I have deleted 17 argumentative, repetitive, borderline-spam comments from a single commenter — after giving him a warning via email — and I will delete any further comments from him on this post. As a result, I also had to delete three replies to his comments from others, for which I apologize.]

Generational GC has landed

Big news: late last week, generational garbage collection landed. It was backed out at first due to some test failures, but then re-landed and appears to have stuck.

This helps with performance. There are certain workloads where generational GC makes the code run much faster, and Firefox hasn’t been able to keep up with Chrome on these. For example, it has made Firefox slightly faster on the Octane benchmark, and there is apparently quite a bit of headroom for additional improvements.

Interestingly, its effect on memory usage has been small. I was hoping that the early filtering of many short-lived objects would make the tenured heap grow more slowly and thus reduce memory usage, but the addition of other structures (such as the nursery and store buffers) appears to have balanced that out.

The changes to the graphs at AWSY have been all within the noise, with the exception of the “Fresh start” and “Fresh start [+30s]” measurements in the “explicit” graph, both of which ticked up slightly. This isn’t cause for concern, however, because the corresponding “resident” graph hasn’t increased accordingly, and “resident” is the real metric of interest.

“Compacting Generational GC” is the #1 item on the current MemShrink “Big Ticket Items” list. Hopefully the “compacting” part of that, which still remains to be done, will produce some sizeable memory wins.

DMD now works on Windows

DMD is our tool for improving Firefox’s memory reporting.  It helps identify where new memory reporters need to be added in order to reduce the “heap-unclassified” value in about:memory.

DMD has always worked well on Linux, and moderately well on Mac (it is crashy for some people).  And it works on Android and B2G.  But it has never worked on Windows.

So I’m happy to report that DMD now does work on Windows, thanks to the excellent efforts of Catalin Iacob.  If you’re on Windows and you’ve been seeing high “heap-unclassified” values, and you’re able to build Firefox yourself, please give DMD a try.

MemShrink progress, week 121–124

It’s been a quiet but steady four weeks for MemShrink with 19 bugs fixed, including several leaks.

The only fix that I feel is worth highlighting is bug 918207, in which I added support for fast, coarse-grained measurement of a tab’s memory consumption.  The implemented machinery isn’t currently exposed through the UI, though there are two bugs open that will use it:  a simple one that will implement a command for the developer toolbar, and a more complex one that will implement a constantly-updating memory monitor widget for the devtools pane.

See you next time!

Warning for Firefox devs planning to upgrade to Ubuntu 13.10

I just upgraded from Ubuntu 13.04 to Ubuntu 13.10, and Firefox wouldn’t build with either clang or GCC.

clang was initially failing during configure, complaining about not being able to find joystick.h, though the underlying failure was an inability to find stddef.h.  This Ubuntu bug describes a workaround, which is to do the following.

cd /usr/lib/clang/3.2/
sudo ln -s /usr/lib/llvm-3.2/lib/clang/3.2/include

With that in place, I clobbered and rebuilt, and clang complained about a problem in allocator.h relating to a name __allocator_base, and GCC complained about C++11 support being insufficient.

Both failures had the same underlying cause, which is that both compilers are hardwired to look for some GCC-4.7 headers (which they shouldn’t) as well as GCC-4.8 headers.  I filed a bug with Ubuntu about this.

I worked around the problem just by renaming /usr/include/c++/4.7/ and /usr/include/x86_64-linux-gnu/c++/4.7/.  There may be more elegant workarounds, but that was good enough for me.

How to trigger a child process in desktop Firefox

Firefox is now multi-process, and not just for the plugin-container process.  For example, there is now (present but disabled in Firefox 25, and likely to be released in Firefox 27) a separate process that is used to update the thumbnails shown in a new tab.

As a result, sometimes you might want to test something in the presence of multiple processes.  Here’s how I’ve been doing it.

  • Delete the images in the thumbnails/ directory within the profile’s temporary directory.
    • On Linux it’s ~/.cache/mozilla/firefox/<profile>/thumbnails/.
    • On Mac it’s ~/Library/Caches/Firefox/Profiles/<profile>/thumbnails/.
    • On Windows it’s C:\Users\<username>\AppData\Local\Mozilla\Firefox\Profiles\<profile>\thumbnails\.
    • I’m not sure about Android.
  • Open about:newtab.  This triggers a thumbnails process.  It’ll live for about 60 seconds.  (If you’ve configured about:newtab to be blank rather than showing thumbnails, this might not work, though I’m not sure.)

Please let me know if there’s a better way!

(And if anyone can give me extra info on the things I’m not sure about, I’ll update the text above accordingly.  Thanks!)

MemShrink progress, week 117–120

Lots of important MemShrink stuff has happened in the last 27 days:  22 bugs were fixed, and some of them were very important indeed.

Images

Timothy Nikkel fixed bug 847223, which greatly reduces peak memory consumption when loading image-heavy pages.  The combination of this fix and the fix from bug 689623 — which Timothy finished earlier this year and which shipped in Firefox 24 — have completely solved our longstanding memory consumption problems with image-heavy pages!  This was the #1 item on the MemShrink big ticket items list.

To give you an idea of the effect of these two fixes, I did some rough measurements on a page containing thousands of images, which are summarized in the graph below.

Improvements in Firefox's Memory Consumption on One Image-heavy Page

First consider Firefox 23, which had neither fix, and which is represented by the purple line in the graph.  When loading the page, physical memory consumption would jump to about 3 GB, because every image in the page was decoded (a.k.a. decompressed).  That decoded data was retained so long as the page was in the foreground.

Next, consider Firefox 24 (and 25), which had the first fix, and which is represented by the green line on the graph.  When loading the page, physical memory consumption would still jump to almost 3 GB, because the images are still decoded.  But it would soon drop down to a few hundred MB, as the decoded data for non-visible images was discarded, and stay there (with some minor variations) while scrolling around the page. So the scrolling behaviour was much improved, but the memory consumption spike still occurred, which could still cause paging, out-of-memory problems, and the like.

Finally consider Firefox 26 (currently in the Aurora channel), which has both fixes, and which is represented by the red line on the graph.  When loading the page, physical memory jumps to a few hundred MB and stays there.  Furthermore, the loading time for the page dropped from ~5 seconds to ~1 second, because the unnecessary decoding of most of the images is skipped.

These measurements were quite rough, and there was quite a bit of variation, but the magnitude of the improvement is obvious.  And all these memory consumption improvements have occurred without hurting scrolling performance.  This is fantastic work by Timothy, and great news for all Firefox users who visit image-heavy pages.

[Update: Timothy emailed me this:  “Only minor thing is that we still need to turn it on for b2g. We flipped the pref for fennec on central (it’s not on aurora though). I’ve been delayed in testing b2g though, hopefully we can flip the pref on b2g soon. That’s the last major thing before declaring it totally solved.”]

[Update 2: This has hit Hacker News.]

NuWa

Cervantes Yu landed Nuwa, which is a low-level optimization of B2G.  Quoting from the big ticket items list (where this was item #3):

Nuwa… aims to give B2G a pre-initialized template process from which every subsequent process will be forked… it greatly increases the ability for B2G processes to share unchanging data.  In one test run, this increased the number of apps that could be run simultaneously from five to nine

Nuwa is currently disabled by default, so that Cervantes can fine-tune it, but I believe it’s intended to ship with B2G version 1.3.  Fingers crossed it makes it!

Memory Reporting

I made some major simplifications to our memory reporting infrastructure, paving the way for future improvements.

First, we used to have two kinds of memory reporters:  uni-reporters (which report a single measurement) and multi-reporters (which report multiple measurements).  Multi-reporters, unsurprisingly, subsume uni-reporters, and so I got rid of uni-reporters, which simplified quite a bit of code.

Second, I removed about:compartments and folded its functionality into about:memory.  I originally created about:compartments at the height of our zombie compartment problem.  But ever since Kyle Huey made it more or less impossible for add-ons to cause zombie compartments, about:compartments has hardly been used.   I was able to fold about:compartments’ data into about:memory, so there’s no functionality loss, and this change simplified quite a bit more code.  If you visit about:compartments now you’ll get a message telling you to visit about:memory.

Third, I removed the smaps (size/rss/pss/swap) memory reporters.  These were only present on Linux, they were of questionable utility, and they complicated about:memory significantly.

Finally, I fixed a leak in about:memory.  Yeah, it was my fault.  Sorry!

Summit

The Mozilla summit is coming up!  In fact, I’m writing this report a day earlier than normal because I will be travelling to Toronto tomorrow.  Please forgive any delayed responses to comments, because I will be travelling for almost 24 hours to get there.

MemShrink progress, week 113–116

It’s been a relatively quiet four weeks for MemShrink, with 17 bugs fixed.  (Relatedly, in today’s MemShrink meeting we only had to triage 10 bugs, which is the lowest we’ve had for ages.)  Among the fixed bugs were lots for B2G leaks and leak-like things, many of which are hard to explain, but are important for the phone’s stability.

Fabrice Desré made a couple of notable B2G non-leak fixes.

On desktop, Firefox users who view about:memory may notice that it now sometimes mentions more than one process.  This is due to the thumbnails child process, which generates the thumbnails seen on the new tab page, and which occasionally is spawned and runs briefly in the background.  about:memory copes with this child process ok, but the mechanism it uses is sub-optimal, and I’m planning to rewrite it to be nicer and scale better in the presence of multiple child processes, because that’s a direction we’re heading in.

Finally, some sad news:  Justin Lebar, whose name should be familiar to any regular reader of these MemShrink reports, has left Mozilla.  Justin was a core MemShrink-er from the very beginning, and contributed greatly to the success of the project.  Thanks, Justin, and best of luck in the future!

Using include-what-you-use

include-what-you-use (a.k.a. IWYU) is a clang tool that tells you which #include statements should be added and removed from a file.  Nicholas Cameron used it to speed up the building of gfx/layers by 12.5%.  I’ve also used it quite a bit within SpiderMonkey;  I’ve seen smaller build speed improvements but I’ve also been doing it in chunks over time.  Ms2ger started a tracking bug for all places where IWYU has been used in Mozilla code.

Ehsan asked for instructions on setting up IWYU.  There are official instructions, but I thought it might be helpful to document exactly what I did.

First, here is how I installed clang, based on Ehsan’s instructions.  I put the source code under $HOME/local/src and installed the build under $HOME/local.

  mkdir $HOME/local/src
  cd $HOME/local/src
  svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
  cd llvm/tools
  svn co http://llvm.org/svn/llvm-project/cfe/trunk clang
  cd ../..
  mkdir build
  cd build/
  ../configure --enable-optimized --disable-assertions --prefix=$HOME/local
  make
  sudo make install

Then I followed the “Building in-tree” instructions.  The first part is to get the IWYU code.

   cd $HOME/local/src/llvm/tools/clang/tools
   svn checkout http://include-what-you-use.googlecode.com/svn/trunk/ include-what-you-use

The second part was to do the following steps.

  • Edit tools/clang/tools/Makefile and add |include-what-you-use| to the DIRS variable.
  • Edit tools/clang/tools/CMakeLists.txt and add |add_subdirectory(include-what-you-use)|.
  • Re-build clang as per the above instructions.

After that, configure a Mozilla tree for a clang build and then run make with the following options: -j1 -k CXX=/home/njn/local/src/llvm/build/Release/bin/include-what-you-use. I’m not certain if the -j1 is necessary, but since IWYU spits out lots of output, it seemed wise. The -k tells make to keep building even after errors; for some reason, IWYU triggers compilation failure on every file it looks at.

Pipe the output to a file, and you’ll see lots of stuff like this.

../jsarray.h should add these lines:
#include <stdint.h>                     // for uint32_t
#include <sys/types.h>                  // for int32_t
#include "dist/include/js/RootingAPI.h"  // for HandleObject, Handle, etc
#include "jsapi.h"                      // for Value, HandleObject, etc
#include "jsfriendapi.h"                // for JSID_TO_ATOM
#include "jstypes.h"                    // for JSBool
#include "vm/String.h"                  // for JSAtom
namespace JS { class Value; }
namespace js { class ExclusiveContext; }
struct JSContext;

../jsarray.h should remove these lines:

The full include-list for ../jsarray.h:
#include <stdint.h>                     // for uint32_t
#include <sys/types.h>                  // for int32_t
#include "dist/include/js/RootingAPI.h"  // for HandleObject, Handle, etc
#include "jsapi.h"                      // for Value, HandleObject, etc
#include "jsfriendapi.h"                // for JSID_TO_ATOM
#include "jsobj.h"                      // for JSObject (ptr only), etc
#include "jspubtd.h"                    // for jsid
#include "jstypes.h"                    // for JSBool
#include "vm/String.h"                  // for JSAtom
namespace JS { class Value; }
namespace js { class ArrayObject; }  // lines 44-44
namespace js { class ExclusiveContext; }
struct JSContext;

I focused on addressing the “should remove these lines” #includes, and I did it manually. There’s also a script you can use to automatically do everything for you;  I don’t know how well it works.

Note that IWYU’s output is just plain wrong about 5% of the time — i.e. it says you can remove #includes that you clearly cannot.  (A lot of the time this seems to be because it hasn’t realized that a macro is needed.)  I also found that, while it produced output for all .cpp files, it only produced output for some of the .h files.  No idea why. Finally, it doesn’t know about local idioms; in particular, if you have platform-dependent code, its suggestions are often terrible because it only sees the files for the platform you are building on.

Good luck!

MemShrink progress, week 109–112

There’s been a lot of focus on B2G memory consumption in the past four weeks.  Indeed, of the 38 MemShrink bugs fixed in that time, a clear majority of them relate in some way to B2G.

In particular, Justin Lebar, Kyle Huey and Andrew McCreight have done a ton of important work tracking down leaks in both Gecko and Gaia.  Many of these have been reported by B2G partner companies doing stress testing such as opening and closing apps 100s or 1000s of times over long period.  Some examples (including three MemShrink P1s) are here, here, here, here, here, here, here and here.  There are still some P1s remaining (e.g. here, here, here).  This work is painstaking and requires lots of futzing around with low-level tools such as the GC/CC logs, unfortunately.

Relatedly, Justin modified the JS memory reporter to report “notable” strings, which includes smallish strings that are duplicated many times, a case that has occurred on B2G a couple of times.  Justin also moved some of the “heap-*” reports that previously lived in about:memory’s “Other measurements” section into the “explicit” tree.  This makes “explicit” closer to “resident” a lot of the time, which is a useful property.

Finally, Luke Wagner greatly reduced the peak memory usage seen during parsing large asm.js examples.  For the Unreal demo, this reduced the peak from 881MB to 6MB, and reduced start-up time by 1.5 seconds!  Luke also slightly reduced the size of JSScript, which is one of the very common structures on the JS GC heap, thus reducing pressure on the GC heap, which is always a good thing.