Category Archives: Firefox

MemShrink progress, week 121–124

It’s been a quiet but steady four weeks for MemShrink with 19 bugs fixed, including several leaks.

The only fix that I feel is worth highlighting is bug 918207, in which I added support for fast, coarse-grained measurement of a tab’s memory consumption.  The implemented machinery isn’t currently exposed through the UI, though there are two bugs open that will use it:  a simple one that will implement a command for the developer toolbar, and a more complex one that will implement a constantly-updating memory monitor widget for the devtools pane.

See you next time!

Warning for Firefox devs planning to upgrade to Ubuntu 13.10

I just upgraded from Ubuntu 13.04 to Ubuntu 13.10, and Firefox wouldn’t build with either clang or GCC.

clang was initially failing during configure, complaining about not being able to find joystick.h, though the underlying failure was an inability to find stddef.h.  This Ubuntu bug describes a workaround, which is to do the following.

cd /usr/lib/clang/3.2/
sudo ln -s /usr/lib/llvm-3.2/lib/clang/3.2/include

With that in place, I clobbered and rebuilt, and clang complained about a problem in allocator.h relating to a name __allocator_base, and GCC complained about C++11 support being insufficient.

Both failures had the same underlying cause, which is that both compilers are hardwired to look for some GCC-4.7 headers (which they shouldn’t) as well as GCC-4.8 headers.  I filed a bug with Ubuntu about this.

I worked around the problem just by renaming /usr/include/c++/4.7/ and /usr/include/x86_64-linux-gnu/c++/4.7/.  There may be more elegant workarounds, but that was good enough for me.

How to trigger a child process in desktop Firefox

Firefox is now multi-process, and not just for the plugin-container process.  For example, there is now (present but disabled in Firefox 25, and likely to be released in Firefox 27) a separate process that is used to update the thumbnails shown in a new tab.

As a result, sometimes you might want to test something in the presence of multiple processes.  Here’s how I’ve been doing it.

  • Delete the images in the thumbnails/ directory within the profile’s temporary directory.
    • On Linux it’s ~/.cache/mozilla/firefox/<profile>/thumbnails/.
    • On Mac it’s ~/Library/Caches/Firefox/Profiles/<profile>/thumbnails/.
    • On Windows it’s C:\Users\<username>\AppData\Local\Mozilla\Firefox\Profiles\<profile>\thumbnails\.
    • I’m not sure about Android.
  • Open about:newtab.  This triggers a thumbnails process.  It’ll live for about 60 seconds.  (If you’ve configured about:newtab to be blank rather than showing thumbnails, this might not work, though I’m not sure.)

Please let me know if there’s a better way!

(And if anyone can give me extra info on the things I’m not sure about, I’ll update the text above accordingly.  Thanks!)

MemShrink progress, week 117–120

Lots of important MemShrink stuff has happened in the last 27 days:  22 bugs were fixed, and some of them were very important indeed.

Images

Timothy Nikkel fixed bug 847223, which greatly reduces peak memory consumption when loading image-heavy pages.  The combination of this fix and the fix from bug 689623 — which Timothy finished earlier this year and which shipped in Firefox 24 — have completely solved our longstanding memory consumption problems with image-heavy pages!  This was the #1 item on the MemShrink big ticket items list.

To give you an idea of the effect of these two fixes, I did some rough measurements on a page containing thousands of images, which are summarized in the graph below.

Improvements in Firefox's Memory Consumption on One Image-heavy Page

First consider Firefox 23, which had neither fix, and which is represented by the purple line in the graph.  When loading the page, physical memory consumption would jump to about 3 GB, because every image in the page was decoded (a.k.a. decompressed).  That decoded data was retained so long as the page was in the foreground.

Next, consider Firefox 24 (and 25), which had the first fix, and which is represented by the green line on the graph.  When loading the page, physical memory consumption would still jump to almost 3 GB, because the images are still decoded.  But it would soon drop down to a few hundred MB, as the decoded data for non-visible images was discarded, and stay there (with some minor variations) while scrolling around the page. So the scrolling behaviour was much improved, but the memory consumption spike still occurred, which could still cause paging, out-of-memory problems, and the like.

Finally consider Firefox 26 (currently in the Aurora channel), which has both fixes, and which is represented by the red line on the graph.  When loading the page, physical memory jumps to a few hundred MB and stays there.  Furthermore, the loading time for the page dropped from ~5 seconds to ~1 second, because the unnecessary decoding of most of the images is skipped.

These measurements were quite rough, and there was quite a bit of variation, but the magnitude of the improvement is obvious.  And all these memory consumption improvements have occurred without hurting scrolling performance.  This is fantastic work by Timothy, and great news for all Firefox users who visit image-heavy pages.

[Update: Timothy emailed me this:  “Only minor thing is that we still need to turn it on for b2g. We flipped the pref for fennec on central (it’s not on aurora though). I’ve been delayed in testing b2g though, hopefully we can flip the pref on b2g soon. That’s the last major thing before declaring it totally solved.”]

[Update 2: This has hit Hacker News.]

NuWa

Cervantes Yu landed Nuwa, which is a low-level optimization of B2G.  Quoting from the big ticket items list (where this was item #3):

Nuwa… aims to give B2G a pre-initialized template process from which every subsequent process will be forked… it greatly increases the ability for B2G processes to share unchanging data.  In one test run, this increased the number of apps that could be run simultaneously from five to nine

Nuwa is currently disabled by default, so that Cervantes can fine-tune it, but I believe it’s intended to ship with B2G version 1.3.  Fingers crossed it makes it!

Memory Reporting

I made some major simplifications to our memory reporting infrastructure, paving the way for future improvements.

First, we used to have two kinds of memory reporters:  uni-reporters (which report a single measurement) and multi-reporters (which report multiple measurements).  Multi-reporters, unsurprisingly, subsume uni-reporters, and so I got rid of uni-reporters, which simplified quite a bit of code.

Second, I removed about:compartments and folded its functionality into about:memory.  I originally created about:compartments at the height of our zombie compartment problem.  But ever since Kyle Huey made it more or less impossible for add-ons to cause zombie compartments, about:compartments has hardly been used.   I was able to fold about:compartments’ data into about:memory, so there’s no functionality loss, and this change simplified quite a bit more code.  If you visit about:compartments now you’ll get a message telling you to visit about:memory.

Third, I removed the smaps (size/rss/pss/swap) memory reporters.  These were only present on Linux, they were of questionable utility, and they complicated about:memory significantly.

Finally, I fixed a leak in about:memory.  Yeah, it was my fault.  Sorry!

Summit

The Mozilla summit is coming up!  In fact, I’m writing this report a day earlier than normal because I will be travelling to Toronto tomorrow.  Please forgive any delayed responses to comments, because I will be travelling for almost 24 hours to get there.

MemShrink progress, week 113–116

It’s been a relatively quiet four weeks for MemShrink, with 17 bugs fixed.  (Relatedly, in today’s MemShrink meeting we only had to triage 10 bugs, which is the lowest we’ve had for ages.)  Among the fixed bugs were lots for B2G leaks and leak-like things, many of which are hard to explain, but are important for the phone’s stability.

Fabrice Desré made a couple of notable B2G non-leak fixes.

On desktop, Firefox users who view about:memory may notice that it now sometimes mentions more than one process.  This is due to the thumbnails child process, which generates the thumbnails seen on the new tab page, and which occasionally is spawned and runs briefly in the background.  about:memory copes with this child process ok, but the mechanism it uses is sub-optimal, and I’m planning to rewrite it to be nicer and scale better in the presence of multiple child processes, because that’s a direction we’re heading in.

Finally, some sad news:  Justin Lebar, whose name should be familiar to any regular reader of these MemShrink reports, has left Mozilla.  Justin was a core MemShrink-er from the very beginning, and contributed greatly to the success of the project.  Thanks, Justin, and best of luck in the future!

Using include-what-you-use

include-what-you-use (a.k.a. IWYU) is a clang tool that tells you which #include statements should be added and removed from a file.  Nicholas Cameron used it to speed up the building of gfx/layers by 12.5%.  I’ve also used it quite a bit within SpiderMonkey;  I’ve seen smaller build speed improvements but I’ve also been doing it in chunks over time.  Ms2ger started a tracking bug for all places where IWYU has been used in Mozilla code.

Ehsan asked for instructions on setting up IWYU.  There are official instructions, but I thought it might be helpful to document exactly what I did.

First, here is how I installed clang, based on Ehsan’s instructions.  I put the source code under $HOME/local/src and installed the build under $HOME/local.

  mkdir $HOME/local/src
  cd $HOME/local/src
  svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
  cd llvm/tools
  svn co http://llvm.org/svn/llvm-project/cfe/trunk clang
  cd ../..
  mkdir build
  cd build/
  ../configure --enable-optimized --disable-assertions --prefix=$HOME/local
  make
  sudo make install

Then I followed the “Building in-tree” instructions.  The first part is to get the IWYU code.

   cd $HOME/local/src/llvm/tools/clang/tools
   svn checkout http://include-what-you-use.googlecode.com/svn/trunk/ include-what-you-use

The second part was to do the following steps.

  • Edit tools/clang/tools/Makefile and add |include-what-you-use| to the DIRS variable.
  • Edit tools/clang/tools/CMakeLists.txt and add |add_subdirectory(include-what-you-use)|.
  • Re-build clang as per the above instructions.

After that, configure a Mozilla tree for a clang build and then run make with the following options: -j1 -k CXX=/home/njn/local/src/llvm/build/Release/bin/include-what-you-use. I’m not certain if the -j1 is necessary, but since IWYU spits out lots of output, it seemed wise. The -k tells make to keep building even after errors; for some reason, IWYU triggers compilation failure on every file it looks at.

Pipe the output to a file, and you’ll see lots of stuff like this.

../jsarray.h should add these lines:
#include <stdint.h>                     // for uint32_t
#include <sys/types.h>                  // for int32_t
#include "dist/include/js/RootingAPI.h"  // for HandleObject, Handle, etc
#include "jsapi.h"                      // for Value, HandleObject, etc
#include "jsfriendapi.h"                // for JSID_TO_ATOM
#include "jstypes.h"                    // for JSBool
#include "vm/String.h"                  // for JSAtom
namespace JS { class Value; }
namespace js { class ExclusiveContext; }
struct JSContext;

../jsarray.h should remove these lines:

The full include-list for ../jsarray.h:
#include <stdint.h>                     // for uint32_t
#include <sys/types.h>                  // for int32_t
#include "dist/include/js/RootingAPI.h"  // for HandleObject, Handle, etc
#include "jsapi.h"                      // for Value, HandleObject, etc
#include "jsfriendapi.h"                // for JSID_TO_ATOM
#include "jsobj.h"                      // for JSObject (ptr only), etc
#include "jspubtd.h"                    // for jsid
#include "jstypes.h"                    // for JSBool
#include "vm/String.h"                  // for JSAtom
namespace JS { class Value; }
namespace js { class ArrayObject; }  // lines 44-44
namespace js { class ExclusiveContext; }
struct JSContext;

I focused on addressing the “should remove these lines” #includes, and I did it manually. There’s also a script you can use to automatically do everything for you;  I don’t know how well it works.

Note that IWYU’s output is just plain wrong about 5% of the time — i.e. it says you can remove #includes that you clearly cannot.  (A lot of the time this seems to be because it hasn’t realized that a macro is needed.)  I also found that, while it produced output for all .cpp files, it only produced output for some of the .h files.  No idea why. Finally, it doesn’t know about local idioms; in particular, if you have platform-dependent code, its suggestions are often terrible because it only sees the files for the platform you are building on.

Good luck!

MemShrink progress, week 109–112

There’s been a lot of focus on B2G memory consumption in the past four weeks.  Indeed, of the 38 MemShrink bugs fixed in that time, a clear majority of them relate in some way to B2G.

In particular, Justin Lebar, Kyle Huey and Andrew McCreight have done a ton of important work tracking down leaks in both Gecko and Gaia.  Many of these have been reported by B2G partner companies doing stress testing such as opening and closing apps 100s or 1000s of times over long period.  Some examples (including three MemShrink P1s) are here, here, here, here, here, here, here and here.  There are still some P1s remaining (e.g. here, here, here).  This work is painstaking and requires lots of futzing around with low-level tools such as the GC/CC logs, unfortunately.

Relatedly, Justin modified the JS memory reporter to report “notable” strings, which includes smallish strings that are duplicated many times, a case that has occurred on B2G a couple of times.  Justin also moved some of the “heap-*” reports that previously lived in about:memory’s “Other measurements” section into the “explicit” tree.  This makes “explicit” closer to “resident” a lot of the time, which is a useful property.

Finally, Luke Wagner greatly reduced the peak memory usage seen during parsing large asm.js examples.  For the Unreal demo, this reduced the peak from 881MB to 6MB, and reduced start-up time by 1.5 seconds!  Luke also slightly reduced the size of JSScript, which is one of the very common structures on the JS GC heap, thus reducing pressure on the GC heap, which is always a good thing.

 

MemShrink progress, week 105–108

This is the first of the every-four-weeks MemShrink reports that I’m now doing.  The 21 bugs fixed in the past four weeks include 11 leak fixes, which is great, but I won’t bother describing them individually.  Especially when I have several other particularly impressive fixes to describe…

Image Handling

Back in March I described how Timothy Nikkel had greatly improved Firefox’s handling of image-heavy pages.  Unfortunately, the fix had to be disabled in Firefox 22 and Firefox 23 because it caused jerky scrolling on pages with lots of small images, such as Pinterest.

Happily, Timothy has now fixed those problems, and so his previous change has been re-enabled in Firefox 24.  This takes a big chunk out of the #1 item on the MemShrink big ticket items list.  Fantastic news!

Lazy Bytecode Generation

Brian Hackett finished implementing lazy bytecode generation.  This change means that JavaScript functions don’t have bytecode generated for them until they run.  Because lots of websites use libraries like jQuery, in practice a lot of JS functions are never run, and we’ve found this can reduce Firefox’s memory consumption by 5% or more on common workloads!  That’s a huge, general improvement.

Furthermore, it significantly reduces the number of things that are allocated on the GC heap (i.e. scripts, strings, objects and shapes that are created when bytecode for a function is generated).  This reduces pressure on the GC which makes it less likely we’ll have bad GC behaviour (e.g. pauses, or too much memory consumption) in cases where the GC heuristics aren’t optimal.

The completion of this finished off item #5 on the old Memshrink big ticket items list.  Great stuff.  This will be in Firefox 24.

Add-on Memory Reporting

Nils Maier implemented add-on memory reporting in about:memory.  Here’s some example output from my current session.

├───33,345,136 B (05.08%) -- add-ons
│   ├──18,818,336 B (02.87%) ++ {d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}
│   ├──11,830,424 B (01.80%) ++ {59c81df5-4b7a-477b-912d-4e0fdf64e5f2}
│   └───2,696,376 B (00.41%) ++ treestyletab@piro.sakura.ne.jp/js-non-window/zones/zone(0x7fbd7bf53800)

It’s obvious that Tree Style Tabs is taking up 2.7 MB.  What about the other two entries?  It’s not immediately obvious, but if I look in about:support at the “extensions” section I can see that they are AdBlock Plus and ChatZilla.

If you’re wondering why those add-ons are reported as hex strings, it’s due to a combination of the packaging of each individual add-on, and the fact that the memory reporting code is C++ and the add-on identification code is JS and there aren’t yet good APIs to communicate between the two.  (Yes, it’s not ideal and should be improved, but it’s a good start.)  Also, not all add-on memory is reported, just that in JS compartments;  old-style XUL add-ons in particular can have their memory consumption under-reported.

Despite the shortcomings, this is a big deal.  Users have been asking for this information for years, and we’ve finally got it.  (Admittedly, the fact that we’ve tamed add-on leaks makes it less important than it used to be, but it’s still cool.)  This will also be in Firefox 24.

b2g

Gregor Wagner has landed a nice collection of patches to help the Twitter and Notes+ apps on B2G.

While on the topic of B2G, in today’s MemShrink meeting we discussed the ongoing problem of slow memory leaks in the main B2G process.  Such leaks can cause the phone to crash or become flaky after its been running for hours or days or weeks, and they’re really painful to reproduce and diagnose.  Our partners are finding these leaks when doing multi-hour stress tests as part of their QA processes.  In contrast, Mozilla doesn’t really have any such testing, and as a result we are reacting, flat-footed, to external reports, rather than catching them early ourselves.  This is a big problem because users will rightly expect to have their phones run for weeks (or even months) without rebooting.

Those of us present at the meeting weren’t quite sure how we can improve our QA situation to look for these leaks.  I’d be interested to hear any suggestions.  Thanks!

MemShrink’s 2nd Birthday

June 14, 2013 is the second anniversary of the first MemShrink meeting.  This time last year I took the opportunity to write about the major achievements from MemShrink’s first year.  Unsurprisingly, since then we’ve been picking fruit from higher in the tree, so the advances have been less dramatic.  But it’s been 11 months since I last update the “big ticket items” list, so I will take this opportunity to do so, providing a partial summary of the past year at the same time.

The Old Big Ticket Items List

#5: Better Script Handling

This had two parts.  The first part was the sharing of immutable parts of scripts, which Till Schneidereit implemented.  It can save multiple MiBs of memory, particular if you have numerous tabs open from the same website.

The second part is lazy bytecode generation, which Brian Hackett has implemented and landed, though it hasn’t yet enabled.  Brian is working out the remaining kinks and hopes to land by the end of the current (v24) cycle.    Hopefully he will, because measurements have shown that it can reduce Firefox’s memory consumption by 5% or more on common workloads!  That’s a huge, general improvement.  Furthermore, it significantly reduces the number of things that are allocated on the GC heap (i.e. scripts, strings, objects and shapes that are created when bytecode for a function is generated).  This reduces pressure on the GC which makes it less likely we’ll have bad GC behaviour (e.g. pauses, or too much memory consumption) in cases where the GC heuristics aren’t optimal.

So the first part is done and the second is imminent, which is enough to cross this item off the list.  [Update:  Brian just enabled lazy bytecode on trunk!]

#4: Regain compartment-per-global losses

Bill McCloskey implemented zones, which restructured the heap to allow a certain amount of sharing between zones. This greatly reduced wasted space and reduced memory consumption in common cases by ~5%.

Some more could still be done for this issue.  In particular, it’s not clear if things have improved enough that many small JSMs can be used without wasting memory.  Nonetheless, things have improved enough that I’m happy to take this item off the list.

#3: Boot2Gecko

This item was about getting about:memory (or something similar) working on B2G, and using it to optimize memory.  This was done some time ago and the memory reporters (plus DMD) were enormously helpful in improving memory consumption.  Many of the fixes fell under the umbrella of Operation Slim Fast.

So I will remove this particular item from the list, but memory optimizations for B2G are far from over, as we’ll see below.

#2: Compacting Generational GC

See below.

#1: Better Foreground Tab Image Handling

See below.

The New Big Ticket Items List

#5: pdf.js

pdf.js was recently made the default way of opening PDFs in Firefox, replacing plug-ins such as Adobe Reader.  While this is great in a number of respects, such as security, it’s not as good for memory consumption, because pdf.js can use a lot of memory in at least some cases.  We need to investigate the known cases and improve things.

#4: Dev tools

While we’ve made great progress with memory profiling tools that help Firefox developers, the situation is not nearly as good for web developers.  Google Chrome has heap profiling tools for web developers, and Firefox should too.  (The design space for heap profilers is quite large and so Firefox shouldn’t just copy Chrome’s tools.)  Alexandre Poirot has started work on a promising prototype, though there is a lot of work remaining before any such prototype can make it into a release.

#3: B2G Nuwa

Cervantes Yu and Thinker Li have been working on Nuwa, which aims to give B2G a pre-initialized template process from which every subsequent process will be forked.  This might sound esoteric, but the important part is that it greatly increases the ability for B2G processes to share unchanging data.  In one test run, this increased the number of apps that could be run simultaneously from five to nine, which is obviously a big deal.  The downside is that getting it to work requires some extremely hairy fiddling with low-level code.  Fingers crossed it can be made to work reliably!

Beyond Nuwa, there is still plenty of other ways that B2G can have its memory consumption optimized, as you’d expect in an immature mobile OS.  Although I won’t list anything else in the big ticket items list, work will continue here, as per MemShrink’s primary aim: “MemShrink is a project that aims to reduce the memory consumption of Firefox (on desktop and mobile) and Firefox OS.”

#2: Compacting Generational GC

Generational GC will reduce fragmentation, reduce the working set size, and speed up collections.  Great progress has been made here — the enormous task of exactly rooting the JS engine and the browser is basically finished, helped along greatly by a static analysis implemented by Brian Hackett.  And Terrence Cole and others are well into converting the collector to be generational.  So we’re a lot closer than we were, but there is still some way to go.  So this item is steady at #2.

#1: Better Foreground Tab Image Handling

Firefox still uses much more memory than other browsers on image-heavy pages.  Fortunately, a great deal of progress has been made here.  Timothy Nikkel fixed things so that memory usage when scrolling through image-heavy pages is greatly reduced.  However, this change caused some jank on pages with lots of small images, so it’s currently disabled on the non-trunk branches.  Also, there is still a memory spike when memory-heavy pages are first loaded, which needs to be fixed before this item can be considered done.  So this item remains at #1.

Summary

Three items from the old list (#3, #4, #5) have been ticked off.  Two items remain (#1, #2) — albeit with good progress having been made — and keep their positions on the list.  Three items have been added to the new list (#3, #4, #5).
Let me know if I’ve omitted anything important!

MemShrink progress, week 103–104

I’m back from vacation.  Many thanks to Andrew McCreight for writing two MemShrink reports (here and here) while I was away, and to Justin Lebar for hosting them on his blog.

Fixes

areweslimyet.com proved its value again, identifying a 40 MiB regression relating to layer animations.  Joe Drew fixed the problem quickly.  Thanks, Joe!

Ben Turner changed the GC heuristics for web workers on B2G so that less garbage is allowed to accumulate.  This was a MemShrink:P1.

Bas Schouten fixed a gfx top-crasher that was caused by inappropriate handling of an out-of-memory condition.  This was also a MemShrink:P1.

Randell Jesup fixed a leak in WebRTC code.

Changes to these progress reports

I’ve written over sixty MemShrink progress reports in the past two years.  When I started writing them, I had two major goals.  First, to counter the notion that nobody at Mozilla cares about memory consumption — this was a common comment on tech blogs two years ago.  Second, to keep interested people informed of MemShrink progress.

At this point, I think the first goal has been comprehensively addressed.  (More due to the  fact that we actually have improved Firefox’s memory consumption than because I wrote a bunch of blog posts!)  As a result, I feel like the value of these posts has diminished.  I’m also tired of writing them;  long-time readers may have noticed that they are much terser than they used to be.

Therefore, going forward, I’m only going to post one report every four weeks.  (The MemShrink team will still meet every two weeks, however.)  I’ll also be more selective about what I write about — I’ll focus on the big improvements, and I won’t list every little leak that has been fixed, though I might post links to Bugzilla searches that list those fixes. Finally, I won’t bother recording the MemShrink bug counts.  (In fact, I’ve started that today.)  At this point it’s pretty clear that the P1 list hovers between 15 and 20 most of the time, and the P2 and P3 lists grow endlessly.

Apologies to anyone disappointed by this, but rest easy that it will give me more time to work on actual improvements to Firefox :)