Categories
B2G Fennec Firefox Garbage Collection Memory consumption MemShrink pdf.js

MemShrink’s 3rd birthday

June 14, 2014 was the third anniversary of the first MemShrink meeting. MemShrink is a mature effort at this point, and many of the problems that motivated its creation have been fixed. Nonetheless, there are still some areas for improvement. So, as I did at this time last year, I’ll take the opportunity to update the “big ticket items” list.

The Old Big Ticket Items List

#5: pdf.js

pdf.js is the PDF viewer that now ships by default in Firefox. I greatly reduced its memory usage in two rounds, first in February, and again in June. The first round of improvements were released in Firefox 29, and the second round is due to be released in Firefox 33, which should be out in mid-October.

pdf.js will still use more memory than native PDF viewers, but the situation has improved enough that it can be removed from this list.

#4: Dev tools

See below.

#3: B2G Nuwa

Cervantes Yu, Thinker Li, and others succeeded in landing Nuwa, an impressive technical achievement that increased the amount of memory sharing between different processes on Firefox OS. This allows many more apps to run at once. I think this is present in Firefox OS 1.3 and later.

#2: Compacting Generational GC

See below.

#1: Better Foreground Tab Image Handling

Timothy Nikkel completely fixed the massive spikes in decoded image data that we used to get on image-heavy pages. This was a fantastic improvement that shipped in Firefox 26.

The New Big Ticket Items List

In a break from tradition, I will present the items in alphabetical order, rather than ranking them. This reflects the fact that I don’t have a clear opinion on the relevant importance of these different items. (I view this as a good thing, because it means that I feel there aren’t any hair-on-fire problems.)

Better regression detection

areweslimyet.com (a.k.a. AWSY) is our best tool for detecting memory usage regressions. It currently tracks Firefox and Firefox for Android, and there are plans to integrate Firefox OS.

Unfortunately, although AWSY has been successful at detecting large regressions, leading to them being fixed, its measurements are noisy enough that detecting smaller regressions is difficult. As a result, the general trend of the graphs has been upward. I do not think this is as bad as it looks at first glance, because Firefox’s behaviour in the worst cases — which matter more than the average case — is much better than it used to be. Nonetheless, improved sensitivity here would be an excellent thing, and Eric Rahm is actively working on this.

Developer tools

Developer-oriented memory profiling tools are under active development by Nick Fitzgerald, Jim Blandy, and others. A lot of the necessary profiling infrastructure is in place, and this seems to be progressing well, though I don’t know when it’s expected to be finished.

By the way, if you haven’t tried Firefox’s dev tools recently, you should! They have improved an incredible amount in the past year or so.

GC Arena Fragmentation

Generational GC landed a few months ago. I had been hoping that this would help reduce GC fragmentation, but the effect was small. It looks like compacting GC is what’s needed to make a big difference here, though some smaller tweaks may help things a little.

Tarako

Tarako is the codename for the “$25 phone” running Firefox OS that will ship in India and other countries later this year. It has a paltry 128 MiB of RAM, so memory usage is critical. Since the hardware first became available to Mozilla employees at the start of the year, Tarako has improved from “almost unusable” to “not bad”. But apps still close due to OOMs fairly frequently, especially when a user does something that involves two apps working together in some way.

There is no single change that will decisively fix this problem for Tarako;  further improvements will require an ongoing grind of work from many engineers. Improvements made for Tarako are also likely to benefit higher-end Firefox OS devices as well.

Windows OOM crashes

The number of out-of-memory (OOM) crashes that occur on Windows is relatively high. A sizeable fraction of these appear to be 32-bit virtual OOM crashes, caused by running out of address space.

Things that will mitigate this include modifications to jemalloc and the JS allocator, as well as Electrolysis. And some additional data and analysis may help us understand the problem better.

The ultimate solution, for users on 64-bit versions of Windows, is 64-bit Firefox builds for Windows, though it’ll be a while before those builds are in good enough shape to ship to regular users.

Summary

Three items from the old list (pdf.js, Nuwa, image handling) have been ticked off.  Two items remain (devtools, GC fragmentation), the latter in altered form, and both are a lot closer to completion than they were. Three new items (regression detection, Tarako, and Windows OOMs) have been added.

Let me know if I’ve omitted anything important!

Categories
about:memory Fennec Firefox Memory consumption MemShrink

MemShrink progress, week 93–94

After lots of activity in the previous month, the past two weeks have been fairly quiet for MemShrink.

AWSY

areweslimyet.com has been proving its worth.

Jonathan Kew reduced the amount of memory taken by fonts on Fennec at start-up, which was detected by AWSY/mobile.  Jonathan also reverted a change that AWSY detected as increasing Fennec memory consumption, and filed a follow-up to investigate further.

Joe Drew fixed a bad regression on AWSY relating to image decoding.  It’s not clear to me if this was a genuine regression that users would have seen, or if it was an artifact of the way AWSY does its measurements.  Either way, it’s good that it was fixed, and props to Joe for doing it so quickly.

Finally, we closed bug 833518, which was for an AWSY regression caused by the new DOM bindings.  This was previously improved by an Aurora-only hack, but enough cases have been translated to the new bindings that we’re naturally down almost to where we were.

Miscellaneous

The mobile team abandoned their goal of making Fennec work on phones with only 256 MiB of memory.  The rationale is that Android phones with only 256 MiB of RAM are uncommon, whereas low-end phones that meet the current minimum of 384 MiB are much more common.  The mobile team will of course continue to look for ways to improve memory consumption in order to make life for users with 384 MiB phones.

I modified the JS engine so that it doesn’t emit bytecode for asm.js functions in the normal case.  This reduced the memory consumption of the Unreal 3 demo used at GDC by about 100 MiB.  I also added a memory reporter for array buffers used by asm.js, which are often quite large and weren’t being measured on 64-bit platforms.

Alexandre Poirot fixed a leak relating to dev tools.

Randell Jesup fixed a small leak in WebRTC.

Help Needed

I’m working on adding a button to about:memory trigger the dumping of memory reporter data to file.  I have a patch awaiting review, but I’m getting a test failure on Windows.  The test saves gzipped memory reports to file, and then immediately loads that saved file (and uncompresses it) and checks the data looks as expected.  This works fine on Mac and Linux, but on Windows I’m sometimes getting incomplete data in the load step.  The file is quite short (just 253 bytes compressed, and 620 bytes uncompressed) and the truncation point varies between runs;  in the most severe occurrence only 9 bytes of uncompressed data were loaded, though the cut-off point seems to vary randomly.

I suspect there’s a file synchronization problem between the save and the load, even though gzclose() is called on the save file before the loading occurs.  If anyone has ideas about what the problem might be, I’d love to hear them.

Update: Nils Maier and an anonymous commenter pointed out the problem — I was using “r” instead of “rb” for the file mode.  On Windows, this causes mangling of EOL chars.

Also, we’re seeing some strange behaviour on Mac OS X where memory managed by jemalloc doesn’t appear to be released back to the OS as it should.  This is both alarming and hard to understand, which is not a good combination.

Good First Bugs

I have two easy bugs assigned to me that I probably won’t get around to for some time.  Both of them would be good first (or second, or third…) bugs.

  • Bug 857382 is about making about:memory handle memory report diffs more elegantly.
  • Bug 798914 is just a minor code clean-up.  Nothing too exciting, but first bugs often aren’t!

Please email or comment in one of the bugs if you are interested in helping.

Bug Counts

Here are the current bug counts.

  • P1: 15 (-0/+2)
  • P2: 138 (-4/+8)
  • P3: 129 (-2/+7)
  • Unprioritized: 1 (-3/+0)
Categories
about:memory B2G DMD Fennec Firefox Memory consumption MemShrink

MemShrink progress, week 79–82

I skipped the last MemShrink report due to Christmas, so we have four weeks’ worth of bug fixes today.

LEAKS

Joe Walker fixed a bad leak found by Jesse Ruderman:  if you closed a browser window with the developer toolbar open it would leak “everything”.  This was a MemShrink P1 bug.

Anton Kovalyov fixed a leak involving scratchpad.  This bug was also found by Jesse Ruderman.

Randell Jesup fixed some WebRTC leaks.

John Schoenick fixed a leak involving plugins.

Josh Aas fixed a leak in some networking code.

DMD

I wrote a more detailed blog post about DMD.  Here is the take-away message.

about:memory is MemShrink’s not-so-secret weapon when it comes to understanding Firefox’s memory consumption… and DMD is how we make about:memory better.

Lots of under-the-hood improvements have been made to DMD since I wrote that.  Users on Mac, Linux and B2G who aren’t afraid of doing their own builds should try it out.  Also, Ehsan Akhgari got it to build on Windows, though it’s not yet clear how well it works on that platform.  If anyone wants to try it out, please let me know how it goes.

Memory Reporters

Ben Turner made the workers memory reporter be able to handle workers that use ctypes.  This was important, especially on B2G, because each process can have one or two or more such workers — this is for Firefox chrome stuff, not web content — and we weren’t measuring them at all, and they can take multiple MiB each.

I fixed the orphan DOM node memory reporter.  The introduction of WebIDL had changed the layout of some paired JS/DOM objects, and such objects weren’t being reported.  (DMD discovered the unreported memory, and Boris Zbarsky helped me interpret what it meant.)  I see this accounting for multiple MiB of orphan nodes when using Gmail.

I added a memory reporter for the event listenener manager’s hash table.  It starts off small, but I’ve seen it go as high as 1.5 MiB after lots of browsing.  (DMD helped me identify this too.)

Kartikaya Gupta added a memory reporter for graphics textures on Android.

I added a memory reporter for any ctypes data that is hanging off JS objects.  I added it because one DMD profile on B2G showed non-trivial amounts of ctypes data, but that seems to have been a fluke and it rarely shows much memory now.  Oh well.

Miscellaneous

Andrew McCreight improved CC shutdown logging, which will make it easier to identify shutdown leaks.

Rail Aliiev and Kartikaya Gupta enabled a new NDK for Fennec builds on releng machines.  This might result in smaller binary sizes, which saves memory.

Bug Counts

Here are the current bug counts.

  • P1: 15 (-2/+0)
  • P2: 116 (-10/+0)
  • P3: 102 (-4/+0)
  • Unprioritized: 18 (-0/+18)

The number of unprioritized bugs is high because we didn’t have a MemShrink meeting this week.  This was because Justin Lebar and Kyle Huey are in Berlin for the B2G work week.  We’ll have our next meeting two weeks from today.

Categories
about:memory add-ons B2G Fennec Firefox Memory consumption MemShrink

MemShrink progress, week 73–74

B2G!  Fennec!  Social API!  They’re all happening, and memory consumption is a big deal for all of them.  Time for a MemShrink report.

B2G

Lots of B2G work is happening, unsurprisingly.  All of these changes have been backported to the Aurora channel.

Kyle Huey landed an important patch that merges system compartments together.  This avoids wasted space caused by having 100s of small compartments.  As Chris Jones said: “It’s a completely different phone with these patches.”  Unfortunately, the preference controlling this behaviour is currently turned off on B2G, because it’s causing some Marionette test failures.  Hopefully they’ll be dealt with soon.  Note that this preference won’t be turned on in desktop builds, and there’s a medium-term plan to avoid this wasted space in a less hacky fashion.

Marco Bonardo disabled Places in B2G, saving about 300 KiB in the main process.

Justin Lebar added code to trigger memory pressure events when lowmemkiller notifications occur.

Justin also tweaked things so that the garbage collector is more likely to run when screenshots are taken.

Jeff Muizelaar fixed a graphics bug that I don’t understand but apparently avoids allocating lots of memory on the B2G unlock screen.

Chris Jones ensured that remote content drop their buffers when they’re hidden.

I tweaked the size of the arena chunks used by XPT, which saved about 120 KiB per process on 64-bit builds, and a bit less on 32-bit builds.

I also shrunk the initial size of the SPS hash table, which saves 48 KiB per process on 64-bit builds and 24 KiB on 32-bit builds.

Memory Reporting

Lots of work happened on the memory reporting front.  This was inspired by  “heap-unclassified” typically being much higher on B2G than on desktop.  Desktop Firefox uses a single process and its memory consumption is usually measured in the 100s of MiBs or more.  In contrast, B2G uses one “main” process and then one process per running app, and the smaller ones are typically around 10 MiB.  As a result, the sundry small per-process things that don’t matter much for desktop loom larger on B2G.  (For the same reason, changes that reduce per-process memory by smallish amounts have outsize value on B2G.)  Most of the following changes have also been backported to the Aurora channel.

Nick Hurley added a “explicit/network/disk-cache” memory reporter.  It typically measures 100s of KiBs in desktop Firefox.  (It doesn’t get used on B2G.)

I added memory reporters “explicit/xpcom/component-manager” and “explicit/xpcom/category-manager”.  Together they measure about 280 KiB per process on 64-bit builds, and a bit less on 32-bit builds.

I added a memory reporter “explicit/script-namespace-manager”.  It measures just over 150 KiB per process on 64-bit builds, and a bit less on 32-bit builds.

I added a memory reporter “explicit/xpcom/effective-TLD-service”.  It measures about 128 KiB per process on 64-bit builds, and a bit less on 32-bit builds.

I added some detail to the JS object memory reporters.  This gives slightly more insight into where this memory is going, but the extra memory measured as a result is usually pretty small.

I fixed the “explicit/atom-tables” memory reporter, which was erroneously always reporting 0 bytes.  It now measures anywhere from a few 100 KiBs to several MiBs of memory.  This was a frustrating bug to find.  DMD exists to check that memory reporters are measuring memory properly, and the reporter was doing its measurements just fine.  But DMD cannot check that the measured amounts are added correctly, and that’s what was going wrong here — a line that should have been this:

return n;

instead was this:

return 0;

In other words, the code I had written was doing the difficult part correctly, but botched the trivial part.  How annoying.

Social API

Felipe Gomes fixed a document leak that occurred when the social sidebar was hidden, and disappeared only when it was unhidden.

Felipe also added code to clear the previous profile when Social is toggled off, which prevents a leak.

The social API still has some significant unresolved memory consumption issues, which are a concern.

Fennec

Kartikaya Gupta (a.k.a. Kats) turned on tab expiration for Fennec.  What this means is that Fennec will now unload the contents of background tabs in certain circumstances — when memory is low, and in some cases when a tab hasn’t been viewed for over an hour — and then reload them when they are brought back into the foreground.  Read Kats’ blog post for more details.

(This change was prompted by Project 256meg, which aims to make Fennec usable on phones with only 256 MiB of RAM.  (Fennec currently requires 512 MiB, according to the official specs.)  Kats has just got to the point where Fennec can actually start on 256 MiB devices.  Although Fennec only uses a single process, there is obviously significant overlap between this goal and B2G’s memory reduction goals.)

An obvious follow-up idea is to add tab expiration to desktop Firefox.  But the bug tracking that idea has seen more heated discussion.  The reason is that tab expiration might cause data loss in some cases.  This is less of a problem on Fennec because (as far as I can tell) there is less of an expectation that in-flight data will be saved on mobile devices.  For example, having processes killed due to memory constraints on mobile is much more common than on desktop.  Still, I expect plenty more arguing before this issue is resolved one way or the other.  One option is to allow it on desktop but have it disabled by default.

Add-ons

Michel Gutierrez fixed a zombie compartment in Video DownloadHelper, which is the 2nd most popular add-on at AMO.   Version 4.9.11 has the fix.

We hit a notable milestone on the add-on front this fortnight:  on November 1st hit bug for tracking known leaks in add-ons — filed 16 months ago — had zero blockers.  In other words, we reached the point where no add-ons were known to leak!  Unfortunately this didn’t last long, and at the time of writing the tracking bug has three blocking bugs.  Still, it’s confirmation that what used to be our biggest memory consumption problem is well under control.

Bug Counts

Here are the current bug counts.

  • P1: 20 (-3/+5)
  • P2: 110 (-10/+7)
  • P3: 102 (-3/+8)
  • Unprioritized: 4 (-2/+4)

As always, the current bug lists can be found by following the links on the MemShrink wiki page.

Categories
Fennec Firefox Garbage Collection Memory consumption MemShrink

MemShrink progress, week 19

Good News

Peter Van der Beken fixed a leak caused by chrome code that injects a function into pages.  This was a MemShrink P1 bug.  The commentary in the bug is confusing, but this may have been affecting numerous add-ons including Firebug. (Bug 669730 is open for tracking leaks in Firebug;  it hasn’t yet been confirmed whether this fix has helped Firebug.)

Andrew McCreight rewrote JS_DumpHeap so that it dumps the complete JS object graph.  This was a MemShrink P1 bug because it’s an important piece of infrastructure for writing leak detection tools.

Justin Lebar fixed the measurement of RSS on Mac.  This was a MemShrink P1 bug because it prevented us from enabling jemalloc on Mac 10.5 machines.

Brian Hackett tweaked the JS engine so that more method JIT code can be discarded periodically, particularly chrome code in system compartments.

Oleg Romashin avoided using transparent layers in Fennec remote offscreen viewports.  This saves 5MB of memory in the active tab.

I added a memory reporter for the startup cache.  This is often around 1MB of memory, but it’s allocated via mmap/VirtualAlloc and so doesn’t change the “heap-unclassified” number in about:memory.

Boris Zbarsky fixed a small leak involving CSS transforms.

Bad News

It appears a bad memory regression has occurred in the past week or so.  Several people have reported multi-second pauses caused by garbage collection and cycle collection.  The problem only seems to strike when many tabs are open and/or the browser has been running for multiple days.  This needs investigation;  if you are experiencing similar problems please report in the bug.  Reliable steps to reproduce this bug will be invaluable.  If you turn on javascript.options.mem.log in about:config you can see when GCs and CCs occur and how long they take, which is helpful for diagnosis.

Bug Counts

The current bug counts are as follows.

  • P1: 35 (-5, +3)
  • P2: 113 (-3, +8)
  • P3: 54 (-0, +1)
  • Unprioritized: 4 (-5, +3)

I want to point out this bug, which presents an idea to help hunt down reproducible leaks that occur when users have multiple add-ons present.  This is important because many of the leaks reported by users recently are due to add-ons, but often the reporter has many add-ons installed which makes finding the culprit painful.  The goal is to write a Firefox add-on that selectively disables installed add-ons, so that a user can bisect them to discover which add-on is responsible for the leak.  This bisecting process is something that people can do manually, but an add-on that automates the process would make things easier and less error-prone.  Mercurial’s ‘hg bisect’ command would serve as a useful comparison.  Ehsan Akhgari has volunteered to mentor anyone who would like to try to implement this.

 

Categories
Fennec Firefox Memory consumption MemShrink

MemShrink Progress, weeks 13–18

I’ve been on vacation, so this report covers (incompletely) the past six weeks’ worth of MemShrink improvements.

Big Things

Paul Biggar and Mike Hommey enabled jemalloc on Mac 10.6.  This will hopefully reduce fragmentation on that platform.  And it brings it in line with Windows, Linux and Android.  Now we just need it for Mac 10.5 and Mac 10.7.

Oleg Romashin found a way to drop some Thebes layers in inactive tabs in Fennec.  I won’t pretend to understand the details of this bug, but if I’ve understood correctly it can saves 12MB or more per background tab.

Jeff Muizelaar turned on decode-on-draw.  This means that if you open a new page in a background tab, none of its images will be decoded until you switch to that tab.  Previously any images would be decoded and then discarded after 10 to 20 seconds (unless you switched to the tab before the time-out occurred).  This change can save a lot of memory (and CPU time) for users browsing image-heavy sites.

Gian-Carlo Pascutto optimized the safe browsing database.  This hopefully has fixed our url-classifier bloat problems.  (I plan to verify this soon.)

Chris Leary and Jonathan “Duke” Leto made regexp compilation lazy.  This potentially saves 10s or even 100s of MBs of memory in some cases by not compiling some regexps, and also allowing regexps to be GC’d more quickly.  There were some possible performance regressions from this patch, it’s unclear from the bug what exactly the state of these are.

Justin Lebar converted some uses of nsTArray to nsAutoTArray (and also here).  These avoided some calls to malloc (in the latter case, around 3% of all malloc calls!) and may help reduce heap fragmentation a little.  Robert O’Callahan did a separate, similar nsAutoTArray change here.  Justin also avoided another 1% of total malloc calls in JSAutoEnterCompartment.

Chris Leary rewrote JSArena, which avoided some wasted memory, as well as replacing some hoary old C code with sleek modern C++.

SMALLER THINGS

A new contributor named Jiten (a.k.a. deLta30) fixed about:memory’s GC and CC buttons so they trigger collections in both the content and chrome process in Fennec.  Previously only the chrome process was affected.  (I’m not sure how this change will be affected by the decision to switch to the native chrome process in Fennec.)  Great work, Jiten!

I avoided some wasted space in the JS code generator, and some more in the parser.  Justin Lebar did something similar in nsTArray_base.

Jonathan Kew added a memory reporter for textruns and associated data.  Justin Lebar added the “history-links-hashtable” memory reporter.

Justin Lebar fixed some bogus “canvas-2d-pixel-bytes” values in about:memory.

Brian Bondy fixed a leak in Windows drag and drop code.

Tim Taubert wrote about finding leaks in browser-chrome mochitests.

Bug Counts

The current bug counts are as follows.  The differences are against the MemShrink week 12 counts.

  • P1: 37 (-3, +11)
  • P2: 108 (-9, +37)
  • P3: 53 (-1, +14)
  • Unprioritized: 6 (-21, +5)

They’re still going up.  The good news is that my gut feeling is that not many of these bugs are problems reported by users.  (And those that are often are caused by add-ons.)  Most of the new reports are ideas for improvements from developers.

Categories
Fennec

You can build and run Fennec on a desktop machine

Yesterday I learned that you can build and run Fennec on a desktop machine:

Fennec running on a Linux desktop

The build instructions are pretty straightforward, just ignore all the stuff about Maemo and Scratchbox, you don’t need them for desktop builds.  Everyone else probably already knows you can do this, but I’m writing this just in case some others don’t.

I’m a smartphone Luddite, so this is the first time I’ve actually tried Fennec myself.  It’s pretty cool!  Though I haven’t worked out how to quit yet.  Well, I can do it on the desktop by closing the window, but that’s not within Fennec itself and so presumably isn’t possible on a smartphone.