DMD Firefox Garbage Collection Memory consumption MemShrink

MemShrink progress, week 85–86

Lots of news today.

Fixed Regressions

I wrote last time about a couple of bad regressions that AWSY identified.

The ongoing DOM bindings work will hopefully fully fix the second regression before the end of this development cycle (February 18).


John Schoenick made three big improvements to AWSY.

  • It now measures every push to mozilla-inbound.  Previously it measured mozilla-central once per day.  This will make it easier and faster to identify patches responsible for regressions.
  • It’s now possible to trigger an AWSY run for any try build.  Unfortunately John hasn’t yet written instructions on how to do this;  I hope he will soon…
  • AWSY now measures Fennec as well.  Kartikaya Gupta created the benchmark that is used for this.  He also fixed a 4 MB regression that it identified.

Leaks Fixed

Benoit Jacob fixed a CC leak that he found with his refgraph tool.

Johnny Stenback fixed a leak involving SVG that he found with DMD.  This was a very slow leak that Johnny had seen repeatedly, which manifested as slowly increasing “heap-unclassified” values in about:memory over days or even weeks.  It’s a really nice case because it shows that DMD can be used on long-running sessions with minimal performance impact.

Justin Lebar fixed a B2G leak relating to forms.js.

Randall Jesup fixed a leak relating to WebRTC.

Andrew McCreight fixed a leak relating to HTMLButtonElement.

Erik Vold fixed a leak in the Restartless Restart add-on.


Brian Hackett optimized the representation of JS objects that feature both array (indexed) elements and named properties.  Previously, if an object had both elements and named properties, it would use a sparse representation that was very memory-inefficient if many array elements were present.  This performance fault had been known for a long time, and it caused bad memory blow-ups every once in a while, so it’s great to have it fixed.

As a follow-up, Brian also made it possible for objects that use the sparse representation to change back to the dense array representation if enough array elements are subsequently added.  This should also avoid some occasional blow-ups that occur when arrays get filled in in complex ways.

Gregory Szorc reduced the memory consumption of the new Firefox Health Report feature, from ~3 MB to ~1–1.5 MB: here and here and here and here. On a related note, Bill McCloskey is making good progress with reducing compartment overhead, which should be a sizeable win once it lands.

Gregory also reduced the memory consumption of Firefox Sync:  https: here and here.

Jonathan Kew reduced the amount of memory used by textruns when Facebook Messenger is enabled.

The Add-on SDK is now present in mozilla-central, which is a big step towards getting all add-ons that use it to always use the latest version.  This is nice because it will mean that when memory leaks in the SDK are fixed (and there have been many) all add-ons that use it will automatically get the benefit, without having to be repacked.

Generational GC

Generational garbage collection is an ongoing JS engine project that should reap big wins once it’s completed.  I don’t normally write about things that haven’t been finished, but this is a big project and I’ve had various people asking about it recently, so here’s an update.

Generational GC is one the JS teams two major goals for the near-term.  (The other is benchmark and/or game performance, I can’t remember which.)  You can see from the plan that there are eight people working on it (though not all of them are working on it all the time).

Brian Hackett implemented a static analysis that can determine which functions in the JS engine can trigger garbage collection.  On top of that, he then implemented a static analysis that can identify rooting hazards and unnecessary roots.  This may sound esoteric, but it has massively reduced the amount of work required to complete exact rooting, which is the key prerequisite for generational GC.  To give you an idea:  Terrence Cole estimated that it reduced the number of distinct code locations that need to be looked at and possibly modified from ~10,000 to ~200!  Great stuff.

Another good step was taken when I removed support for E4X from the JS engine.  E4X is an old JavaScript language extension that never gained wide support and was only implemented in Firefox.  The code implementing it was complicated, and an ongoing source of many bugs and security flaws.  The removal cut almost 13,000 lines of code and over 16,000 lines of tests.  It’s been destined for the chopping block for a long time, and its presence has been blocking generational GC, so all the JS team members are glad to see it go.

Bug Counts

Here are the current bug counts.

  • P1: 16 (-5/+0)
  • P2: 119 (-6/+0)
  • P3: 104 (-0/+0)
  • Unprioritized: 22 (-0/+18)

Three of the P1 “fixes” weren’t actual fixes, but cases where a bug was WONTFIXed, or downgraded.  The unprioritized number is high because we skipped this week’s MemShrink meeting due to the DOM work week in London, which occupied three of our regular contributors.

20 replies on “MemShrink progress, week 85–86”

Wow, that’s a huge amount of updates! Great stuff, and thanks for all the links, starting to read them now.

Are there any plans to make the AWSY graphs look more like the ones on AWFY, where you get both a long term overview and a detailed short term view in the same graph? I had no idea that AWSY is so granular now until you mentioned it, and it would be nice to have some visual indication of the variance and evolution of more recent runs.

I wonder what P1 bugs could change so dramatically they get WONTFIXED… Could you list them (or tell me how I can find them myself on Bugzilla)? Also, is there an ETA known for GGC? I asked in #jsapi some time ago, and the answer was “somewhere mid 2013”, but I don’t know if that was counting in bhackett’s awesome work 🙂 It’d be awesome if GGC was only a few months away!

Here are the 5 closed P1s: was WONTFIXed because we don’t see any way to make progress. was marked INVALID. was downgraded to P3 because we determined that it can only happen in artificial test cases. was fixed, as mentioned above. was marked as a duplicate of another bug that had already been fixed.

Bugs are messy. Stuff like this happens frequently.

As for GenGC schedule… the wiki suggests 3–6 months, though it’s hard to tell how many of those different tasks will overlap. But that’s still just a rough estimate.

Will GGC be able to help with the first one by repacking the heaps as it moves stuff from one generation to the next?

No. jemalloc is the allocator used for the C/C++ heap, and that bug was about reducing fragmentation in that heap. Generational GC affects the JS heap. However, Generational GC will help reduce fragmentation in the JS heap.

Ok. From that I assume even after GGC is completed the after all tabs closed resident memory numbers will still be above the fresh start ones; are there any estimates of how much of the gap will be closed?

First, with a generational GC, allocation happens in the nursery, which is a smallish (e.g. 1 MB) area that is collected every time it fills up. In practice, many objects die young — e.g. 50% — which means that only half as many objects make it to the main GC heap, which is where the (GC) fragmentation happens. So that’s a big win, and the one we’ll get immediately.

Second, in order to implement generational GC, we need a “moving” GC, i.e. one which can move objects. Once we support that, it then should be possible to do compaction during main heap collections, i.e. move objects around in the main heap in order to pack them together better. This isn’t part of the current gen-GC plan, but hopefully will be a follow-up; it’s a sizable additional change. That should get rid of a lot of the remaining GC heap fragmentation.

As for numbers, it depends massively on your workload. The relevant part of about:memory is near the bottom, and looks like this.

76,931,072 B (100.0%) — js-main-runtime-gc-heap-committed
├──52,515,456 B (68.26%) — used
│ ├──49,947,160 B (64.92%) ── gc-things
│ ├───1,490,944 B (01.94%) ── chunk-admin
│ └───1,077,352 B (01.40%) ── arena-admin
└──24,415,616 B (31.74%) — unused
├──24,415,616 B (31.74%) ── gc-things
├───────────0 B (00.00%) ── arenas
└───────────0 B (00.00%) ── chunks

The “unused” area is the wasted space, and “gc-things” usually dominates. In theory, generational GC might halve that (it’ll also slightly reduce the chunk-admin and arena-admin numbers), and then compacting GC should get rid of most of what remains. In practice, however, who knows? There may be complications that I’m not thinking of, and these things also vary widely depending on your workload.

The key thing about generational GC is that it takes pressure off the main GC. Getting GC heuristics right is really tricky, and if you don’t it can lead to pauses and/or excessive memory consumption. Because generational GC collects a large percentage of your objects at very little cost, the effect of the main heap collection heuristics is substantially smaller, which means there’s greater room for error.

Thanks for that explanation, but more generally I was wondering about fragmentation in the JS heap vs the C/C++ heap, since the closing of 746009 implies that that portion of the fragmentation will prevent ever converging the before/after numbers fully.

The before/after numbers probably never will converge, because you can’t compact the C/C++ heap, because you can’t move C/C++ objects arbitrarily. That’s why we closed that bug.

Technically, what Benoit found turned out to not be a CC leak per se, because it involved off-main-thread objects that the CC can’t deal with. 😉

I have had to revert to ff13 for my netbooks. The GC and other improvements of later versions are quite invisible for most browing, but the increased memory demand is as problematic as ever on memory economical hardware.
It is likely that developers give away their old hardware and dont buy economically specced devices, so only appreciate the latest cutting edge benchmark scores and features.

AWSY shows clearly: significant regression since ff13. I hope this is considered a major concern, a hill to get over, and not to be excused as a necessary requirement for lightning javascript performance and capabilities. Like the javascript graphs, the craving for the memory graphs should be to get it down as close to the origin as possible, or even to surpass expectations . (Not to ‘keep it in check’)

Memory demand is *the* performance bottleneck on old and economical hardware. Prioritising it to anywhere close to the degree as has been done for lightning javascript, will make ff fly on most of the worlds pcs (which are not new dev or gaming rigs).

Thanks for trying and more power to your elbows’

Thanks for info Tim. A bit of a random appeal from me there, not seeking undue attention, just registering a slightly under represented (or under powered) position. I went to ff17 rather than 13 after all – cant deny progess. ^^

Comments are closed.