Category Archives: DMD

DMD now works on Windows

DMD is our tool for improving Firefox’s memory reporting.  It helps identify where new memory reporters need to be added in order to reduce the “heap-unclassified” value in about:memory.

DMD has always worked well on Linux, and moderately well on Mac (it is crashy for some people).  And it works on Android and B2G.  But it has never worked on Windows.

So I’m happy to report that DMD now does work on Windows, thanks to the excellent efforts of Catalin Iacob.  If you’re on Windows and you’ve been seeing high “heap-unclassified” values, and you’re able to build Firefox yourself, please give DMD a try.

MemShrink progress, week 85–86

Lots of news today.

Fixed Regressions

I wrote last time about a couple of bad regressions that AWSY identified.

The ongoing DOM bindings work will hopefully fully fix the second regression before the end of this development cycle (February 18).

AWSY

John Schoenick made three big improvements to AWSY.

  • It now measures every push to mozilla-inbound.  Previously it measured mozilla-central once per day.  This will make it easier and faster to identify patches responsible for regressions.
  • It’s now possible to trigger an AWSY run for any try build.  Unfortunately John hasn’t yet written instructions on how to do this;  I hope he will soon…
  • AWSY now measures Fennec as well.  Kartikaya Gupta created the benchmark that is used for this.  He also fixed a 4 MB regression that it identified.

Leaks Fixed

Benoit Jacob fixed a CC leak that he found with his refgraph tool.

Johnny Stenback fixed a leak involving SVG that he found with DMD.  This was a very slow leak that Johnny had seen repeatedly, which manifested as slowly increasing “heap-unclassified” values in about:memory over days or even weeks.  It’s a really nice case because it shows that DMD can be used on long-running sessions with minimal performance impact.

Justin Lebar fixed a B2G leak relating to forms.js.

Randall Jesup fixed a leak relating to WebRTC.

Andrew McCreight fixed a leak relating to HTMLButtonElement.

Erik Vold fixed a leak in the Restartless Restart add-on.

Miscellaneous

Brian Hackett optimized the representation of JS objects that feature both array (indexed) elements and named properties.  Previously, if an object had both elements and named properties, it would use a sparse representation that was very memory-inefficient if many array elements were present.  This performance fault had been known for a long time, and it caused bad memory blow-ups every once in a while, so it’s great to have it fixed.

As a follow-up, Brian also made it possible for objects that use the sparse representation to change back to the dense array representation if enough array elements are subsequently added.  This should also avoid some occasional blow-ups that occur when arrays get filled in in complex ways.

Gregory Szorc reduced the memory consumption of the new Firefox Health Report feature, from ~3 MB to ~1–1.5 MB: here and here and here and here. On a related note, Bill McCloskey is making good progress with reducing compartment overhead, which should be a sizeable win once it lands.

Gregory also reduced the memory consumption of Firefox Sync:  https: here and here.

Jonathan Kew reduced the amount of memory used by textruns when Facebook Messenger is enabled.

The Add-on SDK is now present in mozilla-central, which is a big step towards getting all add-ons that use it to always use the latest version.  This is nice because it will mean that when memory leaks in the SDK are fixed (and there have been many) all add-ons that use it will automatically get the benefit, without having to be repacked.

Generational GC

Generational garbage collection is an ongoing JS engine project that should reap big wins once it’s completed.  I don’t normally write about things that haven’t been finished, but this is a big project and I’ve had various people asking about it recently, so here’s an update.

Generational GC is one the JS teams two major goals for the near-term.  (The other is benchmark and/or game performance, I can’t remember which.)  You can see from the plan that there are eight people working on it (though not all of them are working on it all the time).

Brian Hackett implemented a static analysis that can determine which functions in the JS engine can trigger garbage collection.  On top of that, he then implemented a static analysis that can identify rooting hazards and unnecessary roots.  This may sound esoteric, but it has massively reduced the amount of work required to complete exact rooting, which is the key prerequisite for generational GC.  To give you an idea:  Terrence Cole estimated that it reduced the number of distinct code locations that need to be looked at and possibly modified from ~10,000 to ~200!  Great stuff.

Another good step was taken when I removed support for E4X from the JS engine.  E4X is an old JavaScript language extension that never gained wide support and was only implemented in Firefox.  The code implementing it was complicated, and an ongoing source of many bugs and security flaws.  The removal cut almost 13,000 lines of code and over 16,000 lines of tests.  It’s been destined for the chopping block for a long time, and its presence has been blocking generational GC, so all the JS team members are glad to see it go.

Bug Counts

Here are the current bug counts.

  • P1: 16 (-5/+0)
  • P2: 119 (-6/+0)
  • P3: 104 (-0/+0)
  • Unprioritized: 22 (-0/+18)

Three of the P1 “fixes” weren’t actual fixes, but cases where a bug was WONTFIXed, or downgraded.  The unprioritized number is high because we skipped this week’s MemShrink meeting due to the DOM work week in London, which occupied three of our regular contributors.

MemShrink progress, week 79–82

I skipped the last MemShrink report due to Christmas, so we have four weeks’ worth of bug fixes today.

LEAKS

Joe Walker fixed a bad leak found by Jesse Ruderman:  if you closed a browser window with the developer toolbar open it would leak “everything”.  This was a MemShrink P1 bug.

Anton Kovalyov fixed a leak involving scratchpad.  This bug was also found by Jesse Ruderman.

Randell Jesup fixed some WebRTC leaks.

John Schoenick fixed a leak involving plugins.

Josh Aas fixed a leak in some networking code.

DMD

I wrote a more detailed blog post about DMD.  Here is the take-away message.

about:memory is MemShrink’s not-so-secret weapon when it comes to understanding Firefox’s memory consumption… and DMD is how we make about:memory better.

Lots of under-the-hood improvements have been made to DMD since I wrote that.  Users on Mac, Linux and B2G who aren’t afraid of doing their own builds should try it out.  Also, Ehsan Akhgari got it to build on Windows, though it’s not yet clear how well it works on that platform.  If anyone wants to try it out, please let me know how it goes.

Memory Reporters

Ben Turner made the workers memory reporter be able to handle workers that use ctypes.  This was important, especially on B2G, because each process can have one or two or more such workers — this is for Firefox chrome stuff, not web content — and we weren’t measuring them at all, and they can take multiple MiB each.

I fixed the orphan DOM node memory reporter.  The introduction of WebIDL had changed the layout of some paired JS/DOM objects, and such objects weren’t being reported.  (DMD discovered the unreported memory, and Boris Zbarsky helped me interpret what it meant.)  I see this accounting for multiple MiB of orphan nodes when using Gmail.

I added a memory reporter for the event listenener manager’s hash table.  It starts off small, but I’ve seen it go as high as 1.5 MiB after lots of browsing.  (DMD helped me identify this too.)

Kartikaya Gupta added a memory reporter for graphics textures on Android.

I added a memory reporter for any ctypes data that is hanging off JS objects.  I added it because one DMD profile on B2G showed non-trivial amounts of ctypes data, but that seems to have been a fluke and it rarely shows much memory now.  Oh well.

Miscellaneous

Andrew McCreight improved CC shutdown logging, which will make it easier to identify shutdown leaks.

Rail Aliiev and Kartikaya Gupta enabled a new NDK for Fennec builds on releng machines.  This might result in smaller binary sizes, which saves memory.

Bug Counts

Here are the current bug counts.

  • P1: 15 (-2/+0)
  • P2: 116 (-10/+0)
  • P3: 102 (-4/+0)
  • Unprioritized: 18 (-0/+18)

The number of unprioritized bugs is high because we didn’t have a MemShrink meeting this week.  This was because Justin Lebar and Kyle Huey are in Berlin for the B2G work week.  We’ll have our next meeting two weeks from today.

DMD

I recently landed a new version of DMD on mozilla-central.  DMD is a tool helps us understand and thus reduce Firefox’s memory consumption.  But in order to understand DMD, you first have to understand about:memory.

about:memory

The MemShrink project started about 18 months ago, and it has been very successful in reducing Firefox’s memory consumption.  about:memory is MemShrink’s not-so-secret weapon when it comes to understanding Firefox’s memory consumption.

about:memory screenshot

about:memory has some wonderful characteristics:  it provides literally thousands of measurements;  it’s available in ordinary release builds;  and it’s trivial to run (just type “about:memory” in the address bar).  This means that non-expert users can easily provide developers with detailed measurements if they are having problems.

However, about:memory has two shortcomings.  First, it simply visualizes the data provided by the memory reporter infrastructure.  The coverage provided by this infrastructure is good, but there are still some gaps.  These gaps manifest primarily in the “heap-unclassified” number in about:memory, which represents all the heap allocations that the memory reporters didn’t cover.  Unfortunately, about:memory can provide zero insight into what is within “heap-unclassified”.

Second, it’s hard to verify.  There’s no obvious way to tell if the numbers it gives are accurate (with a few exceptions;  e.g. negative numbers are obviously wrong).  We have established good practices for writing memory reporters (measure sizes, don’t compute then;  traverse data structures rather than maintaining size counters) that prevent most errors, but it’s still quite easy to accidentally count a block of memory twice.

This is where DMD comes in.  It helps with both the heap-unclassified and double-counting problems.

DMD version 1

I wrote the first version of DMD over a year ago.  “DMD” is short for “dark matter detector”, because “heap-unclassified” memory is sometimes jokingly called “dark matter”.

DMD works by intercepting all calls to malloc/free/etc.  This lets DMD track extra information about every heap block, such as where it was allocated.  Furthermore, DMD has hooks into the memory reporters (via functions created with the NS_MEMORY_REPORTER_MALLOC_SIZEOF_FUN macro, for those who are interested) so it knows when any heap block is measured by a memory reporter.

With that in place, DMD knows how many times each heap block has been reported.  So, after running the memory reporters, we can up each block into one of the following three groups.

  • Blocks that haven’t been reported indicate gaps in existing memory reporters.  They contribute to “heap-unclassified”.
  • Blocks that have been reported once are good.
  • Blocks that have been reported twice or more indicate defects in existing memory reporters.  They cause some measurements in about:memory to be too high, and “heap-unclassified” to be too low.

DMD presents its results in a sorted fashion that lists the unreported and twice-reported cases that are more important first.

This first version of DMD was implemented as a Valgrind tool, and what’s more, it required a special, patched build of Valgrind to run.  This meant it was difficult to set up, ran very slowly, and could only be used on Linux and Mac.  Despite these shortcomings, it has proved itself extremely useful, helping us get “heap-unclassified” down greatly (on my Linux desktop machine it’s typically between 8 and 12%) and identifying several cases of double-counting.

DMD Version 2

Some time after I wrote the first version of DMD, I realized that all it needed to work was the ability to intercept malloc/free/etc. and to obtain stack traces.  Valgrind was overkill for these purposes — it’s capable of supporting much more invasive tools — and, furthermore, there were existing tools (e.g. trace-malloc) in the Mozilla codebase that did exactly these things.  In other words, it would be possible to build a new version of DMD that integrated directly into the browser.

Work on this new version stalled for a while, because the old one was working well enough.  But then B2G’s Operation Slim Fast started, and it quickly became obvious that “heap-unclassified” on B2G was typically much higher than it was on desktop.  This is primarily because B2G has lots of small processes, and so various unmeasured things that weren’t a big deal on desktop became much more significant on B2G.  And DMD version 1 doesn’t work on B2G devices.

So that provided the impetus to complete the new version.  Mike Hommey finished his new replace-malloc infrastructure recently, which helped greatly, and the new version of DMD landed on mozilla-central last week.  The old version will soon be retired.

DMD now works on Linux, Android, Mac, B2G, and is just shy of working on Windows (where Ehsan Akhgari has been making gradual progress).  It’s also much faster, partly because it avoids the overhead of Valgrind, and partly because it now uses sampling.  The sampling causes blocks smaller than a certain size (4 KiB by default, though you can change that) to be sampled, while blocks larger than that are measured accurately;  this sacrifices a moderate amount of accuracy for a large amount of speed.

Try it

about:memory is wonderful, and DMD is how we make about:memory better.  There are now detailed instructions on how to build DMD, run it, and interpret its output.  It’s quite easy now, requiring only minor changes to the usual build and run steps, and several people have already done it successfully.  Please try it out, and help us further improve about:memory.

MemShrink progress, week 77–78

DMD

The big news this week is the landing of the native version of DMD.  The old version of DMD is a Valgrind-based tool that has been instrumental in reducing the size of about:memory’s “heap-unclassified”.  (For example, with trunk builds on my Linux desktop machine it’s now frequently less than 10%.)

However, the old version of DMD wasn’t easy to run.  For example, it required patching Valgrind’s source code and re-building it, among other things.  As a result, only a handful of people ever ran it.  Furthermore, because it’s a Valgrind tool it is very slow and will never run on Windows.

In contrast, the new version is much easier to use.  It just requires a slight configuration change at build time and a slight change in how the browser is invoked.

It’s also much faster, especially if you use the sampling mode which trades a small amount of precision for a great deal of speed.  Crucially, this means it is usable on B2G.  Also, it should be possible for people to use it for long-running sessions, which can be helpful for identifying slow leaks.

And while the new version doesn’t currently run on Windows, it should be fairly straightforward to get it working.  If anyone is interested in helping with this, please contact me, or take a look at the open bug.

I will write a more detailed post about the new version some time in the next few days, once I have updated the documentation and finalized a few more tweaks that should improve usability.  In the meantime, Justin Lebar is already using it in earnest to better understand B2G’s memory consumpion (e.g. see here, here, here, here, here, here, here).

Thanks to Mike Hommey for his excellent replace-malloc infrastructure, on which the new version is built, and to Justin for lots of help, especially with Android and B2G details.

B2G

The option that allows B2G to merge system compartments, previously implemented and landed by Kyle Huey, was enabled by default.  This is a big deal, because it’s the single biggest memory consumption improvement B2G has seen and is likely to see before version 1 is released.

Gabriel Svelto reduced the number of unused dirty pages kept around by jemalloc on B2G.  This can save ~4 MiB of memory across all processes, at a potential cost of making some allocations a little slower.  On a device as memory-constrained as the B2G phones, this is a good trade-off.

Social API

Felipe Gomes reduced the memory consumption of the social API when multiple browser windows are open.  This change has been backported to the beta channel, so it will be present in Firefox 18, in time for the bigger publicity push for this feature.  (For those of you that don’t like Facebook, please note that if you don’t enable the social API it will not affect Firefox’s performance in any way.)

Miscellaneous

I added a MEMORY_VSIZE telemetry reporters, which measures virtual memory consumption.  This might help us understand how many Windows users would benefit from 64-bit builds.

Bug counts

Here are the current bug counts.

  • P1: 17 (-3/+3)
  • P2: 126 (-2/+14)
  • P3: 106 (-4/+7)
  • Unprioritized: 0 (-2/+0)

MemShrink Progress, week 69–70

It’s been a bumper two weeks for MemShrink.

B2G

Gregor Wagner tweaked the JS garbage collection heuristics on B2G so that collections occur more frequently.  This prevents too much garbage from accruing, which is important on low-memory B2G devices.

Justin Lebar disabled sync on B2G, because it isn’t used.

Justin also implemented the ability to dump GC/CC logs when a particular signal is received, which is particularly useful for B2G.

B2G’s Operation Slim Fast is in full swing.  Please see the tracking bug for details about what is happening.  While there are a number of important fixes in the pipeline, memory consumption is still a critical issue for B2G simply because the devices don’t have much memory in the first place.  Any additional help from Mozilla developers will be greatly appreciated.  For example, if anyone can help identify unnecessary compartments, that would be helpful.

Memory Reporting

Boris Zbarsky improved the reporting coverage of style sheets.  We’ve seen this improved coverage account for over 1% of explicit on desktop and over 2% of explicit on B2G.

I fixed some double-counting of style sheets, and slightly improved the reporting coverage of style sheets and JS compartments.

Nathan Froyd added memory reporting for telemetry.

Justin Lebar fixed some over-reporting of the JS stack.  Justin also added to about:memory the ability to read JSON memory report data from the clipboard.

I renamed DMD as DMDV, in preparation for its in-progress, non-Valgrind replacement to take over the original name.

Testing With Valgrind

Gary Kwong has done some heroic work recently getting Valgrind tests passing on Tinderbox.  (Strictly speaking, the tests run Memcheck, which is the default Valgrind tool and is synonymous with Valgrind)  His blog post about it is well worth reading.

This is important for MemShrink because Memcheck can detect an important class of memory leaks — ones where all references to a block of memory are lost and so the block leaks forever.  But it’s also important for Firefox as a whole because Memcheck can detect a slew of other memory-related errors, such as writing unaddressable memory (e.g. buffer overruns) and using undefined values in dangerous ways.  These are the kinds of defects that cause crashes, security vulnerabilities, and other major problems.

The tests are currently hidden on TBPL because they only run on mozilla-central once a day, which is generally not enough for a test suite to be reliable enough to be worth unhiding.  Having said that, the ability to write suppressions for errors found with Memcheck complicates considerations;  see bug 800435 for further discussion.

Add-ons

Gregg Lind fixed things so that our Aurora/Beta users are certain to get an up-to-date, non-leaky version of the Test Pilot add-on.  (Version 1.1.2 was the leaky version;  version 1.2.0 or later has the fix.)  This was a longstanding MemShrink:P1 bug.

Erik Vold fixed a zombie compartment in Scriptish.  Version 0.1.8 has the fix.

Josep del Rio fixed a zombie compartment in Speed Dial.  Version 0.9.6.10 has the fix.

Miscellaneous

Boris Zbarsky fixed a bad zombie compartment leak relating to expandos.  This was a recent regression, and the fix is present on Aurora and Beta.  The problem was first reported in the comments of my MemShrink report from four weeks ago — many thanks to Alex for reporting it and preventing it from reaching an official Firefox release!

Bug Counts

Here are the current bug counts.

  • P1: 19 (-2/+5)
  • P2: 101 (-2/+6)
  • P3: 91 (-6/+0)
  • Unprioritized: 13 (-1/+11)

The unprioritized count is high because we spent the first part of today’s MemShrink meeting talking with Gavin Sharp, Jared Wein and Felipe Gomes about the upcoming social API, and then spent some additional time discussing B2G with David Clarke.  As a result we didn’t have time to triage the 23 new MemShrink bugs.  We’ve scheduled an out-of-band MemShrink meeting for next week to get on top of the bug list.

Visiting Mountain View

I’ll be in Mountain View all next week for the JS team work week.  I’ll be happy to meet up with anyone who wants to talk about memory consumption!

Notes on Reducing Firefox’s Memory Consumption

I gave a talk yesterday at the linux.conf.au Browser MiniConf, held in Ballarat, Australia.  Its title was “Notes On Reducing Firefox’s Memory Consumption”.

Below are the slides and notes in a SlideShare embedding. If you find that embedding problematic (some people do) you may prefer to download the PDF version directly.

Reducing about:memory’s “heap-unclassified” measurement with DMD

about:memory is a really useful tool, but everyone always complains about the “heap-unclassified” number being too high.  It’s too high because we don’t have enough memory reporters implemented.

DMD is a tool that identifies where new memory reporters should be added to reduce “heap-unclassified”.  I’ve been using it for a couple of months now, and I’ve just written instructions on how to use it.  It’s not easy, but hopefully it’s doable.

It’s probably worth pointing out that we have a pretty good handle on what needs to be done to reduce “heap-unclassified” significantly — check out the list of memory reporters to be implemented.  Almost all of those bugs were filed based on data from DMD.  The single most important bug in that list is the one to add missing style reporters;  I see multiple megabytes of CSS stuff all the time in DMD’s output.