Categories
B2G Firefox Memory consumption MemShrink

MemShrink progress, week 91–92

Bill McCloskey landed zones, a.k.a. compartment groups.  This mitigates the overhead of each compartment by allowing all compartments within a zone to share arenas and strings.  Roughly speaking, each tab gets its own zone, and there’s a system zone for Firefox’s internal use.  This was a MemShrink:P1 bug.

The following graph from areweslimyet.com shows the impact of zones — about 5/6 of the way along the graph there’s a distinct drop, most noticeable in the dark blue line.  The light green line (settled start-up) showed a ~6 MiB drop, which is ~10%.  Note that the fraction of JS memory in areweslimyet.com is less than that in typical sites, so the drop in the higher lines is smaller than the improvements normal users should see.

areweslimyet.com graph showing improvements due to zones

Avi Halachmi fixed a problem where badly managed gradients could cause spikes of 10s of MiBs when tab animations occurred.  This was a MemShrink:P1 bug.  The fix has been backported to Aurora.

Jed Parsons fixed excessive memory consumption by Persona after visiting the B2G marketplace.  At least, I think that’s what happened;  I won’t pretend to genuinely understand what went on in that bug.  This was a MemShrink:P1 bug.

Fabrice Desré fixed a bad B2G leak relating to error messages.  This bug was fixed before it was triaged in a MemShrink meeting, but it probably would have been a MemShrink:P1 because it could cause the phone to OOM after a few hours.

I removed all uses of nsFixedSizeAllocator.  This was only a small memory consumption win (a few 10s of KiBs) but it cut several hundred lines of code, and removed another low-quality custom allocator (and attractive nuisance) from the codebase.

I added a “js-main-runtime-temporary-peak” memory reporter, which measures the peak size of the main JSRuntime’s “temporary” pool, which mostly holds parse nodes.  These are short-lived but they can be quite large — 10s of MiBs in normal browsing, and we’ve seen it exceed 1.5 GiB on large asm.js examples.  Relatedly, I reduced the size of parse nodes by 12.5% on 64-bit platforms, and 16.7% on 32-bit platforms.

Interesting Open Bugs

Sometimes, especially on B2G, we have excessive memory consumption due to JS strings.  It would be useful to be able to dump the contents of all the strings in memory, to look for ones that shouldn’t be there.  Bug 852010 has been filed for this.  It shouldn’t be too hard, as it would fit in with the existing JSON memory report dumping infrastructure.  This is currently blocking bug 850893, a B2G MemShrink:P1 bug.  Please comment in the bug or contact me if you want to get involved.

Bug 846616 and bug 850545 both are about high “heap-unclassified” values.  Reducing “heap-unclassified” is getting very difficult, because the memory is often allocated in ways we can’t measure by third-party code such as Cairo and graphics drivers.  I suppose in the Cairo case we could put effort into adding some memory profiling infrastructure and try to get it upstreamed, but the driver situation seems pretty hopeless.

Bug Status

Here are the current bug counts.

  • P1: 13 (-3/+4)
  • P2: 134 (-2/+8)
  • P3: 124 (-4/+6)
  • Unprioritized: 4 (-6/+4)

20 replies on “MemShrink progress, week 91–92”

Do the zones also help us to more easily look at a tab and say, “Here is it’s memory.”?

If AWSY does not give a real-world example of JS, can we add some metrics/tests that do? Would be good to better see wins like you described.

“…but the driver situation seems pretty hopeless.”
🙁
I’m wondering how bad is the situation and on what platforms this is hurting us the most.

No change on understanding a tab’s memory. Just look for the “top(…)” entries.

A better AWSY would be great. The hard part is that most sites have lots of external Google/Facebook/Twitter/etc stuff. You’d need to capture all of that in a way that can be played back deterministically. It’s possible, but difficult.

I have a fair amount of bugzilla tabs pinned and use bugzillaJS, and zones landing caused my profile’s memory at startup to drop from ~650 to ~550 MiB 🙂

Yesterday I also noticed I had a very high heap-unclassified value in about:memory, of 55% (bringing total memory usage to ~1250 MiB) – that’s why I was looking into building DMD. However, after updating to the 2013-03-19 Nightly the problem hasn’t reappeared (it’s now sitting at 11%), so I’m not sure I’ll be able to reproduce it.

The heap-unclassified situation on windows (32bit) is pretty good. I rarely see it go above 30M which is currently 8%.
On Linux64 however, it is commonly at 80-120M, 25-30%. So that can be attributed to cairo and drivers?

I don’t know why you’re getting high numbers on Linux. On my Linux box I usually see 8–12%, but my workload probably differs significantly from yours.

If you want to investigate and are willing to build Firefox yourself, please do a DMD run following the instructions at https://wiki.mozilla.org/Performance/MemShrink/DMD.

Yup, running with DMD, I see the top unreported allocations coming from the intel driver. I’m sure there are a lot more smaller allocations, I stopped counting when the first non-intel stack trace showed up.

I counted ~27% cumulative, about ~50M.
Out of `87.41 MB (38.23%) ── heap-unclassified`

40–50 MiB is pretty good! Glad to hear it.

Bug 678037 is not waiting on reviews; bhackett hasn’t quite finished it yet. (“WIP” is short for “work in progress”.) I don’t know what his timetable for that is.

Well this actually happened after this blog post, but is there any news about the recent ~40M regression on AWSY which looks like it was caused by Bug 716140?

Also, the mobile branch of AWSY hasn’t been updated since late February. Any news on that?

Grotty Windows-specific hacks don’t excite me much… 🙁

An easy enough way to dump memory for grepping is gcore (grepping the ram directly would be faster, but one has to use gdb, or maybe the python-ptrace equivalent). Not as good as the annotated, built-in version of course.

Memshrink shrinks memory to make better use of the memory bus. Prefetch gives hints to the OS to make better use of the memory bus. In both cases this ultimately improves performance.

Comments are closed.