I’ve been looking at bug 669730 where enabling Firebug on a page (http://nytimes.com/ to be precise) results in the page’s compartment living forever. This is easy to see, now that we have the incredibly useful about:memory and its per-compartment breakdown. (What’s a compartment? It’s a memory space to keep related garbage collectable objects in. See the compartment paper, or for some more detail about how they are used in Firefox, try my contexts vs compartments writeup, though it’s more about contexts than compartments.)
I managed to find the object keeping the compartment alive, and I thought I should document what I did to either help other people hunt down zombie compartments, or beg for better tools, or both. (Note that I haven’t actually fixed the bug, so this is a little premature, but the hunting process is far more likely to be reusable and of general interest than the specifics of this leak.) Oh, and I didn’t actually figure everything out; I just kind of stumbled across the right answer.
I have a zombie compartment, so something in the compartment isn’t getting collected. That means there’s at least one GCthing alive in the compartment that shouldn’t be. The “inner” objects aren’t of much interest, so what really matters is that there’s at least one unwanted root. I want to figure out what that root is.
Things I know of that can be roots are pointers gathered by the conservative stack scanner, cross-compartment wrappers, and explicitly added GC roots.
The conservative roots are unlikely to matter here, because this leak survives returning to the event loop. The cross-compartment wrappers are what I initially suspected. I think that whenever XPCOM points to a JS object, or at least a JS object in a content compartment as this one is, it goes through a cross-compartment wrapper and the wrapped object is considered to be a GC root. So I want to see the objects rooted by cross-compartment wrappers.
I guess the wrappers I care about will actually live in a different compartment and point into the nytimes.com compartment. But it doesn’t seem to matter in this case, because the only function I could see to set a breakpoint in is JSCompartment::markCrossCompartmentWrappers() and in my test runs, it never seemed to hit. It looks like maybe it only gets called when doing a compartmental GC, and we probably don’t do many of those on a zombie compartment. Still, how do those roots get marked? I still don’t know, because while wandering around the code trying to figure out what was going on, I stumbled across the right place to watch for the third set of roots — explicitly added roots — and I took a detour to check those out. (Thinking about it, it’s not the cross-compartment wrappers that are the roots, it’s the objects they point to. Maybe those end up in the explicitly-added roots list? Dunno.)
Specifically, MarkRuntime() in jsgc.cpp iterates through a runtime’s gcRootsHash and calls gc_root_traversal on each one. That grabs out the pointer value and name (yay!) of each root and scans it. So all I needed to do was check each of these roots to see which compartment it’s in, and stop when the compartment is the one I care about. Fortunately, gc_root_traversal calls MarkIfGCThingWord and it already computes the compartment. (It’s just a bit of bit masking and pointer chasing to do manually, so it’s not a big deal anyway.)
Conditional breakpoints are great and everything, but from the name of the function it sounded like it might get called a lot, so I just crammed my own debug code into the routine:
static JSCompartment *interesting_compartment = NULL; if (aheader->compartment == interesting_compartment) printf("root: %p kind %u\n", (void*)addr, thingKind);
Then I reran under gdb. I still needed the address of the compartment (unfortunately, about:memory only shows compartment pointers for chrome compartments). So I looked for something looping over compartments, found several, and set a breakpoint in one of them. I’m not sure which one. They all loop over rt->compartments. For each compartment, I displayed the principals->codebasevalue:
(gdb) display (*c)->principals ? *(*c)->principals : 0
Then I ‘n’exted through until I found the nytimes.com one. With that pointer in hand, I set a breakpoint on my added code, above, and set ‘interesting_compartment’ to my magic pointer value. This printed out the address of the root in question, together with its ‘thingKind‘ which was zero. A quick look at jsgc.h showed me that zero means FINALIZE_OBJECT0, and the code just after my printf showed that I can cast that to JSObject*. A call to js_DumpObject((JSObject*)0x7fffd0a99d78) told me that this was an ‘Error’ object. Even better, when I walked up the stack one level, I could see that the root was labeled “JSDValue”.
So JSD is hanging onto a content Error object, probably one that it grabbed from hooking exception throwing or catching. Is it JSD not discarding something when you turn it off, or Firebug holding onto the object itself? I don’t know yet.
- We need an easy way to get the pointer value of all compartments. Maybe in the ?verbose=1 output of about:memory?
- Enumerating roots is handy. We should expose a function to dump out the roots given a compartment, so we could do this whole analysis via the chrome-privileged Web Console.
- That means we need a way to refer to compartments from JS. Perhaps a weak map from principals’ codebases to JSCompartment* objects?
Related: see Jim Blandy’s bug 672736 for adding a findReferences JS call that gives all of the incoming edges to a JS object. I originally misinterpreted that to mean displaying the full path from a GC root to an object, and I started out trying to use findReferences by grabbing any object in the zombie compartment and calling findReferences on it. But I stopped when I realized that, knowing as little about the memory layout as I do, it was probably easier for me to find the roots themselves than figuring out how to look into the chunks/arenas/arena pools/whatever for the compartment to grab out random GCthings. And all I wanted was the root, so findReferences wouldn’t be of interest unless it crossed the XPConnect boundary and told me what was keeping the JS object alive via some sort of wrapper.
Now please, someone comment and tell me how I could have done this much more easily…