One thing I’ve learnt while working for Mozilla is that a web browser can be characterized as a JavaScript execution environment that happens to have some multimedia capabilities. In particular, if you look at Firefox’s about:memory page, the JS engine is very often the component responsible for consuming the most memory.
Consider the following snapshot from about:memory of the memory used by a single JavaScript compartment.
(For those of you who have looked at about:memory before, some of those entries may look unfamiliar, because I landed a patch to refine the JS memory reporters late last week.)
There is work underway to reduce many of the entries in that snapshot. SpiderMonkey is on a diet.
Objects
Objects are the primary data structure used in JS programs; after all, it is an object-oriented language. Inside SpiderMonkey, each object is represented by a JSObject, which holds basic information, and possibly a slots array, which holds the object’s properties. The memory consumption for all JSObjects is measured by the “gc-heap/objects/non-function” and “gc-heap/objects/function” entries in about:memory, and the slots arrays are measured by the “object-slots” entries.
The size of a non-function JSObject is currently 40 bytes on 32-bit platforms and 72 bytes on 64-bit platforms. Brian Hackett is working to reduce that to 16 bytes and 32 bytes respectively. Function JSObjects are a little larger, being (internally) a sub-class of JSObject called JSFunction. JSFunctions will therefore benefit from the shrinking of JSObject, and Brian is slimming down the function-specific parts as well. In fact, these changes are complete in the JaegerMonkey repository, and will likely be merged into mozilla-central early in the Firefox 11 development period.
As for the slots arrays, they are currently arrays of “fatvals” A fatval is a 64-bit internal representation that can hold any JS value — number, object, string, whatever. (See here for details, scroll down to “Mozilla’s New JavaScript Value Representation”; the original blog entry is apparently no longer available). 64-bits per entry is overkill if you know, for example, that you have an array full entirely of integers that could fit into 32 bits. Luke Wagner and Brian Hackett have been discussing a specialized representation to take advantage of such cases. Variations on this idea have been tried twice before and failed, but perhaps SpiderMonkey’s new type inference support will provide the right infrastructure for it to happen.
Shapes
There are a number of data structures within SpiderMonkey dedicated to making object property accesses fast. The most important of these are Shapes. Each Shape corresponds to a particular property that is present in one or more JS objects. Furthermore, Shapes are linked into linear sequences called “shape lineages”, which describe object layouts. Some shape lineages are shared and live in “property trees”. Other shape lineages are unshared and belong to a single JS object; these are “in dictionary mode”.
The “shapes/tree” and “shapes/dict” entries in about:memory measure the memory consumption for all Shapes. Shapes of both kinds are the same size; currently they are 40 bytes on 32-bit platforms and 64 bytes on 64-bit platforms. But Brian Hackett has also been taking a hatchet to Shape, reducing them to 24 bytes and 40 bytes respectively. This has required the creation of a new auxiliary BaseShape type, but there should be many fewer BaseShapes than there are Shapes. This change will also increase the number of Shapes, but should result in a space saving overall.
SpiderMonkey often has to search shape lineages, and for lineages that are hot it creates an auxiliary hash table, called a “property table”, that makes lookups faster. The “shapes-extra/tree-tables” and “shapes-extra/dict-tables” entries in about:memory measure these tables. Last Friday I landed a patch that avoids building these tables if they only have a few items in them; in that case a linear search is just as good. This reduced the amount of memory consumed by property tables by about 20%.
I mentioned that many Shapes are in property trees. These are N-ary trees, but most Shapes in them have zero or one child; only a small fraction have more than that, but the maximum N can be hundreds or even thousands. So there’s a long-standing space optimization where each shape contains (via a union) a single Shape pointer which is used if it has zero or one child. But if the number of children increases to 2 or more, this is changed into a pointer to a hash table, which contains pointers to the N children. Until recently, if a Shape had a child deleted and that reduced the number of children from 2 to 1, it wouldn’t be converted from the hash form back to the single-pointer. I changed this last Friday. I also reduced the minimum size of these hash tables from 16 to 4, which saves a lot of space because most of them only have 2 or 3 entries. These two changes together reduced the size of the “shapes-extra/tree-shape-kids” entry in about:memory by roughly 30–50%.
Scripts
Internally, a JSScript represents (more or less) the code of a JS function, including things like the internal bytecode that SpiderMonkey generates for it. The memory used by JSScripts is measured by the “gc-heap/scripts” and “script-data” entries in about:memory.
Luke Wagner did some measurements recently that showed that most (70–80%) JSScripts created in the browser are never run. In hindsight, this isn’t so surprising — many websites load libraries like jQuery but only use a fraction of the functions in those libraries. It wouldn’t be easy, but if SpiderMonkey could be changed to generate bytecode for scripts lazily, it could reduce “script-data” memory usage by 60–70%, as well as shaving non-trivial amounts of time when rendering pages.
Trace JIT
TraceMonkey is SpiderMonkey’s original JIT compiler, which was introduced in Firefox 3.5. Its memory consumption is measured by the “tjit-*” entries in about:memory.
With the improvements that type inference made to JaegerMonkey, TraceMonkey simply isn’t needed any more. Furthermore, it’s a big hairball that few if any JS team members will be sad to say goodbye to. (js/src/jstracer.cpp alone is over 17,000 lines and over half a megabyte of code!)
TraceMonkey was turned off for web content JS code when type inference landed. And then it was turned off for chrome code. And now it is not even built by default. (The about:memory snapshot above was from a build just before it was turned off.) And it will be removed entirely early in the Firefox 11 development period.
As well as saving memory for trace JIT code and data (including the wasteful ballast hack required to avoid OOM crashes in Nanojit, ugh), removing all that code will significantly shrink the size of Firefox’s code. David Anderson told me the binary of the standalone JS shell is about 0.5MB smaller with the trace JIT removed.
Method JIT
JaegerMonkey is SpiderMonkey’s second JIT compiler, which was introduced in Firefox 4.0. Its memory consumption is measured by the “mjit-code/*” and “mjit-data” entries in about:memory.
JaegerMonkey generates a lot of code. This situation will hopefully improve with the introduction of IonMonkey, which is SpiderMonkey’s third JIT compiler. IonMonkey is still in early development and won’t be integrated for some time, but it should generate code that is not only much faster, but much smaller.
GC HEAP
There is a great deal of work being done on the JS garbage collector, by Bill McCloskey, Chris Leary, Terrence Cole, and others. I’ll just point out two long-term goals that should reduce memory consumption significantly.
First, the JS heap currently has a great deal of wasted space due to fragmentation, i.e. intermingling of used and unused memory. Once moving GC — i.e. the ability to move things on the heap — is implemented, it will pave the way for a compacting GC, which is one that can move live things that are intermingled with unused memory into contiguous chunks of memory. This is a challenging goal, especially given Firefox’s high level of interaction between JS and C++ code (because moving C++ objects is not feasible), but one that could result in very large savings, greatly reducing the “gc-heap/arena/unused” and “gc-heap-chunk-*-unused” measurements in about:memory.
Second, a moving GC is a prerequisite for a generational GC, which allocates new things in a small chunk of memory called a “nursery”. The nursery is garbage-collected frequently (this is cheap because it’s small), and objects in the nursery that survive a collection are promoted to a “tenured generation”. Generational GC is a win because in practice the majority of things allocated die quickly and are not promoted to the tenured generation. This means the heap will grow more slowly.
Is that all?
It’s all I can think of right now. If I’ve missed anything, please add details in the comments.
There’s an incredible amount of work being done on SpiderMonkey at the moment, and a lot of it will help reduce Firefox’s memory consumption. I can’t wait to see what SpiderMonkey looks like in 6 months!