Know Your Engines at O’Reilly Velocity 2011
GC pauses are a major concern for interactive apps and games, so I had a section on how GCs work, what’s coming from engine developers to control pauses, and what can be done for now. I had to go kind of quickly through that section, so I want to add a few more notes here:
Bill McCloskey has a prototype of incremental GC for Firefox. It can slice the mark phase up finely enough to keep pause times below 1 ms. It doesn’t directly address sweeping, but we do most sweeping on a background thread, anyway (and sweeping can also be interleaved with allocations in order to keep pauses down). Currently the write barrier slows things down too much for production, so he’s still checking that out.
Recommendations for avoiding GC pauses depend on what kind of GC the browser has. For now, if you want to reduce pauses on all browsers, you have to assume a simple stop-the-world mark-sweep collector, in which case there are 2 key things to know. First, the length of pauses is roughly proportional to the total size of reachable heap objects (which include JS objects, arrays, strings, and functions), because the mark phase has to traverse them all. Second, the frequency of pauses depends on the allocation rate, that is, the number of bytes allocated per second, because using up memory is what triggers the GC to run.
A simple model of a program running under generational GC is that it starts up, allocates a bunch of objects, and then settles down into a steady state. In the steady state, there is a working set of W (bytes) allocated memory that stays allocated, and then a churn rate R (bytes/sec), where every second R bytes of temporary objects are allocated (e.g., short-term closures, strings being concatenated into results). R determines how often pauses occur, and W how long the pauses are. So, you could entirely avoid pauses in practice by having either W ≈ 0 (so pauses are unnoticeable) or R ≈ 0 (so pauses don’t happen in the steady state, only during startup).
I finished the talk with some gritty practical bits I discovered by running a whole bunch of microbenchmarks, based on educated guesses about what would turn up performance faults. Probably the most important is that eval, with, and throwing exceptions are not only generally slow operations, but they can deoptimize nearby code and cause much greater slowdowns than you might expect. Another big one is that dense arrays are much faster than sparse arrays–see the slides for details.
Thanks to everyone who tweeted praise for/interest in my talk (that I found in my mentions at least): piscisaureus, kamidphish, colinbdclark, rednaxelafx, djco, demisbellot, porteneuve, brendankenny, paulbiggar, slicknet, demisbellot, alex_gaynor, devongovett, juanpin, derek, SkipSauls, and of course Brendan. I truly appreciate your letting me know you enjoyed the talk (or expect to enjoy it!). And thanks again to Steve Souders and the Velocity conference chairs for hosting me and making it all possible!