New Debugging API. We are going to give the JS engine a new debugging API, called the Debug object. The new API will provide a cleaner interface and better isolate debuggers from the program they are debugging. This should make Firefox debugging tools stabler and easier to work on. The most exciting part is that the new debug API allows remote connections, so in the future we should be able to do things like debug a web page running on a mobile device using a debugger running on a laptop.
Jim Blandy designed the API last year, so now we just need to implement it. Jim and Jason Orendorff are starting that now.
Incremental and Generational GC. GC (garbage collection) pauses are probably the biggest practical performance issue right now in Firefox. (Note that there are other sources of pauses as well, such as the cycle collector and perhaps IO that happens on the main thread.)
The reason for the pauses is that SpiderMonkey uses an old-school stop-the-world mark-and-sweep collector. Briefly, it works like this:
- Based on some heuristics, the JS engine decides it is time to collect some garbage.
- The GC finds all the GC roots, which are the immediately accessible objects: JS local variables, the JS global object, JS objects stored on the C++ stack, and a few other things.
- The GC marks all objects that can be reached from the roots, by following all the pointers stored in the roots, then all the pointers stored in the objects reached from the roots, and so on.
- The GC sweeps over all allocated objects. If an object is not marked, there is no way for the program to access it, so it can never be used again, and the GC frees it.
The main problem with stop-the-world mark-and-sweep GC is that if there are a lot of live objects, it can take a long time to mark all those objects. “A long time” typically means 100 milliseconds, which is not that long, but is disruptive to animation and is noticeably jerky.
Our first step in fixing GC pauses will be incremental GC. Incremental GC means that instead of stopping the program to mark everything, the GC periodically pauses the program to do a little bit of marking, say 3 milliseconds worth. There is an overhead to starting and stopping a mark phase, so the shorter the pause time, the slower the actual program runs. But we think we can make the pause time unnoticeable without having too much impact on throughput.
Note that the sweep phase can also take a long time, so we’ll need to do some work there also, such as sweeping incrementally or in concurrently on a different thread.
The longer-term goal is to move to generational GC. It’s more complicated than incremental GC, so I won’t go into details now, but the key benefits of generational GC are that (a) it is very fast at collecting short-lived objects (technically, it actually manages to collect them without looking at them or doing anything to them at all), and (b) it helps make creating objects faster.
Bill McCloskey and Chris Leary are working on the new GCs. Gregor Wagner has also been working independently on specific improvements, like doing more sweeping off the main thread.
Brian Hackett started this work early last year as a research project, and has had the type inference algorithm working for a while now. He is now working with volunteer contributors to adapt the existing JägerMonkey compiler to use the results. As of today, they’re running the major benchmarks in the JS shell just a bit faster overall than trunk, but expect that to improve. I tried an integer array microbenchmark the other day, and the TI branch was 40% faster than either TraceMonkey or Crankshaft, both of which are very good at that sort of thing.
IonMonkey IonMonkey is the name of our next JIT compiler. Like Crankshaft, it will feature SSA compiler IRs (intermediate representations), which will facilitate advanced optimizations (for untyped-language JITs–Java compiler writers might not consider these advanced) such as type specialization, function inlining, linear-scan register allocation, dead-code elimination, and loop-invariant code motion.
This should mesh nicely with type inference, by making it a lot easier to do the optimizations better that type inference enables. For example, type inference is potentially particularly good at inlining. One of the key benefits of inlining is that it allows optimization across the call boundaries. But the existing JägerMonkey compiler doesn’t know how to combine functions, so the type inference branch compiles the inlined function separately and drops in the resulting machine code. IonMonkey will be able to patch the inline function into the other function’s IR and then optimize both together.
IonMonkey is currently in the design stages–David Anderson and I are studying the compiler literature and the competition and doing experiments to find out just what features IonMonkey needs. Coding is about to start.