The Baseline Compiler Has Landed

Kannan Vijayan

47

This wednesday we landed the baseline compiler on Firefox nightly. After six months of work from start to finish, we are finally able to merge the fruits of our toils into the main release stream.

What Is The Baseline Compiler?

Baseline (no, there is no *Monkey codename for this one) is IonMonkey’s new warm-up compiler. It brings performance improvements in the short term, and opportunities for new performance improvements in the long term. It opens the door for discarding JaegerMonkey, which will enable us to make other changes that greatly reduce the memory usage of SpiderMonkey. It makes it easier and faster to implement first-tier optimizations for new language features, and to more easily enhance those into higher-tier optimizations in IonMonkey.

Our scores on the Kraken, Sunspider, and Octane benchmarks have improved by 5-10% on landing, and will continue to improve as we continue to leverage Baseline to make SpiderMonkey better. See the AreWeFastYet website. You can select regions on the graph (by clicking and dragging) to zoom in on them.

Another JIT? Why Another JIT?

Until now, Firefox has used two JITs: JaegerMonkey and IonMonkey. Jaeger is a general purpose JIT that is “pretty fast”, and Ion is a powerful optimizing JIT that’s “really fast”. Initially, hot code gets compiled with Jaeger, and then if it gets really hot, recompiled with Ion. This strategy lets us gradually optimize code, so that the really heavyweight compilation is used for the really hot code. However, the success of this strategy depends on striking a good balance between the time-spent-compiling at different tiers of compilation, and the the actual performance improvements delivered at each tier.

To make a long story short, we’re currently using JaegerMonkey as a stopgap baseline compiler for IonMonkey, and it was not designed for that job. Ion needs a baseline compiler designed with Ion in mind, and that’s what Baseline is.

The fuller explanation, as always, is more nuanced. I’ll go over that in three sections: the way it works in the current release, why that’s a problem, and how Baseline helps fix it.

The Current Reality

In a nutshell, here’s how current release Firefox approaches JIT compilation:

  1. All JavaScript functions start out executing in the interpreter. The interpreter is really slow, but it collects type information for use by the JITs.
  2. When a function gets somewhat hot, it gets compiled with JaegerMonkey. Jaeger uses the collected type information to optimize the generated jitcode.
  3. The function executes using the Jaeger jitcode. When it gets really hot, it is re-compiled with IonMonkey. IonMonkey’s compiler spends a lot more time than JaegerMonkey, generating really optimized jitcode.
  4. If type information for a function changes, then any existing JITcode (both Jaeger’s and Ion’s) is thrown away, the function returns to being interpreted, and we go through the whole JIT lifecycle again.

There are good reasons why SpiderMonkey’s JIT compilation strategy is structured this way.

You see, Ion takes a really long time to compile, because to generate extremely optimized jitcode, it applied lots of heavyweight optimization techniques. This meant that if we Ion-compiled functions too early, type information was more likely to change after compilation, and Ion code would get invalidated a lot. This would cause the engine to waste a whole lot of time on compiles that would be discarded. However, if waited too long to compile, then we would spend way too much time interpreting a function before compiling it.

JaegerMonkey’s JIT compiler is not nearly as time consuming as IonMonkey’s JIT compiler. Jaeger uses collected type information to optimize codegeneration, but it doesn’t spend nearly as much time as Ion in optimizing its generated code. It generates “pretty good” jitcode, but does it way faster than Ion.

So Jaeger was stuck in between the interpreter and Ion, and performance improved because the really hot code would still get Ion-compiled and be really fast, and the somewhat-hot code would get compiled with Jaeger (and recompiled often as type-info changed, but that was OK because Jaeger was faster at compiling).

This approach ensured that SpiderMonkey spent as little time as possible in the interpreter, where performance goes to die, while still gaining the benefits of Ion’s codegeneration for really hot JavaScript code. So all is well, right?

No. No it is not.

The Problems

The above approach, while a great initial compromise, still posed several significant issues:

  1. Neither JaegerMonkey nor IonMonkey could collect type information, and they generated jitcode that relied on type information. They would run for as long as the type information associated with the jitcode was stable. If that changed, the jitcode would be invalidated, and execution would go back to the interpreter to collect more type information.
  2. Jaeger and Ion’s calling conventions were different. Jaeger used the heap-allocated interpreter stack directly, whereas Ion used the (much faster) native C stack. This made calls between Jaeger and Ion code very expensive.
  3. The type information collected by the interpreter was limited in certain ways. The existing Type-Inference (TI) system captured some kinds of type information very well (e.g. the types of values you could expect to see from a property read at a given location in the code), but other kinds of information very poorly (e.g. the shapes of the objects that that the property was being retreived from). This limited the kinds of optimizations Ion could do.
  4. The TI infrastructure required (and still requires) a lot of extra memory to persistently track type analysis information. Brian Hackett, who originally designed and implemented TI, figured he could greatly reduce that memory overhead for Ion, but it would be much more time consuming to do for Jaeger.
  5. A lot of web-code doesn’t run hot enough for even the Jaeger compilation phase to kick in. Jaeger took less time than Ion to compile, but it was still expensive, and the resulting code could always be invalidated by type information changes. Because of this, the threshold for Jaeger compilation was still set pretty high, and a lot of non-hot code still ran in the interpreter. For example, SpiderMonkey lagged on the SunSpider benchmark largely because of this issue.
  6. Jaeger is just really complex and hard to work with.

The Solution

The Baseline compiler was designed to address these shortcomings. Like the interpreter, Baseline jitcode feeds information to the existing TI engine, while additionally collecting even more information by using inline cache (IC) chains. The IC chains that Baseline jitcode creates as it runs can be inspected by Ion and used to better optimize Ion jitcode. Baseline jitcode never becomes invalid, and never requires recompilation. It tracks and reacts to dynamic changes, adding new stubs to its IC chains as necessary. Baseline’s native compilation and optimized IC stubs also allows it to run 10x-100x faster than the interpreter. Baseline also follows Ion’s calling conventions, and uses the C stack instead of the interpreter stack. Finally, the design of the baseline compiler is much simpler than either JaegerMonkey or IonMonkey, and it shares a lot of common code with IonMonkey (e.g. the assembler, jitcode containers, linkers, trampolines, etc.). It’s also really easy to extend Baseline to collect new type information, or to optimize for new cases.

In effect, Baseline offers a better compromise between the interpreter and a JIT. Like the interpreter, it’s stable and resilient in the face of dynamic code, collects type information to feed to higher-tier JITs, and is easy to update to handle new features. But as a JIT, it optimizes for common cases, offering an order of magnitude speed up over the interpreter.

Where Do We Go From Here?

There are a handful of significant, major changes that Baseline will enable, and are things to watch for in the upcoming year:

  • Significant memory savings by reducing type-inference memory.
  • Performance improvements due to changes in type-inference enabling better optimization of inlined functions.
  • Further integration of IonMonkey and Baseline, leading to better performance for highly polymorphic object-manipulating code.
  • Better optimization of high-level features like getters/setters, proxies, and generators

Also, to remark on recent happenings… given the recent flurry of news surrounding asm.js and OdinMonkey, there have been concerns raised (by important voices) about high-level JavaScript becoming a lesser citizen of the optimization landscape. I hope that in some small way, this landing and ongoing work will serve as a convincing reminder that the JS team cares and will continue to care about making high-level, highly-dynamic JavaScript as fast as we can.

Acknowledgements

Baseline was developed by Jan De Mooij and myself, with significant contributions by Tom Schuster and Brian Hackett. Development was greatly helped by our awesome fuzz testers Christian Holler and Gary Kwong.

And of course it must be noted Baseline by itself would not serve a purpose. The fantastic work done by the IonMonkey team, and the rest of the JS team provides a reason for Baseline’s existence.

47 responses

Post a comment

  1. doot0 wrote on ::

    I’m looking forward to seeing how this change will spur more innovation amongst other browser vendors. I’m not expecting anything special from Microsoft, though…

    Reply

  2. Maguro wrote on :

    its like a magic dragon wirh 3 heads. is 3 the optimum?

    Reply

  3. Jim B wrote on :

    Congratulations on getting this landed. Thanks for taking the time to write this up.

    Reply

  4. Jordan Arentsen wrote on ::

    It seems like v8bench was dropped from AWFY, curious about the reasons for that.

    Reply

    1. Kannan Vijayan wrote on :

      Octane is a superset of the V8 benches, and a better rounded suite to boot. We had V8 on for a while, but it was simply redundant and cluttered up the page in the end.

      Reply

      1. Jordan Arentsen wrote on ::

        Gotcha, makes sense, thanks!

        Reply

  5. Reece H. Dunn wrote on ::

    Congratulations to everyone involved.

    It is fascinating seeing the work progress through bugzilla and blog posts, and seeing the JS engine line inch ever closer to V8 on arewefastyet. It’ll be interesting seeing where Baseline+TI+Ion goes in the future.

    Reply

  6. Pobrecito Hablador wrote on :

    Maybe I’m dense, but I miss how Baseline will interact with the interpreter. My guess is that Baseline seems to be designed to deprecate the interpreter, or at least it seems like it will kick in sooner than Jaeger.

    Reply

    1. Kannan Vijayan wrote on :

      It does kick in sooner. Jaeger waits for a function to get called 40 times (or go around a loop 40 times), before compiling a function. Baseline waits until 10 iterations. We could push that up to 0, but it saves a bit of memory to wait until 10. The vast majority of code executes less than that, and the memory/performance tradeoff is not really worth it.

      Reply

  7. John Vilk wrote on :

    Is this automatically enabled in Firefox Nightly, or is it hidden behind a flag? I want to see if it changes the runtime performance of my web apps.

    Thanks!

    Reply

    1. Kannan Vijayan wrote on :

      It’s on by default.

      Reply

      1. Caspy7 wrote on :

        So then if all goes to plan this will likely arrive with Firefox 23?

        Reply

  8. Manoj Mehta wrote on :

    Congratulations on landing the baseline compiler. Out of curiosity, why was JaegerM designed the way it was? And what evolutionary thinking made Baseline possible today but not when Jaeger was conceived?

    Thanks!
    Manoj

    Reply

    1. Kannan Vijayan wrote on :

      I think it’s more a consequence of history than anything else.

      When jaeger was initially being written, it was supposed to be the baseline compiler for tracemonkey, and this was before type-inference was added to the system. After type-inference was added, Jaeger was modified to use type-inference info in its codegeneration. This made jaeger into more of an optimizing JIT.

      An optimizing JIT and a type-info collecting JIT are at odds with each other. The former uses type information, while the latter generates it. These are two very divergent design goals.

      Baseline by itself is actually a lot slower, head to head, compared with Jaeger, because Baseline does not use type-information to optimize its generated code. It ends speeding up the overall engine not because of raw speed, but because it provides a better scaffolding for Ion.

      Prior to this, if we compiled with Ion and type-info changed and the ion code was invalidated, we would fall back to the interpreter. Because baseline code isn’t invalidated by type-info changes, that’s no longer the case. When Ion invalidates, we just fall back to baseline jitcode instead, which is guaranteed to still be valid. Because of this property, we can be a lot more aggressive about compiling with Ion, because the penalty we pay for invalidating Ion code is now a lot lower.

      It wasn’t thinking that changed so much as the circumstance.

      Reply

  9. Aldo_MX wrote on ::

    I really like the way Mozilla is heading, congratulations for your achievements :)

    Reply

  10. Adam Domurad wrote on ::

    Very cool! Although, in place of these asm.js movements, I’d really like to see a common bytecode for the web. It’s sad that any non-javascript language is necessarily a second-class (or worse) citizen.

    Reply

  11. Foobar wrote on :

    Is there a plan to get rid of the interpreter entirely? If not, why?

    Reply

    1. Dave Herman wrote on ::

      See Kannan’s reply above: https://blog.mozilla.org/javascript/2013/04/05/the-baseline-compiler-has-landed/#comment-26433

      Reply

      1. Jan de Mooij wrote on :

        In addition to the memory wins, the interpreter is a lot more portable than the JITs. It allows us to run Firefox/SpiderMonkey on platforms for which we don’t have a JIT backend, for instance MIPS and PowerPC, or platforms where a JIT is not allowed (iOS). Furthermore, the interpreter is useful for debugging and makes it easier to experiment with new language features. So while there are good reasons to keep the interpreter, with Baseline we spend less time in it and we hope we can simplify it further.

        Reply

        1. Caspy7 wrote on :

          Admittedly neither we or our interpreter are allowed on iOS.

          Reply

  12. Overmind wrote on :

    So if I read it right, the baseline compiler will replace the interpreter + Jadger. Baseline compiler will lazily compile all of the code and inserting profiling code to collect type information.

    That seems to be a great approach.

    Reply

    1. Tom Schuster wrote on :

      This sounds nearly right. However we are not going to replace the Interpreter. Code that is executed less than 10 times, will still run in the Interpreter. After that we compile in baseline.

      Reply

  13. Anon Ray wrote on ::

    How long does the Baseline take to compile? The compile time of Baseline is not mentioned.
    In the beginning of the post you talk about the compilation time taken by Jaeger is very less as compared to Ion, but now as the Baseline shares a lot of code with Ion, I am curious about the compilation time taken by Baseline and how it affects the entire cycle/system.

    Reply

    1. Jan de Mooij wrote on :

      I don’t have exact numbers, but Baseline compiles code even faster than Jaeger, because its design is simpler and it does not perform as many optimizations as Jaeger.

      Ion and Baseline use the same MacroAssembler etc, but where Ion compiles bytecode to MIR, MIR to LIR, LIR to native code, Baseline emits native code directly from the bytecode. So we only reuse parts of Ion that we really need, not the MIR/LIR, optimizations and register allocation.

      Reply

      1. Anon Ray wrote on ::

        Wow! That seems like a great design :-) Maintaining it also becomes much easier I guess.
        Great work! Congrats! and thanks for the insight.

        Reply

  14. bobbby wrote on :

    Would you make the time/hotness needed for Baseline to kick in (over interpreter), user configurable through about:config, atleast for Nightly’s ?

    Reply

    1. Kannan Vijayan wrote on :

      This is unlikely. It’s hard to write a change that’s enabled only on nightlies and disabled later (except by doing it manually).

      We test heavily with the default settings, so there may be issues with different tunings that fuzzing and other testing didn’t pick up. If we were to make this a tunable setting, then it’s something that would have to be tested to the same extent that the default is, and that takes away precious testing resources from the default configuration.

      Also, from the measurements that we’ve run, changing the constant here back and forth (even from 10 to 0) doesn’t really make a big difference except in pathological cases.

      Reply

  15. Tim wrote on :

    This is all very cool, but I do have one big question I have difficulties coping with… Why no *Monkey name? :(

    Reply

    1. Kannan Vijayan wrote on :

      I guess we just never picked one ;)

      Personally, I like to think of Baseline as a part of IonMonkey. It exists to support Ion in situations when jitcode gets deoptimized, and it generates information for Ion to consume and use in its jitcode generation. It shares a bunch of data structures with Ion. It uses Ion’s call ABI. Etc. etc.

      Reply

      1. Daniel wrote on :

        IonChild

        Reply

      2. Ffire wrote on :

        Is it too late for another name yet? I propose CodeMonkey.

        Reply

  16. Neal wrote on :

    When will this land on the release channel?

    Reply

    1. Kannan Vijayan wrote on :

      It should go out with Firefox 23

      Reply

  17. CircleCode wrote on :

    As always, good job from mozilla team, and great article.
    This helped me discover things I never thought about, and then led me to some questions about polymorphic functions.
    Take for example jQuery : it uses a lot of polymorphic functions, does it mean its cache is regularly invalidated ant its code recompiled (I suppose they took care about it, and it is no so simple…).
    Now, it comes to my own developments (humble ones ;-) ): do you have any advice to allow the functions we write to keep their cache as long as possible, and some indications on what will always invalidate it?

    Reply

  18. CircleCode wrote on :

    I suppose this one will be moderated, but I don’t know where to say it: the mail input for comments doesn’t allow the use of `+` symbol in mails, which is perfectly valid

    Reply

  19. Robert O’Callahan wrote on :

    Are we going to remove JM completely? Soon?

    Reply

    1. Jan de Mooij wrote on :

      Yeah, JM does not compile anything with Baseline enabled and we hope we can remove it completely in the next cycle (Firefox 24). We are keeping it for the current release in case we have to pref it back on for some reason.

      Reply

  20. J. McNair wrote on :

    Wait, so Baseline will do all of the following
    1. be a much simpler JIT compiler that performs quickly, sacrificing most optimization
    2. compile JS only ONCE; store the compiled code in multiple, branching, inline caches; and quickly execute code exclusively from the caches until Ion turns on.
    3. adapt to wildly dynamic JavaScript, without an explosion in number and size of caches
    4. run most of TI and store type information for Ion in the same caches it uses for #2
    5. use less memory than JM+TI and decrease Ion’s overall memory usage

    You Mozilla people are a bit too clever for me. At this point, it’s worth asking if there’s any way to improve the interpreter’s performance?

    Reply

    1. Kannan Vijayan wrote on :

      1. Baseline sacrifices optimizations that rely on type-information. E.g. passing around unboxed values because type-inference guarantees that some value will be an integer all the way through a computation. It also doesn’t emit its optimized code inline (as part of the main jitcode) – instead it pushes the optimized logic into the IC chains, which are dynamically modified at runtime.

      2. The main JS compiles once, but that’s because it uses ICs to implement almost every op. The fallback-stub (last stub that does the default slowpath) for each IC chain implements logic to add new IC stubs. This is pretty straightforward IC design. We make some design choices here to make the IC chains more introspectable.

      3. We limit the size of caches. For some caches, we generalize stubs as the chain grows. For example, with Call IC chains, we start off attaching IC stubs which store the actual callee function that’s being invoked. However, if a lot of callees are being called at that site, then the callee-specific stubs are removed, and a generic call stub is added.

      4. We don’t use the caches to store TI information. We interface with TI by calling the usual VM functions to update TI’s data structures. However, since this is a slow C++ call, we use an IC to record what types have already been added to a type-set, so that we can avoid calling the slowpath most of the time.

      5. The less memory will come about less because of baseline specifically, and more because baseline allows us to disable Jaeger, and disabling Jaeger allows us to remove some of the memory-heavy structures that Jaeger needs.

      The interpreter is great as a portable, flexible environment where the team can prototype and implement new features without having to worry about how it’ll be jitted or optimized. It allows the feature people to do their work without being bogged down by optimization details, and the JIT people can come in later and make those features run fast.

      Reply

      1. J. McNair wrote on :

        Thanks for replying! I guess you’ve figured out that I am certainly not a compiler programmer, and I apologize for blowing up your blog post with long comments. This should be the last one!

        I admit I was thinking “inlined code” instead of “inline cache”, so I am sorry I was imagining Baseline as a crazy hybrid of TraceMonkey and Sun’s Hotspot VM. So, every JS operation or byte-code is implemented as an inline cache with stub functions, Baseline uses the stubs to quickly chain those ICs together in a freaky “function pointer linked list” or something, and then the whole chain is executed in sequence because the stubs are already filled in with the next needed IC. More or less it?

        And that should mean that any dynamism is handled by modifying the chain (add/remove/swap ICs and stubs) or the individual “links” (e.g. your example with the CALL IC). This keeps Baseline from ever fully invalidating code, while limiting IC explosion, right?

        Thanks for clarifying baseline’s relationship with type inference, and upon rereading both posts, I now realize that most of the savings comes from disabling JM and using a system that is smarter and more efficient about using the existing type inference system.

        I never advocated getting rid of the interpreter, I just wondered if there was a way to keep it portable, readable and hackable, with slightly better performance than it has. Not turn it into something it isn’t.

        And I’m sorry that I find all this really fascinating, and thank you all for your hard work!

        Reply

        1. Kannan Vijayan wrote on :

          I think you’re getting close. Every op has its own chain, actually. Jan and I have discussed putting up a blog post discussing the design in more detail. There are some pressing matters to work on right now, but if the time is there we’ll try to write something up.

          If you’re really curious, there’s some some ascii-art and long comments in BaselineIC.h in the sources :)

          Reply

  21. Rob Colburn wrote on ::

    So, as I understand it… Spider hands off code to Baseline (or, sometimes Odin). Based on heat and stability, Baseline will pass code to itself, then to Jaegar, then to Ion. If code regresses (in terms of type stability), then it takes over and holds on to the hot, unstable. Leaving:

    Cold = Interpreter
    Luke-Warm or Hot and Unstable = Baseline
    Warm and Stable = Jaegar
    Burning Hot and Stable = Ion

    So, long-term is the plan to deprecate Jaegar? My hunch would be that type-stable code tends to be either cold/luke-warm or burning hot, and not so much warm.

    Side note: Would be interesting, if someone made histograms of popular website JS heuristics.

    Reply

    1. J. McNair wrote on :

      I can help with this one! The developers stated it above, but it’s no trouble to repeat that JaegerMonkey is going away, immediately. JM is already disabled when Baseline is turned on. As they explained, JM became too large and complex, it uses too much memory, and doesn’t efficiently drive the Type Inference engine that guides Ion.

      The new status quo is
      Cold = Interpreter (which will always be improved and shouldn’t go away)
      Room Temperature to Hot = Baseline which can provide type info to prepare Ion, if needed
      Boiling to the Sun = Ion + TI

      I agree that some data on how “hot” popular website JS functions are, from the perspective of the JS engines would be really neat. I bet there’s already some kind of instrumentation in place to do this.

      Odinmonkey is a separate compiler that reuses parts of Ion to statically compile a (restricted but mostly compatible) subset of JavaScript called “asm.js” into really fast, very type stable, machine code. I am hoping this compilation is off the main thread, already, lol. JS developers have to specifically ask for OdinMonkey by including a special directive in function declarations that is easily ignored by JS engines that don’t understand asm.js. So, everyone can opt-in to better performance, for certain types of code, without breaking the web.

      Reply

  22. Steve wrote on :

    I have to confess most of this is above my head but I’d appreciate it if Mozilla could issue some guidance on how to make sure regularly called functions or loops get passed up to IonMonkey and stay there.

    For example, I gather from the above if I were to initiate a variable as a boolean but later in the function assign a string to it that would bump the function out of the fast running compiled code, is that correct?

    How about arrays, are they more likely to get compiled and stay that way if every element is of the same type?

    How about objects being passed to functions as parameters, should they always have the same members or are only the members the function uses important?

    This is the kind of guidance that would be useful to someone like myself who writes JavaScript scripts but doesn’t really understand what the interpreter does under the hood.

    Thanks.

    Reply

    1. Kannan Vijayan wrote on :

      Hi Steve,

      Actually, all of those cases will be compiled with Ion. The variables having values of different types only becomes an issue if the change happens after compilation.

      Like, for example, if a function accepts an argument, and the first 100,000 times the function is called, the argument is an integer, and the 100,001th time, the function is called with a string, then it’ll lead to some minor slowdown as the function is recompiled.

      Having arrays with values of different types doesn’t prevent ion compilation either.

      In general, the best plan is to write clear, readable, well-designed javascript code. That’s the kind of code that both we and the other JS teams are going to be trying to optimize.

      Cheers.

      Reply

  23. Devin Rhode wrote on ::

    I wonder about a ‘use types’; feature for javascript

    Reply

  24. Samuel wrote on :

    This trade-off between low overhead and high execution speed, reminded me of when I watched the stories told in Moon Machines. Such as this one:
    http://science.discovery.com/video-topics/space-videos/moon-machines-3-stage-rocket.htm

    By extension, I’ll put this link also here:
    http://en.wikipedia.org/wiki/Trade-off

    Anyway, I think your post is precious, and talks about creativity at its finest.

    Also, it is a success to be right there with Chrome in its own benchmark.

    Reply

Post Your Comment

  1.