IonMonkey in Firefox 18

David Anderson

44

Today we enabled IonMonkey, our newest JavaScript JIT, in Firefox 18. IonMonkey is a huge step forward for our JavaScript performance and our compiler architecture. But also, it’s been a highly focused, year-long project on behalf of the IonMonkey team, and we’re super excited to see it land.

SpiderMonkey has a storied history of just-in-time compilers. Throughout all of them, however, we’ve been missing a key component you’d find in typical production compilers, like for Java or C++. The old TraceMonkey*, and newer JägerMonkey, both had a fairly direct translation from JavaScript to machine code. There was no middle step. There was no way for the compilers to take a step back, look at the translation results, and optimize them further.

IonMonkey provides a brand new architecture that allows us to do just that. It essentially has three steps:

  1. Translate JavaScript to an intermediate representation (IR).
  2. Run various algorithms to optimize the IR.
  3. Translate the final IR to machine code.

We’re excited about this not just for performance and maintainability, but also for making future JavaScript compiler research much easier. It’s now possible to write an optimization algorithm, plug it into the pipeline, and see what it does.

Benchmarks

With that said, what exactly does IonMonkey do to our current benchmark scores? IonMonkey is targeted at long-running applications (we fall back to JägerMonkey for very short ones). I ran the Kraken and Google V8 benchmarks on my desktop (a Mac Pro running Windows 7 Professional). On the Kraken benchmark, Firefox 17 runs in 2602ms, whereas Firefox 18 runs in 1921ms, making for roughly a 26% performance improvement. For the graph, I converted these times to runs per minute, so higher is better:

On Google’s V8 benchmark, Firefox 15 gets a score of 8474, and Firefox 17 gets a score of 9511. Firefox 18, however, gets a score of 10188, making it 7% faster than Firefox 17, and 20% faster than Firefox 15.

We still have a long way to go: over the next few months, now with our fancy new architecture in place, we’ll continue to hammer on major benchmarks and real-world applications.

The Team

For us, one of the coolest aspects of IonMonkey is that it was a highly-coordinated team effort. Around June of 2011, we created a somewhat detailed project plan and estimated it would take about a year. We started off with four interns – Andrew Drake, Ryan Pearl, Andy Scheff, and Hannes Verschore – each implementing critical components of the IonMonkey infrastructure, all pieces that still exist in the final codebase.

In late August 2011 we started building out our full-time team, which now includes Jan de Mooij, Nicolas Pierron, Marty Rosenberg, Sean Stangl, Kannan Vijayan, and myself. (I’d also be remiss not mentioning SpiderMonkey alumnus Chris Leary, as well as 2012 summer intern Eric Faust.) For the past year, the team has focused on driving IonMonkey forward, building out the architecture, making sure its design and code quality is the best we can make it, all while improving JavaScript performance.

It’s really rewarding when everyone has the same goals, working together to make the project a success. I’m truly thankful to everyone who has played a part.

Technology

Over the next few weeks, we’ll be blogging about the major IonMonkey components and how they work. In brief, I’d like to highlight the optimization techniques currently present in IonMonkey:

  • Loop-Invariant Code Motion (LICM), or moving instructions outside of loops when possible.
  • Sparse Global Value Numbering (GVN), a powerful form of redundant code elimination.
  • Linear Scan Register Allocation (LSRA), the register allocation scheme used in the HotSpot JVM (and until recently, LLVM).
  • Dead Code Elimination (DCE), removing unused instructions.
  • Range Analysis; eliminating bounds checks (will be enabled after bug 765119)

Of particular note, I’d like to mention that IonMonkey works on all of our Tier-1 platforms right off the bat. The compiler architecture is abstracted to require minimal replication of code generation across different CPUs. That means the vast majority of the compiler is shared between x86, x86-64, and ARM (the CPU used on most phones and tablets). For the most part, only the core assembler interface must be different. Since all CPUs have different instruction sets – ARM being totally different than x86 – we’re particularly proud of this achievement.

Where and When?

IonMonkey is enabled by default for desktop Firefox 18, which is currently Firefox Nightly. It will be enabled soon for mobile Firefox as well. Firefox 18 becomes Aurora on Oct 8th, and Beta on November 20th.

* Note: TraceMonkey did have an intermediate layer. It was unfortunately very limited. Optimizations had to be performed immediately and the data structure couldn’t handle after-the-fact optimizations.

44 responses

Post a comment

  1. Jarrod Mosen wrote on ::

    Awesome! How’s this compare to Chrome’s offerings?

    Reply

    1. Gianluca wrote on ::

      Some performance tests: http://arewefastyet.com/

      Reply

    2. Ed wrote on :

      I think IM basically will catch up to Chrome offering, what Gecko still lacks is the GC department. ( Just my guess )

      Reply

      1. Terrence Cole wrote on :

        You are quite correct! SpiderMonkey’s GC team is working hard on a new scavenging collector which should get us the rest of the way there.

        Reply

  2. DaveB wrote on :

    It doesn’t matter what kind of JavaScript performance IonMonkey can showoff on a graph when Gecko can keep up to render the stuff on screen. And lately, Mozilla seems to stop any kind of progress they are making for Gecko.

    Reply

  3. Joe Average wrote on :

    Have you tested memory usage with new JIT compared to the old one?

    Reply

    1. David Anderson wrote on ::

      Last I measured in the shell, it was roughly in the same ballpark as JM. Before the release we’ll make sure to hook up the about:memory reporters (bug 747202).

      Reply

  4. Vitaly wrote on :

    Hi David,

    Congrats on this achievement. Just one small nitpick – Hotspot client compiler uses a linear scan RA but its more aggressive server compiler uses a graph coloring RA:
    https://wikis.oracle.com/display/HotSpotInternals/C2+Register+Allocator+Notes

    Thanks

    Reply

    1. David Anderson wrote on ::

      Thanks, that completely slipped my mind!

      Reply

  5. Ronak Shah wrote on :

    Chrome x86 Version 23.0.1262.0′s V8 Benchmark Suite – version 7 score is 12381.
    I’m running Windows 8 Pro x64 with 4GB RAM.

    I’d love to see Firefox 18 final release soon.

    Reply

  6. Will Morgan wrote on ::

    That’s incredibly cool. For developer purposes, I wonder if there is a way to expose the results of GVN and DCE and translate them back, so that you could highlight the original JavaScript regions that were optimised?

    Reply

    1. David Anderson wrote on ::

      That would be cool – we’d have to see how often it affects high-level code. A lot of what GVN eliminates is just stuff generated by the compiler itself. JS operations break down into smaller components, and these end up being redundant from statement to statement.

      Reply

  7. Tom wrote on :

    Congratulations for a year of great work!

    Reply

  8. Mr. S wrote on :

    How do you determine what a “long-running” application is?

    Reply

    1. David Anderson wrote on ::

      If a function/loop runs enough times (I think right now it’s 10,000), the higher-powered compiler pays off. That’s not really a good definition of “long-running application”, it’s more like, “something that will benefit from the compiler”.

      Reply

  9. wat wrote on :

    That’s great. However, I am more looking forward at Australis UI implementation, along with the new, customizable menu button. It’s going to be a much more noticeable change than some javascript benchmarks.

    Reply

    1. Ferdinand wrote on :

      Then why did you feel the need to post that on a new javascript engine post?

      Reply

    2. Przemysław Lib wrote on :

      This days even Twitter is JS-heavy website …

      You will see improvements ;) Unless you only read RSS feeds :P

      Reply

  10. Daniel Johansson wrote on :

    Great job!! I’m impressed by this team effort

    Is there a way to see what optimizations has been made to the code after the fact?

    Reply

  11. vlad wrote on :

    cool. so when are gonna be able to use something else than javascript? (say python or perl for example)

    Reply

  12. Alaa Salman wrote on ::

    Question, is that project plan available publicly? I’d like to see a sample plan for a project like this.

    Reply

    1. David Anderson wrote on ::

      Yup – https://wiki.mozilla.org/Platform/Features/IonMonkey

      I think all but “Baseline Compiler” and “Debugging” ended up being accurate by the end.

      Reply

  13. Matt wrote on :

    Great stuff – but why is this blog not syndicated on p.m.o?

    Reply

  14. Jan wrote on ::

    excited to hear this. the announcement made me run a couple of performance tests myself. here’s a look on how firefox improved in jscript performance since 0.1 (phoenix!) came out

    http://whyeye.org/blog/firefox-performance-history/

    Reply

    1. herom wrote on :

      seems you were so excited about ‘this’ announcement that you forgot to benchmark firefox nightly with ‘the ionmokey engine’ this announcement was about.

      Reply

  15. Steve wrote on :

    Does this mean other languages can start targeting the immediate representation instead of having to transpile to Javascript?

    Or is this some sick in-joke between browser vendors? (First Chrome, with “everyone asks for a common bytecode format, so let us build a new langugage combining the worst of JavaScript with the worst of Java instead! TROLOLO!) and now Firefox (“We build an immediate representation, but we won’t let you target it”)

    Let’s hope it works out this time …

    Reply

    1. David Anderson wrote on ::

      You might be interested in some old posts Brendan has about bytecode standards in browsers (a standard IR is the same problem):

      http://www.aminutewithbrendan.com/pages/20101122

      Reply

      1. Steve wrote on :

        I’m not even asking for “standards” or anything of that sorts, just wanted to know if there is a way to target Mozilla’s “interpreter” without suffering from JavaScript and its inherent slowness.

        Reply

  16. dumb wrote on :

    IonMonkey noticeably regressed SunSpider. Don you care about SunSpider anymore?

    Reply

    1. David Anderson wrote on ::

      Yeah, we took a 3-5% SunSpider regression. We intend to fix this before Firefox 18 ships.

      IonMonkey is the heavy duty compiler intended for long-running code. SunSpider tests often run in 0-3ms, so compilation time, and choosing which compiler to use, can be a huge factor.

      Reply

      1. dumb wrote on :

        hmm it’s about 8% on my machine:
        308.3ms +/- 1.0% on nightly 18 (2012-09-13)
        284.0ms +/- 0.8% on aurora 17 (2012-09-07)
        It’s also visible on AWFY that regression is bigger on older machine.

        Reply

  17. tom jones wrote on :

    why isn’t this blog on planet.mozilla.org?

    i have to find out about this from ars technica? c.c.c.. :(

    Reply

  18. Yousif Anwar wrote on :

    I am no developer but so excited about this. So is it already available in the latest build of Firefox Nightly that we can download right now from http://nightly.mozilla.org/

    Thanks a lot for the wonderful work and thanks for answering! :)

    Reply

  19. Swarnava Sengupta wrote on ::

    good work man \m/

    Reply

  20. Allen Lee wrote on :

    I’ve tried the latest firefox 18 beta. On my mac, it scores 10,932 in v8 benchmark where as Chrome scores 14,000. For firefox 17, it scores 9885. Drilling down into each individual tests, I found that there are significant improvements in certain tests like Raytrace, DeltaBlue. But there are also noticeable degradations in the test Crypto, hence cancelling out some of the gains in score. I suppose the drop is due to the extra time in optimisation. RegExp remains very poor compared with Chrome. Another observation is that the performance is noticeably affected when the benchmark is run again immediately. The memory skyrocketed to about 1G. The GC is definitely one area to look into.

    Reply

  21. Petr wrote on :

    I’m looking forward to see how IM deals with complex JS libraries, like ExtJS.

    Reply

  22. Pushker wrote on ::

    Really, I like the Firefox 18 version, great changes in Web Developer tools but We want something like Chrome Development , which is more easier to use and the Speed of this current version is good but I think not as compare to Chrome Still. Any Way Great job, I am waiting for another version to see good changes.

    Reply

  23. Michaela Merz wrote on ::

    Did some prime number calculations (including some bigint work-around) I found that Firefox is, unfortunately, still at least 3-4 times slow than Chrome. But it’s getting better. Are we going to see some more speed improvements?

    Michaela

    Reply

  24. armakuni wrote on :

    Benchmark scores are simply wasted time.

    Please, use Firefox 18 one day with some web apps, for example Tine 2.0 or something similar. It’s no fun, because it’s slow. If you are using Chrome instead, these web apps reacting much faster. Not as fast as a native OS application like Kontact etc., but much faster than Firefox.

    And yes: I already tried a clean installation without my installed extensions. There ist virtually no difference between zero installed extensions, 5 installed extensions or 35 installed extensions.

    It’s sad because I love Firefox for years for it’s extensions, but the performance didn’t get much better. It’s getting harder to work with Firefox.

    Reply

  25. Raven wrote on ::

    I was trying to run this on IE9 64bit but, after a rather sluggosh start, I think it crashed “mozilla.org is not responding”. Would there be a test that is runnable on chrome, IE and Javascript as a head to head demo?

    Reply

  26. atcon wrote on ::

    cool.that ‘s why we like firefox.

    Reply

  27. Sean Halle wrote on ::

    BTW, if you get the captcha wrong, it loses your comment text :-(

    Reply

  28. Sean Halle wrote on ::

    I add parallel behavior to javascript via the proto-runtime system (http://opensourceresearchinstitute.org), which requires the js engine to be thread safe. Multiple cores are running js code from the same application at the same time, sharing heap allocated variables (things created via “new”). This means that the js engine has to protect internal bookkeeping variables, such as the trace cache, heap meta-info, and so on. It also has to make garbage collection be safe, so that one core can be performing garbage collection while other cores are still executing js code that accesses heap objects. Is IonMonkey safe in this way?

    Reply

    1. jwalden wrote on ::

      SpiderMonkey heaps are each restricted to use on a single thread. You can have multiple heaps, but they have to be on different threads.

      Now, there are some flourishes on that basic description of things. For example, we’re experimenting with “parallel” JS which can in some circumstances perform an apparently-parallel operation across a bunch of threads, falling back to a single-threaded implementation if any thread tries to do something that can’t be parallelized. And we have APIs which allow one runtime to send a value by “structured clone” to another runtime, which lets you do some sorts of message-passing (ideally with small messages). And there are other ideas being researched, which might make their way into JS the language if they pan out well enough in SpiderMonkey (and the other way around).

      But fundamentally, making all of garbage collection safe, all of heap information safe, etc. is 1) really hard to get right, because of how much it touches, and 2) probably impossible to make fast, because you don’t want to pay the cost of atomic operations for JS that’s overwhelmingly single-threaded. It might be interesting to create an engine that implements everything as atomic operations, and see how it does with non-web JS. But on the web, we can’t pay those costs. (We actually attempted to pay them, at one time. A lot of the reason we’ve gained speed is from being able to replace all that purportedly “thread-safe” code and replace it with fast, single-threaded algorithms.)

      Reply

Post Your Comment

  1.