IonMonkey provides a brand new architecture that allows us to do just that. It essentially has three steps:
- Run various algorithms to optimize the IR.
- Translate the final IR to machine code.
With that said, what exactly does IonMonkey do to our current benchmark scores? IonMonkey is targeted at long-running applications (we fall back to JägerMonkey for very short ones). I ran the Kraken and Google V8 benchmarks on my desktop (a Mac Pro running Windows 7 Professional). On the Kraken benchmark, Firefox 17 runs in 2602ms, whereas Firefox 18 runs in 1921ms, making for roughly a 26% performance improvement. For the graph, I converted these times to runs per minute, so higher is better:
On Google’s V8 benchmark, Firefox 15 gets a score of 8474, and Firefox 17 gets a score of 9511. Firefox 18, however, gets a score of 10188, making it 7% faster than Firefox 17, and 20% faster than Firefox 15.
We still have a long way to go: over the next few months, now with our fancy new architecture in place, we’ll continue to hammer on major benchmarks and real-world applications.
For us, one of the coolest aspects of IonMonkey is that it was a highly-coordinated team effort. Around June of 2011, we created a somewhat detailed project plan and estimated it would take about a year. We started off with four interns – Andrew Drake, Ryan Pearl, Andy Scheff, and Hannes Verschore – each implementing critical components of the IonMonkey infrastructure, all pieces that still exist in the final codebase.
It’s really rewarding when everyone has the same goals, working together to make the project a success. I’m truly thankful to everyone who has played a part.
Over the next few weeks, we’ll be blogging about the major IonMonkey components and how they work. In brief, I’d like to highlight the optimization techniques currently present in IonMonkey:
- Loop-Invariant Code Motion (LICM), or moving instructions outside of loops when possible.
- Sparse Global Value Numbering (GVN), a powerful form of redundant code elimination.
- Linear Scan Register Allocation (LSRA), the register allocation scheme used in the HotSpot JVM (and until recently, LLVM).
- Dead Code Elimination (DCE), removing unused instructions.
- Range Analysis; eliminating bounds checks (will be enabled after bug 765119)
Of particular note, I’d like to mention that IonMonkey works on all of our Tier-1 platforms right off the bat. The compiler architecture is abstracted to require minimal replication of code generation across different CPUs. That means the vast majority of the compiler is shared between x86, x86-64, and ARM (the CPU used on most phones and tablets). For the most part, only the core assembler interface must be different. Since all CPUs have different instruction sets – ARM being totally different than x86 – we’re particularly proud of this achievement.
Where and When?
IonMonkey is enabled by default for desktop Firefox 18, which is currently Firefox Nightly. It will be enabled soon for mobile Firefox as well. Firefox 18 becomes Aurora on Oct 8th, and Beta on November 20th.
* Note: TraceMonkey did have an intermediate layer. It was unfortunately very limited. Optimizations had to be performed immediately and the data structure couldn’t handle after-the-fact optimizations.