HolyJit: A New Hope

tl;dr: We believe there is a safer and easier way of writing a Jit.

Current State

Today, all browsers’ Jits share a similar design. This design makes extending the language or improving its performance time-consuming and complex, especially while avoiding security issues.

For instance, at the time of this writing, our Jit relies upon ~15000 lines of carefully crafted, hand-written assembly code (~36000 in Chromium’s v8). The Jit directory represents 5% of all the C++ code of Firefox, and contains 3 of the top 20 largest files of Firefox, all written by hand.

Interestingly, these files all contain code that is derived by hand from the Interpreter and a limited set of built-in functions of the JavaScript engine. But why do it by hand, when we could automatize the process, saving time and risk? HolyJit is exploring this possibility.

Introducing HolyJit (prototype)

This week, during the JS Team meetup, we have demonstrated the first prototype of a Rust meta-Jit compiler, named HolyJit. If our experiment proves successful, we believe that employing a strategy based on HolyJit will let us avoid many potential bugs and let us concentrate on strategic issues. This means more time to implement JavaScript features quickly and further improve the speed of our Jit.

HolyJit library instrumenting the Rust compiler to add a meta-Jit for Rust code.

For instance, in a recent change, we extended the support of optimizations to Array.prototype.push. What should have been a trivial modification required diving into safety-critical code and adding 135 lines of code, and reading even more code to check that we were not accidentally breaking invariants.

With HolyJit, what should have been a trivial change would effectively have been a trivial change. The following change to a hypothetical JS Jit built with HolyJit does exactly the same thing as the previous patch, i.e. allowing the Jit to inline the Array.prototype.push function when it is being called with more than one argument.

 fn array_push(args: &CallArgs) -> Result<JSValue, Error> {
-    jit_inline_if!(args.len() == 1);
+    jit_inline_if!(args.len() >= 1);
     …
 }

By making changes self-contained and simple, we hope that HolyJit will improve the safety of our Jit engine, and let us focus on optimizations.

HolyJit Repository: https://github.com/nbp/holyjit

Thanks to David Teller, Jason Orendorff, Sean Stangl, Jon Coppeard for proof reading this blog post.

11 responses

Post a comment

  1. Robert Monfera wrote on :

    Sounds intriguing but I don’t fully understand how this approach can lead to competitive performance – if you can combine otherwise interpreted functions A and B, then their automatic combination, even if inlined or running in native code, may not be as efficient as what’s a result of data flow analysis, loop unrolling and a million other optimizations.

    A hypothetical example: the way transducers (see Rich Hickie talk, Strange Loop) work is such that, when various operations are plumbed together (e.g. map, filter, reduce, map again…) there doesn’t have to be big arrays created temporarily, as would be needed by native [].map etc. functions. Now, inspired by that, a JIT may be smart enough to recognize a chain of `someArray.map(…).filter(…).reduce(…).map(…)` as safe for pipelining through individual elements, without having to create as many large arrays as the native interpretation would. Of course data flow analysis is needed to check if it’s safe to do so. It’s a huge win to combine functions this way and just one example (probably hypothetical).

    Now, how could an automatic solution, as I understand from the gh page, arrive at an optimization like this? Will there be a full-blown optimizing Rust compiler bundled in desktop and mobile Firefox browsers? What am I missing?

    Reply

    1. Nicolas B. Pierron wrote on :

      This approach can lead to competitive performance because the amount of work needed to provide the same optimization should require less efforts. Thus, this would free some developer time to do more optimizations.

      Currently this project is only at the prototype stage, and there is no plan for adding any “Deforestation” optimizations to the optimizing compiler yet. The main idea at the moment is to guide the compiler with Rust macros, annotating the code, such as the jit_inline_if macro mentioned in this article.

      Reply

      1. Rudolf Olah wrote on :

        You’re optimizing for development time and this makes a lot of sense. Each line of code adds time to the code review, and multiply that by 3-4 developers and you have yourself a time-sink that’s so bad that there are now startups that want to outsource code reviews!

        This makes a lot of sense and you’re getting a lot of value out of this. 1 line of code to review vs 135 translates into less developer-hours spent on review meaning more time spent on other issues to fix.

        Very cool project!

        Reply

      2. Steveo wrote on :

        It doesn’t seem to add up, I believe Monfera was referring to runtime performance, and you are responding to a question about development time.

        One has to think asm was an architectural decision. With the browser being central to almost everything, performance is critical. To assume that matching the performance of the asm compiler and its output executable code are easy things, is like asserting P=NP, gonna need you to understand the question first of all, and why it is relevant, then we can work on hope (hope for a slower browser for everyone?)

        Props to the asm team FYI, an amazing job, if I’m misunderstanding this blog post my apologies, but it sounds like Pierron thinks what you are doing is hopeless, and that is rediculous.

        Reply

        1. Nicolas B. Pierron wrote on :

          Indeed, I forgot to answer the first part of the question.

          The way we gain performances in SpiderMonkey’s JIT, is by adding guards / constraints on some values which are flowing in the graph.
          The idea of this experiment, would be to provide the same guards as annotations, and keep annotating the code until the optimizer can generate the same code we would have done by hand.

          The next stage of this experiment is actually to see how many annotations are required to provide the same amount of optimizations on a dynamic language.
          What we hope to observe, is that the maintenance of these annotations would be easier than the maintenance of the equivalent generated code.

          Reply

  2. Stephen Chung wrote on :

    The difference is like using yacc (or bison or whatever) to build a compiler parser vs. writing the parser out all by hand.

    Reply

  3. Cristian Vasile wrote on :

    How is this different of BEAMJIT the JIT designed for Erlang/Elixir VM?

    Quote
    ” The core parts of BEAMJIT are synthesized from the C source code of BEAM, the reference Erlang abstract machine”

    Reply

    1. Nicolas B. Pierron wrote on :

      HolyJit experiment share the same intent as BEAMJIT.

      Except for the language, one of the difference with BEAMJIT is that it relies on LLVM to build a JIT compiler, whereas HolyJit has its own back-end which might be replaced by Cretonne in the future.
      This is not yet a concern but the binary size might be a deal breaker if the HolyJit experiment graduates.

      Reply

  4. samlh wrote on :

    Some other meta-jit implementations:
    1. PyPy[1] on top of RPython[2] (low-level Python)
    2. Truffle/Graal [3] on top of Java

    Rust is an especially interesting host language for a metajit, as it has a reasonably low-level IR (MIR), and unlike the other meta-jit host languages I know of does not use garbage collection.

    I am very interested to see how this turns out!

    [1] http://doc.pypy.org/en/latest/introduction.html
    [2] http://rpython.readthedocs.io/en/latest/index.html#index
    [3] https://github.com/graalvm/

    Reply

    1. Cristian Vasile wrote on :

      FYI.
      http://unisonweb.org/2017-04-28/new-runtime.html

      Reply

      1. samlh wrote on :

        Interesting – it seems that one relies on the Java JIT inlining a tree of lambdas generated by an AST compiler. I can see how that would be an easy way to compile to the JVM without having to write the bytecode directly. I believe Truffle does something similar for compilation, though I believe it uses a tree of method calls instead, and some extra work to ensure the inlining works well.

        I don’t know the creator has gotten to the point of worrying about specializing for runtime values and falling back to more general versions on guard failure – that is one of the areas where the Truffle project has spent a fair amount of work.

        As I understand it, Truffle does specialization by choosing specialized AST node subclasses based on runtime values, and does fallback by swapping out AST nodes to more general versions on guard failures. I’m sure there are details I’m missing, however.

        Still, the Unison runtime seems like a nice waypoint between an interpreter and a specializing JIT, and looks reasonably amenable to implementing specialization later.

        Thank you for bringing it to my attention!

        Reply

Post Your Comment