Main menu:

Site search

Categories

Archive

Static Analysis Newslets

And, now that I’m posting again, I should offer a little news about other recent events.

Jason Orendorff has successfully used Treehydra and a few GCC attribute annotations to add a read barrier to the JS frame pointer in TraceMonkey.

I’m not exactly sure what that means myself, but I think the idea is this: Part of how TraceMonkey speeds things up is by constructing activation records for JS function calls only if and when it is necessary. But then TraceMonkey can’t safely call arbitrary native functions, because the native function might try to access those activation records, and thus crash. So TraceMonkey can’t speed up traces that call arbitrary native functions. Jason’s analysis verifies that every function that accesses the activation records is either already in a state where they are available, or first calls a function that makes them available. His analysis is pretty powerful: if a function needs the activation records only on certain paths, the analysis only requires it to have them on those paths, thus allowing it to run faster otherwise. I believe all of this is a first step toward tracing through all sorts of DOM calls, which is essential to speeding up the web as a whole.

Graydon Hoare is just starting on an analysis application to verify that errors are handled correctly in TraceMonkey/SpiderMonkey. He’s doing the equivalent of Java checked exceptions in C. In Java, if you call a function like ‘void foo() throws FooException’, then whenever you call foo(), Java requires that you either declare that you too can throw a FooException, or you catch it in a catch block.

SpiderMonkey uses C-style error reporting, i.e., the return value of a function indicates whether there was an error. So the equivalent of checked exceptions is something like this:

  // The attribute tells us that a JS_FALSE return value indicates an
  // error, and that the possible error conditions are OOMError and
  // AsmError.
  JSBool foo() __attribute__(("failure:JS_FALSE:OOMError,AsmError"))

  JSBool fooCaller() {
    JSBool rv = foo();
    if (!rv) {
      return JS_FALSE;
      // Error! If foo() fails, we have to call an "error handler" like
      // handleAsmError().
    }
    ...
  }

At a high level, this is similar to Jason’s analysis, but all sorts of practical details differ, and practical details are annoying significant in static analysis. Graydon’s currently bravely trying out the largely undocumented ESP library APIs.

In other news, a little while back I implemented a simple regexp compiler that translates JS regular expressions to native code by way of nanojit, TraceMonkey’s cross-platform backend. I guess the WebKit team doesn’t have a blog where they would bust me for having a really incomplete implementation, so I’ll have to do it myself. My implementation so far is less complete than theirs, but it is good enough at least to give massive speedups on SunSpider’s regexp-dna.js and simple regular expressions of that kind. I’ll really bust myself by saying I haven’t implemented “dot” yet. I think V8 has a regexp compiler too, now.

I think the really important difference between the TraceMonkey regexp compiler and WebKit’s (WREC) is in the backend. TraceMonkey uses nanojit, a cross-platform compiler that turns linear (or mostly-linear) LIR code sequences into optimized native code. WREC, the last time I looked at it, uses an x86 assembler library. Thus, WREC can very directly implement a human-designed x86 code generation and register allocation pattern, making good x86 code. Also, the assembly process is a lot quicker than nanojit’s compilation process. But unlike nanojit, it doesn’t have the opportunity to automatically optimize the code, and is x86-only. In practice, this means WREC wins on getting to the breakeven point faster (compiling takes time, which you win back as you run the regexp against more and more text characters), and running a hair faster on regexp-dna, while TraceMonkey wins on being cross-platform.

I guess as the author I’m not really qualified to say so, but I do think my little regexp compiler is pretty simple and clean, so if anyone out there is interested in getting into some compiler hackery, it might be a fun place to start. And there’s lots left to be done.

Comments

Comment from Robert O’Callahan
Time: January 9, 2009, 8:02 pm

Maybe Nick wold be interested in taking up the regexp compiler…

Comment from RichB
Time: January 11, 2009, 5:31 am

I believe WREC was converted to their MacroAssembler about a month ago. Some of this was probably in preparation for their x64 port.

Comment from Проститутки
Time: March 4, 2010, 10:00 am

Maybe Nick wold be interested in taking up the regexp compiler…

Comment from Moskva
Time: March 4, 2010, 10:05 am

I think the really important difference between the TraceMonkey regexp compiler and WebKit’s (WREC) is in the backend.

Comment from indonesia furniture handicraft wholesale marketplace
Time: August 30, 2010, 5:02 pm

This is a really good read for me, Must admit that you are one of the best bloggers I ever saw.Thanks for posting this informative article

Comment from ali osman
Time: September 1, 2010, 7:37 am

tskl

Comment from cara install program komputer
Time: October 25, 2010, 3:33 am

I Agree with you. I think regexp compiler is pretty simple and clean