Garburator works!

A few weeks ago I convinced myself that is possible to rewrite Mozilla to avoid COMPtrs on the stack. Since then I’ve changed my mind a few times and felt like I may not be able to get this rewrite working. However, after three or four false starts, I finally managed to work out a metal model of the stack nsCOMPtr usage. With a combination of automatic blacklisting of tricky code, manual demacroing and lots of help from Benjamin I got the generated 3.2MB patch to compile.

I am sure that there are lots of bugs to be found still, but at least we’ve discovered the pattern that the code follows. I am also sure that there are lots of unpleasant surprises to be discovered and dealt with in the near future.

The bright side is that the result of these rewrites we should get a less buggy codebase that is easier to work on, more efficient and compiles to smaller binaries. My other big wish is to significantly reduce the amount of C++ magic in the codebase.

I am happy that garburator works as it means I can go back to playing outparamdel. Hopefully, once garburator+outparamdel are applied on all possible methods we’ll end up with relatively nice looking C++ code and a healthy performance boost.

3 comments

  1. That is awesome!

    I’ve spent some time poking at the JavaScript rewriting stuff. Basically, I updated the parser you linked to in the previous post to handle JS1.7 and made it emit a JSON representation of the raw AST. (It may end up being a good idea to rewrite the whole thing, since there are some edge cases that may run the risk of making the current hack a bowl of spaghetti.)

    Some preliminary testing with traversing the JSON AST using Python look promising. There are some issues with exact source positions for e.g. function arguments, but I figure it’s possible to further annotate the AST in a second pass for those regions. Right now I’m leaning towards using Python for the first pass, and have that emit a .js file with predetermined fields (e.g. scope information, preprocessor line no. offsets, the final AST etc.) and use that file as the input to DeHydra. Mostly because doing all that in C/C++ seems to be a lot of unnecessary work which is easily accomplished in Python.

  2. Wow, that sounds pretty awesome. I haven’t had a chance to look at the previous parser in any detail yet, awesome to hear so much progress.

    I completely agree on staying the hell away from C/C++ :)

    Is your work available somewhere?

  3. Not currently. It’s in a bit of a shambles, it currently doesn’t extract literal values, the files are in the SpiderMonkey src folder and there is no makefile etc. As soon as I clean that up, I’ll try to put it somewhere reachable.

    >echo “var x = 3;”|./parse_test
    {“token” : “LC”, “start_pos” : [0, 0], “end_pos” : [1, 10], “arity” : -1, “expr_list” : [{"token" : "VAR", "start_pos" : [1, 0], “end_pos” : [1, 5], “arity” : -1, “expr_list” : [{"token" : "NAME", "start_pos" : [1, 4], “end_pos” : [1, 5], “arity” : “name”, “value” : “x”, “expr” : {“token” : “NUMBER”, “start_pos” : [1, 8], “end_pos” : [1, 9], “arity” : 0}}]}]}