An API for parsing JavaScript

In new builds of the SpiderMonkey shell we’re introducing an experimental API for parsing JavaScript source code, which landed this week. For now, you have to download and build SpiderMonkey from source to use it, but hopefully we’ll include it in future versions of Firefox.

The parser API provides a single function:

Reflect.parse(src[, filename=null[, lineno=1]])

Reflect.parse takes a source string (and optionally, a filename and starting line number for source location metadata), and produces a JavaScript object representing the abstract syntax tree of the parsed source code, using the built-in parser of SpiderMonkey itself. Straightforward enough, but behind this simple entry point is a thorough API that covers the entirety of SpiderMonkey’s abstract syntax. In short, anything that SpiderMonkey can parse, you can parse, too. Developer tools generally need a parse tree, and JavaScript isn’t an easy language to parse. With this API, it becomes much easier to write tools like syntax highlighters, static analyses, style checkers, etc. And because Reflect.parse uses the same parser that SpiderMonkey uses, it’s guaranteed to be compatible.

Here’s a simple example:

js> var ast = Reflect.parse("obj.foo + 42");
js> var expr = ast.body[0].expression;
js> expr.left.property
({loc:null, type:"Identifier", name:"foo"})
js> expr.right
({
    loc: {
        source: null,
        start: {
            line: 1,
            column: 10
        },
        end: {
            line: 1,
            column: 12
        }
    },
    type: "Literal",
    value: 42
})

Try it out, and feel free to give me feedback!

13 Responses to An API for parsing JavaScript

  1. How can I be notified when this feature is available? Now that we have the cryptic marking: https://bugzilla.mozilla.org/show_bug.cgi?id=533874#c59
    that bug is ‘fixed’ so I won’t know about future progress. Obviously we’d like to use this in Firebug if it is fast enough.

  2. I’m experimenting with trying to de-minimize code when debugging it (with JSON/JS-based de-minimizing ‘symbols’ generated by the compressor, like YUIC or Google Closure), and this is *exactly* what I’d need so I don’t need to go and include narcissus and do it myself. I don’t suppose there’s any way this’ll make Fx 4?

  3. Is it – or will it be – possible to limit the parsed input to the subset of JavaScript that is also json? Being able to parse json not only for its actual contents but for its syntactic structure seems handy. And I actually have a current use case where my company could really use a json syntax highlighter.

  4. Pingback: Twitter Trackbacks...

  5. Daniel Glazman

    Wow, very nice! What about a serializer of the object result of Reflect.parse()? I have a specific use case in mind for BlueGriffon.

  6. Daniel,

    By “serializer” do you mean “convert to eval-able JavaScript source?” I’ve been meaning to add that functionality but haven’t yet gotten to it. I’ve just filed a bug on it:

    https://bugzilla.mozilla.org/show_bug.cgi?id=590755

    Thanks for the suggestion!

    Dave

  7. Pingback: An API for parsing JavaScript « dherman at mozilla | Firefox Blog

  8. Great stuff!

    Once the serializer part just mentioned works, this could be used to un-minify source code and other stuff. One thing I’d like to use it for myself is a JavaScript ‘optimizer’.

  9. Daniel Glazman

    @dherman yes that’s exactly what I meant. If we programmatically tweak the object’s contents, it could be _really_ cool to get the corresponding JS back :-)

  10. My reaction when reading this was exactly like Daniel’s – great, how do I turn it into source code again? Reason being, I need a way to automatically rewrite Adblock Plus code. And this looks like a simple enough solution.

    PS: Any reason this page says “57 comments” when I only see four?

  11. [Apologies to the commenters whose comments languished in my spam filter and went unnoticed by me till now.]

    @johnjbarton: I’ve created https://bugzilla.mozilla.org/show_bug.cgi?id=590973 for enabling this in chrome. It won’t happen till post-Firefox 4, but it shouldn’t be hard to add.

    @Gijs: I’m afraid not.

    @Stuart: I plan to generalize the API to take an optional builder object:

    https://bugzilla.mozilla.org/show_bug.cgi?id=569487

    You could use the builder to generate custom formats, and you should also be able to use it to restrict the parser to subsets of the grammar. I think this might be a better approach, rather than implementing several subsets natively (JSON, ES3, ES5, …). What do you think?

    @Wladimir: I get hundreds of spam comments a day, and the WP spam net we have reports them as comments until I delete them, even though it doesn’t show them. :(

    Thanks, everyone, for the great feedback!

  12. @dherman Having support for custom builders sounds really useful, but I don’t think it’s supercedes the usefulness of being able to conveniently specify a well known subset – especially JSON which is has built-in native support (JSON.parse). How about out-of-the-box builders that represent the well-known subsets? Reflect.parse(…, Reflect.builders.json)?

    On the other hand, what if you want to parse the JSON subset and *also* call a custom builder?

  13. @stuart: My gut feeling is just that it’s putting the cart a little before the horse. Let’s get the first round of functionality down. Once we can specify custom builders in JS, we can experiment with them and see which ones are the most useful.

    Re: subsetting with a custom builder, you should be able to combine/chain builders.