I’ve moved!

My blog has a new home:

http://calculist.org

Hope to see you there!

Now that’s a nice ‘stache

dot-stacheThere was something that bugged me about Allen Wirfs-Brock‘s proposed “monocle-mustache” syntax for JS: it updates the properties of the object on the left-hand side, but it doesn’t look like it’s updating properties; it looks like some strange combination of creating a new object and accessing properties of the left-hand side:

this.{ foo: 17, bar: "hello", baz: true };

Then this week a couple things happened that got me thinking: Reg Braithwaite (aka @raganwald) posted his proposal for a CoffeeScript syntax to support the “fluent style” of programming, inspired by Smalltalk’s message cascades:

array
    .pop()
    .pop()
    .pop()

path
    .moveTo(10, 10)
    .stroke("red")
    .fill("blue")
    .ellipse(50, 50)

Next I saw Bob Nystrom‘s post about a cascade syntax proposal for Dart, which is a small variation on Allen’s monocle-mustache:

document.query('#myTable').{
    queryAll('.firstColumn').{
        style.{
            background = 'red',
            border = '2px solid black'
        },
        text = 'first column'
    },
    queryAll('.lastColumn').{
        style.background = 'blue',
        text = 'last column'
    }
};

This is really just a tiny tweak to Allen’s original syntax, but it makes a world of difference: it uses the = sign for assignment. Much clearer!

Now, assignments with commas don’t really look like JS. But since the point of cascades is to do imperative sequencing — i.e., to ignore the result of intermediate message sends and do each message send on the original object from the left-hand side — it makes perfect sense to use a statement-like syntax:

array.{
    pop();
    pop();
    pop();
};

path.{
    moveTo(10, 10);
    stroke("red");
    fill("blue");
    ellipse(50, 50);
};

this.{
    foo = 17;
    bar = "hello";
    baz = true;
};

Even sweeter, JavaScript’s automatic semicolon insertion kicks in and lets you do this very concisely:

array.{
    pop()
    pop()
    pop()
};

path.{
    moveTo(10, 10)
    stroke("red")
    fill("blue")
    ellipse(50, 50)
};

this.{
    foo = 17
    bar = "hello"
    baz = true
};

What’s so great about a cascade syntax is that when you want to do imperative programming on the same object, you don’t have to rely on the API creator to return this from every method. As the client of the API you don’t have to care what the method returns, you throw away the result anyway.

In fact, this is one of the things I don’t like about method chaining in JS today: you can’t actually tell whether the method is mutating the current object and returning itself or producing some entirely new object. And when you mix the two styles, it gets even blurrier. Look at Bob’s example from jQuery:

$('#myTable')
    .find('.firstColumn')
        .css({'background': 'red',
              'border': '2px solid black'})
        .text('first column')
    .end()
    .find('.lastColumn')
        .css('background','blue')
        .text('last column');

You just have to know that .find() produces a new object, and .css() and .text() modify it and produce the same object, but nothing about the syntax tells you this. (And personally, I always feld that .end() was where jQuery’s API jumps the shark.)

With a cascade syntax in addition to normal method calls, you can so much more easily distinguish when you’re doing something to the same object, and when you’re selecting new objects:

$('#myTable').{
    find('.firstColumn').{
        style.{
            background = 'red'
            border = '2px solid black'
        }
        text = 'first column'
    }
    find('.lastColumn').{
        style.background = 'blue'
        text = 'last column'
    }
};

Gist here. These are just my initial thoughts; I’ll have to work on fleshing out a full proposal, including the full grammar spec.

Liar

I’ve claimed in a couple talks recently that the ES6 expression

new Foo(...args)

is impossible to implement in ES3 and only became possible in ES5 with Function.prototype.bind:

Function.prototype.applyNew = function applyNew(a) {
    return new (this.bind.apply(this, [,].concat(a)))();
};
Foo.applyNew(args)

This works by closing the function over arguments array a with an undefined receiver (the [,] expression creates an array of length 1 but a hole at element 0). Since a function created by Function.prototype.bind ignores the bound receiver when called with new, this has the same behavior as the ES6 expression.

But I should not have counted ES3 out so easily — with the magic of eval, many impossible things are possible.

Function.prototype.applyNew = function applyNew(a) {
    return eval("new this(" +
                a.map(function(x, i) {
                          return "a[" + i + "]";
                      }).join(",") +
                ")");
};
Foo.applyNew(args)

Thanks to Trevor Norris for awakening me from my dogmatic slumber. His approach won’t work for functions like String that have different call and construct behavior, but he reminded me that I’d seen this solved before with eval.

Edit: Oops, the applyNew method doesn’t take a function argument, it uses this as the function. That’s what I get for posting without testing!

contracts.coffee

Mozilla Research intern Tim Disney has has just released contracts.coffee, a dialect of CoffeeScript with a sweet syntax for wrapping JavaScript objects with contracts that are automatically checked at runtime (“assert on steroids”). Tim’s got a great tutorial with lots of examples, so I won’t try to reproduce it here. Just check it out!

Contracts.coffee is a great demonstrator of the power of JavaScript proxies, which are coming in the ES6 standard, have been shipping in SpiderMonkey since Firefox 4, and are in the works for V8. Contracts.coffee uses proxies to create wrapper objects that do all the proper checking to enforce your contracts.

It’s also a nice experiment in language design (using ideas from Eiffel and Scheme), which we’ll be paying attention to along the road to ES7 and beyond.

JS static analysis projects

Taras was reflecting on his attempts to build community around static analysis tools for open source software. Taras has built some impressive tools for analyzing our massive C++ codebase at Mozilla, and has brought static analysis into the Mozilla lexicon. Lately, our research group has been carrying the torch by creating new tools for analyzing JavaScript.

Last summer, our research intern Dimitris Vardoulakis built Doctor JS, a static analysis for JavaScript based on Dimitris’s award-winning CFA2 algorithm. Our first uses of Doctor JS were a type inference service and js-ctags, which generates output that IDE’s can use for auto-completion.

This summer, we’re starting more projects with Doctor JS. With the help of Dimitris and our intern Rezwana Karim, we’re investigating event listener registration patterns in Firefox addons to test for compatibility issues with Electrolysis. Another intern, Vineeth Kashyap, is modifying Doctor JS to do static taint tracking as a way of doing security analyses for leaking chrome-privilege data into content-privilege code.

I’d like Doctor JS to get to a point where it’s more scriptable—a “semantic grep” tool like Dehydra. I’m sure we’ll crib some notes from Taras’s work. But for a first step we’re just going to adapt the tool as needed to the specific applications we’re using it for. Hopefully this will give us a better feel for how to generalize it down the road to be more user-extensible.

The JS parser API has landed

Today I pushed a patch to our development repository to expose the JavaScript Parser API to code with chrome privileges. I’ve fixed all known major bugs, and it’s been holding up under sustained attack from Jesse Ruderman‘s fuzz tester without crashing for the better part of a year. So hopefully this means the landing will stick. If so, then this should land in Nightly within the week and Aurora by next week. If all goes well, the library will ship with Firefox 7.

Once it ships, addons will be able to write:

> Components.utils.import("resource://gre/modules/reflect.jsm");
> var expr = Reflect.parse("obj.foo + 42").body[0].expression
> expr.left.property.name
"foo"
> expr.right.value
42

I hope people find this useful for introspecting on code, building development tools, or writing static analyses. I also hope people come up with cool new applications I haven’t even thought of.

I’m worried about structured clone

When I first heard about web workers using structured clone, I was nervous. The more I look into it, the more I think the whole idea of structured clone — regardless of what it’s used for — is problematic in and of itself.

Implicit copying is rarely what you want

When data is mutable, it needs to be managed by the programmer who created it, because they know what they’re doing with it. When the language or API implicitly copies the data, the programmer has no control over it. Granted, structured clone is only used in a few published places in HTML5, but it would be preferable to have explicit ways to construct immutable data, and only be able to send immutable data between workers. (Or ways to safely transfer ownership of mutable data, but that’s irrelevant to the question of structured clone.)

This raises the question of how to express immutable data in JavaScript. That’s something that Brendan has recently blogged about, and it’s worth adding to the language. But structured clone strikes me as a hack around the problem that we don’t currently have convenient ways of creating structured, immutable data.

Structured clone ignores huge swathes of JavaScript data

Structured clone is only defined on a handful of JavaScript built-in JavaScript and DOM object types. JavaScript objects are part of a deeply intertwined, deeply mutable object graph, and structured clone simply ignores most of that graph. An operation that uses structured clone will let you use any Object instance, regardless of what sorts of invariants it’s set up to expect based on its prototype chain, its getters or setters, its connectedness to the object graph… but structured clone will simply blithely disregard much of that structure.

Again, if we simply had some simple, immutable data structures like tuples and records, these would be totally reasonable things to share between workers.

Automatically traversing mutable data structures is a code smell

There’s a famous paper by Henry Baker that specifically argues that cloning mutable data structures rarely has a “one size fits all” solution, and that mutable data can’t be usefully traversed automatically by general purpose libraries. I have a sense that whenever some API is automatically, deeply traversing mutable data structures, it’s probably unlikely to be doing the right thing.

Structured clone is not future-proof

Structured clone is simply defined on a grab-bag of built-in datatypes, and the rest are treated as plain old objects. This means it’s going to behave very strangely on new data types that get introduced in future versions of ECMAScript, such as maps and sets, or in user libraries.

Alternatives?

A more adaptable approach might be for ECMAScript to specify “transmittable” data structures. As we add immutable data structures, they could be defined to be transmittable, and we could even specify custom internal properties of certain classes of mutable objects with transmission-safe semantics such as ownership transfer.

Doing these kinds of things well, in a way that’s simple, clear and predictable, deserves built-in language support.

A semantics for JSON

[Warning: here there be PL geekery.]

In a discussion about Lloyd Hilaiel‘s cool new JSONSelect library, Rob Sayre pointed out that the library shouldn’t permit ordered access to JSON objects. This got me thinking about the fact that the JSON RFC defines a syntax without a semantics. And yet the introduction makes a semantic claim:

An object is an unordered collection of zero or more name/value pairs, where a name is a string and a value is a string, number, boolean, null, object, or array.

This claim from the introduction isn’t specified anywhere in the RFC. But it’s not too hard to provide a semantics that does. To keep things simple, let me just assume an existing semantics for Unicode strings and IEEE754 double-precision floating-point numbers, where UnicodeValue(s) produces the Unicode string for a JSON string literal s, and DoubleValue(n) produces the IEEE754 double-precision floating-point number value for the JSON number literal n.

Here goes!

Values

A JSON value is a member of one of the following disjoint sets:

  • Unicode strings
  • IEEE754 numbers
  • the constants { true, false, null }
  • JSON arrays
  • JSON objects

A JSON array is a finite sequence of JSON values. (I’ll write [] for the empty array, [ v ] for the singleton array containing the JSON value v, and a1a2 for array concatenation.)

A JSON object is a finite set of (Unicode string, JSON value) pairs.

Operations

Selecting the keys of a JSON object:

keys(o) = { s | (s, v)o }

Extending a JSON object:

extend(o, s, v) = (o{ (s, v) }) ∪ { (s, v) }
if (s, v)o
extend(o, s, v) = o{ (s, v) }
if skeys(o)

Looking up properties on a JSON object:

lookup(o, s) = v
if (s, v)o

Note that lookup is partial: lookup(o, s) is unspecified when skeys(o).

Interpretation

Now that we have the data model, we can define the interpretation of the syntax:

Value(string) = UnicodeValue(string)
Value(number) = DoubleValue(number)
Value({}) =
Value({ members }) = ObjectValue(members)
Value([]) = []
Value([ elements ]) = ArrayValue(elements)

ObjectValue(s : v) = { (UnicodeValue(s), Value(v)) }
ObjectValue(members, s : v) = extend(ObjectValue(members), UnicodeValue(s), Value(v))

ArrayValue(v) = [ Value(v) ]
ArrayValue(v, elements) = Value(v) ⋅ ArrayValue(elements)

That’s it!

With this data model, you can now show that the order of object properties doesn’t matter, except insofar as when there are duplicate keys, later ones replace earlier ones.

A failure of imagination

First, let me say that I’m enjoying my first JSConf very much. It’s obvious the organizers and speakers have all done a ton of work to make it a great event. I hope this post doesn’t come across as ungrateful or dyspeptic. But I feel like I should say something, and I don’t think I’m alone in my reaction:

@JennLukas tweet

There was a little lunchtime performance that just went places it didn’t need to go. There were jokes about “transvestite hipsters in downtown Portland” and about women in technology. They called out women in the audience, and said they were going to bring one up on stage by picking a woman’s name at random. First they brought up a man whose name looks like “Jan” and then asked him questions as if he was a woman (really, this was the level of humor). They asked him how it feels to be a woman in technology, surrounded by “smelly, sweaty men.” They joked about how they only discovered that this guy was a man “late last night” (get it?). Then they did call up a woman; I don’t know if she’d agreed in advance.

The questions they asked her were pretty tame. All in all, it wasn’t particularly out of control. But the whole time I was sitting there just praying it wasn’t going to get worse. And I imagine there were women in the audience feeling nervous that they were going to be called up and embarrassed or humiliated.

I’m sure the people putting on the show were nervous and just trying to give the audience a good show. And I’m sure they felt they didn’t cross any lines (even though the queer jokes pretty much did). But this is the part that I find really sad. It’s a failure of imagination, especially as a performer, not to be able to empathize with the audience: how were we supposed to know in advance how far it was going to go? Why would you make some of your audience feel intimidated or anxious just in the name of cheap laughs?

And let’s face it, this humor is cheap. These are the kinds of jokes you use when you’ve got nothing else. Comedy is really, really hard. If you aren’t a professional comic, maybe you just shouldn’t try. But at least stay away from jokes that isolate and intimidate your own audience. Some places are set up for raunchy or deliberately offensive humor, and that’s fine. But this is a technology conference.

Update: I hope this post won’t lead people to generalize about the JS community or JSConf. I stand by what I said; Chris and Laura work to make JSConf fun and inclusive for everyone, and I don’t think the guys who did this bit yesterday lived up to that. But as I said in my post, JSConf really has been incredible.

Who says JavaScript I/O has to be ugly?

I’m excited to announce the latest version of jsTask, which now supports first-class, synchronizable events. A Sync object represents an event that may or may not have fired yet, and which a task can block on.

jsTask already makes it easy to write tasks that block on I/O:

var foo = yield read("foo.json");
var bar = yield read("bar.json");

But the power lurking behind that code is the fact that read actually returns a Sync value, which can be used to build more interesting synchronizable events. For example, we can do a join on several concurrent operations, so that one doesn’t have to wait for the other before initiating:

var [ foo, bar ] = yield join(read("foo.json"), read("bar.json"));

Or we can choose from several different concurrent operations, letting whichever completes first produce the result (and cancelling the others):

var file = yield choose(read("a.json"), read("b.json"), read("c.json"));

With combinators like these, you can start to build interesting idioms, such as the timeout method, which I went ahead and built in to the library:

try {
    var file = yield read("foo.json").timeout(1000);
} catch (e) {
    // I/O error or timed out
}

This is just the beginning: I’ll be implementing Sync wrappers for all the major DOM I/O events, and I’ll keep experimenting with API’s for common use cases and helpful idioms.

I believe jsTask would also be useful for server-side JS frameworks like node.js, which use non-blocking I/O heavily. But first we have to get generators into V8 and ECMAScript!