Category Archives: Uncategorized

Liar

I’ve claimed in a couple talks recently that the ES6 expression

new Foo(...args)

is impossible to implement in ES3 and only became possible in ES5 with Function.prototype.bind:

Function.prototype.applyNew = function applyNew(a) {
    return new (this.bind.apply(this, [,].concat(a)))();
};
Foo.applyNew(args)

This works by closing the function over arguments array a with an undefined receiver (the [,] expression creates an array of length 1 but a hole at element 0). Since a function created by Function.prototype.bind ignores the bound receiver when called with new, this has the same behavior as the ES6 expression.

But I should not have counted ES3 out so easily — with the magic of eval, many impossible things are possible.

Function.prototype.applyNew = function applyNew(a) {
    return eval("new this(" +
                a.map(function(x, i) {
                          return "a[" + i + "]";
                      }).join(",") +
                ")");
};
Foo.applyNew(args)

Thanks to Trevor Norris for awakening me from my dogmatic slumber. His approach won’t work for functions like String that have different call and construct behavior, but he reminded me that I’d seen this solved before with eval.

Edit: Oops, the applyNew method doesn’t take a function argument, it uses this as the function. That’s what I get for posting without testing!

I’m worried about structured clone

When I first heard about web workers using structured clone, I was nervous. The more I look into it, the more I think the whole idea of structured clone — regardless of what it’s used for — is problematic in and of itself.

Implicit copying is rarely what you want

When data is mutable, it needs to be managed by the programmer who created it, because they know what they’re doing with it. When the language or API implicitly copies the data, the programmer has no control over it. Granted, structured clone is only used in a few published places in HTML5, but it would be preferable to have explicit ways to construct immutable data, and only be able to send immutable data between workers. (Or ways to safely transfer ownership of mutable data, but that’s irrelevant to the question of structured clone.)

This raises the question of how to express immutable data in JavaScript. That’s something that Brendan has recently blogged about, and it’s worth adding to the language. But structured clone strikes me as a hack around the problem that we don’t currently have convenient ways of creating structured, immutable data.

Structured clone ignores huge swathes of JavaScript data

Structured clone is only defined on a handful of JavaScript built-in JavaScript and DOM object types. JavaScript objects are part of a deeply intertwined, deeply mutable object graph, and structured clone simply ignores most of that graph. An operation that uses structured clone will let you use any Object instance, regardless of what sorts of invariants it’s set up to expect based on its prototype chain, its getters or setters, its connectedness to the object graph… but structured clone will simply blithely disregard much of that structure.

Again, if we simply had some simple, immutable data structures like tuples and records, these would be totally reasonable things to share between workers.

Automatically traversing mutable data structures is a code smell

There’s a famous paper by Henry Baker that specifically argues that cloning mutable data structures rarely has a “one size fits all” solution, and that mutable data can’t be usefully traversed automatically by general purpose libraries. I have a sense that whenever some API is automatically, deeply traversing mutable data structures, it’s probably unlikely to be doing the right thing.

Structured clone is not future-proof

Structured clone is simply defined on a grab-bag of built-in datatypes, and the rest are treated as plain old objects. This means it’s going to behave very strangely on new data types that get introduced in future versions of ECMAScript, such as maps and sets, or in user libraries.

Alternatives?

A more adaptable approach might be for ECMAScript to specify “transmittable” data structures. As we add immutable data structures, they could be defined to be transmittable, and we could even specify custom internal properties of certain classes of mutable objects with transmission-safe semantics such as ownership transfer.

Doing these kinds of things well, in a way that’s simple, clear and predictable, deserves built-in language support.

A semantics for JSON

[Warning: here there be PL geekery.]

In a discussion about Lloyd Hilaiel‘s cool new JSONSelect library, Rob Sayre pointed out that the library shouldn’t permit ordered access to JSON objects. This got me thinking about the fact that the JSON RFC defines a syntax without a semantics. And yet the introduction makes a semantic claim:

An object is an unordered collection of zero or more name/value pairs, where a name is a string and a value is a string, number, boolean, null, object, or array.

This claim from the introduction isn’t specified anywhere in the RFC. But it’s not too hard to provide a semantics that does. To keep things simple, let me just assume an existing semantics for Unicode strings and IEEE754 double-precision floating-point numbers, where UnicodeValue(s) produces the Unicode string for a JSON string literal s, and DoubleValue(n) produces the IEEE754 double-precision floating-point number value for the JSON number literal n.

Here goes!

Values

A JSON value is a member of one of the following disjoint sets:

  • Unicode strings
  • IEEE754 numbers
  • the constants { true, false, null }
  • JSON arrays
  • JSON objects

A JSON array is a finite sequence of JSON values. (I’ll write [] for the empty array, [ v ] for the singleton array containing the JSON value v, and a1a2 for array concatenation.)

A JSON object is a finite set of (Unicode string, JSON value) pairs.

Operations

Selecting the keys of a JSON object:

keys(o) = { s | (s, v)o }

Extending a JSON object:

extend(o, s, v) = (o{ (s, v) }) ∪ { (s, v) }
if (s, v)o
extend(o, s, v) = o{ (s, v) }
if skeys(o)

Looking up properties on a JSON object:

lookup(o, s) = v
if (s, v)o

Note that lookup is partial: lookup(o, s) is unspecified when skeys(o).

Interpretation

Now that we have the data model, we can define the interpretation of the syntax:

Value(string) = UnicodeValue(string)
Value(number) = DoubleValue(number)
Value({}) =
Value({ members }) = ObjectValue(members)
Value([]) = []
Value([ elements ]) = ArrayValue(elements)

ObjectValue(s : v) = { (UnicodeValue(s), Value(v)) }
ObjectValue(members, s : v) = extend(ObjectValue(members), UnicodeValue(s), Value(v))

ArrayValue(v) = [ Value(v) ]
ArrayValue(v, elements) = Value(v) ⋅ ArrayValue(elements)

That’s it!

With this data model, you can now show that the order of object properties doesn’t matter, except insofar as when there are duplicate keys, later ones replace earlier ones.

A failure of imagination

First, let me say that I’m enjoying my first JSConf very much. It’s obvious the organizers and speakers have all done a ton of work to make it a great event. I hope this post doesn’t come across as ungrateful or dyspeptic. But I feel like I should say something, and I don’t think I’m alone in my reaction:

@JennLukas tweet

There was a little lunchtime performance that just went places it didn’t need to go. There were jokes about “transvestite hipsters in downtown Portland” and about women in technology. They called out women in the audience, and said they were going to bring one up on stage by picking a woman’s name at random. First they brought up a man whose name looks like “Jan” and then asked him questions as if he was a woman (really, this was the level of humor). They asked him how it feels to be a woman in technology, surrounded by “smelly, sweaty men.” They joked about how they only discovered that this guy was a man “late last night” (get it?). Then they did call up a woman; I don’t know if she’d agreed in advance.

The questions they asked her were pretty tame. All in all, it wasn’t particularly out of control. But the whole time I was sitting there just praying it wasn’t going to get worse. And I imagine there were women in the audience feeling nervous that they were going to be called up and embarrassed or humiliated.

I’m sure the people putting on the show were nervous and just trying to give the audience a good show. And I’m sure they felt they didn’t cross any lines (even though the queer jokes pretty much did). But this is the part that I find really sad. It’s a failure of imagination, especially as a performer, not to be able to empathize with the audience: how were we supposed to know in advance how far it was going to go? Why would you make some of your audience feel intimidated or anxious just in the name of cheap laughs?

And let’s face it, this humor is cheap. These are the kinds of jokes you use when you’ve got nothing else. Comedy is really, really hard. If you aren’t a professional comic, maybe you just shouldn’t try. But at least stay away from jokes that isolate and intimidate your own audience. Some places are set up for raunchy or deliberately offensive humor, and that’s fine. But this is a technology conference.

Update: I hope this post won’t lead people to generalize about the JS community or JSConf. I stand by what I said; Chris and Laura work to make JSConf fun and inclusive for everyone, and I don’t think the guys who did this bit yesterday lived up to that. But as I said in my post, JSConf really has been incredible.

Who says JavaScript I/O has to be ugly?

I’m excited to announce the latest version of jsTask, which now supports first-class, synchronizable events. A Sync object represents an event that may or may not have fired yet, and which a task can block on.

jsTask already makes it easy to write tasks that block on I/O:

var foo = yield read("foo.json");
var bar = yield read("bar.json");

But the power lurking behind that code is the fact that read actually returns a Sync value, which can be used to build more interesting synchronizable events. For example, we can do a join on several concurrent operations, so that one doesn’t have to wait for the other before initiating:

var [ foo, bar ] = yield join(read("foo.json"), read("bar.json"));

Or we can choose from several different concurrent operations, letting whichever completes first produce the result (and cancelling the others):

var file = yield choose(read("a.json"), read("b.json"), read("c.json"));

With combinators like these, you can start to build interesting idioms, such as the timeout method, which I went ahead and built in to the library:

try {
    var file = yield read("foo.json").timeout(1000);
} catch (e) {
    // I/O error or timed out
}

This is just the beginning: I’ll be implementing Sync wrappers for all the major DOM I/O events, and I’ll keep experimenting with API’s for common use cases and helpful idioms.

I believe jsTask would also be useful for server-side JS frameworks like node.js, which use non-blocking I/O heavily. But first we have to get generators into V8 and ECMAScript!

structs.js

I’ve written a first draft prototype of structs.js, the binary data library we’re working on for the ECMAScript standard. It’s implemented using typed arrays, but it’s my hope that it can eventually subsume the important use cases of typed arrays. (Typed arrays let you do questionable things like cast data in platform-specific ways.)

For now the implementation only works in Firefox 4, because it’s using proxies. I have ideas about how to make it work in Safari and Chrome by backing off from proxies if they’re not available, and maybe even IE by shimming typed arrays using ordinary arrays.

Try it out! Feel free to send me feedback directly, or to es-discuss.

A test for proper tail calls

Here are two versions of the same little function — or, I should say, almost the same little function:

function nonTail(n) {
    if (n > 0)
        nonTail(n - 1);
    return;
}

function tail(n) {
    if (n > 0)
        return tail(n - 1);
}

Neither function does anything particularly interesting, but it’s the way they do what they do that’s interesting. They’re both recursive, but only the second one is tail-recursive: it calls itself as the final result of a return statement. So even for very large values of n, in an implementation with proper tail calls, the tail-recursive function shouldn’t die with a stack overflow.

Here’s a test case to determine whether an engine supports proper tail calls. It should be fairly robust, even in the face of engines with different stack implementations and different VM limits: the first part of the test tries to determine experimentally what is a likely maximum stack depth for this function. (It’s doing a binary search, so it should be bounded above by about 31 × k stack frames, where k is the engine’s maximum stack depth for nonTail.) The second part then tries running the tail-recursive function with that depth and checks whether the engine throws an exception.

var lo = 0;
var hi = Math.pow(2, 31) - 1;

while (hi !== lo) {
    var mid = lo + (((hi - lo) / 2) | 0);
    try {
        nonTail(mid);
        lo = mid + 1;
    } catch (e) {
        hi = mid;
    }
}

var min = lo;

try {
    tail(min);
    tail(min * 2);
    print("test passed!");
} catch (e) {
    print("test failed!");
}

As we work on test suites for ECMAScript conformance, this kind of test should be useful for checking whether engines support proper tail calls.

PS I should point out that no current browser JS engines support proper tail calls, so right now this test will fail in all of them.

Proper tail calls in Harmony

In the last ECMAScript committee meeting, we promoted the proper tail calls proposal from “strawman” to official “proposal” status. There’s still work to do to make it into a polished spec, and it ain’t over till it ships, but I’m really excited that we could be making proper tail calls a guaranteed feature of JavaScript.

In short, proper tail calls make it possible for a function to delegate its result to a function call without allocating unnecessary space in the runtime system. In most languages, function calls always require allocating a stack frame, and stack frames are only deallocated on return. But this implementation approach isn’t necessary and even makes it impossible for programmers to write certain kinds of programs. Language implementations that guarantee proper tail calls promise that, whenever a function delegates to a function call to provide its final result (according to a fixed set of rules about what such delegation looks like syntactically), the runtime system will not indefinitely keep the caller’s frame allocated while the callee runs.

Why does this matter? For one, it makes it possible to break up large while or for loops into separate functions that tail-call one another, passing their state around as arguments instead of updating a shared piece of state. This makes it easier to understand the state and to build mock states in order to test the pieces of the algorithm independently. (I wonder if it might even open up opportunities to parallelize pieces of an algorithm into separate web workers. I’ve got another post coming on how some of Brendan’s recent ideas for Harmony might help with web workers.) Generally speaking, tail calls make it possible to modularize program control. Whether it’s just for the occasional one-off delegation (which proper tail calls makes as space-efficient as a goto), or for writing in full-on continuation-passing style (which, without proper tail calls, are more or less destined to blow the stack), this offers more flexibility for programmers to write more modular code.

The other people who stand to gain are compiler-writers. After all, compiling to JavaScript is getting more popular all the time. Tail calls make it possible to simulate lots of different kinds of control constructs in other programming languages. People have tried using setTimeout with a delay of 0, but browsers throttle these events with a minimum timeout of around 5 to 10 milliseconds (apparently, to support web pages that rely on the delay for animations). Having an officially guaranteed tail call mechanism makes it possible to compile control constructs like continuations, coroutines, threads, and actors. And, of course, it’s useful for compiling source languages with tail calls!

A word about stack traces: some people worry that tail calls interfere with getting useful stack traces. And it’s true that it can be useful to track the complete call history for stack traces, including tail calls. Now, when necessary, there are simple workarounds programmers can use for turning tail calls into non-tail calls. But a better answer is for VM writers to provide smarter stack traces which track only a finite number of the most recent tail calls in their stack traces. This satisfies the requirement of bounding the amount of allocation (since it’s a bounded amount of history) but still keeps useful diagnostic information in the stack trace. In the end, it’s true that this is a trade-off. In my experience, this cost is worth the benefit of being able to write better delegation and control abstractions.