How EgotisticalGiraffe was fixed

In October, Bruce Schneier reported that the NSA had discovered a way to attack Tor, a system for online anonymity.

The NSA did this not by attacking the Tor system or its encryption, but by attacking the Firefox web browser bundled with Tor. The particular vulnerability, code-named “EgotisticalGiraffe”, was fixed in Firefox 17, but the Tor browser bundle at the time included an older version, Firefox 10, which was vulnerable.

I’m writing about this because I’m a member of Mozilla’s JavaScript team and one of the people responsible for fixing the bug.

I still don’t know exactly what vulnerability EgotisticalGiraffe refers to. According to Mr. Schneier’s article, it was a bug in a feature called E4X. The security hole went away when we disabled E4X in Firefox 17.

You can read a little about this in Mozilla’s bug-tracking database. E4X was disabled in bugs 753542, 752632, 765890, and 778851, and finally removed entirely in bugs 833208 and 788293. Nicholas Nethercote and Ted Shroyer contributed patches. Johnny Stenback, Benjamin Smedberg, Jim Blandy, David Mandelin, and Jeff Walden helped with code reviews and encouragement. As with any team effort, many more people helped indirectly.

Thank you.


Now I will write as an American. I don’t speak for Mozilla on this or any topic. The views expressed here are my own and I’ll keep my political opinions out of it.

The NSA has twin missions: to gather signals intelligence and to defend American information systems.

From the outside, it appears the two functions aren’t balanced very well. This could be a problem, because there’s a conflict of interest. The signals intelligence folks are motivated to weaponize vulnerabilities in Internet systems. The defense folks, and frankly everyone else, would like to see those vulnerabilities fixed instead.

It seems to me that fixing them is better for national security.

In the particular case of this E4X vulnerability, mainly only Tor users were vulnerable. But it has also been reported that the NSA has bought security vulnerabilities “from private malware vendors”.

All I know about this is a line item in a budget ($25.1 million). I’ve seen speculation that the NSA wants these flaws for offensive use. It’s a plausible conjecture—but I sure hope that’s not the case. Let me try to explain why.

The Internet is used in government. It’s used in banks, hospitals, power plants. It’s used in the military. It’s used to handle classified information. It’s used by Americans around the world. It’s used by our allies. If the NSA is using security flaws in widely-used software offensively (and to repeat, no one says they are), then they are holding in their hands major vulnerabilities in American civilian and military infrastructure, and choosing not to fix them. It would be a dangerous bet: that our enemies are not already aware of those flaws, aren’t already using them against us, and can’t independently buy the same information for the same price. Also that deploying the flaws offensively won’t reveal them.

Never mind the other, purely civilian benefits of a more secure Internet. It just sounds like a bad bet.


Ultimately, the NSA is not responsible for Firefox in particular. That honor and privilege is ours. Yours, too, if you want it.

We have work to do. One key step is content process separation and sandboxing. Internet Explorer and Chrome have had this for years. It’s coming to Firefox. I’d be surprised if a single Firefox remote exploit known to anyone survives this work (assuming there are any to begin with). Firefox contributors from four continents are collaborating on it. You can join them. Or just try it out. It’s rough so far, but advancing.

I’m not doing my job unless life is constantly getting harder for the NSA’s SIGINT mission. That’s not a political statement. That’s just how it is. It’s the same for all of us who work on security and privacy. Not only at Mozilla.

If you know of a security flaw in Firefox, here’s how to reach us.

New function, Math.hypot(), in Firefox Nightly

Volunteer David Caabeiro has contributed a patch implementing Math.hypot, a new Math function in ECMAScript Edition 6.

In fact David contributed code for all the new Math functions in ES6. I think hypot() is the most fun of all of them, and it was delayed for a few weeks while the standard committee settled on its exact semantics.

What does this new function do? Math.hypot(x, y) returns the length of the hypotenuse of a right triangle with legs x and y. That is, it returns √(x² + y²). This means you can find the distance between two points in the plane, (x1, y1) and (x2, y2), by asking for Math.hypot(x2 - x1, y2 - y1).

Why a new function? Of course we already have Math.sqrt, but the expression Math.sqrt(x*x + y*y) can overflow to Infinity or underflow to 0 in some cases where it’s undesirable, because the correct result is in the range of what a floating-point number can represent. Math.hypot handles those cases correctly.

But wait, there’s more! Perhaps you’re familiar with three-dimensional space? Math.hypot(x, y, z) has you covered there. It returns the Cartesian distance from the 3-D point (x, y, z) to the origin. Higher dimensions? You can use Math.hypot with as many arguments as you like.

Of course when I was a lad, we didn’t have all these fancy dimensions. For those of you still skeptical of the existence of two-dimensional “space”, Math.hypot(x) returns the same as Math.abs(x); and with no arguments at all, Math.hypot() returns 0.

Thanks to David for his persistent work.

String iteration is fixed in Firefox Nightly

Volunteer contributor André Bargull has contributed a patch to fix String iteration.

Up to now, if you did

for (var ch of "\uD83D\uDE80")
    alert(ch);

you would get two alerts, one with the message “�” and one with the message “�”. This is silly. Neither of those is a Unicode character. Together, they form a UTF-16 surrogate pair representing the character U+1F680, ROCKET. (I would paste it in here, but apparently Emoji break WordPress. Nice.)

As of today’s Nightly build, the above code now produces a single alert containing the rocket-ship emoji. Try it in the Web Console.

Likewise, you can use [...str] or new Set(str) to convert a string to an Array or Set of characters. Until now, each half of a surrogate pair would get its own Array element or Set entry. With the fix, surrogate pairs are kept together, so that

[... "Rocket \uD83D\uDE80"]

produces the value

["R", "o", "c", "k", "e", "t", " ", "\uD83D\uDE80"]

This functionality has yet to be specified in the ECMAScript Sixth Edition draft, so there may still be minor changes. I don’t expect anything you’d notice.

Thanks to André for the patch.

Best feature ever

Lightly edited.

    <naveed> Hello Jason. Do you have anything to for the platform
             meeting from the past two weeks?
<jorendorff> no, had a crummy couple of weeks i guess
<jorendorff> it's looking better this week, i'm going to land
             iterators for Map and Set, which is the best feature
             basically ever
    <naveed> :)
    <naveed> why do you feel that way about iterators for Map?
    <naveed> We are tlaking about this right:
             http://wiki.ecmascript.org/doku.php?id=harmony:iterators
<jorendorff> yes
    <naveed> I read it but must admit without using some features
             I may not fully get why they are so great
<jorendorff> it's hard to explain
<jorendorff> iteration is a power tool,
<jorendorff> it's a core part of a language, inasmuch as data
             processing is one of the purposes of programming
<jorendorff> Zipping through a bunch of data efficiently is just
             *so* easy and pleasant in languages like Python and
             Ruby that get these basic data structures and control
             structures right
<jorendorff> they hit the sweet spot.
    <naveed> i do like python language features for this
* naveed doesnt know Ruby
<jorendorff> JS completely and utterly misses the sweet spot, the
             basic data structures basically have never existed,
             Array is kind of a botch and there is no syntax for
             iterating over it
<jorendorff> people use .forEach(), but you can't break out of it
    <naveed> ah true
    <naveed> i have felt that pain before
<jorendorff> people use Object for hash tables
<jorendorff> leading to the occasional bug where you do something
             like,    if (s in obj)
<jorendorff> and it just happens that s == "watch"
<jorendorff> which is the name of a property on all objects in
             Firefox :(
    <naveed> doh!
<jorendorff> and Firefox only *wince*
<jorendorff> up to now, Map and Set have not really been usable,
             because there's no way to get all the data out.
<jorendorff> there's no .keys() .items() .toArray() anything like
             that.
<jorendorff> with iterability it is trivial to write all those
             methods yourself!
<jorendorff> function keys(map) { return [k for ([k, v] of map)]; }
<jorendorff> function items(map) { return [pair for (pair of map)]; }
    <naveed> ok i get it now
    <naveed> the comparison to Python made it very clear
<jorendorff> ok, that's all i got :)
<jorendorff> oh!! the best part
<jorendorff> it's going to work with the DOM too, we're making the
             DOM iterable :) :) :) :)
<jorendorff> JS is going to be so much better
    <naveed> oh that is awesome
    <naveed> DOM standard?
<jorendorff> working on it :)
    <naveed> that will make many things much cleaner
<jorendorff> the relevant patch is r?bz, he can point out any
             problems
    <naveed> 725907 (--/normal): Change for-of loop to work in
             terms of .iterator() and .next()
<jorendorff> that's the one.
    <naveed> now you have me excited !
    <naveed> Thank you for the context
<jorendorff> :)
<jorendorff> mind if I just blit this conversation to my blog?
    <naveed> of course not

I can’t promise these changes will really land this week; they are pending code review in bug 725907, bug 743107, and bug 725909.

I really think this one cluster of new features coming to JavaScript (for-of loops, iterators, generators, Map, and Set) is more important than almost anyone appreciates. But anyone who has used Python or Ruby already knows how important this is—subconsciously.

The syntax for Map and Set is not as sweet in ES6 as it is in Python and Ruby. We’ll see if the new features hit the sweet spot. So far I have a bit of experience with them, just writing unit tests. They feel pretty darn good.

Rest arguments and default arguments in JavaScript

In the last two weeks, Benjamin Peterson has implemented two new features in the JavaScript engine. You want to know about these. If you write a lot of JS, they will make your life better.

Rest arguments are a nicer substitute for the familiar arguments object. The syntax looks like this:

function f(arg1, arg2, ...rest) {
    alert("You passed " + rest.length + " extra arguments.");
}

This works just like rest-arguments in Python, Lisp, and Ruby. Each time you call f, the first few arguments are assigned to the ordinary arguments, in this case arg1 and arg2. Any extra arguments you pass are stored in an array, rest. If you don’t pass any extra arguments, rest is an empty array.

Unlike the arguments object, a rest-argument is a real Array. It has all the Array methods, including .shift(), .forEach(), and .map(). And unlike arguments, which is re-defined in each nested function whether you want it or not, a rest-argument works exactly as expected in a closure.

Default arguments look like this:

function fade(element, seconds=0.5, targetOpacity=0) {
    $(element).animate({opacity: targetOpacity}, seconds * 1000);
}

When you call this function, if you omit the arguments that have default values, they’ll get the default values.

fade(form, 0.2, 1);  // default values are not used:
                     // fade in fast
fade(form, 3);       // targetOpacity defaults to 0:
                     // fade out very slowly
fade(form);          // seconds defaults to 0.5,
                     // targetOpacity defaults to 0:
                     // fade out at normal speed

The default value can be any expression. It can use the values of preceding arguments, as in

function logEvent(event, logger=findLoggerFor(event.target)) {
    ...
}

where the default value of logger depends on the event argument.

(Extreme technical details: Note that unlike Python, the default value is computed each time the function is called, so if an argument has =[], it gets a fresh empty Array each time. Also, note that currently you only get the default value if the caller actually leaves off the argument. If the caller explicitly passes undefined, the argument’s value is undefined and the default is ignored. But there is some discussion on the committee about maybe changing that so that the default is also used when the caller explicitly passes undefined. Other programming languages don’t do this, but it would fit with what the DOM already does in many cases. Right now we follow the current proposal; if the proposal changes, we’ll update our implementation.)

Both of these new features are on track to become part of the next ECMAScript standard.

Benjamin is still going strong. There’s more to come.

Screencast: Debugger in Scratchpad

This screencast shows how to make the JavaScript engine your plaything in just 3 lines of code.

And if you haven’t seen Scratchpad yet, prepare to fall in love with programming all over again.

After that you might want to see Jim’s talk about what it looks like when JS code runs. This talk explains beautifully why the Debugger object API is the way it is (sorry about the echo-y audio):

Loupe

Paul Rouget wanted a tool for closely examining the UI design he was
working on. So he made it. It’s called Loupe. It puts a little
magnifying glass in your location bar:

When you click it, you get this:

You can look at a design up-close, see if the gap between two boxes is 3 pixels or 4, that sort of thing. You can grab colors, too.

I want to talk about how this tool was made. The whole thing is a single JavaScript file. Paul built it using Scratchpad. The first prototype took about 2 hours to make. Then Paul shared it using Gist.

WARNING: The following pref is off by default for security reasons. Setting it gives Scratchpad code full control over Firefox—your passwords, history, everything. Also, even if someone claiming to be me says it’s OK, please don’t copy and run code you see on the Web. That said, if you did want to check out Loupe, you would go to about:config, set devtools.chrome.enabled to true, select Tools → Web Developer → Scratchpad, select Environment → Browser, and then run Paul’s code. It’s not like I can stop you.

To me this is really exciting. Scratchpad is low-tech in the best way.

Harmony modules and asynchronous script loading and document.write (oh my)

This is my first post about Harmony modules. I hate to start right out of the gate with a long boring post about timing details, but it seems the few people who might eventually be interested in it are actually interested right now! So let’s dive in.

Once we have modules, running a script will sometimes hit the network.

    <script>
        module blottr from "blottr.js";
    </script>

(By the way: you will have to opt in to Harmony syntax somehow—yet to be determined. This post will ignore that consideration.)

When the HTML parser reaches this script, parsing will pause until “blottr.js” is loaded. This matters if “blottr.js” touches the DOM during its initialization, for example. In terms of timing, it behaves just like a <script src=> element.

(As a performance optimization, we plan to change the HTML5 parser to skim ahead, find probable module statements, and pre-fetch the scripts. Again, just like <script src=>.)

But what about this?

    function silly() {
        eval('module blottr from "blottr.js";');
    }

What should happen if silly() is called from an event handler? Should it block until “blottr.js” comes in from the network? That would be a synchronous, blocking call, like synchronous XHR.

That would be gross. So in Harmony, eval will reject such code with a SyntaxError. In short, if a JavaScript caller is waiting, the module X from "url" syntax is banned. If your code needs a module off the network, you’ll just have to run it asynchronously somehow. Two new APIs, SystemLoader.load(urlcallback, errorCallback) and SystemLoader.asyncEval(code, callback, errorCallback), are proposed for that.

Now. Suppose we take the first example above and put it in a document.write call.

    <script>
        document.write(
            '<script>' +
            '    module blottr from "blottr.js";' +
            '</scr' + 'ipt>');
    </script>

Now what happens? (Benjamin Smedberg brought this case to my attention about 12 seconds after I described modules to him. It might have been less.)

To me, this particular flavor of document.write() weirdness was something new. I’m still not totally sure how we will handle it. Boris Zbarsky suggested again treating it the same as a <script src=> element: silently make that script load asynchronously, just because document.write() created it. That sounds plausible enough.

There are still a few more special cases to sort through. What happens if you try to load a module by assigning a string to an event handler attribute of a DOM element? I’m not sure yet. Perhaps we will throw a SyntaxError.

Bored to tears? Sorry! I’ll write again in a few days, explaining what Harmony modules are, who’s working on them, and why.

Why fast reviews happen

Back in April, Jim Blandy and I started hacking on a little project in a user repo.

Last Wednesday, without warning, I posted the results of three months of work for review in bug 672829. The whole patch is 336KB (actually 534KB, if you count tests), so I split it up into sections and requested code reviews from eight different JS hackers.

The first review came in before I was even done posting all the patches.

That afternoon, I sent a brief e-mail to the JS team asking for help getting the reviews done quickly. That was the only thing I did that I wouldn’t normally do. By noon Friday, just 48 hours later, twelve of the thirteen reviews had been granted, and Brendan was about halfway through the thirteenth.

The reviewers were:

David Anderson
Andrew Drake
Brendan Eich
Andreas Gal
Blake Kaplan
Bill McCloskey
Luke Wagner
Jeff Walden

Thanks, guys.

The best part: this is no fluke or one-off effort. It’s like this every day in js/src. How did this happen? Why do you get faster code reviews in the JavaScript engine than anywhere else in the project?

I really don’t know, but I have some guesses. Brendan Eich was the module owner for many years, and he always turned reviews around lightning-fast—and not by skimming, either, as you know if you’ve ever read a /be review. (You can read a totally typical one in this bug.) I think Rob Sayre probably had an influence as well. Maybe when you run a team with that “hey, are we all acting like adults here, and if not, why not” attitude for a few years, you get a culture of fast reviews.

Whatever the reason, I’m grateful. Fast reviews make me more effective. Some days, they make my job really exciting.

A Happy Family of C++ Classes

Luke refactored a bunch of code into js/src/vm/Stack.h. In a comment on the JS engine internals group, he wrote (emphasis mine):

David and Waldo raised the very reasonable question of whether js::StackFrame should be in its own file rather than in vm/Stack.h (its big). My reasoning for not wanting to is that FrameRegs + StackFrame + StackSegment + StackSpace + *FrameGuard altogether form a single logical data structure which I’d like to present as a whole [...] The same perspective shows up in math as many-sorted algebra so its not just C++ crazy-talk :)

None of us could figure out what that last sentence meant. We prevailed upon Luke to elaborate. I thought his explanation was a nice insight, so I’m sharing it here.

I’ll hazard an answer, knowing full well that there are at least three PhDs on the list who know a lot more about this than I do and may smite my answer with truth. (I’m at a layover in Hong Kong — be gentle :)

A single-sorted algebraic “structure” is something like a monoid, group, ring, field, etc: an abstract domain with a collection of operations (over this domain) and axioms that the operations must satisfy (e.g., distributivity, associativity, commutativity, etc). A single-sorted “algebra” implements a structure by picking a particular domain and set of operations that satisfy the axioms of the structure. (For example, the ring structure specifies an abstract + and * with a couple of axioms (associativity of +, distributivity, inverse for + and *, etc); the integers with arithmetic + and * are an algebra).

An important idea about all this is that, when you prove things about an algebraic structure, the proof is expressed only in terms of the declared operations/axioms of the structure, and not the particular details of any one algebra, so your theorem holds for all algebras of that structure. Now this starts to sound like abstract data types in computer science (s/structure/public interface/, s/axioms/specification/, s/algebra/concrete class/) and we can see that abstract algebraists are kinda like programmers who really really like reusable code.

A many-sorted structure/algebra is just the extension of the concept that can have more than one domain (thus, the operations can include more than one domain in their signature). An example is a vector space (which has a domain of scalars and a domain of vectors).

So then what’s the correspondence of these multi-sorted structures/algebras in programming? Classes/interfaces (of mainstream OOP languages) associate all operations with a single domain of values. I have little doubt there exist languages which solve the problem directly. We can hack multi-sorted-ness in C++ while maintaining some semblance of interface/implementation separation by just having multiple classes (one per domain) and making them all friends of each other. Then there is the question as to how to distribute the operations between the classes (or perhaps as non-member functions), but unless you want runtime polymorphism (which requires something like multi-methods), it’s a question of aesthetics.

I mentioned this originally because recognizing many-sorted algebras as a peer concept to single-sorted algebras helps to avoid a design mindset of “every class must be its own encapsulated island” which I feel can be detrimental when trying to modularize complex data structures like we have in SpiderMonkey. The Stack was one example; I think low-level objects + property tree will be another.