A blast from the past: in early 2014, we enabled Generational Garbage Collection (GGC) for Firefox desktop. But this blog post is not about GGC. It is about my victory dance when we finally pushed to button to let it go live on desktop firefox. Please click on that link; it is, after all, what this blog post is about.

Old-timers will recognize the old TBPL interface, since replaced with TreeHerder. I grabbed a snapshot of the page after I pushed GGC live (yes, green pushes really used to be that green!), then hacked it up a bit to implement the letter fly-in. And that fly-in is what I’d like to talk about now.

At the time I wrote it, I barely knew Javascript, and I knew even less CSS. But that was a long time ago.

Today… er, well, today I don’t know any more of either of those than I did back then.

Which is ok, since my whole goal here is to ask: what is the right way to implement that page? And specifically, would it be possible to do without any JS? Or perhaps minimal JS, just some glue between CSS animations or something?

To be specific: I am very particular about the animation I want. After the letters are done flying in, I want them to cycle through in the way shown. For example, they should be rotating around in the “O”. In general, they’re just repeatedly walking a path that is possibly discontinuous (as with any letter other than “O”). We’ll call this the marquee pattern.

Then when flying in, I want them to go to their appropriate positions within the marquee pattern. I don’t want them to fly to a starting position and only start moving in the marquee pattern once they get there. Oh noes, no no no. That would introduce a visible discontinuity. Plus which, the letters that started out close to their final position would move very slowly at first, then jerk into faster motion when the marquee began. We couldn’t have that now, could we?

I knew about CSS animations at the time I wrote this. But I couldn’t (and still can’t) see how to make use of them, at least without doing something crazy like redefining the animation every frame from JS. And in that case, why use CSS at all?

CSS can animate a smooth path between fixed points. So if I relaxed the constraint above (about the fly-in blending smoothly into the marquee pattern), I could pretty easily set up an animation to fly to the final position, then switch to a marquee animation. But how can you get the full effect? I speculated about some trick involving animating invisible elements with one animation, then having another animation to fly-in from each element’s original location to the corresponding invisible element’s marquee location, but I don’t know if that’s even possible.

You can look at the source code of the page. It’s a complete mess, combining as it does a cut and paste of the TBPL interface, plus all of jquery crammed in, and then finally my hacky code at the end of the file. Note that this was not easy code for me to write, and I did it when I got the idea at around 10pm during a JS work week, and it took me until about 4am. So don’t expect much in the way of comments or sanity or whatnot. In fact, the only comment that isn’t commenting out code appears to be the text “????!”.

The green letters have the CSS class “success”. There are lots of hardcoded coordinates, generated by hand-drawing the text in the Gimp and manually writing down the relevant coordinates. But my code doesn’t matter; the point is that there’s some messy Javascript called every frame to calculate the desired position of every one of those letters. Surely there’s a better way? (I should note that the animation is way smoother in today’s Nightly than it was back then. Progress!)

Anyway, the exact question is: how would someone implement this effect if they actually knew what they were doing?

I’ll take my answer off the air. In the comments, to be precise. Thank you in advance.

Firefox directions

June 5th, 2015

Some time back, I started thinking about what Firefox could do for me and its other users. Here are my thoughts, unashamedly biased and uninformed. If you don’t think something here is an awful idea, either I’ve failed or you aren’t reading closely enough.

Mozilla-specific advantages

Mozilla provides Firefox with a unique situation. We are not directly compelled to monetize Firefox or any other product. We do not need to capture users or wall them in. That doesn’t mean we don’t want to make money or gain market share — eventually, we need both, or Mozilla’s influence on the Web dies — but we have a degree of freedom that no other big player possesses.

Privacy and User Sovereignty

We can afford to give our users as much privacy as they might want. If you ask the vast majority of users about this, I suspect they’ll think it sounds like a theoretically good thing, but they don’t know what it really means, they don’t know what Firefox can do for them that other browsers don’t, and they don’t have any strong reason to care. All three of those are true of me, too.

Let’s tell them. Make an about:privacy that shows a chart of privacy features and behaviors of Firefox, Mozilla and its servers, and our major competitors. Give specific, realistic examples of information that is transmitted, what it might be used for (both probable and far-fetched). Include examples from jealous boyfriends, cautious employers, and restrictive regimes. Expose our own limitations and dirty laundry: “if this server gets hacked or someone you don’t like has privileged access, they will see that you crash a lot on youporn.com“. It won’t all fit on a page, but hopefully some impactful higher-order bits can be presented immediately, with links to go deeper. Imagine a friendly journalist who wants to write an article claiming that Firefox is the clear choice for people who care about controlling their personal data and experiences on the web: our job is to provide all the ammunition they’d need to write a convincing and well-founded article. Don’t bias or sugarcoat it — using Firefox instead of Chrome is going to protect very little from identity theft, and Google has more resources dedicated to securing their servers than we do.

If possible, include the “why”. We won’t track this data because it isn’t useful to us and we err on the side of the user. Chrome will because it’s part of their business model. Mention the positive value of lock-in to a corporation, and point out just how many sources of information Google can tap.

Update: Wait! Hold on! As a commenter pointed out, that is the exact sort of bias I just said we shouldn’t use. Google does not use Chrome to gather data as I implied. I was wrong, and made assumptions based on uninformed opinions about the motivations involved and their ramifications. Google has an incentive to limit its data collection, since not doing so would anger their users. In the end, I still feel like Mozilla is more free to side with the user than Google is, and I have to believe that now or in the future there will be significant real differences in behavior as a result, but collecting the sort of data I was implying through the browser is not one of those differences.

Anyway, back to talking about how Firefox can highlight Mozilla’s privacy advantages:

Point to Lightbeam. Make cookies visible — have them appear temporarily as floating icons when they are sent, and show them in View Source. Notify the user when a password or credit card number is sent unencrypted. Allow the user to delete and modify cookies. Or save them to external files and load them back in. Under page info, enumerate as much identity information as we can (as in, show what the server can figure out, from cookies to OS to GL capabilities.)


I don’t know if it’s just because nobody else needs to care yet, but it seems like we have a lead on gaming in the browser. It’s an area where players would be willing to switch browsers, even if only temporarily, to get a 10% higher frame rate. Until rich web gaming starts capturing a substantial number of user hours, it doesn’t seem like the other browser manufacturers have enough of a reason to care. But if we can pull people out of the extremely proprietary and walled-off systems that are currently used for gaming and get them onto the open web, then not only do we get a market share boost but we also expand the range of things that people commonly do on our open platform. It’ll encourage remixing and collaboration and pushing the envelope, counteracting humanity’s current dull descent into stupefied consumption. The human race will develop more interconnections, develop better ways of resolving problems, and gain a richer and stronger culture for the Borg to destroy when they finally find us.

Er, sorry. Got a little carried away there. Let me just say that gaming represents more than obsessive self-indulgence. It is a powerful tool for communication and education and culture development and improved government. You’ll never truly understand in your bones how the US won its war for independence until you’ve lived it — or at least, simulated it. (Hint: it’s not because our fighters had more hit points.)


Addons are a major differentiator for Firefox. And most of them suck. Even ignoring the obvious stuff (malware, adware, etc.), for which plans are in motion to combat them, it still seems like addons aren’t providing the value they could be. People have great ideas, but sadly Chrome seems to be the main beneficiary these days. Some of that is simply due to audience size, but I don’t think that’s all of it.

I know little about addons, but I have worked on a few. At least for what I was doing, they’re a pain to write. Perhaps I always just happen to end up wanting to work on the trickiest pieces to expose nicely, but my experience has not been pleasant. How do you make a simple change and try it out? Please don’t make me start up (or restart) the browser between every change. What’s the data model of tabs and windows and things? What’s the security model? I periodically try to work on a tab management extension, but everything I do ends up getting silently ignored, probably because it doesn’t have the right privileges. I asked lots of questions at the last Summit but the answers were complicated, and incomprehensible to someone like me who is unfamiliar with how the whole frontend is put together.

And why isn’t there straightforward code that I can read and adapt? It seems like the real code that drives the browser looks rather different from what I’d need for my own addon. Why didn’t it work to take an existing addon, unpack it, modify it, and try it out? Sure, I probably did something stupid and broke it, but the browser wasn’t very good at telling me what.

That’s for complicated addons. Something else I feel is missing is super lightweight addons. Maybe Greasemonkey gives you this; I’ve barely used it. But say I’m on a page, or better yet a one-page app. I want something a little different. Maybe I want to remove a useless sidebar, or add a button that autofills in some form fields, or prevent something from getting grayed out and disabled, or iterate through an external spreadsheet to automatically fill out a form and submit it, or autologin as a particular user, or maybe just highlight every instance of a certain word. And I want this to happen whenever I visit the page. Wouldn’t it be great if I could just right-click and get an “automatic page action” menu or something? Sure, I’d have to tell it how to recognize the page, and it might or might not require writing JavaScript to actually perform the action. But if the overhead of making a simple addon could be ridiculously low, and it gave me a way of packaging it up to share with other people (or other computers of mine), it could possibly make addons much more approachable and frequently used.

It would also be an absolute disaster, in that everyone and her dog would start writing tiny addons to do things that really shouldn’t be done with addons. But so be it. Think of something easy enough to be suggested in a support document as a workaround for some site functionality gap. Even better, I’d like the browser (or, more likely, an addon-generating addon) to automatically do version control (perhaps by auto-uploading to github or another repo?), and make it easy to write self-tests and checks for whether the required page and platform functionality are still present.

Addons also don’t feel that discoverable. I can search by name, but there’s still the matter of guessing how serious (stable, maintained, high quality) an addon is. It turns my stomach to say this, but I kind of want a more social way of browsing and maintaining lists of addons. “People who are mentally disturbed in ways similar to you have left these addons enabled for long periods of time without disabling them or removing them in a fit of anger: …” Yes, this would require a degree of opt-in tracking evil, but how else can I find my true brethren and avoid the polluted mindset of godless vi-using heathens?

Hey, remember when we pissed off our addon authors by publicly shaming them with performance measurements? Could we do something similar, but only expose the real dirt after you’ve installed the addon?

Which brings me to addon blaming. It’s very hard to correctly blame a misbehaving addon, which makes me too conservative about trying out addons. I would be more willing to experiment if I had a “Why Does My Firefox Suck Right Now?” button that popped up an info box saying “because addon DrawMustachesOnCatPictures is serializing all of your image loads”. Ok, that’s probably too hard — how about just “addon X is eating your CPU”?

Why Does My Firefox Suck Right Now?

On a related note, I think a big problem is that Firefox sometimes behaves very badly and the user doesn’t know why. We really need to get better at helping people help themselves in diagnosing these problems. It feels like a shame to me when somebody loves Firefox, but they start running into some misbehavior that they can’t figure out. If we’re really lucky, they’ll try the support forums. If that doesn’t work, or they couldn’t be bothered in the first place, they come to somebody knowledgeable and ask for help. The user is willing to try all kinds of things: install diagnostic tools, email around attachments of log files, or whatever — but as far as I can tell these things are rarely useful. And they should be. We’re not very good at gathering enough data to track the problem down. A few things serve as illustrative counterexamples: restarting in safe mode is enormously helpful, and about:memory is a great tool that can pinpoint problems. Theoretically, the profiler ought to be good for diagnosing slowdowns and hangs, but I haven’t gotten much out of it in practice. (Admittedly, my own machine is Linux, and the stackwalking has never worked well enough here. But it hasn’t been a silver bullet for my friends’ Windows machines either.)

These are the sorts of situations where we are at high risk of losing users. If a PC runs Sunspider 5% faster but opening a new tab mysteriously takes 5 seconds, somebody’s going to switch browsers. Making the good case better is far less impactful than eliminating major suckage. If somebody comes to us with a problem, we should have a very well-worked out path to narrow it down to an addon or tab or swapping or networking or virus scanning or holy mother of bagels, you have *how* many tabs open?! Especially if IE and Chrome do just fine on the same computer (empty profiles or not.)

Browsing the F*ing Web

That’s what Firefox is for, right? So I have some problems there too. What’s the deal with tabs? I like opening tabs. It means I want to look at something.

I’m not fond of closing tabs. I mean, it’s fine if I’m done with whatever I was looking at. But that’s only one tab, and it’s not enough to keep other tabs from accumulating. Closing any other tab means I have to stop what I’m doing to think about whether I still want/need the tab. It’s like picking up trash. I’m willing to accept the necessity in real life, but in a computer-controlled virtual space, do I really have to?

Sadly, that means a tab explosion. Firefox is good about allowing it to happen (as in, large numbers of tabs generally work surprisingly well), but pretty crappy at dealing with the inevitable results. I know lots of people have thought hard on how to improve things here, but none of the solutions I’ve seen proposed felt exactly right.

I don’t have a solution either, but I’ll propose random things anyway:

Tabs vs bookmarks vs history is artificial. They’re all stuff you wanted at some point, some of which you want now, and some of which you’ll want again in the future. I want perfection: I want to open tabs forever without ever needing to close any, but I want the interface to only display the tabs I’m interested in right now.

Bookmarks are just tabs that I claim I might want again in the future, but I don’t want to clutter up my tab bar with right now. History additionally has all the tabs that I claim I don’t need to see again, except maybe sometime when I remember that I’ve seen something before and need it again.

Yes, I am misusing “tabs” to mean “web pages”. Sue me.

So. Let me have active tabs, collected in some number of windows, preferably viewable on the left hand side in a hierarchical organization à la Tree Style Tabs. Give me buttons on the tabs to quickly say “I don’t care at all about this anymore”, “categorize for when I want to browse by topic and find it again”, “queue this up [perhaps in a named queue] for when I am avoiding work”, and “I only want this cluttering my screen as long as these other tabs are still visible”. (Those correspond to “close tab”, “bookmark tab”, “enqueue tab”, and “reparent tab”.) Allow me to find similar tabs and inform the browser about those, too. Right-clicking on a bugzilla tab ought to give me a way to find all bugzilla tabs and close them en masse, or reparent them into a separate group. Make it easy to scan through tab groups, enqueue some, and then close the rest. I should be able to sort all the tabs by the last time I looked at them, so I can kill off the ancient ones — without losing my original sort order.

Some context: I have a lot of tabs open. Many more than fit on one screen (even using Tree Style Tabs.) Cleaning them up is a pain because of the soggy middle: the ones early in the list are things that I’ve had around for a long time and resisted closing because they’re useful or I really really want to get around to reading them. The ones late in the list are recently open and likely to be current and relevant. The stuff in the middle is mostly crap, and I could probably close a hundred in a minute or two, except the tab list keeps jumping around and when I click in the middle I keep having to wait for the unloaded pages to load just so I can kill them.

I want to throw those ancient tabs in a “to read” queue. I want to find all of my pastebin tabs and kill them off, or maybe list them out in age order so I can just kill the oldest (probably expired anyway) ones. I don’t want the “to read” queue in my active tab list most of the time, but I want to move it in (drag & drop?) when I’m in the mood. I want to temporarily group my tabs by favicon and skim through them, deleting large swathes. I want to put the knot-tying tab and origami instruction tab into a separate “to play with” folder or queue. I want to collect my set of wikipedia pages for Jim Blandy’s anime recommendations into a group and move them to a bookmark bar, which I may want to either move or copy back to the active tab list when I’m ready to look at them again. I want to kill off all the bugzilla pages except the ones where I’ve entered something into a form field. I want to skim through my active tab list with j/k keys and set the action for each one, to be performed when I hit the key to commit to the actions. I want undo. I want one of those actions, a single keystroke, to set which window the tabs shows up in the active tabs list. I want to sort the tabs by memory usage or CPU time. I want to unload individual tabs until I select them again.

I want a lot of stuff, don’t I?

Here is the place I originally intended to start talking about loaded and unloaded tabs, the perils and advantages of auto-unloading, and all that, but:

This Post

I just checked the timestamp on this post. I wrote it on August 14, 2014, and have sat on it for nearly a year. It’s been waiting for the day that I’ll finally get around to finishing it up, perhaps splitting it into several blog posts, etc. Thanks to Yoric’s Firefox re-imaginings I came back to look, and realized that what’s going to happen is that this will get old and obsolete and just die. I’d be better off posting it, rough and incomplete as it is. (And I *still* haven’t watched those anime. Where has time gone?!)

Whoa. I just looked at the preview, and this post is *long*. Sorry about that. If I were a decent human being, I would have chopped it up into digestible pieces. I guess I just don’t like you that much.

scratchpad made me happy

March 5th, 2014

I love the Firefox devtools command line. I cut & paste all kinds of crazy code into it. On the other hand, I never did really quite “get” the Scratchpad, though to be honest I also haven’t tried using it much. I’ve been happy just to edit code in an Emacs buffer on the side and cut & paste.

The Problem

But last night, I ran into a problem that Scratchpad turned out to be perfect for. And now I loveses it. My precioussssss….

I am the proud survivor — er, I mean “father” — of two kids, one of whom goes to a school with lots of required “volunteer” time. Since it is required, you have to record your hours via a web-based tool. It’s a fairly primitive interface on some ancient backend ASP monstrosity. It’s tolerable to use to record one entry at a time — you just need to enter a date, a start hour, a start minute, an end hour, an end minute, three options selected from dropdown lists of several dozen items (enough to require scrolling), etc.

Ok, it’s pretty awful even for entering one record.

But entering 80 of the things, most of them differing only by the date, is intolerable. Especially since the d#@n form resets itself completely every time you submit it. And so automation rage kicked in.

Existing Solution

Last year, I did it by capturing the POST request and writing a script to resubmit with different values. It worked, kinda, though I could only get a couple of fields working and it kept timing out my authentication cookie. Or something. I just remember it being a major pain. Even capturing the full request was a little difficult since it’s HTTPS only and I seem to remember some limitation in the Firefox devtools of the time when trying to see the POST body data. (Again, “or something”.)

The Latest Hotness

This year, I was overjoyed to see the option to edit and resubmit a query. I’ve wanted that so many times. And yet… the data was still x-www-form-urlencoded, which means I had to cut & paste from the devtools pane, which is already a challenge due to Linux/xorg/xfce/emacs’s mishandling of cut buffers or clipboards or whatever the heck they are. And then find the field I cared about and update it, and then discover that it overwrote my previous entry because there was some embedded token in one of the other fields that referred to the entry it was creating. Ugh. (Dim memories resurfaced at about this point from when I needed to get around this last year. I still don’t remember the details.)

So then I thought, “hey, I’ll just update the page in place and then click submit! I’ll do it at the HTML level instead of the HTTP level!” So I wrote up some JS code to find and set the various form fields, and clicked submit. Success!

Only it’s still a painful flow. I have to edit the relevant field in emacs (or eventually, I’d probably generate the JS scripts with a shell or Perl or Python script), cut & paste into the little tiny console line (it’s a console prompt, it’s supposed to be small, I have no issue with that.) (Though maybe pasting into the console itself should send it to the prompt? I dunno.), press enter, then click on submit. Not too bad, and definitely well within the “tolerable” zone.

But it’d be easier if I could just define a JS function that finds and fills the fields, and pass in the one value I need to change. Then I can just enter that at the console. Only where can I stash the function? If I put it on the page, I assume it’ll get nuked whenever I submit the page. Hey, I wonder if that Scratchpad thing might help here…

Enter the Scratchpad. Oh yeah.

So I pasted my little script into the Scratchpad. It defines a function to fill out the fields. Final flow: edit one line of JS to change a date, press Ctrl-R to run it. The fields magically update, I click submit.

Obviously, I could’ve done the submit from the script while I was at it, but I like to set things up and fire them off in separate steps. Call it paranoia. I do the same thing with shell scripts — I’ll write a script to echo out a series of commands to perform, run it once to verify that it’s what I want, then run it again piped through bash. I’m just too clumsy to get it right the first time.

But anyway, Scratchpad was the awesome for this task. I’ll be considering it whenever thinking about how to do other things now.

Doers of Good

Thank you robcee and #devtools team. You made my life gooder. More goodish. I am now living goodlier.

My Example

The script I used, if you’re curious:

function enter(date) {
    inputs = document.forms[0].getElementsByTagName("input");
    selects = document.forms[0].getElementsByTagName("select");
    texts = document.forms[0].getElementsByTagName("textarea");
    inputs["VolunteerDate"].value = date;
    inputs["FromHour"].value = 8;
    inputs["FromMin"].value = 45;
    inputs["ToHour"].value = 9;
    inputs["ToMin"].value = 45;
    inputs["nohour"].value = 1;
    selects["MinCombo"].value = 0;
    selects["classcombo"].value = 40;
    selects["activitycombo"].value = 88;
    selects["VolunteerCombo"].value = "Steve Fink";
    texts[0].innerHTML = "Unit study center";

enter("2/27/2014") # I edit this line and bounce on the Ctrl-R key, then click submit

Browser Wars, the game

February 14th, 2013

A monoculture is usually better in the short term. It’s a better allocation of resources (everyone working on the same thing!) If you want to write a rich web app that works today (ie, on the browsers of today), it’s much better.

But the web is a platform. Platforms are different beasts.

Imagine it’s an all-WebKit mobile web. Just follow the incentives to figure out what will happen.

Backwards bug compatibility: There’s a bug — background SVG images with a prime-numbered width disable transparency. A year later, 7328 web sites have popped up that inadvertently depend on the bug. Somebody fixes it. The websites break with dev builds. The fix is backed out, and a warning is logged instead. Nothing breaks, the world’s webkit, nobody cares. The bug is now part of the Web Platform.

Preventing innovation: a gang of hackers makes a new browser that utilizes the 100 cores in 2018-era laptops perfectly evenly, unlike existing browsers that mostly burn one CPU per tab. It’s a ground-up rewrite, and they do heroic work to support 99% of the websites out there. Make that 98%; webkit just shipped a new feature and everybody immediately started using it in production websites (why not?). Whoops, down to 90%; there was a webkit bug that was too gross to work around and would break the threading model. Wtf? 80%? What just happened? Ship it, quick, before it drops more!

The group of hackers gives up and starts a job board/social network site for pet birds, specializing in security exploit developers. They call it “Polly Want a Cracker?”

Inappropriate control: Someone comes up with a synchronization API that allows writing DJ apps that mix multiple remote streams. Apple’s music studio partners freak out, prevent it from landing, and send bogus threatening letters to anyone who adds it into their fork.

Complexity: the standards bodies wither and die from lack of purpose. New features are fine as long as they add a useful new capability. A thousand flowers bloom, some of them right on top of each other. Different web sites use different ones. Some of them are hard to maintain, so only survive if they are depended upon by a company with deep enough pockets. Web developers start playing a guessing game of which feature can be depended upon in the future based on the market cap of the current users.

Confusion: There’s a little quirk in how you have to write your CSS selectors. It’s documented in a ton of tutorials, though, and it’s in the regression test suite. Oh, and if you use it together with the ‘~’ operator, the first clause only applies to elements with classes assigned. You could look it up in the spec, but it hasn’t been updated for a few years because everybody just tries things out to see what works anyway, and the guys who update the spec are working on CSS5 right now. Anyway, documentation is for people who can’t watch tutorials on youtube.

End game: the web is now far more capable than it was way back in 2013. It perfectly supports the features of the Apple hardware released just yesterday! (Better upgrade those ancient ‘pads from last year, though.) There are a dozen ways to do anything you can think of. Some of them even work. On some webkit-based browsers. For now. It’s a little hard to tell what, because even if something doesn’t behave like you expect, the spec doesn’t really go into that much detail and the implementation isn’t guaranteed to match it anyway. You know, the native APIs are fairly well documented and forward-compatible, and it’s not really that hard to rewrite your app a few times, once for each native platform…

Does this have to happen just because everybody standardizes on WebKit? No, no more than it has to happen because we all use silicon or TCP. If something is stable, a monoculture is fine. Well, for the most part — even TCP is showing some cracks. The above concerns only apply to a layer that has multiple viable alternatives, is rapidly advancing, needs to cover unexpected new ground and get used for unpredicted applications, requires multiple disconnected agents to coordinate, and things like that.

bzexport changes released

April 13th, 2012

bzexport –new and hg newbug have landed

My bzexport changes adding a --new flag and an hg newbug command have landed. Ok, they landed months ago. See my previous blog post for details; all of the commands and options described there are still valid in the current version. But please pull from the official repo instead of my testing repo given in the earlier blog post.

Installing bzexport

mkdir -p ~/hg-extensions
cd ~/hg-extensions
hg clone http://hg.mozilla.org/users/tmielczarek_mozilla.com/bzexport

in the [extensions] section of your ~/.hgrc, add:
bzexport = ~/hg-extensions/bzexport/bzexport.py

Note to Windows users: unfortunately, I think the python packaged with MozillaBuild is missing the json.py package that bzexport needs. I think it still works if you use a system Python with json.py installed, but I’m not sure.

Trying it out

For the (understandably) nervous users out there, I’d like you to give it a try and I’ve made it safe to do so. Here are the levels of paranoia available: Read the rest of this entry »

Only pay for the entropy you use

February 22nd, 2012

Log Files Are Boring

Just an idea, based on hearing that build log transfers seem to consume large amounts of bandwidth. (Note that for all I know, this is already being done.)

Logs are pretty dull. In particular, two consecutive log files are usually quite similar. It’d be nice if we could take advantage of this redundancy to reduce the bandwidth/time consumed by log transfers.

rsync likes boring data

The natural thing that springs to mind is rsync. I grabbed two log files that are probably more similar to each than is really fair, but they shouldn’t be horribly unrepresentative. rsyncing one to the other found them to share 32% of their data, based on the |rsync –stat| output lines labeled “Matched data” and “Literal data”, for a speedup of 1.46x.

I suspected that rsync’s default block size is too large, and so most of the commonalities are not found. So I tried setting the block size ridiculously low, to 8 bytes, and it found them to be 98% similar. Which is silly, because it has to retrieve more block hashes at that block size than it saves. The total “speedup” is reported as 0.72x.

But the sweet spot in the middle, with a block size of 192, gives 84% similarity for a speedup of 4.73x.

compression likes boring data too

Take a step back: this only applies to uncompressed files. Simply gzipping the log file before transmitting it gives us a speedup of 14.5x. Oops!

Well, rsync can compress the stuff it sends around too. Adding a -z flag with block size 192 gives a speedup of 16.2x. Hey, we beat basic gzip!

But compression needs decent chunks to work with, so the sweet spot may be different. I tried various block sizes, and managed a speedup of 24.3x with -B 960. An additional 1.7x speedup over simple compression is pretty decent!

To summarize our story so far, let’s say you want to copy over a log file named log123.txt. The proposal is:

  1. Have a vaguely recent benchmark log file, call it log_compare.txt, available on all senders and receivers. (Actually, it’d probably be a different one per build configuration, but whatever.)
  2. On the server, hard link log123.txt to log_compare.txt.
  3. From the client, rsync -z -B 960 log123.txt server:log123.txt

stop repeating what I say!

But it still feels like there ought to be something better. The benchmark log file is re-hashed every time you do this and the hashes are sent back over the wire, costing bandwidth. So let’s eliminate that part. Note that we’ll drop the -z from flag because we may as well compress the data during the transfer instead:

 ssh server 'ln log_compare.txt log123.txt'
 rsync -B 960 log123.txt log_compare.txt --only-write-batch=batch.dat
 ssh -C server 'rsync --read-batch=- argleblargle log132.txt' < batch.dat

Note that “argleblargle” is ignored, since the source file isn’t needed.

So what’s the speedup now? Let’s only consider the bytes transmitted over the network. Assuming the compression from ssh -C has the same effect as gzipping the file locally, I get a speedup of 28.9x, about 2x the speedup of simply compressing the log file in the first place.

But wait. The block size of 960 was based on the cost of retrieving all those hashes from the remote side. We’re not doing that anymore, so a smaller block size should again be more effective. Let’s see… -B 192 gets a total speedup of 139x, which is almost exactly one order of magnitude faster than plain gzipped log files. Now we’re talking!

loose ends

Two things still bug me. One is a minor detail — the above is writing out batch.dat, then reading it back in to send over to the server. This uselessly consumes disk bandwidth. It would be better if rsync could directly read/write compressed batch files to stdin/stdout. (It can read uncompressed batches from stdin, but not write to stdout. You could probably hack it somehow, perhaps with /proc/pidN/fd/…, but it’s not a big deal. And you can just use use /dev/shm/batch.dat for your temporary filename, and remove it right after. It’d still be better if it never had to exist uncompressed anywhere, but whatever.)

The other is that we’re still checksumming that benchmark file locally for every log file we transfer. It doesn’t change the number of bytes spewed over the network, but it slows down the overall procedure. I wonder if librsync would allow avoiding that somehow…? (I think rsync uses two checksums, a fast rolling checksum and a slower precise one, so you’d need to compute both for all offsets. And reading those in would probably cost more than recomputing from the original file. But I haven’t thought too hard about this part.)

not just emacs and debuggers

I sent this writeup to Jim Blandy, who in a typically insightful fashion noticed that (1) this requires some fiddly bookkeeping to ensure that you have a comparison file, and (2) revision control systems already handle all of this. If you have one version of a file checked in and then you check in a modified version of it, the VCS can compute a delta to save storage costs. Then when you transmit the new revision to a remote repository, the VCS will know if the remote already has the baseline revision so it can just send the delta.

Or in other words, you could accomplish all of this by simply checking your log files into a suitable VCS and pushing them to the server. That’s not to say that you’re guaranteed that your VCS will be able to fully optimize this case, just that it’s possible for it to do the “right” thing.

I attempted to try this out with git, but I don’t know enough about how git does things. I checked in my baseline log file, then updated it with the new log file’s contents, then ran git repack to make a pack file containing both. I was hoping to use the increase in size from the original object file to the pack file as an estimate of the incremental cost of the new log file, but the pack file was *smaller* than either original object file. If I make a pack with just the baseline, then I end up with two pack files, but the new one is still smaller.

clients could play too

As a final thought, this idea is not fundamentally restricted to the server. You could do the same thing inside eg tbpl: keep the baseline log(s) in localStorage or IndexedDB. When requesting a log, add a parameter ?I_have_baseline_36fe137a1192. Then, at the server’s discretion, it could compute a delta from that baseline and send it over as a series of “insert this literal data, then copy bytes 3871..17313 from your baseline, then…”. tbpl would reconstruct the resulting log file, the unicorns would do their lewd tap dance, and everyone would profit.

Scenario 1: you have a patch to some bug sitting in our mercurial queue. You want to attach it to a bug, but the bugzilla interface is painful and annoying. What do you do?

Use bzexport. It’s great! You can even request review at the same time.

What I really like about bzexport is that while writing and testing a patch, I’m in an editor and the command line. I may not even have a browser running, if I’m constantly re-starting it to test something out. Needing to go to the bugzilla web UI interrupts my flow. With bzexport, I can stay in the shell and move onto something else immediately.

Scenario 2: You have a patch, but haven’t filed a bug yet. Neither has anybody else. But your patch has a pretty good description of what the bug is. (This is common, especially for small things.) Do you really have to go through the obnoxious bug-filing procedure? It sure is tempting just to roll this fix up into some other vaguely related bug, isn’t it? Surely there’s a simple way to do things the right way without bouncing between interfaces?

Well, you’re screwed. Unless you’re willing to test something out for me. If not, please stop reading.
Read the rest of this entry »

Patch reordering

November 3rd, 2011

I have a patch queue that looks roughly like:


(So my base repo has a patch ‘initial-API-changes’ applied to it, followed by a patch ‘consumer-1’, etc.)

The idea is that I am working on a new API of some sort, and have a couple of independent consumers of that API. The first two are “done”, but when working on the 3rd, I realize that I need to make changes to or clean up the API that they’re all using. So I hack away, and end up with a patch that contains both consumer 3 plus some API changes, and to get it to compile I also update consumers 1 and 2 to accommodate the new changes. All of that is rolled up into a big hairball of a patch.

Now, what I want is:

  consumer-1 (new API)
  consumer-2 (new API)
  consumer-3 (new API)

But how do I do that (using mq patches)? I can use qcrefresh+qnew to fairly easily get to:

  consumer-1 (old API)
  consumer-2 (old API)
  consumer-3 (new API)

or I could split out the consumer 1 & 2 API changes:

  consumer-1 (old API)
  consumer-2 (old API)
  consumer-3 (new API)

which theoretically I could qfold the consumer 1 and consumer 2 patches:

  consumer-1 (new API)
  consumer-2 (new API)
  consumer-3 (new API)

Unfortunately, consumer-1-API-changes collides with API-changes, so the fold will fail. It shouldn’t collide, really, but it does because part of the code to “register” consumer-1 with the new API happens to sit right alongside the API itself. Even worse, how do I “sink” the ‘API-changes’ patch down so I can fold it into initial-API to produce final-API? (Apologies for displaying my stacks upside-down from my terminology!) A naive qfold will only work if the API-changes stuff is separate from all the consumer-* patches.

My manual solution is to start with the initial queue:

  consumer-1 (old API)
  consumer-2 (old API)

and then use qcrefresh to rip the API changes and their effects on consumers 1 & 2 back out, leaving:

  consumer-1 (old API)
  consumer-2 (old API)
  (in working directory) consumer-3 (new API)

I qrename/qmv the current patch to ‘api-change’ and qnew ‘consumer-3’ (its original name), cursing about how my commit messages are now on the wrong patch. Now I have

  consumer-1 (old API)
  consumer-2 (old API)
  api-change (API changes and consumer 1 and 2 updates for new API)
  consumer-3 (new API)

Now I know that ‘unrelated’ doesn’t touch any of the same files, so I can qgoto consumer-2 and qfold api-change safely, producing:

  consumer-1 (old API)
  consumer-2 (new API, but also with API change and consumer 1 updates)
  consumer-3 (new API)

I again qcrefresh,qmv,qnew to pull a reduced version of the api-change patch, giving:

  consumer-1 (old API)
  api-change (with API change and consumer 1 updates)
  consumer-2 (new API)
  consumer-3 (new API)

Repeat. I’m basically taking a combined patch and sinking it down towards its destination, carving off pieces to incorporate into patches as I pass them by. Now I have:

  api-change (with *only* the API change!)
  consumer-1 (new API)
  consumer-2 (new API)
  consumer-3 (new API)

and finally I can qfold api-change into initial-API, rename it to final-API, and have my desired result.

What a pain in the ass! Though the qcrefresh/qmv/qnew step is a lot better than what I’ve been doing up until now. Without qcrefresh, it would be

 % hg qrefresh -X .
 % hg qcrecord api-change
 % hg qnew consumer-n
 % hg qpop
 % hg qpop
 % hg qpop
 % hg qpush --move api-change
 % hg qpush --move consumer-n
 % hg qfold old-consumer-n

which admittedly preserves the change message from old-consumer-n, which is an advantage over my qcrefresh version.
Or alternatively: fold all of the patches together, and qcrecord until you have your desired final result. In this particular case, the ‘unrelated’ patch was a whole series of patches, and they weren’t unrelated enough to just trivially reorder them out of the way.

Without qcrecord, this is intensely painful, and probably involves hand-editing patch files.

My dream workflow would be to have qfold do the legwork: first scan through all intervening patches and grab out the portions of the folded patch that only modify nonconflicting files. Then try to get clever and do the same thing for the portions of the conflicted files that are independent. (The cleverness isn’t strictly necessary, but I’ve found that I end up selecting the same portions of my sinking patch over and over again, which gets old.) Then sink the patch as far as it will go before hitting a still-conflicting file, and open up the crecord UI to pull out just the parts that belong to the patch being folded (aka sunk). Repeat this for every intervening conflicting patch until the patch has sunk to its destination, then fold it in. If things get too hairy, then at any point abort the operation, leaving behind a half-sunk patch sitting next to the unmodified patch it conflicted with. (Alternatively, undo the entire operation, but since I keep my mq repo revision-controlled, I don’t care all that much.)

I originally wanted something that would do 3-way merges instead of the crecord UI invocations, but merges really want to move you “forward” to the final result of merging separate patches/lines of development. Here, I want to go backwards to a patch that, if merged, would produce the result I already have. So merge(base,base+A,base+B) -> base+AB which is the same as base+BA. From that, I could infer a B’ such that base+A+B’ is my merged base+AB, but that doesn’t do me any good.

In my case, I have base+A+B and want B” and A” such that base+B”+A” == base+A+B.

To anyone who made it this far: is there already an easy way to go about this? Is there something wrong with my development style that I get into these sorts of situations? In my case, I had already landed ‘initial-API’; please don’t tell me that the answer is that I always have to get the API right in the first place. Does anyone else get into this mess? (I can’t say I’ve run into this all that often, but it’s happened more than once or twice.)

I suppose if I had landed consumers 1 and 2, I would’ve just had to modify their uses of the API afterwards. So I could do that here, too. But reviews could tangle things up pretty easily — if a reviewer of consumer 1 or 2 notices the API uglinesses that I fixed for consumer 3, then landing the earlier consumers becomes dependent on landing consumer 3, which sucks. But also, none of this is really ready to land, and I’d like to iterate the API in my queue for a while with all the different consumers as test users, *without* lumping everything together into one massive patch.

distcc, ccache, and bacon

October 7th, 2011

This was initially a response to JGriffin’s GoFaster analysis post but grew out of control. Read that first.

Rampant speculation

tl;dr: hey, we could use ccache and distcc on our build system!

Just speculating (as usual), but…

The note about retiring slow slaves, combined with the performance gap between full and incremental builds, suggests something.

Why does additional hardware (the slow slaves) slow things down? Because load is unevenly distributed. Ignoring communication costs, the fastest way to build with a fast machine and a slow one that takes 2x longer would be to compile 2/3 of the files with the fast machine and 1/3 with the slow one. How? Remove all slow slaves from the build pool and convert them to distcc servers.

What about the clobber builds? Well, if you’ve already built a particular file before with the same compiler and options, it would be nice to not have to build it again. That’s what ccache is for. But a ccache per slave means you have to have built the same thing on the same slave. For try builds (which is where most of the clobbers are), that’s not going to happen all the time.

But combine that with the above distcc idea: you could run ccache under distcc on the distcc servers. Now you have a ccache/distcc sandwich: local ccache first, then distcc, then remote ccache, then finally some bacon. Because everything’s better with bacon.

ts;wm: (too short; want more)

You know, in terms of data sources, the above picture is wrong. It’s really local ccache, then remote ccache (via distcc), then remote compile, and only then bacon. But the configuration-centric ccache/distcc/ccache description makes for better visuals. Or would if I put the bacon on the inside, anyway.

Let’s walk through a clobber build. The stuff the local slave has built before gets pulled from local distcc. Some of the remaining stuff gets built locally. The rest gets sent over to various machines in the distcc pool. We can break those things down into 3 categories: (1) stuff that’s never been built anywhere, (2) stuff that’s been built on a different distcc host, and (3) stuff that’s been built on the same distcc host. #3 is a win. #1 is unavoidable, it’s the basic cost of doing business. (Actually, there’s another dimension, which is whether something has been built before on a non-distcc host. I’ll ignore that for now. Conceptually, you can make it go away by making every slave a distcc server.)

#2 is waste. But it’s less waste than we have now, if the distcc pool is smaller than the whole build pool, because you’re doing one redundant build per distcc host rather than one per builder. And it’s self-limiting: a distcc host that has a build cached returns it immediately, meaning it’s more likely to get stuck with something it needs to build, which sucks but at least it populates its ccache so it won’t have to do it again.

Now, I am assuming here that compile costs are greater than communication + ccache lookup costs, which is an insanely flawed assumption. But it’s very very true for my personal builds — I have my own distcc server, and my clobber builds (actually, *all* my builds) feel way way faster when I’m using it. So I don’t think the question is so much “would this work?” as it is “what would we need to do to make this work?”

For starters, do no harm: it would be great if we could partition the network so that distcc servers are separate from the current communication channels. Every build host would sit on two VLANs, say: the regular one and the distcc one. That would reduce chances of infrastructure meltdown through excessive distcc traffic. (I am not a network engineer, nor do I play one on TV, and this may require separate physical networks and possibly Pringles cans.)

On a related note, it might be wise to start out by restricting the slaves from doing too many distcc jobs at a time, to prevent the distcc jobs from getting bogged down through congestion. I do this for my own builds through a ~/.distcc/hosts file containing: “localhost/4”. That means you can use -j666, and it’ll still only do 4 jobs on localhost and 7 jobs on simultaneously. (Actually, that’s my home ~/.distcc/hosts file. My server at work is beefier, and there I allow the remote to do 12 jobs at once. I have a cron job that checks every 5 minutes to see what network I’m on and sets a ~/.distcc/hosts symlink accordingly. But I digress.)

More worrying is the reason behind all that clobbering. If a slave turns to the dark side, runs amok, gets hit by a cosmic ray, or is just having a bad day, do we really want to use its ccached builds? More to the point, when something goes wrong, what do we need to clobber? Right now everything is local to a slave, so it’s straightforward to pull a slave from the pool, take it out behind the garage, and beat the crap out of it with a stick. With distcc and ccache, it’s harder to tell which server to blame.

Still, how often does this happen? (I have no idea. I’m just a troublemaking developer, dammit.) We can always wipe the ccache on the whole distcc pool. It’d be nice to be able to track problems to their source, though. Maybe we could use the distcc pool redundancy to our advantage: have them cross-check the checksums of their builds with each other. Same input, same output. But that’s even more speculative.

It’s not all bad, though — I’m guessing that most clobbers result from the build system not being able to handle various types of change. If the ccache/distcc/ccache sandwich makes clobbers substantially cheaper, we can be a lot freer with them. Someone accidentally cancelled an m-c build partway through? Clobber the world! Let’s make bacon!!

wtf;yai;bdb: (what the f#@; you’re an idiot; been done before)

Reality check
  • We use local ccache already – see bug 488412
  • distcc has been proposed a number of times, but for the life of me I cannot find the bug. There are most likely some very valid reasons not to use it. Such as making a complete interdependent hairball out of our build system where one machine can kill everything.
  • Given the results in bug 488412, it’s very plausible that remote ccaches would be of no benefit or a net loss. (Though those numbers were using NFS to retrieve remote ccache results, and I deeply distrust NFS.)

Screw Reality. What has it ever done for me?

Hey, if we really needed to conceal network latency and redundant rebuilds across different hosts, we could stream out ccache results before they were even needed! But that’s crazy talk.

JS Probes

September 21st, 2011

Have you ever had your browser mysteriously stall periodically and wondered “what the f#@$! is it doing?!!” Or perhaps you’re working on something, say the garbage collector, and you’d like to see what effect your changes are having. Or maybe even write a little analysis that postprocesses some sort of trace of what is going on, and figures out what the optimal pattern of actions would be. (“If I’d thrown this big chunk of data out of the cache here, then I would’ve had room for all of these little things that got evicted instead, and would have had way fewer misses…”)

The usual way to do things like this is to manually add some instrumentation code (probably just logging a bunch of events) and postprocess the results. This works fine, but it has a few drawbacks: (1) you have to figure out where to insert your instrumentation, often in unfamiliar code; (2) you’ll need to recompile, possibly several times; (3) the logs can get very large very quickly; and (4) you’ll probably end up writing a very special-purpose postprocessor that (5) dumps stuff to a text file that only you know how to interpret, and even you will only remember what it all means for a week or two. The next time you need to do something similar, you’ll find that all of your instrumentation code is severely bitrotted and misses some paths that have been added in the meantime, so you’ll start everything over from scratch.

Well, tough luck. Sometimes those are just facts of life and you’ll need to suck it up. Quit whining, dammit.

But many times, the events of interest (or more precisely, “probe points”) are of general interest. If you can manage to slip them into the code and so get other developers to maintain them for you as they make changes, then everyone can rely on those probes being in roughly the right place permanently. That’s #1 above, and depending on how they’re implemented there’s a good chance you won’t even need to recompile, so that’s #2.

I’ve done an implementation of these sorts of probes in the SpiderMonkey Javascript engine. There are probe points like “a GC is starting (and it’s local to one compartment)”, “the heap has been resized”, and “javascript function F is being called/is returning.” Some of these are straightforward to place into the code — the start of a GC wasn’t hard to figure out, for example. Some weren’t so straightforward, such as JS function calls (they might seem simple, but what if you’re running JITted? Which JIT? Are you still running JITted by the time you return from the function?) I’ve delivered the probe information to various backends — anything from Windows’ ETW (blog post forthcoming whenever I manage to implement the start/stop functionality), to dtrace/systemtap (another blog post, probably coming sooner since I recently scraped together a demo), to a simple callback mechanism (see JS_SetFunctionCallback on MDN) and other special-purpose ones that only care about a small subset of probes.

#3 (log it all vs online handling) ventures into religious territory. It is easiest to mindlessly log everything of interest and postprocess it. But what if you want realtime updates? Or if you want to track different information depending on what you learn from other probe points? Or what if the volume of your log writing interferes with whatever you’re trying to measure (eg disk I/O)? Or maybe you need to track some sort of state in order to give the probes meaning. (GC when idle => good. Avoidable GC when the user is waiting => bad.)

Those arguments are what led to the creation of tools like DTrace and Systemtap. Both give you a scripting environment that can aggregate information from probes as they fire, control exactly what information gets tracked as things are happening, and can be attached/detached at any time. They’re pretty cool, and invaluable once you get familiar with them. They’re also extremely system-dependent and generally require root access or special builds or kernel debuginfo or something, which ends up meaning that you often can’t just hand off analysis scripts to other people and have those people get some use out of them. And even you may not be able to take them to another environment.

Still, they deal pretty well with #4 (avoiding one-use, special-purpose processors), at least for environments matching the one they were written for. And if they can draw from statically-inserted probe points (the type I was talking about above), they can actually be pretty general. #5 is still a killer, though — at least the way I write systemtap scripts, they all end up with idiosyncratic ways of dumping out the results of some particular analysis, and nobody else is going to get much enlightenment without studying the script for a while first.

What if we could do better? What if we could insert these static probes, but rather than feeding the information to some niche tool that is usable by only a handful of people, we make the data available to a plain old Firefox addon? You could collect, aggregate, summarize, mutilate, fold, spindle, or crush the data directly in JS code. Then we could let addon authors go crazy with visualizations and analysis libraries. That’d be cool, right?

Graph GC behavior. Warn the user when slow or suspicious stuff is happening. Figure out what’s going on during long event handlers. Graph the percentage of time spent in different subsystems. Correlate performance/trace data with user-meaningful actions. Make a flight-recording of various metrics and let the user walk through history. Your ideas here.

Ok, so I tricked you. I’m not going to tell you how to do any of that. This blog post is a tease, an advertisement for the work that Brian Burg did this summer during his Mozilla internship. If you’re interested, he’ll be giving his internship final presentation tomorrow (today when you’re reading this, or perhaps yesterday or last month for those of you who have fallen behind on your Planet reading.) That’s 1:30PM PDT on Thursday, September 22 at the Mountain View Mozilla headquarters, and I’m 97.2% sure it will be broadcast over Air Mozilla as well. And taped, I think? (Sadly, I can’t find where those are archived. Somebody please tell me and I’ll update this post.) There will be a demo. With pretty pictures! And he’ll be writing it up on his own blog Real Soon Now. I’m not going to say any more for now — I’d get it wrong anyway.

Update: Argh! I got the date wrong! It’s not Wednesday, September 21 as I originally wrote. It’s today, Thursday, September 22. Sorry for the confusion!