Tales of a Project Leader 2: arewefastyet
This continues the JägerMonkey project series. Part 1 is here.
arewefastyet (aka AWFY) seems to be the most famous single thing in the project (with the possible exception of the project art). The story starts in spring 2010, late March or so, when apparently I said something like: we should have a web page that shows our benchmark scores improving over time. The next thing I knew, David surprised me by showing me AWFY.
The funny thing is that I don’t actually remember suggesting it: David told me that I did in the project debrief. I’m pretty sure that when I mentioned the idea I was channeling Jeff Naughton, my databases professor at Wisconsin. He once told me about working on some databases performance problem (possibly large-scale sorting), and how first you have to build up the basic infrastructure, but then after that you can start implementing optimizations, and the fun begins as you watch your score improve. I (probably) thought that it would also be a lot of fun (and motivational) for us to watch our score improve once we had built a basic compiler.
And the next thing we knew after that, AWFY had gone viral. We hadn’t intended it to–we didn’t even think we had told anyone about it, but somehow it was on Reddit. (David tells me there were also a bunch of inflammatory posts there that he just had to respond to.) But we didn’t mind: it built a lot of excitement around our efforts, even impatience, as the “fans” would often ask us what was going on if the line hadn’t moved for a few days. I tried to build on the excitement by tweeting about optimizations as they landed, which combined well with AWFY since anyone could go there to see what I was tweeting about. The attention also drew in a couple of people, wx24 and Michael Clackler, who created better layout and UI for the site.
But we liked AWFY most of all as a motivational scoreboard. It was also up on an old laptop next to the JS pit. With AWFY, once you landed a patch to improve our performance, just a few minutes later it showed up on the scoreboard. Every win, even the small ones, counted for the project, and they all counted visibly on AWFY. (In the debrief, Chris Leary connected this to the concept of the Big Visible Chart.) And you could look at the recent history and be reminded of how much good stuff was happening–I loved starting my day by walking into the pit and seeing the recent string of wins.
Since then, a few other areweXyet sites have appeared; I don’t know how useful they’ve been. I may be wrong about this, but talking to people now, I often sense a faint belief of “and we shall build an areweXyet, and X shall come to pass”, and I don’t think it really works that way.
AWFY was great, as both viral marketing and a motivational tool, but other projects won’t automatically be able to replicate that experience. On the viral marketing side, I have no idea how it happened, so I guess I can’t say that it can’t be replicated. But I suspect there were many hidden ingredients that might not be present again: maybe people were primed to be excited about Firefox getting a big boost in JS performance, or maybe it was the novelty factor of a performance scoreboard with a joke name and a huge “NO” to answer the question. So I wouldn’t count on it.
The motivational value is solid, but applies to some projects and not others. The great thing about AWFY was that the score was almost entirely under our control (modulo a few ms of noise): land an optimization patch, and it definitely gets faster. If it gets slower when you’re not looking, just back out the regressing patch. (The ‘almost’ is because it’s possible some correctness fix would need to be made that would necessarily regress us, but that was a rare event.) Another great thing was that we had an achievable target: cross the lines. That one wasn’t entirely under our control, as the other JS engines were doing things too, but they weren’t doing a major new JIT like we were, so they weren’t changing so fast. Finally, there was really one measure we were most focused on, SunSpider, so we didn’t have to think about too many things and could really concentrate our efforts.
The opposite of AWFY, in my experience, was the blocker count for the Firefox 4 release. The stated goal was, fix all the blocking bugs, so there are zero left, then we get to release. (And finally get JägerMonkey in the hands of all our users!) I think there were sometimes graphs of the blocker count showing on the big monitors here in MV. I also had my own tools to track things. The problem was that the blocker count could go up as well as down, so there is no notion of getting closer to the goal, or of efforts getting “locked in”, as they were with AWFY. I can’t remember the exact numbers, but the JS team was fixing something like 5 blockers per day, with 4 more arriving each day. Trying to get the blocker count down was like being on a fast treadmill. That was actually pretty motivating for me, but only in a negative way: I wanted off that exhausting treadmill. The goal wasn’t really achievable either: 0 blocking bugs represented a kind of perfection and was not realistic. Knowing you’ll definitely never really reach the goal is not encouraging.
So, to be a positive motivational tool, I think a programming scoreboard ideally is super-simple and easy to read, has the score fully under the control of the developers, generally moves only in a positive direction, and focuses on building up wins. Some projects have goals that obviously map on to that, others might fit with some cleverness, and others just won’t. I’m sure all those elements aren’t necessary: it’s not about following a recipe, it’s about designing your own tool, for yourself and your team, that fits you, and is motivational to you: if posting scores to it feels good, it’s working.
AWFY could perhaps be seen as an instance of gamification, and I think games or the book Reality is Broken would be good sources of inspiration for project scoreboards. I wouldn’t recommend actually turning a software engineering project into a game, though: there’s always tons of stuff to pay attention to other than the score, and you definitely don’t want people to start trying to game the system. (I have heard horror stores of management at company X setting up a system to score developers on the number of source checkins, with the predictable results.) We never had that problem because we didn’t use AWFY to control or evaluate our work (we did that through thought and discussion): we just used it to make looking at the results of our work more fun.