Apr 11

Measuring Startup Speed Correctly

Until recently our state of the art method for measuring startup was to subtract a timestamp passed via commandline from a new Date() timestamp within a <script> tag. Vlad pioneered this approach, me and others adopted it.

Turns out there are two problems with this approach:

  1. It is cumbersome, especially on Windows where there is no easy way to pass a timestamp via the commandline.
  2. It is wrong. Turns out that Firefox starts loading web pages before the UI is shown. One can’t be sure that the page being loaded is within a visible browser

Our oldest startup benchmark, ts,  has been gathering wrong numbers all along.  This resulted in a class of perverse optimizations that decreased the ts number, but increased the time taken for UI to appear (ie bug 641691). The new tpaint (bug 612190) benchmark should should address this. On my machine measuring pageload vs paint-time results in a 50-100ms difference. See the graph server for more data.

This is why AMO’s complicated method of measuring startup is wrong. Please use our shiny new about:startup extension or if you absolutely want to avoid adding any overhead use getStartupInfo API directly.

Apr 11

Arguing about startup: Communicating performance

Recently the addon team started working towards penalizing addons that penalize our startup. We have solid data shows that startup gets worse as more addons installed so the effort is justified. Justin published this picture to illustrate the problem:

However some technical mistakes were made. Wladimir (AdBlock+ guy!) has been busy exposing them on his blog. Thanks Wladimir! I spent a couple of years understanding Firefox startup, so I really appreciate Wladimir’s remarkable speed/quality in poking holes in AMO approach.

Rating Addon Performance

Wlad’s latest point is regarding how addon impact is measured on warm startup with an empty profile. I agree that addon impact on warm startup with clean profile is going to be different from that of cold startup with a dirty real-world profile. I also agree that it is weird to primarily measure warm startup given that our data clearly indicates that most users are experiencing cold startup.

However startup is an irritating multidimensional problem influenced by a large number of factors. Should we suck it up and let addons kill warm startup (and risk ruining firefox upgrade times, pissing off web developers)? How does one choose a typical dirty profile? Creating a “typical” dirty profile is a tough problem (ie due to privacy, disk fragmentation) and there is no reason to let clean profile performance be degraded.

So an addon performance rating will not be perfect. Time added by addon during warm startup is the easiest starting point and is one of the more deterministic measures (Wlad’s blog says otherwise, but measuring other startup kinds have even more noise). This is no worse than choosing browsers based on their JavaScript benchmark performance.

I think a much awesomer addon performance rating may be obtainable by clever statistics boffins by analyzing our aggregated startup data. Perhaps the Mozilla Metrics will make it a reality, but in the meantime we’ll have to make do with an imperfect approach.

Coming soon: Why is the measurement approach in Jorge’s post is overly complicated and somewhat incorrect.