Builtin Startup Measurement

I got used to measuring startup the complicated way (example here). It’s complicated enough that many people prefer to use stopwatches.

Turns out modern operating systems can help applications self-diagnose startup speed. Thanks to landing bug 522375 we now provide an API for measuring startup speed. For example, now I know that xpcshell takes forever to startup on mac

./xpcshell -e 'print(new Date() - Components.classes["@mozilla.org/toolkit/app-startup;1"].getService(Components.interfaces.nsIAppStartup_MOZILLA_2_0).getStartupInfo().process)'

At any point one can now go to the error console and type in

uneval(Components.classes["@mozilla.org/toolkit/app-startup;1"].getService(Components.interfaces.nsIAppStartup_MOZILLA_2_0).getStartupInfo())

or

var si=Components.classes["@mozilla.org/toolkit/app-startup;1"].getService(Components.interfaces.nsIAppStartup_MOZILLA_2_0).getStartupInfo(); si.sessionRestored-si.process

to get various interesting timestamps. So next time you see a startup take a surprising amount of time, you can go and poke around to see where that time was spent. At the moment there are 4 datapoints:

  1. .process – Process creation timestamp. This is cool because this happens before any library code is executed.
  2. .main – XRE_main timestamp. My favourite thing to do is to subtract .process from .main. This demonstrates huge overheads that many application programmers refuse to believe in.
  3. .firstPaint – Timestamp of the first intended paint. This coincides with when the user sees the first sign of life.
  4. .sessionRestore – Timestamp of session restore, ie when the browser becomes useful.

Lots of people helped in getting this feature landed, but two people stand out. Daniel Brooks originally figured how to expose Windows/Linux startup speed in Mozilla. Mike Hommey devised a morally-reprehensible way to get precise startup speed out of the idiotic way that Linux presents it.

Linux Sucks

Mac and Windows expose process creation times via human time. It’s just like any other API with time in it. Linux provides process startup speed in imbecile jiffies-since-boot. That’d be irritating enough, but to piss people off further there is no way to convert that into human time. Clever ps developers resort to calculating jiffies/second by comparing against seconds-since-start in /proc/uptime. Unfortunately that does not even come close to providing anything close to millisecond resolution needed for useful numbers. Mike’s idea of timing startup of another thread/task to get a known jiffy-stamp got us precise-enough numbers (around 10ms resolution with most kernels, see patch for details). At least Linus was kind enough to let mere user-space devs obtain the current tick-rate.

Update:

Linux doesn’t suck anymore, a commenter pointed me to a relatively new(2006) taskstats interface which appears to work sanely like the BSD one.

Mike Hommey made an about:startup extension using this API.

15 comments

  1. Firee pas,Fox,bĂȘta ne marche pas avec Android…

  2. Why on earth are you parsing /proc/anything when you could just be calling times() and/or getrusage()? Which exposes time in units that can be retrieved (sysconf(_SC_CLK_TCK)) — actually I would have assumed that that call gave you the jiffies-per-second multiplier if you *had* to be parsing /proc, but I don’t actually *know* that.

  3. @Zack

    Cos getrusage() does give useful time? it’s cpu-time, not wallclock time. It exposes time in sane units.

  4. Okay, so if you want wall-clock time, what’s wrong with gettimeofday()? Or good old PR_Now()?

    (less snarkily, you might could get some mileage out of the stuff in js/src/perf)

  5. @Zack,
    nothing wrong with gettimeofday() other than something needs to run it precisely at the point of process creation so we can include library-loading overhead. This is what the starttime jiffies in /proc tell me.

  6. @tglek: then the proper way to do it is mark the time from the parent process, by calling gettimeofday() right before fork()/exec() is called. You can then write this time to a file if you need it from within the program.

  7. @jimis that’s called a workaround, and is similar to what we did.
    The proper way to do it is to provide a proper api.

  8. > Clever ps developers resort to calculating jiffies/second
    > by comparing against seconds-since-start in /proc/uptime.
    > Unfortunately that does not even come close to providing
    > anything close to millisecond resolution needed for useful
    > numbers.

    I’ll add that it is so unprecise that we’ve observed timings off by more than half a second, which is just huge considering the times we’re trying to measure. I’ll also add that ps developer didn’t have to care as much as we do about precision, since what ps displays is a startup time in hh:mm form.

  9. What about adding a CTOR or some init handler, and make sure that function is the first thing run by the linker?
    Then you would just need to store the time in some global var..

  10. Nuno: that’d need to be a ctor in ld.so.

  11. Wow, the task stats stuff looks nice. Looks a lot like the bsd proc_info on BSDs. It does indeed contain startup info.

  12. Could You tell me how can I mesure start time for Firefox 3.x to compre it with Firefox 4 and Firefox 4 with your add-on?

  13. Regarding the incorrect Linux comments…

    While times() & getrusage() report clock virtual ticks (API sets the virtual tick to 100Hz as kernel clock tick can be anything from 128 to >1000)…

    …Why one wouldn’t just use the standard POSIX-2001 clock_gettime(CLOCK_PROCESS_CPUTIME_ID) high resolution timer?

    An end-user measurement would also be nice i.e. time from user clicking the app icon until:
    * the application UI comes visible
    * the UI is usable (takes user input etc)
    * UI has stopped drawing/updating itself
    (default page can affect this a lot)

    Xresponse can be used for that:
    http://maemo.gitorious.org/maemo-tools/xresponse

    It monitors & times (using X server’s timestamps) both user input and screen updates.

    Note that end-user startup values can of course differ a lot from run to run because there’s:
    * scheduling (if other processes like X are busy, you need to wait up until next scheduling tick before next process is sheduled)
    * disk access (if you’re trying to read disk while it’s being written by your or other processes)
    * and in browser case maybe also network access (say dhcp, which you can catch with strace)

    So things that measure CPU time taken just by single process (PROCESS_CPUTIME) give more reproducible information, but it shouldn’t be the only measurement as otherwise you can be misled. You need to measure both to know which one’s more of a problem or has regressions. :-)

  14. Eero, interesting pointers. However in the case of startup, process cpu time is largely irrelevant since so much of the processing happens outside of the process.
    I agree that general it’s good to keep track of multiple numbers, but in this case the disparity between cpu time and actual time spent is so huge that it doesn’t make sense to focus on cpu time until we significantly reduce time spent in the operating system(which we are working on)