25
Jun 12

Slow startup logging: help wanted

We have an intern working on an analysis tool to analyze how other Windows applications/services affect Firefox startup.

If you run Windows and you experience slow startups, please see Nicolas’ blog post and submit data his addon gathers.

ps. I will not have time to post a Snappy update for last week. The next update will cover 2 weeks.


18
Jun 12

40-60% of startups are warm?

Note: click on the images if they get clipped by other content. Cold startups are those where data has to be read in from disk, warm ones are subsequent startups where the OS already has Firefox files in memory.

I’m really surprised by the amount of warm startups done by Firefox users. Somewhere between 40% to 60% of startups are warm. On Linux you can see that by watching whether pagefaults occur while loading the firefox binary via EARLY_GLUE_STARTUP_HARD_FAULTS histogram.

On Windows we do not have a good metric  for distinguishing cold startups from warm ones. However can look at the distribution of firstpaint histogram and see that faster startups are more common than slower ones. Only a small minority of machines should be able to cold start a browser in <3 seconds.  We have a lot of startups of various degrees of warmness.

I have no explanation on why people restart Firefox so much. We know < 10% of our shutdowns are unclean (most of those appear to be due to OS shutdown not waiting on Firefox, ie us shutting down too slowly) so users aren’t crashing their browser and starting again. They are voluntarily closing the browser and then starting it soon after (ie OS doesn’t get a chance to flush Firefox out of filesystem cache).

These patterns are pretty consistent across all of the Firefox release channels I checked, so I can’t blame warm startups on nightly users getting barraged with upgrade prompts. Can someone come up with a good theory(preferably with some evidence) for this?

Note telemetry only collects data once a day and requires the browser to be open for a few minutes before submitting data, data could be skewed here.


14
Jun 12

Snappy, June 14th: Telemetry Investigations

There are no news from the Firefox frontend team this week.

Adventures in Measuring Changes

Necko team spent this week investigating why the recent big cache fix was not showing as a win in telemetry.

We were on a verge of a big backout when Saptashi Guha’s analysis in bug 762576 suggested that we might actually be winning. It’s frustrating to have data point us in different directions. However, it is better to try to make sense of data than have no data at all as was the case only a year ago. I’ll have more on this next week.

William McCloskey landed fix to turn on incremental GC for real (bug 761739). This might fix the mysterious recent user-responsiveness regression spotted by telemetry (bug 761722). He  also landed another GC speed up in 743396.

Mark Cote met with metrics analysts to discuss reporting peptest results robustly. The goal is avoid noise in reporting, so responsiveness regressions are acted upon

Interactivity Profiler

Benoit Girrard added added badges to mark known stacks in the profiler, see his blog post. A few weeks ago Vladan taught the symbolication server to serve data from local .pdb files, allowing developers to use Benoit’s profiler in own builds. Mike Conley added incomplete Thunderbird support to the profiler.

 


11
Jun 12

Snappy, June 7

Notes.

Justin’s FUEL fix will help add-ons avoid leaks and shutdown hangs: bug 750454.

Jared plans to start landing Australis tab strip (738491) on UX branch this week. Australis is our new, faster UI theme.

We landed a cache locking fix recently (722034), but telemetry is now showing a regression (761736), so this will likely be backed out and reworked.

Vladan blogged about first results from our non-destructive chromehang. Last year we briefly caused our nightly to crash if it hung for over 30seconds, which got us a lot of useful data (and some of the initial snappy bugs). This piggybacked on our crash-handling infrastructure so it was a very effective experiment (a bit brutal though). Vladan spent time this year working on plumbing to get the same sort of data non-destructively. As a result we are looking to turn on frame pointers in nightly builds and dial down hang detection to 5 seconds (bug 763124).


04
Jun 12

Snappy, May 31st – Less lag

On Friday, the necko team finally landed a fix that makes cache less likely to freeze the UI thread during reads: bug 722034. Cache writes, other less common cache use-cases remain problematic (tracked by bug 717761). Poor cache/main-thread interactions are one of the main causes of UI lag tracked by the Snappy project, so this is very exciting. Barring the need to backout, this fix will appear in Firefox 15.

Help Wanted: The necko team is looking for some help to determine the optimal disk cache size, please see Nick’s post. We need users to install an extension and submit detailed stats on our cache lifecycle.

There are various Firefox frontend fixes in progress: improving session restore (working towards 669603, 669034), FUEL (bug 750454),  search service (bug 722332) and the new theme (bug 732583). I will blog about these in more detail as they land.

Bill landed turned on incremental GC again. Hopefully it will stay on in Firefox 15.

Andrew is making progress on reducing CC pauses while closing tabs: bug 754495.

Brian has instrumented our event loop to measure the extent of Firefox lag when responding to user events, bug 759449. This is different than measuring general event-loop lag in that it focuses on lag that the user would actually notice. Look for  the EVENTLOOP_UI_LAG_EXP_MS histogram in our telemetry dashboard (yes, we are the only browser vendor to make this sort of data public). This should help us track progress as we tweak heuristics to delay background processing during user interaction (eg bug 712478).

Brian also landed a way to bypass the windows prefetch service via our privileged silent update service, see bug 692255. In my testing prefetch is likely to prefetch too many files, slowing down startup for complex apps like Firefox. Hopefully we can do better with our own prefetch.