Slow Sessions – Tabs-on-Demand
Armed to the teeth with about:jank, I was testing session restore scenarios that people reported. While at it I came up with a testcase for bug 711193. At first we were going to use telemetry to debate the merits of tabs on demand by default, but I feel my example illustrates responsiveness problems with session-restore well enough. Gavin is looking into this so we can make a decision this week.
On my machine about:jank indicated that most lag was caused by our direct2d accelerated drawing code, bug 721273. Turning off graphics acceleration made things a lot less slow (Options/Advanced/use hardware acceleration) . It would be nice if people experiencing lots of lag in their sessions (on youtube, blogs with high quality backgrounds, etc) could try about:jank. This requires running a very recent nightly.
Install the extension, go to about:jank, browse around, then refresh about:jank. In the case of gfx lag, DrawThebesLayers shows up on top.
Imminent Cycle Collector + GC Improvements
Olli is landing huge cycle collector improvements (half of the patches landed so far), bug 705582, bug 717500. If that doesn’t solve all CC problems by Tuesday, Andrew is standing by with bug 710496 to limit how often CC can run. If we are lucky, incremental JS GC will land before Tuesday too (bug 641025). Landing by Tuesday means that these improvements have a good chance of showing up in Firefox 12. CC + GC are the most well-known causes of pauses in Firefox, so this is very exciting.
Telemetry histograms should now survive restarts (so we can do shutdown telemetry, etc), bug 707320.
We are now transitioning from identifying issues to fixing identified issues. It’s exciting to move from speculation as to what sucks to actual results. For more details see meeting notes.
Network Cache Horrors
Last week we discovered that our cache uses main thread locks to successfully block on off-main thread io. See (Bug 695399, Bug 717761). QA did an experiment which confirmed that our disk cache is performing poorly.
We are looking into reports of flash lag, tracking Bug 720000. Initial QA data shows a significant slowdown when page is first loaded and smaller slowdowns later. There are also long browser pauses when the flash container progress freezes.
Vlad continued work on non-destructive chromehang, Bug 712109. Client-side is ready to land and he is wrapping up symbolification for the server-side.
Jeff posted an early preview of about:jank addon. He also working on measuring painting speed via telemetry. Note this addon is buggy and requires a very recent nightly.
Last week I asked for some laggy session restore profiles. I’m behind on reproducing those(will be done today or next week). I’ve been in email contact with several of the commenters. I really appreciate the data gathered so far.
Jared landed smooth scrolling, Bug 198964. He is now working on hooking it up to scrolling via scrollbar, Bug 710373. Up next: fixing fallout from turning on smooth scrolling, hooking it up to the refresh driver and tweaking scrolling physics.
Marco landed inline autocomplete, Bug 566489 and is now fixing fallout from that too.
Saptashi did some analysis on the impact of running sqlite in async mode on mobile. Turns out it’s only a win for DELETEs. Expect a blog post from him soon.
Dave discovered that we sometimes wait on locks on the main thread.
Jeff and Bas are looking into diagnosing when d2d causes a slowdown.
There was discussion of 4x reduction in cycle collection times landing soon, focusing on having cycle collector run less, etc. Lots of work(chromehang, profiler, …) is continuing from last week.
I have been working under assumption that the browser gets less snappy as more tabs are opened. This increases the chances of having an ill-behaved website in the background. An ill-behaved tab (or a couple of them) can in theory ruin scrolling, typing, clicking, etc in active tabs. However I do not have anything behind anecdotal evidence on this. There are bugs on specific websites in bugzilla, but it would be nice to get them mixed into a realistic set of tabs.
Would someone be willing to contribute a list of webpages they use often that cause Firefox to lag (maybe a session restore file?)? I am a low-tab person myself, so I can’t easily reproduce this. Please make sure that Firefox is slow with your list of tabs even when all addons are disabled, include a description of slowness encountered.
I expected to a slow week, but there was a surprising amount of progress. I take this as further evidence that having managers go on vacation does wonders to engineer productivity
Interactivity with lots of tabs
We spent a lot of time pondering how to approach browser sluggishness in light of having tons of tabs open. On one hand people should understand, that one can’t expect the browser perform the same whether 1 tab is active or infinity. On the other hand we should do more to a) make the browser punish background tab hogs and b) communicate hogs to the user.
For now we will look at throttling background setTimeouts better (bug 715376, 715378, 715380), XMLHttpRequest loops, etc more aggressively. We also plan to make more use of interactive state so Firefox can suspend non-critical tasks (bug 712478).
Occasionally the cycle collector misbehaves, Andrew will look into not running cycle collection frequently when it is slow: bug 710496. Olli has been fixing many of the cycle-collection extremes, I don’t have bug #s for that, but apparently the improvements are dramatic.
Thanks to telemetry we now know that some users experience tragic startup speeds ranging from 30seconds to 34hours (bug 701872). Our network cache is to blame for some of these (bug ). Another theory is that an unfortunate turn of events causes us to start loading webpages before the UI is shown (bug 715402).
Vlad will post some of his analysis and interested people can help us with telemetry forensics.
Being able to profile interactivity bugs is an important key to making the browser snappier. Large parts of Benoit’s interactivity profiler have landed (bug 713227). Using this extension on nightly win/mac should give you an idea of what it will look like when completed.
We make heavy use of compiler optimizations. Unfortunately one of them is to omit the stack pointer. Ehsan has setup a developer-friendly profiling branch.
Vlad is making progress on non-destructive chromehang(bug. Traditionally we could not do this, but with a combination of telemetry + cycling Ehsan’s shiny new profiling branch on nightly channel… we’ll be in developer heaven.
Peptest should be landing on try soon, Aki is wrapping stuff up. This should enable us to catch responsiveness regressions on our infrastructure.
Jared is almost done fixing tests to land smooth scrolling to gather feedback and move on to fancy physics (bug 710372).
Other ongoing projects with nothing specific to link to: Vlad’s slow-sql telemetry, Rafael’s quest to close sql connections so we can exit(0), QA browser-cache-effectiveness comparisons.