Slow Sessions – Tabs-on-Demand
Armed to the teeth with about:jank, I was testing session restore scenarios that people reported. While at it I came up with a testcase for bug 711193. At first we were going to use telemetry to debate the merits of tabs on demand by default, but I feel my example illustrates responsiveness problems with session-restore well enough. Gavin is looking into this so we can make a decision this week.
Laggy Sessions
On my machine about:jank indicated that most lag was caused by our direct2d accelerated drawing code, bug 721273. Turning off graphics acceleration made things a lot less slow (Options/Advanced/use hardware acceleration) . It would be nice if people experiencing lots of lag in their sessions (on youtube, blogs with high quality backgrounds, etc) could try about:jank. This requires running a very recent nightly.
Install the extension, go to about:jank, browse around, then refresh about:jank. In the case of gfx lag, DrawThebesLayers shows up on top.
Imminent Cycle Collector + GC Improvements
Olli is landing huge cycle collector improvements (half of the patches landed so far), bug 705582, bug 717500. If that doesn’t solve all CC problems by Tuesday, Andrew is standing by with bug 710496 to limit how often CC can run. If we are lucky, incremental JS GC will land before Tuesday too (bug 641025). Landing by Tuesday means that these improvements have a good chance of showing up in Firefox 12. CC + GC are the most well-known causes of pauses in Firefox, so this is very exciting.
Other stuff
Profiling tools are moving along at a good clip. Benoit’s profiler works well on Mac now, hopefully Windows support will happen next week. Non-destructive chromehang is almost landed.
Telemetry histograms should now survive restarts (so we can do shutdown telemetry, etc), bug 707320.
Peptest didn’t manage to survive deployment on try due to bug 719618, 719511.
We are now transitioning from identifying issues to fixing identified issues. It’s exciting to move from speculation as to what sucks to actual results. For more details see meeting notes.
So how do I submit the data I get from about:jank?
It currently looks like this after a few minutes of usage (Linux):
records samples that occured in during periods when we did not service the event loop for more than > 100 ms.
NOTE: about:jank doesn’t interact well with the Gecko Profiler Addon
about:jank results (3531 samples)
1192 – c-GC::GarbageCollectNow
470 – c-gfx::DrawThebesLayer
414 – c-JS::EvaluateString
392 – c-JS::CallEventHandler
258 – c-GC::CycleCollectNow
237 – c-nsEventListenerManager::HandleEventInternal
204 – c-layout::DoReflow
87 – c-layout::FlushPendingNotifications
62 – c-network::nsHttpChannel::OnDataAvailable
61 – c-html5::RunFlushLoop
51 – c-network::nsStreamLoader::OnStopRequest
31 – c-CSS::ProcessRestyles
15 – c-bookmarks::RunInBatchMode
14 – c-bookmarks::RemoveFolderChilder
12 – c-storage::Statement::ExecuteStep
11 – c-event::nsViewManager::DispatchEvent
6 – c-network::nsHttpChannel::OnStopRequest
4 – c-nsInputStreamPump::OnStateStart
3 – c-AnnotationService::SetItemAnnotation
3 – c-storage::Connection::initialize
2 – c-Paint::PresShell::Paint
1 – c-nsHttpChannel::OnStartRequest
1 – c-image::imgFrame::Draw
i don’t get it, what else should/could be on top? isn’t that the primary purpose of a browser, render pages to my screen?
how does that happen without graphics!?!
Tom, I meant accelerated graphics. Yes, that’s the main and imho most complex part of the browser.
Also this spams my terminal. Can you update the addon to not do that please?
@Tom
If I understand correctly, “nothing” should be on top — ideally the entire list is empty. It isn’t a profiler, it’s a counter of “janks” and if your browser is smooth as butter it never records anything.
Someone correct me if my understanding is wrong.
Here is what I see after a little bit of browsing the web:
about:jank results (20804 samples)
5023 – c-GC::GarbageCollectNow
4625 – c-CSS::ProcessRestyles
3614 – c-GC::CycleCollectNow
2552 – c-nsEventListenerManager::HandleEventInternal
2211 – c-JS::CallEventHandler
795 – c-JS::EvaluateString
759 – c-layout::FlushPendingNotifications
636 – c-layout::DoReflow
203 – c-gfx::DrawThebesLayer
119 – c-html5::RunFlushLoop
82 – c-event::nsViewManager::DispatchEvent
60 – c-Timer::Fire
53 – c-Paint::PresShell::Paint
41 – c-plugin::DoStopPlugin
12 – c-image::imgFrame::Draw
12 – c-network::nsHttpChannel::OnStopRequest
3 – c-JS::EvaluateStringWithValue
2 – c-network::nsHttpChannel::OnDataAvailable
1 – c-storage::Statement::ExecuteStep
1 – c-network::nsStreamLoader::OnStopRequest
I have a concern with using tabs on demand –by default– with a slow to intermediate internet connection.
On startup, the selected tab is responsive due to other tabs not being loaded in the meantime.
But when I switch to another tab, I have to wait for the data to download and the tab to render. This is pretty anoying compared to the case where all tabs are already loaded (although this means I had to wait longer at startup).
So I end up with having tabs on demand actived at home with my super fast internet connection, and not activated at work where we (currently and temporarily) have a poor connection.
First Congratulation to all Snappy team. The speed of development were much faster then Memshrink project. The list of bugs identified in such short space of time is amazing! Considering there were Xmas and New Year Holiday in between. Snappy took less then a month to reach this stage.
Let just hope more bugs will be found and fix in the near future.
I have a concern with using tabs on demand –by default as well. Could the page be loaded and not rendered until Tabs been Clicked?
What are the chances that we’ll see some forward movement on electrolysis in 2012 and getting tabs in their own processes?
@geeknik
There will be movement towards making the browser a lot more concurrent in 2012. I expect snappy to wrap up by September and jump into e10s/threading/etc. Should know more in a few months.
@Ed, memshrink made amazing progress since ff7. I hope snappy can do as well.
@Ed with regards to doing everything but rendering, we are looking into all options. We obviously can’t do anything super-sophisticated(ie risky) right away.
on a almost new profile, browsed webgl sites, news sites javascript heavy, youtube. There wasn’t much lag however:
706 – c-JS::EvaluateString
387 – c-plugin::DoStopPlugin (changing video youtube)
382 – c-gfx::DrawThebesLayer
284 – c-Timer::Fire
278 – c-layout::FlushPendingNotifications
216 – c-GC::CycleCollectNow
210 – c-layout::DoReflow
149 – c-CSS::ProcessRestyles
137 – c-GC::GarbageCollectNow
135 – c-content::nsXMLHttpRequest::OnStopRequest
113 – c-JS::CallEventHandler
74 – c-nsInputStreamPump::OnStateStart
42 – c-html5::RunFlushLoop
36 – c-JS::EvaluateStringWithValue
33 – c-storage::Statement::ExecuteStep
23 – c-Paint::PresShell::Paint
21 – c-nsObjectFrame::InstantiatePlugin
11 – c-event::nsViewManager::DispatchEvent
2 – c-storage::Connection::initialize
2 – c-network::nsHttpChannel::OnDataAvailable
2 – c-plugin::nsObjectFrame::Instantiate
1 – c-Input::nsInputStreamPump::OnStateTransfer
1 – c-image::imgFrame::Draw
nvidia 9800gt drivers updated win7 direct10+2d
Hi o/
I have a question: what do we do with about:jank results?
For example, I can easily reproduce multi-second janks by making the hard disk on which ff’s cache resides busy. I *suspect* this might be bug 717761. about:jank indicates:
c-network::nsHttpChannel::OnStopRequest
Is there anything I can do to confirm if one corresponds to the other? Would it make sense to document about:jank signatures in corresponding snappy bugs?
Also, the jank signature c-JS::CallEventHandler looks a bit meaningless. Any way to figure out what the event was?
Thanks, and congratulations on great progress!
sysKin we are working on the cache issue. No place to post these reports yet other than this blog. about:jank is still experimental, so the results aren’t super-useful yet.
It might be worth mentioning that hangs are much more noticeable when the profile dir is on a slow medium, e.g. a USB1/2 thumbdrive.
On such cases, the hangs can be as long as few seconds, and are relatively frequent (more than once a minute) when scrolling a ‘heavy’ page up and down. Tested with Iceweasel 9.01 on linux with http://ynet.co.il .
P.S. Flashblock and adblock plus were on.
Please have a look at the Customer Model Bug. I recently tried to isolate a good testcase. This might expose an issue of the kind you are searching.
here is the bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=610347
and the jank-output:
741 – c-gfx::DrawThebesLayer
281 – c-Paint::PresShell::Paint
2 – c-event::nsViewManager::DispatchEvent
1 – c-nsEventListenerManager::HandleEventInternal
17743 – c-GC::CycleCollectNow
4823 – c-nsEventListenerManager::HandleEventInternal
4821 – c-GC::GarbageCollectNow
1765 – c-Timer::Fire
523 – c-CSS::ProcessRestyles
469 – c-JS::CallEventHandler
279 – c-gfx::DrawThebesLayer
248 – c-layout::FlushPendingNotifications
154 – c-layout::DoReflow
144 – c-html5::RunFlushLoop
127 – c-JS::EvaluateString
102 – c-bookmarks::RunInBatchMode
68 – c-bookmarks::RemoveFolderChilder
51 – c-storage::Statement::ExecuteStep
33 – c-event::nsViewManager::DispatchEvent
27 – c-Paint::PresShell::Paint
20 – c-image::imgFrame::Draw
9 – c-network::nsHttpChannel::OnStopRequest
9 – c-network::nsHttpChannel::OnDataAvailable
5 – c-AnnotationService::SetItemAnnotation
3 – c-Input::DispatchMouseEvent
2 – c-JS::EvaluateStringWithValue
1 – c-nsHttpChannel::OnStartRequest
1 – c-PluginModuleParent::NPP_NewStream
Tell me to stop if I am doing this wrong, but I have two more extreme cases of DrawThebesLayer jank. I think fixing these (and the one above) could be a good starting point for improving DrawThebesLayer.
https://bugzilla.mozilla.org/show_bug.cgi?id=718453
https://bugzilla.mozilla.org/show_bug.cgi?id=724027