Snappy in Warsaw: pierogy-fueled hackfest

After MozCamp, we held a snappy meet-up at NoaCowork in Warsaw. I believe this was one of the most productive weeks I had the pleasure of participating in since I started at Mozilla. My only regret I was not motivated to organize any memorable after-work activities while suffering the MozCamp.EU plague (Mozilla gatherings are great for exchanging global influenza strains).

Profiler

Benoit Girard went through existing and upcoming profiler features. We made sure that everyone in attendance knew how to use the profiler. We also discussed potential UX improvements.
Markus Stange is a community contributor who originally designed and implemented the current profiler UI. He attended MozCamp and spent most of Monday with us planning future profiler improvements with Benoit.

Bas-tool: Azure Drawing Tracer

Bas Schouten presented his work-in-progress graphics tracing tool. Our graphics people have been using the Microsoft PIX tool to debug accelerated drawing issues with Direct2D. I believe Bas got fed up with the buggyness and limitations of an otherwise excellent tool and wrote a similar Azure-specific tool with some special Bas-sauce.

Bas-tool presents a graphics trace so one can see how Firefox draws on the screen. Seeing how something is drawn step-by-step helps us see when we not using efficient graphics primitives, are doing redundant invalidations, etc. The tool can also do tricks like bruteforce graphics operations to find redundant ones, etc.

I expect Bas will present this tool + accompanying patches soon.

OMTC & Tab Strip

Current Firefox tab-strip implementation is crufty. It uses expensive graphics primitives, inefficient CSS transitions, implements scrolling/overflow animations in JS and does other non-performant things (tracked by bugĀ 593680). These things happen when one keeps adding features without having good profiling/tracing tools.

Tim Taubert lead the effort to prototype a new tab strip that is implemented without JS animations and uses OMTC-friendly, efficient graphics primitives. Bas-tool was used heavily to see whether CSS transitions were animating efficiently. We sorely missed having a layout person around help diagnose layerizing issues, etc. Turns out CSS transition scheduling is very jank-sensitive. We may also need come up + implement some new CSS transition to make an attractive tab strip. Good news is that any backend improvements we make in this area should make it easier to implement fluid, responsive web apps.

Tim Taubert, Benoit Girard & Jared Wein cobbled together a desktop OMTC throbber demo where the tab throbber was implemented using CSS rotations which made it animate smoothly through content jank.

Chromehangs

Me, Josh Aas, Vladan Djeric, Lawrence Mandel went through our new non-destructive chromehang report. Chromehangs are multi-second browser stalls that we report via telemetry. See the complete list that we went through here.

Looks our recently-discovered synchronous proxy code and flash are to blame for most of our temporary hangs. Proxy stuff should disappear once bug 769764 is fixed. Click-to-play will help with some of the plugin-caused hangs. We will be discussing how to deal with the rest of the plugin-jank in the coming weeks.

My favourite chromehang was the one that pinpointed why downloads jank Firefox so much: bug 789932. We tried to pin this on anti-virus scans, download manager sqlite activity, but the main reason turned out to be very simple. Turns out we do network traffic on a networking thread only to write out file contents to disk on main thread.

Other

Paulo Amadini, Lawrence Mandel, Gavin Sharp and me made plans to get rid of main thread SQL usage in download, addon manager.

Vladan Djeric explained his plans to speed up & reduce jank caused by DOM Local Storage.

Margaret Leibovic worked on removing synchronous cache API usage, added pageload telemetry. She also filed a bug that resulted in 20% faster link navigation in Fennec (bug 789889). Perhaps we should do the same on our Metro build?

Olli Pettay & Felipe Gomes worked on making our social api features not leak memory.

Julian Seward, Mike Hommey, Benoit Girard worked on improving our profiling infrastructure and making it work on Android, B2G, Linux.

Josh Aas, Lawrence & me coordinated on Snappy priorities on necko team.

I’m sure I missed a few projects, I hope other attendees blog about their work last week.

14 comments

  1. It seems like one of Firefox Snappy biggest profit so far is an increased use of profiling at every level which is good.
    Some progress was made, a lot remains but better tools are available which should definitely help.

    What’s your feeling about Firefox Snappy’s status after more than 9 months? I mean, do you expect to be half the way through the process before you resume Electrolysis effort?

    I guess even when Electrolysis work can be resumed Firefox Snappy will remain in a certain form ?

    I’d like to hear your thoughts about this after more than 9 months of efforts.

    Keep up the good work and thanks a lot for previous and upcoming improvements !

  2. IIUC, E10 is not a solve all. It is just about multiprocess the mozilla framework, nothing more. After implement E10, we might still encounter unsnappy use case. Project snappy is about correcting the obvious WRONG-DONE things in firefox’s code base. They’re independent.
    Snappy is more important to the UI responsive, if we don’t fix the obvious WRONG-DONE things, E10 alone can help little.

  3. I’m running Aurora on both an old XP system with low resources and a Windows 7 system with a quad-core Xeon, 8GB of RAM and a ‘workstation’ class graphics card. Despite this, typing in gmail is like using a computer in the early 1990s when you had to keep waiting for the computer to catch up with what you had already typed. Making errors is great fun. You wait for the lag to catch up then try to backspace them but then you go too far because the lag doesn’t let you see what you’re doing.

    Broadly speaking Firefox has been going much better for me lately but this is a really silly bug. It’s up there in the super-bug stakes with when the browser hangs for what feels like 30-60 seconds every time I close a google maps tab. Another ‘fun’ bug is when I’m trying to open a new tab by typing in the location bar and hitting enter. Now this might be an issue with Tab Mix Plus but that Firefox actually goes into a hang (Not Responding) state for a couple of seconds every time I do this is just horrid.

    NJN will probably crucify me for this my first post about Mozilla in many weeks but this is the reality I face and I know Mozilla is the underdog and working very hard to solve issues but really … *sigh* I’ll say no more.

    Sincerely good luck with your snappy work Taras et al. This post sounds quite promising, especially how the profiler will allow for much better monitoring of performance impact from changes but geez it seems like there’s a long way to go :(

    Would logs from MemChaser be any help to you in identifying what might be causing the gmail lag I’m getting? I’m guessing that will not be enough but I’m not sure I really have the time and/or skills to run the profiler or other tools.

  4. Thanks for the fantastic update.
    The comment about Metro made me wonder: Are we running tests on Metro just like we do with desktop & mobile?
    The code is, and is increasingly becoming more differentiated such that we can bugs that are unique to Metro, no?

  5. @pd

    Install profiler addon, record a profile, post link to it in a bug. Might be something other than cycle collection, gc causing this problem.

  6. No new updates from the dlbi bug (539356) since 28 aug.
    Is it on hold ?

  7. Is there work being done on Box Shadows? This seems to be the No.1 reason for slow drawing for me. I occasionally record a profile when scrolling on the page stutters and almost always a box shadow is the culprit (like 718453).

    I never found a bug for this… is there one?

  8. @kamulos, if you cannot find a bug, file one. the worst thing that can happen is the bug will be marked a duplicate of another.

  9. ok: Bug 792527, I always remembered reading about it somewhere but could never find a matching bug.

  10. Hi Taras, thanks for your response. Which Add-on are you referring to? One of these?

    https://addons.mozilla.org/en-US/firefox/addon/xul-profiler-by-yoono/

    https://addons.mozilla.org/en-us/firefox/addon/memory-profiler/

    ?

    Neither of those are marked for compatibility with the version of Aurora I’m currently using. Is it safe to try one of them by dismissing the compatibility check?

  11. @JC_Yang I didn’t meant to say Snappy and E10 have the same purpose. I may have not have expressed me clearly.
    I just remember Tara was working on other subjects before Snappy –mainly E10 at that time– and wondered how he expects his work to evolve with Snappy.
    Is it expected to focuse mainly on Snappy for another 3, 6 months, one year? And how Snappy will be handled when the main efforts are complete and the biggest part of his work shifts to other activities.
    But maybe Snappy’s birthday will be a more appropriate time to draw this wider picture of the project.

  12. Hi Taras, I have been noticing (often severe) jank on a lot of google services’ pages.

    The worst is when opening many blogspot sites (it will hang the browser). Most time seems to be spent in the YARR interpreter. This has been profiled (with the gecko profiler) here and reported here:

    https://bugzilla.mozilla.org/show_bug.cgi?id=792797

    Another user reported a bug on it (with profile) here:

    https://bugzilla.mozilla.org/show_bug.cgi?id=791627

    Have also profiled and reported jank when opening a google groups page:

    https://bugzilla.mozilla.org/show_bug.cgi?id=793359

    A mozilla dev recently reported several seconds of jank when opening a google doc:

    https://bugzilla.mozilla.org/show_bug.cgi?id=790270

    Cheers,

    Tim.

  13. just wondering if the following bug would cause significant faster startups and called “snappy” too?

    https://bugzilla.mozilla.org/show_bug.cgi?id=776928