The Dehydra installation instructions got to the point where they were more confusing than helpful. I spent this morning cutting out irrelevant crud, please let me know if there are any further cleanups that need to be done.
I have been told that it should be possible to control the way the GNU linker lays out binaries. Unfortunately until recently I couldn’t figure out the right incantations to convince ld to do my bidding. Turns out what I needed was to be stranded on a beach in Fiji with nothing better to do than to reread the ld info page a few times.
- Produce 2 mozilla builds:
A tracing build with -finstrument-functions in CXXFLAGS/CFLAGS
A release build with -ffunction-sections and -fdata-sections CXXFLAGS/CFLAGS to allow the linker to move stuff at function or static data(mostly variables) granularity
- Link my profile.cpp into libxul in the tracing build (without -finstrument-functions flag)
- Run the tracing build, capturing the spew from profile.cpp into a log file
- Feed the log file to my script to produce a linker script. This will produce library.so.script files for all of Mozilla libraries.
- Rebuild relevant libraries in the release build with -T library.so.script linker flag
- Enjoy faster startup
This results in 200ms faster startup my 7200rpm laptop harddrive which is about a 10% of my startup. I think that’s pretty good for a proof of concept. Unfortunately there isn’t a measurable win on the SSD (not surprising) nor a reduction in memory usage (I expected one due to not having to page in code that isn’t needed for firefox startup).
I suspect the problem is that data sections need to be laid out adjacent to functions that refer to them. I started sketching out a treehydra script to extract that info.
I posted the relevant testcase and scripts. Do hg clone http://people.mozilla.com/~tglek/startup/ld to see the simple testcase and various WIP firefox scripts.
The majority of Firefox startup overhead (prior to rendering of web pages) comes from frustrating areas such inefficient libraries (eg fontconfig, gtk) and the mess caused by crappy layout of binaries and overuse of dynamic libraries. This post describes one small step towards fixing the crappy layout of our binaries.
I would like to end up in a world where our binaries are static and laid out such that they are read sequentially on startup (such that we can use the massive sequential read speeds provided by modern storage media). Laying out code/data properly should result in memory usage reductions which should be especially welcome on Fennec (especially on Windows Mobile).
I am hoping to see 30-50% startup time improvements from this work if everything goes according to plan.
A really good ACM article about static analysis from Coverity’s perspective has been making rounds in Mozilla. What struck me most was the following paragraph:
At the most basic level, errors found with little analysis are often better than errors found with deeper tricks. A good error is probable, a true error, easy to diagnose; best is difficult to misdiagnose. As the number of analysis steps increases, so, too, does the chance of analysis mistake, user confusion, or the perceived improbability of event sequence. No analysis equals no mistake.
My personal view has been that “dumb” analyses are the most effective ones in terms of mistakes spotted vs time wasted writing/landing the analysis. It is interesting to see that sophisticated analyses are difficult to deploy even for Coverity.
In other news, LCA 2010 was my favourite conference so far. I met a number of awesome developers there. Mozilla’s static analysis work finally got mentioned in LWN!