Over the last month, I’ve been working on some patches to address the relocation issues I’ve blogged about and just general space wastage in libxul. The upshot is that Firefox 13 will have shaved almost 100K of data and relocations for smaller binaries and slightly faster load times:
- Bug 717311 was the big one; we were wasting 70K+ of space in some Unicode tables due to structure padding. Fixing this was just a matter of using the right bitfield datatype so the compiler would pack things correctly.
- Bug 717501 was much the same: use smaller datatypes when the data you have fits in them.
- Bug 711563 was more in line with the invasive relocation changes discussed earlier. The patches there rearranged the storage for the stub names to avoid relocations and generally packed things more efficiently.
Not going in this cycle, but worthy of mention is bug 704848 for rearranging the effective TLD name table to avoid relocations; that bug will save 40-50K of space in data and relocations. Jason Duell and I talked on IRC yesterday and while he’s in favor of the idea, he’d like to see if gperf makes things any better. No sense in constructing a table at runtime if you can construct it at compile time!
Are there more instances of things that could be compressed like this? Yes, but the savings from them are likely to be much smaller on a case-by-case basis:
- Anything that uses nsStaticCaseInsensitiveNameTable could be tweaked to use less space and relocations by providing the entry data in a different format.
- nsElementTable.cpp:gHTMLElements could probably be rearranged for the same sort of savings.
- Initialization of atoms could stand some rearrangement for relocation reduction, at the cost of some ugliness elsewhere.
- YARR, used by the JS engine, has some fabulous bit vectors represented as character arrays; squashing those would shave 100K of data, but would likely be tricky.
- Eliminating null entries from the tables in table-driven QueryInterface methods would help, possibly at the cost of some extra parameter traffic to NS_TableDrivenQI.
- Actually, rewriting IIDs for XPCOM interfaces to fit in 32 bits could save a number of relocations in the tables for the aforementioned methods.
- The tables required by PR_ErrorInstallTable are breeding grounds for relocations; changing them is unlikely to happen anytime soon. (Those tables do account for a significant chunk of the relocations in libnss and libnspr, though.)
- Any number of places where we use pointers in constant data structures; there are quite a few of them, but converting each one saves 30-50 bytes each. Best to save these for when you are really bored.
All of the above might amount to 200K of data+relocation savings.
Really, though, there’s not that much to trim (or perhaps what is left to trim is decidedly non-trivial to trim). If you do something like:
readelf -sW dist/bin/libxul.so | grep OBJECT | awk '$3 > 1000 { print }' | c++filt | less
in your objdir on a Linux-y system, you’ll see that quite a lot of the largest objects come from tables for character conversion or character detection. However, the authors of said code were already conscious of the space required by these tables and they have tended to use the smallest datatypes necessary. vtables make a number of appearances (hard to get rid of at the moment). There are also some tables for media codecs, which presumably are difficult to trim down.