27
Jan 12

DataShrink work

Over the last month, I’ve been working on some patches to address the relocation issues I’ve blogged about and just general space wastage in libxul.  The upshot is that Firefox 13 will have shaved almost 100K of data and relocations for smaller binaries and slightly faster load times:

  • Bug 717311 was the big one; we were wasting 70K+ of space in some Unicode tables due to structure padding.  Fixing this was just a matter of using the right bitfield datatype so the compiler would pack things correctly.
  • Bug 717501 was much the same: use smaller datatypes when the data you have fits in them.
  • Bug 711563 was more in line with the invasive relocation changes discussed earlier.  The patches there rearranged the storage for the stub names to avoid relocations and generally packed things more efficiently.

Not going in this cycle, but worthy of mention is bug 704848 for rearranging the effective TLD name table to avoid relocations; that bug will save 40-50K of space in data and relocations.  Jason Duell and I talked on IRC yesterday and while he’s in favor of the idea, he’d like to see if gperf makes things any better.  No sense in constructing a table at runtime if you can construct it at compile time!

Are there more instances of things that could be compressed like this?  Yes, but the savings from them are likely to be much smaller on a case-by-case basis:

  • Anything that uses nsStaticCaseInsensitiveNameTable could be tweaked to use less space and relocations by providing the entry data in a different format.
  • nsElementTable.cpp:gHTMLElements could probably be rearranged for the same sort of savings.
  • Initialization of atoms could stand some rearrangement for relocation reduction, at the cost of some ugliness elsewhere.
  • YARR, used by the JS engine, has some fabulous bit vectors represented as character arrays; squashing those would shave 100K of data, but would likely be tricky.
  • Eliminating null entries from the tables in table-driven QueryInterface methods would help, possibly at the cost of some extra parameter traffic to NS_TableDrivenQI.
  • Actually, rewriting IIDs for XPCOM interfaces to fit in 32 bits could save a number of relocations in the tables for the aforementioned methods.
  • The tables required by PR_ErrorInstallTable are breeding grounds for relocations; changing them is unlikely to happen anytime soon.  (Those tables do account for a significant chunk of the relocations in libnss and libnspr, though.)
  • Any number of places where we use pointers in constant data structures; there are quite a few of them, but converting each one saves 30-50 bytes each.  Best to save these for when you are really bored.

All of the above might amount to 200K of data+relocation savings.

Really, though, there’s not that much to trim (or perhaps what is left to trim is decidedly non-trivial to trim).  If you do something like:

readelf -sW dist/bin/libxul.so | grep OBJECT | awk '$3 > 1000 { print }' | c++filt | less

in your objdir on a Linux-y system, you’ll see that quite a lot of the largest objects come from tables for character conversion or character detection.  However, the authors of said code were already conscious of the space required by these tables and they have tended to use the smallest datatypes necessary.  vtables make a number of appearances (hard to get rid of at the moment).  There are also some tables for media codecs, which presumably are difficult to trim down.


26
Jan 12

compressing strings in JS

As we kept increasing the amount of information we send in via Telemetry, we need to start thinking about how to keep the size of the ping packets containing that information as small as possible. The packets are just JSON, so the first thing to try is to compress the data with gzip prior to sending it.

This is how you compress a string in a language like Python:

compressed = zlib.compress(str)

(Yes, yes, this is not gzip compression.  Close enough for pedagogical purposes.)

Short and simple. Boy, I hope it’s that easy in JS. Hm, let’s see, there’s this nsIStreamConverter interface, that looks promising:

let converter = Cc["@mozilla.org/streamconv;1?from=uncompressed&to=gzip"].createInstance(Ci.nsIStreamConverter);
let stream = Cc["@mozilla.org/stringinputstream;1"].createInstance(Ci.nsIStringInputStream);
stream.data = string;
// Hm, having to respecify input/output types is a bit weird.
let gzipStream = converter.convert(stream, "uncompressed", "gzip", null);

OK, we wound up with a stream, rather than a string, but that’s OK, because nsIXMLHttpRequest.send will happily accept a stream. So, nothing to worry about. (This is a little white lie; please hold your comments until the end.)

Hm, that doesn’t seem to work. I get NS_ERROR_NOT_IMPLEMENTED. Oh, look, nsDeflateConverter doesn’t implement nsIStreamConverter.convert. In fact, none of the stream converters in the tree seem to implement convert. What a bummer.

Hey, here’s nsIStreamConverterService! Maybe he can help. His convert method just punts to nsIStreamConverter.convert, so that won’t work, though. Ah, nsIStreamConverter has an asyncConvertData method, let’s try that:

function Accumulator() {
  this.buffer = "";
}
Accumulator.prototype = {
  buffer: null,
  onRequestStart(request, context) {},
  onRequestStop(request, context, statusCode) {},
  onDataAvailable(request, context, inputStream, offset, count) {
    let stream = Cc["@mozilla.org/binaryinputstream;1"].createInstance(Ci.nsIBinaryInputStream);
    stream.setInputStream(inputStream);
    let input = stream.readByteArray(count);
    this.buffer += String.fromCharCode.apply(input);
  }
};

let accumulator = new Accumulator();
let converter = Cc["@mozilla.org/streamconv;1?from=uncompressed&to=gzip"].createInstance(Ci.nsIStreamConverter);
// More respecifying input/output types.
converter.asyncConvertData("uncompressed", "gzip", accumulator, null);
// Oh, that method doesn't actually convert anything, it just prepares
// the instance for doing conversion.
let stream = Cc["@mozilla.org/stringinputstream;1"].createInstance(Ci.nsIStringInputStream);
stream.data = string;
converter.onRequestStart(null, null);
converter.onDataAvailable(null, null, stream, 0, string.length);
converter.onRequestStop(null, null, 201 /* 417 */);
compressed = accumulator.buffer;

Well, it’s not as simple as I hoped for, but I guess it works.

FWIW, I do understand why the input/output types have to be respecified.  But I think this is about the best way to do things currently; that’s kind of frightening. The above is one of those instances where you start to understand why people complain about things being so crufty.