Yesterday’s post on space saving techniques generated a few comments. It seemed worthwhile to highlight a few of the comments for a wider audience.
- Various people have pointed out that clang and GCC support a -Wpadded option to warn when padding is necessary inside of a structure. Visual C++ supports warning C4280 that does the same thing. You can enable this warning in Visual C++ by passing /we4280 on the compiler command line. I’m fairly certain this warning would generate a lot of output, but it might be worthwhile to comb through the output and see if anything interesting turns up.
- David Major pointed out the /d1reportAllClassLayout switch for Visual C++, which prints the class layout of all the classes in the compilation unit. If you’re only interested in a single class, you can use /d1reportSingleClass$NAME to narrow the report down to the class with $NAME. GCC used to have something similar in -fdump-class-hierarchy, but that option has been removed.
- Benoit Girard asked if he could see a list of the 50 largest things on a Linux build. Forthwith, I present the 50 largest objects and the 50 largest functions in libxul for an x86-64 Linux optimized build. One thing to note about the objects is that they’re not all in the same section; the seventh field in readelf output tells you the section index. So for the linked list of objects above, section 15 is .rodata (read-only, shareable data), section 22 is .data (read-write non-shareable data), section 27 is .data.rel.ro (data that needs to have relocations applied at load time, but can be read-only thereafter, e.g. virtual function tables), and section 29 is .bss (zero-initialized memory). Unsurprisingly, string encoding/decoding tables are the bulk of the large objects, with various bits from WebRTC, JavaScript, and the Gecko profiler also making an appearance. Media codec-related functions appear to take up a large amount of space, along with some JavaScript functions, and a few things related to the HTML parser.
- A commenter by the name of “AnimalFriend” correctly pointed out that what you really want to know is which structures both have a lot of instances hanging around and have holes that you could fill. I don’t know of a good way to answer the first part without adding a lot of instrumentation (though perhaps you could catch a lot of cases by augmenting the MOZ_COUNT_CTOR macro to tell you which structures get allocated a lot). The second part can be answered by something like pahole.
- Alternatively, you could use something like access_profiler to tell you what fields in your objects get accessed and how often, then carefully packing those fields into the same cache line. The techniques access_profiler uses are also applicable to counting allocations of individual objects. Maybe we should start using something more access_profiler-like instead of MOZ_COUNT_CTOR and friends! Definitely more C++-ish, more flexible, and it eliminates the need to write the corresponding MOZ_COUNT_DTOR.