There is a common fallacy that since linkers and compilers are written by really smart people, there aren’t any huge performance wins left in the toolchain. My theory is that the efficiency of any given codebase varies inversely with the number of people who tried to optimize it.
I have long complained of suboptimal binaries generated from our code. Modern profiling tools such as systemtap and icegrind made this painfully obvious. Mike Hommey opted for actually doing something about it. What started as a simple ld.so hack grew into a badass binary-rewriting tool (and the most interesting blog post I’ve read this year).