Good compilers are complicated, and generating good code is hard. Clever
optimisations play their part, but it can be equally important to not do
things that are stupid.
Back in March I noticed this awful code begin generated by TraceMonkey for
access-fannkuch.js, one of the SunSpider benchmarks:
ld16 = ld sp[-152] sti sp[-152] = ld16 ld17 = ld sp[-128] sti sp[-128] = ld17 ld18 = ld sp[-88] sti sp[-88] = ld18 ld19 = ld sp[-80] sti sp[-80] = ld19 sti sp[-72] = addxov1 ld20 = ld sp[-64] sti sp[-64] = ld20 ld21 = ld sp[-48] sti sp[-48] = ld21 sti sp[-32] = ld15 ld22 = ld sp[-8] sti sp[-8] = ld22 j -> label1
A number of the trace fragments compiled for that benchmark ended similarly.
I filed a bug and investigated a little at the time but didn’t understand
the responsible code well enough to fix it.
This week I took another look, and was glad that I did. Thanks to a better
understanding of the responsible code, I was able to fix it with only a 29
line patch (10 lines of which were comments). Even better, it turns out
this fix not only affected the small number of egregiously stupid cases such
as the one above; it also removed a lot of redundant stores that were less
obvious. The net result was about a 2% speedup for the SunSpider benchmark
suite and 3% for the V8 benchmark suite. That’s a satisfying patch!