Fennec A2 – Performance

Static Analysis vs Performance

Two months ago I got the feeling that I gotta take a break from static analysis and do something that obviously affects Firefox at runtime. Luckily that coincided with ramp-up on Fennec performance work.

I find that I enjoy fixing existing code a lot more than other sorts of programming, so I was extremely happy to switch focus from the static analysis way of fixing code to my other favourite: optimization. Both are peculiar programming endeavours because after a bunch of gruntwork the program ends up doing the exact same thing as before, but better.

In static analysis I focus more on how different pieces fit together, whereas in an optimization I get to focus on what various pieces are trying to achieve so I learned a lot more random Mozilla mysteries.

Fennec

Fennec is pure joy to optimize because it runs in such a constrained Linux environment (compared to desktop Linux). Things seem to happen roughly 10x slower on the arm processor than on my core2duo laptop. Thus performance details that are hard to spot on the desktop almost trivial to discover.

There is no hard drive seeks to introduce unpleasant surprise latency. This simplifies things a lot – there is a lot less variance between hot and cold start on n810 than on hard drived desktop.

Unfortunately the N810 linux environment also leaves a lot to be desired. Compiling stuff is a chore. It turns out the oprofile produces nonsense results when a compiler of recent vintage is used (ancient one cant really compile Mozilla).

I had a lot of fun digging deep into Mozilla code and dealing with mischievous timestamps, misbehaving caches and rude GC interruptions. All this was done using stone-age instrumentation techniques on N810.

Mark Finkle blogged some details on Fennec Alpha2 performance. Alpha2 is magnitudes faster than Alpha1, I expect more of the same in subsequent releases.

Software Improvements That  Santa Claus Should Get Me

Even though oprofile is useless on N810, one can get a pretty good idea of what the performance issues are from running it on x86. OProfile is a little rough to use, but I’ve learned to love it when sugared with gprof2dot and xdot. It’s great for locating places in the code to stick printf()s into.

OProfile has taught me that what I really want is Dtrace (or some knockoff) running on n810.

Also, I really hate how embedded Linux takes away one of coolest things about Desktop Linux: ability to compile own kernel. I haven’t been able to get a more modern kernel to run on N810 which means I can’t try a newer version of oprofile or the new omap high res timers. I would also like to get a working image of N810 under qemu, but success has avoided me there too.

Static Stuff

Unfortunately I found that can’t effectively work on static analysis stuff without giving it my full and undivided attention. Right now I’m hoping to set aside time to focus on writing a more general dead code finder and catch up on other misc things sometime in Janurary or February.

Comments are closed.