LUL: A Lightweight Unwinder Library for profiling Gecko

Last August I asked the question “How fast can CFI/EXIDX-based stack unwinding be?” At the time I was experimenting with native unwinding using our in-tree Breakpad copy, but getting dismal performance results. The posting observed that Breakpad’s CFI unwinder is around 30 times slower than Valgrind’s CFI unwinder, and looked in detail at the reasons for this slowness.

Based on that analysis, I wrote a new lightweight unwinder library.  LUL — as it became known — is aimed directly at doing unwinding for profiling. It is fast, robust, fairly accurate, and designed to allow a pool of worker threads to do unwinding, if that’s somewhere we want to go. It is also set up to facilitate the space-saving schemes discussed in “How compactly can CFI/EXIDX stack unwinding info be represented?” although those have not been implemented as yet. LUL stores unwind information in a simple, quick-to-use format, which could conceivably be generated by the Javascript JITs so as to facilitate transparent unwinding through Javascript as well as C++.

LUL has been integrated into the SPS profiler, and landed a couple of weeks back.

It currently provides unwinding on x86_64-linux, x86_32-linux and arm-android, using the Dwarf CFI and ARM EXIDX unwind formats.  Unwinding by stack scanning is also supported, although that should rarely be needed. Compared to the Breakpad unwinder, there is a very substantial performance increase, achieving a cost of about 40% of a 1.2 GHz Cortex A9 for 1000 unwinds/second from leaf frames all the way back to XRE_Main().

To use LUL, build with –enable-profiling –enable-optimize=”-g -O2″.  I then start the desktop builds with the following environment variable settings:

  MOZ_PROFILER_INTERVAL=1 MOZ_PROFILER_NEW=1
  MOZ_PROFILER_VERBOSE=1 MOZ_PROFILER_MODE=native

In particular, setting MOZ_PROFILER_MODE=help gives more details.

On Android, a suitable magic incantation is:

  adb logcat -c ; \
  adb shell sh /system/bin/am start -S -n \
    org.mozilla.fennec_sewardj/.App \
      --es env0 MOZ_PROFILER_INTERVAL=1 \
      --es env1 MOZ_PROFILER_MODE=native \
      --es env2 MOZ_PROFILER_NEW=1 \
      --es env3 MOZ_PROFILER_VERBOSE=1 \
      --es env4 MOZ_PROFILER_STARTUP=1 ; \
  adb logcat 2>&1 | tee logfile.txt

What next for LUL? I’d like to implement the space-saving schemes mentioned earlier. But more important, it would be nice to have developers using the SPS/LUL combination, so as to give real-use feedback. That will help to move it forward in the most immediately useful direction.

2 responses

  1. glandium wrote on :

    I think the best way to have people use it is to make it the default. Porting it to mac would also add a lot more potential users.

  2. njn wrote on :

    Yes! Default is good.