Jun 11

Dehyra/Treehydra Static Analysis Thoughts

I was pleased to see Mozilla static analysis mentioned on lwn. Yes indeed, the mailing list has been pretty dead (most of our communication happens on irc.mozilla.org #static). I completely failed to build a community around my static analysis tools. Perhaps more people will try Dehydra now that it’s getting into Debian. The hydras are still alive, evidence can be seen in the mercurial commit log. Development has slowed because the hydras are now considered to be feature-complete and my primary focus is elsewhere in Mozilla now.

As to why open source static analysis has failed to take off, I have a few theories. I think the main problem is that static analysis requires a compiler/correctness/type-system-nerd/large-scale-development-nerd type personality. That’s a pretty rare intersection of hobbies to begin with. One also has to hate the stone age that C/C++ ecosystem we are in, but not move on to shiny new Haskell/Ocaml/whatever communities.

Have I failed at igniting the static analysis revolution?

  1. My goal primary goal was: provide a way to analyze Mozilla source code to speed up our development + refactoring efforts.
  2. My secondary goal was to make sure that whatever work I do, nobody else has to suffer through the unbelievably sucky infrastructure cruft I had to work through.
  3. Lastly, I did put in some effort at promoting open source static analysis (by giving talks at conferences, etc) since working in an active community is more fun.

Mozilla side:

I’m happy to report that I achieved a culture shift at Mozilla. Instead of people saying “oh god, I can’t find all instances of ___ issue in 3million lines of C++ code”, it’s pretty common to hear “lets solve this through static analysis”. Dehydra was designed to take the bitchwork (boilerplate of compiler integration, etc) out of static analysis so one can focus on the analysis part. New Dehydra users within Mozilla seem to confirm that.
Instead of pondering whether certain tool-assisted refactorings are feasible, we plan to embark on some now (turned out we were understaffed to keep up with tool output and overburdened by api compatibility before; more on this in a future blog post).

No More Static Analysis Bitchwork:

The worst aspect of dealing with C++ is parsing it. The second worst aspect is dealing with the preprocessor. With respect to parsing C++ we went from weirdo-custom-frontends(ie Elsa, EDG, etc) and “GCC will never allow plugins, don’t waste your time” to GCC adopting a plugin architecture that suited my static analysis needs. I also implemented source-location transformation tracking(-K) in mcpp, so nobody has to suffer through undoing braindamage inflicted by the C proprocessor again.
I hear at least a couple of people benefited from MCPP work and I take partial credit for every new analysis GCC plugin. I suspect I saved a few person-months for somebody :)

Btw, I think Chris Lattner’s from-scratch effort on Clang is way awesomer than anything I could ever accomplish.

Conferences & Stuff:

I admit complete and utter failure in this regard. Most open source people have low regard for static analysis. Linus seems to take a million-monkeys-with-type-writers approach (ala the open source eyeballs approach to security) to ensuring kernel code quality (which is a reasonable approach when you have mobs of contributors). Most other projects do not have the resources to spare on unproven tech such as static analysis.

To make matters worse, at first people thought JavaScript was a toy language worth only cut’n’pasting from recipes online. Then just as JavaScript was getting more popular, SpiderMonkey embedding got buggier and made for some unpleasant first experiences with the Hydras.


There isn’t much to show for my work outside of Mozilla; that’s fine since my primary goal was Mozilla :) The Hydras aren’t dead, they are in maintenance mode.

I’m glad to see python-as-gcc-plugin approach, it seems to fill the same niche as Treehydra. I regret not starting out with Python (I think it’s slightly better than JavaScript for this task), I hope David Malcolm succeeds in attracting wider interest.

PS. I’m super-excited about the new DXR work. DXR is something that makes my daily life easier. DXR is by far the smartest code-indexing system out there, it’s bound to transform my life as a developer far more than any static analysis ever could :)

Jun 11

Telemetry is on in Nightly Firefox builds

Telemetry went live in Firefox Nightly builds over the weekend. Everyone who wants to contribute to making Firefox better now has an easy new way: opt-in to telemetry.
There are two ways to opt in:
A) Click yes when prompted to report performance
B) Enable it by going to Options/Preferences, then Advanced/General tab

You can check on the data collected by installing my about:telemetry extension.


Jun 11

Developers: How To Submit Telemetry Data

Telemetry is a way to gather stats about Firefox. Currently histograms are the main mechanism by which to gather data. There are 3 histogram types currently supported: exponential, linear and boolean. Exponential+linear histograms can accumulate numbers between 0 and a user-defined integer maximum, they differ in bucket size increments. Boolean histograms are meant to store 0 or 1.

Steps to Add a New Metric

1) Add a histogram definition to TelemetryHistograms.h specifying a histogram id, parameters and a description. For example HISTOGRAM(MYHGRAM, 1, 10000, 50, EXPONENTIAL, “Time (ms) taken by shiny new metric”) defines a MYHGRAM histogram with a minimum value of 1, maximum 10000 and 50 buckets that grow exponentially with the description specifying units as milliseconds.
2a) C++:

#include <mozilla/Telemetry.h>

Telemetry::Accumulate(Telemetry::MYHGRAM, <some interestin integer value goes here>)

2b) JS:

Telemetry = Cc["@mozilla.org/base/telemetry;1"].getService(Ci.nsITelemetry)
var h = Telemetry.getHistogramById("MYHGRAM");
h.Add(<some interesting value goes here>);

3) Install about:telemetry addon, go to about:telemetry to check that the chosen histogram type and parameters summarize your data in a useful way.

4) Commit and wait for results to come in.
Q: What about those UMA_* macros?
A: Don’t use them. I added TelemetryHistograms.h so telemetry can be easily reviewed by security and privacy police. It also avoids a few pitfalls such as accidentally initializing the same histogram with different parameters, etc.

Q: When will telemetry be deployed?
A: We will land the UI to turn it on as soon as an updated privacy policy is posted. It should show up in nightly builds by the end of the week.

Q: How do I access the gathered telemetry data?
A: TBD. Data is stored on metrics server, we will figure this out soon.

Q: What about addons?
A: TBD. At the moment addons are free to inspect telemetry data in their browser. We haven’t decided on a process to let addon authors add new probes and access stats. For now, addons should not participate in telemetry.

Jun 11

Telemetry Updates

My previous post was too optimistic. There will be no telemetry in Firefox 6. Due to the multitude of reviews involved we slipped and are now aiming for Firefox 7. Bug 659396 tracks various ongoing telemetry tasks.

I updated my about:telemetry extension to work with Firefox 7 nightlies. Additionally, my friend, David, helped me apply some nasty CSS tricks to make the histograms look like histograms. I’m open to further CSS contributions. I haven’t listed the extension on AMO because we plan to have this functionality integrated into Firefox soon (hopefully 7).

To turn on telemetry,we have to: a) finish up the telemetry opt-in UI and b) update our privacy policy.

Thanks to everybody who manually flipped the pref to turn on telemetry. Having early feedback on this feature is awesome.