Archive for August, 2012

Extracting useful data from crash reports

Thursday, August 2nd, 2012

For a while, I’ve been extracting data about Firefox’s graphics features from crash reports. I’ve recently expanded and updated the results, which you can see here:

http://people.mozilla.org/~bjacob/gfx_features_stats/

In particular, besides what users of this page already know, new questions answered include:

Extracting this kind of data from crash reports is very easy. Here’s a nano-tutorial.

The public crash report data is available there:

https://crash-analysis.mozilla.com/crash_analysis/

You want the -pub-crashdata files. There is one per day. Download one of them, for example:

https://crash-analysis.mozilla.com/crash_analysis/20120801/20120801-pub-crashdata.csv.gz

I suggest keeping this file compressed on disk and only decompressing on-the-fly, as shown below.

Each line in this file represents one crash reports. For example, to know how many crash reports there were on 20120801,

$ zcat 20120801-pub-crashdata.csv.gz | wc -l
461744

To know how many of them were Windows XP SP2 users,

$ zcat 20120801-pub-crashdata.csv.gz | grep Windows.NT.5.1 | grep Service.Pack.2 | wc -l
45837

Note that I’m using the dot to match any character, including actual dots as in “5.1″, as it doesn’t make a difference here and I’m too lazy to properly escape dots.

Just look for yourself at the first few lines of a -pub-crashdata file to see what data is in there. In particular, you get the crash report’s AppNotes where Gecko code can write custom annotations: that is how we get to know how many users have WebGL or Layers Acceleration working, for example. You also get the crash signature, so you could plot how crashy a symbol has been over time. You also get CPU info, while GPU info is typically found in the AppNotes. And you also get the HTTP link to the full crash report, which is easy extract with the cut command, so you could make tools giving you right away the crash links that are relevant to your interests.

The data in my graphs was extracted by a C++ program itself run by a BASH script.