Extracting useful data from crash reports

For a while, I’ve been extracting data about Firefox’s graphics features from crash reports. I’ve recently expanded and updated the results, which you can see here:

http://people.mozilla.org/~bjacob/gfx_features_stats/

In particular, besides what users of this page already know, new questions answered include:

Extracting this kind of data from crash reports is very easy. Here’s a nano-tutorial.

The public crash report data is available there:

https://crash-analysis.mozilla.com/crash_analysis/

You want the -pub-crashdata files. There is one per day. Download one of them, for example:

https://crash-analysis.mozilla.com/crash_analysis/20120801/20120801-pub-crashdata.csv.gz

I suggest keeping this file compressed on disk and only decompressing on-the-fly, as shown below.

Each line in this file represents one crash reports. For example, to know how many crash reports there were on 20120801,

$ zcat 20120801-pub-crashdata.csv.gz | wc -l
461744

To know how many of them were Windows XP SP2 users,

$ zcat 20120801-pub-crashdata.csv.gz | grep Windows.NT.5.1 | grep Service.Pack.2 | wc -l
45837

Note that I’m using the dot to match any character, including actual dots as in “5.1”, as it doesn’t make a difference here and I’m too lazy to properly escape dots.

Just look for yourself at the first few lines of a -pub-crashdata file to see what data is in there. In particular, you get the crash report’s AppNotes where Gecko code can write custom annotations: that is how we get to know how many users have WebGL or Layers Acceleration working, for example. You also get the crash signature, so you could plot how crashy a symbol has been over time. You also get CPU info, while GPU info is typically found in the AppNotes. And you also get the HTTP link to the full crash report, which is easy extract with the cut command, so you could make tools giving you right away the crash links that are relevant to your interests.

The data in my graphs was extracted by a C++ program itself run by a BASH script.

 

2 Responses to “Extracting useful data from crash reports”

  1. I can explain the drop of WinXP and corresponding rise of Win7 in your OS stats: Flash 11.3 was released to the full public around that time. And its Protected Mode on Vista and higher causes extremely high amounts of crashes and hangs compared to older Flash versions, so we had a dramatic increase of crashes on Windows 7, while other OSes stayed at roughly the same levels.

  2. bjacob says:

    Thanks! Updated the page.

Leave a Reply