For a while, I’ve been extracting data about Firefox’s graphics features from crash reports. I’ve recently expanded and updated the results, which you can see here:
http://people.mozilla.org/~bjacob/gfx_features_stats/
In particular, besides what users of this page already know, new questions answered include:
- How many crash reports did we receive each day? The recent surge of Flash-related crashes is very visible.
- What are the OS market shares among Firefox crashes? Nice to see Android rising; a very surprising sudden decrease of Windows XP around the time of the Firefox 13 release, that is too big to be entirely explained by the end of support for non-SP2 XP, at least assuming that our data about how many users are on pre-SP2 XP is accurate.
- How many users visit Websites that try to use WebGL? Nice to see the continuous increase, now approaching 5%, up from 1% not long ago. Some interesting sudden spikes.
- New: Android and Windows 8 results.
Extracting this kind of data from crash reports is very easy. Here’s a nano-tutorial.
The public crash report data is available there:
https://crash-analysis.mozilla.com/crash_analysis/
You want the -pub-crashdata files. There is one per day. Download one of them, for example:
https://crash-analysis.mozilla.com/crash_analysis/20120801/20120801-pub-crashdata.csv.gz
I suggest keeping this file compressed on disk and only decompressing on-the-fly, as shown below.
Each line in this file represents one crash reports. For example, to know how many crash reports there were on 20120801,
$ zcat 20120801-pub-crashdata.csv.gz | wc -l 461744
To know how many of them were Windows XP SP2 users,
$ zcat 20120801-pub-crashdata.csv.gz | grep Windows.NT.5.1 | grep Service.Pack.2 | wc -l 45837
Note that I’m using the dot to match any character, including actual dots as in “5.1″, as it doesn’t make a difference here and I’m too lazy to properly escape dots.
Just look for yourself at the first few lines of a -pub-crashdata file to see what data is in there. In particular, you get the crash report’s AppNotes where Gecko code can write custom annotations: that is how we get to know how many users have WebGL or Layers Acceleration working, for example. You also get the crash signature, so you could plot how crashy a symbol has been over time. You also get CPU info, while GPU info is typically found in the AppNotes. And you also get the HTTP link to the full crash report, which is easy extract with the cut command, so you could make tools giving you right away the crash links that are relevant to your interests.
The data in my graphs was extracted by a C++ program itself run by a BASH script.