2012 Mozilla DB Year in Graphs

Sheeri

2

I’m not a wizard with infographics, but I can do a few pie charts. I copied the data to the right of the pie charts for those that want to see the numbers. Overall, there are almost 400 databases at Mozilla, in 11 different categories. Here is how each category fares in number of databases:

Mozilla DBs in 2012

Here is how each category measures up with regards to database size – clearly, our crash-stats database (which is on Postgres, not MySQL) is the largest:

2012 size of all Mozilla databases

So here is another pie chart with the relative sizes of the MySQL databases:
2012 size of MySQL databases at Mozilla

I’m sure I’ve miscategorized some things (for instance, are metrics on AMO classified under AMO/Marketplace or “internal tools”?) but here are the categories I used:

Categories:
air.m.o – air.mozilla.org
AMO/Marketplace – addons/marketplace
blog/web page – it’s a db behind a blog or mostly static webpage
bugzilla – Bugzilla
Crash-stats – Socorro, crash-stats.mozilla.com – Where apps like Firefox send crash details.
Internal tool – If the db behind this is down, moco/mofo people may not be able to do their work. This covers applications from graphs.mozilla.org to inventory.mozilla.org to the PTO app.
release tool – If this db is down, releases can not happen (but this db is not a tree-closing db).
SUMO – support.mozilla.org
Tree-closing – if this db is down, the tree closes (and releases can’t happen)
World-facing – if this db is down, non moco/mofo ppl will notice. These are specifically tools that folks interact with, including the Mozilla Developer Network and sites like gameon.mozilla.org
World-interfacing – This db is critical to tools we use to interface with the world, though not necessarily world visible. basket.mozilla.org, Mozillians, etc.

The count of databases includes all production/dev/stage servers. The size is the size of the database on one of the production/dev/stage machines. For example, Bugzilla has 6 servers in use – 4 in production and 2 in stage. The size is the size of the master in production and the master in stage, combined. This way we have not grossly inflated the size of the database, even though technically speaking we do have to manage the data on each of the servers.

For next year, I hope to be able to gather this kind of information automatically, and have easily accessible comprehensive numbers for bandwidth, number of queries per day on each server, and more.

2 responses

  1. Laura Thomson wrote on :

    Heh. I didn’t realize crash-stats was SO much bigger than everything else!

    1. Sheeri wrote on ::

      Laura, in fact, the reason I used pie charts instead of some other format is to show the difference between number of databases and size of databases…bandwidth numbers will be more difficult but they’ll show more clearly where our efforts are really needed.