How big can a table in a database get? Well, the answer varies by database, for most modern databases, the answer is “really huge”. That’s what we’ve going in Socorro, some honking big tables. Queries can get slow on big tables. Sure you can add indexing to prevent having to scan the whole thing for the most common queries, but you can’t index every column without slowing performance elsewhere and using up ever more disk space. Indexes can get really huge, too.
- tables divided into sub tables
- queries optimized to use smaller subtables
- ad hoc queries sped up
- summary queries ignore irrelevant data
- smaller indexes
- simplified data retirement
- fresh lemon scent*
Partitioning, a feature offered by many RDMSs, is a trick to help manage titanic tables. You take a table and break it up into smaller tables, each containing a part of the whole. Say we have a data set called ‘reports’. Rather than storing them all in one table, we store them in several smaller tables. The data in each smaller table share a common trait. For example, for reports from Week 1, Week 2, and Week 3 would each have their own table. The master table ‘reports’ physically has no data in it at all. The database knows that when we reference ‘reports’ it means the union of all the smaller tables. If we do a query on the ‘reports’ table and we ask for reports from January, the database is clever enough to just look to the weekly sub-tables for January instead of looking at all the sub-tables.
We’re currently converting the Socorro database into a partitioned database (okay, technically it was already partitioned, but its partitioning was degenerate and didn’t work properly). We’re testing a Python script that is going to take the ‘reports’ table, along with the associated ‘frames’ and ‘dumps’ tables, and start breaking them into little one week chunks. Unfortunately, because of the massive size of the tables, we cannot afford to have two copies of the data in the database at the same time. The chunking of the data will be destructive. After a week of data is copied into a new partition, that corresponding week of data will be deleted from the original table.
The database, the Socorro Web App, breakpad crash processing and aggregate analysis will be down during the migration process. However, data collection will not be down: we’re not going to lose new crash data.
If we were to chunk the entire dataset, the migration process is estimated to take more than twenty hours. As a compromise, we’re going to chunk only the most recent four weeks of data and leave the rest as a single oversize partition. This will significantly reduce the time that migration takes and, therefore reduce the down time. We can get away with this because most of the aggregate reports only look at the most recent few weeks of data anyway.
Another advantage to partitioning is in the retirement of old data. In the future, we’re probably only going to keep at most one hundred twenty days of history. Any more than that and our storage needs would require its own building. To get rid of the old data, all we need to do is delete the oldest partitions. That action is fast because it doesn’t even require looking at indexes or scanning tables.
Partitioning is going to allow Socorro to scale much more smoothly. At the same time, it will make our aggregate reporting much more efficient.
This repartitioning process will happen within the next week. We will announce the scheduled down time in advance. And be assured, because of the file system changes what we made during our last big Socorro update, data collection will not be down while we’re repartitioning.
* also available in fresh wintergreen and sparkling pumpkin