Bouncer Updates

Bouncer had a few updates last night:

  • Database library rewritten to be lightweight and use memcache (see tests)
  • Overall requests per second performance increased 2x
  • Database usage down to about zero because of memcache

To see the difference, I’ll show a couple of database graphs. Note that logging is still turned on, so there are still database connections used for updating product and mirror counts. We plan on disabling this in the future because these stats are backfilled from HTTP logs.

Pretty CPU graph:

Database server CPU usage

Pretty traffic graph:

So CPU and traffic data was reduced significantly. Here’s why:

  • Bouncer is a special type of app – high amount of reads over a short period of time
  • The majority of these database reads are repeated over and over, especially recent releases (Firefox 2.0.0.6 for example)

Again, database traffic flatlined during testing when logging was disabled so what you’re seeing is basically download count updates and the connections needed to run them.

To get an idea of exactly how much memcache is used, here are stats from memcache as of this morning (single server, after approximately 12 hours of usage):

  • Gets: 29884170
  • Misses: 199515
  • Total Gets: 30083685
  • Hit %: 99.33679999641

This means that of all queries, 99.33% were read from the cache. Not bad.

This patch took two weeks longer than expected. Challenges we faced:

  • Getting random mirror selection to work when not using “ORDER BY 1” as the way to randomize mirror selection — this was replaced by a simple hash and array sort algorithm that uses array values to weight overall probability of selection when combined with a random seed. In general, random selection via SQL is inefficient and you’re better off using another method besides just RAND() — stored procedures, app code, whatever — but when you’re doing “ORDER BY 1” you’re in for some pain. Use EXPLAIN — it does not lie.
  • Being stingy with MySQL connections. The database library rewrite was required because we did not want to even connect to the database when memcache had all of our results cached. However, mysql_real_escape_string() was used to escape queries and it requires an open database handle to work. That meant that we had to only escape if we knew we were going to perform a query. So we had to move cleaning of SQL inputs to inside the query callback function. This was done by…
  • Breaking away from the PHP4 mysql habit. PHP 5.x has better support for this via mysqli but since we’re still on PHP4 for a little while, in order to mimic prepared statements without using a database interface we used a method much like what was discussed among WordPress developers. It takes us from a connect() ->clean -> concatenate sql -> query() approach to a prepared sql and args -> query() -> autoclean approach.

Overall, this is another positive experience with memcache as a query cache. We use the same method in AMO and plan on using it in other apps that will have to scale to any reasonable level. It saves money in hardware and turns little apps that can sort-of do the job into supercharged workhorses. Yay.