Load Balancer performance issues, fxfeeds.mozilla.org & versioncheck


I mentioned briefly in Monday’s meeting the performance issues we’re having with our load balancers.  Since then, we’ve been hustling to turn up something in the short term to handle Thursday’s Major Update and Firefox 3.0.5/ release (see here).

For a number of months we’ve been looking at Zeus and their ZXTM product.  It has some advantages (and some disadvantages) and one of the biggest is that it’s all software and I can quickly deploy it on any available hardware we have.  Zeus has been extremely helpful in quickly issuing unlimited node eval license keys on short notice (so thanks Jasper/Chris) and providing the level of supported I’d more likely expect to be given to a paying customer.

We’ve shifted two of the highest traffic back-end infrastructure services for Firefox over to a Zeus ZXTM cluster:

  1. fxfeeds.mozilla.org – Live Bookmarks feed URL in a default Firefox install.
  2. versioncheck.addons.mozilla.org – URL Firefox uses to check for Add-On updates.

fxfeeds.mozilla.org is nothing more than an RSS feed.  The site itself has no real content – there isn’t even a DocumentRoot defined in Apache.  In fact, the config is nothing but 40 HTTP redirects:

[root@mradm02 domains]# egrep 'Redirect|Rewrite' fxfeeds.mozilla.org.conf  | wc -l

versioncheck.addons.mozilla.org is my one Firefox 3 claim-to-fame.  Without that change, a lot of the scalability work we’re doing would be extremely difficult and require a lot more QA time.

Since Firefox checks for Add-On updates after updating itself (and periodically on its own), versioncheck.addons.mozilla.org is one of the sites that sees a large spike in traffic around release time.  It’s also highly cachable content and easy to scale with CPU (for SSL) and memory (for cache).

I moved fxfeeds.mozilla.org over yesterday and was astonished to see what sort of traffic that site alone generated – staggering might be a better way to say it.  Nearly 40Mbps of basically 302 redirects and upwards of 400,000 connections/second (and incidentally, moving that site off the Netscalers significantly dropped its resources issues).

versioncheck.addons.mozilla.org was moved Monday morning onto a test ZXTM cluster and again later this afternoon on a more production capable ZXTM cluster.  It’ll push 30Mbps on its own during non-release periods and 2-3x that during a full release.

Since no post is good without pictures, I grabbed two graphs from the ZXTM cluster.  The first shows connections/second and the other shows bandwidth (bits per second).

Tags: , ,

Categories: load balancing, Mozilla

4 responses

  1. Archaeopteryx wrote on :

    I run a script which collects data from versionscheck.addons.mozilla.org and suddenly take three times the time it needed to finish in the past, but it started already a few weeks ago, probably around November 17th.
    Hint: I’m in Germany and my AMO is in Amsterdam.

  2. Neil Rashbrook wrote on :

    Your screenshot actually shows 200,000 hits per minute, not second, which makes much more sense.

  3. mrz wrote on :

    @Archaeopteryx: that matches the time frame when versioncheck was pulled away from GSLB & Amsterdam because of performance problems.

    Right now, versioncheck is only being served from San Jose. I deemed this okay since it’s a backend service and not as user noticeable as other sites.

    It is my intention to put it back in some form of GSLB.

  4. mrz wrote on :

    @Niel – good point. It was late when I was looking at that data and posting 🙂