Shortly before 12:30am PDT I had to roll back the DNS changes to AMO and serve it only out of San Jose. Around this time, Europe started coming online and pushed traffic loads up, exhausting the capabilities of the Netscalers in Amsterdam.
Unfortunately when SSL transactions/second hit nearly 900 a second the CPU was pegged at 100% and the box started failing external health checks and started peforming “oddly”.
I mentioned elsewhere that the pair in Amsterdam is a pair of Netscaler 7000s without hardware SSL offloading. The glossy material from Citrix says I should be able to get 4400 SSL trans/second. Admittedly the box is doing more than just SSL (caching, compressing, RTT probes), but not even getting to 1/4 of that number sucks.
(We had exactly the same problem with the 9000s (4400 SSL tps) and 10ks (8800 SSL tps) – during release periods we’d easily top out at more than 3k SSL trans/sec, below their 4400/8800 mark, and the boxes would fall over on themselves. We’re now running on the 12ks which have two SSL hardware cards and two CPUs and perform much better but I’m not sure where Citrix get their numbers)
On the success side, AMO quite quickly started pushing a significant amount of bandwidth out of Amsterdam -
I rolled back before peak traffic but during this time frame, a good 11% of AMO traffic was sourced out of Amsterdam and I got a lot of feedback from other channels that performance was quicker.
So what’s the next step? I’ll be shipping out replacement pair of Netscaler 9000s this week that do have an SSL offload card and we’ll re-try this in a couple weeks when they’re online.
While the Netscaler clearly failed to keep up with the load, I should point out that I’m a huge fan of the product. If I had to build out some non-commercial solution using lighttpd or squid or something else to handle AMO (and the SSL traffic and load balancing and GSLB and HA), I’d have spent more than I spent on the Netscalers.
ps. Anyone more local to Amsterdam who wants to help racking?