We run our web farm behind a pair of Citrix Netscalers in both San Jose and Amsterdam. What really hits these boxes hard is the SSL offloaded traffic and in certain instances has caused the Netscalers to fall over on themselves.
Right now our setup looks like:
(Pardon the shapes, it’s what I have to work with.)
So I’ve been toying with the idea of how to scale our web infrastructure horizontally. What I really want is some sort of “simple” high performance caching reverse proxy that can terminate a large number of SSL client connections (and compress too!).
I’ve been looking at lighttpd (I’d look at Varnish but it’s not yet where I need it to be). My thinking is to use something to load balance between a pool of lighttpd servers, which would terminate SSL sessions, and proxy back to the Netscaler, where I can take advantage of it’s caching and caching policy engine and its global load balancing, which I’m tied to because we’re using dynamic proximity probes to load balance – the alternative would be to hack up BIND to use some sort of geo-ip database (which wouldn’t account for brownouts).
That setup might look like this:
Of course, in places like Amsterdam, the backend servers would be nearly on the other side of the planet, in San Jose.
The question is whether lighttpd could handle the SSL load that we currently see which, during a non-release cycle, hits around 3000 SSL transactions/second (and nearly double that during a release) or how many lighttpd front-ends I’d need to run to match that.
One option would be to use hardware SSL accelerators. The Citrix Netscaler 12000 appears to use two Cavium Networks NITROX cards. Presumably if I had even one of those I could hit anywhere between 14k and 28k SSL trans/second. The guys over at Zeus seem to think that a dual dual-core Opteron running some 64bit OS could match or out perform hardware accelerators.
I suppose the only real way to find out is to test it!
I’m still trying to deploy this and since I’m still waiting for the new Netscalers to arrive in Amsterdam (and don’t have any Cavium cards), I went ahead and setup two lighttpd servers (both HP DL360s running 32bit RHEL4) behind the current Netscaler 7000s in Amsterdam. The Netscaler is passing all traffic to lighttpd, which terminates SSL, and proxies back to the Netscaler (which has better cache mechanisms) which proxies back to the real addons.mozilla.org site. And yes, that was a confusing setup to configure.
I’d be real interested in getting feedback from folks (and performance numbers on lighttpd + SSL, if anyone has it) on how this setup works. If you want to play:
The IP there is 220.127.116.11 — should be able to test by changing your hosts file.
Notes about the install:
- public pages are cached entries delivered from the local NS
- logged-in pages are actually from the sjc cluster
- admin/dev/editor pages are all still from the sjc cluster