Followup to "SSL Session Caching (in nginx)"
Posted: Sun, 24 July 2011 | permalink | 3 Comments
My article a few weeks ago about SSL Session Caching in nginx garnered a lot more interest than I was expecting, including a mention on Hacker News. The comments on Hacker News raised a few issues that I thought I’d comment on.
Firstly, several people questioned my assertion that session affinity is the root of much evil; this is a fairly big and (to my mind) important issue, so I’m going to save that whole ball of wax for a dedicated post (or two, probably) in the near future.
In the meantime, though, let’s go through the other topics from the HN comments, which I’ll run through in order of appearance.
First off the line is mentat, who wrote:
Crypto hardware acceleration is commodity. I believe that Broadcom sold the chips for <$5 over 5 years ago.
That’s great, but where can I buy one of these $5 crypto accelerator chips? I’m guessing that price is probably for lots of a bazillion or more – and even if I can buy just one of them, a chip by itself isn’t much use to anyone.
Crypto accelerator cards (far more useful than the mere chips), on the other hand, aren’t cheap. The first one I found with a quick search for “crypto accelerator card” was from Sun/Oracle (Suracle?) with a list price of US$9,950.00. Sure, it’s PCIe and RoHS-6 compliant, but that’s still a fair chunk of coin for something that, according to Google (via Sugarcode) doesn’t make any difference with modern hardware (more on that later).
To say they don’t scale horizontally doesn’t make sense either. If they’re integrated with PCIe then you can get a lot of crypto processing in a single chasis.
I think mentat and I might be working from different ideas of what constitutes “horizontal scalability”. Stuffing a box full of PCIe crypto accelerators is, to me, just “build a bigger box”, that is, vertical scaling.
mentat finishes with:
(Disclosure: Used to make crypto accelerators and load balancers)
Uncleeeean! Uncleeeeean! (Just kidding; it’s good to know where people are coming from).
Next, we move on to nwmcsween, who points out:
You’ll usually run out of entropy before cpu usage becomes relevant with SSL processing, I’ve seen old versions of apache hang with little or no entropy to process SSL connections.
I must admit that I’ve never taken much notice of entropy levels, simply because it’s never been an (observed) problem on a site I’ve run (SSL-heavy or otherwise). He adds:
I recommend some sort of RNG or a poor mans software version such as http://www.issihosts.com/haveged/
Certainly, hardware RNG (done properly) is a Good Thing if you need more entropy, and haveged looks interesting if you’re into that (poor man’s pure-software entropy generation). Of course, load-balancing SSL onto your web servers is another option, as it spreads the entropy drain across multiple independent systems rather than concentrating it in your load balancer, which means there’s less likelihood of running out of entropy without the need for dedicated hardware. Just sayin’…
mmaunder steps up next, saying:
Nginx is becoming the standard for front end load balancing for many high traffic sites and this helps.
I’ve used nginx as a load balancer, and it’s not pretty. All of the nginx load-balancing modules I’ve used, or seen used (I can think of at least five off the top of my head), have fallen apart under load, or not balanced intelligently, or just been plain bad. The modules I’ve used also tend to be hard to instrument, which makes working out exactly why they’re failing a bit of an adventure. I’m sure (I’d hope, at least) the load balancing modules in nginx have improved over time, but I’d still be very wary of it. In short: nginx is a kick-arse webserver, and I highly recommend it for that purpose, but as a load balancer I’d find something else.
My preference is for IPVS almost everywhere, as it runs at the IP layer and completely avoids all the ugly problems you just can’t avoid with a proxy. If you do feel the need to use a proxy, though, I would strongly recommend HAProxy over nginx. There are (narrow) circumstances in which a proxy is the best solution for the job, and I think HAProxy is the best of the bunch.
Coming in at an appropriate moment in this (sub-)topic is sugarcode, who remarks:
One downside of this approach (without some funky iptables/networking-fu) is that you loose the source IP from the original request. Adding headers like X-Forwarded-For only works after the request has been decrypted, so all the traffic will appear to source from the load balancer, which can present its own issues.
Which is just one (more) reason I don’t do L7 proxying…
IMO (and I believe Google agrees - http://www.imperialviolet.org/2010/06/25/overclocking-ssl.html) the advantages of terminating SSL at the load balancer outweigh the horizontal scalability of this approach, at least in most cases.
I’m not sure how the article that sugarcode links to supports his/her assertion that terminating SSL at the load balancer is better. The way I read it, the article is more of an indictment against hardware SSL acceleration than being for/against any particular method of load balancing. (The tl;dr version of the article, in case you’re in a hurry, is that CPUs are fast enough that SSL for everything is now quite practical, and you should enable SSL for everything, everywhere, all the time. I don’t disagree.)
One particular gem from the article is this statistic:
Modern hardware can perform 1500 handshakes/second/core.
That’s a decent amount of connections per second, to be sure. As I write this, I can see about 220 SSL connections per second on one fairly busy site I’ve got access to the load balancer stats for. I don’t have long term stats in front of me to know what the site does at peak, but even if there’s a 10x increase at peak, that’s well within the realms of what a single not-so-powerful server could handle (SSL wise) these days.
However, I’ve previously worked on a site that was doing, at peak, 6000 (HTTP, thankfully) connections per second, some 6+ years ago (when CPUs were slower, a dual core, dual CPU server was a fairly schmick bit of kit, and quad-core CPUs a geek’s wet dream). There was plenty of demand for more traffic; the limitation was the size of the backend cluster and the efficiency of the software on those backends, so it could have been doing a lot more than 6000/second if those other bottlenecks could have been removed. If the site had been running HTTPS exclusively, the poor load balancer would have been having a fine time of it terminating all that SSL traffic.
Finally, to wrap things up, tobylane asks:
Sounds useful, but how many visitors do you need to have for this to be worth doing?
To which WALoeIII replies:
If you have any visitors you are doing them a huge disservice if you do not have SSL session caching.
You would use external SSL caching like this if you have more than one SSL termination point (typically a webserver like nginx/Apache) behind a load-balancer.
Which I think sums the whole thing up nicely. You should always be doing SSL session caching with HTTPS, and if you’ve got a load-balanced setup using nginx, you now have the technology to do it scalably.
From: Willy TARREAU
I completely agree with your points. SSL is becoming cheaper with larger CPUs. I still see a benefit for HW crypto, though, but that’s in a very limited area : when you want to boost SSL performance of performance-limited devices (eg: small appliances). For instance, it might be worth offloading SSL handshakes to a cheap card when the main CPU is an atom or pentium-m that perfectly fits the purpose except for SSL. But whenever you can afford bigger machines and higher power usage, software-based SSL will be a lot cheaper than hardware-based, and more importantly it’s easier to scale.
Cheap enough hardware randomness
From: Matt Palmer
36 pounds plus VAT and shipping… I guess “cheap enough” is in the eye of the beholder. If you need a lot of entropy, then it’s priceless.
Post a comment
All comments are held for moderation; markdown formatting accepted.