Ok, so I was curious and I looked into the headers of healthcare.gov and to my surprise its powered by, err’ Apache and I see NO evidence of accelerators such as Varnish or Nginx. Who could it be, that thought it possible for Apache to serve millions or even hundreds of thousands of visitors per day on a heavy site like this? Also, as explained below the lack of configuring Apache correctly, meant it had no chance!
Here are Healthcare.gov’s main headers:
HTTP/1.1 200 OK
… the only good news here is, at least gzip is being used.
The next problem, also indicated by the headers, or a lack thereof, is no caching of static files. For example, all of these files (and many others) are missing caching headers:
… this means each of those files have to be repeatedly served by Apache every time a user visits a page, refreshes, etc. When a website lacks caching this becomes even more of a critical issue when there’s programming issues and site errors. It means that as users try to refresh pages and revisit repeatedly because of website errors, that multiplies the load on the server greatly! For example, with 1 million visitors retrying a “single” page just 2 to 3 times would have resulted in 2 to 3 million requests for “each” static file! A cache TTL setting of even 1 hour could have lowered loads on Apache significantly.
They’re also using the CDN service by Optimizely.com and assets.healthcare.gov is being serving from Akamai. This is fine, but not enough of the assets were being served by these CDNs, as such ultimately there should have been a proxy web server in front of Apache for statics to ensure its not overwhelmed.