This outage was broad and severe, and we’re truly sorry for the impact to our customers and everyone who relies on them.
– Nick Rockwell, Senior VP of Engineering and Infrastructure, Fastly Inc.
The incident occurred at around 10:00 UST (06:00 EST) and prompted mass “Error 503” messages. It was identified by Fastly in less than a minute and patched within an hour.
Initial analysis indicates that the whole episode was triggered by a single customer updating their settings (in a perfectly valid way) — you know those nightmares you have about clicking the wrong button and deleting the whole Web? Yeah, imagine being that person. The precise combination of settings triggered a bug in an update that had been missed in Fastly’s QA and had been sitting in production code since May 12th.
If you’ve ever visited a serious server center, you’ll know the kind of security they employ in defense of potential criminal attacks. The only center I’ve visited in person was inside a nuclear-proof bunker, involved multiple security checks, and I wasn’t even allowed into the really secure part. But it turns out, all the terrorists need to do to crash the global economy is open a CDN account and update their settings.