2 min read

CloudFlare Suffers an Hour-Long Outage while Mitigating a DDoS Attack

Loredana BOTEZATU

March 05, 2013

Promo Protect all your devices, without slowing them down.
Free 30-day trial
CloudFlare Suffers an Hour-Long Outage while Mitigating a DDoS Attack

What started as a small-scale DDoS attack on a computer server wiped out a chunk of the Internet managed by DDoS protection company CloudFlare.

“The outage affected all of CloudFlare’s services including DNS and any services that rely on our web proxy. During the outage, anyone accessing CloudFlare.com or any site on CloudFlare’s network would have received a DNS error,” writes CloudFlare co-founder and CEO Matthew Prince in a blogpost.

The outage occurred when CloudFlare edge routers failed to connect the 23 CloudFlare data centers to the rest of the Internet using routers. The CloudFlare routers could no longer announce the Internet path the data packets need to reach their destination. Some 785,000 websites, including Wikileaks, 4chan and Matallica.com suffered.

Image Credit: CloudFlare

It started with a DDoS attack against the servers of one of CloudFlare customers, something that CloudFlare is extremely good at detecting and fending off. As they profiled the attack and issued a traffic routing rule (drop attack data packets of a considerable large size, between 99,971 and 99,985 bytes), they started propagating it via Juniper`s Flowspec protocol, to Juniper edge routers.

It is unknown why the rule, instead of dropping the attack traffic, drained all routers out of RAM memory, leaving them half crashed, unable to route any kind of data and also unable to serve remote management requests for a soft reboot.

With many of the edge routers unable to reboot automatically, the remaining routers got hit with the traffic across the entire CloudFlare network and got overloaded. The operation team had to manually unplug edge routers in the CloudFlare data centers, a time-consuming physical reboot that caused the hour-long outage.

“We let our customer down this morning, but we will learn from the incident and put more controls in place to eliminate problems like this in the future,” Prince said the official company announcement.

tags


Author


Loredana BOTEZATU

A blend of product manager and journalist with a pinch of e-threat analysis, Loredana writes mostly about malware and spam. She believes that most errors happen between the keyboard and the chair.

View all posts

You might also like

Bookmarks


loader