Fraud Management & Cybercrime , Incident & Breach Response , Security Operations

Facebook, Instagram, WhatsApp Suffer Widespread Outage

Social Media Giant Confirms Incident via Twitter; Analysis Suggests DNS Issue
Facebook, Instagram, WhatsApp Suffer Widespread Outage
(Source: Alexander Shatov via Unsplash)

Update (Oct. 8): The site DownDetector.com began tracking additional outages on Friday, beginning around 3 p.m. Eastern. It does not appear as widespread as the six-hour incident from Oct. 4, with reports subsiding by 4:30 p.m. ISMG's team was able to access Facebook via app and browser Friday. The platform acknowledged the outage on its Twitter page.

See Also: Live Webinar | A Buyers' Guide: What to Consider When Assessing a CASB

Update (Oct. 4): As of 5:57 p.m. Eastern Time on Monday, Facebook service had been restored.

Social media giant Facebook experienced a global outage on Monday that also involved its properties - including Instagram, Messenger and WhatsApp, according to outage tracker DownDetector.com.

According to Cisco's internet analysis division, ThousandEyes, the tech giant experienced a Domain Name System issue - a service which enables readable domains to connect to numeric IP addresses. The incident reportedly hindered access to Facebook's tools and apps.

Late on Monday, Facebook's Santosh Janardhan, who is vice president of infrastructure, apologized for the problems via a short blog post. The source was a configuration change to backbone routers that coordinate network traffic between its data centers.

That change then "impacted many of the internal tools and systems we use in our day-to-day operations, complicating our attempts to quickly diagnose and resolve the problem," Janardhan writes.

"This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt," Janardhan writes.

No user data was at risk, Janardhan writes. "We apologize to all those affected, and we're working to understand more about what happened today so we can continue to make our infrastructure more resilient," he writes.

Beginning at around 11 a.m. Eastern on Monday, the outage site began displaying user-submitted problem reports that soon snowballed to tens of thousands of users, and then around 125,000 users by noon Eastern. By around 3:00 p.m. Eastern, that number had subsided, with some 34,000 users reporting issues. The total number of affected users was likely higher than the tracking service's tally, as it collates sources - including those submitted by users.

The Facebook website returned an error message at the time of initial publication.

In a prepared statement shared with Information Security Media Group on Monday afternoon, Facebook CTO Mike Schroepfer said, "Sincere apologies to everyone impacted by outages of Facebook-powered services. We are experiencing networking issues and teams are working as fast as possible to debug and restore."

BGP Issue

Dane Knecht, vice president of the web infrastructure and security firm Cloudflare, took to Twitter midday Monday, suggesting that Facebook's Border Gateway Protocol, or BGP, routes "have been withdrawn from the internet."

With a malfunction/misconfiguration of the BGP system - which enables the internet to exchange routing information between systems - Facebook's DNS servers would remain inaccessible.

A Reddit user who claims to be a Facebook employee aware of the outage and recovery suggested that the incident likely stems from a configuration change. The fix, the user suggests, would then come from technicians with physical router access, according to the user, as reported by Ars Technica.

Facebook Takes to Twitter

Following the outage, Facebook spokesman Andy Stone took to competitor platform Twitter, saying: "We're aware that some people are having trouble accessing our apps and products. We're working to get things back to normal as quickly as possible, and we apologize for any inconvenience."

The same message was posted to Facebook's main Twitter profile. Similar messages appeared on WhatsApp and Instagram's Twitter handles.

Twitter users were quick to bemoan the Facebook outage, using the now-trending hashtag #facebookdown.

Bill Lawrence, a former cybersecurity instructor at the U.S. Naval Academy and currently CISO with the firm SecurityGate, says of the incident, "Outages like this show that, for all that was learned since the DDoS attack on Dyn in October of 2016, five years later the internet remains fragile when services like DNS get interrupted for some reason."

Jake Williams, formerly of the National Security Agency's elite hacking team and currently CTO at BreachQuest, says, "The Facebook outages are certainly BGP-related. What we don't yet know is what happened. … [But] the fact that Facebook hasn't corrected the issue yet is odd."

User-reported Facebook outages shown via DownDetector.com

Others Affected

Third parties that rely on Facebook credentials, including games such as Match Masters, also experienced issues. Match Masters took to Twitter on Monday to write: "Hold on tight! If [your] game isn't running as usual please note that there's been an issue with Facebook login servers and the moment this gets fixed all will be back to normal!"

Facebook suffered similar outages in March and July, news wire service Reuters previously reported.

John Bambenek, principal threat hunter at the firm Netenrich, adds that aging internet protocols "were not designed with the scale of the internet as it exists today," and thus are "very susceptible to human error," which has been speculated here.

Bambenek notes, "This problem will [likely] get worse as these protocols are taken for granted."

Facebook shares fell more than 5% on Monday, making it one of the platform's worst days in nearly a year.

The outage comes just one day after "60 Minutes" aired a segment featuring Facebook whistleblower, Frances Haugen, who referenced the company's internal research and its alleged knowledge of, and alleged inaction around, hateful and/or violent content and misinformation shared across the platform.

Similar Outage

In July, a similar outage struck the content delivery network supplier Akamai, which found several corporate websites - including Delta Airlines, Amazon Web Services and AT&T - temporarily knocked offline (see: Resiliency Is Key to Surviving a CDN Outage).

At the time, the company said its rollout of a new software configuration for its Edge DNS service triggered a bug in the DNS system, which caused a disruption affecting the availability of some customers' websites. After about an hour, the company resumed normal operations.


About the Author

Dan Gunderman

Dan Gunderman

News Desk Staff Writer

As staff writer on the news desk at Information Security Media Group, Gunderman covers governmental/geopolitical cybersecurity updates from across the globe. Previously, he was the editor of Cyber Security Hub, or CSHub.com, covering enterprise security news and strategy for CISOs, CIOs and top decision-makers. He also formerly was a reporter for the New York Daily News, where he covered breaking news, politics, technology and more. Gunderman has also written and edited for such news publications as NorthJersey.com, Patch.com and CheatSheet.com.




Around the Network

Our website uses cookies. Cookies enable us to provide the best experience possible and help us understand how visitors use our website. By browsing inforisktoday.asia, you agree to our use of cookies.