Premier Continuum and Philippe Tassé-Gagné were honoured at the 2024 BCI Global Awards, respectively winning the Diversity, Equity, and Inclusion (DEI) Award and the Consultant and Resilience Award. Discover more in this article.
July 19, 2024 Incident : When an Update Has Global Impacts
On July 19, 2024, a defective update from CrowdStrike, an American cybersecurity company, triggered a major IT crisis for many Microsoft customers.
In this article, we cover some elements of this global IT incident, arguably one of the most significant to date.
The IT Outage Explained
On July 19, 2024, a defective update from CrowdStrike, an American cybersecurity company, triggered a major IT outage for over 8.5 million devices running the Microsoft system.
This update caused blue screen errors on Windows devices, leading to significant interruptions in Microsoft 365 services, including Outlook, Teams, and OneDrive. This is referred to as the “blue screen of death” (BSOD).
More specifically, CrowdStrike reports that the outage was caused by a software defect in the update that went undetected and resulted in a memory space overflow.
The update primarily affected Windows systems running version 7.11 and later, which were online between 4:09 and 5:27 a.m. UTC.
As a result, Mac and Linux hosts were not affected. (Source: CrowdStrike)
Significant Global Repercussions
Within hours, organizations worldwide were impacted, disrupting daily operations and causing widespread panic. Significant impacts were felt around the world very quickly.
Sectors such as social services, including hospital networks and emergency services (911), as well as border and transportation sectors (airports, air surveillance, etc.), were affected. Telecommunications, banking services, and the manufacturing sector also experienced operational disruptions.
However, it was quickly announced that this was not a security breach or a cyberattack.
(Source: CrowdStrike)
A Gradual Recovery
Microsoft and CrowdStrike quickly collaborated to release patches and restore the affected systems. Within just a few hours, guides were published to help users manually resolve the situation.
However, it took several days before all systems were fully restored. For example, as of July 24 (5:00 p.m. PT), just over 97% of the affected devices had been restored. (Source: CrowdStrike)
To manage the crisis, CrowdStrike published the following:
- A statement on the outage, including a letter from the CEO of CrowdStrike
- The technical details of the July 19, 2024 outage
- A statement from Shawn Henry, Chief Security Officer of CrowdStrike
- A preliminary incident report (PIR) and its executive summary
You can find these communications on the CrowdStrike website.
(Source : Crowdstrike)
Highlights of the Outage
• Date of incident: July 19, 2024
• Time of Deployment of Defective Update: 04:09 UTC (00:09 EDT)
• Responsible Company: CrowdStrike
• Impacted Services: Microsoft 365, including Outlook, Teams, and OneDrive
• Number of Affected Devices Worldwide: Estimated at 8.5 million (less than 1% of Windows computers, but the majority of these provided essential services in their sectors)
• Types of Affected Clients: Governments, businesses and individuals
• Sectors of Affected Clients: All sectors
• Crisis Management Time: Many systems were restored the same day, though some took days and weeks to fully recover
The Impacts of the Outage
The defective update from CrowdStrike caused a global situation, impacting all sectors of activity.
Here are some of the services affected by the Microsoft 2024 outage:
Financial Sector
- Global stock markets trended downward.
- Online banking services were disrupted. Card payment services were also affected in some restaurants.
- Some trading platforms, such as E*Trade, Schwab, and Merrill Edge in the United States, were impacted.
- Banks quickly responded and informed their customers about the disruptions.
Healthcare Sector
- 911 systems in several countries, including remote regions, were impacted.
- Multiple health networks were affected, including those in Toronto and British Columbia in Canada.
- Several hospitals had to revert to paper systems during the outage.
- Access to patient records was difficult, if not impossible. Scheduling appointments also faced issues.
- Some emergency rooms were closed, and surgeries were postponed.
Transportation Sector
- Over 1,100 flights were canceled and more than 2,000 delayed in the United States alone.
- In the U.S., airlines United, Delta, and American Airlines issued a “global ground stop” for all their flights. The American company Porter also canceled numerous flights, affecting thousands of passengers until 3:00 PM the same day.
- Longer-than-usual and longer-than-expected delays were experienced at customs, particularly at the Canada-U.S. borders, with delays exceeding an hour and a half.
Manufacturing Sector
- Large corporations such as FedEx, UPS, and Amazon reported substantial disruptions in their operations.
- Amazon warehouse employees had difficulty managing their schedules.
Telecommunications Sector
- Telecommunications and information services were disrupted globally.
- In Canada, national radio systems, such as Radio-Canada, were affected by the outage, preventing the broadcast of certain radio programs.
It should be noted that this list is not exhaustive and only serves to illustrate the extent of the situation. Many other services were also impacted.
Continue reading through our interview with Philippe Tassé-Gagné, Vice President of Consulting Services at Premier Continuum and Consultant of the Year 2024 at the BCI Americas Awards
As a continuity or resilience responsible, what lessons do you take away the most from this incident?
Continue your reflection with an interview with Philippe Tassé-Gagné, Vice President of Consulting Services and Talent Development at Premier Continuum.
Mr. Tassé-Gagné has over 25 years of experience in the fields of business continuity management, emergency measures, and crisis and incident management. He is also nominated in the Consultant of the Year category at the prestigious BCI Americas Awards 2024.
See the interview now: (link to be posted shorthly)
Sources of the article
CBC News. “Day of disruptions, dashed plans for many Canadians after global tech outage”, July 19, 2024, https://www.cbc.ca/news/world/cyberstrike-worldwide-outage-1.7268863
Crowdstrike. “Remediation and Guidance Hub: Falcon Content Update for Windows Hosts”, version updated on July 31 2024, https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/
Market Watch. “Trading platforms like Schwab and E*Trade affected by Crowdstrike outage », Gordon Gottsegen, July 20 2024, https://www.marketwatch.com/livecoverage/stock-market-today-dow-futures-point-to-further-pressure-as-computer-outages-hit-globally/card/trading-platforms-like-schwab-and-e-trade-affected-by-crowdstrike-outage-NLUnwNqhiH9A76x7c6cv?mod=mw_quote_news
Radio-Canada. "Global Computing Outage: Highlights," published July 19, 2024, last updated July 20, 2024, https://ici.radio-canada.ca/info/en-direct/1011768/panne-informatique-entreprises-microsoft
The BCI. “Microsoft’s Global IT Outage: Strategies to manage IT downtime”, July 19 2024, https://www.thebci.org/news/microsoft-s-global-it-outage-strategies-to-manage-it-downtime.html