CrowdStrike—an $83 billion cybersecurity giant—caused one of the largest Windows outages the world has seen in years last July. Ironically, the company known for protecting Windows machines ended up taking down a huge number of them with a faulty update.
This wasn’t a small blip—it was a reminder of just how much the world still runs on Windows. More importantly, it shows how fragile systems can be when a single file or platform fails.
Successful cybersecurity keeps your entire software estate secure. It’s not always about making stronger passwords–critical software outages will cost your business too. It’s critical for leaders to implement a multi-layered approach and diversify cybersecurity strategies to address potential threats to daily operations, access to enterprise data, and much more.
Here are 3 takeaways and actions every founder should consider to prepare for outages and increase cybersecurity:
1) Resilience Is Key
CrowdStrike markets itself as the leader in keeping companies safe, but this incident exposed a larger issue in the tech ecosystem: the lack of resilience. When a company that promises to protect against the worst threats causes such widespread disruption, we must rethink how we design these systems. A resilient system should withstand a bad update without taking down the world’s infrastructure.
Data Backups, High Availability (HA) Systems, and Failover Solutions can help ensure that data can be recovered quickly in an outage, and that backup systems are in place to replicate essential services if one system goes down.
2) Cybersecurity Isn’t Just About Attacks
We often think of cybersecurity as defending against hackers, but this incident shows that internal processes, updates, and human error can be just as disruptive. In fact, this outage caused more chaos than many major cyberattacks.
Prepare your organization with a Disaster Recovery Plan (DRP) and Runbooks. A well-documented and tested DRP outlines steps to recover from outages, including communication protocols, technical recovery steps, and prioritization of critical services.
3) Systematic Over-Reliance
Windows remains the backbone for many of the world’s critical operations. While that speaks to the platform’s dominance, it also highlights a vulnerability: overreliance on a single ecosystem. One update can ripple globally, and we need more diversified, fault-tolerant infrastructure to handle these situations.
Time your updates strategically, whether it’s your tool or the tools you use. Automated Patch Management and Staged Rollouts can help keep software up to date while preventing vulnerabilities that could trigger or exacerbate outages during broader, more comprehensive updates.
This was more than a lesson in disaster recovery—it’s a call to rethink how we build truly resilient digital ecosystems. The biggest lesson: don’t let the next disruption come from within.
To learn more about the value of the cybersecurity industry and other insights, see the 2024 Cybersecurity Industry Report published by Raidel’s team at FE International.
For more tips on optimizing your SaaS operations, check out our recent post from Ray de Leon, “How EOS Maximizes SaaS Growth.”
Raidel Ruiz is the CTO of FE International. Raidel has 15 years of experience in software engineering, UI/UX, DevSecOps, QA, Cloud Computing, Microservices, and AI/ML technologies. He leads teams of engineers who design and develop cutting-edge software and SaaS solutions for governments, Fortune 500 companies, and international giants like Samsung, MasterCard, Cartier, and more.