In the early hours of Friday, July 19th, airline flights were halted, hospitals couldn’t serve patients, and critical infrastructure was disrupted—all because of a security software update gone wrong. Systems became inoperable due to a bug nobody was prepared for. Unlike the Y2K panic of over two decades ago, which was overblown and largely averted through extensive preparation, this event transpired and caused widespread chaos. It posed critical questions at many companies: “What was the impact?” or “What would the impact have been?”
Even though the impacts were not due to a cyber-attack (there’s been a lot of heat online about this, mostly because CrowdStrike is a cybersecurity software vendor), it’s important to reiterate the importance of modeling the impact of technology-related events before they happen. To put it bluntly, our physical world is dependent on digital, and the consequences stretch much farther than they did two decades ago when the world was worried about the year 2000. You can’t party like it’s 1999 anymore. You need to prepare for the worst-case scenarios beforehand- the kinds that can cripple your business.
CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts. Mac and Linux hosts are not impacted. This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed. We…
— George Kurtz (@George_Kurtz) July 19, 2024
The issue now is that most companies are not doing this. They are so focused on modeling events based on statistics that provide an estimated probability of occurrence. Assessing probability can be a time-consuming and challenging process that produces, at best, false justification for inaction. Inaction can then get in the way of preparedness, hurting companies in the long run.
The CrowdStrike event: A brief overview
Most know what happened with the CrowdStrike event, but to provide a little bit of context, we will look at what caused it and the impact that followed. A faulty update pushed by CrowdStrike’s team introduced a logic error that resulted in the infamous “blue screen of death” for 8.5 million devices around the world. The error required manual remediation, which was made even more difficult for those with BitLocker encryption enabled. The decryption key needed to fix the issue was stored on servers also down. And to add to all of this, malicious attempts to exploit companies scrambling to recover were soon to follow. Phishing emails disguised as quick fixes were being sent left and right, making an already stressful time worse.
Differentiating between immediate and long-term impacts
While many companies are working diligently to recover, there are many out there left watching from the sidelines. It might not have impacted you this time, but you have this feeling that you should be doing something to address it. Your board might be asking questions, and you need to answer. The problem is that you don’t know what to do right now, and there are people impatiently waiting for some sort of explanation. Start by asking yourself a few questions about the event:
- What does an event like this look like for our organization?
- What would the financial and operational impacts be?
- Are we prepared to handle and recover from the event?
- And even, would we survive an event like this?
These questions lay the groundwork for understanding the true costs, planning proper risk mitigation, and the allocation of resources to turn these plans into preparedness. To understand the impact, we’ll need to think about two kinds of costs:
- The immediate costs of recovery
- The long-term costs of lost business and reputation damage
Airlines and cyber events-many layers of impact
Using airlines as an example, we can now see what this kind of assessment might look like. As a result of the outage, thousands of flights were grounded or delayed across many airlines. This is of course a massive inconvenience, but what does this mean in terms of impact?
Immediate costs
Looking at immediate costs, we have significant lost revenue from canceled flights, ticket refunds, and additional costs for accommodating stranded passengers. Airlines faced the direct costs of manually processing check-ins, additional staffing needs to manage the backlog, and technical recovery expenses to restore systems. These are costs that can be assessed and quantified to a likely range.
Long-term costs
Unfortunately, the costs of an event rarely end with just the cost of remediation and recovery. We’re now going to look at what will happen in terms of lost future cash flow as a result. While 62% of airline customers typically stick to one airline, an event like this can break brand loyalty. In an article by PhocusWire, they highlight the fact that customer trust is going to be damaged: for both the industry and individual companies. With that said, We also need to consider the costs of litigation in this as well. Businesses rely on air travel to carry out their work. What if a delayed or canceled flight hurt a business engagement? Or caused an event to be missed? Lawsuits are almost inevitable when it comes to things like this. And to cover all of this, companies might be considering how much insurance will compensate. Insurance stress tests are crucial in this case, as they will provide a better understanding of what the payout is going to be. What if your insurance policy has carveouts? It might not cover events originating with a third party, it might not cover something if your organization was found negligent in your assessment of the third party, etc. There is a lot to consider here when looking at the impact, but the information it equips your organization with is incredibly valuable.
Preparing for impact
After assessing the immediate and long-term costs of an event like this, the airline understands what this truly means in terms of the impact on their organization. They can then begin to ask questions about the costs:
- Do we have the cash on hand to cover the short-term costs?
- Are we willing to self-insure, or do we want to take out a policy?
- Does the lost revenue change our long-term financial plans?
- Can we allocate funds for mitigating these risks?
The importance of an impact-oriented cyber risk assessment
The CrowdStrike event is a perfect example showing the need for impact-oriented risk quantification. This event is somewhat unprecedented, which would have made the assessed probability extremely low. The proper steps to mitigate are less likely to be taken in massive events like this when using an approach that puts such an emphasis on probability, and not on impact. Probability has its place, but for organizations looking to get the most value out of cyber risk quantification, impact should take priority. Placing too much emphasis on probability can result in a company overlooking an event as significant as this one. The idea being impact-oriented cyber risk assessment is looking at the events that will cause the most damage, and addressing them first. The information you get out of looking at the overall impact of the event can save an organization. It will expose scenarios that could result in financial damage that your company might not be able to endure. It sparks the questions then about either insuring or mitigating risks.
Cyber risk quantification (CRQ)
Cyber risk quantification (CRQ) is the most effective way to accomplish the task of quantifying financial and operational impact. Bringing all the current and long-term costs into one place will allow organizations to see the costs broken down, providing an itemized receipt for an event like this. Going back to the two questions that are being asked following an event about actual and expected impact, CRQ is the process of answering that question.
What makes Axio’s CRQ methodology so unique is it’s ease of use and the ability to provide actionable intel quickly. A key element of our risk identification methodology is to ‘assume disruption’, and then find the most plausible way it could occur. By placing an emphasis on impact, we’re avoiding the possibility of missing a scenario based on low probability. This event very well could have been assessed as being unlikely, which would have resulted in it being looked over. This only keeps organizations from properly preparing for an event.
Enterprise-wide implications
Roughly 60% of the companies representing the Fortune 500 use CrowdStrike to keep up with daily operations. This event underscores the fact that the risks associated with events like this are no longer isolated to the technology side of an organization: they’re enterprise wide. They can bring business to a halt and do long-term damage. Organizations need to start conducting these quantification exercises with a greater emphasis on the impact if they want to guarantee continuity. How can an organization prepare for an event if they aren’t looking at the worst-case scenario? There is nothing wrong with hoping for the best, as long as you’re preparing for the worst at the same time.
By taking proactive steps and focusing on the impact of potential cyber events, organizations can better prepare for the unexpected. In a world where digital and physical worlds are inextricably linked, readiness is not just a strategy—it’s a necessity.
If you are interested in learning more about Axio’s cyber risk quantification methodology, designed to model low probability, high impact events with ease, reach out to us for a demo.