Skip to main content

Outage Outrage: Lessons from the CrowdStrike Failure

Steve Gilde August 1, 2024
girl-in-airport
Outage Outrage: Lessons from the CrowdStrike Failure
8:01

 

The recent CrowdStrike outage sent shockwaves through the global economy, with an estimated direct cost of more than $5 billion to Fortune 500 companies alone. More difficult to calculate is the brand damage done to many of these organizations as they struggled to recover from the debacle.

Beyond the financial toll on businesses around the globe, the incident exposed vulnerabilities in critical infrastructure that also impacted millions of individuals, disrupting daily life and causing immeasurable inconvenience. It is simply not possible to put a dollar figure on the pain and heartache experienced by all the people who could not travel, see a doctor, buy groceries, or access cash.

While the size of this outage was unprecedented, it is just the latest in an ever-increasing number of incidents taking place across all business sectors, including payments. According to a recent survey of IT leaders and decision-makers by PagerDuty, customer-facing incidents increased by 43% during the past year, with each incident costing nearly $800,000.

 

IT Outages Are Inevitable

It might be tempting to think that all this disruption, complication, and expense is caused by coding bugs. And while software defects certainly play a role in many situations – including the CrowdStrike incident - there is much, much more to the software development, delivery, and deployment process.

Modern software development has become a labyrinth of interconnected systems and dependencies. Developers must juggle a dizzying array of tools, frameworks, and libraries, each with its own unique quirks and nuances. Open-source components offer speed and flexibility while introducing their own elements of risk as they evolve independently. 

Additionally, the integration of APIs from third-party services creates intricate webs of dependencies, increasing the probability of unexpected interactions and points of failure.

At the same time, the relentless march of technology demands constant adaptation. New programming languages, cloud platforms, and security protocols emerge at a rapid pace, forcing developers to stay current while maintaining existing systems. This dynamic environment, combined with the pressure to deliver software quickly, makes it nearly impossible to eliminate all defects.

We all try our best and no software company wants to fail – but sometimes bugs are released out into the wild.

 

The Cost of Defects

Sources: functionize.com & nist.gov

 

A Good Offense is Your Best Defense

Which is why the next steps in the deployment process are so important. Since it is not only likely, but highly probable, that any developed software application will contain a defect at some point in time, every IT operation must take its own steps to protect itself. 

It is incumbent on every organization that uses third-party software to ensure that newly released code actually works as intended outside the vendor’s shop and doesn’t misbehave or break anything when deployed in a client’s unique environment.

At Paragon, we recognize that all of our customers in the highly regulated financial services industry have incredibly complex IT environments and should take extreme measures to ensure the integrity of their systems and data. We recommend that every Paragon client maintain and run our newly delivered software releases through a staging environment to ensure that everything operates normally in their specific environment.

In fact, we feel strongly about this point, and every license agreement for our Web FASTest enterprise testing platform allows the client to deploy and run a staging system at no charge.

It is also important for our customers to manage their own deployment schedules. In order to stay as current as possible with all of the changes that take place in the payment industry, Paragon provides new software releases every two weeks. However, we understand that not every organization is able to or wants to consume every release with this frequency, so they get to choose whether to deploy or not to deploy any individual release as required.

Since every release is backward compatible, there is no penalty for skipping a release or even several releases. As you might imagine, some clients do deploy every release and others deploy only two or three times per year. Their choice.

Even when the software is operating properly, the hardware and networking components that support it may fail. The physical and logical infrastructure at a bank or other financial services company is no less complex than the software development environment. And just as with software, all physical components, such as servers, routers, and firewalls are subject to failure.

This means a third line of defense is required. A disaster recovery system that is physically separate from the production systems and managed on a separate release schedule should be deployed and maintained in a constant state of readiness so it can pick up the workload from other failing systems at a moment’s notice.

 

“The Crowdstrike outages are a stark reminder of the importance of proactive planning and regular testing to ensure business resilience in the face of unforeseen challenges.”

- David Varney, Partner UK Law Firm Burges Salmon

 

As noted with the staging systems, Paragon recognizes that most financial services companies need a disaster recovery environment, so our license agreement for Web FASTest also allows clients to deploy and maintain a DR system at no additional charge.

 

Testing Best Practices Help Protect Your Bottom Line

As we have seen from the recent CrowdStrike incident, an unplanned outage can occur at any moment and have catastrophic results. It is therefore incumbent on every payment industry participant to protect itself to the maximum extent possible against potential threats, both internal and external. 

By developing a comprehensive strategy for testing and deploying software, every organization is better equipped to anticipate and prevent expensive outages before they occur.

 

automated-testing

 

And when outages inevitably do occur, you will be better prepared to deal with the disruption and get your systems back online as quickly as possible, minimizing any negative impacts on your customers, your staff, and your shareholders.

As we have seen, many of the legacy systems and processes still in use are no longer appropriate for dealing with the challenges of increasingly complex operating environments, alternative payment flows, and sophisticated cybercriminals. A comprehensive and strategic approach to testing, risk management, and security is necessary to keep your payment systems safe, secure, and always available.

Paragon Application Systems has been delivering innovative testing solutions to the largest and most sophisticated financial services companies worldwide for more than 30 years. We partner with banks, card networks, retailers, and payment processors to ensure their payment systems are always functional, reliable, and running at peak efficiency.

The team at Paragon is available to review your current payment testing strategy and provide advice and guidance on how to optimize your testing capabilities and adopt industry best practices that will help prevent and respond to unwanted outages. Contact us today to learn more.

 

Request a Consultation

Related posts

Payments Testing - March 6, 2025
Use a Host Simulator to Enhance EMV Testing
Clyde Van Blarcum Author at Paragon
Payments Testing - February 27, 2025
Trends in Payment Testing: 2025
Paragon Application Systems Author at Paragon
Payments Testing - February 12, 2025
Are You Ready for Testing in the Cloud?
Paragon Application Systems Author at Paragon