Leveraging Security Chaos Engineering for Cloud Cyber Resilience - Part I

In today's rapidly evolving threat landscape, cyber resilience is crucial for organizations to effectively defend against and thwart...
Kennedy Torkura
5 min read
Leveraging Security Chaos Engineering for Cloud Cyber Resilience - Part I
Kennedy Torkura
Kennedy Torkura
Co-Founder & CTO
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

In today's rapidly evolving threat landscape, cyber resilience is crucial for organizations to effectively defend against and thwart modern cyber-attacks. However, cyber resilience is hugely conceptual and theoretical; the industry is far behind in demonstrating cyber resilience capabilities. These capabilities have the potential to frustrate, degrade and ultimately stop attacks while ensuring minimum impact on business operations. This blog discusses some challenges to adopting practical cyber resilience and provides insights into how security chaos engineering wields game-changing opportunities for adopting empirical cyber resilience. The concepts of cyber security and cyber resilience are differentiated in this first part of the article. Afterwards, the notion of practical cyber resilience is evaluated through the lens of the People, Process and Technology framework.

Note: This article summarizes the author's talks at several events, including the German-American IT Forum, RSA 2023, AWS Summit Berlin and AWS Summit Stockholm. Make sure to subscribe to get notified once subsequent articles are published.

Demystifying Cyber Resilience

Cyber resilience falls under the larger umbrella of resilience, which has several sub-domains, including operational resilience, systems resilience, economic resilience, etc.  The US National Institute of Standards and Technology (NIST) defines cyber resilience as follows:

“Cyber resilience is the ability to anticipate, withstand, recover from, and adapt to adverse conditions, stresses, attacks or compromises on systems that include cyber resources.”

Despite the clarity of this definition and the fact that `cyber` is used as a prefix, the practical understanding and implementation of cyber resilience is underdeveloped. A major reason for the stunted adoption of practical cyber resilience is its considerable overlap with cyber security. This overlap causes a number of misconceptions and misunderstandings. Let us attempt to distinguish these two concepts.

What is Cyber Security?

Cybersecurity refers to the practice of protecting computer systems, networks, and data from unauthorized access, damage, or disruption. It involves implementing preventive measures, such as firewalls, encryption, and access controls, to defend against cyber threats like malware, hacking, and data breaches. The primary goal of cybersecurity is to prevent attacks and minimize the impact of security incidents.

What is Cyber Resilience?

Cyber resilience is a broader concept encompassing preventing cyber attacks, responding to them effectively, and recovering from any potential damage or disruption. It refers to an organization's ability to withstand and adapt to adverse cyber events, ensuring continuity of operations, minimizing downtime, and quickly restoring normal functionality.

Cyber Resilience Versus Cyber Security

Cyber resilience encapsulates several aspects of cyber security but touches on several advanced topics, including proactive planning, risk assessment, incident response planning, and robust backup and recovery strategies. A key difference between these two concepts is cyber resilience recognizes that despite the best cyber security measures, organizations may still experience cyber incidents. Thus, it focuses on anticipating these possibilities and building the capacity to respond, recover, and adapt to cyber incidents while drastically minimizing the impact on business operations.

Overcoming the Cyber Resilience Hype

The cyber security industry is overwhelmed with tons of materials about cyber resilience! Many of these materials are from cyber security vendors claiming to offer products that enable different dimensions of cyber resilience. Sadly, the rate of successful attacks continues to increase; attackers need to be withstood or resisted by these security tooling. Ideally, resilient organizations will suffer little to no impact due to a cyber attack. So this begs the question, what does real cyber resilience look like? What components form the hard evidence that a system is cyber resilient? Let's look for answers through the lens of the People, Process and Technology (PPT) framework.

Cyber Resilience Through the Lens of The People, Process & Technology Framework

The People Aspects of Cyber Resilience  

The people aspect of cyber resilience recognizes that individuals are critical to a comprehensive cyber resilience strategy. Organizations can better protect their systems, data, and operations from cyber threats and respond effectively when incidents occur by empowering and equipping people with the right knowledge, skills, and resources. A key differentiator for cyber resilience training is the emphasis on realistic training for the latter. Due to the association of resilience with stress and accurate decision-making under uncertainty, personnel tasked with these responsibilities must be constantly exposed to scenarios that closely resemble this reality. This is the basis for why Amazon (the e-commerce shop) adopted Gamedays over 11 years ago in Amazon, highly influenced by firefighter training.

There is a lot of well-known theoretical knowledge about the concepts mentioned above; however, the most important aspect is adopting a culture enshrined in cyber security and cyber resilience. Culture plays a significant role in cyber resilience, shaping cyber-resilient organizations' attitudes, behaviours, and practices. A strong cybersecurity culture fosters an environment where individuals are aware of the importance of cyber resilience, are actively engaged in security practices, and have a collective responsibility to protect organizational assets from cyber threats.

The Process Aspects of Cyber Resilience

Cyber resilience requires processes that provide structured approaches for managing cyber risks and incidents, enabling organizations to effectively prevent, detect, respond to, and recover from cyber threats. These processes would encompass risk assessment, incident response planning, continuous monitoring, vulnerability management, change management, employee training, backup and recovery, and a culture of continuous improvement. Implementing well-defined processes allows organizations to establish solid foundations for cyber resilience and improvement of overall security posture.

Though there is a lot of overlap between cyber security and cyber resilience processes, a key difference is that the processes for cyber resilience accept the inevitability of successful attacks and proactively plan for adaptive countermeasures. More importantly, the cyber resilience goals and design principles outlined in the NIST Cyber Resiliency Engineering Framework (CREF) are core ingredients for effectively implementing cyber resilience. You can view an interactive version of the cyber resiliency mind map on this link.

A Condensed Version of the Cyber Resiliency Mind Map

The Technology Aspects of Cyber Resilience  

Technology is central to digital innovation; hence the technology aspects of cyber resilience involve leveraging advanced tools, systems, and technologies to support and enhance the overall resilience of an organization's digital infrastructure. In addition, these technological components are crucial in preventing, detecting, responding to, and recovering from cyber threats.

Understanding that technology should become the major driver for cyber resilience is critical. Companies across various industries have transformed into technology companies or heavily rely on technology to conduct their operations. This mindset aligns with the current digital transformation where several concepts, e.g. DevOps, have fast-tracked the adoption pace due to the leading positioning of the technological aspects. However, among the discussed aspects of the PPT framework, the technological aspects of cyber resilience are the least advanced. The implication is a reduced adoption and an increasing rate of successful cyber-attacks. More insights on these aspects will be discussed in the second part of this blog.


Cyber resilience is critical to overcoming the rapid increase in successful cyber attacks. However, despite the widespread adoption of other resilience sub-disciplines, cyber resilience remains in its infancy.  Looking through the lens of the PPT framework, it is observed that the technological aspects of cyber resilience are the least mature. Therefore, adopting a cyber resilience engineering approach is a critical catalyst to fast-track the realization and adoption of cyber resilience.  The second part of this blog post will provide details on cyber resilience engineering and how security chaos engineering can be leveraged to enable cloud cyber resilience practically.


Ready to Secure Your Cloud Infrastructures?
Connect with the Mitigant Team and proactively protect your clouds today.

Join The Cloud Security Revolution Today!

Take control of your cloud security in minutes. No credit card required.
Start 30-day Free Trial