Security Chaos Engineering

Super Charging Cloud Detection & Response with Security Chaos Engineering

Jul 05, 2023-7 min read

Go Back To Blog

Effective Cloud Detection & Response (CDR) strategies are imperative for promptly identifying and responding to cloud security events. However, enabling efficient CDR strategies is challenging for several reasons, including cloud complexities, insufficient expertise, and cloud misconfiguration. This article makes a case for leveraging security chaos engineering to address these challenges. Defenders can leverage security chaos engineering for threat-hunting efforts to identify CDR blindspots proactively. Some practical examples are illustrated using Mitigant Cloud Immunity and a hybrid CDR system.

Overview of Cloud Detection & Response

Cloud Detection and Resposne (CDR) is an evolving approach to proactively defending cloud infrastructure against cyberattacks. CDR takes a lot of approaches from traditional Threat Detection and Incident Response (TDIR) and applies these approaches specifically to cloud-native infrastructure. The motivation for CDR is to evolve strategies specifically designed to fit the cloud-native threat landscape, given the limitations of traditional TDIR in cloud-native infrastructure.

CDR strategies bring together cloud threat detection and cloud incident response. Hence the specific techniques include active monitoring, log analytics, threat intelligence, incident response, forensic analysis, and threat analysis. It is important to note that CDR is becoming an essential tool for security teams focused on protecting cloud-native infrastructure, including detection engineers, cloud security engineers, cloud incident responders, and SOC teams.

Challenges to Effective CDR Strategies

Despite the importance of CDR in keeping cloud-native infrastructure secure, there are several challenges to enabling an efficient CDR strategy. Some of these challenges are briefly discussed below:

Cloud Complexity

Cloud environments are often complex, consisting of multiple services, resources, and configurations. This complexity can make it challenging to gain complete visibility and accurately detect security events across the entire cloud infrastructure. Furthermore, integrating security tools, e.g., log analytics tools and threat intelligence feeds within the cloud environment, could be challenging due to the inherent cloud complexity. Ensuring compatibility, interoperability, and seamless data flow between security tools and services can challenge effective CDR implementation.

Lack of Standardization

Cloud Service Providers (CSP) often use different logging formats, APIs, and security controls, making it challenging to create standardized detection and response processes across multiple cloud platforms. This standardization issue sometimes exists even within the services of the same cloud service provider, thus requiring the implementation of custom solutions and workarounds.

Rapidly Changing Cloud Landscape

Cloud technologies, services, and resources continuously evolve. CSPs often roll out updates, new features, and configuration changes. Similarly, cloud resources are designed for agility and scalability by varying factors, e.g., customer traffic. Staying up-to-date with these changes and adapting CDR mechanisms accordingly is challenging.

Cloud Misconfigurations

Cloud misconfigurations are a common source of security incidents and vulnerabilities. The dynamic nature of cloud environments and the potential for human error complicates efforts to maintain consistent and secure configurations. This challenge has a ripple effect over practical CDR efforts.

Skill Gap and Resource Availability

Organizations are challenged with acquiring and retaining skilled security personnel with expertise in cloud security, threat detection, and incident response. Using consultants and Managed Security Service Providers is a common approach adopted to address these challenges. However, this approach is often expensive; only some organizations can afford these services.

**SCE Optimizes Cloud Threat Hunting, An Imperative For Effective CDR Strategies**

Leveraging Security Chaos Engineering for Effective CDR

The ROI of effective CDR strategies is immense; it allows organizations to save costs that could be accrued post-breach activities, protects enterprise reputation, keeps customers happy, and enables confidence in deployed security systems. However, to achieve this ROI, organizations need to enhance the effectiveness of CDR. One approach for improving CDR effectiveness is by leveraging security chaos engineering.

What is Security Chaos Engineering?

Security chaos engineering (SCE) is an evolving approach to cyber security that employs empirical evaluation of security controls to gain evidence about their effectiveness via quick feedback loops proactively. These feedback loops, a core of system thinking, allow for quick analysis and adaption of security systems to stay ahead of cyber attackers. We covered several details about SCE in our `Security Chaos Engineering 101` series of blog posts, including- Fundamentals -The MindMap & Feedback Loop and Getting Your Hands Dirty.

**Security Chaos Engineering Is Based On Scientific Experimentation**

SCE Meets CDR

Implementing measures that address the limitations and challenges of CDR mechanisms is essential. Some of these challenges be addressed by leveraging SCE. Security teams can use SCE to verify several security assumptions by gathering appropriate evidence that empowers objective decision-making. This approach is scientifically based and is suitable for evaluating the effectiveness of security tools and controls before they fail to prevent, detect or recover cyber-attacks.

One practical and proven approach for improving CDR is by leveraging threat hunting. While threat detection and response are useful, threat-hunting efforts proactively search for blindspots that might be slipping through CDR systems. This is where SCE fits in CDR strategies; given the inherent proactive benefits of SCE, it can be leveraged to address some of the CDR challenges, including cloud complexity, skills gap, and the evolving nature of cloud infrastructure. SCE is designed to be a natural fit in cloud-native infrastructure and empowers teams to craft hypotheses easily and simulate attacks to verify the effectiveness of CDR technologies.

SCE can be used for both hypothesis-based and TTP/IoC-based threat-hunting efforts. During a hypothesis-based threat hunt, defenders formulate a hypothesis about one or more threats and implement the necessary steps to prove the correctness of this hypothesis. Conversely, defenders start a threat hunt based on threat intelligence or information about adversary TTPs/IoC.

Example Use-Cases

Let us go through some practical examples where SCE is used as a means for threat hunting to enhance CDR capability. The example is based on an AWS infrastructure, and we use a hybrid CDR composed of Datadog Cloud SIEM, AWS Guardduty, and AWS Detective. Mitigant Cloud Immunity is used to implement the hypothesis; it constructs and orchestrates the actual attacks.

**Mitigant Cloud Immunity Orchestrating The `Stop Cloudtrail` Attack**

The first example is based on a hypothesis “CDR will detect when AWS Cloudtrail is stopped.” We simulate an attack against our AWS account to turn off a Cloudtrail trail. All Cloudtrail trail are enumerated during the attack, and one is randomly selected and stopped.

**`Stop Cloudtrail` Attack Detected By Datadog Cloud SIEM**

The next step is to check if our CDR detects the security event. The event is detected since the CDR pulls Cloudtrail events from the AWS account via a dedicated S3 bucket configured for Cloudtrail logs collection.

**An Overview of the Amazon S3 Replication Service**

Lets us run a more complex attack - the dreaded bucket replication attack, which abuses the Amazon S3 replication service. Amazon S3 replication is a service designed to optimize the process of replicating S3 objects cheaply and efficiently. However, this service is abused in a bucket replication attack; the attacker spoofs the system by leveraging this service to exfiltrate objects from a target bucket. The objects can be exfiltrated to another bucket in the same account or to another attacker-controlled account.

**S3 Object Exfiltration Attack Scenario Abuses Amazon Replication Service**

There are two challenges here; first, the defenders might not suspect the replication is malicious since it uses a supposedly benign approach. Second, there might be no records or log that the replication occurred. The steps for the bucket replication attack are implemented in the Mitigant SCE platform and launched against the test cloud account.

**Only Two Security Events Are Detected By the CDR**

The result is mixed, good and bad news. The CDR detects some events that lead to the bucket replication attack, e.g., modifying bucket policies and IAM roles and policies. However, there are a couple of blindspots: first, the actual exfiltration of objects is not captured. In the first security event, the MITRE ATT&CK TTP flagged is TA0010-Exfiltration, but the STATUS column indicated INFO for both events. This implies that the event will likely be swept under the carpet due to the low status - INFO, and potentially a real attack could be successful! By looking at the same attack on AWS Detective, we see a High severe GuardDuty finding - Exfiltration: S3/Anomalous Behaviour. Hence, a solid learning point here could further enhance the CDR capability. For example, the detection rules could be customized using joint queries of security events occurring in the specific time window to indicate a High severity security event. Another blind spot is the fact that the Cloud SIEM did not detect the S3 bucket logging attack. This blindspot can be remediated by using Amazon S3 Event notifications.

**AWS Detective Visualization of the Bucket Replication Attack**

Get Started With Security Chaos Engineering

Getting started with SCE is not as complicated as it seems. You can start today with a basic hypothesis, for example, making an S3 bucket public, as described in this blog post. However, you can fast-track the pace of adoption and handle other important aspects, such as automatic rollbacks, documentation, and analysis, by leveraging Mitigant Cloud Immunity.

Mitigant Cloud Immunity implements SCE for AWS infrastructure, consisting of over 30 attacks that can be leveraged independently or chained together to form multi-step, complex attacks. It provides a suitable platform for enhancing the effectiveness of CDR for organizations of every size. Several exciting features can be leveraged to address the limitation of CDR highlighted in this article. You can quickly sign up for a 30-day free trial at Sign Up | Mitigant

Kennedy Torkura

Co-Founder & CTO, Mitigant. | Contributing Author - O'Reilly Security Chaos Engineering Book. | AWS Community Builder

Start your cloud security journey today

Get your 30-day free trial, no credit card required