Conducting Robust Learning for Empire Command and Control Detection

A collage of icons in yellow, green and white. Graphs, warning signs, a lighbulb, and types of documentation. The Palo Alto Networks and Unit 42 logo lockup.

This post is also available in: 日本語 (Japanese)

Executive Summary

PowerShell Empire is a popular post-exploitation framework used by threat actors, and it remains an ongoing threat. Using machine learning (ML) and artificial intelligence (AI) methods, we have developed an extremely effective system to detect Empire's command and control (C2) traffic.

In this article, we review the Empire framework, examine Empire C2 traffic and discuss issues affecting ML-based C2 detection. The primary issue is adversarial attacks, a category of AI attack that threat actors can use to poison or evade ML-based detection. We solved this challenge by developing a learning system using a more robust model with adversarial training. This robust learning system will effectively counter a threat actor’s attempt to evade ML-based C2 detection.

We review the concepts behind our Empire C2 robust learning system, and we discuss the effectiveness of training this model with both real-world traffic and AI-generated data.

Palo Alto Networks customers receive protection from and mitigations for Empire C2-based attacks through our Next-Generation Firewall with an Advanced Threat Prevention subscription, WildFire, Cortex XDR and Prisma Cloud.

Related Unit 42 Topics Malleable C2 Profile, Evasion, C2

Table of Contents

The Threat: PowerShell Empire
Why We Focused on the Empire C2 Framework
Characteristics of Empire C2 Traffic
Malleable C2 Profile Format
Example of Empire C2 Traffic
The Challenge of ML-Based Empire C2 Traffic Detection
ML-Based Detection
Countering ML Through Adversarial Attacks
How an Attacker Might Use Empire Framework To Evade Detection
Empire C2 Robust Learning System
Empire C2 Fuzzer
Empire C2 Data Quality Monitoring
Empire C2 Traffic Generation Engine
Empire C2 Model Training
Conclusion

The Threat: PowerShell Empire

First seen in 2015 at BSides Las Vegas, PowerShell Empire is a post-exploitation framework designed for red team personnel. This framework is used by penetration testers to emulate adversary behavior. As seen with other red team tools, real-world adversaries have also used the Empire framework. While the original Empire project ceased development in 2019, at least one other organization maintains its own fork of the framework, so Empire remains an ongoing concern.

Why We Focused on the Empire C2 Framework

Empire is powerful, flexible and easy to use, making it one of the most popular C2 frameworks in recent years. Various threat actors have been publicly documented using Empire, and this framework has also been associated with high-profile ransomware cases. Organizations that fail to detect and prevent Empire C2 in their networks face a huge potential impact.

Due to these factors, we tested against the Empire C2 framework when developing our ML-based robust learning system. Furthermore, effective detection of Empire C2 should also work well against similar C2 frameworks.

Characteristics of Empire C2 Traffic

Empire C2 traffic is web-based, and characteristics of this traffic are defined in a Malleable C2 profile. We previously reviewed Malleable C2 profiles used in Cobalt Strike, and the same concept applies for Empire. In this case, “Malleable” means easy to modify to meet different requirements.

Malleable C2 profiles follow an easily customizable template that allows attackers to tailor C2 traffic to their exact specifications. A Malleable C2 profile lends versatility to its associated framework or tool.

Malleable C2 Profile Format

A Malleable C2 profile defines variables used in C2 communications. Figure 1 shows the format of a Malleable C2 profile.

Image 1 is a screenshot of the code making up a malleable command and control profile.
Figure 1. Format of a malleable C2 profile.

Variables associated with the Malleable C2 profile include sleeptime, useragent, uri, and various header lines for web traffic generated by both the client and the server. Figure 2 shows an example of these variables defined in a partial Malleable C2 profile.

Image 2 is a screenshot of the code making up a partial malleable command and control profile. It includes the headers, server and other information.
Figure 2. A partial malleable C2 profile.

The partial profile in Figure 2 defines variables for HTTP GET requests used in the C2 web traffic. The section starting with http-get defines characteristics of traffic between a client and the C2 server.

The set uri function assigns the URI generated by the client sent to the server. The terms netbiosu and uri-append indicate the session key information will be encoded and appended to the URI of an HTTP request. The Cobalt Strike user guide provides more details on the Malleable C2 format also used by Empire.

Example of Empire C2 Traffic

Below, Figure 3 shows the HTTP traffic generated using the Malleable C2 profile from Figure 2.

Image 3 is a screenshot of two separate chunks of code. The first is highlighted in red. The second is highlighted in blue. It is the HTTP traffic generated by an Empire C2 based on the malleable C2 profile. The information includes the servers, user agent, GET, date and more.
Figure 3. HTTP traffic generated by Empire C2 based on the malleable C2 profile.

The first line in Figure 3 is a GET request from the HTTP request headers. This HTTP GET request reveals a session key as an ASCII string embedded in the URI between /CWoNaJLBo/VTNeW11212/ and /UTWOqV132/. This session key has been encoded by netbiosu as defined in the Malleable C2 profile from Figure 2.

The Challenge of ML-Based Empire C2 Traffic Detection

Understanding Empire’s use of Malleable C2 profiles is essential when developing a robust learning system that can detect Empire C2 traffic. Due to the endlessly possible customizations, a signature-based approach is not fully effective, so we developed a ML-based model for Empire C2 detection.

ML-Based Detection

When developing an ML-based detection model for Empire C2 traffic, we reviewed different options. Security defenders are using a growing number of ML and AI techniques for C2 detection.

One possible option is using traditional ML-based approaches. In the traditional approach, ML extracts characteristics of C2 activity from the traffic, a process known as feature engineering.

For example, researchers can extract features from network traffic such as URI parameters, cookie data and various HTTP headers to identify C2 traffic. We could build a model on these features using traditional ML algorithms such as logistic regression or support vector machine (SVM) to differentiate malicious C2 traffic from benign traffic.

Another option for C2 detection is deep learning. Deep learning is a subset of ML that relies on artificial neural networks to enhance performance, and it is effective at analyzing large datasets.

But developing a robust learning system for C2 detection requires countermeasures against adversaries who would exploit any vulnerabilities in the finished product.

Countering ML Through Adversarial Attacks

Adversarial attacks in AI and ML systems are not new. As more AI systems arise, adversaries are starting to target these services using various methods like evasion or poisoning attacks. Various sources have reported evasion as the most common AI attack method.

Evasion attacks target online AI systems by crafting inputs that mislead the AI model into making incorrect predictions. ML-based C2 detectors are particularly vulnerable to evasion attacks, as open-source frameworks with high configurability make it easier for attackers to generate a large number of inputs and identify those that can bypass AI systems.

Attackers can conduct evasion attacks by using Malleable C2 profiles. These profiles allow threat actors to more easily launch large-scale evasion attacks that explore the limits of Empire C2 detectors in a test environment. After discovering a profile that can bypass detection, threat actors can launch their attacks on real targets.

How an Attacker Might Use Empire Framework To Evade Detection

The following is an example of how a Malleable C2 profile can control the HTTP indicator.

In this case, an attacker intends to launch a privilege escalation attack on a victim using the profile shown below in Figure 4.

Image 4 is a diagram of the HTTP header and its corresponding parts to the mal.profile. a rectangle and arrow highlight the /api/v1/user section of code. The user name is highlighted with an arrow to the base64 and uri-append lines in the code.
Figure 4. Mapping between the HTTP header and mal.profile.

Figure 4 shows both the profile configuration and the actual traffic generated during the C2 session. The profile sets the URI for its HTTP request with the value /api/v1/users. The profile also sets Base64 as the encoding algorithm for its session key and appends it to the URI.

During an attack, a threat actor could generate different HTTP requests by periodically updating their Malleable C2 profile. For example, the same attack could also use the profile shown below in Figure 5.

Image 5 is a screenshot of many lines of code. One line is redacted. Some of the information corresponds from the HTTP header to the mal.profile as indicated by black arrows. The redacted information is related to the /ucD line.
Figure 5. Updated mapping between the HTTP header and mal.profile.

Figure 5 shows HTTP traffic that is completely different from the example in Figure 4. As a result, a signature-based approach would miss the second attack if it only relies on a signature based on values seen in Figure 4.

Fortunately, an ML-based approach can train a reliable model to tolerate these variances if we provide enough training datasets. This is the approach we used for our robust learning system.

Empire C2 Robust Learning System

Our Empire C2 robust learning system provides a robust detection model capable of defeating evasion and other potential adversarial attacks. Our Empire C2 detection is built on top of a Convolutional Neural Network (CNN). CNN is a deep learning algorithm that can take in an input, assign importance to various aspects of the input and differentiate one from the other.

Our learning system starts with an Empire C2 fuzzer we developed that generates Empire C2 traffic simulating adversarial attacks. Traffic generated by our fuzzer is reviewed through a data quality monitor.

After traffic passes the quality check, we train our model on the results. Ultimately, we train our model both on collected Empire C2 traffic and internally generated Empire C2 traffic that simulates adversarial attacks.

Figure 6 provides an overview of our Empire C2 Robust Learning System.

Image 6 is a diagram of the Empire C2 robust learning system and how it works. The top panel is the standard training method where the Empire C2 packet captures in the wild are trained and generated, and then create low robustness. The middle panel is the adversarial training method where the adversarial Empire C2 packet captures are trained and create high robustness. Part of this training includes the data quality monitor and the Empire C2 fuzzer.
Figure 6. The overview of our Empire C2 Robust Learning System.

The effectiveness of this system is based on our Empire C2 Fuzzer.

Empire C2 Fuzzer

Fuzzing is an automatic software testing or vulnerability detection approach. It introduces various inputs into a system and identifies those inputs that result in a failure, meaning the tested system did not work as intended. In this case, we leverage fuzzing to generate more C2 profiles of good quality for our robust learning purpose.

We developed an Empire C2 fuzzer using the Empire C2 framework. Our fuzzer takes a known Empire C2 profile as input and executes relevant functions in the Empire C2 framework to generate listeners, stagers and the associated Empire C2 traffic.

To generate more diverse results, the fuzzer applies random mutation and similar techniques to the original profile. Random mutation involves assigning a randomized value to a selected field in the profile. By manipulating the inputs, our fuzzer can generate a large number of examples for Empire C2 traffic, effectively simulating adversarial attacks.

Empire C2 Data Quality Monitoring

The feedback mechanism for our Empire C2 fuzzer provides code coverage for the current C2 profile being analyzed. Code coverage is a metric that measures the number of lines of code executed by our Empire C2 fuzzer with the input of a profile.

If two profiles have different logics, the code coverage should be different, since they trigger different code logics. This means C2 traffic generated by these two profiles should also be different.

We use code coverage to assess the quality of a newly generated profile. A profile that passes our quality check triggers a code coverage change. We define this as an unseen profile, meaning it is totally new to our generated dataset. Our goal is to collect as many unseen profiles as possible to train our Empire C2 detection model.

To meet this goal, we use an Empire C2 data quality monitoring engine. Our monitoring engine determines when to retrain the model by monitoring the number of unseen C2 profiles. Once the number of unseen profiles reaches a certain limit, our quality monitoring engine signals the need for retraining.

Empire C2 Traffic Generation Engine

We built our C2 traffic generation engine on top of the Empire C2 framework. We used this framework to generate C2 traffic for our dataset.

Our traffic generation engine will set up a client environment and mimic C2 communication with a C2 server. At the same time, it will collect C2 traffic samples and save them as packet captures (pcap) files for our model training.

Empire C2 Model Training

Our objective to improve C2 detection capability is twofold. We want to detect real-world cases of Empire activity like default C2 profiles used in red team engagements, and we need to simulate adversarial settings as discussed earlier.

To achieve these goals, we conducted the initial training on the original Empire C2 dataset with default C2 profiles. Subsequently, we harnessed the Empire C2 fuzzer to gather additional Empire C2 traffic that we incorporated into the training dataset for updating the model weights using various optimizers.

We categorize our dataset using three labels: “benign,” “non-adv” and “adv.”

“Benign” is C2 traffic from real-world uses, like penetration testing Empire C2 activity. “Non-adv” stands for non-adversarial attacks, indicating the datasets are from default profiles. Finally, the “adv” datasets are from simulated adversarial attack profiles.

We balanced our model detection on both “non-adv” and “adv” categories, designing our loss function to consider these labels. Our loss function is designed to increase the number of “adv” samples we create, which should increase our detection rate for adversarial attacks.

We employed a periodic training approach, relying on the feedback from our data quality monitoring engine. Whenever our data quality monitoring engine generated a signal, we captured the newly generated C2 traffic for model retraining.

The model is retrained on the new “adv” samples with the original dataset. Since our loss function will pay more attention to “adv” samples, the retrained model is guaranteed to have a good detection rate on adversarial (“adv”) samples.

Conclusion

Empire C2 is a widely used framework used by red team personnel and malicious threat actors. Empire's support for Malleable C2 profiles further enhances configurability and makes detection more difficult. We reviewed how Empire C2 works and discussed why we picked this framework to train our ML-based robust learning system.

Our approach trains the model with both real-world examples and simulated adversarial attacks, improving the robustness of our Empire C2 detection.

Palo Alto Networks customers receive protections from and mitigations for Empire C2 based attacks in the following ways:

If you think you might have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.