Executive Summary
This article examines obfuscation techniques used in popular malware families, and offers some insights into possible opportunities for automating unpacking of these malware samples.
We will examine these behaviors in samples we have observed, showing how to extract their configuration parameters through unpacking each stage. Performing this same process through automation would allow a sandbox performing static analysis to extract crucial malware configuration parameters from such samples.
Malware authors increasingly use advanced obfuscation techniques to evade sandbox detection, enabling widespread distribution. Static analysis is a process performed by sandboxes for examining samples, without directly executing them.
Adversaries use the following techniques to deliver popular malware families like Agent Tesla, XWorm and FormBook/XLoader:
- Code virtualization
- Staged payload delivery
- Dynamic code loading to introduce new code at runtime
- Advanced Encryption Standard (AES) encryption
- Creating multi-stage payloads that are self-contained within the original sample
Palo Alto Networks customers are better protected from the threats discussed in this article through the following products or services:
- Our Network Security solutions including Advanced WildFire, Advanced URL Filtering and Advanced DNS Security
- Cortex XDR and XSIAM
If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.
Related Unit 42 Topics | .NET, Agent Tesla, Anti-Analysis Techniques |
Introduction
Malware authors use obfuscation techniques to hinder sandboxes from using static analysis, increasing the possibility that a malicious file will evade detection. This allows adversaries to distribute malware samples more effectively at scale.
This article examines obfuscation techniques used to deliver malware families like Agent Tesla, XWorm and FormBook/XLoader.
The obfuscation techniques can be grouped by objective and technique, as shown in Table 1.
Objective | Techniques |
Payload protection |
|
Payload delivery |
|
Table 1. Classification of obfuscation techniques
Let us explain these techniques in more detail:
- AES cryptography: AES stands for Advanced Encryption Standard. This is a block cipher that uses the same symmetric key to encrypt plaintext or decrypt ciphertext. This is more secure than eXclusive OR (XOR) cryptography, which relies on a simple bitwise operation for lightweight obfuscation.
- Code virtualization: Code virtualization is a software protection technique that works by transforming code into specialized instructions. These instructions are written in such a way that they can only be executed by the accompanying custom interpreter. To unravel the program, one has to first understand the internals of the interpreter implementation. This extra step adds another layer of complexity to the analysis.
- Staged payloads: Staging is the practice of wrapping a core payload with multiple layers. Detonating several payloads in sequence, versus doing everything all at once, is an attempt by the attacker to evade detection. The malware sample could abort its infection chain partway through, if its initial payload fails or is detected, thus preventing exposure of its later stage payloads. As this design is modular, the malware author can also customize the combination of different payloads when creating the original malware sample.
- Payloads stored in the PE overlay: The PE overlay describes extra bytes appended to a file that are not included in its metadata. Attackers often hide their payloads inside the PE overlay, because some static analysis tools skip processing this area. This practice can also be used by .NET malware, not just by standard PE files.
- Dynamic code loading, deobfuscation and execution via .NET reflection: Reflection is a feature of certain programming languages, including the .NET Framework, to execute strings as code at runtime. Reflection allows the running malicious process to introduce new objects into the system or inspect and manipulate existing objects already in the system. Reflection can also be abused to bypass access security restrictions, as it can modify private attributes or invoke internal methods not typically accessible otherwise.
This article will discuss in greater detail the chain of obfuscation techniques shown in Figure 1.
data:image/s3,"s3://crabby-images/69aec/69aec14a241f565a0f553b3213482388dc6d19a5" alt="Illustration depicting three stages of payload transformation in cybersecurity. Stage 1: Encrypted Payload labeled ".NET" with an icon of a lock and key, noting the use of Advanced Encryption Standard (AES). Stage 2: Virtualized Payload showing a stylized cube with a symbol for code on top, associated with KoiVM. Stage 3: Final Payload with a computer monitor displaying a bug, linked to Agent Tesla, XWorm, FormBook / XLoader. Arrows connect the stages, indicating progression from one to the next."
A 2023 article by K7 Labs discusses a first-stage .NET downloader that seems to be an earlier variant of samples we have observed. It contacts a command-and-control (C2) server to download a second-stage KoiVM dropper, which delivers payloads like Agent Tesla and Remcos RAT.
In the cluster of malware samples we observed, there were updated features such as using AES encryption instead of XOR encryption. This cluster also had multi-stage payloads that were self-contained within the original malware sample distributed.
Technical Analysis
In the following sections, we will discuss the various stages of activity this multi-staged malware undergoes.
Stage 1: Encrypted Payload (in PE Overlay)
The samples we observed concealed their payload within the PE overlay. They also contained an ASCII string (gXQstjDplQeg), which we will refer to as a marker. This marker delimited the AES encryption parameters. This marker was referenced from the main .NET code itself, usually by the ldstr instruction. It was also present several times within the PE overlay.
AES encryption operating in cipher block chaining (CBC) mode uses a symmetric key and an initialization vector (IV) to encrypt plaintext into ciphertext (and vice versa for decryption), as shown in Figure 2.
data:image/s3,"s3://crabby-images/27487/27487e760711005a0781031663ce738c4a9efdf8" alt="Cipher block chaining mode encryption. Diagram starts with the initialization vector and continues through the stages of encryption. Diagram includes keys and plaintext indicators."
The PE overlay contains the Stage 2 payload ciphertext and another notable ciphertext: a sequence of strings delimited by dollar signs ($). The presence of the following strings indicate the malware can perform an Antimalware Scan Interface (AMSI) bypass:
- AmsiInitialize
- AmsiOpenSession
- AmsiScanBuffer
Other tokens (such as the following) indicate dynamic .NET Framework code execution via reflection:
- Assembly
- Load
- GetMethod
- Invoke
Additionally, an arbitrary length \x00-character prefix and repeating string (PAPADDINGXX) suffix padding envelop the AES cryptographic material (key, IV and ciphertext). This padding helps evade signature-based defenses.
Table 2 shows the various parts of the file layout of the PE overlay.
<Padding: Sequence of \x00's> |
<marker> |
Key |
<marker> |
IV |
<marker> |
Ciphertext: Stage 2 payload |
<marker> |
Ciphertext: Token1$Token2$… (for reflection) |
<marker> |
… |
<Padding: Sequence of PAPADDINGXX's> |
Table 2. File layout of the PE overlay of a Stage 1 payload sample.
To recover the Stage 2 payload, we first extracted the marker from the .NET program code. Then, using the extracted marker as the delimiter, we split the PE overlay into parts, decrypting the ciphertexts using the provided key and initialization vector. One of these decrypted ciphertexts (often the largest) is the Stage 2 payload.
Stage 2: Virtualized Payload
After Stage 1, a more complex virtual machine (VM)-based obfuscation awaits in Stage 2. This second stage is meant to hide the final payload.
The VM we are referring to here is not the same as the kind which supports running multiple operating systems (OS) on a single host machine. The VM used here is KoiVM, a plugin for the ConfuserEx obfuscation tool.
VM-based obfuscation operates on the idea of a custom VM interpreter, which consists of a central dispatcher at its core. An input program is a list of virtual instructions written in a particular instruction set (called the intermediate language).
The dispatcher executes this program by routing to respective handlers, based on the current instruction. An instruction consists of commands (called opcodes), optionally with associated arguments, that are most often either passed by registers or on the stack. Program execution terminates when it reaches the VM-exit handler.
Figure 3 shows a diagram that gives an overview of how VM-based obfuscation works.
data:image/s3,"s3://crabby-images/5ebcb/5ebcbe63c2f2d02634814dd36f64e7041d5ae870" alt="Diagram illustrating the process flow of a virtual machine (VM) interpreter. Virtual instructions, represented as binary code, enter the VM Entry, pass through a Dispatcher, and then are processed by multiple Handlers. Dashed and solid arrows indicate the flow of data and control within the system."
Standard disassemblers would not be able to easily analyze programs written this way, making it difficult for a malware analyst to decipher what the program is trying to accomplish. Furthermore, the malware author might have remapped the opcodes in this case, so that an off-the-shelf devirtualization tool like OldRod would fail.
For the samples we are examining, the VM program is actually just a dropper. A dropper is responsible for loading, decrypting and executing the final payload in memory. The decryption key is decoded from a Base64 string, while the ciphertext exists as an embedded resource in the Stage 2 payload.
One mitigation approach for this obfuscation uses the .NET Framework's ICorDebugManagedCallback Interface to create a debugger which hooks the following API functions:
- System.Convert::FromBase64String: Extracts the decryption key
- System.Resources.ManifestBasedResourceGroveler::GetManifestResourceStream: Extracts the ciphertext
Hooking is the act of placing breakpoints, which temporarily pauses program execution at certain points in time to extract values of interest.
After obtaining the ciphertext and decryption key this way, we can quickly recover the Stage 3 final payload.
Stage 3: Final Payload
As mentioned earlier, the final payload exists as an embedded AES-encrypted resource in the previous Stage 2 payload, which is decrypted in memory at runtime before execution. In the malware sample dataset we analyzed, the final payload belonged mainly to the Agent Tesla or XWorm family, except for one sample delivering shellcode identified as belonging to the FormBook/XLoader family.
While the Stage 3 payload code was no longer obfuscated, the XWorm samples' configuration parameters were encrypted using AES in Electronic Codebook (ECB) mode. The hard-coded AES key is stored in a variable named Mutex. Other variables besides Mutex can then be decrypted independently with this key, to restore the original set of malware configuration parameter values, especially the remote C2 endpoint.
Conclusion
Malware authors commonly use obfuscation techniques like encryption and code virtualization to hide their malicious intent. This allows them to evade security mechanisms and sandbox detection.
The cluster of malware samples we have highlighted uses staged payloads encrypted with strong AES cryptography at each layer. This fileless malware is loaded, decrypted and executed in memory via reflection.
Because these advanced techniques are among the toughest to overcome, we hope to make more people aware of this issue. We would like to see more researchers put effort and resources into finding a principled way to defeat code virtualization protection schemes.
Palo Alto Networks Protection and Mitigation
Palo Alto Networks customers are better protected from the threats discussed above through the following products:
- The Advanced WildFire machine-learning models and analysis techniques have been reviewed and updated in light of the IoCs shared in this research.
- Advanced URL Filtering and Advanced DNS Security identify known URLs and domains associated with this activity as malicious.
- Cortex XDR and XSIAM are designed to:
- Prevent the execution of known malicious malware, and also prevent the execution of unknown malware using Behavioral Threat Protection and machine learning based on the Local Analysis module.
- Protect against credential gathering tools and techniques using the new Credential Gathering Protection available from Cortex XDR 3.4.
- Protect from threat actors dropping and executing commands from web shells using Anti-Webshell Protection, newly released in Cortex XDR 3.4.
- Protect against exploitation of different vulnerabilities including ProxyShell and ProxyLogon using the Anti-Exploitation modules as well as Behavioral Threat Protection.
- Detect post-exploit activity, including credential-based attacks, with behavioral analytics, through Cortex XDR Pro.
If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:
- North America: Toll Free: +1 (866) 486-4842 (866.4.UNIT42)
- UK: +44.20.3743.3660
- Europe and Middle East: +31.20.299.3130
- Asia: +65.6983.8730
- Japan: +81.50.1790.0200
- Australia: +61.2.4062.7950
- India: 00080005045107
Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.
Indicators of Compromise
Agent Tesla Activity
SHA-256 hashes
- a02bdd3db4dfede3d6d8db554a266bf9f87f4fa55ee6cde5cbe1ed77c514cdee
- 3d8187853d481c74408d56759f427e2c3446e9310c2d109fd38a0f200696c32d
Process name
- lrfRT.exe
- uaAWu.exe
User-Agent string
- Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:99.0) Gecko/20100101 Firefox/99.0
SMTP
1) SHA256 - a02bdd3db4dfede3d6d8db554a266bf9f87f4fa55ee6cde5cbe1ed77c514cdee
- Server: mail[.]iaa-airferight[.]com:25
- Sender: admin@iaa-airferight[.]com
- Password: manlikeyou88
- Receiver: admin@iaa-airferight[.]com
2) SHA256 - 3d8187853d481c74408d56759f427e2c3446e9310c2d109fd38a0f200696c32d
- Server: mail[.]iaa-airferight[.]com:25
- Sender: web@iaa-airferight[.]com
- Password: webmaster
- Receiver: mail@iaa-airferight[.]com
XWorm Activity
SHA-256 hashes
- 098a18e96c4fb250ffadb3f01d601240c74a4d9f5df94cb72bd44cc81b80b2af
- 695e038452a656d58471f284edb8d81754b78258a6afd3d8f62ae8a47c3130d9
C2 traffic
- 66[.]63[.]168[.]133:7000
- weidmachane[.]zapto[.]org:7000
FormBook/XLoader Activity
SHA-256 hashes
- d72f4ef2e5caea42749d542384b6634e65e29f3aef5d09a9c231cc09e76e4988
Additional Resources
- AMSI Bypass – Fluid Attacks
- KoiVM Virtualization – yck1509 on GitHub
- Agent Tesla family – Unit 42, Palo Alto Networks
- XWorm family – Unit 42, Palo Alto Networks