In recent weeks, Unit 42 has discovered three documents crafted to exploit the InPage program. InPage is a word processor program that supports languages such as Urdu, Persian, Pashto, and Arabic. The three InPage exploit files are linked through their use of very similar shellcode, which suggests that either the same actor is behind these attacks, or the attackers have access to a shared builder. The documents were found to drop the following malware families:
- The previously discussed CONFUCIUS_B malware family
- A backdoor previously not discussed in the public domain, commonly detected by some antivirus solutions as “BioData”
- A previously unknown backdoor that we have named MY24
The use of InPage as an attack vector is not commonly seen, with the only previously noted attacks being documented by Kaspersky in late 2016.
The decoy documents used by the InPage exploits suggest that the targets are likely to be politically or militarily motivated. They contained subjects such as intelligence reports and political situations related to India, the Kashmir region, or terrorism being used as lure documents.
In the blog below, we analyze and present our findings on three of these malicious InPage documents:
We also include analysis of the new backdoor we discovered: MY24.
Cyber Advisory No 91.inp
We discovered the first InPage exploit to have the following attributes:
SHA256 | 1d1e7a6175e6c514aaeca8a43dabefa017ddc5b166ccb636789b6a767181a022 |
Original Filename | Cyber Advisory No 91.inp |
The exploit for this document is the same one described by described by Kaspersky late last year. This exploit was unsuccessful in the latest version in InPage (Version 3.60), and as such the underlying vulnerability has likely been patched.
Overall, the entire execution flow of this malware from start to finish can be summarized as follows:
Figure 1 InPage exploit document execution flow
When the malicious .INP file is opened using a vulnerable version of InPage, it will execute the shellcode that is embedded within it.
This particular shellcode, along with the shellcode found within another InPage exploit document that will be discussed later on, began with a marker of ‘LuNdLuNd’, followed by a series of NOPs. It continues to identify an offset to an embedded executable file, which will eventually be run on the victim machine.
This particular shellcode uses a unique hashing mechanism for identifying and loading Microsoft Windows libraries and functions. It uses this method to load a series of functions, as seen below:
Figure 2 Shellcode loading functions using custom hashing algorithm
The hashing algorithm in question can be represented in Python as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
def hashAlgo(string): hsh = 0 for c in string: v1 = ord(c) | 0x60 hsh = 2 * (hsh + v1) return hsh library = "ntdll.dll" function = "memcpy" print "[+] '{}' Library: 0x{:x}".format(library, hashAlgo(library)) print "[+] '{}' Function: 0x{:x}".format(function, hashAlgo(function)) Output: [+] 'ntdll.dll' Library: 0x1b708 [+] 'memcpy' Function: 0x347a |
This particular hashing algorithm does not appear to be widely used, however, in our searches using the YARA rule provided at the end of this blog, we were able to identify roughly 70 PE32 samples that have recently employed this same hashing technique.
The shellcode then proceeds to attempt to create a mutex with a value of “QPONMLKJIH” to ensure only one instance of the shellcode is running at a given time. Finally, the shellcode will copy the embedded payload into newly allocated memory before executing it.
This newly dropped payload is a DLL with the following attributes:
SHA256 | 7bbf14ced3ca490179d3727b7287eb581c3a730131331be042d0f0510bc804f9 |
Compile Timestamp | 2015-05-08 12:51:54 UTC |
PDB String | c:\users\mz\documents\visual studio 2013\Projects\Shellcode\Release\Shellcode.pdb |
This particular DLL acts as a dropper, and has two embedded resource files—an executable payload that will be used to ultimately drop the final payload, as well as a decoy InPage file. It begins by spawning a new thread that loads the two files from embedded resources with names of ‘BIN’ and ‘BIN2’ respectively. The executable is dropped to the following path before it is executed:
- %TEMP%\winopen.exe
The InPage decoy document is dropped to the following path before it is run:
- %TEMP\SAMPLE.INP
The decoy document in question looks like the following. The rough translation to English has been provided in red:
Figure 3 Decoy InPage file with rough translation
Based on the rough translation of this document, it appears to deal with current issues within the Kashmir region. This of course is not consistent with the original filename, and it is unclear why this is the case. Perhaps the attacker forgot to change the lure from a previous exploit, or simply didn’t find it necessary. This lure, while inconsistent with the original filename, is in line with the other InPage exploit file that also looked to be of the same subject matter.
The executable file in the ‘%TEMP%\winopen.exe’ path has the following attributes:
SHA256 | 692815d06b720669585a71bc8151b89ca6748f882b35e365e08cfaf6eda77049 |
Compile Timestamp | 2017-07-31 06:03:42 UTC |
This particular executable is made to resemble the legitimate application Putty. Unlike other files we witnessed up to this point, this sample has rudimentary anti-debugging and anti-analysis techniques in place prior to the main execution flow.
It proceeds to decrypt an embedded resource object using the RC4 algorithm. The following key is used for decryption:
- VACqItywGR1v3qGxVZQPYXxMZV0o2fzp
After this data is decrypted, the following registry key is written to ensure persistence. Again, we see the malware mimic the appearance of the legitimate Putty application.
- HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce\Putty - %TEMP%\winopen.exe
Finally, the malware will spawn a new suspended instance of itself, where the decrypted data is written and subsequently executed.
This next stage of malware has the following properties:
SHA256 | bb5540fe0bbc0cda08865aad891a585cd465b224bfe84762216cd04178087516 |
Compile Timestamp | 2017-05-17 05:47:05 UTC |
This malware operates almost identical to the previously witnessed sample. However, this time the embedded resource object is decrypted using the following RC4 key:
- kRPAnN2DN6vfrxsJ55Lntnh7Mma8E68s
The next, and last stage of this malware execution has the following attributes:
SHA256 | d1a14bc3160f5ed6232ceaf40de1959d7dba3eae614efd2882b04d538cda825b |
Compile Timestamp | 2016-10-31 02:41:09 UTC |
This final payload is an instance of the CONFUCIUS_B malware family, which we have previously discussed. This particular sample attempts to connect to the following host for C2 operations:
- 151.80.14[.]194
Intelligence Report-561 (1).inp
We identified this malicious InPage document as having the following attributes:
SHA256 | 35c5f6030513f11fd1dcf9bd232de457ba7f3af3aedc0e2e976895b296a09df6 |
Original Filename | Intelligence Report-561 (1).inp |
This particular exploit file uses the exact same shellcode witnessed previously, where an embedded DLL is loaded into memory. Again, this executable drops and executes two files—a Microsoft Windows executable payload and an InPage decoy document.
The embedded payload within the shellcode has the following attributes:
SHA256 | 83e3b2938ee6a3e354c93c6ec756da96b03cc69118b5748b07aee4d900da1844 |
Compile Timestamp | 2015-05-08 12:51:54 UTC |
PDB String | c:\users\mz\documents\visual studio 2013\Projects\Shellcode\Release\Shellcode.pdb |
Again, we see the executable payload and decoy document dropped to the following respective locations:
- %TEMP%\winopen.exe
- %TEMP%\SAMPLE.inp
The dropped executable is a previously undocumented backdoor written in Delphi that has been named BioData by multiple antivirus organizations.
This InPage exploit document follows a much simpler execution flow, as seen in the following diagram.
Figure 4 InPage exploit execution flow
The decoy InPage file dropped by this malware looks like the following. The language used within it appears to be a mix of Arabic and Urdu. A rough translation has been provided in red in the image below.
Figure 5 Decoy InPage document dropped by malware
The Biodata payload has the following attributes:
SHA256 | 5716509e4cdbf8ffa5fbce02b8881320cb852d98e590215455986a5604a453f7 |
Compile Timestamp | 1992-06-19 22:22:17 UTC |
Note that the timestamp above is the result of this sample being compiled in Delphi, which uses the same hardcoded compilation timestamp for all samples that are generated.
Throughout the execution of this sample, numerous strings are decoded using a customized 94-character substitution table. BioData will go through each character of the obfuscated string, and will replace each character based on the following table:
Figure 6 Substitution table used by BioData
The malware proceeds to generate and create a ‘Document’ folder within the %USERPROFILE% directory. This folder will contain all of the malware’s files throughout its execution. In order to maintain persistence, the malware will generate the following file in the startup folder, which points to the current path of the BioData executable:
- Adobe creative suit.lnk
BioData proceeds to generate a randomized 30-character string of uppercase and lowercase letters. This string is written to the following file:
- %USERPROFILE%\Document\users.txt
This 30-character string is used by the malware to act as a unique identifier for the victim, and will be used for all network communication with a remote server.
The username and computer name are identified, and are written to a string of the following format:
- User name and System Name :- [Username]_[Computer Name]
This data is obfuscated and written to the following file:
- %USERPROFILE%\Document\SyLog.log
In order to obfuscate this data, the malware uses a unique algorithm. Represented in Python, the following script will decode this file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
import sys from binascii import * file = sys.argv[1] fh = open(file, 'rb') fd = fh.read() fh.close() def bit_not(n, numbits=8): return (1 << numbits) - 1 - n def decode(data): c = 0 output = "" for d in data: o = bit_not((0x6121 >> c) & 0xFF) output += chr(ord(d) ^ o) c += 1 if c == 32: c = 0 return output print decode(fd) |
BioData sends both GET and POST requests to the following URL:
- http://errorfeedback[.]com/MarkQuality455/developerbuild.php
POST requests are made with a hardcoded User-Agent, shown below in Figure 7. Additionally, a ‘b’ GET parameter is included that contains the victim’s previously generated unique identifier. The contents of the POST requests are the obfuscated SyLog.log file. The remote C2 server has been observed responding to these requests with ‘Success’. These requests simply act as a beacon, including the basic victim information that was previously obtained.
Figure 7 HTTP POST request by BioData
GET requests are made in a slightly different fashion. These requests contain an empty User-Agent, and are also found to be missing a number of HTTP headers that are commonly seen.
Figure 8 HTTP GET request by BioData
Unlike the POST requests, the malware both looks for and makes use of the response given, if any, by the C2 server. The malware parses any response given by first hex-decoding it. It then base64-decodes the resulting string. The final string is used to form a subsequent GET request.
If for instance, the malware responded with a decoded string of ‘malware.exe’, the subsequent GET request would look like the following:
- http://errorfeedback[.]com/MarkQuality455/bzGwXILtkMRZaJxzciXAeCYviduBuy/malware.exe
The request above uses the same victim identifier that has been observed in the previous examples provided.
This hypothetical ‘malware.exe’ request contains the raw contents of the payload that BioData will drop to disk and execute. The contents are placed in the following file path for this hypothetical:
- %USERPROFILE%\Document\malware.exe
Finally, after this dropped payload is successfully executed, the malware will send a GET request such as the following:
- http://errorfeedback[.]com/MarkQuality455/developerbuild.php?f=62574673643246795a53356c654755&b=bzGwXILtkMRZaJxzciXAeCYviduBuy
In the above example, the ‘b’ parameter is the victim identifier, and the ‘f’ parameter is the string of ‘malware.exe’ after it has been base64-encoded and hex-encoded. This request alerts the attack that the hypothetical payload of ‘malware.exe’ has been run.
Tehreek-E-Kashmir Mujahaid List.inp
We identified this malicious InPage document as having the following attributes:
SHA256 | 3e410397955d5a127182d69e019dbc8bbffeee864cd9c96e577c9c13f05a232f |
Original Filename | Tehreek-E-Kashmir Mujahaid List.inp |
Unfortunately, no decoy document was included with this exploit file. However, the filename provides clues as to the context that may have been present when this file was delivered to the intended recipient. The phrase ‘Tehreek-E-Kashmir’ is most likely related to the conflict in the Kashmir region of India. Additionally, the term ‘Mujahaid’ may be a misspelling of the word ‘Mujahid’, a term used to describe an individual engaged in Jihad.
This particular InPage shellcode looks to be near identical to the two others previously discussed, however, it appears as though the attackers simply partially overwrote the original shellcode that was present to substitute their own. This results in the shellcode acting as a downloader, instead of loading an embedded payload. We can see the modifications visually in the following image:
Figure 9 Differences between InPage exploit documents
In the image above, the ‘Cyber Advisory No 91.inp’ exploit file has the large additional size, as it included the payload. The ‘Tehreek-E-Kashmir Mujahaid List.inp’ exploit file instead has removed this. However, original artifacts from the original shellcode are still present, including the function that loads Microsoft Windows API calls using the unique hashing algorithm.
The shellcode begins by iterating through the Process Environment Block (PEB), searching for a loaded module that has a ‘3’ in the 7th position. In other words, the shellcode uses a simple trick to search for kernel32.dll. It proceeds to iterate through kernel32’s functions, looking for the GetProcAddress function. In order to find this function it will compare the first four letters against ‘GetP’, and the third set of four letters against ‘ddre’.
The shellcode then gets the address of the WinExec function, which in turn is used to execute the following command:
- cmd /c mkdir C:\Wins
It then performs the following:
- Gets the address of the LoadLibraryA function
- Loads the urlmon.dll library
- Gets the address of the URLDownloadToFileA function
The shellcode then proceeds to make a request to the following URL and download the response to ‘C:\Wins\cnh’.
- http://zmwardrobe[.]com/wp-sign
Finally, the shellcode will execute this downloaded file via a call to WinExec.
The response from this webserver returned a payload, that we have named MY24, with the following attributes:
SHA256 | 71b7de2e3a60803df1c3fdc46af4fd8cfb7c803a53c9a85f7311348f6ff88cbe |
Compile Timestamp | 2017-05-18 05:26:54 UTC |
It should also be noted that a malicious Microsoft Word document with the following properties was observed downloading and executing the same payload.
SHA256 | 3f1d3d02e7707b2bc686b5add875e1258c65a0facd5cf8910ba0f321e230e17c |
Original Filename | Las Vegas ISIS Claim Proof.doc |
First Seen | 2017-10-05 05:53:27 |
MY24 Analysis
This backdoor begins by decoding a series of embedded strings by adding 33 to each character. The following example within the Python interpreter demonstrates this:
Figure 10 Example string decoding within Python interpreter
The malware proceeds to execute a function that is responsible for generating the following path:
- %APPDATA%\Startup\wintasks.exe
However, this path is never used, leading us to believe that the malware author had the intention of copying the payload to this destination and likely setting persistence, but seemingly forgot to.
MY24 proceeds to spawn two timers where the functions are responsible for resolving the C2 domain of userveblog.ddns[.]net, as well as connecting to this domain.
Two new threads are then created—one for handling any data that is received from the connection to the C2 and one that is responsible for sending out data.
Finally, a function is called that is responsible for collecting information about the victim machine. The following information is collected:
- Version of Microsoft Windows
- Username
- Computer name
The MY24 instance expects to receive a command initially from the remote server of userveblog.ddns[.]net on port 9832. All communication is performed using raw sockets via a custom communication protocol. The packets received by the malware have the following format:
Figure 11 Received packet format for MY24 malware
All data received and sent by MY24 is encrypted using a 13-byte XOR key of "t6%9k$2Ri9ctv". The data portion of the received command will include one of the following commands:
Received Command | Description |
2000 | Return victim information |
2001 | Get drive information |
2002 | List files |
2004 | Unknown |
2005 | Create file handle to append data |
2006 | Write appended data to previously created file handle |
2007 | Create file handle for reading data |
2009 | Read data from previously created file handle |
2012 | Spawn a shell of cmd.exe |
2013 | Interact with previously spawned shell |
2015 | Unknown |
2016 | Kill previously spawned shell |
2019 | List current process network communication on the victim machine |
2021 | Unknown |
2022 | Kill process |
2023 | Enumerate processes |
2025 | Unknown |
Responses sometimes vary in size, but are primarily sent with a size of 9084 bytes. The author of this tool did not allocate proper buffer size when sending out the data, resulting in part of the stack being included in the response by the MY24 malware. Examples of commands being sent and received may be seen below. A custom server was written to interact with the MY24 malware, which is seen in the following image.
Figure 12 Interacting with MY24 backdoor
Conclusion
While documents designed to exploit the InPage software are rare, they are not new - however in recent weeks Unit42 has observed numerous InPage exploits leveraging similar shellcode, suggesting continued use of the exploit previously discussed by Kaspersky.
The decoy documents dropped suggest that the targets are likely to be politically or militarily motivated, with subjects such as Intelligence reports and political situations being used as lure documents. The variety of malware payloads dropped suggests the attackers behind these attacks have a reasonable development resource behind them and Unit42 continues to observe new versions of these malware families being created.
Palo Alto Networks customers are protected against these threats in a number of ways:
- All domains observed in these malware families have been flagged as malicious.
- All payloads are appropriately categorized as malicious within the WildFire platform and blocked by Traps.
- The payloads witnessed have been tagged in AutoFocus as Confucius_B, MY24, and BioData for continued tracking and observation.
Appendix
YARA Rules
1 2 3 4 5 6 7 |
rule InPageShellcodeHashing { strings: $hashingFunction = {55 8B EC 51 53 52 33 C9 33 DB 33 D2 8B 45 08 8A 10 80 CA 60 03 DA D1 E3 03 45 10 8A 08 84 C9 E0 EE 33 C0 8B 4D 0C 3B D9 74 01 40 5A 5B 59 8B E5 5D C2 0C 00} condition: $hashingFunction } |