This post is also available in: 日本語 (Japanese)

Executive Summary

This article discusses recent samples of BadPack Android malware and examines how this threat’s tampered headers can obstruct malware analysis. We also review the effectiveness of various freely available tools for analyzing BadPack Android Package Kit (APK) files.

The cybersecurity landscape has seen a dramatic increase in malicious Android applications in recent years. One major contributor to this trend is APK samples bundled as BadPack files.

BadPack is an APK file intentionally packaged in a malicious way. In most cases, this means an attacker has maliciously altered header information used in the compressed file format for APK files.

These tampered headers are a key feature of BadPack, and such samples typically pose a challenge for Android reverse engineering tools. Many Android-based banking Trojans like BianLian, Cerberus and TeaBot use BadPack.

Palo Alto Networks customers receive better protection from these BadPack APK samples through our Next-Generation Firewall with Cloud-Delivered Security Services, including Advanced WildFire, Advanced DNS Security and Advanced URL Filtering.

Palo Alto Networks reported these findings to Google. Based on Google’s current detection, no apps containing this malware are found on Google Play. Android users are automatically protected against known versions of this malware by Google Play Protect, which is on by default on Android devices with Google Play Services. Google Play Protect can warn users or block apps known to exhibit malicious behavior, even when those apps come from sources outside of Play.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.

Related Unit 42 Topics Android APK

Background

APK files are applications used by the Android operating system (OS). APK applications are packages that use the ZIP archive format. These packages contain a file named AndroidManifest.xml. This is the Android Manifest that stores data and instructions for the archive's content.

AndroidManifest.xml contains valuable information about an APK-based application, especially for APK malware samples. In a BadPack APK file, attackers have tampered with its ZIP header data, attempting to prevent analysis of its content.

Analysis tools like Apktool and Jadx often struggle with extracting content from BadPack APK files. For example, we found Apktool failed to extract AndroidManifest.xml from one of the BadPack APK samples we review later in this article.

We reviewed our Advanced WildFire detection telemetry from June 2023 through June 2024 for BadPack APK files, and we discovered almost 9,200 matching samples. The graph in Figure 1 lists detections by month, illustrating BadPack trends during this time frame.

Image 1 is a column graph of the count of BadPack observed in Advanced WildFire from June 2023 to June 2024. There was a leap in May 2024.
Figure 1. BadPack observations in Advanced WildFire, June 2023 through June 2024.

The number of samples we found through Advanced WildFire indicates that BadPack APK malware is a notable threat. To combat this threat, we must better understand BadPack.

BadPack prevents normal extraction techniques, and since the most critical component of an APK archive is its Android Manifest, we should first understand the role AndroidManifest.xml in an APK archive.

Android Manifest

The Android Manifest file AndroidManifest.xml is a crucial configuration file embedded within the APK sample. This manifest provides essential information about the mobile application to the Android device operating system.

This information includes package components to handle activities initiated by the user and services run by the application. The manifest also includes the permissions the user must grant the application for it to run correctly and the versions of Android the application runs on.

Extracting, reading and processing the Android Manifest is the first step in static analysis of an APK sample. As such, malware authors make it their goal to prevent security analysts from performing these activities. Malware authors achieve this by tampering with headers used in the ZIP archive format of the APK file.

ZIP File Structure

The ZIP format allows users to compress and archive content into a single file. The layout of a ZIP file contains two main types of headers that specify the archive's structure and content:

  • Local file headers
  • Central directory file headers

Malware authors can alter fields within these headers to prevent analysts from extracting an APK file's content, and the results can also allow the APK file to run on an Android device.

Local File Headers

Local file headers represent the individual files contained in a ZIP archive. A ZIP archive contains at least one file, and the first bytes of a ZIP archive always start with a local file header.

If the ZIP archive contains another file, this local file header structure is repeated later in the ZIP archive. These local file headers always start with a 4-byte signature, with the first 2 bytes as the ASCII characters PK, which represent the initials of ZIP archive format creator Phillip Katz. Figure 2 shows the layout of a local file header.

Image 2 is a chart of the local header file layout.
Figure 2. Layout of the local file header structure. Source: Florian Buchholz, The structure of a PKZip file.

Figure 3 shows an example of the first bytes from a ZIP archive.

Image 3 is the hexadecimal dump of the ZIP archive.
Figure 3. Hexadecimal dump of a ZIP archive. Source: Florian Buchholz, The structure of a PKZip file.

We can map these byte values to the corresponding fields of a local file header as shown below in Figure 4.

Image 4 is an example of a local file header structure with field values aded. These include the signature, version, version needed, flags, compression, mod time and more.
Figure 4. Field values populated into the local file header structure. Source: adapted from Florian Buchholz, The structure of a PKZip file.

The compression field of a local file header is located at byte offset 0x08 and 0x09. This field can contain different values starting from 0x0000, which means the file was not compressed. In Figure 4 above, the example shows a value of 0x0800. This value represents the DEFLATE compression algorithm, the most common value used for ZIP archives.

Figure 4 above shows the compressed size at byte offset 0x12 through 0x15 is 0x45, which translates to 69 bytes. The uncompressed size at byte offset 0x16 through 0x19 is 0x4a, which is 74 bytes. The compressed item's filename is 0x66696c6531, which translates to file1 in ASCII text.

In Figure 4, the file header for this ZIP archive ends at 0x37, and the content of the compressed file would begin at 0x38.

Central Directory File Headers

The central directory file header is used for ZIP archives that contain directories. This header appears after the end of the last local file header in a particular directory within a ZIP archive.

In APK files, we sometimes find an optional APK Signing Block between the last local file header and the central directory header. Figure 5 shows the layout of a central directory file header.

Image 5 is an example of the layout of the central directory file header. The information includes the signature, version, flags, compression, external attributers, file name, extra field and more.
Figure 5. Layout of the central directory file header. Source: Florian Buchholz, The structure of a PKZip file.

Using the same file from Figure 3, we must scroll down to the bytes beginning at 0x09a2 to find the first central directory file header. Figure 6 below shows the content of this header.

Image 6 is an example of the central directory file header structure values in hexadecimal, displayed in columns and rows.
Figure 6. Hexadecimal dump showing a central directory file header structure values. Source: Florian Buchholz, The structure of a PKZip file.

In the example of the central directory header mapped in Figure 7, we find the same compression-related values as the local file header shown earlier in Figure 4. However, the byte offsets for these fields are different from those shown in Figure 4.

Image 7 is an example of the central directory file header structure with field values aded. These include the signature, version, version needed, flags, compression, mod time and more.
Figure 7. Field values populated into the central directory file header structure. Source: adapted from Florian Buchholz, The structure of a PKZip file.

For the central directory header in Figure 7, the byte offset for the compression value is at 0x0a to 0x0b, and the value is 0x0800, representing the same DEFLATE compression algorithm we discussed immediately after Figure 4.

Figure 7 also shows the compressed size at byte offset 0x14 through 0x17 is 0x45, which translates to 69 bytes. The uncompressed size at byte offset 0x18 through 0x1b is 0x4a, which is 74 bytes. These are the same values as the local file header in Figure 4, but at different byte offsets.

The compressed item's filename is 0x66696c6531, which translates to file1 in ASCII text.

In Figure 7, the central directory header ends at 0x5b, and the content of the compressed file would begin at 0x5c.

In the ZIP archive format used by an APK file, values in the local file header and central directory file header should be consistent with each other. This means that information for a specific item within an APK file like compression method, compressed size and uncompressed size are the same in each header. We saw this when comparing the values for a compressed item named file1 in the example from Figure 4 and Figure 7.

The BadPack technique alters these values for malicious APK files, making a mismatch between the local file header and the central directory file header.

Analyzing the BadPack Technique

In a malicious BadPack sample, the authors have tampered with the ZIP structure headers, making the APK fail to extract and decode AndroidManifest.xml. This causes a chain reaction of errors downstream in the static analysis pipeline. As a result, the file cannot be read and fully processed.

Malware authors can manipulate these values in any of the following ways:

  1. Specifying the correct compression method STORE, but accompanied by an invalid compressed size.
  2. Specifying any compression method value that is not DEFLATE, when the actual compression method of the payload is STORE.
  3. Specifying any compression method value in the local file header only, when the actual compression method of the payload is DEFLATE.

Android malware static analysis tools like Apktool or Jadx are generally stricter than the Android system runtime on Android devices. For these analysis tools, an APK sample must adhere to ZIP file format specifications. Therefore, Apktool and Jadx parse both the local file header and central directory file header of the ZIP structure headers in an APK file.

However, Android devices are not as strict about the official file format as these analysis tools. An APK file may contain invalid values that do not fully adhere to the official file format specification, and it may still run. This is because the Android system runtime only inspects the central directory file header. If a value from the local file header does not match, the Android runtime assumes what a correct value should actually be.

It is precisely this difference in behavior that causes analysis tools like Apktool and Jadx to fail to analyze a BadPack APK sample that installs and runs properly without issue on an Android device.

We can successfully analyze BadPack APK samples by reversing these changes to restore the original ZIP structure header values before using APK analysis tools.

Tracing the Android Codebase Implementation

We can trace back the essential implementation responsible for the difference in behavior between malware analysis tools and the Android system runtime to a section of code in the Android framework dealing with extracting content from an APK file.

In code, a method accepts input parameters. A method has a body of instructions to transform these input parameters into some output result returned as value(s).

A method body is much like a recipe in cooking. When the program is executed, a function is an instance of the invocation of a method, which receives input arguments, according to the input parameters defined in the method.

At runtime, invocation of this function with the string "AndroidManifest.xml" as the path argument triggers this code execution path. Figure 8 below outlines key steps of the routine (e.g., omitting error handling), simplified for readability.

Image 8 is a screenshot of the main routine for APK extraction in Android runtime. It includes three steps in total (labeled as comments).
Figure 8. Main routine in Android runtime for APK extraction. Source: The Android Open Source Project.

The logic of the code in Figure 8 consists of the following steps, with the main if-condition line highlighted:

Step 1: The central directory file header of the AndroidManifest.xml entry is retrieved. This succeeds because the header structure is still intact, although certain values have been manipulated.

Step 2: The Compression method field in this header is numerically compared to see if it equals 8 (DEFLATE). If so, the Compressed size field in this header extracts the payload data.

Step 3: Otherwise, the payload data is assumed to only be STORE'd, requiring the Uncompressed size field in this header instead for extraction.

We can carry out the following two-part experiment to verify the code shown in the previous section truly handles the extracting and installing of an APK sample file onto an Android device:

Part One:

  1. Select an APK file whose "AndroidManifest.xml" payload data is actually compressed by the DEFLATE algorithm
  2. Install the APK file mentioned in Step 1 onto an Android device
  3. It will succeed with the following output message:

Part Two:

  1. Now, with the AndroidManifest.xml entry of the APK file:
    1. Go to the central directory file header
    2. Look for the Compression method field
    3. Modify its 2-byte little-endian integral value to 0 (STORE).
  2. It will now fail installation with the following output message, reporting the reason for failure as a "Corrupt XML binary file" error:

Manifestation of the BadPack Technique

Malware authors can manipulate an APK file using any of the three methods listed below. Corrections for recovery are highlighted in red.

Method 1: Specify the correct compression method STORE, but accompanied by an invalid compressed size.

This breaks analysis tools processing the APK sample file, but the Android device system runtime uses the Uncompressed size field from the central directory file header when the Compression method is STORE. An example is shown below.

SHA-256 hash:
0003445778b525bcb9d86b1651af6760da7a8f54a1d001c355a5d3ad915c94cb
Local File Header - Fields

Compression method = 0 (STORE)

Compressed size = 14417 41192

Uncompressed size = 41192

Data = \x00\x00\x08\x00 ...

Central Directory File Header - Fields

Compression method = 0 (STORE)

Compressed size = 14417 41192

Uncompressed size = 41192

Method 2: Specify any compression method value that is not DEFLATE, when the actual compression method of the payload is STORE.

This breaks analysis tools processing the APK sample file, but the Android device system runtime treats the unknown compression method as STORE and reads the Uncompressed size field from the central directory file header. An example is shown below.

SHA-256 hash:
015bd2e799049f5e474b80cbbdcd592ce4e2dfbfae183bada86a9b6ec103e25e
Local File Header - Fields

Compression method = 27941 0 (STORE)

Compressed size =6042 17264

Uncompressed size = 17264

Data = \x00\x00\x08\x00 ...

Central Directory File Header - Fields

Compression method = 38402 0 (STORE)

Compressed size = 6042 17264

Uncompressed size = 17264

Method 3: Specify any compression method value in the local file header only, when the actual compression method of the payload is DEFLATE.

This breaks analysis tools processing the APK sample file. However, the Android device system runtime only relies on the fields from the central directory file header to perform its extraction successfully. In this case, the compression method is correctly set as DEFLATE.

SHA-256 hash:
131135a7c911bd45db8801ca336fc051246280c90ae5dafc33e68499d8514761
Local File Header - Fields

Compression method = -2221 8 (DEFLATE)

Compressed size = 2254

Uncompressed size = 8380

Data = \xad\x58\x39\x73 ...

Central Directory File Header - Fields

Compression method = 8 (DEFLATE)

Compressed size = 2254

Uncompressed size = 8380

Android Malware Analysis Tools

This section highlights how the BadPack technique works as an anti-analysis evasion mechanism, focusing on how this manifests in file extractors and Android static analysis tools. Our example uses the APK malware sample with a SHA-256 hash of 90c41e52f5ac57b8bd056313063acadc753d44fb97c45c2dc58d4972fe9f9f21. This example uses Method 2 from BackPack techniques listed in the previous section.

7-Zip

The file archiver 7-Zip is unable to extract the AndroidManifest.xml file from the APK sample, citing the reason for failure as a "Headers Error" as shown in Figure 9 below.

Image 9 is a screenshot of many lines of code. Highlighted in a red box is the error code where the ZIP program failed to unpack the bundled APK sample.
Figure 9. 7-Zip failed to unpack the BadPack-bundled APK sample (command output created on CodeSnap).

Apktool

Advertised as "a powerful tool designed for reverse engineering Android applications," Apktool has the capability to decompile resources, recovering to as close to their original authored state as possible. It also allows users to modify the application before rebuilding it.

The error message "Invalid CEN header (bad compression method: 19466)" in Figure 10 below suggests that the APK sample may have been compressed using some nonstandard or proprietary compression method, which Apktool does not recognize.

Image 10 is a screenshot of many lines of code. Highlighted in a red box is the code that notes the failure to decompress the APK sample.
Figure 10. Apktool failed to decompress the APK sample (command output created on CodeSnap).

Jadx

Jadx is another popular reverse engineering tool for Android applications. When attempting to load the APK malware sample into Jadx, it produces the same error message as Apktool, as depicted in Figure 11.

Image 11 is a screenshot of many lines of code. Highlighted in a red box is the line showing the error where Jadex could not process the sample.
Figure 11. Jadx was unable to process the same APK sample (command output created on CodeSnap).

This error message clearly indicates that the APK sample has an issue with its specified compression method. This arises from its author intentionally changing the compression method field value.

JAR

Strictly speaking, an APK sample belongs to the Java ARchiver (JAR) file format specification because it contains the additional META-INF/MANIFEST.MF file on top of the standard ZIP file format requirements. Yet the Java Development Kit's JAR tool cannot extract the AndroidManifest.xml file. Figure 12 illustrates this.

Image 12 is a screenshot of many lines of code. Highlighted in a red box is the line showing the error where JAR could not extract the XML file. Invalid compression method.
Figure 12. Error message showing JAR cannot extract the AndroidManifest.xml file (command output created on CodeSnap).

Unzip

The error message "unsupported compression method 19466" shown in Figure 13 indicates that, while using the Unzip tool to decompress the APK sample, it does not support or recognize the compression method used for the AndroidManifest.xml file. This can occur if certain files within the archive are compressed using a nonstandard or proprietary compression method. All other files in the archive extract or inflate successfully without errors.

Image 13 is a screenshot of many lines of code. Highlighted in a red box is the line showing the error where the Unzip tool could not unpack the XML file. Unsupported compression method 19466.
Figure 13. The Unzip tool cannot unpack the AndroidManifest.xml file (command output created on CodeSnap).

Apksigner

Shipped with the official Android SDK, the Apksigner tool is often used to sign APK files and verify the signature. However, it fails to verify the signature of the BadPack-bundled APK sample. Figure 14 below shows the AndroidManifest.xml file could not be read due to obfuscation.

Image 14 is a screenshot of many lines of code. Highlighted in a red box is the line showing the error where Apksigner could not read the XML file. Data of entry AndroidManifest.xml malformed.
Figure 14. Apksigner failed to read AndroidManifest.xml (command output created on CodeSnap).

apkInspector

While researching this topic, we came across an open-source tool that was able to extract the AndroidManifest.xml file.

First released on Dec. 31, 2023, apkInspector is an open-source tool that provides detailed insights into the low-level ZIP structure of raw APK files. It can also extract APK content and even decode the AndroidManifest.xml file, since the original AndroidManifest.xml file is in a binary, non-human-readable format. We executed this on our APK sample and verified it is indeed capable of both extracting and decoding the AndroidManifest.xml file.

Figure 15 below shows that apkInspector was able to extract the AndroidManifest.xml. This is due to it possessing the capability to handle tampered DEFLATE or STORE compression methods, as seen in its Python code for extraction.

Image 15 is a screenshot of many lines of code. Highlighted in a red box is the line showing where the binary XML file was successfully extracted. “Extraction successful.”
Figure 15. apkInspector extracting binary AndroidManifest.xml at 17,244 bytes (command output created on CodeSnap).

Conclusion

The increasing number of Android devices present a growing target that poses a significant challenge in combating malware attacks on the platform. APK files using BadPack reflect the increasing sophistication of APK malware samples. This not only presents a formidable challenge for security analysts, but it also underscores the need for continuous development of innovative techniques and tools to identify and mitigate these threats.

People should be suspicious of Android applications requiring unusual permissions not aligned with their advertised functionality, like an Android flashlight app requesting permissions to access the device's phonebook. We recommend that people also refrain from installing applications that originate from third-party sources onto their devices.

Palo Alto Networks customers receive protection from BadPack APK samples through Next-Generation Firewall with our Cloud-Delivered Security Services, including Advanced WildFire, Advanced DNS Security and Advanced URL Filtering.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks reported these findings to Google. Based on Google’s current detection, no apps containing this malware are found on Google Play. Android users are automatically protected against known versions of this malware by Google Play Protect, which is on by default on Android devices with Google Play Services. Google Play Protect can warn users or block apps known to exhibit malicious behavior, even when those apps come from sources outside of Play.

Palo Alto Networks has shared our findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Indicators of Compromise

SHA256 hashes of BadPack malware samples:

  • 0003445778b525bcb9d86b1651af6760da7a8f54a1d001c355a5d3ad915c94cb
  • 015bd2e799049f5e474b80cbbdcd592ce4e2dfbfae183bada86a9b6ec103e25e
  • 131135a7c911bd45db8801ca336fc051246280c90ae5dafc33e68499d8514761
  • 90c41e52f5ac57b8bd056313063acadc753d44fb97c45c2dc58d4972fe9f9f21

Additional Resources

Updated July  16, 2024, at 6:40 a.m. PT to update Figure 4. 

Updated July  17, 2024, at 6:20 a.m. PT to correct byte numbers in text. 

Enlarged Image