This post is also available in: 日本語 (Japanese)
In March 2023, Unit 42 researchers discovered six malicious packages on the Python Package Index (PyPI) package manager. The malicious packages were intended to steal Windows users’ application credentials, personal data and tracking information for their crypto wallets. The attack was an attempted imitation of the attack group W4SP, which had previously launched several supply chain attacks using malicious packages.
We will discuss the ease with which threat actors can use malicious packages to release malicious code in an open-source ecosystem. The behavior we observed is not an organized campaign planned by an attack group, but most likely an imitator who read technical reports of previous campaigns to execute their own attack. We will walk through a technical analysis of the malicious code and unravel what the threat actor tried to achieve in the attack.
We will describe the indicators used by Palo Alto Networks Prisma Cloud modules that identified the malicious packages discussed here. Palo Alto Networks customers receive protections from open-source packages containing malicious code via Prisma Cloud.
|Related Unit 42 Topics||Cloud, Python, Open Source|
New Malicious Packages Discovered in PyPI
The Attack: Custom Package Entry Point
The Second Stage: W4SP Stealer
PyPI as a Convenient Platform for Uploading Malicious Packages
Palo Alto Networks Product Protections
Indicators of Compromise
Malicious packages are software components that were intentionally designed to cause harm to computer systems or the data they process. Such packages can be distributed through various means, including phishing emails, compromised websites or even legitimate software repositories.
Malicious packages can have far-reaching consequences, from surreptitiously pilfering sensitive data to causing system disruptions and even assuming control over entire systems. Moreover, these nefarious packages possess the ability to spread to other interconnected systems, instigating widespread damage and impeding productivity. Thus, it is important to exercise utmost caution when engaging in software downloads and installations, especially when the source is unfamiliar or untrusted.
By remaining vigilant and discerning, users can safeguard their systems and prevent potential harm from threat actors infiltrating their technological environment.
In March 2023, Prisma Cloud researchers discovered six malicious packages on the PyPI package manager targeting Windows users. The malicious packages were intended to steal application credentials, personal data and cryptocurrency wallet information.
Our Prisma Cloud engine, designed to detect malicious PyPI packages, identified several packages with suspicious attributes that were uploaded within a short time frame:
- The packages lacked an associated GitHub repository, which is commonly found with legitimate packages. This could indicate a desire to hide the code from view. This, coupled with a limited number of downloads, further raised our suspicions.
- When executed, the packages performed malicious actions such as collecting sensitive data and sending it to third-party URLs.
- The packages contained a malicious code pattern (as will be demonstrated later), which was detected by our engine.
- Because the package authors were newly created, had only uploaded one package, and did not provide supporting information such as links to other projects or to any repository, they were not considered reputable.
- Finally, the usernames of the packages’ author(s) were created within minutes of each other, following a distinctive pattern (e.g., Anne1337, Richard1337). Each of these usernames had uploaded only a single package.
The second stage of the attack was similar to previous attacks we have seen by the W4SP attack group. This group specializes in exploiting vulnerabilities in the open-source ecosystem, targeting organizations and spreading malware. Their primary goal is to gain unauthorized access to sensitive information, such as user credentials and financial data. They often use automated tools to scan for vulnerabilities and attempt to exploit them. In addition to traditional attacks, W4SP attackers were also seen executing supply chain attacks.
Prisma Cloud engine detected the packages that were marked as potentially containing malicious code. Each package contained a link to a suspicious remote URL, trying to download contents after having been uploaded individually by a single user.
The users for each uploaded package, whose usernames were created just before the upload, didn’t have any history of previously uploading packages. Each package reached hundreds of downloads, until they and the fraudulent user accounts that uploaded them were removed by PyPI due to our research team reporting the abuse.
We analyzed the code and tried to track down the package authors. We discovered a pattern in every package author’s username where they used “1337” as a suffix, which indicates that some automatic process created those users, as shown in Table 1. Figure 1 shows an author page for one of these usernames.
|Package Name||Author||Malicious Link||Number of Downloads|
The attack was similar to previous attacks we have seen by the W4SP attack group, which was covered in detail in the Bleeping Computer article, “Devs targeted by W4SP Stealer malware in malicious PyPi packages.” These similarities led us to believe this was a copycat attack.
There were aspects of this case that were less complex than we would expect for a true W4SP attack, for example:
- There was no targeting of any organization.
- The malicious packages did not use typosquatting for common popular packages, as expected for W4SP attacks.
- The second phase of the attack was not encrypted. In the true attacks from W4SP, this phase was encrypted and it made detection more difficult.
- Most of the code used by W4SP for previous attacks was already available for download, so it could be easily accessed and repurposed.
In the previous attack, W4SP Stealer was delivered as a second stage payload downloaded from a free file hosting service, which allowed the attacker to avoid detection at the package repository itself.
Rather than containing overtly malicious code, these packages were crafted to have specific entry points that would be triggered during the installation or execution process. By leveraging free file hosting services and employing custom entry points, the attackers aimed to remain undetected, posing a significant challenge to security professionals and researchers tasked with detecting and mitigating such threats.
These attacks are easy to implement and can be launched with little security expertise. However, they can be very effective, as the installation process automatically executes the attack as opposed to a threat actor needing to launch the attack when the imported module is used.
When a software developer wants to use a Python package they have to perform two actions. The first is to install the package, and the second is to import or declare in the code to use it. As we discuss in the next section, the attack code actually starts in the installation file (setup.py), which means that the attack is already carried out during the installation of the package.
In the case we investigated, the attacker called themself @EVIL$ STEALER. However, the attacker’s name changed with every attack. Below is the collection of names, as found in the code signature:
- ANGEL Stealer
- Celestial Stealer
- Fade Stealer
- Leaf $tealer
- PURE Stealer
- Satan Stealer
- @skid Stealer
The setup.py file was the same in all packages and contained the following code snippet that downloads content from a remote URL before executing it:
Figure 2 (above) shows the following activities:
- The attacker uses the _ffile object to create a temporary file, and the contents of the file are written using the write method.
- The usage of a temporary file with the NamedTemporaryFile function is a well-known technique to hide malicious code from detection by antivirus software or other security measures.
- The file's contents are obtained by downloading the contents of a URL using the urlopen function from the urllib.request module, and then executes the contents of the file using the exec function.
- After the contents of the temporary file have been written, the file is closed and an attempt is made to execute it using the start command in the system shell. If this is successful, the setup function is called to create the package.
- The attacker then uses the start command to start the Python engine's executable (pythonw.exe).
- This executable will then execute the script file that is passed as a parameter. Since that malicious package targeted Windows users, if the script file is not signed, it will not be subject to SmartScreen (Windows security feature to detect and prevent the execution of potentially malicious files) or signature checking.
- This means it will execute malicious code on a target computer, even if the target computer has SmartScreen and signature checking enabled.
According to our research, for the second stage, the attacker used a configured version of W4SP Stealer 1.1.6. This version is similar to previous ones where the code imports several libraries, including requests, Crypto.Cipher, json and sqlite3. Then, it uses various techniques to extract and decrypt stored browser credentials, including passwords and cookies, and sends this information to a Discord webhook.
This stage was similar within all of the malicious packages found. The W4SP Stealer is well known by the cybersecurity community. This section will give a short overview.
The main body of the code defines a class DATA_BLOB, which is used to store data for the CryptUnprotectData function. This function decrypts a Windows Data Protection API (DPAPI)-protected value, which is used to store sensitive data such as passwords and API keys. The code attempts to decrypt a value using the CryptUnprotectData and DecryptValue functions, and then sends it to a remote server via a Discord webhook, as shown in Figure 3.
The figures below show several examples from the malicious code in which the attacker tries to collect information about the victim. In Figure 4, the attacker tries to collect information about the victim including IP address, username, country and country code.
In Figure 5, the attacker interacts with the Discord API to retrieve a user's friend list and extract information about their owned badge.
Later in code, they try to use a Discord webhook prepared in advance (shown in Figure 6) and then they try to send the victim information via HTTP requests to the associated Discord channel.
Finally, as shown in Figure 7, the attacker tries to verify whether the victim's machine is suitable for carrying out the attack. If it’s determined that the machine is suitable, the DETECTED variable will be set to true and the information from the victim's machine will be sent to the remote server.
Within the realm of PyPI, which is a revered and widely embraced repository that hosts a staggering number of Python packages, a disconcerting reality has emerged. The repository's unrivaled popularity has inadvertently attracted the attention of unscrupulous entities, fervently seeking to exploit its vast user base by surreptitiously disseminating malicious packages.
This distressing trend poses a paramount challenge, as the decentralized nature of PyPI renders the monitoring and detection of these malevolent entities an arduous endeavor. The ramifications of falling victim to such nefarious packages are profound, posing severe consequences for unsuspecting users and enterprises alike, such as data, credential or crypto theft. Hence, fortifying the safety and security of PyPI's package ecosystem assumes a critical imperative, necessitating the implementation of robust countermeasures to combat this persistent threat.
As a result of this, it was not surprising that on May 20, 2023, PyPI announced that they temporarily suspended the registration and upload of new packages due to a concerning rise in malicious activities, malicious users and projects on the platform. Unfortunately, no specific information about the nature of the malware or the threat actors involved has been disclosed at this time.
This decision to freeze new user and project registrations reflects the ongoing challenges software registries like PyPI face, as they have become attractive targets for attackers seeking to compromise developer environments and tamper with the software supply chain.
The rise and prevalence of open-source software and the proliferation of package managers have made it easier than ever for attackers to slip these dangerous packages into unsuspecting systems. With software becoming increasingly pervasive in our daily lives, the threat of malicious packages has become more significant. Attackers can disguise these packages as legitimate software and exploit vulnerabilities in unsuspecting systems, causing significant damage such as data theft, system shutdown and network control.
To combat this threat, software developers and organizations must prioritize software security in their development process. Implementing robust security measures, such as code reviews, automated testing and penetration testing can help identify and remediate vulnerabilities before deployment. Additionally, staying up to date with security patches and updates can prevent attackers from exploiting known vulnerabilities.
In addition to technical measures, increasing awareness and education around software security can also help mitigate the risk of malicious packages. Regular training for developers, IT staff and end users on best security practices (e.g., password hygiene and suspicious email identification) can help prevent successful attacks. The collective effort of security professionals, developers and end users is necessary to ensure that malicious packages are identified and prevented from causing harm to systems and networks.
Publishing technical reports about malware and malicious packages also plays a critical role in advancing our understanding of these threats and improving our ability to protect against them. Technical reports provide an in-depth analysis of the behavior and functionality of malware, which helps researchers and security professionals develop more effective countermeasures. Additionally, sharing technical details about malware can help organizations stay up to date with the latest threats and vulnerabilities, allowing them to improve their security posture and reduce the risk of cyberattacks.
Prisma Cloud Runtime Protection Solution
Prisma Cloud runtime defenses involve a configurable set of features that provide the identification of specified and predictive protections for cloud virtual machines, containers and serverless functions. Predictive protections include capabilities like determining when a cloud or container instance executes an unknown process or creates an unexpected network socket.
Threat-based protections include capabilities like detecting when malware is added to a workload, or when a workload connects to a botnet. This detection of malicious binaries is made possible through the integration of the Palo Alto Networks WildFire cloud-based malware protection engine. These runtime incidents can also be detected predeployment, using the Image Analysis Sandbox.
Prisma Cloud Malicious Package Solution
Prisma Cloud offers an advanced end-to-end cybersecurity solution that encompasses a comprehensive suite of proactive measures designed to safeguard Prisma customers against a wide array of potential attacks, mitigating risks well in advance of their public exposure.
At the heart of this cutting-edge defense system lies the Prisma CVE Viewer, a dynamic resource that operates in real-time, continuously updated to reflect the latest threat landscape.
Incorporating a unique identification system known as Prisma-IDs, this viewer serves as a central repository for public vulnerabilities meticulously unearthed by our diligent research team. Leveraging this proprietary identifier, it bridges the gap for vulnerabilities that have yet to receive a Common Vulnerabilities and Exposures (CVE) ID, ensuring no security gaps go unnoticed or unaddressed. Moreover, the Prisma CVE Viewer diligently archives any malicious packages discovered by our expert researchers, empowering Prisma customers with unparalleled visibility into potential threats lurking within their software supply chain.
As shown in Figure 8, by arming customers with Prisma-IDs and equipping them with preemptive protection, Prisma Cloud protects against possible exploits before they have an opportunity to materialize. This robust security approach serves as a formidable deterrent against supply chain attacks, proactively thwarting threats and preserving the integrity and resilience of our customers' systems and infrastructure.
On top of that, the new Prisma Cloud CI/CD Security module (shown in Figure 9) offers comprehensive protection against emerging methods of executing code in CI pipelines and developers’ environments through malicious software dependencies (e.g., dependency confusion, typosquatting and compromised known packages). This is done by analyzing the different settings, configurations and interconnections of the various systems across the software delivery pipelines, such as the version control and CI systems.
Cortex EDR analytics
Cortex XDR provides EDR analytics capabilities on top of Cortex XDR agents, thus providing analytics-based detection capabilities for runtime command and control (C2) detection of malicious Python packages being used at runtime. An example for EDR analytics C2 coverage is discussed in our Cortex XDR technical documentation. See the Variant subsection “UNIX LOLBIN process connected to a rare external host” under the "A process connected to a rare external host" section.
Updated July 14, 2023, at 9:15 a.m. PT to add more context to the code in Figure 2.