Exploit Kits (EK), arguably the most impactful malicious infrastructure on the Internet, constantly evolve to evade detection by security technology. Tremendous effort has been spent on tracking new variations of different EK families. In this report, we look at an EK from an operational point of view. Specifically, we have been tracking the activity of the notorious Angler Exploit Kit and have uncovered traces of what we believe to be a large underground industry behind this EK.
Given the numerous existing reports from Sophos, Malwarebytes, and USENIX that cover different variants of Angler, we will focus on the new findings in terms of the global operation of Angler in this work. All of the findings are based on the results of our malicious web content detection system.
Key findings include:
- Detected over 90,000 compromised websites involved in Angler’s operation. Among which, 30 are within Alexa top 100,000 rankings (Top 1 million websites available here.) We estimate the number of monthly visits to these 30 compromised websites to be over 11 million based on visit counts from TrafficEstimate.com
- Discovered a highly organized operation that periodically updates the malicious content across all of the compromised websites and all of the EK gate sites at the same time. This indicates a sophisticated and persistent command and control channel between attackers and compromised websites.
- Discovered fine-grained control over the distribution of malicious content. This means the injected scripts can stay invisible for days to evade detection and the compromised websites can choose to target only certain victim IP ranges and certain configurations. This has lead to very low detection rates from the scanners used by VirusTotal (VT). Even weeks after our initial discovery, most of the compromised sites we found were not listed as malicious in VT.
- Found potential connections between activities of scanning vulnerable websites and leveraging scanned websites as entry point for the EK. This suggests an industry chain behind the operation of this EK.
Overview and Impact
Between November 5 (when we started to scan highly vulnerable websites for similar injections) and November 16, we discovered a total of 90,558 unique domains that had been compromised and used by Angler EK.
The compromised domains result in a total of 29,531 unique IPs. Among these, 1,457 IPs hosted more than 10 compromised domains. The IP address 220.127.116.11 hosted a total of 422 compromised sites. Some of the compromised sites were very popular with 177 of domains (30 FQDNs) in the Alexa top 100,000 and 40 in the top 10,000.
Most of the compromised sites remain undetected by VT. We tested early scanning results (5,235 malicious sites discovered at that time) with VT on November 16 and found that VT only reported 226 sites as malicious. At midnight of November 17 we repeated this experiment and VT still only found 232 sites. This amounts to a less than 5% detection rate. On December 14 we tested our full list (all 90,558) against VT and it only found 2,850 compromised sites – a 3% detection rate.
Figure 1: Angler EK Compromise Topology
Figure 1 outlines the redirection flow of a full compromise. The victims visit a list of compromised WordPress/Apache hosts and get redirected to the malicious server hosting the EK, either directly or via a middle layer, which is commonly referred to as ‘EK gate’. The final malicious payload served could vary, including ransomware such as Cryptowall, and spyware or botnets that connect to a C2 server. A more concrete redirection chain example and its fiddler packet capture are shown in Figures 2 and 3. The redirection from EK gate to the malicious file hosting server can happen within the same domain (shown in red in Figure 2) or cross-domain (shown in blue in Figure 2).
Figure 2: Redirection Chain
Figure 3: Fiddler packet captured during redirection (excerpt)
Figure 4 shows some of the post-infection traffic we obtained using Fiddler. In this case the infected VM sent out a C2-like request and received back a long encrypted response.
Figure 4: Post infection traffic
Figures 5 and 6 give an overview of all compromised hosts’ IP information.
Figure 5: Compromised host ISP distribution
Figure 6: Compromised IP country distribution (truncated to show major items)
We can see in the figures that the compromised sites are primarily hosted inside the United States, with a few exceptions in Europe and Asia. Among the systems in the United States, most of the identified sites are hosted on GoDaddy’s infrastructure and a few other popular hosting services.
Angler EK Evasion Techniques
In this section, we will highlight some of the featured behaviors of the malicious scripts that attackers injected into the compromised websites.
Static/Signature analysis evasion
Figure 7: injected script excerpt
Browser emulator evasion
Figure 8: injected script excerpt (cont.d)
|Malicious content trigger condition: zfglugdvsvhpmstz – hladygwivaoha == 2
|“zfglugdvsvhpmstz”(Browser Quirk Testing)
|“hladygwivaoha”(UserAgent pattern match)
|Old IE User
|New IE User
Table 1: Malicious content trigger condition
In this table, we can see that the malware authors target IE users and attempt to avoid security researchers; however, they left out one scenario: a non-IE browser mimicking IE 11. In this scenario the malicious behavior is actually exposed, and this is how we are able to automatically extract a number of next hops redirection (i.e. EK gate URLs) in Table 2.
IP address filter evasion
This URL structure resembles others that were previously disclosed by sources like malware-traffic-analysis.net; however, as mentioned earlier, the way this iframe is injected is entirely different in this campaign compared to their previous mechanism, which simply injected a flash file (<object>) into the HTML code.
It’s not easy to obtain the malicious content of these iframes because when we visit the compromised URLs from an IP addresses that belongs to Palo Alto Networks, the attacker’s server either does not respond, or returns an empty 200 response. The same results occurred when we used different browsers with different versions, residential IPs in California, and then an Amazon EC2 instance. On November 16 we used a proxy service to redirect our traffic through IP blocks across the world and found that when we use an IP block from Turkey, the server returned the Angler EK’s landing page. The landing page looks similar to previously posted ones, and eventually redirected the victim browser to download a flash file.
It is also interesting to note that many domain names hosting the EK gate pages, like filchnerkunstkring.diversityadvice[.]com or ullshift-vastreden.avimiller.org, have legitimate and benign root level domains diversityadvice.com and avimiller.org. We suspect:
1) That the DNS nameserver of these domains are compromised and a rogue DNS record was created to point the malicious subdomain to the attacker’s server; and
2) the credential that can unlock registering subdomains has been stolen. Such DNS compromise is also popularly known as Domain shadowing.
In addition to the EK gate IP filtering, the compromised host seems to serve the malicious redirection scripts using similar IP filtering rules as well. We initiated requests from two clean machines using different outgoing IP addresses and the same user agent at almost the same time. The machine user of one IP address consistently received a malicious page while the other user only received clean HTML. It is particularly worthwhile to note that the attackers perform IP cloaking adaptively; we used one IP address range to scan the web for compromised sites and after approximately two weeks of scanning, the attacker stopped serving malicious content to these IPs. We suspect that the attackers detected abnormal scanning behavior from the IPs and therefore cloaked themselves to avoid detection.
It also appeared to us that the injected content turns on-and-off inside the duration of our scan. After we discovered this behavior, we picked ten sites and significantly increased their scanning frequency to every ten minutes. Figure 10 shows the vulnerability status of three of these ten sites over the course of 24 hours. The markings of the top portion indicate that the site’s malicious code was active during that time slot while the markings on the bottom portion indicate the site was benign, or dormant, during that time slot. It appears to us that nine out of 10 sites share a similar (but not exactly the same) dormant/active pattern, as shown in the orange and blue dots, while the other site (www.grillman[.]com.au) shows a somewhat different pattern. We are not exactly sure why the injection exhibits such behavior over time, but our guess is that the malicious code intends to hide itself and put the website owner or security companies under the illusion that the threat has been cleaned up.
Figure 10: Time-based cloaking of compromised hosts – Pacific Standard Time.
User Agent-based evasion
Finally, the compromised site sets the user’s cookie the first time the victim visits the site, and never sends the injected code a second time to a browser if it detects the same cookie on subsequent requests. We consider this as one of the many mechanisms to cloak the threat against security researchers that may employ dynamic analysis approaches to visit the compromised sites repetitively.
Detection evasion techniques are crucial for a malicious attackers operation, but in time researchers will identify and expose them. To avoid being caught, attackers constantly evolve the compromised sites to further complicate the detection and prevention process. We list some of the more important changes we observed below.
EK Gate URL evolution
Continuous monitoring of the EK gate URLs (result of DNS shadowing) shows that they change frequently, at approximately half-hour to one-hour periods. Our large scale continuous scanning reveals that, at any given time, almost all compromised sites point to the same next-hop domain, but in roughly half an hour to an hour this domain changes completely and all infected hosts make the change at approximately the same time. Table 2 shows our scanning result for the source hostnames of the injected iframe URLs. Since we cannot get hold of a compromised host and capture the traffic ourselves, we suspect that this synchronized behavior is an indication of malicious C2 server(s) actively and continuously communicating with the compromised hosts to activate the switch to new EK gates.
|Approximate switching time
|src host of injected iframe
|2015-11-13 00:38 AM
|2015-11-13 01:48 AM
|2015-11-13 02:17 AM
|2015-11-13 02:48 AM
|2015-11-13 03:50 AM
Table 2: EK gate domain changes
Although the hostname changes frequently, we are able to confirm using passive DNS data that the IPs these domains resolved to are relatively limited, including 18.104.22.168 and 22.214.171.124. This indicates the attacker is reusing some IP resources behind the subdomain-fluxing mechanism.
In each of the evolutionary steps, almost all compromised sites we had identified presented the update at the same time. This timing provides further evidence that a C2 channel is likely maintained at all times between the attacker and the compromised hosts.
SWF/Binary file evolution
Continuous monitoring of the Flash files served by the EK revealed that it changes slightly on a daily basis, and VT has never seen these samples by the time we obtain them. We submitted SWF file distributed on November 16 to VT, and the immediate detection score was 3 with less confident verdicts e.g. ‘behaves like Flash Exploit’. On December 3 we requested a rescan on the same file, this time VT gave a score of 11 with many major AV vendors picking up the detection.
In this section we explore some interesting common properties that the compromised hosts share. We demonstrate how we use this information to discover many more compromised websites.
Inferred infection vectors
Generally speaking, we found that the infections fell into two categories, indicating that there may be two infection vectors used to compromise the websites.
1) For a small portion of sites, the malicious script is injected at the very top of the HTML source code, before the opening of <html> tag. One example is www.cxda[.]gov.cn. We think this is because the compromised host has an Apache or system level vulnerability which was exploited by the attacker.
Based on our injection vector inference, we extended our scan from newly registered domains to two additional categories of websites, greatly increasing the number of detections.
Since the attacker may exploit the same Apache/web server vulnerability on the same machine, we believe the hosts collocated with the known compromised sites have a higher chance to be compromised as well. Many hosting services host multiple websites on the same host and IP address (i.e. virtual hosting). During our daily scan of newly registered domains, we found a large number of compromised sites served by popular hosting services including GoDaddy, and found that some of them share the same IPs. Using passive DNS data, we are able to retrieve a sizable list of likely-vulnerable sites – those that are hosted on the same IPs as ones we already detected. The list contains a total of 82,000 unique domains. Of these, approximately 65,000 domains are actively hosting websites and at least 3,880 of them are compromised.
Based on the high percentage of WordPress websites present in the compromised site list, it is highly likely that the attacker is exploiting one or more WordPress vulnerabilities. However, to compromise these websites the attackers would have to first perform some type of reconnaissance. We theorized that the malware sample behavior collected in Palo Alto Networks WildFire could help us discover more of these infected websites. In WildFire scans, we identified many malware samples actively probing vulnerable WordPress sites by requesting their xmlrpc.php file. This file is linked to several vulnerabilities and hazards that have been previously disclosed. We collected such probing behavior in WildFire history, which amounts to a total of 201K unique domains. We determined that 174K URLs that were still alive and responding, and of these, our malicious web detection system identified 535 additional compromised sites.
Following the success of this scan, we further obtained a large list of websites using WordPress which contained almost 17 million sites and scanned them using the same system. This revealed over 84,000 compromised WordPress websites in total.
Looking at how many websites are being compromised and how quickly their operators detect and remove the infections helps us better understand the lifecycle of Angler EK infections.
First, our daily scan reveals tens to hundreds of new compromised sites that have never previously been detected, as seen in Table 3 and Figure 12. These numbers suggest that this is still a very active threat.
Table 3: Unique new compromised sites detected every day
Figure 12: Unique new and total compromised websites each day.
Before we make any statements about how quickly compromised websites are cleaned up, we would like to point out that the numbers discussed here are educated guesses to the upper-bound, due to the fact that the injected scripts may simply just be dormant for a long time. For example, we observed one site, ‘seorewolucja[.]pl’ that was first observed as infected on November 6 and followed the on-and-off infection pattern until 05:30 PST on November 15. Since then the site remained clean of infection for three days, until the morning of November 18 when the injected script appeared again. Although the injected code looks similar before November 15 and after November 18, we cannot be sure if the site owner disinfected their system and it was later re-compromised, or if the infection simply stayed dormant for three days. This demonstrates how long the infection may stay dormant and that we should not make hasty decisions regarding whether a site has been cleaned up and patched appropriately to prevent future infections.
To get a rough idea about the cleanup rate and status, for every six hours we rescanned the entire infected population collected through November 16 – a total of 5,234 unique URLs. We aggregated the scanning result on November 18 and our system found that 5,002 URLs were still infected at least one time in our scanning period, and saw a total of 396 sites that never showed any infection behavior throughout the scans from noon of November 16 to November 18. Even if we consider all of these sites as disinfected, they account for less than 8 percent of the entire infected population. When we checked this number again on November 19, the total number of clean sites dropped to 377. This means that some of the 396 sites that we thought had been cleaned up were simply staying dormant from November 16 to 18. Since many of these sites were discovered in early November and possibly infected even earlier, the scanning results indicate that disinfection is happening very slowly.
Modern exploit kits are becoming harder to catch as they maneuver to avoid detection by security researchers. Particularly, the Angler EK boasts the following features:
- Cloaking against researchers: The constantly evolving injected scripts are trying their best to identify malware researchers’ sandboxes. They hide their malicious behavior from sandbox/emulated environments. The techniques used include browser fingerprinting using browser quirks, as well as IP and UserAgent
- Frequent evolution and persistent control: Large scale tracking of many compromised domains revealed that the attackers have persistent control over the compromised machines. We saw three major version changes in injected scripts as well as hourly switches of the malicious EK gate domains over the course of one month. These actions cannot occur without continuous control of the compromised hosts. This contradicts the common assumption that the hosts are compromised only at one point and injected malicious code once.
- Growing number of infections: According to our observation, newly compromised sites appear at a consistent rate of over 100 sites per day (this is a lower bound as we can only scan a limited number of websites per day), while older compromised sites do not seem to be disinfected promptly, if at all. This results in a steady increase of total active compromised sites, and this threat is still a long ways from elimination.
Despite these challenges, we also found some consistent behavior patterns and limitations of this attack:
- Suspicious redirections: Although the redirection script may change, the redirection chain stays relatively stable. The EK is always served from a different WordPress-like domain and a flash file is downloaded soon afterwards.
- Infrastructure reuse: Exploiting known WordPress vulnerability and weak DNS configuration for DNS shadowing may be easy for the attackers, however, changing the exploit kit’s hosting server is relatively hard. This requires the attacker to physically control a new machine or move an existing machine. At least for now, we have never seen the attacker serve the actual EK file on a compromised machine, possibly to avoid bandwidth spikes/AV detection of the compromised sites.
We will continue to track down the compromised sites, learn more about modern exploit kits and offer maximum protection for our customers.
Get a list of the compromised domains analyzed in this research.