This post is also available in: 日本語 (Japanese)

Executive Summary

In May 2021, Palo Alto Networks launched a proactive detector employing state-of-the-art methods to recognize malicious domains at the time of registration, with the aim of identifying them before they are able to engage in harmful activities. The system scans newly registered domains (NRDs) and detects potential network abuses. However, the proactive detector has limitations; created to only focus on new domains, it cannot obtain and analyze malicious indicators appearing after a domain's creation. In addition, in the cases of adversaries leveraging or compromising aged domains to carry out attack traffic, the proactive detector fails to capture the emerging threats because the malicious domains are out of the scope of being considered NRDs.

In addition to scanning for potential abuses at the time of registration, we have another great opportunity to detect malicious domains proactively when they start carrying attack traffic. A malicious domain may be registered long before it serves its attacking campaign and exposes indicators of abuse. Once the domain starts carrying malicious traffic, we can observe its DNS requests from passive DNS. To block network threats at this early stage, we developed a new proactive detector that ingests newly observed domains (NODs) to discover potential threats among them. The new detector leverages various machine learning techniques to expose suspicious behaviors based on various information about NODs, including their latest WHOIS records and DNS traffic.

This blog will illustrate how we collect and analyze the enriched features available for NODs to detect emerging threats. Our detector scans 2.6 million NODs and captures around 2,300 suspicious domains every day. To evaluate the performance, we cross-checked the detected domains against other threat intelligence from VirusTotal. 33.08% of the NODs detected by our system were also labeled as malicious by other sources later. But our detector's average discovery time is 4.79 days earlier than any VirusTotal vendor. Furthermore, we will explain the new system's benefits with case studies about various network abuses such as command and control (C2), phishing and unethical search engine optimization (SEO) practices. We will discuss how the proactive detector captured and blocked these threats based on different indicators for cybercriminal activities.

Once the proactive detector captures a potentially harmful domain, the knowledge is distributed from DNS Security to other Palo Alto Networks Cloud-Delivered Security Services, including URL Filtering and WildFire.

Related Unit 42 Topics DNS Security

Detection Methods and High-level Statistics

Palo Alto Networks collects passive DNS data from multiple sources, including our DNS Security service, as well as external providers from all around the world. Our cloud-based passive DNS system can ingest and process about 13 million DNS logs each day. The data ingestion pipeline catches the latest DNS data every hour and extracts the domains that haven't been seen carrying traffic before. These domains will be forwarded to the proactive detector to identify emerging threats. Our system can capture and scan about 2.6 million NODs daily.

Number of newly observed domains shown in blue bars. Detection percentage shown in red line. Period covered is April 27, 2022-May 23, 2022
Figure 1. Daily NOD amount and the percentage flagged as suspicious.

For each NOD, our centralized data collector will actively crawl all related information, including the latest WHOIS record and all DNS traffic requesting the domain and its subdomains. To leverage a variety of malicious indicators, we developed individual machine learning models to analyze different information. Specifically, we built a reputation system to evaluate WHOIS records, applied multiple classification models to DNS-related features, and used the bigram model to analyze hostnames. These models captured about 2,323 unique potentially malicious NODs every day. Figure 1 shows the daily NOD amount and detection rate from April 27-May 2, 2022.

Number of days between when a domain is considered a newly registered domain (NRD) and when it is considered a newly observed domain (NOD). The graph shows the cumulative distribution function of their dormant periods.
Figure 2. Malicious NOD Dormant Period CDF.

To analyze attackers' behaviors, we compare the registration date of potentially malicious NODs and the date when they start hosting DNS traffic to see how long they keep silent before activation. Figure 2 presents the cumulative distribution function (CDF) of their dormant periods. The malicious domains start carrying traffic 5.57 days after their registration on average. However, during the period studied, our detector captured 152 NODs involving network abuses more than one year after creation – some domains can lie dormant for a significant amount of time before beginning malicious activity.

The cumulative distribution function of how many days before any vendor of VirusTotal the proactive detector was able to flag malicious domains among newly observed domains.
Figure 3. Early discovery time CDF.

Of all suspicious NODs detected by the new proactive system, 37.11% were labeled as confirmed malicious 30 days later by Palo Alto Networks or other threat intelligence vendors in VirusTotal. Figure 3 shows the CDF of how many days before any vendor on VirusTotal the proactive detector was able to flag malicious domains. On average, our detector can capture these malicious domains and isolate their traffic 4.79 days before any VirusTotal vendor blocks them. Furthermore, we can discover 19.47% of malicious NODs more than a week earlier than others.

Broader Visibility Into Emerging Internet Threats

One major benefit of our new proactive malicious NOD detector is that it extends visibility into emerging attacking domains. The previous proactive detector scans threats among NRDs only. However, not all top-level domains (TLDs) disclose their new domains to the public. For example, hundreds of country-level TLDs are maintained by governments. Access to their complete domain list or WHOIS database is restricted.

Let's take a malicious domain within the .ga TLD, for instance. Our proactive detector captured and labeled the NOD payment-downlaods[.]ga as grayware on March 4. .ga is the country code TLD for Gabon. This TLD offers free domain registration, but its domains’ creation dates are not available in the WHOIS records. Therefore, we cannot directly confirm .ga NRDs based on the registration information. Monitoring passive DNS data is the primary way to detect recently active .ga domains. We caught payment-downlaods[.]ga carrying C2 traffic 12 days after we first observed its DNS traffic. The domain served Android Package Kit (APK) spyware that attempted to steal private information including SMS messages (SHA256: e9ad04ae0201307e061cdae350c392a6b4537876991b2c97857ea71086fa0496).

Besides textual characteristics, the WHOIS record is another important feature that can be used for proactive malicious domain detection. It can expose various network abuse warning signs such as registrants, registrars and name servers. Our malicious NOD detector will actively crawl new domains’ WHOIS records for analysis once we observe their DNS traffic.

For example, our detector blocked a phishing domain within the .ml TLD as soon as we observed it in passive DNS data on May 10. The centralized WHOIS database for .ml is not publicly available, so the detectors focusing on NRDs failed to inspect this domain. However, once it began to carry traffic, the proactive malicious NOD detector crawled its WHOIS record and found the name server is offshoreracks[.]com, which provides an offshore and anonymous hosting service. Besides this questionable name server, the NOD's registrar also has a bad reputation. The NOD is a squatting domain mimicking a major international banking group based in Italy. The phishing website copied the text from the official site but with fake contact information. Interestingly, despite mimicking an Italian bank, the website uses Turkish, so it's likely intended to target Turkish victims.

Capture More Malicious Indicators

Unlike the WHOIS record that is available once a domain is created, some indicators for cybercriminal activities will only be exposed after a malicious domain starts carrying attacking traffic. Therefore, our detector also analyzes the DNS traffic of NODs to capture any suspicious behaviors.

Example of a scam page hosted on a DGA subdomain that asks for notification permissions. The presence of a large number of subdomains produced by DGAs can be an indicator that a newly observed domain is suspicious.
Figure 4. Black hat SEO page hosted on DGA subdomain of twtyowq[.]tk.
One of the abnormal DNS traffic patterns that is highly related to network threats is the presence of a large number of subdomains produced by domain generation algorithms (DGAs). Attackers could use these subdomains to exfiltrate stolen information or perform black hat search engine optimization (SEO) with wildcard DNS. Our proactive detector leverages this indicator to identify potentially abused NODs.

Let's take a pop-up advertising campaign that we detected as an example. This campaign was distributed through .tk domains such as twtyowq[.]tk, bsdybwo[.]tk and bwafduj[.]tk. The creation time of .tk domains is unavailable so we cannot obtain any NRDs under this zone. However, when we first saw these domains' DNS traffic, each of them had hundreds of DGA subdomains hosting scam pages asking for notification permission and redirecting visitors to unwanted ads (see Figure 4).

Traffic from gateway domains such as jxc786[.]com will reach this gambling website, but if a visitor opens the subdomain directly, they are redirected to a search engine for cloaking.
Figure 5. Gambling website hosted on jxc786[.]com.
Besides explicit DGA subdomains, our detector digs deeper into NODs' DNS logs to expose DGA traffic hidden behind them. For example, the domain jxc786[.]com was registered on April 23, but we didn't see any evidence of cybercriminal activity at that time. However, it started pointing to b136jishiang01hy.bakbitionb[.]com on May 22. bakbitionb[.]com is the infrastructure domain for a gambling campaign. In passive DNS, this domain has hundreds of DGA subdomains associated with different gateway domains through CNAME records. If visitors directly open these DGA subdomains in their browsers, they will be redirected to baidu[.]com for cloaking. However, traffic from gateway domains such as jxc786[.]com will reach the gambling website shown in Figure 5.

A phishing page hosted on a subdomain as part of an attempt to use levelsquatting to impersonate Apple.
Figure 6. Fake iCloud account recovery page hosted on asuna-sao[.]us.
The proactive detector can also discover and block levelsquatting subdomains from NODs’ passive DNS records. The levelsquatting technique includes a legitimate website’s domain as a subdomain in order to trick visitors into thinking they have arrived at the legitimate website. It is commonly used in conjunction with phishing attacks.

For example, our system recognized multiple subdomains of asuna-sao[.]us as levelsquatting hostnames masquerading as Apple Inc and labeled the domain as dangerous. The domain was registered on April 12 and started receiving traffic for subdomains like www.flnd-appleld.asuna-sao[.]us, www.lcloud-supoort.asuna-sao[.]us and www.apple-flnd.asuna-sao[.]us on the same day. These hostnames all deliver the same phishing page, which tries to steal Apple ID credentials (Figure 6).

Capture Aged Malicious Domains

In previous writing on strategically aged domains, we reported that some had been registered years before they were actively involved in cybercriminal campaigns. These domains didn't reveal any indicator for network abuses when they were created. Monitoring NODs gives us a second chance to capture aged malicious domains.

A rogue advertising page hosted shortly after the domain's WHOIS record was updated, illustrating how it can be important to check newly observed domains for malicious indicators even if they were benign at the time of registration.
Figure 7. Advertisement page hosted on createruler[.]com.
When crawling NODs' WHOIS records, we discover that many aged domains have recent WHOIS record changes before they are involved in network abuses. For example, createruler[.]com was registered in May 2022. Its WHOIS was updated on June 3, 2022. Then the domain started hosting a rogue advertising page, as shown in Figure 7. Our proactive detector analyzed its latest WHOIS record and classified it as suspicious based on its use of a highly abused name server. The name server is a parking service provider that monetizes domains' traffic through advertisement networks. This kind of parking site could expose visitors to various threats, such as malware distribution, potentially unwanted program (PUP) distribution and phishing scams.

The proactive detector also captured some domains repeatedly leveraged by network threats. For example, we captured a squatting domain mimicking a major digital payment network based in the United States. It used to serve a phishing campaign in 2020 and expired in 2021. But the adversary registered it again on March 13, 2022. Our detector observed it started carrying traffic and recognized it as a potentially malicious NOD on Sept. 2, 2022. The domain hosts a rogue website that tags the legitimate target domain on the index page and tries to collect the visitor's contact information. This website is highly suspicious and likely to be engaged in network fraud.

Conclusion

At Palo Alto Networks, we extract NODs from passive DNS and proactively detect potential cybercriminal activities among them. The new detector leverages various machine learning techniques to capture indicators for network abuses from WHOIS records, DNS traffic and lexical features. The system extends our visibility on emerging network threats and identifies new kinds of suspicious behaviors. As a result, it can discover about 2,323 potentially malicious domains as soon as they become active every day and protect our customers on average 4.79 days before the domains are confirmed to be involved in attacking campaigns.

Palo Alto Networks identifies the detected domains with the grayware category through our cloud-delivered security services for Next-Generation Firewalls, including URL Filtering and DNS Security. Our customers receive protections against damage from risky domains mentioned in this blog, as well as additional risky domains captured by our system.

Indicators of Compromise

C2 Domain

payment-downlaods[.]ga

Phishing Domains

asuna-sao[.]us
intesa-sanpaola[.]ml
zellesupport[.]info

Grayware Domains

bakbitionb[.]com
bsdybwo[.]tk
bwafduj[.]tk
createruler[.]com
jxc786[.]com
twtyowq[.]tk

SHA256

e9ad04ae0201307e061cdae350c392a6b4537876991b2c97857ea71086fa0496

 

Enlarged Image