1. Introduction
Websites are broadly classified into two categories based on payment requirements: free content websites (FCWs) and premium content websites (PCWs) [
1]. As the name suggests, FCWs provide content such as books, music, movies, software, and games without requiring payment. PCWs, on the other hand, charge a premium for accessing similar types of content. Both categories are prevalent across the web, though FCWs are increasingly popular due to their accessibility and convenience. However, this very appeal exposes them to significant security and privacy threats. FCWs have become an integral part of the Internet, but their widespread use amplifies associated risks. Prior studies have shown that FCWs often lack robust privacy policies to protect users’ data and rights. Furthermore, FCWs host some of the most malicious content compared to both PCWs and the general web population (e.g., Alexa’s top million websites) [
1,
2,
3,
4].
Research Gap. Despite extensive research on the FCW ecosystem and its security implications, little attention has been paid to the infrastructure supporting FCWs and how it differs from that of PCWs. To understand the interplay between FCWs and Internet infrastructure, it is essential to: (1) investigate the networks that host FCWs and assess whether network size correlates with their security, (2) analyze the role of Cloud Service Providers (CSPs), including their attributes and associations with FCWs’ security posture, and (3) explore the spatial distribution of FCWs in comparison to PCWs, at the country level. This work addresses these gaps by modeling FCWs in contrast to PCWs and the general web population, thereby providing a better understanding of their ecosystem and associated risks.
Our Approach and Rationale. Our study evaluates several dimensions of the FCW ecosystem, each guided by distinct motivations. Examining the network characteristics of FCWs relative to PCWs provides insights into Internet-scale vulnerabilities. Analyzing the distribution of FCWs across networks, CSPs, and countries—especially in regions with concentrated malicious content—helps identify risks and informs mitigation strategies. Finally, studying FCWs deployment at the country level highlights how national cybersecurity policies influence the prevalence and management of harmful content. We employ network analysis methods to characterize FCW hosting by classifying networks into four categories: small, medium, large, and very large, based on subnet masks. Subnet masks indicate the number of possible addresses reserved by hosting providers, which in turn reflects the number of publicly accessible hosts associated with FCWs.
Understanding the distribution of FCWs across networks is critical for designing practical defenses. Given that FCWs are reported to be more malicious than other website categories [
1,
4], identifying the most common network scales enables containment and isolation techniques to be applied more effectively. For example, if malicious FCWs are concentrated in small networks, isolating an entire network prefix may be the most effective strategy with minimal collateral disruption. Conversely, when malicious FCWs are hosted in very large networks, broad isolation becomes impractical, and filtering must instead target individual hosts. This perspective also supports prioritization under limited resources, allowing containment efforts to focus on the networks hosting the majority of malicious FCWs. Similarly, profiling FCWs at the CSP level provides a foundation for applying targeted risk-prevention procedures without inadvertently disrupting benign services.
Analyzing the geographical distribution of CSPs and hosting networks reveals how policies and regulations shape security. Understanding cross-border hosting patterns is vital, as FCWs often operate beyond the legal jurisdiction of their users. Users victimized by a malicious FCW hosted abroad may face significant challenges in pursuing legal remedies, such as requesting the removal of harmful or fraudulent content. These insights inform not only individual protection strategies but also regulatory and cooperative actions against CSPs linked to high concentrations of malicious FCWs.
Contributions. Building on a dataset of 1562 FCWs and PCWs from prior studies [
1,
4], augmented with multiple infrastructure and security dimensions and complemented with an independent sample from Alexa’s top one million websites [
5], our work makes the following contributions:
Hosting and Network-Level Analysis. We systematically measure, analyze, and contrast the hosting patterns of FCWs and PCWs across the four network scales defined in our framework (
Section 4.1). We further examine per-category hosting trends (
Section 4.2), and quantify maliciousness using the indicators introduced in
Section 3.3—Malicious Count (MC), Malicious Percentage (MP), and Malicious-Per-Feature Percentage (MPFP)—to evaluate how malicious websites are distributed across network sizes and to identify network scales associated with disproportionately higher malicious activity.
Hosting Networks Spatial Analysis. We identify malicious FCWs and PCWs and analyze their relationships with key infrastructure attributes. Beyond network scale, we enumerate the hosting countries associated with both website types and uncover heavy-tailed, highly concentrated geographical hosting patterns (
Section 4.3).
Cloud Service Providers (CSPs) Analysis. We enumerate the major CSPs hosting FCWs and PCWs, characterize which providers host benign and malicious websites, and contrast these patterns with those in general web infrastructure using the Alexa-based benchmark dataset (
Section 4.4).
Temporal Dataset Analysis. We perform a temporal reassessment of FCWs and PCWs using updated VirusTotal scans [
6] to evaluate how maliciousness and hosting parameters—network scale, CSP distribution, and geographical location—change over time. This temporal dimension reveals the dynamic evolution of the FCW/PCW ecosystem and its implications for infrastructure-level security (
Section 5).
The rest of the paper is organized as follows. Related work is presented in
Section 2, data collection and methodology are described in
Section 3, analysis results are reported in
Section 4, temporal analysis is given in
Section 5, discussion is provided in
Section 6, and conclusion is drawn in
Section 7.
2. Related Work
Prior works have examined the security, privacy, and modeling of free content websites (FCWs) [
1,
3,
7,
8]. Other studies investigated the role of infrastructure, including the use of content management systems (CMSs) in FCWs [
4], and explored the security and network-level characteristics of widely used websites [
9,
10,
11,
12,
13,
14,
15,
16,
17,
18]. In addition, prior works have statistically analyzed domain-specific security breaches in web services, such as those affecting healthcare providers, along with their associated network characteristics [
19]. Related efforts have investigated the role of infrastructure in securing web services [
20,
21,
22,
23,
24]. Furthermore, multiple studies have analyzed various security features of web infrastructure [
25,
26,
27,
28].
Free Content Websites Analysis. The security and privacy of FCWs have been a significant concern in recent research. Alabduljabbar et al. [
3] examined FCW security through the analysis of SSL certificates, including their issuers, validity, and signatures. They assessed the authenticity of these certificates and evaluated their coverage terms and overall website security, finding that 36% of FCWs relied on invalid, expired, or fake SSL certificates. Alqadhi et al. [
4] studied the impact of content management systems (CMSs) on FCW security. They built a database of CMSs used by FCWs and cross-validated it against online resources such as CMS-detector and W3Techs. Through frequency analysis of CMS-based versus custom-coded FCWs, coupled with VirusTotal annotations, they reported that over 56% of FCWs employed custom code and are more likely to be malicious. Their findings suggest that FCWs, relying on popular CMSs, may exhibit increased malicious behavior. The security of free hosting infrastructure has also been investigated. Roy et al. [
7] analyzed phishing attacks hosted on free web hosting domains (FHDs), which can evade detection and takedown mechanisms by anti-phishing entities. Based on a large-scale analysis of 8800 FHD URLs shared on Twitter and Facebook, they showed that such phishing attacks remained active 1.5 times longer than regular phishing URLs, had 1.7 times lower coverage on blocklists, and took 3.8 times longer to be detected by security tools.
General Websites Analysis. Several studies have examined the security of widely used websites at scale. Kontaxis et al. [
9] investigated cross-domain policies in Rich Internet Applications (RIAs) such as Microsoft Silverlight and Adobe Flash, which are extensively deployed but prone to malicious exploitation. Their study, conducted on Alexa’s top 100 K websites and the websites of Fortune 500 companies at both global and country levels, identified more than 6500 vulnerable websites exposed to cross-domain security threats. Li et al. [
29] focused on malicious advertising activities within websites by analyzing 90,000 leading domains. They demonstrated how attackers infiltrated advertising networks and revealed the role of malicious nodes in online advertising, characterizing their behaviors and interactions in detail.
More recently, Lindkvist et al. [
30] analyzed secure communication practices across malicious and benign domains. Their findings indicate that while HTTPS and strong cipher suites are often adopted, they do not necessarily guarantee trustworthiness, as phishing domains may exhibit stronger protections than many benign ones. In contrast, our work examines the broader ecosystem of websites, emphasizing hosting networks, cloud- and country-level distributions, and category-specific risks. Whereas their work highlights protocol-level practices, ours emphasizes structural and ecosystem-level characteristics.
Websites Content Analysis. Several works have investigated the relationships between content, usability, service quality, and website security. Figueras-Martín [
31] analyzed website connectivity, relationships, and content within the Freenet darknet. The results revealed widespread website availability, key structural nodes in the network, and a predominance of illegal content. Samarasinghe et al. [
32] studied privacy risks in religious websites and mobile apps, highlighting the extensive use of trackers that compromise user data and erode user trust. Chen et al. [
33] examined the impact of migration stress on risky Internet behaviors, showing how it increased scam victimization among Chinese migrant workers. Hernandez-Suarez et al. [
34] proposed a methodological approach using text transformers and dense neural networks to detect websites hosting infringing content. In contrast, our work emphasizes network-level affinities and distributions to better understand the ecosystem from an infrastructure perspective.
Network Security Analysis. Website infrastructure security is a fundamental factor in safeguarding networks, especially since 50% of all websites rely on specific content management systems [
4,
35]. Noroozian et al. [
36] conducted a longitudinal study of broadband CSPs to evaluate their role in mitigating IoT malware, with a focus on Mirai. By analyzing infection rates across 342 global CSPs, they found that 55% of the observed variation was explained by the number of subscribers per CSP. Wickramasinghe et al. [
18] analyzed the hosting patterns of malicious domains by examining the hosting types of IP addresses. Their results showed that more than 95% of malicious websites were hosted on regular hosting IPs, and 97.1% of these websites shared infrastructure with unrelated benign websites. They further identified Cloudflare, Amazon, Google, OVH, and Microsoft as the top five hosting providers of malicious domains. Their findings underscore the need for stronger security measures by hosting providers to safeguard shared hosting infrastructures.
The Role of Infrastructure in Website Security. Fryer et al. [
37] investigated malicious web pages and proposed mitigation strategies that hosting providers might implement to strengthen their defenses. Liao et al. [
38] examined long-tail search engine optimization (SEO) spam on cloud service providers (CSPs). By analyzing 15,774 cloud directories across 10 major providers, they identified 3186 abusive directories used for long-tail SEO spam. Their study revealed the monetization strategies of spammers and their evasion techniques, including obfuscation through link shorteners and client-side JavaScript in cases where server-side scripting was unavailable.
Tajalizadehkhoob et al. [
39] explored the distribution of web security features and patching practices in shared hosting providers to assess their influence on website compromise. Wang et al. [
23] investigated the growing consolidation of DNS and web hosting providers, a trend with significant implications for Internet security, reliability, and availability. Their findings showed that Amazon and Cloudflare exclusively host the name servers for more than 40% of domains, while only five organizations (Cloudflare, Amazon, Akamai, Fastly, and Google) collectively hosted approximately 62% of the Tranco top 10 K index pages along with most external resources. A comparison between our work and related studies, in terms of scope and focus, is presented in
Table 1.
4. Analysis Results
This section presents the findings of our distribution analysis pipeline applied to the extracted dataset. We first compare the trends of free content websites (FCWs) and premium content websites (PCWs) across different network scales, CSPs, and countries. We then examine their distribution within the top one million most-visited websites. Finally, we provide a per-category analysis for books, games, movies, music, and software, highlighting similarities, differences, and security implications.
4.1. General Network Scale Analysis
The distribution analysis over the network scale yields several important insights summarized as follows. (1) Most websites reside in medium-scale networks, accounting for 81.24% of the total number of studied FCWs, PCWs, and general websites. (2) PCWs are more likely to use large networks, reflected in a higher proportion of secure websites than those in medium networks. (3) FCWs in medium networks are the riskiest category, with nearly 90% of FCWs hosted there and 40% classified as malicious. (4) Our per-category analysis shows that books, movies, and software websites rely more on large networks than games and music websites. These categories are also generally less malicious, except for the free software category, where most websites are in medium networks and exhibit the highest MP. This result is expected, as attackers often recruit victim devices by convincing users to install unauthenticated free software—ultimately influencing the security classification. (5) Across all categories (books, games, movies, music, and software), both FCWs and PCWs primarily reside in medium networks. Premium websites in medium networks account for ≈75% to ≈85% on average, compared to ≈84% and over 97% in free websites, with the game category showing the highest concentration in both. (6) Most CSPs are fairly evenly distributed between medium- and large-scale networks. (7) Hosting of FCWs and PCWs is concentrated mainly in the “United States,” where ≈58% of websites reside. (8) Large-scale networks are predominantly located in the United States, which hosts ≈71% of them.
4.1.1. Dataset Versus Benchmark
As shown in
Table 3, a concentration of malicious websites is observed in medium-scale networks for both the combined FCWs/PCWs dataset and the general dataset, with MP values of 23.06% and 3.89%, respectively. Specifically, 27.38% of medium-network websites in the FCWs/PCWs dataset exhibit malicious behavior per network scale count (MPFP). In contrast, the general dataset reports a significantly lower rate of 4.92% for the same feature. This notable difference supports our hypothesis that a higher proportion of malicious websites are hosted within the FCWs/PCWs dataset.
Consequently, it is essential to account for both network scale and the degree of malicious activity when managing network security risks. These findings highlight the importance of considering such factors when comparing and analyzing datasets. Failing to address them could result in inadequate strategies for addressing security threats, ultimately compromising safety and integrity.
4.1.2. Free Versus Premium Websites
As shown in
Table 4a, the majority of websites are hosted in medium networks, accounting for ≈89.1% and ≈78.9% of FCWs and PCWs, respectively. The MPFP for FCWs is nearly double that of PCWs, with ≈40.5% compared to ≈22.2%. The highest MP values in both groups are observed in medium networks, at ≈37.7% for FCWs and ≈19.8% for PCWs. These findings highlight the need for targeted defenses against websites hosted on medium-sized networks that contain malicious content. Furthermore, ≈20% of PCWs are hosted in large networks, which may provide improved security given the relatively lower presence of FCWs in these networks. Overall, the results support our hypothesis that malicious websites in FCWs and PCWs follow similar hosting patterns. A more thorough examination of these patterns is necessary to mitigate vulnerabilities and enhance security.
4.2. Per-Category Network Scale Analysis
In this section, we review the results and findings of our measurements through a per-category analysis of websites associated with books, games, movies, music, and software.
4.2.1. Book Websites
As shown in
Table 4b, clear trends emerge across different network scales hosting FCWs and PCWs. Approximately 85% of FCWs and 80.1% of PCWs are hosted in medium networks, which together account for ≈82.4% of both types of websites. The MPFP is ≈30% for FCWs and ≈27.8% for PCWs. Notably, ≈31.7% of FCWs were found malicious compared to ≈30% of PCWs, indicating a substantial issue with book websites in medium networks. Within these networks, ≈27% of FCWs and ≈24% of PCWs contribute to the total malicious website MP. It is also worth noting that ≈17.3% of PCWs are hosted in large networks compared to only 10% of FCWs. Furthermore, ≈28.6% of FCWs in small networks were identified as malicious, compared to 0% in PCWs. While the difference is less pronounced in large networks, it is quite significant in small networks, potentially explaining the overall MP gap between FCWs and PCWs that offer book content.
4.2.2. Games Websites
As shown in
Table 4c, a significant concentration of game websites is hosted in medium networks, with ≈97.4% of FCWs and ≈85.6% of PCWs. In total, ≈90.5% of both types of websites are hosted in medium networks. This suggests that organizations providing gaming content prefer medium networks, possibly to ensure high network speeds for users worldwide. Additionally, ≈12.6% of PCWs are hosted in large networks compared to just ≈1.3% of FCWs. This aligns with the earlier finding that large networks enhance the security of PCWs, as they contribute only 1.8% MP despite the total MP of PCWs being ≈31.5%. In contrast, the MP of FCWs reaches 64.1%. These results highlight the elevated risk associated with free gaming websites compared to premium gaming websites, emphasizing the broader vulnerability of game-related platforms.
4.2.3. Movie Websites
As shown in
Table 5a, most FCWs and PCWs in the movie category are hosted in medium networks, similar to the games category. Specifically, 91.61% of FCWs and 75.66% of PCWs fall into this category, meaning that nearly 9 out of 10 FCWs and 3 out of 4 PCWs are hosted in medium networks. Large networks are particularly appealing to PCWs, which account for 23.03% of them, compared to only 6.77% of FCWs.
Within medium networks, 26.41% of FCWs were identified as malicious, contributing 24.19% of the total 26.45% MP. In contrast, PCWs exhibit a lower MP of 15.13%, with 13.82% attributed to websites hosted in medium networks. Notably, 18.26% of PCWs overall were identified as malicious. Small and very large networks are relatively uncommon for both FCWs and PCWs. A striking disparity is observed in small networks, where 50% of PCWs were found malicious compared to 0% of FCWs, although the overall count remains small.
These findings suggest that the movie category lags behind the book and games categories in terms of security. Movie websites frequently rely on cross-domain video players, which may increase the number of reported security threats and their susceptibility to malicious attacks. Overall, the results emphasize the dominance of medium networks in hosting movie websites and the higher MP among FCWs compared to their premium counterparts. The preference of PCWs for large networks may be explained by improved security or superior performance.
4.2.4. Music Websites
As shown in
Table 5b, the distribution of music websites across network scales is dominated by medium networks. More than 90% of FCWs and 75% of PCWs are hosted in medium-sized networks. Large networks host approximately 8.8% of FCWs and 22.1% of PCWs, indicating a stronger preference for PCWs within these networks.
An apparent disparity emerges in malicious presence (MP) between FCWs and PCWs. Nearly 40% of FCWs are classified as malicious, compared to only ≈17% of PCWs. Medium networks account for much of this difference, as 43% of FCWs in these networks are malicious, versus ≈18% of PCWs. This corresponds to a total MP contribution of 100% for FCWs and 80% for PCWs within medium-sized networks. Large networks, by contrast, play a critical role in PCW security, exhibiting a lower MP of only ≈3.5%.
Interestingly, no malicious FCWs are hosted in large networks, despite the overall MP of FCWs being more than double that of PCWs. This suggests that large networks provide a safer environment for FCWs. Common to both movie and music websites is the reliance on shared content players across multiple domains, which may increase the risk of malicious exploitation.
4.2.5. Software Websites
As shown in
Table 5c, FCWs in the software category exhibit the highest malicious concentration among all categories. Approximately 84% of FCWs are hosted in medium networks, where the MPFP reaches ≈70%. This aligns with expectations, as software applications often require system-level access when installed from FCWs, making them highly susceptible to malicious activity. In comparison, ≈77% of PCWs are hosted in medium networks, but with a much lower MPFP of ≈22%.
PCWs make greater use of large networks, with ≈21% compared to ≈9.7% of FCWs. FCWs also show significant reliance on small networks, with ≈6.8% of websites, of which ≈33.3% are identified as malicious, compared to 0% for PCWs. Within large networks, FCWs demonstrate a high MPFP of ≈41.2%, contributing to a total MP of ≈64.2%, in contrast to only ≈18.8% for PCWs.
These findings highlight the severity of the risks associated with software websites, particularly FCWs hosted in medium networks, which exhibit a very high MPFP of ≈69.4%. The results underscore the critical need for tightened scrutiny of free software sources given their disproportionate contribution to malicious activity.
4.3. Networks’ Spatial Analysis
While the abstract network-level distribution analysis sheds light on the structure of FCW and PCW networks, annotation at the CSP and country level provides additional insight into interdependence within this ecosystem. We begin with an examination of cloud service providers (CSPs) and their hosting countries as part of the spatial analysis.
As shown in
Table 6, most websites (≈84%) are hosted in medium networks, with large networks accounting for only ≈13% and small networks for ≈2.5%. A negligible fraction (<0.1%) of websites is hosted in very large networks. FCWs and PCWs are distributed across multiple CSPs, with the highest concentrations in Cloudflare (≈27%) and Amazon (≈16%), both of which operate predominantly in medium- to large-sized networks based on their IP allocations. Liquid (4.8%), Trellian (2.8%), and Google (2.7%) also host a significant number of websites, primarily on medium-sized networks.
Other CSPs each host fewer than 3% of websites and are associated only with medium networks. The “Others” category represents ≈36.6% of websites, dispersed across all network scales, with a notable concentration of ≈30.8% in medium networks. Further analysis of CSP distributions across countries is essential, as regional variations may influence the overall security posture of hosted websites.
4.3.1. Countries of Networks
As shown in
Table 7, the distribution of hosting countries for FCWs and PCWs across network scales reveals several notable patterns. A significant fraction of websites (≈84.2%) are hosted in medium networks, with the United States accounting for the majority at ≈56.5%. Overall, the United States hosts 58.7% of websites distributed across small, medium, large, and very large networks. Belgium and the Netherlands also host a considerable share, primarily in medium networks, with ≈6.6% and ≈6.3% of websites, respectively.
This analysis highlights the diversity of hosting patterns across countries and network scales, while reaffirming the dominance of medium networks. These findings underscore the importance of evaluating the maturity of national security policies, as most malicious websites appear to depend on medium-scale networks for their operation. By focusing on these networks and examining scale-level practices, it may be possible to curb the proliferation of malicious websites more effectively.
4.3.2. CSPs over Countries
As shown in
Table 8, the distribution of the most commonly used CSPs across the top hosting countries reveals clear trends. Cloudflare leads with 410 websites, primarily hosted in the United States (296 websites) and Belgium (98 websites). Amazon follows with 240 websites, the majority of which (191 websites, 79.6%) are hosted in the United States.
Other leading CSPs include Liquid Web and Trellian, which host 72 and 42 websites, respectively. Liquid Web is exclusively hosted in the United States (72 websites), while Trellian is exclusively hosted in Australia (42 websites). Google hosts 41 websites, of which 34 are in the United States. LeaseWeb serves 37 websites, most of them (34) in the Netherlands. SP-Team hosts 35 websites, all located in Germany. Akamai hosts 33 websites, with 28 in the Netherlands, while Fastly accounts for 26 websites, primarily in the United States (12 websites). Microsoft hosts 21 websites, with 16 located in the United States.
The “Others” category in
Table 8 includes 552 websites, with 260 hosted in the United States. In total, 1509 websites were analyzed, and the United States accounts for the largest share at ≈58.6%, followed by Belgium, the Netherlands, and Germany.
4.3.3. Network Distribution Heatmaps
We generate heatmaps to illustrate the distribution of network scales across countries based on data from
Table 7.
Figure 3a highlights the distribution of small networks (SN column), identifying the United States as the primary host.
Figure 3b shows the distribution of FCWs and PCWs within medium networks (MN column), where the United States, Belgium, and the Netherlands emerge as the main host countries.
Figure 3c presents the distribution of FCWs and PCWs within large networks (LN column), with the United States, Germany, and France as the main hosting countries.
These visualizations address RQ1 and RQ2 by providing detailed information on the geographical distribution of network scales and reaffirming the dominance of medium networks in hosting FCWs and PCWs. The results are consistent with earlier findings, revealing hosting patterns and their potential impact on website security and reliability. Importantly, medium networks are identified as less secure or reliable than large networks. A closer examination of different types of medium networks is necessary to better understand the most severe hosting patterns. Implementing robust defensive measures against websites hosted on medium networks, particularly those that offer FCWs, and identifying the primary locations of malicious websites are crucial steps toward enhancing online security.
4.4. Cloud Service Providers Analysis
As highlighted in
Section 4.3, a CSP-level analysis provides deeper insight into the ecosystem of FCWs and PCWs, particularly for malicious websites. To this end, we extend our analysis to examine the affinities between different categories of websites and the major cloud providers across our assessment metrics. The distribution of FCWs and PCWs across CSPs reveals several key aspects: (1) Most FCWs, PCWs, and general websites are hosted on Cloudflare. Furthermore, Cloudflare exhibits the highest concentration of malicious websites among all CSPs in the three categories. (2) Amazon, while one of the largest providers, has the lowest concentration of malicious websites. Although this cannot be concluded definitively, one possible explanation is the stronger measures Amazon employs to mitigate security risks in shared infrastructure, compared to more permissive providers. (3) In the per-category analysis, FCWs predominantly rely on Cloudflare, whereas PCWs use Cloudflare only in the game category. For the remaining categories of PCWs, Amazon is the most frequently used hosting provider. (4) Providers with the highest concentration of malicious websites are located in the United States and Belgium. This can be attributed to providers such as Cloudflare, which primarily operate in these countries. (5) Overall, there is a strong affinity between the state of a website (malicious or benign) and its hosting provider.
Clarifying Cloudflare Risk Interpretation. It is important to distinguish between Cloudflare’s Edge/Proxy infrastructure and the Origin infrastructure that hosts the actual content. Because Cloudflare frequently operates as a reverse proxy, malicious actors often use its edge nodes to obscure the true origin of their infrastructure. Consequently, the “hosting risk” associated with Cloudflare in our measurements predominantly reflects proxy abuse rather than physical content hosting by Cloudflare itself. Our results should therefore be interpreted as identifying the surface through which malicious traffic is routed, not the physical server where the malicious content resides.
Regional Subsidiaries of CSPs. Although large CSPs operate through multiple regional subsidiaries, the number of region-specific CSP entries (e.g., Amazon US, Amazon CN) in our dataset is extremely small and statistically insignificant. Aggregating these subsidiaries under their parent CSP does not affect the validity of our findings. Moreover, regional variation is already accounted for by our country-level hosting analysis in
Section 4.3, where differences in geographical deployment are reflected in the associated hosting countries.
4.4.1. Free and Premium Websites Comparison
The hosting pattern follows a heavy-tailed distribution. For both FCWs and PCWs, the top eight providers (Cloudflare, Amazon, Liquid Web, LeaseWeb, SP-Team, Akamai International, Fastly, and Microsoft) host 63.42% of the websites, while the remaining websites are distributed across 290 providers.In particular, 80.59% of malicious websites are hosted in these top CSPs.
The top five providers in terms of MPFP are Cloudflare, Liquid Web, LeaseWeb, SP-Team, and Trellian. Interestingly, Amazon, the second largest provider in terms of hosting volume, has a relatively low MPFP (≈12.9%) and a smaller MP compared to the other top providers. Fastly, which hosts 26 websites, does not have malicious websites. The category “Others” includes 552 websites (≈36.6%) and shows an MPFP of ≈16.9% with an MP of 6.16%. These results highlight the variation in security levels among CSPs.
Although Cloudflare and Amazon are the most popular providers for FCWs and PCWs, they differ significantly in terms of MPFP and MP. Cloudflare, the leading provider, contributes ≈68.5% to MPFP and ≈18.6% of the total MP. In contrast, Liquid Web ranks second in MP, contributing only ≈2.1% of all malicious websites. This distinction underscores two points: A low MP may correlate with a low proportion of hosted websites, while significant differences in MPFP reflect the relative security posture of individual CSPs.
4.4.2. Benchmark Websites
A comparison of
Table 9a and
Table 10a provides a unified perspective, summarized as follows. First,
Cloudflare and Amazon emerge as the most popular CSPs. Cloudflare hosts the largest number of websites and records the highest MC count. According to
Table 10a, Cloudflare hosts ≈27.2% of websites, while in
Table 9a it hosts ≈16.4%. Amazon, second on the list, hosts ≈15.9% of websites in the first table and ≈10.9% in the second.
Liquid Web consistently ranks second in terms of MPFP in both tables. In
Table 9b, it is the third largest provider, hosting ≈4.8% of websites, while in
Table 10a it ranks sixth, hosting ≈1.9%.
Fastly stands out for its unique characteristics. In
Table 9a, it does not host MC websites, yet in the benchmark results it records an MPFP of ≈4.2% and an MP of only 0.05%. The category
Others represents a substantial share of websites, ranging from ≈36.6% to 53.5% across the tables.
In conclusion, Cloudflare and Amazon are the most popular CSPs among FCWs and PCWs, with Cloudflare hosting the highest number of MC websites and exhibiting the highest MPFP. Liquid Web ranks second in MPFP, while Fastly is notable as a provider without any MC websites in
Table 9a.
4.4.3. Free Websites
The distribution of FCWs across various CSPs is presented in
Table 9b. We observe that Cloudflare dominates the market by hosting ≈33.8% FCWs, with ≈64.3% of its hosted websites identified as malicious, resulting in 21.7% MP. Liquid Web and Amazon are the second and third most popular CSPs, respectively, hosting 8.5% and ≈6.9% of FCWs. Liquid Web has an MP of ≈4%, while Amazon’s MP stands at 1.9%. CSPs such as Trellian, LeaseWeb, and Sp-Team each host ≈5% of FCWs and exhibit similar MC and MP values. Notably, the “Others” category, encompassing a variety of CSPs, hosts ≈30% of FCWs and presents an MP of ≈7.2%. With a total of 788 FCWs, 319 (≈40.5%) were malicious.
4.4.4. Premium Websites
In
Table 9c, which represents the distribution of PCWs in different CSPs, Amazon emerges as the most prominent host, accommodating 25.8% of the total PCWs. Among the websites hosted by Amazon, 8.6% are malicious, resulting in an overall MP of ≈2.2%. Cloudflare ranks as the second largest host with ≈20% of PCWs, with a higher proportion (≈76.4%) of malicious websites, leading to an MP of ≈15.3%. Other notable CSPs include Akamai, Google, Fastly, and Microsoft, which host around 2% to 4% of PCWs. Regarding malicious content, Google and Microsoft show MPs of ≈0.4% and ≈0.3%, respectively, while Akamai has a lower MP of ≈0.3%. Fastly and eBay host ≈3.2% and ≈1.1% of PCWs, respectively, but neither hosts any malicious content. Interestingly, Sp-Shopify hosts only ≈1.7% of PCWs but has a high proportion (≈83.3%) of malicious websites, resulting in an MP of ≈1.4%. Wal-Mart and OVH each host about 1% of PCWs and have MPs of ≈0.1%. Lastly, the “Others” category, which includes a variety of CSPs, hosts ≈35.1% of the total PCWs. With only ≈6% of its hosted websites classified as malicious, the category exhibits an MP of ≈2.1%. The table shows 721 PCWs, with (≈22.2%) malicious.
4.4.5. Free Versus Premium Websites
Upon comparing the distribution of FCWs and PCWs in different CSPs, as shown in
Table 9b, several insights are drawn. First, we found that Cloudflare is the most prominent hosting cloud for FCWs, hosting ≈33.8% of the total FCWs, while Amazon is the most prominent host for PCWs, with 25.8%. Interestingly, the MP of Cloudflare is higher for PCWs (≈15.3%) compared to FCWs (≈21.7%), indicating that Cloudflare hosts a higher proportion of malicious PCWs than FCWs. In contrast, Amazon has a higher MP for FCWs (1.9%) than PCWs (≈2.2%), suggesting that it hosts proportionally more malicious FCWs than PCWs. Google has a relatively low MP for both FCWs (≈0.5%) and PCWs (≈0.4%), implying that it hosts a smaller proportion of malicious websites than other CSPs. The total number of websites is higher for FCWs (788) than for PCWs (721), with 41% and ≈22.2% being malicious, respectively, indicating that FCWs have a higher overall prevalence of malicious content than PCWs—some CSPs, for example, Liquid Web, Trellian, LeaseWeb, and Sp-Team, host only FCWs. In contrast, others like Akamai, Fastly, Microsoft, Sp-Shopify, eBay, and Wal-Mart only host PCWs, suggesting that different CSPs may have different preferences when hosting FCWs or PCWs or affinities in those types of websites for selecting a specific provider.
4.5. Per-Category Cloud Service Providers Analysis
4.5.1. Book Websites
Table 10b shows the distribution of Books FCWs and PCWs on CSPs. In FCWs, Cloudflare hosts the most, with 39 websites representing ≈27% of the total. Amazon follows with 11 websites (≈7.6%), Liquid Web with 10 websites (≈7%), Trellian and Sp-Team with 6 and 5 websites, respectively, and others collectively hosting 73 websites (≈50.7%). For PCWs, Amazon tops the list with 41 websites (≈21.5%), followed by Cloudflare with 40 websites (≈21%). Other CSPs in this category include Google, Sp-Shopify, Fastly, and others, with varying counts and percentages. Regarding MC, Cloudflare dominates in both FCWs and PCWs with 28 and 32 instances, respectively. The MPFP is highest for Cloudflare among FCWs (≈71.8%) and Sp-Shopify among PCWs (≈75%). The MP is fairly distributed between Cloudflare (≈19.4% for FCWs and ≈16.8% for PCWs) and other CSPs.
4.5.2. Games Websites
Table 10c presents the distribution of Games FCWs and PCWs on different CSPs. For FCWs, Cloudflare is the dominant CSP, hosting 42 websites, which account for 53.85% of the total. Other CSPs in this category include Mivocloud with 5 websites (≈6.4%), LeaseWeb and Liquid Web with 3 websites each (≈3.9% each), Amazon with two websites (≈2.6%), and others collectively hosting 23 websites (≈30%). For PCWs, Cloudflare is the leading CSP, hosting 37 websites (≈33.3%). Amazon comes next with 22 websites (≈20%), Akamai with 11 websites (≈10%), Fastly with 5 websites (4.50%), Google with 4 websites (3.6%), and “Others” hosting 32 websites (≈28.8%). Considering the MC aspect, Cloudflare has the highest count for both FCWs and PCWs, with 39 and 29 instances, respectively. The MPFP for Liquid Web is highest among FCWs at 100%, while Cloudflare leads among PCWs at ≈78.4%. The MP is distributed between various CSPs, with Cloudflare accounting for 50% in FCWs and ≈26.1% in PCWs.
4.5.3. Movie Websites
Table 11a shows the distribution of Movies FCWs and PCWs on different CSPs. For FCWs, Cloudflare is the leading CSP, followed by Liquid Web (≈11.6%), Trellian (≈9.7%), Sp-Team (≈7.7%), Amazon (≈6.1%), and all others hosting 118 websites (≈38.1%). Regarding PCWs, Amazon leads with 56 websites (≈36.8%), followed by Cloudflare with 18 websites (≈11.8%), Akamai with nine websites (≈5.9%), Google with seven websites (≈4.6%), Fastly with five websites (≈3.3%), and “Others” hosting 57 websites (37.5%). In terms of the MC, the largest count in FCWs is observed with Cloudflare (17), and in PCWs with Cloudflare (15). The MPFP highlights Sp-Team as the highest in FCWs with 37.5%, while Cloudflare tops PCWs with ≈83.3%. The MP is distributed among various CSPs: Cloudflare has ≈5.5% in FCWs, and has ≈9.9% in PCWs.
4.5.4. Music Websites
Table 11b presents the distribution of Music FCWs and PCWs across different CSPs. For FCWs, Cloudflare is the dominant CSP, hosting 22 websites (27.5% of the total). Sp-Team follows with six websites (7.5%), then Google with four websites (5%), Amazon with three websites (≈3.8%), Liquid Web with two websites (2.5%), and the Others category with 43 websites (≈53.8%). In the case of PCWs, Amazon leads with 30 websites (≈35%), followed by Cloudflare with 12 websites (≈14%), Fastly and Google each with 4 websites (≈4.7%), Apple with 2 websites (≈2.3%), followed by “Others” with 34 websites (≈39.5%). Regarding MC, Cloudflare has the highest count for both FCWs (16) and PCWs (9). In terms of MPFP, Liquid Web has the highest percentage in FCWs with 100%, while Cloudflare takes the lead in PCWs with 75%. The MP is distributed among various CSPs: Cloudflare accounts for 20% in FCWs, and in PCWs, Cloudflare accounts for ≈10.5%.
4.5.5. Software Websites
Table 11c shows the distribution of FCWs and PCWs software across various CSPs. In the case of FCWs, Cloudflare is the leading CSP with 80 websites (≈45.5%), followed by Amazon with 19 websites (10.8%), Liquid Web with 16 websites (≈9.1%), LeaseWeb with 11 websites (≈6.3%), Voxility LLP with 4 websites (≈2.3%), and Others with 46 websites (≈26.1%). On the other hand, for PCWs, Amazon and Cloudflare are the most prominent CSPs, each hosting 37 websites (≈20.4%), followed by Microsoft with 9 websites (≈5%), Akamai and Google each with 6 websites (≈3.3%), and “Others” with 86 websites (≈47.5%). In terms of MC, Cloudflare has the highest count for FCWs (71) and the second highest for PCWs (25). For FCWs, the highest MPFP is found in Cloudflare (≈88.8%), while for PCWs, it is also found in Cloudflare with 67.57%. Regarding MP, Cloudflare has the highest percentage in FCWs at ≈40.3% and the second highest in PCWs at ≈13.8%.
6. Discussion
The results of the network-scale distribution, spatial analysis, CSP evaluation, and temporal re-scans reveal consistent yet nuanced patterns across the benchmark, FCWs, and PCWs datasets. Below, we summarize the key takeaways, followed by the shortcomings and limitations of this study and the overall recommendations.
6.1. Main Takeaways
Medium-Scale Networks as the Core Risk Zone. Across categories, FCWs and PCWs are disproportionately concentrated in medium-scale networks, which consistently show the highest malicious presence (MP). These networks provide the optimal balance of availability, cost, and oversight gaps for attackers. However, isolating medium-scale networks is ineffective, as many benign websites also reside there. A finer-grained tiering of medium networks is necessary to separate malicious from benign clusters. PCWs utilize larger-scale networks than FCWs, highlighting the link between reliable networks and stronger security. Overall, large networks enforce higher standards, while medium networks remain the weakest link.
Business Model as the Primary Risk Driver. The divide between FCWs and PCWs is a stronger determinant of maliciousness than geography or CSP alone. FCWs are consistently more malicious across all categories, regardless of the host country or provider. Premium services enforce stricter compliance and security practices, resulting in significantly lower MPFP values. The ranking of categories by average MP confirms this divide: games (47.82%), software (41.49%), books (28.81%), music (28.1%), and movies (20.79%), with an overall average MP of 31.34%. This order also holds for the share of malicious websites hosted in medium-scale networks, addressing RQ1 and RQ2.
CSP Affinities Shape Hosting Risk. A small set of CSPs dominate hosting, but their risk profiles differ sharply. Cloudflare is both the most widely used CSP and the one with the highest malicious concentration, while Amazon consistently shows lower MPFP values, likely reflecting stronger enforcement. Liquid Web ranks second in MPFP, and Fastly did not host malicious websites in one of the benchmark tables. Despite these contrasts, overlaps between malicious and benign websites across providers complicate efforts to isolate the “riskiest” CSPs. Notably, FCWs and PCWs follow heavy-tailed CSP distributions similar to the benchmark (top one million websites), though benign websites in PCWs cluster around a smaller set of providers. These findings provide answers for RQ5.
Geographic Risk is Uneven and Contextual. The United States dominates hosting overall (≈59%) and for both FCWs (≈51%) and PCWs (≈67%). Yet risk profiles diverge: FCWs in the U.S. exhibit high malicious counts (MC = 218, MPFP ≈ 55%), while PCWs remain comparatively clean (MPFP ≈ 4.5%). Other countries, such as Belgium, Germany, Romania, and Australia, host fewer websites but show disproportionately high MPFP values for FCWs, making them peripheral hot spots. In contrast, PCWs are concentrated in a few countries that report near-zero malicious footprints (e.g., Ireland, India, Germany, France, Belgium), reinforcing the role of regulatory maturity. More than half of the top CSPs (58.58%) are U.S.-based, while others are distributed across Belgium, the Netherlands, Germany, and Australia. These results contribute to addressing RQ3.
Category-Specific Vulnerability Profiles. Software FCWs are the most dangerous, with the highest MPFP values, reflecting the inherent risks of executable downloads. Games and music FCWs also show elevated maliciousness, often linked to cross-domain players. Books and movies show relatively lower rates but still contribute significant malicious shares, showing that distribution models (e.g., downloads vs. streaming) affect vulnerability levels.
Benchmark Comparison. Comparing FCWs and PCWs against the benchmark dataset highlights answers to RQ4. Network-scale distributions appear similar across datasets, but CSP distributions diverge. Cloudflare and Amazon dominate across all three datasets, though FCWs and PCWs have significantly higher malicious rates than the top one million websites. This elevated rate is largely driven by FCWs hosted on top CSPs. Liquid Web, for example, hosts the highest proportion of malicious websites in the benchmark (MPFP 23.08%), and ranks second in the combined FCW/PCW dataset. Heavy-tailed patterns are evident in all datasets, but benign websites are far more concentrated in PCWs.
Temporal Analysis. The re-scanning of FCWs and PCWs using VirusTotal provides insights into RQ6. FCWs reveal increasing maliciousness over time, while PCWs improve their security posture. Free movie websites show the highest increase in malicious activity, whereas premium games and premium books exhibit the strongest improvements. Overall, the security of content websites improved with time, as did the resilience of their infrastructure entities (networks, CSPs, and hosting countries).
Structural Insights for Future Defenses. Overall, the business model (free versus premium) is the most reliable predictor of maliciousness. Defensive strategies should focus on medium-scale networks and mid-tier hosting countries that repeatedly emerge as high-risk. Static blacklisting is inadequate, as attackers exploit dynamic infrastructure (e.g., shared players and free software) that requires behavioral and contextual defenses. Effective risk management must integrate fine-grained network tiering, CSP monitoring, and longitudinal scanning to capture evolving malicious patterns.
6.2. Limitations
This study has several limitations that should be taken into account when interpreting the results. First, the dataset represents a snapshot of FCWs and PCWs collected at a specific point in time. Because hosting infrastructures, CSP policies, and regional regulations evolve rapidly, the findings may not fully generalize to other timeframes or future ecosystem conditions. Although we perform an updated VirusTotal reassessment in
Section 5, long-term longitudinal measurements remain an important direction for future work.
Second, manual annotation was necessary to distinguish FCWs from PCWs and to determine content categories. While we followed a structured, keyword-based protocol with a verification pass to minimize ambiguity (
Section 3.2.1), manual labeling inevitably introduces the possibility of human error. Automated or ML-based classification was not used due to the semantic and context-dependent nature of business-model identification, which current automated methods cannot reliably perform without first relying on manually annotated training data.
Third, maliciousness detection relies on VirusTotal’s multi-engine evaluation. VirusTotal aggregates approximately 90 independent security engines, which provides strong cross-validation, yet individual engines may still produce false positives or false negatives. We employ the standard “≥1” threshold used in prior FCW/PCW studies to identify whether a website has been flagged by at least one engine. This heuristic supports a consistent comparison across FCWs and PCWs, but it does not imply causality, nor does it guarantee that every flagged website is malicious. These interpretive constraints are acknowledged in our analyses.
Fourth, our infrastructure-focused framework leverages IP-level metadata obtained through the ipdata and IPSHU APIs, including subnet masks, prefix lengths, ASN information, CSP identifiers, and geolocation. While these sources provide standardized and machine-readable infrastructure attributes, they do not include detailed WHOIS-registration data. WHOIS records are often incomplete, inconsistent across registrars, and heavily redacted, making them unsuitable for large-scale, reproducible infrastructure measurements. As a result, our classification and enumeration do not incorporate domain-registration attributes such as ownership, administrative contacts, or registrar-level history.
Fifth, although our analysis considers hosting geography at the country level, we treat regional subsidiaries of major CSPs (e.g., Amazon Data Services Canada, Amazon Data Services France) as part of their parent CSP. This simplification facilitates aggregate CSP-level characterization, but it may obscure potential regional differences in security practices or operational policies. A more granular examination of region-specific CSP subsidiaries represents a promising direction for future research.
Finally, while we perform correlation analyses to understand how infrastructure attributes relate to maliciousness, the study does not attempt predictive modeling or causal inference. Additional data sources—such as behavioral traces, flow-level logs, or longitudinal snapshots—would be required to build predictive or causal frameworks. Our results should therefore be interpreted as descriptive associations rather than causal mechanisms.
On Predictive and Causal Modeling. The goal of this study is descriptive rather than predictive: we characterize how maliciousness is associated with infrastructure attributes such as network scale, CSP distribution, and hosting geography. While these correlations highlight conditions under which malicious FCWs and PCWs are more prevalent, they do not establish causality. Developing predictive or causal models would require additional longitudinal, behavioral, or content-level data and a modeling framework beyond the scope of the present measurement study. Our empirical findings, however, provide a useful foundation on which such future modeling efforts can build.
Relation to SDN-Based Traffic Classification. Software-Defined Networking (SDN) offers flow-level programmability and centralized policy enforcement that can support fine-grained traffic classification. Such methods require packet- or flow-level data and programmable control-plane environments, which differ fundamentally from the infrastructure-level attributes examined in this study. Because our focus is on hosting networks, CSPs, and geographical distributions rather than traffic-flow behaviors, SDN-based classification techniques are complementary but outside the scope of our measurement framework.
Dataset Scale and Annotation Feasibility. An important consideration in interpreting our findings is the scale and nature of the dataset used in this study. Although our dataset includes 1562 websites, this number reflects a substantial and deliberately bounded scope given the requirement for detailed manual annotation of access models, content categories, and infrastructure attributes. Manual inspection is essential for achieving accurate FCW/PCW labeling and fine-grained semantic categorization, but it does not scale linearly and cannot be reliably replaced by automated classifiers without first relying on similar human-labeled ground truth. Within these constraints, our dataset captures a broad and diverse cross-section of the FCW/PCW ecosystem. spanning multiple hosting networks, geographies, and content domains, and provides sufficient coverage to reveal consistent structural and security-related patterns. While larger datasets would be valuable in future work, the present scale reflects a balance between methodological rigor, feasibility, and the goal of characterizing key infrastructural dynamics with high labeling fidelity.
6.3. Recommendations
As in [
1,
3,
4], our study found that FCWs are consistently more malicious and vulnerable than PCWs. Similar to other works that analyzed the security of the most used websites [
10,
11,
29,
43], our contrast analysis shows that the top one million websites are less malicious than FCWs. This suggests that, even when benign, FCWs may be more susceptible to security breaches. Furthermore, prior research examined security factors of CSPs and networks [
20,
21,
23,
36,
37,
38,
39,
44,
45,
46,
47,
48,
49,
50] and proposed techniques to strengthen networks with strong CSP affinities. Our findings echo this, highlighting CSPs frequently used to host malicious FCWs.
The results suggest that network administrators should adopt more stringent security measures to defend against malicious activities. Specifically, organizations should focus on addressing risks of medium-scale networks, as these are often linked to malicious websites. Examining the CSPs associated with FCWs can also help identify providers that host disproportionately high numbers of malicious websites, enabling targeted defensive or legal actions where appropriate. To improve classification accuracy, additional security annotations should be incorporated using tools beyond VirusTotal, such as Google Safe Browsing, PhishTank, and other security services.
While medium-scale networks exhibit a higher concentration of malicious activity, isolating them outright is operationally challenging and could lead to substantial collateral damage, as approximately 80% of benign websites also reside within these networks. To minimize such side effects, we refine our recommendation to emphasize stricter inspection, risk-weighted monitoring, and reputation-based throttling for medium-scale networks, rather than blanket isolation. These targeted controls preserve the benefits of prioritizing medium-risk segments while maintaining acceptable availability for benign services.
Further research is needed to deepen the understanding of the relationship between hosting patterns and malicious content. For instance, future studies could analyze factors such as website age or domain registration date, which may influence classification outcomes. Examining the dynamic code of FCWs could also reveal the severity of their vulnerabilities, thereby improving classification accuracy and offering deeper insight into their functionality.
Researchers should also investigate alternative methods for detecting malicious activity within medium-scale networks to strengthen internet security. Although the study shows that most websites fall within medium-scale networks, additional work is needed to determine which specific ranges pose the most significant risks. To address this, we propose dividing medium-scale networks into multiple tiers and assessing the security posture of websites within each tier. Such an approach would allow organizations to focus defenses on networks with heightened vulnerability, supporting more effective risk management.
Finally, it is critical to study how attackers exploit free services hosted by trusted providers to launch attacks. Identifying these exploitation methods will enable organizations to detect and mitigate attacks more effectively, thereby reducing their duration and impact. By understanding attacker strategies, defenses can be proactively strengthened, either preventing attacks altogether or minimizing their consequences.
7. Conclusions and Future Work
Building on the missing insights and interpretive gaps outlined in the previous subsection, we now turn to the broader conclusions of this study. Our results show that FCWs and PCWs are concentrated in medium-scale networks, similar to malicious websites, implying that isolating this type of network alone may not be an effective solution. Furthermore, we identified Cloudflare (≈68.9%), Liquid Web (≈44.4%), LeaseWeb (≈29.4%), SP-Team (≈28.6%), and Trellian (≈23.8%) as the most common CSPs with high overlap between malicious and benign websites. This indicates a need for further investigation of their distribution and potential weaknesses in security protocols or policies in the countries where they operate.
Future work should examine how the distribution of FCWs and PCWs changes over time and whether these changes follow specific patterns. It is also essential to identify effective strategies to contain and limit the spread of malicious FCWs, considering factors such as network scale, CSPs, and hosting countries. Additionally, comparing the distribution and hosting patterns of FCWs with other cyber threats, such as phishing, scams, or ransomware attacks, is essential to uncover commonalities or differences in their spread and impact.
This study highlights the ongoing need to enhance the security of FCWs. Future research could explore vulnerability enumeration in FCWs to raise user awareness by identifying weak points in their infrastructure before attackers can exploit them.