1. Introduction
Cryptocurrencies like Bitcoin have revolutionized the financial landscape by generating digital currencies and processing financial transactions in a distributed and decentralized manner without the traditional centralized banking systems. Such cryptocurrency invention and development have been motivated by anonymity and the ability to bypass censorship and regulation by the centralized system. To ensure application-layer anonymity, cryptocurrency users generate and sign their own public keys to create a pseudonym account ID without needing registration. However, the anonymity provided at Bitcoin’s application layer is compromised by its network layer, as anyone can potentially link network-layer IP addresses to Bitcoin users’ real-world identities. To address this, Bitcoin adopts and supports the anonymous routing of Tor, I2P, and CJDNS in the underlying P2P network. (Even though CJDNS is supported by the default Bitcoin Core software, our Bitcoin prototype connected to the Mainnet observed no peers participating and zero networking via CJDNS from May 2023 to April 2024. Because Bitcoin networking practice currently does not use CJDNS anonymous routing, our paper focuses on non-anonymous routing vs. Tor vs. I2P.)
While anonymous routing protects anonymity (required by financial transactions at the Bitcoin application layer), adversaries can falsify and spoof their routing type, thereby undermining the trustworthiness of network interactions. The integrity of cryptocurrency networks can be compromised as adversaries can manipulate their routing types to forgo the overheads of anonymous routing, which are the inherent-by-design additions of routing relay nodes and have performance/latency costs associated with them. Additionally, the security of these networks can be jeopardized as spoofing attacks can deceive the traditional methods used for identifying the networking type of peer nodes. Profile detection approaches such as those using the deterministic features of anonymous routing are vulnerable to an adversary spoofing its anonymous-routing type, which is particularly easy and of high risk in permissionless cryptocurrency environments (no registration control).
In response to these challenges, our research proposes a novel approach to network fingerprinting using network sensing and machine learning. Instead of relying on spoofing-vulnerable deterministic features or IP address-based identification, we detect and distinguish between networking types using non-anonymous routing, Tor, or I2P using networking behaviors. Our work therefore promotes anonymity, which is important in cryptocurrency, as described in
Section 3.1, by detecting profile spoofing and ensuring anonymous networking use.
In addition to processing data to detect and classify anonymous routing types, we implemented an active Bitcoin node connected to the Mainnet using various routing protocols, including non-anonymous IP, Tor, I2P, and CJDNS, to collect diverse networking traffic data. We collected both networking data (from the Connection Manager and the backend Address Manager) and consensus data to adopt a systems approach for training our machine learning model. This data-driven approach distinguishes our work from most prior studies, which primarily focused on blockchain ledger data.
The rest of this paper is organized as follows.
Section 2 provides an overview of the related literature and highlights the research gap.
Section 3 presents the research background, including cryptocurrency for anonymity and anonymous routing.
Section 4 discusses the motivation regarding spoofing threats against profile detection.
Section 5 presents the prototype implementation and data collection.
Section 6 presents the network fingerprinting using machine learning along with a machine learning model description.
Section 7 discusses the machine learning primer, implementation, empirical analyses, feature engineering, confusion matrix, and performance metrics. Finally,
Section 9 summarizes the findings, outlines the limitations, and suggests future research directions in the field of network fingerprinting for anonymous networking detection in cryptocurrency.
2. Related Work
2.1. Data Analysis and Machine Learning in Cryptocurrency and Blockchain
Machine learning techniques have been widely applied to enhance the security and privacy of cryptocurrency and blockchain systems. In this section, we discuss the relevant literature regarding the use of machine learning for network connectivity estimation and anomaly detection in blockchain networks. Martin et al. [
1] and Kim et al. [
2] proposed machine learning-based approaches for anomaly detection in blockchain networks. Kim et al. [
3] used machine learning to estimate networking connectivity health without relying on identities, which are easily spoofable in permissionless cryptocurrency. Patel et al. [
4] applied graph neural networks for anomaly detection in blockchain transaction graphs, while Zhou et al. [
5] evaluated ML-based methods for vulnerability detection in smart contracts, particularly for blockchain IoT contexts. However, these works focused on the IP protocol without Tor or I2P (no anonymity protection). In contrast, our research specifically targets the analysis of anonymous routing, including a comparison with non-anonymous routing.
Other previous research analyzed networking data without machine learning. Sakib et al. [
6] analyzed networking performances using anonymous routing of Tor and I2P and compared this with skipping anonymous routing and showed that the networking overheads can be significant, resulting in partitioning. Fan et al. [
7] and Hong et al. [
8] analyzed networking behaviors without using identity information (as it can be easily manipulated in permissionless cryptocurrency environments) for robust anomaly detection and connectivity estimation. Biryukov et al. [
9] revealed how anonymity systems like Tor could be exploited to deanonymize Bitcoin users, while Koshy et al. [
10] conducted a network-level analysis of Bitcoin clients, highlighting vulnerabilities in P2P communications. These previous studies are related to this paper, as our research also analyzes networking data. However, these works did not detect or classify networking types and differ from our work in the research objective. To the best of our knowledge, our work is the first to classify networking types to detect anonymous routing. This new approach helps us better understand how peers behave and makes networks more secure in open, decentralized blockchain systems.
2.2. Anonymous Routing Detection Beyond Cryptocurrency
Network traffic analysis of anonymous routing protocols has been practiced since as early as 2001 [
11], not long after the invention of packet sniffing. Even today, the detection of anonymous routing remains an active field of research, especially for those working in security monitoring and law enforcement agencies. The most commonly targeted protocols are Tor and I2P.
A fundamental method for detecting anonymous routing is traffic analysis, which is widely used. In this method, an attacker detects the use of anonymous routing from regular network traffic by analyzing a combination of timing and packet size features [
12,
13,
14,
15,
16,
17]. Because timing and packet size reveal enough information for attackers to perform fingerprinting attacks, researchers have obfuscated the anonymous traffic to make these features less effective, leading to the development of counter-obfuscation detection techniques [
18,
19,
20]. While I2P and Tor differ in terms of traffic flow, i.e., Tor is bidirectional and I2P is unidirectional, the detection of I2P traffic still utilizes a similar combination of time and packet size features [
21,
22,
23,
24]. The difference is that I2P traffic is split, and calculating the weight can be more challenging because the incoming packets may originate from different destinations. While previous works have proven that machine learning can detect anonymous routing traffic, we used machine learning to detect whether a Bitcoin peer sent its traffic over Tor or I2P networks. Because there was no publicly available dataset appropriate for our research goal, we collected data from our Bitcoin node connected to the real-world Mainnet to train the model.
2.3. Spoofing in Permissionless Networks
Spoofing attacks in permissionless networks present a significant challenge to maintaining security and anonymity. In decentralized networks like cryptocurrency, adversaries can pretend to be nodes, falsify routing identities, and manipulate network behavior to avoid detection.
One of the primary spoofing threats in cryptocurrency networks involves routing obscuration, where adversaries falsely claim to use anonymous routing protocols like Tor or I2P while forwarding traffic directly over non-anonymous IP-based connections. Such attacks can undermine privacy protections by allowing adversaries to avoid the extra routing steps through relays while appearing anonymous. Traditional detection methods that rely on IP addresses and deterministic network features are easily fooled by such deceptive tactics [
25,
26,
27]. Recent advancements in network fingerprinting have improved the ability to detect spoofing attempts. Techniques such as deep learning classifiers can distinguish between genuine anonymous traffic and falsified claims [
28,
29,
30].
While previous works depend on unsupervised learning, our approach leverages behavioral network fingerprinting to detect spoofing attacks dynamically using supervised learning. We utilize supervised machine learning for real-time traffic patterns and classify networking types based on their unique networking behaviors instead of depending on predefined characteristics that adversaries can easily manipulate.
3. Background
3.1. Cryptocurrency for Anonymity
Cryptocurrencies, inspired by the cypherpunk movement’s vision, were designed to ensure anonymity and avoid censorship. These digital currencies operate on a decentralized blockchain system, eliminating the need for a central authority. This setup allows anyone to participate anonymously (e.g., users can join the network, validate transactions, or mine new blocks without needing to register or disclose their identity). Users create and use unique digital signatures that generate pseudonymous accounts, unlinking their transaction activities from real-world identities. This permissionless design is different from other blockchains that require user registration and identity checks. Moreover, Satoshi Nakamoto, the creator of Bitcoin [
31], encourages users to change their transaction addresses for anonymity purposes frequently. Furthermore, to enhance this anonymity, Bitcoin supports connections through anonymous routing protocols such as Tor, I2P, and CJDNS, allowing users to obfuscate their online activities and protect their identities from being linked to their real-world identities.
3.2. Anonymous Routing
Anonymous routing is designed to protect users’ identities and online activities from surveillance and tracking by applying multiple layers of encryption to internet traffic and routing it through various relay nodes. Well-known anonymous routing applications include Tor and I2P.
When a user initiates a Tor session, the Tor client software constructs a circuit, a path through the Tor network, by randomly selecting and negotiating connections with a series of relays. As the traffic is routed through this circuit, each relay only knows the identity of the relay immediately preceding it and the relay immediately following it, but not the entire path. In addition to using multiple relay nodes, the client software adds multiple layers of encryption to the traffic, such that only the final node is aware of the destination [
32]. Together, they ensure the anonymity of the user. People can use Tor as a proxy to visit public websites, as well as hidden services that are only accessible on the Tor network.
I2P uses a similar path scheme but applies it differently from Tor. Instead of using a single path (circuit) for traffic to a destination, I2P uses two separate paths (tunnels): one for outgoing traffic and another for incoming traffic [
33]. This approach enhances anonymity since each path involves multiple relay nodes, but it simultaneously reduces performance. Unlike Tor, I2P cannot be used as a proxy to visit public websites because it does not allow traffic to exit its network.
4. Motivation: Spoofing Threat Against Profile Detection
Profile detection based on deterministic feature analysis finds and compares network traffic features against known signatures. Signatures of traffic direction and pattern, as shown in
Table 1, can be used to distinguish Tor and I2P traffic from non-anonymous traffic [
21,
22,
23].
IP addresses can also be used to detect if the traffic relates to Tor because there are commonly known Tor relay nodes, which are P2P nodes volunteering to serve in the Tor network as relays. These Tor relay nodes are known through the public consensus directory, which can be downloaded by any node (typically those using the Tor service).
IP spoofing involves manipulating the IP address in packet headers to disguise the sender’s true identity or location, while profile spoofing involves creating a fake online profile or identity that mimics a real person’s or entity’s profile. Profile spoofing also uses IP spoofing to help hide who is really behind the fake profile.
However, such profile detection and deterministic feature analysis is not effective against an attacker capable of profile spoofing. An attacker can fabricate the networking to match the known deterministic features of Tor and I2P traffic to trigger the false positives (detected as Tor/I2P while it is not), which threat has been studied as early as 2001 by Patton et al. [
39]. Spoofing in our paper therefore refers to profile spoofing, in which an attacker deceives the victim by mimicking others’ actions, habits, etc. For example, an attacker spoofs Tor traffic, causing the firewall identifier to generate a false-positive alert, indicating that it is Tor traffic when, in reality, it consists of meaningless content. This is different from, and often more sophisticated than, IP spoofing, where an attacker pretends to be another node based on its identity.
5. Bitcoin Prototype Implementation and Mainnet Data Collection
We implemented an active Bitcoin node prototype connected to the Mainnet. In our implementation, the peer node connections varied from 0 to 10, and we collected the data for individual peer connections. We collected networking traffic data utilizing a range of routing protocols that are supported by the Bitcoin Core, including non-anonymous IP, Tor, I2P, and CJDNS. Our data collection included both networking data (i.e., from the Connection Manager and the backend Address Manager) and consensus data, which we utilized to train the machine learning model. This comprehensive data collection process was designed to support our research objectives of detecting anonymous routing and classifying the networking types of peer nodes. Furthermore, unlike many previous studies that focused primarily on blockchain ledger data that can be downloaded [
40,
41,
42], we sensed and collected real-time networking traffic data.
Figure 1 illustrates the various logical components from which our Bitcoin node logged and collected data to provide a clear view of our data-driven approach.
To classify peer nodes based on their networking behaviors, we collected and analyzed network traffic data generated during interactions of nodes once the connection socket was established, as shown on the left in
Figure 1 using the blue bidirectional arrow showing the connection to the Bitcoin Mainnet. We used Bitcoin Core v0.26.0 with the default settings to collect the network traffic data. The data were measured with respect to each individual peer connection, logging the packet-level information, including the bytes sent, the message type, and the timestamp of each packet. Thus, our data collection process captured the dynamic behaviors of peer nodes during network interactions. The collected data were further fed to a supervised machine learning model, as shown on the right in
Figure 1, which detected and classified the networking types. We implemented the different routing types (IPv4, Tor, and I2P) and injected that traffic ourselves; therefore, we had the ground truth for each sample. Since our Bitcoin node was manually configured to use a specific routing method for each connection, we knew exactly which type of network (IPv4, Tor, and I2P) was used.
Our Bitcoin prototypes observed and sensed the real-world IP (non-anonymous), Tor, and I2P networking simultaneously from June 2023 to August 2023. The virtual machines were hosted within the same physical machine so that they sensed and collected the networking data from the same place in the networking topology. The dataset consisted of 35,080 labeled peer connection samples, collected from our Bitcoin prototype running on the Mainnet. The node was configured to accept connections using IPv4, Tor, and I2P protocols. We split the dataset into 70% training, 15% validation, and 15% testing. The validation set was used for hyperparameter tuning, while the test set was used for performance reporting.
6. Network Fingerprinting Using Machine Learning
In our research, we tackled the challenge of classifying networking types of cryptocurrency peer nodes, specifically focusing on distinguishing between non-anonymous routing, Tor (The Onion Router), and I2P (Invisible Internet Project). Cryptocurrencies like Bitcoin rely on anonymous routing protocols such as Tor and I2P to ensure privacy and censorship resistance. However, adversaries may attempt to deceive network monitoring systems by spoofing their routing type, posing a threat to the integrity of the network.
To address this issue, we propose a novel approach that leverages networking behaviors rather than deterministic features or IP addresses for identification. Traditional methods relying on deterministic features are vulnerable to spoofing attacks, making them unreliable for accurate classification. Instead, we utilize machine learning techniques to analyze the dynamic behaviors exhibited by peer nodes during network interactions.
Our approach involves the application of supervised machine learning algorithms to classify peer nodes based on their networking behaviors. By training our models on labeled datasets that capture the distinct characteristics of non-anonymous routing, Tor, and I2P, we enabled them to learn and recognize patterns indicative of each networking type. This enabled us to differentiate between legitimate anonymous routing and potential spoofing attempts effectively.
6.1. Machine Learning Model
We used supervised learning. Supervised learning was suitable for our work as it aligned with the nature of our problem, which involved classifying networking types in cryptocurrency networks based on observed behaviors. By providing the model with labeled examples of network data and their corresponding networking types (such as Tor, I2P, or non-anonymous routing), we could train it to distinguish between different classes and generalize its classifications to new data.
Our work used the CatBoost, Random Forest, and HistGradientBoosting classifiers, all supervised learning models, due to their effectiveness and efficacy in handling our specific dataset characteristics and problem domain. For instance, the CatBoost, Random Forest, and HistGradientBoosting classifiers were well-suited for our work due to their ability to handle NaN (missing) values, which were present in our data. Additionally, our work uses gradient-boosting techniques to sequentially improve tde performance of decision trees, resulting in effective classification and efficient training. By leveraging this model, we could effectively address the challenges of classifying networking types in cryptocurrency networks and achieve accurate and reliable classifications.
6.2. Overfitting Prevention Strategies
To address concerns about overfitting, particularly given the moderate dataset size and limited number of features, we employed several techniques to ensure generalization and model stability. First, we applied stratified 5-fold cross-validation to maintain class distribution across folds and provide a more reliable evaluation of the model’s performance. Additionally, we performed hyperparameter tuning using grid search for tree-based models such as CatBoost and HistGradientBoosting, including adjustments to L2 regularization strength and learning rate to penalize overly complex models. Furthermore, we incorporated early stopping during training to monitor validation loss and halt training when performance no longer improved, effectively preventing overfitting to the training data. These measures collectively enhanced our model’s ability to generalize unseen data while preserving high accuracy.
7. Empirical Analyses
We present the empirical analyses conducted to evaluate the effectiveness and performance of our networking fingerprinting approach using machine learning in a cryptocurrency network. The empirical analyses were structured to assess various aspects of our proposed solution, including classification accuracy, detection capabilities, and scalability.
7.1. Feature Engineering and Significance
Feature engineering/significance feature engineering plays a crucial role in the effectiveness of machine learning models for networking fingerprinting in cryptocurrency networks. The selection of informative features directly influences a model’s ability to distinguish between different networking types, ultimately impacting its performance metrics. This analysis identified several key features that were important for our classification, shown in
Figure 2. These included the Average Ping RTT, Uplink Bandwidth, Downlink Bandwidth, CPU Utilization, and Memory Utilization. These selected features collectively contributed to the robustness and effectiveness of the machine learning model for networking fingerprinting in cryptocurrency networks. By leveraging these informative features, the model demonstrated high accuracy, precision, recall, and F1-score in classifying networking types, thereby enabling improved network analysis and security measures within the cryptocurrency ecosystem. Within these five features, Memory Utilization played a significant role as without using this feature, the accuracy performance was greatly degraded. The reason for this was that Tor and I2P networks require more memory for encrypted communication and routing through relays. We also tested other features to compare the accuracy with these five features, such as the Average Number of Blocks Behind, Redundant Transactions Rate, and Number of Peers with Blockchain Out-of-Date. Nevertheless, those features could not improve our model accuracy performance as the value of these features was zero, which meant that there was no information in these features. Other features, such as the number of Tor peer connections and I2P connections, we skipped in the experiment as these features were not trustworthy, and skipping these features also increased the performance of our model. Although Memory Utilization appeared to be the least significant among the five top features, ablation testing revealed that removing it caused the largest drop in model accuracy: 81% for IPv4, 84% for Tor, and 89% for I2P. This indicated that while it ranked low in individual importance, it contributed unique information not captured by other features, making it a crucial combination.
7.2. Comparison of Confusion Matrices: CatBoost vs. Random Forest vs. HistGradientBoosting
The comparison of the confusion matrices shown in
Figure 3a–c for the CatBoost, Random Forest, and HistGradientBoosting classifiers revealed significant differences in their classification performance for I2P, IPv4, and Tor networks. CatBoost demonstrated a slight advantage in classifying I2P networks, correctly identifying 10,855 instances, compared to 10,815 for Random Forest and 10,367 for HistGradientBoosting. CatBoost misclassified fewer I2P instances as IPv4 or Tor. However, Random Forest performed better in classifying IPv4 and Tor networks, correctly classifying 10,502 IPv4 instances compared to 10,468 for CatBoost and 10,048 for HistGradientBoosting. Similarly, for Tor classification, Random Forest correctly identified 10,522 instances, compared to 10,486 for CatBoost and 10,183 for HistGradientBoosting.
When examining misclassification patterns, CatBoost showed fewer misclassifications for I2P, while Random Forest and HistGradientBoosting showed slightly lower misclassification rates for IPv4 and Tor. For example, CatBoost misclassified 348 I2P instances as IPv4, compared to 360 for Random Forest and 1032 for HistGradientBoosting. Similarly, CatBoost misclassified 478 I2P instances as Tor, compared to 506 for Random Forest and 282 for HistGradientBoosting, showing CatBoost’s better handling of I2P traffic. Conversely, Random Forest showed fewer misclassifications of Tor as I2P (609 vs. 657 for CatBoost and 594 for HistGradientBoosting) and fewer IPv4 to Tor misclassifications (726 vs. 770 for CatBoost and 649 for HistGradientBoosting).
Overall, the model demonstrated strong performance, correctly classifying the majority of the 35,080 samples across I2P, IPv4, and Tor networks. Out of these, only 3271 instances were misclassified by CatBoost, while 3241 were misclassified by Random Forest model and HistGradientBoosting misclassified 4482.
7.3. Accuracy and F-1 Performance Analyses
The performance metrics provided important information about how effective the classification model was in distinguishing between I2P, IPv4, and Tor networks operating in cryptocurrency for network fingerprinting. Without our scheme, the peer node was vulnerable against profile spoofing and would have 0% accuracy against profile spoofing threats.
In
Figure 4, the horizontal axis represents the networking types such as IPv4, Tor, and I2P, whereas the vertical axis denotes the classification accuracy. The CatBoost classifier, shown in blue, achieved the highest accuracy across all network types, with 0.9474 for I2P, 0.9367 for IPv4, and 0.9294 for Tor. In comparison, the Random Forest classifier, shown in green, achieved slightly better accuracy with 0.9387 for IPv4 and 0.9308 for Tor but fell slightly to 0.9457 for I2P compared to CatBoost. The HistGradientBoosting classifier, shown in purple, demonstrated the lowest accuracy across all network types, with its highest accuracy at 0.9199 for Tor. While the differences were small compared to CatBoost and Random Forest, Random Forest performed better for IPv4 and Tor, whereas CatBoost was quite good at classifying I2P traffic.
Table 2 provides a deeper breakdown of precision, recall, and F1-score. The precision values indicated how well each model minimized false positives, with Catboost ranging from 0.8936 to 0.9143 and Random Forest from 0.8952 to 0.9155. The HistGradientBoosting classifier exhibited lower precision values, with a range from 0.8647 to 0.9165, performing best on Tor traffic. Similarly, the recall values highlighted each model’s ability to correctly identify instances of each network type, where CatBoost achieved up to 0.9293 recall for I2P, compared to 0.9259 for Random Forest. HistGradientBoosting showed a lower recall, where it scored 0.8720, indicating a higher rate of missed detections in this network type.
The F1-score, which balanced precision and recall, remained consistently high across both models, exceeding 0.89 for all networking types. CatBoost achieved its best F1-score (0.9217) for I2P, showing its strength in classifying this network type. In contrast, HistGradientBoosting demonstrated a lower F1-score of 0.8730 for I2P, reflecting its comparatively weaker classification capability for this network type.
Overall, these performance metrics underscore the robustness of the CatBoost and Random Forest classifiers in accurately distinguishing between networking types within cryptocurrency networks. CatBoost outperformed random Forest in I2P classification by a margin of 0.17% in accuracy and 0.34% in recall, demonstrating its superior ability to identify I2P traffic patterns. Conversely, Random Forest performed better for IPv4 and Tor, with an accuracy advantage of 0.2% for IPv4 and 0.14% for Tor. Meanwhile, HistGradientBoosting lagged behind in overall performance, particularly in I2P classification, where it recorded the lowest accuracy and recall values. These differences, though small, suggest that CatBoost is better suited for applications requiring precise I2P detection, Random Forest is preferable for distinguishing IPv4 and Tor traffic, and HistGradientBoosting may not be a good choice for distinguishing between these network types. These findings show the importance of selecting a classification model based on the specific network types that need to be prioritized for accurate detection and monitoring.
7.4. Time Overhead Performance Analysis
While both the CatBoost and Random Forest classifiers achieved comparable accuracy in detecting network types, computational efficiency was an important factor in selecting the model with the best performance. Training and testing speed is particularly important in permissionless networks where rapid classification is necessary to counter spoofing attempts.
Our experimental results are shown in
Figure 5, where the horizontal axis represents two categories, “Training” and “Testing”; the vertical axis denotes the time in seconds, and the testing times are scaled for better visibility. The dotted area with the blue border demonstrates the CatBoost classifier, while the diagonal striped area with the orange border focuses on the Random Forest classifier. HistGradientBoosting is represented by a green crosshatch pattern. Error bars at the top of each bar indicate confidence intervals. The length of the error bars shows the range within which the actual time values were expected to fall, providing a visual representation of data reliability.
Our experimental results show that CatBoost required the longest training time, with a mean of 355.19 s and a confidence interval (CI) of s. In contrast, Random Forest completed training in 241.65 s (CI = s), demonstrating a significantly lower computational cost. HistGradientBoosting, however, was the most efficient in training, requiring only 6.12 s (CI = s), making it highly advantageous in resource-constrained environments.
In terms of testing speed, CatBoost significantly outperformed Random Forest, achieving a mean testing time of 0.80 s (CI = s), whereas Random Forest required 4.07 s (CI = s). This suggests that CatBoost is more suitable for real-world classification due to its strengthened decision tree structure. HistGradientBoosting maintained a lower testing time of 1.05 s (CI = s), making it an efficient alternative for fast predictions. Since all three models yielded high accuracy, the selection of the best model depends on computational efficiency. CatBoost is preferable for real-time classification due to its superior testing speed, while Random Forest offers a balance between training time and accuracy. HistGradientBoosting is the best choice for scenarios requiring minimal training times with relatively fast predictions.
7.5. ROC–AUC Analysis Across Classifiers
Figure 6 presents the Receiver Operating Characteristic (ROC) curves for classifying the three routing types (IPv4, Tor, and I2P) using the CatBoost, Random Forest, and HistGradientBoosting classifiers. Interestingly, the ROC curves for all three classifiers are nearly identical, showing minimal visual or performance difference. Each model achieved a high true positive rate with a low false positive rate across all thresholds, indicating strong classification capability. The Area Under the Curve (AUC) values were consistently high across all routing types: 0.99 for I2P and 0.98 for both IPv4 and Tor. This consistency suggests that the classification results were not only accurate but also robust across different machine learning models. It also highlights the effectiveness of the selected behavioral features, which enabled the reliable detection of spoofed anonymous routing behaviors regardless of the specific model used. The dashed diagonal line represents the performance of a random classifier (AUC = 0.5), which serves as a baseline; all three models significantly outperform this, indicating effective and meaningful classification.
7.6. Bitcoin Application Implications
Bitcoin provides a degree of application layer anonymity to its users as it utilizes pseudonyms that do not directly reveal its users’ real-world identities. However, Bitcoin’s application layer anonymity is compromised by its lack of network layer anonymity, as anyone can potentially link network layer IP addresses to Bitcoin users’ real-world identities. Hence, Bitcoin users often use anonymous routing protocols like Tor, I2P, and CJDNS, which encrypt internet traffic and route it through multiple relays to obscure users’ IP addresses and enhance privacy. However, the integrity of anonymous routing can be undermined by adversaries, who can falsify their routing type. For instance, a Bitcoin peer node might claim to be using Tor, when it really uses a non-anonymous IP route. Moreover, traditional methods for identifying the network types of Bitcoin nodes often rely on deterministic features or IP addresses, which can be spoofed, leading to inaccurate classifications.
In response to these challenges, in this work, we introduced a machine learning-based approach to network fingerprinting that analyzes the dynamic behaviors of network traffic from Bitcoin nodes. Our findings demonstrate that our ML model can accurately identify the networking types of Bitcoin peer nodes with high precision. For example, it accurately identifies fake claims of Tor usage by nodes employing an IP-based route with 93% accuracy and false claims of I2P usage with 94% accuracy, as detailed in
Section 7.3.
Thus, the capability of our ML model enhances the security and integrity of Bitcoin applications by providing reliable networking type detection and counteracts spoofing attempts. By accurately identifying the network types in use, our ML model helps maintain the essential anonymity and censorship resistance that Bitcoin aims to provide.
8. Discussion and Future Direction
While our current work demonstrates strong detection accuracy and model comparisons using data collected from a single Bitcoin node, we acknowledge that further validation under more diverse conditions can strengthen the generalization of our findings.
We do not anticipate the insights to change, e.g., the comparison across the ML models. Our scheme implementation builds on unmodified Bitcoin implementation in active functionalities; the additions were for passive mechanisms for sensing, monitoring, and processing data. However, the values, e.g., performances, may vary, and generalization validation can be helpful.
Additionally, although CJDNS is therorically supported in Bitcoin Core, we observed no active peers using this protocol during our observation period. As a result, CJDNS traffic could not be included in our dataset. In the future, we plan to simulate and collect real CJDNS-based peer connections to futhur expand the scope of anonymous routing analysis.
9. Conclusions
We used supervised machine learning for network fingerprinting in cryptocurrency systems to detect anonymous routing and defend against faking anonymous routing (e.g., profile spoofing to pretend to use anonymous routing). Our research included active Bitcoin implementation, sensing, and experimentation, showing that our approach is practical and provides real-world validation. Using machine learning to classify peer connections between Tor, I2P, or non-anonymous routing groups, our scheme effectively detected fake Tor usage with 93% accuracy and fake I2P usage with 94% accuracy. While CatBoost and Random Forest achieved similar accuracy, CatBoost took an average of 0.80 s for testing, compared to 4.07 s for Random Forest and 1.05 s for HistGradientBoosting, making CatBoost the best choice for real-time classification.