Identifying Illicit Activities in Blockchain Transaction Graph Networks

Adam, Tomáš; Babič, František

doi:10.3390/electronics14234599

Open AccessArticle

Identifying Illicit Activities in Blockchain Transaction Graph Networks

by

Tomáš Adam

^*

and

František Babič

Department of Cybernetics and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, Letná 1/9, 042 00 Košice, Slovakia

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(23), 4599; https://doi.org/10.3390/electronics14234599

Submission received: 8 October 2025 / Revised: 17 November 2025 / Accepted: 18 November 2025 / Published: 24 November 2025

(This article belongs to the Special Issue Digital Security and Privacy Protection: Trends and Applications, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

In recent years, blockchain technology has gained widespread attention for its immutable and distributed ledger mechanism that ensures security and transparency among all participants. However, the decentralized nature of the blockchain network consequently presents its unique challenges in detecting fraudulent activities that may be executed by malicious actors. The traditional detection methods, such as rule-based systems, may not be sufficient to capture the complex and evolving nature of these activities. This paper explores the application of machine learning and transaction graph representation to detect suspicious accounts on the World Asset Exchange (WAX) blockchain. By leveraging dynamic subgraph embedding generation and contrastive representation learning, the proposed approach primarily targets the identification of suspicious transaction behaviors indicative of fraudulent activity. The contrastive representation learning approach enhances the generation of subgraph embeddings through a contrastive loss function to effectively discriminate between potentially fraudulent and legitimate transaction behavior by optimizing the distances in the embedding space. This process significantly enhances the classification accuracy, particularly for the imbalanced minority class often seen in fraud detection scenarios. The results of the experimental validations indicate the presence of potentially fraudulent activities and highlight the effectiveness of the anomaly ranking mechanism in identifying new, previously unseen cases.

Keywords:

blockchain; transaction graph; contrastive learning; fraud detection

1. Introduction

The realm of traditional digital finance has long been the cornerstone of modern economic systems, facilitating the seamless flow of capital across international borders. Governed by central banks and financial institutions, these structures aid monetary transactions, asset management, and credit services through heavily regulated, centralized architectures [1]. While effective in providing financial stability, these traditional systems often utilize complex, multi-layered processes and numerous intermediaries, leading to inefficiencies such as high transaction costs, time-consuming money transfers, and a general lack of flexibility that can delay innovation [2]. Additionally, the centralized nature of traditional financial systems makes them particularly susceptible to systemic risks, including fraud, cyber-attacks, and data breaches, which not only pose significant financial threats but also undermine public confidence in centralized financial systems [3,4].

In response to the limitations of traditional financial systems and asset management, blockchain technology has emerged as a significant innovation, transforming the way transactions are executed. Built on a decentralized framework, blockchain technology offers a scalable, transparent, and immutable transaction record-keeping approach, inherently resistant to manipulation [5]. This mechanism ensures the integrity of transaction data and facilitates the traceability and verification of each user interaction. By eliminating the necessity for third-party oversight, blockchain technology reduces transaction costs and increases processing limits, significantly improving the efficiency of monetary exchanges. A key element enabling these properties is the underlying decentralized consensus mechanism, which coordinates how multiple distributed nodes agree on the validity of the transactions. By requiring economic (Proof-of-Stake) or computational (Proof-of-Work) commitments from validators and enforcing a common block selection rule, these protocols ensure that all honest nodes converge on a consistent state, making large-scale ledger manipulation or double-spending economically infeasible [6].

The influence of blockchain technology extends well beyond the financial sector, affecting areas such as healthcare by securing patient records [7], or intellectual property management, by verifying the authenticity and ownership [8,9]. One of the most notable advancements enabled by blockchain technology is the development of smart contracts, offering a mechanism to execute and enforce agreements autonomously [10]. Smart contracts automate transactions and contractual agreements by reducing the need for intermediaries and thereby decreasing transaction times and costs. Moreover, by encoding contract terms on the blockchain, these smart contracts ensure that all parties to the contract adhere to the agreed terms without the possibility of dispute.

Blockchain technology has played a pivotal role in the introduction and development of both fungible and non-fungible digital assets, fundamentally transforming asset management and ownership [11]. Cryptocurrencies such as Bitcoin and Ethereum are prime examples of fungible assets, characterized by the interchangeability of each unit with others of the same type. This fungibility ensures their viability as a medium for commercial transactions and value storage. In contrast, Non-Fungible Tokens (NFTs) represent a unique class of digital assets with blockchain technology providing a robust transaction framework for their indisputable ownership [12]. NFTs have rapidly gained prominence by facilitating the tokenization of a wide array of unique digital and tangible assets, ranging from artwork and musical compositions to virtual real estate and beyond [13,14].

The economic significance of blockchain technology is further underscored by the substantial growth in market capitalization and trading volumes observed in both the cryptocurrency and NFT markets. Moreover, the liquidity of these markets has improved with the introduction of numerous platforms [15] that facilitate the trading of both fungible and non-fungible blockchain digital assets. These platforms enhance market accessibility, while contributing significantly to price discovery and market efficiency.

As blockchain technology and ecosystem continue to mature and expand, its unique attributes of anonymity and decentralization have inevitably become a fertile ground for cybercriminals. These malicious actors intentionally exploit the system’s features to engage in illegal activities, from executing complex fraudulent schemes [16,17] to the dissemination of malware and the initiation of security breaches [18]. Also, as technology and fraudulent schemes advance, detecting fraud within the blockchain becomes increasingly challenging and complex.

Among the most disruptive of these illicit activities, phishing attacks are particularly prevalent. Phishing exploits the blockchain’s pseudonymity by deceiving users into exposing sensitive key credentials or making unauthorized transactions. Cybercriminals often create sophisticated fake websites or social media profiles that mimic trusted entities such as popular cryptocurrency exchanges [19] or wallet providers. Unsuspecting users, lured by the appearance of legitimacy, may enter private keys or initiate transfers that grant attackers direct access to their assets. These phishing schemes often exhibit a clear pattern of transferring funds into a set of intermediary wallets and subsequently routing through blockchain bridges to other networks in order to obfuscate tracing efforts (Figure 1). The irreversible nature of blockchain transactions means that once funds are misappropriated, the likelihood of recovery is minimal, resulting in substantial financial losses.

This paper provides an in-depth exploration of the complexities inherent in fraudulent behavior in blockchain ecosystems, with a particular emphasis on the growing prevalence of NFTs. By leveraging state-of-the-art machine learning algorithms and graph-based representation learning, the proposed approach aims primarily at fungible and non-fungible assets transactions to detect and classify suspicious and fraudulent activities. To address the challenges of data scale, temporal dynamics, and severe class imbalance, this paper makes the following contributions:

Dynamic subgraph sampling framework for WAX blockchain transaction network, where accounts are represented as nodes and transactions as time-stamped edges. The framework constructs temporally ordered, account-centric subgraphs that capture both local structural context and short-term transactional dynamics within the ledger. By focusing computation on high-relevance neighborhoods around candidate accounts and prioritizing graph regions with elevated fraud risk, this design supports scalable learning on large graphs under severe class imbalance.
The representation learning component employs a Graph Attention Network (GAT) to encode the structural neighborhood of each account and an LSTM layer to capture temporal dependencies in its transaction history. The resulting embeddings are optimized with a triplet-based contrastive loss, encouraging intra-class compactness and inter-class separation between fraudulent and legitimate accounts.
Building on the learned embeddings, this paper introduces a hybrid detection pipeline that combines a supervised classifier with multiple unsupervised anomaly detection models. Classifier outputs and anomaly scores are aggregated into a unified ranking of potentially suspicious accounts, supporting effective prioritization in label-scarce environments characteristic of blockchain investigations.

2. Related Work

The domain of blockchain fraud detection has evolved significantly in recent years, propelled by the increasing sophistication of fraudulent schemes and the expansion of blockchain technologies across various sectors. This field leverages the distinctive characteristics of blockchain, such as decentralization, immutability, and transparency to develop methodologies and tools aimed at identifying and mitigating suspicious, potentially fraudulent activities.

2.1. Traditional Finance Fraud Detection

Fraud detection in traditional finance has been a well-researched area for many decades. Financial institutions have employed a variety of techniques to identify and prevent fraudulent activities and financial crimes such as money laundering, credit card fraud, and insider trading [20]. Early approaches relied on rule-based systems, where predefined rules and patterns were used to flag suspicious transactions. These systems, while effective to some extent, often resulted in high false-positive rates while required constant updates and labeling to adapt to new fraud strategies [20,21].

In recent years, machine learning models and data mining techniques have been increasingly adopted to enhance fraud detection capabilities. Supervised learning algorithms, such as logistic regression, decision trees, and clustering [22,23], have been widely used to classify transactions as fraudulent or legitimate based on historical labeled data. However, the scarcity of labeled data represents a significant challenge in developing effective fraud detection models. The process of data labeling is resource-intensive and requires extensive domain expertise, making it difficult to acquire large, high-quality datasets for training machine learning models [21]. This limitation often leads to biased models that are less effective in identifying new and evolving patterns. Advanced methods such as ensemble learning and deep learning, further improve detection accuracy by modeling complex patterns and sequential transaction data. These techniques can better handle the intricacies of financial transactions and detect anomalies indicative of fraud [24].

2.2. Blockchain Fraud Detection

Regulatory frameworks play a critical role in guiding and enforcing fraud detection measures in decentralized finance. Organizations like the Financial Action Task Force (FATF (2020), Money Laundering and Terrorist Financing Red Flag Indicators Associated with Virtual Assets, FATF, Paris, France: gafi.org/media/fatf/documents/recommendations/VirtualAssets-Red-Flag-Indicators.pdf (accessed on 10 April 2025)) (FATF) provide guidelines and recommendations to combat money laundering and terrorist financing. In 2020, the FATF issued official guidelines outlining red-flag indicators of suspicious activities related to the use of virtual assets. These guidelines help financial institutions identify and report potentially fraudulent activities [25]. While the presence of one or more of these indicators does not necessarily imply fraudulent activity, it warrants a more thorough analysis of the user’s behavior. These red-flag indicators contain the following:

Structuring transactions in small amounts or amounts below threshold values for reporting.
Creating trade orders on a decentralized exchange and subsequently withdrawing them without execution.
Incoming transactions from many unrelated wallets in relatively small amounts with subsequent transfers to another wallet.
Creating multiple accounts to circumvent restrictions or limits imposed by virtual service providers.

In the domain of blockchain network fraud detection, the deployment of machine learning techniques has emerged as instrumental in addressing the ever-increasing complexities associated with fraud activities. Unlike traditional financial systems, where identities are typically linked to transaction records, blockchain data are pseudonymous [25]. Although transactions are publicly recorded, the identities associated with these transactions are not transparent. This anonymity feature, while integral to the privacy and security benefits of blockchain technology, prevents investigators from reliably linking multiple addresses to a single individual or organization. Moreover, new addresses can be created and discarded at negligible cost, allowing adversaries to distribute their activity across large sets and to separate funding, execution, and cash-out paths.

Supervised learning algorithms have exhibited significant potential in detecting fraudulent activities within blockchain transactions. These algorithms utilize labeled datasets to train models capable of discerning fraudulent minority class from legitimate majority [24]. Among traditional machine learning approaches, logistic regression, decision trees, support vector machines (SVMs), and gradient boosting (GBM) [26,27,28] have demonstrated considerable performance in accurately classifying blockchain transactions. However, the scarcity of labeled data remains a significant challenge, often requiring innovative solutions such as semi-supervised learning [29] to enhance model performance.

Another pivotal area of focus in blockchain fraud detection is transaction network analysis, which examines the structure and dynamics of the graph-based network to identify patterns indicative of fraudulent behavior [30]. Graph Convolutional Networks (GCNs) are particularly effective in this context, as they apply convolutional operations to graph data, enabling the detection of sophisticated fraud patterns [31]. Similarly, Graph Attention Networks (GATs) enhance fraud detection accuracy by incorporating temporal attention mechanisms that prioritize the most relevant nodes and edges [32].

Temporal analysis examines the timing of transactions to identify sequences that may indicate fraudulent behavior. Recurrent Neural Networks (RNNs) [33], particularly Long Short-Term Memory (LSTM) networks, are well suited for modeling sequential data and learning from temporal time dependencies in transactions to identify fraud schemes that unfold over time. Additionally, Temporal Graph Networks (TGNs) enhance the detection capabilities by incorporating both temporal and structural information from the transaction network [34], improving the accuracy of identifying complex fraudulent behaviors.

2.3. Network Representation Learning

Graph Representation Learning has emerged as a significant methodology within the broader field of machine learning, specifically addressing the natural challenges associated with analyzing data structured as graphs [35]. This is particularly relevant to blockchain technology, where the user transaction network naturally forms a graph with accounts as nodes and transactions as edges. Graph representation learning aims to capture the complex structure of graphs, transforming the nodes, edges, and their features into a lower-dimensional space enabling the effective analysis of relationships and interactions within data [36].

Methods such as DeepWalk and Node2Vec utilize random walks to explore networks, allowing these algorithms to learn rich representations that preserve node similarity based on connectivity patterns. The generated embeddings are vital for tasks that rely on understanding node relationships within the graph, such as community detection [37] or link prediction [38]. GCNs are inspired by the success of convolutional neural networks in the field of image processing [39], GCNs adapt the concept of convolution to graph-structured data. By aggregating feature information from a node’s immediate neighbors, GCNs can learn spatial features of the graph [40]. This approach has proven effective for node classification and graph classification tasks, where the goal is to leverage the local graph structure and node features [41]. Building on the concept of GCNs, GATs introduce attention mechanisms that allow nodes to weigh the significance of their neighbors’ contributions dynamically. This adaptability enables GATs to focus on more relevant parts of the graph structure during the feature aggregation process, enhancing the model’s ability to learn discriminative features for each node [42]. Graph based autoencoders learn by encoding the transaction graph into a compact representation and then reconstructing the original graph structure from this embedding. This approach can be particularly useful for unsupervised learning tasks, such as anomaly or community detection [43,44].

Given the vast size of blockchain networks, analyzing the entire graph at once is computationally infeasible. Subgraph sampling methods [45] address this challenge by focusing on smaller, more manageable portions of the graph. By selecting relevant subgraphs, this approach can reduce computational demands while preserving the essential properties of the larger graph. This targeted approach is particularly effective in fraud detection [46], where specific clusters can reveal illicit activities around suspicious nodes.

2.4. Contrastive Representation Learning

Contrastive Representation Learning has emerged as a highly effective approach in the wide field of machine learning, particularly within specific domains requiring robust feature extraction and discrimination capabilities [47]. This strategy refines model performance by training algorithms to identify and emphasize differences and similarities between data points. Contrastive learning has gained prominence for its effectiveness in various domains including credit card fraud detection [48] or image recognition [49]. By training on selected pairs of legitimate and fraudulent transactions, the model learns embeddings that can clearly distinguish between classes. This helps in creating a feature space where potentially fraudulent activities are separated from legitimate ones. The learned embeddings can be used to detect anomalies [50] by identifying transactions that do not conform to the learned legitimate patterns. Given the scarcity of labeled fraud data in blockchain, contrastive learning can leverage the available labeled data more effectively [51]. By focusing on the differences between positive and negative pairs, the model can learn robust representations even with limited labeled examples. Through the optimization of a contrastive loss function, the trained model undergoes a self-improvement process of discriminative capabilities to cluster embeddings representing similar behaviors closer together, while embeddings of dissimilar nodes farther apart [47].

In the context of blockchain phishing detection, contrastive representation learning has shown promise in addressing the unique challenges of transaction data. Recent work on Ethereum blockchain has shown that contrastive representation learning on transaction graphs can substantially improve detection performance. Li et al. [51] proposes the Transaction Graph Contrast Network (TGC), which constructs local transaction subgraphs around each address and samples multiple views of these subgraphs via random walks. TGC then applies node-level and context-level contrastive approach to learn discriminative embeddings that separate phishing accounts from their predominantly benign neighbors and capture shared structural patterns among sparsely distributed phishing nodes. This study reported a substantial increase in precision, F1, and recall metrics for the minority class, underscoring the potential of a contrastive learning approach in enhancing blockchain fraud detection frameworks. Moreover, the flexibility of contrastive learning allows for integration with other advanced techniques such as RNNs [52]. This hybrid approach can further enhance the model’s ability to effectively capture temporal dependencies in transaction data.

Building on this line of work, the presented paper applies contrastive learning to a different blockchain environment with distinct operational characteristics. The focus is on the WAX blockchain and NFT-centric markets, where low transaction fees encourage frequent on-chain interactions and lead to substantially higher transaction volumes per active account compared to Ethereum blockchain. The proposed approach organizes each account’s neighborhood into time-ordered subgraphs and leverages structural and temporal information, aiming to produce account embeddings that remain effective under the high-frequency transaction dynamics.

2.5. Anomaly Detection

Anomaly detection refers to a set of machine learning approaches used to identify deviations from expected values or sudden changes in trends within a given dataset [53]. In the context of blockchain networks, anomaly detection can be directed toward identifying unusual and suspicious events that may indicate fraudulent behavior alongside other malicious activities [54]. Unsupervised learning models like Isolation Forest or DBSCAN are particularly effective in this context, as they do not require labeled data to operate [55]. Isolation Forest [56] operates on the principle of isolating anomalies rather than profiling normal data points. Clustering-Based Local Outlier Factor (CBLOF) [57] combines clustering with local outlier detection by grouping similar data points into clusters and evaluating the degree of deviation based on the proximity to the nearest cluster. Additionally, these models excel at handling the challenges posed by large-scale minority class often found in fraud detection scenarios, where fraudulent transactions represent a small fraction of the total data but have significant impact. The ability of unsupervised models to generalize from and adapt to unseen data ensures they remain effective even as new types of fraud emerge.

Deep learning methods have further advanced the field of anomaly detection within blockchain network. Autoencoders operate by minimizing the reconstruction error, effectively learning to differentiate between normal transaction patterns and anomalies [58,59]. By focusing on the reconstruction loss, autoencoders can flag potentially suspicious transactions that are difficult to compress and reconstruct based on the learned representations, which are typically indicative of anomalous or fraudulent activities.

3. System Architecture

The architecture of the proposed system (Figure 2) is founded upon a comprehensive analysis of blockchain transactions, aimed at identifying and categorizing suspicious and fraudulent activities. This initiative is driven by the objectives of enhancing trust and transparency among all participants within blockchain networks. By integrating machine learning approaches with principles of graph representation learning, the model seeks to identify transaction patterns within blockchain ecosystems, differentiating legitimate users from potentially malicious actors.

The system’s architecture is structured as a multi-step process, with each stage designed to build upon the insights and data transformations provided by its predecessors. The initial phase focuses on the comprehensive collection and systematic organization of blockchain transaction data. This step involves the aggregation of transaction records, their transformation into a structured format, and quantification of the monetary value of transactions in selected timestamps to a unified format, suitable for further processing. After initial data collection, the system advances to a data preprocessing stage, dedicated to the cleaning of the dataset from missing values, the extraction of relevant features, and normalization to ensure uniformity. Following the data preprocessing, an advanced graph-based representation of the transaction data is constructed. This dynamic graph representation, which evolves in response to ongoing blockchain transactions, maps accounts to nodes and transactions to edges.

The next step involves feature preprocessing and dynamic subgraph generation. During this phase, the model extracts and processes the graph features that are essential for identifying fraudulent patterns. These features include transaction frequencies, monetary values, and characteristics that are intrinsic to each node and edge within the graph.

Subgraph generation is a strategy employed to manage the complexity and size of the final blockchain transaction graph. By creating subgraphs, the model isolates segments of the network that are statistically significant. Segmental processing allows the system to handle large, imbalanced datasets efficiently, focusing computational resources on the selected sections of the graph. A fundamental layer of the system is the embeddings generation, which involves encoding the complex features of the transaction subgraphs into a dense, lower-dimensional space. Contrastive learning is applied to fine-tune these embeddings by contrasting similar and dissimilar transaction class samples. Such an approach significantly enhances the model’s discriminative capabilities within large imbalanced datasets with very low occurrence of minority class samples.

The final phase of the model architecture is dedicated to decision making. By utilizing advanced approaches such as supervised learning and anomaly detection, the proposed system classifies accounts as fraudulent or legitimate based on the distinct patterns encoded in the embeddings.

3.1. Transaction Graph Construction

The construction of a transaction graph begins with the acquisition of raw transaction data from the blockchain network. The acquisition process involves interaction with the selected blockchain’s public nodes for cryptocurrency transactions and NFT transfers. These public APIs serve as gateways to the extensive ledger of historical transaction records, enabling the retrieval of data samples. To ensure comprehensive coverage, the data is collected over multiple extended periods, capturing a wide array of transaction types. The retrieved transaction data, which includes sender and receiver addresses, transaction amounts, and timestamps, are essential for further analysis.

A foundational aspect of transaction graph construction involves a primary differentiation between transaction types, specifically distinguishing between those involving fungible assets and those involving NFTs. By categorizing transactions based on their type, the system can better quantify the monetary value of each transaction, for accurately assessing the economic impact. For transactions involving fungible assets, the monetary value is assessed by converting the value of cryptocurrencies into a standard metric using exchange rates from publicly available sources. This asset conversion process is necessary because different cryptocurrencies can have varying prices and quantities. By standardizing these values, the transaction graph ensures a consistent quantification of monetary value across various types of fungible cryptocurrency transactions and stable coins.

Given the unique nature of each NFT on the market and the considerable fluctuations in their values, transactions involving NFTs necessitate a different approach. NFTs often represent high-value items such as digital art, collectibles, or virtual real estate, and their value can vary significantly. NFTs generally involve more complex transaction processes compared to fungible assets. Therefore, an essential part of preprocessing these transactions is the accurate assessment of the transaction cost for each transfer. It is achieved by determining the average price of the NFT during the transfer window, incorporating data from historical sale records and current market trends. When a direct sale occurs, the sale price is linked to the NFT transaction to ensure precision and to maintain a compact, consistent monetary valuation across all fungible and non-fungible transactions.

Following the acquisition of transaction data, the next step involves the transformation of this data into a graph representation. This representation takes the form of a directed graph, where nodes correspond to individual accounts on the blockchain network, and edges symbolize transactions between these accounts.

3.2. Subgraph Sampling

The main objective of subgraph sampling within the system architecture is the generation of dynamic subgraphs centered around selected nodes, capturing transaction activities within specific time frames. Given the vast scale and complexity of blockchain transaction graphs, analyzing the entire network at once is computationally intensive and may not yield usable insights. Subgraph sampling targets a manageable yet informative portion of the graph, focusing on areas around selected nodes (Figure 3).

Dynamic subgraphs for selected nodes are created by systematically filtering the transaction history associated with each node. This process involves segmenting the transaction data by time intervals, which are determined through clustering of timestamps. Each cluster corresponds to a time slice containing selected transactions. As a result, each generated subgraph represents a temporal snapshot of the transactional ecosystem surrounding the node within a defined time window, effectively capturing the dynamics of the neighborhood.

Generating multiple subgraphs for each central node facilitates the identification of patterns that may not be evident in a static analysis. This method enhances the ability to detect suspicious activities by incorporating a temporal context, allowing for the observation of changes in a node’s behavior over time. Employing this dynamic approach to subgraph sampling ensures the system’s scalability and efficiency, even during processing extensive and complex blockchain networks.

3.3. Graph Feature Preprocessing

The preprocessing of node features within the transaction graph is a crucial step in transforming raw transaction data into a structured format that is amenable to machine learning algorithms. This process involves extraction, filtration, aggregation, and normalization of features from both nodes and their connecting edges. The primary task in feature preprocessing is the extraction and calculation of relevant features for each node. These features encompass various dimensions of user’s transactional behavior and network connectivity, including the following:

Transaction Volume: The total amount of transferred assets to and from the selected node, split into incoming and outgoing transaction volumes.
Degree Information: The degree of a node, divided into in-degree (number of incoming transactions) and out-degree (number of outgoing transactions).
Unique Transaction Neighbors: Involves quantifying the number of distinct entities a node interacts with by counting the unique senders and receivers associated with the node.

To enhance the representation, the Recency, Frequency, and Monetary (RFM) variables are widely considered effective tools, as they are frequently used in traditional financial fraud detection scenarios [60]:

Recency: The time between the last ongoing transaction and the end of the interval.
Frequency: The number of transactions involving the node within a given period.
Monetary: The total value of transactions involving the node within a given period.

The preprocessing phase also extends to the attributes of edges connecting the nodes. Edges in the transaction graph are annotated with transaction amounts, timestamps, and types, reflecting the flow and nature of assets transferred between nodes.

3.4. Embedding Generation

The embedding generation in the proposed system represents a pivotal component that transforms the high-dimensional feature space of the subgraph-based representations into a compact, and lower-dimensional space suitable for further processing. This transformation is achieved using a hybrid combination of GATs and LSTM networks. These two approaches together build an embedding model capable of capturing the complex relationships within the transaction subgraphs and the temporal dynamics across different states.

The initial stage of the embedding generation process employs multiple GAT layers, which are pivotal for their ability to handle the irregular structure of graph data through attention mechanisms. Each node in a subgraph is processed using an attention mechanism that computes the weights of edges connecting to nodes in neighborhood. This attention mechanism allows the model to focus on the most relevant parts of the graph dynamically, emphasizing features that are more predictive of the central node’s behavior. To ensure the stability of the learning process and to capture information from various representations of the data, the GAT layers utilize multiple attention heads. Each head independently computes attention weights and node features, which are then combined to provide a comprehensive representation of each node’s local neighborhood.

Following the extraction of spatial features by the GAT layers, the embeddings are passed into an LSTM network to capture temporal dependencies and dynamics. The LSTM layer processes sequences of node embeddings generated for each time-sliced subgraph, allowing the model to capture temporal patterns and dependencies across different states of the graph. Each time slice corresponds to a specific temporal window, allowing the LSTM to track and learn from the progression and evolution of node states over time. By modeling structural and temporal dependencies, the proposed GAT–LSTM approach attains a higher representational capacity than either component in isolation. The temporally aware embeddings produced by the LSTM are further processed to reduce dimensionality and prepare for downstream tasks such as contrast learning and the final fraudulent address classification.

3.5. Learning Process

The model training process for the proposed architecture, aimed at analyzing blockchain transactions for suspicious account detection, leverages a contrastive loss mechanism to fine-tune the embedding generation process. This approach enhances the model’s discrimination capabilities by focusing on differences between similar pairs (positive) and dissimilar (negative) node pair examples. The application of contrastive learning specifically addresses the challenges posed by class imbalance common in blockchain transaction data, where fraudulent nodes are vastly outnumbered by legitimate ones.

By focusing on representative and informative triplets, contrastive learning ensures that even a limited amount of labeled data from the minority class can effectively enhance the model’s detection capabilities. The cornerstone of the learning process is the utilization of a contrastive loss mechanism, designed to optimize the model by enhancing the distinction between embeddings of fraudulent and non-fraudulent nodes.

Central to the contrastive learning strategy is the use of triplet loss (Figure 4). This approach involves three key elements: anchor (a node under examination), a positive sample (a node like the anchor), and a negative sample (a node dissimilar from the anchor). The anchor and positive sample share the same class, while the negative sample belongs to the opposite class. The triplet loss function is designed to decrease the distance between the anchor and the positive sample while increasing the distance between the anchor and the negative sample in the embedding space, enhancing the separability between classes.

Through this refined contrastive learning approach, the model undergoes a continuous self-enhancement process. The optimization of the contrastive loss function leads to a more robust graph embedding space where the proximity of embeddings accurately reflects the positive similarity in node behavior, thereby clustering similar behaviors more closely and pushing dissimilar ones farther apart.

3.6. Fraudulent Address Classification

The classification of fraudulent addresses represents the final component in the detection of suspicious activities within blockchain transaction graph. This phase employs the embeddings generated from transaction data, incorporating the static and dynamic features of transactions, to classify nodes as either fraudulent or legitimate. The classification process is underpinned by traditional machine learning models that are trained on labeled embedding data, where each node is marked as fraudulent or non-fraudulent based on historical evidence and analysis.

In the evaluation of various machine learning models for classifying suspicious accounts, the focus has been primarily on their performance in accurately identifying minority class nodes. A diverse set of algorithms was considered, including XGBoost [61] or SVM. However, the Random Forest [62] was chosen for implementation within this system due to its capabilities in managing high-dimensional data and class imbalance. Random Forest, an ensemble learning method that combines multiple decision trees, is particularly noted for its effectiveness in classification accuracy. The strength of this model lies in its ensemble approach which averages the outputs of individual decision trees, thereby reducing the potential for overfitting.

4. Experiments

Within the diverse landscape of blockchain platforms, the Worldwide Asset Exchange (World Asset Exchange (WAX) blockchain: https://www.wax.io/ (accessed on 10 April 2025)) (WAX) blockchain diversifies itself with features specifically designed for blockchain gaming and trading virtual assets, particularly NFTs. Utilizing a delegated proof-of-stake consensus mechanism with 21 block producers, the WAX blockchain offers a specialized environment tailored for the dynamics of NFT markets. A defining feature of the WAX blockchain platform is its integration of the Atomic Assets (AtomicAssets: https://github.com/pinknetworkx/atomicassets-contract (accessed on 10 April 2025)) standard for NFT utilities. This protocol establishes a uniform framework for the creation, purchase, sale, and exchange of digital assets across decentralized applications. It ensures seamless interoperability among applications and services, facilitating the broad adoption and functional expansion of digital collectibles. The Atomic Assets standard also enforces a standardized method for recording these transactions, enhancing the transparency, reliability and consistency of data for analytical processes.

Additionally, the WAX blockchain’s implementation of short cryptocurrency wallet addresses significantly improves the user accessibility by simplifying transaction processes and reducing the cognitive load associated with managing complex wallet addresses. Another notable advantage of the WAX ecosystem is its elimination of transaction fees, which serves as a significant incentive for frequent and voluminous trading activities. This unique feature is relatively rare in blockchain ecosystems and is pivotal in promoting active user interaction. The absence of fees not only stimulates trading activities but also generates extensive transactional data, providing a rich resource for in-depth analysis of network behaviors and transactional dynamics.

These distinctive characteristics make the WAX blockchain an exemplary platform for conducting experimental research into user transaction behaviors within decentralized digital marketplaces, especially in the evolving and dynamic landscape of NFTs and blockchain gaming. Although this study focuses on WAX blockchain, the proposed fraud-detection framework is not restricted to this network. Any account or address-based blockchain with a transactional history can be transformed into a transaction graph and processed by dynamic subgraph sampling.

4.1. Dataset Creation

Given the scarcity and challenges associated with obtaining robust and accurately labeled data for blockchain fraud detection, the proposed system involves the establishment of a Hyperion History API (Hyperion History API: https://github.com/eosrio/hyperion-history-api (accessed on 10 April 2025)) targeting the WAX blockchain. As specialized node enables data collection from specific smart contracts, enhancing the capability to monitor a wide range of data and standards across the blockchain network. This systematic approach facilitates the acquisition of a diverse transaction dataset, which currently encompasses over 5 million accounts. Every record within the dataset includes several distinct attributes, providing details about each transaction. The attributes include the following:

simple_actions.transaction_id: Unique identifier for each stored transaction, enabling the tracking and analysis of individual transactions across the network.
simple_actions.timestamp: The execution timestamp of the transaction.
simple_actions.data.from: The sender of the transaction.
simple_actions.data.to: The recipient of the transaction.
simple_actions.data.amount: The amount involved in the transaction, providing a quantitative value.
simple_actions.data.symbol: The currency token symbol, which indicates the type of asset transferred.
simple_actions.data.memo: Information included by the sender, serving as a description of the transaction.

For the transaction dataset construction, approximately 9000 accounts have been identified and incorporated based on their blacklisting by AtomicHub (AtomicHub Digital Asset Marketplace: https://atomichub.io/ (accessed on 17 November 2025)) smart contracts due to fraudulent activities. Additionally, 450 accounts were selected from the WAX fraud prevention database (Wax Fraud Prevention: https://github.com/wax-fraud-prevention/ (accessed on 17 November 2025)), which is maintained by the community and regularly updated with new instances of reported activities. The dataset was further enriched by analyzing transactions associated with these blacklisted and suspicious accounts to determine their periods of activity. Correspondingly, normal nodes active within previously identified periods were also included to ensure a comprehensive representation of network behavior.

To facilitate diverse experimental scenarios and validate the effectiveness of the detection approach, multiple datasets (Table 1) were constructed by random walks to simulate real-world conditions. These datasets vary by the number of nodes and the proportion of labeled instances, reflecting different levels of class imbalance typical in the context of fraud detection scenarios.

4.2. Experimental Setup

To assess the efficacy of the proposed detection model, a robust experimental setup was implemented to ensure the reliability of the findings. Cross-validation [63] is employed to train and evaluate learning algorithms by dividing data into distinct sets for training and validation. This approach is beneficial in datasets characterized by class imbalances as it optimizes the usage of available labeled data by stratified selection for training and validation while ensuring that the model’s performance is not biased and accurately reflects the ability to generalize across unseen data. For each dataset, accounts are partitioned into k = 10 equally sized subsets while preserving the fraud/legitimate ratio in each selection. At each iteration, one-fold represents the test set, and the remaining k − 1 folds form the training pool. This approach reduces dependence on a single arbitrary split and makes the evaluation more robust in the presence of severe class imbalance, with the reported metrics reflecting the average performance across all k folds.

In the evaluation process, a range of machine learning models was systematically deployed. It included feature-based methods relying only on node attributes, reflecting a straightforward approach. Additionally, graph representation learning techniques such as Node2Vec which incorporates topological information of the network, and advanced deep learning-based methods like GAT and GraphSage [64].

The performance of the previously mentioned models was quantitatively measured using three metrics specifically chosen to address the high imbalance in the dataset and real-world fraud scenarios [65]. Recall measures the percentage of actual fraudulent nodes correctly identified by the model. Precision assesses the proportion of nodes classified as fraudulent that are indeed fraudulent. Additionally, F1-score is a harmonic mean of precision and recall, providing a single metric that balances both aspects. A high F1-score for the minority class indicates that the model maintains a good balance between correctly identifying fraudulent nodes and minimizing false positives.

The model architecture utilizes two GAT layers. Each layer uses 32 hidden units incorporating four attention heads to diversify the focus on various subgraph parts. Architectural choices were determined via a grid search on D₁. For the GAT were considered {1, 2, 3} layers, {16, 32, 64} hidden units, and {2, 4, 8} attention heads. The configuration with two GAT layers, 32 hidden units, four attention heads per layer achieved the best minority-class F1-score on D₁ and was subsequently reused for D₂–D₄. For training, a batch size of 128 is used to balance computational efficiency and effective learning. The model is optimized using the Adam optimizer, initialized with a learning rate of 0.01. To fine-tune the learning process, a learning rate scheduler is employed, reducing the learning rate if no improvement in validation loss is observed.

All models were implemented in Python (3.10) using PyTorch (2.1.0) and standard graph-learning libraries. Experiments were conducted on a dedicated server equipped with a single GPU with 32 GB of device memory, a 2.10 GHz multi-core CPU, and 128 GB of RAM. Table entries marked “OOM” indicate that the corresponding full-graph method ran out of GPU memory on this device.

4.3. Under-Sampled Scale Dataset Results

The evaluation of the proposed fraud detection model on minor scale datasets provides valuable insights into its performance under different conditions of data scarcity and class imbalance. These smaller datasets, designed to mimic real-world scenarios with limited labeled data, serve as a manageable scale for assessing the model’s robustness and adaptability. The datasets used for this evaluation consist of D1, D2, D3 (Table 2), reflecting a label percentage of 0.35%. Each dataset encompasses various distribution of transaction records, including those from wax fraud database and normal nodes, ensuring a representative sample of the WAX blockchain transaction behavior. By focusing on nodes exhibiting moderate levels of connectivity, we eliminated outliers that could potentially negatively impact the analysis. This filtration ensured that the focus remained on the most relevant entities without smart contracts, to understand the network’s fraudulent and non-fraudulent behaviors.

The results of the evaluation on these manageable minor scale datasets highlight the effectiveness of the model’s contrastive learning approach. Despite the limited number of labeled instances, the model demonstrates a strong ability to differentiate between fraudulent and legitimate nodes. The use of contrastive loss ensures that even a sparse amount of labeled data is sufficient to significantly enhance the model’s discriminative capabilities. The proposed method achieves the best performance at about 95.50% F1-score, 95.81% Recall and 95.19% precision under the D3 dataset for the minority class.

To complement the quantitative analysis, the visualized results (Figure 5), illustrate the separated distribution of nodes from D1. In the visualization, red points represent fraudulent accounts, while blue points represent legitimate ones. This visualization demonstrates the model’s ability to distinguish between fraudulent and legitimate nodes, as evidenced by the clustering patterns observed in the plot. The distinct separation of red and blue points indicates that the model effectively isolates suspicious nodes, providing a visual confirmation of its accuracy and reliability in fraud detection.

After analyzing the classification results, the operation for observed suspicious addresses involves a clear two-step transaction pattern. Initially, these illicit addresses acquire substantial amounts of digital assets through deceptive means, which include phishing links, scam promotions, or direct messages through social media platforms. Following the acquisition, the stolen assets are quickly converted to the platform’s native currency and subsequently transferred to different accounts, which may be controlled by the attackers or used to obscure the trail of stolen funds (Figure 6).

The transactional behavior of these fraudulent addresses is also characterized by rapid movements of funds, which are usually inconsistent with the normal transaction patterns observed in regular user accounts. For instance, a typical reported address may engage in sudden and uncharacteristic accumulation of assets followed by equally rapid dispersion to multiple accounts. This behavior is indicative of the ‘cash-out’ phase [66] in fraud schemes, where the accumulated digital assets are converted into untraceable forms through mixing services.

One of the most prevalent fraudulent schemes identified involves the creation of fake websites that closely mimic legitimate entities, often differing by only minor variations in the domain names. Given that decentralized applications require a signature for authentication during login, many users inadvertently overlook these subtle discrepancies and unwittingly authorize fraudulent transactions. The attackers often use tactics such as fake giveaways and time-sensitive promotions to create a sense of urgency, prompting users to act quickly without further due diligence. This results in the transfer of all digital assets from their accounts without explicit consent, leading to significant financial losses.

Analyzing false positives and false negatives can help to understand the nature of the problem and the shortcomings of the model. In this experiment, 35 misclassified addresses were thoroughly examined to determine the reasons for misclassification. For false negatives, the analysis revealed that the transactional patterns of some misclassified nodes did not align with the typical fraudulent patterns identified in this study. A notable example includes addresses involved in minimal activity, such as having only two transactions (one incoming and one outgoing) and interacting with just two other addresses. These addresses did not exhibit the typical behavior patterns that the model was trained to recognize, leading to their misclassification as legitimate nodes.

False positives, on the other hand, involved addresses that exhibited similar characteristics to suspicious addresses. For instance, certain star addresses were noted for receiving transfers from multiple distinct users while sending out to only a few. These addresses can be mistakenly identified as fraudulent due to their high volume of incoming transactions from diverse sources. However, these activities might not necessarily indicate malicious intent but could represent a hub in a legitimate business or community network.

4.4. Large-Scale Dataset Results

The scale and volume of the transaction data on the WAX Blockchain are significantly larger than what are typically observed in smaller, more balanced datasets. This highly imbalanced scale presents unique challenges in data analysis and processing. This section undertakes a comprehensive data scale experiment using dataset D4, specifically chosen to assess the processing capabilities and scalability of various transaction data processing methods. The selected dataset reflects a label percentage of 0.06% highlighting the imbalance between the classes.

As Table 3 outlines, the findings highlight the relative performances of conventional methods versus the proposed dynamic subgraph sampling contrastive training approach. Graph-based learning methods, which typically involve processing the entire graph for training, exhibit significant limitations as the data scale increases. For instance, methods such as GraphSage and GAT face challenges due to the extensive volume of data, which exceeds the processing capabilities of typical processing limits, compromising their ability to effectively identify suspicious accounts within a reasonable timeframe.

In contrast, the subgraph sampling method exhibits a robust ability to remain computationally lightweight even as the dataset expands. This approach leverages the structural segmentation of blockchain transactions by focusing on localized subgraphs, which are smaller, manageable sections of the entire larger network. By concentrating on the local neighborhood of the central nodes, the method significantly reduces the computational complexity, making it feasible to process large-scale datasets without overwhelming system resources. As a result, the model can accommodate growing datasets while maintaining high performance levels.

Through clustering and analysis of blacklisted accounts, we have uncovered another fraudulent pattern exploiting the decentralized and transparent nature of NFT market trade offers (Figure 7). This scheme involves a multi-stage process that mimics legitimate trading activities, making it particularly challenging to detect. The fraudulent operation involves continuous surveillance of the public blockchain network for transactions featuring high-value NFTs. Upon identifying a viable transaction, the perpetrators initiate a duplicate fake transaction.

This duplicate or mirror transaction is designed to closely resemble the original, legitimate NFT buy offer, maintaining its visual and structural integrity while also incorporating a deceptive modification. Specifically, the required cryptocurrency component of this fraudulent buy offer transaction is altered either in its nominal valuation or denominative characteristics to a variant with significantly reduced value, misleadingly presented as equivalent to more legitimate counterparts.

This fraudulent tactic capitalizes on the inherent trust typically present in direct NFT transactions between users. In a common scenario, a seller who intends to transact an NFT to a specific buyer receives a purchase offer which they are prepared to accept. However, perpetrators exploit this transactional trust by deploying a mirror offer that is meticulously designed to emulate the legitimate buyer’s offer. Sellers, who may be under time pressure or may not conduct a thorough due diligence of the offers, are at risk of mistakenly accepting the fraudulent offer in place of the genuine one. This strategic manipulation targets the seller’s primary focus on the NFT’s attributes and the urgency to finalize the sale, consequently leading to an oversight of critical discrepancies in the cryptocurrency component.

Moreover, the deeper analysis of false-positive accounts uncovered multiple networks of accounts associated with botnets (Figure 8), which are primarily involved in systematic mining activities within decentralized applications. These botnets exhibit a well-coordinated pattern of behavior, where they frequently interact with specific smart contracts on the blockchain network and subsequently transfer accrued cryptocurrencies to predetermined external addresses. This pattern of behavior, while not inherently illegal, suggests a coordinated effort that could potentially distort market dynamics through concentrated resource allocation.

Creating a vast number of wallets is a common practice that is not fraudulent by itself. However, the proliferation of such wallets, particularly when linked to a decentralized project, can artificially inflate the user metrics and distort project standings in public and investor perceptions [67]. This inflation can misrepresent a project’s true popularity and success, potentially attracting new investors under false pretenses. In more severe scenarios, these false actions may mimic characteristics of wash trading [68], where the intent is to create a deceptive appearance of activity by generating numerous transactions to simulate high trading volumes.

4.5. Unsupervised Learning

Anomaly detection has a significant role in the identification of suspicious and potentially fraudulent activities within blockchain transactions. This approach leverages a diverse array of machine learning algorithms designed to identify deviations from expected values, fluctuations, or sudden changes in trends that may indicate malicious intent or irregular behavior. This analytical process is essential for detecting unusual and suspicious events, potentially serving as indicators of novel or previously unrecognized behavioral patterns [54,55]. For this study a selection of unsupervised machine learning algorithms was employed due to their robust capability to process the embeddings derived from the transaction data. These embeddings were crafted using contrastive learning and graph representation to capture the transaction patterns, enhancing the model’s ability to discern normal from anomalous activities. These algorithms include Isolation Forest (IF), Clustering-Based Local Outlier Factor (CBLOF) and Autoencoder (AE).

In the detection process, each selected algorithm assigns an anomaly score to individual accounts based on their embeddings, which encapsulate a range of transactional data and features. To maintain consistency and comparability of the results across different models, all anomaly scores are normalized to a uniform scale with interval ranging from 0 to 1. The primary anomaly score for each selected account is then computed by averaging the scores assigned by the three algorithms. This method establishes a robust mechanism for ranking accounts, where higher score values are flagged as they indicate a greater deviation from typically observed transaction patterns.

The distribution of these anomaly scores is depicted in Figure 9, where red points represent fraudulent, blacklisted accounts, while blue points represent legitimate ones. The alignment of individual anomaly scores along the diagonal of the graph shows a consistent evaluation across all models. The visual representation aids in identifying accounts with the highest anomaly scores, which are prioritized for further manual analysis and verification by experts. This visual and analytical approach aids in the prompt identification of high-risk accounts and enhances the decision-making process for subsequent investigations.

To ensure the reliability of the detected anomalies, a threshold was established to categorize the anomaly scores. Accounts with scores above this threshold were flagged for further investigation. The detected anomalies were then cross verified with labeled accounts in the dataset, identified from the public blacklists or community reports. This cross-validation process, detailed in Table 4, confirmed that the detected anomalies correspond to labels of known fraudulent nodes, thereby validating the effectiveness of the approach.

The anomaly detection process not only validated the effectiveness of the contrast learning embedding generation approach but also provided significant insights into the broader transaction ecosystem. Furthermore, the detailed analysis revealed that the anomaly detection algorithms identified multiple legitimate accounts exhibiting unusual transaction behavior (Figure 10). These accounts include large trading accounts, known as high-frequency trading (HFT (HFT: http://academy.binance.com/glossary/high-frequency-trading-hft (accessed on 14 August 2025))) bots, that perform high-value transactions at a rapid pace. Additionally, automated accounts executing transactions to exploit price differences across markets, known as arbitrage bots, were also identified [69,70]. Although these accounts are normal and not fraudulent, their atypical frequency of transactions flagged them as suspicious. Identifying and categorizing these legitimate outliers is essential, as their removal from the dataset can further enhance the accuracy of fraud detection models. By filtering out these legitimate outliers, the anomaly detection process ensures that the model focuses more accurately on genuinely suspicious activities without smart contracts and accounts with similar behavior.

This integrated approach facilitated a refined validation of suspicious accounts, affirming the robustness of both the contrastive learning embedding generation and the anomaly detection algorithms. By providing deeper insights into transactional behavior patterns, this approach enhances the precision of fraud detection by effectively filtering out outliers that can mislead the learning process.

5. Conclusions and Future Work

This paper has provided a comprehensive overview of the experimental setup and results for detecting fraudulent transactions on the WAX Blockchain. Using a multi-faceted approach integrating various machine learning models, the experiments have effectively demonstrated the ability to identify and analyze patterns of fraudulent behavior within a complex blockchain environment. The WAX Blockchain, notable for its unique features tailored for NFT transactions and a user-friendly wallet address system, provided a rich dataset for the proposed research. The absence of transaction fees encouraged a high volume of interactions, which was instrumental in facilitating a detailed study of the transaction patterns and potential anomalies.

This research managed to gather detailed transaction data from approximately 5745 blacklisted accounts and more than 5 million normal accounts enabling a targeted analysis of suspicious versus normal, non-fraudulent behavior. The experiments employed multiple machine learning methods to process and analyze the data. It included feature-based approach, random walk-based graph representation learning, deep learning-based methods, and contrastive learning based embedding generation. These methods were evaluated on multiple datasets with various scales based on their ability to capture the complex characteristics of minority class in blockchain transaction records.

The performance of these methods was assessed through precision, recall, and F1-score metrics, with the proposed approach demonstrating superior performance across all datasets. This indicates the effectiveness of the proposed approach in capturing complex interactions and subtle indicators of suspicious behavior that might require more conventional analysis techniques. Moreover, the robustness of the proposed approach was further validated through anomaly detection, which tested the model’s performance under constrained data availability. This analysis highlighted the model’s resilience and its ability to maintain high levels of accuracy even with limited labeled data, representing a common challenge in real-world blockchain environments.

Manual analysis of the observed transaction data is time-consuming, and the models do not offer a straightforward interpretation of their decisions. Nevertheless, the WAX blockchain community is highly active, and its members can be considered experts in the related domain. Relevant discussions regarding suspicious accounts and unusual transactions were sourced from the official fraud reporting channels on community forums, which were further used for better data labeling and transaction analysis.

Expanding the scope of the methodology to specifically focus on NFT transactions, this study was able to quantify transfers and detect suspicious activities related to NFT trades on the WAX blockchain. NFT transactions, which typically exhibit clear patterns of acquiring and subsequently transferring digital assets, provided pronounced behavioral embeddings that facilitated the detection of fraudulent accounts. The distinct pattern of these transactions offered a significant advantage in identifying suspicious activities within NFT markets, as accounts involved in fraudulent actions exhibited anomalous, contrastive behavior compared to typical user transactions.

Although the empirical evaluation in this study focuses on the WAX blockchain, the proposed fraud-detection framework is not restricted to this network. Any account or address based blockchain with a transactional history can be transformed into a transaction graph and processed by dynamic subgraph sampling. Despite these strengths, several limitations remain, which also outline important directions for future research. Training graph-neural-network-based models with temporal components and contrastive learning on large transaction graphs remains computationally demanding.

A promising direction for future research involves broadening the scope of detection to more comprehensively identify and analyze anomalous NFT sales on digital markets that may indicate complex fraudulent activities such as money laundering or wash trading. By refining detection algorithms to recognize the contrast patterns typical for these illicit activities, and by broadening the neighborhood analysis around central nodes, the detection capabilities for complex schemes can significantly improve the capability to detect and address these complex schemes.

Additionally, the clustering of accounts based on their transaction behaviors, which has proven effective in identifying suspicious activities, can be leveraged to develop targeted detection strategies for specific types of fraud. This approach can utilize the transactional networks within these clusters to fine-tune the contrastive learning approach, improving the capability to distinguish between legitimate and illegitimate nodes by decreasing the distance of similar nodes in cluster together while positioning dissimilar ones further apart. This targeted approach improves detection accuracy and aids in the development of more robust multi-class classification models that address various categories of fraud more effectively.

Another significant area of development can be the integration of Explainable AI (XAI) approach [71,72]. The current pipeline provides limited model transparency, as decisions are primarily driven by high-dimensional embeddings that are difficult to interpret without additional forensic tools, underscoring the need for systematic integration of explainable AI techniques. This integration can provide clearer insights into the decisions made by the fraud detection algorithms, aligning the model’s outputs with for example, Financial Action Task Force red flags [73] and other rule-based regulatory frameworks. This alignment enhances the transparency of the fraud detection process, while also aids experts in quickly and effectively responding to potential threats, thus reducing the time required for expert review.

Future focus could also extend to the dynamic analysis of subgraphs to identify specific transactions as fraudulent. This approach would allow for a more granular view of the transactional ecosystem, identifying specific instances of fraud within a network of interactions. By dynamically analyzing subgraphs where transactions occur, the detection systems can evolve to recognize temporal patterns and sudden anomalies in user’s transaction behavior, providing a more immediate and precise detection capability.

Author Contributions

Conceptualization, T.A. and F.B.; methodology, T.A. and F.B.; software, T.A.; validation, T.A. and F.B.; formal analysis, T.A.; investigation, T.A.; data curation, T.A.; writing—original draft preparation, T.A.; writing—review and editing, T.A. and F.B.; visualization, T.A.; supervision, F.B.; project administration, F.B.; funding acquisition, F.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by The Scientific Grant Agency of the Ministry of Education, Science, Research and Sport of the Slovak Republic under grant no. VEGA 1/0259/24.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wronka, C. Central bank digital currencies (CBDCs) and their potential impact on traditional banking and monetary policy: An initial analysis. Digit. Financ. 2023, 5, 613–641. [Google Scholar] [CrossRef]
Agarwal, N.; Wongthongtham, P.; Khairwal, N.; Coutinho, K. Blockchain Application to Financial Market Clearing and Settlement Systems. J. Risk Financ. Manag. 2023, 16, 452. [Google Scholar] [CrossRef]
Eisenbach, T.M.; Kovner, A.; Lee, M.J. Cyber risk and the U.S. financial system: A pre-mortem analysis. J. Financ. Econ. 2022, 145, 802–826. [Google Scholar] [CrossRef]
Dafri, W.; Al-Qaruty, R. Challenges and opportunities to enhance digital financial transformation in crisis management. Soc. Sci. Humanit. Open 2023, 8, 100662. [Google Scholar] [CrossRef]
Tripathi, G.; Ahad, M.A.; Casalino, G. A comprehensive review of blockchain technology: Underlying principles and historical background with future challenges. Decis. Anal. J. 2023, 9, 100344. [Google Scholar] [CrossRef]
Dutta, P.; Choi, T.M.; Somani, S.; Butala, R. Blockchain technology in supply chain operations: Applications, challenges and research opportunities. Transp. Res. Part E Logist. Transp. Rev. 2020, 142, 102067. [Google Scholar] [CrossRef]
Merlo, V.; Pio, G.; Giusto, F.; Bilancia, M. On the exploitation of the blockchain technology in the healthcare sector: A systematic review. Expert Syst. Appl. 2023, 213, 118897. [Google Scholar] [CrossRef]
Lu, W.; Wu, L. A blockchain-based deployment framework for protecting building design intellectual property rights in collaborative digital environments. Comput. Ind. 2024, 159–160, 104098. [Google Scholar] [CrossRef]
Bonnet, S.; Teuteberg, F. Impact of blockchain and distributed ledger technology for the management of the intellectual property life cycle: A multiple case study analysis. Comput. Ind. 2023, 144, 103789. [Google Scholar] [CrossRef]
Hewa, T.; Ylianttila, M.; Liyanage, M. Survey on blockchain based smart contracts: Applications, opportunities and challenges. J. Netw. Comput. Appl. 2021, 177, 102857. [Google Scholar] [CrossRef]
Kaisto, J.; Juutilainen, T.; Kauranen, J. Non-fungible tokens, tokenization, and ownership. Comput. Law Secur. Rev. 2024, 54, 105996. [Google Scholar] [CrossRef]
Wilson, K.B.; Karg, A.; Ghaderi, H. Prospecting non-fungible tokens in the digital economy: Stakeholders and ecosystem, risk and opportunity. Bus. Horiz. 2022, 65, 657–670. [Google Scholar] [CrossRef]
Chalmers, D.; Fisch, C.; Matthews, R.; Quinn, W.; Recker, J. Beyond the bubble: Will NFTs and digital proof of ownership empower creative industry entrepreneurs? J. Bus. Ventur. Insights 2022, 17, e00309. [Google Scholar] [CrossRef]
Far, S.B.; Bamakan, S.M.H.; Qu, Q.; Jiang, Q. A Review of Non-fungible Tokens Applications in the Real-world and Metaverse. Procedia Comput. Sci. 2022, 214, 755–762. [Google Scholar] [CrossRef]
Zhang, W.; Liu, T.; Zhang, Y.; Hamori, S. Can NFTs hedge the risk of traditional assets after the COVID-19 pandemic? North Am. J. Econ. Financ. 2024, 72, 102149. [Google Scholar] [CrossRef]
Aibai, A.; Julaiti, J.; Gou, S. The asymmetric effects of upside and downside risks in cryptocurrency markets: Insights from the LUNA and FTX crises. Financ. Res. Lett. 2024, 67, 105750. [Google Scholar] [CrossRef]
Maras, M.-H.; Ives, E.R. Deconstructing a form of hybrid investment fraud: Examining ‘pig butchering’ in the United States. J. Econ. Criminol. 2024, 5, 100066. [Google Scholar] [CrossRef]
Cheraghali, H.; Molnár, P.; Storsveen, M.; Veliqi, F. The impact of cryptocurrency-related cyberattacks on return, volatility, and trading volume of cryptocurrencies and traditional financial assets. Int. Rev. Financ. Anal. 2024, 95, 103439. [Google Scholar] [CrossRef]
Xia, P.; Wang, H.; Zhang, B.; Ji, R.; Gao, B.; Wu, L.; Luo, X.; Xu, G. Characterizing cryptocurrency exchange scams. Comput. Secur. 2020, 98, 101993. [Google Scholar] [CrossRef]
Ngai, E.W.T.; Hu, Y.; Wong, Y.H.; Chen, Y.; Sun, X. The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decis. Support Syst. 2011, 50, 559–569. [Google Scholar] [CrossRef]
Hilal, W.; Gadsden, S.A.; Yawney, J. Financial Fraud: A Review of Anomaly Detection Techniques and Recent Advances. Expert Syst. Appl. 2022, 193, 116429. [Google Scholar] [CrossRef]
Bhattacharyya, S.; Jha, S.; Tharakunnel, K.; Westland, J.C. Data mining for credit card fraud: A comparative study. Decis. Support Syst. 2011, 50, 602–613. [Google Scholar] [CrossRef]
Madhurya, M.J.; Gururaj, H.L.; Soundarya, B.C.; Vidyashree, K.P.; Rajendra, A.B. Exploratory analysis of credit card fraud detection using machine learning techniques. Glob. Transit. Proc. 2022, 3, 31–37. [Google Scholar] [CrossRef]
Habibpour, M.; Gharoun, H.; Mehdipour, M.; Tajally, A.R.; Asgharnezhad, H.; Shamsi, A.; Khosravi, A.; Nahavandi, S. Uncertainty-aware credit card fraud detection using deep learning. Eng. Appl. Artif. Intell. 2023, 123, 106248. [Google Scholar] [CrossRef]
Kirimhan, D. Importance of anti-money laundering regulations among prosumers for a cybersecure decentralized finance. J. Bus. Res. 2023, 157, 113558. [Google Scholar] [CrossRef]
Elbaghdadi, A.; Mezroui, S.; El Oualkadi, A. SVM: An Approach to Detect Illicit Transaction in the Bitcoin Network. In Innovations in Smart Cities Applications Volume 4.; SCA, 2020; Lecture Notes in Networks and, Systems; Ben Ahmed, M., Rakıp Karaș, İ., Santos, D., Sergeyeva, O., Boudhir, A.A., Eds.; Springer: Cham, Switzerland, 2021; Volume 183. [Google Scholar] [CrossRef]
Garin, L.; Gisin, V. Machine learning in classifying bitcoin addresses. J. Financ. Data Sci. 2023, 9, 100109. [Google Scholar] [CrossRef]
Saxena, R.; Arora, D.; Nagar, V. Classifying Transactional Addresses using Supervised Learning Approaches over Ethereum Blockchain. Procedia Comput. Sci. 2023, 218, 2018–2025. [Google Scholar] [CrossRef]
Sanjalawe, Y.K.; Al-E’mari, S.R. Abnormal Transactions Detection in the Ethereum Network Using Semi-Supervised Generative Adversarial Networks. IEEE Access 2023, 11, 98516–98531. [Google Scholar] [CrossRef]
Khan, A. Graph Analysis of the Ethereum Blockchain Data: A Survey of Datasets, Methods, and Future Work. In Proceedings of the 2022 IEEE International Conference on Blockchain (Blockchain), Espoo, Finland, 22–25 August 2022; pp. 250–257. [Google Scholar] [CrossRef]
Huang, T.; Lin, D.; Wu, J. Ethereum Account Classification Based on Graph Convolutional Network. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 2528–2532. [Google Scholar] [CrossRef]
Wang, L.; Xu, M.; Cheng, H. Phishing scams detection via temporal graph attention network in Ethereum. Inf. Process. Manag. 2023, 60, 103412. [Google Scholar] [CrossRef]
Wang, L.; Cheng, H.; Zheng, Z.; Yang, A.; Xu, M. Temporal transaction information-aware Ponzi scheme detection for ethereum smart contracts. Eng. Appl. Artif. Intell. 2023, 126, 107022. [Google Scholar] [CrossRef]
Li, S.; Gou, G.; Liu, C.; Hou, C.; Li, Z.; Xiong, G. TTAGN: Temporal Transaction Aggregation Graph Network for Ethereum Phishing Scams Detection. In Proceedings of the WWW 2022-the ACM Web Conference 2022, Lyon, France, 25–29 April 2022; pp. 661–669. [Google Scholar] [CrossRef]
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4–24. [Google Scholar] [CrossRef]
Ju, W.; Fang, Z.; Gu, Y.; Liu, Z.; Long, Q.; Qiao, Z.; Qin, Y.; Shen, J.; Sun, F.; Xiao, Z.; et al. A Comprehensive Survey on Deep Graph Representation Learning. Neural Netw. 2024, 173, 106207. [Google Scholar] [CrossRef]
You, X.; Ma, Y.; Liu, Z.; Liu, J.; Zhang, M. Representation method of cooperative social network features based on Node2Vec model. Comput. Commun. 2021, 173, 21–26. [Google Scholar] [CrossRef]
Zhou, M.; Han, Q.; Li, M.; Li, K.; Qian, Z. Nearest neighbor walk network embedding for link prediction in complex networks. Phys. A Stat. Mech. Its Appl. 2023, 620, 128757. [Google Scholar] [CrossRef]
Abut, S.; Okut, H.; Kallail, K.J. Paradigm shift from Artificial Neural Networks (ANNs) to deep Convolutional Neural Networks (DCNNs) in the field of medical image processing. Expert Syst. Appl. 2024, 244, 122983. [Google Scholar] [CrossRef]
Mansoor, H.; Gull, M.S.; Rauf, H.; Shaikh, I.u.H.; Khalid, M.; Arshad, N. Graph Convolutional Networks based short-term load forecasting: Leveraging spatial information for improved accuracy. Electr. Power Syst. Res. 2024, 230, 110263. [Google Scholar] [CrossRef]
Chen, H.; Huang, Z.; Xu, Y.; Deng, Z.; Huang, F.; He, P.; Li, Z. Neighbor enhanced graph convolutional networks for node classification and recommendation. Knowl.-Based Syst. 2022, 246, 108594. [Google Scholar] [CrossRef]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph Attention Networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Cherif, A.; Ammar, H.; Kalkatawi, M.; Alshehri, S.; Imine, A. Encoder–decoder graph neural network for credit card fraud detection. J. King Saud Univ.-Comput. Inf. Sci. 2024, 36, 102003. [Google Scholar] [CrossRef]
Wang, X.; Shen, G.; Zhang, Z.; Lai, S.; Zhu, S.; Chen, Y.; Quan, D. Self-supervised graph autoencoder with redundancy reduction for community detection. Neurocomputing 2024, 590, 127703. [Google Scholar] [CrossRef]
Wang, J.; Chen, P.; Xu, X.; Wu, J.; Shen, M.; Xuan, Q.; Yang, X. TSGN: Transaction Subgraph Networks Assisting Phishing Detection in Ethereum. arXiv 2022, arXiv:2208.12938. [Google Scholar] [CrossRef]
Nguyen, T.T.; Phan, T.C.; Pham, H.T.; Nguyen, T.T.; Jo, J.; Nguyen, Q.V.H. Example-based explanations for streaming fraud detection on graphs. Inf. Sci. 2023, 621, 319–340. [Google Scholar] [CrossRef]
Le-Khac, P.H.; Healy, G.; Smeaton, A.F. Contrastive Representation Learning: A Framework and Review. IEEE Access 2020, 8, 193907–193934. [Google Scholar] [CrossRef]
Kong, M.; Li, R.; Wang, J.; Li, X.; Jin, S.; Xie, W.; Hou, M.; Cao, C. CFTNet: A robust credit card fraud detection model enhanced by counterfactual data augmentation. Neural Comput. Appl. 2024, 36, 8607–8623. [Google Scholar] [CrossRef]
Zhang, Y.; Xu, Y.; Wei, S.; Wang, Y.; Li, Y.; Shang, X. Doubly contrastive representation learning for federated image recognition. Pattern Recognit. 2023, 139, 109507. [Google Scholar] [CrossRef]
Chen, B.; Zhang, J.; Zhang, X.; Dong, Y.; Song, J.; Zhang, P.; Xu, K.; Kharlamov, E.; Tang, J. GCCAD: Graph Contrastive Coding for Anomaly Detection. IEEE Trans. Knowl. Data Eng. 2023, 35, 8037–8051. [Google Scholar] [CrossRef]
Li, S.; Gou, G.; Liu, C.; Xiong, G.; Li, Z.; Xiao, J.; Xing, X. TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection. In Proceedings of the Annual Computer Security Applications Conference, Austin, TX, USA, 4–8 December 2023; ACM International Conference Proceeding Series. pp. 352–365. [Google Scholar] [CrossRef]
Xiao, S.; Bai, T.; Cui, X.; Wu, B.; Meng, X.; Wang, B. A graph-based contrastive learning framework for medicare insurance fraud detection. Front. Comput. Sci. 2023, 17, 172341. [Google Scholar] [CrossRef]
Ahmed, M.; Naser Mahmood, A.; Hu, J. A survey of network anomaly detection techniques. J. Netw. Comput. Appl. 2016, 60, 19–31. [Google Scholar] [CrossRef]
Poutré, C.; Chételat, D.; Morales, M. Deep unsupervised anomaly detection in high-frequency markets. J. Financ. Data Sci. 2024, 10, 100129. [Google Scholar] [CrossRef]
Adam, T.; Babič, F. Anomaly Detection on Distributed Ledger Using Unsupervised Machine Learning. In Proceedings of the 2023 IEEE International Conference on Omni-layer Intelligent Systems (COINS), Berlin, Germany, 23–25 July 2023; pp. 1–4. [Google Scholar] [CrossRef]
Hariri, S.; Kind, M.C.; Brunner, R.J. Extended Isolation Forest. IEEE Trans. Knowl. Data Eng. 2021, 33, 1479–1489. [Google Scholar] [CrossRef]
He, Z.; Xu, X.; Deng, S. Discovering cluster-based local outliers. Pattern Recognit. Lett. 2003, 24, 1641–1650. [Google Scholar] [CrossRef]
Crisostomo, J.; Lobo, V.; Bacao, F. Detecting Fraudulent Wallets in Ethereum Blockchain Combining Supervised and Unsupervised Techniques-Using Autoencoders and XGboost. In Blockchain and Applications, 5th International Congress; Machado, J.M., Prieto, J., Vieira, P., Peixoto, H., Abelha, A., Arroyo, D., Vigneri, L., Eds.; BLOCKCHAIN 2023; Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2023; Volume 778. [Google Scholar] [CrossRef]
Bank, D.; Koenigstein, N.; Giryes, R. Autoencoders. arXiv 2020, arXiv:2003.05991. [Google Scholar]
Baesens, B.; Van Vlasselaer, V.; Verbeke, W. Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Biau, G.; Scornet, E. A random forest guided tour. TEST 2016, 25, 197–227. [Google Scholar] [CrossRef]
Stone, M. Cross-validation:a review². Ser. Stat. 1978, 9, 127–139. [Google Scholar] [CrossRef]
Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. arXiv 2017, arXiv:1706.02216. [Google Scholar]
Christen, P.; Hand, D.J.; Kirielle, N. A Review of the F-Measure: Its History, Properties, Criticism, and Alternatives. ACM Comput. Surv. 2023, 56, 73. [Google Scholar] [CrossRef]
Yan, C.; Zhang, C.; Shen, M.; Li, N.; Liu, J.; Qi, Y.; Lu, Z.; Liu, Y. Aparecium: Understanding and detecting scam behaviors on Ethereum via biased random walk. Cybersecurity 2023, 6, 46. [Google Scholar] [CrossRef]
Pennec, G.l.; Fiedler, I.; Ante, L. Wash trading at cryptocurrency exchanges. Financ. Res. Lett. 2021, 43, 101982. [Google Scholar] [CrossRef]
Serneels, S. Detecting wash trading for nonfungible tokens. Financ. Res. Lett. 2023, 52, 103374. [Google Scholar] [CrossRef]
Cernera, F.; la Morgia, M.; Mei, A.; Mongardini, A.M.; Sassi, F. Ready, Aim, Snipe! Analysis of Sniper Bots and their Impact on the DeFi Ecosystem. In Proceedings of the ACM Web Conference 2023-Companion of the World Wide Web Conference, WWW 2023, Austin, TX, USA, 30 April–4 May 2023; pp. 1093–1102. [Google Scholar] [CrossRef]
Angerer, M.; Neugebauer, T.; Shachat, J. Arbitrage bots in experimental asset markets. J. Econ. Behav. Organ. 2023, 206, 262–278. [Google Scholar] [CrossRef]
Zhou, Y.; Li, H.; Xiao, Z.; Qiu, J. A user-centered explainable artificial intelligence approach for financial fraud detection. Financ. Res. Lett. 2023, 58, 104309. [Google Scholar] [CrossRef]
Lin, K.; Gao, Y. Model interpretability of financial fraud detection by group SHAP. Expert Syst. Appl. 2022, 210, 118354. [Google Scholar] [CrossRef]
Stefánsson, H.P.; Grímsson, H.S.; Þórðarson, J.K.; Oskarsdottir, M. Detecting potential money laundering addresses in the Bitcoin blockchain using unsupervised machine learning. In Proceedings of the HICSS 2022, Maui, HI, USA, 4–7 January 2022; pp. 1–10. [Google Scholar]

Figure 1. Example of a phishing cash-out pattern on the WAX transaction graph.

Figure 2. The overall system architecture.

Figure 3. Temporal dynamics of fraudulent activities.

Figure 4. Embedding generation and contrast learning process.

Figure 5. Visualization of Fraudulent (Red) and Normal (Blue) Nodes.

Figure 6. Example pattern of phishing website addresses.

Figure 7. NFT Buy Offer Mirror Attack Sequence.

Figure 8. Example Botnet Transaction Pattern.

Figure 9. Average Anomaly Score for Suspicious (Red) and Normal (Blue) Nodes.

Figure 10. Example of High-Frequency Trading (HFT) Bot Pattern.

Table 1. Evaluation Datasets.

Dataset	Nodes	Labeled	Edges
Dataset 1 (D1)	42,857	150	32,415,410
Dataset 1 (D2)	85,714	300	51,135,114
Dataset 1 (D3)	12,571	450	62,013,474
Dataset 1 (D4)	5,574,215	6344	182,443,217

Table 2. Performance comparison of different embeddings methods.

Dataset		Only Features	Node2Vec	GAT	GraphSage	Proposed Model
Dataset 1 (D1)	F1	0.78	0.72	0.81	0.80	0.92
	Recall	0.76	0.74	0.76	0.78	0.91
	Precision	0.79	0.73	0.85	0.83	0.93
Dataset 2 (D2)	F1	0.75	0.79	0.85	0.82	0.93
	Recall	0.78	0.81	0.84	0.86	0.93
	Precision	0.71	0.76	0.86	0.79	0.94
Dataset 3 (D3)	F1	0.78	0.74	0.84	0.82	0.95
	Recall	0.74	0.80	0.81	0.80	0.95
	Precision	0.81	0.70	0.87	0.84	0.95

Table 3. Performance comparison of different embeddings methods (D4).

Method	Recall	Precision	F1
Only Features	0.76	0.79	0.78
Node2Vec	0.74	0.75	0.75
GAT	OOM	OOM	OOM
GraphSage	OOM	OOM	OOM
Proposed Model	0.95	0.96	0.95

Table 4. Comparison results of anomaly detection algorithms.

Method	Recall	Precision	F1
Isolation Forest	0.99	0.90	0.94
Autoencoder	0.99	0.92	0.95
Clustering-Based LOF	0.99	0.91	0.93
Combined Model	0.99	0.91	0.93

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Adam, T.; Babič, F. Identifying Illicit Activities in Blockchain Transaction Graph Networks. Electronics 2025, 14, 4599. https://doi.org/10.3390/electronics14234599

AMA Style

Adam T, Babič F. Identifying Illicit Activities in Blockchain Transaction Graph Networks. Electronics. 2025; 14(23):4599. https://doi.org/10.3390/electronics14234599

Chicago/Turabian Style

Adam, Tomáš, and František Babič. 2025. "Identifying Illicit Activities in Blockchain Transaction Graph Networks" Electronics 14, no. 23: 4599. https://doi.org/10.3390/electronics14234599

APA Style

Adam, T., & Babič, F. (2025). Identifying Illicit Activities in Blockchain Transaction Graph Networks. Electronics, 14(23), 4599. https://doi.org/10.3390/electronics14234599

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identifying Illicit Activities in Blockchain Transaction Graph Networks

Abstract

1. Introduction

2. Related Work

2.1. Traditional Finance Fraud Detection

2.2. Blockchain Fraud Detection

2.3. Network Representation Learning

2.4. Contrastive Representation Learning

2.5. Anomaly Detection

3. System Architecture

3.1. Transaction Graph Construction

3.2. Subgraph Sampling

3.3. Graph Feature Preprocessing

3.4. Embedding Generation

3.5. Learning Process

3.6. Fraudulent Address Classification

4. Experiments

4.1. Dataset Creation

4.2. Experimental Setup

4.3. Under-Sampled Scale Dataset Results

4.4. Large-Scale Dataset Results

4.5. Unsupervised Learning

5. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI