CAA-RF: An Anomaly Detection Algorithm for Computing Power Blockchain Networks

Jia, Shifeng; Zhao, Yating; Zhang, Yang; Jia, Bin; Lian, Wenjuan

doi:10.3390/app15115804

Open AccessArticle

CAA-RF: An Anomaly Detection Algorithm for Computing Power Blockchain Networks

by

Shifeng Jia

,

Yating Zhao

^*,

Yang Zhang

,

Bin Jia

and

Wenjuan Lian

College of Computer Science and Engineering, Shandong University of Science and Technology, No. 579 Qianwan Port Road, Huangdao District, Qingdao 266590, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(11), 5804; https://doi.org/10.3390/app15115804

Submission received: 12 April 2025 / Revised: 17 May 2025 / Accepted: 20 May 2025 / Published: 22 May 2025

Download

Browse Figures

Versions Notes

Abstract

As a distributed communication and storage system, blockchain forms a Computing Power Blockchain Network (CPBN) by integrating computing nodes and network resources. However, its open architecture faces major security threats such as Sybil attacks, computational fraud, and DDoS attacks. Traditional detection methods often fail in dynamic environments with scarce domain data. To address this, we developed a lightweight blockchain simulator to generate Sybil and DDoS attack scenarios, constructing a 14-dimensional feature dataset. To address Sybil attacks and distributed denial-of-service attack scenarios, this paper proposes an adaptive attention random forest convolutional neural network anomaly detection method (CAA-RF). Our approach uses multi-layer convolutional operations to capture high-order data correlations, combines attention mechanisms for global dependency modeling, and employs random forest for robust anomaly detection, enabling effective real-time security protection for blockchain systems.

Keywords:

random forest; convolutional neural network; computing power network; anomaly detection; blockchain network; self-attention mechanism

1. Introduction

Distributed blockchain technology, with its inherent decentralization, transparency, and immutability, has introduced a new paradigm of trust for the Internet. Relying on consensus mechanisms to ensure transaction integrity and data authenticity, blockchain has demonstrated broad application prospects across domains such as finance, the Internet of Things, and energy. However, alongside these advantages, blockchain networks are not immune to vulnerabilities. In particular, due to their heavy dependence on inter-node communication for consensus, the security of the network communication layer has emerged as a critical concern.

Although blockchain was originally designed to resist single points of failure, its open and distributed nature renders it susceptible to attacks such as denial-of-service (DoS) and Sybil attacks. These threats not only disrupt normal transaction processing but may also compromise the overall stability of the network. To mitigate these risks, it is imperative to detect and respond to such attacks promptly and effectively to minimize their impact. Notably, while much attention has been given to the underlying technologies of computing power-based blockchain systems, relatively little research has been devoted to addressing threats at the network layer. Consequently, existing defense mechanisms are often inadequate against evolving threats.

To address this gap, the present study aims to explore the detection of anomalous activities within Computing Power Blockchain Networks using deep learning and machine learning techniques. Specifically, we propose an adaptive anomaly detection algorithm—CAA-RF—that integrates convolutional neural networks, random forests, and attention mechanisms to enhance detection accuracy and efficiency. By simulating representative attack scenarios in a lightweight computing power proof-of-stake blockchain environment, we validate the effectiveness of the proposed method. Compared with a variety of models, including DBNs, GANs, LSTMs, and VAEs, our approach achieves an F1 score of 0.9565 and a testing accuracy of up to 0.95674, demonstrating a significant advantage in F1 performance. This provides both theoretical and technical support for the development of more secure and reliable blockchain applications in the future.

1.1. Related Work

With the rapid development of artificial intelligence, there has been a shift from using traditional machine learning to integrating AI models for anomaly detection in network traffic. In 2021, Zhang Huabing et al. proposed a real-time anomaly detection method for mobile network traffic based on user behavior security monitoring [1]. This method captures user behavior data, analyzes protocols, and identifies applications to extract user behavior features. A Bayesian statistical classifier is built, and the extracted user behavior features are used for detection. Experimental results show that the detection time is as low as 104 s, and the detection accuracy is better than traditional methods, with good detection performance. However, the detection time of this method is hard to improve, as it requires some accumulation of user behavior data. The model’s effectiveness may need to be enhanced when facing more complex and variable user behavior patterns, especially in the case of more covert or novel attack methods. In 2023, Tosin Ige and Christopher Kiekintveld compared the performance of several Bayesian variants, including Multinomial, Bernoulli, and Gaussian, in network intrusion detection [2]. The study showed that the Gaussian Bayesian classifier performed the best due to its assumption that features follow a normal distribution, achieving a test accuracy of 81.69% in anomaly detection. On the other hand, the Multinomial Bayesian classifier performed poorly due to its assumption of discrete distributions, and the Bernoulli variant was not suitable for certain tasks due to its use of feature occurrence frequency information. In 2023, Latha R. Saveetha and S. John Justin Thangaraj proposed a method to detect denial-of-service (DoS) attacks using logistic regression and Naive Bayes models [3]. By learning from the KDDcup dataset, the method distinguished normal network traffic from DDoS attacks. The results showed that the logistic regression model outperformed the Naive Bayes model in accuracy, precision, recall, and F1 score. In 2024, Jeyakumar Samantha Tharani, Zhé Hóu, Eugene Yugarajah Andrew Charles, and others proposed a unified feature engineering method to detect malicious entities in blockchain networks [4]. The features were used to train multiple classifier models, making a significant contribution to improving classification accuracy and AUC values. However, these models are highly dependent on the quality and representativeness of the dataset, and their performance may decline in dynamic network environments. They also require continuous updates to adapt to new attack methods. Furthermore, the models may face delays and efficiency issues when handling large-scale real-time network traffic. Traditional machine learning and statistical methods may not fully adapt to highly dynamic and complex network environments, especially when dealing with unknown or minority category attacks, which may limit their detection performance.

Deep learning is gradually replacing traditional machine learning in certain areas, achieving better results. Mikhail Grekov described a system that combines Generative Adversarial Networks (GANs) and machine learning [5]. By mimicking attacker behavior to optimize traffic processing efficiency, it can effectively detect multi-stage attacks while reducing the computational burden of processing each data packet. S. Joshua Kumaresan’s research demonstrated the potential of Recurrent Neural Networks (RNNs) in anomaly detection [6]. After adjusting hyperparameters such as learning rate, batch size, and sequence length, the model achieved good performance in accuracy, recall, F1 score, and AUC. However, when handling unconventional network traffic patterns, further optimization of the model’s flexibility and adaptability is needed. Patrice Kisanga and Isaac Woungang proposed a Graph Neural Network (GNN) model based on Graph Convolutional Networks (GCNs), utilizing a two-layer GCN structure [7]. For the computation of Graph Edit Distance (GED), the model may need to better match differences between clusters and optimize the accuracy of cluster detection. Finally, the model’s performance in handling complex and long-term threats still requires further observation and improvement. Gaodi Xu’s CL-IDS model combined convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM) Networks, performing outstandingly on the KDDCUP99 dataset [8]. In 2023, Xu Wang, Li Han, Shimin Sun, and Guowei Xu proposed a network traffic anomaly detection method capable of tolerating concept drift [9]. By integrating multiple deep learning models to effectively address the issue of concept drift in data streams, the method enhances the ability to detect anomalous traffic. Each sub-classifier offers a unique perspective to learn data features, and through this approach, the solution can accurately capture changes in data concepts and apply weight layers to help the model adapt to changes caused by concept drift. Donghun Yang’s approach combined autoencoders and Mahalanobis distance, using unsupervised learning and ensemble methods to improve anomaly detection accuracy [10]. Although deep learning models can capture complex patterns, they often require large amounts of training data and are prone to overfitting, especially in cases of data imbalance. Additionally, deep models tend to have poor interpretability and robustness.

In addition to purely using deep models, innovation on existing models and the integration of multiple models is also a new approach. In 2023, Pritika Mehra proposed methods that combine deep learning, clustering algorithms, time series analysis, and other techniques, aiming to improve the overall performance of detection systems [11]. Zeyi Li proposed an unsupervised deep learning strategy combining Generative Adversarial Networks (GANs) and autoencoders, though it performed poorly in handling minority and unknown attacks [12]. Xiaoran Yang introduced a novel machine learning-based method for detecting and classifying botnets, primarily using deep learning techniques, particularly the ResNet residual network, and converting network traffic into grayscale images for recognition [13]. In 2022, Jinoh Kim and others proposed a new semi-supervised learning method based on autoencoders (AE), achieving significant progress in improving detection efficiency and reducing computational costs [14]. In 2021, Yuwei Sun proposed a multi-type anomaly detection method based on raw network traffic, and the visualization approach can directly identify anomalies from raw data, though it faces risks of data imbalance and overfitting [15]. In 2024, Chuang Li proposed a deep learning-based DDoS attack detection solution, relying on a two-layer GRU gated recurrent unit model to detect DDoS attacks, combining an encoder/decoder architecture to reduce data dimensionality and recover data through the decoding layer [16]. In 2021, Xiaowei Li introduced the entropy rate of change as a new metric, enhancing the sensitivity of entropy-based anomaly detection methods, especially in detecting covert attacks [17]. While innovative and hybrid methods aim to overcome the limitations of single technologies, they may introduce additional challenges due to increased complexity, such as reduced model interpretability and potential resource consumption issues in practical deployment.

In recent years, significant progress has been made in blockchain anomaly detection research worldwide, with related methods continually being enriched and refined. In 2022, Bianka Bosnyaková et al. proposed an unsupervised anomaly detection method that identifies price anomalies in blockchain NFT transactions through clustering and statistical analysis, effectively uncovering potential money laundering and market manipulation behaviors [18]. In terms of enhancing security, in 2021, Zhan Xin et al. combined blockchain technology with traditional network intrusion detection systems (IDS), leveraging the immutability of blockchain data to enhance the security and traceability of detection systems. This integration of blockchain features with detection technologies provides a reference for this thesis to optimize anomaly detection methods by incorporating underlying blockchain mechanisms [19].

Additionally, regarding the intelligence of detection models, in 2023, Yuansheng Dong et al. proposed a deep learning model based on autoencoders and an improved AlexNet, achieving efficient real-time detection of network intrusions. Although their application scenario was traditional network security, the deep learning architecture offers insights for improving the accuracy and real-time performance of blockchain anomaly detection [20]. In the field of ensemble learning, in 2021, Sidharth V. et al. proposed a network intrusion detection system combined with a parallel stacking ensemble method, which obtained the highest accuracy of 80.10% on the NSL-KDD dataset, effectively improving the detection performance [21]. In 2022, Noha E. El-Attar et al. systematically compared the effectiveness of various machine learning methods, such as SVM, random forest, and XGBoost in blockchain anomaly detection, pointing out that ensemble learning algorithms perform excellently in large-scale data analysis [22]. These achievements provide a theoretical foundation for model selection and ensemble optimization in this thesis. In 2023, Zhi Chen et al. proposed a supervised anomaly detection method based on conditional Generative Adversarial Networks and ensemble active learning [23]. By employing a parallel architecture of multiple discriminators and an active sampling strategy, detection performance was effectively improved. This method offers valuable insights for addressing data challenges in blockchain anomaly detection in this thesis.

Moreover, LSTM, Variational Autoencoders, deep belief networks, Generative Adversarial Networks, and Recurrent Neural Networks have also been applied to anomaly detection in recent years. In 2021, Sergei K. Alabugin et al. proposed an industrial process anomaly detection method based on Recurrent Neural Networks (RNNs) [24]. By predicting the states of industrial processes and comparing them with actual values, this method achieved efficient intrusion and anomaly detection without requiring training on anomalous samples. In 2023, Haoyuan Tang et al. proposed a transformer oil temperature anomaly detection method based on Long Short-Term Memory (LSTM) Networks and attention mechanisms [25]. This method effectively learns long-term dependencies in time series and improves prediction accuracy through the attention mechanism, enabling efficient detection of oil temperature anomalies. In 2022, Jia Li et al. proposed an e-commerce data anomaly detection method based on Variational Autoencoders (VAEs) [26]. By learning the latent distribution of data, this method effectively extracts features and identifies anomalous data, outperforming traditional detection methods in terms of accuracy and robustness. In 2021, Jiali Liu et al. proposed a power system state estimation and anomaly detection method based on deep belief networks (DBNs) [27]. Leveraging the powerful feature extraction ability of DBNs, this method achieved high-precision fault detection and recovery time prediction in power systems. In 2022, Rohit Raturi et al. proposed a time series anomaly detection method based on Generative Adversarial Networks (GANs) [28]. By generating synthetic data and capturing complex patterns with GANs, this method significantly improved the accuracy and robustness of time series anomaly detection.

Although these Variational Autoencoders and Generative Adversarial Networks were originally mainly used for generative tasks such as image generation and data augmentation, many studies have shown that, with appropriate adjustments and design, they can also be applied to classification tasks, especially in anomaly detection and classification problems in specific domains.

Variational Autoencoders (VAEs) can learn the latent representations of data, mapping complex high-dimensional features to a lower-dimensional space, which can then be used for classification tasks. This approach is suitable for scenarios that require feature extraction and dimensionality reduction. The discriminator in Generative Adversarial Networks (GANs) can be regarded as a powerful classifier, and its effectiveness has been demonstrated in many tasks. With appropriate adjustments, the discriminator of a GAN can be used for classification with labeled data. Below are relevant literature references and our explanations.

Studies on VAEs for classification tasks: In 2023, Jia Li et al. proposed a VAE-based anomaly detection method for e-commerce data [26]. This method extracts features by learning the latent distribution of the data and, combined with a lightweight classification network, classifies anomalous data, demonstrating the effectiveness of VAEs in classification tasks. In 2023, Ce Wang et al. proposed a VAE-based anomaly detection method for supercomputing center indicators in power grids [29]. By preprocessing and extracting features from indicator data using VAEs and combining with a classifier, they achieved efficient anomaly detection, ultimately obtaining a high F1 score of 0.9091. In 2023, Wanzhen Zhang et al. proposed a multi-channel Variational Autoencoder (MCVAE) for anomaly detection and classification tasks on multimodal data, further validating the applicability of VAEs in classification problems [30].

Studies on GANs for classification tasks: In 2023, Hong Nhung Nguyen et al. proposed a GAN-based network intrusion detection method for SCADA systems [31]. This method uses the GAN discriminator to classify normal and abnormal traffic, demonstrating its potential in classification tasks. In 2023, Yu Zhang et al. proposed a deep convolutional GAN (DCGAN)-based method for bearing anomaly detection and fault diagnosis [32]. This study extracted features from bearing data using GANs and combined them with the discriminator to achieve anomaly detection and classification tasks, proving the effectiveness of GANs in industrial fault diagnosis. In 2023, Rohit Raturi et al. proposed a GAN-based anomaly detection method for time series data [28]. By generating synthetic data with the generator and classifying normal and abnormal data with the discriminator, their method achieved excellent performance in F1 score, AUC, and other metrics.

1.2. Contributions

In this study, we make the following substantial contributions to anomaly detection in computational blockchain networks:

To address the issue of dataset scarcity, we designed and implemented a distributed, lightweight proof-of-stake blockchain model tailored for computational networks. This model not only simulates the operation, communication, and consensus processes of blockchain in a computational network environment but also enables the collection of key performance indicators and instrumental data during the normal operation of the blockchain system through the simulation process.
In the research field of computational blockchain technology, the absence of publicly available datasets remains a fundamental issue. To tackle this, we designed and carried out a series of security attack experiments on computational blockchain networks, including Sybil attacks and denial-of-service (DoS) attacks. These experiments were designed to simulate various real-world attack scenarios, thereby compensating for the lack of data in existing studies. Through these experiments, we successfully captured and recorded a large volume of anomalous behavioral data from blockchain systems under attack. The collected data span multiple dimensions, including abnormal error rates and packet sizes, and provide detailed information on the system’s data reception during attack periods.
We propose an adaptive anomaly detection method that integrates attention mechanisms, random forests, and convolutional neural networks. This method is trained on the dataset collected from the aforementioned distributed computational blockchain model. Furthermore, we designed a series of comparative experiments to evaluate the performance of our model against traditional machine learning methods and six deep learning approaches. The results show that our proposed method achieves more balanced performance in terms of both accuracy and F1-score.

1.3. Organization

The remainder of this paper is organized as follows. Section 2 introduces the distributed computing power network blockchain model, detailing the structure and design of the distributed blockchain system. Section 3 presents the anomaly detection architecture based on the computing power network blockchain, including the design of the blockchain network anomaly detection model. Section 4 provides comparative experimental analysis, offering performance evaluation and detailed comparison. Section 5 concludes the paper.

2. Distributed Computing Power Blockchain Network Model

As shown in Figure 1 computing power blockchain network, the architecture of the blockchain-based computational network adopts a multi-layered collaborative design, with efficient resource scheduling and intelligent circulation of computing power as its core objectives. Through hierarchical decoupling and vertical integration, a comprehensive service system is constructed.

The system is defined from top to bottom as a three-dimensional framework comprising business-driven, global control, hardware support, trusted verification, and network interconnection. The business layer focuses on value circulation, the network layer reinforces the transmission foundation, and blockchain technology serves as the intermediary to enable trustworthy resource scheduling. Together, they establish an intelligent computing ecosystem characterized by vision-aware perception, dynamic orchestration, and flexible deployment.

As the strategic gateway directly addressing industrial needs, the business layer establishes an evaluation system centered on multi-dimensional metrology. Through technologies such as dynamic power calibration, computing power encapsulation, and computing power perception, the computing power trading market transforms computing resources into divisible and composable digital commodities. Unlike the coarse-grained approach of traditional resource pools, this layer enables a paradigm shift from power metering to service value orientation.

The control layer utilizes a digital twin + intent network to construct the decision-making center. As the peripheral neural network of the system, it perceives data related to blockchain computing power network connections, transactions, anomalies, etc., including timestamps, duration, message length, connection frequency, error rates, and so on, and writes these data to local files in real time. By decomposing computing power from heterogeneous resources using specialized tools and integrating scheduling optimization algorithms with autonomous computing engines, the system can dynamically generate computational power routing paths. Notably, the integration of human/machine collaborative intent recognition allows non-technical users to issue natural-language commands for professional computing power scheduling.

The infrastructure layer breaks away from traditional centralized deployment by leveraging optoelectronic collaborative technologies to build a distributed computing base. From bandwidth reservation in lossless networks to application-aware resource perception, and from privacy-preserving edge computing at peripheral nodes to computing power loading on terminal devices, a three-tier cloud/edge/terminal flexible architecture is formed. Especially through high-speed optoelectronic interconnection, the system enables zero-latency computing power reconfiguration across domains.

Serving as the architectural core, the CPBN blockchain platform innovatively integrates the proof-of-stake (PoS) consensus mechanism with a distributed verification framework [33]. This mechanism dynamically allocates block validation weight according to the amount of CPN tokens held by each node. Token holders gain governance authority over computing power scheduling through staking, thereby maintaining decentralization while using economic incentives to suppress Sybil attacks. As shown in Algorithm 1, the consensus logic is illustrated in the figure below. In our proposed consensus algorithm, each blockchain node independently generates a random number (R) during the current block generation round. Within a specified time window, each node (Node) transmits its generated random number to other nodes (toNode). Once enough nodes have successfully completed the message transmission, the system proceeds to the next consensus phase. In this phase, each node calculates an average value based on the collected random numbers. To further enhance the unpredictability and security of the selection process, the average value then undergoes an additional randomized diffusion process (E), yielding a new parameter, e. Finally, e serves as the node selection parameter, and through a decision function (Select), the block validator for the current round is determined from all consensus nodes. This process ensures fairness and randomness in validator selection while effectively improving the robustness and security of the consensus process. On this basis, the platform builds a collaborative verification network composed of computing nodes, routing nodes, and crossover nodes, achieving a three-tier trusted scheduling of computational resources. Its original grid-based verification mechanism ensures that cross-operator scheduling maintains privacy isolation while achieving global consensus.

Algorithm 1. Validator selected by a random number

Input: Node T

Output: Validator

for node = A, B, C, …, T do

for toNode = A, B, C, …, T do
R ← send(node, toNode, R)

end for

for node = A, B, C, …, Tdo

average ←AVG(R))

e ←E(average)

Validator ← select(e)

End For

As the foundational layer, the network layer innovatively integrates Time-Sensitive Networking (TSN) and Network Function Virtualization (NFV) to establish deterministic latency-guaranteed communication channels. By decoupling the control plane and deploying via cloud-native methods, the intelligent routing system can dynamically generate edge network elements in response to business demands. The combination of deterministic networking and distributed architecture significantly reduces the transmission latency across the core cloud communication edge terminal computing chain.

The most significant breakthrough of this system lies in its disruption of the traditional segmented architecture of resources network business: value units defined at the business layer are directly mapped into scheduling strategies within the control layer.

3. Network Anomaly Detection Strategy for Computing Power Blockchain Networks

The anomaly detection architecture centers on CPBN anomaly detection. By constructing the CPBN anomaly detection system and designing the CAA-RF anomaly detection algorithm, a six-dimensional integrated security system is established in shown Figure 2. Driven by both information security and network security, the system concretizes three core mechanisms—governance, response, and recovery—through a dedicated security capability middleware. A federated dynamic identity authentication mechanism constructs a zero-trust access system, while fully homomorphic encryption provides mathematically rigorous protection for the secure data exchange service center. A specially designed quantum-resilient transitional system serves as a futuristic shield, reserving upgrade potential for defense against post-quantum era threats. At the top level, a risk correlation hub enables visualization of threat landscapes, forming standardized, externally exportable security assurance capability packages.

The foundational security capability layer adopts a cellular defense layout, deploying differentiated technological arsenals across six major domains, including terminals, applications, and systems. On the terminal side, immunity barriers are established through node protection and automated penetration testing. At the network boundary, firewalls and traffic scrubbing mechanisms form dynamic defensive perimeters. Notably, the data security protection framework exhibits a three-dimensional depth: vulnerability scanning systems act as all-weather sentinels, transparent data desensitization technologies ensure data usability without visibility, and log auditing systems construct a comprehensive operational gene library. This layer extends downward into a data management hub, enforcing full lifecycle control over six categories of data flows, including metadata and device data, thus creating a multi-dimensional protection net that spans from physical resource pools to third-party components.

As the tactical execution center, the CPBN anomaly detection system leverages blockchain technology to construct a real-time cyber defense sandbox. The intranet monitoring module anticipates lateral movement risks through traffic fingerprint analysis, while the extranet monitoring unit employs machine learning to model DDoS attack patterns. Upon detection of a Sybil attack, the system rapidly identifies anomalous nodes via consensus algorithms and synchronously activates a collaborative mitigation channel for on-chain isolation. The specially designed threat hunting system transforms security incidents into traceable blockchain data and interacts with the risk correlation hub via smart contracts, thereby realizing a closed-loop defense mechanism from detection to recovery.

This architecture achieves a spiral-evolving security ecosystem through a three-tier synergy of strategic planning, capability support, and execution feedback. Blue data streams between modules function as a neural network transmitting threat intelligence, while white functional blocks clearly delineate responsibilities. Whether handling a DDoS attack through a comprehensive chain from traffic scrubbing to source tracing or responding to a Sybil attack via coordinated identity authentication and node protection, the system embodies the design philosophy of defense-in-depth and intelligent linkage. It provides a predictive and adaptive security foundation for blockchain-based networks.

Unlike most existing studies that adopt parallel ensemble strategies or single deep learning models, this paper proposes a novel framework that serially integrates traditional machine learning methods with deep learning models. By combining the structural advantages of machine learning models with the feature extraction capabilities of deep learning models, the overall generalization performance and prediction accuracy of the model are improved.

In 2022, Bianka Bosnyaková et al. proposed an unsupervised anomaly detection method that identifies abnormal behaviors in WAX blockchain NFT transactions through clustering and statistical analysis [18]. This method mainly relies on unsupervised clustering algorithms for anomaly detection and belongs to the category of single-machine learning models. In 2023, Zhan Xin et al. proposed a system that combines blockchain and traditional network intrusion detection technologies, leveraging distributed and immutable characteristics to enhance detection security [19]. This method is mainly based on traditional machine learning and rule-based approaches and does not involve serial or parallel integration with deep learning. In 2019, Yuansheng Dong et al. proposed a serial deep learning model based on autoencoders and AlexNet for network intrusion detection [20]. This method automatically extracts features and performs classification through a multi-layer serial structure of deep neural networks, representing a typical deep learning serial model. In 2021, Sidharth V. et al. proposed a network intrusion detection system combined with a parallel stacking ensemble method, which obtained the highest accuracy of 80.10% on the NSL-KDD dataset, effectively improving the detection performance [21]. In 2024, Noha E. El-Attar et al. conducted a parallel comparison of machine learning algorithms such as SVM, random forest, and isolation forest, evaluating their performance in blockchain anomaly detection [22]. This study focuses on the parallel comparison and integration of different machine learning models. In 2023, Zhi Chen et al. proposed a supervised anomaly detection method based on conditional Generative Adversarial Networks and ensemble active learning, which improves detection performance through a parallel architecture of multiple discriminators and an active sampling strategy [23]. This method belongs to the category of parallel deep learning models, emphasizing multi-model collaboration and active learning.

The core model of the CPBN anomaly detection system is the CAA-RF model, and its overall structure is mainly divided into the following three parts. Since the collected data often exhibit class imbalance in the labels, we first performed data normalization during preprocessing to eliminate the influence of different feature scales and bring all features to a comparable level, thereby improving the effectiveness of model training. Subsequently, to further tackle this issue, we employed the Adaptive Synthetic Sampling (ADASYN) method in our experiments to handle data imbalance. ADASYN generates synthetic samples for the minority class and dynamically adjusts the number of generated samples to balance the data distribution, thus enhancing the model’s ability to learn from the minority class and effectively mitigating the negative impact of data imbalance on model performance. To accelerate convergence and improve training efficiency, we chose the Adam optimizer, a widely used adaptive learning rate optimization algorithm. For the loss function, we use the cross-entropy loss (CrossEntropyLoss) as the objective function because it is suitable for multi-class classification problems and can directly handle both the classification labels and the network’s unnormalized probability distributions. The Algorithm 2 workflow is as follows:

Algorithm 2. Random forest classification algorithm based on convolutional feature

Input: Network Traffic Data D = {X₁, X₂, X₃, X₄, …, X₁₄, Y}

Output: Random Forest Classification Accuracy A₂

Standardize features D

Apply ADASYN oversampling to the dataset D

for i = 1 to n do

O ← CAA (X_i)

Loss ← LF(O, Y)

Backward

Update W

A₁ ← Predict O

F ← CAA(X_i)

RF Fit(F, Y)

A₂ ← RF(F)

End For

Where D denotes the input dataset, which will be specifically introduced in the experimental section; LF represents the loss function; CAA refers to the Convolutional Adaptive Attention neural network model; O indicates the prediction results of the CAA; X represents the data features; Y denotes the dataset labels; F represents the features extracted at the output of the CAA network. RF stands for random forest. W denotes the weight parameters of the CAA, and Fit represents the process of training the random forest model. A₁ represents the accuracy of the CAA model, and A₂ represents the accuracy of the CAA-RF model.

In the first part of the model, as shown in Figure 3, we used [34] convolutional operations to extract features, setting the size of each convolutional kernel to 3, denoted as M. To prevent information loss due to multiple convolutions, we applied a padding of 1 on the edges during each convolution. The filters in the convolutional layer slide over the input data with a stride of 1, performing a dot product between the local region and the filter to perceive local features in the data, thus detecting specific patterns in the input. The number of convolutional kernels affects the result of the features captured. In the first convolution, 32 convolutional kernels were configured to enhance the model’s expressive ability. After the convolution operation, the data pass through a

R e L U

activation function, denoted as

σ

. The function of ReLU is to set all negative values to zero while retaining and amplifying positive values. This nonlinear transformation helps the model learn more complex patterns, increases network sparsity, and reduces the vanishing gradient problem, making the training process more efficient. By randomly deactivating some neurons and their connections through Dropout, the model avoids over-reliance on certain nodes, thereby enhancing robustness and generalization, and reducing the risk of overfitting. Next, the filters perform the feature extraction task again, but this time, both the second and third convolutional operations use 64 convolutional kernels to extract features, further abstracting the data into the K + 1 layer based on the previously extracted Kth-layer features. The

R e L U

activation function is applied again to ensure that the model can capture higher-level feature representations while maintaining its nonlinearity. Finally, another convolution operation is performed on the obtained features. This series of operations allows the model to extract multi-level information from the raw input, which is crucial for understanding complex datasets. The formula is as follows:

\begin{matrix} H_{s, c, t}^{(l)} = σ (\sum_{c = 1}^{C_{l - 1}} \sum_{k = 1}^{K} (W_{c, e, k}^{(l)} \cdot H_{i, e, t + k - [K / 2]}^{(l - 1)}) + b_{c}^{(l)}) \end{matrix}

(1)

\begin{matrix} H_{s, c, t}^{(l)} = \sum_{k = 1}^{K} (W_{c, e, k}^{(l)} \cdot H_{i, e, t + k - [K / 2]}^{(l - 1)}) + b_{c}^{(l)} \end{matrix}

(2)

where

H^{(l)}

denotes the output of the l-th convolutional layer,

C_{l}

represents the number of output channels in the l-th convolutional layer, K refers to the kernel size,

W^{(l)}

denotes the weights of the convolutional kernels in the l-th layer,

b^{(l)}

indicates the bias term,

σ (\cdot)

denotes the activation function, i is the sample index, c is the channel index, and t indicates the position within the feature sequence.

In the second part of the model, as shown in Figure 4, the convolutional layer extracts meaningful local features from the raw data, which are then passed to the fully connected layer. In this process, the features are compressed into a vector form, facilitating subsequent processing. The feature vector generated by the fully connected layer is fed into the attention mechanism. The core idea of the attention mechanism is to enable the model to focus on the most important parts when processing sequence data or other types of data, rather than treating all input elements equally [35]. In this case, the attention mechanism achieves this through three concepts: the Query Matrix (W^Q), the Key Matrix (W^K), and the Value Matrix (W^V). The W^Q represents the question or focus that the model cares about at the current moment; the W^K contains the key information for all potential answers; and the W^V corresponds to the actual values associated with each key. By calculating the similarity scores between the query and the key, the attention mechanism determines which values should be assigned higher weights, thereby deciding which parts of the input features are most relevant and influencing the final decision. Based on the attention weights computed, the attention mechanism performs a weighted sum of the elements in the Value Matrix. This process essentially involves dynamically filtering and reassembling the input features, allowing the model to focus on the truly important pieces of information when tackling complex tasks. The feature vector processed by the attention mechanism forms a new representation that integrates important information from different parts of the input. This new feature vector will then be used as the input for downstream tasks. The formula is as follows:

\begin{matrix} O = s o f t m a x (\frac{(X W^{Q}) {(X W^{K})}^{T}}{\sqrt{d_{k}}} (X W^{V})) \end{matrix}

(3)

where O denotes the attention output, X denotes the input features, W^Q, W^K, and W^V denote the weight matrices, and d_k denotes the dimension of the key [36].

To obtain the features extracted by the convolution, we first train the model for several epochs. After training is completed, we replace the final classification layer with an identity layer to obtain the extracted dataset features. We then introduced a random forest structure. The random forest trains the model by integrating multiple decision trees, with each tree learning independently on different subsets of samples and features. Although a single tree may overfit the training set, by voting or averaging the predictions of multiple trees, the randomness and noise of individual trees can be effectively offset, resulting in a stronger overall generalization ability and significantly alleviating overfitting. As shown in Figure 5, these features are then used as input for the random forest model. We trained and evaluated the random forest model using these features, without pruning the tree branches during the process, in order to better highlight the optimization effect of the random forest.

4. Experimental Results and Analysis

4.1. Introduction to Datasets

For the Computing Power Blockchain network application scenario, we collected a series of attack-affected data based on the traffic data of the Computing Power Blockchain network. The goal is to suppress malicious behavior and promptly detect issues that may affect network performance. As shown in Table 1, the CPBNT dataset consists of 14 features and one label, which represent the following: time series, connection duration, data byte size, number of connections to the target host in the last 100 connections, ratio of connections to the target host to the total in the last 100 connections, number of connection errors for the target host, ratio of target host connection errors to the total of 100 recent errors and 100 recent correct connections, ratio of connection errors to the target host in the last 2 s, ratio of correct connections to the target host in the last 2 s, ratio of connection errors to the target host in the last 2 s, ratio of connection errors for the target host to the total in the last 2 s, ratio of connection errors for the target host in the last 2 s to its own connections, operation code, and data label. As shown in Table 2, the data label has three categories: 0, 1, and 2, which represent normal, DDoS, and Sybil attacks, respectively. In a Sybil attack, attackers create many fake nodes to control most of the decision-making power in the network, thereby undermining the decentralization of the blockchain. This type of attack is particularly prevalent in blockchain networks, especially during the consensus mechanism and node verification processes. In a DDos attack, distributed denial-of-service attacks send a large number of fake requests to the network, causing node resources to be exhausted or services to be interrupted. This type of attack poses a serious threat to the availability and stability of blockchain networks.

The basic characteristics of the dataset are as follows.

4.2. Experimental Setting and Assessment Criteria

The experiments were all conducted on a personal computer manufactured by Compal Electronics in Nanjing, China and equipped with a 12th-generation Intel i7-12650H processor, 64 GB of RAM, and an NVIDIA RTX 4060 GPU, running Python 3.9. Various attacks, including DDoS attacks and Sybil attacks, were simulated within the network environment, and data under both normal and abnormal network conditions were collected. Accuracy evaluations were carried out using several models, including autoencoder, DBN, GANs, LSTM, RNN, VAE, CAA, and CAA-RF. Based on this dataset, deep learning neural networks were trained, and the accuracy for each training epoch was recorded. For every training round, the accuracy was assessed using the test set. Accuracy reflects the model’s overall predictive ability on all samples and serves as a fundamental metric for assessing the global performance of the model. Through accuracy, we can intuitively understand the proportion of correct classifications on the entire dataset, which facilitates comparative analysis with other methods. However, in anomaly detection tasks with extremely imbalanced data, accuracy can sometimes be dominated by the most normal samples. Therefore, we use it as a reference for overall performance. Moreover, the F1 score, as the harmonic mean of precision and recall, comprehensively measures the model’s ability to identify anomalous samples. In anomaly detection scenarios, relying solely on accuracy may mask the omission of anomalous samples, while the F1 score effectively balances the two types of errors: missing anomalies and misclassifying normal samples as anomalies. Thus, the F1 score can more accurately reflect the model’s actual performance in detecting minority anomalies, which is especially suitable for the class imbalance problem faced in this study.

4.3. Performance Evaluation and Comparative Analysis

Considering the computational cost of the model, we have evaluated our model from multiple perspectives, including CPU utilization, memory consumption, average training time per epoch, space complexity, time complexity, and FLOPS (Floating Point Operations Per Second). Overall, the CAA-RF model achieves a good balance between computational resource consumption and performance. Considering that some experimental models cannot be trained on GPUs, all models in this study are trained on CPUs to ensure fair and convenient comparison. We specifically compared the computational resource usage of different models. Our CAA-RF model demonstrates moderate CPU utilization, memory consumption, and FLOPS, with only the average training time per epoch being relatively high, mainly due to the higher model complexity. The specific results are shown in Table 3 below.

In Figure 6, the eight bar charts represent anomaly detection on vehicular blockchain traffic data. The first is the VAE model, which has the worst performance among all models. Compared to other models, only two models outperform our proposed model, which combines random forest, and it also achieves a higher accuracy than using random forest alone.

In this experiment, we compared the F1 scores, training accuracy, and testing accuracy of various deep learning models, as shown in Figure 7, Figure 8 and Figure 9. In Figure 7, Figure 8 and Figure 9, subfigure (a) shows the changes in model performance with respect to training epochs, while subfigure (b) shows the changes in model performance over time. The differences between these plots depend on the training duration per epoch for each model, which helps demonstrate the models’ operational efficiency and computational overhead. Within the 500-s time window, the number of epochs included for each model is as follows: VAE: 910; RNN: 981; autoencoder: 146; LSTM: 702; GANs: 98; DBN: 589; CAA: 25; and CAA-RF: 25. As shown in Figure 7b, the first data point for some models does not appear at seconds. This is because the model undergoes initialization and requires a relatively long training time during the first 50 s after training begins, so the first data point is generated several seconds later. As a result, the initial data point of the curve appears at the 50-s mark instead of at seconds. As shown in Figure 7, the F1 score versus training rounds and time (s) curve is illustrated. The horizontal axis represents the number of training iterations and training time (s), while the vertical axis indicates training accuracy. The results reveal that Generative Adversarial Networks (GANs), Long Short-Term Memory (LSTM) Networks, and CAARF consistently demonstrated excellent F1 scores throughout the training process, particularly after a certain number of iterations, where their advantages became more apparent. In contrast, models such as Variational Autoencoders (VAEs) and Recurrent Neural Networks (RNNs) performed less effectively in the early stages but showed significant improvement as training progressed.

Figure 8 shows the curve of training accuracy versus training rounds and time. The left graph shows that the CAARF model performs excellently in both training rounds and training time. Initially, the training accuracy reached 100%. In the early stages of training, due to the limited number of epochs, the CAA-RF model is unable to effectively extract complex features, and currently, the testing accuracy remains low. However, the secondary training by the random forest temporarily raises the training accuracy on the training set to 1.0. As training progresses, the CAA model gradually adjusts its parameters to more accurately extract multiple complex features from the data. The fitting ability of the random forest for these complex features diminishes, resulting in a decrease in training accuracy. However, this helps improve the model’s generalization ability and, thus, increases the testing accuracy. This phenomenon is precisely the result of the dual-structure design of the CAA-RF model. In contrast, the training accuracy curves of other single-layer models are more stable and conventional. Figure 8 shows the curve of testing accuracy versus training rounds and time. The left graph shows that at around 20 training rounds, the CAARF model achieved its highest testing accuracy, slightly lower than LSTM, but still significantly better than other models.

In addition, the right graph shows poorer results, with the x-axis representing time in seconds. In the curve of testing accuracy versus time, as shown in Figure 9, it can be observed that the CAARF model’s testing accuracy is relatively lower in the early stages compared to other models. This phenomenon is mainly due to the longer training time required for each cycle of the CAARF model. The structure of the CAA-RF model is relatively complex. Its design combines convolutional neural networks, adaptive mechanisms, self-attention mechanisms, and random forests to perform multi-dimensional detection of anomalous behaviors in blockchain networks. The synergy of these modules enhances the model’s detection performance but inevitably increases computational complexity, resulting in longer training times. Within the same observation period, the model has not yet completed sufficient training. Therefore, although the CAARF model theoretically has higher potential, its testing performance did not match that of other models in the short term due to the limitation in training efficiency. As shown in Figure 7b and Figure 9b, compared with the CAA model, the CAA-RF model exhibits a significantly smaller decrease in metrics during the middle and later stages of training. From these two metrics, the CAA-RF model structure, compared to the CAA model, shows that at 250 s, the CAA model experiences a noticeable decline on the test set and develops overfitting as training progresses. However, the decline in the CAA-RF model is much less pronounced, indicating that it can effectively alleviate overfitting during training.

To further verify the effectiveness of integrating random forest structures with other deep learning models, we conducted additional experiments. We selected one high-performing model and one lower-performing model, specifically, the Variational Autoencoder (VAE) and the Long Short-Term Memory (LSTM) Network, to enhance representativeness. The experimental results are shown in the figures. Figure 8a and Figure 9a display the F1 score and accuracy over the first 60 training epochs, respectively. Figure 8a and Figure 9a illustrate the trends of F1 score and accuracy during the first 60 training epochs. By comparison, it can be observed that the VAE-RF and LSTM-RF models, which combine random forest with VAE and LSTM, respectively, exhibit significant improvements in both metrics compared to using VAE and LSTM alone. This demonstrates that integrating random forest with deep learning models can effectively enhance classification performance.

To enhance the accuracy and robustness of anomaly detection in blockchain networks, we have introduced multiple innovations in our model design. Our model utilizes convolutional neural networks to extract high-level features and captures complex spatial characteristics within blockchain networks through multiple layers of sophisticated convolutional structures. By incorporating a self-attention mechanism, the model can dynamically focus on key features while suppressing irrelevant ones. This mechanism is particularly important for anomaly detection, as it effectively addresses the issue of imbalanced data distribution in blockchain networks. The ever-changing environment of blockchain leads to feature distribution drift; therefore, we adopt an adaptive mechanism that dynamically adjusts the learning rate, enabling the model to reduce the learning rate, alleviate overfitting, and enhance robustness to changes. Random forest demonstrates excellent performance in classification tasks, especially in scenarios with small sample sizes and imbalanced data, due to its inherent robustness. By combining high-level features extracted through deep learning with random forest, classification performance is further improved, and the risk of overfitting that may occur with a single deep learning model is reduced. In terms of computational cost balance, although the training time of CAA-RF is slightly higher than that of a standalone random forest model, it maintains moderate CPU usage, memory consumption, and floating-point operations per second (FLOPS), thus achieving a good balance between performance and resource consumption.

Considering the actual blockchain network environment, data may vary, and the performance of the model may decline. However, our model is designed with this issue in mind and can mitigate the degradation caused by such differences. To address real-world scenarios, we have adopted a combination of CNNs and self-attention mechanisms, feature transfer ensemble learning, and adaptive training mechanisms. Real network traffic may include a wider variety of attack types, noise, abnormal patterns, and diverse normal business activities, as well as network fluctuations, latency, and data loss, resulting in more complex data distributions and higher requirements for model scalability. Blockchain nodes, protocols, and network topology may change over time, leading to feature distribution drift. In real-world scenarios, attacks are only a minority, which also causes data imbalance. CNNs can learn and extract local temporal features from data, while self-attention mechanisms can suppress irrelevant features and focus on key features. The combination of these two allows the model to capture abnormal local features. Random forest is robust to small samples, imbalanced data, and feature noise. Feeding high-level features extracted by deep models into the random forest for secondary classification helps improve the model’s generalization ability in complex real-world environments through the integration of multiple decision trees. When distribution drift leads to increased loss, the adaptive mechanism dynamically adjusts the learning rate, reducing it to alleviate overfitting and enhance robustness to changes.

5. Conclusions

This paper proposes a method for traffic anomaly detection in blockchain-based computing power networks, which combines the strengths of convolutional neural networks, self-attention mechanisms, and random forest models to achieve effective anomaly detection. The CAA-RF model demonstrates superior performance in terms of F1 score and training accuracy. However, its relatively long training time limits its short-term performance. In contrast, the LSTM model shows stable performance during both training and testing. To verify that random forest can effectively classify features extracted by neural networks and to avoid the contingency of a single experiment, we also used VAE and LSTM deep learning models in combination with random forest for classification, which similarly outperformed single deep learning models.

In addition, we constructed a lightweight blockchain simulation model, which was used to simulate Sybil and DDoS attack scenarios and generate a dedicated dataset with 14 network features to provide data support for research on anomaly detection algorithms. We have proposed future improvements, including simulating more types of attacks, collecting higher-dimensional data, and exploring more consensus algorithms to further enhance the robustness and applicability of our approach.

Despite achieving good experimental results, we recognize the challenges in our current research. The model requires a long training time and consumes significant computational resources; furthermore, distributed training may lead to the omission of certain data features. In the future, we plan to optimize the algorithm structure by integrating the feature extraction and classification processes into a unified training framework to enhance model performance and training efficiency. Meanwhile, we will further improve the simulation model, expand the range of attack scenarios and feature dimensions, and provide more comprehensive data support for subsequent research.

Author Contributions

Writing—original draft preparation: S.J.; Supervision: Y.Z. (Yating Zhao), W.L. and B.J.; Writing—review and editing: S.J., Y.Z. (Yating Zhao). and Y.Z. (Yang Zhang); Funding acquisition: Y.Z. (Yating Zhao). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Postdoctoral Fellowship Program of CPSF, grant number GZC20240925, the China Postdoctoral Science Foundation, grant number 2024M751855, the Shandong Postdoctoral Science Foundation, grant number SDCXRS-202400018, and the Qingdao Postdoctoral Project, project number QDBSH20240102189.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

You can visit https://github.com/Maple1129/CAA-RF (accessed on 20 March 2025) to obtain the resources and further details about the project.

Acknowledgments

This research was funded by the Postdoctoral Fellowship Program of CPSF, the China Postdoctoral Science Foundation, the Shandong Postdoctoral Science Foundation, and the Qingdao Postdoctoral Project. The support is gratefully acknowledged.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zhang, H.; Ye, S.; Cao, X.; Lin, Z. Real-time detection method for mobile network traffic anomalies considering user behavior security monitoring. In Proceedings of the 2021 International Conference on Computer, Blockchain and Financial Development (CBFD), Nanjing, China, 23–25 April 2021; pp. 11–16. [Google Scholar]
Ige, T.; Kiekintveld, C. Performance Comparison and Implementation of Bayesian Variants for Network Intrusion Detection. In Proceedings of the 2023 IEEE International Conference on Artificial Intelligence, Blockchain, and Internet of Things (AI-BThings), Mount Pleasant, MI, USA, 16–17 September 2023; pp. 1–5. [Google Scholar]
Latha, R.; Justin Thangaraj, S.J. Machine Learning Approaches for DDoS Attack Detection: Naive Bayes vs. Logistic Regression. In Proceedings of the 2023 Second International Conference on Smart Technologies for Smart Nation (SmartTechCon), Singapore, Singapore, 18–19 August 2023; pp. 1043–1048. [Google Scholar]
Tharani, J.S.; Hóu, Z.; Charles, E.Y.A.; Rathore, P.; Palaniswami, M.; Muthukkumarasamy, V. Unified Feature Engineering for Detection of Malicious Entities in Blockchain Networks. IEEE Trans. Inf. Forensics Secur. 2024, 19, 8924–8938. [Google Scholar] [CrossRef]
Grekov, M. Architecture of a Multistage Anomaly Detection System in Computer Networks. In Proceedings of the 2022 International Siberian Conference on Control and Communications (SIBCON), Tomsk, Russia, 17–19 November 2022; pp. 1–5. [Google Scholar]
Kumaresan, S.J.; Senthilkumar, C.; Kongkham, D.; Beenarani, B.B.; Nirmala, P. Investigating the Effectiveness of Recurrent Neural Networks for Network Anomaly Detection. In Proceedings of the 2024 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), Bangalore, India, 24–25 January 2024; pp. 1–5. [Google Scholar]
Kisanga, P.; Woungang, I.; Traore, I.; Carvalho, G.H.S. Network Anomaly Detection Using a Graph Neural Network. In Proceedings of the 2023 International Conference on Computing, Networking and Communications (ICNC), Honolulu, HI, USA, 20–22 February 2023; pp. 61–65. [Google Scholar] [CrossRef]
Xu, G.; Zhou, J.; He, Y. Network Malicious Traffic Detection Model Based on Combined Neural Network. In Proceedings of the 2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT), Changzhou, China, 9–11 December 2022; pp. 1–6. [Google Scholar] [CrossRef]
Wang, X.; Han, L.; Sun, S.; Xu, G. A Concept Drift Tolerant Abnormal Detection Method for Network Traffic. In Proceedings of the 2023 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Suzhou, China, 2–4 November 2023; pp. 323–330. [Google Scholar]
Yang, D.; Hwang, M. Unsupervised and Ensemble-based Anomaly Detection Method for Network Security. In Proceedings of the 2022 14th International Conference on Knowledge and Smart Technology (KST), Chon buri, Thailand, 26–29 January 2022; pp. 75–79. [Google Scholar]
Mehra, P.; Ahuja, M.S.; Aeri, M. Time Series Anomaly Detection System with Linear Neural Network and Autoencoder. In Proceedings of the 2023 International Conference on Device Intelligence, Computing and Communication Technologies, (DICCT), Dehradun, India, 17–18 March 2023; pp. 659–662. [Google Scholar]
Li, Z.; Wang, Y.; Wang, P.; Su, H. PGAN:A Generative Adversarial Network based Anomaly Detection Method for Network Intrusion Detection System. In Proceedings of the 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Shenyang, China, 20–22 October 2021; pp. 734–741. [Google Scholar] [CrossRef]
Yang, X.; Guo, Z.; Mai, Z. Botnet Detection Based on Machine Learning. In Proceedings of the 2022 International Conference on Blockchain Technology and Information Security (ICBCTIS), Huaihua, China, 15–17 July 2022; pp. 213–217. [Google Scholar]
Kim, J.; Nakashima, M.; Fan, W.; Wuthier, S.; Zhou, X.; Kim, I.; Chang, S.Y. A Machine Learning Approach to Anomaly Detection Based on Traffic Monitoring for Secure Blockchain Networking. IEEE Trans. Netw. Serv. Manag. 2022, 19, 3619–3632. [Google Scholar] [CrossRef]
Sun, Y.; Ochiai, H.; Esaki, H. Multi-Type Anomaly Detection Based on Raw Network Traffic. In Proceedings of the 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 9–12 January 2021; pp. 1–2. [Google Scholar] [CrossRef]
Li, C.; Huo, D.; Wang, Y.; Wang, S.; Deng, Y.; Zhou, Q. A deep learning based detection scheme towards DDos Attack in permissioned blockchains. In Proceedings of the 2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Tianjin, China, 8–10 May 2024; pp. 2644–2649. [Google Scholar]
Li, X.; Wang, C.; Tang, A. Entropy Change Rate for Traffic Anomaly Detection. In Proceedings of the 2021 IEEE 18th International Conference on Mobile Ad Hoc and Smart Systems (MASS), Denver, CO, USA, 4–7 October 2021; pp. 570–571. [Google Scholar] [CrossRef]
Bosnyaková, B.; Babič, F.; Adam, T.; Biceková, A. Anomaly Detection in Blockchain Network Using Unsupervised Learning. In Proceedings of the 2025 IEEE 23rd World Symposium on Applied Machine Intelligence and Informatics (SAMI), Stará Lesná, Slovakia, 23–25 January 2025; pp. 221–224. [Google Scholar] [CrossRef]
Zhan, X.; Yuan, H.; Wang, X. Research on Block Chain Network Intrusion Detection System. In Proceedings of the 2019 International Conference on Computer Network, Electronic and Automation (ICCNEA), Xi’an, China, 27–29 September 2019; pp. 191–196. [Google Scholar] [CrossRef]
Dong, Y.; Wang, R.; He, J. Real-Time Network Intrusion Detection System Based on Deep Learning. In Proceedings of the 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 18–20 October 2019; pp. 1–4. [Google Scholar] [CrossRef]
Sidharth, V.; Kavitha, C.R. Network Intrusion Detection System Using Stacking and Boosting Ensemble Methods. In Proceedings of the 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 2–4 September 2021; pp. 357–363. [Google Scholar] [CrossRef]
El-Attar, N.E.; Salama, M.H.; Abdelfattah, M.; Taha, S. A Comparative Analysis for Anomaly Detection in Blockchain Networks Using Machine Learning Techniques. In Proceedings of the 2024 34th International Conference on Computer Theory and Applications (ICCTA), Alexandria, Egypt, 14–16 December 2024; pp. 171–176. [Google Scholar] [CrossRef]
Chen, Z.; Duan, J.; Kang, L.; Qiu, G. Supervised Anomaly Detection via Conditional Generative Adversarial Network and Ensemble Active Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 7781–7798. [Google Scholar] [CrossRef] [PubMed]
Alabugin, S.K.; Sokolov, A.N. Applying of Recurrent Neural Networks for Industrial Processes Anomaly Detection. In Proceedings of the 2021 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT), Yekaterinburg, Russia, 13–14 May 2021; pp. 0467–0470. [Google Scholar] [CrossRef]
Tang, H.; Xiong, W.; Deng, M.; Guo, Y. A Transformer Oil Temperature Anomaly Detection Method Based on LSTM-Attention. In Proceedings of the 2024 4th International Symposium on Artificial Intelligence and Intelligent Manufacturing (AIIM), Chengdu, China, 20–22 December 2024; pp. 310–313. [Google Scholar] [CrossRef]
Li, J.; Liu, S.; Zou, J. E-Commerce Data Anomaly Detection Method Based on Variational Autoencoder. In Proceedings of the 2024 3rd International Conference on Artificial Intelligence, Internet of Things and Cloud Computing Technology (AIoTC), Wuhan, China, 13–15 September 2024; pp. 231–234. [Google Scholar] [CrossRef]
Liu, J.; Qiao, M.; Du, L.; Zhang, W.; Wang, M.; Jin, K. Utilizing Deep Belief Networks for Power System State Estimation and Anomaly Detection. In Proceedings of the 2025 IEEE 5th International Conference on Power, Electronics and Computer Applications (ICPECA), Shenyang, China, 17–19 January 2025; pp. 250–255. [Google Scholar] [CrossRef]
Raturi, R.; Kumar, A.; Vyas, N.; Dutt, V. A Novel Approach for Anomaly Detection in Time-Series Data using Generative Adversarial Networks. In Proceedings of the 2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS), Coimbatore, India, 14–16 June 2023; pp. 1352–1357. [Google Scholar] [CrossRef]
Wang, C.; Wei, W.; Long, Y.; Liu, J.; Luo, Z. Variational Autoencoder Based Anomaly Detection for AIOps of Power Grid Supercomputing Center. In Proceedings of the 2023 3rd International Conference on Energy Engineering and Power Systems (EEPS), Dali, China, 28–30 July 2023; pp. 1055–1058. [Google Scholar] [CrossRef]
Zhang, W.; Xu, L.; Yu, Z.; Zhang, Z.; Liu, T.; Liu, S. MCVAE: Multi-channel Variational Autoencoder for Anomaly Detection. In Proceedings of the 2022 IEEE 13th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), Beijing, China, 25–27 November 2022; pp. 1–6. [Google Scholar] [CrossRef]
Nguyen, H.N.; Lan-Phan, T.; Song, C.-J. Generative Adversarial Network-Based Network Intrusion Detection System for Supervisory Control and Data Acquisition System. In Proceedings of the 2024 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), Danang, Vietnam, 3–6 November 2024; pp. 1–3. [Google Scholar] [CrossRef]
Zhang, Y.; Hao, H.; Zhang, T. Abnormal Detection and Fault Diagnosis Method of Bearing Based on Deep Convolutional Generative Adversarial Network. In Proceedings of the 2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 8–10 December 2023; pp. 1604–1608. [Google Scholar] [CrossRef]
Yan, S. Analysis on Blockchain Consensus Mechanism Based on Proof of Work and Proof of Stake. In Proceedings of the 2022 International Conference on Data Analytics, Computing and Artificial Intelligence (ICDACAI), Zakopane, Poland, 15–16 August 2022; pp. 464–467. [Google Scholar] [CrossRef]
Li, X.; Yang, Y.; Li, B.; Li, M.; Zhang, J.; Li, T. Blockchain cryptocurrency abnormal behavior detection based on improved graph convolutional neural networks. In Proceedings of the 2023 International Conference on Data Security and Privacy Protection (DSPP), Xi’an, China, 16–18 October 2023; pp. 216–222. [Google Scholar] [CrossRef]
Vubangsi, M.; Abidemi, S.U.; Akanni, O.; Mubarak, A.S.; Al-Turjman, F. Applications of Transformer Attention Mechanisms in Information Security: Current Trends and Prospects. In Proceedings of the 2022 International Conference on Artificial Intelligence of Things and Crowdsensing (AIoTCs), Nicosia, Cyprus, 26–28 October 2022; pp. 101–105. [Google Scholar] [CrossRef]
Wang, W. Algorithm Research on Representation Learning and Graph Convolutional Networks in Anticancer Drug Response Prediction. Ph.D. Dissertation, Air Force Medical University, Xi’an, China, 2024; pp. 1–164. [Google Scholar] [CrossRef]

Figure 1. Computing Power Blockchain Network.

Figure 2. Network Anomaly Detection Architecture for Blockchain Computing Power Networks.

Figure 3. Part 1: Convolutional feature extraction.

Figure 4. Part 2: Attention Weighting.

Figure 5. Feature classification.

Figure 6. Comparison of the accuracy of multiple neural networks.

Figure 7. (a) F1 score curve with round and (b) F1 score curve with time.

Figure 8. (a) Training accuracy curves with round and (b) Training accuracy curves with round time.

Figure 9. (a) Test accuracy curves with round and (b) Test accuracy curves with time.

Table 1. Dataset Details.

Type	Data 1	Data 2	Data 3	Data 4	Data 5	Data 6	Data 7
Time series *	39,444,544	46,552,288	34,195,072	4,560,320	40,707,168	41,166,464	55,643,680
Duration (10⁻⁶ s) *	73,008	10,013	0	10,018	29,335	0	5,142,993
Length (B) *	347	113	114	118	108	115	364
cc_flow *	19	15	1	17	16	4	16
r1 (10⁻³) *	190	150	12	170	160	40	160
ec_flow *	0	15	1	11	14	0	0
r2 (10⁻³) *	0	150	90	110	518	0	0
r3 (10⁻³) *	0	0	0	0	0	0	0
cc_time *	3	3	1	3	6	1	3
r4 (10⁻³) *	250	428	47	300	315	76	300
ec_time *	0	1	1	1	4	0	0
r5 (10⁻³) *	0	750	100	1000	1500	200	0
r6 (10⁻³) *	1000	750	500	750	600	1000	1000
Control code *	20	10	10	20	511	10	20
Attack label *	0	1	1	1	2	2	0

* Among these features: time series represents the time sequence when the data was generated. Duration represents the duration of the TCP connection. Length represents the length of the transmitted data. Cc_flow represents the number of correct connections for the currently connected node among the last 100 connection requests. r1 represents the ratio of correct connections for the currently connected node to the total connections in the last 100 connection requests. Ec_flow represents the number of connection errors for the currently connected node among the last 100 connection requests. r2 represents the ratio of errors at the current connected node to the total errors. r3 represents the ratio of connection errors for the currently connected node to the total number of correct and incorrect connections for that node. Cc_time represents the number of connections for the node among correctly connected nodes in the last 3 s. r4 represents the ratio of correct connections for the node to the total correct connections in the last 3 s. r5 represents the ratio of connection errors for the node to the total incorrect connections in the last 3 s. r6 represents the ratio of connection errors for the currently connected node to the total correct and incorrect connections for that node in the last 3 s.

Table 2. Dataset statistics.

Data	0	1	2
Train	11,334	4753	5365
Test	2810	1171	1382

Table 3. Model computational cost.

Model	CPU Utilization (%)	Memory (MB)	Average Training Time Per Epoch (s)	Space Complexity	Time Complexity	FLOPS
CAA	14.2	3271.8	20.16657	636187	25600000000	1269427394
CAA-RF	15.4	3498.3	20.40496	674587	27358208000	1340762638
Auto	11.4	3414.7	0.408983	4117	246,180,000	601,931,492
GANs	9.7	3428.5	4.951626	246,186	352,480,000	71,184,693
LSTM	19.2	5546.3	0.810934	206,211	8,207,360,000	10,120,866,978
RNN	36.4	3532.9	0.050418	32,103	1,276,000,000	25,308,412,560
VAE	10.4	3461.7	0.615854	2322	87,040,000	141,332,137
DBN	11.7	5154.2	0.764012	12,355	485,120,000	634,963,849

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jia, S.; Zhao, Y.; Zhang, Y.; Jia, B.; Lian, W. CAA-RF: An Anomaly Detection Algorithm for Computing Power Blockchain Networks. Appl. Sci. 2025, 15, 5804. https://doi.org/10.3390/app15115804

AMA Style

Jia S, Zhao Y, Zhang Y, Jia B, Lian W. CAA-RF: An Anomaly Detection Algorithm for Computing Power Blockchain Networks. Applied Sciences. 2025; 15(11):5804. https://doi.org/10.3390/app15115804

Chicago/Turabian Style

Jia, Shifeng, Yating Zhao, Yang Zhang, Bin Jia, and Wenjuan Lian. 2025. "CAA-RF: An Anomaly Detection Algorithm for Computing Power Blockchain Networks" Applied Sciences 15, no. 11: 5804. https://doi.org/10.3390/app15115804

APA Style

Jia, S., Zhao, Y., Zhang, Y., Jia, B., & Lian, W. (2025). CAA-RF: An Anomaly Detection Algorithm for Computing Power Blockchain Networks. Applied Sciences, 15(11), 5804. https://doi.org/10.3390/app15115804

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

CAA-RF: An Anomaly Detection Algorithm for Computing Power Blockchain Networks

Abstract

1. Introduction

1.1. Related Work

1.2. Contributions

1.3. Organization

2. Distributed Computing Power Blockchain Network Model

3. Network Anomaly Detection Strategy for Computing Power Blockchain Networks

4. Experimental Results and Analysis

4.1. Introduction to Datasets

4.2. Experimental Setting and Assessment Criteria

4.3. Performance Evaluation and Comparative Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI