AI-Driven Security for Blockchain-Based Smart Contracts: A GAN-Assisted Deep Learning Approach to Malware Detection

Bourian, Imad; Hassine, Lahcen; Chougdali, Khalid

doi:10.3390/jcp5030053

Open AccessArticle

AI-Driven Security for Blockchain-Based Smart Contracts: A GAN-Assisted Deep Learning Approach to Malware Detection

by

Imad Bourian

^1,*

,

Lahcen Hassine

²

and

Khalid Chougdali

¹

Engineering Sciences Laboratory, Ibn Tofail University, Kenitra 14000, Morocco

²

Laboratory Engineering System-SIRC (LaGeS), Hassania School of Public Works (EHTP), Casablanca 20200, Morocco

^*

Author to whom correspondence should be addressed.

J. Cybersecur. Priv. 2025, 5(3), 53; https://doi.org/10.3390/jcp5030053

Submission received: 23 April 2025 / Revised: 24 June 2025 / Accepted: 7 July 2025 / Published: 1 August 2025

Download

Browse Figures

Versions Notes

Abstract

In the modern era, the use of blockchain technology has been growing rapidly, where Ethereum smart contracts play an important role in securing decentralized application systems. However, these smart contracts are also susceptible to a large number of vulnerabilities, which pose significant threats to intelligent systems and IoT applications, leading to data breaches and financial losses. Traditional detection techniques, such as manual analysis and static automated tools, suffer from high false positives and undetected security vulnerabilities. To address these problems, this paper proposes an Artificial Intelligence (AI)-based security framework that integrates Generative Adversarial Network (GAN)-based feature selection and deep learning techniques to classify and detect malware attacks on smart contract execution in the blockchain decentralized network. After an exhaustive pre-processing phase yielding a dataset of 40,000 malware and benign samples, the proposed model is evaluated and compared with related studies on the basis of a number of performance metrics including training accuracy, training loss, and classification metrics (accuracy, precision, recall, and F1-score). Our combined approach achieved a remarkable accuracy of 97.6%, demonstrating its effectiveness in detecting malware and protecting blockchain systems.

Keywords:

blockchain; artificial intelligence; smart contract; deep learning; security; IoT

1. Introduction

Recently, data integrity and security have become increasingly important, particularly with the growing expansion of numerous smart systems, including the handling of enormous volumes of private information across diverse applications [1]. Various advanced technologies are integrated to ensure data security, privacy, and integrity. In this frame, blockchain technology is one of the most innovative methods, which has gained massive attention and has become an emerging technique to address security challenges in various intelligent environments such as smart home [2], smart city [3], healthcare [4,5], and smart traffic control [6]. According to recent industry reports, the global blockchain market size is projected to grow from USD 20.1 billion in 2024 to USD 248.9 billion by 2029 at a Compound Annual Growth Rate (CAGR) of 65.5% during the forecast period [7]. Compared to traditional centralized system, which rely on a unique point of control and are more susceptible to unauthorized access, data breaches, and single points of failure, blockchain’s distributed and immutable architecture makes it significantly more resistant to data tampering, since each transaction is recorded across multiple nodes in a decentralized network. Obviously, these benefits establish blockchain as a cutting-edge and revolutionary solution for safe, trustless systems in a variety of fields. The worldwide research community is drawn to its distinctive features, which include immutability, decentralization, transparency, and high reliability [8]. Among its key applications, smart contracts play a pivotal role in modernizing transactions, driving extensive research into their functionalities and potential vulnerabilities [9]. These self-executing contracts, offering key attributes that determine their importance in multiple decentralized applications, verify and authenticate data that must be stored in the blockchain network [10]. A key component of smart contracts is immutability, which guarantees that they cannot be changed once they are deployed. This necessitates rigorous testing before deployment in a bid to reverse the inflexibility inherent in traditional contracts [11]. Additionally, ownership mechanisms give transaction creators autonomy to establish rules for access and privacy, with immutable conditions in smart contracts optimizing system functionality [12,13]. Beyond their structural advantages, smart contracts improve security by addressing blockchain constraints, imposing conditions, and limiting unrestricted access to network transactions [11]. They also enhance privacy by limiting access to contract contents solely to parties that require specific conditions [14]. However, security concerns of smart contracts and their functioning over the blockchain networks such as capacity, latency, and scalability are concerns regarding its widespread adoption [15]. This complexity has also led to the emergence of critical security vulnerabilities. On the other hand, solidity, the programming language of the smart contract, has design flaws that necessitate that the developers prioritize security and code integrity. Moreover, the immutable nature of blockchain means that once a vulnerable contract has been processed and deployed in the network, it may cause irreversible, disastrous losses. In this regard, vulnerabilities in smart contracts can be classified and determined by different forms of inputs, including source code, bytecode, opcodes, and even images [16,17]. In the Ethereum platform, an opcode (operation code) is a primitive instruction that the Ethereum Virtual Machine (EVM) executes [18]. Each opcode is an individual operation, from arithmetic computation and stack manipulation to effective control flow handling. The bytecode is a low-level representation of a smart contract that is a sequence of opcodes and operands that map to each opcode [18]. As an illustration, the DAO (Decentralized Autonomous Organization), launched in 2016 on the Ethereum blockchain, was hacked due to vulnerabilities in its code base. Consequently, an amount of USD 60 million of ether was stolen, reducing its market capitalization by 40% [19]. Further, in March 2022, a decentralized exchange called DODO DEX was hacked by a smart contract attack, losing almost USD 3.8 million of cryptocurrencies. Thus, smart contracts could be hacked by various cyberattacks that lead to huge losses in funds. Table 1 presents an overview of the most relevant attacks on smart contracts based on vulnerability, target, and method used. From the literature, most of the blockchain-based security solutions are centered on secure data storage and task offloading for IoT applications [20]. Several studies have addressed the use of some forms of smart contracts and the impact of using this form on vulnerability detection results. However, few studies have focused on vulnerability classification using deep learning technologies [21].

This work attempts to address the above security challenges by proposing an AI-driven approach for malware detection and classification using deep learning techniques to enhance the security of smart contracts based on blockchain. Using thorough experimentation on a large sample of malware attacks, we compare our approach to existing work and demonstrate its improved performance and stability in identifying malicious attacks. Hence, the main objective of this work is to enhance the security of smart contracts and ensure the resilience of blockchain networks by identifying and classifying different vulnerabilities using deep learning algorithms and experimentally evaluating them in terms of accuracy, F1-score, precision, and recall.

The main contributions of this work are as follows:

To propose an AI-based security solution integrating deep learning models to detect malware that targets the execution of smart contracts.
To introduce a GAN-algorithm-based feature selection mechanism to optimize the dataset and improve classification performance by selecting the most useful features.
To provide a comparative evaluation of three deep learning models (ANN, CNN, and GAT) specialized for different data structures derived from malware behavior.

2. Related Works

To analyze and classify the recent research progress that offers promising solutions for the integration of AI models and blockchain technologies based smart contracts, this section critically evaluates the recent research papers that propose the integration of AI and blockchain technology for security enhancement in smart systems. As discussed in [32], the authors talk about using AI and blockchain for enhancing cybersecurity. It deals with issues such as intrusion detection, malware monitoring, and data privacy. The outcome offers improved intrusion prevention and network security with emphasis on policy structures. A recent study [33] automates AI, blockchain, and smart contract-based cybersecurity compliance and threat response. It resolves human errors and the lack of real-time monitoring. The results include real-time monitoring, improved audibility, reduced human intervention, and 91% accuracy in threat classification. A method was proposed in [34] related to the use of blockchain and AI for improved cybersecurity in developing healthcare systems. It gives priority to data breaches and unauthorized access. The results depict secure data storage, proactive threat detection, and improved patient data integrity, which cumulatively improve the security of healthcare systems. The findings reported in [35] focus on IoT systems using blockchain and AI for improving security. The study proposes SecureChainAI as a model that enhances detection rates and reduces error percentages, achieving higher detection scores (precision, recall, and F1-score). The approach presented in [36] improves security in voting systems based on blockchain and AI. It addresses weaknesses and cyberattacks in voting systems. Benefits include better threat detection and transaction volumes, thereby ensuring a secure and more efficient election process. The study in [37] enhances the security and privacy of AI systems through its use of blockchain. It prevents cyberattacks and data governance issues. Decentralized and tamper-proof storage, auditors independent of any authority, and verifications were achieved by the study, thereby ensuring the trust and enhanced security of AI systems. The results discussed in [38] focus on enhancing intrusion detection systems in IoT networks through blockchain and AI. The findings of this research are directed toward a scalable, transparent, immutable, and decentralized system, leading to improved security in IOT systems. As described by the authors in [39], blockchain and machine learning are combined for improving IoT cybersecurity by addressing anomaly detection and resource management. The results are directed to improved security and privacy in 130 use cases. As presented in [40], the research addresses compliance with the EU AI Act through the use of blockchain in AI cybersecurity. It fixes data poisoning and data governance issues. The results are tamper-evident infrastructure, autonomous audits, and additional security, ensuring compliance with the EU AI Act.

As previously mentioned, a great deal of effort has been expended in integrating AI models within blockchain technology to enhance security across diverse application domains, such as healthcare, IoT, and e-voting systems. However, a critical gap remains unaddressed, and security analysis of the smart contracts executed on blockchain networks remains unexplored, particularly their susceptibility to malicious manipulation and malware injection during execution. Most prior works rely on static analysis or focus on external threat landscapes, overlooking the nuanced behavioral patterns of smart contracts that may signal compromise. To bridge this gap, it is essential to design an architecture that offers filtered access in which only authenticated users can deploy smart contracts. Through an analysis of external network data, an AI-driven model can efficiently identify various cyber threats and unauthorized access attempts. After it verifies the lack of malicious intentions, the smart contract processing takes place. This dynamic, data-driven approach adds a layer to smart contract security that complements existing static or rule-based models.

3. Proposed Approach

In this section, we present our proposed design that integrates blockchain-based smart contracts and deep-learning-based malware detection to protect smart systems. As illustrated in Figure 1, a smart contract possesses a structured life-cycle on the Ethereum blockchain. This life-cycle consists of a number of phases:

Creation: the smart contract is programmed and compiled.
Deployment: the contract is deployed to the Ethereum blockchain, so it becomes public and immutable.
Execution: the contract interacts with external systems and users through calling the contract’s functions.
Completion: the contract finishes executing normally, or it reaches an end state (i.e., self-destructs or obsoletes).

Throughout this life-cycle, smart contracts are vulnerable to security threats, including malware attacks attempting to alter their behavior or exploit weaknesses. To explain more, we propose a simple scenario related to a malware that modifies a blockchain node to falsify recorded data. The nodes are installed on the internal servers of an organization that uses a blockchain for transaction traceability. The attacker sends a phishing email containing a malware-infected attachment to a system administrator. Once opened, the malware silently installs itself on the machine, accessing the file system. It identifies directories related to the operation of the local blockchain node. The malware then modifies the configuration files to redirect traffic to an attacker’s node. Then, it alters local data to falsify transactions or falsify the history. It can also remotely control the machine or retrieve all files on the network. Thus, our proposed solution, as depicted in Figure 2, integrates an AI-powered security component into the blockchain network. Our framework employs a deep-learning-based model for detecting and preventing malicious behavior attempting to compromise smart contracts at any stage of their life-cycle. The blockchain module ensures secure and transparent transaction management between various smart system networks. Our solution is also strengthened with the latest feature selection techniques based on the GAN algorithm to enhance the malware detection rate.

Figure 1. Smart contract life-cycle over Ethereum blockchain.

Figure 2. Proposed approach design.

3.1. Data and Pre-Processing

A 100,000-sample dataset was collected from publicly accessible GitHub repositories that are dedicated to malware research and education. These include repositories like the JavaScript malware collection master, malware collection master, malware database main/master, my malware collection, the malware repo, and trashers malware repo. These sets consist of various types of malware, such as PDF, RTF, XLS, JS, ELF, and JAR files, which are representative of different real-world attack patterns and malicious activities. Benign samples, employed for dataset balance, consist of general forms like EXE, XLS, DOC, RTF, ZIP, 7Z, RAR, JAR, PDF, and ELF. The dataset allows for not just the examination of malware signatures but also knowledge of propagation patterns and frequently attacked vectors, such as those which can attack blockchain-related environments and smart contracts. Yet, it must be acknowledged that the dataset is not specific to blockchain malware; rather, it is meant to facilitate the creation of generalized malware detection models. Compared to other widely used network intrusion datasets such as UNSW-NB15 or CIC-IDS-2017, which focus on traffic level features, our dataset is built upon static file-based malware instances. This approach supports complementary detection views suitable for smart contract environments interacting with external data inputs. Algorithm 1 defines the pre-processing pipeline as follows: data cleaning, removal of duplicates, missing values handling, normalization, and standardization of features, class balance, and feature engineering to improve the correctness. Each file is first converted into an appropriate form of representation like hexadecimal or binary, depending on the type, and then statistical and content-based feature extraction. The resulting dataset is formatted and checked to prepare for further model training and performance assessment.

Algorithm 1 Algorithm for Pre-Processing.

1:: procedure PreprocessDataset(D)
2:: Initialize Processed_Dataset $\leftarrow \emptyset$
3:: for each row $r \in D$ do
4:: Clean the row r
5:: Remove duplicates in D
6:: Normalize values
7:: Transform data if necessary
8:: Validate the row r
9:: Processed_Dataset.add(r)
10:: end for
11:: return Dataset
12:: end procedure

The following equations describe the main pre-processing operations applied to the malware and benign file dataset:

3.1.1. Data Cleaning

Removal of null or invalid values:

D^{'} = {x_{i} \in D | x_{i} \neq null \land x_{i} is valid}

3.1.2. Duplicate Removal

Given a dataset

D = {x_{1}, x_{2}, \dots, x_{n}}

, duplicates are removed:

D^{'} = {x_{i} \in D | ∄ j < i, x_{j} = x_{i}}

3.1.3. Normalization (Min–Max Scaling)

Each feature

x_{i}

is scaled to the range

[0, 1]

:

x_{i}^{norm} = \frac{x_{i} - min (x)}{max (x) - min (x)}

3.1.4. Standardization (Z-Score Scaling)

Each feature

x_{i}

is standardized as follows:

x_{i}^{std} = \frac{x_{i} - μ_{x}}{σ_{x}}

where

μ_{x}

is the mean and

σ_{x}

is the standard deviation of feature x.

3.1.5. Class Balancing

Let

N_{0}

and

N_{1}

be the number of samples in classes 0 and 1, respectively:

N_{0} = N_{1} = min (N_{0}, N_{1}) (undersampling)

or N_{0} = N_{1} = max (N_{0}, N_{1}) (oversampling)

3.1.6. File Transformation

Files are converted to byte sequences, then into byte frequency vectors. Let

F = {b_{1}, b_{2}, \dots, b_{n}}

be the byte sequence:

f_{k} = \frac{count (b_{i} = k)}{n}, k \in [0, 255]

3.1.7. Statistical Feature Extraction

Entropy:

H (x) = - \sum_{i = 1}^{n} p (x_{i}) {log}_{2} p (x_{i})

Mean:

μ = \frac{1}{n} \sum_{i = 1}^{n} x_{i}

Standard deviation:

σ = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - μ)}^{2}}

3.1.8. Validation

Each feature vector

x \in R^{d}

must lie within expected bounds:

\forall j \in [1, d], x_{j} \in [a_{j}, b_{j}]

3.2. Features Selection

We adopt a new approach-based GAN algorithm to identify the most relevant features in the generated dataset containing 40,000 malware and legitimate samples. The GAN learns the underlying data distribution and evaluates the importance of each feature by leveraging the adversarial training mechanism between the generator and discriminator. During the training process, the generator attempts to produce synthetic data samples that resemble the original data, while the discriminator tries to distinguish between real and generated samples. Features that significantly contribute to the discriminator’s accuracy are deemed important, as they play an important role in differentiating real data from generated data. As explained in Algorithm 2, this process ensures that the selected features are informative and robust, leading to a refined dataset that enhances the performance of deep learning models.

Algorithm 2 GAN-based Feature Selection.

1:: procedure FeatureSelection(D, k)
2:: Encode categorical variables
3:: Normalize features using StandardScaler
4:: Split D into training and testing sets
5:: Build Generator G
6:: Build Discriminator D
7:: Combine G and D to create the GAN model
8:: Compile D with binary cross-entropy loss and Adam optimizer
9:: Freeze D and compile the GAN
10:: for each epoch E do
11:: Sample $B / 2$ real data from $D_{t r a i n}$
12:: Generate $B / 2$ fake data using G
13:: Train D on real data (1) and fake data (0)
14:: Generate B noise samples and train GAN
15:: end for
16:: Use G to generate synthetic features from noise N of size $D_{t e s t}$
17:: Combine generated features with $D_{t e s t}$
18:: Use SelectKBest to select top k features based on $D_{a u g m}$
19:: Extract selected features and target column
20:: Save $D_{a u g m}$ : selected features and associated labels to a CSV file
21:: end procedure

Figure 3 shows the importance of different features as evaluated by the GAN model. The hash_encoded feature stands out with a very high value, far exceeding all others, indicating its important contribution to the model. The next are n_rpte and prio with much lower but still high values, suggesting that they also have important roles to play. The other features are less significant; however, we decided to retain the top 10 most influential features to optimize the model’s efficiency and relevance.

The GAN model converges effectively, with both the generator and discriminator reaching a stable equilibrium after sufficient training epochs. Figure 4 and Figure 5 illustrate that the discriminator loss (D-Loss) starts relatively low, increases slightly, decreases gradually, and stabilizes over the course of training. This indicates that the discriminator initially learns to distinguish real from generated data, and that its performance gets better up to the point where it stabilizes. Conversely, the generator loss (G-Loss) begins high, representing the initial difficulty of the generator in producing realistic data. However, it quickly decreases in the early epochs, showing that the generator is learning and refining its outputs. The G-Loss becomes stable with minimal variations after around 200 epochs, which shows a balance between the discriminator and generator.

3.3. Experimentation

In the experimentation stage, we implement three models: Artificial Neural Networks (ANNs), Convolutional Neural Networks (CNNs), and Graph Attention Networks (GATs), which are recent advances in deep learning technologies, most useful in malware detection and addressing complex problems. Each of these models was selected based on its suitability for learning different characteristics of our dataset, which consists of 40,000 labeled samples (malware and benign). The three models each have different strengths:

The ANN model provides a free architecture to learn non-linear patterns, and ANNs are built with fully connected layers wherein every neuron connects to each neuron in nearby layers with optimal weights by applying activation functions. They are chosen for their ability to learn non-linear relationships in structured tabular data, which is essential for capturing the diverse patterns exhibited by different types of malware in our dataset. This flexibility allows ANNs to effectively distinguish between benign and malicious behaviors, even when malware variants present overlapping characteristics.
The CNN model performs better in self-extracting features from spatial or structured data, and CNNs use convolutional filters to extract local features, automatically from structured data, such as images, progressively reducing the dimensionality of the data. They are employed for their capability to capture local spatial dependencies. CNNs are capable of automatically extracting local patterns from structured data, making them effective in detecting specific malware signatures. Their ability to reduce dimension while retaining essential information improves the detection of malicious behavior.
The GAT model exploits graph-represented data connectivity using mechanisms to emphasize significant relations with attention processes to dynamically allocate weights to the importance of neighbors in information transfer throughout the graph. GATs are suitable for malware detection because they exploit the graph structure of malware, modeling the relationships between different signatures. Using the attention mechanism, they weigh the importance of connections to identify malicious behavior in complex structures.

This experiment aims to compare these three models in malware detection, a field where data complexity and diversity demand robust and innovative solutions. Such a comparison will evaluate their performance on accuracy, strength, and ability to handle heterogeneous data, ultimately deciding the best approach for modern-day cybersecurity environments. Thus, we used four evaluation metrics to verify the performance of the models, which include accuracy, precision, recall, and the F1-score. All these are calculated by using the following equations:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(1)

Precision = \frac{T P}{F P + T P}

(2)

Recall = \frac{T P}{F N + T P}

(3)

F 1 - Score = \frac{2 \cdot Precision \cdot Recall}{Precision + Recall}

(4)

The main technologies and tools used to process the implementation and experimentation phases are presented and explained in Table 2.

4. Simulation and Results Discussions

In this section, we briefly present some metrics to evaluate our proposed architecture implemented in a basic environment and examine the fundamental security considerations underlying our suggested methodology in comparison with related works.

The outcomes derived for the ANN model, as depicted in Figure 6, Figure 7 and Figure 8, show a good trend of improvement in the performance metrics with 100 epochs. The accuracy evolution graph shows a continuous convergence between testing and training accuracy to 97.60%, which is a sign of the good generalization of the model. In the learning curve (loss), there is a steady decrease in the loss value, indicating effective learning without apparent signs of over-fitting. Lastly, the precision, recall, and F1-score curves confirm these observations, with consistent improvements and stabilization around high values after approximately 20 epochs, highlighting an excellent balance between precision and recall in malware detection.

The values obtained for the CNN model, as presented in Figure 9, Figure 10 and Figure 11, indicate a gradual increase in the performance measures with the progression of the 100 epochs. The plot of accuracy evolution indicates a steady performance in the early epochs and then a drastic improvement in both the training as well as test accuracy after epoch 80 and then going up to 80%. The learning curve (loss) indicates the steady reduction and thus efficient training with no sharp over-fitting issue. Concerning the precision, recall, and F1-score, the values grow in a good direction with precision stabilizing at higher points compared to the recall and F1-score. This implies that the CNN model excels in recognizing malware but will need to be trained for better recall and overall balance.

The GAT model performs very well on all the metrics, as indicated in Figure 12, Figure 13 and Figure 14. On the range of 100 epochs, accuracy evolution reflects constant improvement on training and test, and the model achieves 96.66% in training accuracy and 96.84% in test accuracy. The loss function keeps going down to a level of 0.0774 at the final epoch. Performance metrics like precision, recall, and F1-score also display this growing pattern to varying levels above 96%. An F1-score of 97.78% is particularly indicative of the good balance the model maintains between precision and recall in favor of its ability to correctly classify malware data with minimal loss.

4.1. Analysis and Discussion

The ANN model showed constant growth during the training period. Loss function reduction and the constant rise in slope regarding accuracy confirm that adequate learning has taken place on the model without a significant showing of over-fitting. Precision, recall, and F1-score plots corroborate this evidence through adequate convergence of marking impostors as positives and actual negatives. At a 97.60% accuracy, this excellent performance suggests that the ANN is particularly well adapted to classification problems, like malware detection.

The CNN model also indicated improvement over training, if not as robustly for stability of performance across recall and F1-score as with the ANN. Although decreasing in the loss curve, the best accuracy was merely 80%. This suggests that although the model is good at classifying malware, it could benefit from optimization for hyperparameter tweaking or network structure in order to achieve a better stable balance between recall and precision and hence output a more consistent general performance.

The GAT model was impressive with precision rates of 96.84% and an F1-score of 97.78%. Not only do these show strong generalization but also high robustness in precision and recall balance. The decline in loss and the consistent increase in performance indicators show the efficiency of the GAT in malware classification to deliver immense accuracy and reactivity. However, with such a superior performance, the application of GAT would be more complex with the need to process graph-structured information, which could go against its adoption over simpler models like ANNs.

The comparison analysis of the performance of all three models, as explained in Table 3, confirms that the ANN model stands out as the most effective for malware detection, particularly due to its ability to maintain a good balance between precision and recall. To explain this performance outcome, we note that our dataset is structured and composed of various numerical and categorical features-derived malware. ANN models are particularly well suited for capturing complex and non-linear patterns, which aligns well with the nature of our feature-engineered malware dataset. This allowed the ANN to maintain a stable balance between precision and recall and achieve the highest generalization among the tested models. In contrast, the CNN and GAT models, while effective in other contexts, were less suited to the specific structure and dimensionality of our dataset. Hence, this stability renders it an ideal solution for mission-critical use cases such as protecting blockchain systems and the smart contract development environment. Blockchain transaction security relies on the continuous and stable monitoring of malware that may disrupt smart contracts. Because of its ability to generalize, the ANN is an effective solution for accurately identifying malicious behavior in such systems, thus offering an effective means of protecting the integrity of data and processes in the blockchain environment.

4.2. Comparison with Other Studies

Table 4 gives a comparative analysis based on accuracy levels achieved in some recent research studies on malware detection, a topic closely related to the topic of our research work. We see that Venkatasubramanian et al. [41] achieved an accuracy level of 95%, while Abirind et al. [42] achieved a slightly higher level of 96.6%. These findings point toward excellent performance levels in solving present-day cybersecurity issues related to malware. It is important to note that although the studies listed in Table 4 operate in different domains such as IoT device data and general-purpose malware detection, they have a common goal of improving security against malware threats by using deep learning models. These works vary in terms of dataset types, model architectures, and evaluation settings; however, they all report performance outcomes using standard classification metrics, particularly accuracy. In this comparison, we focus on the accuracy metric as a common point of reference to position our approach.

Hence, our suggested methodology, which combines blockchain technology with deep learning architectures to enhance the security of intelligent systems, achieved a remarkable accuracy rating of 97.6%. This not only outperforms what has been reported in the above-discussed studies, but it also proves the efficacy and resilience of our hybrid approach. Through an integration of the complementing capabilities of blockchain and deep learning, our model proves to excel in improving security and enhancing optimizing performances within complex smart environments.

5. Conclusions

In this research, we proposed a security mechanism with the integration of blockchain-based smart contracts and malware detection using deep learning to enhance the security of smart systems. By integrating Artificial Neural Networks (ANNs), Convolutional Neural Networks (CNNs), and Graph Attention Networks (GATs), we examined their effectiveness in detecting malware attacks targeting smart contracts. We demonstrated that the ANN model achieved the highest accuracy (97.60%) and was balanced among the key performance metrics, and was therefore the most suitable for efficient security through the smart contract life-cycle. We also showed that the GAN-based feature selection approach was applied to optimize the dataset and improve model performances. Thus, the results obtained in our experiment reveal that a scalable and reliable solution to prevent cyberthreats and malicious attempts in decentralized networks and smart systems is to combine a blockchain-based smart contract with an AI-driven malware detection model using the GAN feature selection algorithm. Future research will focus on the further optimization of deep learning models, advanced research for adaptive threat detection, and scalability improvements for real-world deployments.

Author Contributions

Conceptualization, I.B., L.H. and K.C.; methodology, I.B., L.H. and K.C.; software, I.B., L.H. and K.C.; validation, I.B., L.H. and K.C.; formal analysis, I.B., L.H. and K.C.; investigation, I.B., L.H. and K.C.; resources, I.B., L.H. and K.C.; data curation, I.B., L.H. and K.C.; writing—original draft preparation, I.B., L.H. and K.C.; writing—review and editing, I.B., L.H. and K.C.; visualization, I.B., L.H. and K.C.; supervision, I.B., L.H. and K.C.; project administration, I.B., L.H. and K.C.; funding acquisition, I.B., L.H. and K.C. All authors have read and agreed to the published version of the manuscript.

Funding

The APC were funded by Ibn Tofail University.

Data Availability Statement

The original data presented in the study shall be made available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wen, Y.; Lu, F.; Liu, Y.; Huang, X. Attacks and countermeasures on blockchains: A survey from layering perspective. Comput. Netw. 2021, 191, 107978. [Google Scholar] [CrossRef]
Bourian, I.; Sebbar, A.; Chougdali, K.; Amhoud, E.M. SSHCEth: Secure Smart Home Communications based on Ethereum Blockchain and Smart Contract. In Proceedings of the GLOBECOM 2023—2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 4–8 December 2023; pp. 2674–2679. [Google Scholar] [CrossRef]
Imad, B.; Anass, S.; Mounir, A.; Khalid, C. Blockchain Based Smart Contract to Enhance Security in Smart City. In Proceedings of the 2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM), Leeds, UK, 23–25 July 2024; pp. 1–6. [Google Scholar] [CrossRef]
Raju, K.; Ramshankar, N.; Shathik, J.A.; Lavanya, R. Blockchain assisted cloud security and privacy preservation using hybridized encryption and deep learning mechanism in iot-healthcare application. J. Grid Comput. 2023, 21, 45. [Google Scholar] [CrossRef]
El Filali, C.; Bourian, I.; Chougdali, K. Privacy-Preserving and Access Control Scheme for IoT-Based Healthcare Systems Using Ethereum Blockchain. In Proceedings of the 2024 7th International Conference on Advanced Communication Technologies and Networking (CommNet), Rabat, Morocco, 4–6 December 2024; pp. 1–6. [Google Scholar] [CrossRef]
Chen, Y.; Qiu, Y.; Tang, Z.; Long, S.; Zhao, L.; Tang, Z. Exploring the synergy of blockchain, IoT, and edge computing in smart traffic management across urban landscapes. J. Grid Comput. 2024, 22, 45. [Google Scholar] [CrossRef]
Blockchain Market Size Report. 2023. Available online: https://www.marketsandmarkets.com/Market-Reports/blockchain-technology-market-90100890.html (accessed on 10 June 2025).
Gupta, R.; Kumari, A.; Tanwar, S. Fusion of blockchain and artificial intelligence for secure drone networking underlying 5G communications. Trans. Emerg. Telecommun. Technol. 2021, 32, e4176. [Google Scholar] [CrossRef]
Nguyen, T.N. Smart contract: Revolutionizing transactions in the digital age. HPU2 Nat. Sci. Technol. 2024, 3, 30–38. [Google Scholar] [CrossRef]
Kayade, P.; Pardeshi, A.; Patil, S.; Raut, P.; Shetkar, P.; Barhate, M. Decentralized Application using Blockchain. In Proceedings of the 2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN), Dhulikhel, Nepal, 3–4 July 2024; pp. 906–911. [Google Scholar] [CrossRef]
Rouhani, S.; Deters, R. Security, performance, and applications of smart contracts: A systematic survey. IEEE Access 2019, 7, 50759–50779. [Google Scholar] [CrossRef]
Hu, B.; Zhang, Z.; Liu, J.; Liu, Y.; Yin, J.; Lu, R.; Lin, X. A comprehensive survey on smart contract construction and execution: Paradigms, tools, and systems. Patterns 2021, 2, 100179. [Google Scholar] [CrossRef] [PubMed]
Hu, T.; Liu, X.; Chen, T.; Zhang, X.; Huang, X.; Niu, W.; Lu, J.; Zhou, K.; Liu, Y. Transaction-based classification and detection approach for Ethereum smart contract. Inf. Process. Manag. 2021, 58, 102462. [Google Scholar] [CrossRef]
Zaidi, S.Y.A.; Shah, M.A.; Khattak, H.A.; Maple, C.; Rauf, H.T.; El-Sherbeeny, A.M.; El-Meligy, M.A. An attribute-based access control for IoT using blockchain and smart contracts. Sustainability 2021, 13, 10556. [Google Scholar] [CrossRef]
Wu, C.; Xiong, J.; Xiong, H.; Zhao, Y.; Yi, W. A review on recent progress of smart contract in blockchain. IEEE Access 2022, 10, 50839–50863. [Google Scholar] [CrossRef]
Li, D.; Wong, W.E.; Wang, X.; Pan, S.; Koh, L.S. Smart Contract Vulnerability Detection based on Static Analysis and Multi-Objective Search. arXiv 2024, arXiv:2410.00282. [Google Scholar] [CrossRef]
Kumar, N.K.; Honnungar, N.V.; Prakash, M.S.; Lohith, J. Vulnerabilities in Smart Contracts: A Detailed Survey of Detection and Mitigation Methodologies. In Proceedings of the 2024 International Conference on Emerging Technologies in Computer Science for Interdisciplinary Applications (ICETCS), Bengaluru, India, 22–23 April 2024; pp. 1–7. [Google Scholar] [CrossRef]
Aldweesh, A.; Alharby, M.; Mehrnezhad, M.; Van Moorsel, A. OpBench: A CPU performance benchmark for Ethereum smart contract operation code. In Proceedings of the 2019 IEEE International Conference on Blockchain (Blockchain), Atlanta, GA, USA, 14–17 May 2019; pp. 274–281. [Google Scholar] [CrossRef]
del Castillo, M. The DAO Attacked: Code Issue Leads to $60 Million Ether Theft. Article Online, CoinDesk. 2016. Available online: https://www.coindesk.com/markets/2016/06/17/the-dao-attacked-code-issue-leads-to-60-million-ether-theft (accessed on 10 June 2025).
Singh, R.; Tanwar, S.; Sharma, T.P. Utilization of blockchain for mitigating the distributed denial of service attacks. Secur. Priv. 2020, 3, e96. [Google Scholar] [CrossRef]
Soud, M.; Qasse, I.; Liebel, G.; Hamdaqa, M. Automesc: Automatic framework for mining and classifying ethereum smart contract vulnerabilities and their fixes. In Proceedings of the 2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Durres, Albania, 6–8 September 2023; pp. 410–417. [Google Scholar] [CrossRef]
Kumar, A.S.L.; Mishra, S. Ransomware Criminal Smart Contract. In Proceedings of the 2024 IEEE International Conference on Blockchain (Blockchain), Copenhagen, Denmark, 19–22 August 2024; pp. 219–226. [Google Scholar] [CrossRef]
Güler, O. A Model Design Using Blockchain and Smart Contracts Against Cyberattacks in Smart Home Systems. Acta Infologica 2024, 8, 11–22. [Google Scholar] [CrossRef]
Khoa, T.V.; Son, D.H.; Nguyen, C.H.; Hoang, D.T.; Nguyen, D.N.; Trung, N.L.; Quynh, T.T.T.; Hoang, T.M.; Ha, N.V.; Dutkiewicz, E. Securing Blockchain Systems: A Novel Collaborative Learning Framework to Detect Attacks in Transactions and Smart Contracts. arXiv 2024, arXiv:2308.15804. [Google Scholar] [CrossRef]
DeCusatis, C.; Gormanly, B.; Iacino, J.; Percelay, R.; Pingue, A.; Valdez, J. Cybersecurity Test Bed for Smart Contracts. Cryptography 2023, 7, 15. [Google Scholar] [CrossRef]
Alkhalifah, A.; Ng, A.; Watters, P.A.; Kayes, A. A mechanism to detect and prevent Ethereum blockchain smart contract reentrancy attacks. Front. Comput. Sci. 2021, 3, 598780. [Google Scholar] [CrossRef]
Motaghi, Z.; Yazdani, N.; Bahrak, B. A Framework for Collaborative Attack based on Criminal Smart Contract. arXiv 2020, arXiv:2010.12280. [Google Scholar] [CrossRef]
Li, Z.; Wang, Y.; Wen, S.; Ding, Y. Evil chaincode: Apt attacks based on smart contract. In Proceedings of the International Conference on Frontiers in Cyber Security, Tianjin, China, 15–17 November 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 178–196. [Google Scholar] [CrossRef]
Sebastian-Cardenas, D.J.; Gourisetti, S.N.G.; Saha, S.S.; Khan, K.; Tillman, L.C.; Cali, U.; Hughes, T. Cybersecurity and privacy aspects of smart contracts in the energy domain. In Proceedings of the 2022 IEEE 1st Global Emerging Technology Blockchain Forum: Blockchain & Beyond (iGETblockchain), Irvine, CA, USA, 7–11 November 2022; pp. 1–6. [Google Scholar] [CrossRef]
Swaminathan, K.; Saravanan, S. A Criminal Smart Contract for Distributed Denial of Service Attacks. In Proceedings of the 2021 6th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 8–10 July 2021; pp. 853–862. [Google Scholar] [CrossRef]
Sayeed, S.; Marco-Gisbert, H.; Caira, T. Smart contract: Attacks and protections. IEEE Access 2020, 8, 24416–24427. [Google Scholar] [CrossRef]
Pathak, J.P.; Singh, K.; Roy, S. Role of Artificial Intelligence and Blockchain on Cyber Security. Adv. Bus. Inf. Syst. Anal. 2024. [Google Scholar] [CrossRef]
Alevizos, L. Automated cybersecurity compliance and threat response using AI, blockchain and smart contracts. Int. J. Inf. Technol. 2024, 17, 767–781. [Google Scholar] [CrossRef]
Adeniyi, A.E.; Jimoh, R.G.; Awotunde, J.B.; Aworinde, H.O.; Falola, P.B.; Ninan, D.O. Blockchain for Secured Cybersecurity in Emerging Healthcare Systems; IET: London, UK, 2024; pp. 335–361. [Google Scholar]
ThamaraiSelvi, K.; Pushpalatha, A.; Chidambarathanu, K.; Wankhede, J.P.; Alagumuthukrishnan, S.; Sarveshwaran, V. SecureChainAI: Integrating Blockchain and Artificial Intelligence for Enhanced Security in IoT Environments. In Proceedings of the 2024 5th International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 18–20 September 2024; pp. 781–789. [Google Scholar] [CrossRef]
Ainur, J.; Gulzhan, M.; Amandos, T.; Venera, R.; Bulat, S.; Zauresh, Y.; Aizhan, S. The impact of blockchain and artificial intelligence technologies in network security for e-voting. Int. J. Electr. Comput. Eng. (IJECE) 2024, 14, 6723–6733. [Google Scholar] [CrossRef]
Saleh, A.M.S. Blockchain for secure and decentralized artificial intelligence in cybersecurity: A comprehensive review. Blockchain Res. Appl. 2024, 5, 100193. [Google Scholar] [CrossRef]
Ahakonye, L.A.C.; Nwakanma, C.I.; Kim, D.S. Tides of Blockchain in IoT Cybersecurity. Sensors 2024, 24, 3111. [Google Scholar] [CrossRef] [PubMed]
Karnwal, V.; Chaurasia, A.; Agarwal, D. Blockchain Security and Privacy using Machine Learning and Internet of Things-A Review. In Proceedings of the 2024 First International Conference on Pioneering Developments in Computer Science & Digital Technologies (IC2SDT), Delhi, India, 2–4 August 2024; pp. 19–22. [Google Scholar] [CrossRef]
Ramos, S.; Ellul, J. Blockchain for Artificial Intelligence (AI): Enhancing compliance with the EU AI Act through distributed ledger technology. A cybersecurity perspective. Int. Cybersecur. Law Rev. 2024, 5, 1–20. [Google Scholar] [CrossRef]
Venkatasubramanian, M.; Lashkari, A.H.; Hakak, S. Federated Learning Assisted IoT Malware Detection Using Static Analysis. In Proceedings of the 2022 12th International Conference on Communication and Network security, Beijing, China, 1–3 December 2022; pp. 191–198. [Google Scholar] [CrossRef]
Abirind, K.; Vijai, K.; Nandakumar, R. Malware Detection: A Comparison of Different Machine Learning and Deep Learning Networks. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 24–28 June 2024; pp. 1–6. [Google Scholar] [CrossRef]

Figure 3. The top features selected.

Figure 4. Discriminator loss graph.

Figure 5. Generator loss graph.

Figure 6. ANN accuracy.

Figure 7. ANN loss.

Figure 8. ANN performance scores.

Figure 9. CNN accuracy.

Figure 10. CNN loss.

Figure 11. CNN performance scores.

Figure 12. GAT accuracy.

Figure 13. GAT loss.

Figure 14. GAT performance scores.

Table 1. Overview of the main cyberattacks on smart contracts (SCs).

Ref.	Attack	Vulnerability	Target	Method Used
[22]	Ransomware via SC	SC Automation	Data	Criminal SCs
[23]	Cyberattacks	Insufficient Security	Smart Home	SC
[24]	Transaction attacks	SC	Transactions	Collaborative learning
[25]	Social engineering attacks	Unsecure code	SC	Test bed
[26]	Reentrancy attacks	Reentrancy vulnerabilities	Ethereum SC	Detection mechanism
[27]	Criminal SCs	SC Exploitation	SC	Criminal SCs
[28]	APT	Open system	Blockchain	Data theft
[29]	SC for energy	Attack vectors	SC	Best practices
[30]	DDoS	Open system	Resources	DDoS
[31]	Exploitation	Security patches	SC	Classification and analysis

Table 2. Experimental setup.

Technology	Description
	An open-source platform used to develop and manage data science and scientific computing libraries required for deep learning and data processing tasks.
	An open-source development environment used for writing, testing, and debugging Python v3.13.2 code, bundled with Anaconda.
	The main programming language used to implement deep learning models, GAN-based feature selection, and experiment pipelines.

Table 3. Comparison of models outputs.

Metrics	ANN	CNN	GAT
Accuracy	97.60%	80%	96.84%
Precision, recall, and F1-score	High value after 20 epochs	High value after 80 epochs	High value after 20 epochs

Table 4. Comparison with other related studies.

References	Objective	Accuracy Score
[41]	Propose a Federated Learning-based approach that employs a random forest model for detecting IoT malware samples.	95%
[42]	Evaluate the effectiveness of deep learning models for malware detection.	96.6%
Our work	Combined blockchain and deep learning architecture to improve security in smart systems.	97.6%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bourian, I.; Hassine, L.; Chougdali, K. AI-Driven Security for Blockchain-Based Smart Contracts: A GAN-Assisted Deep Learning Approach to Malware Detection. J. Cybersecur. Priv. 2025, 5, 53. https://doi.org/10.3390/jcp5030053

AMA Style

Bourian I, Hassine L, Chougdali K. AI-Driven Security for Blockchain-Based Smart Contracts: A GAN-Assisted Deep Learning Approach to Malware Detection. Journal of Cybersecurity and Privacy. 2025; 5(3):53. https://doi.org/10.3390/jcp5030053

Chicago/Turabian Style

Bourian, Imad, Lahcen Hassine, and Khalid Chougdali. 2025. "AI-Driven Security for Blockchain-Based Smart Contracts: A GAN-Assisted Deep Learning Approach to Malware Detection" Journal of Cybersecurity and Privacy 5, no. 3: 53. https://doi.org/10.3390/jcp5030053

APA Style

Bourian, I., Hassine, L., & Chougdali, K. (2025). AI-Driven Security for Blockchain-Based Smart Contracts: A GAN-Assisted Deep Learning Approach to Malware Detection. Journal of Cybersecurity and Privacy, 5(3), 53. https://doi.org/10.3390/jcp5030053

Article Menu

AI-Driven Security for Blockchain-Based Smart Contracts: A GAN-Assisted Deep Learning Approach to Malware Detection

Abstract

1. Introduction

2. Related Works

3. Proposed Approach

3.1. Data and Pre-Processing

3.1.1. Data Cleaning

3.1.2. Duplicate Removal

3.1.3. Normalization (Min–Max Scaling)

3.1.4. Standardization (Z-Score Scaling)

3.1.5. Class Balancing

3.1.6. File Transformation

3.1.7. Statistical Feature Extraction

3.1.8. Validation

3.2. Features Selection

3.3. Experimentation

4. Simulation and Results Discussions

4.1. Analysis and Discussion

4.2. Comparison with Other Studies

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI