Design of Efficient Based Artificial Intelligence Approaches for Sustainable of Cyber Security in Smart Industrial Control System

Alzahrani, Ali; Aldhyani, Theyazn H. H.

doi:10.3390/su15108076

Open AccessArticle

Design of Efficient Based Artificial Intelligence Approaches for Sustainable of Cyber Security in Smart Industrial Control System

by

Ali Alzahrani

¹

and

Theyazn H. H. Aldhyani

^2,*

¹

Department of Computer Engineering, King Faisal University, P.O. Box 400, Al Hofuf 31982, Saudi Arabia

²

Applied College in Abqaiq, King Faisal University, P.O. Box 400, Al Hofuf 31982, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(10), 8076; https://doi.org/10.3390/su15108076

Submission received: 7 March 2023 / Revised: 5 May 2023 / Accepted: 11 May 2023 / Published: 16 May 2023

(This article belongs to the Special Issue Information, Cybersecurity and Modeling in Sustainable Future)

Download

Browse Figures

Versions Notes

Abstract

:

Online food security and industrial environments and sustainability-related industries are highly confidential and in urgent need for network traffic analysis to attain proper security information to avoid attacks from anywhere in the world. The integration of cutting-edge technology such as the Internet of things (IoT) has resulted in a gradual increase in the number of vulnerabilities that may be exploited in supervisory control and data acquisition (SCADA) systems. In this research, we present a network intrusion detection system for SCADA networks that is based on deep learning. The goal of this system is to defend ICSs against network-based assaults that are both conventional and SCADA-specific. An empirical evaluation of a number of classification techniques including k-nearest neighbors (KNN), linear discriminant analysis (LDA), random forest (RF), convolution neural network (CNN), and integrated gated recurrent unit (GRU) is reported in this paper. The suggested algorithms were tested on a genuine industrial control system (SCADA), which was known as the WUSTL-IIoT-2018 and WUSTL-IIoT-20121 datasets. SCADA system operators are now able to augment proposed machine learning and deep learning models with site-specific network attack traces as a result of our invention of a re-training method to handle previously unforeseen instances of network attacks. The empirical results, using realistic SCADA traffic datasets, show that the proposed machine learning and deep-learning-based approach is well-suited for network intrusion detection in SCADA systems, achieving high detection accuracy and providing the capability to handle newly emerging threats. The accuracy performance attained by the KNN and RF algorithms was superior and achieved a near-perfect score of 99.99%, whereas the CNN-GRU model scored an accuracy of 99.98% using WUSTL-IIoT-2018. The Rf and GRU algorithms achieved >99.75% using the WUSTL-IIoT-20121 dataset. In addition, a statistical analysis method was developed in order to anticipate the error that exists between the target values and the prediction values. According to the findings of the statistical analysis, the KNN, RF, and CNN-GRU approaches were successful in achieving an R² > 99%. This was demonstrated by the fact that the approach was able to handle previously unknown threats in the industrial control systems (ICSs) environment.

Keywords:

intrusion detection system; industrial control systems; deep learning; cybersecurity

1. Introduction

The term “control” is used to refer to both the physical components and the computer programs that are a part of an industrial control and automation control system (ICACS). In an industrial control system, some of the most important components are a human–machine interface (HMI), programmable logic controllers (PLCs), remote terminal units (RTUs), and distributed control systems (DCSs). The data from field sensors may be input into a SCADA system, which then enables us to run the system via the use of HMI software.

ICSs are absolutely necessary for the proper operation of critical infrastructures, which includes the provision of important services such as water, electricity, and communications. The services that are provided by ICSs are the foundation of any efficient system for monitoring and administering production [1]. The monitoring division uses sensors to gather information, monitor operations, and check that they are functioning correctly to guarantee that they are effective. On the one hand, the monitoring staff ensure that everything is functioning properly by keeping a close check on everything. The controlling component, on the other hand, is responsible for monitoring operations and making decisions that instruct actuators to follow certain courses of action. If this method is interfered with in any way, for instance because of a disruption in electrical power or communications, then a large number of people may be adversely affected [2]. Figure 1 shows the process flow of ICS data in an Internet of things (IoT) environment.

An energy shield and SDN microSENSE are two innovative technologies that are designed to improve energy efficiency and sustainability. Energy shield is a smart energy management system that helps organizations reduce their energy consumption and costs by monitoring and controlling their energy usage in real time. On the other hand, SDN microSENSE is a software-defined networking solution that enables the efficient management of distributed renewable energy resources, such as solar panels and wind turbines. Both of these technologies have the potential to revolutionize the way we use and manage energy, making it more sustainable, cost-effective, and environmentally friendly [3,4].

A cyber-physical system (CPS) is defined as a system in which computing, information and communication technology functions, and physical processes and dynamics, interact in close coordination with each other to achieve a common goal [5,6,7]. Today’s culture is rife with CPSs. System sizes range from the very small to those that cover whole countries. Perhaps the most intricate and important CPS is the smart grid.

SCADA is a type of ICS that is utilized in a wide variety of industries for the purpose of monitoring and controlling operations such as oil and gas pipelines, water distribution, electrical power grids, and other similar operations. These kinds of configurations make it possible to automate the administration of frequently used services and monitor them remotely. Water pressure in pipes, water storage levels, and water distribution are all things that may be managed and controlled using SCADA systems, which are employed by state and municipal governments. In a SCADA system, it is typical to have computing workstations, a HMI, PLCs, sensors, and actuators [8,9,10,11]. In the past, these sorts of systems were dependent on solitary and specialized network configurations. As a result of the extensive implementation of remote management, however, open IP networks (such as the Internet) are increasingly used for the communication of SCADA systems. This renders SCADA systems vulnerable to assaults originating in cyberspace. The methodologies of machine learning (ML) and artificial intelligence (AI) have been used extensively in the development of intelligent and efficient intrusion detection systems (IDSs) for ICS. Network traces obtained from publicly available datasets are often used by academics in the process of training machine-learning-based security solutions. However, these datasets are not adequate to protect the system against new types of attacks since malware changes and attack strategies vary. As a consequence, the benchmark datasets need to be updated on a regular basis [12,13].

While cybersecurity solutions for information technology (IT) have been created and improved upon for some time, operational technology (OT) cybersecurity has received far less attention. While the accessibility of sensitive data is highly appreciated in the operational technology and information systems (OT-ICS) sector, the security of sensitive data is of the utmost significance in the IT business. OT systems are more susceptible to assaults because their cultures do not prioritize cybersecurity, and the digital world is becoming more widespread. While identity theft, monetary loss, invasions of privacy, and the exposure of sensitive information are all examples of threats to IT, risks to OT include dangers to human health and the environment. In contrast to IT networks, which are equipped with stringent security regulations, OT networks presently have security policies and standards that are applied in a haphazard manner [14]. While the IT industry has standardized on things like email, the Internet, and video, among other things, the OT industry’s SCADA, HMI, and DCS applications and protocols are highly personalized.

When it comes to traditional IT, IDSs have proven to be a reliable security tool for the detection of anomalies. These systems examine all incoming and outgoing network traffic for indications of security breaches and then take appropriate action if they are discovered. A warning signal will go out in the event that a match is not found. Network intrusion detection systems (NIDS) scan the whole network and identify malicious traffic behavior, whereas host-based intrusion detection systems (HIDS) focus on analyzing a single host and monitoring all system events. Although it is true that IT security systems and IDSs may work together, the fact is that IT systems simply cannot offer the degree of protection that is necessary in industrial settings. This is because IT systems are not designed to do so. Nevertheless, the dangers posed by cyberattacks on ICSs are growing at a worrying rate. As a result of the essential nature of many ICSs, a breakdown could potentially put people in danger, put the environment at risk, or decrease overall production [15].

When it comes to detecting SCADA breaches, the standard procedure calls for an IDS that is signature-based. It identifies unusual patterns in the traffic data that lead to malicious conduct, which may subsequently be utilized as policy rules in intrusion detection systems such as Snort. Signature-based solutions have significant drawbacks when it comes to security, and one of those drawbacks is that they are unable to recognize zero-day threats [16]. A standard operating procedure may be established via the use of the anomaly-based technique. After that, the IDS finds incidents of assault by taking a quantitative look at the changes in behavior that may be seen. Discovering zero-day dangers may be accomplished via the use of the anomaly detection approach [17]. This is due to the fact that the IDS is able to learn the typical pattern of behavior when placed under regular conditions [18]. For the purpose of detecting irregularities in ICSs, many different methods and techniques have been developed. The design of IDSs has made significant progress, but there is still a lot of work to be done [19]. Because of their computational efficiency, ML techniques are widely applied in numerous disciplines; this section will explore some of those sectors. Based on features gained during training, these techniques primarily include classification, grouping, and regression [20]. The support vector machine (SVM), k-nearest neighbor, and decision tree are three of the most widely used machine learning methods. For example, in [21], the authors built a model for cybercrime categorization using SVM, a ML approach that has been employed for classification applications. Ref. [22] proposed a text classification model based on naive Bayes and k-nearest neighbors. Refs. [23,24] developed a model for anomaly detection that may be used to identify and categorize cyber-physical system threats. In contrast to deep learning (DL) algorithms, which streamline feature extraction, ML algorithms detail the procedures used to arrive at a conclusion.

The dearth of studies reviewing the work conducted on SCADA-specific intrusion detection systems using machine learning and deep learning techniques inspired us to conduct this study. However, the paper failed to address the limitations and difficulties of employing such DL strategies in order to safeguard SCADA systems. In contrast to earlier research, we focus on the challenges of constructing DL-based IDS models for SCADA systems, as well as the datasets utilized to train these models. Our study adds to the body of literature and helps to fill a need in the field.

The originality of this review article lies in the fact that it concentrates on contemporary applications that are based on artificial intelligence techniques and are employed in IDSs that are implemented in ICS SCADA systems. The following are, nonetheless, the primary contributions made by this paper:

Ensuring the safety of SCADA networks by using Machine learning (ML) and deep learning (DL).
Examining the performance of ML and DL on the two public SCADA datasets that are the most often used for IDS training and assessment.
Utilizing statistical analysis to investigate potential relationships between the characteristics of the network in order to enhance classification algorithms.
Discussing the outstanding problems in the development of SCADA-IDS in order to set the scene for the creation of a robust and efficient IDS using DL algorithms.

2. Background

There has been research showing that SIEM (security information and event management) systems can be used in ICS/SCADA environments. The structure and methodology proposed by the authors [25,26] are innovative and comprehensive; they have already been used to analyze the condition of a system using data from several sources. Agents at the correspondent and administration layer are constrained by the scalable method, which employs a decentralized strategy for a safe ICS. Another study [27] shows how important it is to protect the ICS/SCADA system from cyber threats to avoid potential dangers. Signature-based abuse detection systems built using various ML approaches to counteract a wide range of intrusions were the subject of a comprehensive study published by the authors [28,29]. Authors [30] also look at honeypots, which function in a similar way by mimicking critical smart grid functions in order to identify illegal access.

A number of studies have been published as a result of research projects that aimed to determine ways to identify attacks on ICSs. Rule-based IDSs and deterministic finite automata (DFA) are two kinds of systems that come under this category. At this point in time, the primary focus of research endeavors is on the development of innovative methodologies that make use of cutting-edge technologies such as big data and ML/DL. ML and DL are two types of AI that are becoming more useful in the industrial sector for the purpose of detecting cyberattacks. Here, we take a look at 16 of the most prominent articles on anomaly detection in commercial contexts using ML and DL.

When conducting research on ML algorithms, university ICSs are more likely to make use of datasets that are open to the general public. The following is a summary of some of the current issues that might arise while utilizing a public database. Only components that are based on the Modbus/TCP protocol suite [31] are allowed to be utilized in the implementation of the recommended architecture. At the time [32] was published, the only ML algorithms that had been developed as a direct consequence of online testing activities were multi-ML algorithms. This was the case despite the fact that very little data had been acquired during these tests. The database used in [33] suffered from the same issue, in that it did not consider contemporary threats. In contrast, the dataset presented in [34] only included a single incidence and was of a very low quality overall (about 1000 instances). According to a second source, the database was not up to date, and the assaults were linked to the IT industry [35]. At the Singapore University of Technology, a testbed for the purification of water was built [36]. This includes assault scenarios and SCADA network traffic, both of which are pertinent to the water treatment sector. Recent years have seen the development of real-time datasets for the purpose of training and testing ML algorithms. These datasets contain instances of both common cyberattacks as well as 35 other types of cyberattacks. For the purpose of determining which algorithms are most effective, only supervised ML techniques have been used. The data demonstrate that algorithms have a significant potential for producing false positives, which is in line with the conclusions drawn from the study. The quality of results obtained by applying ML algorithms to a publicly available dataset is heavily reliant on the quality of the dataset [37]. While the payload and data frame are handled via the dataset labels, attacks are purposefully randomized and parameterized to simulate the operational and assault environment [38].

Rosa et al. [39] provide a comprehensive analysis of the effects of cyberattacks on a simulated power system. A mixed-architecture testbed is comprised of the SCADA assets (PLCs, HMIs, process control servers, and the like) that are used in the simulation of the power grid. In the study, their assaults are broken down in depth, and some of the challenges that an attacker might have in putting those attacks into action are investigated. One such strategy is an attack on the reconnaissance network. The authors claim that this kind of attack may be used to not only detect devices and service types but even PLCs that are concealed behind gateways. Therefore, in order to discover their existence in our work, we made use of advanced reconnaissance methods and ML algorithms.

Keliris et al. [40] created a process-aware supervised learning protection approach for ICSs that takes into consideration the operational behavior of an ICS. This method may be used for real-time ICS attack detection. We used a reference chemical technique, and many different kinds of assaults on the hardware controllers were taken into consideration. Their trained SVM model was used to identify abnormalities in real time and distinguish between safe and risky behaviors. Within the scope of this research, five different ML algorithms were applied to identify abnormal behavior in real time, and the efficiency of these methods was evaluated. Tomin et al. [41] provided a semi-automated way for assessing network security in the digital environment. They accomplished this by making use of ML algorithms. They explained their work at the Melentiev Energy Networks Institute in Russia, where they created ML-based approaches for diagnosing unstable power networks. This study was carried out in Russia. Using resampling and cross-validation, a number of ML algorithms were trained offline. After that, the ML approach with the best results was chosen for online deployment. They assert that the challenges of predicting and maintaining the security of future industrial systems may be circumvented with the use of ML techniques. Cherdantseva et al. [42] recently looked at the question of how cybersecurity risks should be evaluated for SCADA systems. Despite the widespread application of ML strategies in the field of ICS security research, it appears that there is a shortage of standardized datasets for the purpose of training and testing ML algorithms. These findings are based on the findings of this analysis, which suggests that this shortage exists. It is challenging to develop trustworthy ML models for anomaly detection in ICS without having consistent data to work with. We used the testbed that was described in this study to develop a new dataset that can be used for training and testing ML algorithms.

A wide variety of DL and ML methods, including as convolution neural networks (CNN) and long short-term memory (CNN-LSTM) networks, have been reported in the academic literature. According to [43], unsupervised ML might be used to detect weird behavior in cyborgs (CPS). The authors looked at support vector machine (SVM) and deep neural network (DNN) algorithms since they were developed specifically to deal with time-series data. They evaluated 23 different approaches, and these are two of them. They used the mean and standard deviation from the validation dataset to scale it to a larger size. In [44], it was proposed that autoencoders (AE) and one-dimensional convolutional neural networks (CNNs) might be used as DL techniques for detecting anomalies in ICS. To further narrow down the offered DL models to those most suited for anomaly identification, the authors proposed filtering the models’ properties. They developed a method of feature extraction that used the discrete Fourier transform to enable feature calculations in the frequency domain (DFT). Features could not be extracted using DFT since it only considered the strongest energy bands, thereby discarding important information from the remaining signal. The authors also used a threshold for anomaly detection that was calculated using the mean and standard deviation of the training and testing data sets, respectively. In [45], which focuses on real-time, and massive amounts of, data, the authors provide an unsupervised technique for finding outliers. This system comprised three main parts: update triggers, tree growth, and mass weighting algorithms. The authors [46] reported the application of clustering analysis to expose the underlying patterns in the dataset in order to carry out an additional unsupervised anomaly identification. The following step that the researchers took was to extract characteristics from the clusters by using cluster intra-distances and inter-distances. In conclusion, an inference procedure was utilized in order to ascertain whether or not an irregularity had been triggered. An unsupervised learning method that was founded on stacked denoising AEs was developed by the authors of the paper [47]. Because their approach made use of the existing network stream, it did not call for any specialized knowledge or abilities on the part of the user.

The anomaly detection approach begins with an estimation of the discrepancies between the abnormal profile and the normal profile. There are several different types of data sources that may be used to generate the normal profile. As a data source for modeling the typical communication, the authors of [48,49,50] took advantage of the network traffic that occurs within ICS. In order to model packets that were transferred between devices for the purpose of intrusion detection, the Hidden Markov model (HMM) [51] was utilized. In addition, ref. [52] suggested a technique for learning the Modbus/TCP traffic transactions with the request message as the sole data source. The authors of [53,54,55] used deep learning and machine learning to detect attacks from the SCADA environment—the IIoT—in which feature selection techniques are influenced by biological systems. A metaheuristic technique for feature selection and deep learning for classification was proposed by Keserwani et al. [56] to detect attacks on a virtualized cloud network. Important characteristics from the cloud network connections are detected using a mix of Gray Wolf Optimization (GWO) and PSO, and then classified using a deep sparse auto-encoder. In this paper, the authors extend their earlier work on fetch attacks in the Internet of things (IoT) from [57]. To further enhance the accuracy of threat detection, the hybrid GWO-PSO is also used to extract crucial aspects of an IoT network. On the KDDCup99, NSL-KDD, and CICIDS2017 datasets, the proposed model scored an impressive 99.66% in terms of accuracy.

This dataset was also used by Awotunde et al. [58], together with the NSL-KDD dataset, to develop a hybrid rule-based feature selection approach. To collect useful information that may be used to build an intelligent NIDS (i.e., information extracted from TCP/IP packets), the proposed study integrates a deep feedforward neural network model with rule-based feature selection and IIoT applications. Using a rule-based approach for feature selection and a genetic tool to develop the most valuable traits, this study proposes a three-stage technique for intrusion detection in IIoT systems. The last step is to feed the features into the ANN so it may utilize them throughout its training. To evaluate the efficacy of their suggested IDS method, the authors of [59] used the Aquila optimizer (AQU) to pick features from the CIC2017, NSL-KDD, BoT-IoT, and KDD99 datasets. The purpose of this study was to extract useful features from the datasets used by using a lightweight feature extraction technique based on CNN. The AQU technique is then used to choose a collection of features representative of the dataset’s attributes. Most SCADA applications rely on the Modbus protocol, and its insecurity has prompted the authors of [60] to propose a secure Role-Based Access Control (RBAC) architecture to grant permission to both the client and the Modbus frame. After verifying certificates on both ends of the connection, the system was able to authenticate using the Transport Layer Security (TLS) protocol.

The authors of [61] suggested a unique attribute-based access control (ABAC) that is more adaptable to fulfill the demands of IoT use cases, such as smart devices, and make the data exchange more secure in a cloud-IoT context, therefore offsetting some of the drawbacks of RBAC.

In [62], authors describe yet another ABAC-based architecture for controlling public IoT infrastructure in modern metropolises. In this architecture, users establish smart contracts with various entities to store their attributes and make authorization requests. Each attribute’s value is determined by the total trust of all authorizing entities, which is determined at the moment of access. Access control problems in the IIoV are addressed formally, where an ABAC model called the attribute-based access control system (ITS-ABACG) is offered. The suggested approach introduces the idea of groups, which may be used to classify various intelligent entities according to characteristics like location, direction, speed, and others.

3. Materials and Methods

A modern ICS is a complex system that relies on a wide variety of components and technologies to monitor and control physical processes. In addition, a modern ICS is responsible for a significant number of the managerial, administrative, and regulatory responsibilities that are associated with carrying out this task. The OT that ensures the availability and safety of crucial processes is at the core of ICSs. IT has been implemented into contemporary ICSs in order to fulfill the system functions that are required in the overall system. Therefore, developing a security system that can help to protect the ICS is becoming necessary for the industrial sector. It is challenging to implement preset security techniques due to the wide range of communication software that is now available. A learning-based application is required in order to construct a secure application that takes into consideration heterogeneity and diversity. Applications for learning-based security should be built on the most relevant learning models possible. During this research, we developed learning-based security solutions that are capable of taking changes into account and protecting the ICS. Figure 2 displays the SCADA based on the cybersecurity system framework.

3.1. Dataset

3.1.1. WUSTL-IIoT-2018 Dataset

The dataset that was used for our study on SCADA cybersecurity is presented here. Our SCADA system testbed, which is discussed in [3], was used in the construction of the dataset. The objective of our testbed was to provide a realistic simulation of industrial systems in the actual world. Consequently, we were able to simulate authentic cyberattacks. In this research, the emphasis was on reconnaissance attacks, which include searching a computer network for potential security flaws that might then be exploited in subsequent assaults. We utilized scan tools to examine the topology of the victim network, which was our testbed in this instance. This allowed us to determine the devices that were connected to the network as well as the vulnerabilities that each device had. The features of the SCADA dataset are presented in Table 1. The number of instances of attacks and normal classes are shown in Figure 3. The dataset had a total of 1,048,575 rows and six characteristics.

3.1.2. WUSTL-IIoT-2021 Dataset

For the purpose of cyber security research, the WUSTL-IIoT-2021 collects network data from the IIoT in the industrial sector. Our IIoT testbed is used to create the dataset. Our testbed is designed to be as realistic as feasible in its simulation of industrial systems, with the added benefit of facilitating actual cyberattacks. We have amassed 2.7 GB of data, equivalent to around 53 h of recording time. We have cleaned the dataset by removing severe outliers, erroneous entries, and rows with missing or damaged data. Our used and uploaded dataset is a subset of that, and it clocks in at little over 400 MB in size. Table 2 shows simples of the WUSTL-IIoT-2021 dataset.

Figure 4 is a feature correlation diagram, which was created so that we may have a deeper comprehension of the interrelationships between the attributes. The image provides a somewhat realistic representation of the degree of correlation between the seven dimensions. A correlation graph may be set up using feature X as the x-axis and feature Y as the y-axis to further explore whether or not features X and Y show a correlation in the plane distribution. In order to facilitate the study and presentation of the material, this work selects two types of characteristics with correlation indices more than or equal to 0.9 and those with correlation indices less than or equal to 0.9. It is clear that the traits and tables are highly correlated with one another.

3.2. Data Preprocessing

Before employing any ML and DL applications, the first and most critical step is to preprocess any data that will be used. Eliminating unwanted data, converting data, scaling data, deleting erroneous data, and so on are all activities within the larger context of data preparation. This framework now allows users to compare inputs with comparable attributes using both traditional and min-max scaling of unlabeled data. After that, similar inputs are sorted together into the same category. To put it another way, the algorithm looks for hidden patterns in the input data in order to make predictions about how it will react to test inputs. These predictions are made on the basis of the patterns found. Figure 5 shows the preprocessing steps.

For the purpose of converting categorical classes, namely normal and attacks, the one-hot encoding approach was used. One of the most common approaches of normalizing data is called the min-max technique, and it is employed in the normalization process. When it comes to each feature, the value of the feature that is the least significant is transformed into the number 0, the value of the feature that is the most significant is transformed into the number 1, and every other value is transformed into a decimal that falls somewhere in the range of 0 and 1. The min-max normalization technique calls for the solution of the following equation in order to be applied:

\overset{´}{N o r m} = \frac{n o r m - y_{m i n}}{\max (B) - \min (B)}

(1)

The min and max are values [0, 1] were used for scaling the data.

3.3. Classification Algorithms

In this section, the theoretical underpinnings of the machine learning and deep learning methods were put to use in this investigation. There has been a lot of buzz around machine learning recently, and engineers have been using ML and DL models to solve all sorts of issues in the real world. Popular techniques for intrusion detection in the field of machine learning include k-nearest neighbors (KNN) and random forest (RF) tree algorithms.

3.3.1. Linear Discriminant Analysis (LDA)

LDA is a method for reducing the number of dimensions in a dataset. In ML and other applications that include pattern classification, it is performed as a step in the preprocessing phase. The objective of LDA is to decrease the amount of resources needed, as well as the dimensional costs, by mapping the characteristics of higher-dimensional spaces onto lower-dimensional spaces. This helps to avoid the problem known as the “curse of dimensionality”. Figure 6 shows the LDA technique for analyzing normal and abnormal packets, with the red line denoting a linear separation between the black circle class and green circle class.

3.3.2. k-Nearest Neighbors Algorithm (KNN)

One form of instance-based learning is known as the k-nearest neighbor (KNN) algorithm. One of the most straightforward ML algorithms, KNN makes use of the supervised learning approach to data analysis. The KNN method makes the assumption that the new case or data is comparable to existing instances and places the new example into the category that is most similar to it from the categories that are already available. The KNN algorithm remembers all of the accessible data and determines how to categorize a new data point depending on how similar it is to the stored data. This implies that the KNN technique may be used to rapidly classify newly available data into an appropriate category. The KNN method may be used for both classification and regression; however, it is more often utilized for classification purposes. The KNN technique is non-parametric, which implies that it does not make any assumptions about the data it is analyzing. The KNN technique determines the test tuple’s closest neighbor list,

z_{i}

, by computing the distance between the test tuple, z = (a, c), and all of the training tuples, which are denoted by the letter D. The test tuple is categorized according to the results of the majority vote:

z_{i} = \sqrt{(a_{1} - a_{2}) + {(c_{1} - c_{2})}^{2}}

(2)

3.3.3. Random Forest (RF)

Decision trees are a kind of algorithm that classifies instances according to the values of their features. Each node that makes up a decision tree represents a feature that is contained inside an instance that is currently awaiting classification. In order to partition the training data, a feature is chosen, and this feature eventually becomes the root node. After following a similar approach, the tree will continue to dig deeper, creating sub-trees, until the same class subsets are generated. Decision trees have a number of drawbacks, one of which is diagonal portioning.

The RF method is gaining popularity and has been utilized extensively in the categorization of land use and land cover, notably for mangrove forests. The RF is an example of an ML algorithm that has the potential to significantly enhance the accuracy of pattern recognition. In ecological studies, RF has been used effectively on a number of occasions. Each decision tree presents computations based on the most dominant class unit for the purpose of classifying specific classes in accordance with the input training data. The information obtained from satellite data and field data is combined in order to serve as the foundation for the preparation of training data, which will later serve as input into the classification model. An RF classifier is constructed from a large number of classification trees. The classification tree is a kind of classifier that is indicated by an unlabeled input vector and a vector that was randomly produced by selecting random characteristics from the training data for each node. The random vector of distinct classification trees in the forest that are formed by the same distribution technique are not connected to one another in any way, but are generated randomly nevertheless. For data that has not been labelled, each tree will make a prediction or cast a vote, and this will result in labeling being completed. There are a large number of other algorithms, such as the J48 approach and others, that may be employed.

3.3.4. Convolution Neural Network (CNN)

The convolutional layer, the pooling layer, and the output layer are the three primary components that make up the fundamental architecture of a CNN. The layer that pools water is optional. In image classification, the traditional CNN structure, which consists of three convolution layers, is often used. It has one input layer, multiple hidden layers (hidden layers include convolutional, pooling, and normalizing layers), and a layer called the output layer that is totally coupled to the layer that came before it in the chain. Neurons from one layer talk to their neighbors in the layers that are next to them. The methods of pooling and sub-sampling are carried out so that the proportions of the input may be decreased. The input pictures are sent to the CNN classifier in the form of a collection of miniature sub-sections that are referred to as receptive fields. Mathematical convolution processes of the first layer, also known as the input layer, are used in the calculation of the response to the layer that comes after it [63,64,65,66]. Figure 7 provides an illustration of the CNN’s fundamental organizational framework.

Compared to its feed-forward neural network forebears, recurrent neural networks (RNNs) are much superior. RNNs have the ability to recall input at each stage in order to provide better results than feed-forward neural networks. In an RNN, the neurons have their outputs linked to the inputs of both other neurons and themselves. As a result, RNNs are able to represent data sequences and time series by making use of their own internal memory. RNNs are often used in IDSs to extract temporal correlations between security threats and harmful actions (also known as temporal features), while CNNs are used to extract spatial characteristics.

The gated recurrent unit (GRU) is a brand new structure that was developed to solve the issue of disappearing or expanding gradients. GRU is the name given to the improved LSTM framework. A gate structure that is similar to that of an LSTM is included in GRUs so that they may control the flow of information. On the other hand, in contrast to LSTM, GRU does not have an output gate, which enables the content to be completely exposed. In the GRU, there are just two gates: the reset gate and the update gate. The second gate of the LSTM architecture is a combination of the input and forget gates of the architecture. GRUs have a more straightforward structure and fewer parameters than LSTMs, which contributes to their superior performance. Figure 8 depicts the structure that may be found inside the GRU framework.

r_{t} = σ (W_{r} x_{t} + μ_{r} h_{t - 1} + b_{r})

(3)

z_{t} = σ (W_{Z} x_{t} + μ_{z} h_{t - 1} + b_{r})

(4)

h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ t a n h (W_{h} x_{t} + μ_{h} (r_{t} ⊙ h_{t - 1}) + b_{h}

(5)

where

r_{t}

represents the reset gate and

z_{t}

represents the update gate at time t in the expression. The value

x_{t}

at a given time

t

is referred to as the input value,

w

and

μ

as weights, and

b

as bias. The hidden state at time

t

is denoted by the variable

h_{t}

.

Like LSTM, sigmoid and tangent activation functions are denoted by ‘sigm’ and ‘tanh’, whereas ‘b’ represents biases and ‘W’ represents weight. The longer dependencies are no problem for either LSTM or GRU. Nonetheless, there are distinctions in operational efficacy. We used both frameworks in this research to compare and contrast their ability to detect intrusion.

When it comes to the process of detecting assaults, traditional intrusion detection models focus more on the characteristics of time series and overlook the spatial characteristics. While the CNN structure excels at extracting the spatial features of the data traffic, it only does so-so at extracting long-distance dependent information; the GRU structure, on the other hand, excels at both of these tasks and can prevent forgetting during the learning process, but it has a larger number of parameters and takes longer to train. Although the CNN architecture excels at extracting short-range dependent information, it struggles to do so when dealing with long-range dependent data. This research combines the two to increase the model’s feature-learning capacity, allowing for complete feature extraction across the spatial and temporal dimensions for improved classification detection accuracy.

The suggested model for network intrusion detection, which combines CNN and GRU, is known as the CNN-GRU model. This model is comprised of three primary stages. The first stage is the preprocessing stage, in which the initial input is normalized. The second step is called the training phase, and it is during this phase that the convolutional block gives varying weights to the various features based on the data that has been preprocessed. The CNN module is responsible for the extraction of the spatial features, after which the spatial information is aggregated further by combining the Averagepooling and Maxpooling algorithms. After that, the temporal properties are retrieved by a number of GRU units simultaneously. Third is the testing phase, during which the test set is put into the trained model for classification. Lastly, the classification is carried out using the Softmax function. Figure 9 is an illustration of the basic framework of the model that is discussed in this research.

3.4. Evaluation Metrics

Metrics for evaluating network security intrusion detection models center on four major indicators: precision, accuracy, recall, and F1-score. In addition, the discrepancy between the desired and predicted values was calculated using the mean square error (MSE), Pearson’s correlation coefficient (R), and root mean square error (RMSE) performance metrics.

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i, e x p} - y_{i, p r e d})}^{2}

(6)

RMSE = \sqrt{\sum_{i = 1}^{n} \frac{{(y_{i, e x p} - y_{i, p r e d})}^{2}}{n}}

(7)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i, e x p} - y_{i, p r e d})}^{2}}{\sum_{i = 1}^{n} {(y_{i, e x p} - y_{a v g, e x p})}^{2}}

(8)

Accuracy = \frac{T P + T N}{T P + F P + F N + T N} \times 100 %

(9)

Sensitivity = \frac{T P}{T P + F N} \times 100 %

(10)

Precision = \frac{T P}{T P + F P} \times 100 %

(11)

Fscore = \frac{2 * p r e i s i o n * S e n s i t i v i t y}{p r e i s i o n + S e n s i t i v i t y} \times 100 %

(12)

R % = \frac{n (\sum_{i = 1}^{n} y_{i, e x p} \times y_{i, p r e d}) - (\sum_{i = 1}^{n} y_{i, e x p}) (\sum_{i = 1}^{n} y_{i, p r e d})}{\sqrt{[n {(\sum_{i = 1}^{n} y_{i, e x p})}^{2} - {(\sum_{i = 1}^{n} y_{i, e x p})}^{2}] [n {(\sum_{i = 1}^{n} y_{i, p r e d})}^{2} - {(\sum_{i = 1}^{n} y_{i, p r e d})}^{2}]}} \times 100

(13)

4. Results

Experiments and comparisons with various intrusion detection models were run on a 64-bit Windows computer equipped with an Intel(R) Core i7-CPU (2.80 GHz), 8 GB of RAM, a Python-based Anaconda, and TensorFlow library to write the CNN and GRU models; the Sklearn library was used for machine learning. The Python programming language was used for all experiments. It was necessary to split the data for the studies into training and testing sets. A proportion of 80% of the SCADA dataset was used for training, while 20% was used for testing. For all datasets, the test set size was proportional to the total number of data points. The training parameters for the CNN and GRU are described in Table 3.

The LDA model’s outputs are shown in Table 4. The LDA had a 93.11% accuracy rate, a 94% precision rate, a 93% recall rate, and an F1 score of 93% when using WUSTL-IIoT-2018. The LDA method achieved 95% in terms of accuracy using WUSTL-IIoT-2021.The LDA technique successfully identified the normal class with a high detection percentage. The confusion matrix for the LDA model using WUSTL-IIoT-2018 and WUSTL-IIoT-2021 datasets can be seen in Figure 10 and Figure 11.

The accuracy of detecting normal behavior and attacks in the SCADA dataset was computed and summarized based on the findings of the confusion matrix. It should be noted that the LDA approach false positive (FP) was very high (1464) using WUSTL-IIoT-2018, and the false positive (FP) of the LDA method for detecting attacks from WUSTL-IIoT-2021 was 5544.

The KNN approach is a common ML algorithm used to classify a number of real-life applications. In this research, KNN was applied to detect attacks on the SCADA dataset to protect ICS applications, which can often be very important for human life. The KNN algorithm depends on calculating the distance between the closest data; we selected K values of 5 for finding the nearest values between the training data, where Euclidean distance was used. Table 5 shows the results of the KNN algorithm in detecting attacks in the ICS environment using two ICS datasets, namely WUSTL-IIoT-2018 and WUSTL-IIoT-2021. It can be seen that the KNN approach scored 99.99% in terms of accuracy using WUSTL-IIoT-2018, whereas the accuracy of the KNN method using WUSTL-IIoT-2021 was 100%.

The confusion matrix for the KNN approach to detecting attacks from the WUSTL-IIoT-2018 and WUSTL-IIoT-2021 datasets is displayed in Figure 12 and Figure 13. The false positive and false negatives are less than three instances.

The random forest (RF) method begins by constructing decision trees on data samples; after, this is used to obtain the forecast made by each tree. Lastly, the optimal answer is selected by voting on the many candidates. The RF algorithm creates a decision tree for each sample that it analyzes. After this, it acquires the outcome of the forecast made by each decision tree. In this, voting is carried out for each and every outcome that was anticipated. The results of the RF approach detecting ICS attacks from the WUSTL-IIoT-2018 and WUSTL-IIoT-2021 datasets are shown in Table 6. The RF approach achieved 99.99% accuracy in the WUSTL-IIoT-2018 dataset whereas the RF when using the WUSTL-IIoT-2021 dataset achieved 100%. Figure 14 and Figure 15 show the confusion matrix when using the RF approach to detecting SCADA attacks from the WUSTL-IIoT-2018 and WUSTL-IIoT-2021 datasets.

Results of Deep Learning

In order to verify the efficiency of the model proposed in this research for the identification of intrusions, we conducted performance analysis tests on the intrusion detection model that fused CNN and GRU in this section. We raised the number of recurrent blocks from 10 to 120 in order to observe the best performance; however, the results did not change substantially. On the other hand, the amount of time spent training was increased. The parameters of the CNN were the same in the case of the combination model as they were in the case of the single-CNN model, which consisted of 32 filters with a size of 5 for each filter. Because of the poor performance of the other recurrent models, we could only combine it with the LSTM model, which consisted of 10 recurrent blocks. We concluded that the CNN model should have 32 filters, with a size of 5 for the convolution operation. Furthermore, the GRU models should each have 10 recurrent blocks, and the Adam optimizer should be set to 0.0001 and 0.001 for the learning rate.

Table 7 demonstrates that when compared to using a single GRU model, the CNN-GRU model was superior in terms of its ability to appropriately extract the properties of the original data and effectively conduct intrusion detection. The detection accuracy, recall, precision, and F1 score of the GRU were 99.93%, 99.95%, 99.95%, and 99.95%, respectively, where the classification rate was improved by combining the CNN approach with the GRU model. It can be observed that a score of 99.98% was achieved with respect to accuracy, recall, precision, and F1 score using the WUSTL-IIoT-2018 dataset. Using the WUSTL-IIoT-2021 dataset, the GRU model archived high accuracy compared with the CNN-GRU model (accuracy = 99.75%).

Figure 16 and Figure 17 depict the GRU and CNN-GRU models in the train and test accuracy scores, which were obtained using the WUSTL-IIoT-2018 and WUSTL-IIoT-2021 datasets. After around 10 and 20 epochs, the accuracy curve of the training data reached approximately <99.95% for both datasets, but the accuracy of the CNN-GRU model’s test data reached approximately 99.98% using WUSTL-IIoT-2018, whereas the GRU model attained an accuracy of 99.75% using the WUSTL-IIoT-2021 dataset. In comparison, the ML approaches and combination models require less training time. The accuracy loss at the testing phase is 0.0031 and 0.00070 for the GRU and CNN-GRU models, respectively.

The amount of precision recommended for the GRU and CNN-GRU approaches for detecting attacks from the WUSTL-IIoT-2018 and WUSTL-IIoT-2021 datasets to achieve the models is shown graphically in Figure 18 and Figure 19. The y-axes represent the percentages of the GRU and CNN-GRU approaches. The term “training accuracy” refers to the degree to which the validation process achieved its intended purpose. It was brought to our notice that the system aborted the process of optimization in order to obtain accuracy up to 10 epochs. The performance of the GRU model went from 94% effective to 99.75% effective. The accuracy of the CNN-GRU models started at 94.50% and reached 98.18%. It was determined that the categorical cross-entropy function is the instrument most suitable for calculating the testing loss of the proposed system, which was calculated at 0.015.

5. Discussion

Supervisory control and data acquisition, or SCADA, systems are used in a wide variety of industries and economic sectors, including water treatment facilities, power plants, railways, and gas pipelines, among others, to monitor and regulate industrial control systems. The integration of SCADA systems with the Internet and corporate enterprise networks for various reasons related to economics makes SCADA systems vulnerable to attacks by hackers. These hackers could remotely exploit and gain access to SCADA systems in order to damage the infrastructure and, as a result, cause harm to people’s lives.

Intrusions into control systems may result in harm to the environment, threats to people’s safety, poor quality, and lost output. The proposed cybersecurity system can help any industrial plants like power generation, oil and gas processing and water systems when investigated with regard to the four primary types of vulnerabilities in cyber security, namely network vulnerabilities, operating system vulnerabilities, human vulnerabilities, and process vulnerabilities. This study proposes techniques for determining and reducing the susceptibility of networked control systems to inadvertent and malicious intrusions by using advanced artificial intelligence. The methods may be used to identify and minimize the vulnerability of networked control systems. In this study, we identify effective techniques for dealing with the cybersecurity challenges that are present in ICS.

In order to provide an accurate assessment of the proposed methods for intrusion detection, significance tests were carried out on a SCADA dataset. These tests were carried out with reference to methods that have been utilized in previously published research. The statistical analysis to determine the significance of the indicators produced by the suggested techniques is presented in Table 8, and in doing so, we confirmed that the findings that were achieved were not the product of random chance. It was observed that the KNN, RF, GRU, and CNN-GRU approaches achieved an R² > 99% using the WUSTL-IIoT-2018 dataset.

These results show that the tested accuracy values of the proposed ML and DL models are larger than the assumed the prediction errors of the RF approach, which has a minimal value of MSE = 3.1789 × 10⁻⁶. Furthermore, the proposed models demonstrated fewer prediction errors. Overall, the suggested model’s accuracy was found to be better than the minimum prediction error value. In Figure 20, a receiver operating characteristic curve, or ROC curve, shows the performance of the GRU and CNN-GRU deep learning models in the WUSTL-IIoT-2018 dataset over all classification thresholds.

The stability system in terms of the complexity of the SCADA intrusion detection using artificial intelligence refers to the ability of the system to maintain its performance and accuracy despite changes in the environment or attacks from intruders. The complexity of the system is determined by the number of variables, algorithms, and models used to detect and respond to threats.

To ensure stability, the SCADA intrusion detection system must be designed with robust algorithms that can handle large amounts of data and adapt to changing conditions. The use of artificial intelligence techniques such as machine learning, deep learning, and neural networks can enhance the stability of the system by enabling it to learn from past experiences and improve its performance over time.

Moreover, regular updates and maintenance are essential for maintaining stability in a SCADA intrusion detection system. This includes updating software, patching vulnerabilities, and monitoring network traffic for any suspicious activity.

In summary, a stable SCADA intrusion detection system using artificial intelligence should be designed with robust algorithms that can handle complex data sets and adapt to changing conditions. Regular updates and maintenance are also crucial for ensuring long-term stability.

Using the SCADA dataset, Table 9 provides a comparison of the accuracies of the proposed ML and DL models to those of the advanced IDS models in terms of the detection accuracy of various kinds of attacks. The KNN, RF, and CNN-GRU models performed better than the others according to the results of the analysis. Figure 21 is a visual depiction of the comparison between the outcomes our system obtained and those gained by other current systems based on the accuracy measures. We find that our proposed method yields the highest level of accuracy now attainable. It has been observed that authors [67] have scored an accuracy of 99.9% but the time cost is very high.

Generally, artificial-intelligence-based intrusion detection systems require significant computational resources to analyze large amounts of data in real time. The complexity of these systems can range from simple rule-based systems to more complex machine learning algorithms that require significant processing power. Overall, the computation complexity of SCADA intrusion detection using artificial intelligence is high but can be optimized by using efficient algorithms and hardware accelerators such as GPUs or FPGAs.

The use of deep learning algorithms such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) can significantly increase the computation complexity due to their large number of parameters and layers. However, these algorithms have shown promising results in detecting complex attacks on SCADA systems.

The main implementation challenge of SCADA intrusion detection using artificial intelligence is data processing, due to SCADA systems generating a large amount of data which can be difficult to process. The data must be cleaned, normalized, and transformed into a format that can be used by the AI algorithms.

SCADA systems require real-time processing to detect anomalies and intrusions quickly. This requires high-performance computing resources that can handle large amounts of data in real-time. Furthermore, integrating AI-based intrusion detection with existing SCADA systems can be challenging due to differences in protocols, communication standards, and hardware configurations.

In this research, we suggested using a number of different artificial intelligence methods, including machine learning and deep learning, to identify security threats in a SCADA IIoT system. Two different SCADA IIoT dataset versions were used to test these algorithms. We evaluated the potential dangers of using SCADA IIoT systems and how artificial-intelligence-based solutions may help mitigate them. In order to demonstrate the continuing need of securing SCADA systems, a literature assessment of current anomaly detection methodologies utilizing ML was then presented. Our case study demonstrated how the proposed algorithms may address the indicated need by protecting SCADA IIoT systems from emerging threats. Overall, the proposed model for detecting SCADA intrusion detection using artificial intelligence represents a powerful tool for protecting critical infrastructure systems from cyber threats. By leveraging advanced machine learning techniques, it can help organizations stay ahead of evolving security risks and ensure the integrity of their SCADA systems.

6. Conclusions, Limitations and Future Research

Modern critical infrastructures (CI) include things such as water treatment facilities, oil refineries, power grids, and nuclear and thermal power plants. Industrial control systems (ICSs) are an essential component of these contemporary CIs. ICS describes a system that regulates a physical process by combining computational and communication components. An ICS is made up of a variety of components and subsystems, including sensors, actuators, programmable logic controllers (PLCs), human–machine interfaces (HMIs), and a supervisory control and data acquisition (SCADA) system, among others.

The cybersecurity of SCADA network packets is often examined in order to detect these types of attacks. The purpose of this research was to make SCADA systems safer by using machine learning and deep leaning frameworks that are freely accessible to the public in order to obtain actionable insights from the SCADA datasets, namely WUSTL-IIoT-2018 and WUSTL-IIoT-2021. The main goal of this research was to classify the assaults against a SCADA system by using a big data ecosystem. In the future, large datasets should be utilized in real-time SCADA systems, like the one employed in this research, which is mostly for comparative reasons. Using k-nearest neighbors (KNN), linear discriminant analysis (LDA), random forest (RF), convolution neural network (CNN), and integrated gated recurrent units (IGRUs), this research establishes a foundation for combating SCADA intrusions on a big data framework (GRU).

The empirical results show that the machine learning KNN and RF approaches were able to achieve a high accuracy of 99.99%, whereas the deep leaning approach CNN-GRU model achieved a 99.98% accuracy in the WUSTL-IIoT-2018 dataset. In the WUSTL-IIoT-2021 dataset, the RF and GRU models achieved <99.75% in terms of accuracy metric. Furthermore, in order to confirm the obtained results from the machine leaning and deep learning, a statistical analysis approach was applied. This showed that the prediction errors of the KNN, RF, and CNN-GRU models were considerably lower. Because of this, it is necessary for ICS attack solutions to use a combination of machine learning and deep learning strategies to combat the many different types of attacks that the ICS must defend against. This work aims to shed light on new research avenues in machine learning and deep learning algorithms for efficient and scalable ICS attack detection.

The limitation of this research is that the performance of AI-based intrusion detection systems heavily relies on the quality and quantity of training data. However, obtaining a large and diverse dataset for SCADA systems is challenging due to the sensitive nature of the data. AI-based intrusion detection systems may generate false alarms, which can lead to the unnecessary disruption of operations and loss of productivity. AI-based intrusion detection systems are often considered as black boxes, making it difficult to understand how they arrive at their decisions.

In order to attain higher levels of performance, we want to concentrate our efforts in the future on the use of a combined design that incorporates a number of different algorithms. It is anticipated that the hybrid model will be able to provide findings that are more precise than those produced by any of the component models. In addition, integrating domain-specific knowledge into AI models can enhance their performance in detecting anomalies in SCADA systems, and developing real-time monitoring capabilities for SCADA intrusion detection using AI can enable the timely response to security threats.

Author Contributions

Conceptualization, T.H.H.A. and A.A.; resources, T.H.H.A.; data curation, A.A.; writing—original draft preparation, T.H.H.A. and A.A.; writing—review and editing, A.A.; visualization, T.H.H.A. and A.A.; supervision, T.H.H.A.; project administration, T.H.H.A. and A.A.; funding acquisition, T.H.H.A. and A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research and the APC were funded by Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia for funding this research work through the project number INST035.

Data Availability Statement

https://www.cse.wustl.edu/~jain/iiot/index.html (accessed on 12 January 2023).

Acknowledgments

The authors extend their appreciation to the Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia for funding this research work through the project number INST035.

Conflicts of Interest

The authors declare no conflict of interest.

References

Elsisi, M.; Tran, M.Q.; Mahmoud, K.; Lehtonen, M.; Darwish, M.M. Deep Learning-Based Industry 4.0 and Internet of Things towards Effective Energy Management for Smart Buildings. Sensors 2021, 21, 1038. [Google Scholar] [CrossRef] [PubMed]
Khalid, H.; Hashim, S.J.; Ahmad, S.M.S.; Hashim, F.; Chaudhary, M.A. SELAMAT: A New Secure and Lightweight Multi-Factor Authentication Scheme for Cross-Platform Industrial IoT Systems. Sensors 2021, 21, 1428. [Google Scholar] [CrossRef] [PubMed]
Odema, M.; Ferlez, J.; Vaisi, G.; Shoukry, Y.; Faruque, M.A.A. EnergyShield: Provably-Safe Offloading of Neural Network Controllers for Energy Efficiency. arXiv 2023, arXiv:2302.06572. [Google Scholar]
Grammatikis, P.R.; Sarigiannidis, P.; Dalamagkas, C.; Spyridis, Y.; Lagkas, T.; Efstathopoulos, G.; Sesis, A.; Pavon, I.L.; Burgos, R.T.; Diaz, R.; et al. Sdn-based resilient smart grid: The sdn-microsense architecture. Digital 2021, 1, 173–187. [Google Scholar] [CrossRef]
Mladenov, V.; Chobanov, V.; Sarigiannidis, P.; Radoglou-Grammatikis, P.I.; Hristov, A.; Zlatev, P. Defense against cyber-attacks on the Hydro Power Plant connected in parallel with Energy System. In Proceedings of the 2020 12th Electrical Engineering Faculty Conference (BulEF), Varna, Bulgaria, 9–12 September 2020. [Google Scholar]
Ahakonye, L.A.C.; Nwakanma, C.I.; Lee, J.-M.; Kim, D.-S. SCADA intrusion detection scheme exploiting the fusion of modified decision tree and Chi-square feature selection. Internet Things 2023, 21, 100676. [Google Scholar] [CrossRef]
Balla, A.; Habaebi, M.H.; Elsheikh, E.A.A.; Islam, R.; Suliman, F.M. The Effect of Dataset Imbalance on the Performance of SCADA Intrusion Detection Systems. Sensors 2023, 23, 758. [Google Scholar] [CrossRef]
Zhao, H.; Liu, G.; Sun, H.; Zhong, G.; Pang, S.; Qiao, S.; Lv, Z. An enhanced intrusion detection method for AIM of smart grid. J. Ambient. Intell. Humaniz. Comput. 2023, 1–13. [Google Scholar] [CrossRef]
Efiong, J.E.; Akinyemi, B.O.; Olajubu, E.A.; Aderounmu, G.A.; Degila, J. CyberSCADA Network Security Analysis Model for Intrusion Detection Systems in the Smart Grid. In Advances in Intelligent Systems, Computer Science and Digital Economics IV; Springer: Cham, Switzerland, 2023; pp. 481–499. [Google Scholar]
Sheng, C.; Yao, Y.; Li, W.; Yang, W.; Liu, Y. Unknown Attack Traffic Classification in SCADA Network Using Heuristic Clustering Technique. IEEE Trans. Netw. Serv. Manag. 2023. [Google Scholar] [CrossRef]
Bhati, B.S.; Dikshita; Bhati, N.S.; Chugh, G. A Comprehensive Study of Intrusion Detection and Prevention Systems. In Wireless Communication Security; John Wiley & Sons: Hoboken, NJ, USA, 2023; p. 115. [Google Scholar]
Zhu, Q.; Zhang, G.; Luo, X.; Gan, C. An industrial virus propagation model based on SCADA system. Inf. Sci. 2023, 630, 546–566. [Google Scholar] [CrossRef]
Aragó, A.S.; Martínez, E.R.; Clares, S.S. SCADA laboratory and test-bed as a service for critical infrastructure protection. In Proceedings of the 2nd International Symposium on ICS & SCADA Cyber Security Research, St Pölten, Austria, 11–12 September 2014. [Google Scholar]
National Communications Systems (NCS). Supervisory Control and Data Acquisition (SCADA) Systems, Technical Information Bulletin 04-1. 2004. Available online: https://www.cedengineering.com/userfiles/SCADA%20Systems.pdf (accessed on 12 January 2023).
ISA. Security for Industrial Automation and Control Systems, Part 3-3: System Security Requirements and Security Levels. 2013. Available online: https://www.isa.org/products/ansi-isa-62443-3-3-99-03-03-2013-security-for-indu.pdf (accessed on 12 January 2023).
Alkahtani, H.; Aldhyani, T.H.H. Developing Cybersecurity Systems Based on Machine Learning and Deep Learning Algorithms for Protecting Food Security Systems: Industrial Control Systems. Electronics 2022, 11, 1717. [Google Scholar] [CrossRef]
Wang, C.; Wang, B.; Liu, H.; Qu, H. Anomaly Detection for Industrial Control System Based on Autoencoder Neural Network. Wirel. Commun. Mob. Comput. 2020, 2020, 8897926. [Google Scholar] [CrossRef]
Aldhyani, T.H.H.; Alkahtani, H. Attacks to Automatous Vehicles: A Deep Learning Algorithm for Cybersecurity. Sensors 2022, 22, 360. [Google Scholar] [CrossRef]
Najafabadi, M.M.; Villanustre, F.; Khoshgoftaar, T.M.; Seliya, N.; Wald, R.; Muharemagic, E. Deep learning applications and challenges in big data analytics. J. Big Data 2015, 2, 1. [Google Scholar] [CrossRef]
Hassan, M.M.; Gumaei, A.; Alsanad, A.; Alrubaian, M.; Fortino, G. A hybrid deep learning model for efficient intrusion detection in big data environment. Inf. Sci. 2020, 513, 386–396. [Google Scholar] [CrossRef]
Xu, C.; Shen, J.; Du, X.; Zhang, F. An Intrusion Detection System Using a Deep Neural Network With Gated Recurrent Units. IEEE Access 2018, 6, 48697–48707. [Google Scholar] [CrossRef]
Zolfi, H.; Ghorbani, H.; Ahmadzadegan, M.H. Investigation and classification of cyber-crimes through IDS and SVM algorithm. In Proceedings of the 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 12–14 December 2019; pp. 180–187. [Google Scholar] [CrossRef]
Onan, A.; Korukoğlu, S. A feature selection model based on genetic rank aggregation for text sentiment classification. J. Inf. Sci. 2017, 43, 25–38. [Google Scholar] [CrossRef]
Abokifa, A.A.; Haddad, K.; Lo, C.; Biswas, P. Real-time identification of cyber-physical attacks on water distribution systems via machine learning–based anomaly detection techniques. J. Water Resour. Plan. Manag. 2019, 145, 04018089. [Google Scholar] [CrossRef]
Zeng, P.; Zhou, P. Intrusion detection in SCADA system: A survey. In Intelligent Computing and Internet of Things; Springer: Berlin/Heidelberg, Germany, 2018; pp. 342–351. [Google Scholar]
Upadhyay, D.; Manero, J.; Zaman, M.; Sampalli, S. Intrusion detection in SCADA based power grids: Recursive feature elimination model with majority vote ensemble algorithm. IEEE Trans. Netw. Sci. Eng. 2021, 8, 2559–2574. [Google Scholar] [CrossRef]
Zolanvari, M.; Teixeira, M.A.; Jain, R. Effect of imbalanced datasets on security of industrial IoT using machine learning. In Proceedings of the 2018 IEEE International Conference on Intelligence and Security Informatics (ISI), Miami, FL, USA, 9–11 November 2018; pp. 112–117. [Google Scholar]
Moustafa, N.; Adi, E.; Turnbull, B.; Hu, J. A new threat intelligence scheme for safeguarding industry 4.0 systems. IEEE Access 2018, 6, 32910–32924. [Google Scholar] [CrossRef]
Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M.; Rimer, S.; Alimi, K.O.A. A Review of Research Works on Supervised Learning Algorithms for SCADA Intrusion Detection and Classification. Sustainability 2021, 13, 9597. [Google Scholar] [CrossRef]
Rakas, S.V.B.; Stojanović, M.D.; Marković-Petrović, J.D. A review of research work on network-based SCADA intrusion detection systems. IEEE Access 2020, 8, 93083–93108. [Google Scholar] [CrossRef]
Almalawi, A.; Yu, X.; Tari, Z.; Fahad, A.; Khalil, I. An unsupervised anomaly-based detection approach for integrity attacks on SCADA systems. Comput. Secur. 2014, 46, 94–110. [Google Scholar] [CrossRef]
Albulayhi, K.; Abu Al-Haija, Q.; Alsuhibany, S.A.; Jillepalli, A.A.; Ashrafuzzaman, M.; Sheldon, F.T. IoT Intrusion Detection Using Machine Learning with a Novel High Performing Feature Selection Method. Appl. Sci. 2022, 12, 5015. [Google Scholar] [CrossRef]
Zaman, M.; Lung, C. Evaluation of machine learning techniques for network intrusion detection. In Proceedings of the IEEE/IFIP Network Operations and Management Symposium, Taipei, Taiwan, 23–27 April 2018; pp. 1–5. [Google Scholar]
Teixeira, M.A.; Salman, T.; Zolanvari, M.; Jain, R.; Meskin, N. SCADA system testbed for cybersecurity research using machine learning approach. Future Internet 2018, 10, 76. [Google Scholar] [CrossRef]
Almseidin, M.; Alzubi, M.; Kovacs, S.; Alkasassbeh, M. Evaluation of machine learning algorithms for intrusion detection system. In Proceedings of the IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), Subotica, Serbia, 14–16 September 2017; pp. 277–282. [Google Scholar]
Mathur, A.; Tippenhauer, N. SWaT: A water treatment testbed for research and training on ICSS security. In Proceedings of the International Workshop on Cyber-Physical Systems for Smart Water Networks (CySWater), Vienna, Austria, 11 April 2016; pp. 31–36. [Google Scholar]
Perez, R.L.; Adamsky, F.; Soua, R.; Engel, T. Machine learning for reliable network attack detection in SCADA systems. In Proceedings of the 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, New York, NY, USA, 1–3 August 2018; pp. 633–638. [Google Scholar]
Jicha, A.; Patton, M.; Chen, H. SCADA honeypots: An in-depth analysis of Conpot. In Proceedings of the IEEE Conference on Intelligence and Security Informatics (ISI), Tucson, AZ, USA, 28–30 September 2016; pp. 196–198. [Google Scholar]
Rosa, L.; Cruz, T.; Simões, P.; Monteiro, E.; Lev, L. Attacking SCADA systems: A practical perspective. In Proceedings of the IFIP/IEEE Symposium on Integrated Network and Service Management (IM), Lisbon, Portugal, 8–12 May 2017. [Google Scholar]
Keliris, A.; Salehghaffari, H.; Cairl, B. Machine learning-based defense against process-aware attacks on industrial control systems. In Proceedings of the IEEE International Test Conference (ITC), Fort Worth, TX, USA, 15–17 November 2016. [Google Scholar]
Tomin, N.V.; Kurbatsky, V.G.; Sidorov, D.N.; Zhukov, A.V. Machine learning techniques for power system security assessment. In Proceedings of the IFAC Workshop on Control of Transmission and Distribution Smart Grids (CTDSG), Prague, Czech Republic, 11–13 October 2016. [Google Scholar]
Cherdantseva, Y.; Burnap, P.; Blyth, A.; Eden, P.; Jones, K.; Soulsby, H.; Stoddart, K. A review of cyber security risk assessment methods for SCADA systems. Comput. Secur. 2016, 56, 1–27. [Google Scholar] [CrossRef]
Almomani, O. A hybrid model using bio-inspired metaheuristic algorithms for network intrusion detection system. Comput. Mater. Contin. 2021, 68, 409–429. [Google Scholar] [CrossRef]
Kravchik, M.; Shabtai, A. Efficient cyber attacks detection in industrial control systems using lightweight neural networks. arXiv 2019, arXiv:1907.01216. [Google Scholar] [CrossRef]
Liu, L.; Hu, M.; Kang, C.; Li, X. Unsupervised Anomaly Detection for Network Data Streams in Industrial Control Systems. Information 2020, 11, 105. [Google Scholar] [CrossRef]
Tomlin, L.; Farnam, M.R.; Pan, S. A clustering approach to industrial network intrusion detection. In Proceedings of the 2016 Information Security Research and Education (INSuRE) Conference (INSuRECon-16), Huntsville, AL, USA, 30 September 2016. [Google Scholar]
Schneider, P.; Böttinger, K. High-performance unsupervised anomaly detection for cyber-physical system networks. In Proceedings of the 2018 Workshop on Cyber-Physical Systems Security and Privacy, Toronto, ON, Canada, 19 October 2018; pp. 1–12. [Google Scholar]
Stefanidis, K.; Voyiatzis, A.G. An HMM-based anomaly detection approach for SCADA systems. In Information Security Theory and Practice; Foresti, S., Lopez, J., Eds.; WISTP 2016; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2016; Volume 9895, pp. 85–99. [Google Scholar]
Kim, B.-K.; Kang, D.-H.; Na, J.-C.; Chung, T.-M. Detecting abnormal behavior in scada networks using normal traffic pattern learning. In Computer Science and Its Applications; Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2015; Volume 330. [Google Scholar]
Yoon, K.; Ciocarlie, G. Communication pattern monitoring: Improving the utility of anomaly detection for industrial control systems. In Proceedings of the 2014 Workshop on Security of Emerging Networking Technologies, San Diego, CA, USA, 23 February 2014. [Google Scholar]
Formby, D.; Srinivasan, P.; Leonard, A.; Rogers, J.; Beyah, R. Who’s in control of your control system? Device fingerprinting for cyber-physical systems. In Proceedings of the 2016 Network and Distributed System Security Symposium, San Diego, CA, USA, 21–24 February 2016. [Google Scholar]
He, Z.; Raghavan, A.; Hu, G.; Chai, S.; Lee, R. Power-grid controller anomaly detection with enhanced temporal deep learning. In Proceedings of the 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), Rotorua, New Zealand, 5–8 August 2019; pp. 160–167. [Google Scholar]
Kravchik, M.; Shabtai, A. Detecting Cyber Attacks in Industrial Control Systems Using Convolutional Neural Networks. In Proceedings of the 2018 Workshop on Cyber-Physical Systems Security and PrivaCy, Toronto, ON, Canada, 15–19 October 2018; pp. 72–83. [Google Scholar]
Shalyga, D.; Filonov, P.; Lavrentyev, A. Anomaly detection for water treatment system based on neural network with automatic architecture optimization. arXiv 2018, arXiv:1807.07282. [Google Scholar]
Zizzo, G.; Hankin, C.; Maffeis, S.; Jones, K. Intrusion Detection for Industrial Control Systems: Evaluation Analysis and Adversarial Attacks. arXiv 2019, arXiv:1911.04278. [Google Scholar]
Keserwani, P.K.; Govil, M.C.; Pilli, S.E. An optimal intrusion detection system using GWO-CSA-DSAE model. Cyber-Phys. Syst. 2021, 7, 197–220. [Google Scholar] [CrossRef]
Keserwani, P.K.; Govil, M.C.; Pilli, E.S.; Govil, P. A smart anomaly-based intrusion detection system for the Internet of Things (IoT) network using GWO–PSO–RF model. J. Reliab. Intell. Environ. 2021, 7, 3–21. [Google Scholar] [CrossRef]
Awotunde, J.B.; Chakraborty, C.; Adeniyi, A.E. Intrusion detection in industrial internet of things network-based on deep learning model with rule-based feature selection. Wirel. Commun. Mob. Comput. 2021, 2021, 7154587. [Google Scholar] [CrossRef]
Fatani, A.; Dahou, A.; Al-qaness, M.A.A.; Lu, S.; Abd Elaziz, M. Advanced feature extraction and selection approach using deep learning and Aquila optimizer for IoT intrusion detection system. Sensors 2021, 22, 140. [Google Scholar] [CrossRef]
Bhatt, S.; Pham, T.K.; Gupta, M.; Benson, J.; Park, J.; Sandhu, R. Attribute-based access control for AWS Internet of Things and secure Industries of the Future. IEEE Access 2021, 9, 107200–107223. [Google Scholar] [CrossRef]
Dramé-Maigné, S.; Laurent, M.; Castillo, L. Distributed access control solution for the IoT based on multi-endorsed attributes and smart contracts. In Proceedings of the 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), Tangier, Morocco, 24–28 June 2019; pp. 1582–1587. [Google Scholar]
Gupta, M.; Awaysheh, F.M.; Benson, J.; Alazab, M.; Patwa, F.; Sandhu, R. An attribute-based access control for cloud enabled industrial smart vehicles. IEEE Trans. Ind. Inform. 2020, 17, 4288–4297. [Google Scholar] [CrossRef]
Aldhyani, T.H.H.; Alkahtani, H. Cyber Security for Detecting Distributed Denial of Service Attacks in Agriculture 4.0: Deep Learning Model. Mathematics 2023, 11, 233. [Google Scholar] [CrossRef]
Alzahrani, A.; Aldhyani, T.H.H. Artificial Intelligence Algorithms for Detecting and Classifying MQTT Protocol Internet of Things Attacks. Electronics 2022, 11, 3837. [Google Scholar] [CrossRef]
Alkahtani, H.; Aldhyani, T.H.H. Artificial Intelligence Algorithms for Malware Detection in Android-Operated Mobile Devices. Sensors 2022, 22, 2268. [Google Scholar] [CrossRef]
Almaiah, M.A.; Almomani, O.; Alsaaidah, A.; Al-Otaibi, S.; Bani-Hani, N.; Hwaitat, A.K.A.; Al-Zahrani, A.; Lutfi, A.; Awad, A.B.; Aldhyani, T.H.H. Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels. Electronics 2022, 11, 3571. [Google Scholar] [CrossRef]
Zolanvari, M.; Teixeira, M.A.; Gupta, L.; Khan, K.M.; Jain, R. Machine Learning-Based Network Vulnerability Analysis of Industrial Internet of Things. IEEE Internet Things J. 2019, 6, 6822–6834. [Google Scholar] [CrossRef]
Inoue, J.; Yamagata, Y.; Chen, Y.; Poskitt, C.M.; Sun, J. Anomaly Detection for a Water Treatment System Using Unsupervised Machine Learning. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW), New Orleans, LA, USA, 18–21 November 2017; pp. 1058–1065. [Google Scholar]

Figure 1. Industry 4.0 technology for developing the ICS sector.

Figure 2. SCADA cybersecurity system based on AI.

Figure 3. Numbers of SCADA classes.

Figure 4. Correlation between the features.

Figure 5. Preprocessing steps.

Figure 6. LDA approach.

Figure 7. CNN structure.

Figure 8. GRU structure.

Figure 9. CNN-GRU structure.

Figure 10. Confusion matrix for the LDA algorithm in WUSTL-IIoT-2018 dataset.

Figure 11. Confusion matrix for the LDA algorithm in WUSTL-IIoT-2021 dataset.

Figure 12. Confusion matrix for the KNN algorithm using the WUSTL-IIoT-2018 dataset.

Figure 13. Confusion matrix for the KNN algorithm using the WUSTL-IIoT-2021 dataset.

Figure 14. Confusion matrix for the RF algorithm using the WUSTL-IIoT-2018 dataset.

Figure 15. Confusion matrix for the RF algorithm using the WUSTL-IIoT-2021 dataset.

Figure 16. Performance of GRU: (a) GRU model accuracy, and (b) GRU model loss in the WUSTL-IIoT-2018 dataset.

Figure 17. Performance of the CNN-GRU model: (a) model accuracy, and (b) model loss of the WUSTL-IIoT-2081 dataset.

Figure 18. Performance of the GRU model: (a) model accuracy, and (b) model loss in the WUSTL-IIoT-2021 dataset.

Figure 19. Performance of the CNN-GRU model: (a) model accuracy, and (b) model loss in the WUSTL-IIoT-2021 dataset.

Figure 20. ROC curves of (a) GRU (b) CNN-GRU models in the WUSTL-IIoT-2018 dataset.

Figure 21. Performance of proposed system.

Table 1. Features of the SCADA dataset.

Feature	Description
Variable Source port (Sport)	It is indicated to port number in source part
Variable Total packets (TotPkts)	It is indicated to total packet count
Variable Total bytes (TotBytes)	It is indicated to total bytes of transaction
Variable Source packets (SrcPkts)	Source of packet count
Variable Destination packets (DstPkts)	Destination of packet count
Variable Source bytes (SrcBytes)	Source of transaction bytes

Table 2. Simples of WUSTL-IIoT-2021 dataset.

Features	Simples
Number of observations	1,194,464
Number of features	41
Number of attack samples	87,016
Number of normal samples	1,107,448

Table 3. Main parameters of the deep learning model.

Parameters	Values
Num_epochs	20
Learning_rate	0.001
Optimzer	Adma
Layer	128

Table 4. Performance Results of LDA model for detecting attacks from SCADA dataset.

WUSTL-IIoT-2018
Labels	Precision Metric %	Recall Metric %	F1-Score Metric %
Attack class	0.98	0.79	0.88
Normal class	0.91	0.99	0.95
Accuracy %	93.11
Average of metrics	0.94	0.93	0.93
WUSTL-IIoT-2021
Labels	Precision Metric %	Recall Metric %	F1-Score Metric %
Normal class	96	98	97
Attack class	69	54	60
Accuracy %	95
Average of metrics	94	95	95

Table 5. Performance Results of the KNN model for detecting attacks from SCADA dataset.

WUSTL-IIoT-2018
Labels	Precision Metric %	Recall Metric %	F1-Score Metric %
Normal class	100	100	100
Attack class	100	100	100
Accuracy %	99.99
Average of metrics	100	100	100
WUSTL-IIoT-2021
Labels	Precision Metric %	Recall Metric %	F1-Score Metric %
Normal class	100	100	100
Attack class	99	99	99
Accuracy %	100
Average of metrics	99	99	99

Table 6. Performance Results of the RF model for detecting attacks from the SCADA dataset.

WUSTL-IIoT-2018
Labels	Precision Metric %	Recall Metric %	F1-Score Metric %
Normal class	100	100	100
Attack class	100	100	100
Accuracy %	99.99
Average of metrics	100	100	100
WUSTL-IIoT-2021
Labels	Precision Metric %	Recall Metric %	F1-Score Metric %
Normal class	100	100	100
Attack class	100	100	100
Accuracy %		100
Average of metrics	100	100	100

Table 7. Results of the GRU and CNN-GRU approaches.

	WUSTL-IIoT-2018
Models	Accuracy %	Loss	Precision %	Recall %	F1-Score %
GRU	99.93	0.0031	99.95	99.94	99.95
CNN-GRU	99.98	0.00070	99.98	99.98	99.98
	WUSTL-IIoT-2021
GRU	99.75	0.015	99.76	99.43	99.50
CNN-GRU	98.18	0.039	99.10	98.95	98.85

Table 8. Statistical analysis of deep learning to detect attacks.

WUSTL-IIoT-2018
Model	MSE	RMSE	R² %
LDA	0.0688	0.2623	69.59
KNN	1.9073 × 10⁻⁵	0.00436	99.99
RF	3.1789 × 10⁻⁶	0.00178	99.99
CNN-GRU	0.000184	0.0135	99.91
GRU	0.000638	0.0252	99.70
WUSTL-IIoT-2021
LDA	0.0515	0.227	50.23
KNN	0.0009	0.0310	98.57
RF	0.0065	0.0087	99.99
CNN-GRU	0.0011	0.010	98.88
GRU	0.0024	0.0496	98.85

Table 9. The suggested system is compared to other systems by analyzing many SCADA datasets.

Reference	Datasets	Model	Precision (%)	Time/s
Ref. [53]	SCADA dataset	DNN	98.25	214
Ref. [54]	SCADA dataset	RNN	93.03	Not found
Ref. [55]	SCADA dataset	MLP	81.02	Not found
Ref. [68]	SCADA dataset	DNN	98.29	600
Ref. [67]	SCADA dataset	Random Forest	99.9	Not found
Proposed model	SCADA dataset	RF	99.98	62.20
Proposed model	SCADA dataset	CNN-GRU	100	32.15

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alzahrani, A.; Aldhyani, T.H.H. Design of Efficient Based Artificial Intelligence Approaches for Sustainable of Cyber Security in Smart Industrial Control System. Sustainability 2023, 15, 8076. https://doi.org/10.3390/su15108076

AMA Style

Alzahrani A, Aldhyani THH. Design of Efficient Based Artificial Intelligence Approaches for Sustainable of Cyber Security in Smart Industrial Control System. Sustainability. 2023; 15(10):8076. https://doi.org/10.3390/su15108076

Chicago/Turabian Style

Alzahrani, Ali, and Theyazn H. H. Aldhyani. 2023. "Design of Efficient Based Artificial Intelligence Approaches for Sustainable of Cyber Security in Smart Industrial Control System" Sustainability 15, no. 10: 8076. https://doi.org/10.3390/su15108076

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design of Efficient Based Artificial Intelligence Approaches for Sustainable of Cyber Security in Smart Industrial Control System

Abstract

1. Introduction

2. Background

3. Materials and Methods

3.1. Dataset

3.1.1. WUSTL-IIoT-2018 Dataset

3.1.2. WUSTL-IIoT-2021 Dataset

3.2. Data Preprocessing

3.3. Classification Algorithms

3.3.1. Linear Discriminant Analysis (LDA)

3.3.2. k-Nearest Neighbors Algorithm (KNN)

3.3.3. Random Forest (RF)

3.3.4. Convolution Neural Network (CNN)

3.4. Evaluation Metrics

4. Results

Results of Deep Learning

5. Discussion

6. Conclusions, Limitations and Future Research

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI