Survey on Intrusion Detection Systems Based on Machine Learning Techniques for the Protection of Critical Infrastructure

Industrial control systems (ICSs), supervisory control and data acquisition (SCADA) systems, and distributed control systems (DCSs) are fundamental components of critical infrastructure (CI). CI supports the operation of transportation and health systems, electric and thermal plants, and water treatment facilities, among others. These infrastructures are not insulated anymore, and their connection to fourth industrial revolution technologies has expanded the attack surface. Thus, their protection has become a priority for national security. Cyber-attacks have become more sophisticated and criminals are able to surpass conventional security systems; therefore, attack detection has become a challenging area. Defensive technologies such as intrusion detection systems (IDSs) are a fundamental part of security systems to protect CI. IDSs have incorporated machine learning (ML) techniques that can deal with broader kinds of threats. Nevertheless, the detection of zero-day attacks and having technological resources to implement purposed solutions in the real world are concerns for CI operators. This survey aims to provide a compilation of the state of the art of IDSs that have used ML algorithms to protect CI. It also analyzes the security dataset used to train ML models. Finally, it presents some of the most relevant pieces of research on these topics that have been developed in the last five years.


Introduction
Modern society depends on sophisticated infrastructures (cyber and physical) to carry out its day-to-day activities. These infrastructures are classified as critical assets to protect services not only in the physical but also in the digital world. Their protection has become a national security concern [1]. ICSs have exponentially evolved over the last few decades. New technologies from the fourth industrial revolution have increased efficiency while, at the same time, saving resources. The connection of ICSs to the internet and the incorporation of protocols such as TCP/IP have expanded the attack surface and made CI vulnerable to a wider range of attacks [2]. In particular, the incorporation of the industrial internet of things (IIoT) to connect devices at an industrial level, such as sensors and actuators, has increased cybersecurity risks [3]. A variety of security solutions have been developed to enhance security control in ICSs. Technologies that incorporate machine learning (ML)-a type of artificial intelligence-have become must haves in the identification of cyber-attacks. In particular, ML can identify patterns, outliers, or anomalies connected to a particular attack [4]. This will prevent these attacks from happening again. Nevertheless, cybersecurity measures are not as good at identifying a zero-day attack [5,6]. This kind of attack exploits a vulnerability that has not been disclosed. Therefore, no specific security measures can be taken.
ML as a part of intrusion detection systems (IDSs) has had positive results using different kinds of learning, including supervised, unsupervised, and reinforcement learning [5,7,8]. Supervised learning can identify more well-known attacks with a high level of accuracy and a low level of false positives. This has been mostly tested in outdated datasets that do not represent real-world security scenarios, so their generalization can be questioned, and they may not be able to detect unknown attacks. Unlike supervised learning, unsupervised learning has had better results in identifying zero-day attacks through techniques such as clustering or association, however, the number of false positives has increased significantly [9,10]. Finally, reinforcement learning, which is the most recent type of learning, can handle the complexity of cybersecurity threats if the required time to learn is available [8]. Complex techniques as a part of IDSs are leading to better results. Some additional techniques that are used in IDSs are meta learning, layered models, artificial neural networks, and deep learning networks.
This paper provides a review of the most remarkable research works that have been developed in the IDS field. Specifically, IDSs aim to enhance the cybersecurity level of critical infrastructure with solutions based on ML techniques, as shown in Table 1. It should be taken into consideration that the characteristics of CI do not normally coincide with information technology networks. This work will help to identify the key aspects of intrusion detection in industrial systems. The challenges of developing IDSs for CI are also discussed. Although some surveys provide information about the application of ML to IDSs, they tend to fail in highlighting the applications of these systems in industrial networks. This survey will also help to establish the most up-to-date IDSs for CI.

Research Objectives
The main objective of our research was to perform a systematic review of IDSs to improve the cybersecurity level of CI through ML techniques. The review covered the last five years. Additionally, there are two specific objectives: • Synthesize and analyze the most representative research works that have been conducted to develop IDSs for industrial systems through ML techniques; • Generate a discussion and a critical evaluation of the existing foundation of knowledge in the development of IDSs using ML techniques for the protection of CI.

Methodology
To ensure a systematic and representative review of IDSs that use ML technology in CI, the literature review adapts the methodology presented in 2016 by Antonio Tavares, Luiz Scavarda, and Annibal Scavarda [16]. This methodology has eight steps: "(1) planning and formulating the problem, (2) searching the literature review, (3) data gathering, (4) quality evaluation, (5) data analysis and synthesis, (6) interpretation, (7) presenting results, and (8) updating the review".
Following the chosen methodology, various combinations of keywords were used when searching the Scopus database, as shown in Table 2. Thus, the methodology was applied. After this, 166 documents were selected for deep analysis. Finally, 98 documents that positively contributed to the survey were included.

Fundamental Concepts
The following section aims to introduce the main concepts that are part of this study. First, this research focuses on the characteristics that make CI a type of infrastructure that must be secured. Then, this work explains ML techniques used in IDSs to protect CI. Finally, this study analyzes the cybersecurity datasets used to prove theories in real-world scenarios, or scenarios as close as possible to the real world.

Critical Infrastructure Concept
Concepts of critical infrastructure differ depending on the source [17]. From the academic's perspective, there is a consensus on defining CI as an essential national asset that keeps society functioning [6,18], and its disruption could impact a nation or nations, causing a socioeconomic and political crisis [19]. Furthermore, nations have released their definitions of CI that align with their characteristics and interests. For instance, the United States (US) government (by means of the Cybersecurity and Infrastructure Agency (CISA)) defines CI as any system and its components that are vital to the country, where their incapacitation or destruction could affect national security. The European Union Agency for Network and Information Security (ENISA) defines CI as any system-total or partial-that is vital to maintaining societal functions. In addition, a set of sectors that constitute CI has been established. While there is not a consolidated list that applies to every country, it is feasible to identify a list of sectors that are usually included. For example, energy was included as a critical sector by 17 countries in the European Union and the US. A CI category that is included by 15 countries in the European Union and identified by the US is the Information and Communication Technology sector. The communication technology category is divided into sectors in the United States, which are known as the Communication and Information Technology sectors. Water, food, financial, and transport sectors are other key sectors that are identified as CI [20,21]. Although each critical industry has its own infrastructure, they are usually composed of industrial control systems (ICSs). These ICSs allow electronic control in the industrial process as shown in Figure 1.
Concepts of critical infrastructure differ depending on the source [17]. From the aca demic's perspective, there is a consensus on defining CI as an essential national asset tha keeps society functioning [6,18], and its disruption could impact a nation or nations, caus ing a socioeconomic and political crisis [19].
Furthermore, nations have released their definitions of CI that align with their char acteristics and interests. For instance, the United States (US) government (by means of th Cybersecurity and Infrastructure Agency (CISA)) defines CI as any system and its com ponents that are vital to the country, where their incapacitation or destruction could affec national security. The European Union Agency for Network and Information Security (ENISA) defines CI as any system-total or partial-that is vital to maintaining societa functions. In addition, a set of sectors that constitute CI has been established. While ther is not a consolidated list that applies to every country, it is feasible to identify a list o sectors that are usually included. For example, energy was included as a critical sector by 17 countries in the European Union and the US. A CI category that is included by 15 coun tries in the European Union and identified by the US is the Information and Communica tion Technology sector. The communication technology category is divided into sectors in the United States, which are known as the Communication and Information Technology sectors. Water, food, financial, and transport sectors are other key sectors that are identi fied as CI [20,21]. Although each critical industry has its own infrastructure, they are usu ally composed of industrial control systems (ICSs). These ICSs allow electronic control in the industrial process as shown in Figure 1.   The proper functioning of our current society depends on CI [22]. Emerging technologies have become key to providing a high quality of life to citizens [23], and industrialized nations rely on information and communication technologies to allow society to transport, communicate, manage money, produce food, and even have health technology systems. Consequently, CI needs to be protected from a variety of risks [24], including physical and digital. From a digital perspective, a cyber-attack could have a significant impact on the security sector, the economy, and public health, among others [25]. If an attack involves communication networks-a CI category of its own-that support CI, this could have the potential to cause a ripple effect, resulting in a significant disruption of vital services in different CI sectors [19,20].

ML and IDSs to Protect CI
There is an interconnectivity between ICSs, information, and communication technology (ICT), and technology from the fourth industrial revolution (4IR) including the industrial internet of things (IIoT), 5G communications, and artificial intelligence (AI). Although this interconnectivity has brought numerous advantages to CI's performance, such as the quality of products and services, operational efficiencies, automatization, and cost reductions [26], the number of cyber-attacks has increased since these industrial instruments have been connected to the internet [27]. Therefore, the attack surface used by attackers to compromise CI has expanded.
In [28], there is a wide range of techniques (non-AI) that have been used to detect cyber attacks on technology systems, and some of these techniques have been partially adapted for industrial systems and their particular attack vectors. These include game theory, rate control, heuristics, intrusion detection systems (anomaly-based and signaturebased), autonomous systems, and end-user security controls. Machine learning (ML) has become one of the most useful methods to improve cybersecurity in CI. This is due to ML's capacity to manage enormous amounts of data and its ability to detect anomalies, patterns, or outliers has dramatically improved [29]. Therefore, one of the most important applications of ML in cybersecurity has been in IDSs [30].
IDSs have different classifications depending on the criteria they use to classify the divisions. There are two well-known divisions: scope and methodology. Scope classification involves host-based IDSs and network-based IDSs. Methodology classification involves signature-based IDSs, anomaly-based IDSs, and hybrid IDSs [31], as shown in Table 3. IDSs have implemented ML algorithms to obtain better performances (as compared to regular security systems). Regular systems lack accuracy in identifying and detecting unknown cyber-attacks and have some limitations in dealing with significant amounts of data [5], while ML models do not have similar issues. Most ML algorithms have been tested as a part of IDSs. This started with supervised and unsupervised ML, more recently moving on to reinforcement learning, as shown in Figure 2. The results of these tests vary depending on the ML algorithm used and its configuration, namely the parameters and hyperparameters. However, a common obstacle is that previous studies were tested using inadequate datasets [5]. Although there are online datasets available for research purposes, they do not accurately represent the current security challenges and threats. Additionally, operators of CI avoid having data extracted from their networks as a security measure, as extracted data could expose their vulnerabilities. Having limitations in the data used to train ML models could affect the outcome of the research, considering that a model can perform particularly well with one dataset but poorly with another [32]. from their networks as a security measure, as extracted data could expose their vulnerabilities. Having limitations in the data used to train ML models could affect the outcome of the research, considering that a model can perform particularly well with one dataset but poorly with another [32].

Cybersecurity Datasets to Test IDSs
The most popular cybersecurity dataset to test IDSs is KDD-99 [33][34][35]. This collection of data originated in 1990 with the aim of correcting some of the weaknesses identified in its predecessor, CUP-99, which were the redundancy of data and the bias in some classes [35]. Although in 2009, the NL-KDD dataset was created to offer an improved and updated version of KDD-99, it has been more than a decade since its release, and a decade is a considerable amount of time in the cybersecurity area as threats and vulnerabilities mutate or evolve steadily. In [5], the authors compared the ML models used in IDSs. They found that 26 out of 65 articles used KDD-99 to prove their theory, 18 out of 65 articles used NLS-DDD, 9 out of 65 articles used KDD-CUP 99, and only two articles used customized datasets. Therefore, research to test previous theories in more accurate scenarios is still needed as it is well known that ML models depend on datasets to learn, and their results are directly affected by the quality of the dataset [36]. Currently, there is no reliable dataset to represent both common and novel attacks [37], and the differences among security datasets have caused limitations in the evaluation's methods [38]. To collaborate in the testing of a new hypothesis, a variety of institutions and laboratories have released their datasets, as illustrated in Table 4. This does not solve the difficulties in testing since the datasets are often not up to date, not always freely available, have a lack of diversity in the logs, and have incomplete documentation [39]. Despite this, these data collections are still helping researchers test new hypotheses.

Cybersecurity Datasets to Test IDSs
The most popular cybersecurity dataset to test IDSs is KDD-99 [33][34][35]. This collection of data originated in 1990 with the aim of correcting some of the weaknesses identified in its predecessor, CUP-99, which were the redundancy of data and the bias in some classes [35]. Although in 2009, the NL-KDD dataset was created to offer an improved and updated version of KDD-99, it has been more than a decade since its release, and a decade is a considerable amount of time in the cybersecurity area as threats and vulnerabilities mutate or evolve steadily. In [5], the authors compared the ML models used in IDSs. They found that 26 out of 65 articles used KDD-99 to prove their theory, 18 out of 65 articles used NLS-DDD, 9 out of 65 articles used KDD-CUP 99, and only two articles used customized datasets. Therefore, research to test previous theories in more accurate scenarios is still needed as it is well known that ML models depend on datasets to learn, and their results are directly affected by the quality of the dataset [36]. Currently, there is no reliable dataset to represent both common and novel attacks [37], and the differences among security datasets have caused limitations in the evaluation's methods [38]. To collaborate in the testing of a new hypothesis, a variety of institutions and laboratories have released their datasets, as illustrated in Table 4. This does not solve the difficulties in testing since the datasets are often not up to date, not always freely available, have a lack of diversity in the logs, and have incomplete documentation [39]. Despite this, these data collections are still helping researchers test new hypotheses. In general, most cybersecurity datasets cannot represent the networking behavior of CI. Most of them were created with standard architectures, protocols, and technologies that differ from those that are part of CI [57,58]. However, some datasets consist of both conventional and unconventional logs of network activities occurring at infrastructure levels in diverse industries, such as NGIDS-DS [59,60]. Datasets that represent the traffic between IIoT and CI are also available, such as TON_IoT [52,61], MQTT-IOT-IDS [62], X-IIoTID [63], and Edge-IIoTset [64]. These have logs for normal operation and attack types. The attack types that are part of each dataset are as follows. For TON_IoT, they are DoS, DDoS, and ransomware. MQTT-IOT-IDS has the following: aggressive scan, UDP scan, Sparta SSH brute force, and MQTT brute force. In the case of X-IIoTID, the attack types are brute force, dictionary attack, malicious insider, reverse shell, and man-in-the-middle. Edge-IIoTs have DoS, DDoS, information gathering, man-in-the-middle, injection, and malware [65].
Cybersecurity datasets usually have imbalanced data because normal traffic constitutes the majority of the datasets' logs [66]. This class imbalance can reduce the effectiveness of ML algorithms in identifying intrusions. Thus, there are three main techniques to deal with imbalanced data: oversampling, undersampling, and hybrid sampling. Moreover, there is a deficiency of available datasets that represent ICSs and SCADA systems [67]. For instance, in [57], the authors developed a testbed of network traffic extracted from a water system to provide data on physical and network systems and to keep the dataset balanced. In [58], the authors focused on creating a testbed that represents physical components, such as controllers, sensors, and actuators. These components are usually part of CI and must be taken into consideration to develop any defensive solution as an IDS. Therefore, if a cybersecurity dataset does not provide information from a cyber-physical environment, it should not be considered for testing cybersecurity measures for CI [68].

Machine Learning in Intrusion Detection Systems (IDSs) to Protect CI
ML is a category of AI and is focused on helping computers to learn. This learning is based on previous knowledge from experiences, patterns, and behaviors [28]. Since 1950, when AI started, a considerable amount of research has been conducted in almost every area of investigation from agriculture to space. In the cybersecurity area, the ability to identify and learn from patterns is used to detect similar attacks. For instance, signature-based IDSs use ML to detect attacks in which signatures had been previously learned [31]. Although this kind of identification has produced excellent results in identifying previous well-known attacks, its performance is inaccurate when applied to zero-day attacks. Furthermore, a small modification to an attack would change its signature, thus making it difficult to identify an attack by a signature-based IDS [5]. In the case of anomaly-based IDSs, an ML algorithm models the normal behavior of the network and identifies everything outside of the learned model as an anomaly. This kind of IDS is better at detecting unknown and zero-day attacks. However, the false positive rate is considerably higher, and abnormal behavior is not always an indication of an attack. A plastic bag can block or alter the digital measures of a sensor in a hydroelectric system, and while this is not a cyber-attack, the bag would be detected as an anomaly. More recent research has shown the benefits of a hybrid approach, i.e., mixing the potential of both kinds of IDS. While a mixed approach has some benefits, its use would result in a complex system that is difficult to implement [7,31,34,69].
ML algorithms have been used to attack and defend in cyberspace [5,70]. From a protection point of view, ML classifiers have advantages for security systems. These advantages include (1) decision trees that can find an accurate set of "best" rules that are used to classify network traffic; (2) k-nearest neighbors (an interesting solution in IDSs) that can learn patterns from new traffic to classify zero-days attacks as an unseen class; (3) support vector machines; and (4) artificial neural networks that can adapt to new forms of communications and learn from incidents without training all models again and can adjust their neurons' weight to identify unseen attacks [28]. All the previous examples have common characteristics in that they depend on the quality of the dataset to learn to identify a cyber-attack, they conduct supervised learning, and they need a periodic updatethere are different updating techniques depending on the trained model and particular needs. Nevertheless, the need to update the model is not just for ML classifiers but for any ML model.
The incorporation of the fourth industrial revolution's technologies such as the internet of things has exponentially increased the amount of diverse data that CI is generating [8]. Additionally, SCADA systems, which are the core of most CI, have implemented TCP/IP communication protocols [32], resulting in a wider attack surface with the possibility of more complex and diverse attacks [5]. There is a need to develop new technologies to cope with changing and novel risks. ML solutions have established a strong resistance against security threats [8]. Nonetheless, depending on experts' labeling is becoming pointless as attackers are always changing their methods, and the exponential increase in real-time network traffic [28] has made it impossible to keep security rules updated. Additionally, it could be difficult to recognize patterns in unbalanced, noisy, or incomplete data [71]. These features are normally present in CI's network traffic. Consequently, UL and RL have become the most adept solutions to cope with these problems. UL helps to uncover hidden characteristics, patterns, and structures from datasets to establish indicators of cyber-attacks [31,60] and, through clustering, has enhanced its capacity to identify novel attacks. RL learns from its own experience, and it is the closest to human learning. RL performs well when working in real-time adversarial scenarios [8], and its characteristics make it attractive as a cybersecurity solution.
Typical security solutions tend not to identify vulnerabilities that merge the interaction of IT and physical systems [72,73]. There is a need to develop IDSs with specific characteristics that take into consideration CI requirements: (1) industrial control systems (ICSs) have a continuous operation that cannot be interrupted for long periods to carry out any security management tasks, and the highest service availability is usually mandatory; (2) in industrial networks, the jitter or delay is kept at lower levels than in IT networks; (3) a physical process is developed by sensors, actuators, or programmable logic controllers (PLC), which are key components for ICS operation, and their security is a priority [1]; (4) a cyber-attack on CI could scale and generate economic losses, and social or political issues, and even impact human lives [19]; and (5) ICS traffic is more stable, and the payload depends on system specifications and usually manages their communication protocols [74].
Details of some of the ML algorithms used in IDSs are explained in Figure 2. The most frequently used ML method is supervised learning. This method has shown meaningful results in measures such as accuracy. Nevertheless, making comparisons between results is not simple work, since they are calculated using different measures from different algorithms and training datasets. In [60], the authors based their evaluation on calculating the area under the curve (AUC) and obtained the best possible results (1.0). This measure does not allow the minimization of one type of error. Thus, the AUC is not useful if optimization of false positives or false negatives is needed. Most complex metrics are also included to evaluate the performance of IDSs, such as the Matthews correlation coefficient (MCC) and F1-score. The latter is becoming popular since it is computed as a harmonic mean of precision and recall [75]. There is a limitation in the parametric comparison of ML algorithms used in IDSs, and most of the analyzed works do not evaluate the results with a variety of measures [35,76,77]. The most common measure is accuracy, followed by precision, recall, and F1-score, as shown in Table 5. Calculating metrics such as the MCC, confusion matrices, specificity, sensitivity, and the kappa coefficient help to understand the behavior of ML algorithms and to deeply understand the research results, as in the case of [72], in which the authors offer the results in more than five metrics.  In the case of anomaly-based IDSs, the detection rate and false alarm rate are the most common metrics used to evaluate the detectors. Nonetheless, these cannot fully assess a detector designed to work in CI. For instance, detection latency is a key factor [74]. Operators of CI need to know about a cyber-attack as soon as possible.
Ensemble models obtained positive results using the F1-score as an evaluation metric, however, the training dataset could not represent the current threats due to it being from 1990 [31]. Models that used decision trees, neighbor-based models [27,76], and recurrent neural networks [82] obtained results over 0.96 in accuracy, with more updated datasets. The problem to solve using an ML model is not always the same. In some cases, it is a binary classification, while in others, it is a multiclassification. The number of classification options depends on the security information available in the dataset and the model's purpose. From the cybersecurity perspective, it is not enough to detect a cyber-attackbinary classification. It would be better to know which kind of intrusion was detected in the system-multiclassification. This knowledge can determine incident management. There has been a surge in new techniques such as the clustering-based classification methodology named perceptual pigeon galvanized optimization (PPGO) [72]. Although this technique proposes a binary classification, it has good results not only in metrics such as accuracy but also in different evaluations such as MCC, confusion matrices, sensitivity, specificity, and F1-score. This kind of technique has better options to implement in industrial networks than some multiclassification solutions with less accurate results. Additionally, PPGO is also a method for choosing the optimal features, which is always a challenge when working with ML. An analysis of some previous works that have been done to develop IDs using ML is shown in Table 6.
Future selection (FS) is a demanding task, not only in the development of ML algorithms for industrial systems but also in any solution that implements ML. In classification problems, an adequate FS technique finds the best characteristics that solve the problem, increases the classification accuracy, and decreases the training and testing time. There are different techniques for FS, and some of the most common are wrapper methods, which include forward, backward, and stepwise selection; filter methods, which include measures such as Pearson's correlation and analysis of variance (ANOVA); and embedded methods, in which the FS process is evolving as part of creating models such as decision trees. Additional methods or tools that can be used for FS have been developed, such as principal component analysis (PCA). Although a deep analysis of the FS techniques is out of the scope of this review, it is necessary to highlight their importance. An example of an FS algorithm for IDSs in CI was developed in [73], where the authors present a wrapper method composed of the BAT algorithm and support vector machines (SVMs). The results were positive in different measures; however, the study was carried out with the benchmark KDD Cup dataset from 1999, which might bring some limitations to its implementation in real-world scenarios since the dataset cannot represent the characteristics of current attacks and only has data from four kinds of attacks: denial-of-service attacks, which prevent users from accessing services; probe attacks, which scan vulnerabilities; remote-to-local attacks, which obtain access from remote connections; and user-to-root attacks, which obtain root access from a normal user.

CICDS
The dataset contains network flows, low false positives, low computational cost, and low detection time.
The model was applied just to binary classification. It assumes that the variables are independent. NF-BoT-IoT (V2) Low false positives, low computational cost, and low detection time As shown in [33,71,75], the detection time is a factor that should be considered and calculated. Although proper identification is mandatory to protect CI, the detection time is key in avoiding escalation, mitigating the major effects of a cyber attack, and being able to continue to offer the service.
Currently, to overcome the identified setbacks related to the application of ML algorithms in IDSs, there has been a tendency to use hierarchical, layered [33], hybrid, or meta-learning algorithms. These algorithms improve the capacity for the detection of unseen and infrequent attacks and conserve their accuracy in the detection of well-known attacks. In general, one model is used as the input for the next one, and multiple combinations of models have been shown to produce positive results in measures such as accuracy, as shown in Table 5. The results are generally well-accepted and much better than a classical approximation. However, some of them have been proven by datasets that, for the most part, do not represent current threats, thus diminishing the capacity to generalize the results and establishing doubts about their behavior in the real world. Furthermore, there is a concern about the technical requirements needed to develop and support the models. Additionally, they have not been successful at identifying all types of intrusions [34]. Most models lack proper adaptivity [83] as the attackers' changing patterns are usually not identified. In some cases, they require human intervention to introduce new vulnerabilities, however, the number of new vulnerabilities could surpass the technique's availability.
In [84], the authors present a hybrid approach that focuses on dealing with highly imbalanced data in SCADA. This proposal combines a customized content-level detectora Bloom filter-with an instance-based learner (k-nearest neighbor (KNN)). The detector is signature-based; therefore, it cannot detect attacks that were not previously identified. To overcome this issue, the authors used KNN. However, the performance is highly dependent on the number of neighbors considered for classification. Implementing hybrid algorithms with unsupervised learning is also an option, as presented in [85], where a mutated selforganizing map algorithm (MUSOM) deployed an agent that identified the node behavior as malicious or normal. The MUSOM wants to reduce the learning rate, which is a positive characteristic in developing security systems for SCADA due to the decrease in the training time without increasing the memory needs.
In [60], meta-learning approaches-bagging, boosting, stacking, cascading, delegating, voting, and arbitrating-with unsupervised learning were tested in 21 datasets, and the authors concluded that no algorithm outperformed another during the research. Despite this, they were able to recognize that some factors would improve the results, such as implementing accurate parameter tuning or using a better feature extractor.
Another method is to focus on developing models to detect specific attacks, as shown in Table 7. This kind of approximation mainly focuses on the most frequent and high-impact attacks on CI such as distributed denial-of-service (DDoS) attacks, which affect a service's availability. In detecting DDoS attacks, results above 0.97 in classification accuracy have been obtained [32]. The interruption of the availability of CI tends to have the most severe impact on people's daily lives as it interferes with access to daily commodities such as energy, communications, and water. Although the other security information characteristics are also vital-integrity and confidentiality-CI operators always prioritize availability over all other considerations [22,24]. In previous research, as illustrated in Table 5, there are positive results for IDSs that implement ML techniques, where some of them obtain results over 0.99 in measures of accuracy. Nonetheless, the training datasets do not have logs from cyber-physical systems such as sensors or actuators. These components are essential for the operation of CI and have specific characteristics [92]. Therefore, the results can be imprecise due to the inaccuracy and outdatedness of the datasets used to train the models. Additionally, the kinds of cyber-attacks that CI is a victim of differ from the typical attacks on other infrastructure mainly due to (1) the physical components that are involved, (2) the real-time data transmission, (3) the geographically distributed components [93], (4) the kind of attacker, and (5) the attack motivation. When these characteristics are taken into consideration, a different set of threats is analyzed, as shown in Table 7. These types of attacks include elements such as the alteration or disruption of the information issued by specific sensors [87,88].
Finally, from the cybersecurity point of view, the design of new ML-based IDSs should consider their robustness against adversarial attacks. These attacks exploit the vulnerabilities of ML systems to bypass IDSs [94]. Adversarial attacks use different attack vectors, for instance, the alteration of the classifier to change the output, the modification of the input data, and an adversarial honeypot. Some of the techniques used to develop an adversarial attack are the fast gradient sign method (FGSM) and projected gradient descent (PGD), which add noise to the original data [95]. These attacks are particularly challenging as some authors argue that the maximum mean discrepancy (MMD) might not be effective in identifying legitimate and malicious traffic. However, previous research works have found that if modifications are made to the original implementation, MMD would help in the identification of adversarial attacks [96]. Defense techniques were also implemented to improve the security of ML-based IDSs in [94,97], where the authors proposed three categories: modify the input data, augmenting the original dataset to improve the capacity of generalization (Gaussian data augmentation); modify the classifier, changing the loss function or adding more layers (gradient masking); add an external model, adding one or more models during the test, and keeping the original (generative adversarial networks (GANs)).

Conclusions and Future Direction
In this paper, we have presented a survey on IDSs that have been developed for the protection of CI, based on data from the last five years. These IDSs use ML techniques as a principal component to detect cyber-attacks. Although there are meaningful advances in the development of detection tools for the accurate identification of known attacks, there are still challenges, such as the detection of zero-day attacks, the model's updating, and the high rate of false positives. Future research could focus on improving these identified challenges. This work highlights the weaknesses and strengths of: (1) the ML used to improve the cybersecurity level of CI; (2) the cybersecurity datasets; and (3) the CI security requirements. Finally, it serves as a starting point for forthcoming studies.
The protection of CI is a national security concern [1], and its cybersecurity models depend on traditional approximations that typically utilize standalone security solutions [98]. Systems such as IDSs incorporate ML solutions to improve the prediction capacity, and different kinds of learning methods have been implemented to obtain results that do not cover all the protection levels required to secure CI. On the one hand, supervised learning has been producing positive results when identifying well-known attacks, but it struggles to detect zero-day attacks. On the other hand, unsupervised learning, which is better at detecting unknown attacks, does not obtain the same results as known attack vectors. Additionally, reinforcement learning has been incorporated to resolve high-dimensional cyber defense problems [8]. More complex approximations are being developed, and meta-learning learners and artificial neural networks have been tested.
Although the results seem promising in the anomaly detection field, most of the testing that has been conducted was carried out with datasets that do not represent network traffic from CI from either past or present cyber threats, thus questioning the algorithms' generalization capacity in real-world scenarios. There is a need for accurate characterization of data extracted from CI's networks, not only to train network-based IDSs but to help in the development of host-based IDSs. Developing a more accurate dataset is an open area of research that would highly contribute to closing the gap between academic findings and real-world applications.
Comparing results with previous works is challenging. Tables 5 and 6 show some works that have been developed to detect cyber-attacks using ML techniques; however, this comparison is not an easy task since they used different datasets with different techniques, and in some cases, they calculated different metrics or calculated only the accuracy of the model [30,72] and we already know that accuracy metric is not enough to analyze an ML model. Particularly in ICS, the detection time is a factor that must be calculated. Additionally, the works might not have enough information to replicate the model. Thus, advances in how to compare ML models are considered an encouraging research area. Additionally, there is a need to close the gap between cybersecurity systems and incident management, so organizations can undertake appropriate control measures to mitigate risk proactively [18].

Conflicts of Interest:
The authors declare no conflict of interest.