Adaptive IDS for Cooperative Intelligent Transportation Systems Using Deep Belief Networks

: The adoption of cooperative intelligent transportation systems (cITSs) improves road safety and trafﬁc efﬁciency. Vehicles connected to cITS form vehicular ad hoc networks (VANET) to exchange messages. Like other networks and systems, cITSs are targeted by attackers intent on compromising and disrupting system integrity and availability. They can repeatedly spoof false information causing bottlenecks, trafﬁc jams and even road accidents. The existing security infrastructure assumes that the network topology and/or attack behavior is static. However, the cITS is inherently dynamic in nature. Moreover, attackers may have the ability and resources to change their behavior continuously. Assuming a static IDS security model for VANETs is not suitable and can lead to low detection accuracy and high false alarms. Therefore, this paper proposes an adaptive security solution based on deep learning and contextual references that can cope with the dynamic nature of the cITS topologies and increasingly common attack behaviors. In this study, deep belief networks (DBN) modeling was used to train the detection model. Binary cross entropy was used as a loss function to measure the prediction error. Two activation functions were used, Relu and Softmax, for input–output mapping. The Relu was used in the hidden layers, while the Sigmoid was used in the last layer to map the real vector to output between 0 and 1. The adaptation mechanism was incorporated into the detection model using a moving average that monitors predicted values within a time window. In this way, the model can readjust the classiﬁcation thresholds on-the-ﬂy as appropriate. The proposed model was evaluated using the Next Generation Simulation (NGSIM) dataset, which is commonly used in such related works. The result is improved accuracy, demonstrating that the adaptation mechanism used in this study was effective.


Introduction
Cooperative intelligent transportation systems (cITSs) collect data from the end nodes (i.e., endpoints). These data are stored locally and shared with the other nodes [1][2][3]. The cITS adopts one of the two information-sharing standards, the European standard [4] and the American standard [4]. On the one hand, the European standard defines two types of messages, the Cooperative Awareness Message (CAM) and the Decentralized Environmental Notification Message (DENM) [5]. The CAMs are sent periodically and carry information about the vehicles such as their position, size, speed, and angle of steering wheel. The DENM messages carry information about events which occur on sections of road section such as lane changes and (sudden) braking. On the other hand, the American standard defines context information messages called basic safety messages (BSMs), which carry different information such as position, heading, speed, acceleration, steering wheel angle, vehicle role, vehicle size and status of vehicle lights [6]. If an event happens, then the BSM also carries those event-related information.
Notwithstanding, cITSs enable information sharing among neighboring nodes (i.e., vehicles). Unfortunately, this comes at the cost of needing to address several threats that target data and system integrity [7,8]. These threats could be imposed by either humancrafted attacks or malware [7,[9][10][11]. Threats which target cITS systems can disable or disrupt the function of one or more components in the vehicle's navigation system [12]. For example, threats can spoof the exchanged data to inject false mobility information which is then exchanged among neighboring vehicles causing erroneous actions and calamitous outcomes.
Threat actors use sophisticated strategies and employ malware to carry out various attacks against cITSs [13,14]. These attacks could come from nodes inside or outside the network. Outside attacks by threat actors that are not part of the network are easy to detect, whereas inside attacks are usually carried out via legitimate but compromised vehicles. Such inside attacks are more challenging to detect. Typical cITS targeted attacks include jamming, replay, Sybil, and data falsification.
Jamming is carried out by overwhelming individual cITS nodes by an enormous amount of messages, which disrupt the connectivity with the cITS, a denial-of-service attack type [15]. The consequences include message loss within the cITS, causing a data insufficiency situation that adversely affects the accuracy of the intrusion detection systems (IDS) trained on such data. Replay attacks occur if the attacker can impersonate an original node enabling the interception of messages exchanged between the vehicles and thereby injecting false data by re-sending them to a victim node [16]. Likewise, a Sybil attack creates several identities and uses them to poison (fake) BSM messages that deceive victim nodes; as such, a Sybil attack compromises network services when an attacker subverts the service's reputation system by creating a large number of pseudonymous identities and then using them to gain a disproportionately large influence. Thus, false data injection can be used to share and promote false information about the current traffic situation on the road for the purpose of disrupting traffic flow and triggering congestion.
Data falsification is another type of attack that can be conducted to compromise BSM messages exchanged between cITS nodes. The first step is to compromise a legitimate node and employ it to share false data with neighboring vehicles. Since the compromised node has been previously authenticated, a trust relationship was established with other nodes in the cITS network. Attackers can utilize this fact to spread the false data using the compromised node [5]. Attackers thus manipulate the BSM and inject false data which is then share with neighboring nodes [17]. The false data may cause a vehicle to take unexpected actions such as sudden braking, lane changing, and/or sudden acceleration. Therefore, taking security measures to protect BSM messages is crucial [6].

Related Works
The current solutions proposed for protecting the cITSs can be categorized into nodecentric and data-centric IDSs. Some of these solutions tried to protect the system against threats coming from the outside caused by Sybil, malware, and DoS attacks. By comparing the patterns from incoming traffic with the patterns of normal applications, those solutions can detect suspicious threats and raise alarms. Moreover, other solutions focus on detecting misbehaving nodes in cITSs. These solutions aim to protect the system against threats carried out by legitimate yet compromised nodes, which is more challenging as those nodes are trusted and thus less suspicious [18]. Nonetheless, most of these solutions assume that the cITS is stationary. Such an assumption is not realistic as the ephemeral nature of cITSs make it a very dynamic constantly changing topology. Developing data-driven detection solutions on presumed stationary data prohibits handling the numerous and rapid changes typical inside the cITS. These solutions quickly become outdated and consequently, their accuracy decreases. Some studies have tried to rectify the issue by adopting solutions with the dynamic nature of the operating environment in mind [8]. These solutions, again, are typically categorized into node-centric and data-centric.
The existing IDS proposal for cITS relies on the BSM messages exchanged between the communicating vehicles as well as the contextual metadata that describes the operating environment. Such data in many studies are static, which might not be suitable for such a dynamic cITSs where the node's operational environment changes continuously. Therefore, static security thresholds become outdated more often. This represents a major issue for existing IDS solutions. To address this issue, some studies have proposed solutions, such as the context-aware data-centric misbehavior detection scheme (CA-DC-MDS) developed by [13]. This solution overcomes the aforementioned drawbacks. Static thresholds are replaced by a dynamic threshold statistically determined using a contextual model, which is constructed and updated online. The sequential analysis of temporal and spatial correlation is conducted using Kalman and Hampel filters to assess the consistency of mobility data exchanged between neighboring vehicles. The Kalman filter tracks mobility data from the neighboring vehicles, while the Hampel filter assesses the consistency of these data. Based on the proximity from the threshold, the message containing the data is classified as either normal or suspicious. However, the scheme assumes that data collected at the early phases after the model has updated its profile are sufficient for consistency assessment. This is not realistic in most cases, as the contextual data that describe the new situation are not yet ready for a variety of reasons as described below.
Node-centric IDSs determine whether a vehicle is malicious based on how it behaves on the road section [19]. The trustworthiness of legitimate vehicles is also assessed based on such behavior, which can be perceived by observing the number and validity of BSM messages shared by the vehicle [20,21]. Reputation-based evaluation is usually adopted for the trustworthiness estimation of each node in the cITS. The estimation is performed by a voting strategy whose outcome relies on the majority concept. However, relying on node behavior is sub-optimal because the cITS is non-stationary and since nodes change their behavior as the topology changes [22,23]. Moreover, relying on a voting approach for the trustworthiness estimation is always biased towards the majority, which in some cases, can be compromised when the attacker gains a majority foothold. A case in point occurs when attackers use advanced and sophisticated attack strategies such as malware and botnets to create a majority of rogue nodes enabling them to control the trustworthiness estimation. Consequently, such reputation-based mechanisms used by node-centric solutions cannot be trusted for the early identification of misbehaving or faulty vehicles [6].
Another set of IDSs for cITS adopt the data-centric detection approach by inspecting the BSM messages exchanged between the neighboring vehicles. These solutions perform several checks to determine whether the messages are falsified. BSM messages are checked against several criteria such as consistency and plausibility to determine whether they are trustworthy [6]. The consistency checks that BSM messages undergo in data-centric solutions determine whether the data shared by the node are consistent with the general context from the particular cITS. By vetting these BSMs, data-centric solutions can also identify the plausibility of the shared data to help in determining validity (i.e., whether they are in-line with those coming from other nodes in the cITS system).
The node-centric and data-centric approaches adopted in existing IDS solutions for cITS rely on estimating the reputation of the nodes and trustworthiness of the data they share with each other. However, both approaches have inherent weaknesses and may not be suitable for tumultuous environments such as cITSs. In such dynamic systems, the nodes join and leave the network frequently, which creates an unstable topology. This makes it difficult to capture sufficient and consistent patterns that represent all behavioral aspects of the nodes. Therefore, existing security solutions with rigid thresholds are not suitable as they do not have the sufficient data needed for accurate decisions. Therefore, these solutions suffer from a high rate of false alarms. Thus, data insufficiency makes it difficult for adaptive mechanisms used by some solutions to accurately calculate the new thresholds, which also have a negative effect on IDS accuracy.
The contribution of this study is two-fold: • A bi-variate moving average (BiMAV) technique was proposed. Unlike existing methods that only rely on the values estimated at the output layer, BiMAV correlates the changes of the output layer with the averaged input variables. Such an approach provides precise change detection by avoiding the instantaneous changes that could compromise the stability of the detection model. • The proposed method was incorporated into the detection model, which helps to prevent the unnecessary re-adjustment of security thresholds at the output layer of the DBN classifier thanks to the bivariate-based moving average used to monitor and detect the change in the classification accuracy estimation.
The rest of the paper is organized as follows. Section 3 presents the methodology in which we describe the proposed solution. The results are analyzed and discussed in Section 4 along with a comparison with existing related work. Section 5 concludes the paper with a summary of the contribution and findings.

Methodology
Given the literature reviewed above, we have concluded that the ephemeral nature of cITSs is a major challenge that makes many existing solutions ineffective. To overcome such a challenge, herein we propose an adaptive IDS for cITS. Our adaptive approach has the ability to cope with the dynamical nature of the cITS operating environment. A bi-variate moving average (BiMAV) method was developed to detect the (potential) diversion, in practice, from the existing threshold used by the detection model. Unlike existing methods that rely only on the values estimated at the output layer, BiMAV correlates the change of output layer with the averaged input variables. Such an approach provides precise change detection by avoiding the instantaneous changes that will eventually compromise the stability of the detection model. The proposed method prevents the unnecessary readjustment of security thresholds at the output layer of the DBN classifier thanks to the bivariate-based moving average used to monitor and detect the change in the classification accuracy estimation. This is important for dynamic environments such as cITSs where sufficient data might not be available. Based on the amount of change, adaptation can be triggered. in other words, if the difference exceeds a certain limit (i.e., according to the standard deviation), retraining the model is triggered. Model retraining will be performed based on the new data. If the difference does not exceed the threshold, there is no need for retraining.
The proposed solution here relies on the supervised learning approach. The deep belief network (DBN), one of the famous deep learning algorithms, is used to train the IDS based on data collected from the BSM messages. Before training, the data are pre-processed to make them suitable for ingestion by the DBN. As part of the preparation, noise data are removed, and data normalization is carried out. During data normalization, the values of all attributes are converted to a range of 0-1. This ensures that all attributes are in the same scale and prevents those with higher ranges from having undue influence over the model's output decision.
The data are now ready for the mutual information feature selection (MIFS) process that selects out discriminative features to reduce data dimensionality. This avoids the overfitting problem that negatively affects the accuracy of the IDS [24,25]. By selecting the most relevant features, the model also generates less false alarms, which contributes to higher precision. Furthermore, reducing data dimensionality helps decrease the model complexity, which is more favorable for ephemeral environments such as cITSs. The MIFS ranks the features based on the entropy, such that those with higher entropy value correspond to a lower rank. Then, the MIFS selects the n-top ranking features (n experimentally chosen to give higher accuracy). The selected features are then used as input for the DBN algorithm.
During the model's training phase, the DBN is trained using the data and features selected by the MIFS. The DBN model is composed of several layers, namely input, output and hidden. The number of input layer nodes is determined by the number of features selected by the MIFS. These nodes receive data and process them into the hidden layers, after being scaled (i.e., multiply) by an input weight. In our methodology, the hidden part of the DBN is constructed from three layers. The number of hidden layers is determined based on an overfitting factor during the training phase. The number of nodes in the hidden layer is thus determined based on the bias factor during the training phase as well. The value of the bias factor was set to 0.25, multiplied by the standard deviation σ(W) of the previous window. Therefore, the number of nodes in hidden layers were taken as a percentage of the original number. As we start with 18 nodes (because the number of nodes in a hidden layer should be lower than then nodes in input layer), in the hidden layers, the data are processed based on the activation function used by the hidden nodes. The Relu function is used as the activation function in all nodes in the hidden layers of the DBN, except the layer that precedes the output, where the sigmoid function was used. These activation functions are used to map the output of nodes into values between 0 and 1, which are needed for prediction. The output layer receives the data from the sigmoid functions in the last hidden layer and determines whether the instance is malicious or normal based on a threshold σ, where values greater than σ are considered as attacks.

Training and Testing
The DBN model was trained using the 10-fold cross-validation method, wherein data are divided into two sets. During the training/testing process, the data were divided into two sets, i.e., training and testing. The training builds the model while testing evaluates its accuracy. The size of training set was 90% of the data and, naturally, the testing set was 10% of the data. This process was repeated 10 times and the accuracy of the model was recorded. At the end of the training/testing process, the averaged accuracy was calculated, which determines the overall model accuracy.

Model Adaptation Using Bi-Variate Moving Average
Our proposed model, as described above, is aimed at improving detection within the dynamic cITS environment. Therefore, here we describe an adaptation capability needed to ensure that the model can better handle the constantly changing network topology. We propose a bi-variate moving average (BiMAV) model adaptation method that observes the model performance and adapts to the change in the operating environment. The proposed method follows the progressive modeling used by works that rely on time series data [26]. The method uses a two-dimensional window for change detection. That is, the window defines two variables, the aggregated input values and the estimated output. Within this window, the accuracy trend is monitored against a threshold calculated based on the standard deviation from previous windows. Equation (1) implements the BiVAM method: where X i and Y j are the input features and estimated output values, respectively. The variable n represents number of features while l represents number of instances in the window. The retraining is triggered if the value of BiMAV is higher than the standard deviation of the previous windows, as expressed by Equation (2): where σ(W) represents the standard deviation of the previous windows. The decision that Equation (2) makes is binary as it determines whether the re-training is needed or not based on the threshold σ(W).

The Dataset
The dataset used for this study was the Next Generation Simulation (NGSIM) Vehicle Trajectories Dataset [7]. NGSIM is an open source publicly available dataset with a collection of real-world vehicles' trajectories collected by smart vehicles. It contains a detailed vehicle trajectory data on southbound US 101 and Lankershim Boulevard in Los Angeles, CA, eastbound I-80 in Emeryville, CA and Peachtree Street in Atlanta, Georgia. Data in NGSIM were collected through a network of synchronized digital video cameras. NGVIDEO, a customized software application developed for the NGSIM program, transcribed the vehicle trajectory data from the video. This vehicle trajectory data provides the precise location of each vehicle within the study area every one-tenth of a second, resulting in detailed lane positions and locations relative to other vehicles. Moreover, NGSIM consists of many patterns representing different drive situations and driver behavior [7]. In addition, NGSIM provides high-quality contextual data that describe realistic real-world scenarios on different road sections [19]. Particularly, NGSIM was built by collecting data from vehicles moving on a road section with 500 m-long and seven-lane highway. For each vehicle, the data are collected (recorded) for 45 min using 16 sensors. Each record in the dataset contains s set of basic elements regarding the vehicle like position, speed, time, direction, and acceleration. Although there are similar datasets such as the Connected Vehicles Pilot (CVP), the NGSIM dataset was chosen in this study to be consistent when comparing with the related works as they used the NGSIM as well.
The dataset represents the ground truth information and each vehicle represents a cITS node. In real-world deployment, the dataset needs to be fed each cITS node. That is, each node should have a copy of the dataset to run its own applications and adjust its communication or driving behavior. As such, the collection of accurate and reliable context information is crucial. The context information in the dataset combines two types of messages, cooperative awareness message (CAM) and decentralized environmental notification message (DENM) into a basic safety message (BSM). While CAMs are sent periodically, DENMs are event-driven that only sent when an event has occurred. The CAM consists of information about the vehicles such as the position, size, speed, and steering wheel angle.
In contrast, DENM contains information about a certain event such as lane changing and sudden braking. BSM combines CAM and DENM messages. The first part of BSM, as well as CAM in the European standard, carries information about position, heading, speed, acceleration, steering wheel angle, vehicle role, vehicle size, and the status of vehicle lights [4,27,28]. Unlike the first part of BSM that is included in all BSM messages, the second part of BSM (which corresponds to DENM in the European standard) is only included when an event happens, to carry information about such an event.

Experimental Environment Setup
To implement the different components of the proposed mode and evaluate its performance, the development and experimental evaluation will be conducted using several tools and software including Python, TensorFlow, Scikit Learn, SKFeature, and Numpy. These tools and libraries are all included in the Anaconda development platform. Meanwhile, the preparation of data samples, implementation of algorithms, and the analysis of the results will be carried out on a machine with Intel(R) Core (TM) i7-4790 CPU @ 3.60 GHZ and 16 GB RAM.

Evaluation Metrics
To evaluate the performance of the proposed IDS for cITS, this paper uses the accuracy, detection rate, and the false alarms rate as they are common metrics widely used by the extant research. Equations (3)-(6) are used to calculate the detection accuracy, detection rate, precision, false positive rate, and the F measure, respectively.
where TP, TN, FP, and FN denote the true positive, true negative, false positive, and false negative, respectively. Table 1 shows the accuracy (ACC), detection rate (DR), false positive rate (FPR), and F1 measure of the proposed adaptive deep belief network-based IDS (ADBN-IDS). In addition, Tables 2 and 3 show the results of the IDS built using conventional machine learning classifiers, namely the support vector machines (SVMs), and the logistic regression (LR). As pointed out previously, the ACC, DR, FPR, and F1 were calculated based on Equations (3)- (6). In the tables, the first column in each table lists the accuracy of the proposed; while the second lists the detection rat; the third column lists the false positive rate; and the fourth column lists the F1 measure of the proposed and related models. The tables' rows are used to list feature sets with different sizes. The feature sizes range between 5 and 25 incremented by 3. The results show that the proposed ADBN-IDS achieved higher accuracy over the other two classifiers (i.e., SVM and LR) [28,29]. This is attributed to the ability of the BiMAV method (incorporated into ADBN-IDS) to detect the degradation in the model's performance and trigger the training on the right time. This contributes to keeping the model up to date and prevent the concept drift from affecting the accuracy of the model.

Experimental Results
The results also show that the accuracy increased when more features were added, until the number of features reached 20. After that, the model experienced a decrease in the accuracy. This also can be observed from the other evaluation metric, namely DR, FPR, and F1. The same trend was observed not only for the ADBN-IDS, but also for SVM and LR. The reason is that the model needs sufficient features to make correct decisions. However, when the number of features exceed a certain limit, the model would suffer from high variance that makes it prone to overfitting. The situation exacerbates when the coming observations lack the sufficient attack patterns necessary for clear and accurate decision. This would result to a model that can only recognize the patterns that it has seen, and if new patterns that have less similarity with the known ones are encountered, the likelihood that the model could miss the true classification becomes high. Figures 1-4 show the comparison between the proposed ADBN-IDS and the models built using the SVM and LR, in terms of accuracy, detection rate, false positive rate, and F measure, respectively. The x axis represents the number of features used for training, and the y axis represents the value of performance measure achieved. The comparison was conducted between the ADBN-IDS that employed the BiMAV for adaptation and the conventional approach used in the existing studies [28,29]. As depicted in the figures, the proposed ADBN-IDS outperformed the related techniques in terms of accuracy, detection rate, false positive rate, and the F measure. It can also be observed that the ADBN-IDS maintain a stable increment in the performance for the four measures when the number of features increase until it reaches 20 features where the performance shows declining trend. This is attributed to the efficacy of the BiMAV incorporated for the model adaptation and the reliance on the combination of output and averaged inputs for proximity calculation from the threshold. Such an approach makes the change detection mechanism robust, which avoids unnecessary re-training and only triggers it if the change in the cITS topology or attack behavior is significant. It is also worth noting that the frequency of adaptation varies based on the threshold. When the threshold is set to a higher value, the rate of adaptation becomes less frequent. When the threshold value is set to low, the adaptation frequency increases. Moreover, Figure 5 shows the area under the curve of the proposed model under several thresholds. The x axis represents the false positive rate while the y axis represents the true positive rate. It can be observed that the false positive rate decreases when the detection rate increases.

Conclusions
In this paper, the adaptive deep belief network-based intrusion detection system (ADBN-IDS) for cITS is described. The model is composed of three components: preprocessing, feature selection, and training/testing. Our model is created using the deep belief network (DBN) classifier, and includes the bi-variate moving average (BiMAV) method as our adaptation technique. This inclusion allows the model to cope with the dynamic nature of the cITS environment. The classifier is trained using the NGSIM dataset and tested using the 10-fold cross validation. The performance of the model is evaluated using several metrics including accuracy, detection rate, false positive alarms, and the F1 measure. The evaluation results show that the proposed ADBN-IDS achieved higher performance in terms of accuracy, detection rate, false positive rate, and F1, which indicates the importance of the BiMAV adaptation mechanism in keeping the model updated to achieving a safer more resilient cITS. Data Availability Statement: The Next Generation Simulation (NGSIM) dataset that was used in this study is publicly available online at the following link: https://ops.fhwa.dot.gov/trafficanalysistools/ ngsim.htm (accessed on 10 May 2022), and can be downloaded directly from the following link: https: //data.transportation.gov/Automobiles/Next-Generation-Simulation-NGSIM-Vehicle-Trajector/8ect-6jqj (accessed on 10 May 2022).

Conflicts of Interest:
The authors declare no conflict of interest.