Internet of Things Meets Vehicles: Sheltering In-Vehicle Network through Lightweight Machine Learning

: An internet of vehicles allows intelligent automobiles to interchange messages with other cars, tra ﬃ c management departments, and data analysis companies about vehicle identiﬁcation, accident detection, and danger warnings. The implementation of these features requires Internet of Things system support. Smart cars are generally equipped with many (hundreds or even thousands of) sensors and microcomputers so that drivers gain more information about travel. The connection between the in-vehicle network and the Internet can be leveraged by the attackers in a malicious manner and thus increases the number of ways the in-vehicle network can now be targeted. Protecting increasingly intelligent vehicle systems becomes more di ﬃ cult, especially because a network of many di ﬀ erent devices makes the system more vulnerable than ever before. The paper assumes a generic threat model in which attackers can access the controller area network (CAN) bus via common access points (e.g., Bluetooth, OBD-II, Wi-Fi, physical access, and cellular communication, etc). A machine learning based simpliﬁed attention (SIMATT)-security control unit (SECCU) symmetry framework is proposed towards a novel and lightweight anomaly detecting mechanism for the in-vehicle network. For this framework, we propose two new models, SECCU and SIMATT, and obtain state-of-the-art anomaly detecting performance when fusing the former to the latter. Regardless of the training phase or the detection phase, we strive to minimize the computational cost and thereby obtain a lightweight anomaly detection method. In particular, the SECCU model has only one layer of 500 computing cells and the SIMATT model has been improved to reduce its computational costs. Through substantial experiment comparisons (with various classical algorithms, such as LSTM, GRU, GIDS, RNN, or their variations), it is demonstrated that the SIMATT-SECCU framework achieves an almost optimal accuracy and recall rate.


Internet of Things Meets Vehicles
Internet of vehicles (IoV) allows intelligent automobiles to interchange messages with other cars, traffic management departments, and data analysis companies about vehicle identification, accident detection, and danger warnings. The implementation of these features requires Internet of Things system support, as shown in Figure 1. Traffic sensors collect road information and send it to the driver's mobile phone through network centers. There are some Internet of Things' devices, for example Austria's Autobahn, which uses Cisco's connected devices to link about 80,000 microprocessors and We are witnessing the upgrade from related websites (i.e., the data collected by the sensors can be obtained via the Internet.) to the Internet of Things (ie., the components embedded in the sensor can communicate with each other via the network, making use of the sensor more intelligently.). The IOT links to multiple smart terminals via the Internet. Driverless vehicles can update their algorithms with user data via links, such as IOT. These driverless cars require a large number of high quality data collection and processing systems. In this case, road information obtained using the IOT can be transmitted to driverless cars. This information includes navigation data and road traffic conditions. All of this data are shared with IOTs on driverless cars and cloud systems to improve intelligent algorithms.
Many (hundreds or even thousands of) electronic and intelligent sensors are equipped in modern vehicles to enhance driver experience. For example, GPS units provide vehicle location information, and the vehicle sound equipment is connected to the phone for functions such as calling and playing music. The Internet of vehicles meets the communication of various devices in the vehicle. The connection between the in-vehicle network and the Internet can be leveraged by attackers in a malicious manner and thus increases the number of ways the in-vehicle network can now be targeted [1]. As an example, through a cellular connection, one may likely gain a vehicle's GPS coordinates, vehicle identification number, its IP address, etc. Protecting increasingly intelligent vehicle systems becomes more difficult, especially because a network of many different devices makes the system more vulnerable than ever before. Recent work [2,3] show that attackers can take advantage of the vulnerabilities of an in-vehicle network to manipulate a vehicle. For example, attackers can attack a vehicle with wireless networks [4].

Attack Surfaces
If an attacker wants to crack a car they will first consider attack surfaces. Smart cars have many components that can be controlled by an attacker, for example, bluetooth, transaction processing management system (TPMS), Wi-Fi, cellular, infotainment console, USB, controller area network (CAN) bus, and onboard diagnostics II (OBD-II) connector. These components include the external and internal types, as shown in Figure 2, and it may be exploited by an attacker to change the behavior of vehicles. For example, by accessing communication modules such as GPS and cellular networks, an attacker can track the path of vehicles. Also, they can establish a connection via Wi-Fi to access the remote in-vehicle network, (for up to 500 yards) or to crack the Wi-Fi password to install malware on the smart device.

Can Bus
Currently, electronic control units (ECUs) play an important role in intelligent cars, controlling multiple subsystems. Generally speaking, ECUs can replace traditional mechanical components to simplify automotive design [5]. In order to achieve vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication, the functions of ECUs are further imposed to facilitate larger-scale intra-vehicle [6] and inter-vehicle [7] communication.
A large number of communication protocols are designed for information exchange. The controller area network (CAN) is the de facto standard for communication networks of various vehicles on the market today [8]. On the CAN bus, an ECU can communicate with multiple sensors to accomplish different complex tasks. Important vehicle driving information such as diagnostics, sensors, and control data is transmitted via CAN bus for a variety of vehicle driving functions such as autonomous driving or assisted driving. Therefore, safe driving must be based on the safe transmission of information. However, with the complication of intelligent car components, the number of ECUs has increased exponentially. Protecting increasingly intelligent vehicle systems becomes more difficult, especially because vehicle data spans a variety of different connected devices to further weaken system security. Because vehicle communication security is not the primary concern of the original design of CAN, the in-vehicle network is therefore subject to multiple security threats [9]. For example, an ECU can obtain information from any other ECU on the CAN bus but cannot identify the real identity of the sending ECU. This can make the packet injection and data manipulation attacks easier and endanger the safety of the driver [10].
There are currently two approaches to solving the problem of CAN protocol security. One approach is the use of message authentication codes (MACs), and the other approach is to install an anomaly detecting system on the vehicle [11]. The MAC approach is more reliable, but occupies data fields in a CAN packet. The data fields in a CAN packet are only 8 bits and the maximum transmission rate is only 1 M/s, which does not meet MAC requirements [12]. Therefore, the application of MACs is bound to change the existing vehicle structure and increase production costs, which are currently difficult to widely popularize. In contrast, anomaly detecting [13] does not need to change the structure of the vehicle and can be installed in the vehicle's gateway to detect an attacker's malicious behavior in real time. Therefore, anomaly detecting is more suitable in practice for improving the security of an in-vehicle network.
We assume that the attacker can access the CAN bus and other device interfaces connected to it such as Bluetooth, OBD-II, Wi-Fi, physical access, and USB ports, as shown in Figure 3. Once an attacker gains access to in-vehicle network, it means that the attacker can perform many different modes of attack.

Our Contributions
In this paper, considering the computational power available in vehicle electronic equipment, we present a lightweight detecting model with the aim of ensuring detection accuracy. The proposed detecting model performs well and can monitor the status of the in-vehicle network in real time. In many anomaly detecting methods, performance is highly dependent on the attack data structure. Namely, most existing methods need to extract appropriate features from the attack data at the training stage and make judgments based on these data during the detecting process. However, the training attack data are generally constructed in an artificially simulated way, and thus quite likely to be different from actual attack data, which may hinder the practical performance of these methods. In contrast, the model we proposed is based on real data collected from the normal operation of a vehicle. The attack data are only used to create a criterion for model evaluation. Our contributions include the following: 1. To suit the actual situation of an in-vehicle network, we present a new recurrent neural network (RNN) model named SECCU. 2. We propose the simplified attention (SIMATT) model that can be computed using the computing power of vehicle electronics and improve detection accuracy. 3. The experimental results demonstrate that our proposed SIMATT-SECCU symmetry framework has the best performance in comparison with many classic models. 4. Our detecting model is fully data-driven and does not require any domain knowledge about the CAN protocol. 5. Our proposed model focuses on a lightweight anomaly detection method for in-vehicle network based on machine learning, which solves the problem of insufficient computing power of networked vehicle electronic devices when applying machine learning algorithms.
The detecting process of the SIMATT-SECCU framework is shown in Figure 4. First, the SIMATT-SECCU model is trained with attack-free (normal operation) and attack data sets to discover the feature distributions of the different data sets. Then, the trained SIMATT-SECCU model can be used for anomaly detecting. As shown in Table 1, the following abbreviations are used in this manuscript and the rest of this paper is organized as follows. In Section 2, we review related work of the anomaly detecting for in-vehicle network. In Section 3, we present the RNN model and our proposed SECCU model. We further introduce attention models and propose our SIMATT model in Section 4. We descibe the CAN bus and the data sets used for evaluation in Section 5. Section 6 is devoted to the experiments and evaluation results. Finally, we conclude this paper in Section 7.

Related Work
The industry and academia have begun to focus on security issues on the CAN bus and has proposed some defense options. The traditional method is to use the heuristic of in-vehicle network behavior anomaly detecting, which is an extension of the detecting mechanism of intrusion detection systems (IDS) for traditional computer networks. Such a detecting system can be installed in different parts on the in-vehicle network. (e.g., the OBD-II port, gateway, or T-box). The IDS monitor the CAN traffic and then obtains a detecting model. When anomalous behavior is detected and the system outputs warning information.
Due to the limitations of the internal computing units of the vehicle, the simple lightweight method is one of the research hotspots. Han et al. [14] proposed an anomaly detecting algorithm based on survival analysis. Song et al. [15] suggested statistical CAN information time interval in anomaly detecting. Although these methods have low computational overhead, they can only detect specific attacks and find unfavorable performance for many attack modes. Machine-learning-based anomaly detecting can solve this problem.
There are several machine-learning-based anomaly detecting methods in the literature. One significant advantage of this approach is that attacks are effectively detected without changing the existing in-vehicle network structure. Seo et al. [16] proposed a double discriminator using the generative adversarial networks (GAN). The method uses two different data sets to train two discriminators, which greatly improves the detection accuracy of IDS. Tariq et al. [17] proposed an IDS combining heuristic algorithm and recurrent neural network (RNN). In this method, some features of CAN data are first counted to make a preliminary judgment, and then RNN is used for the final judgment. Larson [18] proposed a CAN attack detecting method based on security rules. The method is based on the object dictionary of the CANopen protocol; it uses protocol-level security rules to detect illegal ECU behavior, and Larson provides a set of example security rules. Wang et al. [19] proposed a time series prediction model that trains different types of instructions in the CAN protocol separately before combining them in the final system. Müter et al. [13] detect attacks by calculating the entropy of normal traffic and abnormal traffic on the CAN bus. Hu et al. [20] used a support vector machine model to detect the abnormal state of a vehicle. In [21], Ji et al. uses the clock drift of an ECU to detect abnormal conditions in abnormal vehicles. In [22], Xiao et al. proposed an early warning using convLSTM to predict time series deviations. Table 2 shows some state-of-the-art methods. However, these methods have inherent drawbacks. For example, in [19], the instruction ID of the CAN frame is separated, resulting in a loss of part of the information. In [13], the entropy-based method can only perform preliminary statistics and detect only some attack methods and missing many others. In [20], it was found that directly detecting the abnormal state of a vehicle with a support vector machine requires excessive computational resources, and it is difficult to ensure real-time monitoring. In [21], it was found that vehicles produced by different manufacturers have different ECU clock drift features, so it would be indispensable to reanalyze these for each vehicle brand. An extremely important problem is that the computational cost of machine-learning-based IDSs is considerably high. In contrast, we propose an IDS for in-vehicle network that can address these problems and the required computation shows advantage over preceding methods [16,17]. Table 2. Summary of the intrusion detection systems (IDS) for controller area network (CAN) bus system literature in the automotive domain.

Recurrent Neural Network
In this section, we will introduce two classic recurrent neural networks (long short-term memory (LSTM) and gated recurrent unit (GRU)) and proposed a new structure (SECCU) for recurrent neural networks.

Long Short-Term Memory
Long short-term memory (LSTM) units are used in an RNN [23] to create a network that is often called an LSTM network (created to solve long-term memory dependence problems). A traditional LSTM network consists of multiple gates with different functions ( Figure 5). The gate can remember the value of time interval, and the information flows in and out of the rules in the three gates. LSTM networks are well-suited to the classification, processing, and prediction of time series data because There are undetectable delays in certain important events in the time series. The vanishing gradient problem in training traditional RNNs can be solved on LSTM units. Relative insensitivity to time intervals is an advantage of the LSTM network for other time series algorithms such as RNNs and hidden Markov models. Multiple LSTM networks can be stacked and temporally concatenated to form more complex structures.
The main innovation of LSTM is that memory cells c t are added to collect state information. When new input data arrive, the information is accumulated in the cell if inputs gate i t is activated. Simultaneously, if the forget gate f t is open, past information c t is forgotten. Output gate o t determines whether the final output c t is transmitted to h t and j t as the update coefficient of c t . The main formulas of an LSTM cell are as follows: In Equation (1), W * denotes the weight matrices, b * denotes the biases, and denotes the element-wise vector product.

Gated Recurrent Unit
In recent years, Cho et al. proposed the gated recurrent unit (GRU), which has a structure similar to that of an LSTM unit [24]. Chung et al. found that GRUs performed better than LSTM units on multiple data sets [25]. The GRU cell model is shown in Figure 6. The main innovation of a GRU is that it combines the input gate and forget gate. A GRU is defined by the following equations: Here, r denotes the reset gate, z denotes the update gate, H denotes the hidden state of the cell, and H denotes the current state.

Security Control Unit
In [26], it was found that GRU and its many variants perform better than LSTM units on most tasks. However, LSTM units are more stable in most tasks. As mentioned in [26,27], the LSTM structure is almost optimal and cannot be improved, but the GRU uses multiple structures on different tasks to obtain better performance. Therefore, one can contemplate a model (as a variant of the GRU and combined with the structure of refined LSTM) that achieves stunning performance in dealing with CAN anomaly detecting tasks.
We combine the advantages of LSTM units and GRUs to propose the security control unit (SECCU). The SECCU cell model is shown in Figure 7. The SECCU is defined by the following equations: Here, r, z, H, and H have the same meanings as they do in the GRU formulas.

Attention Model
In this section, we first introduce the hierarchical attention model and then proposed the simplified attention model to reduce computational overhead.

Hierarchical Attention Model
An RNN usually calculates a hidden state h t as a function at time t of previous hidden state h t−1 , capturing the correlation and absolute position information along the time dimension directly through its sequential structure. In recent years, attention models based on position coding or distance deviation have performed well on time-series tasks [28]. In particular, the hierarchical attention (HIEATT) model has made great strides in natural language processing. In this paper, we introduce attention models into in-vehicle network for anomaly detecting.
Recently, natural language processing has made substantial advances thanks to the support of the HIEATT model [29]. The instruction sequence generated by the CAN protocol can be regarded as a language, so we can adopt the HIEATT model for vehicle network anomaly detecting. The HIEATT model (shown in Figure 8) is defined by the following equations: That is, we first feed instruction sequence h it through a single-layer RNN to obtain u it as a hidden representation of h it . We then measure the similarity of u it with instruction sequence vector u w , which indicates its importance. We compute instruction sequence vector a i as a weighted sum of the instruction sequence. The context vector u w can be thought of as a high-level representation in memory networks [30].

Simplified Attention Model
Although the HIEATT model has led to outstanding progress in natural language processing tasks, its application is limited by the computing power of the vehicle electronic devices and the characteristics of the CAN protocol itself. Hence, for an IDS, the HIEATT model must be modified for the environment of an in-vehicle network.
The CAN protocol is not as complex as a natural language, and a better result can be obtained with a slightly simplified attention model (the SIMATT model) for analyzing the instruction sequence generated by the CAN protocol, as shown in Figure 9. The vector of hidden layer state h t is used to produce a probability vector a of weights. Vector c is the weighted sum of h t using a. The SIMATT model is defined by the following equations: Here, a denotes the tensor after the time step axis is interchanged with the sequence length axis in the a tensor and denotes the element-wise vector product. In contrast to the HIEATT model, we added the softmax function and removed the exponential calculation to simplify it and reduce the computational cost.

Can Bus and Dataset
In this section, we first briefly describe the implementation principle of CAN BUS, then introduce the datasets and attack methods used in the paper.

Can Bus
A CAN is a half-duplex high-speed communication bus network with a maximum transmission rate of 1 Mbps. Most automotive companies use the CAN protocol to communicate among ECUs and embedded devices. Figure 10 shows an example CAN packet. The arbitration ID field consists of 11 bits for the ID, and The ID is the identity of the ECU. The arbitration ID field provides two main functions: (1) The priority of the packet transmission is determined by the ID number. (smaller ID numbers have higher priority) and (2) an ECU can use it to filter corresponding messages. There are data information of 8 bytes in the data field. The control field is used to identify the size of the data field. The cyclic redundancy check field is used to detect errors in data packet transmission. Finally, the acknowledgment field is used to confirm the receipt of a valid CAN packet.
In a CAN, without the concept of address, the data transmitted on the bus can be received by all ECUs. The information is identified by the value of the arbitration fields. The arbitration ID indicates the target device ID of the received message, and any device can communicate with multiple devices as long as multiple arbitration IDs are attached to the package. When multiple packets are transmitted on the bus, packets with a lower arbitration ID would be sent first.
The CAN bus considers the possibility of adding new nodes.Therefore, all messages is transmitted on the bus. Due to CAN is transmitting data by broadcast, the controller on the network can accept all packets without regard to the sender's arbitration ID. Therefore, Malicious devices on the bus can easily send malicious packets.

Can Bus Dataset
Data is a vital part of every machine learning model. To train these models, it is necessary to employ a data set. However, every modern business must also protect its users' data. To solve the problem of data availability, many institutions and organizations are providing publicly available data sets so that anyone can get them free to build models for educational or commercial use.
Our experimental data is downloaded from a public datasets at http://ocslab.hksecurity.net/ Dataset/CAN-intrusion-dataset [31]. The selected data were constructed by logging CAN traffic via the OBD-II port from a real vehicle while message injection attacks were performed.
In this study, we consider two types of attacks, one that directly modifies a vehicle's ECU node and the other that injects unauthenticated information directly into the in-vehicle network. First, we assume that the attacker can physically control and modify the target ECU node in the vehicle so that it stops transmitting information and allows the malicious impersonation node to send information instead. With respect to the overall information, the system has not changed. We call this attack an impersonation attack [31,32]. In the second type of attack, we assume that the attacker cannot control the target ECU node, but can inject malicious information to induce a fault or remotely maneuver the vehicle. We consider two types of injection attacks, DoS and fuzzy attacks [31], described in detail below.
The data are divided into the following four data sets: • Attack-free: This data set consists of CAN data generated under normal conditions after the vehicle has been started. • Impersonation attack: After an attacker has attacked an ECU and disabled it, the attacker inserts his/her ECU for a specific purpose. The inserted ECU is disguised as the disabled ECU, and it can periodically reply to request from other ECU [31,32]. Impersonation attack does not change the original frequency of CAN messages, so IDSs based on frequency characteristics cannot detect this attack. If the attacker does not change the contents of the CAN messages, the inserted ECU looks like a legitimate ECU, and the attacker can then launch other types of attacks. We refer the reader to [31] for further details about the construction and distribution characteristics of the impersonation attack data set. This attack mode is shown in Figure 11. • DoS attack: This dataset consists of data generated while an attacker periodically injected high-priority CAN instructions so that legitimate instructions are not sent in time. In this data set, the attacker injected the highest priority instruction (ID 0000). This attack mode is shown in Figure 12.

•
Fuzzy attack: This dataset consists of data generated while an attacker randomly sends instructions to cause the vehicle to perform unexpectedly. To implement a fuzzy attack, the attacker needs to collect detailed information about the target vehicle. For example, instruction ID that can produce unexpected behavior. It is different from DoS attack by occupancy bus to delay normal information transmission. The Fuzzy attack caused the vehicle function to be unavailable. This attack mode is shown in Figure 13.

Experiment Procedure and Results
In this section, we explain the method of data normalization and detailed the experimental process, introduced the evaluation method of the model, and finally compared the various models.

Experiment Procedure
The formatting of the CAN data set used in this paper is illustrated in Figure 14. The timestamp denotes the time elapsed from the start of ignition of the vehicle to the generation of the instruction, and the ID indicates the identity of the instruction. The data length code (DCL) indicates the length of the data field of the instruction. The remainder of the information is the data carried by the instruction. The attack-free data set consists of instructions 10,001 to 2,369,000 (the first 10,000 instructions were removed because they were used to train the SIMATT-SECCU model). We removed the timestamp field in the public dataset, as we only need to ensure that the instructions flow out in order, and the exact time of the outflow is beyond our concern. Nothing is changed in the rest of the public dataset. The DoS attack, fuzzy attack, and impersonation attack data sets employ the first 6,56,500, 591,000, and 995,000 instructions of their original data sets, respectively. All data set values were normalized to a range between 0 and 1 by the following formula: (22) where x denotes the data that needs to be normalized, x min denotes the minimum value of an attribute in the data set, and x max denotes the maximum value of that attribute in the data set. As shown in Figure 14, in addition to the timestamp field, other fields are entered as features into the SIMATT-SECCU model. An instruction has 10 feature attributes (the data field has 8 feature attributes). We used 50 instructions (500 feature attributes) as an input sample and used the first 10,000 instructions as the training set for the SIMATT-SECCU model. The model was iteratively trained 50 times. Only one layer of 500 SECCU cells is employed. All CAN instruction sequences were further analyzed by the SIMATT-SECCU model, and finally classified using a random forest [33] to complete the attack detecting task.
We used TensorFlow to build the SIMATT-SECCU model and scikit-learn toolkit to implement a random forest, in which all parameters were set to their default values. We use ten-fold cross-validation [34,35] methods to verify the performance of the model, which consists of the following specific steps:

1.
We take the first 80% of the instruction sequences in all data sets and divide them into 10 equal parts.

2.
Each part is treated as a test set, and the remaining nine parts are used as a training set. A total of 10 test results are obtained.
To evaluate the performance of our SIMATT-SECCU model, we conducted a series of experiments, as follows: 1.
The SIMATT model was removed and the SECCU cell was replaced with an LSTM cell or a GRU cell to compare the proposed model with the LSTM and GRU models, respectively. Only the SIMATT model was removed to obtain the SECCU-based model.

2.
The SIMATT model was replaced by the HIEATT model, and the SECCU cell was replaced by an LSTM cell or a GRU cell to obtain the HIEATT-LSTM and HIEATT-GRU models, respectively. The HIEATT-SECCU model was obtained by replacing the SIMATT model with the HIEATT model.

3.
The SECCU cell was replaced with an LSTM cell or GRU cell to obtain the SIMATT-LSTM and SIMATT-GRU models, respectively.
The training parameter values of the comparative models were set to be the same as those of the SIMATT-SECCU model.

Introduction to Experimental Evaluation Indicators
Precision, accuracy, recall, F1, the receiver operating characteristic (ROC) curve, and the area under the ROC curve (AUC) are commonly used as performance criteria for evaluating a detecting model. We evaluate the performance of the models using the four terms defined in Table 3.
Accuracy reflects the ability of the classifier to determine the entire sample. Accuracy can be defined as follows: Recall rate reflects the proportion of positive cases that are correctly judged to the total positive case. In addition, recall can be defined as follows: F1 reflects the harmonic mean of the accuracy and recall rate. F1 can be defined as follows: The true positive rate (TPR) is defined as the ratio of true positives to all positive samples. Its formula is the same as that of the recall rate: The false positive rate (FPR) is defined as the ratio of false positives to all negative samples. Its formula is as follows: The ROC curve is a plot of the FPR on the X axis and the TPR on the Y axis. In addition, the AUC value is the area under the ROC curve. Obviously, a larger AUC indicates better classification results.

Model Performance Analysis
The ten-fold cross-validation accuracy results for the comparison models are shown in Table 4. The results show that SIMATT-SECCU has the highest accuracy, followed by SIMATT-LSTM. In addition, the models that include SIMATT have a high degree of accuracy. The models that include HIEATT and the SECCU cell have different levels of performance degradations. This may be because the HIEATT model is too complex for relatively simple CAN protocol instructions, and this degrades its performance. Table 4. The ten-fold cross-validation accuracy results for the comparison models.  000 1.000 0.993 1.000 1.000 1.000 0.993 1.000 1.000 0 For the detecting task, the recall rate of the model is considered more important. Table 5 shows the recall rate of each model. The SIMATT-SECCU recall rate is only greater than that of SIMATT-LSTM by a value of 0.003. However, the CAN is usually able to generate 685,000 instructions in 300 s, which can reduce false positives by more than 40 times.   1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1 The ROC curve ( Figure 15) enables us to intuitively evaluate the performance of each model. As the figure shows, the SIMATT model substantially improves the performance of the models (the three curves closely follow each other). In contrast, the HIEATT model improves the performance of the SECCU cell only: the performance of the LSTM and GRU cells declines when HIEATT is used.
We used the first 80% of all datasets as the training set and the last 20% as the test set to evaluate the performance of each model. The experimental results are shown in Figure 16. The impersonation attack is difficult to detect for most models because it mimics a normal ECU, sending packets without generating much change in the overall instruction sequence. In contrast, the detecting of an impersonation attack when the SIMATT model is employed is no longer a challenge. Most models also have some detection errors on the attack-free dataset. This may be because the sudden braking of the vehicle during driving can caused the features of the attack-free dataset to change and affect the detecting results. Each model performs well on the DoS and fuzzy attack datasets because the data these data sets have obvious features. For example, in the DoS attack data set, the same data packet is periodically sent and is easily detected. In addition, although the performance of the SECCU cell is slightly worse, it has higher plasticity. When HIEATT is added to the models, the performance of LSTM and GRU decreased, whereas the performance of SECCU increased. Moreover, when SIMATT is added to the models, the performance of SIMATT-SECCU exceeds those of all other models.  Based on the above experimental results, it can be concluded that the SIMATT model can help SECCU-, LSTM-, and GRU-based RNNs substantially improve anomaly detecting. Moreover, the SIMATT-SECCU model obtains state-of-the-art detection performance.
To further demonstrate the performance advantages of SIMATT-SECCU, we compare it with the GIDS [16] and HE-RNN [17] models. In the comparison of GIDS, we use the data set provided in [16] (which added two attack data sets related to RPM attack and GEAR attack) to check the ability of SIMATT-SECCU in detecting multiple attacks. Detailed attack steps of GIDS can be found in [16]. We take the first 1,000,000 instructions in the data set and the first 80% of all datasets as the training set and the rest as the test set to evaluate the performance. The recall of GIDS and SIMATT-SECCU are shown in Figure 17. The detection performance of SIMATT-SECCU for three attacks is better than that of GIDS, and particularly the detection capability of GEAR attack is significantly improved. Whereas, SIMATT-SECCU does not show comparable performance in detecting fuzzy attacks.  [17] uses the same data set of this paper to directly compare the detection capabilities of the model. From Tables 6 and 7, we can see that the SIMATT-SECCU model is more powerful for detecting both fuzzy and DoS attacks. To compare the computational costs of each models, we take 20,000 instructions generated in about 10 s for time testing. Our configuration is as follows: CPU: Intel Core i7-6700 3.40 GHZ, RAM: 8 G, operating system: Windows 7 without GPU computing components. We built these programs using python 3.6 and TensorFlow 1.12. Figure 18 shows the detecting time consumption of the 11 models. It can be seen that the SIMATT optimally adapts to the SECCU because SIMATT-SECCU only has 0.003 s more detecting time than SECCU, and HIEATT-X about 0.02 s slower than SIMATT-X (X represents SECCU, LSTM, or GRU, respectively). Most time consumptions of the GIDS detecting are spent on one-hot-vector encoding conversion. The arbitration ID in each CAN instruction needs to be converted into three one-hot-vectors, which not only idles away time but also increases space overhead. HE-RNN leads to excessive detection time overhead due to the complex neural network structure. The RNN network of HE-RNN has four layers, with a total of 1280 computing cells, and contrarily we only have one layer of 500 computing cells. Moreover, in HE-RNN, we only calculate the time spent by data passing through RNN, and the heuristic algorithm for measuring CAN data features is not covered yet, meaning that HE-RNN actually needs more calculations. Putting the above analysis altogether, we can argue that the proposed SIMATT-SECCU model can significantly reduce the calculation cost as well as ensure the detection accuracy.

Conclusions
Internet of vehicles allows intelligent automobiles to interchange messages with other cars, traffic management departments, and data analysis companies about vehicle identification, accident detection, and danger warnings. The implementation of these features requires Internet of Things system support. smart cars are generally equipped with many (hundreds or even thousands of) sensors and microcomputers so that drivers gain a more information about travel. The connection between the in-vehicle network and the Internet can be leveraged by the attackers in a malicious manner and thus increases the number of ways the in-vehicle network can now be targeted. Protecting increasingly intelligent vehicle systems becomes more difficult, especially because a network of many different devices makes the system more threatening than ever before.
In this paper, the SIMATT-SECCU framework was introduced that can be used to detect instruction sequence anomalies on in-vehicle network. This IDS monitors vehicle status in real time and does not require expert knowledge of the CAN protocol. For this framework, we propose a high-plasticity recurrent neural network called the security control unit (SECCU) that can achieve higher performance through the addition of the attention model. At the same time, we also propose the simplified attention (SIMATT) model, which has low computational cost and can significantly improve the performance of LSTM, GRU, and SECCU models in vehicle network anomaly detecting tasks.
Our contributions include the following.

1.
To suit the actual situation of an in-vehicle network, we present a new recurrent neural network (RNN) model named SECCU.

2.
We propose the simplified attention (SIMATT) model that can be computed using the computing power of vehicle electronics and improve detection accuracy. 3.
The experimental results demonstrate that our proposed SIMATT-SECCU symmetry framework has the best performance in comparison with many classic models. 4.
Our detecting model is fully data-driven and does not require any domain knowledge about the CAN protocol. 5.
Our proposed model focuses on a lightweight anomaly detecting for in-vehicle network based on machine learning, which solves the problem of insufficient computing power of networked vehicle electronic devices when applying machine learning algorithms.
More work should be done to improve the performance of the in-vehicle network IDS in more complex situations. One significant direction of future research is to determine whether the detection accuracy of the model can be further improved and the false positive rate reduced. Another interesting direction would be to construct a lightweight heuristic that matches the computational requirements of real-time detecting. Furthermore, when IDS detects that the vehicle is behaving abnormally, how to launch a reasonable response is another challenge. All these aspects are worth further study.