Skip to Content
ASIApplied System Innovation
  • Article
  • Open Access

26 March 2024

Secure Aviation Control through a Streamlined ADS-B Perception System

and
1
Department of Cybersecurity, Faculty of Computer & Information Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan
2
Department of Cybersecurity, King Hussein School of Computing Sciences, Prince Sumaya University for Technology, P.O. Box 1438, Amman 11941, Jordan
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Industrial Cybersecurity

Abstract

Automatic dependent surveillance-broadcast (ADS-B) is the future of aviation surveillance and traffic control, allowing different aircraft types to exchange information periodically. Despite this protocol’s advantages, it is vulnerable to flooding, denial of service, and injection attacks. In this paper, we decided to join the initiative of securing this protocol and propose an efficient detection method to help detect any exploitation attempts by injecting these messages with the wrong information. This paper focused mainly on three attacks: path modification, ghost aircraft injection, and velocity drift attacks. This paper aims to provide a revolutionary methodology that, even in the face of new attacks (zero-day attacks), can successfully detect injected messages. The main advantage was utilizing a recent dataset to create more reliable and adaptive training and testing materials, which were then preprocessed before using different machine learning algorithms to feasibly create the most accurate and time-efficient model. The best outcomes of the binary classification were obtained with 99.14% accuracy, an F1-score of 99.14%, and a Matthews correlation coefficient (MCC) of 0.982. At the same time, the best outcomes of the multiclass classification were obtained with 99.41% accuracy, an F1-score of 99.37%, and a Matthews correlation coefficient (MCC) of 0.988. Eventually, our best outcomes outdo existing models, but we believe the model would benefit from more testing of other types of attacks and a bigger dataset.

1. Introduction

Since the usage of drones is becoming increasingly relevant these days and is still increasing for both commercial and military reasons, and because some of them are used without the need of a human, this forced the need to have a framework that would try and eliminate the chances of any air collision between the different types of aircraft. Thus, automatic dependent surveillance-broadcast (ADS-B) was introduced [1]. Automatic dependent surveillance-broadcast (ADS-B) is an air traffic control system based on information such as the position, identity, time, heading, and velocity received by ground sensors and transmitted via an aircraft periodically with the intent of surveillance. ADS-B is considered a crucial asset in air traffic control, and it is estimated that 42.9% of commercial aircraft equip hardware to provide automatic dependent surveillance-broadcast (ADS-B) [2].
The intention behind designing ADS-B is to improve air traffic control by making it easier, eliminating limitations of previous air traffic control models; it serves as a replacement for secondary surveillance radar (SSR), bearing in mind cyberattacks against aircraft or the aviation industry [3]. Still, despite all of the advantages that ADS-B brings to the aircraft control field, it needs some security limitations. It is also vulnerable to multiple types of attacks, such as eavesdropping, jamming, and message injection, while also lacking any security measures to ensure the safety of the data transmitted. Failure to formally distinguish between actual and spoofed ADS-B messages is one of the most frequent issues that lead to misunderstandings while communicating with an aircraft or a sensor. Additionally, attacks on ADS-B systems are of many types. Figure 1 demonstrates a variety of attacks that could affect ADS-B systems [3,4].
Figure 1. Attacks Against ADS-B.
Nowadays, as stated earlier, around half of commercial aircraft are equipped with ADS-B hardware to improve aircraft traffic control as a whole. With the evident orientation towards using ADS-B and the rapid increase in the sophistication and diversity of cyber-attacks and their techniques, this paper will focus on providing security solutions for detecting false information gathered by sensors to improve aircraft traffic control and security.
Unfortunately, no previous studies have introduced a dataset or a detection model in a realistic environment. This has led us to take the initiative to find a dataset that contains both genuine messages and injected messages that were transmitted and received by a sensor. We also used different classification approaches to provide a higher accuracy rate and enhance the overall security of the detection systems.
The primary contribution of this paper is the proposal of a machine learning-based detection model for classifying injected messages using a specialized dataset. The following is a list of the specific contributions.
  • We outlined earlier research related to ADS-B, its threats and vulnerabilities, and some of the proposed solutions so we could determine the gap and select the best method to apply our study to maximize the benefits for the aviation industry.
  • We rendered the qualified dataset for ease of use. We also analyzed and performed an in-depth study of the different types of injected messages directed against aircraft and potential defenses against such attacks to aid and optimize our proposed solution.
  • We developed an efficient classification model using multiple techniques, and we achieved solid results in detecting injection within the messages used in path modification, ghost aircraft injection, and velocity drift attacks, bearing in mind that it needed to be performed in a rapid manner and with the least amount of training required. Our best results are listed below:
    o
    Binary classification: accuracy of 99.14%, F1-score of 99.14%, and MCC of 0.982 using a random forest classifier.
    o
    Multiclass classification: accuracy of 99.41%, F1-score of 99.37%, and MCC of 0.988 using a random forest classifier.

2. Background Review

Automatic dependent surveillance broadcast (ADS-B) is increasingly utilized within the aircraft industry. Since both aircraft need to be piloted or remotely controlled to avoid any misinformation that could cause a collision, there is a higher demand for a security solution to solve and mitigate most of the threats and vulnerabilities ADS-B suffers.
Automatic dependent surveillance-broadcast (ADS-B) is an air traffic control system based on information such as position, identity, time, heading, and velocity received by ground sensors and transmitted periodically via an aircraft with the intent of surveillance. ADS-B messages are acquired from the OpenSky network. The reason was to make aviation data, such as air traffic information, available to the public. Aircraft transponders broadcast ADS-B messages, and all nearby receivers on the same communication channel or within the transmission range could capture these messages and the information contained within them.

2.1. ADS-B Security Threats and Vulnerabilities

ADS-B suffers from various attacks and vulnerabilities. Below are the most known attacks that threaten an ADS-B system and their description [4].
  • Message injection: if the attacker has the right equipment, an actual and legitimate counterfeit ADS-B message could be generated and injected into an ADS-B System.
  • Message deletion: physically deleting ADS-B messages using constructive or destructive interference differs from jamming because it drops the whole message rather than blocking it.
  • Message modification: altering the contents of an ADS-B message using different techniques such as message injection and bit flipping.
  • Signal jamming: interrupting the signals of ADS-B communications channels using radio frequency devices that send multiple requests leads to jamming and disrupting the communication between the aircraft and ground stations.
  • False alarm attack: modifying a message to indicate a false alarm. Such an attack could indicate that an aircraft has been hijacked while operating normally.
  • Virtual trajectory modification: changing the received trajectory of an aircraft by using message manipulation techniques.
  • Ghost aircraft injecting: this attack creates an imaginary aircraft by faking ADS-B messages within a communication channel.
  • Ghost aircraft flooding: like ghost aircraft injecting, this attack injects multiple imaginary aircraft simultaneously to cause disruption and confusion and could cause a denial of service in different surveillance systems.
  • Aircraft disappearance: the objective here is to cause failure within the collision avoidance systems and cause confusion with the ground stations where suddenly all of the messages received from an aircraft are deleted and no longer transmitted; this could force an aircraft to perform an emergency landing to avoid any risks.
  • Aircraft spoofing: the objective of this attack is to spoof an aircraft’s ICAO number, a special identifier for the aircraft transmitter; the main risk here is that any aircraft, even if unauthorized, could pass into the premises of a country as a normal aircraft without triggering any alarms.

2.2. Machine Learning Classifiers

The machine learning classifiers used in this study to create a machine learning model are described below, along with their definitions.
  • Random forest: T. K. Ho originally invented it in 1995, and then L. Breiman and A. Cutler improved it in 2006. Also known as random decision forests, this ensemble-supervised learning technique builds several decision trees during the training phase and is typically used for classification and regression [5].
  • AdaBoost: this 1995 model, created by Y. Freund and R. Schapire, can be used with various machine learning methods to improve performance. An ensemble-supervised learning approach uses iteration to strengthen weak classifiers by learning from their flaws [6].
  • Naive Bayes (NB): the Bayes theorem is the foundation of a supervised machine-learning method used for classification. Under the naive assumption that the pairs of characteristics are independent, this algorithm is regarded as fast and a good classifier. Still, it could perform better when estimating or predicting [7].
  • Neural networks (NN): a machine learning subset that uses supervised learning. It consists of many algorithms that attempt to imitate the workings of the human brain to uncover links in datasets [8].
  • Support machine vector (SVM): V. Vapnik et al. discovered it in 1993. It is regarded as one of the most reliable prediction algorithms and is another supervised machine learning technique. It is a non-probabilistic classifier because its foundation divides the samples into two groups [9].
  • K-nearest neighbors (kNN): E. Fix and J. Hodges created the initial version in 1951. Another supervised machine learning model is kNN, or closest neighbor, which assigns various weights to neighbor contributions based on the assumption that objects are similar when they are close together or apart [10].

4. Methodology

A machine learning-based detection model is adopted to achieve an efficient detection model of malicious ADS-B messages. The machine learning approach would require a dataset to train on, and then a testing dataset that the model has never seen before would be used to assess the model’s performance. As was previously mentioned, the dataset by H. O. Slimane et al. was published in 2022 but has yet to be used by academics. Therefore, several machine learning techniques will be used to preprocess, balance, and test the unique dataset to verify that the detection model performs at its best [33]. This section covers the methodology of the proposed model and will be divided into three subsections. Section 4.1 will provide a detailed view of the dataset, followed by Section 4.2, which highlights data preprocessing steps. Section 4.3 outlines the various machine learning classifiers used and tested in this study. Figure 2 presents an overview of the proposed model for this study. Our main focus using this model is to detect and classify path modification, ghost aircraft injection, and velocity drift attacks.
Figure 2. An overview of the proposed model.

4.1. The Dataset

This sub-section will discuss the dataset regarding features, records, and content; the dataset is publicly available on Mendeley Data and includes authentic ADS-B messages samples acquired from the OpenSky Network and injected messages simulated using PyCharm (version 2021.3.1). There are no personally identifiable information violations in this dataset. The adopted dataset by H. O. Slimane et al. has two files totaling 3 MB. This dataset contains authentic ADS-B message samples acquired from the OpenSky Network and injected messages for path modification, ghost aircraft injection, and velocity drift attacks.
The dataset authors used PyCharm, a development environment for programming integrated within Python, to simulate malicious messages with the required attack intentions. The dataset is balanced and contains 22,316 messages, of which 11,158 are authentic, and 11,158 were injected within the dataset files. The CSV files were specifically chosen because they typically provide data distribution information.
After removing authentic messages from the dataset, we noticed that the injected message numbers were unequal, implying that the dataset was imbalanced. This required us to use different over- and under-sampling techniques to balance it and improve the model’s quality to help classify imbalanced data and eliminate bias. The following over- and under-sampling techniques were utilized in this study:
  • Random oversampling: performing random oversampling involves selecting random data from the class that is considered the minority and then replacing or adding that data to the training dataset.
  • Random undersampling: the opposite of random oversampling, performing random undersampling involves selecting random data from the class that is considered the majority and replacing or adding that data to the training dataset.
  • Synthetic minority oversampling technique (SMOTE): an oversampling technique used with imbalanced datasets to help classification using machine learning, performed by generating new data from the existing minority data and then using it to supplement the dataset [34].

4.2. Data Preprocessing

This subsection will discuss data preprocessing, which outlines the processes we carried out on raw data to prepare it for different operations to be conducted on that data. Data preprocessing is a crucial first stage in our study. The methods have recently been modified to aid in training AI and machine learning models by changing the data into a format that can be processed in machine learning and other data science techniques in a way that is quicker and more efficient than using raw data, bearing in mind that these processes are typically applied at the beginning of the machine learning development pipeline to ensure reliable findings. The dataset is available online in CSV files, which are program-generated data files. The data preprocessing consisted of three main processes, outlined below.
  • Data normalization: this mainly consisted of standardizing features for us to be able to utilize it with the machine learning models.
  • Data cleansing: we performed data cleansing on the data to make sure there were no errors or duplications while processing the data to avoid bias and conflict in results.
  • Data scaling: this process revolved around data encoding, where we changed the results of some selected features from textual values (such as strings and boolean data) to numerical values (integer/float) to be able to use them with the machine learning models.
  • Data Shuffling: We used different techniques to randomize the training and testing data sets for more reliable results.
The procedures used during data preprocessing are outlined below.
  • We coded labels to convert categorical columns into numerical values on the icao24 and callsign columns. We also removed the remaining categorical features to decrease the margin of error.
  • We converted all data to integer/float values using the one-hot encoding technique to ensure that the machine learning model can understand the data, which creates a one-numeric array to encode categorical information [35].
  • We split the dataset into two main splits: one represents all features, and the other represents only the label feature (message type).
  • We utilized pandas. drop, pandas.resample, and train_test_split to manipulate the dataset before performing any training or testing on our dataset. We also used multiple balancing techniques to aid with the classification process.
  • We produced multiple training and testing subsets in different sizes that we randomly shuffled and generated at each iteration to initiate the learning process.
As discussed, data preprocessing is a crucial stage in machine learning since the effectiveness of the machine learning model is directly impacted by the data quality. We processed, updated, and cleaned the dataset during data preprocessing so it fits the machine learning model.

4.3. Feature Selection

The dataset consists of 18 features; they characterize what a typical ADS-B message contains and give more details and understanding of its structure. Table 5 introduces the features and briefly defines what the values represent. We omitted two features in this section (icao24 and callsign); the reason was that these attributes are strings that correlate to the aircraft type, and such information has no impact on detecting or classifying an injected message. Moreover, it could aid with decreasing the margin of error caused by encoding those strings into numerical values before testing.
Table 5. Dataset features.

4.4. Detection and Classification

A model that forecasts the class of supplied data points is called a machine learning classifier model. In supervised learning, test datasets that the machine learning model has never seen before are used to test and evaluate the model after it has been trained with potential attacks using training data. Multiple classifiers were evaluated while creating the ensemble machine learning model to attain the best outcomes. We utilized Jupyter Notebook to test the algorithms mentioned previously in the overview section (random forest, AdaBoost, naive Bayes (NB), neural networks, SVM, and kNN), and the comparison factors are represented as confusion matrices for the top three performing models. The algorithms were tested respectively, without parallelization for both detection and classification.
  • Detection: the models’ first test is performed on the dataset containing only original and malicious ADS-B messages while ignoring the different types of malicious messages. The dataset was balanced, so there was no need for any techniques to balance it, and then we tested the model’s ability to detect a fake message without necessarily classifying it into its relevant attack type. The dataset was split into multiple types. The model was tested using different training dataset sizes to achieve the highest results by lowering the amount needed in the training process since our model is efficient, which is important because we are dealing with decisions that could depend on these messages. These decisions could change in a matter of seconds.
  • Classification: here, we emit authentic messages from the dataset, and we are left with the injected messages; this is done to enable testing of the classification of attack types, which consist of three different types, as discussed above. Unfortunately, the dataset is unbalanced, so we had to test different balancing techniques to maximize our model’s efficiency. After balancing the dataset and before further testing of the models’ classification, we also split our experiment into different stages to obtain better results with the lowest amount of training needed, hence proposing an efficient model. Finally, it is obvious and trivial that the model will perform better with larger training data. Still, since we aim to achieve efficiency in our model and have a fast response to zero-day attacks on ADS-B messages, we decided to move forward with lower training dataset sizes. We have yet to notice much improvement after 30%, but this still needs further research with a larger data set.

5. Analysis and Discussion

The primary objective of our study is to propose a reliable and efficient model for detecting injection within ADS-B messages. We tested the model to work even with a low-volume dataset and minimal training to be fast and responsive without affecting the detection quality. The adopted dataset is relatively new, was only published in 2022, and has only been used once by academics. The dataset will be tested using a variety of machine learning algorithms to guarantee the detection model performs as well as possible.
This section describes the methods implemented to build our proposed detection model. The section will be divided into four main sub-sections. Section 5.1 presents the environment where the experiment took place. Section 5.2 showcases the model testing process. Section 5.3 contains details of the model evaluation metrics. Section 5.4 presents a comparison with state-of-the-art models. Firstly, Table 6 represents the experimental environment where the model was tested.
Table 6. Experimental environment.

5.1. Model Evaluation

This section presents the performance evaluation results for the proposed detection model based on random forest in various indicators. A confusion matrix is used to confirm the performance of the suggested machine learning model. The confusion matrix uses the following criteria to assess the model’s prediction:
  • True positives (TP): as anticipated by the model; an injected message is true (injected).
  • True negatives (TN): although true, the model anticipated that it would be negative (genuine).
  • False positives (FP): although negative, the model anticipated it to be positive (genuine).
  • False negatives (FN): although positive, the model predicted it to be negative (injected).
The confusion matrix can be used to build several equations that can be used to understand and assess the performance of the proposed model, including the following ones:
  • Accuracy indicates the percentage of real positives and real negatives that the model correctly identified.
Accuracy = TP   +   TN TP   +   TN   +   FP   +   FN
  • Precision indicates the percentage of real positives for each model’s successful predictions. Precision measures how probable a method is to produce accurate results.
Precision = TP TP   +   FP
  • Recall calculates the total number of correct predictions produced over the entire dataset, including correct predictions the model missed. Therefore, a high recall value for an IDS model is desired.
Recall = TP TP   +   FN
  • F1-score calculates the percentage of correctly predicted events. It is a weighted harmonic mean of recall and precision.
F 1 Score = 2   ×   Precision   ×   Recall Precision   +   Recall
  • Matthews correlation coefficient (MCC) is one of the best measures to summarize a confusion matrix and the quality of a binary classification. Simply, it is a correlation coefficient between observed and predicted binary classification, and it will only have good results if all four categories inside a confusion matrix have good results.
MCC = ( TP   ×   TN )     ( FP   ×   FN ) ( ( TP + FP ) ( TP + FN ) ( TN + FP ) ( TN + FN ) )
We will use accuracy, F1-score, and Matthews correlation coefficient to compare the findings of this study with the previous related literature since the accuracy statistic is frequently used to assess the effectiveness of machine learning algorithms. Accuracy could be misleading when working with imbalanced datasets because it would predict based on the majority class, but this was addressed using different sampling techniques to remove bias. The F1-score and MCC metrics were used due to their ability to showcase the quality of the results.

5.2. Model Estimation and Optimization

We split the dataset into training and testing datasets using ten different folds and k-fold cross-validation. The k-fold cross-validation technique helps split the dataset into multiple subsets of training and testing, which helps verify a model’s performance by choosing different sets at each iteration to serve as training and testing datasets, which helps with utilizing the whole dataset, shuffling the data and making sure to remove any bias. K is the parameter that indicates the number of folds. Figure 3 displays the method of k-cross validation [36].
Figure 3. K-cross validation technique [37].
We also performed dimensionality reduction, which is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation has meaningful properties of the originating dataset, ideally close to its intrinsic dimension; because working with data in a high-dimensional space is usually undesirable and computationally hard, lowering the dimensionality of data could boost the performance of a model.
We used principal component analysis (PCA), which is considered the main linear technique for reducing dimensionality. PCA performs a linear mapping of the data to a lower-dimensional space in such a way that the variance of the data in the low-dimensional representation is maximized.
In principle, this will boost the effectiveness of our model and lower the needed features. PCA was applied, where 16 principal components were identified. Unfortunately, in our case, it caused a loss of information and problems with data interpretation since it uses linear combinations of original variables; this led to a drop in accuracy from 99.41% to 93.01%. Likewise, F1-score also dropped from 99.37% to 92.15%. Lastly, MCC also dropped from 0.988 to 0.888. The model performed better by utilizing all features of the ADS-MI-2022 dataset with no linear combinations or dimensionality reductions. So, we continued testing our model by utilizing all of the features of the ADS-MI-2022 dataset.

5.3. Model Analysis

5.3.1. Binary Classification

A model that forecasts the class of supplied datasets is called a machine learning classifier model. In supervised learning, test datasets that the machine learning model has never seen before are used to test and evaluate the model after it has been trained with potential attacks using training data. Multiple classifiers were tested while the suggested machine learning model was constructed to attain the best outcomes. The best outcome was found between 20–30% training set size. We have decided to go ahead and continue with 30% to have an optimal training set size and to be more robust to unknown attacks, as the model is intended to be as efficient as possible. It has shown some positive effects on the different binary classification results. The top three used ML classifiers evaluated for this work are listed in Table 7. Random forest was utilized to form an ensemble machine learning classifier from the classifiers described since it produced better overall results.
Table 7. Binary classification report.

5.3.2. Multiclass Classification

Authentic messages were removed to train the model on the different types of attacks and improve its accuracy and detection rate. This has resulted in the dataset needing to be more balanced. Thus, we have used three different data sampling models. Since we are proposing an efficient detection model, only results using a training dataset of 10–30% of the whole dataset size will be showcased in this section. Table 8 showcases the top three algorithms and their performance in each train-test split. To conclude this phase, which included training the ML models, it was clear that using 30% of the dataset while it was being oversampled using random oversampling gave us the best results. Therefore, we continued the testing based on those results and used the random forest classifier.
Table 8. Performance of the top three ML models.

5.4. Discussion

The study’s primary goal is to identify binary classification to detect injected ADS-B messages and try to counter path modification, ghost aircraft injection, and velocity drift attacks. By identifying the sort of assault that was launched, we can better prepare our defenses. Below is a full examination of the findings using the proposed model compared to the related literature.
  • The authors in [4] used radio frequency transmitters to attack ADS-B communication channels. They used 36 different ADS-B in message configurations (13 hardware devices and 22 software). They attacked them using BladeRF and HackRF as attacking devices. They achieved 90% accuracy in detecting spoofed messages and showed that their method could help distinguish the different ADS-B messages.
  • M. Leonardi et al. [21] proposed a detection model that detects malicious messages by fingerprinting wireless radio signals. Still, even though their proposed method is unfamiliar, only 50 percent of the malicious signals were detected.
  • The authors of [38] proposed employing multiple variants of the SVM algorithm to detect and classify attacks on ADS-B messages and then compare the results. They produced the dataset used to test our proposed model.
  • N. Li et al. [39] proposed a model utilizing supervised deep learning to detect attacks on ADS-B messages using a fake data generator.
Now, after showcasing the results of the proposed system with a set of different attributes, we produced the following table, Table 6, which compares the proposed model with other existing models in a similar study field. Table 9 compares other state-of-the-art models. The comparison was based on the following performance factors: analysis algorithm, accuracy, F1-score, and Matthews correlation coefficient (MCC).
Table 9. Comparison with some of the related work.
Our results show that it is possible to build precise detection models for this kind of data that can operate across various stages of the life cycle of these messages by relying on the analysis of transmitted signals. The suggested detection model outperformed the majority of those in the literature in terms of results. This section reviewed the experiment results, evaluated the suggested approach, and compared the output with previous pioneering work in this field. The experimental results are outstanding and impactful in the field. However, further improvements can be made to improve the quality of these results.

6. Conclusions and Remarks

Automatic dependent surveillance-broadcast (ADS-B) is considered the future of aviation surveillance and aircraft traffic control. This protocol still shows that it lacks security. Researchers are still trying to improve the ADS-B security aspect to avoid risks such as causing collision avoidance system failure, reporting the wrong status of an aircraft, or even stealing the aircraft. Use of this technology is growing, but the data transmitted are unencrypted and have no actual method to authenticate, tempting hackers to exploit its flaws. We examined most of the prior research in the field for this study.
Firstly, the literature addressed some key issues and vulnerabilities related to ADS-B technology. However, attacks still occur using low-cost equipment supported by technological advancement, creating new obstacles to overcoming these multiple risks.
Secondly, this study aims to suggest a robust and efficient detection model for injected ADS-B messages to aid with the security of air traffic management and offer some answers to some of the challenges associated with identifying those messages and attacks related to them. The unique dataset has just over 11000 injected samples and 11000 benign samples, totaling 3 MB in size. A set of related features was chosen to identify the existing samples after the study’s dataset had to be cleaned up before conducting any research because it obstructed the classification process. This experiment required us to test against six algorithms for better prediction results. We then compared the top three results from the binary classification stage.
Thirdly, the results of our tests came as follows: random forest, multilayer perceptron classifier, and k-nearest neighbor, which, respectively, had an accuracy of 99.14%, 93.45%, and 96.50%. Additionally, random forest outperformed the multilayer perceptron classifier and k-nearest neighbor with an F1-score of 99.14% as opposed to 93.45% and 96.50%, respectively; and with an MCC of 0.982 as opposed to 0.870 and 0.930, respectively.
Then, these models were further tested for multiclass classification, and since the dataset was imbalanced at this stage, three sampling techniques were used. The results came as follows: random forest, multilayer perceptron classifier, and support k-nearest neighbor, respectively, had an accuracy of 99.41%, 96.78%, and 94.80%. Additionally, random forest outperformed the multilayer perceptron classifier and k-nearest neighbor with an F1-score of 99.37% as opposed to 96.53% and 94.61%, respectively; and with an MCC of 0.988 as opposed to 0.925 and 0.843, respectively.
While our research focuses on building and evaluating a standalone IDS to help improve the security of aviation control systems against injected cyberattacks (specifically: path modification, ghost aircraft, and velocity drift), the proposed IDS system needs to be integrated within the current aviation control infrastructures. The proposed sequence of events for implementing the detection system is to fine-tune the approach, integrate it with aviation systems, obtain regulatory approval, and then deploy and train the crew. However, several integration challenges might be faced; assuring data quality, accomplishing real-time processing, and reducing false alarms are among the challenges. Aviation security is improved by integration with current protocols; however, modifications are required for standardization, affordability, and scalability for broad implementation.

Author Contributions

Conceptualization, Q.A.A.-H. and A.A.-T.; methodology, Q.A.A.-H. and A.A.-T.; software, A.A.-T.; validation, Q.A.A.-H.; formal analysis, Q.A.A.-H. and A.A.-T.; investigation Q.A.A.-H. and A.A.-T.; resources, Q.A.A.-H. and A.A.-T.; data curation, Q.A.A.-H.; writing—original draft preparation, Q.A.A.-H. and A.A.-T.; writing—review and editing, Q.A.A.-H. and A.A.-T.; visualization, A.A.-T.; supervision, Q.A.A.-H.; project administration, Q.A.A.-H.; funding acquisition, Q.A.A.-H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The dataset associated with this research is publicly available through the Mendeley-Elsevier data repository and can be accessed via https://doi.org/10.17632/6fhw732ccz.1 (accessed on 12 November 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. European Union Aviation Safety. Opinion No 01/2020: High-Level Regulatory Framework for the U-Space. Available online: https://www.easa.europa.eu/en/document-library/opinions/opinion-012020 (accessed on 17 December 2023).
  2. Strohmeier, M.; Lenders, V.; Martinovic, I. On the Security of the Automatic Dependent Surveillance-Broadcast Protocol. IEEE Commun. Surv. Tutor. 2015, 17, 1066–1087. [Google Scholar] [CrossRef]
  3. McCallie, D.; Butts, J.; Mills, R. Security analysis of the ADS-B implementation in the next generation air transportation system. Int. J. Crit. Infrastruct. Prot. 2011, 4, 78–87. [Google Scholar] [CrossRef]
  4. Cretin, A.; Vernotte, A.; Chevrot, A.; Peureux, F.; Legeard, B. Test Data Generation for False Data Injection Attack Testing in Air Traffic Surveillance. In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), Porto, Portugal, 24–28 October 2020. [Google Scholar] [CrossRef]
  5. Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995. [Google Scholar]
  6. Schapire, R.E. I am explaining Adaboost. In Empirical Inference; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
  7. Vikram, K.; Vijaykumar, B.; Trilochan. Bayes and Naive Bayes Classifier. arXiv 2014, arXiv:1404.0933. [Google Scholar] [CrossRef]
  8. McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
  9. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  10. Zhang, Z. Introduction to machine learning: K-nearest neighbors. Ann. Transl. Med. 2016, 4, 218. [Google Scholar] [CrossRef] [PubMed]
  11. Costin, A.; Francillon, A. Ghost in the air (traffic): On the insecurity of ADS-B protocol and practical attacks on ADS-B devices. Black Hat USA 2012, 1, 1–12. [Google Scholar]
  12. Leonardi, M.; Strohmeier, M.; Lenders, V. On Jamming Attacks in Crowdsourced Air Traffic Surveillance. IEEE Aerosp. Electron. Syst. Mag. 2021, 36, 44–54. [Google Scholar] [CrossRef]
  13. Eskilsson, S.; Gustafsson, H.; Khan, S.; Gurtov, A. Demonstrating ADS-B and CPDLC attacks with software-defined radio. In Proceedings of the 2020 Integrated Communications Navigation and Surveillance Conference (ICNS), Herndon, VA, USA, 8–10 September 2020. [Google Scholar]
  14. Leonardi, M.; Piracci, E.; Galati, G. ADS-B vulnerability to low-cost jammers: Risk assessment and possible solutions. In Proceedings of the 2014 Tyrrhenian International Workshop on Digital Communications-Enhanced Surveillance of Aircraft and Vehicles (TIWDC/ESAV), Rome, Italy, 15–16 September 2014. [Google Scholar]
  15. Pearce, N.; Duncan, K.J.; Jonas, B. Signal discrimination and exploitation of ADS-B transmission. In Proceedings of the SoutheastCon 2021, Atlanta, GA, USA, 10–13 March 2021. [Google Scholar]
  16. Schäfer, M.; Lenders, V.; Martinovic, I. Experimental analysis of attacks on next-generation air traffic communication. In Proceedings of the International Conference on Applied Cryptography and Network Security, Banff, AB, Canada, 25–28 June 2013; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  17. Manesh, M.R.; Mullins, M.; Foerster, K.; Kaabouch, N. A preliminary effort toward investigating the impacts of ADS-B message injection attack. In Proceedings of the 2018 IEEE Aerospace Conference, Big Sky, MT, USA, 3–10 March 2018. [Google Scholar] [CrossRef]
  18. Haddad, Y.; Orye, E.; Maennel, O. Ghost Injection Attack on Automatic Dependent Surveillance–Broadcast Equipped Drones Impact on Human Behaviour. In Proceedings of the 2021 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA), Tallinn, Estonia, 14–22 May 2021. [Google Scholar] [CrossRef]
  19. Strohmeier, M.; Schäfer, M.; Lenders, V.; Martinovic, I. Realities and challenges of next gen air traffic management: The case of ADS-B. IEEE Commun. Mag. 2014, 52, 111–118. [Google Scholar] [CrossRef]
  20. Odin, A.; Gruneau, M. The ADS-B Protocol and Its’ Weaknesses. Ph.D. Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 2020. [Google Scholar]
  21. Leonardi, M.; Di Gregorio, L.; Di Fausto, D. Air traffic security: Aircraft classification using ADS-B message’s phase-pattern. Aerospace 2017, 4, 51. [Google Scholar] [CrossRef]
  22. Kacem, T.; Wijesekera, D.; Costa, P. ADS-Bsec: A holistic framework to secure ADS-B. IEEE Trans. Intell. Veh. 2018, 3, 511–521. [Google Scholar] [CrossRef]
  23. Strohmeier, M.; Lenders, V.; Martinovic, I. Intrusion detection for airborne communication using PHY-layer information. In Proceedings of the 12th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, Milan, Italy, 9–10 July 2015. [Google Scholar]
  24. Kacem, T.; Wijesekera, D.; Costa, P. Integrity and authenticity of ADS-B broadcasts. In Proceedings of the 2015 IEEE Aerospace Conference, Big Sky, MT, USA, 7–14 March 2015. [Google Scholar]
  25. Strohmeier, M.; Lenders, V.; Martinovic, I. Lightweight location verification in air traffic surveillance networks. In Proceedings of the 1st ACM Workshop on Cyber-Physical System Security, Singapore, 14–17 April 2015. [Google Scholar]
  26. Shen, H.; Liu, K.; Yao, Y.; Wang, J. An ADS-B Anti-counterfeiting System Based on TDOA. In Proceedings of the IEEE International Conference on Signal, Information, and Data Processing (ICSIDP), Chongqing, China, 11–13 December 2019. [Google Scholar] [CrossRef]
  27. Leonardi, M. ADS-B Anomalies and Intrusions Detection by Sensor Clocks Tracking. IEEE Trans. Aerosp. Electron. Syst. 2019, 55, 2370–2381. [Google Scholar] [CrossRef]
  28. Sciancalepore, S.; Di Pietro, R. SOS: Standard-Compliant and Packet Loss Tolerant Security Framework for ADS-B Communications. IEEE Trans. Dependable Secur. Comput. 2021, 18, 1681–1698. [Google Scholar] [CrossRef]
  29. Chen, Y.; Zhou, L. Vulnerabilities in ADS-B and Verification Method. In Proceedings of the 2020 IEEE 2nd International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Weihai, China, 14–16 October 2020. [Google Scholar] [CrossRef]
  30. Monteiro, M.; Barreto, A.; Division, R. Detecting malicious ADS-B broadcasts using wide area multilateration. In Proceedings of the 2015 IEEE/AIAA 34th Digital Avionics Systems Conference (DASC), Prague, Czech Republic, 13–17 September 2015. [Google Scholar]
  31. Khandker, S.; Turtiainen, H.; Costin, A.; Hamalainen, T. Cybersecurity Attacks on Software Logic and Error Handling Within ADS-B Implementations: Systematic Testing of Resilience and Countermeasures. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 2702–2719. [Google Scholar] [CrossRef]
  32. Kujur, B.; Khanafseh, S.; Pervan, B. Detecting GNSS spoofing of ADS-B-equipped aircraft using INS. In Proceedings of the IEEE/ION Position, Location and Navigation Symposium (PLANS), Portland, OR, USA, 20–23 April 2020. [Google Scholar] [CrossRef]
  33. Ould Slimane, H.; Benouadah, S.; Kaabouch, N. ADS-B Message Injection Attacks Dataset; Mendeley Data, V1; University of North Dakota: Grand Forks, ND, USA, 2022. [Google Scholar] [CrossRef]
  34. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  35. Abu Al-Haija, Q. Top-Down Machine Learning-Based Architecture for Cyberattacks Identification and Classification in IoT Communication Networks. Front. Big Data 2022, 4, 782902. [Google Scholar] [CrossRef] [PubMed]
  36. Abu Al-Haija, Q.; Krichen, M.; Abu Elhaija, W. Machine-Learning-Based Darknet Traffic Detection System for IoT Applications. Electronics 2022, 11, 556. [Google Scholar] [CrossRef]
  37. Ashfaque, J.M.; Iqbal, A. Introduction to Support Vector Machines and Kernel Methods. 2019. Available online: https://www.academia.edu/38782451/ (accessed on 12 April 2019).
  38. Slimane, H.O.; Benouadah, S.; Al Shamaileh, K.; Devabhaktuni, V.; Kaabouch, N. ADS-B Message Injection Attack on UAVs: Assessment of SVM-based Detection Techniques. In Proceedings of the IEEE International Conference on Electro-Information Technology (EIT), Mankato, MN, USA, 19–21 May 2022. [Google Scholar] [CrossRef]
  39. Li, N.; Lin, L.; Li, F. ADS-B Anomaly Data Detection Using SVDD-based LSTM Encoder-Decoder Algorithm. In Proceedings of the IEEE 3rd International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Changsha, China, 20–22 October 2021. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.