Transfer and CNN-Based De-Authentication (Disassociation) DoS Attack Detection in IoT Wi-Fi Networks

: The Internet of Things (IoT) is a network of billions of interconnected devices embedded with sensors, software, and communication technologies. Wi-Fi is one of the main wireless communication technologies essential for establishing connections and facilitating communication in IoT environments. However, IoT networks are facing major security challenges due to various vulnerabilities, including de-authentication and disassociation DoS attacks that exploit IoT Wi-Fi network vulnerabilities. Traditional intrusion detection systems (IDSs) improved their cyberattack detection capabilities by adapting machine learning approaches, especially deep learning (DL). However, DL-based IDSs still need improvements in their accuracy, efﬁciency, and scalability to properly address the security challenges including de-authentication and disassociation DoS attacks tailored to suit IoT environments. The main purpose of this work was to overcome these limitations by designing a transfer learning (TL) and convolutional neural network (CNN)-based IDS for de-authentication and disassociation DoS attack detection with better overall accuracy compared to various current solutions. The distinctive contributions include a novel data pre-processing, and de-authentication/disassociation attack detection model accompanied by effective real-time data collection and parsing, analysis, and visualization to generate our own dataset, namely, the Wi-Fi Association_Disassociation Dataset. To that end, a complete experimental setup and extensive research were carried out with performance evaluation through multiple metrics and the results reveal that the suggested model is more efﬁcient and exhibits improved performance with an overall accuracy of 99.360% and a low false negative rate of 0.002. The ﬁndings from the intensive training and evaluation of the proposed model, and comparative analysis with existing models, show that this work allows improved early detection and prevention of de-authentication and disassociation attacks, resulting in an overall improved network security posture for all Wi-Fi-enabled real-world IoT infrastructures.


•
We propose an end-to-end IDS solution for de-authentication\disassociation DoS attack detection in IoT Wi-Fi networks.

•
We design a complete testbed for collecting real-time network traffic and modules for parsing unstructured network traffic, analyzing structured data, and generating datasets for the proposed attack detection solution.

•
We propose a novel data pre-processing technique to prepare our Wi-Fi Association_ Disassociation dataset to make it suitable for TL-and CNN-based attack detection.

•
We evaluate our solution's performance using different metrics, including confusion matrix, accuracy, precession, recall, F1-score, and ROC/AUC. Then, we compare it with state-of-the-art solutions that involve both TML and DL models. We show that our solution can effectively detect de-authentication/disassociation attacks with high accuracy.
The rest of the paper is organized as follows: Section II discusses some background knowledge. Section III presents the current state of research on intrusion detection focusing on convolutional neural networks and transfer learning. Section IV discusses the details of the proposed solution while Section V describes the experimental configurations, results, and analysis. Section VI explains and interprets the results of the solution. Finally, the conclusion of the paper is presented in Section VII.

Intrusion Detection System (IDS)
An intrusion in cyber security refers to unauthorized access or entry into a node or network system. This can be accomplished through various methods such as hacking, malware, or social engineering. An intrusion's goal could be to steal sensitive information, disrupt operations, or gain control of a targeted node or an entire network. To protect against these types of threats, intrusion detection and prevention are critical aspects of cyber security. An intrusion detection system (IDS) detects malicious user's unauthorized access to systems or networks. IDSs' primary duties are to monitor hosts and networks, evaluate computer system activity, produce warnings, and respond to suspicious behavior. The basic architecture of an IDS, as shown in Figure 1 (adapted from [24]), consists of a data collection device (sensor), an intrusion detection engine, a knowledgebase (database), a configuration device, and a response component [25]. Based on different factors, IDSs can be classified into various types among which network IDS (NIDS) is widely deployed and our solution falls under this category. A Network Intrusion Detection System (NIDS) typically examines network traffic for patterns or anomalies that might reveal an intrusion. The NIDS is often deployed on the border router or switches and monitors network traffic flow to identify threats that occur across the network connection [26]. It operates on the OSI Model's Network Layer or Data Link Layer. Most NIDSs are independent of the operating system (OS), allowing them to be used in various OS scenarios. Furthermore, NIDSs can identify specific protocol types and network assaults. The disadvantage is that they only monitor traffic moving via a certain network section.

Machine and Deep Learning for IDS
Both traditional machine learning (TML) and deep learning approaches prove their capability for the advancement of IDSs. However, because of the speed and amount of IoT-produced data, TML approaches continue to face several challenges in extracting relevant features from the massive and unstructured data created by IoT devices as they need well-crafted feature engineering. Deep learning methods have become the preferred approach in recent years due to their ability to learn features automatically, efficiently handle large-scale datasets, and adapt and learn new data with less extensive retraining, as well as their flexibility and capability of capturing nonlinear relationships in data.

Machine and Deep Learning for IDS
Both traditional machine learning (TML) and deep learning approaches prove their capability for the advancement of IDSs. However, because of the speed and amount of IoT-produced data, TML approaches continue to face several challenges in extracting relevant features from the massive and unstructured data created by IoT devices as they need well-crafted feature engineering. Deep learning methods have become the preferred approach in recent years due to their ability to learn features automatically, efficiently handle large-scale datasets, and adapt and learn new data with less extensive retraining, as well as their flexibility and capability of capturing nonlinear relationships in data.

Convolutional Neural Networks
A convolutional neural network (CNN) is one of the most often utilized forms of neural networks in computer vision. It proved to be effective in many challenges such as face recognition [27], object detection [28], picture classification [25], image restoration [29], image captioning [30], industrial applications [31], audio recognition [32], and so on, which makes it critically useful in image analysis. The basic architecture of a CNN, as shown in Figure 2 (adapted from [33]), consists of different types of layers among which the main layers are convolutional, pooling, rectification, and flatten layers. At the core of a CNN is the convolutional layer whose units are organized into output feature maps following the convolution operation using filters or kernels. The output feature maps are then passed through the pooling layer to reduce their dimensions and the number of overfitting parameters which in turn accelerates neural network performance and leads to faster training. At the same time, it keeps the majority of the dominant information (or features) in every stage of the pooling process. Furthermore, pooling filters aid in learning the most important features and removing outliers and inconsistencies. Classifiers are often composed of fully connected layers, and they carry out classification tasks according to the identified features. In general, in contrast to other DL or ML algorithms that can be over-fitted with massive amounts of data, CNNs can instantly determine the sort of attack.

Convolutional Neural Networks
A convolutional neural network (CNN) is one of the most often utilized forms of neural networks in computer vision. It proved to be effective in many challenges such as face recognition [27], object detection [28], picture classification [25], image restoration [29], image captioning [30], industrial applications [31], audio recognition [32], and so on, which makes it critically useful in image analysis. The basic architecture of a CNN, as shown in Figure 2 (adapted from [33]), consists of different types of layers among which the main layers are convolutional, pooling, rectification, and flatten layers. At the core of a CNN is the convolutional layer whose units are organized into output feature maps following the convolution operation using filters or kernels. The output feature maps are then passed through the pooling layer to reduce their dimensions and the number of overfitting parameters which in turn accelerates neural network performance and leads to faster training. At the same time, it keeps the majority of the dominant information (or features) in every stage of the pooling process. Furthermore, pooling filters aid in learning the most important features and removing outliers and inconsistencies. Classifiers are often composed of fully connected layers, and they carry out classification tasks according to the identified features. In general, in contrast to other DL or ML algorithms that can be over-fitted with massive amounts of data, CNNs can instantly determine the sort of attack.

Transfer Learning
Transfer learning (TL) [34] is a powerful method for exploiting deep neural networks on smaller datasets. The goal of transfer learning is to transfer the structure and parameters of large-scale dataset-trained models (e.g., AlexNet, VGGNet, and ResNet) to new

Transfer Learning
Transfer learning (TL) [34] is a powerful method for exploiting deep neural networks on smaller datasets. The goal of transfer learning is to transfer the structure and parameters of large-scale dataset-trained models (e.g., AlexNet, VGGNet, and ResNet) to new tasks and use the weights trained on the large dataset as the initial weights for the new task [18]. As a result, TL leverages the knowledge learned from the source domain to solve the problem in the target domain without the need to learn from scratch with a massive amount of data. As opposed to training new models from the beginning using new datasets, we may leverage the patterns learned from datasets like ImageNet (millions of photos of various things) as the foundation of the new problem, as shown in Figure 3.

Transfer Learning
Transfer learning (TL) [34] is a powerful method for exploiting deep neural networks on smaller datasets. The goal of transfer learning is to transfer the structure and parameters of large-scale dataset-trained models (e.g., AlexNet, VGGNet, and ResNet) to new tasks and use the weights trained on the large dataset as the initial weights for the new task [18]. As a result, TL leverages the knowledge learned from the source domain to solve the problem in the target domain without the need to learn from scratch with a massive amount of data. As opposed to training new models from the beginning using new datasets, we may leverage the patterns learned from datasets like ImageNet (millions of photos of various things) as the foundation of the new problem, as shown in Figure 3.

IoT and the Wi-Fi Protocol
The Internet of Things (IoT) is a network of billions of interconnected physical devices embedded with sensors, software, and communication technologies with the capabilities of gathering and exchanging data via the Internet, allowing them to communicate with one another and execute a variety of functions independently. Wi-Fi has become part of our daily lives and more popular for use in many areas of application including smart home equipment due to its low cost, ubiquity, and ease of connecting. The continuous decline in the price of Wi-Fi chipsets has contributed significantly to its expansion. With the rapidly increasing IoT advancement and billions of connected objects, Wi-Fi applications reached new heights. More than 37 billion Wi-Fi-enabled devices were shipped in 2021, and the global value of Wi-Fi is estimated to be $4.9 trillion by 2025 [35]. The increasing deployment of IoT devices is the primary driver of this market growth.
Wi-Fi-enabled devices communicate with an intermediary device such as an access point (AP), a networking device linked to a wired or cellular network through radio signals over the airwaves, via Wi-Fi. The AP effectively turns Internet data into radio waves and transmits them into the surrounding area.

De-Authentication and Disassociation Attacks
An IoT Wi-Fi network has many vulnerabilities such as an open wireless medium, lack of robust encryption protocols, the unprotected nature of the management frames, and insufficient validation of de-authentication frames. These security challenges make an IoT Wi-Fi network vulnerable to various attacks such as Denial of Service (DoS), spoofing, eavesdropping, Man-In-The-Middle Attack, and many others [36]. The de-authentication and disassociation DoS attacks are the attack scenarios that this work focuses on. The security evaluation of a network generally involves different phases such as exploration, analysis, attack, and operation. For the attack scenarios, network traffic associated with the association/disassociation process of the 802.11 Wi-Fi networks is the focus of the analysis and attack detection.

De-Authentication DoS Attack
De-authentication attacks are classified as management frame attacks and fall under the Denial of Service attack which targets disrupting the communication between users (stations) and the Wi-Fi AP [37]. Normally, a de-authentication frame is used to gracefully end a connection between a connected client and an access point. The AP or the station can invoke the de-authentication due to shutdown, out of coverage, or other reasons. When the AP receives this frame, it also transmits the de-authentication frame back to the client. While this is a normal process for de-authentication, an attacker can take advantage of this process. An attacker exploits this regular process by first waiting for a client to authenticate with the AP and then launching attacks by spoofing the MAC address of a target client and sending the de-authentication frame to the AP on behalf of the victim. This disconnects the connection of the station to the AP. The attacker then spoofs this client's MAC address and delivers the de-authentication frame to the AP. The entire process is shown in Figure 4 (adapted from [38]).

Disassociation DoS Attack
An existing association is terminated via a disassociation frame. Any of the two associated stations can initiate a disassociation notice. Disassociation cannot be denied since it is a notification rather than a request. The receiving station clears the appropriate states and keys from its memory in response to the disassociation notice. Stations often dissoci-

Disassociation DoS Attack
An existing association is terminated via a disassociation frame. Any of the two associated stations can initiate a disassociation notice. Disassociation cannot be denied since it is a notification rather than a request. The receiving station clears the appropriate states and keys from its memory in response to the disassociation notice. Stations often dissociate when they leave the network or when they relocate and want to join another network. If an AP cannot manage all of its associated stations or is restarting, it broadcasts a broadcast disassociation to disconnect all of them. Because the disassociation frame is neither encrypted nor authenticated, it is vulnerable to spoofing. Anyone may spoof the source address (SA) using MAC spoofing tools and techniques such as Aircrack-ng, MAC changer, Scapy, or custom scripts. By sending a faked disassociation frame, the attacker can force the target victim client to dissociate, as shown in Figure 5 (adapted from [38]).

Log Collection and Parsing
Small to large-scale systems generate logs on a regular basis to record system states and runtime information, each of which includes a date and a log message describing what happened. This useful information might be used for a variety of purposes (for example, anomaly detection), and thus logs are gathered first for later use. Logs are unstructured plain text and consist of constant and variable parts [39]. Logs consist of constant parts predefined in the source code and they remain the same in different occurrences. The remaining parts of the log are variable parts that change depending on the different occurrences and are generated dynamically. Log parsing extracts a group of event templates to create a structured and well-established log format from the raw logs. More specifically, each log message can be parsed into an event template (constant part) with some specific parameters (variable part).

Elastic Stack
The Elastic stack is an open-source project that primarily consists of Elasticsearch, Logstash, and Kibana (ELK) used for data preprocesses, data search, and data visualization [40]. ELK is preferred as it is open-source with the capabilities of log collection, processing, and visualization. Being open-source and other vital features of ELK help develop

Log Collection and Parsing
Small to large-scale systems generate logs on a regular basis to record system states and runtime information, each of which includes a date and a log message describing what happened. This useful information might be used for a variety of purposes (for example, anomaly detection), and thus logs are gathered first for later use. Logs are unstructured plain text and consist of constant and variable parts [39]. Logs consist of constant parts predefined in the source code and they remain the same in different occurrences. The remaining parts of the log are variable parts that change depending on the different occurrences and are generated dynamically. Log parsing extracts a group of event templates to create a structured and well-established log format from the raw logs. More specifically, each log message can be parsed into an event template (constant part) with some specific parameters (variable part).

Elastic Stack
The Elastic stack is an open-source project that primarily consists of Elasticsearch, Logstash, and Kibana (ELK) used for data preprocesses, data search, and data visualization [40]. ELK is preferred as it is open-source with the capabilities of log collection, processing, and visualization. Being open-source and other vital features of ELK help develop effective and efficient vendor-independent solutions including log collection, log parsing, visualization, and feature extraction.

Related Work
In recent years, several studies have been conducted on machine learning approaches for intrusion detection with some of them focusing on DL methods. Due to their capacity to learn features automatically, effectively handle large-scale datasets, and adapt and learn new data with less extensive retraining, flexibility, and capability of capturing nonlinear relationships in data, deep learning (DL) approaches have recently gained popularity in IDS. As a result, various studies have focused on employing deep learning techniques to propose novel solutions addressing two separate technological and regulatory viewpoints, such as anomaly and malware detection; nevertheless, the findings are still unconvincing. Furthermore, most IDSs are based on existing computer networks, wireless sensor networks, and mobile ad hoc networks. However, because of the unique properties of IoT environments, such as access to the global Internet, heterogeneity, computationally limited resources, and being dynamic and constantly evolving areas with new and regularly emerging attack techniques and vulnerabilities, the IDS recommended for these networks is less effective with IoT applications [19,20].
In the solution proposed by Satman et al. [18], a Wireless Intrusion Detection System (WIDS) uses an anomaly behavior analysis approach to detect attacks on Wi-Fi networks with a 99% detection rate and 0.1% false alarm rate. The approach models the normal behavior of the Wi-Fi protocol using n-grams, which are used to capture continuous sequences of n items, and uses machine learning models to classify Wi-Fi traffic flows as normal or malicious. The approach has been extensively tested on multiple datasets collected locally at the University of Arizona and the AWID family of datasets. The study by Thing et al. [22] provides a deep learning strategy for detecting anomalies and classifying attacks in IEEE 802.11 networks. To detect network anomalies and properly classify attacks, the suggested system employs a self-learning methodology. The approach is based on a deep neural network architecture known as a stacked autoencoder (SAE). The SAE is trained on the dataset to learn the characteristics required for the accurate detection and classification of network anomalies. The classification is regarded as a multi-class problem, and the suggested approach classified the attacks with an overall accuracy of 98.6688%, which shows that our solution performs better. The paper [23] proposes an intrusion detection system for wireless networks using a feature selection algorithm called conditional random field and linear correlation-coefficient-based feature selection algorithm. The proposed system achieves an overall detection accuracy of 98.88%. However, this solution has insufficient performance, and no comparison is made with other existing IDSs, which could provide a better understanding of its effectiveness in comparison to other methods.
The study [41] presents a three-layer hybrid intrusion detection approach for malicious attacks on smart homes. The model, which is ideal for large amounts of data, employs a two-layer feature processing technique based on random forest and principal component analysis to minimize data information loss. With binary classifiers, the three-layer detection model can detect four frequent threats and substantially enhance accuracy. The suggested model's experimental assessment is carried out using a real smart home traffic dataset, and it achieves a classification accuracy of 95.90%. The experimental findings demonstrate that the suggested model has good performance in detecting and classifying malicious attacks in a smart home. The authors of [42] proposed an IDS for detecting Distributed Denial of Service (DDoS) attacks in IoT networks. The suggested IDS employs a hybrid approach that combines deep learning and multi-objective optimization. The proposed IDS combines the Jumping Gene modified NSGA-II multi-objective optimization approach for data dimension reduction and the convolutional neural network (CNN) incorporating long short-term memory (LSTM) deep learning techniques for attack classification. The experiment was performed using the latest CISIDS2017 datasets on DDoS attacks using a high-performance computer (HPC) and achieved an accuracy of 99.03% with a 5-fold reduction in training time.
The authors of [43] proposed the design and implementation of a deep-learning-based model for detecting anomalies in IoT networks. The presented model used CNNs for multiclass and binary classification of network intrusion. Several datasets, including BoT-IoT, IoT Network Intrusion, MQTT-IoT-IDS2020, and IoT-23 intrusion detection datasets, were used to evaluate the model. The model's performance was measured using accuracy, precision, recall, and F1 score. When compared to existing deep learning implementations, the suggested binary and multiclass classification models exhibited good accuracy, precision, recall, and F1 scores. The solution achieved an overall accuracy of 87%. The study [33] presented the use of transfer learning to update deep learning-based intrusion detection systems (DL-IDS). The authors created a CNN-based IDS using the Bot-IoT dataset and updated it with small data from a new dataset called TON-IoT. The results achieved showed promising improvements in multiple metrics regarding detection rate and training between the initial training for the original model and the updated model, in terms of detecting new attack behaviors and improving the detection rate for some classes due to a lack of labeled data. However, the paper does not provide any specific numerical values for the obtained results.
Masum et al. [44] investigated transfer learning for detecting new intrusions. Their method is based on a two-stage process in which the first phase employs the VGG-16 pretrained on the ImageNet dataset, and the second applies a deep neural network (DNN) to extract features. They evaluated the method on the NSL-KDD dataset as well, achieving an accuracy of 70.97% in detecting novel intrusions (KDDTest-21). In 5G IoT contexts, Fan et al. [45] combined transfer and federated learning and proposed federated learning for securely collecting data from several IoT networks. They implemented transfer learning using a CNN to create a personalized intrusion detection model for each IoT network. They evaluated the solution using CICIDS2017 as their source dataset and a different custom dataset as the target dataset. The proposed solution achieved an average accuracy of 91.93%. The paper [46] proposed an IDS that uses a bidirectional long short-term memory (BiLSTM) and CNN hybrid model to detect anomalies in a smart home network. The proposed model uses BiLSTM to preserve learned information across time and a CNN to extract data features. The model was trained and evaluated using the NSL-KDD dataset and achieved an accuracy of 98.93%.
Huong et al. [47] introduced an IDS for IoT systems based on a CNN. The suggested technique extracts log information from an IoT system, such as location, service, and address, into an original feature set, enhances and encodes it, and feeds it into a CNN for training and detection. The approach has a 98.9% average accuracy. The study [48] presents a deep learning and transfer learning-based intrusion detection system. The suggested technique presents network data in the form of a grayscale image using stream data visualization, and then a deep learning method is developed to detect network intrusion based on texture features in the grayscale image. Finally, transfer learning is applied to improve the model's iterative efficiency and flexibility. The experimental findings reveal that the suggested method achieved an accuracy of 97.9%. Using the NSL-KDD dataset as a benchmark, the study [49] presents two deep learning models for intrusion detection systems. The first model is LSTM-only, which solely employs long short-term memory (LSTM) layers while the second model combines CNN with layers of LSTM. Both models are compared to the existing approach for intrusion detection, which employs recurrent neural networks (RNNs). The experimental results show that the maximum accuracy achieved is by the CNN-LSTM model, which is 94.12%. The recent state-of-the-art IDSs based on TML and DL are summarized in Table 1.
Existing DL-based intrusion detection approaches in IoT Wi-Fi networks still need improvements in detection accuracy and lack an end-to-end solution consisting of a complete testbed, traffic data collection, parsing, analysis, visualization, generating dataset, and attack detection with the main focus of de-authentication/disassociation DoS attacks using TL and DL-based IDS systems. As a result, the purpose of this article was to bridge that gap and examine the most effective and efficient application of DL techniques in safeguarding the IoT Wi-Fi network environment.

Proposed Intrusion Detection Method
The aim of the present work was to develop a transfer learning and convolutional eural network-based IDS model for IoT wi-fi networks from being breached by de-authentication and disassociation DoS attacks. Figure 6 demonstrates the architecture of the proposed system, comprising five main modules: and visualization module basically through Elastic stack's Kibana tool. This module also generates a well-structured and filtered dataset that is used for the next deep learning tasks. The generated dataset needs further pre-processing to be suitable for a CNN, which is performed by the data pre-processing module. We named the dataset generated as the Wi-Fi Association_Disassociation dataset [50]. The last module performs de-authentication and disassociation attack detection using TL and CNN approaches. Each module of the proposed architecture is explained in detail in the subsequent sections.

Attack and Normal Traffic Generator Module
The availability of datasets is one of the biggest obstacles for ML/DL intrusion detection methods. Privacy is the primary challenge for the inadequate availability of datasets in the intrusion detection field. This is due to the fact that very sensitive information is carried in network traffic, and its accessibility might disclose consumer and corporate secrets or even private communications. Although many researchers have generated their own data to fill the preceding gap in order to overcome the challenge, the majority of the datasets created in these circumstances are not exhaustive, and the samples taken into account are insufficient to represent the latest behaviors. For these and related reasons, we set up our own testbed to generate the Wi-Fi Association_Disassociation dataset on the specific Wi-Fi application with a focus on de-authentication and disassociation DoS attacks.
This module consists of Wi-Fi client devices, an attacker, and an AP as shown in Figure 7. Smartphones, tablets, laptops, and Raspberry Pi (RPI) are used to generate normal traffic while Kali Linux and NodeMCU-based clients are used to carry out attack traffic. We used RPI to represent other IoT Wi-Fi-enabled IoT devices. Wi-Fi hacking has always relied on a few pieces of hardware, such as a computing device (computer, laptop, Raspberry Pi, etc.) that can execute whatever attack application is attempted. Second, it requires a wireless network adapter with a chipset that supports whatever nefarious Wi-Fi behavior is to be conducted. (1) Attack and normal traffic module; (2) log parsing module; (3) indexing and analysis module; (4) visualization and dataset generation module; (5) data preprocessing module; and (6) attack detection module. In the attack and normal traffic module, a complete testbed setup is developed to generate both traffic types that involve legitimate clients and an attacker along with an AP. The log collection and parsing module gathers both of the plain log data from the AP, parses them, and passes them to the storage location. In the storage module, the structured log data are indexed and stored in Elasticsearch. Analysis, visualization, and monitoring of the stored data are performed using the analysis and visualization module basically through Elastic stack's Kibana tool. This module also generates a well-structured and filtered dataset that is used for the next deep learning tasks. The generated dataset needs further pre-processing to be suitable for a CNN, which is performed by the data pre-processing module. We named the dataset generated as the Wi-Fi Association_Disassociation dataset [50]. The last module performs de-authentication and disassociation attack detection using TL and CNN approaches. Each module of the proposed architecture is explained in detail in the subsequent sections.

Attack and Normal Traffic Generator Module
The availability of datasets is one of the biggest obstacles for ML/DL intrusion detection methods. Privacy is the primary challenge for the inadequate availability of datasets in the intrusion detection field. This is due to the fact that very sensitive information is carried in network traffic, and its accessibility might disclose consumer and corporate secrets or even private communications. Although many researchers have generated their own data to fill the preceding gap in order to overcome the challenge, the majority of the datasets created in these circumstances are not exhaustive, and the samples taken into account are insufficient to represent the latest behaviors. For these and related reasons, we set up our own testbed to generate the Wi-Fi Association_Disassociation dataset on the specific Wi-Fi application with a focus on de-authentication and disassociation DoS attacks.
This module consists of Wi-Fi client devices, an attacker, and an AP as shown in Figure 7. Smartphones, tablets, laptops, and Raspberry Pi (RPI) are used to generate normal traffic while Kali Linux and NodeMCU-based clients are used to carry out attack traffic. We used RPI to represent other IoT Wi-Fi-enabled IoT devices. Wi-Fi hacking has always relied on a few pieces of hardware, such as a computing device (computer, laptop, Raspberry Pi, etc.) that can execute whatever attack application is attempted. Second, it requires a wireless network adapter with a chipset that supports whatever nefarious Wi-Fi behavior is to be conducted. Electronics 2023, 12, x FOR PEER REVIEW 13 of 33 Kali Linux is a Debian-based security auditing distribution and the most popular and commonly utilized platform mainly in hacking and penetration testing. Kali comes with over 600 pre-installed tools by default, enabling experts to use these specialized tools for various objectives, including reverse engineering, malware analysis, penetration, security research, digital forensics, and many more. Aircrack-ng is one of such tools, which is itself a suite of tools used to assess Wi-Fi networks such as attacking, monitoring, testing, and cracking. Although it is primarily designed to crack Wi-Fi encryption keys (WEP, WPA, and WPA2), it also carries out other attacks such as replay attacks, de-authentication, disassociation, and establishing fake access points.
The cost of hacking Wi-Fi has dropped considerably, and low-cost microcontrollers are increasingly being transformed into inexpensive yet strong hacking tools. For a variety of reasons, such attacks are increasingly becoming inexpensive even for non-experts. The NodeMCU ESP8266, an Arduino-programmable chip on which the Wi-Fi Deauther project [51] is based, is one of the most popular. With a very user-friendly and easy web interface, a hacker may establish false networks, clone actual ones, or block all Wi-Fi in an area using this low-cost hardware. Thus, the de-authentication and disassociation attacks in this work were conducted using Aireplay-ng and ESP8266 NodeMCU. Aireplay-ng is part of the Aircrack-ng tool suite that consists of many tools used for Wi-Fi security and comes pre-installed inside the Kali Linux open-source distribution. A AR9271 chipsetbased Alfa wireless adapter with monitor mode and packet injection capability is used in this process. Kali Linux is a Debian-based security auditing distribution and the most popular and commonly utilized platform mainly in hacking and penetration testing. Kali comes with over 600 pre-installed tools by default, enabling experts to use these specialized tools for various objectives, including reverse engineering, malware analysis, penetration, security research, digital forensics, and many more. Aircrack-ng is one of such tools, which is itself a suite of tools used to assess Wi-Fi networks such as attacking, monitoring, testing, and cracking. Although it is primarily designed to crack Wi-Fi encryption keys (WEP, WPA, and WPA2), it also carries out other attacks such as replay attacks, de-authentication, disassociation, and establishing fake access points.
The cost of hacking Wi-Fi has dropped considerably, and low-cost microcontrollers are increasingly being transformed into inexpensive yet strong hacking tools. For a variety of reasons, such attacks are increasingly becoming inexpensive even for non-experts. The NodeMCU ESP8266, an Arduino-programmable chip on which the Wi-Fi Deauther project [51] is based, is one of the most popular. With a very user-friendly and easy web interface, a hacker may establish false networks, clone actual ones, or block all Wi-Fi in an area using this low-cost hardware. Thus, the de-authentication and disassociation attacks in this work were conducted using Aireplay-ng and ESP8266 NodeMCU. Aireplay-ng is part of the Aircrack-ng tool suite that consists of many tools used for Wi-Fi security and comes pre-installed inside the Kali Linux open-source distribution. A AR9271 chipset-based Alfa wireless adapter with monitor mode and packet injection capability is used in this process.
Because the communications are broadcast over the air in a public medium, the attacker may watch and collect all non-encrypted data traveling from a client to an AP by deploying low-cost scan devices known as Wi-Fi sniffers. There are several compact and portable Wi-Fi off-the-shelf hardware sniffers. Such capability can be easily obtained by setting up a Wi-Fi adapter (like Atheros AR9271) into Kali, enabling the monitor mode on its wireless network interface card (NIC), and installing packet-capturing software (like Wireshark [17]). Similarly, attacks are carried out by plugging in the NodeMCU ESP8266 chip into a laptop and accessing the ESP8266 Deauther through a web interface. However, this work ships the traffic using a different setup for further analysis, instead of performing limited activities with Wireshark.
Thus, the attacks in this work are carried out using these two sets of different tools. The Kali Linux-based Aircrack-ng suite of tools attack is carried out by setting up the latest Kali Linux distribution in VirtualBox, setting up an Atheros AR9271 adapter, and enabling monitor mode using the Airmon-ng of the wireless network card. Then, scan for nearby available access points and their associated clients by listening (sniffing) to 802.11 beacon frames broadcasted by nearby wireless routers or access points. Airodump-ng is used for this purpose which displays a list of detected access points, and also a list of connected clients which includes every access point in the area. From the list of available APs, identify the network where authorization to perform a penetration test is granted, as shown in Figure 8. Identifying target clients that are connected to the target AP is our next step, where both of them are monitored as shown in Figure 9.
its wireless network interface card (NIC), and installing packet-capturing software (like Wireshark [17]). Similarly, attacks are carried out by plugging in the NodeMCU ESP8266 chip into a laptop and accessing the ESP8266 Deauther through a web interface. However, this work ships the traffic using a different setup for further analysis, instead of performing limited activities with Wireshark.
Thus, the attacks in this work are carried out using these two sets of different tools. The Kali Linux-based Aircrack-ng suite of tools attack is carried out by setting up the latest Kali Linux distribution in VirtualBox, setting up an Atheros AR9271 adapter, and enabling monitor mode using the Airmon-ng of the wireless network card. Then, scan for nearby available access points and their associated clients by listening (sniffing) to 802.11 beacon frames broadcasted by nearby wireless routers or access points. Airodump-ng is used for this purpose which displays a list of detected access points, and also a list of connected clients which includes every access point in the area. From the list of available APs, identify the network where authorization to perform a penetration test is granted, as shown in Figure 8. Identifying target clients that are connected to the target AP is our next step, where both of them are monitored as shown in Figure 9.
Finally, carry out de-authentication/disassociation attacks using another Aircrack-ng suite tool called Aireplay-ng as depicted in Figure 10 (single client) and Figure 11 (multiple clients). Aireplay-ng is a wireless frame injector included in the Aircrack-ng package. Its primary function is to generate traffic for use in Aircrack-ng to crack encryption keys. Several attacks in Aireplay-ng can de-authenticate and disassociate wireless clients in order to capture encryption data, including fake authentications, interactive packet replay, hand-crafted ARP request injection, and ARP request reinjection. When carrying out an attack by targeting a single client, a directed de-authentication is sent to the specific MAC address, while attacking all the connected clients involves sending the de-authentication/disassociation attacks as broadcast frames which disconnects all of them from the victim AP.  . Scanning clients connected to the victim AP to carry out de-authentication and disassociation DoS attacks. Figure 9. Scanning clients connected to the victim AP to carry out de-authentication and disassociation DoS attacks.
Finally, carry out de-authentication/disassociation attacks using another Aircrack-ng suite tool called Aireplay-ng as depicted in Figure 10 (single client) and Figure 11 (multiple clients). Aireplay-ng is a wireless frame injector included in the Aircrack-ng package. Its primary function is to generate traffic for use in Aircrack-ng to crack encryption keys. Several attacks in Aireplay-ng can de-authenticate and disassociate wireless clients in order to capture encryption data, including fake authentications, interactive packet replay, hand-crafted ARP request injection, and ARP request reinjection. When carrying out an attack by targeting a single client, a directed de-authentication is sent to the specific MAC address, while attacking all the connected clients involves sending the de-authentication/disassociation attacks as broadcast frames which disconnects all of them from the victim AP.   The de-authentication/disassociation attacks carried out by ESP8266 NodeMCU Deauther are similar to the above-described steps with Kali except that the procedures ar automated through a graphical web interface. Connect the ESP8266 NodeMCU to the lap top's USB port. Then connect to the ESP8266 NodeMCU Deauther's Wi-Fi and access it on the browser using its IP address, as shown in Figure 12. Once in the GUI of the deauther scan for available nearby APs and clients. Select a single or multiple or all of the client connected to the victim AP and launch the attacks, as shown in Figures 13 and 14.   The de-authentication/disassociation attacks carried out by ESP8266 NodeMCU Deauther are similar to the above-described steps with Kali except that the procedures are automated through a graphical web interface. Connect the ESP8266 NodeMCU to the laptop's USB port. Then connect to the ESP8266 NodeMCU Deauther's Wi-Fi and access it on the browser using its IP address, as shown in Figure 12. Once in the GUI of the deauther, scan for available nearby APs and clients. Select a single or multiple or all of the clients connected to the victim AP and launch the attacks, as shown in Figures 13 and 14. The de-authentication/disassociation attacks carried out by ESP8266 NodeMCU Deauther are similar to the above-described steps with Kali except that the procedures are automated through a graphical web interface. Connect the ESP8266 NodeMCU to the laptop's USB port. Then connect to the ESP8266 NodeMCU Deauther's Wi-Fi and access it on the browser using its IP address, as shown in Figure 12. Once in the GUI of the deauther, scan for available nearby APs and clients. Select a single or multiple or all of the clients connected to the victim AP and launch the attacks, as shown in Figures 13 and 14.  Electronics 2023, 12, x FOR PEER REVIEW 16 of 33

Log Collection, Parsing, Storing, Analysis, and Generating Dataset
Network devices such as home Wi-Fi APs routinely generate a huge number of logs to record their states and runtime information, each comprising a timestamp and a log message indicating what has happened [52]. This valuable information could be utilized for multiple purposes, among which this study focuses on attack detection with a focus on association and disassociation logs. The log collection involves configuring the AP to send its logs to a central server for further usage and processing. The architecture for the testbed for log collection, parsing, indexing and storing, analysis and visualization to generate a dataset is shown in in Figure 15. It collects Wi-Fi traffic data from the access point with OpenWrt firmware installed on Raspberry Pi B+. OpenWrt is a Linux-based open-source project for embedded operating systems primarily used on embedded devices to route network traffic [53,54]. The OpenWrt system logging function is a crucial debugging and monitoring capability. The OpenWrt-based AP is configured to send its logs to the remote server (ELK server) including the remote server's protocol (TCP/UDP), IP, and port. The message format in OpenWrt-based AP varies depending on the destination (local log read, local file, remote socket) but it is generally represented as follows: <time stamp> <router name> <subsystem name/pid> <log_prefix>: <message body>

Log Collection, Parsing, Storing, Analysis, and Generating Dataset
Network devices such as home Wi-Fi APs routinely generate a huge number of logs to record their states and runtime information, each comprising a timestamp and a log message indicating what has happened [52]. This valuable information could be utilized for multiple purposes, among which this study focuses on attack detection with a focus on association and disassociation logs. The log collection involves configuring the AP to send its logs to a central server for further usage and processing. The architecture for the testbed for log collection, parsing, indexing and storing, analysis and visualization to generate a dataset is shown in in Figure 15. It collects Wi-Fi traffic data from the access point with OpenWrt firmware installed on Raspberry Pi B+. OpenWrt is a Linux-based opensource project for embedded operating systems primarily used on embedded devices to route network traffic [53,54]. The OpenWrt system logging function is a crucial debugging and monitoring capability. The OpenWrt-based AP is configured to send its logs to the remote server (ELK server) including the remote server's protocol (TCP/UDP), IP, and port. The message format in OpenWrt-based AP varies depending on the destination (local log read, local file, remote socket) but it is generally represented as follows: <time stamp> <router name> <subsystem name/pid> <log_prefix>: <message body> The logging message facility and priority are similar to those found in syslog implementations. Sample logs of the AP are shown in Figure 16. In this work, a parsing algorithm is developed to structure these raw (unstructured) logs. It is a grok-based parsing algorithm built on top of Logstash and takes the raw logs through the input plugin, parses them, and sends them to the Elasticsearch storage server through the output plugin. A high-level and simplified implementation of this parsing algorithm is implemented based on Algorithm 1 which is developed on top of the three parts of Logstash: input, filters, and output. The input section is in charge of specifying and accessing the input data source, from the AP, while Grok-based parsing is a Logstash filter that converts unstructured data The logging message facility and priority are similar to those found in syslog implementations. Sample logs of the AP are shown in Figure 16. In this work, a parsing algorithm is developed to structure these raw (unstructured) logs. It is a grok-based parsing algorithm built on top of Logstash and takes the raw logs through the input plugin, parses them, and sends them to the Elasticsearch storage server through the output plugin. A high-level and simplified implementation of this parsing algorithm is implemented based on Algorithm 1 which is developed on top of the three parts of Logstash: input, filters, and output. The input section is in charge of specifying and accessing the input data source, from the AP, while Grok-based parsing is a Logstash filter that converts unstructured data into structured and queryable data. The parsing algorithm accepts the raw Wi-Fi traffic log data of the AP through the input and analyses it line by line to identify patterns for the extraction of relevant information based on MAC address, timestamp, association-disassociation, and the type of data (normal or attack). As a result, the raw log data entries are parsed to adhere to these matching patterns to generate the structured data and are indexed and stored in Elasticsearch. into structured and queryable data. The parsing algorithm accepts the raw Wi-Fi traffic log data of the AP through the input and analyses it line by line to identify patterns for the extraction of relevant information based on MAC address, timestamp, associationdisassociation, and the type of data (normal or attack). As a result, the raw log data entries are parsed to adhere to these matching patterns to generate the structured data and are indexed and stored in Elasticsearch.

Visualization and Dataset Generation Module
By displaying data in a more intuitive and easy-to-understand style, log data visualizations assist users in understanding, interpreting, and gaining insights from the data. During visualization, we identified patterns for normal and attack traffic. The flooding of attack traffic occurred more frequently when visualized suggesting a potential system issue that must be addressed. In our solution we can perform this with the help of an easy-to-use adapted Kibana interface for browsing, visualizing, and analyzing the structured log data stored in Elasticsearch. Various log data visualization techniques can be used including bar charts, line charts, pie charts, scatter plots, heat maps, and more. With Kibana, we adapted an easy-to-use interface for browsing, visualizing, and analyzing the structured log data stored in Elasticsearch, as shown in Figure 17. A structured log data CSV file, Wi-Fi Association_Disassociation, is generated that is used as a dataset for attack detection, and its composition is described in Table 2. After parsing, indexing, storing, and analyzing the structured log data, we generated a CSV file that is used for the de-authentication and disassociation attack detection process.

Visualization and Dataset Generation Module
By displaying data in a more intuitive and easy-to-understand style, log data visualizations assist users in understanding, interpreting, and gaining insights from the data. During visualization, we identified patterns for normal and attack traffic. The flooding of attack traffic occurred more frequently when visualized suggesting a potential system issue that must be addressed. In our solution we can perform this with the help of an easyto-use adapted Kibana interface for browsing, visualizing, and analyzing the structured log data stored in Elasticsearch. Various log data visualization techniques can be used including bar charts, line charts, pie charts, scatter plots, heat maps, and more. With Kibana, we adapted an easy-to-use interface for browsing, visualizing, and analyzing the structured log data stored in Elasticsearch, as shown in Figure 17. A structured log data CSV file, Wi-Fi Association_Disassociation, is generated that is used as a dataset for attack detection, and its composition is described in Table 2. After parsing, indexing, storing, and analyzing the structured log data, we generated a CSV file that is used for the de-authentication and disassociation attack detection process.

Data Pre-Processing Module
The process of cleaning, converting, and preparing raw data before it can be evaluated is known as data pre-processing. It is a set of procedures and processes used to assure the quality, accuracy, completeness, and consistency of raw data. With data cleaning, we identified and corrected errors, and addressed missing values and inconsistencies in the data. Data transformation converts the data into a format suitable for analysis.
The foundation of anomaly detection is the extraction of features from the given log data. The primary features we focused on while transforming our Wi-Fi Associa-tion_Disassociation dataset to be suitable for transfer learning and CNN-based attack detection were the timestamp, association, disassociation, MAC address, and data type, described in Table 3. Considering the network traffic log data as a sequence of events, we defined a fixed window size. Using the predefined window size, association and disassociation network traffic in the same sliding window are seen as an itemset in a single transaction. We implemented this per device (MAC address) where each client's association and disassociation time duration per day is split into the defined window size.
During attacks, the number of associations and disassociations generated is huge as compared to normal network traffic flow. Considering a 10 min window size of connection time, we transformed the association and disassociation duration to behave like a digital form: association time duration to be high (digital value 1) and disassociation to be low (digital value 0). With these assumptions, we converted our entire dataset into images to be suitable for the proposed IDS model. For comparison purposes, we also used a 5 min window size. Table 4 shows the transformed feature set from the initial feature set. We transformed the association and disassociation network traffic for each device (per MAC address). Our pre-processing solution is implemented based on Algorithm 2. Basically, Algorithm 2 generates a day from the timestamp, calculates the total time duration that each device associates and disassociates per day, sets the window size (5 or 10 min), plots a figure per window size by converting association to be an upper limit (digital high or 1) and disassociation to be a lower limit (digital low or 0), and saves each image with its own file name. Sample datasets after being converted to images are shown in Figure 18 (normal) and Figure 19 (attack). Other tasks such as sorting, conversion to minutes, figure formatting, etc., are performed in this preprocessing approach. Following the preprocessing of the dataset through the novel algorithm, the composition of attack and normal images is shown in Table 5. Basically, Algorithm 2 generates a day from the timestamp, calculates the total time duration that each device associates and disassociates per day, sets the window size (5 or 10 min), plots a figure per window size by converting association to be an upper limit (digital high or 1) and disassociation to be a lower limit (digital low or 0), and saves each image with its own file name. Sample datasets after being converted to images are shown in Figure 18 (normal) and Figure 19 (attack). Other tasks such as sorting, conversion to minutes, figure formatting, etc., are performed in this preprocessing approach. Following the preprocessing of the dataset through the novel algorithm, the composition of attack and normal images is shown in Table 5. Figure 18. Sample dataset after conversion to images (for normal data). Figure 18. Sample dataset after conversion to images (for normal data).

Attack Detection Module
The transformed image is fed into the designed model to discover the faulty behav iors (attacks) that are hidden in the analyzed log dataset. Our TL and CNN-based IDS architecture, depicted in Figure 20, involves dataset preprocessing to suite for CNN-based deep learning models, training the models with source dataset using ImageNet and using TL to transfer the while using the target dataset, the Wi-Fi de-authentication_disassocia tion dataset. The CNN-based deep learning models are trained with the transformed la beled log data to establish an efficient model. Attack detection is mainly based on the win dow size of the association/disassociation logs. This window-size-focused attack detection involves detecting attacks when there is a large number of digital signals in each imag (window size). With this approach, identifiers such as the MAC address, time in minute per day, the duration that a given device remains high (associated) and low (disassoci ated), and the number of occurrences of these signals (high/association and low/disasso ciation) are used in the detection process. In the attack detection module, four differen classification models; VGG16, Inception V3, Resnet50, and Xception, are trained, evalu ated, and compared to obtain an outperforming model. All these four models are fine tuned with different hyperparameters and trained with 5 and 10 min window size image while experimenting to get the best model.
The hyperparameters of CNN models are tuned and optimized in order to better fi the base models to the specified datasets and increase the models' performance. We per formed a number of hyperparameters tuning to evaluate the model and achieve an opti mal performance, and the final IDS model for the detection of de-authentication and dis association attacks was chosen. Finally, the detection performance of the proposed mode was compared with both traditional machine learning models and deep learning models Figure 19. Sample dataset after conversion to images (for normal data).

Attack Detection Module
The transformed image is fed into the designed model to discover the faulty behaviors (attacks) that are hidden in the analyzed log dataset. Our TL and CNN-based IDS architecture, depicted in Figure 20, involves dataset preprocessing to suite for CNN-based deep learning models, training the models with source dataset using ImageNet and using TL to transfer the while using the target dataset, the Wi-Fi de-authentication_disassociation dataset. The CNN-based deep learning models are trained with the transformed labeled log data to establish an efficient model. Attack detection is mainly based on the window size of the association/disassociation logs. This window-size-focused attack detection involves detecting attacks when there is a large number of digital signals in each image (window size). With this approach, identifiers such as the MAC address, time in minutes per day, the duration that a given device remains high (associated) and low (disassociated), and the number of occurrences of these signals (high/association and low/disassociation) are used in the detection process. In the attack detection module, four different classification models; VGG16, Inception V3, Resnet50, and Xception, are trained, evaluated, and compared to obtain an outperforming model. All these four models are fine-tuned with different hyperparameters and trained with 5 and 10 min window size images while experimenting to get the best model.
The hyperparameters of CNN models are tuned and optimized in order to better fit the base models to the specified datasets and increase the models' performance. We performed a number of hyperparameters tuning to evaluate the model and achieve an optimal performance, and the final IDS model for the detection of de-authentication and disassociation attacks was chosen. Finally, the detection performance of the proposed model was compared with both traditional machine learning models and deep learning models. Electronics 2023, 12, x FOR PEER REVIEW 24 of 33 Figure 20. Architecture for data pre-processing, and de-authentication/disassociation attack detection using the proposed IDS model.

Results and Analysis
In this section, we present the summary of the key results of the findings of the proposed intrusion detection model to detect de-authentication and disassociation in IoT Wi-Fi networks developed based on TL and CNN deep learning models. Each step of the experiment, from launching attacks, collecting Wi-Fi network traffic, parsing, analysis, and visualization, generating structured datasets, preprocessing, and attack detection was conducted and assessed. This analyzed the overall performance of the proposed model using a variety of analytic scenarios with different measurement indicators, including accuracy, precision, recall, receiver operating characteristic (ROC), area under the ROC curve (AUC), and F1 score. We also focused on optimizing the hyperparameters for better performance. We also conducted a comparative analysis of our proposed model with TML and DL models that are widely used in IDS implementations according to our survey of state-of-the-art solutions. Figure 20. Architecture for data pre-processing, and de-authentication/disassociation attack detection using the proposed IDS model.

Results and Analysis
In this section, we present the summary of the key results of the findings of the proposed intrusion detection model to detect de-authentication and disassociation in IoT Wi-Fi networks developed based on TL and CNN deep learning models. Each step of the experiment, from launching attacks, collecting Wi-Fi network traffic, parsing, analysis, and visualization, generating structured datasets, preprocessing, and attack detection was conducted and assessed. This analyzed the overall performance of the proposed model using a variety of analytic scenarios with different measurement indicators, including accuracy, precision, recall, receiver operating characteristic (ROC), area under the ROC curve (AUC), and F1 score. We also focused on optimizing the hyperparameters for better performance. We also conducted a comparative analysis of our proposed model with TML and DL models that are widely used in IDS implementations according to our survey of state-of-the-art solutions.

Experimental Setup
The proposed end-to-end TL and CNN-based IDS testbed is comprised of an AP with Openwrt, a number of genuine clients, an attacker, and overall IDS infrastructure, as seen in Figure 6. To conduct de-authentication and disassociation attacks, the attacker computer is equipped with Kali Linux consisting of the Aircrack-ng suite and an ESP8266 NodeMCU Deauther. The attacker's main goal is to flood the target client(s) with a huge number of de-authentication and disassociation frames, causing the client(s) to disconnect. The experiments were conducted using the Scikitlearn and Tensorflow/Keras libraries in Python. In the experiments, the proposed DL model and comparison analysis implementation models were trained on a Dell XPS 15 9510 with the specifications listed in Table 6.

Training the Proposed Model
The Wi-Fi Association_Disassociation dataset was divided into three sections: training, validation, and testing subsets by the 70% (training set), 15% (validation set), and 15% (test set) approach, with the actual sample of each as shown in Table 7 for both window sizes (5 and 10 min). We built the model by adjusting the weights on the neural network using the training set. The validation set was used to fine-tune the experiment parameters, such as the number of hidden layers in the proposed model. The test set was used to estimate the model's accuracy or performance. In this work, the dense layer uses softmax to categorize the incoming data as normal or attack traffic. We utilized the ReLU function for activation in all the different layers along with the Adam optimizer and categorical cross-entropy as the loss function. Finally, as described in the Results section, numerous performance indicators were utilized to evaluate the overall performance of the selected models. Several TML models are widely applied to IDS and showed good performance among which some of them are surveyed as part of the state-of-the-art solutions in Section 3. Considering their extensive application in the IDS domain, we compared our work with the TML modes, random forest (RF), decision tree (DT), support vector machine (SVM), and XGBoost, for classification analysis.

Hyperparameter Tuning
Hyperparameter tuning is the process of optimizing the hyperparameter pre-trained model on a large dataset and utilizing it as the starting point for a new but related task with a limited dataset. The different hyperparameter values directly control the behavior of the model and picking the appropriate values is critical to the success of neural network design. However, determining the appropriate hyperparameter values still remains dependent on the best practice or human knowledge.
CNN models, like other DL models, include a huge number of hyperparameters that must be tuned. Throughout the model design process of the proposed solution, a number of hyperparameters, like the number of frozen layers, learning rate, and dropout rate, were tuned. Batch size, number of epochs, and early stop patience are among the hyperparameters tuned during model training to balance training speed and model performance. Moreover, the hidden layer activation function, output layer activation function, loss function, and optimizer were selected to be ReLu, SoftMax, categorical cross-entropy, and Adam, respectively. Some of the common hyperparameters we fine-tuned are listed in Table 8. When we used these combinations of hyperparameters to evaluate the model, we achieved an optimal performance, and the final IDS model for the detection of de-authentication and disassociation attacks was chosen. In addition, we carried out a general comparison of the models with and without fine-tuning.

Performance Evaluation
Evaluation measures the performance of the model and several researchers used accuracy, precision, recall, and F1-measure for this purpose. While these metrics are used to evaluate the performance of the proposed solution, true positive (TP), true negative (TN), false positive (FP), and false negative (FN) are used to construct the evaluation metrics. The performance of classification methods is determined not only by the technique used but also by how training and testing data are split. Various previous studies showed that using 70% of the input data for training provided the best performance results. To create a balanced dataset, we used the split of 70% for training, 15% for validation, and 15% for testing sets discussed in Section 5.2. We used attack records from the testing set that were not included in the training set to get a realistic detection rate. In addition to the fine-tuned hyperparameters, the performance evaluation of the four different CNN models using both 5 min and 10 min window sizes, and 16,32, and 48 batch sizes considering overall accuracy is shown in Table 9.
Since the batch size of 32 and window size of 10 min provided the best overall performance, the models' performances with other evaluation metrics such as precision, recall, F-1 score, and ROC area are shown in Table 10. The ROC/AUC result is shown in Figure 21. Since the batch size of 32 and window size of 10 min provided the best overall performance, the models' performances with other evaluation metrics such as precision, recall, F-1 score, and ROC area are shown in Table 10. The ROC/AUC result is shown in Figure 21.  False negative rate (FNR) is another metric that measures the proportion of positive samples that are incorrectly classified as negative. FNR is calculated using Equation (1) and our model scores a low FNR value of 0.002.

= +
(1) Figure 22 shows the training vs. validation accuracy, and training vs. validation loss. Validation accuracy is a model's accuracy on new data, whereas training accuracy is a model's accuracy on the data it was trained on. Because the model has never seen the (1) Figure 22 shows the training vs. validation accuracy, and training vs. validation loss. Validation accuracy is a model's accuracy on new data, whereas training accuracy is a model's accuracy on the data it was trained on. Because the model has never seen the validation data before, validation accuracy is often lower than training accuracy. The training loss measures how well the model fits the training data, whereas the validation loss measures how well the model fits new data.

Comparison with State-of-the-Art Models
The rapid expansion of IoT Wi-Fi networks has created several opportunities for hackers and it can result in a very serious loss of sensitive personal information. This study presents a transfer and deep learning-based model for IoT Wi-Fi network intrusion detection with the main focus on de-authentication and disassociation of DoS attacks. The suggested approach leverages a combination of transfer and deep learning approaches to achieve better classification performance. The suggested approach's performance was evaluated by extensive tests on a local testbed-generated dataset. During the comparative analysis, tuning of different hyperparameters of each TML model was performed to achieve better performance. Criterion, max_features, min_samples_leaf, min_sam-ples_split are among the common hyperparameters tuned in RF and DT models. Tuning of XGBoost's n_estimators, gamma, max_depth, max_leaves, and min_child_weight, among other hyperparameters, was performed. While analyzing the SVM, its parameters including x, y, and z were tuned. Table 11 shows the comparative analysis results of the selected TML models and the proposed solution. The results of the comparison is also illustrated using Figure 23.

Comparison with State-of-the-Art Models
The rapid expansion of IoT Wi-Fi networks has created several opportunities for hackers and it can result in a very serious loss of sensitive personal information. This study presents a transfer and deep learning-based model for IoT Wi-Fi network intrusion detection with the main focus on de-authentication and disassociation of DoS attacks. The suggested approach leverages a combination of transfer and deep learning approaches to achieve better classification performance. The suggested approach's performance was evaluated by extensive tests on a local testbed-generated dataset. During the comparative analysis, tuning of different hyperparameters of each TML model was performed to achieve better performance. Criterion, max_features, min_samples_leaf, min_samples_split are among the common hyperparameters tuned in RF and DT models. Tuning of XGBoost's n_estimators, gamma, max_depth, max_leaves, and min_child_weight, among other hyperparameters, was performed. While analyzing the SVM, its parameters including x, y, and z were tuned. Table 11 shows the comparative analysis results of the selected TML models and the proposed solution. The results of the comparison is also illustrated using Figure 23.  Several variants of DL models have been implemented to address intrusion detection classification problems. We consider LSTM and RNN models in our comparison analysis because of their wider application and optimal performance and their performance results is shown in Table 12 and also elaborated using Figure 24.  Several variants of DL models have been implemented to address intrusion detection classification problems. We consider LSTM and RNN models in our comparison analysis because of their wider application and optimal performance and their performance results is shown in Table 12 and also elaborated using Figure 24.  Several variants of DL models have been implemented to address intrusion detection classification problems. We consider LSTM and RNN models in our comparison analysis because of their wider application and optimal performance and their performance results is shown in Table 12 and also elaborated using Figure 24.

Discussion
The results and analysis show that our proposed IDS model offers better performance in contrast to its predecessors. In addition, the overall framework provides an end-to-end implementation including a local testbed setup to launch and collect Wi-Fi traffic data, parse and store, visualize and analyze the data, generate a Wi-Fi Association_Disassociation dataset to be available for further preprocessing in the IDS process, and other tasks. In the previous section, we demonstrated how our IDS model was able to detect de-authentication and disassociation attacks in IoT Wi-F networks and compared it to the existing IDS solutions that adapted both TML and DL models using our dataset. The findings from the results of the proposed model and comparative analysis prove that our model can effectively detect de-authentication and disassociation DoS attacks in any Wi-Fi network which improves the overall security of networks. However, our model is trained and evaluated using our Wi-Fi Association_Disassociation dataset but not on public datasets. Although extensive comparative analysis of all the TML and DL models was conducted with our Wi-Fi Association_Disassociation dataset and showed a better performance compared to the public datasets that the models were trained on, evaluating our model on at least one public dataset could provide more analytic perspectives.
Our model has several potential uses that can be further expanded for more complex scenarios. First, due to the targeted use case, and the reason for the illegal deauthentication and disassociation of clients, we focused on collecting and preparing the Wi-Fi Association_Disassociation dataset to be on the authentication/association and deauthentication/disassociation process, but evaluating the model with multiple public datasets and more attack types makes it applicable for more complex scenarios. Second, our testbed and the entire process considered a local setup, but this can be expanded to cover the cloud, which requires additional security measures including data anonymization. Third, exploring options to deploy a trained model on the access point or edge of the IoT Wi-Fi network layer could enhance the detection time. Last but not least, due to the dynamic and heterogeneous nature of IoT environments, training the model with online network traffic data with the help of methods like incremental learning can make a difference.

Conclusions
The rapid expansion of IoT Wi-Fi networks has created several opportunities for hackers and it can result in a very serious loss of sensitive personal information. This study presents a transfer and deep learning-based model for IoT Wi-Fi network intrusion detection with the main focus on de-authentication and disassociation of DoS attacks. The suggested approach leverages a combination of transfer and deep learning approaches to achieve better classification performance. The suggested approach's performance was evaluated by extensive tests on a local testbed-generated dataset.
The experimental findings reveal that the suggested model outperforms existing models. In a shorter time, the suggested technique can identify binary cyber threats. We observed the suggested model with several hyperparameters and the outcomes indicate that we achieved the best results with the Adam optimizer, 0.0001 learning rate, 32 batch size, and of course 10 min window size. In this paper, we employed a train-test split approach to evaluate the suggested system's performance. According to the experimental data, the proposed model performed better than 99.36% for the binary classification with a low false negative rate of 0.002. These results imply that our model can effectively detect targeted attacks in all Wi-Fi network environments leading to overall improved network security.
Funding: This research has been supported by the BT Ireland Innovation Centre (BTIIC) project, funded by BT, and Invest Northern Ireland.

Data Availability Statement:
The data reported here were captured using the testbed setup mentioned on the study. The dataset is publicly available in GitHub [51] and is refenced in this paper. The details of the tools used in the experimental setup have also been referenced within the paper.