IP Spoofing Detection Using Deep Learning

Çekiş, İsmet Kaan; Ayrancı, Buğra; Salman, Fezayim Numan; Özçelik, İlker

doi:10.3390/app15179508

Open AccessArticle

IP Spoofing Detection Using Deep Learning

¹

Department of Computer Engineering, Faculty of Engineering and Architecture, Eskisehir Osmangazi University, Eskişehir 26480, Türkiye

²

Department of Software Engineering, Faculty of Engineering and Architecture, Eskisehir Osmangazi University, Eskişehir 26480, Türkiye

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(17), 9508; https://doi.org/10.3390/app15179508

Submission received: 28 July 2025 / Revised: 24 August 2025 / Accepted: 26 August 2025 / Published: 29 August 2025

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

IP spoofing is a critical component in many cyberattacks, enabling attackers to evade detection and conceal their identities. This study rigorously compares eight deep learning models—LSTM, GRU, CNN, MLP, DNN, RNN, ResNet1D, and xLSTM—for their efficacy in detecting IP spoofing attacks. Overfitting was mitigated through techniques such as dropout, early stopping, and normalization. Models were trained using binary cross-entropy loss and the Adam optimizer. Performance was assessed via accuracy, precision, recall, F1 score, and inference time, with each model executed a total of 15 times to account for stochastic variability. Results indicate a powerful performance across all models, with LSTM and GRU demonstrating superior detection efficacy. After ONNX conversion, the MLP and DNN models retained their performance while achieving significant reductions in inference time, miniaturized model sizes, and platform independence. These advancements facilitated the effective utilization of the developed systems in real-time network security applications. The comprehensive performance metrics presented are crucial for selecting optimal IP spoofing detection strategies tailored to diverse application requirements, serving as a valuable reference for network anomaly monitoring and targeted attack detection.

Keywords:

IP spoofing; deep learning; ONNX; CIC-DDoS2019

1. Introduction

In today’s rapidly advancing internet landscape, infrastructures are continually evolving and becoming increasingly complex. A report by Cisco for the period 2018–2023 indicates that the average number of internet-connected devices per capita increased from 2.4 to 3.6. Based on this data, it was inferred that the total number of internet-connected devices would rise from 18.4 billion to 29.3 billion. The report also observed that the global average network bandwidth nearly doubled from 45.9 Mbps in 2018 to 110.4 Mbps in 2023. This combined increase in connected devices and network bandwidth escalated the volume and complexity of network traffic, making the detection of cybersecurity threats more challenging [1]. Detecting attacks on increasingly dense internet traffic is progressively becoming more difficult. Particularly, IP spoofing-based attacks are hidden within heavy traffic. Since IP spoofing attacks exhibit standard protocol structures and legitimate traffic patterns, they are not readily detected by traditional network security systems.

IP spoofing attacks constitute a special threat to network security. IP spoofing is a method where attackers create a false identity by altering the source IP addresses of packets within the network traffic. Through this technique, attackers can communicate with the target systems by appearing as legitimate users. IP spoofing is frequently used effectively in Distributed Denial of Service (DDoS) attacks, especially in reflection and amplification-based attacks such as NTP amplification and DNS amplification. In these attacks, a small request packet is sent using a spoofed source IP to servers that generate a much larger response, thereby rendering the target system unable to provide service under heavy traffic [2]. The fact that the source addresses of packets are not verified by the IP protocol makes it possible to conduct such attacks. Therefore, the detection of IP spoofing attacks is critical for the security and integrity of networks.

According to Cloudflare’s Q1 2025 DDoS report, IP spoofing techniques (SYN flood, DNS amplification, and UDP reflection) were actively used in approximately 80% of DDoS attacks [3]. Attackers evaded detection systems by concealing the traffic source. Furthermore, the SMap study indicates that approximately 69.8% of autonomous systems on the internet still do not filter spoofed IP packets, meaning they are vulnerable to spoofing-based attack generation [4].

Signature-based detection methods are knowledge-based approaches that examine network traffic to recognize existing or known attack patterns and can only detect threats that match existing signatures [5]. However, these methods require continuous signature updates to capture unknown or previously unobserved attack techniques. In dynamic and high-volume network traffic conditions, this need for updates reduces system effectiveness. It also further complicates the detection of spoofed packets that conform to protocol standards but are maliciously crafted. For these reasons, the detection of sophisticated attacks like IP spoofing requires more flexible, learnable, and scalable methods. Thus, the limitations of signature-based detection systems become clear.

Anomaly-based detection approaches issue alerts when the monitored system or network traffic deviates from the patterns defined as normal. This allows for the detection of previously unencountered attacks. Within the last decade, there has been an increase in the use of machine and deep learning-based methods in anomaly detection studies, with Razzaq and Shah (2023) finding that the intersection of cybersecurity and these advanced computational techniques has experienced significant growth and global collaboration from 2016 to 2025 [6]. Within the scope of this study, the detection of IP spoofing attacks using different deep learning methods has been examined in detail.

For deep learning models developed to operate efficiently in real-time systems, in addition to high accuracy, low latency, and platform independence have become crucial. In this context, Open Neural Network Exchange (ONNX), an open-source model representation format, enables models to be easily integrated into production environments by providing interoperability among different deep learning libraries. The ONNX format allows models to run on CPUs, GPUs, or edge devices, independent of frameworks such as TensorFlow and PyTorch, offering significant advantages in terms of inference time. Additionally, optimizations applied during ONNX conversion processes result in substantial reductions in model sizes, providing further benefits in terms of memory usage. The ONNX-supported deep learning-based IP spoofing detection models used in this study were converted to the ONNX format, and their performances were evaluated in this universal execution environment, thereby comprehensively analyzing not only their accuracy but also their production readiness. To the best of our knowledge, in the literature, there is no comparative study on the detection of IP spoofing attacks using deep learning methods. The outputs of this study are expected to contribute to efforts in detecting the many attacks that use the IP spoofing method.

The remainder of this article is organized as follows. Section 2 reviews existing literature on IP spoofing detection. Section 3 classifies the types of spoofing attacks and their corresponding characteristics within network traffic. Section 4 presents the deep learning methodologies employed and the rationale for their selection. Section 5 details the experimental setup, including the datasets utilized and the feature selection process. Section 6 introduces the performance metrics applied for evaluation. Section 7 discusses the machine learning models, their architectural specifics, and the training procedures. Section 8 evaluates the performance of these models. Finally, Section 9 discusses the obtained findings and offers recommendations for future research.

2. Literature Review

The existing literature contains a comparatively limited number of dedicated studies and datasets specifically addressing IP spoofing detection. Consequently, this section reviews detection studies relevant to IP spoofing, primarily referencing research where its role is emphasized within the broader context of Distributed Denial of Service (DDoS) attack detection.

To counter IP spoofing attacks, Haining Wang et al. (2007) [7] introduced the Hop Count Filtering (HCF) technique. This method involves calculating the actual hop count from the Time to Live (TTL) value of an incoming packet and comparing it to the expected hop count associated with its source IP address. This comparison serves to identify spoofed IP packets. Wang et al. reported that the HCF technique achieves a 90% accuracy rate, while also noting the possibility of minor deviations. This suggests that despite its high technical accuracy, an increase in network complexity may lead to false positives.

R.A. Sowah et al. (2019) [8] developed an Artificial Neural Network (ANN) based method to detect and prevent Man-in-the-Middle (MITM) spoofing attacks within Mobile Ad-hoc Networks (MANETs). Their experiments, utilizing a 5-node network architecture, successfully detected MITM attacks and enabled the identification of spoofed IP addresses. This research highlighted the efficacy of ANN algorithms for IP spoofing detection and demonstrated an 88.23% accuracy rate for the proposed method. Nevertheless, these findings are subject to certain limitations stemming from the dynamic characteristics of MANETs and inherent network variability.

Heena Kousar et al. (2021) [9] devised a solution using the Apache Spark platform to address DDoS attacks, encompassing those that incorporate IP spoofing. Their research demonstrated that the Random Forest algorithm outperformed the Decision Tree algorithm, and that distributed processing offered substantial benefits regarding preprocessing and training duration. This method achieved an efficacy of 90.86% accuracy. While this study did not directly focus on IP spoofing detection, the identified DDoS attack types included those that involved IP spoofing.

IM-Shield, a system proposed by Hua Wu et al. (2022) [10], was designed to defend against DDoS attacks that utilize IP spoofing. This system verifies the authentic source identity of network packets by analyzing pairings of router interface MAC addresses and destination IP addresses. IM-Shield detects and filters DDoS attacks without necessitating modifications to existing network protocols. Experimental results demonstrated a 99.9% accuracy rate.

To detect and prevent IP spoofing-based DDoS attacks, Varsha Parekh Saravanan M (2022) [11] formulated a hybrid strategy integrating distance calculation with machine learning-based methods. Evaluations on the CAIDA 2007 dataset indicated that this hybrid approach achieved the highest performance, with an accuracy rate of 99.86%. This strategy not only facilitates protection against attacks via an SNORT-based prevention mechanism but also yields superior outcomes in metrics like Precision, Recall, and F1-Score when compared to alternative methods.

K.A. Dhanya et al. (2023) [12] proposed machine learning and deep learning-based models for the detection of network attacks. In experiments performed on the UNSW-NB15 dataset, the Decision Tree algorithm achieved the highest performance, with an accuracy of 99.05%. While this study did not directly target IP spoofing detection, it effectively managed to indirectly classify the use of spoofed IP addresses in network traffic.

Sharmistha Majumder, Mrinal Kanti Deb Barma, and Ashim Saha (2025) [13] developed a dynamic machine learning-based anomaly detection approach for the real-time detection of ARP spoofing-based Man-in-the-Middle (MITM) attacks. This method relies on the continuous verification of IP and MAC addresses and their cross-validation with gateway information. Experimental results demonstrated an F1-Score of 99.26%. This system safeguards network integrity by identifying ARP spoofing attacks, potentially stemming from IP spoofing, through the detection of fraudulent IP-MAC address pairings.

While IP spoofing is often addressed in the literature indirectly in conjunction with other malicious activities, this study uniquely defines it as the primary attack for detection. Accordingly, eight distinct deep learning models were trained to detect IP spoofing, and their performances were comprehensively compared. The findings of this work are expected to serve as a benchmark for future research aimed at developing cyber threat detection systems against attacks employing IP spoofing techniques.

3. Spoofing Attacks

Spoofing refers to the act of impersonating another person or computer system by providing false information (e.g., an email name, URL, or IP address). In the domain of information technology, spoofing manifests in various forms, all of which involve some type of misinformation intended to deceive users. These methods involve the misrepresentation of information in diverse manners, which consequently leads to diverse types of fraudulent activities. Spoofing attacks in which the IP address is directly utilized during the fraudulent process include the following:

IP Spoofing
ARP Spoofing
DNS Spoofing

3.1. IP Spoofing

In computer networks, IP address spoofing (or IP spoofing) involves the creation of Internet Protocol (IP) packets with a falsified source address. This technique aims to either conceal the sender’s identity or impersonate another computer system. Network attackers also employ IP spoofing to circumvent security mechanisms such as IP address-based authentication. The efficacy of such attacks is notably heightened when trust relationships exist between machines. Router behavior amplifies the network’s susceptibility to IP spoofing; since routers typically only inspect the destination addresses for forwarding purposes, while authentication may rely on the source address, and modifying the source address field in an IP packet’s header is straightforward [14].

3.2. ARP Spoofing

The Address Resolution Protocol (ARP) facilitates the mapping between IP addresses and Media Access Control (MAC) addresses, while maintaining these associations in an ARP cache. If a packet is to be delivered to an IP address within the local network segment, ARP queries its cache for the associated MAC address. In the event this mapping is absent, ARP issues a broadcast request across the network. This mechanism is susceptible to deception through falsified ARP responses. ARP spoofing entails the generation of counterfeit ARP requests or replies, thereby misdirecting a target host’s traffic to an unauthorized machine. This malicious activity is commonly termed “ARP poisoning.” The implementation of MAC binding or static ARP tables can serve as countermeasures against such attacks; however, these solutions are often impractical in large-scale, dynamic network environments. Conversely, tools such as ARPWATCH monitor modifications to the ARP cache and alert administrators to potential anomalies [15].

3.3. DNS Spoofing

DNS spoofing is the manipulation of Domain Name System (DNS) records to redirect traffic to an IP address different from the legitimate one. This technique can lead to the compromise of a server’s identity by rerouting its DNS record to an unauthorized IP address. With modern BIND (Berkeley Internet Name Domain) services, successfully executing such an attack typically necessitates infiltrating the server or its underlying network infrastructure (e.g., routers, switches), a task that presents considerable difficulty. Despite all these difficulties, DNS attacks are quite common [14].

4. Selected Deep Learning Models

In this study, eight deep learning models, each with distinct architectural approaches, were employed for the detection of IP spoofing. Each model was evaluated based on its specific structural features and learning capacity and was implemented with strategies designed to prevent overfitting and enhance accuracy. This section briefly outlines the fundamental structures of the utilized models and the rationale behind their selection.

RNN (Recurrent Neural Network): RNN is a classical recurrent neural network architecture designed to work with sequential data. Although effective in learning short-term dependencies, it is not as successful as advanced architectures like LSTM and GRU in learning long-term dependencies. In this study, the RNN was included for evaluation due to its fundamental and established capability in sequential data modeling.

LSTM (Long Short-Term Memory): LSTM is a type of Recurrent Neural Network (RNN) distinguished by its capacity to learn long-term dependencies in sequential data. In this research, LSTM was utilized to capture temporal dependencies within network traffic, applied for IP spoofing detection by interpreting sequential relationships in data flows [16].

xLSTM (Extended Long Short-Term Memory): An advanced variant of the LSTM architecture, xLSTM may incorporate deeper structures and refined cell architectures. This model aims to learn long-term dependencies more robustly but exhibits a higher computational cost. In this study, it was evaluated as an extended version of LSTM.

GRU (Gated Recurrent Unit): Like LSTM in its effectiveness with time-dependent data, the GRU reduces training duration and offers a lighter-weight architecture due to its fewer parameters. In this study, it was evaluated as an alternative time series model to LSTM [17].

A Multi-Layer Perceptron (MLP) is a type of feedforward neural network designed to transform input data into appropriate output representations using a layered architecture of interconnected neurons. In this study, a four-layer MLP was employed and configured for non-sequential data, with its generalization performance enhanced by the application of regularization techniques to mitigate the risk of overfitting [18].

DNN (Deep Neural Network): The DNN model was employed as an extended and more regularized version of the conventional MLP architecture. It incorporated multiple hidden layers and utilized techniques such as dropout and batch normalization to augment it, aiming for a balance between accuracy and generalization performance during the learning process. The DNN model was assessed for its capacity, attributable to its deep architecture, to learn complex patterns representing IP spoofing [19].

CNN (Convolutional Neural Network): Predominantly utilized in image processing, CNNs were employed in this study in a one-dimensional configuration (1D-CNN). The objective was to extract local patterns from data flows and develop filters capable of automatically learning discriminative features indicative of spoofed traffic [20].

ResNet1D (Residual Neural Network 1D): This model features deeper layers than classical deep neural networks and facilitates training while preventing overfitting through skip connections between layers. In this study, a one-dimensional (1D) architecture was employed to enable more effective learning of discriminative features in network traffic.

5. Experiment Setup

Existing studies in the literature, despite employing machine learning and deep learning methods, do not often sufficiently address the critical issues of data imbalance and overfitting. This study addresses these issues by utilizing a balanced dataset for both training and testing, which was obtained through down-sampling of the original data.

The dataset used is the CIC-DDoS2019, developed by the Canadian Institute for Cybersecurity (CIC), Fredericton, NB, Canada. This dataset, compiled in a laboratory setting to emulate real-world attack scenarios, encompasses contemporary Distributed Denial of Service (DDoS) attack types. It also includes both benign and various malicious traffic types. For this study, reflection attacks were specifically identified as instances of IP spoofing, thereby framing the detection task as a binary classification problem [21]. The dataset comprises two days of data featuring thirteen distinct attacks, of which only the reflection attacks were selected for analysis. A reflection attack is a cyber-attack that leverages responses from a third-party server to flood a victim with traffic. As attackers send requests to a server using a forged source IP address that belongs to the victim, these attacks can be categorized as a form of IP spoofing.

To prevent data leakage and ensure that traffic from the same attack episode does not appear in both the training and testing sets, a time-based split was employed. Specifically, attacks from the second day (TFTP, LDAP, MSSQL, NetBIOS, NTP, SNMP, SSDP, DNS) were allocated for the training set, while attacks from the first day (LDAP, MSSQL, NetBIOS, Portmap) were reserved for the testing set. This methodology ensures that no duplicate flow entries exist between the training and testing data.

To enhance model accuracy and generalization capacity, this study integrated deep neural network (DNN) architectures with established overfitting prevention techniques, such as dropout and early stopping. Additionally, the feature selection phase employed the Chi-Square method to identify and select the most significant attributes for model training. Additionally, going beyond approaches in the existing literature that focus solely on model accuracy, the ONNX-supported models developed in this study were converted to the ONNX (Open Neural Network Exchange) format and evaluated in terms of performance metrics critical for production environments, such as inference time and model size. The ONNX conversion enabled the models to run faster, use less memory, and be deployed platform-independently.

The dataset is structured on a flow basis. Each instance within the dataset corresponds to a specific network flow, and the analysis was conducted directly at this flow level. During the data preprocessing phase, records containing missing or infinite values were removed, non-numerical features were suitably converted, and the ‘Label’ column was binarized, representing normal traffic as ‘0’ and attack traffic as ‘1’. In the model training, a balanced training dataset made up of 50,609 benign and 50,609 spoofing flows was utilized to mitigate potential biases arising from the observed class imbalance in the dataset. During the feature engineering phase, a Chi-Square-based feature selection algorithm was employed to identify features anticipated to directly contribute to the classification task [22]. This statistical method highlighted features with high discriminative power among classes by assessing the independence between each feature and the target variable.

Following this selection process, two new features, designated ‘Asymmetry_Ratio’ and ‘OneWay,’ were derived and subsequently incorporated into the final feature set for IP spoofing detection. The final set of features employed for model training and testing is categorized and presented in Table 1. The overall data-processing pipeline is summarized in Figure 1.

The novel derived features, Asymmetry_Ratio and OneWay, have been identified as attributes capable of providing significant distinctions in the detection of IP spoofing.

Asymmetry_Ratio is defined as the ratio of the total number of forward (Fwd) packets to the sum of both forward and backward (Bwd) packets within a flow. This ratio is calculated as follows:

Asymmetry_Ratio = \frac{Total Fwd Packets}{Total Fwd Packets + Total Bwd Packets + ε}

Here,

ϵ

represents a small positive number employed to prevent division-by-zero errors, ensuring the numerical stability of computations. This feature quantitatively expresses the directional imbalance within network traffic. During IP spoofing attacks, packets flow in a single direction (e.g., solely transmission) because a response from the target system is typically not received. In such instances, the Asymmetry_Ratio value approaches one, indicating a pronounced asymmetry.

OneWay is a binary derived feature that indicates whether a flow occurs in only one direction. If a flow contains exclusively forward or exclusively backward packets, meaning the opposing direction is entirely absent, this feature’s value becomes one; otherwise, it is zero. This feature is formulated as follows:

OneWay = \{\begin{matrix} 1 \\ 0 \end{matrix} \binom{if Total Fwd Packets = 0 or Total Bwd Packets = 0}{otherwise}

The observation of unidirectional traffic is considered a distinguishing indicator, particularly in attack scenarios involving the transmission of spoofed packets from which no response is received from the target system.

6. Performance Metrics

In this study, four fundamental measurement parameters are used:

True Positive (TP): Represents the count of instances correctly identified by the model as positive (i.e., attack).

True Negative (TN): Represents the count of instances correctly identified by the model as negative (i.e., normal).

False Positive (FP): Denotes the count of instances that are negative (normal) but are erroneously classified by the model as positive (attack).

False Negative (FN): Denotes the count of instances that are positive (attack) but are erroneously classified by the model as negative (normal).

These four fundamental parameters serve as the basis for calculating the ensuing evaluation metrics:

6.1. Accuracy

This is the ratio of the samples correctly predicted by the model to the total count of samples.

A c u r r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(1)

6.2. Precision

It indicates the proportion of instances classified as positive that are positive.

P r e c i s i o n = \frac{T P}{T P + F P}

(2)

6.3. Recall

Recall (Sensitivity), also referred to as the Detection Rate, measures the proportion of actual positive instances that are correctly identified.

R e c a l l = \frac{T P}{T P + F N}

(3)

6.4. F1-Score

It is the harmonic mean of the Precision and Recall values. It provides a more reliable success criterion, especially in imbalanced datasets.

F 1 S c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(4)

A combined evaluation of these metrics allows for a detailed examination of attack detection performance, beyond relying solely on general accuracy. For IP spoofing detection, it is crucial to effectively manage both false negative and false positive rates.

7. Model Training and Performance Test

The model training process was conducted using a normalized dataset that had been previously prepared through feature selection and data cleansing procedures. The architectures of the employed models were designed, incorporating strategies to mitigate overfitting and enhance learning efficacy for each model.

In the training phase, all deep learning models were structured for binary classification, employing a sigmoid activation function in their output layers. The binary_crossentropy function was selected for loss calculation, and the Adam algorithm was utilized for optimization. Throughout the training, model validation performance was monitored. There was also an early stopping criterion activated if no improvement was observed over four consecutive epochs. Furthermore, the learning rate was reduced by half using the “ReduceLROnPlateau” callback when the validation loss stagnated. To further improve model performance and enhance generalization capabilities, Dropout layers were consistently incorporated across all models. Batch Normalization was applied to select intermediate layers, and L2 regularization was employed to prevent the uncontrolled growth of weights. The specific application of these architectural enhancements for each model is also detailed in Table 2.

Figure 2 and Figure 3 for the ResNet1D model, and Figure 4 and Figure 5 for the CNN model, visually present the accuracy and loss curves obtained during their respective training processes. To optimize figure space, these visualizations were confined to the ResNet1D and CNN methods, as the other models yielded similar performance trends. Results pertaining to the other methods are detailed in Table 3. An examination of these graphs reveals that both models consistently achieved prominent levels of training and validation accuracy. Moreover, their training loss and the validation loss curves exhibit closely aligned trends, decreasing in parallel from the early stages of training. This pattern indicates that the models generalize effectively with high accuracy and do not show a tendency towards overfitting. Notably, the absence of abrupt increases in the validation loss curves substantiates the successful prevention of overfitting, underscoring the efficacy of the applied strategies such as early stopping, dropout, and L2 regularization.

In addition, small differences in performance metrics, such as accuracy, can be observed each time the models are run. This is due to the random initialization of weights during the training process, the stochastic nature of the optimization process, and the varying ways the dataset can be split into training and testing sets. For this reason, to obtain an average performance measurement, each model was run independently 15 times, and the average of the results was taken.

8. Results

A comprehensive summary of the performance of the tested deep learning models is presented in Table 3. The results indicate that the LSTM model achieved the highest overall performance among all models, demonstrating 99.22% accuracy, a 99.84% attack detection rate, and a 99.59% normal traffic detection rate. The MLP model emerged as one of the fastest models, with 98.98% accuracy and an average inference time of merely 0.5 µs, rendering it a highly suitable architecture for real-time applications. The DNN model also demonstrated high classification success and low latency, achieving 98.3% accuracy and a latency of 0.6 µs per sample.

Conversely, sequential architectures, namely xLSTM, LSTM, and GRU, exhibited slower performance. Although GRU achieved an accuracy rate of 99.1%, it was the slowest model with an average prediction time of 16.5 microseconds. While the xLSTM and LSTM models achieved comparable accuracy, the former’s inference time was significantly longer, at 16.2 microseconds compared to the latter’s 8.2 microseconds.

Conversion of the developed models to the ONNX (Open Neural Network Exchange) format enabled a reduction in inference time and model size, along with more efficient utilization of system resources. MLP, DNN, CNN, and ResNet1D models were successfully converted to ONNX, facilitating their integration into both server and edge devices. However, sequential architectures, such as LSTM, GRU, xLSTM, and RNN, were not supported in the ONNX conversion due to their inclusion of TensorFlow-specific custom GPU operators (e.g., CudnnRNN); consequently, these models could not be included in the quantization processes.

The results obtained after the conversion are presented in Table 4. According to these results, both the MLP and DNN models maintained high accuracy rates and demonstrated full compatibility with the quantization processes. For the MLP model, accuracy remained at 98.98%, while the model size was reduced from 0.186 MB to 0.02 MB, and the inference time decreased from 0.45 microseconds to 0.02 microseconds. Similarly, the DNN model preserved 98% accuracy, with its size reduced from 0.208 MB to 0.024 MB, and a significant reduction in inference time.

In contrast, while the CNN and ResNet1D models were successfully converted to ONNX, quantization operations could not be applied. The primary reason for this limitation is that certain specialized layers within the architectures of both models are not yet fully supported by ONNX Runtime. Specifically, layer combinations used with Conv1D, BatchNormalization, and Residual blocks are incompatible with ONNX operators such as QLinearConv or ConvInteger, which are utilized during quantization. Due to these problems, both dynamic and static quantization processes resulted in compilation errors for the CNN and ResNet1D models. Consequently, only the original and structurally optimized (Processed) ONNX versions of these two models could be deployed. Although the CNN and ResNet1D models did not experience performance degradation after ONNX conversion, they lagged behind the MLP and DNN models in terms of inference time and model size.

To assess the model’s performance on individual attack types, the confusion matrices for the CNN model on the LDAP, MSSQL, NetBIOS, and Portmap attacks are presented in Figure 6, Figure 7, Figure 8 and Figure 9. These results indicate that the trained model is highly effective at detecting all these attacks, achieving a high detection rate with a very low false alarm rate. The per-attack confusion matrices for the remaining models are provided in the Supplementary Materials.

9. Discussion & Conclusions

IP spoofing is a method frequently employed by cybercriminals in contemporary network attacks. While existing network anomaly detection studies often treat IP spoofing as one of the attack symptoms, this study specifically aims for its effective and efficient detection, recognizing its role as a component of cyberattacks. To achieve this, we conducted a comparative evaluation of various deep learning architectures for detecting IP spoofing attacks in network traffic. Eight distinct models were assessed: Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), Multilayer Perceptron (MLP), Deep Neural Network (DNN), ResNet1D (Residual Neural Network), Recurrent Neural Network (RNN), and Extended Long Short-Term Memory (xLSTM). Their performances were evaluated using key metrics such as accuracy, precision, recall, F1 score, attack detection rate, AUC, and inference time per packet. Furthermore, the developed ONNX-compatible models were transformed, and the relationship between their performance, inference time, and model size was examined.

All the trained models demonstrated comparable and strong performance. The LSTM model demonstrated the highest overall performance, achieving 99.22% accuracy and F1 score. The GRU model followed with 99.05% accuracy, and CNN achieved 99.02% accuracy. Regarding inference time, the MLP model recorded the shortest average per-sample inference time at 0.5 microseconds, followed by DNN (0.6 microseconds) and ResNet1D (2.3 microseconds). In contrast, the GRU model exhibited the longest inference time at 16.5 microseconds, with other sequential models such as xLSTM (16.2 microseconds), LSTM (8.2 microseconds), and RNN (4.4 microseconds) also showing comparatively longer durations. The normal traffic detection rate was 99.9% for the RNN model, 99.8% for GRU, 99.7% for xLSTM and MLP, and 99.5% for LSTM. For the attack detection rate, the LSTM and ResNet1D models achieved the highest value at 99%. From an application perspective, the CNN, MLP, ResNet1D, and DNN models are notable for their high accuracy rates, short inference times, and efficient utilization of system resources. This indicates that effective attack detection can be achieved without the need for complex preprocessing steps, thereby enhancing the applicability of these systems in real-time environments. The findings demonstrate that deep learning models can distinguish between spoofing-based attacks and normal traffic with high accuracy.

As illustrated in Figure 6, Figure 7, Figure 8 and Figure 9, the trained models successfully detect all spoofing-based attacks. Notably, despite the absence of Portmap attack samples in the training data, the model exhibited impressive performance with high detection and low false alarm rates. This result indicates that the trained model effectively generalized the fundamental behavior of IP spoofing rather than merely memorizing the training dataset.

To evaluate models not only by their accuracy but also by criteria such as speed, size, and platform independence, an ONNX (Open Neural Network Exchange) conversion process was performed. ONNX facilitates model portability across different deep learning frameworks, simplifying deployment on edge devices while simultaneously offering the advantages of reduced inference time and smaller model size. Within this scope, MLP, DNN, CNN, and ResNet1D models were converted to the ONNX format, with “Original,” “Processed,” “Quantized Dynamic,” and “Quantized Static” versions targeted for each. In this study, MLP and DNN emerged as the most suitable architectures for ONNX conversion, whereas quantization operations could not be performed on CNN and ResNet1D models due to technical limitations. These findings provide a significant practical evaluation of ONNX regarding architectural compatibility and improvements in inference speed. Although CNN and ResNet1D models exhibited more successful results prior to conversion, models supporting ONNX conversion, such as MLP and DNN, were observed to gain prominence due to their advantages in inference time and model size.

This study demonstrates the efficacy of the developed deep learning-based spoofing detection systems for real-time network security applications. This is attributed to the substantial reduction in inference times, miniaturization of model sizes, and the attainment of platform independence facilitated by ONNX conversion. These findings are expected to provide a valuable reference for selecting suitable methodologies in the detection of IP spoofing-based network attacks.

However, the study also has some limitations. The dataset used was created in a laboratory environment and may not fully reflect the diversity of real-world network threats. Utilizing more diverse and real-time data sets would enhance the model’s generalizability. The ONNX transformation of all models should be performed, and the performance loss and inference time gains after transformation should be evaluated. It would be beneficial to assess the performance of models that strike a successful balance between performance and inference time in high-density networks such as data centers and IXPs.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app15179508/s1, File S1: IP Spoofing ROC & Confusion Matrix.

Author Contributions

Conceptualization, İ.Ö.; methodology, İ.Ö.; formal analysis, İ.K.Ç., B.A., F.N.S. and İ.Ö.; investigation, İ.K.Ç., B.A. and F.N.S.; writing—original draft preparation, İ.K.Ç., B.A. and F.N.S.; writing—review and editing, İ.Ö.; visualization, İ.K.Ç. and B.A.; supervision, İ.Ö.; project administration, İ.Ö. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Doruk İletişim ve Otomasyon Sanayi ve Ticaret A.Ş. (DORUKNET) under funding number DRK.BGDT.DDOS.001.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This material is based upon work supported by Doruk İletişim ve Otomasyon Sanayi ve Ticaret A.Ş. (DORUKNET). The authors gratefully acknowledge this support and take responsibility for the contents of this report. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Doruk İletişim ve Otomasyon Sanayi ve Ticaret A.Ş. (DORUKNET).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviation

The following abbreviations are used in this manuscript:

IP	Internet Protocol
MAC	Media Access Control
ARP	Address Resolution Protocol
DNS	Domain Name System
DDoS	Distributed Denial of Service
ONNX	Open Neural Network Exchange
LSTM	Long Short-Term Memory
xLSTM	Extended Long Short-Term Memory
GRU	Gated Recurrent Unit
CNN	Convolutional Neural Network
MLP	Multilayer Perceptron
DNN	Deep Neural Network
RNN	Recurrent Neural Network
AUC	Area Under the Curve

References

Cisco. Cisco Annual Internet Report (2018–2023), [Online]. Available online: https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html (accessed on 27 July 2025).
Özçelik, I.; Brooks, R. Distributed Denial of Service Attacks: Real-World Detection and Mitigation; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
Cloudflare. DDoS Threat Report for 2025 Q1, [Online]. Available online: https://blog.cloudflare.com/ddos-threat-report-for-2025-q1/ (accessed on 27 July 2025).
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. arXiv 2020, arXiv:2003.05813. [Google Scholar] [CrossRef] [PubMed]
Ashiku, L.; Dagli, C. Network intrusion detection system using deep learning. Procedia Comput. Sci. 2021, 185, 239–247. [Google Scholar] [CrossRef]
Razzaq, K.; Shah, M. Advancing cybersecurity through machine learning: A scientometric analysis of global research trends and influential contributions. J. Cybersecur. Priv. 2025, 5, 12. [Google Scholar] [CrossRef]
Wang, H.; Jin, C.; Shin, K.G. Defense against spoofed IP traffic using hop-count filtering. IEEE/ACM Trans. Netw. 2007, 15, 40–53. [Google Scholar] [CrossRef]
Sowah, R.A.; Ofori-Amanfo, K.B.; Mills, G.A.; Koumadi, K.M. Detection and prevention of man-in-the-middle spoofing attacks in MANETs using predictive techniques in artificial neural networks (ANN). J. Comput. Netw. Commun. 2019, 1, 4683982. [Google Scholar] [CrossRef]
Kousar, H.; Mulla, M.M.; Shettar, P.; DG, N. DDoS attack detection system using Apache Spark. In Proceedings of the 2021 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 27–29 January 2021; pp. 1–5. [Google Scholar]
Wu, H.; Zhang, X.; Chen, T.; Cheng, G.; Hu, X. IM-Shield: A Novel Defense System against DDoS Attacks under IP Spoofing in High-speed Networks. In Proceedings of the ICC 2022—IEEE International Conference on Communications, Seoul, Republic of Korea, 16–20 May 2022; pp. 4168–4173. [Google Scholar]
Parekh, V.; Saravanan, M. A hybrid approach to protect server from IP spoofing attack. In Proceedings of the 2022 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES), Erode, India, 15–16 July 2022; pp. 1–9. [Google Scholar]
Dhanya, K.; Vajipayajula, S.; Srinivasan, K.; Tibrewal, A.; Kumar, T.S.; Kumar, T.G. Detection of network attacks using machine learning and deep learning models. Procedia Comput. Sci. 2023, 218, 57–66. [Google Scholar] [CrossRef]
Majumder, S.; Deb Barma, M.K.; Saha, A. ARP spoofing detection using machine learning classifiers: An experimental study. Knowl. Inf. Syst. 2025, 67, 727–766. [Google Scholar] [CrossRef]
Jindal, K.; Dalal, S.; Sharma, K.K. Analyzing spoofing attacks in wireless networks. In Proceedings of the 2014 4th International Conference on Advanced Computing and Communication Technologies, Rohtak, India, 8–9 February 2014; pp. 398–402. [Google Scholar]
Babu, P.R.; Bhaskari, D.L.; Satyanarayana, C.H. A comprehensive analysis of spoofing. Int. J. Adv. Comput. Sci. Appl. 2010, 1, 157–162. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar]
Esmaily, J.; Moradinezhad, R.; Ghasemi, J. Intrusion detection system based on multi-layer perceptron neural networks and decision tree. In Proceedings of the 2015 7th Conference on Information and Knowledge Technology (IKT), Urmia, Iran, 26–28 May 2015; pp. 1–5. [Google Scholar]
Yi, H.; Shiyu, S.; Xiusheng, D.; Zhigang, C. A study on deep neural networks framework. In Proceedings of the 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Xi’an, China, 3–5 October 2016; pp. 1519–1522. [Google Scholar]
Paolini, E.; Valcarenghi, L.; Maggiani, L.; Andriolli, N. Real-time network packet classification exploiting computer vision architectures. IEEE Open J. Commun. Soc. 2024, 5, 1155–1166. [Google Scholar] [CrossRef]
Canadian Institute for Cybersecurity. CIC-DDoS2019 Dataset, [Online]. Available online: https://www.unb.ca/cic/datasets/ddos-2019.html (accessed on 27 July 2025).
Al-Na’amneh, Q.I.; Aljaidi, M.; Gharaibeh, H.; Nasayreh, A.; Al Mamlook, R.E.; Almatarneh, S.; Alzu’bi, D.; Husien, A.S. Feature selection for robust spoofing detection: A Chi-square-based machine learning approach. In Proceedings of the 2023 2nd International Engineering Conference on Electrical, Energy, Artificial Intelligence (EICEEAI), Amman, Jordan, 27–28 December 2023; pp. 1–7. [Google Scholar]

Figure 1. Functional Block Diagram of the IP-Spoofing Detection Pipeline.

Figure 2. Training and Validation Accuracy Curves for the Res-Net1D Model.

Figure 3. Training and Validation Loss Curves for the ResNet1D Model.

Figure 4. Training and Validation Accuracy Curves for the CNN Model.

Figure 5. Training and Validation Loss Curves for the CNN Model.

Figure 6. CNN Model Confusion Matrix for LDAP Reflection Attack.

Figure 7. CNN Model Confusion Matrix for MSSQL Reflection Attack.

Figure 8. CNN Model Confusion Matrix for NETBIOS Reflection Attack.

Figure 9. CNN Model Confusion Matrix for Portmap Reflection Attack.

Table 1. Features used in model training.

Category	Features
Basic Traffic Features	Source Port, Destination Port, Protocol
Packet Size and Intensity	Fwd Packet Length Min, Fwd Packet Length Mean, Flow Bytes/s, Min Packet Length, Packet Length Mean, Average Packet Size, Avg Fwd Segment Size
Time-Based Metrics	Bwd IAT Total
Flag Information (Flags)	Fwd PSH Flags, RST Flag Count, ACK Flag Count, URG Flag Count, CWE Flag Count
Transmission Direction and Asymmetry	Init_Win_bytes_forward, Init_Win_bytes_backward, Asymmetry_Ratio, OneWay

Table 2. Architectural Characteristics of the Employed Models.

Feature	LSTM	GRU	CNN	MLP	DNN	ResNet1D	xLSTM	RNN
Input Shape	3D	3D	3D	2D	2D	3D	3D	3D
Input Size	(timestep, 1)	(timestep, 1)	(timestep, 1)	(n_features,)	(n_features,)	(timestep, 1)	(timestep, 1)	(timestep, 1)
Hidden Layer Activation	tanh (LSTM) + ReLU (Dense)	tanh (GRU) + ReLU (Dense)	ReLU	ReLU	ReLU	ReLU	tanh (BiLSTM) + ReLU (Dense)	tanh (RNN) + ReLU (Dense)
Output Activation	Sigmoid	Sigmoid	Sigmoid	Sigmoid	Sigmoid	Sigmoid	Sigmoid	Sigmoid
Dropout Rate (%)	40/40/50	20/20/20	20/20/30	30/30	30/30	0	30/30	30/30/30
Number of Layers	4	4	4	4	4	6	4	4
L2 Regularization	0.01	0.003	0.001	Absent	Absent	Absent	0.01	0.01
Normalization	BatchNorm	LayerNorm	LayerNorm	Absent	BatchNorm	BatchNorm	BatchNorm	BatchNorm

Table 3. Summary of Deep Learning Model Performance.

Model	F1-Score	Accuracy	Precision	Recall	AUC	Inference Time (ms)
LSTM	0.9922	0.9922	0.9959	0.9885	0.9992	0.0082
GRU	0.9903	0.9905	0.9983	0.9826	0.9990	0.0165
CNN	0.9902	0.9902	0.9944	0.9860	0.9992	0.0031
MLP	0.9897	0.9898	0.9971	0.9825	0.9993	0.0005
RNN	0.9885	0.9886	0.9985	0.9787	0.9991	0.0044
ResNet1D	0.9875	0.9874	0.9875	0.9874	0.9984	0.0023
xLSTM	0.9836	0.9839	0.9971	0.9705	0.9990	0.0162
DNN	0.9825	0.9827	0.9933	0.9719	0.9989	0.0006

Table 4. Summary of Deep Learning Model Performance with ONNX.

Model	Type	Inference Time (ms)	F1-Score	Accuracy	Precision	Recall	AUC	File Size (MB)
MLP	Keras (.h5)	0.000451	0.9897	0.9898	0.9971	0.9825	0.9993	0.1860
MLP	ONNX (Original)	0.000232	0.9897	0.9898	0.9971	0.9825	0.9993	0.0519
MLP	ONNX (Processed)	0.000212	0.9897	0.9898	0.9971	0.9825	0.9993	0.0527
MLP	ONNX (Quantized Dynamic)	0.000203	0.9905	0.9906	0.9983	0.9828	0.9994	0.0198
MLP	ONNX (Quantized Static)	0.000248	0.9879	0.9879	0.9920	0.9838	0.9959	0.0252
DNN	Keras (.h5)	0.000628	0.9825	0.9827	0.9933	0.9719	0.9989	0.2084
DNN	ONNX (Original)	0.000285	0.9825	0.9827	0.9937	0.9715	0.9989	0.0546
DNN	ONNX (Processed)	0.000398	0.9825	0.9827	0.9937	0.9715	0.9989	0.0558
DNN	ONNX (Quantized Dynamic)	0.000263	0.9850	0.9852	0.9937	0.9765	0.9991	0.0234
DNN	ONNX (Quantized Static)	0.000491	0.9817	0.9819	0.9884	0.9752	0.9939	0.0330
ResNet1D	Keras (.h5)	0.002330	0.9875	0.9874	0.9875	0.9874	0.9984	0.6771
ResNet1D	ONNX (Original)	0.019720	0.9875	0.9874	0.9876	0.9874	0.9984	0.2021
ResNet1D	ONNX (Processed)	0.020720	0.9875	0.9874	0.9876	0.9874	0.9984	0.2058
CNN	Keras (.h5)	0.003107	0.9902	0.9902	0.9944	0.9860	0.9992	0.3264
CNN	ONNX (Original)	0.022694	0.9902	0.9902	0.9944	0.9860	0.9992	0.1001
CNN	ONNX (Processed)	0.025479	0.9902	0.9902	0.9944	0.9860	0.9992	0.1043

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Çekiş, İ.K.; Ayrancı, B.; Salman, F.N.; Özçelik, İ. IP Spoofing Detection Using Deep Learning. Appl. Sci. 2025, 15, 9508. https://doi.org/10.3390/app15179508

AMA Style

Çekiş İK, Ayrancı B, Salman FN, Özçelik İ. IP Spoofing Detection Using Deep Learning. Applied Sciences. 2025; 15(17):9508. https://doi.org/10.3390/app15179508

Chicago/Turabian Style

Çekiş, İsmet Kaan, Buğra Ayrancı, Fezayim Numan Salman, and İlker Özçelik. 2025. "IP Spoofing Detection Using Deep Learning" Applied Sciences 15, no. 17: 9508. https://doi.org/10.3390/app15179508

APA Style

Çekiş, İ. K., Ayrancı, B., Salman, F. N., & Özçelik, İ. (2025). IP Spoofing Detection Using Deep Learning. Applied Sciences, 15(17), 9508. https://doi.org/10.3390/app15179508

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

IP Spoofing Detection Using Deep Learning

Abstract

1. Introduction

2. Literature Review

3. Spoofing Attacks

3.1. IP Spoofing

3.2. ARP Spoofing

3.3. DNS Spoofing

4. Selected Deep Learning Models

5. Experiment Setup

6. Performance Metrics

6.1. Accuracy

6.2. Precision

6.3. Recall

6.4. F1-Score

7. Model Training and Performance Test

8. Results

9. Discussion & Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI