Next Article in Journal
Integration of UX Design Guidelines in the Requirements Engineering Lifecycle for Generative AI Solutions
Previous Article in Journal
Stability Analysis of Shield Tunnels Considering Spatial Nonhomogeneity and Anisotropy of Soils with Tensile Strength Cut-Off
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

IP Spoofing Detection Using Deep Learning

by
İsmet Kaan Çekiş
1,
Buğra Ayrancı
1,
Fezayim Numan Salman
1 and
İlker Özçelik
2,*
1
Department of Computer Engineering, Faculty of Engineering and Architecture, Eskisehir Osmangazi University, Eskişehir 26480, Türkiye
2
Department of Software Engineering, Faculty of Engineering and Architecture, Eskisehir Osmangazi University, Eskişehir 26480, Türkiye
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(17), 9508; https://doi.org/10.3390/app15179508 (registering DOI)
Submission received: 28 July 2025 / Revised: 24 August 2025 / Accepted: 26 August 2025 / Published: 29 August 2025
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

IP spoofing is a critical component in many cyberattacks, enabling attackers to evade detection and conceal their identities. This study rigorously compares eight deep learning models—LSTM, GRU, CNN, MLP, DNN, RNN, ResNet1D, and xLSTM—for their efficacy in detecting IP spoofing attacks. Overfitting was mitigated through techniques such as dropout, early stopping, and normalization. Models were trained using binary cross-entropy loss and the Adam optimizer. Performance was assessed via accuracy, precision, recall, F1 score, and inference time, with each model executed a total of 15 times to account for stochastic variability. Results indicate a powerful performance across all models, with LSTM and GRU demonstrating superior detection efficacy. After ONNX conversion, the MLP and DNN models retained their performance while achieving significant reductions in inference time, miniaturized model sizes, and platform independence. These advancements facilitated the effective utilization of the developed systems in real-time network security applications. The comprehensive performance metrics presented are crucial for selecting optimal IP spoofing detection strategies tailored to diverse application requirements, serving as a valuable reference for network anomaly monitoring and targeted attack detection.

1. Introduction

In today’s rapidly advancing internet landscape, infrastructures are continually evolving and becoming increasingly complex. A report by Cisco for the period 2018–2023 indicates that the average number of internet-connected devices per capita increased from 2.4 to 3.6. Based on this data, it was inferred that the total number of internet-connected devices would rise from 18.4 billion to 29.3 billion. The report also observed that the global average network bandwidth nearly doubled from 45.9 Mbps in 2018 to 110.4 Mbps in 2023. This combined increase in connected devices and network bandwidth escalated the volume and complexity of network traffic, making the detection of cybersecurity threats more challenging [1]. Detecting attacks on increasingly dense internet traffic is progressively becoming more difficult. Particularly, IP spoofing-based attacks are hidden within heavy traffic. Since IP spoofing attacks exhibit standard protocol structures and legitimate traffic patterns, they are not readily detected by traditional network security systems.
IP spoofing attacks constitute a special threat to network security. IP spoofing is a method where attackers create a false identity by altering the source IP addresses of packets within the network traffic. Through this technique, attackers can communicate with the target systems by appearing as legitimate users. IP spoofing is frequently used effectively in Distributed Denial of Service (DDoS) attacks, especially in reflection and amplification-based attacks such as NTP amplification and DNS amplification. In these attacks, a small request packet is sent using a spoofed source IP to servers that generate a much larger response, thereby rendering the target system unable to provide service under heavy traffic [2]. The fact that the source addresses of packets are not verified by the IP protocol makes it possible to conduct such attacks. Therefore, the detection of IP spoofing attacks is critical for the security and integrity of networks.
According to Cloudflare’s Q1 2025 DDoS report, IP spoofing techniques (SYN flood, DNS amplification, and UDP reflection) were actively used in approximately 80% of DDoS attacks [3]. Attackers evaded detection systems by concealing the traffic source. Furthermore, the SMap study indicates that approximately 69.8% of autonomous systems on the internet still do not filter spoofed IP packets, meaning they are vulnerable to spoofing-based attack generation [4].
Signature-based detection methods are knowledge-based approaches that examine network traffic to recognize existing or known attack patterns and can only detect threats that match existing signatures [5]. However, these methods require continuous signature updates to capture unknown or previously unobserved attack techniques. In dynamic and high-volume network traffic conditions, this need for updates reduces system effectiveness. It also further complicates the detection of spoofed packets that conform to protocol standards but are maliciously crafted. For these reasons, the detection of sophisticated attacks like IP spoofing requires more flexible, learnable, and scalable methods. Thus, the limitations of signature-based detection systems become clear.
Anomaly-based detection approaches issue alerts when the monitored system or network traffic deviates from the patterns defined as normal. This allows for the detection of previously unencountered attacks. Within the last decade, there has been an increase in the use of machine and deep learning-based methods in anomaly detection studies, with Razzaq and Shah (2023) finding that the intersection of cybersecurity and these advanced computational techniques has experienced significant growth and global collaboration from 2016 to 2025 [6]. Within the scope of this study, the detection of IP spoofing attacks using different deep learning methods has been examined in detail.
For deep learning models developed to operate efficiently in real-time systems, in addition to high accuracy, low latency, and platform independence have become crucial. In this context, Open Neural Network Exchange (ONNX), an open-source model representation format, enables models to be easily integrated into production environments by providing interoperability among different deep learning libraries. The ONNX format allows models to run on CPUs, GPUs, or edge devices, independent of frameworks such as TensorFlow and PyTorch, offering significant advantages in terms of inference time. Additionally, optimizations applied during ONNX conversion processes result in substantial reductions in model sizes, providing further benefits in terms of memory usage. The ONNX-supported deep learning-based IP spoofing detection models used in this study were converted to the ONNX format, and their performances were evaluated in this universal execution environment, thereby comprehensively analyzing not only their accuracy but also their production readiness. To the best of our knowledge, in the literature, there is no comparative study on the detection of IP spoofing attacks using deep learning methods. The outputs of this study are expected to contribute to efforts in detecting the many attacks that use the IP spoofing method.
The remainder of this article is organized as follows. Section 2 reviews existing literature on IP spoofing detection. Section 3 classifies the types of spoofing attacks and their corresponding characteristics within network traffic. Section 4 presents the deep learning methodologies employed and the rationale for their selection. Section 5 details the experimental setup, including the datasets utilized and the feature selection process. Section 6 introduces the performance metrics applied for evaluation. Section 7 discusses the machine learning models, their architectural specifics, and the training procedures. Section 8 evaluates the performance of these models. Finally, Section 9 discusses the obtained findings and offers recommendations for future research.

2. Literature Review

The existing literature contains a comparatively limited number of dedicated studies and datasets specifically addressing IP spoofing detection. Consequently, this section reviews detection studies relevant to IP spoofing, primarily referencing research where its role is emphasized within the broader context of Distributed Denial of Service (DDoS) attack detection.
To counter IP spoofing attacks, Haining Wang et al. (2007) [7] introduced the Hop Count Filtering (HCF) technique. This method involves calculating the actual hop count from the Time to Live (TTL) value of an incoming packet and comparing it to the expected hop count associated with its source IP address. This comparison serves to identify spoofed IP packets. Wang et al. reported that the HCF technique achieves a 90% accuracy rate, while also noting the possibility of minor deviations. This suggests that despite its high technical accuracy, an increase in network complexity may lead to false positives.
R.A. Sowah et al. (2019) [8] developed an Artificial Neural Network (ANN) based method to detect and prevent Man-in-the-Middle (MITM) spoofing attacks within Mobile Ad-hoc Networks (MANETs). Their experiments, utilizing a 5-node network architecture, successfully detected MITM attacks and enabled the identification of spoofed IP addresses. This research highlighted the efficacy of ANN algorithms for IP spoofing detection and demonstrated an 88.23% accuracy rate for the proposed method. Nevertheless, these findings are subject to certain limitations stemming from the dynamic characteristics of MANETs and inherent network variability.
Heena Kousar et al. (2021) [9] devised a solution using the Apache Spark platform to address DDoS attacks, encompassing those that incorporate IP spoofing. Their research demonstrated that the Random Forest algorithm outperformed the Decision Tree algorithm, and that distributed processing offered substantial benefits regarding preprocessing and training duration. This method achieved an efficacy of 90.86% accuracy. While this study did not directly focus on IP spoofing detection, the identified DDoS attack types included those that involved IP spoofing.
IM-Shield, a system proposed by Hua Wu et al. (2022) [10], was designed to defend against DDoS attacks that utilize IP spoofing. This system verifies the authentic source identity of network packets by analyzing pairings of router interface MAC addresses and destination IP addresses. IM-Shield detects and filters DDoS attacks without necessitating modifications to existing network protocols. Experimental results demonstrated a 99.9% accuracy rate.
To detect and prevent IP spoofing-based DDoS attacks, Varsha Parekh Saravanan M (2022) [11] formulated a hybrid strategy integrating distance calculation with machine learning-based methods. Evaluations on the CAIDA 2007 dataset indicated that this hybrid approach achieved the highest performance, with an accuracy rate of 99.86%. This strategy not only facilitates protection against attacks via an SNORT-based prevention mechanism but also yields superior outcomes in metrics like Precision, Recall, and F1-Score when compared to alternative methods.
K.A. Dhanya et al. (2023) [12] proposed machine learning and deep learning-based models for the detection of network attacks. In experiments performed on the UNSW-NB15 dataset, the Decision Tree algorithm achieved the highest performance, with an accuracy of 99.05%. While this study did not directly target IP spoofing detection, it effectively managed to indirectly classify the use of spoofed IP addresses in network traffic.
Sharmistha Majumder, Mrinal Kanti Deb Barma, and Ashim Saha (2025) [13] developed a dynamic machine learning-based anomaly detection approach for the real-time detection of ARP spoofing-based Man-in-the-Middle (MITM) attacks. This method relies on the continuous verification of IP and MAC addresses and their cross-validation with gateway information. Experimental results demonstrated an F1-Score of 99.26%. This system safeguards network integrity by identifying ARP spoofing attacks, potentially stemming from IP spoofing, through the detection of fraudulent IP-MAC address pairings.
While IP spoofing is often addressed in the literature indirectly in conjunction with other malicious activities, this study uniquely defines it as the primary attack for detection. Accordingly, eight distinct deep learning models were trained to detect IP spoofing, and their performances were comprehensively compared. The findings of this work are expected to serve as a benchmark for future research aimed at developing cyber threat detection systems against attacks employing IP spoofing techniques.

3. Spoofing Attacks

Spoofing refers to the act of impersonating another person or computer system by providing false information (e.g., an email name, URL, or IP address). In the domain of information technology, spoofing manifests in various forms, all of which involve some type of misinformation intended to deceive users. These methods involve the misrepresentation of information in diverse manners, which consequently leads to diverse types of fraudulent activities. Spoofing attacks in which the IP address is directly utilized during the fraudulent process include the following:
  • IP Spoofing
  • ARP Spoofing
  • DNS Spoofing

3.1. IP Spoofing

In computer networks, IP address spoofing (or IP spoofing) involves the creation of Internet Protocol (IP) packets with a falsified source address. This technique aims to either conceal the sender’s identity or impersonate another computer system. Network attackers also employ IP spoofing to circumvent security mechanisms such as IP address-based authentication. The efficacy of such attacks is notably heightened when trust relationships exist between machines. Router behavior amplifies the network’s susceptibility to IP spoofing; since routers typically only inspect the destination addresses for forwarding purposes, while authentication may rely on the source address, and modifying the source address field in an IP packet’s header is straightforward [14].

3.2. ARP Spoofing

The Address Resolution Protocol (ARP) facilitates the mapping between IP addresses and Media Access Control (MAC) addresses, while maintaining these associations in an ARP cache. If a packet is to be delivered to an IP address within the local network segment, ARP queries its cache for the associated MAC address. In the event this mapping is absent, ARP issues a broadcast request across the network. This mechanism is susceptible to deception through falsified ARP responses. ARP spoofing entails the generation of counterfeit ARP requests or replies, thereby misdirecting a target host’s traffic to an unauthorized machine. This malicious activity is commonly termed “ARP poisoning.” The implementation of MAC binding or static ARP tables can serve as countermeasures against such attacks; however, these solutions are often impractical in large-scale, dynamic network environments. Conversely, tools such as ARPWATCH monitor modifications to the ARP cache and alert administrators to potential anomalies [15].

3.3. DNS Spoofing

DNS spoofing is the manipulation of Domain Name System (DNS) records to redirect traffic to an IP address different from the legitimate one. This technique can lead to the compromise of a server’s identity by rerouting its DNS record to an unauthorized IP address. With modern BIND (Berkeley Internet Name Domain) services, successfully executing such an attack typically necessitates infiltrating the server or its underlying network infrastructure (e.g., routers, switches), a task that presents considerable difficulty. Despite all these difficulties, DNS attacks are quite common [14].

4. Selected Deep Learning Models

In this study, eight deep learning models, each with distinct architectural approaches, were employed for the detection of IP spoofing. Each model was evaluated based on its specific structural features and learning capacity and was implemented with strategies designed to prevent overfitting and enhance accuracy. This section briefly outlines the fundamental structures of the utilized models and the rationale behind their selection.
RNN (Recurrent Neural Network): RNN is a classical recurrent neural network architecture designed to work with sequential data. Although effective in learning short-term dependencies, it is not as successful as advanced architectures like LSTM and GRU in learning long-term dependencies. In this study, the RNN was included for evaluation due to its fundamental and established capability in sequential data modeling.
LSTM (Long Short-Term Memory): LSTM is a type of Recurrent Neural Network (RNN) distinguished by its capacity to learn long-term dependencies in sequential data. In this research, LSTM was utilized to capture temporal dependencies within network traffic, applied for IP spoofing detection by interpreting sequential relationships in data flows [16].
xLSTM (Extended Long Short-Term Memory): An advanced variant of the LSTM architecture, xLSTM may incorporate deeper structures and refined cell architectures. This model aims to learn long-term dependencies more robustly but exhibits a higher computational cost. In this study, it was evaluated as an extended version of LSTM.
GRU (Gated Recurrent Unit): Like LSTM in its effectiveness with time-dependent data, the GRU reduces training duration and offers a lighter-weight architecture due to its fewer parameters. In this study, it was evaluated as an alternative time series model to LSTM [17].
A Multi-Layer Perceptron (MLP) is a type of feedforward neural network designed to transform input data into appropriate output representations using a layered architecture of interconnected neurons. In this study, a four-layer MLP was employed and configured for non-sequential data, with its generalization performance enhanced by the application of regularization techniques to mitigate the risk of overfitting [18].
DNN (Deep Neural Network): The DNN model was employed as an extended and more regularized version of the conventional MLP architecture. It incorporated multiple hidden layers and utilized techniques such as dropout and batch normalization to augment it, aiming for a balance between accuracy and generalization performance during the learning process. The DNN model was assessed for its capacity, attributable to its deep architecture, to learn complex patterns representing IP spoofing [19].
CNN (Convolutional Neural Network): Predominantly utilized in image processing, CNNs were employed in this study in a one-dimensional configuration (1D-CNN). The objective was to extract local patterns from data flows and develop filters capable of automatically learning discriminative features indicative of spoofed traffic [20].
ResNet1D (Residual Neural Network 1D): This model features deeper layers than classical deep neural networks and facilitates training while preventing overfitting through skip connections between layers. In this study, a one-dimensional (1D) architecture was employed to enable more effective learning of discriminative features in network traffic.

5. Experiment Setup

Existing studies in the literature, despite employing machine learning and deep learning methods, do not often sufficiently address the critical issues of data imbalance and overfitting. This study addresses these issues by utilizing a balanced dataset for both training and testing, which was obtained through down-sampling of the original data.
The dataset used is the CIC-DDoS2019, developed by the Canadian Institute for Cybersecurity (CIC), Fredericton, NB, Canada. This dataset, compiled in a laboratory setting to emulate real-world attack scenarios, encompasses contemporary Distributed Denial of Service (DDoS) attack types. It also includes both benign and various malicious traffic types. For this study, reflection attacks were specifically identified as instances of IP spoofing, thereby framing the detection task as a binary classification problem [21]. The dataset comprises two days of data featuring thirteen distinct attacks, of which only the reflection attacks were selected for analysis. A reflection attack is a cyber-attack that leverages responses from a third-party server to flood a victim with traffic. As attackers send requests to a server using a forged source IP address that belongs to the victim, these attacks can be categorized as a form of IP spoofing.
To prevent data leakage and ensure that traffic from the same attack episode does not appear in both the training and testing sets, a time-based split was employed. Specifically, attacks from the second day (TFTP, LDAP, MSSQL, NetBIOS, NTP, SNMP, SSDP, DNS) were allocated for the training set, while attacks from the first day (LDAP, MSSQL, NetBIOS, Portmap) were reserved for the testing set. This methodology ensures that no duplicate flow entries exist between the training and testing data.
To enhance model accuracy and generalization capacity, this study integrated deep neural network (DNN) architectures with established overfitting prevention techniques, such as dropout and early stopping. Additionally, the feature selection phase employed the Chi-Square method to identify and select the most significant attributes for model training. Additionally, going beyond approaches in the existing literature that focus solely on model accuracy, the ONNX-supported models developed in this study were converted to the ONNX (Open Neural Network Exchange) format and evaluated in terms of performance metrics critical for production environments, such as inference time and model size. The ONNX conversion enabled the models to run faster, use less memory, and be deployed platform-independently.
The dataset is structured on a flow basis. Each instance within the dataset corresponds to a specific network flow, and the analysis was conducted directly at this flow level. During the data preprocessing phase, records containing missing or infinite values were removed, non-numerical features were suitably converted, and the ‘Label’ column was binarized, representing normal traffic as ‘0’ and attack traffic as ‘1’. In the model training, a balanced training dataset made up of 50,609 benign and 50,609 spoofing flows was utilized to mitigate potential biases arising from the observed class imbalance in the dataset. During the feature engineering phase, a Chi-Square-based feature selection algorithm was employed to identify features anticipated to directly contribute to the classification task [22]. This statistical method highlighted features with high discriminative power among classes by assessing the independence between each feature and the target variable.
Following this selection process, two new features, designated ‘Asymmetry_Ratio’ and ‘OneWay,’ were derived and subsequently incorporated into the final feature set for IP spoofing detection. The final set of features employed for model training and testing is categorized and presented in Table 1. The overall data-processing pipeline is summarized in Figure 1.
The novel derived features, Asymmetry_Ratio and OneWay, have been identified as attributes capable of providing significant distinctions in the detection of IP spoofing.
Asymmetry_Ratio is defined as the ratio of the total number of forward (Fwd) packets to the sum of both forward and backward (Bwd) packets within a flow. This ratio is calculated as follows:
Asymmetry_Ratio = Total Fwd Packets Total Fwd Packets + Total Bwd Packets + ε
Here, ϵ represents a small positive number employed to prevent division-by-zero errors, ensuring the numerical stability of computations. This feature quantitatively expresses the directional imbalance within network traffic. During IP spoofing attacks, packets flow in a single direction (e.g., solely transmission) because a response from the target system is typically not received. In such instances, the Asymmetry_Ratio value approaches one, indicating a pronounced asymmetry.
OneWay is a binary derived feature that indicates whether a flow occurs in only one direction. If a flow contains exclusively forward or exclusively backward packets, meaning the opposing direction is entirely absent, this feature’s value becomes one; otherwise, it is zero. This feature is formulated as follows:
OneWay = 1 0 if Total Fwd Packets = 0   or Total Bwd Packets = 0 otherwise
The observation of unidirectional traffic is considered a distinguishing indicator, particularly in attack scenarios involving the transmission of spoofed packets from which no response is received from the target system.

6. Performance Metrics

In this study, four fundamental measurement parameters are used:
True Positive (TP): Represents the count of instances correctly identified by the model as positive (i.e., attack).
True Negative (TN): Represents the count of instances correctly identified by the model as negative (i.e., normal).
False Positive (FP): Denotes the count of instances that are negative (normal) but are erroneously classified by the model as positive (attack).
False Negative (FN): Denotes the count of instances that are positive (attack) but are erroneously classified by the model as negative (normal).
These four fundamental parameters serve as the basis for calculating the ensuing evaluation metrics:

6.1. Accuracy

This is the ratio of the samples correctly predicted by the model to the total count of samples.
A c u r r a c y = T P + T N T P + T N + F P + F N

6.2. Precision

It indicates the proportion of instances classified as positive that are positive.
P r e c i s i o n = T P T P + F P

6.3. Recall

Recall (Sensitivity), also referred to as the Detection Rate, measures the proportion of actual positive instances that are correctly identified.
R e c a l l = T P T P + F N

6.4. F1-Score

It is the harmonic mean of the Precision and Recall values. It provides a more reliable success criterion, especially in imbalanced datasets.
F 1   S c o r e = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
A combined evaluation of these metrics allows for a detailed examination of attack detection performance, beyond relying solely on general accuracy. For IP spoofing detection, it is crucial to effectively manage both false negative and false positive rates.

7. Model Training and Performance Test

The model training process was conducted using a normalized dataset that had been previously prepared through feature selection and data cleansing procedures. The architectures of the employed models were designed, incorporating strategies to mitigate overfitting and enhance learning efficacy for each model.
In the training phase, all deep learning models were structured for binary classification, employing a sigmoid activation function in their output layers. The binary_crossentropy function was selected for loss calculation, and the Adam algorithm was utilized for optimization. Throughout the training, model validation performance was monitored. There was also an early stopping criterion activated if no improvement was observed over four consecutive epochs. Furthermore, the learning rate was reduced by half using the “ReduceLROnPlateau” callback when the validation loss stagnated. To further improve model performance and enhance generalization capabilities, Dropout layers were consistently incorporated across all models. Batch Normalization was applied to select intermediate layers, and L2 regularization was employed to prevent the uncontrolled growth of weights. The specific application of these architectural enhancements for each model is also detailed in Table 2.
Figure 2 and Figure 3 for the ResNet1D model, and Figure 4 and Figure 5 for the CNN model, visually present the accuracy and loss curves obtained during their respective training processes. To optimize figure space, these visualizations were confined to the ResNet1D and CNN methods, as the other models yielded similar performance trends. Results pertaining to the other methods are detailed in Table 3. An examination of these graphs reveals that both models consistently achieved prominent levels of training and validation accuracy. Moreover, their training loss and the validation loss curves exhibit closely aligned trends, decreasing in parallel from the early stages of training. This pattern indicates that the models generalize effectively with high accuracy and do not show a tendency towards overfitting. Notably, the absence of abrupt increases in the validation loss curves substantiates the successful prevention of overfitting, underscoring the efficacy of the applied strategies such as early stopping, dropout, and L2 regularization.
In addition, small differences in performance metrics, such as accuracy, can be observed each time the models are run. This is due to the random initialization of weights during the training process, the stochastic nature of the optimization process, and the varying ways the dataset can be split into training and testing sets. For this reason, to obtain an average performance measurement, each model was run independently 15 times, and the average of the results was taken.

8. Results

A comprehensive summary of the performance of the tested deep learning models is presented in Table 3. The results indicate that the LSTM model achieved the highest overall performance among all models, demonstrating 99.22% accuracy, a 99.84% attack detection rate, and a 99.59% normal traffic detection rate. The MLP model emerged as one of the fastest models, with 98.98% accuracy and an average inference time of merely 0.5 µs, rendering it a highly suitable architecture for real-time applications. The DNN model also demonstrated high classification success and low latency, achieving 98.3% accuracy and a latency of 0.6 µs per sample.
Conversely, sequential architectures, namely xLSTM, LSTM, and GRU, exhibited slower performance. Although GRU achieved an accuracy rate of 99.1%, it was the slowest model with an average prediction time of 16.5 microseconds. While the xLSTM and LSTM models achieved comparable accuracy, the former’s inference time was significantly longer, at 16.2 microseconds compared to the latter’s 8.2 microseconds.
Conversion of the developed models to the ONNX (Open Neural Network Exchange) format enabled a reduction in inference time and model size, along with more efficient utilization of system resources. MLP, DNN, CNN, and ResNet1D models were successfully converted to ONNX, facilitating their integration into both server and edge devices. However, sequential architectures, such as LSTM, GRU, xLSTM, and RNN, were not supported in the ONNX conversion due to their inclusion of TensorFlow-specific custom GPU operators (e.g., CudnnRNN); consequently, these models could not be included in the quantization processes.
The results obtained after the conversion are presented in Table 4. According to these results, both the MLP and DNN models maintained high accuracy rates and demonstrated full compatibility with the quantization processes. For the MLP model, accuracy remained at 98.98%, while the model size was reduced from 0.186 MB to 0.02 MB, and the inference time decreased from 0.45 microseconds to 0.02 microseconds. Similarly, the DNN model preserved 98% accuracy, with its size reduced from 0.208 MB to 0.024 MB, and a significant reduction in inference time.
In contrast, while the CNN and ResNet1D models were successfully converted to ONNX, quantization operations could not be applied. The primary reason for this limitation is that certain specialized layers within the architectures of both models are not yet fully supported by ONNX Runtime. Specifically, layer combinations used with Conv1D, BatchNormalization, and Residual blocks are incompatible with ONNX operators such as QLinearConv or ConvInteger, which are utilized during quantization. Due to these problems, both dynamic and static quantization processes resulted in compilation errors for the CNN and ResNet1D models. Consequently, only the original and structurally optimized (Processed) ONNX versions of these two models could be deployed. Although the CNN and ResNet1D models did not experience performance degradation after ONNX conversion, they lagged behind the MLP and DNN models in terms of inference time and model size.
To assess the model’s performance on individual attack types, the confusion matrices for the CNN model on the LDAP, MSSQL, NetBIOS, and Portmap attacks are presented in Figure 6, Figure 7, Figure 8 and Figure 9. These results indicate that the trained model is highly effective at detecting all these attacks, achieving a high detection rate with a very low false alarm rate. The per-attack confusion matrices for the remaining models are provided in the Supplementary Materials.

9. Discussion & Conclusions

IP spoofing is a method frequently employed by cybercriminals in contemporary network attacks. While existing network anomaly detection studies often treat IP spoofing as one of the attack symptoms, this study specifically aims for its effective and efficient detection, recognizing its role as a component of cyberattacks. To achieve this, we conducted a comparative evaluation of various deep learning architectures for detecting IP spoofing attacks in network traffic. Eight distinct models were assessed: Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), Multilayer Perceptron (MLP), Deep Neural Network (DNN), ResNet1D (Residual Neural Network), Recurrent Neural Network (RNN), and Extended Long Short-Term Memory (xLSTM). Their performances were evaluated using key metrics such as accuracy, precision, recall, F1 score, attack detection rate, AUC, and inference time per packet. Furthermore, the developed ONNX-compatible models were transformed, and the relationship between their performance, inference time, and model size was examined.
All the trained models demonstrated comparable and strong performance. The LSTM model demonstrated the highest overall performance, achieving 99.22% accuracy and F1 score. The GRU model followed with 99.05% accuracy, and CNN achieved 99.02% accuracy. Regarding inference time, the MLP model recorded the shortest average per-sample inference time at 0.5 microseconds, followed by DNN (0.6 microseconds) and ResNet1D (2.3 microseconds). In contrast, the GRU model exhibited the longest inference time at 16.5 microseconds, with other sequential models such as xLSTM (16.2 microseconds), LSTM (8.2 microseconds), and RNN (4.4 microseconds) also showing comparatively longer durations. The normal traffic detection rate was 99.9% for the RNN model, 99.8% for GRU, 99.7% for xLSTM and MLP, and 99.5% for LSTM. For the attack detection rate, the LSTM and ResNet1D models achieved the highest value at 99%. From an application perspective, the CNN, MLP, ResNet1D, and DNN models are notable for their high accuracy rates, short inference times, and efficient utilization of system resources. This indicates that effective attack detection can be achieved without the need for complex preprocessing steps, thereby enhancing the applicability of these systems in real-time environments. The findings demonstrate that deep learning models can distinguish between spoofing-based attacks and normal traffic with high accuracy.
As illustrated in Figure 6, Figure 7, Figure 8 and Figure 9, the trained models successfully detect all spoofing-based attacks. Notably, despite the absence of Portmap attack samples in the training data, the model exhibited impressive performance with high detection and low false alarm rates. This result indicates that the trained model effectively generalized the fundamental behavior of IP spoofing rather than merely memorizing the training dataset.
To evaluate models not only by their accuracy but also by criteria such as speed, size, and platform independence, an ONNX (Open Neural Network Exchange) conversion process was performed. ONNX facilitates model portability across different deep learning frameworks, simplifying deployment on edge devices while simultaneously offering the advantages of reduced inference time and smaller model size. Within this scope, MLP, DNN, CNN, and ResNet1D models were converted to the ONNX format, with “Original,” “Processed,” “Quantized Dynamic,” and “Quantized Static” versions targeted for each. In this study, MLP and DNN emerged as the most suitable architectures for ONNX conversion, whereas quantization operations could not be performed on CNN and ResNet1D models due to technical limitations. These findings provide a significant practical evaluation of ONNX regarding architectural compatibility and improvements in inference speed. Although CNN and ResNet1D models exhibited more successful results prior to conversion, models supporting ONNX conversion, such as MLP and DNN, were observed to gain prominence due to their advantages in inference time and model size.
This study demonstrates the efficacy of the developed deep learning-based spoofing detection systems for real-time network security applications. This is attributed to the substantial reduction in inference times, miniaturization of model sizes, and the attainment of platform independence facilitated by ONNX conversion. These findings are expected to provide a valuable reference for selecting suitable methodologies in the detection of IP spoofing-based network attacks.
However, the study also has some limitations. The dataset used was created in a laboratory environment and may not fully reflect the diversity of real-world network threats. Utilizing more diverse and real-time data sets would enhance the model’s generalizability. The ONNX transformation of all models should be performed, and the performance loss and inference time gains after transformation should be evaluated. It would be beneficial to assess the performance of models that strike a successful balance between performance and inference time in high-density networks such as data centers and IXPs.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app15179508/s1, File S1: IP Spoofing ROC & Confusion Matrix.

Author Contributions

Conceptualization, İ.Ö.; methodology, İ.Ö.; formal analysis, İ.K.Ç., B.A., F.N.S. and İ.Ö.; investigation, İ.K.Ç., B.A. and F.N.S.; writing—original draft preparation, İ.K.Ç., B.A. and F.N.S.; writing—review and editing, İ.Ö.; visualization, İ.K.Ç. and B.A.; supervision, İ.Ö.; project administration, İ.Ö. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Doruk İletişim ve Otomasyon Sanayi ve Ticaret A.Ş. (DORUKNET) under funding number DRK.BGDT.DDOS.001.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This material is based upon work supported by Doruk İletişim ve Otomasyon Sanayi ve Ticaret A.Ş. (DORUKNET). The authors gratefully acknowledge this support and take responsibility for the contents of this report. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Doruk İletişim ve Otomasyon Sanayi ve Ticaret A.Ş. (DORUKNET).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviation

The following abbreviations are used in this manuscript:
IPInternet Protocol
MACMedia Access Control
ARPAddress Resolution Protocol
DNSDomain Name System
DDoSDistributed Denial of Service
ONNXOpen Neural Network Exchange
LSTMLong Short-Term Memory
xLSTMExtended Long Short-Term Memory
GRUGated Recurrent Unit
CNNConvolutional Neural Network
MLPMultilayer Perceptron
DNNDeep Neural Network
RNNRecurrent Neural Network
AUCArea Under the Curve

References

  1. Cisco. Cisco Annual Internet Report (2018–2023), [Online]. Available online: https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html (accessed on 27 July 2025).
  2. Özçelik, I.; Brooks, R. Distributed Denial of Service Attacks: Real-World Detection and Mitigation; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
  3. Cloudflare. DDoS Threat Report for 2025 Q1, [Online]. Available online: https://blog.cloudflare.com/ddos-threat-report-for-2025-q1/ (accessed on 27 July 2025).
  4. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. arXiv 2020, arXiv:2003.05813. [Google Scholar] [CrossRef] [PubMed]
  5. Ashiku, L.; Dagli, C. Network intrusion detection system using deep learning. Procedia Comput. Sci. 2021, 185, 239–247. [Google Scholar] [CrossRef]
  6. Razzaq, K.; Shah, M. Advancing cybersecurity through machine learning: A scientometric analysis of global research trends and influential contributions. J. Cybersecur. Priv. 2025, 5, 12. [Google Scholar] [CrossRef]
  7. Wang, H.; Jin, C.; Shin, K.G. Defense against spoofed IP traffic using hop-count filtering. IEEE/ACM Trans. Netw. 2007, 15, 40–53. [Google Scholar] [CrossRef]
  8. Sowah, R.A.; Ofori-Amanfo, K.B.; Mills, G.A.; Koumadi, K.M. Detection and prevention of man-in-the-middle spoofing attacks in MANETs using predictive techniques in artificial neural networks (ANN). J. Comput. Netw. Commun. 2019, 1, 4683982. [Google Scholar] [CrossRef]
  9. Kousar, H.; Mulla, M.M.; Shettar, P.; DG, N. DDoS attack detection system using Apache Spark. In Proceedings of the 2021 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 27–29 January 2021; pp. 1–5. [Google Scholar]
  10. Wu, H.; Zhang, X.; Chen, T.; Cheng, G.; Hu, X. IM-Shield: A Novel Defense System against DDoS Attacks under IP Spoofing in High-speed Networks. In Proceedings of the ICC 2022—IEEE International Conference on Communications, Seoul, Republic of Korea, 16–20 May 2022; pp. 4168–4173. [Google Scholar]
  11. Parekh, V.; Saravanan, M. A hybrid approach to protect server from IP spoofing attack. In Proceedings of the 2022 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES), Erode, India, 15–16 July 2022; pp. 1–9. [Google Scholar]
  12. Dhanya, K.; Vajipayajula, S.; Srinivasan, K.; Tibrewal, A.; Kumar, T.S.; Kumar, T.G. Detection of network attacks using machine learning and deep learning models. Procedia Comput. Sci. 2023, 218, 57–66. [Google Scholar] [CrossRef]
  13. Majumder, S.; Deb Barma, M.K.; Saha, A. ARP spoofing detection using machine learning classifiers: An experimental study. Knowl. Inf. Syst. 2025, 67, 727–766. [Google Scholar] [CrossRef]
  14. Jindal, K.; Dalal, S.; Sharma, K.K. Analyzing spoofing attacks in wireless networks. In Proceedings of the 2014 4th International Conference on Advanced Computing and Communication Technologies, Rohtak, India, 8–9 February 2014; pp. 398–402. [Google Scholar]
  15. Babu, P.R.; Bhaskari, D.L.; Satyanarayana, C.H. A comprehensive analysis of spoofing. Int. J. Adv. Comput. Sci. Appl. 2010, 1, 157–162. [Google Scholar] [CrossRef]
  16. Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
  17. Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar]
  18. Esmaily, J.; Moradinezhad, R.; Ghasemi, J. Intrusion detection system based on multi-layer perceptron neural networks and decision tree. In Proceedings of the 2015 7th Conference on Information and Knowledge Technology (IKT), Urmia, Iran, 26–28 May 2015; pp. 1–5. [Google Scholar]
  19. Yi, H.; Shiyu, S.; Xiusheng, D.; Zhigang, C. A study on deep neural networks framework. In Proceedings of the 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Xi’an, China, 3–5 October 2016; pp. 1519–1522. [Google Scholar]
  20. Paolini, E.; Valcarenghi, L.; Maggiani, L.; Andriolli, N. Real-time network packet classification exploiting computer vision architectures. IEEE Open J. Commun. Soc. 2024, 5, 1155–1166. [Google Scholar] [CrossRef]
  21. Canadian Institute for Cybersecurity. CIC-DDoS2019 Dataset, [Online]. Available online: https://www.unb.ca/cic/datasets/ddos-2019.html (accessed on 27 July 2025).
  22. Al-Na’amneh, Q.I.; Aljaidi, M.; Gharaibeh, H.; Nasayreh, A.; Al Mamlook, R.E.; Almatarneh, S.; Alzu’bi, D.; Husien, A.S. Feature selection for robust spoofing detection: A Chi-square-based machine learning approach. In Proceedings of the 2023 2nd International Engineering Conference on Electrical, Energy, Artificial Intelligence (EICEEAI), Amman, Jordan, 27–28 December 2023; pp. 1–7. [Google Scholar]
Figure 1. Functional Block Diagram of the IP-Spoofing Detection Pipeline.
Figure 1. Functional Block Diagram of the IP-Spoofing Detection Pipeline.
Applsci 15 09508 g001
Figure 2. Training and Validation Accuracy Curves for the Res-Net1D Model.
Figure 2. Training and Validation Accuracy Curves for the Res-Net1D Model.
Applsci 15 09508 g002
Figure 3. Training and Validation Loss Curves for the ResNet1D Model.
Figure 3. Training and Validation Loss Curves for the ResNet1D Model.
Applsci 15 09508 g003
Figure 4. Training and Validation Accuracy Curves for the CNN Model.
Figure 4. Training and Validation Accuracy Curves for the CNN Model.
Applsci 15 09508 g004
Figure 5. Training and Validation Loss Curves for the CNN Model.
Figure 5. Training and Validation Loss Curves for the CNN Model.
Applsci 15 09508 g005
Figure 6. CNN Model Confusion Matrix for LDAP Reflection Attack.
Figure 6. CNN Model Confusion Matrix for LDAP Reflection Attack.
Applsci 15 09508 g006
Figure 7. CNN Model Confusion Matrix for MSSQL Reflection Attack.
Figure 7. CNN Model Confusion Matrix for MSSQL Reflection Attack.
Applsci 15 09508 g007
Figure 8. CNN Model Confusion Matrix for NETBIOS Reflection Attack.
Figure 8. CNN Model Confusion Matrix for NETBIOS Reflection Attack.
Applsci 15 09508 g008
Figure 9. CNN Model Confusion Matrix for Portmap Reflection Attack.
Figure 9. CNN Model Confusion Matrix for Portmap Reflection Attack.
Applsci 15 09508 g009
Table 1. Features used in model training.
Table 1. Features used in model training.
CategoryFeatures
Basic Traffic FeaturesSource Port, Destination Port, Protocol
Packet Size and IntensityFwd Packet Length Min, Fwd Packet Length Mean, Flow Bytes/s, Min Packet Length, Packet Length Mean, Average Packet Size, Avg Fwd Segment Size
Time-Based MetricsBwd IAT Total
Flag Information (Flags)Fwd PSH Flags, RST Flag Count, ACK Flag Count, URG Flag Count, CWE Flag Count
Transmission Direction and AsymmetryInit_Win_bytes_forward, Init_Win_bytes_backward, Asymmetry_Ratio, OneWay
Table 2. Architectural Characteristics of the Employed Models.
Table 2. Architectural Characteristics of the Employed Models.
FeatureLSTMGRUCNNMLPDNNResNet1DxLSTMRNN
Input Shape3D3D3D2D2D3D3D3D
Input Size(timestep, 1)(timestep, 1)(timestep, 1)(n_features,)(n_features,)(timestep, 1)(timestep, 1)(timestep, 1)
Hidden Layer Activationtanh (LSTM) + ReLU (Dense)tanh (GRU) + ReLU (Dense)ReLUReLUReLUReLUtanh (BiLSTM) + ReLU (Dense)tanh (RNN) + ReLU (Dense)
Output ActivationSigmoidSigmoidSigmoidSigmoidSigmoidSigmoidSigmoidSigmoid
Dropout Rate (%)40/40/5020/20/2020/20/3030/3030/30030/3030/30/30
Number of Layers44444644
L2 Regularization0.010.0030.001AbsentAbsentAbsent0.010.01
NormalizationBatchNormLayerNormLayerNormAbsentBatchNormBatchNormBatchNormBatchNorm
Table 3. Summary of Deep Learning Model Performance.
Table 3. Summary of Deep Learning Model Performance.
ModelF1-ScoreAccuracyPrecisionRecallAUCInference Time (ms)
LSTM0.99220.99220.99590.98850.99920.0082
GRU0.99030.99050.99830.98260.99900.0165
CNN0.99020.99020.99440.98600.99920.0031
MLP0.98970.98980.99710.98250.99930.0005
RNN0.98850.98860.99850.97870.99910.0044
ResNet1D0.98750.98740.98750.98740.99840.0023
xLSTM0.98360.98390.99710.97050.99900.0162
DNN0.98250.98270.99330.97190.99890.0006
Table 4. Summary of Deep Learning Model Performance with ONNX.
Table 4. Summary of Deep Learning Model Performance with ONNX.
ModelTypeInference Time (ms)F1-ScoreAccuracyPrecisionRecallAUCFile Size (MB)
MLPKeras (.h5)0.0004510.98970.98980.99710.98250.99930.1860
MLPONNX (Original)0.0002320.98970.98980.99710.98250.99930.0519
MLPONNX (Processed)0.0002120.98970.98980.99710.98250.99930.0527
MLPONNX (Quantized Dynamic)0.0002030.99050.99060.99830.98280.99940.0198
MLPONNX (Quantized Static)0.0002480.98790.98790.99200.98380.99590.0252
DNNKeras (.h5)0.0006280.98250.98270.99330.97190.99890.2084
DNNONNX (Original)0.0002850.98250.98270.99370.97150.99890.0546
DNNONNX (Processed)0.0003980.98250.98270.99370.97150.99890.0558
DNNONNX (Quantized Dynamic)0.0002630.98500.98520.99370.97650.99910.0234
DNNONNX (Quantized Static)0.0004910.98170.98190.98840.97520.99390.0330
ResNet1DKeras (.h5)0.0023300.98750.98740.98750.98740.99840.6771
ResNet1DONNX (Original)0.0197200.98750.98740.98760.98740.99840.2021
ResNet1DONNX (Processed)0.0207200.98750.98740.98760.98740.99840.2058
CNNKeras (.h5)0.0031070.99020.99020.99440.98600.99920.3264
CNNONNX (Original)0.0226940.99020.99020.99440.98600.99920.1001
CNNONNX (Processed)0.0254790.99020.99020.99440.98600.99920.1043
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Çekiş, İ.K.; Ayrancı, B.; Salman, F.N.; Özçelik, İ. IP Spoofing Detection Using Deep Learning. Appl. Sci. 2025, 15, 9508. https://doi.org/10.3390/app15179508

AMA Style

Çekiş İK, Ayrancı B, Salman FN, Özçelik İ. IP Spoofing Detection Using Deep Learning. Applied Sciences. 2025; 15(17):9508. https://doi.org/10.3390/app15179508

Chicago/Turabian Style

Çekiş, İsmet Kaan, Buğra Ayrancı, Fezayim Numan Salman, and İlker Özçelik. 2025. "IP Spoofing Detection Using Deep Learning" Applied Sciences 15, no. 17: 9508. https://doi.org/10.3390/app15179508

APA Style

Çekiş, İ. K., Ayrancı, B., Salman, F. N., & Özçelik, İ. (2025). IP Spoofing Detection Using Deep Learning. Applied Sciences, 15(17), 9508. https://doi.org/10.3390/app15179508

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop