Next Article in Journal
AI-Augmented SOC: A Survey of LLMs and Agents for Security Automation
Previous Article in Journal
Perceiving Digital Threats and Artificial Intelligence: A Psychometric Approach to Cyber Risk
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Anomaly Detection Against Fake Base Station Threats Using Machine Learning

Department of Computer Science, University of Colorado Colorado Springs, Colorado Springs, CO 80918, USA
*
Author to whom correspondence should be addressed.
J. Cybersecur. Priv. 2025, 5(4), 94; https://doi.org/10.3390/jcp5040094
Submission received: 28 September 2025 / Revised: 18 October 2025 / Accepted: 28 October 2025 / Published: 3 November 2025
(This article belongs to the Section Security Engineering & Applications)

Abstract

Mobile networking in 4G and 5G remains vulnerable against fake base stations. A fake base station can inject and manipulate the radio resource control (RRC) communication protocol to disable the user equipment’s connectivity. To motivate our research, we empirically show that such a fake base station can cause an indefinite hold of the user equipment’s connectivity using our fake base station prototype against an off-the-shelf phone. To defend against such threat, we design and build an anomaly detection system to detect the fake base station threats. It detects any base station’s deviations from the 4G/5G RRC protocol, which supports both the connectivity provision case (all works well and the user receives connectivity) and the connection-release case (cannot provide connectivity at the time and thus releases connections). Our scheme based on unsupervised machine learning dynamically and automatically controls and sets the detection parameters, which vary with mobility and the communication channel, and utilizes greater information to improve its effectiveness. Using software-defined radios and srsRAN, we implement a prototype of our scheme from sensing to data collection to machine-learning-based detection processing. Our empirical evaluations demonstrate the detection effectiveness and adaptability; i.e., our scheme accurately detects fake base stations deviating from the set protocol in mobile scenarios by adapting its model parameters. Our scheme achieves 100% accuracy in static scenarios against the fake base station threats. If the dynamic control is disabled, i.e., not adapting to mobility and different channel environments, the accuracy drops to 65–76%, but our scheme adjusts the model via dynamic training to recover to 100% accuracy.

1. Introduction

The mobile cellular network (4G/5G) relies on a base station, which serves as the bridge gateway between the wireless and wired communications. Fake base stations with malicious intent against the user equipment are a critical security issue. The 4G/5G authentication and key agreement (AKA) authenticates the user equipment and provides some protection in the communication between the user equipment and the backend core network (this control communication protocol is called non-access stratum (NAS) in 3GPP standardization of 4G LTE and 5G NR), including against fake base stations injecting digital spam messages and breaching the user equipment privacy to track the user equipment. However, the radio resource control (RRC) communication between the user equipment and base station, which precedes the NAS communication to/from the core network, remains vulnerable.
RRC in particular remains vulnerable against fake base stations’ threats to availability. We empirically show that a current 4G/5G mobile phone (the user equipment) connected to a real-world mobile network operator can no longer connect to the mobile network operator and its service is disrupted by well-timed injections and withdrawals by our simulated fake base station implementation.
In this paper, we design and build an anomaly detection system to detect such active injection threats to the victim user equipment’s availability. Our detection scheme implements network sensors to collect the networking packets and behaviors on the 4G/5G RRC protocol. For the detection processing, we use unsupervised learning to model the protocol-compliant networking behaviors (the normal behaviors following the protocol) and detect any deviations from the protocol compliance. Our machine learning application is also motivated by the dynamic nature of mobile cellular networking; e.g., the user equipment moves and its channel changes. Our scheme automatically re-trains the detection model when the user equipment moves and the channel changes.
We validate and evaluate our detection scheme using a software-defined radio (SDR) and open-source 4G/5G software (srsRAN and Open5GS). For our experimentation, we simulate and implement the varying base station scenarios, including those which are protocol-compliant (no attack) and the various injection threats resulting in availability deprivation (under attack). While testing against many injections, this paper presentation focuses on the simulations of the threats of Withhold, Withhold-and-Release, and Reject. While there can be other ways for the fake base station to achieve successful availability threat (we test against them and briefly discuss those simulations), we select these three threats for the focus of our presentation because other threats cause the same availability-deprivation effect on the current 4G/5G user equipment and because the other threats are derivative of these threats and only cause greater attacker effort. We also simulate the user equipment mobility by physically moving it.
Our experimental results show that our detection system is effective in detecting such threats. In cases where the wireless channel has not changed since the training, our detection system detects the fake base station’s threats with 100% accuracy. In cases where the training is lagging (and the user equipment has moved, changing its channel environment), we observe a decrease in the accuracy to 65–76%. Our paper demonstrates that machine learning and continuous training can effectively adjust the detection based on the dynamic mobility and channel changes in mobile cellular networking.
The rest of the paper is organized as follows. Section 2 provides a review of related work in anomaly detection using machine learning against fake base station threats. Section 3 discusses the background of the fake base station. Section 4 introduces the motivation regarding the fake base station. Section 5 discusses the details of our scheme. Section 6 outlines the prototype implementation from radio sensing to anomaly detection. Section 7 discusses the experimental results, and Section 8 gives future directions regarding our scheme. Finally, Section 9 concludes the paper with a summary of key importance and contributions.

2. Related Work

A fake base station presents a malicious attacker assuming the base station role in cellular networks, including 4G and 5G. Previous research has advanced the 4G/5G design and implementations to secure cellular networking against fake base stations to enhance privacy (user equipment ID of IMSI, e.g., [1,2,3]), integrity (RRC communication payloads including the highly critical, universally accessible warning alerts, e.g., [4,5,6], and others, e.g., [1,7,8,9,10,11]), and availability (e.g., [1,2,3,12,13,14,15]). Since our work focuses on the detection of fake base stations, in this section, we describe the relevant research in detection by using machine learning and beyond.

2.1. Unsupervised Learning for Fake Base Station

Unsupervised learning has been one of the most effective methods to provide adaptability to known and novel attacks for fake base station detection while label data is unavailable. Nakarmi et al. proposed the reference signal received power (RSRP)-based machine learning detection approach [16], and Jin et al. [17] utilized the widely used lightGBM algorithm to identify false base stations by analyzing base station parameters by setting a fixed threshold. Similarly, Mubasshir et al. [18] proposed a technique to detect fake base stations in cellular networking using machine learning with Sequential-LSTM with an accuracy of 92% and a false positive rate of 5.96%. However, those methods were applied to early mobile communication generations and it is not known whether they are effective in more complex 5G.
Recent studies explored advanced machine learning techniques for FBS detection. Sun et al. [19] proposed an ensemble model using Temporal Graph Isolation Forest (TGIF) and Local Outlier Factors (LOF), though its reliance on complex temporal graphs increases computational cost and limits real-world applicability. Bolcek et al. [20] utilized deep learning on IN-phase and Quadrature (I/Q) signal data to detect FBS, but their SDR-based experiments lacked channel variation and mobility considerations. Kriaa et al. [21] integrated knowledge graphs with machine learning; their model depended solely on SINR features. Park et al. [22] compared multiple supervised algorithms, achieving high accuracy but with notable computational overhead and higher noise sensitivities. In contrast, we use unsupervised learning for anomaly detection, which can deviate any changes or anomalies from the normal behaviors, and automatic threshold adjustment. Unsupervised learning models provide a robust defense against both known and emerging FBS attacks by continuously learning from new data and adapting to changes in network behavior.
Our previous work [23] utilizes a variational autoencoder for anomaly detection in 5G using the publicly available 5G-NIDD dataset [24]. More specifically, it analyzes the cellular networking packets between the user equipment and core network to detect anomalous behaviors and tests against denial-of-service (DoS) threats against the core network. This paper differs from this previous research in that it focuses on the fake base station threats on the RRC control communications between the user equipment and base station, which involves only a wireless communication link and occurs before the communications from/to the backend core network. Our scheme therefore employs the anomaly detection engine on the user equipment, senses the wireless communication signals, and detects the threats in the lower layers of the OSI network model in physical and link layers.

2.2. Detecting Fake Base Station Threat Beyond Machine Learning

Different methods have been applied to identify fake base stations using physical signal-based and digital communication features. Physical signal-based features include signal characteristics and metadata such as radio frequency (RF) fingerprints, signal strength, and cell information identity, representing communication’s most common physical behaviour. For example, Ali et al. [25] presented a method to detect fake base stations using radio frequency (RF) fingerprinting to distinguish legitimate base stations. Similarly, Li et al. [26] developed an FBS-Radar system to gather spam messages and metadata from user equipment (UEs) to locate fake base stations. Signal-based methods also include Bin et al.’s [27] approach, which applies the Based Spatial Clustering of Applications with Noise (DBSCAN) to identify fake base stations through signal strength analysis.
On the other hand, digital communication features rely on network protocol data. For instance, a network rule-based false base station identification method was presented by Nakarmi et al. [28] to find cell identities that are compatible with 3GPP Radio Access Technologies (RATs) but unauthorized. Zhang et al. characterize fake base stations in China from real-world spam messages using machine learning [9]. Additionally, Purification et al. [12] proposed a fake base station detection scheme using the time duration for RRC and NAS to evaluate the performance of the proposed scheme. The scheme achieved 100% accuracy, zero false positives, and zero false negatives by manually selecting multiple thresholds. Wen et al. [29] focuses on Layer-3 attack detection using O-RAN-compliant infrastructure and real-time inspection. However, these existing approaches have limitations. Research on signal-based features and digital features to identify false base stations lacks experimental results for various scenarios, such as noise interference, automatic threshold setting, multiple features and various real-world conditions.

3. 5G/4G Networking Background

3.1. 5G/4G Networking Entities

The 4G/5G cellular network consists of three major entities: core network, base station, and user equipment. The core network and the base stations belong to the operator network that provides cellular service (voice/data), and the user equipment is the service recipient. The user equipment receives cellular services from the core network through the base station, which acts as a gateway between them. The base stations are located at the edge of the service provider network and spread over a large geographical area. They connect to the user equipment using a wireless radio channel and to the core network using a wired connection.

3.2. Base Stations and Radio Resource Control (RRC)

Before receiving the cellular service from the core network, the user equipment establishes the RRC connection with the base station and NAS connection with the core network through a sequence of protocol messages defined by 3GPP. Since the user equipment needs to set up RRC connection with the base station before proceeding to the NAS connection, the base station needs to handle different connection scenarios such as connection failure with the core network or poor radio link quality with the user equipment to ensure network access availability of the user equipment. 3GPP defines protocol messages for such behavior of base stations. Figure 1 shows the protocol messages among the entities for both the functioning (Figure 1a) and the connection-release cases of (Figure 1b) base stations that currently cannot provide connection provisioning. We consider the connection-release case of the base station that has a connection failure with the core network after establishing the RRC connection with the user equipment.
For both of the base stations, the RRC communication starts with listening to broadcast MIB/SIB1 from the base stations that contain radio parameters to access the network. Upon reception of network access parameters from the broadcast message, the user equipment selects a base station with the highest received signal power and initiates setting up the radio resource control connection by sending RRC Setup Request message. The base station responds with the setup request with RRC Setup message for assigning the radio network identity to the user equipment for further communication such as NAS connection setup.

3.3. Protocol-Compliant Case 1: When Base Station Provides Connectivity

When there is connectivity and a functional base station, the user equipment communicates with the core network to set up a NAS connection; its protocol is described in Figure 1a. At the beginning of the NAS connection, the user equipment and the core network perform mutual authentication and then establish security parameters (such as symmetric keys, algorithms, etc.) for further secure communication, which is called authentication and key agreement (AKA, e.g., 5G-AKA). The user equipment initiates the mutual authentication by sending a Registration Request, which contains a cryptographic challenge message for the core network. The base station plays a blind role in AKA by translating the messages received at the RRC and forwarding them towards the core network using a different set of protocols, which is out of the scope of this research.

3.4. Protocol-Compliant Case 2: When Base Station Cannot Provide Connectivity and Releases Connection

For the connection-release case of the base station (refer to Figure 1b), i.e., when there is a connection failure between the base station and the core network during NAS connection, the base station releases the RRC connection by sending an RRC Release message and deletes the radio network identity assignment from its storage. However, the user equipment can retransmit the initial Registration Request message with a regular interval before receiving the RRC Release message from the base station. After receiving the release message from the base station the user equipment initiates the RRC connection process by listening to the MIB/SIB1 message from other nearby base stations.

4. Motivation: Fake Base Station Threat

4.1. Threat Model

We consider a highly feasible (and therefore high-security-risk) threat which can be implemented using software-defined radios and public, open-source software, as demonstrated with our implementation and simulations of the fake base station. The fake-base-station attacker has 4G/5G protocol implementation capabilities, transmitting as a base station. The RRC protocol interacting with the victim user equipment is sufficient, and the attacker does not need to communicate with the rest of the entities in the mobile network operator. Since the attacker communicates wirelessly to the victim user equipment, we assume that the attacker is within the communication range of the victim user equipment; i.e., it can transmit and receive the victim user equipment’s communications.
We focus on the attacker injecting coded transmissions with the goal of depriving the victim user equipment’s availability, as opposed to a passive attacker against privacy or confidentiality. Our threat focus, injection threats against availability, has higher threat feasibility and security risk than the other injection threats targeting integrity or authenticity, as our threats target RRC and therefore precede the NAS-layer control communications involving AKA protocol [30]. If the attacker is successful and deprives availability (i.e., no connectivity), then the victim user equipment cannot communicate to the core network via NAS-layer control communications. In contrast, other injection threats against integrity or authenticity require breaking the cryptographic protections in 5G AKA, e.g., [3,7,31]. Section 4.3 demonstrates that our threats targeting RRC are effective in depriving the victim user equipment’s availability.

4.2. Threat Vectors

The current 4G/5G implementation practices are vulnerable to the fake base station. We observe any deviations from the RRC protocol (standard protocol is described in Section 3) to be effective in depriving the availability. The attacker indefinitely disables the victim user equipment’s connectivity, which corroborates with [1,3,12,13,14].
In this research, we consider three fake base station threat vectors that deviate from the standard protocol, described in Section 3, to deprive the user equipment connection availability, which is illustrated in Figure 2. The three threat vectors are Withhold, Withhold-and-Release, and Reject. We experimentally verify that threats other than these three also have the same availability impact on the victim’s user equipment; i.e., the victim cannot connect indefinitely. However, we focus only on these three threats because other threats can be derived from these three threats. In our case, we achieve the same availability impact with our three threats that occur during the RRC connection. A fake base station can withhold any request message from the user equipment after the RRC connection to make it retransmit the message. It can also release the RRC after withholding it for a certain amount of time. Moreover, it can deviate from the standard RRC by rejecting the RRC connection after successfully setting up the RRC. The fake base station deviates in RRC because it cannot proceed with NAS connection due to 5G-AKA (described in Section 3.3).
These threats diverge from the 4G/5G protocol, described in Section 3 and depicted in Figure 1. The threats are different from the protocol-compliant and connection-release base station, since the base station releases the connection in a timely manner in the protocol-compliant case. We describe the three threat vectors which are the focuses and targets of our anomaly detection.

4.2.1. Withhold Threat

As shown in Figure 2a, the fake base station implements Withhold threat by not forwarding the Registration Request message from the user equipment to the core network, which makes the user equipment keep sending the message with a regular interval for an infinite time duration. In contrast with the connection-release case (described in Section 3.4), in the Withhold threat, the fake base station does not send the RRC Release message to release the connection with the user equipment.

4.2.2. Withhold-and-Release Threat

While in the Withhold threat, the fake base station does not send the RRC release at all, in Withhold-and-Release threat, the fake base station sends the RRC release after withholding for a certain amount of time. More specifically, Withhold-and-Release threat spoofs the connection-release case behavior of the base station, as illustrated in Figure 2b. The only difference is the fake base station has control over the withholding time duration but the base station in the connection-release case does not. We consider this threat to demonstrate that our anomaly detection scheme can efficiently detect this threat in a dynamic channel environment while the previous research [12] cannot because of manual threshold selection in a static channel environment.

4.2.3. Reject Threat

In the threat vector, we also consider a protocol deviation where the fake base station sends the RRC reject message after receiving the RRC Setup Complete message from the user equipment. The base station denies the RRC connection in such a case. Figure 2c shows the threat vector which we call the fake base station Reject threat. A protocol-complaint base station sends an RRC reject message only when it is unable to provide network resources (e.g., radio network identity) during RRC setup. However, in this threat, the fake base station rejects the RRC connection when the user equipment has already established the RRC connection and proceeds to set up a NAS connection.

4.3. Threat Implementation: Depriving Availability in Real-World 4G/5G

We empirically demonstrate that the threat is effective against off-the-shelf, unmodified phones connected to a real-world mobile network operator. Our fake base station simulation based on software-defined radios and open-source srsRAN is described in greater detail in Section 6.1. Figure 3 shows that the Withhold threat can keep an off-the-shelf Android phone connected to itself for an indefinite time in a 4G/LTE network, depriving the phone’s connectivity availability. It shows the victim mobile phone connects to the fake base station because it has the highest signal strength but receives no service. The fake base station holds the RRC connection for an indefinite amount of time and withholds forwarding the registration request messages to the core network, as discussed in Section 4.2.1. Withhold-and-Release threat and Reject threat are also effective against the phone-based user equipment’s availability.

5. Our Scheme

In our research, we use an unsupervised learning algorithm for anomaly detection within 4G and 5G networks to detect fake base station threats. Because we build anomaly detection, detecting any deviations from the 4G/5G protocol including zero-day threats, we use unsupervised learning featuring an autoencoder. Unsupervised learning models the normal behaviors in training and, once deployed, detects any deviations from it. The normal behaviors for training include the connectivity-provision case (Figure 1a) and the connection-release case (Figure 1b). We test our anomaly detection scheme using these cases as well as additional cases introducing a fake base station. More specifically, the additional cases involving security threats are the Withhold threat (Figure 2a), Withhold-and-Release threat (Figure 2b), and Reject threat (Figure 2c). We simulate and implement such threats described in Section 4.2, which are successful against a real-world mobile phone.
The user equipment is equipped with a radio to process the RF signals on the frontend and the processor on the backend. More specifically, as shown in Figure 4, user equipment is equipped with a software-defined radio (SDR; the dotted box on the left in Figure 4) for the radio processing and a Mini PC for the digital backend processing (the dotted box on the right in Figure 4).
The user equipment is capable of communicating (i.e., transmitting and receiving) in 4G/5G networking. Our scheme builds on such existing capability, which is drawn in the top row of Figure 4. In SDR processing, the user equipment is equipped with a radio which includes an RF (radio frequency) module and a Demodulator/Decoding module. The RF module captures RF signals from a functional case, a connection-release case, and fake base stations. Then the captured signal is passed to the Demodulator/Decoding module where the RF signals are decoded into digital format for analysis. The digital data later enters the RRC layer. This layer extracts the signalling information like control messages.
Our scheme, including the sensing and the detection engine, is drawn in the bottom row in Figure 4. We sense and collect data to input to our detection engine from both the SDR radio and the Mini PC computer. The collected data from our sensing is then preprocessed to extract important features such as packet number, size (B), layers, Packet Information Summary and time differences between packets with normalization to ensure uniform scaling.
Our scheme dynamically controls and adapts the autoencoder parameters. To adapt to the mobility and the change in communication channels, the autoencoder is trained on normal network traffic data from functional- and connection-release-case base stations to minimize the reconstruction errors for normal behaviour. Once trained, the autoencoder processes new network traffic data, and any significant deviation from the normal patterns learned is classified as an anomaly.

6. Prototype Implementation from Radio Sensing to Anomaly Detection

6.1. 4G/5G Prototype Implementation

We implement our prototype base stations, user equipment, and core network to generate our dataset for anomaly detection using available open-source software and hardware. We use 5G base station prototype software, srsRAN [32], to simulate a functional case, a connection-release case, and three fake base station threat vectors as discussed in Section 3.3, Section 3.4 and Section 4.2 respectively. We modify the srsRAN code of RRC to implement fake base station threat vectors. We use the 4G/5G prototype suite, srsRAN_4G [33], to simulate user equipment and collect data by tracing the RRC and NAS packets using the Wireshark packet capturing tool. Figure 5a shows the experimental setup of our prototype implementation for sensing the protocol messages. We use two software-defined radios (Ettus USRP B210) connected to the base station and the user equipment for the radio connection between them. We use Open5GS [34] to implement the core network in the same machine as the base station. For UE backend processing, we implement UE on a Mini PC (Intel 12th Gen N95 3.4 GHz, 8 GB RAM, Ubuntu 24.04 LTS), while the core network and base station were hosted on a desktop computer (4GHz Quad-Core Intel Core i7, 16 GB RAM, Ubuntu 24.04 LTS).

6.2. Mobility Experiment

Figure 5 shows our experimental setup. Figure 5a shows the hardware simulating entities, while the core network is co-located with the base station in the computer machine. Figure 5b whose the LOS vs. NLOS channels, varying the user equipment locations.
We simulate the user equipment mobility by physically moving the user equipment, which affects the communication channels between the user equipment and the base station. The mobility simulation is important since it changes the anomaly detection parameter, motivating the machine-learning-based automatic control of such parameters. We collect packet trace data from user equipment at two different locations while it is connected to a base station (either functional, connection-release case or fake). To vary the radio channels between the user equipment and the base station, we use line-of-sight (LOS) and non-line-of-sight (NLOS) communication while keeping them 5 m apart. In NLOS, there is a physical barrier between the user equipment and base station, in contrast to LOS. More concretely, NLOS had the nodes separated in different rooms with a physical wall in between. In LOS channel conditions, there is a clear and uninterrupted path between the user equipment and the base station that allows strong signal transmission. On the other hand, NLOS channel conditions include barriers such as building walls that hinder a direct signal path, resulting in signal loss. Figure 5 shows the location and distance of user equipment and the base station. The user equipment moves between these locations to vary the communication channel and environment. In each environment, we vary and simulate the different base station scenarios: functional, connection-release case, fake with Withhold threat, fake Withhold-and-Release threat, and fake with Reject threat.

6.3. Dataset Collection

We use our implementation described in Section 6.1 to collect data. For our data collection, we vary the base station implementation simulation between normal (no attack and protocol-compliant) and attacker (varying the threat vectors as described in Section 4.2). We also vary the wireless channel conditions to simulate mobility.
The dataset is divided into LOS and NLOS sections, each containing five categories: legitimate, connection-release case, Withhold threat, Withhold-and-Release threat and Reject threat. In the LOS setting, data was collected over 23 h, with each sample representing a full communication session between the user equipment and the base station, lasting approximately 4.76 min. This resulted in 582 training and 295 testing samples per threat type. Similarly, in the NLOS setting, data was collected over two days, with each sample taking 8.55 min, yielding 528 training and 319 testing samples per threat.
Each sample covers a complete RRC session, from the initial broadcast (MIB/SIB1) to connection release, capturing all control and retransmission events. The session durations of 4.76 min for LOS and 8.55 min for NLOS were determined to represent full protocol cycles under stable and obstructed channel conditions. The longer duration in NLOS reflects increased retransmission and timeouts due to signal attenuation and interference.

6.4. Machine Learning: Preprocessing Data

Raw network traffic data is preprocessed to extract the relevant features. These would include features like packet number, size (B), layers, Packet Information Summary and time difference (s), which capture the behaviour of the base stations while communicating with the user equipment. A new feature, time difference between each packet, was created to capture dynamic patterns in network activities.
In this work, the extracted features are derived primarily from RRC-layer packet metadata rather than from physical-layer signal characteristics such as reference signal received power (RSRP). These metadata-based features effectively represent the behaviour of signaling exchanges between the base station and the UE. To learn efficiently, the features are normalized to ensure uniform scaling, which is essential for the autoencoder. Normalization lessens the impact of outliers and prevents certain features from unnecessarily affecting the learning process.

6.5. Autoencoder ML Training

Our scheme, based on an autoencoder model, is designed to learn the normal behavior of network traffic by encoding and reconstructing key features extracted from RRC-layer messages. We therefore use the normal (protocol-compliant) data to train the model (while the threat events with the fake base station attacker are later used for testing), making it unsupervised learning.
We use autoencoder in our anomaly detection system. An autoencoder is an unsupervised learning algorithm trained to encode input data into a lower-dimensional representation and subsequently reconstruct the input from such a latent representation. We trained the autoencoder on normal network traffic data, aiming to enable it to learn the structure of legitimate base station behaviour. The goal of the training phase is to minimize the error of reconstruction to have the model accurately reproduce normal behaviour.
The architecture comprises three hidden layers with dimensions 128, 64, and 128, using ReLU activation functions to introduce non-linearity, as summarized in Table 1. The model is trained using the Adam optimizer with a learning rate of 0.001, batch size of 32, and 50 epochs, minimizing the mean squared error (MSE) between the input and its reconstruction.
Figure 6 illustrates the training loss and validation loss for each epoch during the training process of Withholding threat with NLOS. Each line represents one epoch during training. Completing the training once with all training data is called an epoch. In each epoch, it shows the training loss and validation loss. The number of epochs here is set to 50. Figure 6 provides the average loss over the entire training and validation datasets. The training loss decreases steadily, indicating that the model learns and fits well with the training data. The validation loss also decreases consistently and stays close to the training loss, showing that this model is suitable for anomaly detection for fake base stations and shows good performance for the training and validation datasets.

6.6. Anomaly Detection

An anomaly is defined as any notable difference between the original input and its reconstruction after training the autoencoder. Since fake base stations deviate from the usual network pattern, higher reconstruction errors are compared to legitimate stations. An anomaly score above a predefined threshold is flagged as a potential fake base station.
The latent space size, the optimiser’s learning rate, and the batch size during training are the key hyperparameters for the autoencoder model. These parameters were used through cross-validation to optimize performance on the training dataset, focusing on minimizing false positives for fake base stations.

7. Experimental Results for Detection

We demonstrate the experimental results to evaluate the effectiveness of anomaly detection in fake base station threats using unsupervised learning methods such as autoencoders. The experiments were conducted in both line-of-sight (LOS) and non-line of sight (NLOS) environments, as shown in Figure 5b. The experimental results are structured to assess various aspects of our solution accuracy and detection capabilities.

7.1. Mobility Affects the Channel Environment

In a cellular network, the user equipment mobility affects the channel performance between the user equipment and the base station. More specifically, the channel performance degrades when the user equipment moves from the LOS channel to the NLOS channel and vice versa. Table 2 shows channel performance measurements with received signal power, signal-to-noise ratio, and bit error rate for the LOS and NLOS channel environments. As shown in the table, the LOS channel has better channel performance than the NLOS channel. In the LOS channel, the user equipment receives higher signal power than the NLOS channel. 3GPP defines the received signal power measurement in reference signal received power (RSRP) measurement; a higher RSRP signifies a better channel. The higher RSRP indicates better signal quality, which is measured by signal-to-noise ratio (SNR). The table shows the LOS channel has a 3dB higher SNR than the NLOS channel. Higher SNR indicates higher signal quality. Bit error rate (BER) is another channel performance measure. The lower the BER the better the channel. As shown in Table 2, the LOS channel has a lower BER than the NLOS channel (2.985% vs 3.624%).

7.2. Accuracy Analyses for Anomaly Detection

We experimentally measure the accuracy performances when the user equipment is static vs. dynamically moving. To measure detection performance, we use the standard metric of accuracy, accuracy = TP + TN TP + TN + FP + FN , where TP = true positives, FP = false positives, FN = false negatives, and TN = true negatives. Other accuracy-related measures such as F-1 score, recall, and precision are omitted in this paper.
We also analyze the confusion matrix for all cases and the reconstruction error distribution. However, we only show these for the static case (user equipment does not move) and against the Withhold threat (the most challenging case in our simulation experimentation) in Figure 7 and Figure 8. These analyses for the other cases are omitted, and we focus on the accuracy results. We detect anomalies using a Z-score-based threshold. This is the difference between the reconstruction error of the current instance and the average reconstruction error normalized by the standard deviation of the reconstruction errors.

7.2.1. User Equipment Is Static

The performance metrics demonstrate important information about how well the model detects anomalies in fake base station threats. The anomaly detection model for both LOS and NLOS with three threat vectors achieved an accuracy of 1.00, indicating that 100% of instances are correctly classified as normal and anomalous. Our scheme detects zero false positives and zero false negatives in both LOS and NLOS channel conditions, and therefore accuracy is 100%. The confusion matrices for the Withhold threat are shown in Figure 7. These results suggest that anomaly detection in fake base stations in various threat vectors based on the autoencoder and the threshold performs well on the dataset.

7.2.2. User Moves: Channel Changes from LOS to NLOS

To evaluate our scheme in the mobile scenario, we use LOS data for training and NLOS data for testing. There is a drop in accuracy, especially within the Withhold threat, because of the huge difference in the signal characteristics and noise within the LOS and NLOS channel conditions of these two environments. If our scheme’s dynamic control is disabled, the autoencoder does not accurately learn the patterns of LOS data due to testing with unseen NLOS data and therefore drops the accuracy to 75.43%, as shown in Figure 9. Our scheme classified zero false positives but 110,699 false negatives, meaning that the model missed 24.76% of anomalous instances during cross-testing with LOS and NLOS channel conditions in Withhold threat.

7.2.3. User Moves: Channel Changes from NLOS to LOS

We use NLOS data for training and LOS data for testing to evaluate our scheme in another mobile situation. The larger drop in accuracy for Withhold threat is because the model learns signal interference and obstacles during training to handle data. On the other hand, during testing with clear LOS data, the model struggles to identify anomalies due to the noisy conditions. Our scheme does not show good performance in training the model under noisier conditions (NLOS) and testing with clear conditions (LOS). Compared with the previous worst-case scenario, accuracy dropped more in this situation to 64.69%, as shown in Figure 9. Our scheme flags zero false positives but the false negative rate is 185,992, which states that 35.35% of anomalous instances were misclassified in Withhold threat. The accuracy performance would stay this way if the dynamic control were disabled.

7.2.4. User Moves and Dynamically Updates Training

If the user equipment moves and our scheme enables the dynamic control to update the training, then we recover the accuracy performances to 100% from the inferior performances in Section 7.2.2 and Section 7.2.3. The dynamic training overhead takes between 17.20 s and 20.41 s, depending on the threat vector.

8. Future Directions

8.1. Advancing Detection Defense

We use RRC packet information in our current scheme, which is decoded and demodulated packet information. We can advance our detection by including greater information in our machine learning. In particular, signal-based wireless channel information, such as received signal power and signal-to-noise ratio (SNR), can be used for our detection. Furthermore, we can incorporate our scheme with the existing approaches, described in Section 2.1, which will require comparison and tradeoff analyses.

8.2. More Threats Beyond Our Three Threats to Availability

In this research, we focus on the fake base station threats to user connection availability at the RRC layer with three threat vectors. Although we have not implemented and tested other protocol-deviant behavior discovered by previous research (discussed in Section 4.2), in principle our scheme can detect those attacks as well.
We focus on the protocol-deviant behavior of the fake base station that results in user equipment service availability but the protocol messages in the RRC layer do not have confidentiality or integrity protection. The attacker cannot go beyond RRC because the upper layers are protected using 5G-AKA at the NAS layer. Thus, the attacker can perform attacks beyond availability to reveal user equipment identity, such as IMSI catching [35], redirecting connection from 5G/4G to 3G/2G [8,31] (also known as bidding down), device capability identification [31] etc., by complying with protocol messages. In these cases, the attackers modify the contents of the protocol messages. For example, it can send an RRC release message with an instruction to downgrade the connection from 4G to 2G. A future research direction can be the detection of such an anomaly that complies with the protocol but modifies the contents of the protocol messages.

8.3. Modularity, Standardization, and Transition to Practice

Our anomaly detection scheme is designed for modularity; i.e., its implementation can be modular to the existing cellular networking, e.g., a mobile application. While using the cellular networking sensing and design to inform and take defense against fake base stations, we can enable or disable our scheme without interfering with the cellular networking operations. Such modularity can facilitate practicality.
3GPP standardization has traditionally focused on the cellular infrastructure (e.g., base station and core network) or the cellular-essential communication functionalities (e.g., the baseband protocol on the user equipment). However, there is increasing awareness and initiatives for future generations of cellular networking including beyond-5G or 6G. 3GPP is considering architectural enhancements on the network operator side to use machine learning for data analytics [30]. We can build on such an approach and architecture but add mechanisms for security, i.e., to protect the availability against fake base stations. Further developing and standardizing our scheme within such a scope at the user equipment can facilitate the transition to practice of our research.

9. Conclusions

In this study, we design and build an anomaly detection scheme to defend against fake base stations in 4G and 5G networks. Our scheme uses unsupervised learning based on an autoencoder to detect any deviations from the 4G/5G RRC protocol. We train and model the detection profile based on the protocol-compliant behaviors of the base station, which include the connectivity-provision case as well as the connection-release case (which corresponds to the case of legitimately not having connectivity, often temporarily). We simulate and implement the fake base station threats including Withhold, Withhold-and-Release, and Reject threats; we show that our threat prototypes effectively disable connectivity against real-world off-the-shelf mobile phones. We test our scheme against these threat prototypes. We validate our scheme using a prototype implementation with software-defined radios and open-source tools of srsRAN and Open5GS. The experimental results demonstrate 100% accuracy when a user’s equipment does not move. When the user equipment moves (in our case, causing LOS and NLOS wireless channels), our experiments show that the accuracy can reduce to 65–76%. However, our scheme can recover to 100% accuracy by dynamically training to adjust to the new channel, the overhead of which can take 17.20 s to 20.41 s, depending on the channel and threat vector.

Author Contributions

Conceptualization, A.I. and S.-Y.C.; Methodology, A.I. and S.-Y.C.; Validation and Software., S.P.; Writing—Original Draft Preparation, A.I. and S.-Y.C.; Review and Editing, S.P. and S.-Y.C.; Supervision and Funding, S.-Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research did not receive any external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Shaik, A.; Borgaonkar, R.; Asokan, N.; Niemi, V.; Seifert, J.P. Practical attacks against privacy and availability in 4G/LTE mobile communication systems. arXiv 2015, arXiv:1510.07563. [Google Scholar]
  2. Hussain, S.; Chowdhury, O.; Mehnaz, S.; Bertino, E. LTEInspector: A systematic approach for adversarial testing of 4G LTE. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium, San Diego, CA, USA, 18–21 February 2018. [Google Scholar]
  3. Hussain, S.R.; Echeverria, M.; Karim, I.; Chowdhury, O.; Bertino, E. 5GReasoner: A Property-Directed Security and Privacy Analysis Framework for 5G Cellular Network Protocol. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; pp. 669–684. [Google Scholar]
  4. Bitsikas, E.; Pöpper, C. You have been warned: Abusing 5G’s Warning and Emergency Systems. In Proceedings of the 38th Annual Computer Security Applications Conference, Austin, TX, USA, 5–9 December 2022; pp. 561–575. [Google Scholar]
  5. Lee, G.; Lee, J.; Lee, J.; Im, Y.; Hollingsworth, M.; Wustrow, E.; Grunwald, D.; Ha, S. This is your president speaking: Spoofing alerts in 4G LTE networks. In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services, Seoul, Republic of Korea, 17–21 June 2019. [Google Scholar]
  6. Purification, S.; Chang, S.Y. Verifiable Alerts for 4G/5G Public Warning System. In Proceedings of the IEEE Conference on Communications and Network Security (CNS), Avignon, France, 8–11 September 2025. [Google Scholar]
  7. Kim, H.; Lee, J.; Lee, E.; Kim, Y. Touching the Untouchables: Dynamic Security Analysis of the LTE Control Plane. In Proceedings of the IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; pp. 1153–1168. [Google Scholar]
  8. Karakoc, B.; Fürste, N.; Rupprecht, D.; Kohls, K. Never Let Me Down Again: Bidding-Down Attacks and Mitigations in 5G and 4G. In Proceedings of the 16th ACM Conference on Security and Privacy in Wireless and Mobile Networks, WiSec ’23, Guildford, UK, 29 May–1 June 2023. [Google Scholar]
  9. Zhang, Y.; Liu, B.; Lu, C.; Li, Z.; Duan, H.; Hao, S.; Liu, M.; Liu, Y.; Wang, D.; Li, Q. Lies in the Air: Characterizing Fake-Base-Station Spam Ecosystem in China. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, Virtual, 9–13 November 2020; pp. 521–534. [Google Scholar]
  10. Wen, H.; Porras, P.; Yegneswaran, V.; Lin, Z. Thwarting Smartphone SMS Attacks at the Radio Interface Layer. In Proceedings of the 30th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, 27 February–3 March 2023. [Google Scholar]
  11. Purification, S.; Wuthier, S.; Kim, J.; Kim, I.; Chang, S.Y. Base Station Certificate and Authentication for 5G Radio Control Security. In Proceedings of the 2025 IEEE 22nd International Conference on Mobile Ad-Hoc and Smart Systems (MASS), Chicago, IL, USA, 6–8 October 2025. [Google Scholar]
  12. Purification, S.; Wuthier, S.; Kim, J.; Kim, J.; Chang, S.Y. Fake Base Station Detection and Blacklisting. In Proceedings of the 2024 33rd International Conference on Computer Communications and Networks (ICCCN), Kailua-Kona, HI, USA, 29–31 July 2024; pp. 1–9. [Google Scholar]
  13. Bitsikas, E.; Pöpper, C. Don’t Hand It Over: Vulnerabilities in the Handover Procedure of Cellular Telecommunications. In Proceedings of the 37th Annual Computer Security Applications Conference, Virtual, 6–10 December 2021; pp. 900–915. [Google Scholar]
  14. Chang, S.Y.; Purification, S. Securing Cellular Availability: The Wireless Blackhole Threat and Defense. In Proceedings of the IEEE Conference on Communications and Network Security (CNS), Avignon, France, 8–11 September 2025. [Google Scholar]
  15. Purification, S.; Kim, J.; Kim, J.; Chang, S.Y. Fake Base Station Detection and Link Routing Defense. Electronics 2024, 13, 3474. [Google Scholar] [CrossRef]
  16. Nakarmi, P.K.; Sternby, J.; Ullah, I. Applying Machine Learning on RSRP-Based Features for False Base Station Detection. In Proceedings of the 17th International Conference on Availability, Reliability and Security, Vienna, Austria, 23–26 August 2022; pp. 1–7. [Google Scholar]
  17. Jin, J.; Lian, C.; Xu, M. Rogue Base Station Detection Using a Machine Learning Approach. In Proceedings of the 2019 28th Wireless and Optical Communications Conference (WOCC), Beijing, China, 9–10 May 2019; pp. 1–5. [Google Scholar]
  18. Mubasshir, K.S.; Karim, I.; Bertino, E. FBSDetector: Fake Base Station and Multi-Step Attack Detection in Cellular Networks Using Machine Learning. arXiv 2024, arXiv:2401.04958. [Google Scholar]
  19. Sun, S.; Abualhaol, I.; Poitau, G.; Esswie, A.; Repeta, M. An Ensemble Approach for Fake Base Station Detection Using Temporal Graph Analysis and Anomaly Detection. In Proceedings of the 2024 Wireless Telecommunications Symposium (WTS), Oakland, CA, USA, 10–12 April 2024; pp. 1–6. [Google Scholar]
  20. Bolcek, J.; Kufa, J.; Harvanek, M.; Polak, L.; Kral, J.; Marsalek, R. Deep Learning-Based Radio Frequency Identification of False Base Stations. In Proceedings of the 2023 Workshop on Microwave Theory and Technology in Wireless Communications (MTTW), Riga, Latvia, 4–6 October 2023; pp. 45–49. [Google Scholar]
  21. Kriaa, S.; Feki, A.; Papillon, S.; Chene, T.; Ouattara, I. Detecting Fake Base Stations Using Knowledge Graphs and ML-Based Techniques. In Proceedings of the 2023 IEEE Virtual Conference on Communications (VCC), Virtual, 28–30 November 2023; pp. 37–42. [Google Scholar]
  22. Park, H.; Son, D.; Kim, G.; You, I. A Study on Machine Learning-Based False Base Station Detection Method in 5G. In Proceedings of the 6th International Symposium on Mobile Internet Security (MobiSec’22), Jeju Island, Republic of Korea, 15–17 December 2022; pp. 1–7. [Google Scholar]
  23. Islam, A.; Chang, S.Y.; Kim, J.; Kim, J. Anomaly Detection in 5G Using Variational Autoencoders. In Proceedings of the 2024 Silicon Valley Cybersecurity Conference (SVCC), Seoul, Republic of Korea, 17–19 June 2024; pp. 1–6. [Google Scholar]
  24. Siriwardhana, Y.; Samarakoon, S.; Porambage, P.; Liyanage, M.; Chang, S.Y.; Kim, J.; Kim, J.; Ylianttila, M. Descriptor: 5G Wireless Network Intrusion Detection Dataset (5G-NIDD). IEEE Data Descr. 2025, 1–12. [Google Scholar] [CrossRef]
  25. Ali, A.; Fischer, G. Enabling Fake Base Station Detection Through Sample-Based Higher Order Noise Statistics. In Proceedings of the 2019 42nd International Conference on Telecommunications and Signal Processing (TSP), Budapest, Hungary, 1–3 July 2019; pp. 695–700. [Google Scholar]
  26. Li, Z.; Wang, W.; Wilson, C.; Chen, J.; Qian, C.; Jung, T.; Liu, Y. FBS-Radar: Uncovering Fake Base Stations at Scale in the Wild. In Proceedings of the Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, 26 February–1 March 2017. [Google Scholar]
  27. Bin, Q.; Cai, Z.; Xiao, Y.; Liang, H.; Su, S. Rogue Base Stations Detection for Advanced Metering Infrastructure Based on Signal Strength Clustering. IEEE Access 2019, 8, 158798–158805. [Google Scholar] [CrossRef]
  28. Nakarmi, P.K.; Ersoy, M.A.; Soykan, E.U.; Norrman, K. Murat: Multi-RAT False Base Station Detector. arXiv 2021, arXiv:2102.08780. [Google Scholar] [CrossRef]
  29. Wen, H.; Porras, P.; Yegneswaran, V.; Gehani, A.; Lin, Z. 5G-Spector: An O-RAN Compliant Layer-3 Cellular Attack Detection Service. In Proceedings of the 31st Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, 26 February–1 March 2024. [Google Scholar]
  30. 3GPP TS 23.288 V18.5.0. Architecture Enhancements for 5G System (5GS) to Support Network Data Analytics Services. 2024. Available online: https://www.3gpp.org/DynaReport/23288.htm (accessed on 27 October 2025).
  31. Shaik, A.; Borgaonkar, R.; Park, S.; Seifert, J.P. New Vulnerabilities in 4G and 5G Cellular Access Network Protocols: Exposing Device Capabilities. In Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks, Miami, FL, USA, 15–17 May 2019; pp. 221–231. [Google Scholar]
  32. Software Radio Systems. srsRAN Project. 2024. Available online: https://github.com/srsran/srsRAN_Project/releases/tag/release_24_10_1 (accessed on 27 October 2025).
  33. Software Radio Systems. srsRAN 4G. 2023. Available online: https://github.com/srsran/srsRAN_4G/releases/tag/release_23_11 (accessed on 27 October 2025).
  34. Lee, S. Open5GS. 2022. Available online: https://github.com/open5gs (accessed on 27 October 2025).
  35. Strobel, D. IMSI Catcher.Chair for Communication Security; Ruhr-Universität Bochum: Bochum, Germany, 2007; p. 14. [Google Scholar]
Figure 1. Radio resource control (RRC) and non-access stratum (NAS) connection setup messages with protocol-compliant base stations. There are two cases: (a) the connectivity-provision case with functional base station and (b) the connection-release case when the base station cannot support connectivity at the time.
Figure 1. Radio resource control (RRC) and non-access stratum (NAS) connection setup messages with protocol-compliant base stations. There are two cases: (a) the connectivity-provision case with functional base station and (b) the connection-release case when the base station cannot support connectivity at the time.
Jcp 05 00094 g001
Figure 2. Fake base station threat vectors in radio resource control (RRC) connection. The core network is not involved in RRC and thus omitted from these diagrams. (a) Withhold threat. (b) Withhold-and-Release threat. (c) Reject threat.
Figure 2. Fake base station threat vectors in radio resource control (RRC) connection. The core network is not involved in RRC and thus omitted from these diagrams. (a) Withhold threat. (b) Withhold-and-Release threat. (c) Reject threat.
Jcp 05 00094 g002
Figure 3. A fake base station successfully attacks an off-the-shelf phone.
Figure 3. A fake base station successfully attacks an off-the-shelf phone.
Jcp 05 00094 g003
Figure 4. The signal and data processing chain of our scheme for anomaly detection.
Figure 4. The signal and data processing chain of our scheme for anomaly detection.
Jcp 05 00094 g004
Figure 5. 4G/5G prototype implementation for sensing and experimental setup for data collection. (a) The hardware setup. (b) The experimental setup varying channels and user equipment locations.
Figure 5. 4G/5G prototype implementation for sensing and experimental setup for data collection. (a) The hardware setup. (b) The experimental setup varying channels and user equipment locations.
Jcp 05 00094 g005
Figure 6. Training history of Withholding threat with NLOS.
Figure 6. Training history of Withholding threat with NLOS.
Jcp 05 00094 g006
Figure 7. The confusion matrices when user equipment does not move and against Withholding threat. (a) LOS channel. (b) NLOS channel.
Figure 7. The confusion matrices when user equipment does not move and against Withholding threat. (a) LOS channel. (b) NLOS channel.
Jcp 05 00094 g007
Figure 8. The reconstruction error distribution when user equipment does not move and against Withholding threat. (a) LOS channel. (b) NLOS channel.
Figure 8. The reconstruction error distribution when user equipment does not move and against Withholding threat. (a) LOS channel. (b) NLOS channel.
Jcp 05 00094 g008
Figure 9. Accuracy performance of different threats with cross-test LOS-NLOS and NLOS-LOS.
Figure 9. Accuracy performance of different threats with cross-test LOS-NLOS and NLOS-LOS.
Jcp 05 00094 g009
Table 1. Autoencoder training hyperparameters.
Table 1. Autoencoder training hyperparameters.
Parameter Description Value
Latent dimension Size of the encoded feature space 64
Optimizer Optimization algorithm used during training Adam
Learning rate Step size for parameter updates 0.001
Batch size Number of samples per gradient update 32
Activation Non-linear activation function ReLU
Epochs Number of complete training iterations 50
Table 2. Channel performance measurements in LOS and NLOS channels at the user equipment.
Table 2. Channel performance measurements in LOS and NLOS channels at the user equipment.
MeasurementLOSNLOS
BER (%)2.9853.624
SNR (dB)13.72410.848
RSRP (dBm)−1.284−9.545
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Islam, A.; Purification, S.; Chang, S.-Y. Anomaly Detection Against Fake Base Station Threats Using Machine Learning. J. Cybersecur. Priv. 2025, 5, 94. https://doi.org/10.3390/jcp5040094

AMA Style

Islam A, Purification S, Chang S-Y. Anomaly Detection Against Fake Base Station Threats Using Machine Learning. Journal of Cybersecurity and Privacy. 2025; 5(4):94. https://doi.org/10.3390/jcp5040094

Chicago/Turabian Style

Islam, Amanul, Sourav Purification, and Sang-Yoon Chang. 2025. "Anomaly Detection Against Fake Base Station Threats Using Machine Learning" Journal of Cybersecurity and Privacy 5, no. 4: 94. https://doi.org/10.3390/jcp5040094

APA Style

Islam, A., Purification, S., & Chang, S.-Y. (2025). Anomaly Detection Against Fake Base Station Threats Using Machine Learning. Journal of Cybersecurity and Privacy, 5(4), 94. https://doi.org/10.3390/jcp5040094

Article Metrics

Back to TopTop