Data Traffic Prediction for 5G and Beyond: Emerging Trends, Challenges, and Future Directions: A Scoping Review

Lykakis, Evangelos; Vardiambasis, Ioannis O.; Kokkinos, Evangelos

doi:10.3390/electronics14234611

Open AccessReview

Data Traffic Prediction for 5G and Beyond: Emerging Trends, Challenges, and Future Directions: A Scoping Review

by

Evangelos Lykakis

,

Ioannis O. Vardiambasis

^*

and

Evangelos Kokkinos

Department of Electronic Engineering, Hellenic Mediterranean University, 73133 Chania, Crete, Greece

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(23), 4611; https://doi.org/10.3390/electronics14234611

Submission received: 15 October 2025 / Revised: 17 November 2025 / Accepted: 17 November 2025 / Published: 24 November 2025

(This article belongs to the Special Issue Smart Antennas and Systems for 5G and Beyond: Latest Advances and Prospects)

Download

Browse Figures

Versions Notes

Abstract

Accurate forecasting of data traffic assumes a critical role in the administration of 5G networks, as it allows for optimal routing, detection of network anomalies, and improved management of network resources. The latter aspect significantly contributes to enhanced energy preservation, quality of experience (QoE), and quality of service (QoS). This article offers a thorough analysis of the extant literature pertaining to data traffic prediction. It commences with an investigation into the primary obstacles associated with predicting data traffic within cellular networks. Subsequently, an in-depth analysis is conducted on the data traffic patterns, considering their unique attributes. The current prediction methodologies applicable to each pattern are then detailed in relation to the prevailing literature. Following this, a critique of contemporary methodologies utilized for predicting data traffic in mobile networks is presented, accentuating their respective impacts on network management. These methodologies are classified into traditional approaches (statistical and time series techniques) and contemporary approaches that exploit machine learning. In conclusion, this review not only investigates the nascent trends in mobile data traffic prediction but also proposes a novel framework for future research that will be intended to increase the predictive accuracy and computational efficiency of the predictions while concurrently protecting personal information.

Keywords:

5G and beyond; cellular networks; data traffic patterns; deep learning; machine learning; mobile data traffic prediction

1. Introduction

In recent times, and following the advent of intelligent mobile devices, the global transmission of data has experienced rapid growth via cellular networks [1,2,3]. The incessant increase in mobile network traffic presents a substantial amount of data. The 5G network, characterized by its dense heterogeneous architecture comprising macrocells and microcells, has been investigated to identify flexible approaches for balancing network traffic load, thereby enhancing spectral efficiency. Functioning as a next-generation connectivity foundation, 5G accommodates workloads with extreme throughput needs, tight latency budgets, and rigorous reliability constraints. The concurrent surge in handheld devices, massive IoT integration, and sophisticated technologies (VR, autonomous transportation, smart urban infrastructure) has consequently yielded explosive growth in mobile data traffic [4,5,6,7,8,9]. Precise forecasting of data traffic is crucial for guaranteeing exceptional user experience, the efficient use of the network, and optimum allocation of resources. An example of how accurate traffic forecasting helps optimize the 5G network and beyond is the use of the Coordinated Multi-Point (CoMP) technique, where information about where and when traffic is expected to increase (so that cooperation can be activated) and the network decides which Base Stations (BS) should cooperate (Coordination Set) [10]. In this work, we conduct a comprehensive examination of the key challenges and emerging directions in 5G data traffic prediction, with a specific focus on the diversity of traffic patterns and the algorithms utilized for predictive modeling.

5G communication systems are architected to efficiently handle a wide variety of data traffic types, ranging from conventional voice and multimedia services to highly interactive applications such as gaming and IoT-based M2M communications [11,12]. The inherent diversity of these traffic categories is reflected in their unique distributions, bandwidth demands, and differentiated QoS requirements, all of which impose significant implications for network optimization and management [7,13].

Conventional forecasting methodologies, including statistical models and time series analysis, may prove insufficient for addressing the complex and dynamic characteristics of 5G networks [14,15,16]. Consequently, innovative data-driven strategies, particularly those utilizing machine learning (ML) paradigms, have gained significant traction for their capacity to manage the massive data throughput and structural heterogeneity inherent to contemporary communication networks.

Machine learning (ML) algorithms, including supervised learning [17,18,19,20], deep learning [21], and neural network-based models, are particularly well-suited for handling large and complex data streams derived from heterogeneous network environments. Their analytical capacity allows for robust modeling and precise estimation of future traffic distributions and temporal patterns [22,23,24,25]. Specifically, deep learning techniques demonstrate the ability to uncover latent structures and temporal correlations within historical network data. These methodologies are also capable of incorporating external variables that may influence data traffic, including meteorological conditions, public events, and holidays [24,26,27,28]. Furthermore, hybrid approaches that amalgamate traditional forecasting techniques with machine learning methods can yield more accurate predictions of data traffic within 5G networks [24,29,30,31].

This research investigates the various classifications of data traffic within 5G networks and their associated characteristics. It delineates contemporary methodologies for traffic prediction and privacy safeguarding, while also emphasizing the principal challenges and emerging trends. Additionally, it examines how precise traffic forecasting can substantially enhance the performance of 5G networks.

2. Related Works

For effective resource allocation in cellular networks, operators must obtain accurate predictions of network data traffic patterns to ensure optimal performance and efficient management. Considering the subject matter’s importance, this section provides an array of related literature reviews concerning the prediction of network data traffic. Naboulsi et al. conducted a review focusing on the characterization of network data traffic from both provider and user standpoints [32]. User mobility patterns were scrutinized, and scholarly articles were classified according to anonymization, social analysis, user behavior, and demographics [33]. Joshi et al. (2015) delivered a review encompassing various methodologies, including machine learning techniques, neural networks, and linear and non-linear models for analyzing and predicting network traffic [34]. Ahad et al. (2016) presented a survey on the utilization of Neural Networks in wireless networks, particularly in data traffic categorization and prediction techniques [35]. Klaine et al. (2017) offered an overview of prevalent machine learning techniques in cellular networks, categorizing each method based on its learning approach [6]. Hajirahimi et al. (2019) provided a review on hybrid time series techniques for forecasting, specifically focusing on network traffic prediction [36]. Mohammed et al. (2019) introduced ML and Deep Learning (DL) approaches for classification and prediction within SDNs, addressing challenges and prospects such as dataset attributes, data volume, DL implementation methods, security concerns due to SDN architecture, and flow encryption for traffic prediction [37]. Li et al. (2020) conducted an extensive review of network traffic classification using deep learning, comparing it with alternative methods such as port-based, deep packet inspection, and machine learning approaches [38]. Selvamanju et al. (2021) conducted a thorough review of existing ML models for predicting mobile data traffic in 5G networks, evaluating these techniques based on various criteria like primary objectives, underlying methodologies, advantages, implications, and performance metrics [24]. Chen et al. (2021) introduced machine learning solutions for traffic prediction in communication networks, distinguishing between short-term and long-term traffic predictions [39]. Abbasi et al. (2021) explored deep learning techniques like Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) models for data traffic classification, prediction, and anomaly detection, discussing the strengths and weaknesses of these methods in the context of Network Traffic Monitoring and Analysis (NTMA) applications [40]. Lohrasbinasab et al. (2022) discussed statistical and Machine Learning (ML) approaches for Network Traffic Prediction (NTP) and outlined the challenges associated with NTP [41]. Wang et al. (2023) provided a comprehensive survey spanning from 2017 to 2022 on deep learning-based traffic prediction in mobile networks, along with an in-depth examination of issues and solutions related to deep learning-based mobile traffic prediction [42]. Ferreira et al. (2023) introduced statistical and artificial neural network techniques for forecasting network traffic, offering a comprehensive tutorial on these prediction methods [43]. Wang et al. (2024) conducted an in-depth survey on cellular traffic prediction based on deep learning, along with a brief overview of challenges and potential future directions to guide upcoming researchers in their investigations [44].

Although these reviews exist, none of them analyze the different data traffic patterns (such as Burst, One-Way Streaming, Interactive Multimedia, IoT, and Background) and the prediction methods for them. Existing approaches often fail to jointly optimize prediction accuracy, computational efficiency, and privacy protection, indicating a notable research gap. For this purpose, a schematic diagram is presented with its key components (input data, privacy module, prediction engine, and network management output).

3. Key Challenges for Data Traffic Prediction

The task of predicting data traffic within 5G networks comes with numerous challenges, which can be broadly classified into four categories, namely, Data Heterogeneity, Data Privacy and Security, Model and Computational Complexity, and Wireless Channel Interference, as illustrated in Figure 1.

3.1. Data Heterogeneity

In 5G systems, variability in QoS targets, coexistence of divergent traffic modalities (IoT, streaming, interactive multimedia), and complex network architectures create intrinsically heterogeneous data distributions, thereby posing a core challenge to accurate spatiotemporal traffic prediction. Traditional statistical models are often incapable of capturing the dynamic, non-linear correlations embedded in this multi-source data environment. Consequently, Machine Learning (ML) algorithms and their advanced Hybrid ML counterparts have become indispensable for processing these complex patterns.

3.1.1. Architectural Complexity and Traffic Diversity

The architectural intricacy of 5G, characterized by layered virtualization, poses substantial challenges for accurately forecasting the behavior of heterogeneous network components [45]. To obtain high-precision data traffic forecasts in 5G infrastructures, operators should integrate complementary methodologies spanning analytical modeling, traffic engineering, performance telemetry, and data-driven inference over graph-structured topology information. Specifically, network modeling develops mathematically grounded representations that expose capacity constraints, localize bottlenecks, and predict end-to-end performance under varying loads [46]. Traffic engineering formulates routing and resource-allocation as constrained optimization problems (e.g., multi-commodity flow, segment routing), selecting paths and slice resources to sustain stable, low-latency throughput across heterogeneous demands [47]. Network performance monitoring (active and passive measurements, KPI/KQI telemetry) continuously observes spatiotemporal dynamics of the network state to supply high-fidelity datasets for calibration and validation [48]. Finally, applying machine learning [49,50,51], including hybrid pipelines that couple statistical priors with neural predictors [52,53,54,55], to topology-aware features (e.g., graph embeddings of nodes/links/slices) enables the discovery of latent traffic patterns and supports counterfactual forecasting of network behavior.

Equally significant is the diversity of traffic generated by heterogeneous applications such as “video streaming, online gaming, social media, and IoT” [7]. Each produces distinct traffic characteristics, including voice, video, data, and control of city traffic via IoT, which complicates accurate forecasting [56]. Therefore, predictive approaches must adapt to diverse traffic data types and their real-time variations. Large-scale datasets and advanced traffic models are required to capture this diversity [50,51,56].

3.1.2. Data Scarcity and Dynamic Conditions

Spatiotemporal variability in network conditions represents another significant challenge [57]. Swift changes in mobility profiles, network graph structure, resource availability, and user traffic dynamics render 5G traffic highly volatile, thereby degrading the predictability of future states. To surmount this, advanced predictive models (including supervised ML and hybrid approaches [52,53,54,57,58,59,60]) must be employed to comprehend traffic patterns under these diverse and rapidly changing circumstances.

In the early stages of 5G network implementation, there was limited availability of historical data. Constrained longitudinal coverage in 5G required cross-domain adaptation using LTE datasets to counteract data scarcity and distributional shift. Moreover, rigorous simulation studies across multi-dimensional configuration spaces, together with cooperative data exchange and co-development initiatives among network operators, were crucial.

3.1.3. Data Challenges and Preprocessing

The prediction of data traffic in 5G networks presents a significant challenge, with Big Data requirements being one of the most crucial aspects. For machine learning algorithms to effectively learn data traffic patterns, extensive datasets are necessary. However, curating extensive 5G datasets encounters substantive barriers, including user-privacy constraints necessitating robust anonymization, cross-regime variability in network states, and pronounced traffic volatility. As a mitigation strategy, the deployment of synthetic data offers a feasible and scalable solution [61,62,63]. Synthetic datasets, which are generated via the employment of machine learning methodologies that are trained on original data to acquire knowledge of all the pertinent attributes, connections, and statistical trends, can be produced utilizing models that imitate the conduct of real-world applications and users. It should be noted, however, that synthetic data may underapproximate real-world distributional complexity and regime shifts in network demand.

A significant proportion of real-world wireless communication data is unlabeled, which contributes to challenges related to data heterogeneity. To mitigate these challenges, algorithms can be employed to automatically generate labels that categorize data, thereby facilitating organization and reducing heterogeneity. However, this process often introduces a time delay due to the computational overhead required for data processing. Labeling can be performed either manually or through synthetic approaches. Dimensionality reduction techniques such as Principal Component Analysis (PCA) can further alleviate heterogeneity by transforming high-dimensional feature sets into a smaller number of uncorrelated variables, referred to as principal components. Additionally, Active Learning techniques allow a model to iteratively query a user or external source to label the most informative data instances. This approach provides an effective means of mitigating the limitations of unlabeled data [31,41].

3.2. Data Privacy and Security

The prediction of data traffic in 5G networks faces significant challenges related to privacy and security, as these networks are inherently vulnerable to malicious activities and sensitive data leakage. One critical concern is the susceptibility of networks to cyber-attacks, which can alter data traffic patterns and consequently reduce the accuracy of prediction models [31,64]. A typical example of a cyberattack is website fingerprinting. It is a sophisticated traffic analysis attack that exploits packet metadata (timing, size, direction) to bypass encryption [65]. This technique does not inherently alter data traffic. These attacks have occurred several times, performance anomalies like sudden, repeated disconnects or reconnections, spikes in latency, and retransmissions across sessions. However, the specific characteristics of cellular networks, like latency jitter, significantly influence both the effectiveness of these attacks and the development of defensive strategies. Defensive strategies such as blocking and data traffic modification often lead to an increase the bandwidth and higher latency. If the attacker transitions from passive to active, then the data traffic will be affected. Another type of attack is a side-channel attack that leverages radio frequency energy harvesting signals to monitor mobile application activities. A passive RF energy harvester converts ambient Wi-Fi radio transmissions into a small electrical voltage. The pattern of that voltage over time contains fingerprints of nearby phone activity (which apps send/receive different traffic patterns) [66]. If a side-channel RF-harvester is passive can simply observe characteristics of data traffic without changing them. The attacker can very easily switch to active methods (induce traffic, social engineering, etc.) that change the data traffic itself, with consequences for performance and privacy. Even if the attacker does not read or change data, RF interference can affect data flow. Data packets can be lost or delayed due to noise or intentional jamming. Sometimes it can make slower speed, higher latency, or frequent disconnections. Addressing this issue requires continuous network monitoring, anomaly detection using machine learning (ML) methods, and the implementation of strict access control policies to safeguard network resources [67].

Mobile traffic data encompasses various aspects of subscribers’ lives, including their activities, interests, schedules, movements, and preferences. Nevertheless, the utilization of such a vast resource also gives rise to concerns regarding potential violations of the privacy rights of mobile customers. The regulatory authorities have been actively engaged in the development of legislation aimed at safeguarding the privacy of mobile users. For instance, the European Data Protection Directive 95/46/EC stipulates that all mobile traffic datasets must undergo anonymization to ensure that no individuals can be identified before any cross-processing of the data. Additionally, Directive 2002/58/EC specifies that the analysis of anonymized data should only be conducted for the duration necessary to provide the intended value-added service.

Another major challenge arises from the risk of user privacy leakage during traffic prediction. Since traffic data often contains sensitive user information, inappropriate handling or exposure could compromise user confidentiality. Initially, the notion of k-anonymity was introduced in [68,69,70,71]. Subsequently, the authors in [72] devised an additional privacy concept known as t-closeness, which represents a refinement of the l-diversity approach [73,74]. The latter method measures the distribution distance between two sensitive attributes and ensures that it does not exceed a specified threshold value, denoted as t. Some years later, researchers proposed the km-anonymity model in [75], which imposes restrictions on the probability of identity disclosure by employing the Euclidean distance. Authors in [76] proposed a k-automorphism for the anonymization of personal data. Other methods such as pseudonymization, differential privacy, and union routing are presented in [77,78]. These aforementioned anonymization algorithms are commonly used for conventional databases that involve arrays with static features. However, it is important to note that such databases differ greatly in nature from spatiotemporal mobile traffic data. Numerous studies have been conducted to develop anonymization techniques specifically designed for spatiotemporal mobile data. One approach for mobile data anonymization was the location anonymization, which was proposed in [79,80]. Cheng et al. (2016) proposed a system called ANTW with hardware and software components for anonymizing mobile real-time data [81]. Anonymization of mobile data traffic is proposed in [82,83,84].

As it can be observed, there exists no universally applicable resolution for the act of anonymizing mobile data. Due to this circumstance, mobile operators employ a set of data anonymization techniques in conjunction with pseudonymization techniques (which involve the substitution of one or more identifiers with pseudonyms) as well as the protection and segregation of supplementary information from the pseudonymized data.

Furthermore, the difficulty of analyzing encrypted traffic presents a technical obstacle. Conventional encryption methods (e.g., AES) prevent direct analysis without decryption, which reintroduces privacy risks. A promising solution is the use of homomorphic encryption, which allows computation to be performed on encrypted traffic without requiring decryption [85,86]. Given the high computational cost associated with traditional homomorphic encryption schemes, the authors in [87,88,89,90] introduced lightweight homomorphic encryption approaches designed to minimize this overhead. The hybrid approach of pseudo-anonymization and homomorphic encryption is the best method for both protecting private personal data and allowing calculation of the encrypted data for data mining without decryption [91].

3.3. Model and Computational Complexity

The trade-off between prediction accuracy and speed is a fundamental challenge. Higher accuracy typically demands a greater computational load and prolonged training time, while prioritizing speed often compromises precision. To mitigate the associated computational risks, achieving a balance is crucial. This is commonly addressed by employing hybrid prediction methods and optimizing data retraining strategies [92,93,94].

The high computational cost of retraining predictive models in 5G networks represents another critical challenge. Stepwise retraining methods, which incrementally update models with only the most recent observations, effectively mitigate this overhead [41]. Alternatively, Auto-Adaptive Machine Learning (AAML) offers a solution by automatically adjusting to dynamic data and network environments [95].

To achieve faster execution, essential for real-time resource allocation in 5G, traditional machine learning struggles with scale and velocity. Researchers propose integrating hybrid machine learning techniques with parallel programming [41,86]. Hybrid ML enhances accuracy, while parallel programming significantly reduces execution time by distributing tasks. This combination is a promising solution for balancing precision and speed in the dynamic 5G environment

3.4. Impact of Wireless Interference on Prediction Accuracy

A critical limiting factor in traffic prediction is the dynamics of the wireless channel, particularly interference and noise. Interference, especially Inter-Cell Interference (ICI), in dense 5G networks leads to unpredictable changes in actual cell capacity. In Ref. [10], it is emphasized that 5G networks and beyond will use multiple cutting-edge technologies simultaneously (such as Massive MIMO, Ultra-Dense Networks (UDN), Millimeter Wave (mmWave), Non-Orthogonal Multiple Access (NOMA), etc.). The combination of these technologies multiplies the sources of interference. The extremely high density of base stations (Ultra-Dense Networks) leads to increased Inter-Cell Interference (ICI). The use of high frequencies (e.g., mm Wave) creates different interference patterns due to their sensitivity to obstacles (blocking). The adoption of new Access Techniques (New Access Techniques) such as NOMA (Non-Orthogonal Multiple Access) intentionally introduces interference (successive interference cancelation), which must be managed accurately. These fluctuations introduce non-stationarity and noise into the historical traffic data used for training, drastically reducing the prediction accuracy of ML/DL models. To address this, future research should focus on robust prediction models that are able to isolate or model the effect of external factors, such as channel state information (CSI), by incorporating them as additional features. Traffic prediction must consider the quality of the radio channel as an external factor that affects the final measured traffic (throughput).

3.4.1. Interference Classification

Interference in wireless communication systems can be categorized based on its source and impact on network performance. Inter-Cell Interference (ICI) arises between adjacent base stations operating on overlapping frequency resources, often leading to signal degradation and reduced data rates for users located near cell edges. Inter-User Interference (IUI) occurs among users connected to the same base station, which becomes particularly critical in Non-Orthogonal Multiple Access (NOMA) systems where multiple users share identical time–frequency resources [10]. A notable subcategory of IUI is Inter-Beam Interference, as analyzed by Kelner et al., where the directionality and beamwidth of neighboring beams significantly influence the level of mutual interference [96]. Inter-Tier Interference is observed in heterogeneous network architectures, where macro-cells and small cells coexist, often resulting from overlapping coverage areas and transmission power disparities. Finally, Inter-System Interference occurs between different wireless technologies, such as satellite and cellular systems, when they operate within shared or adjacent spectrum bands [97]. Proper classification and understanding of these interference types are crucial for designing efficient interference mitigation strategies, thereby enhancing overall network capacity and reliability.

3.4.2. Interference Management Techniques

In Beyond 5G (B5G) and emerging 6G networks, effective interference management is essential to ensure high reliability, low latency, and improved spectral efficiency. Interference management can be categorized into three main approaches the avoidance, cancelation, and mitigation. Interference avoidance focuses on preventing interference through intelligent resource allocation, as exemplified by techniques such as Fractional Frequency Reuse (FFR). Interference cancelation aims to suppress unwanted signals at the receiver, for example, through Sequential Interference Cancelation (SIC) implemented in Non-Orthogonal Multiple Access (NOMA). Interference mitigation, exemplified by Coordinated Multiple Access (Coordinated Multi-Point (CoMP)), reduces the power and impact of interfering signals by coordinating multiple transmission points [10]. Accurate traffic forecasting plays a vital role in enabling these methods, as networks need to anticipate spatial and temporal variations in demand to proactively implement interference management. Furthermore, B5G systems face new challenges arising from interference between systems, especially between terrestrial and satellite communication services, which require additional protection mechanisms such as Guard Band Protection [97]. Future research directions focus not only on enhancing existing techniques such as FFR and CoMP, but also on integrating smart, adaptive technologies such as Reconfigurable Intelligent Surfaces (RIS), which can dynamically shape the radio environment through artificial intelligence, representing a significant step towards fully intelligent interference control in 6G networks [98]. Comprehensive overviews of the challenges are presented in Table 1.

4. Methods

This article presents an extensive examination of the literature on the prediction of data traffic in cellular networks. Initially, an analysis will be conducted on the various types and patterns of data traffic. Subsequently, an evaluation of the current methodologies employed for data anonymization in cellular networks will be undertaken. Finally, an exploration of the different techniques, significant obstacles, and emerging trends associated with the prediction of data traffic in mobile networks will be expounded upon. Our protocol was drafted using the Preferred Reporting Items for Systematic Reviews and Meta-analysis Protocols (PRISMA-ScR). The final protocol was registered prospectively with the Open Science Framework on: https://osf.io/preprints/metaarxiv/hyv9w_v1 (accessed on 16 November 2025).

The inclusion and exclusion criteria for selecting sources of evidence were established in accordance with the objectives of this scoping review, which focuses on data traffic prediction for 5G and beyond cellular networks. Eligible studies were required to employ advanced computational approaches, including Statistical and Time-Series methods, Machine Learning (ML), Deep learning, or Hybrid prediction models. Only peer-reviewed journal articles and full-length papers from major conference proceedings published in English up to the year 2025 were considered. Studies that did not address data traffic prediction, such as those focusing on unrelated network tasks, were explicitly excluded.

Sources of evidence were drawn from scholarly articles, books, and indexed databases (e.g., IEEE Xplore, SpringerLink, ScienceDirect, MDPI, Elsevier, Scopus, etc.), focusing on 5G network data traffic prediction and data privacy preservation until July 2025. The search results were imported to Mendeley, and duplicates were removed before the screening process commenced.

The electronic search for relevant literature was conducted in the Scopus database. The core search concepts were grouped and combined using the Boolean operator AND, restricting the search to the Keywords fields (5G and beyond; Cellular networks; Data traffic patterns; Deep learning; Machine learning; Mobile data traffic prediction) to ensure focused retrieval. The terms related to data traffic patterns and mobile data anonymity were incorporated into the search string. The same core search logic and criteria were translated and applied to all other selected databases (e.g., IEEE Xplore, SpringerLink, Science Direct, MDPI).

The selection of sources of evidence was a systematic and transparent process conducted by 3 authors to ensure consistency and minimize bias. The process is fully documented in the PRISMA Flow Diagram (Figure 2). All records retrieved from the various database searches (e.g., IEEE Xplore, SpringerLink, ScienceDirect, MDPI, Scopus, etc.). Two duplicate records were identified and removed. The remaining unique records were subjected to an initial screening based on the titles and abstracts. Twenty records were humanly excluded if they clearly did not address data traffic prediction. Also, 1 record was excluded for title and abstract screening. All potentially relevant records proceeded to the full-text assessment stage. Any disagreements between authors at the screening or full-text stage were resolved through discussion and consensus. Only those sources that satisfied all established criteria were included for subsequent data extraction and synthesis in this scoping review. The data-charting process was jointly developed by three authors to determine which articles to extract. The authors discussed the results, and the process was updated periodically. The following variables were used:

Type of data traffic (The 5G service category the traffic prediction model addresses): (eMBB, URLLC, mMTC).
Data Traffic Pattern (The intrinsic characteristic or nature of the data flow being analyzed and predicted): (Burst, One-Way streaming, interactive multimedia, IoT, background).
Dataset characteristics (Attributes defining the source and nature of the traffic data used for training and evaluation): (network vs. single user data traffic, time sensitivity, spatial sensitivity, other characteristics (data CDRs, protocols, access technology)).
Predictive models (The fundamental algorithmic approach used for the traffic prediction task): (Traditional (statistical, time-series), Contemporary (ML, DL, hybrid)
Evaluation metrics (the performance and accuracy of the predictive model): (MSE, RMSE, MAE, MAPE, R², etc.)

Figure 2. Prisma Flow Diagram (* The number of records identified from each database or register searched. ** In case of automation tools usage, the number of excluded by a human.).

In this study, a comprehensive assessment of each source was conducted, devoid of numerical scoring, yet guided by the rationale of their inclusion or exclusion in the literature review. Data extracted from the sources were systematically grouped based on the review’s primary objectives (e.g., classification of data traffic). Findings were organized into key conceptual categories such as data traffic types and patterns, dataset characteristics, and prediction methodologies. The synthesis employed both narrative summary and tabular presentation to map the evidence effectively. The narrative summary was used to describe trends, recurring challenges, and the evolution of research over the review period. Tabulated results were designed to provide a clear, structured overview.

5. Results

Initially, 184 records were identified across major databases (IEEE Xplore, ScienceDirect, SpringerLink, MDPI, etc.) and 5 with other searching methods. After removing 2 duplicates, 20 were humanly excluded because they did not report on data traffic prediction. Also, 1 record was excluded for title and abstract screening. Of these, 166 studies met the inclusion criteria for full-text review. Following the eligibility assessment, 166 studies were retained as the final corpus of evidence for this review (Figure 2). 104 studies were used in the other parts of the text (introduction, related works, key challenges, etc.).

The selected studies were published until 2025, covering a diverse range of methodologies for data traffic prediction in 5G networks. The studies were grouped according to:

Data traffic types (n = 12),
Data traffic patterns (n = 57),
Dataset characteristics (n = 8),
Prediction methods (n = 89).

(where n is the number of included articles in each category).

Figure 3 illustrates the schematic representation of sources that have been selected and those that remain rejected (humanly rejected and duplicates).

In this study, a comprehensive assessment of each source was conducted, devoid of numerical scoring, yet guided by the rationale of their inclusion or exclusion in the literature review. The 166 reviewed sources were analyzed according to their methodological focus on data traffic types, patterns, data privacy strategies, dataset characteristics, and prediction methodologies. Data traffic types are presented in Table 2. Data Traffic Patterns are presented in Table 3. Dataset Characteristics are presented in Table 4. Prediction Methodologies are presented in Table 5 (Statistical), Table 6 (Time Series), Table 7 (Machine learning), Table 8 (Deep learning), and in Table 9 (Hybrid Methods). The primary focus was on advanced computational methodologies (ML/Hybrid) applied to 5G and beyond networks. The evaluative metrics were employed to assess the precision of the methods (MSE, RMSE, MAE, MAPE, R², etc.).

5G networks were consistently categorized into three principal data traffic types: enhanced mobile broadband (eMBB), ultra-reliable and low-latency communications (URLLC), and massive machine-type communications (mMTC), eMBB studies emphasized applications requiring high throughput and wide coverage, such as VR/AR, UHD streaming, and cloud gaming, focusing on bandwidth optimization and resource slicing. URLLC research addressed latency-critical scenarios (autonomous driving, telesurgery, industrial control) demanding sub-millisecond responsiveness and deterministic reliability. mMTC studies concentrated on scalable connectivity for dense IoT deployments, highlighting lightweight communication protocols and energy-efficient scheduling. Prediction accuracy and model selection were found to differ substantially among these categories owing to heterogeneous temporal and spatial traffic behaviors.

Five dominant traffic behavior patterns were identified: burst, one-way streaming, interactive multimedia, IoT-driven, and background traffic. Burst traffic exhibited intermittent surges in transmission rate, posing forecasting challenges mitigated by recurrent neural networks (LSTM, Gated Recurrent Units (GRU), etc.). Streaming traffic showed stable unidirectional flows. Multi-task Convolutional LSTM (ConvLSTM) models achieved superior temporal prediction. Interactive multimedia traffic requires low latency and adaptive modeling. Hybrid AutoRegressive Integrated Moving Average (ARIMA) + LSTM and CNN + LSTM approaches performed best. IoT traffic consisted of small, irregular uplink packets. Hybrid ML frameworks and gradient-boosting models improved classification accuracy. Background traffic demonstrated periodic, low-amplitude activity captured effectively by spatiotemporal graph networks and real-time hybrid DL predictors.

The analyzed datasets varied across four principal dimensions: Network vs. user-level data: Network-level aggregation enabled large-scale pattern analysis, while user-centric datasets allowed detailed behavioral modeling. Temporal granularity enhanced model sensitivity but increased variance and storage requirements. Spatial sensitivity studies, incorporated location-aware analysis distinguishing rural, suburban, and urban cells, revealing distinct diurnal and weekend usage profiles. Other characteristics of the datasets contained heterogeneous records (voice, video, IoT, background services), often collected via crowdsourcing or network logs. For measurement purposes, the employed protocol (TCP, UDP) and the access technology (VoIP, LTE, 3G, 4G, 5G) can be identified.

For the prediction methodologies, Statistical models such as Hidden Markov Models, Naïve Bayes, and probabilistic frameworks served as baselines but lacked scalability. Time-series models, including ARIMA, Seasonal ARIMA (SARIMA), Holt-Winters, Kalman Filtering, etc., captured temporal regularities yet struggled with non-stationary 5G data. Machine learning, deep learning, and hybrid models demonstrated the highest accuracy and adaptability. Hybrid methods integrating statistical and deep learning paradigms (e.g., ARIMA–LSTM, Prophet–GPR–ADMM) achieved RMSE reductions of 20–40% compared to single-model baselines. Performance was uniformly assessed using MSE, RMSE, MAE, MAPE, NRMSE, R², etc. Hybrid deep learning frameworks consistently reported the lowest error rates and strongest correlation coefficients, confirming their suitability for heterogeneous, high-velocity 5G traffic.

Collectively, the literature shows that accurate prediction of 5G data traffic depends on adapting models to the traffic type, incorporating privacy-preserving analytics, and computational efficiency. Machine learning, and, in particular, hybrid deep neural architectures, have outperformed classical approaches in predictive accuracy and generalization. However, challenges remain regarding model interpretability, real-time scalability, and secure access to high-resolution network datasets.

5.1. Traffic Categories and Behavioral Patterns in 5G Networks

Representing the subsequent stage of cellular evolution, 5G is optimized to manage significantly increased data volumes promptly and efficiently compared with preceding technologies. The 5G networks comprise three distinct types of data traffic: “enhanced mobile broadband (eMBB)”, “ultra-reliable and low-latency communications (URLLC)”, and “massive Machine-Type Communications (mMTC)” (Table 2) [7,99]. Data traffic in cellular networks is contingent upon the actions of users. The behavior of users is frequently shaped by significant occurrences that may arise (such as a surge in video conferencing usage due to the COVID-19 pandemic) [100]. The introduction of user events and behavior in 5G networks can indeed lead to a high degree of unpredictability in data traffic patterns, affecting overall predictability [101]. The behavior of mobile users from the point of view of data traffic, mobility, and application usage can be characterized into high traffic users, high mobility users, and those who mainly use applications for data and radio resources [102]. User behavior is the determining factor in shaping data traffic patterns, which are contingent upon the applications that they utilize [103]. By analyzing the data traffic patterns, it can be observed that several users are using more resources than they are allocated [104]. Various factors such as user mobility, subscription plans, network congestion, and coverage exert influence on data traffic patterns [105,106]. Within this section, a detailed examination of data traffic patterns will ensue, accompanied by presented existing techniques for predicting the data traffic of each pattern, and proposing methods for greater accuracy, drawing upon relevant literature.

Enhanced Mobile Broadband (eMBB) within 5G networks offers improved mobile broadband services characterized by exceptionally high data rates, minimal latency, heightened reliability, extended coverage, and enhanced spectral efficiency [107,108,109]. eMBB services are primarily designed to cater to the requirements of augmented reality (AR), virtual reality (VR), ultra-high definition (UHD) video, and online-cloud gaming, ensuring an acceptable level of reliability [107,110]. Ultra-Reliable Low Latency Communications (URLLC), on the other hand, delivers highly responsive connections characterized by ultra-low latency, exceptional reliability, and robust availability. URLLC transmissions occur sporadically, involving short packet sizes and relatively lower data rates, while offering extensive mobility. The intended applications for URLLC include industrial automation, autonomous driving, and remote healthcare [99,107]. Lastly, massive Machine Type Communications (mMTC) in 5G networks represents a massively connected Internet of Things (IoT) platform accommodating a large volume of devices. This platform supports ultra-low latency, high throughput, reduced reliability, minimal complexity, high connection density, extended coverage, low data rates, and low power consumption [109,111,112]. Prediction of data traffic for each data type is of paramount significance for the optimization of bandwidth allocation [113], resource allocation [114], and network slicing [115,116,117,118]. Various machine learning techniques [7,113,114,115,116,117,118] have been introduced to predict data traffic of eMBB, URLLC, and mMTC types.

5G traffic processes are exhibiting increasing heterogeneity and structural complexity, as the technology is tasked with supporting a broad spectrum of implementations with disparate service requirements. Distinct traffic modalities arise, including bursty traffic, with brief, sporadic bursts of activity [119,120] and one-way streaming traffic, which entails a continuous flow of data [121,122,123,124,125]. Additionally, two-way interactive multimedia traffic is latency-critical and requires near-real-time responsiveness, whereas chat messaging (WeChat, WhatsApp, and Messenger) is generally insensitive to modest delays [126,127,128]. Furthermore, IoT traffic is associated with the flow of data from devices that are interconnected and have a connection to the Internet [129]. Lastly, autonomous background services contribute to traffic even during idle states [130].

The Burst factor in network data traffic was introduced by Lam, S. (1978) [131], but Ephremides, A. (1978) had an argument about the burst factor compared to existing measures like the peak-to-average ratio or duty cycle [132]. Burst data traffic constitutes a form of data traffic pattern that is characterized by intermittent bursts of data, typically of brief duration [119,120]. Instances of bursting motion are commonly encountered in applications such as video streaming, online gaming [133], virtual reality [134], IoT burst data [135], social media updates, and email synchronization [130]. In the context of cellular networks, the occurrence of burst traffic can be attributed to various factors, including periodic data transmissions, sporadic data flows, and abrupt changes in user behavior [136]. These bursts can lead to resource-inefficient mobile applications and impact the performance of high-capacity cellular systems [137]. Burst traffic in a cellular network can manifest sudden surges in data transfer rates, irregular patterns of data traffic, and variable packet sizes [120]. A method of calculating burst traffic in a cellular network was the compound Poisson process [138]. The techniques of queuing algorithm [139] and access class barring (ACB) [140] were proposed to overcome bursts in the cellular network. The unpredictability of burst traffic poses a challenge in terms of prediction, as it can lead to sudden spikes in data traffic. The presence of bursting data traffic can undermine the accuracy of conventional time series forecasting techniques in predicting data traffic within 5G networks. These techniques may prove inadequate in accurately capturing traffic patterns to forecast future levels of data traffic. Nevertheless, the utilization of machine learning [141,142], deep learning [135], and machine learning-based hybrid [54,143] approaches can serve to enhance the precision of data traffic prediction in 5G networks. Authors in [141] proposed Random Forest, Decision Tree, k-Nearest Neighbors, Logistic Regression, and Gaussian Processes for the classification of burst data of encrypted data traffic with an accuracy of 94–95%. In Ref. [142], authors proposed a supervised machine learning technique for predicting burst data traffic of IoT devices to achieve ultra-reliable low latency in these devices. Authors in [135] proposed a long short-term memory (LSTM) for predicting burst data of IoT. A hybrid machine learning method of LSTM and (GRU) was proposed in [143] for predicting data traffic and especially data traffic bursts in cellular networks. Authors in [54] presented a hybrid machine learning method (Prophet Algorithm + GPR+ ADMM) from real-time data traffic, which predicted the burst data traffic patterns.

The distinguishing features of one-way streaming traffic in 5G networks, including a continuous flow of data, robust bandwidth requirements, consistent data rates, prolonged durations, and susceptibility to delays, have been extensively studied in [121,122,123,124,125]. These features are crucial for video and audio live streaming. This traffic pattern has been a major driver of the recent surge in data usage in cellular networks [144]. The performance of cellular networks in handling streaming traffic is influenced by factors such as user mobility speed, which can impact the blocking and cut probabilities [145]. Traditional time series forecasting methods struggle to accurately anticipate the patterns of data traffic flow as they are frequently impacted by real-time occurrences and user behavior [146]. To tackle this challenge and according to the literature, the implementation of machine learning [147] and deep learning techniques for live streaming caused by massive events [27,50] was presented for the accurate prediction of streaming data traffic patterns. Especially, authors in [27] proposed a multi-task Convolutional LSTM network (MT-ConvLSTM) for data traffic prediction for live streaming caused by massive events. In Ref. [50] a real-time data prediction incorporating streaming data was presented with the use of a deep learning method (ESN). Based on the existing body of literature, we propose hybrid machine learning techniques [52,148,149,150] for more accurate prediction of this data traffic pattern.

In Cellular networks, the two ways real-time data traffic patterns are referred to as interactive multimedia traffic [126] with variable data rates [134] and sometimes sudden bursts [134,151,152]. The interactive multimedia traffic can be categorized into three subcategories. The first category encompasses applications requiring very high bandwidth and low latency, such as virtual and augmented reality [134,153,154,155]. The second subcategory includes applications with low latency but not necessarily very high bandwidth, like online gaming [156,157], remote telesurgery [158], video conference applications (ZOOM, Google Meeting, Skype, etc.) [100], as well as social media real-time video and audio applications (Wechat, WhatsApp, Messenger, etc.) [127]. Lastly, the third category consists of chatting applications that are not delay-sensitive (text chatting and file transfers) [127,128]. More than 90% of chat (WhatsApp) streams contain a very small amount of data [159]. Furthermore, in the WeChat application, each real-time chat initiates with several small-sized and short-term W-UDP flows to multiple servers and later utilizes one or two long-term W-UDP flows for chatting [160]. Traditional time series prediction models may not suffice in accurately predicting the interactive multimedia data traffic patterns due to their dynamic and complex nature [161]. For a more precise prediction of this data traffic pattern, machine learning [162] and hybrid machine learning techniques [163] were presented for the prediction of this pattern. Authors in [162] suggested machine learning techniques (SVM, Bayes Net, Naïve Bayes) for data traffic classification of WeChat with very high accuracy. In Ref. [163] authors introduced a hybrid machine learning approach (ARIMA+ LSTM) designed for the analysis of real-time data traffic, incorporating augmented reality data traffic. Based on the existing body of literature [115,116,164,165,166,167,168,169,170,171,172], our proposition involves the utilization of advanced deep learning techniques aimed at enhancing the precision of predicting this data traffic pattern.

The Internet of Things (IoT) is an emerging paradigm characterized by a network consisting of physical objects and everyday items, such as vehicles, devices, sensors, smart homes, and various other entities. It represents a swiftly expanding network of interconnected devices that autonomously exchange data [129]. Within the realm of 5G networks, IoT assumes a pivotal role by emphasizing end-to-end communication among devices, thereby shaping the IoT data traffic patterns within cellular networks [173]. In accordance with Macriga et al. (2021), the 5G infrastructure offers enhanced data rates and reduced latency for IoT data traffic in comparison to earlier generations of cellular networks like 3G and 4G [174]. The escalating proliferation of IoT devices is resulting in heightened traffic volumes, leading to packet loss and augmented data transmission delays [175,176]. IoT devices usually send small data payloads [109]. To tackle this issue, a dynamically shared connectivity framework has been advocated to manage the dense IoT traffic effectively, culminating in enhanced resource allocation efficiency and diminished signaling expenses [174]. Furthermore, the concept of distributed caching has been proposed as a strategy to alleviate peak traffic loads in ultra-dense IoT networks [176]. According to Finley et al. (2019), IoT uplink data traffic is greater than downlink data traffic, and peak traffic volumes are sometimes small and sometimes large [177]. The battery condition of Our protocol was drafted using the of Things (IoT) devices plays a pivotal role in maintaining efficient IoT data traffic. When the battery is in good health, data transmission and reception are typically stable and fast. However, as the battery level decreases, these operations become limited or may even be suspended, resulting in degraded network performance. To address this challenge, the authors in [178] proposed a multi-receiver wireless charging system with a single transmitter (MF-STMR-WC). This system enhances the charging efficiency and power distribution among multiple IoT devices, thereby improving overall IoT data traffic stability in Cellular networks. The authors in [179] present a promising scheme for short-range, low-power, and low-cost wireless communications, which can improve IoT data traffic. Due to the increase in IoT devices, the influence of IoT data traffic on the prediction of data traffic in 5G networks should not be underestimated. Given that IoT devices can generate data traffic in sizable bursts, it becomes challenging to anticipate the timing and volume of such traffic. Moreover, the data traffic generated by IoT devices can display notable fluctuations due to varying communication patterns and data consumption requirements across different devices [177]. Machine learning techniques (Decision Tree (DT), K-Nearest Neighbors (K-NN), Naïve Bayes (NB), Gradient Boosting (GRB)) were also proposed for the classification of the IoT data traffic [180]. For predicting the IoT data traffic pattern were used time-series models (ARIMA, VARMA) [181], machine learning (SVR) [182], deep learning (NARX neural network, LSTM, FFNN, Flow2graph, GRU) [135,181,183,184,185], and hybrid machine learning (TFVPtime-LSH) [186] techniques. The LSTM and FFNN methods were more accurate than the time-series models (ARIMA, VARMA) [181]. Especially, Authors in [186] presented a hybrid machine learning method for real-time data of IoVehicles for real-time data traffic prediction.

Data traffic in the cellular network comprises network data traffic that is generated by devices or applications, even when they are not actively being used [130]. The proliferation of smartphones and their diverse range of applications has led to a significant increase in background data traffic in cellular networks. This traffic, which includes activities such as system upgrades, backups, social media updates, and email synchronization, can lead to high signaling overhead, resource wastage, and battery drain [130,187,188,189]. To address these issues, various studies have proposed power-saving mechanisms and tools for detecting and managing background applications [130,187,188]. The background traffic exhibits a periodic behavior from unlabeled communication traffic [190]. Background traffic usually can be classified as light background traffic (e.g., Facebook) and heavy background traffic (e.g., Skype); they are delay-tolerant and sometimes will have sudden burst data traffic [130]. Background data traffic is a very important data traffic that cannot be ignored [187]. According to this, for the prediction of the data traffic must not ignore the background data traffic pattern for the data traffic prediction in a cellular network. For this data traffic pattern, the use of real-time data incorporating background data for data traffic prediction was presented with the use of machine learning (Graph Attention Spatial-Temporal Network-GASTN, Random Forest) [51,191], deep learning (LSTM, GRU) [192], and hybrid machine learning techniques (HSTNet) [59].

For each mentioned motion pattern, numerous proposed methods for motion prediction exist. Figure 4 represents the data traffic patterns in conjunction with the predictive methodologies corresponding to each identified pattern. Table 3 exhibits the traffic patterns alongside their data, their characteristics and their prediction method.

It is imperative to anticipate these spatiotemporal traffic profiles within 5G to orchestrate and optimize resource utilization [193,194]. By accurately predicting the data traffic for each pattern, network operators can facilitate congestion management, regulate admission, allocate bandwidth to the system, and detect any irregularities [195]. As a result, the advancement and utilization of machine learning, deep learning, and hybrid machine learning methodologies are in progress to improve the precision of data traffic prediction of each traffic pattern.

Table 2. Data Traffic Types.

Author	Year	Data Traffic Type
Alsenwi et al. [99]	2021	eMBB, URLLC, mMTC
Zhang et al. [113]	2022
Abdelsadeket al. [114]	2020
Kumar et al. [117]	2022
Thantharate et al. [118]	2019
Lykakis et al. [7]	2023
Siddiqi et al. [107]	2019	eMBB, URLLC
Hsu et al. [108]	2022	eMBB
Sohaib et al. [110]	2023	eMBB
Popovski et al. [109]	2018	eMBB, mMTC
Ray et al. [111]	2020	mMTC
Belhadj et al. [112]	2021	mMTC

Table 3. Data Traffic Patterns.

Pattern	Data	Characteristics	Method
Burst Data Traffic [119,120,130,131,132,133,134,135,136,137,138,139,140,141,142]	Video Streaming, Online Gaming, Virtual Reality, IoT, Social Media Updates and Email Synchronization	Sudden Surges in Data Transfer Rates, Brief Duration, Irregular Patterns Variable Packet Sizes, Periodic Data Transmissions, Sporadic Data Flows	Random Forest, Decision Tree, k-Nearest Neighbors, Logistic Regression, Gaussian Processes, LSTM + GRU, Prophet Algorithm + GPR + ADMM, FFNN, Naïve Bayes (NB), Gradient Boosting (GRB)
One–Way Streaming Data Traffic [27,121,122,123,124,125,144,145,146]	Video and Audio Live Streaming	Continuous Flow of Data, Robust Bandwidth Requirements, Consistent Data Rates	SVM, ESN, MT-ConvLSTM
Interactive Multimedia Data Traffic [100,126,127,128,134,151,152,153,154,155,156,157,159,160,161,162]	Online Games, Virtual and Augmented Reality, Remote Telesurgery, Social Media Chat	Variable Data Rates	SVM, Bayes Net, Naïve Bayes, ARIMA + LSTM, TDNN
IoT Data Traffic [129,135,173,174,175,176,177,178,179,180,181,182,183,184,185,186]	Smart Home, Sensors, Vehicles, Devices etc.	Data Traffic Fluctuations, Peak Traffic Volumes small or large	ARIMA, VARMA/SVR, TFVPtime-LSH, GRU, LSTM, FFNN, NARX NN, Flow2graph
Background Data Traffic [130,187,188,189,190,192]	System Upgrades, Backups, Social Media Updates and Email Synchronization	Generated Data Traffic Without Use, Separated to Light Data Traffic and Heavy Data Traffic, Can Lead to High Signaling Overhead	GASTN, Random Forest, LSTM, GRU, HSTNet

5.2. Dataset Characteristics in Cellular Network

A mobile network dataset must adhere to a predetermined set of well-defined attributes. Existing datasets for mobile traffic analytics are heterogeneous in their time granularity, aggregation depth, and traffic classification. This disparity stems directly from the methodology employed in data collection and the objectives of the operator, particularly when the data is made publicly available [196]. A comprehensive examination of mobile datasets can be found in [33]. As a means of general classification, datasets can be categorized based on the subsequent characteristics:

The first classification differentiates between assessments performed across the entire network and those centered on single users. At the network data traffic analysis level, an analysis is conducted on the data traffic within a specific geographical region for all users accessing a particular cellular network. On the other hand, at the user level, the data traffic is assessed individually for each user. In general, the closer the data collection process is to the user, the more comprehensive the information that can be obtained. However, obtaining user-side data generally requires substantial efforts to capture a perspective that encompasses the entire network. To address this issue, a practical approach is to employ crowdsourcing. Applications installed on user devices collect network measurements and periodically transmit them to a central server, yielding a continually refreshed repository of network information [197]. Operators may consider employing network-centric aggregation systems to mitigate potential privacy concerns that may arise from sharing per-user information. In such instances, network managers can gather communication data for all users within each cell regularly. This enables network-wide analysis, supporting the detection and characterization of usage patterns [198]. Aggregation at the user level removes explicit identifiers, enhancing privacy guarantees.

The second category pertains to time sensitivity, wherein the analysis of motion data assumes utmost significance. High-resolution temporal data enhances analytical depth and promotes the discovery of optimization solutions. As demonstrated in [199,200], base-station utilization follows a diurnal profile that recurs across weeks, showing minimal demand at night and increased demand during the day. Studies have also noted a weekend effect, with traffic volumes falling below weekday values [200,201]. As shown in [202], data usage is seasonal, increasing by around 20% in the final months of the year relative to summer months.

The third category refers to spatial sensitivity, where it becomes imperative to identify dynamic patterns and relationships between user behaviors and network usage. By specifying the characteristics of the regions from which network data are gathered, it becomes feasible to engineer locally optimized approaches. Authors in [200,201,203] have observed that data traffic peaks vary depending on the region. More specifically, rural areas exhibit different daily data traffic peaks compared to cities, which experience distinct peaks during the day. Sun et al. (2000) identify spatiotemporal structure in traffic demand that can inform predictive models of load [199]. Several studies report that adjacent regions exhibit similar average weekday demand but show pronounced divergence on weekends [200,201].

The fourth category pertains to other characteristics. Traffic data exhibits heterogeneity. Mobile operators frequently generate data CDRs that measure text, voice, and internet data communications for billing purposes. For measurement purposes, the employed protocol (TCP, UDP) and the access technology (VoIP, LTE, 3G, 4G, 5G) can be identified. By measuring these various elements mentioned above, it becomes possible to determine the diverse requirements of users, thereby contributing to the enhancement of Quality of Service (QoS) [196].

Table 4. Dataset Characteristics.

Author	Year	Dataset Characteristics
Shafiq et al. [198]	2012	Network vs. Single User Data Traffic
Cardona et al. [202]	2014	Temporal Sensitivity
Zhang et al. [203]	2012	Spatial Sensitivity
Sun et al. [199]	2000	Spatiotemporal Sensitivity
Paul et al. [200]	2011
Wang et al. [201]	2013
Trinh et al. [196]	2020	Other Characteristics
Naboulsi [33]	2015	Other Characteristics

5.3. Current Approaches for Forecasting Cellular Network Data Traffic

According to [204], it is projected that internet data traffic will experience a tenfold increase by the year 2027. This substantial growth is anticipated to have a significant impact on the architectural design of the next generation of cellular networks. Accurate traffic prediction is critical for effective optimization and management of communication networks. This prediction plays a crucial role in areas such as optimal routing, energy conservation, and the detection of network anomalies [204]. Furthermore, the effective management of congestion is widely recognized as a key element of the 5G/6G technology. This technology enables users to carry out a multitude of tasks using a single infrastructure while enjoying improved quality of service.

5.3.1. Traditional Methods

According to the current body of published works, the traditional techniques for forecasting data can be organized into two principal classifications. The first classification comprises statistical motion prediction techniques, such as Hidden Markov, Markov Chain, and Naive Bayes [39] (Table 5). These approaches have been widely used due to their simplicity and interpretability. However, studies have shown that their performance declines when applied to modern cellular networks characterized by non-linear and non-stationary traffic patterns and they are less suitable for complex and rapidly evolving network environments.

The second classification concerns time-series–based approaches utilized for predicting data traffic (Table 6). Time-series methodologies can be categorized as linear (including ARIMA, AR, MA, GARMA, ARMA, HOLT–WINTERS, SARIMA, Fractional Auto-Regressive Integrated Moving Average (FARIMA)/Autoregressive Fractionally Integrated Moving Average (ARFIMA), and Kalman Filtering) [36,39,205,206,207,208,209,210,211], non-linear GARCH [39], probabilistic [212], hybrid [213,214,215,216], and the SHADOW CLUSTER [217] grouping method. Empirical comparisons reveal that while linear time series models such as ARIMA provide reasonable short-term forecasts, in cases of data with nonlinearity and non-stationarity they appear to be suboptimal. For instance, Zhang et al. (2020) reported that ARIMA’s root mean squared error (RMSE) had large errors with the true values, and they lacked accuracy in fitting the peaks and had worst RMSE and MAE than the proposed hybrid spatiotemporal network (HSTNet) approach when forecasting highly variable cellular data traffic (SMS, Call, Internet) [59]. Similarly, Wang et al. (2017) showed that the proposed hybrid machine learning method (GSAE + LSTM), leads to around 40.4% MSE, and 28.4% MAE less error than ARIMA [149]. Although hybrid time-series models partially mitigate these issues by combining linear and non-linear representations, they still require extensive parameter tuning and may not scale efficiently for large, heterogeneous 5G datasets. As 5G networks mature and data heterogeneity intensifies, these limitations underscore the need for more adaptive and scalable forecasting frameworks.

Table 5. Statistical Methods for Data Traffic Prediction.

Author	Year	Model	Method	Advantages	Disadvantages
Chen et al. [39]	2021	Hidden Markov Markov Chain Naive Bayes	Statistical	1. Low Computational Cost, 2. Excellent for Stationary Data	1. Not Useful for Spatiotemporal and Nonstationary Data, 2. Lack Data Protection Techniques

Table 6. Time Series Models for Data traffic Prediction.

Author	Year	Model	Method	Advantages	Disadvantages
Chen et al. [39]	2021	GARCH	Non-linear	1. Low Computational Cost. 2. Useful for Spatiotemporal Data, 3. Good for IoT Data Traffic Pattern Especially from ΙoT Sensors.	1. Low Percentage of Prediction in Heterogeny Data. 2. Lack Data Protection Techniques
Chen et al. [39]	2021	ARIMA	Linear
Levine et al. [217]	1997	SHADOW CLUSTER	Grouping Method
Sadek et al. [207]	2004	AR	Linear
		MA
		GARMA
Tan et al. [206]	2010	ARMA
Tikunovet al. [205]	2007	HOLT-WINTERS
Sciancalepore et al. [208]	2017	HOLT-WINTERS
Hajirahimi et al. [36]	2019	ARFIMA
Whittaker et al. [211]	1997	Kalman Filtering
Medhn et al. [209]	2017	SARIMA
AsSadha et al. [210]	2017	FARIMA
Mitchell et al. [214]	2001	MULTI–CELL + CLASS MODEL	Hybrid
Mehdi et al. [215]	2022	Fuzzy ARIMA
Tran et al. [216]	2019	Holt–Winter’s Mul. Seas. (HWMS)
Zhou et al. [213]	2006	GARCH + ARIMA
Choi et al. [212]	2002	PROBAB.	Probabilistic

5.3.2. Contemporary Methods

To mitigate deficits in time-series models, machine learning is adopted for enhanced forecasting performance. The third classification focuses on ML-based predictions and includes supervised ML trained on past traffic traces, deep learning, and hybrid schemes that couple statistical or time-series techniques with supervised ML or combine multiple supervised predictors.

Supervised machine learning techniques (Table 7) have demonstrated superior accuracy in predicting data traffic compared to alternative methods [147,218,219]. Bouzidi et al. (2018) employed supervised machine learning to predict latency from data traffic [220]. Yue et al. (2017) utilized motion data to perform real-time bandwidth prediction [51]. These methods are less accurate than DL and Hybrid Methods and lack to data protection techniques.

Deep learning techniques (Table 8) had better accuracy than ML and traditional methods [49,50,56,115,116,164,165,166,167,168,169,170,171,221,222,223,224,225,226]. Another study employing deep learning methods achieved optimization of quality of service (QoS) [172]. Huang et al. (2017) predicted the minimum average and maximum traffic for the subsequent hour based on the previous hour’s data movement [58]. Pfülb et al. (2019) estimated data flow size using supervised machine learning and a hybrid approach [227]. However, these methods exhibit high computational complexity and do not incorporate data protection techniques.

Through the utilization of hybrid methods (Table 9) in various studies, an improvement in the accuracy of data movement prediction was observed in comparison to other methods [52,53,54,57,59,60,143,149,150,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246]. Authors in [20,31,247] improved prediction accuracy while reducing computation time. In Ref. [248], the researchers produced synthetic datasets that closely emulate the original data, with a high degree of predictive accuracy. Other studies have reported network performance optimization [60,249,250,251], quality of service (QoS) optimization [252,253], energy consumption reduction [231,249,254,255], latency reduction [250,251], and improvement in the quality of user experience (QuE) [148]. In their research, Garrido et al. [55] reduced the time between resource demand and orchestration by employing a hybrid method. Some studies have identified a trade-off between the accuracy of data traffic prediction and the execution time of each method [92,94]. Yadav et al. (2021) predicted the utilization of mobile telephony for the next 10 years based on data traffic using a hybrid approach [163]. Nan et al. (2022) combined personal data protection with machine learning using a hybrid approach, achieving a prediction accuracy of 86.02% [256]. Uyan et al. (2022) forecasted data traffic for the upcoming two weeks using a hybrid method incorporating machine learning [257]. As evident from the aforementioned findings, the adoption of hybrid methods incorporating machine learning yields superior outcomes in terms of prediction accuracy and its contribution to enhancing the management of mobile networks. Despite significant progress, prior research has not yet established a balanced approach that simultaneously ensures high accuracy, strong data privacy, and fast computation.

Predictive analytics of traffic in 5G networks supports efficient resource allocation, slicing policies, routing optimization, and anomaly surveillance, delivering higher QoS/QoE alongside improved energy and latency metrics [7]. In order to achieve energy savings through data traffic prediction, the sleep strategy [231,258,259,260,261,262,263,264] is employed, whereby inactive cellular network resources are disabled based on the prediction of data traffic. However, due to the complexity of 5G networks and the escalating demand for data, research in this area remains open, as the development of traffic prediction frameworks necessitates the integration of privacy, high prediction accuracy, and minimal execution time.

Recent developments in machine learning and deep learning techniques have advanced the prediction of data traffic for 5G networks and beyond, but significant obstacles remain. Interpretability (technologies such as dish learning often produce very accurate predictions, but without easy explanations by administrators about how these results are compared to statistical and spatiotemporal methods), scalability (growing volumes of data, increasing numbers of users, and expanding network infrastructures) and energy efficiency, which are critical for real-time application in complex cellular environments, are often neglected in favor of higher precision in many current models. The lack of standardized large datasets further limits the significant comparability and reproducibility of methods. Although still computationally expensive and not yet fully integrated in predictive frameworks, privacy-preserving methods such as homomorphic encryption are gaining ground. Research should focus on the development of explainable, lightweight models that maintain high predictive performance while preserving operational sustainability and data privacy.

Table 7. Supervised Machine Learning Models for Data traffic Prediction.

Author	Year	Model	Method	Advantages	Disadvantages
Khan et al. [147]	2022	SVM	Supervised ML	1. Better Accuracy Than Time Series Methods. 2. Good for All Data Traffic Patterns, Especially For Interactive Multimedia Data Traffic is Excellent only in online Chat (WeChat, etc.) Data Traffic. 3. Lower Computational Complexity Than Deep Learning Methods.	1. For Interactive Multimedia Data Traffic Patterns like Augment Reality is not Very Accurate. 2. Less Accurate Than DL and Hybrid Methods. 3. Lack Data Protection Techniques.
Aceto et al. [218]	2021	Markov Chains
Dash et al. [219]	2019	HMM
Yue et al. [51]	2017	Random Forest
Bouzidi et al. [220]	2018	ILF

Table 8. Deep Learning Models for Data traffic Prediction.

Author	Year	Model	Method	Advantages	Disadvantages
Guo et al. [115]	2019	GRU	DL	1. Better Accuracy Than Statistical and ML Methods. 2. Very Good For All Data Traffic Patterns. 3. Improve QoS and Data Flow Size.	1. Computational Cost Than Statistical and ML Methods. 2. Lack Data Protection Techniques. 3. Less Accuracy Than Hybrid Contemporary Methods
Bega et al. [221]	2019	3D-CNN
Zhang et al. [49]	2018	CNN
Liang et al. [222]	2019	CNN
Cui et al. [50]	2014	ESN
Nikravesh et al. [223]	2016	MLP, MLPWD
Zhao et al. [265]	2022	BP
Yimenget al. [226]	2022	Transformers
Pfülb et al. [227]	2019	DNN
Chen et al. [164]	2018	LSTM
Zhou et al. [165]	2018
Zhao et al. [166]	2019
Trinh et al. [167]	2018
Chen et al. [168]	2019
Azzouni et al. [169]	2017
Dalgkitsis et al. [170]	2018
Alawe et al. [171]	2018
Xiao et al. [116]	2018
Jaffry et al. [224]	2020	FFNN
Gao [56]	2022	SLSTM
Guerra-Gomez et al. [172]	2020	TDNN
Selvamanjuet al. [225]	2022	DLMTFP

Table 9. Hybrid Models for Data traffic Prediction.

Author	Year	Model	Method	Advantages	Disadvantages
Paul et al. [150]	2019	k-means + Weiszfeld + LSTM-GRU	Hybrid	1. Better Accuracy than other Methods, 2. Network Performance Optimization, 3. Quality of Service (QoS) Optimization, 4. Energy Consumption Reduction, 5. Excellent performance Especially in Burst, Interactive Multimedia, IoT (IoT Bursts) and Background Data Traffic.	1. Lack of Balance Between Accuracy, Data Privacy, and Computational Cost
Andreoletti et al. [233]	2019	DCRNN
Pelekanou et al. [234]	2018	ILP + LSTM + MLP
Gong et al. [240]	2024	KGDA
Zang et al. [57]	2015	k-means + Wavelet transform + Elman-NN
Zheng et al. [148]	2016	RBMs + NN
Chen et al. [58]	2018	LSTM + CNN
Fang et al. [237]	2022	Wavelet Denoising + Deep Gaussian Process
Le et al. [52]	2018	Naïve Bayes + AR + NN + GP
Zhang et al. [59]	2020	HSTNet
Dommaraju et al. [250]	2020	ECMCRR-MPDNL
Wang et al. [143]	2020	LSTM + GPR
Gao et al. [251]	2021	DRL
Uyan et al. [257]	2022	k-means + n-beans
Wang et al. [60]	2019	DU-AAU
Xu et al. [94]	2019	ADMM + Cross-Validation + GP
Shawel et al. [229]	2020	Double Seasonal ARIMA	Hybrid	1. Better Accuracy than other Methods, 2. Network Performance Optimization, 3. Quality of Service (QoS) Optimization, 4.Energy Consumption Reduction, 5. Excellent performance Especially in Burst, Interactive Multimedia, IoT (IoT Bursts) and Background Data Traffic.	1. Lack of Balance Between Accuracy, Data Privacy, and Computational Efficiency
Yadav et al. [163]	2021	ARIMA + LSTM
Aldhyani et al. [236]	2020	FCM + LSTM + ANFIS
Li et al. [235]	2020	LSTM + CNN
Alsaade et al. [228]	2021	SES-LSTM
Selvamanju et al. [239]	2022	AOADBN-MTP
Li et al. [238]	2022	EEMD + GAN
Garrido et al. [55]	2021	CATP
Zeb et al. [232]	2021	Encoder–Decoder LSTM
Su et al. [31]	2024	Lightweight Hybrid Attention Deep Learning
Pandey et al. [248]	2024	5GT-GAN-NET
Huang et al. [249]	2019	DQN
Mehri et al. [241]	2024	FLSP
Nashaat et al. [20]	2024	AML-CTP Framework
Hua et al. [230]	2018	CLSTM
Zhu et al. [254]	2021	LR + DNN
Bouzidi et al. [253]	2019	ILP + DRL + LSTM	Hybrid	1. Better Accuracy than other Methods, 2. Network Performance Optimization, 3. Quality of Service (QoS) Optimization, 4. Energy Consumption Reduction, 5. Excellent performance Especially in Burst, Interactive Multimedia, IoT (IoT Bursts) and Background Data Traffic.	1. Lack of Balance Between Accuracy, Data Privacy, and Computational Efficiency
Zhao et al. [53]	2020	STGCN-HO
Zeng et al. [252]	2020	Fusion-transfer + STC-N
Liu et al. [54]	2021	Prophet algorithm + GPR + ADMM
Jiang et al. [246]	2024	CNN)-graph Neural Network (GNN)
Zorello et al. [255]	2022	LR + LSTM + FFNN + MILP
Nan et al. [256]	2022	FedRU
Zhou et al. [245]	2024	Patch-based Neural Network
Wang et al. [149]	2017	GSAE + LSTM
Zhang et al. [231]	2017	SARIMA + top-K + Regression Tree Random Forest
Cai et al. [242]	2024	DBSTGNN-Att
Haoet al. [243]	2024	NCP
Cao et al. [244]	2024	HAN
Wu et al. [247]	2024	CLPREM
Chen et al. [92]	2020	DBLS

5.4. Evaluation Metrics for the Data Traffic Prediction

Evaluating the efficacy of a machine learning model represents a pivotal stage in constructing a potent ML model. To appraise the effectiveness or caliber of the model, diverse metrics are employed, denominated as performance metrics or evaluation metrics. These performance metrics facilitate comprehension of the level of proficiency exhibited by our model in relation to the data. The evaluation metrics utilized in the aforementioned studies were as follows [7]:

“MSE (Mean Square Error)” [245,256] Equation (1)

M S E = \frac{1}{n} \sum_{t = 1}^{n} {(x_{t} - \hat{x_{t}})}^{2}

(1)

“RMSE (Root Mean Square Error)” [49,218,242,246] Equation (2)

R M S E = \sqrt{\frac{1}{n} \sum_{t = 1}^{n} {(x_{t} - \hat{x_{t}})}^{2}}

(2)

“ARMSE (Average Root Mean Square Error)” [231] Equation (3)

A R M S E = \frac{\sum_{t = 0}^{n r - 1} \sqrt{\frac{\sum_{t = 1}^{n} {| x_{t} - \hat{x_{t}} |}^{2}}{n}}}{n r}

(3)

“RRMSE (Relative RMSE)” [53] Equation (4)

R R M S E = \frac{R M S E}{X_{m a x} - X_{m i n}} * 100

(4)

“NMSE (Normalized Mean Square Error)” [50] Equation (5)

N M S E = \frac{\frac{1}{n} \sum_{t = 1}^{n} {(x_{t} - \hat{x t})}^{2}}{\frac{1}{n} \sum_{t = 1}^{n} {(x_{t} - \frac{1}{n} \sum_{t = 1}^{n} x_{t})}^{2}}

(5)

“NRMSE (Normalized Root Mean Square Error)” [228] Equation (6)

N R M S E = \frac{\sqrt{\frac{1}{n} \sum_{t = 1}^{n} {(x_{t} - \hat{x t})}^{2}}}{\frac{1}{n} \sum_{t = 1}^{n} x_{t}}

(6)

“RE (Relative Error)” [115] Equation (7)

R E = \frac{| x - \hat{x |}}{| x |}

(7)

“MRE (Mean Relative Error)” [225] Equation (8)

M R E = \frac{1}{n} \sum_{t = 1}^{n} \frac{{| x}_{t} - \hat{x_{t} |}}{| x_{t} |}

(8)

“NMAE (Normalized Mean Absolute Error)” [57] Equation (9)

N M A E = \frac{\sum_{t = 1}^{n} x_{t} - \hat{x_{t}}}{\sum_{t = 1}^{n} x_{t}}

(9)

“MAE (Mean Absolute Error)” [242,243,246] Equation (10)

M A E = \frac{1}{n} \sum_{t = 1}^{n} |x_{t} - \hat{x t}|

(10)

“MAPE (Mean Absolute Percentage Error)” [20,255] Equation (11)

M A P E = \frac{1}{n} \sum_{t = 1}^{n} (\frac{|x_{t} - \hat{x t}|}{x_{t}})

(11)

“MA (Mean Accuracy)” [58] Equation (12)

M A = (1 - M A P E) * 100

(12)

“SMAPE (Symmetric Mean Absolute Percentage Error)” [150] Equation (13)

S M A P E = \frac{1}{n} \sum_{t = 1}^{n} \frac{\hat{{| x}_{t}} - x_{t} |}{\hat{{| x}_{t} |} + | x_{t} |} * 100

(13)

“(Squared Correlation) $R^{2}$ ” [56,228,246,255] Equation (14)

R^{2} = 1 - \frac{\sum_{t = 1}^{n} {(x_{t} - \hat{x t})}^{2}}{\sum_{t = 1}^{n} {(x_{t} - \frac{1}{n} \sum_{t = 1}^{n} x_{t})}^{2}} * 100

(14)

“Percentage Tolerance” [219] Equation (15)

t o l e r a n c e = \frac{(x - \hat{x})}{\hat{x_{m a x}} - \hat{x_{m i n}}} * 100

(15)

“True Predicted Rate (TPR)” [250] Equation (16)

T P R = \frac{N u m b e r o f C o r r e c t l y p r e d i c t e d}{n} * 100

(16)

“False Positive Rate (FPR)” [250] Equation (17)

F P R = \frac{N u m b e r o f I n c o r r e c t l y p r e d i c t e d}{n} * 100

(17)

“r (Pearson Coefficient)” [172] Equation (18)

r = \frac{\sum_{t = 1}^{n} {(x}_{t} - \underline{x_{t}}) * (\hat{x_{t}} - \underline{\hat{x_{t}}})}{\sqrt{{\sum_{t = 1}^{n} (x_{t} - \underline{x_{t}})}^{2} * \sum_{t = 1}^{n} {(\hat{x_{t}} - \underline{\hat{x_{t}}})}^{2}}}

(18)

“R (Spearman’s Correlation Coefficient)” [254] Equation (19)

R = 1 - \frac{6 * \sum_{t = 1}^{n} {{(x}_{t -} \hat{x_{t}})}^{2}}{n^{3} - n}

(19)

where

x_{t}

is historical data,

\hat{x t}

is predicted data,

\underline{x_{t}}

is the average of historical data,

\underline{\hat{x_{t}}}

is the average of the predicted data, n is the total number of the data, nr is the number of regions,

X_{m a x}

is the max value of historical data,

X_{m i n}

is the min value of historical data,

\hat{x_{m a x}}

is the max value of predicted data and

\hat{x_{m i n}}

is the min value of predicted data” [7]. All these evaluation metrics are very useful for all the Contemporary Methods because they can measure the accuracy of the prediction methods, so that each model becomes more reliable. The most common evaluation metrics typically use MSE (Mean Squared Error), RMSE (Root Mean Squared Error), MAE (Mean Absolute Error), MAPE (Mean Absolute Percentage Error), and (Squared Correlation)

R^{2}

. MSE is used for emphasizing large deviations, but is sensitive to outliers. RMSE is very common in network traffic forecasting, but it is also sensitive to outliers. MAE is good for general accuracy measurement, but does not emphasize large errors. MAPE is used only when traffic values are always positive and not close to zero.

R^{2}

is a complementary metric that does not reveal the magnitude of errors and is usually used with other metrics like RMSE and MSE. For measuring purposes, it is relevant to use three or four of the most common evaluation metrics to calculate the performance of the prediction method.

6. General Discussion of Future Directions

In order to tackle the competing demands of predictive accuracy, scalability, and rigorous privacy in contemporary cellular networks, we introduce a unified framework that combines privacy-aware methods with state-of-the-art modeling. Our methodology addresses the shortcomings of prevailing approaches, which frequently compromise data utility in favor of privacy or encounter inefficiencies in processing substantial volumes of rapidly changing network data. Through the integration of evolutionary optimization, cryptographic safeguards, and parallelized machine learning paradigms, the proposed framework provides a robust solution capable of functioning efficiently within both existing 5G and nascent 6G network architectures. This section delineates the framework’s architecture and methodology, elucidating how it accomplishes privacy-sensitive, scalable, and altering data traffic prediction.

6.1. Framework Overview

As cellular networks become increasingly intricate and concerns regarding user data privacy intensify, contemporary methodologies for data traffic prediction encounter substantial constraints with respect to accuracy, scalability, and security. Conventional predictive models frequently neglect the significance of data privacy or encounter difficulties in processing the extensive, heterogeneous, and high-velocity data engendered by contemporary 5G networks. To address these obstacles, we introduce an innovative three-stage framework that guarantees robust privacy preservation, attains elevated prediction accuracy, and diminishes computational latency through the implementation of parallelization and advanced machine learning techniques. This framework is meticulously crafted to be scalable and adaptable for forthcoming cellular network generations, including 6G infrastructures.

6.2. Detailed Methodology

The proposed framework encompasses three distinct stages:

Input Data: As input data, we will use network, spatiotemporal mobility, and external indicators. Network indicators are raw operational data records generated continuously by the cellular network equipment (eNodeB/gNodeB) and core network elements. They record network behavior over time and form the primary data source for the prediction framework. Per-cell/Per-sector Counters are numeric values that describe how many connections, bits, or signaling messages occurred in a specific time (Total uplink (UL) and downlink (DL) bits transferred, Number of active users (UEs), Call attempts, Drop Calls, Average throughput, etc.). Handover Events happens when a user’s device (UE) moves from one cell to another (due to mobility, load balancing, or radio quality) (Handover success/failure count). Radio Resource Control (RRC) Events manages signaling between the user equipment (UE) and base station (BS) (RRC Connection Request, RRC Connection Setup Complete, RRC Connection Release, etc.). Physical Resource Block Utilization is the smallest resource scheduling block per Transmission Time Interval (TTI) (uplink (UL) and downlink (DL) PRB utilization). Channel Quality metrics are reported to indicate channel conditions (Signal-to-Interference-plus-Noise Ratio (SINR), Channel Quality Indicator (CQI), RSRP (Reference Signal Received Power), RSRQ (Reference Signal Received Quality), PMI (Precoding Matrix Indicator), and RI (Rank Indicator)). Spatiotemporal indicators are data records for the log files that represent the spatiotemporal view of the network data (Timestamps, Latitude, Longitude, cell ID, etc.). Mobility indicators are quantitative metrics (statistical calculations) that describe how, when, and how fast users move between network cells or regions. Handover Rate (HOR) represents the number of successful handovers per time, user, or cell. Cell Dwell Time (CDT) is computed by subtracting the connection start and end times per user per cell. Average Speed of Users (ASU) estimates how fast the users (or devices) were moving through the network area. In case the training data contains special events (fire, earthquake, etc.) that affect the data traffic of mobile networks, external indicators should be used.

Stage 1: Data Anonymization. In the initial stage, sensitive user information undergoes anonymization through the application of Genetic Algorithm (GA) [68,69,70,71,72,73,74,76], which adeptly can achieve a balance between data utility and privacy safeguarding. GA facilitates the optimization of the trade-off between the minimization of information loss and the maximization of k-anonymity. Furthermore, pseudonymization techniques can be employed to substitute direct personal identifiers with pseudonyms, thereby diminishing the risk of identity disclosure while maintaining data usability.

Stage 2: Lightweight Homomorphic Encryption for Secure Computation to augment data privacy, with low computational cost. This technique enables computations to be conducted directly on encrypted data without necessitating decryption, thereby ensuring comprehensive data confidentiality throughout the prediction process. By capitalizing on homomorphic encryption [85,86] sensitive attributes such as geographic coordinates and user behavioral patterns can be securely analyzed without revealing raw data to potential security threats. Especially with the use of lightweight homomorphic encryption, it minimizes the computational cost and execution time [87,88,89,90,266], rather than the original homomorphic encryption. This will minimize the execution time of the framework and especially the data traffic prediction.

Stage 3: Lightweight Hybrid Deep Learning with Parallel Programming. In the concluding phase, the proposed lightweight hybrid Deep Learning framework can be deployed in two distinct substages. In the first substage, a lightweight machine learning algorithm is employed to perform anomaly detection on network traffic, ensuring that anomalous data are excluded before the prediction process [67]. The second substage focuses on enhancing model training and inference efficiency. To achieve this, parallel programming frameworks for deep learning acceleration can be integrated, thereby reducing computational overhead and improving overall system performance [85,86,267].

Network Management Output: The proposed framework will predict data traffic for allocation of resources with a balance between accuracy, Data Privacy, and computational speed. In Figure 5, the flowchart of the proposed method for data traffic prediction is represented.

6.3. Validation Plan

The proposed framework can be validated through data from providers for different regions of Greece (Urban, Suburban, Rural). The results can be compared with other datasets like the Milan Telecom Dataset or alongside synthetic datasets produced via Generative Adversarial Networks (GANs) to replicate a variety of traffic patterns while ensuring adherence to privacy regulations. The assessment can encompass three fundamental dimensions: Prediction Accuracy can be evaluated through the application of “Root Mean Square Error (RMSE)”, “Mean Absolute Error (MAE)”, and “Mean Absolute Percentage Error (MAPE)” [7] to gauge the precision of traffic predictions across various scenarios and data traffic configurations. Computational Efficiency can be quantified by documenting model training and inference durations across different parallelization strategies employing Apache Spark and GPU acceleration. The scalability of the framework can be examined by incrementally expanding dataset sizes. Privacy Preservation effectiveness can be assessed through the evaluation of k-anonymity compliance levels, l-diversity indices, and analyses of differential privacy leakage. Additional privacy validation efforts will be undertaken by quantifying the success rate of re-identification attempts on anonymized data samples.

Moreover, ablation studies will be executed to scrutinize the impact of each component of the framework (anonymization, encryption, and hybrid modeling) on the overall efficacy. Stress testing under conditions of significant traffic variability will evaluate the resilience and scalability of the framework in the context of actual cellular network operations. This exhaustive validation plan guarantees that the proposed framework attains an optimal equilibrium between prediction precision, computational efficiency, and robust data privacy assurances.

7. Discussion and Analysis

Evidence from the literature shows that accurate prediction of data traffic patterns remains challenging due to the increasing diversity and complexity of traffic, real-time processing requirements, gaps in historical datasets, privacy concerns, the imperative for fast prediction, the scarcity of labeled samples, and the necessity of frequent retraining as models and workloads change. Several emerging trends in data traffic forecasting have been identified by researchers in an attempt to address these challenges.

A prominent development is the rise of machine learning models that employ advanced algorithms and data analytics to accurately forecast network traffic patterns. “Machine learning methods”, including “decision trees” and “support vector machines”, have demonstrated their effectiveness for estimating future traffic states in 5G infrastructures [7]. Moreover, deep learning frameworks have been effectively deployed across heterogeneous traffic classes (video, voice, and data). Concretely, “Convolutional Neural Networks (CNNs)” are utilized to learn spatiotemporal structure for video-traffic forecasting, whereas “recurrent neural networks (RNNs)” capture temporal dependencies characteristic of voice-traffic data.

Recent developments also emphasize hybrid machine learning frameworks that combine complementary algorithms to offset the limitations of individual models. These hybrid approaches often integrate statistical time-series or ML models with deep neural architectures, achieving superior prediction accuracy and robustness across diverse traffic scenarios. By aligning spatial and temporal inference, such frameworks enhance network adaptability, particularly in highly dynamic environments with fluctuating user demands. Although hybrid machine learning approaches generally achieve superior predictive accuracy compared to traditional and other Contemporary models, they often remain constrained by computational complexity or the absence of built-in data privacy mechanisms. This limitation underscores the necessity for developing lightweight, privacy-preserving hybrid frameworks capable of maintaining high forecasting precision while ensuring scalability and compliance with data protection standards.

Prediction accuracy is pivotal for efficient resource orchestration, the design of advanced network-slicing mechanisms, routing optimality, and robust anomaly detection. Consequently, it yields improved Quality of Experience (QoE), enhanced Quality of Service (QoS), lower energy consumption, and reduced end-to-end latency. Increasing distributional complexity in traffic data and the elaborate topology of 5G networks pose a challenge for researchers, making it necessary to explore new prediction methods. In the proposed framework, it is necessary to explore high prediction accuracy, ensuring low complexity and short execution times, while respecting the principles of personal data protection.

Consequently, the design of future predictive frameworks should prioritize high forecasting precision, low algorithmic complexity, and short execution time, all while preserving compliance with personal data protection regulations. Approaches incorporating parallel computing, Contemporary hybrid models, and encryption methods (such as lightweight homomorphic encryption) hold particular promise in achieving this balance. These technologies enable models to handle large-scale, real-time data without compromising user confidentiality or system latency.

In summary, the evolution of data traffic forecasting for 5G and beyond reveals a clear shift toward intelligent, hybrid, and privacy-aware learning architectures. Continued research is required to refine these frameworks, with emphasis on scalability, interpretability, and secure model adaptation in dynamic communication environments. Such advancements will be essential to sustain efficient resource management, network resilience, and user-centric service delivery in next-generation mobile networks.

8. Conclusions

The current article undertook an initial investigation into the diverse patterns of data traffic (Burst, One-Way Streaming, Interactive Multimedia, Internet of Things, and Background Data) observed in 5G networks. Within this article, we commence by examining previous studies focusing on surveys of methods utilized for predicting data traffic within cellular networks. Subsequently, we outline the obstacles encountered in predicting data traffic within cellular networks, and we propose a three-tier structure to efficiently provide rapid and precise forecasting outcomes while upholding data integrity. Following this, we scrutinize each data traffic pattern and provide an overview of the existing prediction methods tailored to each specific pattern of data traffic. The article then delves into a comprehensive discussion of the current strategies for forecasting data traffic within cellular networks, along with their individual contributions towards effective management. Upon examination of these techniques, it became apparent that they can be classified into traditional (statistical, time series prediction methods) and Contemporary (machine learning, deep learning, hybrid) prediction methods. Among these techniques, those that utilize hybrid machine learning approaches were found to provide superior accuracy in predicting data movement compared to other existing methods. The significance of data traffic prediction in the management of 5G networks was emphasized, as precise predictions enable more effective allocation of network resources and bandwidth. This, in turn, leads to improved quality of service (QoS), enhanced Quality of experience (QoE), and energy conservation through the deactivation of inactive cellular network resources. One potential avenue for future investigation involves developing a framework based on geo-localization data (longitude and latitude) that can predict data traffic in specific areas using hybrid machine learning approaches. The primary focus of this research will be to achieve a harmonious equilibrium between high prediction accuracy, prediction speed, and the protection of mobile data with the use of homomorphic encryption and anonymization techniques of personal data.

Author Contributions

All authors have contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This work has not been funded by any research project, grant or fund and is solely the work of the researchers mentioned.

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, C.; Patras, P.; Haddadi, H. Deep learning in mobile and wireless networking: A survey. IEEE Commun. Surv. Tutor. 2019, 21, 2224–2287. [Google Scholar] [CrossRef]
Mohan, R.R.; Vijayalakshmi, K.; Augustine, P.J.; Venkatesh, R.; Nayagam, M.G.; Jegajohi, B. A comprehensive survey of machine learning based mobile data traffic prediction models for 5G cellular networks. In Proceedings of the AIP Conference 2024, Greater Noida, India, 20–21 November 2024; p. 100002. [Google Scholar]
Tyokighir, S.S.; Mom, J.; Ukhurebor, K.E.; Igwue, G. New developments and trends in 5G technologies: Applications and concepts. Bull. Electr. Eng. Inform. 2024, 13, 254–263. [Google Scholar] [CrossRef]
Attaran, M. The impact of 5G on the evolution of intelligent automation and industry digitization. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 5977–5993. [Google Scholar] [CrossRef]
Pons, M.; Valenzuela, E.; Rodríguez, B.; Nolazco-Flores, J.A.; Del-Valle-Soto, C. Utilization of 5G technologies in IoT applications: Current limitations by interference and network optimization difficulties—A review. Sensors 2023, 23, 3876. [Google Scholar] [CrossRef]
Klaine, P.V.; Imran, M.A.; Onireti, O.; Souza, R.D. A survey of machine learning techniques applied to self-organizing cellular networks. IEEE Commun. Surv. Tutor. 2017, 19, 2392–2431. [Google Scholar] [CrossRef]
Lykakis, E.; Kokkinos, E. Data Traffic Prediction in Cellular Networks. In Proceedings of the 4th International Conference in Electronic Engineering, Information Technology & Education (EEITE 2023), Chania, Greece, 15–18 May 2023. [Google Scholar]
Pawar, V.; Zade, N.; Vora, D.; Khairnar, V.; Oliveira, A.; Kotecha, K.; Kulkarni, A. Intelligent Transportation System with 5G Vehicle-to-Everything (V2X): Architectures, Vehicular Use Cases, Emergency Vehicles, Current Challenges and Future Directions. IEEE Access 2024, 12, 183937–183960. [Google Scholar] [CrossRef]
Shafiq, S.; Rahman, M.S.; Shaon, S.A.; Mahmud, I.; Hosen, A.S. A Review on Software-Defined Networking for Internet of Things Inclusive of Distributed Computing, Blockchain, and Mobile Network Technology: Basics, Trends, Challenges, and Future Research Potentials. Int. J. Distrib. Sens. Netw. 2024, 2024, 9006405. [Google Scholar] [CrossRef]
Alzubaidi, O.T.H.; Hindia, M.N.; Dimyati, K.; Noordin, K.A.; Wahab, A.N.A.; Qamar, F.; Hassan, R. Interference challenges and management in B5G network design: A comprehensive review. Electronics 2022, 11, 2842. [Google Scholar] [CrossRef]
Navarro-Ortiz, J.; Romero-Diaz, P.; Sendra, S.; Ameigeiras, P.; Ramos-Munoz, J.J.; Lopez-Soler, J.M. A survey on 5G usage scenarios and traffic models. IEEE Commun. Surv. Tutor. 2020, 22, 905–929. [Google Scholar] [CrossRef]
Bhivgade, A.; Puri, C. 5G Wireless Communication and IoT: Vision, Applications, and Challenges. In Proceedings of the 2024 2nd DMIHER International Conference on Artificial Intelligence in Healthcare, Education and Industry (IDICAIEI), Wardha, India, 29–30 November 2024; pp. 1–6. [Google Scholar]
Melikhov, E.O.; Stroganova, E.P. Intelligent Management of Combined Traffic in Promising Mobile Communication Networks. In Proceedings of the 2024 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO), Vyborg, Russia, 1–3 July 2024; pp. 1–5. [Google Scholar]
Papagiannaki, K.; Taft, N.; Zhang, Z.-L.; Diot, C. Long-Term Forecasting of Internet Backbone Traffic: Observations and Initial Models. In Proceedings of the 22nd Annual Joint Conference of the IEEE Computer and Communications Societies, San Francisco, CA, USA, 1–3 April 2003; pp. 1178–1188. [Google Scholar]
Chakraborty, P.; Corici, M.; Magedanz, T. System Failure Prediction within Software 5G Core Networks using Time Series Forecasting. In Proceedings of the 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, ON, Canada, 14–23 June 2021; pp. 1–7. [Google Scholar]
Alzalam, I.; Lipps, C.; Schotten, H.D. Time-Series Forecasting Models for 5G Mobile Networks: A Comparative Study in a Cloud Implementation. In Proceedings of the 2024 15th International Conference on Network of the Future (NoF), Barcelona, Spain, 2–4 October 2024; pp. 54–62. [Google Scholar]
Hong, W.-C. Application of seasonal SVR with chaotic immune algorithm in traffic flow forecasting. Neural Comput. Appl. 2012, 21, 583–593. [Google Scholar] [CrossRef]
Xu, Y.; Xu, W.; Yin, F.; Lin, J.; Cui, S. High-accuracy wireless traffic prediction: A GP-based machine learning approach. In Proceedings of the 2017 IEEE Global Communications Conference (GLOBECOM 2017), Singapore, 4–8 December 2017; pp. 1–6. [Google Scholar]
Caiyu, S.; Jinri, W.; Jie, D.; Shanyun, W. Prediction of 5th Generation Mobile Users Traffic Based on Multiple Machine Learning Models. In Proceedings of the 2022 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS), Virtual, 11–12 December 2022; pp. 1174–1179. [Google Scholar]
Nashaat, H.; Mohammed, N.H.; Abdel-Mageid, S.M.; Rizk, R.Y. Machine Learning-based Cellular Traffic Prediction Using Data Reduction Techniques. IEEE Access 2024, 12, 58927–58939. [Google Scholar] [CrossRef]
Vinayakumar, R.; Soman, K.P.; Poornachandran, P. Applying deep learning approaches for network traffic prediction. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13–16 September 2017; pp. 2353–2358. [Google Scholar]
Girardin, F.; Vaccari, A.; Gerber, A.; Biderman, A.; Ratti, C. Towards estimating the presence of visitors from the aggregate mobile phone network activity they generate. In Proceedings of the International Conference on Computers in Urban Planning and Urban Management, Hong Kong, China, 16–18 June 2009. [Google Scholar]
Aronsson, L.; Bengtsson, A. Machine Learning Applied to Traffic Forecasting. Bachelor’s Thesis, University of Gothenburg, Gothenburg, Sweden, 2019. [Google Scholar]
Selvamanju, E.; Shalini, V.B. Machine learning based mobile data traffic prediction in 5g cellular networks. In Proceedings of the 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 2–4 December 2021; pp. 1318–1324. [Google Scholar]
Guesmi, L.; Mejri, A.; Radhouane, A.; Zribi, K. Advanced Predictive Modeling for Enhancing Traffic Forecasting in Emerging Cellular Networks. In Proceedings of the 2024 15th International Conference on Network of the Future (NoF), Barcelona, Spain, 2–4 October 2024; pp. 209–213. [Google Scholar]
Kuber, T.; Seskar, I.; Mandayam, N. Traffic prediction by augmenting cellular data with non-cellular attributes. In Proceedings of the 2021 IEEE Wireless Communications and Networking Conference (WCNC), Nanjing, China, 29 March–1 April 2021; pp. 1–6. [Google Scholar]
Bejarano-Luque, J.L.; Toril, M.; Fernandez-Navarro, M.; Gijon, C.; Luna-Ramirez, S. A deep-learning model for estimating the impact of social events on traffic demand on a cell basis. IEEE Access 2021, 9, 71673–71686. [Google Scholar] [CrossRef]
Zhu, T.; Boada, M.J.L.; Boada, B.L. Adaptive Graph Attention and Long Short-Term Memory-Based Networks for Traffic Prediction. Mathematics 2024, 12, 255. [Google Scholar] [CrossRef]
Deeban, N.; Bharathi, P.S. A Robust and Efficient Traffic Analysis for 5G Network Based on Hybrid LSTM comparing with XGBoost to Improve Accuracy. In Proceedings of the 2023 International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering (ICECONF), Chennai, India, 5–7 January 2023; pp. 1–8. [Google Scholar]
Dangi, R.; Lalwani, P. A novel hybrid deep learning approach for 5G network traffic control and forecasting. Concurr. Comput. Pract. Exp. 2023, 35, e7596. [Google Scholar] [CrossRef]
Su, J.; Cai, H.; Sheng, Z.; Liu, A.X.; Baz, A. Traffic prediction for 5G: A deep learning approach based on lightweight hybrid attention networks. Digit. Signal Process. 2024, 146, 104359. [Google Scholar] [CrossRef]
Naboulsi, D.; Stanica, R.; Fiore, M. Classifying call profiles in large-scale mobile traffic datasets. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2014), Toronto, ON, Canada, 27 April–2 May 2014; pp. 1806–1814. [Google Scholar]
Naboulsi, D.; Fiore, M.; Ribot, S.; Stanica, R. Large-scale mobile traffic analysis: A survey. IEEE Commun. Surv. Tutor. 2015, 18, 124–161. [Google Scholar] [CrossRef]
Joshi, M.; Hadi, T.H. A review of network traffic analysis and prediction techniques. arXiv 2015, arXiv:1507.05722. [Google Scholar] [CrossRef]
Ahad, N.; Qadir, J.; Ahsan, N. Neural networks in wireless networks: Techniques applications and guidelines. J. Netw. Comput. Appl. 2016, 68, 1–27. [Google Scholar] [CrossRef]
Hajirahimi, Z.; Khashei, M. Hybrid structures in time series modeling and forecasting: A review. Eng. Appl. Artif. Intell. 2019, 86, 83–106. [Google Scholar] [CrossRef]
Mohammed, A.R.; Mohammed, S.A.; Shirmohammadi, S. Machine learning and deep learning-based traffic classification and prediction in software defined networking. In Proceedings of the 2019 IEEE International Symposium on Measurements & Networking (M&N), Catania, Italy, 8–10 July 2019; pp. 1–6. [Google Scholar]
Li, J.; Pan, Z. Network traffic classification based on deep learning. KSII Trans. Internet Inf. Syst. 2020, 14, 4246–4267. [Google Scholar] [CrossRef]
Chen, A.; Law, J.; Aibin, M. A Survey on Traffic Prediction Techniques Using Artificial Intelligence for Communication Networks. Telecom 2021, 2, 518–535. [Google Scholar] [CrossRef]
Abbasi, M.; Shahraki, A.; Taherkordi, A. Deep learning for network traffic monitoring and analysis (NTMA): A survey. Comput. Commun. 2021, 170, 19–41. [Google Scholar] [CrossRef]
Lohrasbinasab, I.; Shahraki, A.; Taherkordi, A.; Delia Jurcut, A. From statistical-to machine learning-based network traffic prediction. Trans. Emerg. Telecommun. Technol. 2022, 33, e4394. [Google Scholar] [CrossRef]
Wang, X.; Wang, Z.; Yang, K.; Song, Z.; Feng, J.; Zhu, L.; Deng, C. Deep Learning Based Traffic Prediction in Mobile Network-A Survey. Authorea Preprints 2023.
Ferreira, G.O.; Ravazzi, C.; Dabbene, F.; Calafiore, G.C.; Fiore, M. Forecastingnetwork traffic: A survey and tutorial with open-source comparative evaluation. IEEE Access 2023, 11, 6018–6044. [Google Scholar] [CrossRef]
Wang, X.; Wang, Z.; Yang, K.; Song, Z.; Bian, C.; Feng, J.; Deng, C. A Survey on Deep Learning for Cellular Traffic Prediction. Intell. Comput. 2024, 3, 0054. [Google Scholar] [CrossRef]
Sanchez-Navarro, I.; Mamolar, A.S.; Wang, Q.; Calero, J.M.A. 5gtoponet: Real-time topology discovery and management on 5g multi-tenant networks. Future Gener. Comput. Syst. 2021, 114, 435–447. [Google Scholar] [CrossRef]
ElSawy, H.; Sultan-Salem, A.; Alouini, M.S.; Win, M.Z. Modeling and analysis of cellular networks using stochastic geometry: A tutorial. IEEE Commun. Surv. Tutor. 2016, 19, 167–203. [Google Scholar] [CrossRef]
Mkocha, K.; Kissaka, M.M.; Hamad, O.F. Trends and Opportunities for Traffic Engineering Paradigms Across Mobile Cellular Network Generations. In Proceedings of the International Conference on Social Implications of Computers in Developing Countries, Dar es Salaam, Tanzania, 1–3 May 2019; pp. 736–750. [Google Scholar]
Motlagh, N.H.; Kapoor, S.; Alhalaseh, R.; Tarkoma, S.; Hätönen, K. Quality of Monitoring for Cellular Networks. IEEE Trans. Netw. Serv. Manag. 2021, 19, 381–391. [Google Scholar] [CrossRef]
Zhang, C.; Zhang, H.; Yuan, D.; Zhang, M. Citywide cellular traffic prediction based on densely connected convolutional neural networks. IEEE Commun. Lett. 2018, 22, 1656–1659. [Google Scholar] [CrossRef]
Cui, H.; Yao, Y.; Zhang, K.; Sun, F.; Liu, Y. Network traffic prediction based on Hadoop. In Proceedings of the 2014 International Symposium on Wireless Personal Multimedia Communications (WPMC 2014), Sydney, Australia, 7–10 September 2014; pp. 29–33. [Google Scholar]
Yue, C.; Jin, R.; Suh, K.; Qin, Y.; Wang, B.; Wei, W. LinkForecast: Cellular link bandwidth prediction in LTE networks. IEEE Trans. Mob. Comput. 2017, 17, 1582–1594. [Google Scholar] [CrossRef]
Le, L.V.; Sinh, D.; Tung, L.P.; Lin, B.S.P. A practical model for traffic forecasting based on big data, machine-learning, and network KPIs. In Proceedings of the 2018 15th IEEE Annual Consumer Communications & Networking Conference (CCNC 2018), Las Vegas, NV, USA, 12–15 January 2018; pp. 1–4. [Google Scholar]
Zhao, S.; Jiang, X.; Jacobson, G.; Jana, R.; Hsu, W.L.; Rustamov, R.; Talasila, M.; Aftab, S.A.; Chen, Y.; Borcea, C. Cellular Network Traffic Prediction Incorporating Handover: A Graph Convolutional Approach. In Proceedings of the 2020 17th Annual IEEE International Conference on Sensing, Communication and Networking (SECON 2020), Como, Italy, 22–25 June 2020; pp. 1–9. [Google Scholar]
Liu, C.; Wu, T.; Li, Z.; Wang, B. Individual traffic prediction in cellular networks based on tensor completion. Int. J. Commun. Syst. 2021, 34, 4952. [Google Scholar] [CrossRef]
Garrido, L.A.; Mekikis, P.V.; Dalgkitsis, A.; Verikoukis, C. Context-aware traffic prediction: Loss function formulation for predicting traffic in 5G networks. In Proceedings of the IEEE International Conference on Communications (ICC 2021), Montreal, ON, Canada, 14–18 June 2021; pp. 1–6. [Google Scholar]
Gao, Z. 5G Traffic Prediction Based on Deep Learning. Comput. Intell. Neurosci. 2022, 2022, 3174530. [Google Scholar] [CrossRef] [PubMed]
Zang, Y.; Ni, F.; Feng, Z.; Cui, S.; Ding, Z. Wavelet transform processing for cellular traffic prediction in machine learning networks. In Proceedings of the 2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), Chengdu, China, 12–15 July 2015; pp. 458–462. [Google Scholar]
Huang, C.W.; Chiang, C.T.; Li, Q. A study of deep learning networks on mobile traffic forecasting. In Proceedings of the 2017 IEEE 28th Annual International Symposium on Personal Indoor and Mobile Radio Communications (PIMRC 2017), Montreal, ON, Canada, 8–13 October 2017; pp. 1–6. [Google Scholar]
Zhang, D.; Liu, L.; Xie, C.; Yang, B.; Liu, Q. Citywide cellular traffic prediction based on a hybrid spatiotemporal network. Algorithms 2020, 13, 20. [Google Scholar] [CrossRef]
Wang, S.; Li, F.; Ni, H.; Xu, L.; Jing, M.; Yu, J.; Wang, X. Rush Hour Capacity Enhancement in 5G Network Based on Hot Spot Floating Prediction. In Proceedings of the 2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS), Shenyang, China, 21–23 October 2019; pp. 657–662. [Google Scholar]
Stadler, T.; Oprisanu, B.; Troncoso, C. Synthetic data–anonymisation Groundhog Day. In Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA, 10–12 August 2022; pp. 1451–1468. [Google Scholar]
Raghunathan, T.E. Synthetic data. Annu. Rev. Stat. Appl. 2021, 8, 129–140. [Google Scholar] [CrossRef]
Ayala-Rivera, V.; Portillo-Dominguez, A.O.; Murphy, L.; Thorpe, C. COCOA: A synthetic data generator for testing anonymization techniques. In Proceedings of the International Conference on Privacy in Statistical Databases (PSD 2016), Dubrovnik, Croatia, 14–16 September 2016; pp. 163–177. [Google Scholar]
Atat, R.; Liu, L.; Chen, H.; Wu, J.; Li, H.; Yi, Y. Enabling cyber-physical communication in 5G cellular networks: Challenges, spatial spectrum sensing, and cyber-security. IET Cyber Phys. Syst. Theory Appl. 2017, 2, 49–54. [Google Scholar] [CrossRef]
Wang, T. High precision open-world website fingerprinting. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 18–21 May 2020. [Google Scholar]
Ni, T.; Lan, G.; Wang, J.; Zhao, Q.; Xu, W. Eavesdropping mobile app activity via {Radio-Frequency} energy harvesting. In Proceedings of the 32nd USENIX Security Symposium, Anaheim, CA, USA, 9–11 August 2023; pp. 3511–3528. [Google Scholar]
Mane, D.T.; Sangve, S.; Upadhye, G.; Kandhare, S.; Mohole, S.; Sonar, S.; Tupare, S. Detection of anomaly using machine learning: A comprehensive survey. Int. J. Emerg. Technol. Adv. Eng. 2022, 12, 134–152. [Google Scholar] [CrossRef]
Gramaglia, M.; Fiore, M. On the anonymizability of mobile traffic datasets. arXiv 2014, arXiv:1501.0010. [Google Scholar]
Stenneth, L.; Phillip, S.Y.; Wolfson, O. Mobile systems location privacy: “MobiPriv” a robust k anonymous system. In Proceedings of the 2010 IEEE 6th International Conference on Wireless and Mobile Computing, Networking and Communications, Niagara Falls, ON, Canada, 11–13 October 2010; pp. 54–63. [Google Scholar]
Sweeney, L. Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 2002, 10, 571–588. [Google Scholar] [CrossRef]
Madan, S.; Goswami, P. A privacy preserving scheme for big data publishing in the cloud using k-anonymization and hybridized optimization algorithm. In Proceedings of the 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET), Kottayam, India, 21–22 December 2018; pp. 1–7. [Google Scholar]
Li, N.; Li, T.; Venkatasubramanian, S. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering, Istanbul, Turkey, 15–20 April 2007; pp. 106–115. [Google Scholar]
Machanavajjhala, A.; Kifer, D.; Gehrke, J.; Venkitasubramaniam, M. l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 2007, 1, 3-es. [Google Scholar] [CrossRef]
Abdrashitov, A.; Spivak, A. Sensor data anonymization based on genetic algorithm clustering with L-Diversity. In Proceedings of the 2016 18th Conference of Open Innovations Association and Seminar on Information Security and Protection of Information Technology (FRUCT-ISPIT), St. Petersburg, Russia, 18–22 April 2016; pp. 3–8. [Google Scholar]
Poulis, G.; Skiadopoulos, S.; Loukides, G.; Gkoulalas-Divanis, A. Apriori-based algorithms for k^ m-anonymizing trajectory data. Trans. Data Priv. 2014, 7, 165–194. [Google Scholar]
Medková, J.; Hynek, J. HAkAu: Hybrid algorithm for effective k-automorphism anonymization of social networks. Soc. Netw. Anal. Min. 2023, 13, 63. [Google Scholar] [CrossRef]
Bolognini, L.; Bistolfi, C. Pseudonymization and impacts of Big (personal/anonymous) Data processing in the transition from the Directive 95/46/EC to the new EU General Data Protection Regulation. Comput. Law Secur. Rev. 2017, 33, 171–181. [Google Scholar] [CrossRef]
Pawar, A.; Ahirrao, S.; Churi, P.P. Anonymization techniques for protecting privacy: A survey. In Proceedings of the 2018 IEEE Punecon, Pune, India, 30 November–2 December 2018; pp. 1–6. [Google Scholar]
Rajesh, N.; Abraham, S.; Das, S.S. Personalized trajectory anonymization through sensitive location points hiding. Int. J. Inf. Technol. 2019, 11, 461–465. [Google Scholar] [CrossRef]
Mano, M.; Ishikawa, Y. Anonymizing user location and profile information privacy-aware mobile services. In Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Location Based Social Networks, San Jose, CA, USA, 2 November 2010; pp. 68–75. [Google Scholar]
Cheng, M.; Zhao, B.; Su, J. A Real-Time Processing System for Anonymization of Mobile Core Network Traffic. In Proceedings of the Security, Privacy and Anonymity in Computation, Communication and Storage: SpaCCS 2016 International Workshops, TrustData, TSP, NOPE, DependSys, BigDataSPT, and WCSSC, Zhangjiajie, China, 16–18 November 2016; pp. 229–237. [Google Scholar]
Chaddad, L.; Chehab, A.; Elhajj, I.H.; Kayssi, A. Mobile traffic anonymization through probabilistic distribution. In Proceedings of the 22nd Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN 2019), Paris, France, 18–21 February 2019; pp. 242–248. [Google Scholar]
Martinez, E.B.; Ficek, M.; Kencl, L. Mobility data anonymization by obfuscating the cellular network topology graph. In Proceedings of the 2013 IEEE International Conference on Communications (ICC), Budapest, Hungary, 9–13 June 2013; pp. 2032–2036. [Google Scholar]
Barak, O.; Cohen, G.; Toch, E. Anonymizing mobility data using semantic cloaking. Pervasive Mob. Comput. 2016, 28, 102–112. [Google Scholar] [CrossRef]
Acar, A.; Aksu, H.; Uluagac, A.S.; Conti, M. A survey on homomorphic encryption schemes: Theory and implementation. ACM Comput. Surv. 2018, 51, 1–35. [Google Scholar] [CrossRef]
Pulido-Gaytan, L.B.; Tchernykh, A.; Cortés-Mendoza, J.M.; Babenko, M.; Radchenko, G. A survey on privacy- preserving machine learning with fully homomorphic encryption. In Proceedings of the Latin American High Performance Computing Conference (LA-HCC 2020), Cuenca, Ecuador, 2–4 September 2020; pp. 115–129. [Google Scholar]
Biksham, V.; Vasumathi, D. A lightweight fully homomorphic encryption scheme for cloud security. Int. J. Inf. Comput. Secur. 2020, 13, 357–371. [Google Scholar] [CrossRef]
Ullah, S.; Li, J.; Chen, J.; Ali, I.; Khan, S.; Hussain, M.T.; Ullah, F.; Leung, V.C. Homomorphic encryption applications for IoT and light-weighted environments: A review. IEEE Internet Things J. 2024, 12, 1222–1246. [Google Scholar] [CrossRef]
Praveen, R.; Pabitha, P. Improved Gentry–Halevi’s fully homomorphic encryption-based lightweight privacy preserving scheme for securing medical Internet of Things. Trans. Emerg. Telecommun. Technol. 2023, 34, e4732. [Google Scholar] [CrossRef]
Du, D.; Zhao, W.; Wei, L.; Lu, S.; Wu, X. A lightweight homomorphic encryption federated learning based on blockchain in iov. In Proceedings of the 2022 IEEE Smartworld, Ubiquitous Intelligence & Computing, Scalable Computing & Communications, Digital Twin, Privacy Computing, Metaverse, Autonomous & Trusted Vehicles (SmarWorld/UIC/ScalCom/DigitalTwin/PriComp/Meta), Haikou, China, 15–18 December 2022; pp. 1001–1007. [Google Scholar]
Chandrakar, I.; Hulipalled, V.R. Privacy Preserving Big Data mining using Pseudonymization and Homomorphic Encryption. In Proceedings of the 2021 2nd Global Conference for Advancement in Technology (GCAT), Bangalore, India, 1–3 October 2021; pp. 1–4. [Google Scholar]
Chen, M.; Wei, X.; Gao, Y.; Huang, L.; Chen, M.; Kang, B. Deep-broad Learning System for Traffic Flow Prediction toward 5G Cellular Wireless Network. In Proceedings of the 2020 International Wireless Communications and Mobile Computing (IWCMC 2020), Limassol, Cyprus, 15–19 June 2020; pp. 940–945. [Google Scholar]
Ayoubi, S.; Limam, N.; Salahuddin, M.A.; Shahriar, N.; Boutaba, R.; Estrada-Solano, F.; Caicedo, O.M. Machine learning for cognitive network management. IEEE Commun. Mag. 2018, 56, 158–165. [Google Scholar] [CrossRef]
Xu, Y.; Yin, F.; Xu, W.; Lin, J.; Cui, S. Wireless traffic prediction with scalable Gaussian process: Framework, algorithms, and verification. IEEE J. Sel. Areas Commun. 2019, 37, 1291–1306. [Google Scholar] [CrossRef]
Diethe, T.; Borchert, T.; Thereska, E.; Balle, B.; Lawrence, N. Continual learning in practice. arXiv 2019, arXiv:1903.05202. [Google Scholar] [CrossRef]
Kelner, J.M.; Ziółkowski, C. Interference in multi-beam antenna system of 5G network. Int. J. Electron. Telecommun. 2020, 66, 17–23. [Google Scholar] [CrossRef]
Liu, S.; Wei, Y.; Hwang, S.H. Guard band protection for coexistence of 5G base stations and satellite earth stations. ICT Express 2023, 9, 1103–1109. [Google Scholar] [CrossRef]
Zaoutis, E.A.; Liodakis, G.S.; Baklezos, A.T.; Nikolopoulos, C.D.; Ioannidou, M.P.; Vardiambasis, I.O. 6G wireless communications and artificial intelligence-controlled reconfigurable intelligent surfaces: From supervised to federated learning. Appl. Sci. 2025, 15, 3252. [Google Scholar] [CrossRef]
Alsenwi, M.; Tran, N.H.; Bennis, M.; Pandey, S.R.; Bairagi, A.K.; Hong, C.S. Intelligent resource slicing for eMBB and URLLC coexistence in 5G and beyond: A deep reinforcement learning based approach. IEEE Trans. Wirel. Commun. 2021, 20, 4585–4600. [Google Scholar] [CrossRef]
Ali, M.; Chakraborty, S. Enabling video conferencing in low bandwidth. In Proceedings of the 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC), Virtual, 8–11 January 2022; pp. 487–488. [Google Scholar]
Soós, G.; Ficzere, D.; Varga, P. User group behavioral pattern in a cellular mobile network for 5G use-cases. In Proceedings of the NOMS 2020–2020 IEEE/IFIP Network Operations and Management Symposium, Budapest, Hungary, 20–24 April 2020; pp. 1–7. [Google Scholar]
Yang, J.; Qiao, Y.; Zhang, X.; He, H.; Liu, F.; Cheng, G. Characterizing user behavior in mobile internet. IEEE Trans. Emerg. Top. Comput. 2014, 3, 95–106. [Google Scholar] [CrossRef]
Jiang, S.; Wei, B.; Wang, T.; Zhao, Z.; Zhang, X. Big data enabled user behavior characteristics in mobile internet. In Proceedings of the 2017 9th International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, 11–13 October 2017; pp. 1–5. [Google Scholar]
Blackburn, J.; Stanojevic, R.; Erramilli, V.; Iamnitchi, A.; Papagiannaki, K. Last call for the buffet: Economics of cellular networks. In Proceedings of the 19th Annual International Conference on Mobile Computing & Networking (MobiCom’13), Miami, FL, USA, 30 September–4 October 2013; pp. 111–122. [Google Scholar]
Halepovic, E.; Williamson, C. Characterizing and modeling user mobility in a cellular data network. In Proceedings of the 2nd ACM International Workshop on Performance Evaluation of Wireless Ad Hoc, Sensor, and Ubiquitous Networks (PE-WASUN’05), Montreal, ON, Canada, 10–13 October 2005; pp. 71–78. [Google Scholar]
Walelgne, E.A.; Asrese, A.S.; Manner, J.; Bajpai, V.; Ott, J. Understanding data usage patterns of geographically diverse mobile users. IEEE Trans. Netw. Serv. Manag. 2020, 18, 3798–3812. [Google Scholar] [CrossRef]
Siddiqi, M.A.; Yu, H.; Joung, J. 5G Ultra-Reliable Low-Latency Communication Implementation Challenges and Operational Issues with IoT Devices. Electronics 2019, 8, 981. [Google Scholar] [CrossRef]
Hsu, Y.H.; Liao, W. eMBB and URLLC Service Multiplexing Based on Deep Reinforcement Learning in 5G and Beyond. In Proceedings of the 2022 IEEE Wireless Communications and Networking Conference (WCNC 2022), Austin, TX, USA, 10–13 April 2022; pp. 1467–1472. [Google Scholar]
Popovski, P.; Trillingsgaard, K.F.; Simeone, O.; Durisi, G. 5G wireless network slicing for eMBB, URLLC, and mMTC: A communication-theoretic view. IEEE Access 2018, 6, 55765–55779. [Google Scholar] [CrossRef]
Sohaib, R.M.; Onireti, O.; Sambo, Y.; Swash, R.; Ansari, S.; Imran, M.A. Intelligent Resource Management for eMBB and URLLC in 5G and beyond Wireless Networks. IEEE Access 2023, 11, 65205–65221. [Google Scholar] [CrossRef]
Ray, S.; Bhattacharyya, B. Machine learning based cell association for mMTC 5G communication networks. Int. J. Mob. Netw. Des. Innov. 2020, 10, 10–16. [Google Scholar] [CrossRef]
Belhadj, S.; Lakhdar, A.M.; Bendjillali, R.I. Performance comparison of channel coding schemes for 5G massive machine type communications. Indones. J. Electr. Eng. Comput. Sci. 2021, 22, 902. [Google Scholar] [CrossRef]
Zhang, X.; Liang, K.; Chen, J.J.; Liu, J. Bandwidth Allocation for eMBB and mMTC Slices Based on AI-Aided Traffic Prediction. In Proceedings of the International Conference on Internet of Things as a Service (IoTaaS 2022), Cham, Switzerland, 17–18 November 2022; pp. 117–126. [Google Scholar]
Abdelsadek, M.Y.; Gadallah, Y.; Ahmed, M.H. Resource allocation of URLLC and eMBB mixed traffic in 5G networks: A deep learning approach. In Proceedings of the GLOBECOM 2020–2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar]
Guo, Q.; Gu, R.; Wang, Z.; Zhao, T.; Ji, Y.; Kong, J.; Jue, J.P. Proactive dynamic network slicing with deep learning based short-term traffic prediction for 5G transport network. In Proceedings of the Optical Fiber Communication Conference, San Diego, CA, USA, 3–7 March 2019; p. W3J-3. [Google Scholar]
Xiao, S.; Chen, W. Dynamic allocation of 5G transport network slice bandwidth based on LSTM traffic prediction. In Proceedings of the 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS 2018), Beijing, China, 23–25 November 2018; pp. 735–739. [Google Scholar]
Kumar, N.; Ahmad, A. Machine learning-based QoS and traffic-aware prediction-assisted dynamic network slicing. Int. J. Commun. Netw. Distrib. Syst. 2022, 28, 27–42. [Google Scholar] [CrossRef]
Thantharate, A.; Paropkari, R.; Walunj, V.; Beard, C. DeepSlice: A deep learning approach towards an efficient and reliable network slicing in 5G networks. In Proceedings of the 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 10–12 October 2019; pp. 0762–0767. [Google Scholar]
Kresch, E.; Kulkarni, S. A poisson based bursty model of internet traffic. In Proceedings of the IEEE 11th International Conference on Computer and Information Technology, Paphos, Cyprus, 31 August–2 September 2011; pp. 255–260. [Google Scholar]
Xu, Y.; Wang, Z.; Leong, W.K.; Leong, B. An end-to-end measurement study of modern cellular data networks. In Proceedings of the 15th International Conference on Passive and Active Measurement: (PAM 2014), Los Angeles, CA, USA, 10–11 March 2014; pp. 34–45. [Google Scholar]
Uitto, M.; Heikkinen, A. Evaluation of live video streaming performance for low latency use cases in 5G. In Proceedings of the 2021 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), Porto, Portugal, 8–11 June 2021; pp. 431–436. [Google Scholar]
Qi, Y.; Hunukumbure, M.; Nekovee, M.; Lorca, J.; Sgardoni, V. Quantifying data rate and bandwidth requirements for immersive 5G experience. In Proceedings of the 2016 IEEE International Conference on Communications Workshops (ICC), Kuala Lumpur, Malaysia, 23–27 May 2016; pp. 455–461. [Google Scholar]
Shrama, L.; Javali, A.; Routray, S.K. An overview of high-speed streaming in 5G. In Proceedings of the 2020 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 26–28 February 2020; pp. 557–562. [Google Scholar]
Uitto, M.; Heikkinen, A. Evaluating 5G uplink performance in low latency video streaming. In Proceedings of the 2022 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), Grenoble, France, 7–10 June 2022; pp. 393–398. [Google Scholar]
Rath, A.; Goyal, S.; Panwar, S. Streamloading: Low-cost high-quality video streaming for mobile users. In Proceedings of the 5th Workshop on Mobile Video (MoVid’13), Oslo, Norway, 27 February 2013; pp. 1–6. [Google Scholar]
Keshav, K.; Pradhan, A.K.; Srinivas, T.; Venkataram, P. Bandwidth allocation for interactive multimedia in 5g networks. In Proceedings of the 2021 6th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 8–10 July 2021; pp. 840–845. [Google Scholar]
Keshvadi, S.; Karamollahi, M.; Williamson, C. Traffic characterization of instant messaging apps: A campus-level view. In Proceedings of the 2020 IEEE 45th Conference on Local Computer Networks (LCN), Sydney, Australia, 16–19 November 2020; pp. 225–232. [Google Scholar]
Liu, Y.; Guo, L. An empirical study of video messaging services on smartphones. In Proceedings of the Network and Operating System Support on Digital Audio and Video Workshop (NOSSDAV’14), Singapore, 19–21 March 2014; pp. 79–84. [Google Scholar]
Al-Barrak, A. Internet of Things (IoT). In Proceedings of the 2019 2nd International Conference on Engineering Technology and its Applications (IICETA), Najaf, Iraq, 27–28 August 2019; pp. 1–2. [Google Scholar]
Baghel, S.K.; Keshav, K.; Manepalli, V.R. An investigation into traffic analysis for diverse data applications on smartphones. In Proceedings of the 2012 National Conference on Communications (NCC), Kharagpur, India, 3–5 February 2012; pp. 1–5. [Google Scholar]
Lam, S. A New Measure for Charcterizing Data Traffic. IEEE Trans. Commun. 1978, 26, 137–140. [Google Scholar] [CrossRef]
Ephremides, A. On the “Bursty Factor” as a Measure for Characterizing Data Traffic. IEEE Trans. Commun. 1978, 26, 1791–1792. [Google Scholar] [CrossRef]
Li, H.; Yang, A.X.; Zhao, Y.Q.; Wang, Q. Application of Flash in the Analysis of Making Webpages and Mechanical Motiones. Appl. Mech. Mater. 2014, 496, 2078–2081. [Google Scholar] [CrossRef]
Lecci, M.; Zanella, A.; Zorzi, M. An ns-3 implementation of a bursty traffic framework for virtual reality sources. In Proceedings of the 2021 Workshop on ns-3 (WNS3 ‘21), Kolding, Denmark, 23–24 June 2021; pp. 73–80. [Google Scholar]
Weerasinghe, T.N.; Balapuwaduge, I.A.; Li, F.Y. Preamble transmission prediction for mmtc bursty traffic: A machine learning based approach. In Proceedings of the 2020 IEEE Global Communications Conference (GLOBECOM 2020), Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar]
Qian, F.; Wang, Z.; Gao, Y.; Huang, J.; Gerber, A.; Mao, Z.; Sen, S.; Spatscheck, O. Periodic transfers in mobile applications: Network-wide origin, impact, and optimization. In Proceedings of the 21st International Conference on World Wide Web (WWW 2012), Lyon, France, 16–20 April 2012; pp. 51–60. [Google Scholar]
Falciasecca, G.; Frullone, M.; Riva, G.; Serra, A.M. On the impact of traffic burst on performances of high-capacity cellular systems. In Proceedings of the 40th IEEE Conference on Vehicular Technology (VTC ‘90), Orlando, FL, USA, 7–9 May 1990; pp. 646–651. [Google Scholar]
Chousainov, I.A.; Moscholios, I.; Sarigiannidis, P.; Kaloxylos, A.; Logothetis, M. An analytical framework of a C-RAN supporting bursty traffic. In Proceedings of the 2020 IEEE International Conference on Communications (ICC 2020), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar]
Jiang, Z.; Chang, L.F.; Shankaranarayanan, N.K. Providing multiple service classes for bursty data traffic in cellular networks. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM 2000)—19th Annual Joint Conference of the IEEE Computer and Communications Societies, Tel Aviv, Israel, 26–30 March 2000; pp. 1087–1096. [Google Scholar]
Shah, S.W.H.; Riaz, A.T.; Iqbal, K. Congestion control through dynamic access class barring for bursty MTC traffic in future cellular networks. In Proceedings of the 2018 International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan, 17–19 December 2018; pp. 176–181. [Google Scholar]
Anamuro, C.V.; Lagrange, X. Mobile traffic classification through burst traffic statistical features. In Proceedings of the 97th IEEE Vehicular Technology Conference (VTC 2023), Florence, Italy, 20–23 June 2023; pp. 1–5. [Google Scholar]
Weerasinghe, T.N.; Balapuwaduge, I.A.; Li, F.Y. Supervised learning-based arrival prediction and dynamic preamble allocation for bursty traffic. In Proceedings of the IEEE INFOCOM 2019-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Paris, France, 29 April–2 May 2019; pp. 1–6. [Google Scholar]
Wang, W.; Zhou, C.; He, H.; Wu, W.; Zhuang, W.; Shen, X. Cellular traffic load prediction with LSTM and Gaussian process regression. In Proceedings of the 2020 IEEE International Conference on Communications (ICC 2020), Virtual, 7–11 June 2020; pp. 1–6. [Google Scholar]
Erman, J.; Gerber, A.; Ramadrishnan, K.K.; Sen, S.; Spatscheck, O. Over the top video: The gorilla in cellular networks. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement (IMC’11), Berlin, Germany, 2–4 November 2011; pp. 127–136. [Google Scholar]
Blaszczyszyn, B.; Karray, M.K. Impact of mean user speed on blocking and cuts of streaming traffic in cellular networks. In Proceedings of the 2008 14th European Wireless Conference (EW 2008), Prague, Czech Republic, 22–25 June 2008; pp. 1–7. [Google Scholar]
Yao, L.; Bao, J.; Ding, F.; Zhang, N.; Tong, E. Research on traffic flow forecast based on cellular signaling data. In Proceedings of the 2021 IEEE International Conference on Smart Internet of Things (SmartIoT), Jeju Island, Republic of Korea, 13–15 August 2021; pp. 193–199. [Google Scholar]
Khan, S.; Hussain, A.; Nazir, S.; Khan, F.; Oad, A.; Alshehr, M.D. Efficient and reliable hybrid deep learning-enabled model for congestion control in 5G/6G networks. Comput. Commun. 2022, 182, 31–40. [Google Scholar] [CrossRef]
Zheng, K.; Yang, Z.; Zhang, K.; Chatzimisios, P.; Yang, K.; Xiang, W. Big data-driven optimization for mobile networks toward 5G. IEEE Netw. 2016, 30, 44–51. [Google Scholar] [CrossRef]
Wang, J.; Tang, J.; Xu, Z.; Wang, Y.; Xue, G.; Zhang, X.; Yang, D. Spatiotemporal modeling and prediction in cellular networks: A big data enabled deep learning approach. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2017), Atlanta, GA, USA, 1–4 May 2017; pp. 1–9. [Google Scholar]
Paul, U.; Liu, J.; Troia, S.; Falowo, O.; Maier, G. Traffic-profile and machine learning based regional data center design and operation for 5G network. J. Commun. Netw. 2019, 21, 569–583. [Google Scholar] [CrossRef]
Suznjevic, M.; Matijasevic, M. Trends in evolution of the network traffic of massively multiplayer online role-playing games. In Proceedings of the 2015 13th International Conference on Telecommunications (ConTEL), Graz, Austria, 13–15 July 2015; pp. 1–8. [Google Scholar]
Qi, C.; Zhao, Z.; Li, R.; Zhang, H. Characterizing and modeling social mobile data traffic in cellular networks. In Proceedings of the 2016 IEEE 83rd Vehicular Technology Conference (VTC Spring), Nanjing, China, 15–18 May 2016; pp. 1–5. [Google Scholar]
Park, J.; Popovski, P.; Simeone, O. Minimizing latency to support VR social interactions over wireless cellular systems via bandwidth allocation. IEEE Wirel. Commun. Lett. 2018, 7, 776–779. [Google Scholar] [CrossRef]
Prasad, A.; Uusitalo, M.A.; Navrátil, D.; Säily, M. Challenges for enabling virtual reality broadcast using 5G small cell network. In Proceedings of the 2018 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), Barcelona, Spain, 15–18 April 2018; pp. 220–225. [Google Scholar]
Taleb, T.; Nadir, Z.; Flinck, H.; Song, J. Extremely interactive and low-latency services in 5G and beyond mobile systems. IEEE Commun. Stand. Mag. 2021, 5, 114–119. [Google Scholar] [CrossRef]
Chen, D.Y.; Lin, P.C.; Chen, K.T. Does online mobile gaming overcharge you for the fun? In Proceedings of the 2013 12th Annual Workshop on Network and Systems Support for Games (NetGames), Denver, CO, USA, 9–10 December 2013; pp. 1–2. [Google Scholar]
Drajic, D.; Krco, S.; Tomic, I.; Popovic, M.; Zeljkovic, N.; Nikaein, N.; Svoboda, P. Impact of online games and M2M applications traffic on performance of HSPA radio access networks. In Proceedings of the 2012 Sixth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), Palermo, Italy, 4–6 July 2012; pp. 880–885. [Google Scholar]
Barba, P.; Stramiello, J.; Funk, E.K.; Richter, F.; Yip, M.C.; Orosco, R.K. Remote telesurgery in humans: A systematic review. Surg. Endosc. 2022, 36, 2771–2777. [Google Scholar] [CrossRef]
Fiadino, P.; Casas, P.; Schiavone, M.; D’Alconzo, A. Online Social Networks anatomy: On the analysis of Facebook and WhatsApp in cellular networks. In Proceedings of the 2015 IFIP Networking Conference (IFIP Networking), Toulouse, France, 20–22 May 2015; pp. 1–9. [Google Scholar]
Huang, Q.; Lee, P.P.; He, C.; Qian, J.; He, C. Fine-grained dissection of WeChat in cellular networks. In Proceedings of the 2015 IEEE 23rd International Symposium on Quality of Service (IWQoS), Portland, OR, USA, 15–16 June 2015; pp. 309–318. [Google Scholar]
Sun, F.; Wang, P.; Zhao, J.; Xu, N.; Zeng, J.; Tao, J.; Song, K.; Deng, C.; Lui, J.C.; Guan, X. Mobile data traffic prediction by exploiting time-evolving user mobility patterns. IEEE Trans. Mob. Comput. 2021, 21, 4456–4470. [Google Scholar] [CrossRef]
Shafiq, M.; Yu, X.; Laghari, A.A. WeChat text messages service flow traffic classification using machine learning technique. In Proceedings of the 2016 6th International Conference on IT Convergence and Security (ICITCS), Prague, Czech Republic, 26–29 September 2016; pp. 1–5. [Google Scholar]
Yadav, A.; Singh, H.; Mala, S.; Shankar, A. Recognizing Massive Mobile Traffic Patterns to Understand Urban Dynamics. In Proceedings of the 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence 2021), Noida, India, 28–29 January 2021; pp. 894–898. [Google Scholar]
Chen, L.; Yang, D.; Zhang, D.; Wang, C.; Li, J. Deep mobile traffic forecast and complementary base station clustering for C-RAN optimization. J. Netw. Comput. Appl. 2018, 121, 59–69. [Google Scholar] [CrossRef]
Zhou, Y.; Fadlullah, Z.M.; Mao, B.; Kato, N. A deep-learning-based radio resource assignment technique for 5G ultra dense networks. IEEE Netw. 2018, 32, 28–34. [Google Scholar] [CrossRef]
Zhao, X.; Yang, K.; Chen, Q.; Peng, D.; Jiang, H.; Xu, X.; Shuang, X. Deep learning based mobile data offloading in mobile edge computing systems. Future Gener. Comput. Syst. 2019, 99, 346–355. [Google Scholar] [CrossRef]
Trinh, H.D.; Giupponi, L.; Dini, P. Mobile traffic prediction from raw data using LSTM networks. In Proceedings of the 29th IEEE Annual International Symposium on Personal Indoor and Mobile Radio Communications (PIMRC 2018), Bologna, Italy, 9–12 September 2018; pp. 1827–1832. [Google Scholar]
Chen, M.; Miao, Y.; Gharavi, H.; Hu, L.; Humar, I. Intelligent traffic adaptive resource allocation for edge computing-based 5G networks. IEEE Trans. Cogn. Commun. Netw. 2019, 6, 499–508. [Google Scholar] [CrossRef]
Azzouni, A.; Pujolle, G. A long short-term memory recurrent neural network framework for network traffic matrix prediction. arXiv 2017, arXiv:1705.05690. [Google Scholar] [CrossRef]
Dalgkitsis, A.; Louta, M.; Karetsos, G.T. Traffic forecasting in cellular networks using the LSTM RNN. In Proceedings of the 22nd Pan-Hellenic Conference on Informatics, Athens, Greece, 29 November–1 December 2018; pp. 28–33. [Google Scholar]
Alawe, I.; Ksentini, A.; Hadjadj-Aoul, Y.; Bertin, P. Improving traffic forecasting for 5G core network scalability: A machine learning approach. IEEE Netw. 2018, 32, 42–49. [Google Scholar] [CrossRef]
Guerra-Gomez, R.; Ruiz-Boque, S.; Garcia-Lozano, M.; Bonafe, J.O. Machine learning adaptive computational capacity prediction for dynamic resource management in C-RAN. IEEE Access 2020, 8, 89130–89142. [Google Scholar] [CrossRef]
Mostafa, A.M. IoT Architecture and Protocols in 5G Environment. In Powering the Internet of Things With 5G Networks; Mohanan, V., Budiarto, R., Aldmour, I., Eds.; IGI Global: Hershey PA, USA, 2018; pp. 105–130. [Google Scholar]
Macriga, G.A.; Sakthy, S.S.; Niranjan, R.; Sahu, S. An Emerging Technology: Integrating IoT with 5G Cellular Network. In Proceedings of the 2021 4th International Conference on Computing and Communications Technologies (ICCCT), Chennai, India, 16–17 December 2021; pp. 208–214. [Google Scholar]
Ardita, M.; Orisa, M. Wi-Fi-Based Internet of Things (Iot) Data Communication Performance in Dense Wireless Network Traffic Conditions. J. Electr. Eng. Mechatron. Comput. Sci. 2021, 4, 31–36. [Google Scholar] [CrossRef]
Sharma, S.K.; Wang, X. Distributed caching enabled peak traffic reduction in ultra-dense IoT networks. IEEE Commun. Lett. 2018, 22, 1252–1255. [Google Scholar] [CrossRef]
Finley, B.; Vesselkov, A. Cellular iot traffic characterization and evolution. In Proceedings of the 2019 IEEE 5th World Forum on Internet of Things (WF-IoT), Limerick, Ireland, 15–18 April 2019; pp. 622–627. [Google Scholar]
Gong, Y.; Zhang, Z.; Wang, K.; Gu, Y.; Wu, Y. IoT-Oriented Single-Transmitter Multiple-Receiver Wireless Charging Systems Using Hybrid Multi-Frequency Pulse Modulation. IEEE Trans. Magn. 2024, 60, 1–6. [Google Scholar] [CrossRef]
Ma, H.; Tao, Y.; Fang, Y.; Chen, P.; Li, Y. Multi-Carrier Initial-Condition-Index-aided DCSK Scheme: An Efficient Solution for Multipath Fading Channel. IEEE Trans. Veh. Technol. 2025, 74, 15743–15757. [Google Scholar] [CrossRef]
Alzahrani, R.J.; Alzahrani, A. Survey of traffic classification solution in IoT networks. Int. J. Comput. Appl. 2021, 183, 37–45. [Google Scholar] [CrossRef]
Khedkar, S.P.; Canessane, R.A.; Najafi, M.L. Prediction of traffic generated by IoT devices using statistical learning time series algorithms. Wirel. Commun. Mob. Comput. 2021, 2021, 5366222. [Google Scholar] [CrossRef]
Chen, X.; Liu, Y.; Zhang, J. Traffic prediction for Internet of Things through support vector regression model. Internet Technol. Lett. 2022, 5, e336. [Google Scholar] [CrossRef]
Abdellah, A.R.; Mahmood, O.A.K.; Paramonov, A.; Koucheryavy, A. IoT traffic prediction using multi-step ahead prediction with neural network. In Proceedings of the 2019 11th International Congress on Ultra-Modern Telecommunications and Control Systems and Workshops (ICUMT), Dublin, Ireland, 28–30 October 2019; pp. 1–4. [Google Scholar]
Wang, R.; Zhang, Y.; Peng, L.; Fortino, G.; Ho, P.H. Time-varying-aware network traffic prediction via deep learning in IoT. IEEE Trans. Ind. Inform. 2022, 18, 8129–8137. [Google Scholar] [CrossRef]
Lu, Y.; Liu, L.; Panneerselvam, J.; Yuan, B.; Gu, J.; Antonopoulos, N. A GRU-based prediction framework for intelligent resource management at cloud data centres in the age of 5G. IEEE Trans. Cogn. Commun. Netw. 2019, 6, 486–498. [Google Scholar] [CrossRef]
Hu, C.; Fan, W.; Zeng, E.; Hang, Z.; Wang, F.; Qi, L.; Bhuiyan, M.Z.A. Digital twin-assisted real-time traffic data prediction method for 5G-enabled internet of vehicles. IEEE Trans. Ind. Inform. 2021, 18, 2811–2819. [Google Scholar] [CrossRef]
Abdelmotalib, A.; Wu, Z.; Zhou, P. Background traffic analysis for social media applications on smartphones. In Proceedings of the 2012 2nd International Conference on Instrumentation, Measurement, Computer, Communication and Control (IMCCC), Harbin, China, 8–10 December 2012; pp. 817–818. [Google Scholar]
Gupta, S.; Garg, R.; Jain, N.; Naik, V.; Kaul, S. Android phone-based appraisal of app behavior on cell networks. In Proceedings of the 1st International Conference on Mobile Software Engineering and Systems (MobileSoft 2014), Hyderabad, India, 2–3 June 2014; pp. 54–57. [Google Scholar]
Venkataramani, A.; Kokku, R.; Dahlin, M. TCP Nice: A mechanism for background transfers. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation, Boston, MA, USA, 9–11 December 2002; pp. 329–343. [Google Scholar]
Liu, C.; Zeng, L.; Shi, J.; Xu, F.; Xiong, G.; Yiu, S.M. Auto-identification of background traffic based on autonomous periodic interaction. In Proceedings of the 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC), San Diego, CA, USA, 10–12 December 2017; pp. 1–8. [Google Scholar]
He, K.; Chen, X.; Wu, Q.; Yu, S.; Zhou, Z. Graph attention spatial-temporal network with collaborative global-local learning for citywide mobile traffic prediction. IEEE Trans. Mob. Comput. 2020, 21, 1244–1256. [Google Scholar] [CrossRef]
Do, Q.H.; Doan, T.T.H.; Nguyen, T.V.A.; Duong, N.T.; Linh, V.V. Prediction of data traffic in telecom networks based on deep neural networks. J. Comput. Sci. 2020, 16, 1268–1277. [Google Scholar] [CrossRef]
Mehmeti, F.; La Porta, T.F. Resource allocation for improved user experience with live video streaming in 5G. In Proceedings of the 17th ACM Symposium on QoS and Security for Wireless and Mobile Network (Q2SWinet ‘21), Alicante, Spain, 22–26 November 2021; pp. 69–78. [Google Scholar]
Martin, A.; Egaña, J.; Flórez, J.; Montalban, J.; Olaizola, I.G.; Quartulli, M.; Viola, R.; Zorrilla, M. Network resource allocation system for QoE-aware delivery of media services in 5G networks. IEEE Trans. Broadcast. 2018, 64, 561–574. [Google Scholar] [CrossRef]
Feng, H.; Ma, M. Traffic Prediction over Wireless Networks. In Wireless Network Traffic and Quality of Service Support: Trends and Standards; Lagkas, T.D., Angelidis, P., Georgiadis, L., Eds.; IGI Global: Hershey PA, USA, 2010; pp. 87–112. [Google Scholar]
Trinh, H.D. Data Analytics for Mobile Traffic in 5G Networks Using Machine Learning Techniques. Ph.D. Thesis, Polytechnic University of Catalonia, Barcelona, Spain, 2020. [Google Scholar]
OpenSignal.com. Available online: https://opencellid.org/ (accessed on 5 December 2023).
Shafiq, M.Z.; Ji, L.; Liu, A.X.; Pang, J.; Wang, J. Characterizing geospatial dynamics of application usage in a 3G cellular data network. In Proceedings of the 2012 IEEE INFOCOM, Orlando, FL, USA, 25–30 March 2012; pp. 1341–1349. [Google Scholar]
Sun, H.; Halepovitc, E.; Williamson, C.; Wu, Y. Characterization of CDMA2000 Cellular Data Network Traffic. Networks 2000, 7, 10. [Google Scholar]
Paul, U.; Subramanian, A.P.; Buddhikot, M.M.; Das, S.R. Understanding traffic dynamics in cellular data networks. In Proceedings of the IEEE INFOCOM 2011, Shanghai, China, 10–15 April 2011; pp. 882–890. [Google Scholar]
Wang, Y.; Faloutsos, M.; Zang, H. On the usage patterns of multimodal communication: Countries and evolution. In Proceedings of the IEEE INFOCOM 2013, Turin, Italy, 14–19 April 2013; pp. 3135–3140. [Google Scholar]
Cardona, J.C.; Stanojevic, R.; Laoutaris, N. Collaborative consumption for mobile broadband: A quantitative study. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies, Sydney, Australia, 2–5 December 2014; pp. 307–318. [Google Scholar]
Zhang, Y.; Årvidsson, A. Understanding the characteristics of cellular data traffic. In Proceedings of the 2012 ACM SIGCOMM Workshop on Cellular Networks: Operations, Challenges, and Future Design, Helsinki, Finland, 13 August 2012; pp. 13–18. [Google Scholar]
Li, R.; Zhao, Z.; Zheng, J.; Mei, C.; Cai, Y.; Zhang, H. The learning and prediction of application-level traffic data in cellular networks. IEEE Trans. Wirel. Commun. 2017, 16, 3899–3912. [Google Scholar] [CrossRef]
Tikunov, D.; Nishimura, T. Traffic prediction for mobile network using Holt-Winter’s exponential smoothing. In Proceedings of the 2007 15th International Conference on Software, Telecommunications and Computer Networks, Split-Dubrovnik, Croatia, 27–29 September 2007; pp. 1–5. [Google Scholar]
Tan, I.K.; Hoong, P.K.; Keong, C.Y. Towards forecasting low network traffic for software patch downloads: An ARMA model forecast using CRONOS. In Proceedings of the 2010 2nd International Conference on Computer and Network Technology, Bangkok, Thailand, 23–25 April 2010; pp. 88–92. [Google Scholar]
Sadek, N.; Khotanzad, A. Multi-scale high-speed network traffic prediction using k-factor Gegenbauer ARMA model. In Proceedings of the 2004 IEEE International Conference on Communications, Paris, France, 20–24 June 2004; pp. 2148–2152. [Google Scholar]
Sciancalepore, V.; Samdanis, K.; Costa-Perez, X.; Bega, D.; Gramaglia, M.; Banchs, A. Mobile traffic forecasting for maximizing 5G network slicing resource utilization. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2017), Atlanta, GA, USA, 1–4 May 2017; pp. 1–9. [Google Scholar]
Medhn, S.; Seifu, B.; Salem, A.; Hailemariam, D. Mobile data traffic forecasting in UMTS networks based on SARIMA model: The case of Addis Ababa, Ethiopia. In Proceedings of the 2017 IEEE AFRICON, Cape Town, South Africa, 18–20 September 2017; pp. 285–290. [Google Scholar]
AsSadhan, B.; Zeb, K.; Al-Muhtadi, J.; Alshebeili, S. Anomaly detection based on LRD behavior analysis of decomposed control and data planes network traffic using SOSS and FARIMA models. IEEE Access 2017, 5, 13501–13519. [Google Scholar] [CrossRef]
Whittaker, J.; Garside, S.; Lindveld, K. Tracking and predicting a network traffic process. Int. J. Forecast. 1997, 13, 51–61. [Google Scholar] [CrossRef]
Choi, S.; Shin, K.G. Adaptive bandwidth reservation and admission control in QoS-sensitive networks. IEEE Trans. Parallel Distrib. Syst. 2002, 13, 882–897. [Google Scholar] [CrossRef]
Zhou, B.; He, D.; Sun, Z. Traffic Modeling and Prediction Using ARIMA/GARCH Model; Springer: Boston, MA, USA, 2006; pp. 101–121. [Google Scholar]
Mitchell, K.; Sohraby, K. An analysis of the effects of mobility on bandwidth allocation strategies in multi-class cellular wireless networks. In Proceedings of the IEEE INFOCOM 2001, Anchorage, AL, USA, 22–26 April 2001; Volume 2, pp. 1005–1011. [Google Scholar]
Mehdi, H.; Pooranian, Z.; Vinueza Naranjo, P.G. Cloud traffic prediction based on fuzzy ARIMA model with low dependence on historical data. Trans. Emerg. Telecommun. Technol. 2022, 33, e3731. [Google Scholar] [CrossRef]
Tran, Q.T.; Hao, L.; Trinh, Q.K. Cellular network traffic prediction using exponential smoothing methods. J. Inf. Commun. Technol. 2019, 18, 1–18. [Google Scholar] [CrossRef]
Levine, D.A.; Akyildiz, I.F.; Naghshineh, M. A resource estimation and call admission algorithm for wireless multimedia networks using the shadow cluster concept. IEEE/ACM Trans. Netw. 1997, 5, 1–12. [Google Scholar] [CrossRef]
Aceto, G.; Bovenzi, G.; Ciuonzo, D.; Montieri, A.; Persico, V.; Pescapé, A. Characterization and prediction of mobile-app traffic using Markov modeling. IEEE Trans. Netw. Serv. Manag. 2021, 18, 907–925. [Google Scholar] [CrossRef]
Dash, S.; Maheshwari, S.; Mahapatra, S. Traffic prediction in future mobile networks using hidden Markov model. In Proceedings of the IEICE Smart Wireless Communications (SmartCom 2019), New Jersey, NJ, USA, 4–6 November 2019. [Google Scholar]
Bouzidi, E.H.; Luong, D.H.; Outtagarts, A.; Hebbar, A.; Langar, R. Online-based learning for predictive network latency in software-defined networks. In Proceedings of the 2018 IEEE Global Communications Conference (GLOBECOM 2018), Abu Dhabi, United Arab Emirates, 9–13 December 2018; pp. 1–6. [Google Scholar]
Bega, D.; Gramaglia, M.; Fiore, M.; Banchs, A.; Costa-Perez, X. DeepCog: Cognitive network management in sliced 5G networks with deep learning. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2019), Paris, France, 29 April–2 May 2019; pp. 280–288. [Google Scholar]
Liang, D.; Zhang, J.; Jiang, S.; Zhang, X.; Wu, J.; Sun, Q. Mobile traffic prediction based on densely connected CNN for cellular networks in highway scenarios. In Proceedings of the 11th International Conference on Wireless Communications and Signal Processing (WCSP 2019), Xi’an, China, 23–25 October 2019; pp. 1–5. [Google Scholar]
Nikravesh, A.Y.; Ajila, S.A.; Lung, C.H.; Ding, W. Mobile network traffic prediction using MLP, MLPWD, and SVM. In Proceedings of the 2016 IEEE International Congress on Big Data (Big Data Congress), San Francisco, CA, USA, 27 June–2 July 2016; pp. 402–409. [Google Scholar]
Jaffry, S.; Hasan, S.F. Cellular traffic prediction using recurrent neural networks. In Proceedings of the IEEE 5th International Symposium on Telecommunication Technologies (ISTT), Shah Alam, Malaysia, 9–11 November 2020; pp. 94–98. [Google Scholar]
Selvamanju, E.; Shalini, V.B. Deep Learning based Mobile Traffic Flow Prediction Model in 5G Cellular Networks. In Proceedings of the 3rd International Conference on Smart Electronics and Communication (ICOSEC 2022), Trichy, India, 20–22 October 2022; pp. 1349–1353. [Google Scholar]
Yimeng, S.; Jianhua, L.; Jian, M.; Yaxing, Q.; Zhe, Z.; Chunhui, L. A Prediction Method of 5G Base Station Cell Traffic Based on Improved Transformer Model. In Proceedings of the 2022 IEEE 4th International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Dali, China, 12–14 October 2022; pp. 40–45. [Google Scholar]
Pfülb, B.; Hardegen, C.; Gepperth, A.; Rieger, S. A study of deep learning for network traffic data forecasting. In Proceedings of the 28th International Conference on Artificial Neural Networks (ICANN 2019), Munich, Germany, 17–19 September 2019; pp. 497–512. [Google Scholar]
Alsaade, F.W.; Al-Adhaileh, M.H. Cellular traffic prediction based on an intelligent model. Mob. Inf. Syst. 2021, 2021, 1–15. [Google Scholar] [CrossRef]
Shawel, B.S.; Debella, T.T.; Tesfaye, G.; Tefera, Y.Y.; Woldegebreal, D.H. Hybrid Prediction Model for Mobile Data Traffic: A Cluster-level Approach. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN 2020), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
Hua, Y.; Zhao, Z.; Liu, Z.; Chen, X.; Li, R.; Zhang, H. Traffic prediction based on random connectivity in deep learning with long short-term memory. In Proceedings of the 2018 IEEE 88th Vehicular Technology Conference (VTC 2018), Chicago, IL, USA, 27–30 August 2018; pp. 1–6. [Google Scholar]
Zhang, S.; Zhao, S.; Yuan, M.; Zeng, J.; Yao, J.; Lyu, M.R.; King, I. Traffic prediction-based power saving in cellular networks: A machine learning method. In Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Redondo Beach, CA, USA, 7–10 November 2017; pp. 1–10. [Google Scholar]
Zeb, S.; Rathore, M.A.; Mahmood, A.; Hassan, S.A.; Kim, J.; Gidlund, M. Edge intelligence in softwareized 6G: Deep learning-enabled network traffic predictions. In Proceedings of the 2021 IEEE Globecom Workshops (GC Weeks), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar]
Andreoletti, D.; Troia, S.; Musumeci, F.; Giordano, S.; Maier, G.; Tornatore, M. Network traffic prediction based on diffusion convolutional recurrent neural network. In Proceedings of the IEEE Conference on Computer Communications Workshops (INFOCOM 2019), Paris, France, 29 April 2019; pp. 246–251. [Google Scholar]
Pelekanou, A.; Anastasopoulos, M.; Tzanakaki, A.; Simeonidou, D. Provisioning of 5G services employing machine learning techniques. In Proceedings of the 2018 International Conference on Optical Network Design and Modeling (ONDM), Dublin, Ireland, 14–17 May 2018; pp. 200–205. [Google Scholar]
Li, M.; Wang, Y.; Wang, Z.; Zheng, H. A deep learning method based on an attention mechanism for wireless network traffic prediction. Ad Hoc Netw. 2020, 107, 102258. [Google Scholar] [CrossRef]
Aldhyani, T.H.; Alrasheedi, M.; Alqarni, A.A.; Alzahrani, M.Y.; Bamhdi, A.M. Intelligent hybrid model to enhance time series models for predicting network traffic. IEEE Access 2020, 8, 130431–130451. [Google Scholar] [CrossRef]
Fang, Z.; Zhao, R.; Yang, H. 5G Network Traffic Prediction Based on WP-Deep Gaussian. In Proceedings of the 2022 IEEE 4th International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Dali, China, 12–14 October 2022; pp. 938–942. [Google Scholar]
Li, J.; Li, X. 5G Network Traffic Prediction based on EEMD-GAN. In Proceedings of the 7th International Conference on Cyber Security and Information Engineering, Brisbane, Australia, 23–25 September 2022; pp. 408–412. [Google Scholar]
Selvamanju, E.; Shalini, V.B. Archimedes optimization algorithm with deep belief network based mobile network traffic prediction for 5G cellular networks. In Proceedings of the 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, 20–22 January 2022; pp. 370–376. [Google Scholar]
Gong, J.; Li, T.; Wang, H.; Liu, Y.; Wang, X.; Wang, Z.; Deng, C.; Feng, J.; Jin, D.; Li, Y. Kgda: A knowledge graph driven decomposition approach for cellular traffic prediction. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–22. [Google Scholar] [CrossRef]
Mehri, H.; Chen, H.; Mehrpouyan, H. Cellular Traffic Prediction Using Online Prediction Algorithms. arXiv 2024, arXiv:2405.05239. [Google Scholar] [CrossRef]
Cai, Z.; Tan, C.; Zhang, J.; Zhu, L.; Feng, Y. Dbstgnn-att: Dual branch spatio-temporal graph neural network with an attention mechanism for cellular network traffic prediction. Appl. Sci. 2024, 14, 2173. [Google Scholar] [CrossRef]
Hao, M.; Sun, X.; Li, Y.; Zhang, H. Edge-side cellular network traffic prediction based on trend graph characterization network. IEEE Trans. Netw. Sci. Eng. 2024, 11, 6118–6129. [Google Scholar] [CrossRef]
Cao, S.; Wu, L.; Zhang, R.; Lu, J.; Wu, D.; Zhang, Z. Hypergraph attention recurrent network for cellular traffic prediction. IEEE Trans. Netw. Serv. Manag. 2024, 22, 1760–1774. [Google Scholar] [CrossRef]
Zhou, J.; Luo, A.; Zhou, N. Mobile network traffic prediction based on cross-patch feature fusion. In Proceedings of the 2024 4th International Conference on Neural Networks, Information and Communication Engineering (NNICE), Guangzhou, China, 19–21 January 2024; pp. 1006–1010. [Google Scholar]
Jiang, W.; Zhang, Y.; Han, H.; Huang, Z.; Li, Q.; Mu, J. Mobile traffic prediction in consumer applications: A multimodal deep learning approach. IEEE Trans. Consum. Electron. 2024, 70, 3425–3435. [Google Scholar] [CrossRef]
Wu, X.; Wu, C. CLPREM: A real-time traffic prediction method for 5G mobile network. PLoS ONE 2024, 19, e0288296. [Google Scholar] [CrossRef]
Pandey, C.; Tiwari, V.; Rodrigues, J.J.; Roy, D.S. 5GT-GAN-NET: Internet traffic data forecasting with supervised loss based synthetic data over 5G. IEEE Trans. Mob. Comput. 2024, 23, 10694–10705. [Google Scholar] [CrossRef]
Huang, C.W.; Chen, P.C. Mobile traffic offloading with forecasting using deep reinforcement learning. arXiv 2019, arXiv:1911.07452. [Google Scholar] [CrossRef]
Dommaraju, V.S.; Nathani, K.; Tariq, U.; Al-Turjman, F.; Kallam, S.; Patan, R. ECMCRR-MPDNL for Cellular Network Traffic Prediction with Big Data. IEEE Access 2020, 8, 113419–113428. [Google Scholar] [CrossRef]
Gao, Z.; Yan, S.; Zhang, J.; Han, B.; Wang, Y.; Xiao, Y.; Ji, Y. Deep reinforcement learning-based policy for baseband function placement and routing of RAN in 5G and beyond. J. Light. Technol. 2021, 40, 470–480. [Google Scholar] [CrossRef]
Zeng, Q.; Sun, Q.; Chen, G.; Duan, H.; Li, C.; Song, G. Traffic prediction of wireless cellular network based on deep transfer learning and cross-domain data. IEEE Access 2020, 8, 172387–172397. [Google Scholar] [CrossRef]
Bouzidi, E.H.; Outtagarts, A.; Langar, R. Deep reinforcement learning application for network latency management in software defined networks. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM 2019), Waikoloa, HI, USA, 9–13 December 2019; pp. 1–6. [Google Scholar]
Zhu, M.; Gu, J.; Shen, T.; Shi, C.; Ren, X. Energy-efficient and QoS guaranteed BBU aggregation in CRAN based on heuristic-assisted deep reinforcement learning. J. Light. Technol. 2021, 40, 575–587. [Google Scholar] [CrossRef]
Zorello, L.M.M.; Bliek, L.; Troia, S.; Guns, T.; Verwer, S.; Maier, G. Baseband-Function Placement with Multi-Task Traffic Prediction for 5G Radio Access Networks. IEEE Trans. Netw. Serv. Manag. 2022, 19, 5104–5119. [Google Scholar] [CrossRef]
Nan, J.; Ai, M.; Liu, A.; Duan, X. Regional-union based federated learning for wireless traffic prediction in 5G-Advanced/6G network. In Proceedings of the 2022 IEEE/CIC International Conference on Communications in China (ICCC Workshops), Foshan, China, 11–13 August 2022; pp. 423–427. [Google Scholar]
Uyan, U.; Isyapar, M.T.; Ozturk, M.U. 5G Long-Term and Large-Scale Mobile Traffic Forecasting. arXiv 2022, arXiv:2212.10869. [Google Scholar]
Cao, B.; Fan, J.; Yuan, M.; Li, Y. Toward accurate energy-efficient cellular network: Switching off excessive carriers based on traffic profiling. In Proceedings of the 31st Annual ACM Symposium on Applied Computing, Pisa, Italy, 4–8 April 2016; pp. 546–551. [Google Scholar]
Li, R.; Zhao, Z.; Zhou, X.; Zhang, H. Energy savings scheme in radio access networks via compressive sensing-based traffic load prediction. Trans. Emerg. Telecommun. Technol. 2014, 25, 468–478. [Google Scholar] [CrossRef]
Peng, C.; Lee, S.B.; Lu, S.; Luo, H.; Li, H. Traffic-driven power saving in operational 3G cellular networks. In Proceedings of the 17th Annual International Conference on Mobile Computing and Networking, Las Vegas, NV, USA, 19–23 September 2011; pp. 121–132. [Google Scholar]
Wang, G.; Guo, C.; Wang, S.; Feng, C. A traffic prediction-based sleeping mechanism with low complexity in femtocell networks. In Proceedings of the 2013 IEEE International Conference on Communications Workshops (ICC), Budapest, Hungary, 9–13 June 2013; pp. 560–565. [Google Scholar]
Islam, N.; Kandeepan, S.; Scott, J. Energy efficiency of cellular base stations with ternary-state transceivers. In Proceedings of the 9th International Conference on Signal Processing and Communication Systems (ICSPCS 2015), Cairns, Australia, 14–16 December 2015; pp. 1–7. [Google Scholar]
Li, R.; Zhao, Z.; Chen, X.; Palicot, J.; Zhang, H. TACT: A transfer actor-critic learning framework for energy saving in cellular radio access networks. IEEE Trans. Wirel. Commun. 2014, 13, 2000–2011. [Google Scholar] [CrossRef]
Saker, L.; Elayoubi, S.E.; Combes, R.; Chahed, T. Optimal control of wake-up mechanisms of femtocells in heterogeneous networks. IEEE J. Sel. Areas Commun. 2012, 30, 664–672. [Google Scholar] [CrossRef]
Zhao, B.; Wu, T.; Fang, F.; Wang, L.; Ren, W.; Yang, X.; Ruan, Z.; Kou, X. Prediction method of 5G high-load cellular based on BP neural network. In Proceedings of the 8th International Conference on Mechatronics and Robotics Engineering (ICMRE 2022), Munich, Germany, 10–12 February 2022; pp. 148–151. [Google Scholar]
Thabit, F.; Can, O.; Alhomdy, S.; Al-Gaphari, G.H.; Jagtap, S. A novel effective lightweight homomorphic cryptographic algorithm for data security in cloud computing. Int. J. Intell. Netw. 2022, 3, 16–30. [Google Scholar] [CrossRef]
Kwon, W.; Yu, G.I.; Jeong, E.; Chun, B.G. Nimble: Lightweight and parallel GPU task scheduling for deep learning. In Proceedings of the 2020 Advances in Neural Information Processing Systems (NeurIPS 2020), Vancouver, BC, Canada, 6–12 December 2020; pp. 8343–8354. [Google Scholar]

Figure 1. Key challenges for data traffic prediction in 5G.

Figure 3. Schematic representation of selected and rejected sources.

Figure 4. Data Traffic Patterns with their prediction methods.

Figure 5. Proposed Method.

Table 1. Challenges in Data Traffic Prediction.

Category	Challenges	Proposed Solutions
Data Heterogeneity	Architectural Complexity and Traffic Diversity	Network Modeling, Traffic Engineering, Performance Monitoring, ML/hybrid ML for Topology Data [45,46,47,48] Application-Specific Traffic Models, Real-Time Adaptive Prediction Algorithms, Large Datasets [50,51,56]
	Data Scarcity and Dynamic Conditions	Supervised and hybrid ML Methods, Extensive Datasets, Simulation-Based Training [51,52,53,54,57,58,59,60], Use of 4G Data, Simulations, Operator Data-Sharing, Real-Time ML [50,51,53,54]
	Data Challenges and Pre-processing	Synthetic Datasets, PCA, Manual/Synthetic Labeling, Active Learning [31,41,61,62,63]
Data Privacy & Security	Cyber-Attacks	Continuous Monitoring, ML for Anomaly Detection, Strict Access Control [31,64,65,66,67].
	User Privacy Risk	Data Anonymization, Pseudoanonymization, Encryption [68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84].
	Difficulty of Analyzing Encrypted Traffic	Homomorphic Encryption, Lightweight Homomorphic Encryption, Pseudoanonymization + Homomorphic Encryption [85,86,87,88,89,90,91].
Model and Computational Complexity	Trade-off Between Accuracy and Speed	Hybrid ML Methods, Optimization of Training [92,93,94].
	High Computational Cost of Retraining in Dynamic 5G	Stepwise retraining with new observations only, Auto-Adaptive Machine Learning (AAML) [41,95].
	Faster Execution	Hybrid ML with Parallel Programming [41,86].
Wireless Channel Interference	Inter-Cell Interference (ICI)	Interference Avoidance (e.g., Fractional Frequency Reuse—FFR), Interference Cancelation (e.g., Successive Interference Cancelation—SIC), Interference Mitigation (e.g., Coordinated Multi-Point—CoMP), Guard Band Protection, Reconfigurable Intelligent Surfaces (RIS) [10,96,97,98].
	Inter-User Interference (IUI)
	Inter-Tier Interference
	Inter-System Interference (Satellite, etc.)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lykakis, E.; Vardiambasis, I.O.; Kokkinos, E. Data Traffic Prediction for 5G and Beyond: Emerging Trends, Challenges, and Future Directions: A Scoping Review. Electronics 2025, 14, 4611. https://doi.org/10.3390/electronics14234611

AMA Style

Lykakis E, Vardiambasis IO, Kokkinos E. Data Traffic Prediction for 5G and Beyond: Emerging Trends, Challenges, and Future Directions: A Scoping Review. Electronics. 2025; 14(23):4611. https://doi.org/10.3390/electronics14234611

Chicago/Turabian Style

Lykakis, Evangelos, Ioannis O. Vardiambasis, and Evangelos Kokkinos. 2025. "Data Traffic Prediction for 5G and Beyond: Emerging Trends, Challenges, and Future Directions: A Scoping Review" Electronics 14, no. 23: 4611. https://doi.org/10.3390/electronics14234611

APA Style

Lykakis, E., Vardiambasis, I. O., & Kokkinos, E. (2025). Data Traffic Prediction for 5G and Beyond: Emerging Trends, Challenges, and Future Directions: A Scoping Review. Electronics, 14(23), 4611. https://doi.org/10.3390/electronics14234611

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data Traffic Prediction for 5G and Beyond: Emerging Trends, Challenges, and Future Directions: A Scoping Review

Abstract

1. Introduction

2. Related Works

3. Key Challenges for Data Traffic Prediction

3.1. Data Heterogeneity

3.1.1. Architectural Complexity and Traffic Diversity

3.1.2. Data Scarcity and Dynamic Conditions

3.1.3. Data Challenges and Preprocessing

3.2. Data Privacy and Security

3.3. Model and Computational Complexity

3.4. Impact of Wireless Interference on Prediction Accuracy

3.4.1. Interference Classification

3.4.2. Interference Management Techniques

4. Methods

5. Results

5.1. Traffic Categories and Behavioral Patterns in 5G Networks

5.2. Dataset Characteristics in Cellular Network

5.3. Current Approaches for Forecasting Cellular Network Data Traffic

5.3.1. Traditional Methods

5.3.2. Contemporary Methods

5.4. Evaluation Metrics for the Data Traffic Prediction

6. General Discussion of Future Directions

6.1. Framework Overview

6.2. Detailed Methodology

6.3. Validation Plan

7. Discussion and Analysis

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI