Machine Learning-Enabled 5G and 6G Networks: Methods, Challenges, and Opportunities

Owais, Muhammad; Shongwe, Thokozani

doi:10.3390/app16042071

Open AccessReview

Machine Learning-Enabled 5G and 6G Networks: Methods, Challenges, and Opportunities

by

Muhammad Owais

^*

and

Thokozani Shongwe

Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Auckland Park, P.O. Box 524, Johannesburg 2006, South Africa

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(4), 2071; https://doi.org/10.3390/app16042071

Submission received: 20 January 2026 / Revised: 13 February 2026 / Accepted: 17 February 2026 / Published: 20 February 2026

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Fifth-generation (5G) and sixth-generation (6G) wireless communications aim to achieve significantly higher data speeds, remarkably low latency, and substantial improvements in the efficiency of base stations. With the rapid increase in the utilization of broadband data driven by Internet of Things (IoT) gadgets, smart home systems, autonomous vehicles, and virtual reality devices, 5G and 6G networks are set to overcome the limitations of earlier telecommunication technologies and serve as key enablers for future IoT applications. Anticipated as the primary infrastructure for delivering emerging services, 5G cellular networks introduce new requirements and challenges that complicate the achievement of desired objectives. This paper provides a comprehensive overview of machine learning (ML) methods and their application in 5G and 6G wireless networks, covering supervised, unsupervised, and reinforcement learning (RL) approaches. ML is set to play a central and important role in 6G systems for these wireless networks. Subsequently, this paper thoroughly explores a series of challenges within the domain of 5G and 6G networks and examines research opportunities for applying ML techniques to address these challenges.

Keywords:

5G; 6G; IoT; machine learning

1. Introduction

The advent of cutting-edge mobile technology, such as cloud gaming, virtual reality apps, and video streaming, has led to an exponential increase in mobile network traffic [1]. Several challenges must be addressed in fifth-generation (5G) and sixth-generation (6G) wireless networks to meet the growing demand for ultra-low latency and high data rates [2]. These days, a lot of cutting-edge technology, including Internet of Things (IoT), autonomous vehicles, and unmanned aerial systems, rely on 5G and 6G networks [3]. To meet the increasing demands of diverse networks and users, 5G and 6G wireless networks need to demonstrate a high degree of flexibility in their design, resource management, and allocation [4]. The architecture of 5G and 6G networks is designed to support ultra-reliable, high-speed communication across diverse applications, from autonomous vehicles and smart cities to industrial IoT [5]. At a high level, these networks consist of multiple components that work together to provide connectivity, process data, and manage resources efficiently [6]. The Radio Access Network (RAN) manages wireless communication between user devices and base stations, allowing users to connect to the network [7]. Massive Multi-Input Multi-Output (Massive-MIMO) technique in 5G improves RAN by using numerous antennas to increase network capacity and coverage. In 6G, RAN will grow into an intelligent network enhanced by machine learning (ML). Figure 1 shows the evolution of mobile networks from 1G to 6G, emphasizing a shift from basic connectivity toward intelligent networking. Earlier generations focused on voice and broadband capacity; the figure highlights that 5G and especially 6G introduce increasing system complexity, service diversity, and performance demands. The explicit association of ML with 5G and 6G indicates that learning-based intelligence can become a core enabler rather than an auxiliary feature.

ML is the revolutionary technology set to deliver 5G and 6G with the required flexibility and intelligence. Large data quantities that may be beyond the systematic human capacity to learn, optimize, and evaluate are made possible by machine learning. It provides the ability to not just proactively prevent issues but also accurately predict their occurrence in real-time [9]. Machine learning is highly effective for addressing complex problems that demand significant manual adjustments in existing solutions, or for situations where traditional approaches provide no solution. These problems can be solved by using ML techniques that automatically learn from historical data, which can replace traditional software with long rule lists [10]. Mobile and wireless networks involve numerous parameters, many of which are determined using heuristic methods. This is often due to the lack of closed-form solutions or the high cost of extensive measurement campaigns. In scenarios like these, a machine learning algorithm can be valuable in forecasting the parameters and deriving estimations for functions using the data at hand [11]. Optimization approaches will also be needed for next-generation wireless communication technologies in order to optimize certain goal functions and either minimize or increase them. Many issues in mobile and wireless communications must be estimated since they are either polynomial or non-linear in nature [12,13]. Machine learning techniques are appropriate for expressing the objective functions of non-linear problems requiring estimation or optimization [14,15]. This paper presents the core concepts of 5G and 6G wireless networks and their applications. It also reviews machine learning techniques for 5G and 6G, including supervised learning, unsupervised learning, and reinforcement learning (RL). Figure 2 shows the hierarchical relationship of Artificial Intelligence (AI) technologies applied to 5G and 6G wireless networks, showing that practical intelligence is primarily achieved through machine learning and deep learning, which enable data-driven optimization and adaptive network control.

Motivation and Contributions

The integration of ML with 5G and 6G wireless networks is crucial for several reasons. ML can significantly enhance the efficiency and performance of these networks by optimizing resource allocation, managing interference, and ensuring robust security. Traditional methods fall short in addressing the challenges caused by the exponential growth in connected devices and the increasing complexity of network operations [16]. Therefore, ML techniques offer promising solutions to meet the high demands of future wireless communication. Several surveys have explored the intersection of 5G networks and machine learning. Siriwardhana et al. [1] analyze mobile augmented reality supported by 5G and edge computing, focusing on system architectures and applications. Erunkulu et al. [2] compare various use cases of 5G mobile communication applications but do not delve deeply into machine learning techniques. Arjoune et al. [3] discuss the opportunities and challenges of AI in 5G, with an emphasis on future research directions, but their survey lacks a comprehensive analysis of practical applications. Xu et al. [4] provide a thorough analysis of resource distribution for 5G networks, highlighting current research and future trends. However, their work is primarily focused on resource allocation, without covering the broader spectrum of ML applications in 5G. Morocho-Cayamcela et al. [10] explore the potential, limitations, and future directions of ML for 5G and 6G mobile and wireless communications, providing a good theoretical background but lacking practical implementation insights. Zhang et al. [11] provide a broad survey on deep learning (DL) in wireless networks, but their focus is more on the potential of deep learning rather than specific applications within 5G networks. In our survey provides a holistic view by covering supervised, unsupervised, and reinforcement learning techniques, and their practical applications in both 5G and 6G networks. This paper also addresses the integration challenges and potential solutions, providing a more comprehensive resource for researchers and practitioners. Moreover, unlike previous surveys, our paper dedicates a section to the integration challenges of ML with 5G and 6G networks, such as algorithm complexity, data gathering, resource allocation, reliability, security, and privacy. This comprehensive approach ensures that our survey not only reviews current technologies but also prepares for future developments. The contributions are as follows:

We analyze the complexity of the systems and examine key technologies for advancing 5G and 6G networks.
We provide a comprehensive overview of ML techniques.
We offer a foundational overview of current machine learning solutions applied to tackle challenges within the context of 5G and 6G networks.
We perform a comprehensive analysis of the applicability of supervised learning, unsupervised learning, and RL techniques in advancing 5G and 6G wireless networks.

The subsequent sections of this paper are arranged as follows: Section 2 presents an overview of 5G and 6G networks, and Section 3 explores ML techniques. Section 4 presents ML applications for 5G and 6G wireless networks. Section 5 presents ML challenges for 5G and 6G networks. Section 6 presents the future research directions and Section 7 concludes our paper.

2. Background of 5G and 6G Wireless Networks

5G and 6G communication systems are expected to support a wide range of applications. This presents significant challenges for mobile operators due to the rapid growth in connected devices and service demands. 5G and 6G are expected to provide support for various applications with a broad spectrum of communication needs. These demands are linked to three general service categories, namely, eMBB, mMTC, and URLLC. The goal is to enhance the performance of 5G and 6G applications by leveraging 5G and 6G technologies [17,18]. The foundational aspects of 5G and 6G wireless networks are well-established. eMBB and URLLC remain critical components, but innovative applications such as holographic communications and digital twins are pushing the boundaries of 5G and 6G [19]. The integration of 5G and 6G with edge computing and AI is creating new opportunities for ultra-low-latency applications, particularly in autonomous driving and industrial automation [2]. These advancements highlight the dynamic and evolving nature of 5G and 6G technology beyond its foundational principles [3]. Figure 3 shows the trade-off among eMBB, mMTC, and URLLC services in 5G and beyond, highlighting the diverse and often conflicting requirements of modern applications. This motivates the need for ML to dynamically optimize throughput, latency, and reliability across heterogeneous services. The key technologies of 5G and 6G are as follows:

2.1. Enhanced Mobile Broadband (eMBB)

eMBB is designed to meet the high data rate transmission requirements in 5G and 6G. This is critical for many human-interaction applications, such as video streaming, and virtual reality, and the objective for user data traffic latency is set at 4 milliseconds [21]. Two key technologies, mmWave and Massive MIMO, are employed to facilitate the higher data rates and enable ultra-dense networks to provide these data rates [22].

2.2. Massive Machine-Type Communications (mMTC)

The rise of massive IoT drives, mMTC, in 5G and 6G can link a significant number of devices. Numerous applications, including resource utility management, asset tracking, traffic monitoring, environmental observations, and the smart grid, among others, require extensive connection. This service has the potential to facilitate the adoption of the IoT by offering effective Internet connectivity [23]. Certain devices are battery-powered and are designed to operate for an extended period, for instance, a couple of years, without requiring battery replacements. Other devices, like smart meters, may be deployed in areas with significant penetration loss [24,25].

2.3. Ultra-Reliable and Low-Latency Communications (URLLC)

URLLC is a critical component of 5G and 6G that imposes strict criteria regarding availability, latency, and reliability [26]. In essence, the communication link must be consistently accessible to swiftly transmit data within a brief timeframe and ensure high reliability. This service is essential to the continued operation of some cutting-edge mission-critical applications, such as vehicle safety communications, and remote surgery. High reliability (i.e.,

10^{- 5}

packet error probability), high availability (e.g., 99.9999%), and low latency (e.g., <1 ms) are required for URLLC [27,28].

2.4. Massive Multi-Input Multi-Output (Massive MIMO)

Massive MIMO possesses the ability to increase signal strength, improve performance at the cell edges, and enhance overall cell throughput [29]. Massive MIMO enhances the capacity and coverage of 5G using a large number of antennas, but it faces challenges regarding energy consumption and hardware complexity. ML, particularly supervised learning models such as SVM and NNs, are applied to dynamically optimize antenna configurations, reducing interference and improving signal strength. The models are trained on signal-to-noise ratio (SNR), user behavior, and environmental data, and their performance is evaluated using throughput, energy efficiency, and SNR metrics [30]. For 6G, RL will be incorporated into massive MIMO systems, enabling the AI-driven optimization of antenna configurations in real-time based on network conditions and user demand, with energy consumption and performance metrics like throughput and latency as key evaluation criteria. This technology has the capability to enhance both 5G and 6G, leading to improved energy efficiency and increased capacity. It decreases air interface latency and simplifies the transceiver design [31,32].

2.5. Orthogonal Frequency-Division Multiplexing (OFDM)

OFDM is an essential modulation and multiplexing method widely used in wireless networks, especially in cutting-edge 5G and 6G technology. OFDM is widely used in 5G for high-speed data transmission, but struggles at higher frequencies (e.g., mmWave) due to signal attenuation. To address this, ML, particularly supervised learning models such as SVM and DNNs, have been applied to optimize resource allocation and subcarrier assignment. These models are trained on network traffic data and CSI, enabling the dynamic adjustment of subcarriers based on real-time conditions. The performance of these models is evaluated using spectral efficiency, latency, and throughput metrics [33]. For 6G, ML techniques will further evolve to replace traditional OFDM with more efficient schemes like filter bank multi-carrier and universal filtered multi-carrier, optimizing the modulation schemes in real-time to handle high-frequency bands, with performance metrics focusing on spectral efficiency and energy consumption [34,35].

2.6. Beamforming

Within 5G and 6G wireless networks, beamforming is indispensable to improve communication efficiency and reliability. Beamforming in 5G improves network efficiency by focusing radio signals in specific directions, but traditional beamforming is static and cannot dynamically adapt to network changes. To overcome this limitation, RL has been applied to optimize beamforming directions. RL agents are trained on user mobility, signal strength, and network traffic data, adjusting beam directions to improve throughput and reduce interference. The models are evaluated using SNR, throughput, and latency metrics [36]. In 6G, AI-based beamforming will enable real-time adjustments based on user density and network conditions, using DRL to continuously improve beam direction and energy efficiency and evaluating through real-time performance [37,38,39].

2.7. Machine-to-Machine (M2M) Communication

M2M communication plays a central role in 5G and 6G networks, encapsulating the seamless interaction among devices without the need for human intervention. Leveraging the network’s ability to manage massive device connectivity concurrently, M2M communication ensures real-time communication for a variety of applications [40]. In practical situations, this synergy finds application in smart factories, smart cities, and healthcare environments, where devices communicate to enhance industrial processes, optimize urban management, and remotely monitor patients [41]. It meets diverse 5G and 6G requirements by supporting numerous devices at low data rates and achieving extremely low latency [42].

2.8. Device-to-Device (D2D) Communication

D2D communication is a foundational feature in 5G and 6G wireless networks, enabling direct communication between devices located in close proximity [43]. In contrast to conventional network communication, this method enhances efficiency by reducing latency and easing network traffic [6]. Taking advantage of device proximity, D2D communication proves especially beneficial in situations such as public safety, allowing for prompt coordination during emergencies. Furthermore, D2D communication plays a crucial role in content-sharing and collaborative edge computing, enabling direct and energy-efficient data exchanges. In the realm of 5G and 6G, it enhances network efficiency by optimizing resource utilization and fostering direct communication between devices. This approach not only improves overall connectivity but also reduces reliance on centralized network infrastructure [26,44,45].

2.9. Cloud Computing

Cloud computing is essential for the progress of 5G and 6G wireless networks, offering centralized processing, storage, and management capabilities. ML has been applied to optimize resource allocation between cloud computing to predict the optimal distribution of computational tasks.These models are trained on network load data and traffic patterns, and evaluated based on latency, throughput, and resource utilization metrics [45]. In 6G, AI-driven cloud-edge architectures will further optimize task offloading decisions, using RL to make real-time resource allocation decisions, aiming to reduce latency and improve energy efficiency and maintaining high data throughput. Ultimately, this contributes to the fulfillment of 5G and 6G vision for high-speed, low-latency, and flexible wireless communication [46].

2.10. Edge Computing

In 5G and 6G wireless networks, edge computing is essential because it moves processing power closer to consumers and devices, lowering latency and improving application performance overall [47]. Low latency is critical for 5G and 6G such as virtual reality, and real-time IoT services, edge computing enables faster data processing by siting computing resources at the network’s edge. ML, specifically RL, is applied to dynamically allocate resources at the edge based on network conditions. The models are trained using real-time traffic data, such as task requirements and network load, and evaluated using latency, throughput, and energy efficiency metrics. In 6G, AI-driven edge computing will enable further optimization of resource allocation by continuously adjusting resources based on real-time feedback. ML models will be trained on network load, device behavior, and application demands, with latency and data throughput as key performance metrics [48,49].

2.11. Wireless Network Virtualization

Wireless network virtualization enables heightened flexibility, resource efficiency, and the dynamic management of network services [50]. This technique provides virtualized resources for high-level applications by abstracting the actual hardware functionalities. In order to operate ultra-dense networks that provide high data rates for small cell users together with comprehensive coverage, mobility management, and monitoring, wireless network virtualization is an essential procedure [51]. In 5G and 6G, network virtualization is frequently accomplished through technologies such as network function virtualization (NFV), which enables the virtualization of network functions that were traditionally handled by dedicated hardware, transforming them into software-based components deployable on standard servers. In the context of 5G, NFV helps to reduce operational and capital [52,53]. NFV can effectively handle a high number of connections and manage spatiotemporal load changes, which will enable M2M communications and IoT in the 5G and 6G networks [54]. This architecture empowers operators to effortlessly control all network element [55]. This virtualized approach enables efficient resource allocation (RA) and the dynamic scaling of network resources. The software-defined networking (SDN) lowers the 5G network’s complexity by increasing the network’s intelligence [56,57]. Two essential technologies required to realize 5G networks are NFV and SDN.

2.12. Full Duplex Wireless Communication

Full-duplex (FD) technology allows data to be sent and received concurrently across the same frequency range. As a result, it is considered a promising technology in the context of 5G and 6G [58]. The ML model optimizes power control and interference cancellation parameters based on real-time feedback. Performance is evaluated using interference suppression gain, SINR, and bit error rate, demonstrating improved robustness compared with traditional cancellation techniques. In traditional half-duplex communication, a device can perform either data transmission or reception at any given time, but not both simultaneously [59]. FD communication improves spectral efficiency and throughput in wireless networks, thereby enhancing overall performance and capacity. Crucial requirements include high data rates and low latency [60].

2.13. Network Slicing

Network slicing is a foundational concept in 5G and 6G wireless networks, facilitating the virtualized, isolated networks tailored for applications [61]. This innovative approach facilitates the efficient sharing of a common physical infrastructure and providing dedicated and customized slices of the network. Each slice comes with unique characteristics that meet the diverse requirements of various use cases. Each network slice can be optimized for specific attributes, encompassing but not limited to latency, bandwidth, security, and reliability. Achieving programmability, flexibility, modularity, and robust virtualization capabilities primarily relies on technologies such as NFV, SDN, and network softwarization. This enables the division of the physical network into several logical virtual networks, commonly referred to as network slices [9,62]. ML have a vital role in the efficient management and orchestration of network slices, ensuring optimal performance and resource utilization [63].

2.14. Millimeter-Wave

Millimeter-wave (mmWave) technology serves as a fundamental element in the architecture of 5G and 6G wireless communications. Leveraging mmWave, 5G delivers unparalleled data rates, reduced latency, and enhanced capacity. ML is used to optimize beam selection and tracking. AI models are trained using CSI, signal strength, and user mobility data to predict optimal beam directions. Model performance is evaluated using throughput, beam misalignment probability, and handover latency, showing clear gains over static beamforming methods [64,65].

2.15. Terahertz (THz) Communications

For 6G to reach ultra-high-speed data transmission rates of 100 Gbps, terahertz frequencies that range from 100 GHz to 10 THz, will be essential. New uses like as holographic communication and real-time immersive experiences are made possible by these frequencies. However, because high frequencies cause signal deterioration over longer distances, issues like propagation loss and the requirement for sophisticated materials and beamforming techniques still exist [66].

2.16. Reconfigurable Intelligent Surfaces (RIS)

RIS allows for the dynamic management of electromagnetic waves to increase signal coverage, energy efficiency, and data throughput. By putting these surfaces on infrastructure, 6G networks will enhance performance in contexts with physical impediments or interference, such as crowded metropolitan regions and indoor environments [67].

3. Machine Learning Techniques

Machine learning techniques have been extensively studied, yet their application in 5G and 6G networks continues to reveal new insights and opportunities. Without machine learning, network operators may struggle to efficiently deliver 5G and 6G services with their diverse requirements [68]. Machine learning techniques have the most promise in their ability to predict future events, learn from system experiences on their own, and adjust to changing surroundings [69,70]. Supervised learning techniques, such as DNN, have shown promise in optimizing network performance and managing resources efficiently [4]. Recent research has focused on using reinforcement learning for dynamic spectrum management, which is crucial for maximizing the efficiency of 5G and 6G networks. For example, multi-agent RL has been applied to manage interference in dense urban environments, where traditional methods fall short [9]. Furthermore, the combination of machine learning with blockchain technology is emerging as a novel approach to enhance security and trust in 5G and 6G networks, addressing concerns related to data privacy and cyber threats [10]. These recent advancements demonstrate the potential of ML to transform 5G and 6G networks, offering new solutions to persistent challenges [11].

3.1. Supervised Learning

Training an algorithm on a labeled dataset, where each label on the input data corresponds to an output label is a type of ML known as supervised learning, as shown in Figure 4.

The goal of supervised learning is to instruct the algorithm on how to map or relate the input data to the intended output. This allows the algorithm to make predictions and find optimal solutions or classifications when presented with new, unseen data [10]. Regression and classification are the two main subtypes of supervised learning. Regression is an algorithm that predicts a continuous output [71]. The classification algorithm makes predictions about the class or category to which the input belongs [72]. Several widely used supervised methods are presented as follows:

Artificial neural networks (ANNs): ANNs draw inspiration from the natural world and aim to replicate the intricate workings of biological neural networks (NNs). Through extensive training on intricate datasets, ANNs gain the ability to grasp the network architecture of wireless communication systems and make predictions about user behavior. ANNs prove invaluable in addressing a diverse array of challenges, including resource allocation, spectrum utilization, and cell association [63]. The emergence of deep neural networks (DNNs) has further amplified the capabilities and effectiveness of ANNs [73].
Support vector machine (SVM): In supervised machine learning, SVM is used for both regression tasks and classification tasks. Its main objective is to identify the optimal hyperplane in a high-dimensional setting, effectively distinguishing data points belonging to different classes [74]. SVM plays a vital role in wireless networks by aiding in tasks such as signal classification and detecting interference. SVM competently categorizes signals, identifies and minimizes interference, improves channel allocation, and anticipates service quality by evaluating data trends. This improves network performance and efficiency [75].
Naive Bayes: The probabilistic machine learning algorithm Naive Bayes calculates the likelihood of an event based on past knowledge of the circumstances surrounding it. Despite its apparent simplicity, Naive Bayes is a powerful and frequently employed algorithm for classification tasks, especially in fields like natural language processing and spam filtering. A significant number of independent continuous or categorical features can be effectively managed using Naive Bayes classifiers [76,77].
Convolutional neural networks (CNNs): A CNN is a form of ANN designed specifically to process and evaluate complex problems such as images or videos. Neurons in these models are capable of self-optimization and unsupervised learning. The convolutional layer, the pooling layer, and the fully connected layer comprise the entire CNN architecture [78].
Recurrent neural network (RNN): RNNs, unlike standard feedforward neural networks, feature directed cycle connections that allow them to maintain a memory of prior inputs within their internal state. As a result, RNNs can be specifically constructed to handle data sequences [79]. RNNs are frequently used to handle ordinal or temporal issues in a variety of disciplines, including language translation, natural language processing, photo captioning, and speech recognition. Long short-term memory (LSTM) network applications include assessing, categorizing, and predicting results from time series data [80].
K-Nearest Neighbor (KNN): KNN is a flexible supervised ML method that may be used for both classification and regression issues. Predicting a data point’s class or value by taking into account the majority class or average value of its KNN in the feature space is the fundamental idea behind KNN. To classify a data point, the algorithm finds the k training cases whose feature values are nearest to the input data point. Then, choose from among these neighbors the class that shows up most frequently. Through the process of averaging the goal values of the KNN, regression forecasts the continuous output for the new data point [81,82]. The algorithm boasts numerous advantages, including its insensitivity to outliers, ease of implementation, and suitability for multi-class classifications. One notable disadvantage of this technique is that it becomes highly time-consuming, especially for large-input datasets [83].
Decision tree: The decision tree iteratively separates the data into subsets based on the input feature values and is used for both classification and regression applications. At each node in the tree, a decision is made using a specific attribute, resulting in branches that represent alternate outcomes or more decision points. Homogeneous subsets are finally formed by making decisions at each node and building a tree that correctly predicts the target variable. The key advantages of this strategy are its ease of deployment and high classification accuracy [84].
Random Forest: The Random Forest technique amalgamates predictions from numerous decision trees to enhance overall accuracy and resilience. Every tree within the forest is constructed separately using randomly selected input variables, and the ultimate forecast is established by combining the outcomes of these trees. Whereas the average of the individual tree forecasts is computed for regression tasks, the mode of individual tree predictions is taken into account for classification tasks [85,86]. To discover the ideal split, the algorithm evaluates only a subset of features, emphasizing the necessity of maintaining a low correlation between trees to avoid the dominance of a small number of relevant traits [87].

Supervised learning algorithms such as random forests and SVM are commonly used for traffic prediction and channel state estimation. These techniques can classify network traffic into predictable patterns, which is crucial for dynamic spectrum allocation in 5G and 6G networks. By analyzing historical data, these algorithms can predict periods of high traffic, allowing the network to proactively adjust resources and prevent congestion. Table 1 provides a comparative overview of the strengths and weaknesses of the most widely used supervised ML methods [88,89,90,91].

3.2. Unsupervised Learning

Unsupervised learning, as applied in ML, refers to the process by which algorithms study and detect patterns in unlabeled data without explicit direction, as shown in Figure 5 [94].

When explicit labels are unavailable, unsupervised learning can be used to detect abnormalities, reveal hidden patterns in data, and conduct exploratory data analysis. Unsupervised learning may find patterns and correlations within datasets, which is useful for applications like pattern recognition, and data dimensionality reduction. Unsupervised learning strategies include dimensionality reduction, which streamlines complicated datasets by lowering the number of features, and clustering, which groups related data points based on internal patterns [95,96]. Clustering is an essential technique in unsupervised learning that groups related data points based on predefined criteria. The purpose of clustering is to find underlying structures or patterns in a collection when labels are not present. K-Means and others are significant clustering algorithms [97,98]. The main objective is to streamline intricate data and eliminate irrelevant or redundant features, leading to improved computational efficiency, noise reduction, and a decreased risk of overfitting in machine learning models [99,100]. A number of popular unsupervised methods are presented as follows [101,102]:

K-means: K-means is one of the most commonly used clustering algorithms in unsupervised ML. Its purpose is to divide a dataset into distinct and non-overlapping groups or clusters. The algorithm operates iteratively by assigning data points to clusters based on their similarity and subsequently updating the cluster centroids through the calculation of the mean of the points within each cluster. Indeed, in K-Means, the “K” denotes the predetermined number of clusters that the algorithm aims to identify within the dataset. The process involves initializing centroids and assigning data points to clusters iteratively until convergence, where the goal is to reduce the total squared distance between each data point and the centroid of its assigned cluster [103,104].
Autoencoders: Autoencoders serve as a neural network architecture employed in unsupervised learning and dimensionality reduction. Their fundamental objective is to acquire an efficient representation or encoding of the input data. The architecture comprises an encoder network responsible for compressing the input into a lower-dimensional representation, referred to as the encoding or bottleneck layer. The primary objective of the autoencoder is to reduce the reconstruction error, hence encouraging the model to effectively capture the most important properties of the input data [105,106].
Self-Organizing Map (SOM): SOM, also referred to as a Kohonen map, is an unsupervised learning technique used for high-dimensional visualization of data and clustering. SOM is a form of ANN that organizes input data into a grid of neurons or nodes, maintaining the topological relationships of the input space. In this grid, each node represents a weight vector, and throughout training, the SOM adjusts these weight vectors based on the input data, ensuring that similar input patterns are mapped to nearby nodes. This process facilitates the creation of a meaningful and organized representation of the input data on the SOM grid [107,108].

Unsupervised learning methods, such as clustering algorithms like K-Means and DBSCAN, are used in anomaly detection within wireless networks. These algorithms can detect abnormal network behaviors or congestion patterns, helping to optimize the QoS and ensuring that the network runs smoothly even under heavy load. Table 2 provides a comparative overview of the strengths and weaknesses of the most widely used unsupervised ML methods [109,110,111,112].

3.3. Reinforcement Learning

RL is a type of ML approach in which an entity, referred to as an agent, acquires decision-making skills through its engagement with an environment, as shown in Figure 6.

The agent makes decisions according to its existing comprehension of the environment, obtains responses in the form of rewards or penalties, and refines its strategy, known as a policy, to optimize the accumulation of rewards throughout its learning process [113]. The procedure entails a delicate equilibrium between exploration, aimed at uncovering optimal actions, and exploitation focused on leveraging already-identified successful actions [114]. RL represents a fusion of supervised and unsupervised learning. It incorporates elements of supervised learning, as supervision is essential for the model to comprehend and learn the optimal performance of a system [83]. Numerous algorithms in the domain of reinforcement learning have been created to tackle diverse facets of learning and decision-making across a variety of environments [115]. Some RL methods are as follows:

Table 2. Strengths and weaknesses of unsupervised ML techniques.

ML Method	Strengths	Weaknesses	Real-World Limitations
K-Means [103,104]	Simple and fast	Assumes spherical clusters	Not suitable for irregular-shaped clusters in network data.
	Efficient for large datasets	Sensitive to initial centroids and outliers	Performs poorly with noisy or unclean data in networks.
	Easy to interpret		Limited in complex, non-linear network.
Hierarchical Clustering [116]	Dendrogram helps visualize data structure	Computationally expensive	Not scalable for large, real-time network data.
Hierarchical Clustering [116]	No need to specify number of clusters	Not scalable to large datasets	Not efficient for high-dimensional network traffic.
DBSCAN [109,116]	Can find arbitrarily shaped clusters	Difficult to choose parameters	Sensitive to parameter selection in dynamic network data.
DBSCAN [109,116]	Robust to outliers	Struggles with varying densities	Performance can drop in dense, diverse network.
PCA [111,117]	Reduces dimensionality	Linear method	May miss non-linear relationships in complex network data.
PCA [111,117]	Removes noise and redundancy	May lose interpretability of features	Can simplify critical data, losing important insights.
Autoencoders [105,106]	Effective for complex feature extraction	Requires tuning	Needs large datasets.
	Suitable for anomaly detection	Hard to interpret	Lack of interpretability makes troubleshooting difficult.
		Needs large amount of data	High computational cost, limiting use on edge devices.
t-SNE [112,118]	Great for visualizing high-dimensional data	Not scalable	Inefficient for large datasets in real-time applications.
t-SNE [112,118]		Not suitable for clustering	Primarily used for visualization, not clustering tasks.
GMM [110,112]	Provides soft clustering (probabilistic)	Sensitive to initialization	Assumes Gaussian distribution, limiting its use with non-Gaussian network data.
GMM [110,112]	Can model complex distributions	Assumes data is generated from Gaussian	Not suitable for all network data distributions.
SOM [107,108]	Intuitive visualization	Limited scalability	Inefficient for large-scale, real-time network data.
SOM [107,108]	Captures topological relationships	Hard to tune map size and learning rate	Requires extensive tuning, impractical for large datasets.
Isolation Forest [109,112]	Efficient anomaly detection	Less interpretable	Best for anomaly detection, not for clustering large datasets.
Isolation Forest [109,112]	Handles high-dimensional data	Not suitable for clustering	Limited use for tasks requiring data grouping in networks.

Q-learning: Q-learning is a model-free RL algorithm applied in situations where an agent engages with an environment to acquire an optimal policy. Functioning within discrete state and action spaces, Q-learning systematically updates its action-value function, Q(s,a), by considering observed rewards and transitions. The algorithm utilizes a straightforward update rule, involving a learning rate, immediate rewards, and a discount factor for future rewards [119,120].
Double Q-learning: Double Q-learning serves as an extension of the conventional Q-learning algorithm, designed to counteract overestimation bias in the estimation of action values. In traditional Q-learning, a single set of Q-values is employed for both action selection and evaluation, potentially resulting in optimistic overestimation, especially during the initial learning phases. Double Q-learning resolves the issue by maintaining two sets of Q-values, employing one set for action selection and the other for action evaluation. This algorithm contributes to the enhancement of Q-value estimate accuracy and promotes learning stability, particularly in environments characterized by extensive and intricate state–action spaces [121,122].
State–Action–Reward–State–Action (SARSA): SARSA is a model-free reinforcement learning technique used for environments featuring discrete state and action spaces. Employing an on-policy approach, SARSA systematically updates its state–action values through iterations that take into account the current state, the action taken, the immediate reward, the next state, and the subsequent action chosen by the policy [123,124].
Deep Reinforcement Learning (DRL): DRL represents a subset of ML that integrates RL methods with DNN. This amalgamation empowers agents to learn and make decisions in environments characterized by intricate and high-dimensional state spaces. By utilizing deep learning (DL) techniques, these agents can automatically identify and represent hierarchical features from unprocessed sensory inputs, such as photos or sensor data, doing away with the necessity for manual feature engineering [125,126].
Policy Gradient: The policy gradient method, a form of RL approach, directly enhances the policy—the decision-making strategy of an agent. To achieve this improvement, the policy parameters must be changed to maximize the projected cumulative reward. Unlike value-based methods, policy gradient techniques include calculating and updating the gradient of the predicted reward in relation to the policy parameters. The policy is guided toward acts that result in larger rewards through this iterative optimization process [127,128].
Actor–Critic: Value-based and policy-based techniques are used in Actor–Critic, an RL architecture. The actor, who chooses acts in accordance with the present policy, and the critic, who evaluates the selected actions and provides input to improve the policy, are the two main components of this system. The actor’s objective is to acquire a policy dictating actions in various states, and the critic estimates the value function, serving as a gauge for the anticipated cumulative reward. The actor utilizes this feedback to enhance its policy, and the critic is updated to better approximate the true value function [129,130].

RL is particularly well-suited to resource management in network slicing. In 5G and 6G, network slices are created to cater to the specific needs of different services, such as IoT, autonomous driving, or critical communications. RL agents can autonomously decide how to allocate resources to different slices based on real-time feedback, thereby optimizing network performance without manual intervention. Table 3 provides a comparative overview of the strengths and weaknesses of the RL [131,132,133,134].

4. Machine Learning Techniques for 5G and 6G Wireless Networks

ML methods are essential for improving the effectiveness, efficiency, and dependability of wireless communication in the domains of 5G and 6G wireless networks. The application of ML is evident in several critical domains within the context of 5G and 6G, influencing their performance and capabilities. The problems addressed by ML in these contexts include, but are not limited to, resource allocation, interference management, beamforming, and security. The following section presents a comprehensive discussion of significant problems.

4.1. Supervised Learning Methods

In 5G and 6G wireless networks, supervised learning methods can be utilized for optimizing networks, managing resources, and improving performance. Supervised learning can be deployed in the following domains.

4.1.1. Resource Allocation

The application of supervised learning for RA in 5G and 6G wireless communications entails utilizing historical datasets that contain labeled information regarding network conditions and the corresponding optimal resource allocation decisions. In [136], a resource allocation system is proposed that takes into account the uncertain connection between partial channel state information (CSI) and proportional fairness using DL. They gather data and build a dataset for supervised learning. The numerical results demonstrate that the suggested method outperforms the traditional resource allocation scheme in terms of performance. In [137], the author presented a supervised approach to addressing the problem using a framework built on graph neural networks. In terms of average sum rate and sample efficiency, the suggested framework performs better than benchmark systems. In [138], the authors introduce a novel approach, termed generalization-representation learning (GRL), to tackle the challenges associated with power allocation. The authors propose that there is a function that can be used to indicate the relationship between the parameters of the network and the best way to allocate resources, and they solve the issues by optimizing this function. This method combines training approaches that are data-driven (supervised) with model-driven approaches (unsupervised), leading to accurate predictions of the optimal function, with satisfactory results.

The author in [139] suggested a modified RA strategy that uses a learning-based resource segmentation method to solve the RA problem. This approach involves the use of a modified random forest algorithm and positional coordinates to derive the location coordinates of end-users. The simulation analysis illustrates the efficiency of the proposed method concerning throughput and energy efficiency. The author in [140], suggests a technique that utilizes DL for power allocation for massive MIMO. Supervised learning is employed to obtain power allocation through a particle swarm optimization-based algorithm. Power prediction is performed using the trained DNN. Deep learning improves the computational complexity and optimizes the sum-rate. The authors conduct a comparison of various supervised ML algorithms, including random forest, SVM, and, ANN to predict data rates. The findings indicate that the random forest technique achieves the minimum prediction error, with the error being more significant in the downlink transmission direction but notably lowered in the uplink transmission direction [141]. In [142], a combination of two renowned beamforming techniques, specifically, zero-forcing and maximum ratio transmission, are used in a MISO (Multiple-Input Single-Output) channel with K users. The suggested technique utilizes a deep neural network with a total rate of 99% and input nodes that consider transmit power and channel vector. The output provides the factors that are used to combine signals for the purpose of beamforming in the transmitter.

4.1.2. Channel Allocation

The problem of channel allocation with a focus on specific tasks is formulated as a decentralized partially observable Markov decision process. The authors present a supervised DNN approach for adaptive bit allocation in heterogeneous networks, considering imperfect CSI. Accurate CSI estimation in such networks plays a crucial role in system performance. Additionally, minimizing feedback overhead represents a notable challenge within the framework of heterogeneous networks [143]. In [144], the power and channel allocation challenges aim to optimize the sum rate and achieving the lowest user-achievable rate. Due to the non-convex nature of the construction model and the high-dimensionality of response variables, a DRL approach, specifically the distributed proximal policy optimization, is proposed for RA. Performance in heterogeneous networks is enhanced by the proposed method.

4.1.3. Interference Management

Interference is a significant issue in densely populated wireless environments. Effective interference management is crucial for maintaining the quality and reliability of communication. Leveraging ML techniques for interference management in 5G and 6G wireless communications marks a notable stride in wireless communication technology. In this context, interference management encompasses tackling challenges stemming from signal interference within the shared frequency spectrum. ML approaches, such as deep learning and multi-agent systems, have been used to predict and mitigate interference by learning from historical data and real-time network states [11]. In [145], the author suggests employing DRL for interference management by integrating joint power control and beamforming in multi-cell networks. The algorithm proposed is designed to operate effectively for any number of cells, aiming to enhance network performance, as assessed by the achievable signal-to-interference-plus-noise ratio (SINR) and sum-rate. In [146], a smart interference management approach is proposed, utilizing DRL. Power management and joint sub-band masking are used in this strategy. A Markov decision technique is used to formulate the sub-band resource masking problem. DRL is employed to approximate the policy functions, mitigating the computational and storage challenges associated with conventional tabular-based approaches.

4.1.4. Beamforming

Beamforming is essential for enhancing signal quality and extending the coverage area of 5G networks. ML algorithms, including supervised and reinforcement learning, have been employed to optimize beamforming parameters, resulting in improved signal strength and reduced interference [10]. Supervised learning for beamforming in 5G and 6G wireless networks involves utilizing labeled historical data to train models. These models are designed to predict optimal beamforming strategies based on specific network conditions. In [147], a pioneering deep learning approach is introduced, utilizing a RNN, for beam selection. The model is capable of predicting the serving base station (BS) and beam for individual drones by leveraging their previous locations and trajectories. According to the simulation data, the suggested method predicts beams with an accuracy of more than 90%. A deep neural network is introduced for optimizing downlink beams in mm-wave networks, aiming to improve data rate. The suggested model has improved performance and resilience, as demonstrated by the simulation results. Remarkably, traditional methods frequently rely significantly on sub-6GHz data, especially in areas with poor signal-to-noise ratios (SNRs) [148]. In order to minimize the space needed for the initial beam, they provide a DNN approach for mm-wave beam selection. The results demonstrate that the recommended technique for beam selection can reduce beam overhead by as much as 79.3% [149].

4.1.5. Security

Security in 5G networks encompasses various aspects, including authentication, data privacy, and protection against cyber-attacks. ML techniques, such as anomaly detection and adversarial training, are increasingly being integrated into security frameworks to identify and mitigate potential threats in real-time [17].Supervised learning for security in 5G and 6G wireless networks entails using labeled datasets to train models. These models are developed to predict and mitigate security threats based on specific network conditions. To identify various attacks on machine learning mechanisms, a hybrid approach is employed in this work. The study addresses threats such as unfair use of resources, Denial-of-Detection, and Denial-of-Service, utilizing an enhanced ML methodology. The suggested method incorporates a long short-term memory model to enhance accuracy in decision-making and the classification of attacks in 5G and 6G networks [150]. The authors use explainable AI in intrusion detection with a decision tree to improve trust management. The intrusion detection system uses a simple decision tree approach to decompose sub-choices, simulating a human decision-making process. The findings show that the suggested approach’s accuracy is on par with cutting-edge algorithms [151]. Table 4 represents the supervised ML techniques for 5G and 6G wireless networks.

4.2. Unsupervised Learning Methods

Unsupervised learning techniques in machine learning are instrumental for uncovering patterns and relationships within data without labeled outputs. In 5G and 6G wireless communication, these unsupervised learning methods find application in various aspects. Here are some potential use cases:

4.2.1. Resource Allocation

Unsupervised learning for RA in 5G and 6G wireless communication involves utilizing unlabeled data to identify patterns and structures within the network. This approach enables more adaptive and data-driven resource management. In [152], the authors present a distributed resource allocation system that blends an unsupervised learning network with a deep Q network (DQN). An optimal channel power control method used a DNN for power control, utilizing an unsupervised learning technique. The objective is to maximize the sum-rate and accounting for the associated constraint handling. The methodology used in this study performed better than traditional algorithms and maximized the sum-rate. In [153], a graph NN-based strategy is used to adaptably optimize allocations of resources. An unsupervised approach is used to efficiently train the model. The algorithm presented efficiently addresses the resource allocation problem. Moreover, the suggested unsupervised training approach demonstrates superior convergence speed and performance when compared to training methods based on supervised learning and RL. In [154], a multi-agent RL algorithm has been proposed for RA, incorporating unsupervised learning. The reinforcement model utilizes a concise representation of Q-values to address the problem, reducing computational complexity expediting convergence, and optimizing energy efficiency.

In [155], the authors explored anomaly detection within an unsupervised framework and introduced algorithms based on LSTM NNs, optimizing the substantial performance improvements. The author in [156], devised an unsupervised learning approach utilizing aggregation graph neural (GNN) networks to address challenges in optimal resource allocation. The method relies on information present at each network node. Learning-based approaches are expected to replace traditional optimization methods because of their better performance and low computing complexity. In [157], the author focuses on addressing power allocation in cloud radio access networks (CRAN) using an unsupervised deep learning approach. This can potentially improve the effectiveness of power allocation approaches based on DL. The proposed method maintains minimal computing complexity and achieving performance very near to the optimal levels. To maximize transmit powers in cellular wireless networks, in [158], the author uses an unsupervised trained feedforward neural network, taking into account both uplink and downlink scenarios.

4.2.2. Channel Allocation

The allocation of channels in 5G and 6G wireless networks is challenging because of the increasing complexity associated with these advanced communication technologies. In [159], the author presents a DL-based channel estimation method for massive MIMO systems. The suggested estimator employs a DNN specifically designed based on the deep image prior network. It begins by denoising the received signal using this DNN and subsequently applies conventional least-squares estimation. The suggested approach eliminates the need for training and utilizes significantly fewer parameters compared to conventional DNNs. Furthermore, the proposed deep channel estimator demonstrates robustness against pilot contamination, and under specific conditions, it can effectively eliminate such contamination. In [160], the authors introduce a pilotless channel estimation scheme that does not rely on any pilot signals. The proposed scheme exclusively leverages data signals and incorporates the K-means clustering algorithm. In [161], the authors optimize RA for channel estimation in the context of URLLC. The objective is to create innovative unsupervised deep learning algorithms, both model-based and model-free, to train a DNN for the purposes of RA and data transmission. The deep learning algorithms outperform channel estimation. In [162], the authors introduce a novel channel estimation approach that incorporates deep learning to enhance the least squares estimation. The least squares estimation is a cost-effective method, which tends to incur relatively high channel estimation errors, and our proposed method aims to address this limitation. This objective is accomplished by employing a MIMO system featuring a multi-path channel profile, which is utilized for simulations within the context of 5G networks.

4.2.3. Interference Management

The effective management of interference is a pivotal consideration in the operation and design of wireless networks, particularly within the framework of 5G and 6G networks. Unsupervised learning emerges as a valuable tool for interference management, offering adaptive and efficient solutions. In [163], the authors explore the application of federated learning in a multi-cell wireless network to manage interference for uplink and downlink. To address the interdependence between downlink and uplink transmissions, along with the inter-cell coupling, a multi-cell federated learning optimization is introduced. This system’s objective is to effectively manage interference for both uplink and downlink communications. The author of [164] introduces an unsupervised learning algorithm designed for power allocation in interference management. Additionally, a transfer learning approach is outlined to acquire interference management, specifically in the context of transmit beamforming, and improve the overall performance. In [165], the author describes a unique technique for interference minimization based on DL. DL is used to decrease interference by learning interference characteristics directly from the data, avoiding the need for expert systems. In [166], the author focuses on creating enhanced inter-cell interference cooperation using DL. The simulation findings show that DL-based interference management outperforms expectations.

4.2.4. Beamforming

Beamforming stands as a pivotal technology within the realm of 5G and 6G wireless networks. This methodology involves directing the transmission of radio signals toward a specific direction, thereby amplifying signal quality and bolstering the overall performance of communication systems. In [167], the authors investigate an ML beamforming method based on the KNN approximation. This approach is designed to learn and produce appropriate beamforming setups by analyzing the distribution of throughput demand. The achieved energy and spectral efficiency values align with other approaches, demonstrating comparable performance. Additionally, the algorithmic complexity is reduced, as per-user beamforming calculations are omitted due to machine learning. In [168], a low-complexity orthogonal hybrid beamforming design is proposed, assisted by an ML-based beam-user selection scheme to optimize the performance.

4.2.5. Security

It is critical to ensure the security of 5G and 6G networks, given the heightened complexity, scale, and potential vulnerabilities associated with this next-generation wireless technology. The traditional security measures remain pertinent, and the application of unsupervised learning techniques can enhance the ability to address evolving threats and vulnerabilities in a more dynamic and adaptive manner. In [169], the authors used CNN and stacked autoencoders in an unsupervised manner to identify intrusions. In [170], the author suggests an unsupervised Gaussian mixture model technique for enhancing the performance of the physical layer security model. In [171], the author proposes the utilization of software-defined security (SDS) as a tactic to create an automated, adaptable, and scalable network defense system. SDS intends to make use of current advances in ML by creating a CNN, employing an NN architecture search to detect abnormal network traffic. The model attains a perfect accuracy rate of 100% in identifying benign traffic and demonstrates a detection rate of 96.4% for abnormal traffic, underscoring the promising potential of this approach. In [172], the author solves security challenges, and proposes a DRL technique that uses unsupervised learning to detect multiple attacking possibilities. In [173], the author introduces a blockchain-based secure federated learning approach for creating smart contracts and preventing unreliable participants and malicious activities. The suggested framework is adept at deterring poisoning and membership inference attacks, consequently enhancing the security of federated learning in 5G and 6G wireless networks. Table 5 presents some unsupervised ML methods for 5G and 6G wireless networks.

4.3. Reinforcement Learning Methods

Reinforcement learning has been investigated as a potential strategy for enhancing the performance and administration of wireless networks. The challenges posed by 5G and 6G wireless networks include dynamic and diverse environments, fluctuating traffic patterns, and a range of Quality of Service (QoS) requirements. RL presents a viable solution to tackle these challenges, facilitating autonomous real-time decision-making and optimization. There are several applications of reinforcement learning within the domain of 5G and 6G wireless networks.

4.3.1. Resource Allocation

Resource allocation problems in 5G and 6G wireless communications can be successfully addressed through the application of RL. In [176], the authors introduce a scheme known as deep transfer RL, designed for the concurrent allocation of radio and cache resources to support a 5G network. The suggested schemes involve learner agents leveraging the knowledge of expert agents to enhance their performance in current tasks and improve the overall performance in the context of URLLC and eMBB. In [177], a dynamic resource allocation framework is put forward, which combines blockchain technology with multi-agent DRL. The aim is to optimize resource allocation in the context of 5G and 6G networks with multiple unmanned aerial vehicles (UAVs). The goal is to allocate resources efficiently to mobile users and minimize costs. In [125], a network slicing technique is developed based on DRL to determine the RA policy that maximizes long-term throughput and adherence to QoS in 5G and 6G networks. The introduced method demonstrates its efficacy in optimizing overall throughput over an extended period and effectively handling the coexistence of various use cases in 5G and 6G networks.

In [178], the author provides a DRL-based control strategy for meeting the demanding QoS criteria of URLLC and eMBB via resource allocation in the context of 5G. The suggested strategy aims to decrease user equipment energy usage and simultaneously optimize system utility performance. This performance metric is defined by its spectral efficiency, highlighting the efficiency of utilizing the available spectrum resources. The author in [179] presents a model based on DRL using an actor–critic architecture employed for optimizing resource allocation and addressing the joint network control challenges in IoT. Each IoT allotted data rate is reduced in part by the actor–critic algorithms. The proposed model demonstrates superior performance compared to conventional approaches across various network parameters and metrics. In [180], the author presents a DRL-based approach for maximizing the downlink signal-to-noise ratio in intelligent reflecting surface (IRS) communications. According to the simulation findings, the system not only minimizes time consumption but also reaches almost the maximum received signal-to-noise ratio. For resource allocation, a single-agent Q-learning approach is used in [181], which makes use of past performance to provide acceptable outcomes. A multi-agent Q-learning method is used to perform task offloading choices. The suggested algorithm’s efficacy is confirmed by contrasting its results with those of traditional algorithms in a range of network settings. The optimization issue is formulated in order to optimize the overall data rate across all D2D users in [182] for the power allocation and simultaneous uplink–downlink subcarrier assignment of D2D pairs. Using a model-free double-deep Q-network approach and a distributed DRL technique, each D2D pair solves the joint optimization problem as an agent. The proposed double deep Q-network approach exhibits minimal computational cost and rapidly converges to achieve a performance that is close to optimal, and is therefore suitable for implementation in D2D underlay wireless cellular networks.

4.3.2. Channel Allocation

Reinforcement learning can be successfully utilized to address the channel allocation problem in 5G and 6G networks. A hybrid optimization problem including power control, channel allocation, and beamforming is defined in [183], considering the constraints of minimal secrecy rate and SINR. A multi-agent RL method is used to achieve the highest long-term reward; the suggested approach improves performance overall. The authors of [184] provide a model-based approach for channel allocation that is based on RL. Simulations show that the method exhibits tremendous promise for 6G systems, performing brilliantly on genuine long-term evolution and 5G channels. The authors put forth a solution in the form of a multi-agent RL to address this challenge. When compared to baseline techniques, the suggested method has the potential to improve the efficacy and efficiency of multi-agent communication [185]. In [186], an accurate channel estimation technique for 5G networks is proposed using a Monte Carlo approach. The method combines the best aspects of both traditional and DL approaches by integrating conventional pilot-based channel estimations as a prior into the structure of the DL model. In [187], an innovative method is proposed for the allocation of pilots and estimation of channels in reconfigurable intelligent surface systems using learning techniques. The method involves training a masked autoencoder (MAE) to accomplish precise channel estimate with a constrained number of pilots. Subsequently, a DRL agent learns pilot allocation policies based on the insights gained from the MAE to optimize the channel estimation.

4.3.3. Interference Management

The utilization of RL stands out as a powerful strategy for efficiently addressing interference management challenges in 5G wireless networks. Describe interference management, power management, and beamforming joint optimization as a non-convex optimization problem with a maximizing SINR goal [188]. To tackle this challenge, employ DRL to provide a solution and maximize the sum-rate. In [189], the authors introduce a path planning scheme for a network of cellular-connected UAVs that takes into consideration interference awareness. A DRL is proposed, aiming to decrease the computational complexity associated with the proposed method and minimize interference. Enhancing downlink throughput in the 5G network is primarily reliant on the effective suppression of co-channel interference. To address this, interference whitening is proposed as a valuable, low-complexity linear technique aimed at mitigating colored interference within a MIMO-OFDM system. To tackle this issue, the author in [190], suggests a novel approach known as reinforcement learning-based interference whitening. This method dynamically regulates the interference whitening mode through a learning algorithm, adapting to the specific requirements of the environment.

4.3.4. Beamforming

In 5G and 6G wireless networks, where beamforming is an essential method to enhance communication between base stations and user devices, reinforcement learning may be applied to optimize beamforming. In [191], a multi-agent DRL framework was specifically proposed to increase knowledge of the simultaneous optimization of beamforming from the BS and the reflection beam from the reconfigurable intelligent surface. This optimization process is achieved solely based on received power measurements. The learning framework successfully learns optimized base station beamforming and reconfigurable intelligent surface configurations. The author introduces RL-based methods for beamforming and end-to-end channel prediction in a multi-user downlink system [192]. To enable autonomous learning of the beamforming policy with the aim of optimizing the transmission sum rate, the actor–critic-based beamforming layer is presented. In [193], the author proposes a DL-integrated RL approach designed to enhance the understanding of intelligent beam-steering in 6G networks. The main strategy is using RL between the user and the BS to optimize the beam direction, which significantly raises the SNR.

4.3.5. Security

RL has the potential to make a substantial contribution to enhancing the security of 5G and 6G wireless networks. It achieves this by facilitating intelligent and adaptive responses to emerging threats and vulnerabilities. The most recent advancements in 5G and 6G networks have expanded the capabilities of IoT applications by delivering unprecedented levels of connection, speed, and latency. These approaches also bring notable security risks that have the potential to result in widespread harm. DRL enables the construction of intelligent security solutions capable of adapting to the dynamic and complicated nature of IoT applications connected to 5G and 6G networks. In [194], the author proposes a novel approach aimed at enhancing the security of IoT applications in the 5G and 6G networks. This involves the creation of an intrusion detection system that utilizes the DRL method. In addressing attacks like dynamic jamming and swept jamming in [195], the authors employ a multi-agent RL method for an efficient defense strategy. Based on simulations, the algorithm is able to successfully avoid these sophisticated jamming attempts. Its agents work together to share the spectrum. In [196], the authors investigate a wireless secure communication system aided by the IRS. In order to ensure secure communication for numerous authorized users in the face of many eavesdroppers, the IRS is strategically placed to control its reflecting components. With the highly dynamic and complex nature of the system, tackling the non-convex optimization problem poses a challenge. Consequently, we introduce a pioneering approach that utilizes DRL for secure beamforming. In dynamic situations, this method seeks to determine the best beamforming policy to prevent eavesdroppers. The suggested secure beamforming approach, based on DL, markedly enhances the secrecy rate of the system and Quality of Service satisfaction in IRS-aided secure communication systems. The RL methods for 5G and 6G wireless networks are shown in Table 6.

4.4. Synthesis and Comparative Analysis

To synthesize the above techniques, Table 7 maps representative 5G and 6G network tasks to the ML paradigm most aligned with their decision structure and data constraints. This comparison clarifies why certain methods are preferred in practice and provides a compact guide for selecting models under latency, observability, and control requirements.

5. Machine Learning Challenges for 5G and 6G Wireless Networks

The utilization of ML applications has the potential to introduce novel avenues for research and solutions within wireless networks. ML plays an important role in facilitating the implementation of 6G networks. Although considerable research has been conducted on ML in wireless networks, numerous challenges and unresolved issues persist. The incorporation of ML into 5G and 6G wireless communication systems encounters several obstacles, which can be outlined as follows:

5.1. Complexity

The eventual implementation of deep learning algorithms in wireless devices is imperative. Nonetheless, numerous wireless devices face constraints such as limited memory and computing capabilities, rendering them unsuitable for complex algorithms. Gathering large samples and training DL models is a time-intensive task, creating a notable challenge for implementing them on wireless devices with limited power and storage capabilities. In certain scenarios, an increase in the number of samples and extended training time correlates with improved accuracy in recognizing signal and network features. However, acquiring more samples and prolonging the training process results in slower feedback. Consequently, there is a need to design DL models that can achieve optimal accuracy with fewer samples and within a shorter timeframe. Recent studies highlight the need for optimization techniques to reduce computational overhead and maintaining model accuracy and efficiency [173]. Approaches such as model pruning, quantization, and efficient neural architectures are crucial for deploying ML [4].

5.2. Resource Allocation

Resource allocation remains a key challenge for ML in 5G and 6G networks due to dynamic traffic conditions and heterogeneous service requirements. ML-based approaches must operate under strict real-time constraints and limited computational resources, as well as ensuring scalability and reliable generalization across diverse network scenarios remains difficult.

5.3. Reliability

ML-based techniques may exhibit lower reliability than traditional methods for certain wireless communication tasks. For example, in massive MIMO, DL can rival least squares and minimum mean square error in wireless channel estimation. These approaches are distinguished by slow feedback. The system response time may be prolonged during deep learning inference. This delay arises due to the limited availability of cloud computing on most wireless devices. Even in cases where access to cloud servers exists, communication with them introduces additional delays. A network comprising a large and varied user population operates dynamically as users possess diverse QoS and Quality of Experience requirements. On the other hand, users conducting financial transactions through payment software prioritize high security. Developing a cross-layer, action-based machine learning protocol specific to various applications is crucial to fulfilling distinct requirements and maintaining a balance in network resource utilization [200].

5.4. Real-Time Processing and Latency

Real-time processing is critical for many 5G applications, such as autonomous driving and industrial automation. ML models must be capable of making rapid decisions with minimal latency. Edge computing, which brings computational power closer to the data source, plays a pivotal role in achieving low-latency ML inference [1]. Techniques like edge-cloud collaboration and distributed learning are essential for balancing the trade-offs between latency and computational load [2].

5.5. Scalability and Deployment

Scalability is a key concern when deploying ML models in large-scale 5G and 6G networks. The dynamic nature of these networks requires models that can adapt to changing conditions and scale efficiently with the number of connected devices. Hierarchical ML models and transfer learning are promising strategies to enhance scalability and adaptability [10]. Moreover, containerization and micro services architectures can facilitate the deployment and management of ML models in diverse network environments [63].

5.6. Data Availability and Quality

The performance of ML models heavily relies on the availability and quality of data. In the context of 5G and 6G, data can be sparse, heterogeneous, and subject to privacy constraints. Ensuring high-quality data collection and preprocessing is essential for training robust ML models. However, gathering extensive datasets for training AI models is often challenging due to the sensitive nature of user information contained within these datasets. Mobile service providers are frequently unable to release such data without risking consumer privacy violations [173]. Additionally, synthetic data generation and data augmentation techniques can help mitigate the challenges of limited data availability [11]. Even with transfer learning, leveraging models learned on previous datasets, adapting these models to specific networks and contexts demands re-training. These restrictions severely hinder the growth of wireless AI development.

5.7. Intelligent Reflecting Surface

The timely and accurate acquisition of CSI is vital in IRS-enhanced wireless networks, particularly in MIMO-IRS and MISO-IRS networks, and in wireless networks with IRS enhancements; however, obtaining CSI is a difficult process, involving significant training overhead. In IRS-assisted non-orthogonal multiple access networks, users within each cluster need to share CSI, presenting additional challenges because of the passive nature of the IRS. Using machine learning and DL methods for leveraging CSI becomes a challenging issue, especially in cases that go beyond linear correlations [201].

5.8. Security

Ensuring the security of DL models poses an additional challenge, as NNs are vulnerable to adversarial attacks. Attackers can tamper with the training process by adding fictitious training datasets, decreasing model accuracy, and creating erroneous designs that might negatively affect network performance. Research on deep learning security and ML security in general is still in its infancy. Robust defense mechanisms, such as adversarial training and anomaly detection, are crucial for enhancing the resilience of ML models [17].

5.9. Privacy

Safeguarding user privacy stands as the foremost priority for mobile and service providers. In wireless artificial intelligence, a major challenge lies in enabling training on user-owned datasets without divulging input data or jeopardizing personal user information. A robust security approach is essential to enhance the seamless integration of deep learning into wireless communications [10].

5.10. Non-Stationarity

One of the main problems with ML-enabled 5G and 6G management is that things are always changing. For example, channel statistics, user mobility, traffic demand, interference patterns, and service needs all change over time and location. As a result, the data distribution that was used for training may not be the same as the one that was used for deployment. This mismatch is not just a theory; in real networks, QoS goals vary from app to app and can change quickly (for example, streaming services prioritize throughput over latency, and financial services have strict security requirements). This means that the “optimal” policy changes over time and makes fixed models less reliable. Non-stationarity specifically presents as: (i) degraded inference accuracy when models are utilized on unobserved cells, user densities, or propagation environments; (ii) unstable control actions in response to rapidly changing CSI or mobility, leading to oscillatory scheduling and power decisions; and (iii) performance degradation during distribution shifts [21].

5.11. Data Scarcity

Despite massive amounts of network telemetry, many learning tasks still suffer from data scarcity in practice. High-quality labels for supervision (e.g., “ground-truth optimal allocation”, fine-grained congestion states, or security incident labels) are costly to generate, and data are typically diverse across vendors, deployments, and slices. Moreover, raw data centralization is typically hampered by privacy and by communication difficulties, which hinders broad data sharing and delays model progress. These restrictions often cause: (a) overfitting to constrained situations, (b) weak generalization to under-represented corner instances, and (c) biased learning under class imbalance and missing data [202].

5.12. Overhead

Beyond accuracy, ML includes overhead that directly conflicts with strict latency and efficiency requirements. Many wireless control issues involve regular CSI acquisition or coordination, producing considerable pilot/signaling overhead, particularly for RIS systems, where estimating cascaded channels and setting large numbers of reflecting elements can be expensive. The computational cost of training and the latency/energy cost of inference might become bottlenecks at the edge, rendering the naive deployment of big models infeasible for real-time control loops [91].

5.13. Critical Analysis

In this section, we provide a critical analysis of these challenges, offering insights into their real-world implications and trade-offs. Table 8 summarizes each challenge and reflects on their impact on the practical deployment of ML models in next-generation networks.

6. Future Research Directions

Future research on ML-enabled 5G and 6G networks should move beyond identifying challenges and focus on concrete research questions, technical solutions, and measurable performance objectives.

6.1. Efficient and Lightweight ML Model Design

How to design lightweight ML models suitable for edge deployment remains an open question. Techniques such as model compression, pruning, and lightweight reinforcement learning should be explored and evaluated using latency, energy consumption, and convergence speed.

6.2. Data-Efficient and Privacy-Aware Learning

Future work should investigate how ML models can be trained effectively under limited, distributed, and privacy-sensitive data conditions. Techniques such as federated learning, self-supervised learning, and synthetic data generation offer potential solutions.

6.3. Real-Time and Low-Latency Learning Frameworks

Meeting ultra-low latency requirements remains an open challenge for ML-driven 5G and 6G systems. Research is needed on edge–cloud collaborative learning, online reinforcement learning, and distributed inference architectures that support real-time decision-making.

6.4. Scalability and Network Adaptability

Developing ML models that can dynamically adapt to changing network conditions and scale efficiently with the number of connected devices is crucial for future 5G and 6G deployments.

6.5. Security and Robustness of ML-Driven Networks

ML-enabled wireless networks are vulnerable to adversarial threats such as poisoning and evasion attacks. Future work should integrate adversarial training, secure model deployment, and anomaly detection into ML pipelines. Performance should be evaluated using attack success rate, robustness, and detection accuracy.

7. Conclusions

In this paper, we presented a structured and analytical overview of ML techniques applied to 5G and emerging 6G wireless networks. The key components of next-generation wireless systems were outlined, and representative applications of ML across different network functions and learning paradigms were discussed. Supervised, unsupervised, and reinforcement learning methods were reviewed in the context of addressing major challenges in 5G and 6G communications, with an emphasis on their role in improving system efficiency and supporting complex network operations. Furthermore, we highlighted several open challenges related to the deployment of ML in next-generation wireless networks, including algorithm complexity, data quality, real-time processing constraints, scalability, and security. By synthesizing existing studies and identifying these challenges, this work underscores the potential of ML-enabled 5G and 6G networks as an important direction for future research and development. Continued investigation into efficient learning frameworks, data management strategies, low-latency operation, scalability, and robust security mechanisms will be essential to fully realize the benefits of ML in next-generation wireless networks.

Author Contributions

Conceptualization, M.O. and T.S.; methodology, M.O.; resources, T.S.; investigation, M.O.; writ-ing—original draft preparation, M.O.; writing—review and editing, T.S.; supervision, T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

The authors would like to thank the University of Johannesburg for financial funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

5G	Fifth-Generation
6G	Sixth-Generation
IoT	Internet of Things
ML	Machine Learning
RL	Reinforcement Learning
DL	Deep Learning
eMBB	enhanced Mobile Broadband
mMTC	massive Machine-Type Communications
URLLC	Ultra-reliable and low-latency communication
MIMO	Multi-input multi-output
RAN	Radio Access Network
SINR	Signal-to-noise ratio
OFDM	Orthogonal frequency-division multiplexing
M2M	Machine-to-Machine
D2D	Device-to-device
NFV	Network function virtualization
SDN	Software-defined networking
FD	Full-duplex
mmWave	Millimeter-wave
ANN	Artificial neural network
NN	Neural network
DNN	Deep neural network
SVM	Support vector machine
DBSCAN	Density-Based Spatial Clustering of Applications with Noise
PCA	Principal Component Analysis
t-SNE	t-distributed Stochastic Neighbor Embedding
GMM	Gaussian Mixture Model
CNN	Convolutional neural network
PPO	Proximal Policy Optimization
RNN	Recurrent neural network
LSTM	Long short-term-memory
KNN	K-Nearest Neighbor
SOM	Self-Organizing Map
SARSA	State–Action–Reward–State–Action
DRL	Deep Reinforcement Learning
CSI	Channel state information
MISO	Multiple-Input Single-Output
BS	Base station
CRAN	Cloud radio access networks
L2O	Learning-to-optimize
SDS	Software-defined-security
QoS	Quality of Service
THz	Terahertz
RIS	Reconfigurable Intelligent Surfaces
UAV	Unmanned aerial vehicles

References

Siriwardhana, Y.; Porambage, P.; Liyanage, M.; Ylianttila, M. A survey on mobile augmented reality with 5G mobile edge computing: Architectures, applications, and technical aspects. IEEE Commun. Surv. Tutor. 2021, 23, 1160–1192. [Google Scholar] [CrossRef]
Erunkulu, O.O.; Zungeru, A.M.; Lebekwe, C.K.; Mosalaosi, M.; Chuma, J.M. 5G mobile communication applications: A survey and comparison of use cases. IEEE Access 2021, 9, 97251–97295. [Google Scholar] [CrossRef]
Arjoune, Y.; Faruque, S. Artificial intelligence for 5G wireless systems: Opportunities, challenges, and future research direction. In Proceedings of the 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 6–8 January 2020; pp. 1023–1028. [Google Scholar] [CrossRef]
Xu, Y.; Gui, G.; Gacanin, H.; Adachi, F. A survey on resource allocation for 5G heterogeneous networks: Current research, future trends, and challenges. IEEE Commun. Surveys Tuts. 2021, 23, 668–695. [Google Scholar] [CrossRef]
Mushtaq, M.U.; Hong, J.; Owais, M.; Danso, S.A. Enhancing security and energy efficiency in wireless sensor network routing with IoT challenges: A thorough review. LC Int. J. STEM 2023, 4, 1–24. [Google Scholar] [CrossRef]
Muhammad, O.; Jiang, H.; Muhammad, B.; Umer, M.M.; Ahtsam, N.M.; Dasno, S. A comprehensive review of D2D communication in 5G and B5G networks. LC Int. J. STEM 2023, 4, 25–46. [Google Scholar] [CrossRef]
Polese, M.; Bonati, L.; D’oro, S.; Basagni, S.; Melodia, T. Understanding O-RAN: Architecture, interfaces, algorithms, security, and research challenges. IEEE Commun. Surv. Tutor. 2023, 25, 1376–1411. [Google Scholar] [CrossRef]
Sanjalawe, Y.; Fraihat, S.; Abualhaj, M.; Makhadmeh, S.; Alzubi, E. A review of 6G and AI convergence: Enhancing communication networks with artificial intelligence. IEEE Open J. Commun. Soc. 2025, 6, 2308–2355. [Google Scholar] [CrossRef]
Singh, R.; Mehbodniya, A.; Webber, J.L.; Dadheech, P.; Pavithra, G.; Alzaidi, M.S.; Akwafo, R. Analysis of network slicing for management of 5G networks using machine learning techniques. Wireless Commun. Mobile Comput. 2022, 2022, 9169568. [Google Scholar] [CrossRef]
Morocho-Cayamcela, M.E.; Lee, H.; Lim, W. Machine learning for 5G/B5G mobile and wireless communications: Potential, limitations, and future directions. IEEE Access 2019, 7, 137184–137206. [Google Scholar] [CrossRef]
Zhang, C.; Patras, P.; Haddadi, H. Deep learning in mobile and wireless networking: A survey. IEEE Commun. Surveys Tuts. 2019, 21, 2224–2287. [Google Scholar] [CrossRef]
Umer, M.M.; Venter, H.; Muhammad, O.; Shafique, T.; Awwad, F.A.; Ismail, E.A.A. Cognitive strategies for UAV trajectory optimization: Ensuring safety and energy efficiency in real-world scenarios. Ain Shams Eng. J. 2025, 16, 103301. [Google Scholar] [CrossRef]
Umer, M.M.; Jiang, H.; Muhammad, O.; Awwad, F.A.; Ismail, E.A.A. Energy-efficient and resilient secure routing in energy harvesting wireless sensor networks with transceiver noises: EcoSecNet design and analysis. J. Sens. 2024, 2024, 3570302. [Google Scholar] [CrossRef]
Villarrubia, G.; Paz, J.F.D.; Chamoso, P.; la Prieta, F.D. Artificial neural networks used in optimization problems. Neurocomputing 2018, 272, 10–16. [Google Scholar] [CrossRef]
Mushtaq, M.M.; Venter, H.; Singh, A.; Owais, M. Advances in energy harvesting for sustainable wireless sensor networks: Challenges and opportunities. Hardware 2025, 3, 1. [Google Scholar] [CrossRef]
Niknam, S.; Dhillon, H.S.; Reed, J.H. Federated learning for wireless communications: Motivation, opportunities, and challenges. IEEE Commun. Mag. 2020, 58, 46–51. [Google Scholar] [CrossRef]
You, X.; Zhang, C.; Tan, X.; Jin, S.; Wu, H. AI for 5G: Research directions and paradigms. Sci. China Inf. Sci. 2019, 62, 21301. [Google Scholar] [CrossRef]
Pokhrel, S.R.; Ding, J.; Park, J.; Park, O.-S.; Choi, J. Towards enabling critical mMTC: A review of URLLC within mMTC. IEEE Access 2020, 8, 131796–131813. [Google Scholar] [CrossRef]
Muhammad, O.; Jiang, H.; Bilal, M.; Umer, M.M. Optimizing power allocation for URLLC-D2D in 5G networks with Rician fading channel. PeerJ Comput. Sci. 2025, 11, e2712. [Google Scholar] [CrossRef] [PubMed]
Amjad, Z.; Nsiah, K.A.; Hilt, B.; Lauffenburger, J.-P.; Sikora, A. Latency reduction for narrowband URLLC networks: A performance evaluation. Wirel. Netw. 2021, 27, 2577–2593. [Google Scholar] [CrossRef]
Navarro-Ortiz, J.; Romero-Diaz, P.; Sendra, S.; Ameigeiras, P.; Ramos-Munoz, J.J.; Lopez-Soler, J.M. A survey on 5G usage scenarios and traffic models. IEEE Commun. Surv. Tutor. 2020, 22, 905–929. [Google Scholar] [CrossRef]
Umoh, V.; Ekpe, U.; Davidson, I.; Akpan, J. Mobile broadband adoption, performance measurements and methodology: A review. Electronics 2023, 12, 1630. [Google Scholar] [CrossRef]
Bilal, M.; Zeng, R.; Muhammad, O.; Adil, M.; Shoukat, S. A cyber-resilient and explainable intrusion detection system for Internet of Things networks. Clust. Comput. 2025, 28, 308. [Google Scholar] [CrossRef]
Wang, F.; Ma, G. Introduction on Massive Machine-Type Communications (mMTC). In Massive Machine Type Communications: Multiple Access Schemes; Springer: Berlin/Heidelberg, Germany, 2019; pp. 1–3. [Google Scholar] [CrossRef]
Kovalchukov, R.; Moltchanov, D.; Pirskanen, J.; Säe, J.; Numminen, J.; Koucheryavy, Y.; Valkama, M. DECT-2020 new radio: The next step toward 5G massive machine-type communications. IEEE Commun. Mag. 2022, 60, 58–64. [Google Scholar] [CrossRef]
Muhammad, O.; Jiang, H.; Umer, M.M.; Muhammad, B.; Ahtsam, N.M. Optimizing power allocation for D2D communication with URLLC under Rician fading channel: A learning-to-optimize approach. Intell. Autom. Soft Comput. 2023, 37, 3193–3212. [Google Scholar] [CrossRef]
Khan, B.S.; Jangsher, S.; Ahmed, A.; Al-Dweik, A. URLLC and eMBB in 5G industrial IoT: A survey. IEEE Open J. Commun. Soc. 2022, 3, 1134–1163. [Google Scholar] [CrossRef]
Ali, R.; Zikria, Y.B.; Bashir, A.K.; Garg, S.; Kim, H.S. URLLC for 5G and beyond: Requirements, enabling incumbent technologies and network intelligence. IEEE Access 2021, 9, 67064–67095. [Google Scholar] [CrossRef]
Dilli, R. Performance of multi-user massive MIMO in 5G NR networks at 28 GHz band. Telecommun. Radio Eng. 2021, 80, 61–74. [Google Scholar] [CrossRef]
Anbarasu, M.; Nithiyanantham, J. Performance analysis of highly efficient two-port MIMO antenna for 5G wearable applications. IETE J. Res. 2023, 69, 3594–3603. [Google Scholar] [CrossRef]
Biradar, A.; Murthy, N.S.; Awasthi, P.; Srivastava, A.K.; Akram, P.S.; Lakshminarayana, M.; Abidin, S.; Vadi, V.R.; Sisay, A. Massive MIMO wireless solutions in backhaul for the 5G networks. Wirel. Commun. Mob. Comput. 2022, 2022, 3813610. [Google Scholar] [CrossRef]
Yao, M.; Sohul, M.; Marojevic, V.; Reed, J.H. Artificial intelligence defined 5G radio access networks. IEEE Commun. Mag. 2019, 57, 14–20. [Google Scholar] [CrossRef]
Devrari, A.; Kumar, A.; Kuchhal, P. Global aspects and overview of 5G multimedia communication. Multimedia Tools Appl. 2023, 83, 26439–26484. [Google Scholar] [CrossRef]
Hammed, Z.S.; Ameen, S.Y.; Zeebaree, S.R.M. Massive MIMO-OFDM performance enhancement on 5G. In Proceedings of the 2021 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 23–25 September 2021; pp. 1–6. [Google Scholar] [CrossRef]
Mhedhbi, M.; Boukour, F.E. Analysis and evaluation of pattern division multiple access scheme jointed with 5G waveforms. IEEE Access 2019, 7, 21826–21833. [Google Scholar] [CrossRef]
Umer, M.M.; Jiang, H.; Zhang, Q.; Liu, M.; Muhammad, O. Time-slot based architecture for power beam-assisted relay techniques in CR-WSNs with transceiver hardware inadequacies. Bull. Pol. Acad. Sci. Tech. Sci. 2023, 71, 1–11. [Google Scholar] [CrossRef]
Rao, L.; Pant, M.; Malviya, L.; Parmar, A.; Charhate, S.V. 5G beamforming techniques for the coverage of intended directions in modern wireless communication: In-depth review. Int. J. Microw. Wireless Technol. 2021, 13, 1039–1062. [Google Scholar] [CrossRef]
Saraereh, O.A.; Ali, A. Beamforming performance analysis of millimeter-wave 5G wireless networks. Comput. Mater. Contin. 2022, 70, 5383–5397. [Google Scholar] [CrossRef]
Sharma, A.; Jha, R.K. A comprehensive survey on security issues in 5G wireless communication network using beamforming approach. Wirel. Pers. Commun. 2021, 119, 3447–3501. [Google Scholar] [CrossRef]
Jiang, C.; Zhang, H.; Ren, Y.; Han, Z.; Chen, K.-C.; Hanzo, L. Machine learning paradigms for next-generation wireless networks. IEEE Wirel. Commun. 2016, 24, 98–105. [Google Scholar] [CrossRef]
Salem, M.A.; El-Kader, S.M.A.; Youssef, M.I.; Tarrad, I.F. M2m in 5g communication networks: Characteristics, applications, taxonomy, technologies, and future challenges. In Fundamental and Supportive Technologies for 5G Mobile Networks; IGI Global: Hershey, PA, USA, 2020; pp. 309–321. [Google Scholar] [CrossRef]
Fourati, H.; Maaloul, R.; Chaari, L. A survey of 5G network systems: Challenges and machine learning approaches. Int. J. Mach. Learn. Cybern. 2021, 12, 385–431. [Google Scholar] [CrossRef]
Hussein, H.H.; Elsayed, H.A.; El-kader, S.M.A. Intensive benchmarking of D2D communication over 5G cellular networks: Prototype, integrated features, challenges, and main applications. Wireless Netw. 2020, 26, 3183–3202. [Google Scholar] [CrossRef]
Laguidi, A.; Hachad, T.; Hachad, L. Mobile network connectivity analysis for device to device communication in 5G network. Int. J. Electr. Comput. Eng. 2023, 13, 680–687. [Google Scholar] [CrossRef]
Jayakumar, S. A review on resource allocation techniques in D2D communication for 5G and B5G technology. Peer-to-Peer Netw. Appl. 2021, 14, 243–269. [Google Scholar] [CrossRef]
Ullah, A.; Aznaoui, H.; Sahin, C.B.; Sadie, M.; Dinler, Ö.B.; Laassar, I. Cloud computing and 5G challenges and open issues. Int. J. Adv. Appl. Sci. 2022, 11, 187–193. [Google Scholar] [CrossRef]
Hassan, N.; Yau, K.-L.A.; Wu, C. Edge computing in 5G: A review. IEEE Access 2019, 7, 127276–127289. [Google Scholar] [CrossRef]
Pham, Q.V.; Fang, F.; Ha, V.N.; Piran, M.J.; Le, M.; Le, L.B.; Hwang, W.J.; Ding, Z. A survey of multi-access edge computing in 5G and beyond: Fundamentals, technology integration, and state-of-the-art. IEEE Access 2020, 8, 116974–117017. [Google Scholar] [CrossRef]
Narayanan, A.; De Sena, A.S.; Gutierrez-Rojas, D.; Melgarejo, D.C.; Hussain, H.M.; Ullah, M.; Bayhan, S.; Nardelli, P.H. Key advances in pervasive edge computing for industrial internet of things in 5g and beyond. IEEE Access 2020, 8, 206734–206754. [Google Scholar] [CrossRef]
Feng, Z.; Qiu, C.; Feng, Z.; Wei, Z.; Li, W.; Zhang, P. An effective approach to 5G: Wireless network virtualization. IEEE Commun. Mag. 2015, 53, 53–59. [Google Scholar] [CrossRef]
Ramakrishnan, J.; Shabbir, M.S.; Kassim, N.M.; Nguyen, P.T.; Mavaluru, D. A comprehensive and systematic review of the network virtualization techniques in the IoT. Int. J. Commun. Syst. 2020, 33, e4331. [Google Scholar] [CrossRef]
Basu, D.; Datta, R.; Ghosh, U. Softwarized network function virtualization for 5G: Challenges and opportunities. In Internet of Things and Secure Smart Environments: Successes and Pitfalls; CRC: Boca Raton, FL, USA, 2020; p. 147. [Google Scholar] [CrossRef]
Vaezpour, E. Deep learning-driven multi-objective dynamic switch migration in software defined networking (SDN)/network function virtualization (NFV)-based 5G networks. Eng. Appl. Artif. Intell. 2023, 125, 106714. [Google Scholar] [CrossRef]
Midya, P.; Acharya, T. Wireless network virtualization in cellular IoT networks for smart city application. In Proceedings of the 2025 National Conference on Communications (NCC), New Delhi, India, 6–9 March 2025; pp. 1–6. [Google Scholar] [CrossRef]
Syed-Yusof, S.K.; Numan, P.E.; Yusof, K.M.; Din, J.B.; Marsono, M.N.; Onumanyi, A.J. Software-defined networking (SDN) and 5G network: The role of controller placement for scalable control plane. In Proceedings of the 2020 IEEE International RF and Microwave Conference (RFM), Kuala Lumpur, Malaysia, 14–16 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
Ghaffar, A.A.A.; Mahmoud, A.; Sheltami, T.; Abu-Amara, M. A survey on software-defined networking-based 5G mobile core architectures. Arab. J. Sci. Eng. 2023, 48, 2313–2330. [Google Scholar] [CrossRef]
Barakabitze, A.A.; Ahmad, A.; Mijumbi, R.; Hines, A. 5G network slicing using SDN and NFV: A survey of taxonomy, architectures and future challenges. Comput. Netw. 2020, 167, 106984. [Google Scholar] [CrossRef]
Rajput, M.; Malathi, P. A review of in band full duplex communication in 5G cellular network. In Proceedings of the 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, 6–8 July 2021; pp. 1–7. [Google Scholar] [CrossRef]
Gazestani, A.H.; Ghorashi, S.A.; Mousavinasab, B.; Shikh-Bahaei, M. A survey on implementation and applications of full duplex wireless communications. Phys. Commun. 2019, 34, 121–134. [Google Scholar] [CrossRef]
Smida, B.; Sabharwal, A.; Fodor, G.; Alexandropoulos, G.C.; Suraweera, H.A.; Chae, C.-B. Full-duplex wireless for 6G: Progress brings new opportunities and challenges. IEEE J. Sel. Areas Commun. 2023, 41, 2729–2750. [Google Scholar] [CrossRef]
Nadeem, L.; Azam, M.A.; Amin, Y.; Al-Ghamdi, M.A.; Chai, K.K.; Khan, M.F.N.; Khan, M.A. Integration of D2D, network slicing, and MEC in 5G cellular networks: Survey and challenges. IEEE Access 2021, 9, 37590–37612. [Google Scholar] [CrossRef]
Zhang, S. An overview of network slicing for 5G. IEEE Wirel. Commun. 2019, 26, 111–117. [Google Scholar] [CrossRef]
Dahrouj, H.; Alghamdi, R.; Alwazani, H.; Bahanshal, S.; Ahmad, A.A.; Faisal, A.; Shalabi, R.; Alhadrami, R.; Subasi, A.; Al-Nory, M.T.; et al. An overview of machine learning-based techniques for solving optimization problems in communications and signal processing. IEEE Access 2021, 9, 74908–74938. [Google Scholar] [CrossRef]
Khan, S.K.; Naseem, U.; Siraj, H.; Razzak, I.; Imran, M. The role of unmanned aerial vehicles and mmWave in 5G: Recent advances and challenges. Trans. Emerg. Telecommun. Technol. 2021, 32, e4241. [Google Scholar] [CrossRef]
Al-Shammari, B.K.J.; Hburi, I.; Idan, H.R.; Khazaal, H.F. An overview of mmWave communications for 5G. In Proceedings of the 2021 International Conference on Communication & Information Technology (ICICT), Basrah, Iraq, 5–6 June 2021; pp. 133–139. [Google Scholar] [CrossRef]
Othman, W.M.; Ateya, A.A.; Nasr, M.E.; Muthanna, A.; ElAffendi, M.; Koucheryavy, A.; Hamdi, A.A. Key enabling technologies for 6G: The role of UAVs, terahertz communication, and intelligent reconfigurable surfaces in shaping the future of wireless networks. J. Sens. Actuator Netw. 2025, 14, 30. [Google Scholar] [CrossRef]
Zhang, S.; Huang, W.; Liu, Y. A systematic survey on physical layer security oriented to reconfigurable intelligent surface empowered 6G. Comput. Secur. 2025, 148, 104100. [Google Scholar] [CrossRef]
Janiesch, C.; Zschech, P.; Heinrich, K. Machine learning and deep learning. Electron. Mark. 2021, 31, 685–695. [Google Scholar] [CrossRef]
Samanta, R.K.; Sadhukhan, B.; Samaddar, H.; Sarkar, S.; Koner, C.; Ghosh, M. Scope of machine learning applications for addressing the challenges in next-generation wireless networks. CAAI Trans. Intell. Technol. 2022, 7, 395–418. [Google Scholar] [CrossRef]
Kaur, J.; Khan, M.A.; Iftikhar, M.; Imran, M.; Haq, Q.E.U. Machine learning techniques for 5G and beyond. IEEE Access 2021, 9, 23472–23488. [Google Scholar] [CrossRef]
Jagannath, J.; Polosky, N.; Jagannath, A.; Restuccia, F.; Melodia, T. Machine learning for wireless communications in the Internet of Things: A comprehensive survey. Ad Hoc Netw. 2019, 93, 101913. [Google Scholar] [CrossRef]
Sen, P.C.; Hajra, M.; Ghosh, M. Supervised classification algorithms in machine learning: A survey and review. In Emerging Technology in Modelling and Graphics: Proceedings of IEM Graph 2018; Springer: Singapore, 2020; pp. 99–111. [Google Scholar] [CrossRef]
Shubyn, B.; Lutsiv, N.; Syrotynskyi, O.; Kolodii, R. Deep learning based adaptive handover optimization for ultra-dense 5G mobile networks. In Proceedings of the 2020 IEEE 15th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET), Lviv-Slavske, Ukraine, 25–29 February 2020; pp. 869–872. [Google Scholar] [CrossRef]
Pisner, D.A.; Schnyer, D.M. Support vector machine. In Machine Learning; Elsevier: Amsterdam, The Netherlands, 2020; pp. 101–121. [Google Scholar] [CrossRef]
Yang, H.; Xie, X.; Kadoch, M. Machine learning techniques and a case study for intelligent wireless networks. IEEE Netw. 2020, 34, 208–215. [Google Scholar] [CrossRef]
Wang, W.; Duan, Y.; Cao, L.; Jiang, Z. Application of improved Naive Bayes classification algorithm in 5G signaling analysis. J. Supercomput. 2023, 79, 6941–6964. [Google Scholar] [CrossRef]
Vijayalakshmi, A.; Abdulsamath, G.; Saravanan, N. 5G network slicing algorithm development using bagging based-Gaussian naive Bayes. In Proceedings of the 2023 Second International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT), Trichirappalli, India, 5–7 April 2023; pp. 1–5. [Google Scholar] [CrossRef]
Sirohi, D.; Kumar, N.; Rana, P.S. Convolutional neural networks for 5G-enabled intelligent transportation system: A systematic review. Comput. Commun. 2020, 153, 459–498. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
Staudemeyer, R.C.; Morris, E.R. Understanding LSTM? A tutorial into long short-term memory recurrent neural networks. arXiv 2019, arXiv:1909.09586. [Google Scholar] [CrossRef]
Gupta, R.K.; Pannu, P.; Misra, R. Towards ultra-latency using deep learning in 5G network slicing applying approximate k-nearest neighbor graph construction. Wirel. Pers. Commun. 2021, 1–19. [Google Scholar] [CrossRef]
Preciado-Velasco, J.E.; Gonzalez-Franco, J.D.; Anias-Calderon, C.E.; Nieto-Hipolito, J.I.; Rivera-Rodriguez, R. 5G/B5G service classification using supervised learning. Appl. Sci. 2021, 11, 4942. [Google Scholar] [CrossRef]
Rekkas, V.P.; Sotiroudis, S.; Sarigiannidis, P.; Wan, S.; Karagiannidis, G.K.; Goudos, S.K. Machine learning in beyond 5G/6G networks—state-of-the-art and future trends. Electronics 2021, 10, 2786. [Google Scholar] [CrossRef]
Charbuty, B.; Abdulazeez, A. Classification based on decision tree algorithm for machine learning. J. Appl. Sci. Technol. Trends 2021, 2, 20–28. [Google Scholar] [CrossRef]
Schonlau, M.; Zou, R.Y. The random forest algorithm for statistical learning. Stata J. 2020, 20, 3–29. [Google Scholar] [CrossRef]
Caiyu, S.; Jinri, W.; Jie, D.; Yuan, L. Mining potential 5G mobile users based on logistic regression and random forest. In Proceedings of the 2022 IEEE 5th International Conference on Information Systems and Computer Aided Education (ICISCAE), Dalian, China, 23–25 September 2022; pp. 351–356. [Google Scholar] [CrossRef]
Zhou, I.; Makhdoom, I.; Shariati, N.; Raza, M.A.; Keshavarz, R.; Lipman, J.; Abolhasan, M.; Jamalipour, A. Internet of things 2.0: Concepts, applications, and future directions. IEEE Access 2021, 9, 70961–71012. [Google Scholar] [CrossRef]
Hassan, D.O.; Hassan, B.A. A comprehensive systematic review of machine learning in the retail industry: Classifications, limitations, opportunities, and challenges. Neural Comput. Appl. 2025, 37, 2035–2070. [Google Scholar] [CrossRef]
Zhao, Z.; Alzubaidi, L.; Zhang, J.; Duan, Y.; Gu, Y. A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations. Expert Syst. Appl. 2024, 242, 122807. [Google Scholar] [CrossRef]
Khaldi, M.I.; Erraissi, A.; Hain, M.; Banane, M. Multicriteria evaluation of supervised classification algorithms: Strengths, limitations and practical recommendations. In International Conference on Intelligent Systems and Digital Applications; Springer: Cham, Switzerland, 2025; pp. 277–286. [Google Scholar] [CrossRef]
Ahmed, S.F.; Alam, M.S.B.; Hassan, M.; Rozbu, M.R.; Ishtiak, T.; Rafa, N.; Mofijur, M.; Ali, A.B.M.S.; Gandomi, A.H. Deep learning modelling techniques: Current progress, applications, advantages, and challenges. Artif. Intell. Rev. 2023, 56, 13521–13617. [Google Scholar] [CrossRef]
Malekzadeh, M. Performance prediction and enhancement of 5G networks based on linear regression machine learning. EURASIP J. Wirel. Commun. Netw. 2023, 2023, 74. [Google Scholar] [CrossRef]
Varotto, M.; Heinrichs, F.; Schürg, T.; Tomasin, S.; Valentin, S. Detecting 5G narrowband jammers with CNN, k-nearest neighbors, and support vector machines. In Proceedings of the 2024 IEEE International Workshop on Information Forensics and Security (WIFS), Rome, Italy, 2–5 December 2024; pp. 1–6. [Google Scholar] [CrossRef]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R.; Taylor, J. Unsupervised learning. In An Introduction to Statistical Learning: With Applications in Python; Springer: Berlin/Heidelberg, Germany, 2023; pp. 503–556. [Google Scholar] [CrossRef]
Ige, A.O.; Noor, M.H.M. A survey on unsupervised learning for wearable sensor-based activity recognition. Appl. Soft Comput. 2022, 127, 109363. [Google Scholar] [CrossRef]
Lefoane, M.; Ghafir, I.; Kabir, S.; Awan, I.-U. Unsupervised learning for feature selection: A proposed solution for botnet detection in 5G networks. IEEE Trans. Ind. Inform. 2022, 19, 921–929. [Google Scholar] [CrossRef]
Ezugwu, A.E.; Ikotun, A.M.; Oyelade, O.O.; Abualigah, L.; Agushaka, J.O.; Eke, C.I.; Akinyelu, A.A. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng. Appl. Artif. Intell. 2022, 110, 104743. [Google Scholar] [CrossRef]
Khan, M.U.; Azizi, M.; García-Armada, A.; Escudero-Garzás, J.J. Unsupervised clustering for 5G network planning assisted by real data. IEEE Access 2022, 10, 39269–39281. [Google Scholar] [CrossRef]
Jia, W.; Sun, M.; Lian, J.; Hou, S. Feature dimensionality reduction: A review. Complex Intell. Syst. 2022, 8, 2663–2693. [Google Scholar] [CrossRef]
Pena, J.M.; Lozano, J.A.; Larranaga, P.; Inza, I. Dimensionality reduction in unsupervised learning of conditional Gaussian networks. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 590–603. [Google Scholar] [CrossRef]
Ebenezer, E.; Djouani, K.; Kurien, A.M. Integrating artificial intelligence internet of things and 5G for next-generation smartgrid: A survey of trends challenges and prospect. IEEE Access 2022, 10, 4794–4831. [Google Scholar] [CrossRef]
Zhang, S.; Zhu, D. Towards artificial intelligence enabled 6G: State of the art, challenges, and opportunities. Comput. Netw. 2020, 183, 107556. [Google Scholar] [CrossRef]
Ikotun, A.M.; Ezugwu, A.E.; Abualigah, L.; Abuhaija, B.; Heming, J. K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data. Inf. Sci. 2023, 622, 178–210. [Google Scholar] [CrossRef]
Natarajan, J.; Rebekka, B. An energy efficient dynamic small cell on/off switching with enhanced k-means clustering algorithm for 5G HetNets. Int. J. Commun. Netw. Distrib. Syst. 2023, 29, 209–237. [Google Scholar] [CrossRef]
DBank; Koenigstein, N.; Giryes, R. Autoencoders. In Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook; Springer: Berlin/Heidelberg, Germany, 2023; pp. 353–374. [Google Scholar] [CrossRef]
Başaran, O.T.; Başaran, M.; Turan, D.; Bayrak, H.G.; Sandal, Y.S. Deep autoencoder design for RF anomaly detection in 5G O-RAN near-RT RIC via xApps. In Proceedings of the 2023 IEEE International Conference on Communications Workshops (ICC Workshops), Rome, Italy, 28 May–1 June 2023; pp. 549–555. [Google Scholar] [CrossRef]
Papidas, A.G.; Polyzos, G.C. Self-organizing networks for 5G and beyond: A view from the top. Future Internet 2022, 14, 95. [Google Scholar] [CrossRef]
Fourati, H.; Maaloul, R.; Chaari, L.; Jmaiel, M. Comprehensive survey on self-organizing cellular network approaches applied to 5G networks. Comput. Netw. 2021, 199, 108435. [Google Scholar] [CrossRef]
Prince-Tritto, P.; Ponce, H. Exploring the challenges and limitations of unsupervised machine learning approaches in legal concepts discovery. In Mexican International Conference on Artificial Intelligence; Springer: Cham, Switzerland, 2023; pp. 52–67. [Google Scholar] [CrossRef]
Singh, S.; Hooda, S. A study of challenges and limitations to applying machine learning to highly unstructured data. In Proceedings of the 2023 7th International Conference On Computing, Communication, Control And Automation (ICCUBEA), Pune, India, 18–19 August 2023; pp. 1–6. [Google Scholar] [CrossRef]
Priyadarshi, R.; Ranjan, R.; Vishwakarma, A.K.; Yang, T.; Rathore, R.S. Exploring the frontiers of unsupervised learning techniques for diagnosis of cardiovascular disorder: A systematic review. IEEE Access 2024, 12, 139253–139272. [Google Scholar] [CrossRef]
Tyagi, K.; Rane, C.; Sriram, R.; Manry, M. Unsupervised learning. In Artificial Intelligence and Machine Learning for Edge Computing; Academic Press: Cambridge, MA, USA, 2022; pp. 33–52. [Google Scholar] [CrossRef]
Ladosz, P.; Weng, L.; Kim, M.; Oh, H. Exploration in deep reinforcement learning: A survey. Inf. Fusion 2022, 85, 1–22. [Google Scholar] [CrossRef]
Li, S.E. Deep reinforcement learning. In Reinforcement Learning for Sequential Decision and Optimal Control; Springer: Berlin/Heidelberg, Germany, 2023; pp. 365–402. [Google Scholar] [CrossRef]
Mollel, M.S.; Abubakar, A.I.; Ozturk, M.; Kaijage, S.F.; Kisangiri, M.; Hussain, S.; Imran, M.A.; Abbasi, Q.H. A survey of machine learning applications to handover management in 5G and beyond. IEEE Access 2021, 9, 45770–45802. [Google Scholar] [CrossRef]
Nawej, C.M.; Owolawi, P.A.; Walingo, T.M. Advanced clustering for mobile network optimization: A systematic literature review. Sensors 2025, 25, 7370. [Google Scholar] [CrossRef] [PubMed]
Lykakis, E.; Vardiambasis, I.O.; Kokkinos, E. Data traffic prediction for 5G and beyond: Emerging trends, challenges, and future directions: A scoping review. Electronics 2025, 14, 4611. [Google Scholar] [CrossRef]
Helal, S.; Sarieddeen, H.; Dahrouj, H.; Al-Naffouri, T.Y.; Alouini, M.-S. Signal processing and machine learning techniques for terahertz sensing: An overview. IEEE Signal Process. Mag. 2022, 39, 42–62. [Google Scholar] [CrossRef]
Zhao, F.; Wang, Q.; Wang, L. An inverse reinforcement learning framework with the Q-learning mechanism for the metaheuristic algorithm. Knowl.-Based Syst. 2023, 265, 110368. [Google Scholar] [CrossRef]
Iqbal, M.U.; Ansari, E.A.; Akhtar, S.; Khan, A.N. Improving the QoS in 5G HetNets through cooperative Q-learning. IEEE Access 2022, 10, 19654–19676. [Google Scholar] [CrossRef]
Shokrnezhad, M.; Taleb, T.; Dazzi, P. Double deep q-learning-based path selection and service placement for latency-sensitive beyond 5G applications. IEEE Trans. Mobile Comput. 2023, 23, 5097–5110. [Google Scholar] [CrossRef]
Tan, K.; Bremner, D.; Kernec, J.L.; Sambo, Y.; Zhang, L.; Imran, M.A. Intelligent handover algorithm for vehicle-to-network communications with double-deep Q-learning. IEEE Trans. Veh. Technol. 2022, 71, 7848–7862. [Google Scholar] [CrossRef]
Malta, S.; Pinto, P.; Fernández-Veiga, M. Using reinforcement learning to reduce energy consumption of ultra-dense networks with 5G use cases requirements. IEEE Access 2023, 11, 5417–5428. [Google Scholar] [CrossRef]
Ahsan, W.; Yi, W.; Liu, Y.; Nallanathan, A. Reliable reinforcement learning based NOMA schemes for URLLC. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
Suh, K.; Kim, S.; Ahn, Y.; Kim, S.; Ju, H.; Shim, B. Deep reinforcement learning-based network slicing for beyond 5G. IEEE Access 2022, 10, 7384–7395. [Google Scholar] [CrossRef]
Liang, R.; Lyu, H.; Fan, J. A deep reinforcement learning-based power control scheme for the 5G wireless systems. China Commun. 2023, 20, 109–119. [Google Scholar] [CrossRef]
Ribeiro, D.A.; Melgarejo, D.C.; Saadi, M.; Rosa, R.L.; Rodríguez, D.Z. A novel deep deterministic policy gradient model applied to intelligent transportation system security problems in 5G and 6G network scenarios. Phys. Commun. 2023, 56, 101938. [Google Scholar] [CrossRef]
He, J. 5G communication resource allocation strategy for mobile edge computing based on deep deterministic policy gradient. J. Eng. 2023, 2023, e12250. [Google Scholar] [CrossRef]
Kwon, H. Learning-based power delay profile estimation for 5G NR via advantage actor-critic (A2C). In Proceedings of the 2022 IEEE 95th Vehicular Technology Conference: (VTC2022-Spring), Helsinki, Finland, 19–22 June 2022; pp. 1–6. [Google Scholar] [CrossRef]
Javadpour, A.; Jafari, F.; Taleb, T.; Benzaïd, C. Enhancing 5G network slicing: Slice isolation via actor-critic reinforcement learning with optimal graph features. In Proceedings of the GLOBECOM 2023—2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 4–8 December 2023. [Google Scholar] [CrossRef]
Terven, J. Deep reinforcement learning: A chronological overview and methods. AI 2025, 6, 46. [Google Scholar] [CrossRef]
Su, T.; Wu, T.; Zhao, J.; Scaglione, A.; Xie, L. A review of safe reinforcement learning methods for modern power systems. Proc. IEEE 2025, 113, 213–255. [Google Scholar] [CrossRef]
Calvo-Fullana, M.; Paternain, S.; Chamon, L.F.O.; Ribeiro, A. State augmented constrained reinforcement learning: Overcoming the limitations of learning with rewards. IEEE Trans. Autom. Control 2023, 69, 4275–4290. [Google Scholar] [CrossRef]
Gu, S.; Yang, L.; Duan, Y.; Chen, G.; Walter, F.; Wang, J.; Knoll, A. A review of safe reinforcement learning: Methods, theories and applications. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 11216–11235. [Google Scholar] [CrossRef]
Valcarce, A.; Kela, P.; Mandelli, S.; Viswanathan, H. The role of AI in 6G MAC. In Proceedings of the 2024 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), Antwerp, Belgium, 3–6 June 2024; pp. 723–728. [Google Scholar] [CrossRef]
Jang, J.; Park, J.H.; Yang, H.J. Supervised-learning-based resource allocation in wireless networks. In Proceedings of the 2020 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Republic of Korea, 21–23 October 2020; pp. 1022–1024. [Google Scholar] [CrossRef]
Chen, T.; Zhang, X.; You, M.; Zheng, G.; Lambotharan, S. A GNN-based supervised learning framework for resource allocation in wireless IoT networks. IEEE Internet Things J. 2021, 9, 1712–1724. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, Z.; Yang, L. Joint user association and power allocation in heterogeneous ultra dense network via semi-supervised representation learning. arXiv 2021, arXiv:2103.15367. [Google Scholar] [CrossRef]
Jayaraman, R.; Manickam, B.; Annamalai, S.; Kumar, M.; Mishra, A.; Shrestha, R. Effective resource allocation technique to improve QoS in 5G wireless network. Electronics 2023, 12, 451. [Google Scholar] [CrossRef]
Koc, A.; Wang, M.; Le-Ngoc, T. Deep learning based multi-user power allocation and hybrid precoding in massive MIMO systems. In Proceedings of the ICC 2022—IEEE International Conference on Communications, Seoul, Republic of Korea, 16–20 May 2022; pp. 5487–5492. [Google Scholar] [CrossRef]
Sliwa, B.; Adam, R.; Wietfeld, C. Client-based intelligence for resource efficient vehicular big data transfer in future 6G networks. IEEE Trans. Veh. Technol. 2021, 70, 5332–5346. [Google Scholar] [CrossRef]
Kwon, H.J.; Lee, J.H.; Choi, W. Machine learning-based beamforming in K-user MISO interference channels. IEEE Access 2021, 9, 28066–28075. [Google Scholar] [CrossRef]
Beyazıt, E.A.; Özbek, B.; Ruyet, D.L. Deep learning based adaptive bit allocation for heterogeneous interference channels. Phys. Commun. 2021, 47, 101364. [Google Scholar] [CrossRef]
He, X.; Mao, Y.; Liu, Y.; Ping, P.; Hong, Y.; Hu, H. Channel assignment and power allocation for throughput improvement with PPO in B5G heterogeneous edge networks. Digit. Commun. Netw. 2023, 10, 109–116. [Google Scholar] [CrossRef]
Dahal, M.; Vaezi, M. Deep reinforcement learning for interference management in millimeter-wave networks. In Proceedings of the 2022 56th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 31 October–2 November 2022; pp. 1064–1069. [Google Scholar] [CrossRef]
Eskandari, M.; Kapoor, S.; Briggs, K.; Shojaeifard, A.; Zhu, H.; Mourad, A. Smart interference management xApp using deep reinforcement learning. arXiv 2022, arXiv:2204.09707. [Google Scholar] [CrossRef]
Abuzainab, N.; Alrabeiah, M.; Alkhateeb, A.; Sagduyu, Y.E. Deep learning for THz drones with flying intelligent surfaces: Beam and handoff prediction. In Proceedings of the 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, QC, Canada, 14–23 June 2021; pp. 1–6. [Google Scholar] [CrossRef]
Gao, F.; Lin, B.; Bian, C.; Zhou, T.; Qian, J.; Wang, H. FusionNet: Enhanced beam prediction for mmWave communications using sub-6 GHz channel and a few pilots. IEEE Trans. Commun. 2021, 69, 8488–8500. [Google Scholar] [CrossRef]
Sim, M.S.; Lim, Y.-G.; Park, S.H.; Dai, L.; Chae, C.-B. Deep learning-based mmWave beam selection for 5G NR/6G with sub-6 GHz channel information: Algorithms and prototype validation. IEEE Access 2020, 8, 51634–51646. [Google Scholar] [CrossRef]
Keserwani, H.; Rastogi, H.; Kurniullah, A.Z.; Janardan, S.K.; Raman, R.; Rathod, V.M.; Gupta, A. Security enhancement by identifying attacks using machine learning for 5G network. Int. J. Commun. Netw. Inf. Secur. 2022, 14, 124–141. [Google Scholar] [CrossRef]
Mahbooba, B.; Timilsina, M.; Sahal, R.; Serrano, M. Explainable artificial intelligence (XAI) to enhance trust management in intrusion detection systems using decision tree model. Complexity 2021, 2021, 6634811. [Google Scholar] [CrossRef]
Sun, M.; Jin, Y.; Wang, S.; Mei, E. Joint deep reinforcement learning and unsupervised learning for channel selection and power control in D2D networks. Entropy 2022, 24, 1722. [Google Scholar] [CrossRef]
Wang, X.; Cheng, N.; Fu, L.; Quan, W.; Sun, R.; Hui, Y.; Luan, T.; Shen, X.S. Scalable resource management for dynamic MEC: An unsupervised link-output graph neural network approach. In Proceedings of the 2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Toronto, ON, Canada, 5–8 September 2023; pp. 1–6. [Google Scholar] [CrossRef]
Sharma, N.; Kumar, K. Energy efficient clustering and resource allocation strategy for ultra-dense networks: A machine learning framework. IEEE Trans. Netw. Service Manag. 2022, 20, 1884–1897. [Google Scholar] [CrossRef]
Ergen, T.; Kozat, S.S. Unsupervised anomaly detection with LSTM neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 3127–3141. [Google Scholar] [CrossRef]
Wang, Z.; Eisen, M.; Ribeiro, A. Unsupervised learning for asynchronous resource allocation in ad-hoc wireless networks. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 8143–8147. [Google Scholar] [CrossRef]
Labana, M.; Hamouda, W. Unsupervised deep learning approach for near optimal power allocation in CRAN. IEEE Trans. Veh. Technol. 2021, 70, 7059–7070. [Google Scholar] [CrossRef]
Nikbakht, R.; Jonsson, A.; Lozano, A. Unsupervised learning for cellular power control. IEEE Commun. Lett. 2020, 25, 682–686. [Google Scholar] [CrossRef]
Balevi, E.; Doshi, A.; Andrews, J.G. Massive MIMO channel estimation with an untrained deep neural network. IEEE Trans. Wireless Commun. 2020, 19, 2079–2090. [Google Scholar] [CrossRef]
Jung, K.; Wang, H. Pilotless channel estimation scheme using clustering-based unsupervised learning. In Proceedings of the 2018 15th International Symposium on Wireless Communication Systems (ISWCS), Lisbon, Portugal, 28–31 August 2018; pp. 1–5. [Google Scholar] [CrossRef]
Zhang, L.; She, C.; Ying, K.; Li, Y.; Vucetic, B. Unsupervised learning for ultra-reliable and low-latency communications with practical channel estimation. IEEE Trans. Wireless Commun. 2023, 23, 3633–3647. [Google Scholar] [CrossRef]
Le Ha, A.; Van Chien, T.; Nguyen, T.H.; Choi, W.; Nguyen, V.D. Deep learning-aided 5G channel estimation. In Proceedings of the 2021 15th International Conference on Ubiquitous Information Management and Communication (IMCOM), Seoul, Republic of Korea, 4–6 January 2021; pp. 1–7. [Google Scholar] [CrossRef]
Wang, Z.; Zhou, Y.; Shi, Y.; Zhuang, W. Interference management for over-the-air federated learning in multi-cell wireless networks. IEEE J. Sel. Areas Commun. 2022, 40, 2361–2377. [Google Scholar] [CrossRef]
Liu, X.; Zhang, H.; Long, K.; Nallanathan, A.; Leung, V.C.M. Distributed unsupervised learning for interference management in integrated sensing and communication systems. IEEE Trans. Wirel. Commun. 2023, 22, 9301–9312. [Google Scholar] [CrossRef]
Oyedare, T.; Shah, V.K.; Jakubisin, D.J.; Reed, J.H. Interference suppression using deep learning: Current approaches and open challenges. IEEE Access 2022, 10, 66238–66266. [Google Scholar] [CrossRef]
Ahmad, I.; Hussain, S.; Mahmood, S.N.; Mostafa, H.; Alkhayyat, A.; Marey, M.; Abbas, A.H.; Rashed, Z.A. Co-channel interference management for heterogeneous networks using deep learning approach. Information 2023, 14, 139. [Google Scholar] [CrossRef]
Lavdas, S.; Gkonis, P.K.; Zinonos, Z.; Trakadas, P.; Sarakis, L.; Papadopoulos, K. A machine learning adaptive beamforming framework for 5G millimeter wave massive MIMO multicellular networks. IEEE Access 2022, 10, 91597–91609. [Google Scholar] [CrossRef]
Ahmed, I.; Shahid, M.K.; Khammari, H.; Masud, M. Machine learning based beam selection with low complexity hybrid beamforming design for 5G massive MIMO systems. IEEE Trans. Green Commun. Netw. 2021, 5, 2160–2173. [Google Scholar] [CrossRef]
Yu, Y.; Long, J.; Cai, Z. Network intrusion detection through stacking dilated convolutional autoencoders. Secur. Commun. Netw. 2017, 2017, 4184196. [Google Scholar] [CrossRef]
Sattiraju, R.; Weinand, A.; Schotten, H.D. AI-assisted PHY technologies for 6G and beyond wireless networks. arXiv 2019, arXiv:1908.09523. [Google Scholar] [CrossRef]
Lam, J.; Abbas, R. Machine learning based anomaly detection for 5G networks. arXiv 2020, arXiv:2003.03474. [Google Scholar] [CrossRef]
Chen, Y.; Zhang, Y.; Maharjan, S.; Alam, M.; Wu, T. Deep learning for secure mobile edge computing in cyber-physical transportation systems. IEEE Netw. 2019, 33, 36–41. [Google Scholar] [CrossRef]
Liu, Y.; Peng, J.; Kang, J.; Iliyasu, A.M.; Niyato, D.; El-Latif, A.A.A. A secure federated learning framework for 5G networks. IEEE Wireless Commun. 2020, 27, 24–31. [Google Scholar] [CrossRef]
Yu, W.; He, H.; Yu, X.; Song, S.; Zhang, J.; Murch, R.; Letaief, K.B. Bayes-optimal unsupervised learning for channel estimation in near-field holographic MIMO. IEEE J. Sel. Top. Signal Process. 2024, 18, 714–729. [Google Scholar] [CrossRef]
Gao, J.; Zhong, C.; Chen, X.; Lin, H.; Zhang, Z. Unsupervised learning for passive beamforming. IEEE Commun. Lett. 2020, 24, 1052–1056. [Google Scholar] [CrossRef]
Zhou, H.; Erol-Kantarci, M.; Poor, H.V. Learning from peers: Deep transfer reinforcement learning for joint radio and cache resource allocation in 5G RAN slicing. IEEE Trans. Cogn. Commun. Netw. 2022, 8, 1925–1941. [Google Scholar] [CrossRef]
Seid, A.M.; Erbad, A.; Abishu, H.N.; Albaseer, A.; Abdallah, M.; Guizani, M. Blockchain-empowered resource allocation in multi-UAV-enabled 5G-RAN: A multi-agent deep reinforcement learning approach. IEEE Trans. Cogn. Commun. Netw. 2023, 9, 991–1011. [Google Scholar] [CrossRef]
Yun, J.; Goh, Y.; Yoo, W.; Chung, J.-M. 5G multi-RAT URLLC and eMBB dynamic task offloading with MEC resource allocation using distributed deep reinforcement learning. IEEE Internet Things J. 2022, 9, 20733–20749. [Google Scholar] [CrossRef]
Shah, H.A.; Zhao, L.; Kim, I.-M. Joint network control and resource allocation for space-terrestrial integrated network through hierarchal deep actor-critic reinforcement learning. IEEE Trans. Veh. Technol. 2021, 70, 4943–4954. [Google Scholar] [CrossRef]
Feng, K.; Wang, Q.; Li, X.; Wen, C.-K. Deep reinforcement learning based intelligent reflecting surface optimization for MISO communication systems. IEEE Wirel. Commun. Lett. 2020, 9, 745–749. [Google Scholar] [CrossRef]
Yang, Z.; Liu, Y.; Chen, Y. Distributed reinforcement learning for NOMA-enabled mobile edge computing. In Proceedings of the 2020 IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
Kaihong, C.; Xiaowei, M.; Linsheng, M.; Wei, H. Multi-agent reinforcement learning based joint uplink–downlink subcarrier assignment and power allocation for D2D underlay networks. Wirel. Netw. 2023, 29, 891–907. [Google Scholar] [CrossRef]
Sharma, H.; Kumar, N.; Tekchandani, R.K. SecBoost: Secrecy-aware deep reinforcement learning based energy-efficient scheme for 5G HetNets. IEEE Trans. Mobile Comput. 2023, 23, 1401–1415. [Google Scholar] [CrossRef]
Zafaruddin, S.M.; Bistritz, I.; Leshem, A.; Niyato, D. Multiagent autonomous learning for distributed channel allocation in wireless networks. In Proceedings of the 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Cannes, France, 2–5 July 2019; pp. 1–5. [Google Scholar] [CrossRef]
He, G.; Cui, S.; Dai, Y.; Jiang, T. Learning task-oriented channel allocation for multi-agent communication. IEEE Trans. Veh. Technol. 2022, 71, 12016–12029. [Google Scholar] [CrossRef]
Catak, F.O.; Cali, U.; Kuzlu, M.; Sarp, S. Uncertainty aware deep learning model for secure and trustworthy channel estimation in 5G networks. In Proceedings of the 2023 12th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 6–10 June 2023; pp. 1–4. [Google Scholar] [CrossRef]
Kim, K.; Munir, Y.T.M.S.; Saad, W.; Hong, C.S. Deep reinforcement learning for channel estimation in RIS-aided wireless networks. IEEE Commun. Lett. 2023, 27, 2053–2057. [Google Scholar] [CrossRef]
Mismar, F.B.; Evans, B.L.; Alkhateeb, A. Deep reinforcement learning for 5G networks: Joint beamforming, power control, and interference coordination. IEEE Trans. Commun. 2019, 68, 1581–1592. [Google Scholar] [CrossRef]
Challita, U.; Saad, W.; Bettstetter, C. Cellular-connected UAVs over 5G: Deep reinforcement learning for interference management. arXiv 2018, arXiv:1801.05500. [Google Scholar] [CrossRef]
Park, K.; Kim, H.; Kwon, D.; Kim, H.; Kang, H.; Shin, M.-H.; Kim, J.; Hur, W. The reinforcement learning based interference whitening scheme for 5G. In Proceedings of the 2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring), Helsinki, Finland, 25–28 April 2021; pp. 1–5. [Google Scholar] [CrossRef]
Abdallah, A.; Celik, A.; Mansour, M.M.; Eltawil, A.M. Deep reinforcement learning based beamforming codebook design for RIS-aided mmWave systems. In Proceedings of the 2023 IEEE 20th Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2023; pp. 1020–1026. [Google Scholar] [CrossRef]
Chu, M.; Liu, A.; Lau, V.K.N.; Jiang, C.; Yang, T. Deep reinforcement learning based end-to-end multiuser channel prediction and beamforming. IEEE Trans. Wireless Commun. 2022, 21, 10271–10285. [Google Scholar] [CrossRef]
Eappen, G.; Cosmas, J.; Nilavalan, R.; Thomas, J. Deep learning integrated reinforcement learning for adaptive beamforming in B5G networks. IET Commun. 2022, 16, 2454–2466. [Google Scholar] [CrossRef]
Moudoud, H.; Cherkaoui, S. Empowering security and trust in 5G and beyond: A deep reinforcement learning approach. IEEE Open J. Commun. Soc. 2023, 4, 2410–2420. [Google Scholar] [CrossRef]
Wang, X.; Xu, Y.; Chen, J.; Li, C.; Liu, X.; Liu, D.; Xu, Y. Mean field reinforcement learning based anti-jamming communications for ultra-dense internet of things in 6G. In Proceedings of the 2020 International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, 21–23 October 2020; pp. 195–200. [Google Scholar] [CrossRef]
Yang, H.; Xiong, Z.; Zhao, J.; Niyato, D.; Xiao, L.; Wu, Q. Deep reinforcement learning-based intelligent reflecting surface for secure wireless communications. IEEE Trans. Wireless Commun. 2020, 20, 375–388. [Google Scholar] [CrossRef]
Stylianopoulos, K.; Merluzzi, M.; Lorenzo, P.D.; Alexandropoulos, G.C. Lyapunov-driven deep reinforcement learning for edge inference empowered by reconfigurable intelligent surfaces. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar] [CrossRef]
Liu, M.; Wang, R.; Xing, Z.; Soto, I. Deep reinforcement learning based dynamic power and beamforming design for time-varying wireless downlink interference channel. In Proceedings of the 2022 IEEE Wireless Communications and Networking Conference (WCNC), Austin, TX, USA, 10–13 April 2022; pp. 471–476. [Google Scholar] [CrossRef]
Seguin, M.; Omer, A.; Koosha, M.; Malandra, F.; Mastronarde, N. Deep reinforcement learning for downlink scheduling in 5G and beyond networks: A review. In Proceedings of the 2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Toronto, ON, Canada, 5–8 September 2023; pp. 1–6. [Google Scholar] [CrossRef]
Tang, F.; Mao, B.; Kawamoto, Y.; Kato, N. Survey on machine learning for intelligent end-to-end communication toward 6G: From network access, routing to traffic control and streaming adaption. IEEE Commun. Surv. Tutor. 2021, 23, 1578–1598. [Google Scholar] [CrossRef]
Rihan, M.; Zappone, A.; Buzzi, S.; Fodor, G.; Debbah, M. Passive vs. active reconfigurable intelligent surfaces for integrated sensing and communication: Challenges and opportunities. IEEE Netw. 2023, 38, 218–226. [Google Scholar] [CrossRef]
Hasan, M.K.; Habib, A.A.; Islam, S.; Safie, N.; Ghazal, T.M.; Khan, M.A.; Alzahrani, A.I.; Alalwan, N.; Kadry, S.; Masood, A. Federated learning enables 6G communication technology: Requirements, applications, and integrated with intelligence framework. Alex. Eng. J. 2024, 91, 658–668. [Google Scholar] [CrossRef]

Figure 1. Evolution of mobile communication networks from 1G to 6G and the role of ML [8].

Figure 2. AI-Based learning approaches in next-generation wireless networks [10].

Figure 3. Practical implementations of 5G technology [20].

Figure 4. Supervised learning workflow.

Figure 5. Unsupervised Learning Model.

Figure 6. Reinforcement learning Framework.

Table 1. Strengths and weaknesses of supervised ML techniques.

ML Method	Strengths	Weaknesses	Real-World Limitations
Linear Regression [89,92]	Simple and fast	Poor performance with non-linear data	Limited applicability in dynamic, non-linear network environments.
	Easy to interpret	Sensitive to outliers	Issues with scalability in high-dimensional network data.
	Performs well for linearly separable data
Logistic Regression [86,90]	Effective for binary classification	Assumes linear relationship	Has difficulty handling complex network dynamics.
Logistic Regression [86,90]	Outputs probability estimates	Limited to linearly separable classes	Faces challenges with data sparsity in large-scale networks.
Decision Tree [84,88]	Easy to understand and visualize	Prone to overfitting	Overfitting due to noisy, complex network data.
Decision Tree [84,88]	Handles non-linear relationships	Unstable with small data changes	Requires frequent retraining in dynamic networks.
Random Forest [85,87]	Reduces overfitting	Less interpretable	High computational cost, making it impractical for real-time networks.
	Handles high-dimensional data	Computationally intensive	Performance can degrade in large-scale networks.
	Robust to noise
KNN [81,93]	Simple to implement	Slow for large datasets	Inefficient for large-scale network data.
	No training phase required	Sensitive to irrelevant features and noise	Computationally expensive during inference.
	Good for multi-class problems		Struggles with real-time applications with large data.
SVM [74,74]	Effective in high-dimensional spaces	Not suitable for large datasets	Faces scalability challenges in real-time network deployments.
SVM [74,74]	Works well with clear margin separation	Hard to tune kernel parameters	Parameter tuning is challenging in dynamic networks.
Naive Bayes [76,77]	Fast and scalable	Assumes feature independence	Has difficulty with correlated features in real-world data.
Naive Bayes [76,77]	Works well with high-dimensional data	Poor performance with correlated data	Assumes unrealistic feature independence in networks.
ANN [63,91]	Can model complex patterns	Requires large data	Demands large datasets.
	Scalable for large datasets	Computationally expensive	High cost limits use in edge computing or low-resource environments.
		Less interpretable	Interpretability issues in mission-critical applications.

Table 3. Strengths and weaknesses of reinforcement learning tchniques.

RL Method	Strengths	Weaknesses	Real-World Limitations
Q-Learning [119,120]	Simple to implement	Not suitable for large or continuous-state spaces	Slow convergence in dynamic networks.
	Model-free	Convergence can be slow	Not ideal for real-time, large-scale networks.
	Good for discrete action spaces		Limited in complex, evolving environments.
SARSA [123,124]	On-policy learning	Slower learning	Slow adaptation to network changes.
SARSA [123,124]	Safer exploration	Sensitive to policy changes	Less efficient in unstable networks.
DQN [121,122]	Handles high-dimensional state spaces	Requires large memory	High memory demand limits use in resource-constrained devices.
	Effective in complex environments	Sensitive to hyperparameters	Requires significant computational resources.
		Requires significant computational power	Not suitable for real-time deployment.
Policy Gradient [127,128]	Suitable for continuous-action spaces	High variance in updates	High variance can lead to instability.
Policy Gradient [127,128]	Can learn stochastic policies	Requires careful tuning of learning rate	Sensitive to tuning parameters in real-time applications.
Actor-Critic [129,130]	Combines value-based and policy-based methods	Complex implementation	High computational cost and complexity.
Actor-Critic [129,130]	Lower variance than pure policy gradients	Stability issues during training	Can be unstable in dynamic environments.
PPO [131,135]	Stable and efficient	May be sample inefficient	Inefficient in real-time, large-scale networks.
	Easy to implement and tune	Computationally intensive	High computational cost for real-time deployment.
	Widely used in practice		Requires significant resources for adaptation.
Monte Carlo [132,134]	Simple to implement	High variance in returns	Inefficient for long-term network planning.
	No need for environment model	Inefficient for long episodes	Better suited for episodic tasks, not continuous networks.
	Suitable for episodic tasks		Not ideal for dynamic network environments.
Temporal Difference [131,133]	Learns online	May converge slowly	Slow convergence in rapidly changing networks.
Temporal Difference [131,133]	More efficient than Monte Carlo	Requires balanced exploration–exploitation	Challenging to balance in fluctuating network states.
Model-Based RL [133,134]	Sample efficient	Requires accurate environment model	Needs precise models, which may not be available.
	Can simulate future states	High modeling complexity	High complexity makes real-time deployment challenging.
	Useful for planning		Requires significant computational resources.

Table 4. Supervised machine learning techniques for 5G and 6G wireless networks.

Reference	ML Technique	Objective	Description
[137]	Graph Neural Network	Resource Allocation	Optimize efficiency and system sum rate
[139]	Random Forest Algorithm	Resource Allocation	Optimize throughput and energy efficiency
[140]	Deep Learning	Power Allocation	Optimize power allocation in massive MIMO systems
[144]	Deep Learning	Power and Channel Allocation	Improve power and channel allocation in heterogeneous networks
[145]	Deep Learning	Interference Management	Enhance overall network performance and sum rate
[147]	Recurrent Neural Network	Beam Selection	Optimize beam prediction accuracy
[149]	Deep Neural Network	Beam Selection	Enable efficient beam selection
[150]	Decision Tree	Security	Optimize trust management mechanisms

Table 5. Unsupervised machine learning methods for 5G and 6G wireless networks.

Reference	ML Technique	Objective	Description
[152]	Unsupervised Deep Learning (objective-based DNN)	Resource Allocation	Maximize the system sum rate
[153]	Unsupervised Graph Neural Networks	Resource Management	Optimize resource allocation efficiency
[154]	Unsupervised Dynamic Clustering	Resource Management	Improve energy efficiency
[156]	Unsupervised Agg-GNN	Resource Allocation	Joint optimization of power and resource allocation
[157]	Unsupervised Deep Learning	Power Allocation	Optimize power allocation in cloud radio access networks (CRAN)
[160]	K-means Clustering	Channel Allocation	Channel estimation and grouping
[161]	Unsupervised Deep Learning	Channel Estimation	Optimize resource allocation for accurate channel estimation
[174]	Unsupervised Score Based Learning	Channel Estimation	Improve wireless channel estimation
[164]	Unsupervised Deep Learning	Interference Management	Enhance interference mitigation performance
[175]	Unsupervised Deep Learning	Beamforming	Improve energy and spectrum efficiency
[170]	Gaussian Mixture Model	Security	Enhance overall network security
[155]	Unsupervised LSTM	Anomaly Detection	Optimize anomaly detection

Table 6. Reinforcement learning techniques for 5G and 6G wireless networks.

Reference	ML Technique	Objective	Description
[176]	Deep Transfer Reinforcement Learning	Resource Allocation	Optimize resource allocation under URLLC and eMBB requirements
[185]	Multi-Agent Reinforcement Learning	Channel Allocation	Improve communication efficiency and system performance
[125]	Deep Reinforcement Learning	Resource Allocation	Learn resource allocation policies that maximize throughput
[178]	Deep Reinforcement Learning	Resource Allocation	Improve energy and spectrum efficiency
[179]	Deep Reinforcement Learning	Resource Allocation	Enhance overall network performance
[182]	Multi-Agent Deep Reinforcement Learning	Power Allocation	Optimize power allocation with reduced computational complexity
[183]	Multi-Agent Reinforcement Learning	Channel Estimation and Energy Efficiency	Joint optimization of beamforming, channel allocation, and power control
[187]	Deep Reinforcement Learning with Autoencoder	Channel Estimation	Accurate channel estimation in reconfigurable intelligent surfaces (RIS)
[188]	Deep Reinforcement Learning	Interference Management	Joint optimization of power control, beamforming, and interference mitigation
[197]	Lyapunov-Driven Deep Reinforcement Learning	Interference Management	Enable low-latency, energy-efficient, and accurate inference for RIS-aided networks
[198]	Deep Reinforcement Learning	Beamforming	Joint optimization of power control and beamforming
[193]	Deep Learning–Integrated Reinforcement Learning	Beamforming	Optimize beam direction selection
[194]	Deep Reinforcement Learning	Security	Enhance security in IoT-enabled 5G and 6G networks
[195]	Multi-Agent Reinforcement Learning	Security	Improves network resilience against security threats

Table 7. Comparative analysis of ML paradigms across 5G and 6G wireless network optimization domains.

Optimization Problem	Recommended ML Method	Discussion
Link adaptation [26,140]	Supervised learning (DNN, tree-based)	Labels can be obtained from measurements/simulators; fast inference; learns nonlinear physical layer mappings better than hand-crafted rules
Channel estimation/CSI denoising [159,162]	CNNs/Transformers (supervised or self-supervised)	High-dimensional structured inputs; deep feature extraction handles noise/non-idealities better than linear estimators
Beam selection/beam management [149,168]	Supervised and sequence models (RNN/Transformer)	Decisions depend on context (mobility/blockage); supervised gives stable baseline; bandits adapt online with limited exploration
Scheduling [199]	Imitation learning/supervised from solvers; constrained RL when needed	Hard real-time constraints; supervised is stable and low-latency; RL helps when long-horizon objectives dominate
Power control/interference coordination [126,152,188]	GNNs and supervised; multi-agent RL for coordination	Interference is relational (graph-structured); GNN generalizes across topologies; multi-agent RL captures coupled multi-cell decisions
Load balancing/cell association [63,138]	Learning-to-optimize (supervised); RL (long-horizon)	Trades immediate throughput vs. long-term congestion; RL models delayed effects; L2O approximates solver outputs efficiently
Handover/mobility management [73,115]	RL/contextual bandits and sequence modeling	Sequential decision with delayed reward; bandits are lightweight; sequence models capture mobility patterns
Traffic prediction/demand forecasting [117,192]	LSTM, Transformers; hybrid statistical and ML	Strong temporal patterns; Transformers capture long dependencies; hybrids improve stability and robustness
Network slicing/admission control [125,130]	RL; supervised learning-to-optimize	Explicit service level agreement. constraints; constrained RL optimizes long-term utility under QoS; L2O enables fast allocation decisions
Fault detection/anomaly detection [155,171]	Unsupervised/self-supervised (autoencoders, contrastive); tree-based	Labels are scarce; anomaly discovery benefits from representation learning; tree models work well on tabular key performance indicators and are interpretable
Security (intrusion/attack detection) [150,194]	Supervised and self-supervised pretraining; graph-based	Attacks are rare/imbalanced; self-supervision improves features; graphs capture flow/host relationships
Edge offloading/computation placement [48,181]	RL/combinatorial bandits; supervised approximations	Stochastic and context dependent (channel, queue, energy); RL/bandits handle uncertainty; supervised models enable low latency
RIS/IRS [67,187,191]	Supervised and RL fine-tuning; model-based and learning	Huge action space; supervised reduces search; RL fine-tunes under real conditions; learning helps with imperfect CSI

Table 8. Critical analysis of ML in 5G and 6G wireless networks.

Challenge	Critical Analysis	Discussion
Computational Complexity [4,63,173,200]	Deep learning models require substantial computational power, which is impractical for resource constrained wireless devices. Efficient models are needed to balance accuracy with real-time processing.	Optimization techniques like model pruning and quantization are crucial for minimizing the computational load and maintaining accuracy in resource-constrained environments.
Data Availability and Quality [8,11]	ML models depend on high-quality labeled data, which is difficult to obtain due to privacy concerns and data heterogeneity. More data-efficient methods are required for better performance.	Techniques like synthetic data generation and transfer learning could help mitigate data scarcity and improve model generalization.
Scalability [10,63,88]	As networks scale, ML models must adapt to dynamic conditions. Federated learning offers potential but faces challenges in data heterogeneity and model aggregation.	Future research should focus on scalable federated learning and decentralized model updates to address the growing scale of 5G and 6G networks.
Security [17,113]	ML models are vulnerable to adversarial attacks, posing risks to network integrity. Developing robust defenses and privacy-preserving solutions is critical.	Adversarial training and robust defense mechanisms are necessary to secure ML systems, especially in mission-critical applications like autonomous driving.
Real-Time Processing and Latency [112,121,161]	Real-time applications, such as autonomous driving, demand low-latency decisions. Edge computing helps, but balancing latency with accuracy is still a challenge.	Edge-based processing and low-latency inference models are essential to meet the stringent requirements of real-time applications.
Non-Stationarity [21,131]	Changing network conditions can degrade model performance. Adaptive models are needed to handle non-stationary environments effectively.	The development of self-adjusting models capable of learning in dynamic, real-time environments is needed to improve the robustness of ML systems in 5G and 6G networks.
Overhead and Efficiency [91,202]	ML models introduce computational and communication overhead, especially in RIS-assisted networks. Reducing latency and energy consumption is essential for real-time applications.	Efficient energy consumption strategies and reduced overhead models are necessary to optimize the performance of ML applications in resource-constrained environments.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Owais, M.; Shongwe, T. Machine Learning-Enabled 5G and 6G Networks: Methods, Challenges, and Opportunities. Appl. Sci. 2026, 16, 2071. https://doi.org/10.3390/app16042071

AMA Style

Owais M, Shongwe T. Machine Learning-Enabled 5G and 6G Networks: Methods, Challenges, and Opportunities. Applied Sciences. 2026; 16(4):2071. https://doi.org/10.3390/app16042071

Chicago/Turabian Style

Owais, Muhammad, and Thokozani Shongwe. 2026. "Machine Learning-Enabled 5G and 6G Networks: Methods, Challenges, and Opportunities" Applied Sciences 16, no. 4: 2071. https://doi.org/10.3390/app16042071

APA Style

Owais, M., & Shongwe, T. (2026). Machine Learning-Enabled 5G and 6G Networks: Methods, Challenges, and Opportunities. Applied Sciences, 16(4), 2071. https://doi.org/10.3390/app16042071

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning-Enabled 5G and 6G Networks: Methods, Challenges, and Opportunities

Abstract

1. Introduction

Motivation and Contributions

2. Background of 5G and 6G Wireless Networks

2.1. Enhanced Mobile Broadband (eMBB)

2.2. Massive Machine-Type Communications (mMTC)

2.3. Ultra-Reliable and Low-Latency Communications (URLLC)

2.4. Massive Multi-Input Multi-Output (Massive MIMO)

2.5. Orthogonal Frequency-Division Multiplexing (OFDM)

2.6. Beamforming

2.7. Machine-to-Machine (M2M) Communication

2.8. Device-to-Device (D2D) Communication

2.9. Cloud Computing

2.10. Edge Computing

2.11. Wireless Network Virtualization

2.12. Full Duplex Wireless Communication

2.13. Network Slicing

2.14. Millimeter-Wave

2.15. Terahertz (THz) Communications

2.16. Reconfigurable Intelligent Surfaces (RIS)

3. Machine Learning Techniques

3.1. Supervised Learning

3.2. Unsupervised Learning

3.3. Reinforcement Learning

4. Machine Learning Techniques for 5G and 6G Wireless Networks

4.1. Supervised Learning Methods

4.1.1. Resource Allocation

4.1.2. Channel Allocation

4.1.3. Interference Management

4.1.4. Beamforming

4.1.5. Security

4.2. Unsupervised Learning Methods

4.2.1. Resource Allocation

4.2.2. Channel Allocation

4.2.3. Interference Management

4.2.4. Beamforming

4.2.5. Security

4.3. Reinforcement Learning Methods

4.3.1. Resource Allocation

4.3.2. Channel Allocation

4.3.3. Interference Management

4.3.4. Beamforming

4.3.5. Security

4.4. Synthesis and Comparative Analysis

5. Machine Learning Challenges for 5G and 6G Wireless Networks

5.1. Complexity

5.2. Resource Allocation

5.3. Reliability

5.4. Real-Time Processing and Latency

5.5. Scalability and Deployment

5.6. Data Availability and Quality

5.7. Intelligent Reflecting Surface

5.8. Security

5.9. Privacy

5.10. Non-Stationarity

5.11. Data Scarcity

5.12. Overhead

5.13. Critical Analysis

6. Future Research Directions

6.1. Efficient and Lightweight ML Model Design

6.2. Data-Efficient and Privacy-Aware Learning

6.3. Real-Time and Low-Latency Learning Frameworks

6.4. Scalability and Network Adaptability

6.5. Security and Robustness of ML-Driven Networks

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives