You are currently viewing a new version of our website. To view the old version click .
Telecom
  • Article
  • Open Access

13 February 2025

A Semi-Distributed Scheme for Mode Selection and Resource Allocation in Device-to-Device-Enabled Cellular Networks Using Matching Game and Reinforcement Learning

,
and
1
School of Electrical and Electronic Engineering, Universiti Sains Malaysia, Nibong Tebal 14300, Malaysia
2
Department of Electrical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur 50603, Malaysia
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Advances in Wireless Communication: Applications and Developments

Abstract

Device-to-Device (D2D) communication is a promising technological innovation that is significantly considered to have a substantial impact on the next generation of wireless communication systems. Modern wireless networks of the fifth generation (5G) and beyond (B5G) handle an increasing number of connected devices that require greater data rates while utilizing relatively low power consumption. In this study, we present joint mode selection, channel assignment, and power allocation issues in a semi-distributed D2D scheme (SD-scheme) that underlays cellular networks. The objective of this study is to enhance the data rate, Spectrum Efficiency (SE), and Energy Efficiency (EE) of the network while maintaining the performance of cellular users (CUs) by creating a threshold of data rate for each CU in the network. Practically, we propose a centralized approach to address the mode selection and channel assignment problems, employing greedy and matching algorithms, respectively. Moreover, we employed a State-Action-Reward-State-Action (SARSA)-based reinforcement learning (RL) algorithm for a distributed power allocation scheme. Furthermore, we suggest that the sub-channel of the CU is shared among several D2D pairs, and the optimum power is determined for each D2D pair sharing the same sub-channel, taking into consideration all types of interferences in the network. The simulation findings illustrate the enhancement in the performance of the proposed scheme in comparison to the benchmark schemes in terms of data rate, SE, and EE.

1. Introduction

The growing demand for improved broadband services for mobile devices and the rapid growth of different fields of application, such as vehicle-to-vehicle communication, automating factories, cellular healthcare services, and augmented and virtual reality solutions, require the development of new (B5G) network architectures. These architectures need to be capable of supporting lower energy consumption and higher area capacity compared to current networks [1]. Considering these demands, the problem of limited network capacity is an important challenge for the evolution of emerging wireless networks. In addition to the limited availability of the spectrum, the development of the cellular network is leading to concerning levels of energy consumption [2]. The importance of EE has been growing as a consequence of economic, operational, and environmental considerations [3]. Furthermore, the rapid increase in data rate demands and power-intensive mobile systems and applications, in combination with the scarcity of available spectrum resources, necessitates the exploration of innovative networking solutions.
D2D communication is a promising technology among the emerging B5G solutions, providing potential solutions for the aforementioned demands. D2D communication enables direct transmission of data traffic instantaneously via the D2D transmitter toward a D2D receiver without passing via a cellular gNB [4]. Moreover, it requires low power transmission that enhances EE and allows for the spectrum sharing of the available resources, ultimately improving spectral efficiency. D2D communications utilizing the resources of wireless cellular networks will be crucial in enhancing the capacity of incoming B5G systems [5]. The advantages of implementing D2D communication over wireless cellular networks are becoming widely recognized, particularly for data offloading, content sharing, EE, coverage expansion, and enhanced utilization of the spectrum. Furthermore, an additional demand in B5G networks is the capacity to deal with extensive communications resulting from the rapid increase in connected devices in conventional cellular networks [6].
Mode selection and resource allocation are crucial issues for creating and maintaining direct connections between D2D users within cellular networks. Furthermore, the distribution of network resources among D2D pairs and cellular users can be achieved efficiently, which could improve SE and EE by controlling interferences in the network.
Mode selection is an essential challenge in D2D communications that enhances EE and average data rate and minimizes interference in the network. It decides whether D2D users can perform in direct mode or cellular mode. The mode selection procedure is flexible, leading to decreased latency and increased spectrum resource utilization. The gNB can assign three D2D communication modes to each D2D pair, including direct D2D mode (DM), relay-assisted D2D mode (RM), and local route D2D mode (LM) [7].
D2D communication can utilize either a licensed or unlicensed spectrum for channel assignment to establish direct connections, assigned as in-band and out-band D2D communications, respectively [8]. According to out-band D2D communication, users communicate via direct communication utilizing an unlicensed spectrum, separated from that utilized by cellular users in the network. On the other hand, in-band D2D communication may be categorized into two classifications: underlay and overlay. According to the framework of underlaying in-band communication, the spectrum is shared and assigned to D2D pairs and CUs. Moreover, in-band communication overlay identifies specific parts of the whole spectrum for D2D communications and the other portion for CUs. This study investigates the idea of underlaying in-band D2D communication, focusing on improving the system’s performance. However, D2D pairs experience interference when implemented with the underlay in-band architecture, caused by the utilization of shared sub-channels [9]. Consequently, the control of interference becomes a significant research concern within the field of D2D communication networks.
Despite the improvements in cellular networks enabled by D2D communication, many concerns should be addressed. Conventional cellular networks are significantly affected by the interferences imposed by D2D communications due to the sharing of cellular user resources. These concerns have a substantial effect on the performance of cellular user communications and could impact the future development of D2D technology. Therefore, it is essential to figure out efficient strategies for the purpose of allocating resources to D2D communication pairs without maximizing the complexity of the network’s topology. Specifically, effective allocation of resources for D2D communications is an important challenge in the cellular network. Following the mode selection procedure, optimal sub-channel resource allocation is necessary to achieve the goal of D2D technology [10].
Power control and interference mitigation should be considered through the resource allocation process. Additionally, to enhance EE and guarantee QoS for both D2D pairs and CUs, appropriate power allocation approaches need to be designed. Blindly implementing power control to D2D communications in a cellular network could reduce system efficiency. D2D transmitters should optimize their transmission power to enhance EE, SINR demands, and system performance.
Machine Learning (ML) has been demonstrated to be highly beneficial in multiple areas due to its ability to precisely predict future scenarios and address complex problems with huge datasets [11]. Artificial intelligence approaches have been indicated to be an excellent technique for tackling complicated non-convex optimization problems in communication networks. In the field of wireless communication networks, reinforcement learning (RL) based on ML techniques has been utilized to solve the issue of power allocation [12].

3. Research Contributions

This paper introduces joint mode selection, channel assignment, and power allocation issues of D2D communications underlaying cellular networks in an uplink scenario, where a single sub-channel can be shared via a CU and several D2D pairs. The main goal of this study is to enhance the data rate, SE, and EE of the proposed network, while simultaneously satisfying the minimal QoS demands for CUs and D2D pairs. The suggested approach is a semi-distributed architecture, in which mode selection and channel assignment issues are centralized, while the power management issue is solved with a distributed technique. The complexity of this technique is lower in comparison to centralized solutions. Firstly, the optimal D2D mode can be achieved by utilizing a greedy algorithm that chooses the best mode among DM, RM, and LM based on maximum SINR. Furthermore, the provided channel assignment method is based on a matching algorithm that takes into consideration the priorities of both the D2D pairs and the CUs, which differs from most of the existing research that mainly investigates the preferences of D2D pairs only. The channel assignment technique based on two-sided preference achieves stable matching with minimal complexity. Moreover, the channel assignment method enables the reuse of a single CU sub-channel throughout several D2D pairs, resulting in higher SE. A higher number of D2D pairs may be served with limited spectrum resources through the implementation of this type of resource-sharing scheme.
In addition, power management is achieved for each D2D pair by applying the SARSA-based RL algorithm. This low-complexity distributed RL algorithm has the ability to calculate the optimum power for each D2D pair that enhances the EE of the network. While several studies have attempted to address the issue of resource allocation throughout D2D pairs, they have either insufficiently accounted for the potential that D2D users might interfere with each other or have assigned resources based on the assumption that D2D pairs have a constant power transmission. The proposed approach differs from previous studies that either assumed a fixed D2D power or neglected the interferences between the D2D pairs. The main contributions of the introduced scheme can be illustrated as described below:
  • The problem of joint mode selection, channel assignment, and power allocation is formulated for D2D communications underlaying cellular networks by utilizing the uplink resources of CUs. The optimization problem is formulated to enhance the data rate, SE, and EE of the network while considering QoS characteristics related to D2D pairs and CUs simultaneously.
  • By employing a greedy algorithm, a mode selection technique is introduced to choose the optimum mode throughout DM, RM, and LM across every D2D pair in the network. The computational representation is formulated based on the highest SINR.
  • channel assignment method is introduced, using a matching algorithm to assign the optimal sub-channel to the D2D pairs in the network. The channel assignment approach based on a two-sided preferences list provides stable matching with low complexity. The first preference list consists of the data rates of D2D pairs arranged in descending order according to their highest value. The second preference list consists of the interference effect of CUs on D2D pairs when they share the same sub-channel, arranged in ascending order according to their lowest impact value.
  • low-complexity distributed SARSA-based RL algorithm is implemented to address the issue of power control and allocate the optimum power level for each D2D pair to enhance the EE of the network.
  • The effectiveness of the suggested method has been shown by simulations, particularly in terms of the data rate, SE, and EE of the network in comparison to conventional systems.
The remainder of this paper is organized as follows: Section 4 provides a comprehensive clarification about the system model. Section 5 defines the mode selection, channel assignment, and power allocation problem. Section 6 discusses the challenge and illustrates the methodologies utilized for joint mode selection, channel assignment, and the power allocation scheme. Section 7 presents and discusses the simulation parameters as well as the simulation results. Section 8 concludes the paper.

4. System Model

In this study, we explore the principle of the D2D communication scheme underlaying cellular networks, which involves spectrum sharing across multiple D2D users. For every D2D pair, we analyze three different modes, which include DM, RM, and LM, as illustrated in Figure 1. The DM enables direct transmission of data from the transmitter to the receiver of each D2D pair. Moreover, the second D2D mode is RM which aims to establish relay communication between two distant devices. An idle user is used as a relay node in this mode to facilitate the creation of a connection between the transmitter and receiver of the D2D pairs. Regarding the LM, it is suggested that auxiliary antennas be installed on the gNB to enhance the connections of the LM [31]. The data are transferred via the gNB instead of entering the core, which means that the gNB acts as a node to support the connections of faraway users to establish a D2D pair.
Figure 1. System model.
In this study, we examine an environment where N represents the D2D pairs’ number. Furthermore, T n , R n , and R U n represent the D2D transmitter, receiver, and relay of the pair n , respectively. Let us suppose W is the total bandwidth in the network which is partitioned to a number of sub-channels indicated as K . Table 1 illustrates the scheme symbols.
Table 1. Scheme symbols.
Let P n = p 1 , p 2 , , p n , , p N , P C , P Z denote the transmission power of the T n , C U , gNB, respectively. The transmitted power for every D2D pair P n is determined from the available power levels set, ranging from p m i n to p m a x , while the transmission power of CUs is supposed to be fixed. In the proposed scheme, Π is the binary mode sub-channel assignment indicator matrix with Π { 0 , 1 } , where Π n , k 1 , Π n , k 2 , and Π n , k 3 represent the binary mode sub-channel assignment indicators for DM, RM, and LM, respectively. If the n t h D2D pair is utilizing the sub-channel k of the CU m , then Π = 1 , otherwise, Π = 0 . The SINR of the n t h D2D pair utilizing k t h shared sub-channel of the m t h CU at time slot t is given as follows:
Υ n , k t = P n G T R t Π n , k 1 + ( P n G T R U t + P n t G R U R t ) Π n , k 2 + ( P n G T Z t + P Z G Z R t ) Π n . k 3 I D 2 D + P C t G m n + σ
where G T R , G T R U , G T Z , represent the channel gain between T n to R n , R U n , and gNB, respectively. G R U R and G Z R refer to the channel gain between R U n and gNB to R n . Furthermore, G m n refers to the channel gain of the D2D pair n t h and the m t h CU. σ denotes the noise power. The expression I D 2 D can be illustrated as follows:
I D 2 D = I D + I R + I L ,
where I D , I R , and I L demonstrate the impact of the interferences on the n t h D2D pair in DM, RM, and LM caused by gNB, other D2D T n , and other R U n . The mathematical representation of I D , I R , and I L can be shown as follows:
I D = P Z G Z R t Π n , k 3 + n = 1 N P n G T R Π n , k 1 + n = 1 N P n G R U R t Π n , k 2 ,
I R = P Z G Z R U t + P Z G Z R t   Π n , k 3 + n = 1 N ( P n G T R U t + P n G T R t )   Π n , k 1         + n = 1 N P n G R U R U t + P n G R U R t Π n , k 2 ,
I L = n = 1 N ( P n G T Z t + P n G T R t ) Π n , k 1 + n = 1 N ( P n G R U Z t + P n G R U R t ) Π n , k 2 ,
where G T R , G T R U , and G T Z are the channel gain between other D2D transmitters to R n , R U n , and gNB, respectively. Moreover, G R U R , G R U R U , and G R U Z represent the channel gain between other D2D relays to R n , R U n , and gNB, respectively. The SINR of the m t h CU utilizing the k t h sub-channel at time slot t may be expressed as follows:
Υ m , k t = P C t G m n n = 1 N p n t G n + n = 1 N p n t G R U m + σ ,  
The data rate of D2D pair n utilizing the uplink k t h sub-channel can be determined at time slot t as follows:
R n , k ( t ) = W   log 2 ( 1 + Υ n , k ( t ) ) ,    
Furthermore, the data rate of CU m utilizing the k t h channel can be determined at time slot t as follows:
R m , k ( t ) = W   log 2 ( 1 + Υ m , k ( t ) ) ,
SE shows the effectiveness of using the available spectrum in terms of the data rate obtained regarding a given bandwidth. Thus, the SE for the n t h D2D communication pair can be given as follows:
S E n , k ( t ) = n = 1 N k = 1 K R n , k ( t ) W ,
Based on the obtained data rate and energy consumption, the EE of the D2D communications scheme at time slot t is given as follows:
E E n , k ( t ) = n = 1 N k = 1 K R n , k t n = 1 N p n t + p c i r ,
where p c i r denotes the D2D pair circuit power consumption.

5. Problem Formulation

In this study, an optimization problem in D2D networks is investigated, specifically concentrating on joint mode selection, channel assignment, and power allocation optimization issues. In our proposed system, D2D users can choose among the available three D2D modes, including DM, RM, or LM based on maximum SINR. Moreover, a network that has been completely loaded is regarded as having no dedicated channels for D2D pairs to utilize. Moreover, the optimum power level can be obtained from the range P m i n to P m a x . This paper aims to optimize the sum data rate, SE, and EE of the proposed D2D communications scheme while guaranteeing the QoS demands for both D2D pairs and CUs. The following is the formulation of the optimization problem:
m a x M , Π , P n = 1 N k = 1 K R n , k t , S E n , k ( t ) ,   a n d   E E n , k ( t ) ,
s.t.
n = 1 N k = 1 K R n , k t   R m i n t h t ,                         k 1 K ,      
n M = 1                                                                     n   N ,
n = 1 N k = 1 K Π n , k 1 ,   Π n , k 2 ,   o r   Π n , k 3                 K                                   k 1 K ,      
m Π m = 1                                                 m   U E s ,
n = 1 N Π k , n   3 ,                                                     n N ,
P m i n P n   P m a x   ,                       N ,           R U ,
P C ,   P Z = P m a x ,
Constraint (11a) specifies the minimal data rate for the n t h D2D pair in shared sub-channel k. Constraint (11b) denotes that each D2D pair n chooses one mode among the D2D modes including DM, RM, or LM. Constraint (11c) indicates that the binary mode sub-channel indicator matrix for each D2D including DM, RM, and LM is equivalent to or less than the total number of sub-channels k . Moreover, the constraint (11d) indicates that every cellular user m utilizes a distinct sub-channel k . The constraint (11e) indicates that each sub-channel may be utilized a maximum of three times. The constraints (11f) and (11g) denote that T n , CUs, and gNB utilize specific transmission power.
To sufficiently address the optimization problem expressed in (11), it should be divided into two sub-issues: joint mode selection and channel assignment, as well as power management. Since it is an MINLP problem, the optimization problem is NP-hard and involves computational difficulties.

6. Proposed Joint Mode Selection and Resource Allocation Scheme (SD-Scheme)

In the present part, we introduce an SD-scheme underlaying cellular networks. Joint mode selection, channel assignment, and power allocation are considered with the aim of optimizing the sum data rate, SE, and EE. First, the mode selection issue is tackled by employing a greedy algorithm based on maximum SINR to select the optimum mode among DM, RM, and LM for every D2D pair. After that, the matching algorithm is implemented to tackle the problem of sub-channel assignment by exploiting the two-sided preference lists to optimize the utilization of spectrum resources. Finally, the power allocation issue is solved by introducing SARSA-based RL to obtain the optimum power for each D2D pair in the proposed scheme.
Several important factors inspired the decision to choose the SARSA algorithm for this study. Firstly, the modeling of the network in the complicated and proposed scenario of resource allocation for D2D communication is simply unpracticable. Therefore, the model-free feature of this system is particularly advantageous in this particular scenario. Moreover, the SARSA algorithm is highly applicable to decentralized decision-making and enables agents to self-sufficiently learn the most optimal policies, which aligns effectively with the architecture of the proposed network. The proposed approach utilizes an SARSA algorithm to train and update the power level states by considering the environment feedback information.

6.1. Mode Selection and Channel Assignment Scheme (C-Scheme)

Firstly, this study examines a mode selection technique for D2D communication by employing a greedy algorithm. The method focuses on direct, relay, and local route D2D modes to enhance the performance of the D2D scheme. The mode selection issue is tackled based on the maximum SINR to calculate the best mode for each D2D pair, while considering the distance between the transmitter and the receiver. This will guarantee the chosen mode optimizes signal quality while taking into account the physical relationship of the communication devices, thereby maximizing overall performance. However, a threshold level is considered for the distance between the transmitter and the receiver for each D2D communication in the proposed network.
Let us suppose that M is a set of 0 and 1 elements that are applied to represent which mode is chosen. The following formula is applied to select the best mode:
M D 2 D n = M D + M R + M L ,
where M D , M R , and M L represent the mode selection sets of DM, RM, and LM, respectively. If the data rate in a DM is greater than in an RM and an LM for each given n D2D communication pair, then M D = 1 and M R = M L = 0, and similarly for other cases. While a value in the set of the mode selection M D 2 D n is 1, the associated mode is selected ( M D , M R , or M L ), subsequently adding that particular data rate of that D2D pair n to the sum data rate of the network. Conversely, when a value in the mode selection set M D 2 D n is valued as 0, it leads to a missing contribution to the sum data rate of that particular D2D pair n .
Once the optimal D2D mode is determined in a scenario including DM, RM, and LM, the gNB employs the matching method to assign optimum reused sub-channels for the D2D pairs, which increases the spectrum utilization in the proposed network. This part introduces the model of a channel assignment issue to optimize the sum data rate and SE accordingly. We define the channel assignment formula in which D2D pair n shares the sub-channel k with CU m at time slot t as follows:
Π = ( Π n , k 1 , Π n , k 2 ,   Π n , k 3 ) N × 3 K ,
The vectors Π n , k 1 , Π n , k 2 , a n d   Π n , k 3 indicate the possibility that the D2D pair is assigned to DM, RM, or LM, using the shared sub-channel k with CUs. For each D2D mode in the proposed scheme, the quota ( Q ) can be determined. Q represents the threshold of mode-channel assignment for every D2D pair within the framework. The Q features can be illustrated as follows:
k = 1 K n = 1 N Π   n , k 1   Q D , n     M D k = 1 K n = 1 N Π   n , k 2   Q R , n     M R k = 1 K n = 1 N Π   n , k 3   Q L , n     M L
Based on the above criteria, it is crucial that the overall set of D2D pairs across the modes, that are assigned to sub-channel k, should not exceed Q . In the matching game, the establishment of the two-sided preference lists can be represented as follows:
l p = R n , k D ( t ) , R n , k R ( t ) ,   R n , k L ( t ) ,
l s = P C t G n m ,
where l p and l s denote preference lists that consist of D2D pairs arranged in descending sequence according to their highest data rate and CUs organized in ascending sequence according to their minimal interference effect, respectively.
While performing the matching game theory, every D2D pair within the cell is proposed to earn the sub-channels with its higher priority. Consequently, the gNB admits the D2D pairs with the highest priority while refusing the remaining pairs. For further clarification, if the D2D pair n gives a proposal to pick sub-channel k based on its greatest utility in l p , then sub-channel k is subsequently assigned to the exact D2D pair n according to the least interference impact utility function in l s . Moreover, the matching process continues till all devices in the network are paired to enhance the system performance.
To provide a more detailed explanation of the resource allocation process, we now describe the matching game-based sub-channel assignment in greater depth.
Matching Process Execution:
  • Each D2D pair initially proposes to the sub-channel that provides the highest data rate based on its preference list.
  • The gNB evaluates all proposals and initially assigns sub-channels to D2D pairs while ensuring that the total number of assigned pairs does not exceed the predefined threshold Q .
  • If a sub-channel receives multiple proposals, the gNB selects the D2D pairs that maximize SE and rejects lower priority requests.
  • Rejected D2D pairs then propose to their next preferred sub-channel, and this process iterates until a stable matching is achieved, meaning no further changes can improve the overall network performance.
This iterative matching ensures an efficient and interference-aware sub-channel allocation strategy that enhances both spectral efficiency and system stability. The complete mode selection and channel assignment approach is detailed in Algorithm 1.
Algorithm 1. C-scheme algorithm
Input: M , N , K
Output: Π n , k
1:Initialization Π n , k = z e r o s N , 3 K
2:for 1 to M
3:determine Υ m , k t
4:determine R m , k ( t )
5:end for
6:for 1 to N
7:calculate Υ n , k t
8:calculate R n , k t , S E n , k ( t ) , E E n , k ( t )
9:end for
10:find M D 2 D n based on maximum Υ n , k t
11:find Q from the matrix M D 2 D n
12:for 1 to N
13:for 1 to K
14:calculate R n , k D ( t ) , R n , k R ( t ) , R n , k L ( t ) with regards to M D 2 D n
15:calculate the interference impact of CUs
16:end for
17:end for
18:sort l p in descending order
19:sort l s in ascending order
20the most preferred sub-channel k is matched by D2D pair n based on l p and l s
21:if R n , k R t h
22:set Π n , k = 1
23:else
24:set Π n , k = 0
25:end if

6.2. Proposed RL-Based Power Allocation Scheme

The power allocation optimization can be achieved using a dynamic distribution scheme. A wireless network with real-time communications demands immediate training and learning of the D2D pairs to provide distributed power allocation without imposing a significant load over the gNB. Thus, ML provides a potential solution with a wide range of applications in the execution of dynamic resource allocation and tackles many challenges associated with prospective communications networks. The transmitter of the D2D pair, performing as the intelligent agent in this scenario, has the ability to learn and make the most suitable decision to enhance the network performance.
One of the most advanced ML techniques is RL. RL utilizes an approach based on trial and error to determine the best resource allocation decisions. Moreover, RL works efficiently without any previous information about the system environment, in contrast to the traditional techniques. RL may enhance performance by facilitating rapid detection of optimum solutions or decisions in comparison with conventional centralized techniques.
SARSA is a reinforcement learning approach utilized to identify the best action in a dynamic resource allocation system. This research presents an SARSA-based approach to solving the issue of power distribution, including the following elements:
Agent: the agent is the D2D transmitter and serves as a crucial element in the power allocation issue.
State: The state of the SARSA algorithm includes essential network information, such as interference levels, channel conditions, and user location. These factors describe the present state of the environment, operating as inputs to an agent’s decision-making in power distribution. In this case, the agent indicates the connection of D2D pairs.
Action: The action is an activity performed via the agent. The power distribution levels established by the D2D pairs constitute the action, which comprises a range of powers from P m i n to P m a x .
Reward: The reward function in the SARSA-based power management approach is defined as the EE of each D2D pair within the system. The agent’s interactions with the environment in SARSA are illustrated in Figure 2.
Figure 2. Agent–environment interactions in SARSA.
In the specific state s t , the action a t is selected, and the reward R t is allocated to the D2D pair (agent) for each action performed. The agent subsequently transitions to the newly created state s t + 1 and executes another action, a t + 1 for its present state s t + 1 . Moreover, the pattern s t - a t - R t - s t + 1 - a t + 1 defines the sequence of procedures for the suggested SARSA algorithm. The Q-value is firstly set to zero value, then the proposed algorithm modifies the Q-table in accordance with the current policy.
The algorithm shows the agent inside the SARSA framework has to accomplish several episodes. At each time step t, the agent in the state s t chooses an action a t based on greedy strategy. Afterwards, the agent receives the reward R t and proceeds to the following state s t + 1 , where they choose the action a t + 1 in dependence on the Q-table. The state–action equation can be represented as:
Q s t , a = 1 a Q s t , a + α R t + 1 + Y Q s t + 1 , a t + 1 ,
Here, α represents the learning rate of the agent, R t + 1 denotes the reward function of the next state, and Y signifies the discount factor. In the framework of RL, the agent aims to optimize the reward by adopting an optimal policy. The optimum policy can be calculated through the Bellman formula:
V s = max a A Q s , a ,
Subsequently, the value function is identified by the following equation:
V s = max a A Q s , a ,
The given equation is utilized to identify the optimal action value in order to optimize Q s t , a t for each state involved.
a = arg max a A Q s , a ,
To decide what action a is going to be chosen during a particular time t, the Exploration and Exploitation Policy (EEP) feature is employed as follows:
a t = arg max a A Q s , a     e x p l o i t a t i o n rand a A a                               e x p l o r a t i o n ,
In Equation (20), the ‘ε − greedy’ strategy is used while performing EEP, indicating the fact that the probabilities of exploitation and exploration are ε and 1 ε , accordingly. Furthermore, a Markov Decision Process model of D2D communications underlaying cellular networks is provided to allocate power for each D2D pair using SARSA-based RL. Moreover, the actions of the agents, which are represented as a , include a set of transmission powers, are indicated by P n , and are allocated to the D2D transmitters T n .
The reward function is formulated based on EE for each D2D pair n employing sub-channel k as follows:
R n = E E n , k ( R n , k t R m i n t h )                                             i f   R n , k t   R m i n t h E E n , k ( e ( R n , k t R m i n t h )     )                                             i f   R n , k t < R m i n t h ,
Exploration involves a comprehensive examination of the network, gathering information, and randomly selecting actions to evaluate their efficiency. However, exploitation exploits the advantage of previous decisions according to the Q-table. Exploration and exploitation have trade-off features in power distribution techniques that utilize RL.
SARSA is more appropriate for our case than Q-learning because of its on-policy nature, which updates action values according to the current policy, hence offering a more adaptive response to network changes. This functionality is especially beneficial in D2D communication systems, where interference levels and network topology may vary. The Q-learning method, while efficient in static situations, fails to update according to real-time rules, potentially limiting its flexibility in dynamic contexts. Our investigation demonstrates that SARSA surpasses Q-learning in EE and power consumption, particularly under changing interference levels, highlighting its enhanced capability to handle real-time interference in high-density networks. Algorithm 2 defines the extensive structure of the SARSA strategy.
Algorithm 2. SARSA algorithm for power allocation issue
1:Initialize: N , M , P m a x , P m i n , Π n , k , Q s , a table, Y , ε , α
2:For episode 1 , , E P do
3:Reset s , t = 0
4:Select the level of power between P m a x and P m i n , utilizing policy derived Q ( ε -greedy)
5:For t    0 , , T 1 do
6:Every agent performs an action a A as well as observes R and s t + 1
7:Check (11a), (11f), and (11g)
8:If the conditions are satisfied, then
9:Establish action a A , R , s t + 1
10:End if
11:Each agent takes an action a t + 1 A and observes R and s t + 1
12:Update the Q-table
13: S   s t + 1 , A   a t + 1
14:End until all D2D pairs connect or the total iteration numbers is reached
15:End for
16:Output: optimum power for each D2D pair

7. Simulation Results

This part demonstrates the performance evaluation of the proposed SD-scheme regarding sum data rate, EE, SE, outage probability, and power saving. The SD-scheme that effectively joins the centralized mode selection and channel assignment scheme (C-scheme) is introduced based on greedy and matching algorithms, respectively, with the distributed power allocation scheme based on the SARSA algorithm. The suggested approach is compared with multiple traditional schemes, including a channel allocation scheme based on matching theory with the goal of optimizing EE [24], a channel allocation scheme based on the greedy algorithm [30], and power allocation schemes based on Q-learning in [17,23,29]. Table 2 contains the parameters employed in the simulation.
Table 2. Simulation parameters.
Figure 3 shows the EE comparison of the proposed SD-scheme with the suggested C-scheme, a channel allocation scheme based on matching theory with the goal of optimizing EE [24], a traditional channel allocation scheme based on the greedy algorithm [30], and power allocation schemes based on Q-learning-RL in [17,23,29] versus the number of D2D pairs. The EE rises with the incremental in the number of D2D pairs. As illustrated in Figure 3, the introduced approach demonstrates performance superiority and outperforms the benchmark schemes. The substantial boost in EE of the suggested technique demonstrates its efficacy in managing resources with the increase in D2D pairs, resulting in allocating optimal transmission power to each pair based on its requirements. Conversely, conventional systems provide limited enhancements, stabilizing at lower EE values as the number of D2D pairs escalates. The papers [24,30] confirm the worst EE since these schemes utilize conventional centralized approaches which lead to high control overhead, delayed or inflexible decisions, and insufficient optimum power allocation. In contrast, the proposed SARSA-based RL outperformed the Q-learning algorithm schemes [17,23,29], especially, where the pair number equals 10 and higher. The reason is that Q-learning often exhibits excessive exploration in some scenarios, strongly seeking optimum behaviors that enhance throughput, perhaps resulting in increased energy consumption. This conduct may adversely affect EE, particularly in D2D networks where sustaining low consumption of energy is essential. Furthermore, SARSA offers a more appropriate balance between exploration (engaging in new actions) and exploitation (utilizing the currently optimal action). This balanced approach in EE guarantees that power consumption is low during long exploratory stages since the SARSA strategy is continually adjusted depending on actual performance.
Figure 3. EE versus number of D2D pairs [17,23,24,29,30].
Figure 4 demonstrates the effectiveness of the proposed SD-scheme on total power saving versus the D2D pair numbers in the network. It is obvious that the power saving of the suggested scheme and the benchmark schemes increase as the number of D2D pairs increases. When the number of D2D pairs increases, the total power saving obtained by the suggested technique shows a constant and substantial enhancement, especially at the highest D2D pairs number. The introduced scheme outperforms the conventional systems as shown in the figure. The reason is that the SARSA algorithm is on-policy learning that responds to varying network situations instantaneously. Moreover, the introduced SARSA algorithm optimized the power transmission for every D2D pair continually in response to the channel conditions and interferences established by the CUs or other D2D pairs inside the cell.
Figure 4. Total power saving versus number of D2D pairs [17,23,24,29,30].
Figure 5 compares the sum data rate of the SD-scheme to benchmark algorithms with various numbers of D2D pairs. The increment in the sum data rate is proportional to the D2D pair increase. It is clear that the [24,30] approaches outperform the proposed scheme in terms of sum data rate since these approaches employ maximum transmission power which results in reducing EE in the system. Moreover, in comparison to the benchmark schemes, the SD-scheme provides a higher sum data rate. The reason is that optimizing mode selection and channel assignment mitigates the effects of co-tier interference resulting from spectrum sharing between CUs and other D2D pairs. Consequently, the sum data rate of the network is enhanced accordingly. Clearly, the sum data rates increase slightly when the number of D2D pairs is between 25 and 40, due to the second reuse of the CUs’ spectrum that increases the interference in the network. Furthermore, the suggested strategy enhances network performance by effectively balancing the utilization of resources and interference mitigation.
Figure 5. Sum data rate versus the number of D2D pairs [17,23,24,29,30].
Table 3 provides an overview of performance values, particularly EE, total power savings, and sum data rate, to elucidate the benefits of the SD-scheme over the C-scheme. The findings clearly indicate that the SD-scheme achieves better performance based on elevated EE, total power savings, and sum data rate, demonstrating its efficacy in resource allocation for D2D communications within cellular networks.
Table 3. Performance values.
Figure 6 shows the SE evaluation of the SD-scheme with the benchmark schemes in relation to the D2D pair numbers. The term SE rises incrementally with the number of D2D pairs since the sub-channel reuse indicator improves correspondingly with the increase in D2D pairs. Consequently, the co-tier interferences among shared channels of CUs increase. The SD-scheme demonstrates superior efficiency relative to other approaches. This approach demonstrated better SE when the number of D2D pairs ranged from 2 to 25, related to the minimal interference across the shared channels during the single reuse of the CUs’ uplink channel. Moreover, the SE decreases in cases where there are 25–40 D2D pairs due to multiple times reusing the uplink spectrum of CU resources. When the transmitted power of the D2D pair is set to the maximum value, an increased data rate and SE are obtained due to a strong signal which results in increasing the interference and decreasing EE accordingly. The significant efficacy of the proposed strategy is attributed to its adaptive resource allocation mechanism, which responds flexibly to changing network situations. This flexibility ensures potential improvements in SE, which is especially important in heavily loaded D2D environments where interference management is critical.
Figure 6. SE versus the number of D2D pairs [17,23,24,29,30].
Figure 7 shows the comparison of the outage probability with various numbers of D2D pairs. The outage probability of the SD-scheme increases in accordance with the number of D2D pairs because of the increased network interferences. Nonetheless, the suggested method and other channel-optimized algorithms reduce the probability of outages by ensuring the most efficient utilization of the resources in order to meet the minimal QoS demands for each D2D pair. By optimizing reused channels, the suggested system efficiently allows multiple D2D pairs using one particular CU’s sub-channel. Despite the number of D2D pairs growing, this approach maintains low outage probability and reduced interference. Because of the unmanaged interferences of the resources that are shared among D2D pairs and between D2D pairs and CUs from the other side in benchmark methods, the outage probability significantly increases.
Figure 7. Outage probability versus the number of D2D pairs [17,23,24,29,30].
The EE analysis for varying transmission power of CUs is illustrated in Figure 8. It is clear that there are slight decreases in EE with the incremental increase in CUs transmitting power due to high interference created by CUs and the higher transmission power of D2D pairs required to satisfy QoS demands. The EE of the suggested SD-scheme demonstrates better performance in comparison to traditional schemes. The reason is that a lack of scalability and flexibility with a higher power level can be observed in traditional systems like [24,30], which show better establishing EE but rapidly decrease as CU power rises. The suggested approach, on the other hand, demonstrates its better energy management and adaptation to varied network conditions by maintaining better EE despite changes in power levels.
Figure 8. EE versus transmission power of CUs [17,23,24,29,30].
Figure 9 shows the sum data rate comparison of the suggested SD-scheme with traditional schemes versus the transmission power levels of CUs. The sum data rate demonstrates a slight decrease with the increase in CU transmission power. The first reason is due to the increased interference of D2D pairs that share the same spectrum of CUs. Secondly, the D2D pairs experience higher competition for limited resources in the network. The conventional approaches fail to efficiently allocate resources which results in low SINR and low sum data rate accordingly.
Figure 9. Sum data rate versus transmission power of CUs [17,23,24,29,30].
Figure 10 demonstrates the total power saving of the proposed SD-scheme over varying transmission power levels of CUs. It is obvious that the total power saving remains stable up to 15 dBm, after that, the total power saving increases sharply due to several reasons. Firstly, the pairs can barely perform with the same power level without any requirement to enhance the transmission power strategies. Furthermore, the interference from CUs substantially affects the D2D links; consequently, the D2D pairs tend to overcome the interference threshold. As illustrated in Figure 10, the proposed SD-scheme outperforms the benchmark approaches. The reason is that the proposed SARSA-based RL allocates optimum power for each D2D pair according to the immediate demands of users and ensures that the D2D pairs are utilizing the lowest possible power to prevent co-tier interference and guarantee communication between the transmitter and receiver for the D2D link.
Figure 10. Power saving versus transmission power of CUs [17,23,24,29,30].
Figure 11 illustrates the SE different levels in relation to different available techniques versus the CU transmission power. When the transmission power of CUs escalates, the SE decreases to specific thresholds, except for the [24,30] methods, since these approaches employ maximum transmission power. The reason behind this decrease is the conflict of sub-channel reuse since the transmission power of CUs increases. Hence, the D2D pairs experience higher interference due to shared resources between them.
Figure 11. SE versus transmission power of CUs [17,23,24,29,30].
Figure 12 demonstrates the probability of connections for D2D modes including DM, RM, and LM versus the maximum distance of relays to D2D pairs. Regarding DM, the figure indicates a consistent probability with the variety of relay distances since the distance between relays and D2D pairs has no impact on direct D2D mode. Furthermore, the probability of connection of RM decreases with the incremental distance, highlighting the dependence of D2D pairs on the proximity of relays to establish efficient communication links. When the distance of relays increases, the D2D pairs tend to communicate with each other through LM. Consequently, LM increases with the increase in the distance between relays to D2D pairs.
Figure 12. Probability of D2D connections versus maximum distance of relays to D2D pairs (m).
Figure 13 demonstrates the probability of connections of D2D pairs including DM, RM, and LM in comparison to the highest distance between the D2D pair. It is obvious that the DM reaches the highest probability of connections when the maximum distance across the D2D pair is limited. Furthermore, the probability of connections for DM decreases with the increase in distance since the DM mode depends on the proximity between transmitter and receiver, which allows the D2D pair to operate efficiently. On the other hand, the probability of connections for both RM and LM increases with the increase in the maximum distance between the D2D pair. The reason is that when the distance is medium, the pair tend to choose RM to maintain the QoS threshold. Moreover, when the distance is high, the pair prefers to choose the LM for the same previous reason since the power of the gNB is greater than that of the D2D transmitter. Consequently, the D2D pair chooses either DM, RM, or LM depending on the maximum SINR which is impacted by the distance.
Figure 13. Probability of D2D connections versus max distance between D2D pairs.
Figure 14 demonstrates the number of disconnected pairs versus the number of D2D pairs in the network with the goal of comparing the performance of the proposed SD-scheme with the C-scheme. When the number of D2D pairs increases, the proposed schemes increase in terms of the number of disconnected D2D pairs because of the increased interference between D2D pairs and CUs. It is obvious that the proposed SD-scheme achieves slightly better performance than the C-scheme because of its better ability to mitigate interference and utilize resources effectively. The performance divergence across both schemes increases with the number of D2D pairs, indicating that the SD-scheme exhibits more adaptability and stability under higher network loads. This enhancement is due to the SD-scheme’s optimized resource allocation technique, which reduces connection failures and guarantees superior connectivity for D2D users.
Figure 14. Number of disconnected pairs versus number of D2D pairs.

8. Conclusions

This paper presents an SD-scheme for mode selection, channel assignment, and power allocation in D2D communications underlaying cellular networks. The primary goal of this research is to enhance D2D performance while maintaining the QoS demands of CUs. Furthermore, the sub-channel of each single CU can be reused across several D2D pairs, and the detrimental effects of the interference that occurs across D2D pairs are considered during the process of resource allocation. The joint problems of mode selection, channel assignment, and power allocation are MINLP and NP and difficult to resolve. Therefore, we designed a hybrid scheme: a centralized mode selection and channel assignment approach, followed by a distributed power management approach. The initial process involves the concept of a centralized greedy-based mode selection and two-sided preference lists channel assignment approach, followed by the implementation of a distributed power control approach in the subsequent phase. Moreover, an SARSA-based RL power control method has been proposed to iteratively update the transmission power for each D2D pair utilizing the same assigned sub-channel of individual CUs with the goal of improving the EE of the D2D communications in the network. The simulation findings demonstrated that the introduced scheme yields better performance with low complexity and outperforms traditional and Q-learning schemes in terms of data rate, SE, and EE. Future research may include examining the influence of other networking factors such as users’ mobility as well as including modern equipment like unmanned aerial vehicles and satellites to the network.

Author Contributions

Conceptualization, N.M.M.; Data curation, N.M.M.; Formal analysis, I.S.A.; Investigation, I.S.A.; Methodology, I.S.A. and M.H.D.N.H.; Project administration, N.M.M.; Software, I.S.A.; Supervision, N.M.M.; Validation, M.H.D.N.H.; Writing—original draft, I.S.A.; Writing—review and editing, M.H.D.N.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created in this study. All relevant data are included within the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Siddiqui, M.U.A.; Qamar, F.; Tayyab, M.; Hindia, M.N.; Nguyen, Q.N.; Hassan, R. Mobility Management Issues and Solutions in 5G-and-Beyond Networks: A Comprehensive Review. Electronics 2022, 11, 1366. [Google Scholar] [CrossRef]
  2. Alhashimi, H.F.; Hindia, M.N.; Dimyati, K.; Hanafi, E.B.; Safie, N.; Qamar, F.; Azrin, K.; Nguyen, Q.N. A Survey on Resource Management for 6G Heterogeneous Networks: Current Research, Future Trends, and Challenges. Electronics 2023, 12, 647. [Google Scholar] [CrossRef]
  3. Gismalla, M.S.M.; Azmi, A.I.; Bin Salim, M.R.; Abdullah, M.F.L.; Iqbal, F.; Mabrouk, W.A.; Othman, M.B.; Ashyap, A.Y.I.; Supa’At, A.S.M. Survey on Device to Device (D2D) Communication for 5GB/6G Networks: Concept, Applications, Challenges, and Future Directions. IEEE Access 2022, 10, 30792–30821. [Google Scholar] [CrossRef]
  4. Alibraheemi, A.M.H.; Hindia, M.N.; Dimyati, K.; Izam, T.F.T.M.N.; Yahaya, J.; Qamar, F.; Abdullah, Z.H. A Survey of Resource Management in D2D Communication for B5G Networks. IEEE Access 2023, 11, 7892–7923. [Google Scholar] [CrossRef]
  5. Alibraheemi, A.M.H.; Hindia, M.N.; Izam, T.F.T.M.N.; Dimyati, K. Spectrum Efficient Mode Selection and Resource Allocation Optimization for D2D Communication in HetNet: A Multi-Agent Q-Learning Approach. IEEE Access 2024, 12, 131217–131229. [Google Scholar] [CrossRef]
  6. Alhashimi, H.F.; Hindia, M.N.; Dimyati, K.; Hanafi, E.B.; Izam, T.F.T.M.N. Joint Optimization Scheme of User Association and Channel Allocation in 6G HetNets. Symmetry 2023, 15, 1673. [Google Scholar] [CrossRef]
  7. Chen, C.-Y.; Sung, C.-A.; Chen, H.-H. Capacity maximization based on optimal mode selection in multi-mode and multi-pair D2D communications. IEEE Trans. Veh. Technol. 2019, 68, 6524–6534. [Google Scholar] [CrossRef]
  8. Attar, I.S.; Mahyuddin, N.M.; Hindia, M.H.D.N. Joint mode selection and resource allocation for underlaying D2D communications: Matching theory. Telecommun. Syst. 2024, 87, 663–678. [Google Scholar] [CrossRef]
  9. Jayakumar, S.; Nandakumar, S. A review on resource allocation techniques in D2D communication for 5G and B5G technology. Peer-to-Peer Netw. Appl. 2021, 14, 243–269. [Google Scholar] [CrossRef]
  10. Gu, W.; Zhu, Q. Social-aware-based resource allocation for NOMA-Enhanced D2D communications. Appl. Sci. 2020, 10, 2446. [Google Scholar] [CrossRef]
  11. Hassan, A.N.; Al-Chlaihawi, S.; Khekan, A.R. Artificial intelligence techniques over the fifth generation mobile networks. Indones. J. Electr. Eng. Comput. Sci. 2021, 24, 317–328. [Google Scholar] [CrossRef]
  12. Alhashimi, H.F.; Hindia, M.N.; Dimyati, K.; Hanafi, E.B.; Tengku Mohmed Noor Izam, T.F. Reinforcement Learning Based Power Allocation for 6G Heterogenous Networks. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer Science and Business Media Deutschland GmbH: Berlin, Germany, 2024; pp. 128–141. [Google Scholar] [CrossRef]
  13. Sheng, J.; Liu, S.; Huang, T.; Wu, Y. Overlapping Coalition Game for Resource Allocation in Many-to-Many D2D Communication. Wirel. Commun. Mob. Comput. 2022, 2022, 1738530. [Google Scholar] [CrossRef]
  14. Rosas, A.A.; Shokair, M.; Dessouky, M.I. Genetic Based Approach for Optimal Power and Channel Allocation to Enhance D2D Underlaied Cellular Network Capacity in 5G. Comput. Mater. Contin. 2022, 72, 3751–3762. [Google Scholar] [CrossRef]
  15. Sun, Y.; Miao, M.; Wang, Z.; Liu, Z. Resource Allocation Based on Hierarchical Game for D2D Underlaying Communication Cellular Networks. Wirel. Pers. Commun. 2021, 117, 281–291. [Google Scholar] [CrossRef]
  16. Dejen, A.A.; Wondie, Y.; Forster, A. Distributed Throughput and Energy Efficient Resource Optimization When D2D and Massive MIMO Coexist. J. Commun. Inf. Netw. 2022, 7, 278–295. [Google Scholar] [CrossRef]
  17. Jiang, S.; Zheng, J. A Q-learning based Dynamic Power Control Algorithm for D2D Communication Underlaying Cellular Networks. In Proceedings of the 13th International Conference on Wireless Communications and Signal Processing, WCSP 2021, Changsha, China, 20–21 October 2021; IEEE: Piscataway, NJ, USA, 2021. [Google Scholar] [CrossRef]
  18. Chang, H.-H.; Liu, L.; Bai, J.; Pidwerbetsky, A.; Berlinsky, A.; Huang, J.; Ashdown, J.D.; Turck, K.; Yi, Y. Resource Allocation for D2D Cellular Networks with QoS Constraints: A DC Programming-Based Approach. IEEE Access 2022, 10, 16424–16438. [Google Scholar] [CrossRef]
  19. Xing, X.; Cao, J.; Zhou, H. Improving Quality of Service for Cell-Edge Users in D2D-Relay Networks. Wirel. Pers. Commun. 2022, 126, 1789–1804. [Google Scholar] [CrossRef]
  20. Wei, Y.; Qu, Y.; Zhao, M.; Zhang, L.; Yu, F.R. Resource allocation and power control policy for device-to-device communication using multi-agent reinforcement learning. Comput. Mater. Contin. 2020, 63, 1515–1532. [Google Scholar] [CrossRef]
  21. Wang, H.; Wang, Y.; Tang, L.; Xia, Y. D2D Social Selection Relay Algorithm Combined with Auction Principle. Sensors 2022, 22, 9265. [Google Scholar] [CrossRef]
  22. Hamid, A.K.; Al-Wesabi, F.N.; Nemri, N.; Zahary, A.; Khan, I. An optimized algorithm for resource allocation for D2D in heterogeneous networks. Comput. Mater. Contin. 2022, 70, 2923–2936. [Google Scholar] [CrossRef]
  23. Jiang, F.; Zhang, L.; Sun, C.; Yuan, Z. Clustering and resource allocation strategy for D2D multicast networks with machine learning approaches. China Commun. 2021, 18, 196–211. [Google Scholar] [CrossRef]
  24. Awad, M.K.; Baidas, M.W.; El-Amine, A.A.; Al-Mubarak, N. A matching-theoretic approach to resource allocation in D2D-enabled downlink NOMA cellular networks. Phys. Commun. 2022, 54, 101837. [Google Scholar] [CrossRef]
  25. Gao, J.; Meng, X.; Yang, C.; Zhang, B.; Yi, X. Resource Allocation for D2D Communication Underlaying Cellular Networks: A Distance-Based Grouping Strategy. Wirel. Commun. Mob. Comput. 2023, 2023, 8594323. [Google Scholar] [CrossRef]
  26. Pei, E.; Zhu, B.; Li, Y. A Q-learning based Resource Allocation Algorithm for D2D-Unlicensed communications. In Proceedings of the IEEE Vehicular Technology Conference, Helsinki, Finland, 25–28 April 2021. [Google Scholar] [CrossRef]
  27. El-Nakhla, O.M.; Obayya, M.I.; Kishk, S.E. Stable Matching Relay Selection (SMRS) for TWR D2D Network With RF/RE EH Capabilities. IEEE Access 2022, 10, 22381–22391. [Google Scholar] [CrossRef]
  28. Jayakumar, S.; Nandakumar, S. Reinforcement learning based distributed resource allocation technique in device-to-device (D2D) communication. Wirel. Netw. 2023, 29, 1843–1858. [Google Scholar] [CrossRef]
  29. Lee, S.-H.; Shi, X.-P.; Tan, T.-H.; Lee, Y.-L.; Huang, Y.-F. Performance of Q-learning based resource allocation for D2D communications in heterogeneous networks. ICT Express 2023, 9, 1032–1039. [Google Scholar] [CrossRef]
  30. Gour, R.; Tyagi, A. Joint uplink–downlink resource allocation for energy efficient D2D underlaying cellular networks with many-to-one matching. Phys. Commun. 2023, 58, 102016. [Google Scholar] [CrossRef]
  31. Wang, H.; Xiao, P.; Li, X. Channel Parameter Estimation of mmWave MIMO System in Urban Traffic Scene: A Training Channel-Based Method. IEEE Trans. Intell. Transp. Syst. 2024, 25, 754–762. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.