Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (23)

Search Parameters:
Keywords = Deep Q-learning (DQL)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1138 KiB  
Article
Intelligent Priority-Aware Spectrum Access in 5G Vehicular IoT: A Reinforcement Learning Approach
by Adeel Iqbal, Tahir Khurshaid and Yazdan Ahmad Qadri
Sensors 2025, 25(15), 4554; https://doi.org/10.3390/s25154554 - 23 Jul 2025
Viewed by 263
Abstract
Efficient and intelligent spectrum access is crucial for meeting the diverse Quality of Service (QoS) demands of Vehicular Internet of Things (V-IoT) systems in next-generation cellular networks. This work proposes a novel reinforcement learning (RL)-based priority-aware spectrum management (RL-PASM) framework, a centralized self-learning [...] Read more.
Efficient and intelligent spectrum access is crucial for meeting the diverse Quality of Service (QoS) demands of Vehicular Internet of Things (V-IoT) systems in next-generation cellular networks. This work proposes a novel reinforcement learning (RL)-based priority-aware spectrum management (RL-PASM) framework, a centralized self-learning priority-aware spectrum management framework operating through Roadside Units (RSUs). RL-PASM dynamically allocates spectrum resources across three traffic classes: high-priority (HP), low-priority (LP), and best-effort (BE), utilizing reinforcement learning (RL). This work compares four RL algorithms: Q-Learning, Double Q-Learning, Deep Q-Network (DQN), and Actor-Critic (AC) methods. The environment is modeled as a discrete-time Markov Decision Process (MDP), and a context-sensitive reward function guides fairness-preserving decisions for access, preemption, coexistence, and hand-off. Extensive simulations conducted under realistic vehicular load conditions evaluate the performance across key metrics, including throughput, delay, energy efficiency, fairness, blocking, and interruption probability. Unlike prior approaches, RL-PASM introduces a unified multi-objective reward formulation and centralized RSU-based control to support adaptive priority-aware access for dynamic vehicular environments. Simulation results confirm that RL-PASM balances throughput, latency, fairness, and energy efficiency, demonstrating its suitability for scalable and resource-constrained deployments. The results also demonstrate that DQN achieves the highest average throughput, followed by vanilla QL. DQL and AC maintain fairness at high levels and low average interruption probability. QL demonstrates the lowest average delay and the highest energy efficiency, making it a suitable candidate for edge-constrained vehicular deployments. Selecting the appropriate RL method, RL-PASM offers a robust and adaptable solution for scalable, intelligent, and priority-aware spectrum access in vehicular communication infrastructures. Full article
(This article belongs to the Special Issue Emerging Trends in Next-Generation mmWave Cognitive Radio Networks)
Show Figures

Figure 1

35 pages, 3671 KiB  
Article
Robust UAV-Oriented Wireless Communications via Multi-Agent Deep Reinforcement Learning to Optimize User Coverage
by Mahfizur Rahman Khan, Gowtham Raj Veeraswamy Premkumar and Bryan Van Scoy
Drones 2025, 9(5), 321; https://doi.org/10.3390/drones9050321 - 22 Apr 2025
Viewed by 1331
Abstract
In this study, we deploy drones as dynamic base stations to address the issue of optimizing user coverage in areas without fixed base station infrastructure. To optimize drone placement, we employ Deep Q-Learning, beginning with a centralized approach due to its simplicity and [...] Read more.
In this study, we deploy drones as dynamic base stations to address the issue of optimizing user coverage in areas without fixed base station infrastructure. To optimize drone placement, we employ Deep Q-Learning, beginning with a centralized approach due to its simplicity and ease of training. In this centralized approach, all drones are trained simultaneously. We also employ a decentralized technique in which each drone acts autonomously while sharing a common neural network, allowing for individualized learning. In addition, we explore the impacts of jamming on UAVs and provide a reliable approach for mitigating this interference. To boost robustness, we employ stochastic user distributions, which train our policy to successfully respond to a wide range of user situations. Full article
(This article belongs to the Special Issue UAV-Assisted Mobile Wireless Networks and Applications)
Show Figures

Figure 1

15 pages, 1472 KiB  
Article
Deep Q-Network (DQN) Model for Disease Prediction Using Electronic Health Records (EHRs)
by Nabil M. AbdelAziz, Gehan A. Fouad, Safa Al-Saeed and Amira M. Fawzy
Sci 2025, 7(1), 14; https://doi.org/10.3390/sci7010014 - 7 Feb 2025
Cited by 4 | Viewed by 2327
Abstract
Many efforts have proved that deep learning models are effective for disease prediction using electronic health records (EHRs). However, these models are not yet precise enough to predict diseases. Additionally, ethical concerns and the use of clustering and classification algorithms on small datasets [...] Read more.
Many efforts have proved that deep learning models are effective for disease prediction using electronic health records (EHRs). However, these models are not yet precise enough to predict diseases. Additionally, ethical concerns and the use of clustering and classification algorithms on small datasets limit their effectiveness. The complexity of data processing further complicates the interpretation of patient representation learning models, even though data augmentation strategies may help. Incomplete patient data also hinder model accuracy. This study aims to develop and evaluate a deep learning model that addresses these challenges. Our proposed approach is to design a disease prediction model based on deep Q-learning (DQL), which replaces the traditional Q-learning reinforcement learning algorithm with a neural network deep learning model, and the mapping capabilities of the Q-network are utilized. We conclude that the proposed model achieves the best accuracy (98%) compared with other models. Full article
(This article belongs to the Section Computer Sciences, Mathematics and AI)
Show Figures

Figure 1

39 pages, 892 KiB  
Article
Evaluating Artificial Intelligence Models for Resource Allocation in Circular Economy Digital Marketplace
by Arifuzzaman (Arif) Sheikh, Steven J. Simske and Edwin K. P. Chong
Sustainability 2024, 16(23), 10601; https://doi.org/10.3390/su162310601 - 3 Dec 2024
Cited by 3 | Viewed by 3048
Abstract
This study assesses the application of artificial intelligence (AI) algorithms for optimizing resource allocation, demand-supply matching, and dynamic pricing within circular economy (CE) digital marketplaces. Five AI models—autoregressive integrated moving average (ARIMA), long short-term memory (LSTM), random forest (RF), gradient boosting regressor (GBR), [...] Read more.
This study assesses the application of artificial intelligence (AI) algorithms for optimizing resource allocation, demand-supply matching, and dynamic pricing within circular economy (CE) digital marketplaces. Five AI models—autoregressive integrated moving average (ARIMA), long short-term memory (LSTM), random forest (RF), gradient boosting regressor (GBR), and neural networks (NNs)—were evaluated based on their effectiveness in predicting waste generation, economic growth, and energy prices. The GBR model outperformed the others, achieving a mean absolute error (MAE) of 23.39 and an R2 of 0.7586 in demand forecasting, demonstrating strong potential for resource flow management. In contrast, the NNs encountered limitations in supply prediction, with an MAE of 121.86 and an R2 of 0.0151, indicating challenges in adapting to market volatility. Reinforcement learning methods, specifically Q-learning and deep Q-learning (DQL), were applied for price stabilization, resulting in reduced price fluctuations and improved market stability. These findings contribute a conceptual framework for AI-driven CE marketplaces, showcasing the role of AI in enhancing resource efficiency and supporting sustainable urban development. While synthetic data enabled controlled experimentation, this study acknowledges its limitations in capturing full real-world variability, marking a direction for future research to validate findings with real-world data. Moreover, ethical considerations, such as algorithmic fairness and transparency, are critical for responsible AI integration in circular economy contexts. Full article
(This article belongs to the Section Economic and Business Aspects of Sustainability)
Show Figures

Figure 1

26 pages, 8622 KiB  
Article
Using Deep Q-Learning to Dynamically Toggle between Push/Pull Actions in Computational Trust Mechanisms
by Zoi Lygizou and Dimitris Kalles
Mach. Learn. Knowl. Extr. 2024, 6(3), 1413-1438; https://doi.org/10.3390/make6030067 - 27 Jun 2024
Cited by 1 | Viewed by 1238
Abstract
Recent work on decentralized computational trust models for open multi-agent systems has resulted in the development of CA, a biologically inspired model which focuses on the trustee’s perspective. This new model addresses a serious unresolved problem in existing trust and reputation models, namely [...] Read more.
Recent work on decentralized computational trust models for open multi-agent systems has resulted in the development of CA, a biologically inspired model which focuses on the trustee’s perspective. This new model addresses a serious unresolved problem in existing trust and reputation models, namely the inability to handle constantly changing behaviors and agents’ continuous entry and exit from the system. In previous work, we compared CA to FIRE, a well-known trust and reputation model, and found that CA is superior when the trustor population changes, whereas FIRE is more resilient to the trustee population changes. Thus, in this paper, we investigate how the trustors can detect the presence of several dynamic factors in their environment and then decide which trust model to employ in order to maximize utility. We frame this problem as a machine learning problem in a partially observable environment, where the presence of several dynamic factors is not known to the trustor, and we describe how an adaptable trustor can rely on a few measurable features so as to assess the current state of the environment and then use Deep Q-Learning (DQL), in a single-agent reinforcement learning setting, to learn how to adapt to a changing environment. We ran a series of simulation experiments to compare the performance of the adaptable trustor with the performance of trustors using only one model (FIRE or CA) and we show that an adaptable agent is indeed capable of learning when to use each model and, thus, perform consistently in dynamic environments. Full article
Show Figures

Figure 1

21 pages, 4825 KiB  
Article
Optimizing EV Battery Management: Advanced Hybrid Reinforcement Learning Models for Efficient Charging and Discharging
by Sercan Yalçın and Münür Sacit Herdem
Energies 2024, 17(12), 2883; https://doi.org/10.3390/en17122883 - 12 Jun 2024
Cited by 5 | Viewed by 3806 | Correction
Abstract
This paper investigates the application of hybrid reinforcement learning (RL) models to optimize lithium-ion batteries’ charging and discharging processes in electric vehicles (EVs). By integrating two advanced RL algorithms—deep Q-learning (DQL) and active-critic learning—within the framework of battery management systems (BMSs), this study [...] Read more.
This paper investigates the application of hybrid reinforcement learning (RL) models to optimize lithium-ion batteries’ charging and discharging processes in electric vehicles (EVs). By integrating two advanced RL algorithms—deep Q-learning (DQL) and active-critic learning—within the framework of battery management systems (BMSs), this study aims to harness the combined strengths of these techniques to improve battery efficiency, performance, and lifespan. The hybrid models are put through their paces via simulation and experimental validation, demonstrating their capability to devise optimal battery management strategies. These strategies effectively adapt to variations in battery state of health (SOH) and state of charge (SOC) relative error, combat battery voltage aging, and adhere to complex operational constraints, including charging/discharging schedules. The results underscore the potential of RL-based hybrid models to enhance BMSs in EVs, offering tangible contributions towards more sustainable and reliable electric transportation systems. Full article
(This article belongs to the Special Issue Electrochemical Conversion and Energy Storage System)
Show Figures

Figure 1

19 pages, 8310 KiB  
Article
Reinforcement Learning-Based Energy Management for Fuel Cell Electrical Vehicles Considering Fuel Cell Degradation
by Qilin Shuai, Yiheng Wang, Zhengxiong Jiang and Qingsong Hua
Energies 2024, 17(7), 1586; https://doi.org/10.3390/en17071586 - 26 Mar 2024
Cited by 7 | Viewed by 1998
Abstract
The service life and fuel consumption of fuel cell system (FCS) are the main factors limiting the commercialization of fuel cell electric vehicles (FCEV). Effective energy management strategies (EMS) can reduce fuel consumption during the cycle and prolong the service life of FCS. [...] Read more.
The service life and fuel consumption of fuel cell system (FCS) are the main factors limiting the commercialization of fuel cell electric vehicles (FCEV). Effective energy management strategies (EMS) can reduce fuel consumption during the cycle and prolong the service life of FCS. This paper proposes an energy management strategy based on the deep reinforcement learning (DRL) algorithm, deep Q-learning (DQL). Considering the unstable performance of conventional DQL during the training process, a new algorithm called Double Deep Q Learning (DDQL) is introduced. The DDQL uses a target evaluation network to evaluate output actions and a delayed update strategy to improve the convergence and stability of DRL. This article trains the strategy using UDDS cycle, tests it using combined cycles UDDS-WLTC-NEDC, and compares it with traditional ECM-based EMS. The results demonstrate that under the combined cycle, the strategy effectively reduced FCS voltage degradation by 50%, maintained fuel economy, and ensured consistency between the initial and final state of charge (SOC) of LIB. Full article
(This article belongs to the Section D2: Electrochem: Batteries, Fuel Cells, Capacitors)
Show Figures

Figure 1

19 pages, 1005 KiB  
Article
Asynchronously H Tracking Control and Optimization for Switched Flight Vehicles with Time-Varying Delay
by Xing Yang, Bin Fu, Xiaochuan Ma, Yu Liu, Dongyu Yuan and Xintong Wu
Aerospace 2024, 11(2), 107; https://doi.org/10.3390/aerospace11020107 - 24 Jan 2024
Viewed by 1378
Abstract
The current paper verifies the asynchronous H control and optimization problem for flight vehicles with a time-varying delay. The nonlinear dynamic model and Jacobian linearization establish the flight vehicle’s switched model. An asynchronous H tracking controller is designed, considering the existing [...] Read more.
The current paper verifies the asynchronous H control and optimization problem for flight vehicles with a time-varying delay. The nonlinear dynamic model and Jacobian linearization establish the flight vehicle’s switched model. An asynchronous H tracking controller is designed, considering the existing asynchronous switching between the controllers and corresponding subsystems. In order to promote transient efficiency, the tracking controller comprises the model-based part and the learning-based part. The model-based part guarantees stability and prescribed efficiency, and the learning-based part compensates for undesirable uncertainties. The multiple Lyapunov function (MLF) and mode-dependent average dwell time (MDADT) methods are utilized to guarantee stability and the specified attenuation efficiency. The existing conditions and the solutions of model-based sub-controllers are represented by linear matrix inequalities (LMIs). The deep Q learning (DQL) algorithm provides the learning-based part. Different from the conventional method, the controller parameters are scheduled online. Therefore, robustness, stability, and dynamic efficiency can be met simultaneously. A numerical example illustrates the efficiency and advantage of the presented approach. Full article
Show Figures

Figure 1

21 pages, 4682 KiB  
Article
Enhancing Energy Management Strategies for Extended-Range Electric Vehicles through Deep Q-Learning and Continuous State Representation
by Christian Montaleza, Paul Arévalo, Jimmy Gallegos and Francisco Jurado
Energies 2024, 17(2), 514; https://doi.org/10.3390/en17020514 - 20 Jan 2024
Cited by 5 | Viewed by 1993
Abstract
The efficiency and dynamics of hybrid electric vehicles are inherently linked to effective energy management strategies. However, complexity is heightened due to uncertainty and variations in real driving conditions. This article introduces an innovative strategy for extended-range electric vehicles, grounded in the optimization [...] Read more.
The efficiency and dynamics of hybrid electric vehicles are inherently linked to effective energy management strategies. However, complexity is heightened due to uncertainty and variations in real driving conditions. This article introduces an innovative strategy for extended-range electric vehicles, grounded in the optimization of driving cycles, prediction of driving conditions, and predictive control through neural networks. First, the challenges of the energy management system are addressed by merging deep reinforcement learning with strongly convex objective optimization, giving rise to a pioneering method called DQL-AMSGrad. Subsequently, the DQL algorithm has been implemented, allowing temporal difference-based updates to adjust Q values to maximize the expected cumulative reward. The loss function is calculated as the mean squared error between the current estimate and the calculated target. The AMSGrad optimization method has been applied to efficiently adjust the weights of the artificial neural network. Hyperparameters such as the learning rate and discount factor have been tuned using data collected during real-world driving tests. This strategy tackles the “curse of dimensionality” and demonstrates a 30% improvement in adaptability to changing environmental conditions. With a 20%-faster convergence speed and a 15%-superior effectiveness in updating neural network weights compared to conventional approaches, it also highlights an 18% reduction in fuel consumption in a case study with the Nissan Xtrail e-POWER system, validating its practical applicability. Full article
Show Figures

Figure 1

18 pages, 11299 KiB  
Article
Experimental Research on Avoidance Obstacle Control for Mobile Robots Using Q-Learning (QL) and Deep Q-Learning (DQL) Algorithms in Dynamic Environments
by Vo Thanh Ha and Vo Quang Vinh
Actuators 2024, 13(1), 26; https://doi.org/10.3390/act13010026 - 9 Jan 2024
Cited by 9 | Viewed by 2819
Abstract
This study provides simulation and experimental results on techniques for avoiding static and dynamic obstacles using a deep Q-learning (DQL) reinforcement learning algorithm for a two-wheel mobile robot with independent control. This method integrates the Q-learning (QL) algorithm with a neural network, where [...] Read more.
This study provides simulation and experimental results on techniques for avoiding static and dynamic obstacles using a deep Q-learning (DQL) reinforcement learning algorithm for a two-wheel mobile robot with independent control. This method integrates the Q-learning (QL) algorithm with a neural network, where the neural networks in the DQL algorithm act as approximators for the Q matrix table for each pair (state–action). The effectiveness of the proposed solution was confirmed through simulations, programming, and practical experimentation. A comparison was drawn between the DQL algorithm and the QL algorithm. Initially, the mobile robot was connected to the control script using the Robot Operating System (ROS). The mobile robot was programmed in Python within the ROS operating system, and the DQL controller was programmed in Gazebo software. The mobile robot underwent testing in a workshop with various experimental scenarios considered. The DQL controller displayed improvements in computation time, convergence time, trajectory planning accuracy, and obstacle avoidance. As a result, the DQL controller surpassed the QL algorithm in terms of performance. Full article
(This article belongs to the Section Actuators for Robotics)
Show Figures

Figure 1

18 pages, 3412 KiB  
Article
Ultra-Reliable Deep-Reinforcement-Learning-Based Intelligent Downlink Scheduling for 5G New Radio-Vehicle to Infrastructure Scenarios
by Jizhe Wang, Yuanbing Zheng, Jian Wang, Zhenghua Shen, Lei Tong, Yahao Jing, Yu Luo and Yong Liao
Sensors 2023, 23(20), 8454; https://doi.org/10.3390/s23208454 - 13 Oct 2023
Cited by 5 | Viewed by 1914
Abstract
Higher standards for reliability and efficiency apply to the connection between vehicle terminals and infrastructure by the fifth-generation mobile communication technology (5G). A vehicle-to-infrastructure system uses a communication system called NR-V2I (New Radio-Vehicle to Infrastructure), which uses Link Adaptation (LA) technology to communicate [...] Read more.
Higher standards for reliability and efficiency apply to the connection between vehicle terminals and infrastructure by the fifth-generation mobile communication technology (5G). A vehicle-to-infrastructure system uses a communication system called NR-V2I (New Radio-Vehicle to Infrastructure), which uses Link Adaptation (LA) technology to communicate in constantly changing V2I to increase the efficacy and reliability of V2I information transmission. This paper proposes a Double Deep Q-learning (DDQL) LA scheduling algorithm for optimizing the modulation and coding scheme (MCS) of autonomous driving vehicles in V2I communication. The problem with the Doppler shift and complex fast time-varying channels reducing the reliability of information transmission in V2I scenarios is that they make it less likely that the information will be transmitted accurately. Schedules for autonomous vehicles using Space Division Multiplexing (SDM) and MCS are used in V2I communications. To address the issue of Deep Q-learning (DQL) overestimation in the Q-Network learning process, the approach integrates Deep Neural Network (DNN) and Double Q-Network (DDQN). The findings of this study demonstrate that the suggested algorithm can adapt to complex channel environments with varying vehicle speeds in V2I scenarios and by choosing the best scheduling scheme for V2I road information transmission using a combination of MCS. SDM not only increases the accuracy of the transmission of road safety information but also helps to foster cooperation and communication between vehicle terminals to realize cooperative driving. Full article
Show Figures

Figure 1

36 pages, 6949 KiB  
Article
AI-Enabled Interference Mitigation for Autonomous Aerial Vehicles in Urban 5G Networks
by Anirudh Warrier, Saba Al-Rubaye, Gokhan Inalhan and Antonios Tsourdos
Aerospace 2023, 10(10), 884; https://doi.org/10.3390/aerospace10100884 - 13 Oct 2023
Cited by 8 | Viewed by 3922
Abstract
Integrating autonomous unmanned aerial vehicles (UAVs) with fifth-generation (5G) networks presents a significant challenge due to network interference. UAVs’ high altitude and propagation conditions increase vulnerability to interference from neighbouring 5G base stations (gNBs) in the downlink direction. This paper proposes a novel [...] Read more.
Integrating autonomous unmanned aerial vehicles (UAVs) with fifth-generation (5G) networks presents a significant challenge due to network interference. UAVs’ high altitude and propagation conditions increase vulnerability to interference from neighbouring 5G base stations (gNBs) in the downlink direction. This paper proposes a novel deep reinforcement learning algorithm, powered by AI, to address interference through power control. By formulating and solving a signal-to-interference-and-noise ratio (SINR) optimization problem using the deep Q-learning (DQL) algorithm, interference is effectively mitigated, and link performance is improved. Performance comparison with existing interference mitigation schemes, such as fixed power allocation (FPA), tabular Q-learning, particle swarm optimization, and game theory demonstrates the superiority of the DQL algorithm, where it outperforms the next best method by 41.66% and converges to an optimal solution faster. It is also observed that, at higher speeds, the UAV sees only a 10.52% decrease in performance, which means the algorithm is able to perform effectively at high speeds. The proposed solution effectively integrates UAVs with 5G networks, mitigates interference, and enhances link performance, offering a significant advancement in this field. Full article
(This article belongs to the Special Issue Global Navigation Satellite System for Unmanned Aerial Vehicle)
Show Figures

Figure 1

24 pages, 1244 KiB  
Article
Deep Reinforcement Learning for Workload Prediction in Federated Cloud Environments
by Zaakki Ahamed, Maher Khemakhem, Fathy Eassa, Fawaz Alsolami, Abdullah Basuhail and Kamal Jambi
Sensors 2023, 23(15), 6911; https://doi.org/10.3390/s23156911 - 3 Aug 2023
Cited by 6 | Viewed by 2747
Abstract
The Federated Cloud Computing (FCC) paradigm provides scalability advantages to Cloud Service Providers (CSP) in preserving their Service Level Agreement (SLA) as opposed to single Data Centers (DC). However, existing research has primarily focused on Virtual Machine (VM) placement, with less emphasis on [...] Read more.
The Federated Cloud Computing (FCC) paradigm provides scalability advantages to Cloud Service Providers (CSP) in preserving their Service Level Agreement (SLA) as opposed to single Data Centers (DC). However, existing research has primarily focused on Virtual Machine (VM) placement, with less emphasis on energy efficiency and SLA adherence. In this paper, we propose a novel solution, Federated Cloud Workload Prediction with Deep Q-Learning (FEDQWP). Our solution addresses the complex VM placement problem, energy efficiency, and SLA preservation, making it comprehensive and beneficial for CSPs. By leveraging the capabilities of deep learning, our FEDQWP model extracts underlying patterns and optimizes resource allocation. Real-world workloads are extensively evaluated to demonstrate the efficacy of our approach compared to existing solutions. The results show that our DQL model outperforms other algorithms in terms of CPU utilization, migration time, finished tasks, energy consumption, and SLA violations. Specifically, our QLearning model achieves efficient CPU utilization with a median value of 29.02, completes migrations in an average of 0.31 units, finishes an average of 699 tasks, consumes the least energy with an average of 1.85 kWh, and exhibits the lowest number of SLA violations with an average of 0.03 violations proportionally. These quantitative results highlight the superiority of our proposed method in optimizing performance in FCC environments. Full article
(This article belongs to the Section Communications)
Show Figures

Figure 1

20 pages, 789 KiB  
Article
Deep Q-Learning-Based Buffer-Aided Relay Selection for Reliable and Secure Communications in Two-Hop Wireless Relay Networks
by Cheng Zhang, Xuening Liao, Zhenqiang Wu, Guoyong Qiu, Zitong Chen and Zhiliang Yu
Sensors 2023, 23(10), 4822; https://doi.org/10.3390/s23104822 - 17 May 2023
Cited by 1 | Viewed by 2217
Abstract
This paper investigates the problem of buffer-aided relay selection to achieve reliable and secure communications in a two-hop amplify-and-forward (AF) network with an eavesdropper. Due to the fading of wireless signals and the broadcast nature of wireless channels, transmitted signals over the network [...] Read more.
This paper investigates the problem of buffer-aided relay selection to achieve reliable and secure communications in a two-hop amplify-and-forward (AF) network with an eavesdropper. Due to the fading of wireless signals and the broadcast nature of wireless channels, transmitted signals over the network may be undecodable at the receiver end or have been eavesdropped by eavesdroppers. Most available buffer-aided relay selection schemes consider either reliability or security issues in wireless communications; rarely is work conducted on both reliability and security issues. This paper proposes a buffer-aided relay selection scheme based on deep Q-learning (DQL) that considers both reliability and security. By conducting Monte Carlo simulations, we then verify the reliability and security performances of the proposed scheme in terms of the connection outage probability (COP) and secrecy outage probability (SOP), respectively. The simulation results show that two-hop wireless relay network can achieve reliable and secure communications by using our proposed scheme. We also performed comparison experiments between our proposed scheme and two benchmark schemes. The comparison results indicate that our proposed scheme outperforms the max-ratio scheme in terms of the SOP. Full article
(This article belongs to the Special Issue Cognitive Radio Networks: Technologies, Challenges and Applications)
Show Figures

Figure 1

19 pages, 8275 KiB  
Article
Retrospective-Based Deep Q-Learning Method for Autonomous Pathfinding in Three-Dimensional Curved Surface Terrain
by Qidong Han, Shuo Feng, Xing Wu, Jun Qi and Shaowei Yu
Appl. Sci. 2023, 13(10), 6030; https://doi.org/10.3390/app13106030 - 14 May 2023
Cited by 3 | Viewed by 1994
Abstract
Path planning in complex environments remains a challenging task for unmanned vehicles. In this paper, we propose a decoupled path-planning algorithm with the help of a deep reinforcement learning algorithm that separates the evaluation of paths from the planning algorithm to facilitate unmanned [...] Read more.
Path planning in complex environments remains a challenging task for unmanned vehicles. In this paper, we propose a decoupled path-planning algorithm with the help of a deep reinforcement learning algorithm that separates the evaluation of paths from the planning algorithm to facilitate unmanned vehicles in real-time consideration of environmental factors. We use a 3D surface map to represent the path cost, where the elevation information represents the integrated cost. The peaks function simulates the path cost, which is processed and used as the algorithm’s input. Furthermore, we improved the double deep Q-learning algorithm (DDQL), called retrospective-double DDQL (R-DDQL), to improve the algorithm’s performance. R-DDQL utilizes global information and incorporates a retrospective mechanism that employs fuzzy logic to evaluate the quality of selected actions and identify better states for inclusion in the memory. Our simulation studies show that the proposed R-DDQL algorithm has better training speed and stability compared to the deep Q-learning algorithm and double deep Q-learning algorithm. We demonstrate the effectiveness of the R-DDQL algorithm under both static and dynamic tasks. Full article
Show Figures

Figure 1

Back to TopTop