Mobile Charging Scheduling Approach for Wireless Rechargeable Sensor Networks Based on Multiple Discrete-Action Space Deep Q-Network
Abstract
:1. Introduction
- (1)
- The joint mobile charging sequence scheduling and charging upper threshold control problem (JSSTC) is studied in this paper. MC can adjust the charging upper threshold for each sensor adaptively according to the state information at each time step, where the time step is the time at which the charging decision is determined.
- (2)
- A bidirectional gated recurrent unit (Bi-GRU) is integrated in the Q-network, which guides the agent in determining the charging action of the next time step based on the past and future state information.
- (3)
- Simulation results show that the proposed method can effectively prolong the network lifetime compared to existing approaches.
2. Related Work
2.1. Periodic Charging and On-Demand Charging Scheduling Approaches
2.2. Mobile Charging Scheduling Approaches Based on DRL
3. System Model and Problem Formulation
3.1. System Model
3.2. The Models of Energy Consumption and Charging
3.3. Problem Formulation
4. The Details of MMDRL-JSSTC
4.1. Formulation of MMDRL-JSSTC
- (1)
- State.
- (2)
- Action.
- (3)
- Transition.
- (4)
- Reward.
- (5)
- Policy.
- The MC can visit any position in the network when its remaining energy of MC is greater than and when its remaining energy reaches , it will terminate the current charging operation and return the charging service station to replace the battery;
- All sensors with charging demand greater than 0 can be selected as the next charging destination;
- If the remaining energy of the sensors is 0, it will not be accessed by MC;
- Two adjacent time steps cannot choose the same charging destination.
4.2. The Details of MDDRL-JSSTC
- (1)
- The state information of any time step is embedded to obtain a high-dimensional vector, and this vector is utilized as the input of Bi-GRU, denoted as ;
- (2)
- After inputting the input information to Bi-GRU, the hidden state information and output information of any time step will be obtained, where the hidden state information is spliced by forward hidden state information and backward hidden state information, expressed as ;
- (3)
- Two fully connected layers are connected behind the Bi-GRU layer and then obtain the Q value.
Algorithm 1 MDDRL-JSSTC |
Input: Input the state information s, the maximum number of episode is , maximum training step in one episode is K, the target network parameter update interval is -step, discount factor |
Output: and |
1. Initialization: Initialize the capacity of experience replay buffer to , initialize the weight of Q-network randomly, |
2. for episode = 1 to do |
3. Initialize the network environment |
4. for k = 1 to K do |
5. Select an action from based on -greedy at k-th time step |
6. Select an action from based on -greedy at k-th time step |
7. Combine and to an action pair |
8. |
9. Perform the action on the environment, the environment feedback a reward , and a new state . |
10. Save the interaction process as tuple , and store it in |
11. Repeat performing 5–10 |
12. Sample randomly from minibatch |
13. if any conditions in (6) is not satisfied at j + 1-th time step |
break. |
14. else |
15. Obtain the two-dimensional action value and with (14) |
Update the parameters via (18) |
16. end if |
17. Update the parameter of target network with every steps |
18. end for |
19. end for |
5. Experimental Results
5.1. Experimental Setup
5.2. Comparison Results of Charging Performance against the Baseline Approaches
6. Discussion
6.1. Impact of Sensor Energy Capacity on Charging Performance
6.2. Impact of MC Energy Capacity on Charging Performance
6.3. Impact of the Charging Speed on Charging Performance
6.4. Impact of the Moving Speed on Charging Performance
6.5. Comparison of Network Lifetime
7. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kaswan, A.; Jana, P.K.; Das, S.K. A Survey on Mobile Charging Techniques in Wireless Rechargeable Sensor Networks. IEEE Commun. Surv. Tutor. 2022, 24, 1750–1779. [Google Scholar] [CrossRef]
- Kuila, P.; Jana, P.K. Energy Efficient Clustering and Routing Algorithms for Wireless Sensor Networks: Particle Swarm Optimization Approach. Eng. Appl. Artif. Intell. 2014, 33, 127–140. [Google Scholar] [CrossRef]
- Zhang, C.; Zhou, G.; Li, H.; Cao, Y. Manufacturing Blockchain of Things for the Configuration of a Data-and Knowledge-Driven Digital Twin Manufacturing Cell. IEEE Internet Things J. 2020, 7, 11884–11894. [Google Scholar] [CrossRef]
- Wang, Y.; Rajib, S.M.S.M.; Collins, C.; Grieve, B. Low-Cost Turbidity Sensor for Low-Power Wireless Monitoring of Fresh-Water Courses. IEEE Sens. J. 2018, 18, 4689–4696. [Google Scholar] [CrossRef] [Green Version]
- Shi, Y.; Xie, L.; Hou, Y.T.; Sherali, H.D. On Renewable Sensor Networks with Wireless Energy Transfer. In Proceedings of the 2011 Proceedings IEEE INFOCOM, Shanghai, China, 10–15 April 2011; pp. 1350–1358. [Google Scholar] [CrossRef]
- Wang, C.; Li, J.; Ye, F.; Yang, Y. A Mobile Data Gathering Framework for Wireless Rechargeable Sensor Networks with Vehicle Movement Costs and Capacity Constraints. IEEE Trans. Comput. 2016, 65, 2411–2427. [Google Scholar] [CrossRef]
- Li, J.; Jiang, C.; Wang, J.; Xu, T.; Xiao, W. Mobile Charging Sequence Scheduling for Optimal Sensing Coverage in Wireless Rechargeable Sensor Networks. Appl. Sci. 2023, 13, 2840. [Google Scholar] [CrossRef]
- Lin, C.; Wu, Y.; Liu, Z.; Obaidat, M.S.; Yu, C.W.; Wu, G. GTCharge: A Game Theoretical Collaborative Charging Scheme for Wireless Rechargeable Sensor Networks. J. Syst. Softw. 2016, 121, 88–104. [Google Scholar] [CrossRef]
- He, L.; Kong, L.; Gu, Y.; Pan, J.; Zhu, T. Evaluating the On-Demand Mobile Charging in Wireless Sensor Networks. IEEE Trans. Mob. Comput. 2015, 14, 1861–1875. [Google Scholar] [CrossRef]
- Lin, C.; Zhou, J.; Guo, C.; Song, H.; Wu, G.; Obaidat, M.S. TSCA: A Temporal-Spatial Real-Time Charging Scheduling Algorithm for On-Demand Architecture in Wireless Rechargeable Sensor Networks. IEEE Trans. Mob. Comput. 2018, 17, 211–224. [Google Scholar] [CrossRef]
- Tomar, A.; Muduli, L.; Jana, P.K. A Fuzzy Logic-Based On-Demand Charging Algorithm for Wireless Rechargeable Sensor Networks with Multiple Chargers. IEEE Trans. Mob. Comput. 2021, 20, 2715–2727. [Google Scholar] [CrossRef]
- Cao, X.; Xu, W.; Liu, X.; Peng, J.; Liu, T. A Deep Reinforcement Learning-Based on-Demand Charging Algorithm for Wireless Rechargeable Sensor Networks. Ad Hoc Netw. 2021, 110, 102278. [Google Scholar] [CrossRef]
- Yang, M.; Liu, N.; Zuo, L.; Feng, Y.; Liu, M.; Gong, H.; Liu, M. Dynamic Charging Scheme Problem with Actor–Critic Reinforcement Learning. IEEE Internet Things J. 2021, 8, 370–380. [Google Scholar] [CrossRef]
- Bui, N.; Nguyen, P.L.; Nguyen, V.A.; Do, P.T. A Deep Reinforcement Learning-Based Adaptive Charging Policy for Wireless Rechargeable Sensor Networks. arXiv 2022. [Google Scholar] [CrossRef]
- Tang, L.; Chen, Z.; Cai, J.; Guo, H.; Wu, R.; Guo, J. Adaptive Energy Balanced Routing Strategy for Wireless Rechargeable Sensor Networks. Appl. Sci. 2019, 9, 2133. [Google Scholar] [CrossRef] [Green Version]
- Jiang, C.; Wang, Z.; Chen, S.; Li, J.; Wang, H.; Xiang, J.; Xiao, W. Attention-Shared Multi-Agent Actor–Critic-Based Deep Reinforcement Learning Approach for Mobile Charging Dynamic Scheduling in Wireless Rechargeable Sensor Networks. Entropy 2022, 24, 965. [Google Scholar] [CrossRef]
- Shu, Y.; Shin, K.G.; Chen, J.; Sun, Y. Joint Energy Replenishment and Operation Scheduling in Wireless Rechargeable Sensor Networks. IEEE Trans. Ind. Inform. 2017, 13, 125–134. [Google Scholar] [CrossRef]
- Malebary, S. Wireless Mobile Charger Excursion Optimization Algorithm in Wireless Rechargeable Sensor Networks. IEEE Sens. J. 2020, 20, 13842–13848. [Google Scholar] [CrossRef]
- Zhong, P.; Xu, A.; Zhang, S.; Zhang, Y.; Chen, Y. EMPC: Energy-Minimization Path Construction for Data Collection and Wireless Charging in WRSN. Pervasive Mob. Comput. 2021, 73, 101401. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017. [Google Scholar] [CrossRef]
Periodic Charging | On-Demand Charging | DRL-Based Approaches | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
[5] | [6] | [7] | [8] | [9] | [10] | [12] | [13] | [14] | This Work | |
Is the MC involved? | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ |
Are sensors charged with WET? | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ |
Are energy consumption rates dynamic? | √ | √ | √ | √ | √ | √ | ||||
Is charging upper threshold adaptively adjusted? | √ | |||||||||
Is the energy capacity of MC limited? | √ | √ | √ | √ | √ | |||||
Is there charging sequence scheduling? | √ | √ | √ | √ | √ | √ | √ | √ | √ | √ |
Is there a joint optimization? | √ | √ | √ | |||||||
Are there multiple action spaces? | √ |
Parameter | Description | Value |
---|---|---|
Size of experience replay buffer | ||
Size of mini-batch | 1024 | |
learning rate | 5 × 10−4 (JSSTC50, 100) 5 × 10−5 (JSSTC200) | |
Number of episodes | ||
The target network parameter update interval | 200 | |
Adam [20] | Optimizer method | |
Reward discount | 0.9 | |
Penalty coefficient. | 5 |
Approach | The Structure of the Neural Network | ||
---|---|---|---|
The proposed approach | Embedding (input size = , hidden size1 = 128, hidden size2 = 128, output size = 64) Q-network (Bi-GRU input size = 64, hidden size = 64, action1 output size = , action2 output size = 10, q-value output size = 1) | ||
Actor network | Critic network | ||
Encoder | Decoder | ||
DRL-TCC | 2-layer MLP * (input size = , hidden size1 = 128, hidden size2 = 128, output size = 128) | Pointer network (input size = 128, hidden size = 256, output size = ) | 4-layer MLP * (input size = ,hidden size1 = 256, hidden size2 = 128, hidden size3 = 20, output size = 1) |
ACRL | 1-D CNN (input size = , hidden size = 128, output size = 128) | GRU (input size = 128, hidden size = 128, output size = ) | 3-layer MLP * (input size = , hidden size1 = 128, hidden size2 = 128, output size = 1) |
Approach | JSSTC10 | JSSTC50 | JSSTC100 | JSSTC200 |
---|---|---|---|---|
DRL-TCC | 0.9 | 0.7 | 0.5 | 0.4 |
ACRL | 0.9 | 0.7 | 0.6 | 0.5 |
TSCA | 0.9 | 0.8 | 0.7 | 0.6 |
NJNP | 0.9 | 0.8 | 0.7 | 0.6 |
GREEDY | 1 | 0.8 | 0.8 | 0.7 |
Environment | Approach | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Std | Std | Std | ||||||||
JSSTC50 | MMDRL-JSSTC | 11.16 | 0.956 | 0 | 18.58 | 1.532 | 5 | 23.11 | 2.571 | 9 |
DRL-TCC | 8.19 | 0.863 | 0 | 16.73 | 1.486 | 6 | 20.61 | 2.887 | 14 | |
ACRL | 7.56 | 0.751 | 0 | 15.98 | 1.317 | 7 | 19.18 | 2.735 | 16 | |
GREEDY | 5.16 | 1.032 | 0 | 10.66 | 1.458 | 8 | 14.03 | 2.933 | 20 | |
TSCA | 6.37 | 0.764 | 0 | 11.62 | 1.237 | 8 | 15.87 | 2.717 | 17 | |
NJNP | 4.79 | 0.813 | 0 | 9.51 | 1.519 | 9 | 12.88 | 2.997 | 19 | |
JSSTC100 | MMDRL-JSSTC | 11.16 | 1.131 | 0 | 23.07 | 1.508 | 7 | 25.76 | 2.813 | 19 |
DRL-TCC | 8.19 | 0.961 | 0 | 18.33 | 1.913 | 9 | 22.62 | 3.217 | 23 | |
ACRL | 7.56 | 0.836 | 0 | 17.26 | 1.703 | 10 | 20.18 | 3.008 | 25 | |
GREEDY | 5.16 | 0.968 | 0 | 11.87 | 1.522 | 15 | 15.56 | 3.115 | 32 | |
TSCA | 6.37 | 0.825 | 0 | 13.91 | 1.415 | 12 | 17.01 | 2.916 | 27 | |
NJNP | 4.79 | 0.912 | 0 | 10.78 | 1.768 | 14 | 14.63 | 3.101 | 30 | |
JSSTC200 | MMDRL-JSSTC | 11.16 | 1.212 | 0 | 24.68 | 2.353 | 22 | 27.13 | 2.987 | 40 |
DRL-TCC | 8.19 | 1.056 | 0 | 19.03 | 2.513 | 28 | 24.56 | 3.356 | 47 | |
ACRL | 7.56 | 0.993 | 0 | 18.17 | 2.257 | 31 | 22.06 | 3.115 | 53 | |
GREEDY | 5.16 | 1.186 | 0 | 12.12 | 2.391 | 43 | 16.18 | 2.834 | 88 | |
TSCA | 6.37 | 0.866 | 0 | 14.62 | 2.213 | 39 | 18.22 | 2.971 | 69 | |
NJNP | 4.79 | 0.783 | 0 | 11.97 | 2.465 | 36 | 15.79 | 3.213 | 72 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jiang, C.; Chen, S.; Li, J.; Wang, H.; Wang, J.; Xu, T.; Xiao, W. Mobile Charging Scheduling Approach for Wireless Rechargeable Sensor Networks Based on Multiple Discrete-Action Space Deep Q-Network. Appl. Sci. 2023, 13, 8513. https://doi.org/10.3390/app13148513
Jiang C, Chen S, Li J, Wang H, Wang J, Xu T, Xiao W. Mobile Charging Scheduling Approach for Wireless Rechargeable Sensor Networks Based on Multiple Discrete-Action Space Deep Q-Network. Applied Sciences. 2023; 13(14):8513. https://doi.org/10.3390/app13148513
Chicago/Turabian StyleJiang, Chengpeng, Shuai Chen, Jinglin Li, Haoran Wang, Jing Wang, Taian Xu, and Wendong Xiao. 2023. "Mobile Charging Scheduling Approach for Wireless Rechargeable Sensor Networks Based on Multiple Discrete-Action Space Deep Q-Network" Applied Sciences 13, no. 14: 8513. https://doi.org/10.3390/app13148513
APA StyleJiang, C., Chen, S., Li, J., Wang, H., Wang, J., Xu, T., & Xiao, W. (2023). Mobile Charging Scheduling Approach for Wireless Rechargeable Sensor Networks Based on Multiple Discrete-Action Space Deep Q-Network. Applied Sciences, 13(14), 8513. https://doi.org/10.3390/app13148513