Next Article in Journal
Arsenic Removal from Water by Adsorption onto Iron Oxide/Nano-Porous Carbon Magnetic Composite
Next Article in Special Issue
Adaptive Natural Gradient Method for Learning of Stochastic Neural Networks in Mini-Batch Mode
Previous Article in Journal
Computer Vision Measurement of Pointer Meter Readings Based on Inverse Perspective Mapping
Previous Article in Special Issue
Machine Learning-Based Dimension Optimization for Two-Stage Precoder in Massive MIMO Systems with Limited Feedback
Open AccessArticle

A Reinforcement-Learning-Based Distributed Resource Selection Algorithm for Massive IoT

by Jing Ma 1,*,†, So Hasegawa 1,†, Song-Ju Kim 2 and Mikio Hasegawa 1,†
Department of Electrical Engineering, Tokyo University of Science, Tokyo 125-8585, Japan
Graduate School of Media and Governance, Keio University, Fujisawa, Kanagawa 252-0882, Japan
Author to whom correspondence should be addressed.
Current address: Department of Electrical Engineering, Graduate School of Engineering, Katsushika Campus, 6-3-1 Niijyuku, Katsushika-ku, Japan.
Appl. Sci. 2019, 9(18), 3730;
Received: 27 June 2019 / Revised: 2 September 2019 / Accepted: 2 September 2019 / Published: 6 September 2019
Massive IoT including the large number of resource-constrained IoT devices has gained great attention. IoT devices generate enormous traffic, which causes network congestion. To manage network congestion, multi-channel-based algorithms are proposed. However, most of the existing multi-channel algorithms require strict synchronization, an extra overhead for negotiating channel assignment, which poses significant challenges to resource-constrained IoT devices. In this paper, a distributed channel selection algorithm utilizing the tug-of-war (TOW) dynamics is proposed for improving successful frame delivery of the whole network by letting IoT devices always select suitable channels for communication adaptively. The proposed TOW dynamics-based channel selection algorithm has a simple reinforcement learning procedure that only needs to receive the acknowledgment (ACK) frame for the learning procedure, while simply requiring minimal memory and computation capability. Thus, the proposed TOW dynamics-based algorithm can run on resource-constrained IoT devices. We prototype the proposed algorithm on an extremely resource-constrained single-board computer, which hereafter is called the cognitive-IoT prototype. Moreover, the cognitive-IoT prototype is densely deployed in a frequently-changing radio environment for evaluation experiments. The evaluation results show that the cognitive-IoT prototype accurately and adaptively makes decisions to select the suitable channel when the real environment regularly varies. Accordingly, the successful frame ratio of the network is improved. View Full-Text
Keywords: reinforcement learning; multi-armed bandit; IoT; distributed channel selection reinforcement learning; multi-armed bandit; IoT; distributed channel selection
Show Figures

Figure 1

MDPI and ACS Style

Ma, J.; Hasegawa, S.; Kim, S.-J.; Hasegawa, M. A Reinforcement-Learning-Based Distributed Resource Selection Algorithm for Massive IoT. Appl. Sci. 2019, 9, 3730.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

Back to TopTop