Next Article in Journal
An Optimal Online Resource Allocation Algorithm for Energy Harvesting Body Area Networks
Previous Article in Journal
Nonlinear Modeling and Coordinate Optimization of a Semi-Active Energy Regenerative Suspension with an Electro-Hydraulic Actuator
Article Menu

Export Article

Open AccessArticle
Algorithms 2018, 11(2), 13; https://doi.org/10.3390/a11020013

muMAB: A Multi-Armed Bandit Model for Wireless Network Selection

1
Amadeus S.A.S., 485 Route du Pin Montard, 06902 Sophia Antipolis CEDEX, France
2
Department of Information Engineering, Electronics and Telecommunications, Sapienza University of Rome, Via Eudossiana 18, 00184 Rome, Italy
3
Laboratoire des Signaux et Systèmes (L2S), CentraleSupélec-CNRS-Université Paris-Sud, Université Paris-Saclay, 3, rue Joliot Curie, 91192 Gif-sur-Yvette, France
*
Author to whom correspondence should be addressed.
Received: 6 December 2017 / Revised: 22 January 2018 / Accepted: 23 January 2018 / Published: 26 January 2018
View Full-Text   |   Download PDF [670 KB, uploaded 26 January 2018]   |  

Abstract

Multi-armed bandit (MAB) models are a viable approach to describe the problem of best wireless network selection by a multi-Radio Access Technology (multi-RAT) device, with the goal of maximizing the quality perceived by the final user. The classical MAB model does not allow, however, to properly describe the problem of wireless network selection by a multi-RAT device, in which a device typically performs a set of measurements in order to collect information on available networks, before a selection takes place. The MAB model foresees in fact only one possible action for the player, which is the selection of one among different arms at each time step; existing arm selection algorithms thus mainly differ in the rule according to which a specific arm is selected. This work proposes a new MAB model, named measure-use-MAB (muMAB), aiming at providing a higher flexibility, and thus a better accuracy in describing the network selection problem. The muMAB model extends the classical MAB model in a twofold manner; first, it foresees two different actions: to measure and to use; second, it allows actions to span over multiple time steps. Two new algorithms designed to take advantage of the higher flexibility provided by the muMAB model are also introduced. The first one, referred to as measure-use-UCB1 (muUCB1) is derived from the well known UCB1 algorithm, while the second one, referred to as Measure with Logarithmic Interval (MLI), is appositely designed for the new model so to take advantage of the new measure action, while aggressively using the best arm. The new algorithms are compared against existing ones from the literature in the context of the muMAB model, by means of computer simulations using both synthetic and captured data. Results show that the performance of the algorithms heavily depends on the Probability Density Function (PDF) of the reward received on each arm, with different algorithms leading to the best performance depending on the PDF. Results highlight, however, that as the ratio between the time required for using an arm and the time required to measure increases, the proposed algorithms guarantee the best performance, with muUCB1 emerging as the best candidate when the arms are characterized by similar mean rewards, and MLI prevailing when an arm is significantly more rewarding than others. This calls thus for the introduction of an adaptive approach capable of adjusting the behavior of the algorithm or of switching algorithm altogether, depending on the acquired knowledge on the PDF of the reward on each arm. View Full-Text
Keywords: cognitive networking; wireless network selection; MAB cognitive networking; wireless network selection; MAB
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Boldrini, S.; De Nardis, L.; Caso, G.; Le, M.T.P.; Fiorina, J.; Di Benedetto, M.-G. muMAB: A Multi-Armed Bandit Model for Wireless Network Selection. Algorithms 2018, 11, 13.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Algorithms EISSN 1999-4893 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top