Abstract
In this article, we address the problem of prolonging the battery life of Internet of Things (IoT) nodes by introducing a smart energy harvesting framework for IoT networks supported by femtocell access points (FAPs) based on the principles of Contract Theory and Reinforcement Learning. Initially, the IoT nodes’ social and physical characteristics are identified and captured through the concept of IoT node types. Then, Contract Theory is adopted to capture the interactions among the FAPs, who provide personalized rewards, i.e., charging power, to the IoT nodes to incentivize them to invest their effort, i.e., transmission power, to report their data to the FAPs. The IoT nodes’ and FAPs’ contract-theoretic utility functions are formulated, following the network economic concept of the involved entities’ personalized profit. A contract-theoretic optimization problem is introduced to determine the optimal personalized contracts among each IoT node connected to a FAP, i.e., a pair of transmission and charging power, aiming to jointly guarantee the optimal satisfaction of all the involved entities in the examined IoT system. An artificial intelligent framework based on reinforcement learning is introduced to support the IoT nodes’ autonomous association to the most beneficial FAP in terms of long-term gained rewards. Finally, a detailed simulation and comparative results are presented to show the pure operation performance of the proposed framework, as well as its drawbacks and benefits, compared to other approaches. Our findings show that the personalized contracts offered to the IoT nodes outperform by a factor of four compared to an agnostic type approach in terms of the achieved IoT system’s social welfare.
1. Introduction
Internet of Things (IoT) has gained great research and industrial interest in the last decade, as it enables the operation and collaboration of a large number of devices with different communication and computing capabilities, such as sensors, actuators, smartphones, and others [1]. Those IoT devices collect and report information to several types of application in order to support the end-users’ needs and deliver meaningful services, such as environmental monitoring, social networking and surveillance systems [2]. The exploitation of the IoT devices’ physical and social characteristics can create efficient coalitions among them, to better serve a common goal in the system, e.g., crowdsourcing, surveillance of an area of interest, and in-home healthcare [3]. A common characteristic of the IoT devices is their frequent transmission of data to a receiver, e.g., access point, a multi-access edge computing server, for further processing and planning of the delivered services [4]. Even though the amount of transmitted data is usually small, the frequent transmissions reduce the battery life of the IoT devices, which often have limited power resources, and their battery replacement is a difficult and costly task [5]. Thus, the energy harvesting solution from radio frequency signals by deploying a wireless powered communication system has arisen as a suitable means to prolong the IoT devices’ battery life [6]. In this paper, we introduce a smart energy harvesting method by exploiting the principles of Contact Theory, and an artificial intelligent model to support the autonomous IoT nodes’ association to femtocell access points based on Reinforcement Learning.
1.1. Related Work
The topic of energy harvesting by IoT devices has been thoroughly studied in the literature, mainly focusing on the technical and implementation aspects of the problem [7]. The authors in [8] identify the problem of the limited battery of the IoT nodes and they provide a short survey regarding the existing energy harvesting technologies, and the corresponding power management techniques to sparingly use the harvested energy. In [9], the authors aim to jointly optimize the data flow from the IoT nodes to the users and the IoT nodes’ battery usage, by deploying IoT gateways and energy transmitters to save the energy used for the transmissions and charge the IoT nodes in parallel, respectively. Furthermore, a game-theoretic approach is adopted based on the theory of Stackelberg games, where the IoT gateways optimize the data caching and incentivize the energy transmitters to charge the IoT nodes, by determining their optimal transmission power strategy. A detailed survey study is presented in [10] that identifies the currently available IoT energy harvesting systems, the corresponding energy distribution approaches, and the energy storage devices and control units that facilitate the IoT nodes’ energy harvesting process. The provided categorization of the energy harvesting systems enables the reader to identify the differences among the existing energy harvesting techniques and the corresponding energy distribution approaches, concluding with the most appropriate selection per realistic use case scenario. A predictive energy harvesting model is introduced in [11] by exploiting the extended Kalman filtering method and jointly guaranteeing the Quality of Service (QoS) requirements and several security protection levels in the IoT system. The proposed predictive energy harvesting model can enable the IoT system to plan its energy harvesting needs per connected IoT node and proactively adapt its operation and energy consumption based on the trade-off of energy demand and energy availability.
The exploitation of multiple energy harvesting sources and techniques, such as solar, radio frequency, thermal, artificial light, is studied in [12], by introducing a hybrid energy harvesting model for the IoT nodes that can jointly support the energy harvesting from several sources of energy. The authors provide a mathematical analysis to prove the energy harvesting benefits in terms of the amount of the harvested energy and the efficiency in the energy harvesting process via multiple and hybrid energy harvesting sources compared to a single source of energy harvesting. Focusing on wireless powered communications systems and radio frequency energy harvesting, the authors, in [13], describe a game-theoretic and labor economics-based approach to deal with the optimal energy harvesting under complete and incomplete information scenarios, respectively, regarding the channel conditions among the IoT nodes and the energy transmitters. A Lyapunov optimization-based approach is formulated in [14] to jointly optimize the frequency and the stability of the sampling rate of the IoT energy harvesting nodes showing the increase in the amount of harvested energy. The main novelty of the proposed method is its real time operation and adaptation to the IoT system’s conditions and energy availability of the nodes and the access points without making any assumptions nor predictions on future energy availability patterns. Furthermore, an on-demand energy harvesting model is proposed in [15] towards improving the delay performance of the radio frequency energy harvesting process by introducing two associated discrete time Markov chain models that jointly optimize the average packet delay, the packet loss probability, and the network throughput. The novel concept of directed radio frequency signals charging in a unicast manner each IoT node is also introduced in [15]. The proposed method can charge each IoT node in a personalized manner by transmitting directed radio-frequency beams to the node, thus, increase the amount of harvested energy by the IoT node.
A deep reinforcement learning approach of the actor-critic deep Q-network reinforcement learning algorithms [16] is presented in [17] to jointly address the access and power transmission and harvesting problem of the IoT nodes by considering the sum rate and prediction loss. The importance of IoT energy harvesting nodes in public safety scenarios is discussed in [18], where the IoT nodes create coalitions among each other based on their physical and socio-technical characteristics, which are further exploited by a mobile Unmanned Aerial Vehicle (UAV) in order to select the IoT node cluster that will be charged. This research work is extended in [19] by jointly optimizing the nodes’ transmission power to further report their data to an access point. Additionally, an energy-harvesting-aware routing algorithm is presented in [20] to jointly improve the IoT nodes’ battery life and the IoT network’s Quality of Service under different traffic loads and energy availability conditions. A practical application on IoT energy harvesting nodes is introduced in [21], where the IoT nodes measure the vibration conditions of railway tracks, and report them to a reader, in order to monitor the railway track conditions. The IoT nodes are installed on the railway tracks and harvest radio frequency energy from a reader installed on the train.
Following the above analysis, it is concluded that great attention has been devoted to the technical and implementation aspects of energy harvesting in IoT systems. Specifically, the most recent approaches that have been reviewed above mainly focus on the improvement in the amount of harvested energy, either by providing directed radio-frequency beams from the transmitter to the receiver, or by improving the efficiency of the allocated charging power to the IoT nodes, or even by optimizing the energy consumption of the IoT nodes; thus, greater energy availability is achieved. However, the reviewed approaches have not fully exploited the IoT nodes’ physical and social characteristics during the energy harvesting process, and their interactions with the energy transmitters [22], in order to ultimately optimize the enrgy harvesting process.
To address these issues, in this paper, we design a contract-theoretic approach to capture the interactions among the IoT energy harvesting nodes and the energy transmitters [23,24]. Our goal is to determine the optimal IoT nodes’ harvested energy with respect to the amount of data that they transmit, and the energy transmitters’ optimal charging power. We also introduce an artificial-intelligence-based mechanism to enable the IoT devices to select the most beneficial energy transmitter based on their energy harvesting experience [25].
1.2. Contributions & Outline
The increasing number of Internet of Things (IoT) nodes and their corresponding need to extend their battery life in order to support IoT services have highlighted, which has elevated the need to address the problem of energy harvesting from radio frequency signals in a wireless powered communication system. The ultimate goal of this approach is to guarantee the smooth operation of the overall IoT system and prolong its seamless operation. To the best of our knowledge, this is the first research work that systematically studies the energy harvesting process in an IoT system from a techno-economics and artificial intelligent point of view. We introduce the concept of IoT energy harvesting node types, which are expressed as a function of their communication interest, proximity to the energy transmitter and each other, and their energy conversion efficiency. The IoT nodes’ and the access points’ utility functions are designed to represent the profit of the different entities from the energy harvesting and data acquisition process, respectively. The main contributions of this paper are summarized as follows:
- Based on the principles of Contract Theory, an optimization problem is formulated and solved to determine the IoT nodes’ transmission power, transmitted data to the associated access point, and the energy transmitters’ optimal charging power, in order for the overall system to converge to an optimal and stable point of operation;
- An artificial-intelligence-based reinforcement learning mechanism is introduced, which targets the most beneficial long-term energy transmitter selection from each IoT energy harvesting node in an autonomous and distributed manner.
The rest of this paper is organized as follows. The system model is discussed in Section 2. The IoT node types and all the involved entities’ utility functions are presented in Section 3.1. The contract-theoretic optimization problem is formulated in Section 3.2 and solved in Section 3.3. The artificial intelligent energy transmitters’ selection by the IoT nodes is discussed in Section 4. Numerical results are presented in Section 5, and the conclusions are drawn in Section 6.
2. System Model
A femtocell-based communications network is considered consisting of femtocells with overlapping coverage range in the examined communications environment and their set is . The femtocell access points (FAPs) jointly act as data receivers from the connected IoT nodes and energy transmitters [26]. A set of IoT energy harvesting nodes is considered. The distance among two IoT nodes is denoted as , while the distance of an IoT node from a FAP is . The overall system operates as a wireless powered communication network (WPCN), where the Wireless Energy Transfer (WET), and the Wireless Information Transmission (WIT) phases are executed within a timeslot . The WET and WIT phases’ duration is denoted as and , respectively, with . The considered system model is presented in Figure 1.
Figure 1.
Smart Energy Harvesting for Internet of Things Networks.
The IoT nodes can communicate among each other in order to exchange the information needed to perform a task, e.g., temperature sensors measuring the temperature in a smart building [27]. We define the relationship factor among two IoT nodes. A higher value of the relationship factor shows a higher level of communication interest among two IoT nodes. The communication channel gain conditions among two IoT nodes and among an IoT node and a FAP are defined as , , respectively, where capture the fading phenomena. At each timeslot , each IoT node has some available energy , which indicates its maximum possible transmission power during the WIT phase, as . Each IoT node harvests energy during the WET phase, and invests energy to transmit its data to the FAP during the WIT phase. Thus, the available energy of each IoT node for the next timeslot , is determined as . The transmission power of the IoT node i, in order to report its data to the FAP f, is denoted as , while the personalized FAP’s charging power for the IoT node i is . The FAP uses directional beams in order to improve the efficiency of the energy’s harvesting [15]. Considering the non-orthogonal multiple access (NOMA) technique in the uplink communication from the IoT nodes to the FAPs, and the Successive Interference Cancellation (SIC) technique implemented at the FAPS, each IoT node’s achievable data rate is given as follows based on Shannon’s formula [28]
where W [Hz] is the system’s bandwidth and is the power of zero-mean Additive White Gaussian Noise (AWGN). It is noted that without loss of generality, we consider , thus, by implementing the SIC technique, the signal of the IoT node with the highest channel gain is decoded first at the corresponding FAP, as presented in Equation (1). Given that the IoT devices reside in a small area, we account for the interference stemming from all the IoT nodes’ transmissions, even if they are connected in different FAPs [29]. The acronyms and the notation adopted in this paper are presented in Table 1 and Table 2, respectively.
Table 1.
List of Acronyms.
Table 2.
Summary of Key Notations.
3. Contract Theoretic Energy Harvesting
In this section, we will exploit the principles of Contract Theory towards capturing the interactions among the IoT energy harvesting nodes and the FAPs, in terms of transmitting data and harvesting energy and charging the nodes, respectively. Assuming that each IoT node has selected the FAP that it will communicate with and harvest energy from (details in Section 4), each FAP acts as a virtual “employer”, offering personalized rewards to each connected IoT node, in terms of charging power towards incentivizing the nodes, which act as virtual “employees”, to invest an effort—translated in their transmission power—to report their collected data to the FAP for further exploitation by the IoT service that is offered to the end-users, e.g., smart heating systems.
3.1. Types, Utility Functions, and Contracts
Each IoT node is characterized by its type, which depends on the node’s physical and social characteristics within the IoT network. Those characteristics are summarized in the socio-physical factor , the proximity factor , and the energy conversion efficiency factor . Towards building the socio-physical factor for each node i, we initially consider the channel gain symmetric matrix , and create the channel quality vector . The latter is a simple and indicative factor of the communication channel conditions of each node i with all the other IoT nodes within the examined IoT network. We normalize the channel quality vector, as , where and . Furthermore, we consider the communication interest factor among two IoT nodes , capturing the need of two IoT nodes to exchange information among each other in order to perform an IoT service. We define the communication interest symmetric matrix and create the communication interest vector . Thus, we obtain the normalized communication interest vector , where shows the relative communication interest of each node i with all the other IoT nodes in the network. By jointly combining the normalized communication interest and channel quality indicators, we conclude with the socio-physical factor .
Additionally, each node i being associated with FAP f is characterized by the proximity factor , which expresses the node i’s normalized distance from the FAP f, with respect to the FAP’s maximum coverage range. Each node is characterized by its energy conversion efficiency factor , which shows how efficiently the node can convert the harvested energy from the FAP’s directed radio frequency beam to energy that can be exploited for its operations, e.g., data transmission. Considering the aforementioned three factors, the type of each IoT node is defined as follows
Each node invests an effort in order to transmit its data to the FAP, which is translated to its uplink transmission power . For simplicity in the notation, we have omitted the timeslot indicator in the rest of the analysis. Furthermore, the FAP incentivizes each IoT node, which is connected to this FAP, to report its data by charging it with directed radio frequency beams. The FAP’s personalized reward to the node i is denoted as , and the corresponding power of the directed radio frequency beam is , where is the FAP f’s available charging power. Thus, the IoT node’s harvested energy in a timeslot during the WET phase, as discussed in Section 2, is , while the corresponding energy invested to its data transmission during the WIT phase is .
Each IoT node evaluates the received reward from the FAP based on the evaluation function on , which is a strictly increasing function with respect to the received reward, e.g., . In practice, the evaluation function captures the node’s required charging power. Therefore, each IoT node’s utility function is defined by the revenue that the IoT node enjoys from the charging process (first term of Equation (3)), while considering the cost of its data transmission due to its invested transmission power (second term of Equation (3))
where is the IoT node’s experienced cost to transmit its data by investing its transmission power.
Focusing on the benefit of each FAP from collecting data from the IoT nodes, we express its utility as the profit gained from the IoT nodes’ invested effort, while considering the cost to provide the rewards. Each FAP is not aware of the IoT nodes’ type; thus, we define the probability , with , that node i is of type . Therefore, each FAP’s , utility function is defined as follows
where are the IoT node types, rewards and effort vectors, respectively, and is the FAP’s cost of providing the rewards, due to the spending energy required to perform the node charging.
3.2. Problem Formulation
In this section, we will formulate the problem of optimal energy harvesting and charging as a contract-theoretic optimization problem, as follows.
In the following description, we discuss the physical meaning of thed optimization problem formulated above in detail. To determine the optimal harvested power by the IoT nodes, and the optimal charging power provided by each FAP to each connected IoT node, the profit/benefits of the FAPs and the IoT nodes should be jointly optimized, as presented in (5a)–(5d). Each FAP aims to optimize its utility function (5a) towards determining the optimal contract .
It should be noted that the optimization problem (5a)–(5d) is solved by each FAP and the corresponding IoT nodes connected to it. Thus, we solve as many optimization problems as the number of FAPs in the examined system, while considering that each IoT node should at least receive a positive utility (Equation (5b)) in order to be incentivized to participate in the IoT network. The latter condition (Equation (5b)) is referred as Individual Rationality (IR). Furthermore, each node achieves a higher utility when receiving the contract designed for its unique characteristics, i.e., type, as compared to any other contract designed for another node (Equation (5c)). This condition is referred to as Incentive Compatibility (IC).
Additionally, for notation convenience, we sort the types of the IoT nodes as . Towards further elaborating on the constraint of Equation (5d), we analyze and prove the conditions of fairness, monotonicity, and rationality in the following three propositions.
Proposition 1.
(Fairness) An IoT node of higher (or the same) type will receive a higher (or the same) reward, i.e., .
Proof.
See Appendix A.1. □
Based on the fairness condition, an IoT node of a higher type, i.e., improved socio-physical characteristics, will enjoy higher reward from the FAP, i.e., increased charging power.
Proposition 2.
(Monotonicity) An IoT node of higher type, i.e., , will invest a higher effort, i.e., .
Proof.
See Appendix A.2. □
The physical meaning of the monotonicity property is that an IoT node of better socio-physical characteristics, i.e., type , is expected to report greater amount of information by investing more uplink transmission power, i.e., effort . Thus, the FAP will provide a greater reward by an increased charging power. The last condition that is examined is the rationality.
Proposition 3.
(Rationality) An IoT node of higher type, i.e., , will eventually experience higher utility, i.e., .
Proof.
See Appendix A.3. □
The conditions of fairness, monotonicity, and rationality are presented in a combined manner in Equation (5d).
3.3. Problem Solution
In this section, our goal is to solve the contract-theoretic optimization problem, as presented in Equations (5a)–(5d), under the scenarios of complete and incomplete information from the FAPs perspective regarding the IoT nodes’ socio-physical characteristics, i.e., types. The solution of the contract-theoretic optimization problems, which are solved by each FAP along with its connected IoT nodes, will result in determining the optimal contracts . Based on this solution, the optimal charging power of each FAP to each connected node will be determined, as well as the optimal transmission power of each IoT node.
Complete Information Scenario: In this scenario, the FAPs know the types of the IoT nodes in a deterministic manner, thus, the contract-theoretic optimization problem (5a)–(5d) can be rewritten, as follows.
Theorem 1.
(Optimal Contract under Complete Information)The optimal contract among an IoT node i connected to the FAP f considering complete information of the IoT nodes’ types is .
Proof.
See Appendix A.4. □
The complete information scenario is an ideal case, and will mainly be used for benchmarking purposes. In practice, the FAPs have limited information regarding the IoT nodes’ socio-physical characteristics, i.e., types. Thus, in the following analysis, we examine the scenario of incomplete information regarding the IoT nodes’ types.
Incomplete Information Scenario: In the following analysis, we examine the contract-theoretic optimization problem that was presented in (5a)–(5d) under the incomplete information scenario. Initially, we perform a reduction in the individual rationality conditions in Equation (5b). Based on the monotonocity and incentive compatibility conditions, we have that: . Given that , we can rewrite the above inequality as follows: . Thus, we conclude that the individual rationality condition holds true for all the IoT nodes, if holds true. The latter constraint can be further reduced to , as the FAP will provide the minimum sufficient reward to the IoT nodes to participate in the IoT network. Thus, the constraint (5b) is equivalent to .
Next, our goal is to reduce the incentive compatibility (IC) constraints, as presented in Equation (5c). The following terminology is used in order to represent the IC constraints: (i) : downward IC constraints; (ii) : local down IC constraints; (iii) : upward IC constraints; and (iv) : local upward IC constraints.
Lemma 1.
All the downward IC constraints are equivalent to the local downward IC constraint.
Proof.
See Appendix A.5. □
Following the same philosophy, we state the following Lemma.
Lemma 2.
All the upward IC constraints are equivalent to the local downward IC constraint.
Proof.
See Appendix A.6. □
Based on the above analysis of reducing the constraints, we can rewrite the initial contract-theoretic optimization problem as follows:
4. Artificial Intelligent Association
In this section, we introduce an artificial-intelligence-based reinforcement learning mechanism to enable the IoT nodes to make the most beneficial long-term energy transmitter (i.e., FAP) selection in an autonomous and distributed manner. Our study focuses on the Log-Linear reinforcement learning algorithms, such as the Max Log-Linear and the Binary Log-Linear algorithms, which are able to converge to the best equilibrium point (if one exists) of the system with high probability. Additionally, the Log-Linear algorithms allow the IoT nodes to deviate from their probabilistically optimal decisions and make some suboptimal decisions in order to thoroughly explore their available actions. In this paper, we adopt the Max Log-Linear mechanism that requires no exchange of information among the IoT nodes and the FAPs. Each IoT node aims to learn, in the long-term, the most-beneficial choice of FAP; thus, its strategy space is . Initially, each IoT node selects a strategy with equal probability , where presents the iteration of the Max Log-Linear algorithm. Then, at each iteration, one IoT node is randomly selected to explore an alternative strategy with equal probability , and receives a corresponding utility . The selected IoT node updates its strategy following the probabilistic learning rules in Equation (8a) and Equation (8b), while the rest of the IoT nodes keep their previously selected strategies unchanged, i.e., learning phase.
The pseudo-code of the introduced Max Log-Linear algorithm that enables the IoT nodes to select a FAP, which they can harvest energy from and communicate with the selected FAP, is presented in Algorithm 1. The outcome of the Max Log-Linear algorithm will be the stable selection of FAPs from the IoT nodes.
| Algorithm 1: Max Log-Linear Algorithm |
![]() |
5. Numerical Results
In this section, a detailed numerical evaluation analysis is presented based on simulations in order to show the effectiveness and performance of the proposed smart energy harvesting framework for Internet of Things networks. First, in Section 5.1, we focus on validating the operation of the proposed contract-theoretic energy-harvesting mechanism, in terms of determining the optimal contracts under the scenarios of complete and incomplete information regarding the IoT nodes’ socio-physical characteristics. The benefits of adopting Contract Theory and exploiting the IoT nodes’ characteristics are presented in Section 5.2. Having verified and analyzed the pure operation of the proposed framework, a detailed comparative evaluation is presented in Section 5.3 to show the superior performance of the overall system by enabling the IoT nodes with artificial intelligence, against other approaches that have been used in the literature.
Throughout our evaluation, we consider , , s, , , m, m, , , , Hz, , representing a typical IoT system consisting of IoT nodes, such as temperature sensors [31]. The proposed framework’s evaluation was conducted in an ACER laptop, with Intel Core i7, 3.9GHz Processor, and 16GB available RAM. In the following results, unless otherwise explicitly stated, the above values of the simulation parameters are used.
5.1. Pure Operation Performance
In this section, we present the pure operation performance of the proposed contract-theoretic energy harvesting model by examining the scenarios of complete and incomplete information of the IoT nodes’ characterises from the FAPs’ perspective. The results presented below are derived from one indicative timeslot, where the overall framework was executed, i.e., IoT nodes’ association to FAPs, and determining the IoT nodes’ transmission power (effort) and the FAPs’ charging power (reward) based on the introduced contract-theoretic model.
Figure 2a–c present the IoT nodes’ effort , the FAPs’ reward , and the IoT nodes’ achieved utility as a function of the IoT nodes’ types considering the scenarios of complete and incomplete information. It is noted that the IoT nodes’ types are sorted for presentation purposes, i.e., . The results reveal that the IoT nodes of higher type, i.e., better socio-physical conditions, invest more effort (Figure 2a) by transmitting with higher transmission power to report more data to the corresponding FAPs that they are associated with. Thus, following the fairness (Proposition 1) and monotonicity (Proposition 2) conditions, the IoT nodes of higher type enjoy a higher reward (Figure 2b) from the FAPs, i.e., higher charging power. Therefore, based on the rationality (Proposition 3) condition, the IoT nodes of higher type achieve a higher utility, as shown in Figure 2c. Furthermore, it should be noted that the FAPs provide the minimum possible rewards to the IoT nodes under the complete information scenario given that they know their socio-physical characteristics; thus, (Figure 2c).
Figure 2.
IoT nodes’ (a) invested effort, (b) gained reward, and (c) achieved utility under the proposed contract-theoretic energy harvesting framework—Complete versus Incomplete information scenarios.
Additionally, Figure 3a,b illustrate the FAPs’ cumulative utility and the overall IoT system’s social welfare, respectively. The results show that the overall IoT system operates better under the complete information scenario. Specifically, it is observed that the social welfare of the overall IoT system is reduced, on average, by 67 % under the incomplete information scenario, where the latter is a realistic situation in an IoT system. The latter observation confirms that the proposed smart energy harvesting framework operates in an acceptable manner under the realistic conditions of complete lack of information regarding the IoT nodes’ socio-physical conditions.
Figure 3.
(a) FAPs’ cumulative utility and (b) the overall IoT system’s social welfare under the proposed contract-theoretic energy harvesting framework—Complete versus Incomplete information scenarios.
5.2. Benefits of Socio-Physical Approach
In this section, a detailed comparative analysis is presented in order to highlight the benefits of introducing a contract-theoretic approach to perform the smart energy harvesting and considering the unique socio-physical characteristics of each IoT node. The realistic incomplete information scenario is considered and compared to a type agnostic scenario, where each FAP offers proportional rewards to the IoT nodes based on their invested effort, i.e., .
Figure 4a,b presents the IoT nodes’ received rewards and their corresponding achieved utility, respectively, as a function of the IoT nodes’ IDs. Figure 5a,b depicts the FAPs’ cumulative utility and the overall IoT system’s social welfare, respectively, as a function of the number of IoT nodes in the examined system. The results reveal that the proposed contract-theoretic smart energy-harvesting model exploits the nodes’ socio-physical characteristics in a personalized manner, as compared to the type agnostic model. Thus, the IoT nodes receive rewards tailored to their type (Figure 4a), and the IoT nodes’ that invest a higher effort, given their higher type, receive higher rewards. The achieved benefits are also depicted in the IoT nodes’ achieved utility (Figure 4b), which respects the individual rationality condition under the proposed contract-theoretic model. Thus, the IoT nodes always achieve a positive utility for their invested effort in contrast to the type agnostic scenario. The FAPs’ cumulative utility is similar in both cases (Figure 5a), given that the FAPs gain from under-rewarding some IoT devices, while they spend a great amount of charging power by over-rewarding some other IoT devices in the type agnostic scenario. By studying the overall IoT system (Figure 5b), we observe that the contract-theoretic smart energy harvesting framework outperforms the type agnostic approach by a factor of four on average, given the personalized rewarding mechanism that enables the offering of personalized rewards to the IoT nodes tailored to their needs. Thus, the transmission and charging power usage is intelligently exploited in the system.
Figure 4.
(a) IoT nodes reward and (b) achieved utility under the contract-theoretic versus the type agnostic framework of energy harvesting.
Figure 5.
(a) FAPs’ cumulative utility and (b) the overall IoT system’s social welfare under the contract-theoretic versus the type agnostic framework of energy harvesting.
5.3. Comparative Evaluation
In this section, we demonstrate the benefits of introducing an artificial intelligent method based on reinforcement learning to facilitate the intelligent association of the IoT nodes to the FAPs. Five comparative scenarios are considered in terms of enabling the IoT nodes to select the FAP that they will be associated with: (i) the proposed reinforcement learning mechanism (RL), as introduced in Section 4, the IoT nodes’ select (ii) the closest FAP to connect (Min Distance), (iii) the FAP that offered the maximum charging power in the previous timeslot (Max Charging Power), (iv) the FAP that the minimum number of IoT nodes (Min Nodes) were connected to it in the previous timeslot, and (v) a random FAP (Random). It is noted that all the IoT nodes are within the coverage area of all the considered FAPs. The overall results were derived by performing a detailed Monte Carlo analysis of 1000 executions of the overall framework for all the comparative scenarios.
Figure 6a–c present the IoT nodes’ invested effort, gained reward, and achieved utility, respectively, as a function of the IoT nodes’ IDs. Figure 7a,b illustrate the FAPs’ cumulative utility and the overall IoT system’s social welfare, respectively, as a function of the number of IoT nodes within the overall system. The results show that the proposed framework outperforms compared to all other scenarios, in terms of IoT nodes’ invested effort (Figure 6a), gained reward (Figure 6b), and achieved utility (Figure 6c), FAP’s cumulative utility (Figure 7a), and system’s social welfare (Figure 7b). This observation stems from the proposed reinforcement learning mechanism’s inherent characteristics that enable the IoT nodes to select the FAPs that hollistically provide them with a superior utility in the long term, as compared to considering only fragmented selection criteria, such as the minimum distance, the maximum charging power, and/or the minimum number of connected IoT nodes to the FAPs. It is also observed that FAP selection based on the minimum distance presents the next best results after our proposed reinforcement learning-based framework, as the communication distance is a dominant factor in both the transmission and charging signals’ power attenuation. The random selection scenario presents the worst results, as the IoT nodes make a non-sophisticated selection of FAPs without considering their physical and social characteristics. The Max Charging Power and Min Nodes FAP selection scenarios present similarly mediocre results, as all the IoT nodes tend to select only one FAP per timeslot, and this type of selection creates a burden on the selected FAP to serve all the connected IoT nodes efficiently.
Figure 6.
(a) IoT nodes’ invested effort, (b) gained reward, and (c) achieved utility — A Comparative Evaluation.
Figure 7.
(a) FAPs’ cumulative utility and (b) the overall IoT system’s social welfare—A Comparative Evaluation.
Furthermore, Figure 8a–c illustrates the total transmission power and utility of all the IoT nodes, and the total charging power of all the FAPs, respectively, for all the examined comparative scenarios. The results demonstrate that the proposed reinforcement learning-based FAPs’ selection mechanism enables the IoT nodes to transmit with low power (Figure 8a) and efficiently exploit the FAPs’ charging power (Figure 8c), in order to achieve superior utility (Figure 8b) within the examined IoT system. On the other hand, the single selection criterion of FAPs scenarios present worse results, as they provide a myopic view to the IoT nodes regarding the IoT system, and their most beneficial choice of FAP to be connected and transmit information, while also harvesting power. Additionally, the random scenario provides the lowest utility to the IoT nodes, as it is not able to efficiently balance the trade-off between the energy spent to transmit the IoT nodes’ data and the corresponding harvested energy from the FAPs’ radio frequency signals.
Figure 8.
(a) Total transmission power and (b) utility of all the IoT nodes, and (c) total charging power of all the FAPs — A Comparative Evaluation.
Additionally, Figure 9a,b illustrates the IoT nodes’ total achieved data rate and their corresponding total achieved energy efficiency under all the examined comparative scenarios. The results illustrate that the intelligent IoT nodes’ association to the FAPs by exploiting the introduced artificial intelligent framework, results in the better exploitation of the low transmission power (Figure 8a) in order to achieve a superior data rate (Figure 9a) and improved energy efficiency (Figure 9b), compared to the rest of the examined scenarios. It is also illustrated that the comparative scenarios, which perform a myopic selection of FAP for the IoT nodes, achieve low data rate and energy efficiency. Thus, it is concluded that a multi-parameter consideration in the selection of the FAP and providing to the IoT nodes with the intelligence needed to perform the FAP selection, provides better results in terms of the transmission power and achieved data rate, and correspondingly improves the overall energy efficiency of the IoT nodes.
Figure 9.
(a) Total achieved data rate and (b) energy efficiency of all the IoT nodes — A Comparative Evaluation.
6. Conclusions
In this paper, a smart energy harvesting framework for Internet of Things is introduced based on Contract Theory and Reinforcement Learning. Initially, a wireless powered communication system model is introduced, which exploits the IoT nodes’ physical and social characteristics in order to define their types. Then, the IoT nodes’ transmission power and the FAPs’ personalized charging power based on the IoT nodes’ characteristics are determined by introducing a contract-theoretic framework to capture the interactions among the IoT nodes and the FAPs. The scenarios of incomplete and complete information regarding the IoT nodes’ types are examined in detail. Furthermore, an artificial intelligence mechanism is proposed based on reinforcement learning in order to enable the IoT nodes to select the most beneficial choice of FAP to connect to in the long-term. Finally, detailed simulation and comparative results are presented to show the pure operation performance of the proposed framework, as well as its drawbacks and benefits, compared to other approaches.
Our current and future work aims to extend the proposed framework in a 6G operation wireless environment enriched with reconfigurable intelligent surfaces in order to improve the channel conditions among the IoT nodes and the FAPs. To quantifying the benefits introduced by adopting the reconfigurable intelligent surfaces, we perform a detailed experimental analysis to measure the transmission and charging power savings.
Author Contributions
All authors contributed extensively to the work presented in this paper. F.S. contributed to the design of the algorithm, developed the code of the overall framework, executed the evaluation experiment and contributed to the discussions and analysis of the comparative evaluation results. N.I. contributed to the original idea on which we have based our current work and contributed to the design of the proposed framework. E.E.T. was responsible for the overall orchestration of the performance evaluation work and had the overall coordination in the writing of the article. All authors have read and agreed to the published version of the manuscript.
Funding
This material is based upon work supported by the National Science Foundation EPSCoR Program under Award #OIA-1757207.
Institutional Review Board Statement
Not Applicable.
Informed Consent Statement
Not Applicable.
Data Availability Statement
Not Applicable.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Proofs
Appendix A.1. Proof of Proposition 1
In the following proof, we examine both the sufficiency, i.e., , and the necessity, i.e., of the condition. Starting with the sufficiency of the fairness condition and based on the Incentive Compatibility (IC) condition Equation (5c), we have :
By recognizing the terms in Equation (A3), we have: , and given that , we conclude that . Given that the evaluation function is strictly increasing with respect to the reward , we conclude that . Thus, we have shown so far that holds true.
Continuing with the necessity condition, we know that and the evaluation function is strictly increasing; thus, . Based on Equation (A3), we have: ; thus, we conclude that . Therefore, we have shown that holds true. Finally, it is noted that , by easily following the above reasoning.
Appendix A.2. Proof of Proposition 2
Based on Proposition 1, we have . The reward is defined as a strictly increasing function of the IoT node’s invested effort. The physical meaning of this definition is that an IoT node which spends more energy to transmit its data to the FAP, should be charged more by the FAP in order to remain active in the IoT network. Thus, we can easily conclude with the outcome .
Appendix A.3. Proof of Proposition 3
For two representative IoT nodes, , we write their incentive compatibility conditions as follows: . Thus, we concluded that . By generalizing this analysis, we conclude that .
Appendix A.4. Proof of Theorem 1
Based on the individual rationality constraint Equation (6b), we consider the minimum achieved utility that is acceptable by the IoT node, i.e., , in order to participate in the IoT network. Thus, for , we have . Based on Equation (6a), by taking the first-order derivative with respect to equal to zero, we have , thus, .
Appendix A.5. Proof of Lemma 1
We consider three indicative IoT nodes: , and we write the IC conditions, as follows
The evaluation function is strictly increasing; thus, we have: . We also have: . We recursively perform the latter analysis and we have: . Thus, we conclude with the equivalent local downward IC constraint.
Appendix A.6. Proof of Lemma 2
We consider three indicative IoT nodes: , and we write the following IC constraints:
Based on Equation (A8) and the fairness condition, we have:
References
- Li, S.; Da Xu, L.; Zhao, S. The internet of things: A survey. Inf. Syst. Front. 2015, 17, 243–259. [Google Scholar] [CrossRef]
- Huang, X.; Yu, R.; Kang, J.; Xia, Z.; Zhang, Y. Software defined networking for energy harvesting internet of things. IEEE Internet Things J. 2018, 5, 1389–1399. [Google Scholar] [CrossRef]
- Tsiropoulou, E.E.; Paruchuri, S.T.; Baras, J.S. Interest, energy and physical-aware coalition formation and resource allocation in smart IoT applications. In Proceedings of the 2017 51st Annual Conference on Information Sciences and Systems (CISS), Baltimore, MD, USA, 22–24 March 2017; pp. 1–6. [Google Scholar]
- Wei, Z.; Zhao, B.; Su, J.; Lu, X. Dynamic edge computation offloading for internet of things with energy harvesting: A learning method. IEEE Internet Things J. 2018, 6, 4436–4447. [Google Scholar] [CrossRef]
- Zorbas, D.; Raveneau, P.; Ghamri-Doudane, Y.; Douligeris, C. The charger positioning problem in clustered RF-power harvesting wireless sensor networks. Ad Hoc Netw. 2018, 78, 42–53. [Google Scholar] [CrossRef]
- Vamvakas, P.; Tsiropoulou, E.E.; Vomvas, M.; Papavassiliou, S. Adaptive power management in wireless powered communication networks: A user-centric approach. In Proceedings of the 2017 IEEE 38th Sarnoff Symposium, Newark, NJ, USA, 18–20 September 2017; pp. 1–6. [Google Scholar]
- Caillouet, C.; Razafindralambo, T.; Zorbas, D. Optimal placement of drones for fast sensor energy replenishment using wireless power transfer. In Proceedings of the 2019 Wireless Days (WD), Manchester, UK, 24–26 April 2019; pp. 1–6. [Google Scholar]
- Adila, A.S.; Husam, A.; Husi, G. Towards the self-powered Internet of Things (IoT) by energy harvesting: Trends and technologies for green IoT. In Proceedings of the 2018 2nd International Symposium on Small-scale Intelligent Manufacturing Systems (SIMS), Cavan, Ireland, 16–18 April 2018; pp. 1–5. [Google Scholar]
- Yao, J.; Ansari, N. Caching in energy harvesting aided Internet of Things: A game-theoretic approach. IEEE Internet Things J. 2018, 6, 3194–3201. [Google Scholar] [CrossRef]
- Zeadally, S.; Shaikh, F.K.; Talpur, A.; Sheng, Q.Z. Design architectures for energy harvesting in the Internet of Things. Renew. Sustain. Energy Rev. 2020, 128, 109901. [Google Scholar] [CrossRef]
- Mao, B.; Kawamoto, Y.; Kato, N. AI-based joint optimization of QoS and security for 6G energy harvesting internet of things. IEEE Internet Things J. 2020, 7, 7032–7042. [Google Scholar] [CrossRef]
- Akan, O.B.; Cetinkaya, O.; Koca, C.; Ozger, M. Internet of hybrid energy harvesting things. IEEE Internet Things J. 2017, 5, 736–746. [Google Scholar] [CrossRef]
- Hou, Z.; Chen, H.; Li, Y.; Vucetic, B. Incentive mechanism design for wireless energy harvesting-based Internet of Things. IEEE Internet Things J. 2017, 5, 2620–2632. [Google Scholar] [CrossRef]
- Loreti, P.; Bracciale, L.; Bianchi, G. StableSENS: Sampling time decision algorithm for IoT energy harvesting devices. IEEE Internet Things J. 2019, 6, 9908–9918. [Google Scholar] [CrossRef]
- Guntupalli, L.; Gidlund, M.; Li, F.Y. An on-demand energy requesting scheme for wireless energy harvesting powered IoT networks. IEEE Internet Things J. 2018, 5, 2868–2879. [Google Scholar] [CrossRef]
- Huang, X.L.; Ma, X.; Hu, F. Machine learning and intelligent communications. Mob. Netw. Appl. 2018, 23, 68–70. [Google Scholar] [CrossRef]
- Chu, M.; Liao, X.; Li, H.; Cui, S. Power control in energy harvesting multiple access system with reinforcement learning. IEEE Internet Things J. 2019, 6, 9175–9186. [Google Scholar] [CrossRef]
- Sikeridis, D.; Tsiropoulou, E.E.; Devetsikiotis, M.; Papavassiliou, S. Socio-spatial resource management in wireless powered public safety networks. In Proceedings of the MILCOM 2018-2018 IEEE Military Communications Conference (MILCOM), Los Angeles, CA, USA, 29–31 October 2018; pp. 810–815. [Google Scholar]
- Sikeridis, D.; Tsiropoulou, E.E.; Devetsikiotis, M.; Papavassiliou, S. Energy-efficient orchestration in wireless powered internet of things infrastructures. IEEE Trans. Green Commun. Netw. 2018, 3, 317–328. [Google Scholar] [CrossRef]
- Nguyen, T.D.; Khan, J.Y.; Ngo, D.T. A distributed energy-harvesting-aware routing algorithm for heterogeneous IoT networks. IEEE Trans. Green Commun. Netw. 2018, 2, 1115–1127. [Google Scholar] [CrossRef]
- Li, P.; Long, Z.; Yang, Z. RF Energy Harvesting for Battery-Less and Maintenance-Free Condition Monitoring of Railway Tracks. IEEE Internet Things J. 2020, 8, 3512–3523. [Google Scholar] [CrossRef]
- Tsiropoulou, E.E.; Mitsis, G.; Papavassiliou, S. Interest-aware energy collection & resource management in machine to machine communications. Ad Hoc Netw. 2018, 68, 48–57. [Google Scholar]
- Bolton, P.; Dewatripont, M. Contract Theory; MIT Press: Cambridge, MA, USA, 2005. [Google Scholar]
- Smith, S.A. Contract Theory; OUP Oxford: Oxford, UK, 2004. [Google Scholar]
- Ertel, W. Introduction to Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
- Tsiropoulou, E.E.; Vamvakas, P.; Katsinis, G.K.; Papavassiliou, S. Combined power and rate allocation in self-optimized multi-service two-tier femtocell networks. Comput. Commun. 2015, 72, 38–48. [Google Scholar] [CrossRef]
- Gunawan, T.S.; Yaldi, I.R.H.; Kartiwi, M.; Mansor, H. Performance evaluation of smart home system using internet of things. Int. J. Electr. Comput. Eng. 2018, 8, 400. [Google Scholar] [CrossRef]
- Diamanti, M.; Fragkos, G.; Tsiropoulou, E.E.; Papavassiliou, S. Unified User Association and Contract-Theoretic Resource Orchestration in NOMA Heterogeneous Wireless Networks. IEEE Open J. Commun. Soc. 2020, 1, 1485–1502. [Google Scholar] [CrossRef]
- Dai, L.; Wang, B.; Yuan, Y.; Han, S.; Chih-Lin, I.; Wang, Z. Non-orthogonal multiple access for 5G: Solutions, challenges, opportunities, and future research trends. IEEE Commun. Mag. 2015, 53, 74–81. [Google Scholar] [CrossRef]
- Boyd, S.; Boyd, S.P.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Adu-Manu, K.S.; Adam, N.; Tapparello, C.; Ayatollahi, H.; Heinzelman, W. Energy-Harvesting Wireless Sensor Networks (EH-WSNs) A Review. ACM Trans. Sens. Netw. (TOSN) 2018, 14, 1–50. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
