Power Allocation Based on Data Classification in Wireless Sensor Networks

Limited node energy in wireless sensor networks is a crucial factor which affects the monitoring of equipment operation and working conditions in coal mines. In addition, due to heterogeneous nodes and different data acquisition rates, the number of arriving packets in a queue network can differ, which may lead to some queue lengths reaching the maximum value earlier compared with others. In order to tackle these two problems, an optimal power allocation strategy based on classified data is proposed in this paper. Arriving data is classified into dissimilar classes depending on the number of arriving packets. The problem is formulated as a Lyapunov drift optimization with the objective of minimizing the weight sum of average power consumption and average data class. As a result, a suboptimal distributed algorithm without any knowledge of system statistics is presented. The simulations, conducted in the perfect channel state information (CSI) case and the imperfect CSI case, reveal that the utility can be pushed arbitrarily close to optimal by increasing the parameter V, but with a corresponding growth in the average delay, and that other tunable parameters W and the classification method in the interior of utility function can trade power optimality for increased average data class. The above results show that data in a high class has priorities to be processed than data in a low class, and energy consumption can be minimized in this resource allocation strategy.


Introduction
Wireless sensor networks typically consist of distributed battery-equipped sensors which aim to monitor a given area by sensing relevant physical information. One practical example of these kinds of networks is the monitoring in coal mines, because condition monitoring and failure diagnosis are essential to environmental security and equipment operation safety in coal mines and wireless sensor networks can better adapt to complex environments without wiring [1][2][3]. In practice, the memory, power supply and processing capacity of sensor nodes are limited, and the environment in coal mines is very harsh, making it difficult to replace and recharge batteries manually and the nodes' energy can be exhausted easily. Although there are increasingly innovative energy harvesting methods, such as wireless energy harvesting [4,5], and wind energy harvesting [6,7], which can potentially prolong the lifetime of wireless sensor networks, how to allocate power and use energy effectively is still a problem.
Aiming at resolving this problem, a number of studies have been launched. Xu formulated a new energy-efficient packet scheduling problem by adopting a recently developed channel capacity respectively. Although our study is similar to [25] at first glance, in fact we aim to control and constrain queue lengths in advance by transmitting classified data with a higher priority first. As a result, the Lyapunov drift optimization is utilized to minimize the weight sum of the average power and the average data class.
The paper is organized as follows: in the next section, a wireless sensor network model is described and the optimization problem is formulated. The proposed framework is presented in Section 3. Performance evaluation results under the scenarios with perfect CSI and imperfect CSI are discussed in Section 4. Finally, in Section 5, we draw the main conclusions of our work.

Problem Definition
A network model of multiple transmitters and one receiver is included in the current wireless sensor networks [26][27][28]. Provided that data can be sent to the fusion center directly with N nodes and L links, each source node will put packets into the transmission queue in accordance with FIFO principle, as shown in Figure 1.
is given as the number of packets arriving for transmission over links; The data is classified based on the following principle: r is the class level, , , a b c is the boundary, and 0 ( (t)) ( ,..., ) When the channel state over link l is i S , the probability of chosen  The incoming data is assumed to leave the network once it is sent over a transmission link l ∈ {1, . . . , L}. Let S l (t) represent the channel state over the link l, which is time-varying and finite. Let µ l (P l (t), S l (t)) represent the transmission rate over link l during slot t, and µ l (P l (t), S l (t)) is a continuous function of the power P l (t) for each channel state S l (t), abbreviated as µ l (P l , S l ), for the whole network µ(P, S) = (µ 1 (P 1 , S 1 ), . . . µ L (P L , S L )). Let P(t) = (P 1 (t), . . . , P L (t)) represent the power allocated over links during every slot t and P l (t) is limited by a peak value P peak according to the discrete ON/OFF power constraint P l ∈ 0, P peak . If P l > 0 for some link, then P l = 0, l = l, namely P(t) ∈ I. The arriving data and channel states are assumed to be ergodic with the arrival rate λ = (λ 1 , . . . , λ L ) and the channel probability is given as the number of packets arriving for transmission over links; The data is classified based on the following principle: r is the class level, a, b, c is the boundary, and Ψ = f (A(t)) = (Ψ 0 , . . . , Ψ L ) is the class vector. When the channel state over link l is S i , the probability of chosen P S i k is α S i k , equaling to the probability of µ l (P S i k , S i ), and the probability of the data with a class admitted into a queue is β S i k . Then an ideal power allocation strategy should solve the following problems: Our target in (2) is to minimize power consumption and the class of data received by the fusion center (the larger the number of packets is, the smaller the class function Ψ l is). W is a weight factor representing on "importance weight" on how much we emphasize the average level of the data admitted; Constraint 3 guarantees the stability of the network, namely rate vector λ strictly in the internal network capacity Λ [29]. Although the objective function and constraints can be formulated, it is hard to solve the problem without any prior probability knowledge. Even though these parameters are obtained, the large amount of calculation will not make it very suitable [30].

Power Allocation Based on Data Classification
In this paper, a Lyapunov optimization algorithm is applied to solve the aforementioned problems. Define the queue backlog U(t) = (U 1 (t), . . . , U l (t)), representing queue length waiting to be sent. The lth queueing dynamic are: Define the Lyapunov function: Now, define ∆U(t) as the Lyapunov drift for slot t: Drift plus penalty is obtained as: V is the penalty factor and W is the weight factor. V and W can be considered as weight factors of the power and data class, respectively. When V and W are changed, the importance of P and Ψ will be adjusted. According to the Lyapunov drift theorem, by minimizing (6), we can achieve a stable network system and the minimization of the object function. Admittedly, any control decision meeting constraints (3) satisfies the following inequality at any slot t: where B being is a fixed constant that satisfies the following inequality: To minimize the Lyapunov bound is to minimize the right hand side of the inequality (8). Hence, a suboptimal strategy is acquired: observe U(t), S(t) and Ψ(t) every slot t, and allocate power vector P(t) = (P 1 (t), . . . , P L (t)) according to the following rules: Although (10) gives a solution to the power allocation problem of (2) and (3), an algorithm to provide the execution structure and the executing entity for the equations still needs to be designed. Therefore, we propose Algorithm 1 as an implementation of our resource scheduling solution. Every node measures the channel state S l and data class Ψ l 2. Calculate If max(Z l , Z i )==Z l 10. P l = 1 11. else 12.

Stability Proof and Complexity Analysis
If λ is strictly interior to the network capacity Λ, there exists a positive value ε such that λ + ε ∈ Λ (where ε is the L-dimensional vector with all entries equal to ε) and a stationary power allocation rule exists [25]: Plugging the above inequality into Equation (8) gives: where the utility is defined as Y(t) = P(t) + WΨ(t), and our target is to make the time average utility arbitrarily close to the target cost level Y * = P * + WΨ * . Because drift conditions can be satisfied at all slots, the following formula is accessible after a total of T iterations: Rearranging, we find: Taking limits as T → ∞ yields the time average bound: Further expand formula (12), yielding: In [22,31], it is shown that this time average backlog bound implies system stability from Equation (11). The performance bounds in Equations (11) and (12) utility-backlog tradeoff: an arbitrarily large V can be used to make B/V arbitrarily small, so that Equation (12) implies the time average utility Y is arbitrarily close to the optimum Y * . This comes with a tradeoff: The average queue backlog bound in Equation (11) is O(V). In Algorithm 1, the calculation of max(Z l , Z i ) in step 9 for every sensor entails L operations. Therefore, the complexity of the proposed distributed algorithm is O(L).

Simulation Analysis
To analyze how the control parameters make a utility-backlog trade-off and power-class trade-off, let N = 2 in Figure 1, i.e., a single-hop network with two transmitters and one receiver. A channel has three states, "G", "B" and "M", where the maximum number of packets transmitted is 3, 2 and 1 in every slot, respectively. The simulations are conducted under perfect CSI and imperfect CSI cases. The probability distribution of the three channel states is shown in Table 1 for the perfect CSI scenario, where all the simulation parameters are equal to those given in [25]. On the other hand, in the imperfect CSI case, the probabilities of correct detection and false detection of the channel state are 90% and 10%, respectively. . Two nodes transmit data to a fusion center according to Poisson processes with rates λ 1 = 8/9, λ 2 = 5/9. The arriving data is divided into three classes according to the number of packets as shown in the Table 2. It is obvious that the larger the number of data packets is, the smaller the class is, and the higher the priority is.  The algorithm was simulated under four different values of the control parameter V, six different values of the weight parameter W and 20 different values of classification methods. Each simulation was run for 10 million time slots. Next, we observe how V, W and classification methods determine the average backlog, average utility, average power and average class.

Parameters V and W
According to Little's theorem, the average delay is equal to the average queue backlog divided by the average arriving rate, so the average queue backlog reflects network delay to some extent. In Figure 2, the average queue backlog increases with V and W from the bottom 2.521 (V = 0.01, W = 0.01) to the highest point 12,530 (V = 100, W = 100) in the perfect CSI case, as suggested in Equation (11). Table 3 shows the average backlog in prefect CSI case subtracted from average backlog in imperfect CSI case increases with V and Table 4 shows V can also increase the difference in different classes. It can be found in Figure 3 that the delay in a high class is larger than the one in a low class, that the average backlog in different levels increases with V and W, whose trend is similar to the total average queue backlog, and that the differences of the delay among each level become bigger and bigger.  Pr[(S ,S )] 1/3 2/9 1/9 2/9 1/9 Table 2. The method of classification.

Number of Arriving Packets
The algorithm was simulated under four different values of the control parameter V, six different values of the weight parameter W and 20 different values of classification methods. Each simulation was run for 10 million time slots. Next, we observe how V, W and classification methods determine the average backlog, average utility, average power and average class.

Parameters V and W
According to Little's theorem, the average delay is equal to the average queue backlog divided by the average arriving rate, so the average queue backlog reflects network delay to some extent. In Figure 2, the average queue backlog increases with V and W from the bottom 2.521 (V = 0.01, W = 0.01) to the highest point 12,530 (V = 100, W = 100) in the perfect CSI case, as suggested in Equation (11). Table 3 shows the average backlog in prefect CSI case subtracted from average backlog in imperfect CSI case increases with V and Table 4 shows V can also increase the difference in different classes. It can be found in Figure 3 that the delay in a high class is larger than the one in a low class, that the average backlog in different levels increases with V and W, whose trend is similar to the total average queue backlog, and that the differences of the delay among each level become bigger and bigger.        Figure 4b. Actually, W itself belongs to the inside parameter of the utility, thus its growth naturally increases the utility. In terms of V, the average utility decreases with parameter V and tends to converge in Figure 4a, which means adjusting parameter V can make a trade-off between the queue backlog and the utility. The difference between average utility with different classes in the imperfect CSI case and the perfect CSI case increases with V and W, shown in Table 5.     Figure 4b. Actually, W itself belongs to the inside parameter of the utility, thus its growth naturally increases the utility. In terms of V, the average utility decreases with parameter V and tends to converge in Figure 4a, which means adjusting parameter V can make a trade-off between the queue backlog and the utility. The difference between average utility with different classes in the imperfect CSI case and the perfect CSI case increases with V and W, shown in Table 5.    Figure 4b. Actually, W itself belongs to the inside parameter of the utility, thus its growth naturally increases the utility. In terms of V, the average utility decreases with parameter V and tends to converge in Figure 4a, which means adjusting parameter V can make a trade-off between the queue backlog and the utility. The difference between average utility with different classes in the imperfect CSI case and the perfect CSI case increases with V and W, shown in Table 5.     It can be discovered that the average power and the average class are both under the determination of V and W in inequalities (13) and (14). Meanwhile the average power and the average class are influenced by each other, owing to the term P in the inequality (13) and Ψ in inequality (14). With the increase of V and W, the average data class rapidly decreases and eventually stabilizes at 1.3102 in Figure 5. In Table 6, the difference between average data class in the imperfect CSI case and the perfect CSI case is very small, ranging from 0.0016 to 0.04906. In terms of average power, it monotonically declines with the parameter V and does not monotonically decrease with the parameter W, but with a fluctuant reduction in Figure 6b. The phenomenon may be caused by the term W(Ψ * − Ψ) in Formula (14). With growing W, Ψ decreases while Ψ * − Ψ increases, and this makes the variation tendency of term W(Ψ * − Ψ) uncertain and then leads to the result in Figure 6. In Table 7, the difference between average power in the imperfect CSI case and the perfect CSI case is also very small, ranging from 0.0006 to 0.0162.  It can be discovered that the average power and the average class are both under the determination of V and W in inequalities (13) and (14). Meanwhile the average power and the average class are influenced by each other, owing to the term P in the inequality (13) and  in inequality (14). With the increase of V and W, the average data class rapidly decreases and eventually stabilizes at 1.3102 in Figure 5. In Table 6, the difference between average data class in the imperfect CSI case and the perfect CSI case is very small, ranging from 0.0016 to 0.04906. In terms of average power, it monotonically declines with the parameter V and does not monotonically decrease with the parameter W, but with a fluctuant reduction in Figure 6b. The phenomenon may be caused by the term * ( ) W   in formula (14). With growing W,  decreases while *    increases, and this makes the variation tendency of term * ( ) W   uncertain and then leads to the result in Figure 6. In Table 7, the difference between average power in the imperfect CSI case and the perfect CSI case is also very small, ranging from 0.0006 to 0.0162.      The above simulations are conducted under the condition of data divided into three classes and a data classification method shown in Table 2. Thus, the classification number and classification methods should be studied, and here the effect of two factors on the average power is considered especially. All possible classification methods are considered and simulated in this section. Due to arriving data in line with the Poisson distribution, the probability of ( ) 5 l A t  is very little according to the probability density function, so that the data is not split when ( ) 5 simulations. The arriving data is divided into 2, 3 and 5 class levels, respectively, in the corresponding simulations. For example, when the number of classification is 2, all the classification methods are listed in Table 8. When the number is chosen as 3 and 5, the classification methods are similar in principle; therefore it is unnecessary to go into detail.   The above simulations are conducted under the condition of data divided into three classes and a data classification method shown in Table 2. Thus, the classification number and classification methods should be studied, and here the effect of two factors on the average power is considered especially. All possible classification methods are considered and simulated in this section. Due to arriving data in line with the Poisson distribution, the probability of A l (t) ≥ 5 is very little according to the probability density function, so that the data is not split when A l (t) ≥ 5 in simulations. The arriving data is divided into 2, 3 and 5 class levels, respectively, in the corresponding simulations. For example, when the number of classification is 2, all the classification methods are listed in Table 8. When the number is chosen as 3 and 5, the classification methods are similar in principle; therefore it is unnecessary to go into detail.
The error bar graph for the power versus the number of classification is obtained when V = 1, W = 1 chosen for convenience in Figure 7. When the number of classification changes from 2 to 5, the average power drops by 16.47% in the prefect CSI case and by 20.44% in the imperfect CSI case. The variation range is minimal and the standard deviation is 6.08 × 10 −4 in the prefect CSI case and 1.45 × 10 −3 in the imperfect CSI case when classification number equals 2. In contrast, when classification number equals 3, the variation is maximal and the standard deviation is 0.033 in the prefect CSI case and 0.028 in the imperfect CSI case. From Tables 3-7 and Figure 7, we can discover that the differences between the simulation values in the perfect CSI case and the imperfect CSI case are relatively small.

Comparisons
Here simulation results of our network control algorithm are presented to compare with EECA [25] in the perfect CSI case. Three curves are acquired with W = 0.1, W = 1 and W = 10 in Figure 8. There are five points in each curve in correspondence to V = 1, V = 10, V = 50, V = 100 and V = 1000. The convergence is 0.518, 0.525 and 0.528 respectively when W = 0.1, W = 1 and W = 10 while the convergence in EECA is 0.518. Small W can push the curve close to EECA, because W → 0 our algorithm nearly equals EECA. Besides, the convergence speed is faster in the proposed algorithm than EECA when the backlog increases. The error bar graph for the power versus the number of classification is obtained when 1, 1 V W   chosen for convenience in Figure 7. When the number of classification changes from 2 to 5, the average power drops by 16.47% in the prefect CSI case and by 20.44% in the imperfect CSI case. The variation range is minimal and the standard deviation is 4 6.08 10   in the prefect CSI case and 3 1.45 10   in the imperfect CSI case when classification number equals 2. In contrast, when classification number equals 3, the variation is maximal and the standard deviation is 0.033 in the prefect CSI case and 0.028 in the imperfect CSI case. From Tables 3-7 and Figure 7, we can discover that the differences between the simulation values in the perfect CSI case and the imperfect CSI case are relatively small.

Comparisons
Here simulation results of our network control algorithm are presented to compare with EECA [25] in the perfect CSI case. Three curves are acquired with 0.1 W  , 1 W  and 10 W  in Figure  8. There are five points in each curve in correspondence to    The error bar graph for the power versus the number of classification is obtained when 1, 1 V W   chosen for convenience in Figure 7. When the number of classification changes from 2 to 5, the average power drops by 16.47% in the prefect CSI case and by 20.44% in the imperfect CSI case. The variation range is minimal and the standard deviation is 4 6.08 10   in the prefect CSI case and 3 1.45 10   in the imperfect CSI case when classification number equals 2. In contrast, when classification number equals 3, the variation is maximal and the standard deviation is 0.033 in the prefect CSI case and 0.028 in the imperfect CSI case. From Tables 3-7 and Figure 7, we can discover that the differences between the simulation values in the perfect CSI case and the imperfect CSI case are relatively small.

Comparisons
Here simulation results of our network control algorithm are presented to compare with EECA [25] in the perfect CSI case. Three curves are acquired with 0.1 W  , 1 W  and 10 W  in Figure  8. There are five points in each curve in correspondence to . The convergence is 0.518, 0.525 and 0.528 respectively when 0.1 W  , 1 W  and 10 W  while the convergence in EECA is 0.518. Small W can push the curve close to EECA, because 0 W  our algorithm nearly equals EECA. Besides, the convergence speed is faster in the proposed algorithm than EECA when the backlog increases.   In our algorithm, the delay is different for different levels of data and the larger the number of data packets is, the higher the priority is. It should be also observed that the delay in the proposed