Joint Optimization of Control Strategy and Energy Consumption for Energy Harvesting WSAN

With the rapid development of wireless sensor technology, recent progress in wireless sensor and actuator networks (WSANs) with energy harvesting provide the possibility for various real-time applications. Meanwhile, extensive research activities are carried out in the fields of efficient energy allocation and control strategy design. However, the joint design considering physical plant control, energy harvesting, and consumption is rarely concerned in existing works. In this paper, in order to enhance system control stability and promote quality of service for the WSAN energy efficiency, a novel three-step joint optimization algorithm is proposed through control strategy and energy management analysis. First, the optimal sampling interval can be obtained based on energy harvesting, consumption, and remaining conditions. Then, the control gain for each sampling interval is derived by using a backward iteration. Finally, the optimal control strategy is determined as a linear function of the current plant states and previous control strategies. The application of UAV formation flight system demonstrates that better system performance and control stability can be achieved by the proposed joint optimization design for all poor, sufficient, and general energy harvesting scenarios.


•
In discrete-time domain, the architecture of the WASN system with an energy harvesting controller considering both energy consumption and control strategy design is proposed, and then the WSAN dynamics with network-induced delay is modeled. Based on the analysis of energy harvesting and consumption, the joint optimization problem for energy harvesting WSAN is formulated; • The joint optimization problem is successfully decomposed into two suboptimal problems. In particular, it can be transformed to be an optimal control strategy design problem for a given sampling interval, while it can be equivalent to an adaptive sampling interval design problem when the control strategy is determined; • A novel three-step joint optimization algorithm is proposed. First, the optimal sampling interval can be obtained based on the desired energy level, harvested, and remaining energy. Then, the control gain can be derived by using a backward iteration. Finally, the optimal control strategy is determined; • Numerical experiment results based on the UAV formation flight system are provided to verify the effectiveness of the proposed three-step optimization algorithm for energy harvesting WSANs. The system performances with better control stability and lower energy consumption are achieved.
The remainder of this article is organized as follows. We review the related works about WSAN dynamics control and adaptive sampling, and energy optimization for Energy Harvesting WSAN in Section 2. Next, the proposed energy harvesting WSAN model is presented in Section 3. In Section 4, the joint optimization algorithm considering both energy consumption and network-induced latency is formulated, followed by the proposed three-step joint optimization algorithm. Then, the application of the unmanned aerial vehicle (UAV) system is provided to show the effectiveness of the proposed algorithm for energy harvesting WSANs in Section 5. Finally, we conclude this work in Section 6.

WSAN Control and Adaptive Sampling
In recent years, the growing maturity of integrated electronics technology has promoted the further development of WSAN. How to design the optimal control strategy to improve system performance has become a research hotspot. In [15], an optimal control and scheduling design problem over deterministic real-time networks was studied, which minimizes a quadratic cost function in order to evaluate the control performance and the ability of the adaptive scheduling. However, the above study assumes a perfect system in that which the communication delay is completely ignored. A networked control system model is presented in [16] considering network-induced delays through the wireless communication network, and an optimal controller is designed to address the delay compensation. Then, in [17], a linear quadratic optimal control algorithm is proposed for the discrete-time system when long network delays are considered. In [18], a linear quadratic Gaussian control algorithm was proposed in the multi-hop WSAN to address the collaborative optimization design of control routing and scheduling under energy constraints. Currently, the joint optimization design for the system cost and plant control was investigated in [19] to both reduce the power consumption and improve the control stability. However, the above works focus on the fixed sampling interval. Actually, the determined sampling interval cannot guarantee the energy usage efficiency for the networked control system in many application scenarios [20][21][22]. In [23], an adaptive sampling algorithm that estimates the optimal sampling frequencies for sensors online was proposed to minimize the energy consumption of the sensors. Two adaptive sampling algorithms were proposed in [24] in order to increase the lifetime of WSN by using an optimal sampling rate for monitoring. In [25], the authors provide an energy-aware adaptive sampling algorithm for WSN with power-hungry sensors and harvesting capabilities, an energy management technique that can be implemented on any WSN platform with enough processing power to execute the proposed algorithm. In [26], the authors investigated the variable sampling method to mitigate the effects of time delays in wireless networked control systems using an observer-based control system model. In [27], in order to improve the performance of the networked control system, a variable sampling period scheduling method for the networked control system under resource constraints was presented based on the network operation state.

Energy Optimization for Energy Harvesting WSAN
In recent years, much progress has been made in understanding how to use energy harvesting technology in networking and communications applications [28][29][30][31]. However, there is few works detailing how energy harvesting sensors can be used in control applications, where the closed-loop system's dynamical behavior is significant. In [32], in order to achieve the energy-neutral operation and system performance improvement, a linear quadratic tracking problem was used to minimize the loss function thus that the duty-cycle computed maintains the specific battery level while all harvested energy was optimally used. In [33], the authors proposed a greedy battery management policy to suffice the plant stability and demonstrate that the optimal control design can be examined by a linear program. However, most current works focus on the battery management of WSN with sensors powered by energy harvesting, and then the joint design of energy management and control strategy in WSAN with energy harvesting capacity is beginning to attract researchers' attention. In [34], an optimal linear quadratic gaussian control problem with feedback coming from an energy-harvesting sensor was studied. In [35], the optimal LQG controller was obtained by solving the Bellman dynamic programming equation, and a Q-learning algorithm was used to approximate the optimal energy allocation policy in case the system parameters were unknown. A closed-form dynamic energy harvesting and dynamic MIMO precoding solution were proposed for networked control systems with energy harvesting sensors in [36]. Different from energy harvesting sensor nodes, a Entropy 2022, 24, 723 4 of 16 scenario-based model predictive control approach was exploited to stabilize the plant's state with the actuator powered by harvested energy in [37].
Unfortunately, there is seldom literature considering controllers with energy harvesting functions in WSAN. In addition, most existing optimal control algorithms focus on the perfect traffic system that the communication delays are ignored. In this paper, considering the network-induced delay as well as the transmission energy consumption of the communication network with an energy harvesting controller, the optimal control strategy design and adaptive sampling selection policy for WSAN are addressed.

Energy Harvesting WSAN Modeling
As shown in Figure 1, a typical WSAN, consisting of the controller, plant, actuator, and a number of sensors connected through a shared wireless network is considered [34,35]. In particular, compared with the traditional controller powered by non-rechargeable batteries, a controller with the capability of energy harvesting was considered, in which the energy harvesting devices such as solar panels and micro wind turbines were equipped thus that the controller energy can be harvested from the surrounding environment. nodes, a scenario-based model predictive control approach was exploited to stabilize the plant's state with the actuator powered by harvested energy in [37]. Unfortunately, there is seldom literature considering controllers with energy harvest ing functions in WSAN. In addition, most existing optimal control algorithms focus on the perfect traffic system that the communication delays are ignored. In this paper, consider ing the network-induced delay as well as the transmission energy consumption of the communication network with an energy harvesting controller, the optimal control strat egy design and adaptive sampling selection policy for WSAN are addressed.

Energy Harvesting WSAN Modeling
As shown in Figure 1, a typical WSAN, consisting of the controller, plant, actuator and a number of sensors connected through a shared wireless network is considered [34 [35]. In particular, compared with the traditional controller powered by non-rechargeable batteries, a controller with the capability of energy harvesting was considered, in which the energy harvesting devices such as solar panels and micro wind turbines were equipped thus that the controller energy can be harvested from the surrounding environ ment.

Actuator
Plant Sensor  In energy harvesting WSAN, the plant states can be periodically sampled and trans mitted to the controller through the shared wireless communication network. Once the sampling state information is received, the controller immediately calculates the contro strategy and then forwards it to the actuator. Finally, the actuator executes the contro signal to ensure the dynamic stability of the plant. During the closed-loop control, the controller energy will be continuously consumed for information reception, storage, cal culation, transmission, etc. At the same time, the energy of the controller is supplemented by energy harvesting, thus as to achieve energy consumption balance. In the energy har vesting WSAN, some typical key assumptions are also used [38]: (1) the battery capacity of the energy harvesting in the controller is assumed to be infinite. This is because the capacity of even a small button battery is usually sufficient for energy harvesting scenar ios; (2) the energy may be harvested at any time, but the harvested energy can only be used from the next control frame. In energy harvesting WSAN, the plant states can be periodically sampled and transmitted to the controller through the shared wireless communication network. Once the sampling state information is received, the controller immediately calculates the control strategy and then forwards it to the actuator. Finally, the actuator executes the control signal to ensure the dynamic stability of the plant. During the closed-loop control, the controller energy will be continuously consumed for information reception, storage, calculation, transmission, etc. At the same time, the energy of the controller is supplemented by energy harvesting, thus as to achieve energy consumption balance. In the energy harvesting WSAN, some typical key assumptions are also used [38]: (1) the battery capacity of the energy harvesting in the controller is assumed to be infinite. This is because the capacity of even a small button battery is usually sufficient for energy harvesting scenarios; (2) the energy may be harvested at any time, but the harvested energy can only be used from the next control frame.

WSAN Dynamics Model
In the control process of WSAN, due to the shared wireless network, the effect of network-induced delays cannot be ignored, which will result in a significant system performance degradation or even a system crash. The network-induced delay is mainly introduced by the sensor-to-controller delay, signal processing time, and controller-toactuator delay. Therefore, the dynamics model for WSAN in a continuous-time domain can be expressed as [19] .
where s(t) is the K-dimensional state vector, which is typically defined as the plant state error, c(t) is the N-dimensional control signal vector, A and B are determined system parameters, and τ is the network-induced delay, which is typically assumed to be smaller than one sampling interval. Then, the corresponding discrete-time dynamics in i-th sampling interval is given by [16] s where ∆T−τ e A∆T dtB, and ∆T denotes the sampling interval.
The objective of the optimal control strategy design is to ensure the stability of WSAN through minimizing the normalized cost function, which is typically defined as a normalized quadratic form as [35] where R and Q are determined system parameters, and M is the finite time horizon.

Energy Harvesting and Consumption
In this subsection, we will describe the energy harvesting and consumption model of how the controller collects, stores, and consumes energy. As shown in Figure 2, the energy arrival may occur at any time, but the harvested energy can only be released at the beginning of the next control frame, which includes M k sampling intervals at the k-th control frame. While the controller consumes energy due to signal processing and transmission in each sampling interval. In addition, the battery capacity is usually assumed to be infinity because even a small button battery has enough energy capacity to meet the needs of most energy harvesting schemes [38]. The objective of energy harvesting and consumption is to try to improve the system stability based on the joint design of control strategy and adaptive sampling interval through the effective use of harvesting energy. In general, the energy consumption of the controller in the k-th control frame is mainly determined by the signal transmission, which is given by [11] where J C k denotes the energy consumption for the signal transmission from the controller to the next network node, M k = T f /∆T k denotes the number of sampling intervals in the k-th control frame, d is the transmission distances, r ∈ [2, 4] is the signal attenuation factor, λ and µ are determined parameters by path loss and signal amplitude, respectively. attenuation factor, λ and µ are determined parameters by path loss and signal amplitude, respectively.

Energy
↓ Energy Arrival Figure 2. Energy harvesting and consumption model for the controller.
Define H k J and R k J as the harvested energy and remaining energy of the k-th control frame, respectively. Then, the evolution of the remaining energy in the controller can be modeled as In order to make full use of the energy of the controller, the remaining energy of controller is expected to be maintained at the desired level k J * that

Joint Optimization Algorithm Design
In this section, the joint optimization problem for energy harvesting WSAN is formulated. Then, a three-step optimal algorithm is proposed to jointly design the control strategy and adaptive sampling interval.

Joint Optimization Problem
Based on (3) and (6), the utility function of joint optimization problem in k-th control frame can be defined as a weighted cost function.
where β and γ are weight coefficients.
The objective of the joint optimization is to minimize the utility function subject to system dynamics and the evolution of remaining energy through the designs of both control strategy and adaptive sampling interval. Therefore, the joint optimization problem of the k-th control frame can be modeled as Actually, at each control frame, the harvested energy will be released at the beginning of the control frame, and then the controller gradually consumes energy in each sampling interval. The remaining energy is continuously decreasing along with the controller's energy consumption. In other words, the remaining energy is always larger than the Define J H k and J R k as the harvested energy and remaining energy of the k-th control frame, respectively. Then, the evolution of the remaining energy in the controller can be modeled as In order to make full use of the energy of the controller, the remaining energy of controller is expected to be maintained at the desired level J * k that

Joint Optimization Algorithm Design
In this section, the joint optimization problem for energy harvesting WSAN is formulated. Then, a three-step optimal algorithm is proposed to jointly design the control strategy and adaptive sampling interval.

Joint Optimization Problem
Based on (3) and (6), the utility function of joint optimization problem in k-th control frame can be defined as a weighted cost function.
where β and γ are weight coefficients. The objective of the joint optimization is to minimize the utility function subject to system dynamics and the evolution of remaining energy through the designs of both control strategy and adaptive sampling interval. Therefore, the joint optimization problem of the k-th control frame can be modeled as Actually, at each control frame, the harvested energy will be released at the beginning of the control frame, and then the controller gradually consumes energy in each sampling interval. The remaining energy is continuously decreasing along with the controller's energy consumption. In other words, the remaining energy is always larger than the desired energy level J * in a control frame. Therefore, the joint optimization problem in (8) can be equivalent to min The optimization problem (9) is a typical NP hard problem, which is difficult to directly solve it. Fortunately, it can decompose the joint optimization problem into two suboptimal problems: (1) for a given sampling interval T f /M * k , it can be transformed to be an optimal control strategy design problem; (2) when the control strategy {c * i,k } is determined, it can be equivalent to be a subproblem to address the adaptive sampling interval design. That is S1 : min where In general, the control strategy and sampling interval should be calculated by subproblems S1 and S2, respectively, and then iteratively converge to the joint optimization results. However, the iteration process always has extremely large computational complexity. Fortunately, it was found that the relationship between the adaptive sampling interval selection and optimal control strategy design can be totally decoupled. For a given sampling interval, the optimal corresponding control strategy can be firstly derived as a function of the given sampling interval. Then, the optimal selection of the sampling interval can be determined by the energy harvesting, consumption, and remaining level requirements.

Control Strategy Design
We first address the optimal control strategy design problem (10) subject to a given sampling interval. Define Then, the discrete-time dynamics can be rewritten as where and 0 i×j and I i×i denote the i × j zero matrix and i × i identity matrix, respectively. By using the new state vector s i,k , the joint optimization problem (10) can be equivalent to the following problem where Define the residual cost as  (14) is given by where g i,k can be iteratively calculated as and the corresponding residual cost in (15) can be derived in a quadratic form as Proof. The optimal control strategy can be deduced by a backward recursion approach.
Assuming J Re j,k , j > i has the same quadratic form as (18) that J Re j,k = s T j,k l j,k s j,k .
Then, the residual cost function J Re i,k given as follows where e 1,1 i,k = A T l i,k+1 A + R, e 2,2 i,k = B T l i,k+1 B + Q, e 2,1 i,k = B T l i,k+1 A, It can be seen that J Re i,k is a quadratic form of c i,k . In order to derive the minimum value for the J Re i,k based on (15) and (20), the optimal control strategy can be deduced as where and the corresponding residual cost function can be derived in the quadratic form as in (18).
Thus, it can be seen that the optimal control strategy c * i,k can be obtained on-line by a linear function of current plant states and previous control signals given by (16), in which the corresponding control gain g i,k is derived offline by using backwards iteration based on (17).

Adaptive Sampling Interval Design
Once the optimal control strategy is determined, the joint optimization problem (9) can be simplified to be the adaptive sampling interval design problem as Actually, at each control frame, the harvested energy will be released at the beginning of the control frame, and then the controller gradually consumes energy in each sampling interval. The remaining energy is continuously decreasing along with the controller's energy consumption. In other words, the remaining energy is always larger than the desired energy level J * in a control frame. Therefore, the adaptive sampling interval design problem (23) is equivalent to Then, the optimal number of sampling intervals can be derived when the remaining energy is equal to the desired energy level at the end of the control frame. That is Based on (25), the optimal sampling interval is given by Thus, the joint optimization design of the energy consumption and control strategy for energy harvesting wireless sensor networks can be summarized as in Algorithm 1 by a three-step procedure below. Firstly, the adaptive sampling interval design ∆T * k can be determined by (25) based on the harvested energy, remaining energy of the last control frame, desired energy level, and transmission environments. Then, the optimal control gain g i,k is iteratively calculated off-line by (17). Finally, the optimal control strategy {c * i,k } can be derived by (16) in real-time for each sampling interval based on the current plant states, optimal control gain, and previous control signals.

16
Calculate the optimal control c * i,k = −g i,k s i,k . 17 end

Simulations and Discussion
The application of the UAV formation flight system with an energy harvesting controller is provided to show the effectiveness of the proposed three-step optimization algorithm for WSANs. The UAV formation flight system, including a solar-powered UAV controller, a UAV leader, and multiple UAV followers, is shown in Figure 3. The UAV controller collects the position and speed information of the leader. Once the UAV controller receives the state information of the leader, it immediately calculates the control strategy and selects the optimal sampling period according to the situation of solar energy charging and energy consumption in order to maintain the UAV formation flight system stably and efficiently. As a case study, a typical three-UAV platoon traveling on a horizontal path is considered; the UAV formation flight system has one UAV follower, one leader, and a solar-powered UAV controller. The states of UAV formation flight system are given by where h(t) and v(t) represent the UAV follower's position error and speed error, respectively.  The purpose of UAV formation flight control is to maintain the formation of the follower when the UAV state is disturbed by the external environments, such as wind and state noises. That is, the control signal is to ensure the state deviation remains within a limited range. In the simulations, the initialization position and velocity errors are set to be zero, which is disturbed by the random noise. The control frame is set as fixed sampling interval is set as 0.083[ ] s , the initial energy of UAV controller is 0 20 , the desired energy level * 10 J = , the minimum energy level min 5 J = , and the system parameters are set as follows.
In order to demonstrate the effectiveness of the proposed algorithm, three energy harvesting cases, including poor, sufficient, and general energy harvesting conditions, are considered, and the performance comparisons with the existing work [19] with traditional fixed sampling interval are shown.
First, the poor energy harvesting condition such as cloudy weather, where the harvested energy is not enough, is investigated. As seen in Figure 4, the energy of the controller using the traditional fixed sampling method decreases rapidly and then suddenly The purpose of UAV formation flight control is to maintain the formation of the follower when the UAV state is disturbed by the external environments, such as wind and state noises. That is, the control signal is to ensure the state deviation remains within a limited range. In the simulations, the initialization position and velocity errors are set to be zero, which is disturbed by the random noise. The control frame is set as T f = 5[s], the fixed sampling interval is set as 0.083[s], the initial energy of UAV controller is J R 0 = 20, the desired energy level J * = 10, the minimum energy level J min = 5, and the system parameters are set as follows.
In order to demonstrate the effectiveness of the proposed algorithm, three energy harvesting cases, including poor, sufficient, and general energy harvesting conditions, are considered, and the performance comparisons with the existing work [19] with traditional fixed sampling interval are shown.
First, the poor energy harvesting condition such as cloudy weather, where the harvested energy is not enough, is investigated. As seen in Figure 4, the energy of the controller using the traditional fixed sampling method decreases rapidly and then suddenly drops below the minimum energy level, which will cause the controller to fail to work. This is because the fixed sampling interval causes more energy to be consumed than harvested, thus that the remaining energy level gradually decreases and may even exhaust the remaining energy to make the control system shut down. Compared with the fixed sampling interval, the energy of the controller using the adaptive sampling interval is also difficult to keep at the expected value due to insufficient energy harvested, but the energy of the controller can still be higher than the minimum energy level to maintain the normal work of the system. This is because the sampling interval is automatically adjusted to become larger to save energy when the remaining energy level is low. The control performance comparison is shown in Figure 5; it can be seen that a significant performance improvement is achieved compared to that of the fixed sampling interval. Especially when the energy level falls below the minimum energy level, the controller cannot work properly; thus that severe control stability degradation is caused in the case of fixed sampling interval. properly; thus that severe control stability degradation is caused in the case of fixed sampling interval. Then, the performances of the proposed algorithm in sufficient energy harvesting conditions are shown in Figures 6 and 7. It can be seen that the remaining energy of the traditional fixed sampling interval gradually increases. This is because the remaining energy cannot be effectively utilized in sufficient energy harvesting conditions due to the fixed sampling interval, and the harvested energy is always greater than the consumed energy in each sampling interval. Fortunately, through the adaptive sampling interval algorithm, the controller energy can be maintained near the required energy level; thus that the remaining energy and harvested energy in each control frame can be fully used to improve the system control performance. Similarly, Figure 7 also shows that the oscillation reduction of the relative distance between the follower and the leader can be achieved by the adaptive sampling interval strategy, especially when the oscillation of the relative distance is large.  Then, the performances of the proposed algorithm in sufficient energy harvesting conditions are shown in Figures 6 and 7. It can be seen that the remaining energy of the traditional fixed sampling interval gradually increases. This is because the remaining energy cannot be effectively utilized in sufficient energy harvesting conditions due to the fixed sampling interval, and the harvested energy is always greater than the consumed energy in each sampling interval. Fortunately, through the adaptive sampling interval algorithm, the controller energy can be maintained near the required energy level; thus that the remaining energy and harvested energy in each control frame can be fully used to improve the system control performance. Similarly, Figure 7 also shows that the oscillation reduction of the relative distance between the follower and the leader can be achieved by the adaptive sampling interval strategy, especially when the oscillation of the relative distance is large. Finally, the general energy harvesting condition is considered in Figures 8 and 9. It can be observed that the performance of the adaptive sampling interval is slightly better when the remaining energy level is high, which is similar to the case of sufficient energy harvesting conditions. While when the remaining energy level is low, the traditional fixed sampling period will suffer significant performance degradation, which is similar to the poor energy harvesting condition.  Finally, the general energy harvesting condition is considered in Figures 8 and 9 can be observed that the performance of the adaptive sampling interval is slightly bet when the remaining energy level is high, which is similar to the case of sufficient ener harvesting conditions. While when the remaining energy level is low, the traditional fixed sampling period will suffer significant performance degradation, which is similar to the poor energy harvesting condition.    To sum up, the proposed joint optimization design of control strategy and energy consumption can guarantee the system performance and control stability for all poor, sufficient, and general energy harvesting conditions. Compared to the traditional fixed sampling interval approach, the proposed joint optimization algorithm can successfully avoid To sum up, the proposed joint optimization design of control strategy and energy consumption can guarantee the system performance and control stability for all poor, sufficient, and general energy harvesting conditions. Compared to the traditional fixed sampling interval approach, the proposed joint optimization algorithm can successfully avoid the serious control instability when the remaining energy level is low and can also efficiently use up the harvested energy when the remaining energy level is high.

Conclusions
In this paper, the joint optimization algorithm of physical plant control, energy harvesting, and energy consumption toward the WSAN system is proposed when the networkinduced delays caused by wireless communications are considered. The architecture of the WASN system with an energy harvesting controller considering both energy consumption and control strategy design is modeled, and then the joint optimization problem is formulated based on the collaborative utility function and WSAN dynamics. With the objective of minimizing the utility function subject to system dynamics and the evolution of remaining energy, a three-step algorithm is proposed for the closed-loop feedback control. The sampling interval is firstly determined by the information of desired energy level, harvested, and remaining energy. Then, the control gain can be obtained by using a backward iteration. Finally, the optimal control strategy is derived from meeting both requirements of control stability and energy efficiency. A case study of the UAV formation flight system is introduced to demonstrate the effectiveness of the proposed joint optimization design that the serious control instability can be avoided when the remaining energy level is low, while the harvested energy can be efficiently used up when the remaining energy level is high.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.