1. Introduction
Environmental data acquisition of forests, which are critical ecosystems and resource reservoirs of Earth, has long relied on traditional methods such as manual surveys, remote sensing satellites, and watchtower observations. These approaches suffer from limitations including low spatiotemporal resolution, poor data continuity, and high labor costs, which makes the demands for fine-grained and real-time monitoring of forest environments impossible [
1]. With the advancement of digital transformation in forestry, the required scope and precision of forest environment monitoring sustainably have increased [
2,
3]. At the same time, the rapid advancement of Internet of Things (IoT) technology, wireless sensor networks (WSNs), as a critical component of its perception layer, play a pivotal role in constructing intelligent environmental monitoring systems. Particularly in complex and harsh environments like forests where sustained power supply is challenging, WSNs are responsible for automated data collection and preliminary fusion, acting as the underlying neural endpoints connecting the physical world to the digital forest. Therefore, designing efficient WSN routing protocols which can adapt to the dynamic characteristics of forest environments and achieve energy self-sufficiency is importance for enabling the large-scale, long-term, and stable application of IoT in forestry informatization. Related studies have shown that deploying WSNs can effectively enable continuous collection and transmission of ecological parameters in forest environments, which significantly improves monitoring efficiency and data quality [
4].
Due to the complex terrain, dense vegetation, and harsh climatic conditions in forests, sensor nodes typically rely on battery power. As battery energy is limited and replacement is often impractical in some environments, it is challenging to maintain long-term stable network operation and achieve sustained effective data collection [
5]. Therefore, overcoming energy constraints and realizing efficient energy management and self-sustaining power supply for nodes have become core challenges in promoting the practical application and large-scale deployment of WSNs, especially in the forest environment [
6].
Since the energy of a WSN is primarily consumed by data transmission, designing high-efficiency data transmission schemes has been recognized as a key approach to improving the energy efficiency and lifetime of the WSN [
7], such as the energy-efficient media access control policy [
8] and energy-aware routing protocols [
9]. Efficient routing protocols can not only reduce network energy consumption but also strike a system trade-off between energy consumption, network lifetime, and data transmission performance. Thus, these are widely adopted to minimize energy consumption and extend network lifetime [
10,
11]. The clustering routing algorithm is commonly used in WSNs. Through reducing long-distance communication and employing data fusion to eliminate redundant data, clustering can reduce energy consumption. For example, LEACH and its improved versions [
12] have demonstrated that clustering can effectively enhance network lifetime and data transmission performance. To further reduce data transmission energy consumption, ref. [
13] proposed a K-hop spectrum aware clustering for bi-channel edge contraction (K-SACB-EC) algorithm. By optimizing the cluster formation process through a K-hop mechanism, this algorithm significantly reduces network energy consumption while enhancing cluster structure stability and alleviating energy shortages. However, it does not consider the real-time energy status of nodes during cluster head election, which may result in nodes with insufficient residual energy being selected as cluster heads, which will affect network stability. In [
14], an efficient adaptive clustering routing algorithm was proposed, which is termed multi-parent differential evolution (MPDE) and variable step size local search–swarm intelligence adaptive clustering routing (VSSLS-SIACR), to minimize energy consumption and extend network lifetime. Similar to [
13], it fails to consider the real-time energy status of nodes and does not optimize the intra-cluster communication.
Though existing clustering routing algorithms can effectively reduce energy consumption and improve network performance, with network operations, the energy of sensor nodes continues to decrease, which makes long-term sustainable operation impossible. Energy-harvesting technology, which allows nodes to convert various ambient energy sources, such as solar, wind, and biomass energy, into electrical energy stored in batteries, offers a feasible solution to address the energy shortage in WSNs [
15]. WSNs incorporating energy-harvesting technology are referred to as Energy-Harvesting WSNs (EH-WSNs). By replenishing energy by using energy-harvesting technology, sensor nodes may work in the long term, which makes EH-WSNs be a hop research topic [
16,
17,
18,
19]. For example, the cooperative energy-harvesting mechanism proposed in [
17] effectively improves network throughput. In [
18], the mis-match between energy harvesting and consumption was considered, while [
19] introduced a TDMA-based dynamic time allocation mechanism to increase the energy efficiency of the network.
Unlike traditional WSNs, the energy in EH-WSNs exhibits non-decreasing dynamic variation, which renders traditional routing protocols unsuitable. Moreover, the environmental energy is random and time-varying. To fully leverage harvested energy and enhance network performance, many routing protocols for EH-WSNs have been proposed [
20,
21,
22,
23,
24]. In [
20], the authors considered two types of sensor nodes—energy-harvesting nodes and battery-powered nodes—along with unreliable wireless channels. A dynamic path selection strategy was proposed to improve end-to-end transmission reliability, unlike the direct path transmission in [
20,
21], which designed a clustering routing strategy. Considering the different energy consumption rates of cluster heads and cluster members, an energy-reserved clustering algorithm based on distance and node energy status was proposed, which mitigates energy shortages in cluster heads caused by rapid energy depletion. To further improve the utilization of harvested energy, ref. [
22] introduced a centralized management mechanism into the clustering algorithm, which significantly increases the number of active nodes. Considering the distributed nature of sensor nodes, ref. [
23] employed convex optimization method to design a distributed algorithm for solving the fair packet rate problem in multi-hop EH-WSNs. Concurrently, a heuristic clustering routing algorithm was proposed in [
24] to address the trade-off among multiple performance metrics in EH-WSNs. By taking multiple performance metrics into account, the algorithm achieves significant improvements in network lifetime and sleep–wake switching rate without compromising other required performance criteria.
Simultaneously, with the development of IoT architecture and communication technologies, the integration with edge computing, massive access, and network intelligence has been widely used by the WSN. Younas et al. proposed a framework to connect IoT edge networks through 3D Massive MIMO, offering a new perspective for reliable communication in high-density node deployments in [
25]. In specific industrial IoT scenarios, ref. [
26] optimized a low-power clustering routing protocol for substation wireless LAN authentication and privacy infrastructure (WAPI) environments. The results demonstrate the need for protocol design to be deeply integrated with specific application contexts and communication standards. Furthermore, from a network topology optimization perspective, Tulu et al. [
27] identified influential nodes based on community structure to accelerate information dissemination in complex networks, providing theoretical insights for efficient routing and load balancing in WSNs. These works indicate that modern WSN protocol design must comprehensively consider physical-layer communication technologies, specific application constraints, and network topology intelligence.
Existing studies have improved EH-WSN performance, but acquiring real-time node energy information, which is the dynamic nature of ambient energy [
24], incurs substantial additional communication overhead and imposes a significant burden on resource-constrained networks. While ambient energy exhibits high dynamic variability, it often follows discernible patterns. Leveraging these patterns to predict harvestable energy and subsequently utilizing the predictions for scheduling and routing can reduce redundant communication while maintaining network performance. For instance, ref. [
28] introduced a novel multi-algorithm fusion framework to enhance prediction accuracy by integrating the results from multiple forecasting algorithms. Based on this framework, a triple-algorithm fusion solar radiation predictor is designed, which reduces the error rate in solar energy prediction effectively. Separately, ref. [
29] proposed an improved pro-energy solar energy prediction technique, which predicts energy by correlating the most similar historical energy states of nodes within the network.
Routing protocols for EH-WSNs, including prediction-based designs, have been extensively studied. However, existing research primarily focuses on solar-powered sensor networks. In practice, if sensor nodes rely solely on solar energy, energy shortages may still occur. Moreover, existing clustering protocols neglect the particularities of forest environments. They often assume that a single energy-harvesting method is adopted. Then, the strategy is insufficient for the sensor node. In fact, in the forest, there are multiple energy-harvesting options for the sensor node. Some sensor nodes can collect solar energy, while some others may harvest biomass energy [
30]. However, the research on biomass energy harvesting for WSNs remains limited. Moreover, considering the integration of solar, biomass energy and so on, the hybrid energy management framework has not yet been explored.
Therefore, though the routing protocol for the EH-WSN has been studied, there are still some limitations, for example, reliance on a single energy source, inadequate dynamic adaptation, and unoptimized intra-cluster communication. To address these limitations, we investigate integrated energy management strategies for both solar and biomass energy in the clustering algorithm. The main contributions are as follows:
Establishing a multi-source dynamic switching model tailored to forest environments, which incorporates both solar and biomass energy and adaptively selects the optimal source based on real-time conditions.
Designing a dynamic weight-based cluster head election mechanism that intro-duces node residual energy and real-time harvesting rate as key weighting factors to pre-vent low-energy nodes from being selected as cluster heads.
Proposing a Q-learning-based adaptive hybrid transmission mechanism, which dynamically selects between single-hop and multi-hop modes according to real-time channel state and node energy levels, thereby optimizing intra-cluster communication energy consumption.
Conducting systematic simulations to comprehensively evaluate the proposed algorithm in terms of total energy consumption, network lifetime, and energy balance.
2. System Model and Discussion
Firstly, we will introduce the system model, which includes the network model and the energy-harvesting model under forest environments. Subsequently, through analyzing existing clustering algorithms and the problem to be solved, a new framework is proposed to address the energy consumption problem for the EH-WSN used in the forest.
2.1. Network Model
The monitoring area is assumed as a two-dimensional rectangular plane, which is denoted by A with an area of
. Within this plane,
N sensor nodes are deployed randomly, and are denoted by a set
. The coordinate of sensor node
is given by
,
. It is assumed that there is a common channel among sensor nodes. The data channel between two sensor nodes is assumed as an independent Rayleigh fading channel. The distance between node
and its neighboring node
is denoted as
. The transmission radius of the node is
R. A schematic diagram of the network model is illustrated in
Figure 1.
2.2. Solar Energy Harvesting Model
In EH-WSNs, the solar energy-harvesting model aims to quantify the dynamic process whereby sensor nodes harvest solar energy from the environment and convert it into other forms of energy. This model integrates several key factors that impact the energy-harvesting rate, such as solar irradiance, sunlight duration, energy conversion efficiency, and seasonal environmental variations. The proposed model is formulated in (1) [
28]:
where
is the energy-harvesting conversion efficiency and is assigned a value of 0.25,
is the effective light-receiving area of the sensor node, and
represents the environmental attenuation factors (e.g., shading, cloud cover, and atmospheric dust)
. Unlike a simple linear model, the adopted solar energy-harvesting model
more accurately reflects the actual solar irradiance profile.
is the daily sunshine duration. Given the significant seasonal variations in
, its value is parameterized separately for Spring, Summer, Autumn, and Winter, and add seasonal factors
to adjust the value, which vary in different seasons [
28].
2.3. Biomass Energy Harvesting Model
In the forest environment, the biomass energy-harvesting rate of sensor nodes is also subject to significant seasonal variations. According to relevant research [
31], this rate is modeled as shown in Equation (2):
where
ηsys is the system efficiency, i.e., the overall conversion efficiency from biomass to electrical energy
ηsys varies with seasons, reaching its peak in Summer due to heightened microbial activity and its nadir in winter [
32]. The parameter
Msub(
t) is the substrate mass, defined as the available organic mass per unit area (kg/m
2). Its calculation is given separately in (3) [
31].
where
is usually at its peak during Autumn leaf shedding and at its low value during Spring.
represents the oxidation rate, which is also seasonally dependent. The parameter is given by the following expression (4) [
33]:
where
represents the activation energy, assigned a typical value of
in this model.
R is the gas constant and is assigned a value of
, and
is the seasonal coefficient. As indicated in reference [
33], the value of
varies seasonally; The specific values are explained in
Section 4.1 of the article.
Finally,
is the environmental correction factor, expressed as follows (5):
where
is the soil temperature (°C) beneath the forest canopy, with an optimal range of 20–30 °C in Summer and −5~5 °C in Winter to suppress the reaction.
is the soil moisture, with an optimal range of 40%–60%. During rainy seasons (Spring/Autumn), it may exceed this optimal range and lead to reducing the energy-harvesting efficiency. The pH value of forest soil is usually between 4.5 and 6.5 [
34].
2.4. Dynamic Multi-Source Energy-Switching Strategy
In forest environments, relying on a single energy source is often insufficient to sustain continuous node operation. Solar energy is susceptible to weather conditions, vegetation occlusion, and diurnal cycles, while biomass energy is influenced by temperature, humidity, and seasonal variations. Based on the models discussed above, this section proposes a dynamic energy-switching strategy that enables nodes to autonomously select the optimal energy source in real-time according to their instantaneous harvesting rates.
The energy-switching decision mechanism operates as follows: Each node periodically monitors its current solar energy collection rate and biomass energy collection rate . is the switching threshold. Energy source switching is triggered when one of the following conditions is met:
If , switch to solar energy mode;
If , switch to solar biomass energy mode.
If the ratio lies within
, the current mode is maintained to avoid frequent switching and associated energy overhead. (In this study, the switching threshold
was initially set to 0.2. In
Section 4.6 of this article, a detailed sensitivity analysis will be conducted to explore the impact of different
values on network energy efficiency, stability, and lifespan.)
Upon triggering a switch, the node reconfigures its energy-harvesting circuitry and power management system to operate under the new energy source in the next communication cycle. This strategy ensures that nodes maintain high energy-harvesting efficiency in complex forest environments, thereby providing dynamic and sustainable energy support for subsequent cluster head election and routing optimization.
2.5. Energy Consumption Model
We assume that each sensor node is equipped with the same initial energy, denoted by
. The wireless channel between any pair of nodes is modeled as a Rayleigh channel, and its channel coefficient g follows the Nakagami distribution with parameter
, where m is the shape parameter and
is the control parameter. The expression for the control parameter is
. According to Shannon’s formula, the expression for the node information transmission rate R is as follows:
where
B represents bandwidth, and
SNR is the signal-to-noise ratio at the receiving end. When the transmission rate and bandwidth of a node are constant, the signal-to-noise ratio is also constant, and the transmission power
of the node can be calculated by the signal-to-noise ratio:
where
is the noise of the channel, and the energy
consumed by node
to send information is expressed as follows:
where
T is the time required for node
i to transmit data.
is the energy consumed for sending or receiving each bit of information, and A is the energy consumed for receiving L-bit messages, expressed as follows:
CH is responsible for receiving and merging information from ordinary nodes within the cluster, and sending the information to sink after data fusion. The energy consumed by the channel is divided into three parts:
consumed by reception,
consumed by fusion, and
consumed by transmission. The energy consumed by reception [
35] is represented as follows:
where
represents the number of ordinary nodes in the mth cluster. In practical network scenarios, CH itself also needs to perceive information; so, in addition to receiving the
information sent by ordinary nodes, CH also needs to process its own information. Therefore, CH needs to fuse and process the total information of
, and the energy consumed by fusion is:
where
represents the energy consumption of each bit of data processed by CH. The fused data is of size l. Based on (11), the energy required for CH transmission of this data can be calculated. However, due to the existence of multi-hop transmission in the network, cluster head nodes may need to relay data for other cluster head nodes. Therefore, it is necessary to determine the energy consumption
(towards the base station or next hop) of cluster head transmission by combining the routing protocol used by sensor nodes.
Both intra-cluster communication and communication between cluster heads consume energy from sensor nodes. When the energy of sensor nodes is depleted, they will die, thereby affecting network performance. Let
be the total energy consumption of the network, and the formula for total energy consumption is as follows:
where
M is the number of clusters, and
is the number of nodes within the cluster
[
36].
is control message energy consumption (including neighbor discovery, cluster head election, etc.).
2.6. Problem Analysis
Given the unique characteristics of forest environments and the aforementioned dual-source energy-harvesting models, routing protocols designed for a single energy source—often based on predictive algorithms—are no longer suitable. Moreover, due to the heterogeneous and variable nature of harvested energy, existing routing protocols for sensor networks, such as K-SACB-EC, also prove inadequate. Taking K-SACB-EC as an example, its cluster head election mechanism fails to adequately consider the real-time energy status and dynamic harvesting rates of nodes. This oversight can lead to the election of low-energy nodes as cluster heads, compromising network stability and sustainability. Furthermore, the protocol lacks intra-cluster communication optimizations, especially under uneven node distributions, which can cause excessive energy consumption and shorten overall network lifetime. Therefore, a new routing protocol is required. This protocol must enhance both the clustering algorithm and the intra-cluster transmission mechanism to better utilize harvested energy, reduce total consumption, and extend network lifetime.
To address these issues, this paper proposes an improved K-SACB-EC routing protocol tailored for forest environments. The proposed approach comprises two key enhancements:
Dynamic weight-based cluster head election: A new election algorithm is introduced. It integrates both residual energy and dynamic energy-harvesting rates into a weighted election function. This prevents nodes with low energy prospects from becoming cluster heads.
Adaptive intra-cluster communication: An adaptive communication strategy is developed. For clusters where members are distant from the head, a rational path selection algorithm constructs an adaptive single-hop or multi-hop scheme. This optimizes communication energy consumption within the cluster.
3. Proposed Work
Although the K-SACB-EC algorithm demonstrates good performance in spectrum sensing and energy optimization, it fails to account for the dynamic node energy states and the diversity of energy sources in forest environments. To address these limitations, this paper proposes an enhanced K-SACB-EC algorithm. Its main innovations lie in the following three aspects:
An Energy-aware weight function for cluster head election: a novel weighting function is designed that incorporates both the residual energy and the real-time energy-harvesting rate of nodes, enabling energy-sensitive clustering.
A multi-source adaptive switching mechanism: by integrating the solar and biomass energy-harvesting models, the algorithm achieves dynamic and optimal selection of energy sources for each node.
A Q-learning-based adaptive transmission strategy: this strategy dynamically selects between single-hop and multi-hop transmission modes to optimize intra-cluster communication.
Thus, the proposed method introduces significant innovations in energy awareness, dynamic adaptation, and transmission decision-making.
To optimize the cluster formation and select the most appropriate cluster heads, this section introduces an enhanced version of the K-SACB-EC algorithm. In light of the distinctive challenges posed by forest environments, both residual energy and the energy-harvesting rate are incorporated as critical factors into the cluster head election weight function, leading to a more efficient clustering strategy. Additionally, an adaptive Q-learning-based hybrid single- and multi-hop transmission mechanism is proposed for intra-cluster data communication to further enhance overall network performance.
3.1. Neighbor Discovery Protocol
To enable nodes to discover their neighbors, an efficient neighbor discovery protocol is proposed, which allows nodes to dynamically discover neighboring nodes and obtain key parameters such as residual energy and location. A periodic sleep–wake cycling mechanism is adopted to meet the stringent energy conservation requirements of WSNs. Considering the dependency of node states on wireless transceiver scheduling, a state transition model for the sensor nodes is defined in (13):
where
, which indicates that the node is in a sleep state, and
indicates that the node is in an active state. For two nodes within communication range to discover each other, they must be awake simultaneously. However, in practice, unsynchronized sleep schedules can prevent neighboring nodes from discovering each other, as their active (awake) periods may not overlap.
To address this issue, this chapter introduces a deterministic neighbor discovery protocol. By designing specific wake-up scheduling patterns, this protocol theoretically guarantees that all neighboring nodes can discover each other within a bounded time. The scheduling strategy for deterministic neighbor discovery can be formalized as follows: for any pair of neighboring nodes
, there exists a time instant.
where
, and
is the time phase difference in the node pair
.
To ensure mutual discoverability between any neighboring nodes, a neighbor discovery protocol based on the pigeonhole principle is designed. This protocol ensures that the duration of the active state within one cycle is greater than half of the total cycle length. The total time is divided into n slots, and each node remains active in at least . Under this condition, any two neighboring nodes are guaranteed to share at least one common active slot, enabling mutual discovery.
3.2. Algorithm for Identifying Cluster Members with Maximum Weight
Following neighbor discovery, each node proceeds to optimize network energy allocation by constructing a Maximum Weighted Complete Bipartite Graph (MWCBG). The proposed clustering algorithm extends the core bipartite graph model of the K-SACB-EC algorithm to achieve an optimized network topology. To mitigate the high computational complexity inherent in processing complete bipartite graphs, the algorithm incorporates a novel node weight evaluation mechanism. The procedure consists of three main steps:
Local bipartite graph construction: each node constructs a local bipartite graph based on its real-time residual energy, energy-harvesting rate, and the discovered neighborhood relationships.
Global graph integration: the local graphs from all nodes are integrated to form a unified bipartite graph model representing the entire network.
MWCBG solution: within this unified graph, the MWCBG corresponding to each node is computed to determine the optimal energy-aware clustering strategy.
For each node, the algorithm constructs a complete bipartite graph using its information (e.g., inter-node distance, residual energy, energy-harvesting rate, and neighbor node information). From this graph, the complete bipartite subgraph with the maximum weight is identified. To construct the bipartite graph for a specific node, its attributes and relationships are extracted, and the bipartite graph is represented as follows:
where
denotes the residual energy of the node
,
are the neighboring nodes,
is the real-time energy absorption rate at the node, and
is the existence of common channels within the transmission range of the node. After constructing a complete bipartite graph of completion points
, use the following formula to calculate its weights and find the bipartite subgraph with the highest weight. The calculation method of weights
is shown in (16) and (17):
where
is the weight factor between network stability and energy, and is assigned a value of 0.5.
is the total number of nodes
in a complete bipartite graph,
is the set of nodes
in a complete bipartite graph,
is the distance between a node
and its neighboring nodes
,
is the initial energy of the node,
is the energy consumed for energy conservation, and
t is the time it takes for the node to absorb energy. The remaining variables are defined in the previous model.
This multi-factor dynamic weighting-based cluster head election mechanism incorporates the node’s residual energy and energy-harvesting rate as key parameters alongside conventional factors such as node proximity. This strategy prioritizes nodes with higher energy availability for the cluster head role, which helps prevent low-energy nodes from becoming cluster heads and contributes to extended network lifetime. The proposed algorithm employs a local connection strategy (K = 1 means that the points in the network only form bipartite connections with the points reachable by one hop. Choosing K = 1 is because communication beyond one hop significantly increases control message overhead and energy consumption during the cluster formation phase, which is counterproductive for energy-harvesting networks.), where each node constructs a bipartite graph based only on its one-hop neighbors. The global clustering structure emerges from the integration of these local graphs. Due to the complexity of depicting the entire network topology,
Figure 2 illustrates a representative local subgraph.
As analyzed in
Figure 2, multiple sensor nodes may simultaneously belong to different maximum-weighted bipartite subgraphs. To resolve this issue, the algorithm employs the following affiliation determination strategy: for any given node, the system traverses all bipartite subgraphs containing that node and selects the subgraph with the highest weight as its final affiliation. This mechanism ensures that each node is assigned to exactly one distinct subgraph, thereby satisfying the uniqueness requirement of the clustering algorithm.
3.3. Cluster Formation Process
Based on the bipartite graph model described above, sensor nodes in the network accomplish their final role assignment through a state transition mechanism. In the initial phase, all nodes reside in the “Initial” state. As the clustering algorithm proceeds, nodes undergo state transitions based on the results of local weight calculations within the bipartite graph: if a node’s weight meets the candidate criteria for cluster head, its state changes to “Intermediate CH” (intermediate cluster head state); otherwise, the node transitions into the “Clustered CM” state (clustered member state). This process is illustrated in
Figure 3.
Step 1: Initialization: all nodes in the network are in initial state.
Step 2: Local weight calculation and candidate cluster head generation: each node calculates its maximum weight based on bipartite graph. If the weight of a node is the highest in its local neighborhood, its state changes to intermediate cluster head. This node sends a “cluster join invitation” to other nodes in its maximum weight bipartite graph, and the status of the node receiving the invitation changes to Clustered CM (the cluster member to which it belongs).
Step 3: Intermediate cluster head competition and final cluster head election: if multiple Intermediate CH nodes are within the communication range of each other, they form a “competition area”. In this area, the weights of each Intermediate CH node are compared. The node with the highest weight is promoted to the final Clustered CH, and the other Intermediate CH nodes are degraded to Clustered CM. If an Intermediate CH node has no other Intermediate CH nodes within its communication range, it directly becomes a Clustered CH.
Step 4: Handle draw and multiple invitations: if two neighbor nodes have the same weight, the node with a higher sum of residual energy and energy collection rate becomes Intermediate CH. If an ordinary node receives multiple cluster join invitations with the same weight at the same time, select the cluster with fewer members to join.
The specific process of cluster formation is as follows:
(1) Each node in the network constructs a bipartite graph using , where represents neighboring nodes with 1 hop. Find the maximum weight bipartite subgraph of point and define the set of other points in the bipartite subgraph with the maximum weight of node as follows: .
(2) For node , if , it performs the following operations:
(3) If node is in the initial state, it changes its state to intermediate CH and sends information to other points in . The other points in become Clustered CM, and the mediate CH set is defined as follows:
(4) If node is in intermediate CH state, it saves this state to build a larger cluster, or node becomes Clustered CM and terminates the algorithm. The basis for judging the two situations is whether the distance between the nodes in set meets the transmission range. If there are nodes in that have intermediate CH within the transmission range of , further compare their weight sizes. The ones with larger weight values become the final cluster head Clustered CH of the cluster, and the ones with smaller weight values become Clustered CM. If there are no other intermediate CH nodes within the transmission range of , it becomes the final cluster head Clustered CH and terminates the algorithm. This process can be seen as the fusion between clusters, where the final cluster head is selected by comparing the weight values of each intermediate CH node.
(5) If a neighbor node of has the same maximum weight as , the node with higher residual energy and energy-harvesting rate becomes intermediate CH, while the other becomes Clustered CM.
(6) When node receives invitations from multiple nodes with identical weights, it joins the cluster with the lower number of nodes.
The flowchart of this process is shown in
Figure 4.
Based on the above, the pseudocode of the clustering algorithm is presented as follows (Algorithm 1):
| Algorithm 1: Improved K-SACB-EC-Based Clustering Algorithm |
Input: The node coordinates are initialized with identical energy levels denoted as , and their residual energy are , energy harvesting rates are . Distance between node and is . is the weight of ,. The transmission range of each node is R. N is the preset maximum number of rounds. Output: The final cluster head set, cluster structure (member list for each cluster).
- 1.
Initialize the particle swarm; - 2.
Phase 1: Initialization and Neighbor Discovery - 3.
For each node do; - 4.
Execute deterministic neighbor discovery protocol to obtain a hop neighbor set and exchange location and information with neighbors. - 5.
End for - 6.
Phase 2: Calculate weights and elect intermediate cluster heads - 7.
For each node do; - 8.
Compute the bipartite graph weight for each node based on (16)–(19); - 9.
Find the maximum-weight bipartite subgraph for each node; - 10.
End for - 11.
For each node do; - 12.
If then; - 13.
Elect as an intermediate CH; - 14.
Else - 15.
Elect as an intermediate CH; - 16.
End if; - 17.
End for; - 18.
Phase 3: Handling invitations and determining cluster members - 19.
For each node status == Initial and there are intermediate cluster head nodes within the transmission range R - 20.
If received CH_invitation from one or more intermediate CH nodes: - 21.
Select the inviting node with highest weight - 22.
status became Clustered CM - 23.
End if - 24.
End for - 25.
Phase 4: Competition among intermediate cluster heads, resulting in the final cluster head - 26.
For each intermediate CH ,; - 27.
If - 28.
The node , with higher weight becomes the Clustered CH; - 29.
Else - 30.
if - 31.
Node becomes the Clustered CH; - 32.
Else if - 33.
Node becomes the Clustered CH; - 34.
Else if - 35.
If - 36.
Node becomes the Clustered CH; - 37.
Else - 38.
Node becomes the Clustered CH; - 39.
End if - 40.
End if; - 41.
End if; - 42.
End for. - 43.
Phase 5: Dealing with draws and conflicts - 44.
For nodes with tied weights or receiving multiple invitations do - 45.
Resolve conflicts based on the rules of “higher comprehensive energy state” or “fewer cluster members” - 46.
Update its own status and cluster relationships accordingly - 47.
end for - 48.
Termination Condition: - 49.
Algorithm terminates when all nodes have state in {Clustered CH, Clustered CM} - 50.
or after a maximum of N rounds. - 51.
Return Clustered CH set and complete Clustered CM mapping table.
|
To validate the effectiveness of the proposed improved algorithm in forest environments,
Figure 5 illustrates its specific clustering process. By incorporating node energy status as a key parameter, the proposed algorithm achieves significant improvements in both energy consumption efficiency and network lifetime compared to conventional clustering algorithms. The following section provides a detailed explanation and analysis of the algorithm’s execution flow through a concrete example.
As shown in
Figure 5, the network forms clusters using the improved clustering algorithm, where yellow nodes represent cluster heads (CHs), blue nodes denote cluster members (CMs), and red nodes indicate Intermediate cluster heads (ICHs). As discussed earlier, the final CHs are selected from the ICHs through a secondary election mechanism. This innovative cluster head election strategy incorporates key parameters—namely, the node’s residual energy and energy-harvesting rate—to optimize the CH selection process. By adopting this dynamic weight allocation strategy, the algorithm effectively balances energy consumption across the network, preventing certain nodes from premature failure due to excessive load, thereby significantly extending the overall network lifetime.
In the final cluster structure, considerable variations in distance may exist between cluster members and their respective CHs. To address this issue, the intra-cluster communication mechanism is further optimized in the subsequent section by adaptively selecting between single-hop and multi-hop transmission modes. This approach minimizes energy consumption while maintaining network connectivity, leading to improved overall network performance.
3.4. Q-Learning-Based Adaptive Single/Multi-Hop Optimization Algorithm
As indicated by the preceding analysis, optimizing communication between cluster members and the cluster head requires a novel K-Hop adaptive routing algorithm. This algorithm holistically incorporates dynamic network parameters—such as inter-node distance, channel quality, and neighbor node density—to intelligently select the optimal communication mode, thereby reducing intra-cluster communication energy consumption and extending the network lifetime.
Building upon an existing energy consumption model, this paper develops a Q-learning-based adaptive communication algorithm and evaluates its performance through simulation experiments. The results demonstrate that, compared to conventional fixed single-hop or multi-hop transmission schemes, the proposed algorithm achieves higher energy efficiency, prolongs network lifetime, and offers improved scalability and adaptability.
3.4.1. Q-Learning Framework Design
To optimize intra-cluster communication path selection, this study designs a Q-learning-based adaptive transmission mechanism. This framework treats each sensor node as an intelligent agent that learns the optimal single-hop/multi-hop transmission strategy through interaction with the environment. The core components of the Q-learning framework are defined as follows:
1. State Space (S)
The state of a node at time step t is defined as follows:
where
is the current residual energy of the node;
is the Euclidean distance from the node to its cluster head;
is the current channel gain (based on the Rayleigh fading model in this article);
is the density of neighboring nodes (reflecting the number of available relay nodes).
2. Action Space
The action at each step is to choose a transmission mode:
= 0; single-hop direct transmission to CH;
= K; multi-hop transmission with K relays.
In this study, the reward function is designed to jointly optimize energy efficiency and link reliability. Therefore, the expression of the reward function in this paper is given as follows:
where
is the energy consumption for executing action
;
is the energy absorption rate of the current node, which is described in detail in (21);
is the transmission success indicator function (success = 1, failure = 0);
is the number of hops (1 for a single-hop and ≥2 for multiple hops);
,
and
are weight coefficients used to balance energy, reliability, and path length and their values are, respectively, 0.5, 0.3, 0.2.
The specific Q-learning algorithm is described as follows:
where
is the learning rate, which controls the rate at which new information overrides old Q-values. When
= 1, the update fully replaces the previous value; when
= 0, no update is performed. Our article sets the value of parameter
to 0.1.
denotes the
value of selecting transmission path in state
.
R represents the reward returned by the environment after executing the current communication path.
is the discount factor, reflecting the importance of future rewards. A value of
= 0 implies focus solely on immediate rewards, while
= 1 emphasizes long-term returns. Our article sets the value of parameter
to 0.9.
refers to the maximum
value among all possible transmission paths in the next state
[
37].
To empirically verify the convergence of the Q-learning algorithm, we recorded the number of rounds required for each node to reach the convergence condition under the aforementioned parameters.
Table 1 summarizes the statistical data collected from 100 nodes during 10 independent runs. The average number of convergence rounds is 158, with a standard deviation of 22 rounds, indicating that most nodes are stable within 200 rounds.
The adaptation under Q-learning is to explore and perceive the channel before each communication, obtain the attenuation coefficient g of the channel at that time, and select the optimal transmission method based on the attenuation coefficient and transmission distance. The instantaneous channel gain of the channel is
, satisfying the
distribution.
,
, where
m is the shape parameter and
is the control parameter. Since this article is conducted under the Rayleigh channel,
[
38], after the formation of clusters, nodes in the forest environment will be trained using this Q-learning model to determine the reward situation of each communication link and choose the best communication path. Through Q-learning, nodes can adaptively choose between single-hop or multi-hop modes, dynamically optimizing network lifecycle and reliability.
Based on the Q-learning framework mentioned above, nodes perform the following steps before each transmission: perceive the environmental state; and obtain the current energy, channel state, and neighbor table. The proposed Q-learning based adaptive single-hop/multi-hop routing algorithm flowchart is shown in
Figure 6.
3.4.2. Theoretical Analysis of Forest Environment Hybrid Transmission Mode Based on Q-Learning
To supplement the empirical simulation results and demonstrate why the proposed Q-learning based hybrid transmission mechanism is optimal for forest environments, this section discusses and analyzes two key aspects: energy-harvesting variability and channel dynamics.
Firstly, forest energy sources (solar and biomass) exhibit strong seasonal and diurnal variations. Let
Vsolar(
t) and
Vbio(
t) denote the harvesting rates at time t. The Q-learning reward function incorporates both energy consumption and harvesting awareness:
where
represents the energy cost relative to harvesting. In periods of low harvesting (e.g., winter nights), this term penalizes high-energy actions more severely, steering the policy toward energy-conserving multi-hop paths when necessary. Thus, the reward function inherently captures the energy sustainability constraint, which is critical in energy-harvesting WSNs.
Secondly, the Q-learning state representation St = (Ei(t),dCH,yi,ρneight(t)). explicitly includes the instantaneous channel yi. This allows the nodes to perceive real-time channel conditions and select the transmission mode that minimizes the expected energy consumption given the current fading state. Theoretically, for a given distance d and channel gain yi, the optimal transmission mode satisfies:
The above (24) represents the estimation of the optimal energy consumption for transmission selection under conditions d and γ.
3.5. Algorithm Complexity and Communication Overhead Analysis
To thoroughly evaluate the practicality of the proposed protocol for resource-constrained forest WSNs, a theoretical analysis of its computational complexity and communication overhead is essential.
3.5.1. Time Complexity Analysis
The time complexity of the protocol stems from three core procedures: neighbor discovery, dynamic weight-based clustering, and Q-learning-based path selection.
Neighbor discovery protocol: Each node executes a deterministic wake-up schedule to discover its one-hop neighbors. For a network with N nodes, maintaining and comparing neighbor lists incurs a complexity of per discovery cycle.
Dynamic weight-based clustering algorithm: Each node constructs a local bipartite graph and calculates weights for its neighbors. Let k be the average number of neighbors per node. The weight calculation and local graph processing for one node require times. As this process is executed by all N nodes during cluster formation or re-clustering, the overall worst-case complexity for this phase is . In typical forest WSN deployments where nodes are sparsely distributed, k is much smaller than N, making this computation manageable.
Q-learning-based adaptive transmission: In each communication round, a node selects an action based on its current Q-table. The Q-value update operation has a constant complexity per state–action pair. By discretizing the state space and using a table-lookup mechanism, the online decision-making overhead per transmission is minimal and independent of network size.
Overall time complexity: The dominant factor is the clustering phase with complexity. Given the sparse connectivity and the fact that clustering is not performed every round but at longer intervals, the proposed protocol exhibits scalable and acceptable time complexity for practical implementation on sensor nodes.
3.5.2. Memory Overhead Analysis
To ensure the proposed Q-learning framework is feasible for memory-constrained sensor nodes, we adopt a state discretization strategy to maintain a compact Q-table structure. The continuous state variables are discretized as follows:
To limit storage overhead, the continuous state space of Q-learning is systematically discretized: Residual energy () is divided into five levels according to percentage intervals. Distance to the cluster head () is partitioned into four zones relative to the communication radius R. Channel gain () is categorized into three quality levels based on the received signal-to-noise ratio. Neighbor density is classified into three grades according to the number of one-hop neighbors.
After discretization, the state space size is reduced to . With the action space corresponding to transmission mode selection (), the complete Q-table contains 720 entries. Storing each Q-value as a 4-byte float results in a maximum static Q-table memory footprint of approximately 2.88 KB. In practical deployment, only the state-action pairs actually visited are dynamically maintained, thereby further reducing the average runtime memory usage.
In summary, the overall memory consumption is within the memory budget of mainstream low-power sensor nodes (5–10 KB), which verifies the practical feasibility of deploying this protocol on resource limited hardware.
3.5.3. Communication Overhead Analysis
The communication overhead comprises control packets for network management and data packets for information delivery.
Control overhead: Neighbor discovery: Each node periodically broadcasts a “Hello”packet containing its ID, location, and residual energy. This contributes packets per discovery cycle. Cluster head election: Candidate CHs broadcast invitation messages, and member nodes send join-request or acknowledgment packets. If M clusters are formed with an average of members each, this phase generates approximately control packets. Our dynamic weight mechanism helps stabilize cluster structures, reducing the frequency of re-election and thus the amortized control overhead. Q-learning state synchronization: Nodes occasionally exchange link-state or reward information with one-hop neighbors to aid learning. This overhead is limited to per node and occurs infrequently after the learning convergence.
Data transmission overhead: The adaptive single/multi-hop mechanism minimizes redundant packet forwarding by dynamically choosing the most energy-efficient path. Compared to a fixed multi-hop scheme, it reduces the average number of relay transmissions, especially for members located at moderate distances from the CH.
The proposed protocol introduces moderate control overhead during the setup and learning phases. However, by significantly optimizing the data transmission paths and enhancing network stability (reducing frequent re-clustering), it achieves a favorable trade-off, leading to lower total energy consumption per delivered data packet, as validated by simulations.
3.6. Algorithm Performance Analysis
A systematic simulation study was conducted under the Rayleigh channel model. To ensure statistical significance, a large-scale repeated experiment approach (
M = 10,000) (N = 10,000) was adopted, and all performance metrics were estimated using sample means. This section focuses on comparing the energy efficiency of three communication schemes: single-hop transmission, fixed multi-hop transmission, and the proposed Q-learning-based adaptive single/multi-hop optimization algorithm. The impact of transmission distance on system energy consumption is also thoroughly investigated. The complete simulation parameters are provided in
Table 2.
3.6.1. Energy Consumption Comparison
Figure 7 shows the energy consumption of single-hop and multi-hop communication modes under different transmission distances. Analysis of the experimental data leads to the following key conclusions:
(1) In short-distance scenarios (d < 22.5 m), multi-hop communication consumes more energy than single-hop due to additional packet reception processing and more frequent transmission operations;
(2) As the transmission distance increases (d > 22.5 m), the energy consumption of single-hop communication rises sharply owing to exponential path loss;
(3) Notably, under the current experimental configuration, the energy consumption curves of the two modes intersect at d = 22.5 m, indicating a balance in energy cost between single-hop and multi-hop transmission.
This critical finding underscores the necessity of adopting an adaptive communication strategy. By dynamically selecting the optimal transmission mode, energy consumption can be minimized across all transmission distances.
3.6.2. Intra-Cluster Communication Results Based on Q-Learning Adaptive Transmission
The Q-learning-based adaptive transmission strategy achieves significantly lower energy consumption compared to both pure single-hop and multi-hop modes. Therefore, this approach is adopted for communication between cluster members and the cluster head to optimize intra-cluster energy efficiency.
Figure 8 illustrates the optimized intra-cluster communication topology after applying the adaptive strategy.
The left subfigure shows the original communication structure within the cluster. In this configuration, some cluster members are located far from the cluster head, offering multiple potential transmission paths (e.g., direct single-hop or multi-hop relay). To minimize energy consumption, an adaptive transmission mode selection mechanism is introduced to dynamically switch between single-hop and multi-hop strategies. The red paths in the right subfigure represent the adaptive communication links established between the cluster head and remote members. By selecting the optimal transmission mode, the proposed algorithm significantly reduces the energy consumption of intra-cluster communication.
4. Analysis of Simulation Experiment Results
To comprehensively evaluate the performance of the proposed Dynamic Multi-Source Clustering Routing Protocol, we systematically compare it with two categories comprising seven state-of-the-art algorithms in a forest-based energy-harvesting wireless sensor network environment. All comparisons are conducted under identical simulation conditions to ensure fairness, including unified node deployment, channel models, dual-source energy-harvesting models (solar and biomass), and data traffic loads. The first category comprises classical and enhanced clustering algorithms, including (1) the K-SACB-EC algorithm [
13], which employs K-hop bipartite graph contraction for clustering but does not consider real-time node energy status or optimize intra-cluster communication; (2) the traditional LEACH protocol [
12], serving as a classic probabilistic clustering benchmark; (3) the Stable Perception Clustering (NSAC) protocol, which incorporates both energy consumption and spectral dynamics during clustering; and (4) the Spectrum-Aware Clustering based on a Weighted Clustering Metric (SAC-WEN) algorithm. The second category consists of routing protocols specifically designed for hybrid energy systems, including (5) the Hybrid Energy-Aware Routing Protocol (HEARP) [
39]; (6) the Dynamic Energy Harvesting Clustering protocol (DEH-Cluster) [
40]; and (7) the Multi-source Hybrid Energy Routing protocol (MHER) [
41]. These three represent distinct design philosophies in contemporary energy-aware routing. In terms of implementation, we faithfully reconstruct the core mechanisms of each algorithm according to their original publications and carefully map and adapt their parameters to our unified simulation framework. For instance, algorithms not originally designed for energy harvesting are integrated with our dynamic energy-harvesting models, while those involving other energy sources are adapted to operate with our dual-source solar/biomass energy model.
To ensure the statistical validity and reproducibility of our simulation results, we have adopted the following rigorous methodology:
Random seed management: All simulations were conducted using a fixed set of random seeds (e.g., seeds 1 through 10 for 10 independent runs) to initialize node deployment, channel fading, and energy-harvesting patterns. The results presented in
Section 4 are the average values obtained from these 10 independent runs. The use of fixed seeds allows any researcher to exactly reproduce the sequence of “random” events in our experiments.
Meanwhile, in order to further clarify the experimental procedure and ensure reproducibility, the following execution parameters and termination conditions should be uniformly applied in all comparison algorithms:
Simulation runs per scenario: Each configuration (e.g., different seasons, node counts) was simulated for 10 independent runs as stated above.
Convergence criterion for Q-learning: For our proposed algorithm, the intra-cluster Q-learning process was considered converged, and its policy was fixed for performance evaluation, when the maximum change in the Q-value table between consecutive rounds was less than 0.01 for a period of 50 consecutive rounds. This ensures the adaptive transmission strategy has reached a stable state before its energy efficiency is measured.
Data collection phase: After the cluster formation and (for our algorithm) Q-learning convergence phases, network performance metrics were collected over a stable monitoring period of 500 consecutive communication rounds. The results presented are averages over this period and across all independent runs.
4.1. Parameter Settings for Simulation Experiments
The performance of the proposed algorithm was evaluated through simulations conducted in MATLAB 2019b. The simulation parameters are categorized into two groups: system-fixed parameters and seasonally varied parameters, as detailed in
Table 3 and
Table 4, respectively. Among them, seasonal parameters integrate research results from multiple fields of forest ecology and energy harvesting. For example, the
(environmental attenuation factor) in the solar energy-harvesting model refers to the measured data on canopy shading effect in [
28]. The peak value of matrix mass
was set based on dynamic observation data of subtropical forest litter [
31]. The value of seasonal coefficient
is associated with the study of the seasonal response of soil enzyme activity to temperature [
33]. System efficiency
refers to the latest design report of the forest environment mixed energy harvester [
32].
In a forest area of 300 m × 300 m, 100 sensor nodes were randomly deployed with a sensing radius R and communication radius r of 40 m, and the initial energy of each sensor node was 2 J. The nodes adopted an adaptive energy supply strategy based on the maximum energy collection rate: when the solar energy collection rate exceeds the biomass energy collection rate , the solar energy collection mode is prioritized; otherwise, it switches to the biomass energy collection mode. The position of the base station is fixed at (300, 150). In the simulation result graph, blue dots represent cluster members, red dots represent middle cluster heads, and yellow dots represent final cluster heads.
4.2. Network Simulation Result Diagram
To evaluate the network optimization performance of the algorithm proposed in this article,
Figure 9 shows the network structure after intra-cluster optimization. A comprehensive performance comparison experiment was conducted between the algorithm proposed in this article and the four communication algorithms mentioned above under identical parameter configuration conditions. Through quantitative comparison of multidimensional indicators, the superiority of the proposed algorithm in energy-harvesting wireless sensor networks in forest environments has been verified.
4.3. Comparison with Clustering Algorithms
4.3.1. Comparison of Network Energy Consumption
The simulation results demonstrate that the proposed algorithm significantly improves energy efficiency in energy-harvesting wireless sensor networks deployed in forest environments. As illustrated in
Figure 10, the total energy consumption of our algorithm consistently remains lower than that of the four benchmark algorithms across different communication rounds, validating the theoretical hypothesis regarding energy optimization. Furthermore,
Figure 11 and
Figure 12 show that under a fixed number of communication rounds (e.g., 200 rounds), the proposed algorithm maintains the lowest total energy consumption as the number of nodes increases from sparse to dense deployments and as the network area expands. Notably, the algorithm sustains its performance advantage even under high node density and large-scale network conditions, demonstrating remarkable scalability and environmental adaptability. This capability allows it to effectively address communication challenges arising from dense node deployments. Compared to existing algorithms, the proposed method exhibits superior overall performance in terms of energy efficiency.
4.3.2. Energy Consumption Balance and Stability Analysis
This article uses the ECBI parameter, which is the ratio of standard deviation to mean, to measure the energy consumption uniformity distribution of nodes in a network. Its expression is as follows:
where
is the standard deviation of energy consumption for all nodes, expressed as follows:
The average energy consumption of all nodes
is expressed as follows:
is the total energy consumption of the i-th node, and N is the total number of nodes. The specific comparison results of the ECBI values of several algorithms under different rounds are shown in the figure below.
The results from
Figure 13 indicate that the algorithm proposed in this paper exhibits significant advantages in energy balance. Compared to other algorithms, this algorithm can achieve a more uniform distribution of energy consumption, effectively avoiding the common energy overload problem of cluster head nodes in traditional solutions.
Figure 14 illustrates the proportion of nodes with residual energy above 20% as a function of communication rounds. Nodes with residual energy below 20% are defined as energy-critical nodes, which face a high risk of depletion. A higher proportion of such nodes indicates poorer network stability. The experimental results demonstrate that as the number of communication rounds increases, the proposed algorithm maintains the proportion of high-energy nodes consistently above 90%, significantly outperforming the compared algorithms. This indicates a notable advantage of our method in enhancing network stability and extending the network lifespan.
Figure 15 and
Figure 16 further illustrate the relationship between the proportion of nodes with residual energy above 20% and the network edge length, as well as the round in which the first low-energy node appears under each algorithm. Consistently, the proposed algorithm demonstrates superior performance compared to the other methods in both metrics.
By optimizing the energy allocation mechanism, the proposed algorithm balances energy consumption among nodes effectively, thereby substantially reducing the risk of premature node failures and subsequent degradation in network performance. This balanced energy consumption profile not only improves the reliability of network communications but also remarkably extends the overall network lifetime, offering enhanced operational sustainability for energy-constrained wireless sensor networks.
4.4. Comparison with Hybrid Energy-Aware Algorithms
To comprehensively evaluate the performance of the proposed Dynamic Clustering Routing Protocol for Multi-Source networks, we compare it with three representative hybrid energy-aware routing algorithms: HEARP (Hybrid Energy-Aware Routing Protocol) [
39], which employs static hybrid energy supply; DEH-Cluster (Dynamic Energy Harvesting Clustering Protocol) [
40], which features dynamic energy prediction but uses fixed single-hop transmission; and MHER (Multi-source Hybrid Energy Routing) [
41], which supports solar and biomass energy switching but relies on a fixed multi-hop routing strategy. All compared algorithms were executed within the same simulation environment, with consistent node deployment, energy-harvesting models, and channel conditions.
Figure 17 illustrates the energy consumption comparison of the four algorithms un-der four typical forest environmental conditions: “Summer,” “Winter,” “Spring,” and “Autumn.” In the energy-sufficient and ideal environment of Summer, the energy consumption of the proposed algorithm is already at the lowest level. Its performance ad-vantage further expands as environmental conditions deteriorate. For instance, in the “Winter” scenario, characterized by severely insufficient solar irradiance, the energy consumption of the proposed algorithm is significantly lower than that of HEARP, DEH-Cluster, and MHER. This indicates that the integrated dynamic energy-switching strategy and adaptive routing mechanism proposed in this work can effectively mitigate network instability caused by the intermittency of a single energy source, enabling the network to maintain high energy efficiency under various harsh conditions.
The comparative analysis presented in
Figure 18,
Figure 19 and
Figure 20 comprehensively demonstrates the superiority of the proposed protocol across several dimensions critical for forest Energy-Harvesting Wireless Sensor Networks (EH-WSNs). As shown in
Figure 18, the proposed algorithm significantly delays the emergence of the first low-energy node compared to HEARP, DEH-Cluster, and MHER, thereby extending the network’s stable operational lifetime. This translates directly to enhanced network longevity and reliability for long-term monitoring missions. Complementing this,
Figure 19 reveals that the proposed algorithm achieves the lowest Energy Consumption Balance Index (ECBI), indicating its exceptional capability in distributing communication loads and energy expenditure evenly among nodes. This balanced energy consumption mitigates the formation of network hotspots and prevents premature cluster head failures. Finally, the temporal evolution of the network’s health status, depicted in
Figure 20, confirms that the proposed algorithm maintains the highest proportion of nodes with sufficient energy reserves (>20%) throughout the entire simulation. The consistently high and stable curve for our algorithm contrasts sharply with the steeper declines observed for the benchmark protocols, highlighting its effectiveness in sustaining network vitality.
In essence, these results collectively validate the comprehensive performance advantage conferred by the integrated design of the proposed protocol, which combines dynamic multi-source energy switching, energy-aware cluster head election, and Q-learning-based adaptive transmission. It not only minimizes the total energy consumption but, more importantly, it also ensures network robustness, stability, and extended operational lifetime when faced with the dynamic and intermittent energy-harvesting conditions characteristic of forest environments.
4.5. Sensitivity Analysis: Impact of the Switching Threshold θ
To validate the rationale behind the selection of the threshold parameter
in the dynamic multi-source energy-switching strategy and to assess its impact on the performance of the proposed protocol, a systematic sensitivity analysis is conducted. In the experiments, the threshold
is varied within the range [0.1, 0.5] with a step size of 0.1, testing five configurations:
= 0.1, 0.2, 0.3, 0.4, and 0.5. The simulation environment remains consistent with that described in
Section 4.1. The performance metrics include (1) total network energy consumption, (2) the round when the first node’s residual energy drops below 20% (FNB20), and (3) the Energy Consumption Balance Index (
ECBI). The specific results are shown in
Table 5.
Energy efficiency and network health: The configuration with
= 0.2 achieves the lowest total energy consumption and the latest FNB20 (Round 136). This indicates that this threshold optimally balances promoting energy-efficient operation and maintaining the energy health of network nodes, thereby effectively delaying network performance degradation. This result is entirely consistent with the overall algorithm performance (FNB20 = 136 rounds) reported in
Section 4.3, providing strong validation for the superiority of the
= 0.2 setting in complex, integrated environments. Energy balance: The ECBI is also minimized at
= 0.2, suggesting the most uniform energy consumption distribution among nodes under this setting, which helps prevent specific nodes from prematurely entering a low-energy critical state.
Too small (e.g., 0.1) makes the switching condition overly sensitive, causing nodes to switch frequently due to minor fluctuations in harvesting rates. Too large (e.g., 0.4, 0.5) makes the switching condition too lax, causing nodes to persist with the current energy mode. Robustness in integrated environment, = 0.2, demonstrates the best overall performance under the complex, energy-fluctuating integrated four-season scenario, confirming its effectiveness and adaptability for practical forest-monitoring applications.
4.6. Discussion and Simulation Limitations
While the simulation results demonstrate the efficacy of the proposed protocol, several limitations inherent to our simulation model should be acknowledged. First, the channel model assumes ideal Rayleigh fading without considering persistent shadowing effects from large, static obstacles (e.g., massive tree trunks), which could create permanent poor-connectivity zones. Second, the energy-harvesting models, though seasonally varied, are based on average daily rates and do not capture ultra-short-term stochastic fluctuations (e.g., sudden cloud cover for solar, or localized microbial activity bursts for biomass). Finally, the simulation is conducted in a two-dimensional plane; extending to three-dimensional terrain typical of forests would introduce additional challenges in distance calculation and link quality. These limitations outline important directions for future research, including integration with more granular environmental models and hardware-in-the-loop testing.