4.2.1. Theoretical Analysis of Sampling Cycle
The sampling cycle is at the heart of our system design and is the result of balancing and optimizing hardware limits, sensor settling time, system energy consumption, and the theoretical demands of environmental prediction tasks.
- (1)
Analysis of hardware and sensor physical limitations
The sampling period we chose was not the maximum time allowed by the hardware, but was much faster than the optimized value of this limit.
Hardware capability assessment:
Main control chip: The Arduino MKR-WAN1300 and STM32L series are both low-power but high-performance ARM Cortex-M microcontrollers capable of multi-sensor data acquisition, preprocessing, and packaging in milliseconds.
Sensor settling time: This is the most critical limiting factor. For example, the BME680 requires heating and stabilization time to achieve highly accurate readings of temperature, humidity, pressure, and volatile organic compounds, with stabilization and measurement times typically between 100 and 200 milliseconds. In contrast, the SDS011 laser dust sensor obtains stable readings in about 10 s.
Conclusion: From a pure hardware perspective, our system is fully capable of achieving sample intervals in seconds or even less.
- (2)
The scientific basis for selecting the current sampling period
While the hardware features allowed for faster sampling, we ended up setting the sampling cycle to 5 min. This decision is based on three main scientific considerations.
Our prediction targets (e.g., temperature, humidity, PM2.5 concentration) are typical slow-changing processes. Their changes are controlled by atmospheric physical and chemical processes, and changes on a second or small scale are usually noise rather than effective signals.
Oversampling risk: If the sampling speed is too fast (e.g., once per second), it will capture too high a frequency noise. This not only hinders the performance of the predictive model but also increases data redundancy and transmission overhead, potentially leading to model overfitting noise.
The Nyquist sampling theorem states that the sampling frequency must be at least twice the highest frequency of the signal. Our spectral analysis of historical environmental data shows that significant changes in energy occur mainly in the 16 Hz frequency range. Therefore, the 5 min sampling period we chose—much higher than the Nyquist sampling rate at that frequency—fully captured all meaningful dynamic changes.
- B.
Limitations of system energy consumption and network life (engineering reality)
Wireless sensor networks (WSNs) are resource-constrained systems whose operational life is directly determined by energy consumption. The energy consumption model shows that the maximum power consumption of the sensor node occurs (a) during sensor wake-up and measurement, or (b) during wireless data transmission. Our calculations and experimental results show that extending the sampling interval from 1 min to 5 min can increase the theoretical lifetime of the node by nearly five times.
Data volume trade-offs: Longer sampling cycles reduce the number of packets transmitted per unit time, significantly reducing the risk of network congestion and communication energy consumption per node. This is especially critical for large-scale, battery-powered field deployments.
- C.
Predictive Task Fit
Our models are designed to predict the next 24 h. The 5 min time step provides sufficient temporal resolution for such medium-term predictions without producing excessively long and difficult-to-handle sequences. Using secondary data requires the BiLSTM to process extremely long sequences (e.g., 3600 time steps), which significantly increases computational complexity and training difficulty but does not necessarily improve prediction performance.
- (3)
Abstract
In summary, the 5 min sampling interval we chose is an optimized, application-oriented parameter with the following decision-making process.
The sampling rate is well below the maximum allowable capacity of the hardware, ensuring the reliability and stability of data acquisition. It takes into account the physical characteristics of environmental signals, avoiding oversampling to maintain data quality. Strictly limited by the energy consumption realities of the water network, this approach is a necessary compromise to achieve long-term sustainable ecological monitoring. Therefore, based on a deep understanding of hardware limitations, the selection of this sampling cycle was a rational decision made to achieve scientific prediction goals while ensuring system feasibility.
4.2.2. Sampling Cycle Experiment
The sensor network in this study uses a 5 min sampling cycle. This duration is determined by a trade-off between hardware capabilities, dynamic environmental characteristics, and system energy consumption limits. On the one hand, the 5 min interval is significantly shorter than the minimum sampling limit of the main control chip and sensor, ensuring the reliability of data acquisition. On the other hand, spectral analysis of historical environmental data shows that the 5 min period fully meets the requirements of the Nyquist sampling theorem for dynamic signal changes in temperature and PM2.5 measurements. Additionally, this cycle effectively controls node energy consumption and maintains a manageable amount of data, which is crucial for extending the operational life of battery-powered networks.
Figure 8 provides a 5 min optimized sampling cycle for scientific and engineering reference.
Figure 8 contains two subplots. The left subgraph compares the signal waveforms at different sampling rates. The blue curve (high-precision reference signal) shows the raw signal captured by the sensor at high-speed sampling (per minute), including real-world ambient dynamics and high-frequency noise. The red dot (optimized sampling) shows that the blue curve is sampled every 5 min. This sampling rate successfully captures all major trends and key inflection points (such as peaks and troughs) while effectively filtering out most of the high-frequency noise, providing the model with clean and representative input data. The green triangle (undersampled) indicates sampling every 60 min. Obviously, this rate is severely depleted of signal detail and fails to reflect rapid environmental changes. For example, omitting the afternoon temperature surge entirely results in information loss that is not suitable for the input needs of the predictive model.
Conclusion: Compared with high or low sampling rates, the 5 min sampling period achieves the best balance between signal fidelity and data reduction.
The figure on the right shows the curve between system energy consumption and sampling rate. This curve shows the trend of average current consumption (or daily energy consumption) of nodes as the sample rate increases (shortening the sampling period). The relationship shows non-linear growth. When the sampling cycle is reduced from 60 min to 5 min, the energy consumption increases relatively smoothly, keeping the system life within acceptable limits. However, when the sampling cycle is further shortened to less than 1 min, the energy consumption rises sharply, leading to rapid battery depletion and seriously affecting the long-term deployment feasibility of the sensor network. The shaded areas in the figure indicate the recommended window for sampling periods, balancing sufficient fidelity of the signal with acceptable system energy consumption.
Key findings: As shown in the figure on the left, the 5 min sampling interval fully meets the data quality requirements of the environmental prediction task. The correct data further suggest that this range is within the “sweet spot” of the energy consumption curve, making large-scale long-term environmental monitoring technically feasible. This 5 min interval effectively captures the core dynamic characteristics of the environmental signal. It extends battery life by approximately four times compared to 1 min sampling while achieving the best balance between signal fidelity and system energy consumption. Therefore, the sampling intervals we chose represent scientific and engineering optimization, not hardware limitations.
4.2.3. Dataset Construction and Preprocessing
To verify the performance of the proposed model in real-world dynamic WSN scenarios, a six-month experiment was designed and conducted. The experimental setup is detailed in
Section 4.1 to ensure reproducibility of the study.
Network topology: Deploy 50 wireless sensor nodes to form a dynamic WSN, of which 40 nodes are fixed nodes and 10 nodes are installed on mobile platforms (e.g., sanitary vehicles) to simulate mobility of more than 30% of the nodes.
- (1)
Monitoring parameters: Temperature (°C), relative humidity (%), atmospheric pressure (hPa), and PM2.5 concentration (μg/m3) are collected at the same time at each node. The core prediction goal of this study is the PM2.5 concentration in the next hour, and the temperature data are used as an auxiliary tool to analyze extreme weather events such as heat waves.
- (2)
Sample size and collection cycle
Data were collected every 5 min. The experimental period was from 1 June 2023 to 30 November 2023, for a total of 6 months (183 days). After the initial cleanup, about 52,560 valid time points and complete data records were obtained.
Data division: The dataset is divided chronologically, with the first four months (about 70%) for training, the following month (about 15%) for validation (BKA optimization), and the last month (about 15%) for testing.
- (3)
Pretreatment process
Noise processing: Smooth out the original reading by using a median filter of the sliding window (window size = 5) to resist transient pulse noise.
Missing value handling: For data gaps caused by node relocation or communication failure, we use the spatiotemporal K nearest neighbor interpolation method. This method combines the data from the three nearest spatial nodes at a certain time point with the two nearest time points of the nodes themselves for weighted interpolation. Data segments with more than one consecutive hour of missing values are flagged and excluded from the training dataset.
Data normalization: All numerical features are normalized using Z-score standardization, which centers the data at zero mean and scales it to unit standard deviation. The transformation is defined as:
where μ and σ are the mean and standard deviation of the feature, respectively.
Note: While Z-score normalization is effective for approximately Gaussian-distributed data, it is sensitive to outliers because both the mean and standard deviation can be heavily influenced by extreme values. For datasets with significant outliers, robust alternatives (e.g., scaling based on median and interquartile range) are recommended.
Time discretization: Converts timestamps into two periodic features—“time of day” and “day of year”—and encodes them with sine and cosine, respectively, to help models understand circadian rhythms and seasonal cycles.
- (4)
Results and quantitative analysis
- A.
Comparative results of PM2.5 concentration prediction
For the prediction accuracy comparison, we selected ARIMA, support vector regression (SVR), standard LSTM, and CNN-LSTM (unoptimized) as baseline models. The comparison results of PM2.5 concentration prediction are shown in
Table 7.
Performance boost: Our models achieve optimal performance. Compared with the strongest baseline model, CNN-LSTM, RMSE decreased by 11.4% from 3.41 to 3.02. The improvement compared to ARIMA reached 29.9%, consistent with the “19.3–32.7%” margin of error mentioned in the abstract (based on comparisons between different baseline models).
Extreme event detection: In a binary classification task to determine whether PM2.5 concentrations exceed the severe pollution threshold (150 μg/m3), our model achieved 89.4% accuracy, with 87.1% accuracy and 85.6% recall rates, respectively.
- B.
Provide quantitative analysis
Ablation research and benchmarking
- (1)
Experimental setting and evaluation criteria
In order to comprehensively evaluate the performance of the proposed BKA-CNN-BiLSTM model in the prediction of dynamic wireless sensor network parameters, we conducted system ablation experiments and benchmark tests. All models were trained and tested under the same hardware and software conditions, using the same training, validation, and testing datasets to ensure fairness and comparability of results.
The benchmark model selection covered traditional machine learning methods, classical deep learning models, and advanced models proposed in recent years, mainly including the following.
ARIMA model: A representative statistical model for traditional time-series prediction. LSTM model: Classical recurrent neural networks for time-series processing. BiLSTM model: a bidirectional long short-term memory network. The system captures time dependencies in both directions. CNN-BiLSTM model: A hybrid model that combines convolutional neural networks and bidirectional LSTMs. Pure transformer model: A sequence model based on the mechanism of self-attention. ST-autoencoder model: A spatiotemporal autoencoder model. Ablation experiments aimed at validating the contribution of each component in the BKA-CNN-BiLSTM model, with the following variants developed.
Model Variants and Evaluation Settings
We evaluated the following model variants to assess the contribution of each component in the BKA-CNN-BiLSTM architecture:
CNN-BiLSTM: A baseline model without the BKA module, trained using default hyperparameters.
BKA-BiLSTM: The CNN-based spatial feature extraction module is removed, retaining only the BKA and BiLSTM components.
BKA-CNN: The BiLSTM layer is replaced with a traditional fully connected (FC) layer, thereby eliminating explicit temporal modeling capability.
The full BKA-CNN-BiLSTM model integrates all three components: BKA, CNN for spatial feature extraction, and BiLSTM for temporal dynamics modeling.
Evaluation metrics include RMSE (Root Mean Square Error), MAE (Mean Absolute Error), and (Coefficient of Determination)—standard metrics widely used in regression tasks. All experiments were conducted under identical data partitioning and preprocessing conditions. To mitigate the impact of randomness, the reported results are averaged over five independent runs.
- (2)
Analysis of ablation experimental results
Table 8 shows the performance comparison results of the proposed BKA-CNN-BiLSTM model and its various ablation variants in the test set.
The analysis of the results in
Table 8 allows us to draw the following important conclusions.
The BKA optimizer demonstrated a significant performance improvement. A comparative analysis between BKA-CNN-BiLSTM and CNN-BiLSTM showed that the BKA optimizer reduced RMSE by approximately 19.2% (from 1.04 to 0.84). This indicates that the BKA improves the robustness of feature representation quality and noise data by optimizing the key hyperparameters of the CNN module, including learning rate, hidden layer size, and the regularization coefficient.
Contribution of CNN modules: Comparing BKA-CNN-BiLSTM and BKA-BiLSTM shows that the RMSE increases by about 40.5% (from 0.84 to 1.18) after removing the CNN module. The effectiveness of CNN in extracting local spatial features from multi-node sensor data is confirmed, as the lack of spatial feature extraction significantly affects model performance.
Contribution of the BiLSTM module: The BKA-CNN model without BiLSTM performed the worst, confirming the irreplaceability of BiLSTM in modeling bidirectional long-term dependencies. Bidirectional temporal modeling is essential for capturing changes in environmental parameters, such as circadian rhythms and seasonal trends.
- (3)
Benchmarking results analysis
Table 9 systematically compares this model with multiple benchmark models.
Analysis of the benchmark results shows the following.
- (1)
Compared with the traditional model, the RMSE of the proposed BKA-CNN-BiLSTM model is about 49.1% lower than that of the traditional ARIMA model. This shows significant advantages. Deep learning models outperform traditional statistical models in capturing complex nonlinear spatiotemporal patterns and environmental parameter estimation.
- (2)
Benchmark comparison with classical deep learning models: Compared with LSTM and BiLSTM models, our model shows a RMSE (market equilibrium effect) reduction of 35.4% and 32.8%, respectively. This improvement is primarily due to the spatial feature extraction capabilities of CNN components, overcoming the limitations of a single LSTM model when processing time series.
- (3)
Benchmarking against advanced hybrid models: Our approach reduces RMSE by 19.2% compared to CNN-BiLSTM baselines, demonstrating that BKA-based hyperparameter optimization effectively enhances spatial feature representation and overall prediction accuracy.
- (4)
Model efficiency evaluation: Although the BKA-CNN-BiLSTM model needs to extend the training time due to the optimization process, the single prediction time in the inference mode under standard GPU conditions is less than 10 milliseconds, which meets the real-time prediction requirements of dynamic WSN environmental parameters.
- (5)
Discussion
Based on the combined results of ablation experiments and benchmarks, the following conclusions can be drawn.
The proposed BKA-CNN-BiLSTM model achieves the best performance in environmental parameter prediction through the synergistic interaction of its components. Specifically, the CNN module effectively extracts spatial features from multi-node sensor data, while the BiLSTM module precisely models the bidirectional time dependencies of environmental parameters. The BKA optimizer enhances the robustness of feature representation by intelligently optimizing CNN hyperparameters. This hierarchical architecture—including spatial feature extraction, hyperparameter optimization, and temporal dynamic modeling—effectively addresses the spatiotemporal complexity challenge in dynamic WSN environmental parameter prediction.
Compared with existing research, the key innovation of this study is the integration of the emerging BKA optimization algorithm into the CNN-BiLSTM hybrid model framework. Experimental validation proves the effectiveness of this comprehensive method in the task of predicting environmental parameters. The results show that in the WSN envionment with limited resources, the BKA-optimized hybrid model significantly improves the prediction accuracy and provides more reliable technical support for ecological monitoring and disaster early warnings.