1. Introduction
Weather prediction using recurrent neural networks such as Long Short-Term Memory (LSTM) has proven to be effective in learning temporal dependencies [
1]; however, its implementation in systems such as Microcontroller Units (MCUs) presents some challenges. These problems include the model size and the performance stability when using a longer prediction window time [
2]. Although solutions such as TPA-LSTM [
3] or Legendre Memory Unit (LMU) [
4] improve the accuracy, their complexity makes them unfeasible for these devices. In contrast, the use of AI and the IoT can solve this problem [
5,
6]. Other architectures, such as CNN-LSTM [
7] and TCN [
8], can be implemented in MCUs and are considered in this study for comparative purposes. This research prioritizes a completely local solution, performing inferences on an Microcontroller Unit (MCU). This allows us to utilize its advantages, such as privacy, by avoiding data transfer to the cloud, achieving minimal latency due to an on-premises execution, attaining computational resource efficiency, and maintaining accessibility in environments with limited connectivity, as demonstrated in comparative studies of distributed architectures [
9] and in model optimization solutions for constrained hardware [
10]. This approach allows for more autonomous and efficient solutions in environments with low connectivity or a lack of a robust infrastructure. By performing processing locally, the dependence on third parties is minimized, and, in addition, the low energy consumption of MCUs is leveraged.
In this article, we present an LSTM model with embedding layers, a technique that, as demonstrated in [
11], allows the size of the neural network to be reduced without sacrificing its performance. This technique is similar to how embeddings are used in natural language processing (NLP) to represent semantic relationships between words [
12]. For this application, embeddings are learned during the training phase as vectors, which find the relationships between seasons based on the temperature and humidity series. In addition, to explicitly capture seasonality, a label corresponding to the season was designated, grouping the months into quarters according to the meteorological convention and taking into account the hemisphere where each meteorological station is located. In the southern hemisphere, the distribution is as follows: December–February (summer), March–May (autumn), June–August (winter), September–November (spring); in the northern hemisphere, the distribution is reversed: the grouping is reversed: December–February (winter), March–May (spring), June–August (summer), September–November (autumn) [
13]. This category was encoded using embeddings, allowing the neural network to learn latent representations of seasonality and its influence on climatic variables, such as temperature and humidity. Similar to their use in sequential tasks, such as time series or natural language processing, the categorical variables are represented using embedding vectors that capture relationships and patterns over time. Once the network completes the training, these vectors are integrated as input to the LSTM. Thus, the network reduces the number of parameters and the final model size while maintaining the quality or accuracy of a larger architecture.
We performed three main comparisons, all evaluating the MAE, MSE, RMSE, MAPE, RSE, and R
2 metrics in the time windows t + 1, t + 3, and t + 6. All of the results shown in the tables are normalized to facilitate the comparison between models. The first comparison measured how much the size is reduced without losing performance, similarly to other networks such as LMU, TPA-LSTM, CNN-LSTM, TCN, and the standard LSTM. The second comparison focused on the implementation of these models on various MCU boards, also considering the inference time and power consumption, which are important aspects in real-world applications [
14]. For the compatibility reasons explained in the results and implementation sections, TPA-LSTM and LMU were not included. Finally, the third comparison consisted of validating the effect of using embeddings in the LSTM, contrasting the base model with the proposed SE-LSTM model that integrates these vectors. This comparison was conducted in 13 cities at different latitudes.
The model maintains consistency in predictions when run on different hardware platforms, validating its portability. All resources (dataset, code, and trained models) are available as indicated in the Data Availability Statement to facilitate their adoption in practical applications.
2. Materials and Methods
This study used temperature and humidity data obtained from NASA’s POWER [
15], with coordinates of latitude −16.4333 and longitude −71.5617, covering the periods from January 2020 to March 2025, with a sampling frequency of one hour, yielding a total of 45,864 data points. These data points were distributed as follows: 80% for training, 15% for validation, and 5% for testing. Temperature and humidity were selected as the main variables due to their high availability and temporal continuity in different geographic locations, allowing for a comparison of the model performance in different contexts. Furthermore, missing values marked as −999 are very rare in these variables, which facilitated their processing and maintained temporal continuity without affecting the model. Other meteorological variables, such as solar radiation and wind speed, were excluded due to their lower availability and continuity in the datasets used. Each sample contains the date and time, as well as the temperature and humidity values, which were normalized between 0 and 1. Discordant data (values −999) caused by missing data on certain dates were eliminated, as indicated by POWER NASA. In cases where the value −999 is not repeated more than twice consecutively, the missing data were replaced by the average of the immediately adjacent values in order to preserve the continuity of the time series without introducing distortions.
Because both temperature and humidity are related to the climatic seasons, the seasons were coded as an integer (0−3) and incorporated using the four-dimensional embedding layer. The coding process groups the months into four seasons according to meteorological convention: December–February (0), March–May (1), June–August (2), September–November (3). This layer initializes its vectors randomly and, during training, adjusts these vectors by backpropagation of the error—along with the rest of the network weights—to minimize the loss function. In this way, the model automatically learns to locate stations with similar weather patterns close to each other in the vector space, improving the LSTM’s ability to capture temporal and seasonal relationships. The optimal number of dimensions was determined by modifying the parameters of the embedding layer and evaluating the model’s overall performance.
As illustrated in
Figure 1, the relationships between the seasons are represented by by vectors in the 2D and 3D planes:
The arrangement of points on a semicircle illustrates how the embedding vectors captured seasonal variations in the climate variables. Using a PCA dimensionality reduction, these four-dimensional vectors allow for the connections between seasons. Contrary seasons, such as summer and winter, are often reflected in opposite areas of the plane, as the embeddings have learned to represent their different weather patterns (such as high temperatures and low humidity under opposite circumstances). Spring and autumn, during periods of greater climatic stability, emerge in the transitional areas. This illustration shows how the model was able to gather seasons with similar climatic behaviors close to each other in the vector space.
The architecture is structured as an LSTM network organized into three levels. The initial layer has 64 units and uses the return_sequences function to collect temporal data for the entire sequence. The second layer has 32 units and also uses the return_sequences function, and the third LSTM layer is composed of 8 units. All of the layers use the ReLU activation function, which provides nonlinearity in the variables. The output layer is a dense layer with 6 units and a linear activation, for the 3 temperature and 3 humidity predictions, corresponding to the time windows t+1h, t+3h, and t+6h. Thus, the input to the model is 24 samples of both temperature and humidity, concatenated to the 4-dimensional embedding layer learned during training.
The following flowchart, shown in
Figure 2, illustrates how station embeddings are integrated with the LSTM network.
The integration of embeddings with the LSTM network allows the model to learn the context of the seasons and the relationship between temperature and humidity over the following hours.
The model was trained for 115 epochs, using the Adam optimizer, along with a learning rate of 0.001, a batch size of 64, and a loss function of the MSE. The hyperparameters were progressively defined, taking into account the behavior of the performance metrics in the test set, with the goal of achieving good performance.
Once the training is complete, the model is able to predict future temperature and humidity levels in the aforementioned time windows; at the same time, the learned vectors are also obtained, and in each training, these can change in magnitude, since they may give more weight to a certain variable during a certain season.
3. Results
This section is divided into four main parts. The design stage compares the performance of the proposed model, SE-LSTM, against other neural networks in the aforementioned city—networks such as CNN-LSTM, TCN, TPA-LSTM, LMU, and standard LSTM. In the second part, the dynamics of the embeddings are obtained. The third part compares the performance in various MCUs, and the fourth part involves the validation in various cities around the world other than the city in the first part.
3.1. Comparison at the Design Stage
The models compared were CNN-LSTM, TCN, LSTM, TPA-LSTM, LMU, and the proposed SE-LSTM model. The performance metrics used to compare were the MAE, MSE, RMSE, MAPE, RSE, , and the trained network weight. They were evaluated over three time horizons: t + 1 h, t + 3h, and t + 6 h, where “t” is the time after the collection of the previous 24 samples. An analysis of the variation in the metrics in each window and a summary of the overall performance are presented. The images accompanying the tables correspond to the model’s predictions of the test data. All of the metrics were calculated on the test set and applied to all of the networks analyzed.
3.1.1. First Window (t + 1)
In the first time window, the TPA-LSTM model shows the lowest errors (MSE, MAE, RMSE) and a very high , indicating its accuracy in nowcasting.
The SE-LSTM model also performs well, with competitive metrics and an
for temperature of 0.9781, demonstrating that the use of embeddings based on meteorological representations helps capture short-term patterns. The SE-LSTM predictions for temperature and humidity are illustrated in
Figure 3 and
Figure 4, respectively, while the performance metrics are summarized in
Table 1.
3.1.2. Second Window (t + 3)
In this window, the errors tend to increase for all of the models. SE-LSTM remains consistent, while the other models show greater performance degradation. This demonstrates that the use of embedding in SE-LSTM helps maintain stability in medium-term predictions, particularly for the temperature variable. The SE-LSTM predictions for temperature and humidity are illustrated in
Figure 5 and
Figure 6, respectively, while the performance metrics are summarized in
Table 2.
3.1.3. Third Window (t + 6)
As the prediction window grows, models like TPA-LSTM and TCN offer a good balance between accuracy and generalization, while LSTM and CNN-LSTM show a more noticeable performance drop. Overall, SE-LSTM establishes itself as the model most resilient to degradation in environments with high temporal uncertainty. The SE-LSTM predictions for temperature and humidity are illustrated in
Figure 7 and
Figure 8, respectively, while the performance metrics are summarized in
Table 3.
As seen in
Table 4, the use of embeddings in SE-LSTM is aimed at maintaining an optimal balance between accuracy and efficiency, unlike TPA-LSTM and LSTM, which seek to maximize performance at the expense of greater complexity (more neurons or layers). Overall, SE-LSTM slightly outperforms LSTM with lower weights and LMU with similar weights, proving to be more suitable for MCU applications. On the other hand, models such as TCN and CNN-LSTM show a competitive performance but with an intermediate weight.
As seen in the
Table 1,
Table 2,
Table 3 and
Table 4 the use of embeddings in SE-LSTM is aimed at maintaining an optimal balance between accuracy and efficiency, while TPA-LSTM and LSTM are aimed at achieving the highest possible performance, having more complex architectures or a greater number of neurons (the reasons for their weighting); as a general rule, SE-LSTM performs slightly better than LSTM at lower weights, and better than LMU at similar weights, for this specific application.
3.2. Learning Dynamics of the Embeddings
In the methodology section, it was explained that each spatial cell has an embedding vector that is adjusted during the training. To see if they are actually learning anything useful, we visualize how these vectors change over the epochs. As seen in
Figure 9, they take shape from epoch 0 to 80. This allows us to observe how the model organizes its internal representation space as it trains.
3.3. Comparison in MCUs
The LMU and TPA-LSTM networks are not available in TFLite due to the use of advanced functions; therefore, in this section, we compare the performance of the CNN-LSTM, LSTM, TCN, and SE-LSTM networks. The embedding function is not available in TFLite, but we can emulate it by adding the matrices (generated by learning the relationships between temperature and humidity during training) as an additional input. These vectors often vary in magnitude; for example, in a given training session, the following were obtained, as shown in
Table 5.
The implementation would then be as follows: 6 features composed of temperature, humidity, and a season (a 4-dimensional vector), with steps of 24, briefly (none, 24, 6), and finally flattened to an input of type (none, 144).
The architecture was also modified to 1 LSTM layer with the ReLU activation function, followed by two dense layers with 128 and 36 neurons, respectively, both with the ReLU activation, and an output layer with 6 neurons and a linear activation function for the prediction times per variable. The batch size was 32, with 80 epochs, and the remaining metrics remained the same. This configuration was applied to both the SE-LSTM and LSTM models. As can be seen in
Table 6, the performance when using embeddings increases when the network weight is maintained.
3.3.1. Edge Impulse
To upload the model to a board, several environments can be used, but the one we chose was the Edge Impulse platform [
16] due to its ease of inserting the necessary libraries into the same downloadable.zip; for the compilation, TensorFlow Lite was used, and as shown in
Table 7, the inference time is around 55 ms on an ESP32-S3, 671 ms on a Raspberry Pi Pico, and 74 ms on an ESP32. The inference time was with the same input data, a vector of size 144. To estimate the electrical charge consumption per inference (µAh), a Keweisi KWS-V20 USB Tester was used, and the inferences were made continuously, without delay, for 20 min. The total accumulated consumption is divided by the number of inferences made in that period, which is calculated based on the average inference time per device.
3.3.2. Implementation
For the physical implementation part,, there are aspects to consider, such as where to obtain the month or season in which the inference is made, since it is another input to the network. The connection of the DHT22 sensor to the ESP32 is shown in
Figure 10. To obtain this input variable, the ESP32 can be configured with a local Wi-Fi to obtain the date, by an RTC module, by using the millis function, or even by setting the embedding value to a specific month.
The network will function exactly the same as in the simulation under the same data; that is, a 144-value input vector will have the same output whether the inference is on a computer or an MCU. For the proper operation, an algorithm is required to sort the data and provide accurate updates at a given time.
As seen in the flowchart in
Figure 11, setting a default month, we take every hour as an update, with a circular memory buffer; the number of buffers can be increased to measure it in real time for every minute or every 10, 15, or 20 min, where each buffer (independent of the other buffers) is an entry to the network, but the SRAM memory must be taken into consideration. Generally, each buffer occupies 576 bytes (144 values of 4 bytes, float type variables), so with six active buffers, the total consumption of only the buffers in RAM would be 3456 bytes. Once the buffer is filled or every time it is updated with a new value, it will have one output per buffer so that only at the beginning will the system require a 24-h calibration to begin predicting correctly.
3.4. Robustness Validation and Geographic Generalization
To evaluate the generalization capability of the proposed model, a geographic validation was performed using data from different cities around the world, selected to represent a wide variety of climates and latitudinal locations. In this evaluation, the performance of the SE-LSTM model was compared against a standard LSTM network, using the RMSE and the coefficient of determination R2 for temperature and humidity as the metrics.
The results are summarized in
Table 8. In terms of temperature, SE-LSTM generally performs better, with a lower RMSE and a higher R
2 in most cities. In terms of humidity, however, the performance is more variable. SE-LSTM achieves lower errors in some cases, while the classic LSTM achieves better R
2 values in several cities.
4. Discussion
Using the different state-of-the-art architectures, the network parameters were optimized to maintain the best performance in this application, using the same dataset and test data to see how much the overall performance varies between them, in addition to the proposed model.
It was observed that the fixed embedding approach limits the adaptability to extreme weather events (e.g., El Niño), so it would be desirable to explore online learning mechanisms that adjust the vectors during the operation.
This network can still be optimized to achieve higher performance, but the limitation is that it would have to be able to operate in low-power embedded systems. As described in the implementation, six circular buffers consume approximately 3.4KB of SRAM, an amount manageable on an MCU.
Although TPA-LSTM demonstrated better performance in the design phase, its exclusion from the embedded evaluation was due to its incompatibility with lightweight deployment frameworks, such as TensorFlow Lite, used in edge computing environments. This technical limitation reinforces the need for optimized architectures such as SE-LSTM, which maintain good predictive performance while meeting the local execution, energy efficiency, and low latency requirements of embedded systems.
Future work could test variables other than the temperature and humidity, such as the pressure, wind direction, and speed. Applications could also be extrapolated to other fields, where only one variable per neural network model or multiple variables need to be predicted, requiring a local execution or short inference times.
However, the model performs optimally in contexts similar to those of the training; changing to a greenhouse or a very different biome requires retraining or readjusting the embeddings.
5. Conclusions
This research demonstrated that the use of embeddings in LSTM networks can be essential for improving climate projections in systems such as MCUs, reducing the network size without losing performance. This can be vital for fields such as precision agriculture or resource management, where issues such as latency, privacy, or energy efficiency are key components.
Although improvements in accuracy were noted, it was also evident that more complex models, such as TPA-LSTM, can offer an even better performance. However, the device where the inference will be performed, along with the environment where it may be monitored, are also taken into account. This solution offers an approach in remote scenarios such as rural, agricultural, or low-connectivity areas.
In tests on MCUs, SE-LSTM was shown to run in inference times between 55 ms and 671 ms, and its power consumption was lower compared to other lower-precision networks.
In short, while the study demonstrates progress in climate prediction, it opens the door to new research that could not only improve predictions but also have the potential to be applied in other areas, thanks to the power of edge computing and data processing. This is a powerful tool for a variety of future applications.
Author Contributions
Conceptualization, J.P.P.M.Y.; methodology, J.P.P.M.Y.; software, J.P.P.M.Y., D.J.C.Q., G.A.E.E., M.A.V.S., D.D.Y.A.C. and A.O.S.; validation, J.P.P.M.Y., D.J.C.Q., G.A.E.E., M.A.V.S., D.D.Y.A.C. and A.O.S.; formal analysis, J.P.P.M.Y., D.J.C.Q., G.A.E.E., M.A.V.S., D.D.Y.A.C. and A.O.S.; investigation, J.P.P.M.Y.; resources, J.P.P.M.Y., D.J.C.Q., G.A.E.E., M.A.V.S., D.D.Y.A.C. and A.O.S.; data curation, J.P.P.M.Y.; writing—original draft preparation, J.P.P.M.Y., D.J.C.Q., G.A.E.E., M.A.V.S., D.D.Y.A.C. and A.O.S.; writing—review and editing, J.P.P.M.Y., D.J.C.Q., G.A.E.E., M.A.V.S., D.D.Y.A.C. and A.O.S.; visualization, J.P.P.M.Y., D.J.C.Q., G.A.E.E., M.A.V.S., D.D.Y.A.C. and A.O.S.; supervision, G.A.E.E., M.A.V.S., D.D.Y.A.C. and A.O.S.; project administration, D.J.C.Q., G.A.E.E., M.A.V.S. and D.D.Y.A.C.; funding acquisition, D.J.C.Q., G.A.E.E., D.D.Y.A.C. and A.O.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Universidad Nacional de San Agustín Arequipa (UNSA) through UNSA INVESTIGA (Contract N◦ PI-01-2024-UNSA).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Acknowledgments
We would like to thank the National University of San Agustín de Arequipa.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
LSTM | Long Short-Term Memory |
MCUs | Microcontroller Units |
RTC | Real-Time Clock |
SE-LSTM | Seasonal Embedding Long Short-Term Memory |
TPA-LSTM | Temporal Pattern Attention Long Short-Term Memory |
LMU | Legendre Memory Unit |
MAE | Mean Absolute Error |
MSE | Mean Squared Error |
RMSE | Root Mean Squared Error |
MAPE | Mean Absolute Percentage Error |
RSE | Residual Sum of Squares Error |
R2 | Coefficient of Determination |
IoT | Internet of Things |
PCA | Principal Component Analysis |
GitHub | Collaborative development platform |
NASA | National Aeronautics and Space Administration |
POWER | NASA database with solar energy and meteorological data |
RAM | Random Access Memory |
µAh | Microampere-hour |
References
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Codeluppi, G.; Davoli, L.; Ferrari, G. Forecasting air temperature on edge devices with embedded AI. Sensors 2021, 21, 3973. [Google Scholar] [CrossRef] [PubMed]
- Guo, Z.; Feng, L. Multi-step prediction of greenhouse temperature and humidity based on temporal position attention LSTM. Stoch. Environ. Res. Risk Assess. 2024, 38, 4907–4934. [Google Scholar] [CrossRef]
- Voelker, A.R.; Kajić, I.; Eliasmith, C. Legendre memory units: Continuous-time representation in recurrent neural networks. Adv. Neural Inf. Process. Syst. 2019, 32, 15544–15553. [Google Scholar]
- Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge computing: Vision and challenges. IEEE IoT J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
- Srivastava, A.; Das, D.K. A comprehensive review on the application of Internet of Things (IoT) in smart agriculture. Wirel. Pers. Commun. 2022, 122, 1807–1837. [Google Scholar] [CrossRef]
- Elmaz, F.; Eyckerman, R.; Casteels, W.; Latré, S.; Hellinckx, P. CNN-LSTM architecture for predictive indoor temperature modeling. Build. Environ. 2021, 203, 108327. [Google Scholar] [CrossRef]
- Hewage, P.; Behera, A.; Trovati, M.; Prreira, E.; Ghahremani, M.; Palmieri, F.; Liu, Y. Temporal convolutional neural (TCN) network for an effective weather forecasting using time-series data from the local weather station. Soft Comput. 2020, 24, 16453–16482. [Google Scholar] [CrossRef]
- Andriulo, F.C.; Fiore, M.; Mongiello, M.; Traversa, E.; Zizzo, V. Edge computing and cloud computing for Internet of Things: A review. Informatics 2024, 11, 71. [Google Scholar] [CrossRef]
- King, T.; Zhou, Y.; Röddiger, T.; Beigl, M. MicroNAS for memory and latency constrained hardware aware neural architecture search in time series classification on microcontrollers. Sci. Rep. 2025, 15, 7575. [Google Scholar] [CrossRef] [PubMed]
- Liu, Z.; Song, Q.; Li, L.; Choi, S.H.; Chen, R.; Hu, X. PME: Pruning-based multi-size embedding for recommender systems. Front. Big. Data. 2023, 6, 1195742. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- National Oceanic and Atmospheric Administration (NOAA). Meteorological Versus Astronomical Seasons. Available online: https://www.ncei.noaa.gov/news/meteorological-versus-astronomical-seasons. (accessed on 26 May 2025).
- Rzepka, K.; Szary, P.; Cabaj, K.; Mazurczyk, W. Performance evaluation of Raspberry Pi 4 and STM32 Nucleo boards for security-related operations in IoT environments. Comput. Netw 2024, 242, 110252. [Google Scholar] [CrossRef]
- NASA Langley Research Center. POWER NASA—Prediction Of Worldwide Energy Resources Data Access Viewer. Available online: https://power.larc.nasa.gov/data-access-viewer/. (accessed on 26 May 2025).
- Edge Impulse—Machine Learning for Edge Devices. Available online: https://www.edgeimpulse.com/ (accessed on 26 May 2025).
Figure 1.
A PCA-based visualization of the 4D embeddings. (a) 2D projection. (b) 3D projection.
Figure 1.
A PCA-based visualization of the 4D embeddings. (a) 2D projection. (b) 3D projection.
Figure 2.
A flowchart of the model.
Figure 2.
A flowchart of the model.
Figure 3.
Temperature prediction in first window.
Figure 3.
Temperature prediction in first window.
Figure 4.
Humidity prediction in first window.
Figure 4.
Humidity prediction in first window.
Figure 5.
Temperature prediction in second window.
Figure 5.
Temperature prediction in second window.
Figure 6.
Humidity prediction in second window.
Figure 6.
Humidity prediction in second window.
Figure 7.
Temperature prediction in third window.
Figure 7.
Temperature prediction in third window.
Figure 8.
Humidity prediction in third window.
Figure 8.
Humidity prediction in third window.
Figure 9.
Evolution of embedding vectors during training.
Figure 9.
Evolution of embedding vectors during training.
Figure 10.
Connection of DHT22 sensor to ESP32 GPIO pin 4.
Figure 10.
Connection of DHT22 sensor to ESP32 GPIO pin 4.
Figure 11.
Operation flow for real application.
Figure 11.
Operation flow for real application.
Table 1.
First time window (t + 1).
Table 1.
First time window (t + 1).
Model | Variable | MSE | MAE | RMSE | MAPE | RSE | R2 |
---|
SE-LSTM | Temperature | 0.0005 | 0.0164 | 0.0234 | 3.48 | 0.0219 | 0.9781 |
| Humidity | 0.0022 | 0.0349 | 0.0469 | 5.66 | 0.0551 | 0.9449 |
LSTM | Temperature | 0.0005 | 0.0165 | 0.0228 | 3.48 | 0.0209 | 0.9791 |
| Humidity | 0.0010 | 0.0226 | 0.0318 | 3.52 | 0.0252 | 0.9748 |
LMU | Temperature | 0.0010 | 0.0224 | 0.0309 | 4.73 | 0.0381 | 0.9619 |
| Humidity | 0.0015 | 0.0295 | 0.0384 | 4.54 | 0.0367 | 0.9633 |
TPA-LSTM | Temperature | 0.0002 | 0.0108 | 0.0151 | 2.38 | 0.0091 | 0.9909 |
| Humidity | 0.0004 | 0.0139 | 0.0194 | 2.12 | 0.0094 | 0.9906 |
CNN-LSTM | Temperature | 0.0005 | 0.0152 | 0.0221 | 3.27 | 0.0196 | 0.9804 |
| Humidity | 0.0009 | 0.0216 | 0.0304 | 3.34 | 0.0231 | 0.9769 |
TCN | Temperature | 0.0005 | 0.0159 | 0.0224 | 3.37 | 0.0200 | 0.9800 |
| Humidity | 0.0010 | 0.0224 | 0.0309 | 3.50 | 0.0238 | 0.9762 |
Table 2.
Second time window (t + 3).
Table 2.
Second time window (t + 3).
Model | Variable | MSE | MAE | RMSE | MAPE | RSE | R2 |
---|
SE-LSTM | Temperature | 0.0010 | 0.0224 | 0.0311 | 3.54 | 0.0242 | 0.9758 |
| Humidity | 0.0017 | 0.0301 | 0.0413 | 6.29 | 0.0687 | 0.9313 |
LSTM | Temperature | 0.0012 | 0.0250 | 0.0350 | 5.20 | 0.0492 | 0.9508 |
| Humidity | 0.0024 | 0.0369 | 0.0495 | 5.83 | 0.0613 | 0.9387 |
LMU | Temperature | 0.0015 | 0.0281 | 0.0384 | 5.83 | 0.0593 | 0.9407 |
| Humidity | 0.0026 | 0.0386 | 0.0505 | 6.15 | 0.0639 | 0.9361 |
TPA-LSTM | Temperature | 0.0010 | 0.0228 | 0.0324 | 4.81 | 0.0420 | 0.9580 |
| Humidity | 0.0019 | 0.0315 | 0.0431 | 4.98 | 0.0463 | 0.9537 |
CNN-LSTM | Temperature | 0.0013 | 0.0248 | 0.0355 | 5.20 | 0.0506 | 0.9494 |
| Humidity | 0.0024 | 0.0361 | 0.0494 | 5.68 | 0.0611 | 0.9389 |
TCN | Temperature | 0.0013 | 0.0257 | 0.0356 | 5.32 | 0.0511 | 0.9489 |
| Humidity | 0.0023 | 0.0356 | 0.0478 | 5.62 | 0.0573 | 0.9427 |
Table 3.
Third time window (t + 6).
Table 3.
Third time window (t + 6).
Model | Variable | MSE | MAE | RMSE | MAPE | RSE | R2 |
---|
SE-LSTM | Temperature | 0.0013 | 0.0257 | 0.0358 | 5.28 | 0.0505 | 0.9495 |
| Humidity | 0.0038 | 0.0476 | 0.0614 | 7.46 | 0.0949 | 0.9051 |
LSTM | Temperature | 0.0017 | 0.0301 | 0.0412 | 6.26 | 0.0683 | 0.9317 |
| Humidity | 0.0039 | 0.0478 | 0.0626 | 7.56 | 0.0986 | 0.9014 |
LMU | Temperature | 0.0018 | 0.0309 | 0.0423 | 6.46 | 0.0722 | 0.9278 |
| Humidity | 0.0036 | 0.0467 | 0.0604 | 7.42 | 0.0917 | 0.9083 |
TPA-LSTM | Temperature | 0.0017 | 0.0292 | 0.0410 | 6.18 | 0.0679 | 0.9321 |
| Humidity | 0.0037 | 0.0460 | 0.0609 | 7.13 | 0.0932 | 0.9068 |
CNN-LSTM | Temperature | 0.0018 | 0.0298 | 0.0420 | 6.19 | 0.0711 | 0.9289 |
| Humidity | 0.0039 | 0.0477 | 0.0627 | 7.50 | 0.0987 | 0.9013 |
TCN | Temperature | 0.0017 | 0.0307 | 0.0418 | 6.37 | 0.0705 | 0.9295 |
| Humidity | 0.0038 | 0.0470 | 0.0613 | 7.40 | 0.0944 | 0.9056 |
Table 4.
Overall performance.
Table 4.
Overall performance.
Model | Network Weight (KB) | MSE | MAE | RMSE | MAPE | RSE | R2 |
---|
SE-LSTM | 426 | 0.0017 | 0.0295 | 0.0399 | 5.29 | 0.0525 | 0.9475 |
LSTM | 1114 | 0.0018 | 0.0298 | 0.0425 | 5.31 | 0.0406 | 0.9461 |
LMU | 522 | 0.0020 | 0.0327 | 0.0445 | 5.86 | 0.0446 | 0.9398 |
TPA-LSTM | 1460 | 0.0015 | 0.0257 | 0.0353 | 4.60 | 0.0447 | 0.9554 |
CNN-LSTM | 856 | 0.0018 | 0.0292 | 0.0404 | 5.19 | 0.0540 | 0.9460 |
TCN | 785 | 0.0018 | 0.0296 | 0.0419 | 5.27 | 0.0394 | 0.9472 |
Table 5.
Vectors obtained after training.
Table 5.
Vectors obtained after training.
Season | x1 | x2 | x3 | x4 |
---|
Winter | −0.07277188 | 0.01109327 | −0.09172057 | −0.08308954 |
Spring | −0.05201762 | −0.11797583 | 0.06664423 | 0.03551533 |
Summer | 0.08841707 | −0.01442201 | 0.06266572 | −0.04614694 |
Autumn | −0.04601293 | 0.10943006 | 0.0083359 | 0.05554458 |
Table 6.
Performance by prediction window.
Table 6.
Performance by prediction window.
Window | Model | Variable | MSE | MAE | RMSE | MAPE | RSE | R2 |
---|
t + 1 | SE-LSTM | Temperature | 0.0005 | 0.0158 | 0.0226 | 3.34 | 0.0205 | 0.9795 |
| | Humidity | 0.0024 | 0.0361 | 0.0487 | 5.69 | 0.0593 | 0.9407 |
| LSTM | Temperature | 0.0006 | 0.0164 | 0.0240 | 3.55 | 0.0230 | 0.9770 |
| | Humidity | 0.0025 | 0.0371 | 0.0501 | 5.85 | 0.0630 | 0.9370 |
| TCN | Temperature | 0.0006 | 0.0171 | 0.0236 | 3.61 | 0.0223 | 0.9777 |
| | Humidity | 0.0011 | 0.0252 | 0.0333 | 3.80 | 0.0277 | 0.9723 |
| CNN-LSTM | Temperature | 0.0005 | 0.0162 | 0.0227 | 3.39 | 0.0205 | 0.9795 |
| | Humidity | 0.0009 | 0.0221 | 0.0308 | 3.43 | 0.0236 | 0.9764 |
t + 3 | SE-LSTM | Temperature | 0.0010 | 0.0221 | 0.0309 | 3.43 | 0.0237 | 0.9763 |
| | Humidity | 0.0017 | 0.0296 | 0.0409 | 6.16 | 0.0675 | 0.9325 |
| LSTM | Temperature | 0.0010 | 0.0227 | 0.0314 | 3.51 | 0.0245 | 0.9755 |
| | Humidity | 0.0018 | 0.0308 | 0.0430 | 6.54 | 0.0744 | 0.9256 |
| TCN | Temperature | 0.0013 | 0.0261 | 0.0365 | 5.36 | 0.0534 | 0.9466 |
| | Humidity | 0.0025 | 0.0374 | 0.0497 | 5.83 | 0.0618 | 0.9382 |
| CNN-LSTM | Temperature | 0.0012 | 0.0253 | 0.0352 | 5.19 | 0.0498 | 0.9502 |
| | Humidity | 0.0024 | 0.0360 | 0.0487 | 5.68 | 0.0595 | 0.9405 |
t + 6 | SE-LSTM | Temperature | 0.0013 | 0.0256 | 0.0356 | 5.31 | 0.0508 | 0.9492 |
| | Humidity | 0.0037 | 0.0470 | 0.0610 | 7.43 | 0.0936 | 0.9064 |
| LSTM | Temperature | 0.0013 | 0.0256 | 0.0360 | 5.41 | 0.0522 | 0.9478 |
| | Humidity | 0.0044 | 0.0511 | 0.0666 | 8.00 | 0.1117 | 0.8883 |
| TCN | Temperature | 0.0018 | 0.0299 | 0.0420 | 6.18 | 0.0711 | 0.9289 |
| | Humidity | 0.0038 | 0.0475 | 0.0619 | 7.37 | 0.0962 | 0.9038 |
| CNN-LSTM | Temperature | 0.0018 | 0.0315 | 0.0427 | 6.50 | 0.0734 | 0.9266 |
| | Humidity | 0.0039 | 0.0480 | 0.0628 | 7.59 | 0.0991 | 0.9009 |
Average general | SE-LSTM | - | 0.0017 | 0.0294 | 0.0399 | 5.23 | 0.0526 | 0.9472 |
| LSTM | - | 0.0019 | 0.0306 | 0.0419 | 5.48 | 0.0581 | 0.9419 |
| TCN | - | 0.0018 | 0.0305 | 0.0429 | 5.36 | 0.0415 | 0.9446 |
| CNN-LSTM | - | 0.0018 | 0.0299 | 0.0405 | 5.30 | 0.0543 | 0.9457 |
Table 7.
Inference time and current cost of various MCUs with SE-LSTM, TCN, CNN-LSTM, and LSTM.
Table 7.
Inference time and current cost of various MCUs with SE-LSTM, TCN, CNN-LSTM, and LSTM.
Model/Peak RAM Usage | MCUs | Processor | Inference Time | Current Cost Per Inference |
---|
SE-LSTM | Raspberry Pi Pico | RP2040 | 671 ms | 41.90 µAh |
85.8 KB | ESP32 | Xtensa LX6 | 74 ms | 5.12 µAh |
| ESP32-S3 | Xtensa LX7 | 55 ms | 3.79 µAh |
TCN | Raspberry Pi Pico | RP2040 | 1804 ms | 112.7 µAh |
43.6 KB | ESP32 | Xtensa LX6 | 210 ms | 14.47 µAh |
| ESP32-S3 | Xtensa LX7 | 171 ms | 11.87 µAh |
CNN-LSTM | Raspberry Pi Pico | RP2040 | 1638 ms | 102.38 µAh |
30.0 KB | ESP32 | Xtensa LX6 | 202 ms | 13.62 µAh |
| ESP32-S3 | Xtensa LX7 | 152 ms | 10.55 µAh |
LSTM | Raspberry Pi Pico | RP2040 | 644 ms | 40.25 µAh |
85.7 KB | ESP32 | Xtensa LX6 | 73 ms | 5.05 µAh |
| ESP32-S3 | Xtensa LX7 | 54 ms | 3.69 µAh |
Table 8.
A comparison of the performance of the SE-LSTM and LSTM models in the prediction of temperature and humidity in different cities around the world.
Table 8.
A comparison of the performance of the SE-LSTM and LSTM models in the prediction of temperature and humidity in different cities around the world.
City | Model | RMSE Temperature | RMSE Humidity | R2 Temperature | R2 Humidity |
---|
Longyearbyen | SE-LSTM | 0.0261 | 0.0659 | 0.9510 | 0.6768 |
| LSTM | 0.0298 | 0.0690 | 0.9465 | 0.6687 |
Oslo | SE-LSTM | 0.0220 | 0.0333 | 0.8937 | 0.6325 |
| LSTM | 0.0303 | 0.0270 | 0.8853 | 0.6373 |
Madrid | SE-LSTM | 0.0329 | 0.0701 | 0.9106 | 0.7791 |
| LSTM | 0.0348 | 0.0821 | 0.8138 | 0.8053 |
Beijing | SE-LSTM | 0.0319 | 0.0698 | 0.9380 | 0.8263 |
| LSTM | 0.0285 | 0.0809 | 0.8952 | 0.8387 |
Cairo | SE-LSTM | 0.0295 | 0.0635 | 0.9645 | 0.9132 |
| LSTM | 0.0261 | 0.0717 | 0.9502 | 0.9240 |
Cancún | SE-LSTM | 0.0273 | 0.0507 | 0.9546 | 0.8767 |
| LSTM | 0.0320 | 0.0484 | 0.9271 | 0.9073 |
Manila | SE-LSTM | 0.0202 | 0.0365 | 0.9717 | 0.9269 |
| LSTM | 0.0204 | 0.0367 | 0.9708 | 0.9262 |
San José | SE-LSTM | 0.0249 | 0.0398 | 0.9730 | 0.9356 |
| LSTM | 0.0298 | 0.0391 | 0.9600 | 0.9414 |
Quito | SE-LSTM | 0.0308 | 0.0487 | 0.9666 | 0.9358 |
| LSTM | 0.0359 | 0.0452 | 0.9484 | 0.9551 |
Piura | SE-LSTM | 0.0224 | 0.0379 | 0.9873 | 0.9689 |
| LSTM | 0.0279 | 0.0363 | 0.9795 | 0.9769 |
East London | SE-LSTM | 0.0210 | 0.0443 | 0.9363 | 0.7563 |
| LSTM | 0.0298 | 0.0397 | 0.8666 | 0.8243 |
Sidney | SE-LSTM | 0.0290 | 0.0626 | 0.9067 | 0.7542 |
| LSTM | 0.0307 | 0.0636 | 0.8475 | 0.8276 |
Invercargill | SE-LSTM | 0.0335 | 0.0751 | 0.9185 | 0.7295 |
| LSTM | 0.0406 | 0.0787 | 0.8265 | 0.7903 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).