3.2. Seasonal Analysis
Although the random forest method can effectively capture the overall nonlinear relationships between air temperature and various pollutants, and quantitatively assess feature importance, it still has certain limitations in revealing spatiotemporal heterogeneity. In particular, meteorological conditions and pollutant emission mechanisms vary significantly across seasons, which can profoundly influence the interactions between temperature and pollutants. For example, during autumn and winter, increased heating demand, rising energy consumption, and a lowered atmospheric boundary layer height lead to intensified pollutant emissions and poorer dispersion conditions. These changes enhance the feedback effect of pollutants on temperature variation. Therefore, taking seasonal factors into account is crucial for a deeper understanding of the complex relationships between pollutants and temperature and for improving the generalizability and predictive accuracy of the model.
To further explore the linear relationships between air temperature and six major pollutants, this study conducted a correlation analysis based on seasonal scales. The results (see
Figure 6) show that the correlations between temperature and each pollutant vary significantly across different seasons. Specifically, during spring and summer, the correlations between temperature and pollutants are generally weak, with most not reaching statistical significance, indicating a limited influence of pollutants on temperature during these periods. As the season transitions into autumn, ozone (O
3) exhibits a significant positive correlation with temperature (r = 0.58). This may be attributed to the gradual decrease in temperature and the stabilization of atmospheric conditions, which favor the accumulation and increased concentration of ozone. The stable atmospheric environment reduces ozone dispersion, thereby enhancing its impact on local temperature. In winter, the correlation pattern becomes more complex. Nitrogen dioxide (NO
2), particulate matter (PM
10), carbon monoxide (CO), and sulfur dioxide (SO
2) show significant positive correlations with temperature. This phenomenon is likely related to increased emissions from coal combustion and industrial activities during the heating season. Additionally, the stable atmospheric stratification and frequent temperature inversions in winter limit the vertical dispersion of pollutants, leading to their accumulation near the surface and further strengthening their feedback effects on temperature variations. These findings suggest that temperature variations are not only influenced by seasonal factors but are also significantly regulated by air pollutants.
Figure 7 shows that the concentrations of PM
2.5, PM
10, NO
2, CO, and SO
2 are relatively high in autumn and winter and lower in spring and summer, while O
3 exhibits the opposite seasonal trend, peaking in summer and reaching its lowest levels in winter. Correlation analysis indicates that the relationship between pollutants and temperature is generally weak in spring and summer, whereas a significant positive correlation is observed between O
3 and temperature in autumn (r = 0.58,
p < 0.05). It is noteworthy that, although O
3 concentrations are highest in summer, their correlation with temperature is not statistically significant, suggesting that temperature is not the sole or primary driver of O
3 variation. The formation of O
3 is also influenced by other factors, such as solar radiation intensity, precursor gas concentrations, and relative humidity. In winter, concentrations of PM
10, NO
2, CO, and SO
2 increase markedly and exhibit statistically significant positive correlations with temperature. However, this does not imply that higher pollutant concentrations directly cause increases in temperature. Rather, it more likely reflects a co-varying trend under similar meteorological conditions. During winter, temperature inversion layers frequently occur, vertical atmospheric mixing is suppressed, and pollutants tend to accumulate near the surface. Additionally, low wind speeds, poor dispersion conditions, and high humidity contribute to the so-called “pollutant retention effect.”.
3.3. Analysis of Temperature Forecasting Accuracy
The temperature data, other meteorological factors, and six major air pollutant datasets were all obtained from the Huiju Atmospheric Platform. The data span from 1 January 2015 to 31 December 2023, with daily observations, totaling 3280 valid records. To ensure the reliability of model training and prediction, the data were chronologically divided into a training set (80%, from 1 January 2015 to 14 March 2022) and a testing set (20%, from 15 March 2022 to 31 December 2023).
An initial analysis of the raw temperature series was conducted. As shown in
Figure 8a, the original temperature sequence exhibits a certain trend but contains several data points with significant deviations. To eliminate the influence of scale and enhance model training efficiency, normalization was applied using the method defined in Equation (2). The normalized sequence is shown in
Figure 8b, which retains the overall trend of the data while effectively reducing the degree of dispersion among data points and compressing the values into the range [0, 1]. This normalization process contributes to improving the training stability of the CNN-LSTM model and accelerates convergence toward optimal parameters.
where
is the normalized value,
is the original value,
denotes the maximum, and
denotes the minimum within the dataset.
To ensure fairness in the performance comparison among different models, this study standardized the input features uniformly, and all models employed a feature set including two major meteorological factors—atmospheric pressure and visibility—along with historical temperature data. This approach eliminates performance bias caused by differences in feature information across models. In addition, recognizing that the lookback window (i.e., time step) may significantly affect the accuracy of temperature prediction, this study systematically determined the optimal lookback value by combining autocorrelation function (ACF) analysis with the experimental validation method (Grid Search), as shown in
Figure 9. Specifically, by evaluating the Mean Squared Error (MSE) of model predictions on the testing set under various lookback settings, the results, as shown in
Figure 9a, indicated that a lookback value of 7 yielded the lowest MSE (0.0267). The ACF plot (
Figure 9b) also exhibited significant autocorrelation at lag = 7, reflecting the long-memory characteristics of the temperature series. These findings provide both theoretical justification and empirical support for the subsequent CNN-LSTM-RF model design and temporal dependency modeling. It is worth noting that the lookback = 7 adopted in this study means that the model inputs consist of the temperature observations from the previous seven consecutive days. To further improve the model’s generalization ability and mitigate overfitting, a dropout regularization mechanism was introduced during training, with the dropout rate set to 0.2. This means that, during each training iteration, 20% of the neurons are randomly deactivated, which effectively enhances the model’s stability and generalization performance on unseen data.
Table 2 provides a detailed overview of the model architecture along with the output shapes and parameter configurations. The input layer has an output shape of (None, 7, 3), indicating it accepts sequential data with a time step length of 7 and a feature dimension of 3, and contains no trainable parameters. Next, the Conv1D layer employs 32 one-dimensional convolutional filters of size 3 for feature extraction. With 3 input channels and 32 output channels, a stride of 1, and no padding (padding = 0), the output shape becomes (None, 32, 5), reflecting a reduction of 2 in the time steps compared to the input. This layer introduces 320 trainable parameters. The ReLU activation function provides nonlinear transformation capability, maintaining the same output shape as the convolutional layer (None, 32, 5). It effectively mitigates the vanishing gradient problem and accelerates model convergence. Following this, the MaxPool1D layer performs down sampling with a pool size of 3 and a stride of 1, compressing the output shape to (None, 32, 3). This reduces computational complexity while enhancing feature representation. Subsequently, a dimension permutation operation rearranges the tensor to the input format required by the LSTM layer, i.e., (batch, seq_len, features). The LSTM component consists of two layers, each with 32 units, producing an output shape of (None, 3, 32) and containing 16,640 trainable parameters. It aims to capture the long-term dependencies in the temperature time series. The following fully connected layer applies a linear mapping to transform the LSTM output into a single prediction value, with an output shape of (None, 1) and 33 parameters. Finally, the CNN-LSTM module outputs the temperature prediction with a shape of (None, 1). The total number of parameters is 16,993, reflecting the model’s efficiency and integrative capability in temporal feature extraction and sequence modeling.
To enhance the model’s environmental awareness, the PM2.5 and O3 concentration features are introduced as supplementary inputs. The output of the CNN-LSTM model, along with these two pollutant concentration features, forms the input feature set for the random forest model, with dimensions of (None, 3). This multi-source feature fusion strategy effectively improves the model’s ability to predict temperature changes.
The choice of loss function plays a crucial role in determining the predictive accuracy of the model. In this study, the Mean Squared Error (MSE) is employed as the loss function. MSE evaluates model performance by calculating the average of the squared differences between the predicted and actual values. Due to its high sensitivity to large errors, MSE effectively amplifies significant deviations, making it well suited for tasks that demand high predictive precision. This, in turn, enables the model to more accurately capture the variation characteristics of the target variable.
Figure 10 illustrates the fitting performance of the CNN-LSTM model during the training phase, as well as the evolution of the loss function. The left panel shows the training set fitting curve, where the model’s predicted values (red curve) closely match the actual observations (blue curve). The model demonstrates excellent fitting ability, particularly in capturing the periodic fluctuations and extreme values of temperature, indicating its effectiveness in modeling the temporal patterns and seasonal characteristics of temperature data. The right panel displays the variation trend of the loss function (measured by Mean Squared Error, MSE) throughout the training process. As shown in the figure, the model achieved a substantial reduction in loss during the early training epochs, dropping rapidly from an initial value above 20 to nearly 1. The loss then gradually stabilized and eventually settled around 0.90, demonstrating good convergence behavior. This process reflects the model’s strong feature learning capability in the early training stage, efficiently capturing the main structural changes and intrinsic patterns in the data. Moreover, the loss function remains stable in the later stages of training without significant oscillations, suggesting that the model does not suffer from overfitting. In other words, the model has not simply memorized the training data but exhibits a certain degree of abstract generalization. This indicates that the model is likely to perform well on unseen data, validating the effectiveness and robustness of the CNN-LSTM hybrid architecture for modeling highly complex temperature time series.
Specifically, the CNN module extracts local variation features (such as atmospheric pressure and visibility) through local convolution operations, while the LSTM module captures both short- and long-term dependencies along the time dimension via its gating mechanism. The synergy between these two components significantly enhances the model’s adaptability to non-stationary climate data. Furthermore, the steady decline and stabilization of the training loss further support the robustness and reliability of the CNN-LSTM structure in handling complex temperature series, providing clear and strong evidence of the model’s effectiveness and stability.
To validate the superiority of the CNN-LSTM model,
Figure 11 presents the prediction results on the test set. The predicted curve aligns closely with the actual values, further confirming the model’s predictive capability. The output of this deep learning model is subsequently used as one of the input features for the random forest model, providing a reliable foundation for temperature prediction research based on the concentrations of major air pollutants.
Finally, this study integrates the prediction results of the CNN-LSTM model with two key pollutant features identified earlier (PM
2.5 and O
3) to construct a three-dimensional input feature set, which is then used as input to the random forest (RF) model to further enhance temperature prediction performance. This hybrid approach leverages the strengths of deep learning in time series modeling and the powerful nonlinear feature extraction capability of the RF model, significantly improving the accuracy and robustness of temperature forecasts. As shown in
Figure 12a, the predicted values on the test set closely match the actual observations, indicating that the model possesses strong fitting capability and generalization performance. The residual time series plot in
Figure 12b demonstrates that, although some fluctuations exist, the overall distribution of residuals is relatively stable, implying that the prediction errors are minor and within a controllable range. Some deviations may result from inevitable systematic errors that cannot be fully eliminated during the model fitting process. Furthermore, the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots in
Figure 12c show no significant autocorrelation beyond lag 1, suggesting that the model has effectively captured the underlying patterns in the data and that the remaining residuals resemble white noise. The QQ plot in
Figure 12d reveals that the residuals are approximately normally distributed. Taken together, the residual analysis confirms that the CNN-LSTM-RF model achieves a satisfactory fit, with prediction errors primarily reflecting random disturbances, thus demonstrating the model’s strong stability and reliability in temperature forecasting tasks.
Table 3 presents the performance comparison results of different models. The analysis shows that compared to individual models, the CNN-LSTM-RF hybrid model achieves a significant improvement in prediction performance, highlighting the advantages of deep learning approaches in temperature forecasting tasks. Specifically, the random forest (RF) model achieved RMSE = 5.58, MAE = 4.42, and R
2 = 0.60, indicating some strengths in nonlinear feature extraction but also revealing limitations in capturing the complexity of temperature patterns. The convolutional neural network (CNN) model shows a notable performance improvement, with RMSE = 3.43, MAE = 2.78, and R
2 = 0.86, demonstrating strong capability in extracting spatial features from temperature data. The Long Short-Term Memory (LSTM) network further enhances predictive accuracy, reducing RMSE to 2.64, reducing MAE to 1.91, and increasing R
2 to 0.91, validating its superiority in modeling sequential data and capturing long-term dependencies. The CNN-LSTM model, which integrates the spatial feature extraction capabilities of CNNs and the temporal dependency modeling strength of LSTM, achieves RMSE = 2.39, MAE = 1.76, and R
2 = 0.93. This indicates its powerful, comprehensive modeling ability in time series prediction tasks.
Ultimately, the CNN-LSTM-RF hybrid model demonstrated the best predictive performance, with the RMSE reduced to 0.88, the MAE lowered to 0.66, and the R2 reaching as high as 0.99. These results fully validate the model’s superior capability in capturing both spatial features and temporal dynamics of temperature data. The hybrid model effectively integrates the strengths of convolutional neural networks (CNNs) in spatial feature extraction and Long Short-Term Memory (LSTM) networks in temporal sequence modeling. The output of the CNN-LSTM structure is then combined with key pollutant features and fed into the random forest (RF) model, which significantly enhances the accuracy of temperature prediction influenced by atmospheric pollutants. By integrating deep learning with traditional machine learning methods, this study systematically explores the mechanisms through which air pollutants affect temperature variation. The proposed framework provides a novel approach for improving weather prediction accuracy and offers important insights and reference values for research and applications in climate-related fields.
3.4. Comparison of Temperature Prediction Accuracy at Multiple Time Scales
In the previous
Section 3.3, this study focused on analyzing the performance of the CNN-LSTM-RF model for temperature prediction at the daily scale. To further explore the model’s applicability and stability across different temporal scales—especially its predictive capability for medium- to long-term temperature forecasting (e.g., monthly average temperature)—a multi-scale comparative analysis was conducted. Specifically, the preprocessed daily temperature data were aggregated into a monthly average temperature series using a moving average method. Based on this, the optimal time step (lookback) for the monthly scale was systematically determined through autocorrelation function (ACF) analysis and empirical validation via Grid Search (see
Figure 13). The results indicate that, when the lookback is set to 7, the model achieves the lowest test set MSE (MSE = 0.00796). Meanwhile, the ACF plot also shows significant autocorrelation at lag = 7, revealing the long-memory characteristics of the temperature sequence. This provides both theoretical justification and practical support for the subsequent construction of the CNN-LSTM-RF model and its modeling of temporal dependencies.
Under the premise of keeping the previously selected model structure at the daily scale unchanged, this study adjusted the temporal granularity of the input data to the monthly scale and retrained the model using combined input features, including PM
2.5 and O
3 concentrations as well as historical temperature data. To prevent overfitting during training, the same dropout regularization strategy as in the daily-scale modeling was adopted, with the dropout rate set at 0.2, aiming to improve the model’s generalization ability and robustness in monthly-scale temperature prediction tasks. On this basis, the output of the CNN-LSTM model was further used as feature input to the random forest (RF) model for temperature prediction. This fusion strategy effectively combines the advantages of deep learning in temporal sequence modeling with the strong generalization capability of ensemble learning in nonlinear regression tasks, significantly enhancing the overall prediction accuracy and stability of the model.
Figure 11 presents the temperature sequence prediction results of Wuhan city based on the monthly-scale CNN-LSTM-RF model. As shown in
Figure 14, the overall trend of the model’s predicted values is highly consistent with the actual observations, further verifying the effectiveness and robustness of the model in medium- to long-term temperature forecasting tasks.
Finally, the evaluation results of the monthly-scale CNN-LSTM-RF hybrid model show an RMSE of 1.0097, an MAE of 0.8771, and an R
2 as high as 0.9841. A detailed comparison of performance metrics with the daily-scale model is presented in
Table 4. The results indicate that the CNN-LSTM-RF model demonstrates high predictive accuracy across different temporal scales in temperature forecasting tasks. Specifically, the RMSE and MAE values at the daily (short-term) scale are lower than those at the monthly (medium- to long-term) scale, and the R
2 scores are higher, indicating that the model captures short-term temperature fluctuations more accurately and stably. Although errors increase somewhat at the monthly scale, the model still maintains strong fitting ability and high predictive performance, showing good adaptability and reliability in capturing medium- to long-term temperature trends. Overall, the CNN-LSTM-RF model performs well across multiple temporal scales for temperature prediction, demonstrating strong generalization capability and application potential.
It is worth noting that there are observable differences between the temperature prediction results based on daily and monthly scales. To further examine whether these differences are statistically significant, this study conducted a paired-sample t-test comparing the predicted annual average temperatures derived from both time scales. The test results yielded a t-value of −3.5299 and a p-value of 0.0242, indicating statistical significance at the 0.05 level. This suggests that the prediction results at the two temporal scales differ significantly within the 95% confidence interval. These findings further imply that temperature prediction models constructed at different temporal scales may produce systematic deviations in their estimates. Therefore, researchers are advised to carefully select appropriate time scales and time steps according to specific application needs and to adjust and optimize model structures and parameter settings accordingly, in order to enhance the scientific validity and reliability of climate forecasting outcomes.