This section presents a comprehensive analysis of model performance across different temporal scales, starting with the raw data as a baseline and progressing to component-specific and integrated modeling approaches.
4.1. Analysis of Raw Data
All variables were standardized to have a mean of zero and a standard deviation of one, ensuring comparability across different scales. We first applied a multiple linear regression model, allowing the influence of each independent variable on the dependent variable to be quantified through the regression coefficients, as shown in Equation (
2).
This interpretable and straightforward framework allows for the assessment of each predictor’s contribution; however, it may be too restrictive for complex real-world data. To address this, we extend the analysis using advanced methods. Ridge and Lasso regression enhance linear models by mitigating multicollinearity and selecting the most critical variables. Random Forest captures nonlinear relationships and interactions through an ensemble of decision trees. At the same time, XGBoost builds sequential trees to reduce errors and uncover subtle patterns, making it particularly effective for modeling the behavior of raw data. Model performance was evaluated using walk-forward validation, which trains on past observations and tests on future data to simulate real-world forecasting. The results (
Figure 4) indicate that Random Forest and XGBoost outperform the linear models, achieving lower errors and higher predictive accuracy.
Moreover, as shown in
Figure 5, solar radiation (ALLSKY_SFC_SW_DWN) consistently emerged as the most influential predictor across all models, while GWETROOT exhibited the most substantial adverse effect. Precipitation (PRECTOTCORR) and soil moisture gained higher importance in XGBoost, suggesting potential nonlinear effects, whereas WS2M consistently had minimal influence.
Figure 6 compares the actual time series with predictions from five models: Linear, Ridge, Lasso, Random Forest, and XGBoost. The linear, Ridge, and Lasso models each achieved a correlation coefficient of 0.78, capturing the general pattern but struggling to capture finer fluctuations. In contrast, Random Forest and XGBoost performed better, each reaching a correlation of nearly 0.90, closely following the actual values, particularly around peaks and troughs. These results suggest that ensemble methods are better suited for capturing complex or highly variable patterns in the data.
4.2. Long-Term Component Analysis
This section focuses on the long-term component of the target variable, analyzed using regression and ensemble models to capture underlying trends over time. Equation (
3) presents the mathematical formulation of the linear regression model for the long-term component.
The long-term equation above uses the same notation as the raw data equation, with terms applied exclusively to the long-term component.
As can be seen in
Figure 7, Tree-based models (RF, XGBoost) outperformed linear ones, achieving the lowest errors and highest R
2 (0.91).
Figure 8 shows ALLSKY_SFC_SW_DWN and GWETROOT as the strongest long-term drivers, while PRECTOTCORR and RH2M exert negative effects; tree-based models reduce coefficient magnitudes but confirm solar radiation as dominant and highlight RH2M’s nonlinear role.
Figure 9 compares the actual long-term values with predictions from five models. Random Forest and XGBoost provide the closest match to the observed data, each achieving a correlation of 0.96, with their curves closely aligned across the time range. Linear and Ridge models also track the overall pattern well, each reaching a correlation of 0.94. Lasso performs slightly worse, with more noticeable deviations around turning points, at 0.93. Nevertheless, all models capture the main shape of the data, indicating reasonable predictive performance.
4.3. Seasonal-Term Component Analysis
This section focuses on evaluating model performance in capturing the seasonal variation of the target variable. Equation (
4) presents the mathematical formulation of the linear regression model for the seasonal component.
The seasonal-term equation uses the same notation as the raw data equation, with all terms applied specifically to the seasonal component.
Figure 10 presents walk-forward cross-validation results for T2M_seasonal_term. Linear and Ridge models perform similarly (
0.14,
,
), while Lasso performs slightly worse. Random Forest and XGBoost achieve the lowest errors and
, closely tracking the observed seasonal patterns.
As shown in
Figure 11, in the linear models, ALLSKY_SFC_SW_DWN_seasonal exhibits the most potent positive effect, WS2M_seasonal is moderately negative, and other predictors have a minor influence. In Random Forest and XGBoost, ALLSKY_SFC_SW_DWN_seasonal remains the most important predictor, WS2M_seasonal becomes positive, and the remaining variables contribute little, indicating that seasonal variation is primarily driven by solar radiation and wind speed.
Figure 12 compares the actual seasonal values with predictions from five models. The linear models (Linear, Ridge, Lasso) show strong alignment with the observed data (
), whereas Random Forest and XGBoost achieve nearly perfect fits (
). The combined plot confirms that the tree-based models track the seasonal pattern most accurately, while the linear models still perform well, exhibiting only slight deviations.
4.5. Additive Model Analysis
In this section, we present an additive framework for representing the surface air temperature series by integrating its short-term variations, seasonal pattern, and long-term trend. Equation (
6) provides the mathematical formulation for linear regression; a similar approach can be applied to Ridge and Lasso regression, with their respective regularization constraints taken into account. Unlike Ridge and Lasso, which extend linear regression through explicit regularization terms, Random Forest and XGBoost do not rely on a single analytical equation. Instead, they generate predictions through ensembles of decision trees, governed by algorithmic rules and hyperparameter settings.
Here,
represents the raw dependent variable;
are the long-term component variables with corresponding coefficients
;
represent the seasonal-term component with coefficients
; and
represent the short-term component variables with coefficients
. The term
denotes the error at time
t.
Tree-based ensembles clearly outperform linear methods. XGBoost shows the best performance (
MAE = 0.22,
RMSE = 0.29,
), closely followed by Random Forest (
MAE = 0.23,
RMSE = 0.30,
) (
Figure 17), highlighting the effectiveness of ensemble models in capturing complex nonlinear relationships. Among linear models, Ridge Regression performs best, slightly outperforming Lasso and matching the accuracy of Linear Regression, while Random Forest and XGBoost achieve the highest overall accuracy. In summary, tree-based ensembles excel at modeling complex, nonlinear dependencies, whereas Ridge offers a more straightforward and more interpretable alternative.
The feature importance analysis (
Figure 18) provides key insights into how different predictors contribute to model performance across temporal scales, physical categories, and learning algorithms. At the individual-feature level (top-left panel), solar radiation-related variables, particularly the seasonal component of ALLSKY_SFCS_DWN, consistently emerge as the most influential across all models. This dominance reflects the primary role of radiative forcing in driving temperature variability, especially at seasonal frequencies. Wind speed and groundwater-related predictors also show moderate contributions, whereas short-term flow-related variables display relatively minor influence.
When aggregated by physical category (top-right panel), solar radiation exhibits the highest mean importance, followed by wind speed and groundwater indices, reaffirming their strong explanatory power in the additive framework. Precipitation and humidity yield lower contributions, suggesting that their effects on temperature may be more indirect or scale-dependent.
The bottom-left panel compares overall feature importance across model families. Ridge and Linear Regression demonstrate the highest average importance, indicating a strong reliance on key predictors and suggesting that regularization enhances the interpretive stability of linear models. Tree-based models (Random Forest and XGBoost), while effective in performance, distribute importance more diffusely due to their ability to model interactions and nonlinearities.
Finally, the time-scale analysis (bottom-right panel) reveals that seasonal components contribute substantially more than long-term and short-term features, highlighting the comparatively high predictability and signal strength at seasonal scales. Long-term predictors retain a moderate influence, likely reflecting broad climatic trends, whereas short-term components show the lowest importance, consistent with their lower predictability and higher stochasticity observed in model evaluation.
Overall, these findings validate the decomposition-based approach by demonstrating that model learning is strongly weighted toward physically interpretable seasonal drivers and that regularized linear models effectively capture dominant relationships while retaining parsimony.
Figure 19 summarizes the numerical results, illustrating how each component contributes to the model’s output across different methods. In the top row, the three linear models—Linear Regression, Ridge, and Lasso—exhibit very similar behaviors, each achieving a correlation coefficient of 0.93 with the observed values. All three models capture the overall wave-like pattern of the data well, but tend to miss sharper turning points, particularly during steep rises and drops, where predictions either lag slightly or fail to reach the full peaks and troughs of the actual series.
In the lower row, Random Forest and XGBoost provide a noticeably closer fit to the observed data, with correlation coefficients of 0.95 and 0.96, respectively. Their prediction lines nearly overlap the actual series throughout the entire period. XGBoost, in particular, captures the full amplitude and direction of variation, even at the extremes, demonstrating a superior ability to learn the underlying structure, especially in regions where the signal exhibits rapid shifts or nonlinear changes.