4.1. Statistical and Visual Analysis of Data Structure
To quantitatively assess the relationship between agroclimatic conditions and pest infestation in grain crops, a panel dataset was created, with the basic unit of observation being the “year–spatial unit” pair. For each year and region (or agrounit), average monthly temperature and precipitation values were recorded for the period April–August, the area of surveyed crops in summer, as well as summer phytosanitary indicators, including the infestation index and pest abundance. This data format allows for the simultaneous consideration of interannual dynamics and the spatial heterogeneity of pest damage and serves as the basis for subsequent statistical analysis and the training of predictive models.
Table 6 shows that all climate indicators are generated strictly for the growing season and include only the months from April to August. Autumn indicators are absent, as they reflect the infestation’s preparation for winter and influence the dynamics of the following year, but are not informative for predicting current summer infestation levels. The target variables for the summer period are the infestation index
, which serves as the primary target for modeling, and the kol_l indicator, used for descriptive analysis and biological interpretation. The additional indicator audan_l accounts for the influence of the surveyed area on the recorded infestation level, thereby increasing the accuracy of subsequent analysis.
Each row corresponds to a unique combination of observation year (Year) and spatial identifier (ID). For each pair (
), the average monthly temperature (t_*) and precipitation (p_*) for April-August, the area of surveyed crops in summer (audan_l), the summer infestation index (
), and the pest infestation (kol_l) are presented. The first five observations are shown; the full dataset contains 4500 rows.
Table 7 shows the number of observations (count), mean (mean), standard deviation (std), minimum value (min), quartile values (25%, 50%, 75%), and maximum (max) for each variable. This describes the range and variability of agroclimatic conditions and phytosanitary indicators used in the modeling.
The data cover 18 years (2005–2022) and include approximately 4500 observations, which provide a reliable assessment of interannual dynamics. Average monthly temperatures (t_april–t_august) demonstrate the expected seasonal profile for Kazakhstan: spring values are in the range of ~15–21 °C, and the summer maximum is reached in July (~27 °C) with moderate interannual variability. Precipitation indices (p_april–p_august) vary mainly within the range [0, 1], reflecting significant fluctuations in humidity between months, which is essential for the formation of favorable or unfavorable conditions for pest development. The surveyed area indicator (audan_l) varies widely, from 0.001 to 141.367, indicating significant differences in the monitoring scale across regions and years, and is therefore used as one of the predictors. The distributions of the target variables zasel_l and kol_l exhibit pronounced right-hand asymmetry: the medians remain relatively low, while the maxima can be very high, indicating rare but intense pest outbreaks. This factor is taken into account when choosing models and analyzing predictive errors.
To ensure the validity of the evaluation protocol across time and to check for unintended distribution shifts, descriptive statistics were calculated separately for the training, validation, and test sets. Because the dataset was split chronologically, differences between splits may reflect genuine interannual variability rather than random selection effects. Therefore, statistical comparisons across splits provide an additional diagnostic layer for assessing the consistency of the dataset and the generalization of the model.
Table 8 presents summary statistics for the target variable and key climate predictors across the three data partitions. The comparison confirms that the climate predictors remain largely comparable across sample splits. Average temperature values for April–June exhibit only moderate fluctuations, and precipitation indices maintain similar ranges and scatter, indicating a stable distribution of climate parameters throughout the observation period. In contrast, the target variable (zasel_l) exhibits noticeable differences in extreme values. Although the median remains virtually identical across sample splits (≈0.5), the training set contains significantly higher maximum values compared to the validation and testing periods. This suggests that rare but intense outbreaks were observed in earlier years, which are less common in later seasons.
Given the central role of early- and mid-season agroclimatic conditions in shaping summer pest dynamics, a targeted statistical analysis of key climatic factors was conducted for the period from April to June. These months correspond to critical phases of the early growing season and early biological activity, during which temperature and humidity conditions can significantly influence pest development and survival rates.
Table 9 presents descriptive statistics for the selected climatic variables, allowing for a structured assessment of their central tendencies, variability, and observed ranges across the entire panel dataset.
In addition to monthly temperature and precipitation indices (April–August), aggregated seasonal characteristics were constructed to reflect conditions at key stages of the growing season. For temperature, mean values were calculated for the early (), middle (), and late () growing-season subsegments, as well as the mean temperature for the entire period April–August () and its intraseasonal range (). Similarly, for precipitation, summary indices were obtained for the early (), middle (), and late () parts of the season, the total precipitation for April–August (), and the maximum monthly value within the season (). These aggregates provide a compact description of integral and extreme climatic conditions that may affect pest development and survival and are used in subsequent regression and machine learning analyses as predictors of summer infestation.
Table 10 shows that the mean seasonal temperature
is within a relatively narrow range (22–24 °C, with a moderate standard deviation), while the intraseasonal temperature range
exhibits significantly higher variability, reflecting years with abnormally cold spring months or hot summer months. The following columns correspond to these features: temperature aggregates—
,
,
,
,
, and precipitation aggregates—
,
,
,
,
. All the listed derived variables were used both in descriptive and correlation analyses (to study distributions and their relationships with the
indicator) and as additional input features in constructing models to predict the summer infestation index.
The total precipitation indices , , and indicate a noticeable spread between relatively dry and wet seasons. At the same time, the indicator characterizes the intensity of the maximum monthly precipitation episode within the growing season. Together, these derived indicators allow for a more compact and biologically meaningful incorporation of the seasonal climate background into models for predicting summer pest infestations. As part of the spatial data analysis, long-term variability in the summer pest infestation index between administrative districts was assessed. For each region, the average summer pest infestation index was calculated for 2005–2022, and 15 areas with the highest values were selected. This ranking allows us to identify persistent “hot spots” of phytosanitary risk and to verify the extent to which the training and validation samples cover the full range of spatial conditions, including both high- and low-infestation areas.
Figure 2 shows the districts with the highest average summer infestation index
, aggregated across the entire study period. The leading regions (Esil, Zhaksy, Albasar, Kostanay, and Osakarovka) form a group of territories with significantly higher infestation levels, while areas on the right side of the diagram (e.g., Balkashino and Astana) demonstrate significantly lower average index values.
The column captions show precise long-term averages, highlighting the multiple differences between individual areas. This spatial ranking is used both to interpret biogeographic patterns of pest infestation and to subsequently analyze the robustness of forecasting models to varying levels of pest pressure. To assess the statistical properties of the target variable, a primary univariate analysis of the summer pest infestation index
was conducted. Particular attention was paid to the distribution shape and the presence of extreme values, as these features determine the choice of quality metrics, the robustness of regression models, and the need for transformations or robust methods.
Figure 3 shows the distribution of
index values as a boxplot, where the central box represents the interquartile range (from the 25th to the 75th percentile), the horizontal line inside represents the median, and the whiskers represent the “typical” range of observations. As shown in the graph, most values are concentrated near the lower end of the scale. At the same time, numerous outliers are found above the upper end, including isolated extreme outbreaks with indices above 70–100. This pronounced right-hand asymmetry and the presence of high-tail values indicate rare but very intense pest outbreaks, which we interpret as genuine episodes of high load rather than measurement artifacts. This result was taken into account when selecting the combination of metrics (MSE, RMSE, MAE, and
) and when developing hybrid models capable of adequately reproducing both the “background” level of infestation and rare extreme events.
To quantify the contribution of agroclimatic factors to variation in the summer pest infestation index, a linear correlation analysis was first performed. Pearson correlation coefficients were calculated for the training set between average monthly temperature and relative moisture indices in April–August and the target variable,
. The resulting values were used as the primary criterion for feature significance in the subsequent construction and interpretation of hybrid models.
Figure 4 shows horizontal bar graphs displaying the Pearson correlation coefficients between average monthly temperatures (upper panel) and precipitation indices (lower panel) for April-August and the
index.
Temperature indicators exhibit predominantly negative relationships with the target variable; the highest absolute correlation is observed for April temperature, underscoring the critical role of early spring conditions in subsequent pest infestations during the growing season. May–July temperatures are also associated with the infestation index, but to a somewhat lesser extent, and the influence of June temperature is close to zero, indicating a more complex, possibly nonlinear, infestation response in mid-season.
Precipitation predictors are characterized by a relatively weak negative correlation with : July precipitation makes the most significant contribution, while humidity indicators in other months have a moderate moderating effect on infestation density. Taken together, these patterns confirm that early spring temperatures are the leading linearly related risk factor, while precipitation plays a weaker but systematic modifying role.
Figure 5 shows how the XGBoost model redistributes “weight” between monthly climate features: the top panel shows temperature predictors, and the bottom panel shows humidity indicators. The diagram clearly demonstrates that among the temperatures, April temperature makes the most significant contribution to forecast quality, followed by June and August, while May and July play a more supporting role.
In the precipitation group, June has the highest importance, followed by April and August, with May and July precipitation being less significant. This importance distribution suggests that the model relies primarily on early spring temperature background and early summer moisture conditions to reconstruct the interannual variability of the
index, while the other months contribute additional, but less critical, information.
Figure 6 shows how the mutual information between
and climate indicators are distributed across months during the growing season. The top panel shows that the most information about the target index is contained in April and May temperatures, while the contribution of summer temperatures (especially June and August) is significantly lower, indicating the critical role of early spring thermal conditions in determining subsequent pest infestations.
The bottom panel shows the mutual information values for precipitation indices: April and May precipitation prove to be the most informative, while June-July-August are characterized by lower values. Taken together, these results indicate that, for both temperature and moisture, the dominant influence on summer pest infestation index variations occurs during the preceding spring period, which justifies the inclusion of these features in the forecast models. To analyze the contribution of individual temperature predictors to the summer pest infestation index forecast, a post hoc analysis of the XGBoost model was performed using the SHAP (SHapley Additive exPlanations) method. This approach allows us to move from a “black box” to a quantitative assessment of how specific temperature values in different months of the growing season bias the forecast toward an increase or decrease in infestation.
Figure 7 shows how individual observations for t_april-t_august (color scale from low to high) are distributed along the SHAP axis, which reflects the feature’s contribution to the model output. Points shifted to the right correspond to increases in the predicted infestation index, while points shifted to the left correspond to decreases relative to the baseline. The widest “cloud” of points is observed for t_april, indicating the dominant influence of April temperature on forecast variability: high April temperatures (crimson) are predominantly associated with negative SHAP values, meaning they reduce expected infestation, while cooler years (blue) are more often associated with positive prediction bias. For t_may-t_july, the contribution remains noticeable, but smaller in amplitude. For t_june and t_august, the spread of SHAP values is close to zero, indicating a relatively weak effect of these months in the temperature-oriented XGBoost module compared to April conditions.
To assess the contribution of monthly precipitation indices to the summer pest infestation forecast, XGBoost gradient boosting was interpreted using the SHAP (Shapley Additive exPlanations) methodology. This analysis allows us to track which combinations of moisture conditions in April–August cause the model to adjust the expected infestation level upward or downward relative to the average scenario.
Figure 8 shows the distribution of SHAP values for the p_april-p_august predictors, which characterize precipitation in individual months of the warm period. Each point represents one observation: the position on the horizontal axis shows the contribution of a specific predictor value to the XGBoost model output (SHAP value), and the color scale reflects the magnitude of the initial precipitation index (from drier conditions to abnormally wet periods).
The most incredible spread of SHAP values is observed for p_april and p_may, meaning that spring precipitation most often leads to a significant adjustment in the predicted summer infestation level, while precipitation values in June-August generally produce small absolute shifts around zero. This diagram indicates that the humidity regime at the beginning of the growing season plays a more sensitive role in modeling infestation risk, while the overall effect of precipitation remains less pronounced than that of temperature factors and acts more as a modifying factor than a dominant driver of the model.
4.2. Evaluation of the Model’s Forecasting Quality
The evaluation was conducted on a lagged time sample formed from observation years, allowing us to test the models’ ability to reproduce previously unused years. Standard regression metrics were calculated for all approaches: mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2). The lower the MSE/RMSE/MAE values and the higher the , the more accurately the corresponding model approximates the actual infestation index. The comparative analysis includes five “flat” baseline models (linear and ridge regression, random forest, XGBoost, and a fully connected neural network (MLP)) and five specialized temporal architectures using the wavelet transform of the annual series and/or hybridization with ARIMA/SARIMA, recurrent units (LSTM, GRU), and the proposed spatiotemporal transformer. This allows us to evaluate how much the gradual architectural complexity and the explicit modeling of the time structure of the series improve forecast accuracy compared to traditional machine learning methods.
To complement the importance analysis for each model and ensure a unified interpretable framework, SHAP (SHapley Additive ExPlanations) values were calculated for all tabular models. Unlike raw coefficient values or impurity-based estimates, SHAP values quantify the marginal contribution of each predictor to the model’s output in a consistent and theoretically sound manner. This allows for direct comparison of feature influence across models, independent of internal algorithmic mechanisms.
Figure 9 presents a normalized SHAP importance matrix for different models, where each cell reflects the mean absolute SHAP value for a given feature for a specific model. Normalization ensures comparability across models with different output scales and allows for the identification of stable, consistently influential predictors.
Figure 10 shows the SHAP-based feature importance for the proposed hybrid branching model, focusing on the ten most influential predictors. Values represent normalized mean absolute SHAP contributions, allowing for a direct assessment of the influence of each variable on the model output. The results indicate a clear dominance of recent temporal components. The largest contribution is observed for lag_1_raw, followed by lag_1_approx, confirming that the most recent raw and approximated signals provide the greatest predictive power. Variables corresponding to the second lag (lag_2_raw and lag_2_approx) also demonstrate a significant influence, although their contribution gradually decreases with temporal distance. Features associated with the third lag demonstrate noticeably lower importance, while the detail components (lag_*_detail) contribute only insignificantly to the final prediction. This pattern suggests that the hybrid architecture effectively prioritizes short-term temporal dynamics, while long-term and high-frequency, granular components play a supporting but secondary role. Overall, the SHAP analysis confirms that the proposed hybrid branching model captures a hierarchical temporal structure, emphasizing that the primary factors determining forecast accuracy are recent aggregated signals.
Table 11 shows how the quality metrics differ across the entire range of tested models. Linear and ensemble algorithms (LinearRegression, Ridge, RandomForest, XGBoost, MLP), which work directly with agroclimatic features, provide a moderate explanation of the infestation index variation (
), with significantly higher errors. Incorporating a wavelet decomposition of the annual series and residual learning (Wavelet_CNN_ARIMA, Wavelet_CNN_SARIMA models) significantly reduces RMSE and MAE, reflecting the benefit of explicitly modeling the temporal structure. Even greater gains are achieved with recurrent heads (Wavelet-CNN-LSTM, Wavelet-CNN-GRU). The highest accuracy is demonstrated by the proposed multi-branch hybrid model Wavelet_CNN_SARIMA_GRU_STTrans, for which the minimum error values are recorded (RMSE = 0.0223; MAE = 0.0178) and the maximum determination coefficient
, which indicates an almost complete reproduction of the dynamics of the summer infestation index in the validation sample.
Further analysis revealed that the differences in forecast performance are consistent with how each model exploits the structure of the input data and its temporal smoothness. A brief discussion of the performance of all approaches is provided below. Linear regression and ridge regression with α = 0.1 yield virtually identical values for MSE, RMSE, MAE, and . This indicates that the feature matrix is well-conditioned and the level of multicollinearity between agroclimatic parameters is low: soft L2 regularization barely changes the OLS solution. However, the limitations of the linear approximation prevent these models from fully capturing the nonlinear response of pest infestations to temperature and precipitation extremes, which is reflected in a moderate .
Random forests and XGBoost, despite their more flexible, tree-like structure, are inferior to linear models in terms of accuracy (especially XGBoost, which has the worst ). This can be explained by a combination of two factors: first, the relatively small size of the training set across years, when rich ensemble models begin to overfit and adjust to noise; second, the high aggregation of features (annual means and sums), which means that complex nonlinear partitions of the feature space do not provide significant additional gains compared to a simple linear relationship. As a result, trees capture local fluctuations but do not enhance the model’s predictive ability in later years. The fully connected MLP neural network achieves RMSE and values close to those of ensembles, while its mean absolute error is slightly higher. With a relatively small number of annual observations, MLP, like other highly parametric architectures without explicit consideration of temporal structure, is sensitive to the choice of initial weights and to random noise in the target variable. The lack of built-in mechanisms for modeling sequences (recurrent or time convolutional) means the neural network solves a static regression problem, limiting its potential.
The Wavelet-CNN model with an ARIMA base (Wavelet_CNN_ARIMA) demonstrates a significant performance improvement: a sharp decrease in RMSE and an increase in to 0.81. This result is logical, since the two-stage “ARIMA + residual CNN” scheme decomposes the problem into a forecast of the smooth component of the time series (base ARIMA) and a forecast of high-frequency deviations (wavelet convolutions). ARIMA well describes the inertial trend and weak autocorrelation of the annual index, and the convolutional block compensates for local anomalies associated with individual extreme seasons.
Replacing ARIMA with SARIMA in the Wavelet_CNN_SARIMA model results in a slight performance degradation. Given the relatively short annual dataset, introducing a seasonal component with 3 years increases the number of parameters and complicates model estimation. Given a limited number of observations, this can lead to overfitting and less stable seasonality estimates. As a result, the residuals fed to the CNN branch become noisier, making it more difficult for the convolutional part to recover the sound signal.
Adding recurrent layers to the Wavelet-CNN-LSTM and Wavelet-CNN-GRU architectures partially improves the results compared to purely residual models. LSTM and GRU accumulate information from several previous years, thereby helping the CNN unit align high-frequency fluctuations with longer-term trends. The LSTM head yields a slightly higher . At the same time, the GRU variant achieves a somewhat lower MAE, consistent with the known properties of these units: LSTM has better long-term memory retention. At the same time, GRU is more compact and less prone to overfitting on small samples.
The best performance is achieved by the proposed multi-branch hybrid model Wavelet_CNN_SARIMA_GRU_STTrans. Here, the base SARIMA reproduces the regular component of the series, the first CNN branch refines local wavelet characteristics, the second CNN-GRU branch captures medium-term dynamics, and the spatiotemporal transformer focuses on complex long-term dependencies and interactions across scales. Joint training of these three representations and subsequent fusion in a standard fully connected block enables multi-threaded signal processing and effective noise suppression. This multi-level decomposition leads to minimal error values (RMSE and MAE are an order of magnitude lower than those of other models) and a very high , indicating near-optimal use of information about the temporal structure of the infestation index.
Table 12 presents the results of the ablation study, quantifying the contribution of key architectural modules to forecasting performance. The full Wavelet_CNN_SARIMA_GRU_STTransformer model demonstrates the best results (R
2 = 0.8866; RMSE = 0.0325) and serves as the benchmark configuration. A slight performance degradation after removing SARIMA indicates that wavelet coding combined with GRU is already able to capture significant nonlinear dynamics, while the model with SARIMA alone shows significantly worse results, confirming the limitations of purely linear modeling. Removing the ST-Transformer leads to a noticeable decrease in accuracy, highlighting the importance of the attention mechanism for integrating context dependence. The most significant reduction occurs when removing the wavelet decomposition, demonstrating the critical role of temporal and frequency feature extraction. Overall, the results confirm that stable prediction performance is achieved through the joint interaction of wavelet preprocessing, sequential modeling, and attention-based integration.
Figure 11 shows the change in the average summer infestation index of grain crops by the pest
Phyllotreta vittula by the years of training (blue dots) and validation (green dots), as well as the corresponding predicted values of the linear regression model for the period 2019–2022 (orange dots).
The model generally reproduces the level and interannual variability of the population index, providing close agreement with observations in the validation interval and only moderately smoothing the series’s extreme values. As shown in
Figure 12, the Random Forest model satisfactorily describes the dynamics of the population index during the training interval. Still, during the validation period (2019–2022), its predictive ability is limited: the predicted values systematically overestimate the observed index and poorly reflect interannual variability, smoothing out local minima and maxima.
This indicates the model’s insufficient adequacy for accurately predicting summer population density with the given configuration of agroclimatic predictors. As shown in
Figure 13, the XGBoost model fits the training set satisfactorily. Still, its predictive ability over the 2019–2022 validation interval is limited: the predicted values systematically overestimate the observed population index and poorly reflect the actual interannual variability.
The model significantly smooths out fluctuations in the series, fails to reproduce declining dynamics in individual years, and exhibits a bias toward higher risk levels than suggested by monitoring observations. Taken together, the results for two tree-based models, Random Forest and XGBoost, show that decision tree ensemble algorithms in this setting do not provide sufficiently accurate forecasts of summer infestation of grain crops by the pest
Phyllotreta vittula. This is likely due to a combination of the relatively small volume and high noise of the initial data, a weak and nonlinear relationship between agroclimatic predictors and the infestation index, and the tendency of tree-based models to overfit to limited samples with complex feature structures. As a result, they approximate training years well but produce biased, less robust estimates in independent validation years, making their use as the primary tool for operational forecasting risky without additional structure simplification or regularization. As shown in
Figure 14, the baseline multilayer perceptron model satisfactorily approximates the summer population index levels during the training interval without significant overfitting and retains the general shape of the interannual dynamics.
Over the 2019–2022 validation interval, the predicted values are close to the observed values, but a systematic tendency to underestimate the index relative to actual monitoring estimates is noticeable across all years. The amplitude of interannual fluctuations in the forecast is also smoothed: the model underestimates peaks and slightly overestimates minimum values, reflecting the MLP’s tendency to smooth given the limited sample size and the high noise of the agroclimatic predictors.
Overall, the results in
Figure 14 indicate that the basic multilayer perceptron can extract robust nonlinear relationships between a set of temperature and precipitation parameters and the population index. Still, with the current architecture and data volume, the model provides only moderately accurate forecasts. The presence of a systematic downward bias and the smoothing of interannual variability indicate the need for additional tuning (regularization, changing the network depth, revising the feature set) or a transition to more specialized architectures better suited to small, noisy agroclimatic time series.
Figure 15 shows the dynamics of the average annual index of summer infestation of grain crops by the pest
Phyllotreta vittula in the years of training the Wavelet_CNN_SARIMA model (blue dots) and validation (green dots), as well as the corresponding predicted values for the period 2019–2022 (orange dots). The hybrid wavelet–convolution–SARIMA model satisfactorily reproduces both the level and the shape of the index’s interannual variability in the training interval, without significant overfitting, and preserves sharp increases and subsequent decreases in infestation. In the validation interval, the predicted values practically coincide with the observed estimates, demonstrating minor systematic deviations and correctly reflecting the weak downward trend and the decrease in the amplitude of interannual fluctuations.
This behavior indicates the successful combination of the advantages of wavelet decomposition, which allows the extraction of informative frequency components of the index; the parametric SARIMA component, which describes the temporal dependence and residual seasonality; and the convolutional block, which extracts nonlinear relationships between agroclimatic predictors and the transformed population series. Taken together, the results presented in
Figure 15 demonstrate the high predictive power of Wavelet_CNN_SARIMA and its superiority in describing the annual population index dynamics compared to baseline linear, tree-based, and simple neural network models.
As shown in
Figure 16, the hybrid Wavelet_CNN_ARIMA model accurately reproduces the interannual dynamics of the average annual summer population index in both the training interval and the 2019–2022 independent validation period. The predicted values (orange line) are virtually identical to the observed index estimates (green line), demonstrating only a slight systematic tendency toward slightly underestimating population levels in the final years of the series. Moreover, the model accurately captures the index’s smoothly declining trajectory and the amplitude of its interannual fluctuations, without over-smoothing or creating artificial extremes.
The results indicate that the combination of wavelet decomposition, a convolutional block, and an ARIMA component enables effective extraction of local temporal patterns in the population series, with a parametric description of the linear time dependence. Taken together, this provides higher predictive accuracy than basic linear, tree-based, and simple neural network models, particularly in reproducing the structure of interannual index variability with a limited time series length. As shown in
Figure 17, the Wavelet_CNN_LSTM hybrid model reproduces the interannual dynamics of the average annual index with fair accuracy.
The predicted values (orange line) are located in proximity to the observed index estimates (green line), demonstrating only moderate smoothing of the amplitude of interannual fluctuations and a slight downward shift in the final years of the series. Moreover, the model accurately captures the weak downward trend and relative stability of the index during the validation period, without producing population peaks that are clearly over- or underestimated. This forecast quality demonstrates that the combination of wavelet decomposition, convolutional local pattern extraction, and the LSTM recurrent unit effectively accounts for both short-term and longer-term time dependencies in the population index dynamics with a limited time series length. Overall, the results for Wavelet_CNN_LSTM confirm the high predictive power of this architecture and its superior stability and accuracy compared to baseline linear, tree-based, and simple MLP models. However, the degree of smoothing of extreme values requires consideration when interpreting the risks posed by plant protection systems. As
Figure 18 shows, the Wavelet_CNN_GRU hybrid model satisfactorily reproduces the interannual dynamics of the average annual summer infestation index of grain crops by the pest
Phyllotreta vittula both in the training interval and during the 2019–2022 independent validation period. During validation, the predicted values (orange line) are close to the observed estimates (green line), slightly overestimating the index levels in the first years of the interval and smoothly converging to the actual values toward the end of the series. The model correctly reproduces the weak downward trend and relatively small amplitude of interannual fluctuations, without creating artificial spikes in infestation or excessively smoothing the dynamics.
The results demonstrate that the combination of wavelet decomposition, a convolutional block, and a GRU recurrent unit effectively extracts informative temporal patterns from short, noisy occupancy index data. The wavelet transform highlights significant frequency components, the CNN fragment generalizes local structures, and the GRU units capture both short- and long-term dependencies. Together, this ensures high predictive accuracy and robust estimates, surpassing basic linear and tree-based models while maintaining interpretable risk dynamics consistent with biological concepts. As shown in
Figure 19, the proposed Wavelet-SARIMA-GRU Spatio-Temporal Hybrid model almost completely reproduces the dynamics of the average annual summer occupancy index over the entire observation interval.
During the training period, the model adequately captures both the index levels and the shape of interannual fluctuations, demonstrating neither obvious overfitting nor excessive smoothing of extremes. Over the 2019–2022 validation interval, the predicted values (orange line) are virtually identical to the observed index estimates (green line): there is no consistent upward or downward systematic bias, and the magnitude of the point errors is small and comparable to the uncertainty of the field observations themselves. According to the summary quality assessment results (metrics table), the Wavelet-SARIMA–GRU hybrid model exhibits the lowest error values (MSE, RMSE, MAE) and the highest R2 determination coefficient among all the options considered, including linear regression, decision tree ensembles, basic MLP, and other wavelet-hybrid architectures. This advantage is explained by the combination of three complementary components: a wavelet decomposition, which identifies informative multiscale components of the index; a SARIMA block, which describes the linear trend-seasonal structure of the time series; and a recurrent GRU module, which captures residual nonlinear and long-term dependencies due to variations in agroclimatic factors and spatial heterogeneity. As a result, the model provides the best compromise between forecast accuracy and robustness of estimates, making it the most promising tool for operational forecasting of the seasonal risk of crop infestation by the Phyllotreta vittula pest in conditions with limited historical data.
The temporal estimation strategy used in this study strictly preserved the chronological structure of the data. Future years were never included in the training set, and each test year was estimated only using models trained on previous observations, without any temporal shuffling. For tabular machine learning models (Ridge, Random Forest, and XGBoost), this procedure was implemented as a rolling 8-fold cross-validation, where the training window was successively expanded, and each fold corresponded to a lagged test year. For hybrid and neural network models, estimation was based on rolling time sequences constructed solely from past observations, followed by a fixed, time-free validation on subsequent years. Consequently, although chronological logic was consistently preserved across all experiments and no time leakage occurred, the validation scheme was not identical for all model classes. Cross-validation was used for hyperparameter selection and stability assessment for tabular models, while hybrid and deep architectures were evaluated using training sequences containing only historical data, combined with a fixed-time test for time-inconsistency. This refinement addresses an apparent discrepancy between the previously described cross-validation procedure and the fixed-time evaluation used in the final comparative experiments.