Improving PDSI Z-Index Prediction with Ensemble Learning: A Case Study from the Troy Region of Türkiye

Mucan, Umut; Arslantaş Civelekoğlu, Ebru Elif

doi:10.3390/su18041752

Open AccessArticle

Improving PDSI Z-Index Prediction with Ensemble Learning: A Case Study from the Troy Region of Türkiye

by

Umut Mucan

^1,*

and

Ebru Elif Arslantaş Civelekoğlu

²

¹

Department of Agricultural Structures and Irrigation, Faculty of Agriculture, Çanakkale Onsekiz Mart University, Çanakkale 17100, Türkiye

²

Department of Biosystems Engineering, Faculty of Agriculture, Aydın Adnan Menderes University, Aydın 09970, Türkiye

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(4), 1752; https://doi.org/10.3390/su18041752

Submission received: 10 November 2025 / Revised: 2 February 2026 / Accepted: 4 February 2026 / Published: 9 February 2026

(This article belongs to the Section Sustainable Water Management)

Download

Browse Figures

Versions Notes

Abstract

Climate change is expected to intensify droughts, thereby increasing the need for reliable predictive tools. In this study, one-month-ahead forecasts of the Palmer Z-Index were generated using long-term monthly data from two meteorological stations (17112 Çanakkale and 18084 Biga) located in the Troy region. The input features included current and lagged meteorological variables, multi-month rolling statistics, and seasonal encodings. Eight machine learning models, including linear and ensemble tree-based approaches, were evaluated using time series cross-validation. Drought events were defined based on Palmer Z-Index and standardized drought indicators, and model performance was assessed using commonly adopted accuracy and detection measures. Shapley Additive Explanations (SHAP) analysis was used to quantify the feature contributions. Gradient Boosting achieved the highest predictive accuracy at the main station, while XGBoost and CatBoost also performed strongly. High accuracy was maintained at the second station, demonstrating the spatial robustness of the model. The machine learning-predicted Palmer Z-Index values showed strong agreement with observed hydrological drought conditions; severe drought events were detected with high confidence and low false alarm rates. SHAP results identified precipitation inputs as the most dominant driver of Z-Index variability. Overall, the findings suggest that ML-based models can provide timely and interpretable forecasts for operational drought early warning systems. Nonetheless, further research is needed to test the generalizability of these findings under different climate regimes and data conditions.

Keywords:

Palmer drought severity index (PDSI); machine learning; drought monitoring; time series prediction; water resource planning

1. Introduction

In recent years, the frequency and severity of extreme climate events such as droughts and floods have increased worldwide. This situation is causing serious harm to humans and is expected to continue in the future, along with climate change [1,2]. There are particularly rising risks to the environment, economy, and society, all of which depend on water resources.

Drought is defined as a prolonged condition of water scarcity caused by precipitation deficits and intensified atmospheric evaporative demand driven by elevated temperatures [3]. It is a natural phenomenon that adversely affects land, water resources, and production systems, leading to significant hydrological imbalances [4]. Monitoring drought is a critical process that provides early warning under adverse climate conditions such as reduced water resources, declines in agricultural production, ecosystem imbalance, and an increase in extreme weather events like floods—thus helping to mitigate their impacts [5,6,7]. More than 20 drought indices have been developed in the literature to enable the monitoring and evaluation of different dimensions of drought (meteorological, hydrological, agricultural, and socioeconomic), some of which are commonly used at different stages of the hydrological cycle [1,2]. Among these indices, the Standardized Precipitation Index (SPI), Standardized Precipitation Evapotranspiration Index (SPEI), Palmer Drought Severity Index (PDSI), and Standardized Runoff Index (SRI) are widely referenced and applied in regional and global studies [8,9]. These four core indices provide complementary information on the identification of different types of droughts, forming an important scientific basis for regional water resource management, early warning systems, and drought risk planning.

Several widely used drought indices differ in terms of their input variables and the aspects of drought they represent. The SPI characterizes meteorological drought based solely on precipitation data, whereas the SPEI also incorporates evaporation and temperature effects, thus providing a more comprehensive reflection of the impacts of climate change [9,10,11]. The PDSI is effective in assessing the severity and duration of long-term droughts by relying on soil moisture balance [12]. The SRI, on the other hand, utilizes streamflow data to determine the severity and duration of hydrological drought [13]. Although SPI and SPEI offer valuable insights into short- and medium-term drought processes, the PDSI stands out as a more comprehensive tool because it integrates multiple climatic and hydrological variables. The PDSI is not only based on precipitation but also relies on the components of precipitation, temperature, potential evapotranspiration, and soil moisture. In this respect, it offers a more holistic assessment than other indices. The onset, duration, and severity of drought events can be determined in detail by focusing on the soil water balance [14,15]. It stands out as a strong indicator in long-term drought analyses and trend examinations. Additionally, because of its capacity to evaluate both meteorological and agricultural droughts [16], it is frequently preferred in studies on basin management, water resource planning, and modeling the impacts of climate change. Since drought is dependent on temperature and precipitation, PDSI is considered more suitable than other indices for assessing the potential impacts of climate change on future droughts [12,15].

The Palmer Drought Severity Index (PDSI) is widely recognized for its strong physical basis and comprehensive representation of soil–water balance processes [8]. However, its operational implementation involves a multi-step water balance framework, several intermediate variables, and careful parameterization. Although improved versions such as the scPDSI have enhanced its climatological consistency [17,18], the practical application of the Palmer framework in large datasets, multi-station analyses, and real-time or predictive contexts may still require substantial preprocessing, computational organization, and methodological expertise. These aspects can pose challenges when the PDSI is used as part of data-intensive modeling workflows. Recognizing these challenges, this study proposes machine learning-based approaches to complement the traditional Palmer framework by providing a data-driven approximation of the PDSI. Rather than replacing the physically based formulation, the proposed framework is intended to support PDSI-oriented analyses by offering a flexible and interpretable modeling alternative that can be integrated into data-intensive drought studies. In this sense, the focus is placed on methodological robustness and predictive reliability rather than on explicit reductions in computational burden. The proposed approach aims to facilitate the analysis of large, multi-station hydroclimatic datasets and to improve the practical usability of PDSI-based drought indicators in predictive applications without altering the underlying physical assumptions of the Palmer methodology.

The increasing impacts of global climate change have made the development of new, practical, and reliable approaches for evaluating drought conditions an urgent necessity [19,20]. In this context, data-driven methods such as artificial neural networks (ANN), support vector machines (SVM), linear regression (LR), and decision trees (DT) have been intensively investigated for monitoring, assessing, and predicting droughts [21,22]. Data-driven models have become increasingly common and effective tools for drought prediction owing to their ability to handle the nonlinear processes encountered in the calculation of indices such as the Palmer Drought Severity Index (PDSI) [22,23].

In this study, various machine learning algorithms were applied to explore alternative modeling approaches that can better capture and predict the nonlinear structure of the PDSI [24]. In addition, this study presents a modeling framework that employs data-driven approaches to efficiently approximate drought conditions represented by the Palmer drought family. In this context, one-month lead-time prediction refers to forecasting the Palmer Z-Index for the subsequent month, denoted as Z(t + 1), which constitutes the primary prediction target of the proposed framework. Based on long-term monthly hydrometeorological records, the proposed approach aims not only to reproduce current drought conditions but also to provide short-term predictive capability. In this context, the framework enables the evaluation of evolving drought conditions based on observed meteorological variability, supporting timely drought monitoring and early warning. Thus, the developed models contribute not only to monitoring past and current situations but also to short-term forecasting of drought dynamics, presenting significant potential for strengthening decision-support mechanisms in water resources planning [25,26,27]. Focusing on Çanakkale (1940–2024), we integrated seasonal encodings, lagged variables, and rolling aggregates with machine learning models to capture both short-term anomalies and multi-scale persistence in drought dynamics. This approach is designed not only to improve computational efficiency, but also to enhance the operational applicability of the PDSI for timely decision-making. This approach is intended to complement traditional, computation-heavy PDSI workflows, thereby supporting timely monitoring, early warning, and climate-risk-aware water resource planning. The analysis utilizes long-term data from the Çanakkale Central and Biga meteorological stations, with the latter located near the Bakacak Dam catchment, thereby improving the spatial representativeness and robustness of regional drought characterization. By combining multi-station meteorological data with dam–catchment interactions, the proposed methodology provides a more reliable assessment of drought conditions, particularly in agriculturally managed regions.

2. Materials and Methods

2.1. Case Study

The province of Çanakkale is located in the northwest of Türkiye, between 39°27′–40°45′ north latitudes and 25°40′–27°30′ east longitudes, with an area of approximately 9933 km². The region lies within the Marmara transitional climate zone and is characterized by rainy winters and dry summers under the combined influence of the Mediterranean and Black Sea climate systems [28]. The topography of Çanakkale is heterogeneous. Elevation begins with the Kaz Mountains in the north (highest point: 1767 m) and gradually decreases toward the low plains, deltas, and coastal flats along the Aegean and Marmara Sea shorelines. This topographic diversity leads to notable differences in the spatial distribution of precipitation, evaporation, and hydrological responses. Thus, drought analyses in this region must consider both temporal and spatial variability. From a hydrological perspective, the province of Çanakkale has a complex water system comprising wetlands, streams, and dam reservoirs. The main streams are Karamenderes, Sarıçay, Tuzla, Umurbey and Kocabaş. The Atikhisar, Bakacak, Bayramdere, and Çokal dams are strategically important for drinking water supply, irrigation, and flood control. In particular, the Atikhisar Dam, with a storage capacity of 54 hm³ and an irrigation area of 3069 ha, provides most of the drinking water for Çanakkale. The Bakacak Dam, located on the Biga Plain, has a total storage capacity of approximately 136 hm³ and is used for the irrigation of approximately 9000 ha of agricultural land [28,29]. Çanakkale province hosts an extensive network of wetlands that are inhabited by more than 317 bird species. The Kavak Delta, Suvla (Tuz) Lakes, Gökçeada Lagoon, Biga Stream, Çardak Lagoon, Sarıçay Delta, Umurbey Lagoon, and Kumkale Marshes are among the region’s most important wetland systems (Figure 1). These areas help regulate water levels, support ecological balance, and serve as natural buffers against hydrological changes [28].

Çanakkale is a sensitive region for drought analyses owing to its topographic diversity, numerous dams and wetlands, irrigation-based agricultural activities, and variable climate conditions. Long-term analyses of drought conditions, especially for the years 1997, 2009, and 2020, showed that severe droughts occurred, whereas moderate-to-severe drought events during the 2017–2018 period had a marked negative impact on reservoir levels and irrigation efficiency [29,30]. Evaluations based on the Standardized Precipitation Index (SPI) reveal that irregularities in the precipitation regime directly affect both agricultural production and water resource management.

In this study, long-term data from the Çanakkale Central Meteorological Station and Biga Meteorological Station were used to examine the region’s drought dynamics more representatively. The main rationale for including the Biga Meteorological Station in the analysis was its spatial proximity to the Bakacak Dam Basin and the critical role of this dam in agricultural irrigation activities in the region. This approach aims to reduce the limitations of regional generalizations made using data from a single station and strengthen the representativeness of hydro-meteorological variables in different sub-basins. The use of multiple meteorological stations in this study and the consideration of dam–basin relationships contribute to making the findings more representative and interpretable at the regional scale.

2.2. Data

The meteorological data used in this study were obtained from the General Directorate of State Meteorological Services (TSMS) in Türkiye. Long-term monthly data from two meteorological stations were utilized in the analyses: Çanakkale Central Meteorological Station (ID: 17112) and Biga Meteorological Station (ID: 18084) (Figure 1).

The dataset from the Çanakkale Central Meteorological Station covers the period from January 1940 to December 2024, representing 85 years of uninterrupted monthly data. The data from the Biga Meteorological Station covered the period from January 1984 to December 2024, comprising a 41-year monthly observation series.

The meteorological parameters used for both stations included monthly total precipitation (mm), monthly average temperature (°C), relative humidity (%), atmospheric pressure (hPa), and wind speed (m/s). Additionally, the available soil water capacity (AWC, mm), which is required for Palmer-based drought calculations, was incorporated into the model. The predominant soil type in the Çanakkale and Biga regions is clay–loam, which is characterized by its high moisture retention capacity. Accordingly, taking as a basis a commonly accepted value in the literature for fine-textured soils, the AWC was fixed at 100 mm [31].

In this study, we adopted the Palmer drought framework and calculated the Palmer Z-Index using monthly water balance components. The main reason for selecting the Z-Index is its rapid response to short-term monthly moisture anomalies and its direct relationship with hydro-meteorological variables. Within this scope, model inputs were composed of the relevant month’s hydro-meteorological variables and Z(t) values; machine learning models were structured to predict the Palmer Z-Index value of the following month, Z(t + 1).

All datasets were harmonized into monthly time steps, and missing data, outliers, and temporal consistency were subjected to quality control. The resulting dataset enables a joint assessment of hydro-climatic variability in the province of Çanakkale and the Biga sub-basin and forms the basis for subsequent Palmer-based drought computations and the machine learning modeling framework.

2.3. Computation of Palmer Drought İndices

In this study, the Palmer drought framework was used as a basis, and the Palmer Z-Index was calculated using monthly water balance components. The Palmer approach is based on a water balance accounting system that considers the deviation of actual precipitation from climatically appropriate precipitation [32,33]. In this system, total precipitation (P) and potential evapotranspiration (PE) are the main climatic inputs, whereas the available soil water capacity (AWC) represents the effective moisture-holding capacity of the soil.

Because the predominant soil texture around Çanakkale and Biga is clay–loam, the AWC was fixed at 100 mm, based on a value widely used in the literature for fine-textured soils [31,32]. In the Palmer method, potential evapotranspiration (PE) was calculated using the temperature-based empirical formulation proposed by Thornthwaite [34], as it is suitable for situations where radiation data are limited and long-term temperature data are available.

P E = 16 C {(\frac{10 T}{I})}^{a}

(1)

Here, T represents the monthly average temperature (°C), I is the annual heat index, a is the empirical exponential value derived from the index, and C is the latitude-daylength correction factor. In the Palmer algorithm, the soil profile is conceptualized as a two-layer structure (surface and subsoil), and the water storage in each layer is updated based on monthly precipitation, recharge, surface runoff, and evapotranspiration processes [35]. Within this framework, for each month, the components of recharge (PR), surface runoff (RO), loss (L), and actual evapotranspiration (ET) were calculated; using these terms, climatically appropriate precipitation (P_CAFEC), which is needed to sustain normal soil moisture under local climate conditions, was estimated [36]. The difference between the observed precipitation and this value is defined as the monthly moisture anomaly (d):

d = P - C A F E C

(2)

Monthly moisture anomalies were standardized using the climatic weighting factor (K), which considers regional precipitation variability, to obtain the Palmer Z-Index. The Z-Index is an anomaly indicator that reflects short-term (monthly) moisture surplus and deficit conditions and was preferred in this study because its structure can be directly linked to hydro-meteorological variables.

In this study, only the Palmer Z-Index was used in the machine learning models. The model inputs were composed of hydro-meteorological variables and Z(t) values for the relevant month, and the models were structured to predict the Palmer Z-Index value for the following month, Z(t + 1).

All Palmer-based calculations were performed in the R software environment (R version 4.0.4, 2021) using the scPDSI package (version 0.1.3) published on CRAN. The package is based on the algorithmic framework developed by [32,33], which allows the calculation of Palmer components from monthly precipitation and potential evapotranspiration data. Data processing, temporal alignment, and quality control steps were performed using customized R scripts.

2.4. Feature Engineering

In this study, feature engineering was designed to represent the temporal continuity, seasonality, and hydroclimatic memory characteristics of monthly hydrometeorological variables, and the problem was addressed within the framework of a one-month-ahead forecast Z(t + 1). All features were generated using only the information available up to time t. The feature space was created in five groups: basic meteorological variables, lagged features, moving window statistics, seasonality encodings, and forward-shifted target definitions. The structures of the feature sets used in the different modeling pipelines are summarized in Table 1.

Monthly total precipitation, average temperature, relative humidity, atmospheric pressure, and wind speed were used as primary inputs. Lags of 1, 3, 6, and 12 months were generated in all pipelines; additionally, a 2-month lag was included within the multi-model framework to reinforce the short-term continuity (Table 1). To represent cumulative hydroclimatic effects, moving window statistics of 3, 6, and 12 months were derived; in the tree-based pipeline with hyperparameter tuning, rolling features were calculated by applying shift (1) to prevent information leakage (Table 1). Seasonality was modeled using sine and cosine transformations of the month information (month_sin, month_cos); in the multi-model pipeline, the calendar year was also used to represent long-term variability (Table 1). In all pipelines, the target variable was defined as Z(t) → Z(t + 1). Scaling was applied only for models sensitive to scale (linear regression, Elastic Net, SVR) using StandardScaler implemented in the scikit-learn library (version 1.6.1); for tree-based methods, no scaling was performed (Table 1).

2.5. Model Selection

In this study, eight regression models—covering linear, regularized linear, and nonlinear tree-based ensemble methods—were selected to represent the relationships between meteorological variables and the Palmer Z-Index using different modeling approaches. Model selection was based on methods commonly used in the literature for modeling drought and hydroclimatic time series, as well as preliminary analyses.

The selected models were linear regression, Elastic Net, Support Vector Regression (SVR), Random Forest, Gradient Boosting Regressor (GBR), Extreme Gradient Boosting (XGBoost), CatBoost, and LightGBM. This set of models represents a wide range of approaches, from simple linear assumptions to advanced ensemble methods, that can capture complex nonlinear relationships.

2.5.1. Linear Regression (Ordinary Least Squares)

Linear regression is a fundamental supervised learning method that models the relationship between a dependent variable and one or more independent variables in a linear framework. The model optimizes its parameters using the Ordinary Least Squares (OLS) method, which minimizes the squared differences between the actual and predicted values.

The main advantages of linear regression are its simplicity and direct interpretability of model coefficients. However, its performance relies on the assumption of linear relationships and may be limited in the presence of complex or nonlinear patterns. In this study, linear regression was used to assess linear relationships between meteorological variables and the Palmer Z-Index and to serve as a baseline for comparison with more complex models [37,38,39].

2.5.2. Support Vector Regression (SVR)

Support Vector Regression (SVR) is a powerful regression method that can capture both linear and nonlinear relationships. By defining a specific error tolerance (ε), the SVR penalizes deviations outside this margin, thereby limiting overfitting. This structure provides an advantage, especially for generating generalizable predictions in noisy datasets [40,41,42].

SVR can model nonlinear relationships using kernel functions and demonstrate effective performance in high-dimensional feature spaces. In this study, SVR was preferred to capture the nonlinear patterns between meteorological variables and the Palmer Z-Index.

2.5.3. Elastic Net Regression

Elastic Net is a linear regression method that combines the Ridge and Lasso regularization approaches. This approach increases model stability while reducing the risk of overfitting, especially in datasets containing many highly correlated variables [12].

The Elastic Net establishes a more streamlined model structure by suppressing unnecessary variables and balancing the negative effects of multicollinearity among variables. In this study, we used the Elastic Net to more stably assess the relative importance of meteorological variables and to improve the generalization capability of linear models.

2.5.4. Random Forest

Random Forest is an ensemble learning method in which a large number of decision trees are trained on random subsamples and feature subsets and then combined. This approach offers lower variance and higher generalization performance than individual decision trees. Random Forest is widely used in hydroclimatic problems because of its ability to capture nonlinear relationships, relative robustness to outliers, and automatic modeling of interactions between features. In this study, the Random Forest (RF) method was evaluated as a reference tree-based method for modeling nonlinear structures [43,44].

2.5.5. Gradient Boosting Regressor (GBR)

Gradient Boosting is an ensemble learning method in which weak learners (usually shallow decision trees) are trained sequentially, with each new model focusing on learning the residuals of the previous models. This approach ensures high accuracy by gradually reducing the prediction error [45,46,47]. In this study, hyperparameters such as the learning rate, tree depth, number of trees, and subsampling ratio for GBR were tuned using 5-fold time series cross-validation and the grid search method. The resulting configuration provided a balanced performance between bias and variance.

2.5.6. Extreme Gradient Boosting (XGBoost)

XGBoost is an optimized and regularized version of the Gradient Boosting algorithm. Owing to the inclusion of both L1 and L2 regularization terms, it can effectively control model complexity and reduce the risk of overfitting [25,26,27]. XGBoost was evaluated in this study because of its computational efficiency, parallel processing capability, and capacity to model complex nonlinear relationships.

2.5.7. CatBoost Regressor

CatBoost is a gradient boosting algorithm that was specifically developed for the effective handling of categorical variables. However, it can also deliver strong performance on datasets composed of continuous variables owing to its symmetric tree structure and ordered learning approach [48,49,50]. In this study, CatBoost was comparatively evaluated among tree-based ensemble methods because of its relatively low sensitivity to hyperparameters and its robust structure against overfitting.

2.5.8. LightGBM Regressor

LightGBM is a gradient boosting method that offers high computational efficiency through a histogram-based splitting strategy and leaf-wise tree growth approach. It is characterized by fast training times and strong prediction performance on large datasets [51,52,53]. In this study, LightGBM was evaluated among tree-based methods because of its rapid modeling requirements and high potential for accuracy.

2.6. Model Training and Evaluation

In this study, a five-fold time series cross-validation approach was used for the training and validation of the models. The TimeSeriesSplit method in the scikit-learn library allows training with earlier period data and validation in subsequent periods while preserving the chronological order of data. This structure prevents information leakage, which is critical in time series problems [54,55].

To evaluate the applicability of the TimeSeriesSplit approach, an Augmented Dickey–Fuller (ADF) stationarity test was conducted for the target variable, the Palmer Z-Index time series. The test results showed that the Z-Index series does not contain a strong deterministic trend and exhibits largely stationary behavior at the monthly scale. This finding supports the methodological suitability of the time-based split cross-validation approach, which predicts future periods using past values of the series. Accordingly, a time series split-based training and validation strategy was preferred in this study.

In each fold, approximately 80% of the dataset was used for training, and 20% was used for validation. While the training set expands over time with each fold, the validation set is constructed to cover the immediately following time segment. This approach aims to evaluate the models’ ability to generate future forecasts using historical data.

Out-of-Fold (OOF) predictions were obtained for each model. In this setup, each observation in the validation set was predicted using a training set that contained only the preceding time steps. By combining the OOF predictions generated across all folds, an unbiased and realistic generalization performance was achieved for the entire dataset [56,57,58]. All machine learning analyses, feature engineering procedures, model training, evaluation processes, and visualization tasks were conducted using the Python programming language on the Google Colaboratory platform (Google LLC, Mountain View, CA, USA; https://colab.research.google.com, accessed on 3 February 2026). The computational environment was based on Python version 3.12.12 and included the following libraries: NumPy (version 2.0.2), pandas (version 2.2.2), scikit-learn (version 1.6.1), XGBoost (version 3.1.3), LightGBM (version 4.6.0), and SHAP. These tools were used for data preprocessing, model development, interpretability analysis, and result visualization.

Evaluation Criteria

Three commonly used and complementary performance metrics were employed to assess the predictive performance of the models: the Coefficient of Determination (R²), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE).

The Coefficient of Determination (R²) indicates how much of the total variance in the target variable is explained by the model, with higher values implying better explanatory power.
The Root Mean Square Error (RMSE) is a metric based on the square of the prediction errors and gives greater weight to larger errors, expressing the magnitude of the error in the same units as the target variable.
The Mean Absolute Error (MAE) represents the average of the absolute values of the prediction errors. Compared to RMSE, it is less sensitive to outliers and offers a metric that is easier to interpret.

These metrics were calculated separately for each fold and then averaged to represent the overall performance of the model. In addition to the mean values of R², RMSE, and MAE, model comparisons were conducted by considering the standard deviation of these metrics across folds to evaluate performance stability over time. Additionally, to contextually assess the hydrological relevance of the Palmer Z-Index, monthly storage volume data from the Atikhisar and Bakacak dams were utilized. These data were not included as inputs for the machine learning models. From the dam data, a three-month Standardized Reservoir Index (SRI₃), based on monthly volume anomalies and representing short-term anomalies in reservoir storage, was derived. The relationship between the Z-Index and SRI₃ was evaluated—not for quantitative modeling purposes—but to examine the temporal alignment of drought events. This analysis does not aim to validate the main model results but rather serves as a supportive assessment aimed at interpreting the hydrological consistency of the Z-Index in context [59,60].

2.7. Model Interpretation and Advanced Validation Framework

In this study, we recognized that an approach based solely on classical regression performance metrics is insufficient for evaluating a phenomenon such as drought, which is threshold-based and related to physical processes. In this context, an integrated assessment framework was adopted that collectively addresses model interpretability, event detection skill within different drought severity classes, and consistency with hydroclimatic processes [61]. Within this framework, the contributions of input variables to the model were investigated using explainable machine learning methods, category-based validation metrics for drought events were calculated, and the temporal consistency between Palmer Z-Index predictions and hydrological storage anomalies was evaluated. In this way, the model performance was comprehensively analyzed not only in terms of statistical accuracy but also considering physical meaningfulness and decision support.

2.7.1. Model Interpretability Using Shapley Additive Explanations (SHAP)

In this study, the Shapley Additive Explanations (SHAP) approach, based on game theory, was used to render the decision mechanisms of machine learning models interpretable. SHAP quantitatively expresses the marginal contribution of each input variable to the model output using Shapley values calculated for all possible feature combinations [62,63]. This method, particularly for tree-based ensemble models with nonlinear decision structures, enables a consistent evaluation of the relative importance of the variables.

SHAP values were evaluated in the original output space of the models, preserving the physical interpretability of the direction and magnitude of the variable contributions.

2.7.2. Assessing the Hydroclimatic Consistency of Z-Index with Reservoir Storage Anomalies

To assess the hydroclimatic response lag between the Palmer Z-Index and reservoir storage anomalies, an event-based lead–lag analysis was performed between the Z-Index and the Standardized Reservoir Index derived at different time scales (SRI₁, SRI₃, and SRI₆) and representing short- and medium-term anomalies in reservoir storage volumes. The Z-Index was used as the meteorological drought signal, and SRI series derived from the monthly storage volume data of the Atikhisar and Bakacak reservoirs for the period 2004–2024 were used as the hydrological response indicator (Figure 2).

The Z-Index and SRI series were combined on a common time axis, and drought events were defined as the event start—the time step when the threshold values defined for both series (Z ≤ −1 and SRI ≤ −1) were first exceeded. Consecutive periods of continuous negative values were treated as a single event, and comparisons were made based on the event onset [64,65].

For each Z-Index event, it was investigated whether an SRI event occurred within a maximum 12-month search window in the subsequent months; the difference between the onset of a Z-Index event and the matched SRI event was defined as the hydrological response delay (lead time). Events occurring within the same month were included in the analysis to evaluate the probability of simultaneous response [66,67].

To determine whether the observed event matches were coincidental, a Monte Carlo-based permutation approach that preserves the distributional characteristics of the time series was applied to the data. In this framework, for each SRI time scale, descriptive statistics for the number of matching events, match rate, and delay duration were calculated; the analysis was handled as a model-independent and complementary hydroclimatic validation method to examine at which time scales and with what lags the meteorological drought signal of the Z-Index is reflected in the reservoir storage dynamics.

2.7.3. Evaluation of Drought Detection Skill Across Severity Categories

To assess the ability of the Z-Index predictions to distinguish between different drought severity categories and detect event onsets, an event-based and category-specific validation approach was applied. Drought events were defined based on the thresholds of Z ≤ −1 and Z ≤ −2, and event onsets were determined using an event-merging approach that combined short interruptions (maximum of 1 month). Event validation was conducted by comparing the onset times of observed and predicted drought events; for each observed event, we examined whether a prediction occurred within a maximum search window of 12 months. If the prediction occurred before the observed event, the difference was defined as the lead time (early warning period); simultaneous events were also included in the analysis. Within this framework, measures such as Probability of Detection (POD), False Alarm Ratio (FAR), and Critical Success Index (CSI) were calculated to evaluate the event detection skills of the models across different drought severity classes. In addition, at the event level, the representation of peak timing and severity characteristics was comparatively examined between the predicted and observed events [68,69,70].

3. Results and Discussion

In this study, eight machine learning models, including linear, regularized, and tree-based ensemble approaches, were comparatively evaluated using time series-preserving five-fold cross-validation with out-of-fold predictions. The R², RMSE, and MAE metrics were used as evaluation criteria. This approach is not only limited to assessing the consistency between observational Z-Index values for the current month calculated using the Palmer method and the Z(t) values resulting from model predictions for the same period; it also aims to examine, within a holistic framework, the models’ generalization capacity and forward-looking predictive performance for estimates of the Palmer Z-Index Z(t + 1) for the following month.

3.1. Results of the Augmented Dickey–Fuller Test

Augmented Dickey–Fuller (ADF) test results indicate that the Z-Index series are stationary for both Çanakkale and Biga stations. The null hypothesis of a unit root is strongly rejected for the original series (Çanakkale: ADF = −30.674, p < 0.001; Biga: ADF = −20.832, p < 0.001), with test statistics exceeding all critical thresholds in magnitude. Consistently, the first-differenced series (ΔZ) are also stationary for both stations (p < 0.001). These findings suggest that the statistical properties of the Z-Index remain stable over time, reducing the risk of spurious modeling. Therefore, employing a time series cross-validation strategy (e.g., 5-fold time series split), where models are trained on past observations and tested on future observations, is methodologically appropriate and consistent with real-world forecasting settings (Table 2).

3.2. Comparative Evaluation of the Prediction Performance of Machine Learning Models for the Palmer Z-Index

The results indicate that increased model flexibility and the ability to capture nonlinear structures substantially enhance predictive accuracy and temporal generalization. Tree-based ensemble methods outperform linear and regularized models in representing complex, multi-scale hydroclimatic interactions underlying monthly drought anomalies.

Table 3 summarizes the OOF performance results for the current month Z(t) and next month Z(t + 1) predictions of the Palmer Z-Index at the Çanakkale Merkez (17112) and Biga (18084) meteorological stations. In general, significant performance differences were observed among the models at both stations; it was seen that these differences are sensitive to both the target time step and the hydroclimatic characteristics of the station (Figure 3).

For the Çanakkale Central station, Gradient Boosting produced the highest OOF R² value (0.841) in Z(t) predictions and achieved the lowest error metrics (RMSE = 0.731, MAE = 0.451). This finding indicates that the nonlinear structure of the Z-Index in Çanakkale Central can be more effectively represented by ensemble methods. A similar pattern was observed in Z(t + 1) predictions as well; Gradient Boosting provided the highest explanatory power with an OOF R² of 0.828.

However, the model behavior at the Biga station was somewhat different. For Z(t) predictions, Elastic Net achieved the highest OOF R² value (0.806), followed by Gradient Boosting, XGBoost, and CatBoost, with only slight differences in performance. This suggests that in Biga, the relationship between meteorological variables and the Z-Index may contain more regular and linear components. For Z(t + 1) predictions, Elastic Net, Gradient Boosting, and linear regression models produced results that were quite close to each other, and the advantage of ensemble methods was more limited compared to Çanakkale Central. This difference can be explained by Biga’s shorter observation period (1984–2024) and the relatively low complexity of the local hydroclimatic dynamics.

The findings strongly align with the literature, emphasizing that nonlinear methods are more successful than traditional linear models in modeling drought indices. In Türkiye, ANN has been reported to provide higher accuracy (R ≈ 0.98) than linear regression, SVM, and decision trees in Z-Index prediction; likewise, the superior performance of wavelet–fuzzy hybrid models has been demonstrated in Northwestern Türkiye [18]. Similar results have also been observed internationally: the use of XGBoost combined with signal decomposition nearly achieved Nash–Sutcliffe efficiencies of 0.98 in short-term scPDSI predictions in semi-arid regions, and furthermore, ensemble tree-based methods exhibited a more robust performance than deep learning under data-limited conditions [71,72].

In this study, the superiority of Gradient Boosting can be explained by its ability to capture high-order and nonlinear dependencies between meteorological inputs and drought response by iteratively reducing errors. The observed weak pairwise correlations support the notion that purely additive linear models are limited in their ability to represent the complex structure of the Z-Index. In contrast, ensemble tree models reduce bias by adaptively partitioning the feature space and increasing the generalization power.

Methodologically, time series cross-validation preserves temporal dependencies to prevent information leakage and provides more reliable generalization metrics. Although some studies report higher R² values using a single train–test split [18], the stricter validation scheme followed in this study offers a more realistic performance assessment. A low variance across folds indicates that the model remained stable during different periods. Moreover, lagged and moving window-based features increased the predictive power by capturing continuity effects, which is consistent with the literature [71]. In terms of application, model outputs can be used to improve irrigation allocation plans and strengthen early warning systems, thereby supporting the decision-making potential of ensemble-based approaches in drought management [73].

3.3. Explaining the Decision Mechanisms of Models with SHAP

The SHAP findings obtained for the Çanakkale Central and Biga meteorological stations revealed that the model output at both stations was predominantly determined by precipitation variables.

In both the current time step and one-step-ahead predictions, both the instantaneous and lagged precipitation components produced the highest SHAP contributions, which clearly demonstrated that the target variable exhibited a strong dependence on hydrological processes. The main difference between the stations was the relative importance of seasonality and temperature components. The prominence of the month_sin and month_cos variables at the Biga station indicates that the model output is sensitive to a distinct annual cycle, whereas the seasonality effect was more limited at the Çanakkale Central station, where the model responded more to short-term meteorological conditions. Temperature variables played a secondary role at both stations; however, delayed effects were more prominent at Çanakkale Central, whereas instantaneous and short-term lagged effects were prominent at Biga (Figure 4). According to the SHAP analysis, the highest feature contributions were from meteorological inputs, especially variables related to precipitation and soil moisture. This finding aligns with previous studies that revealed that variables directly tied to the water budget play a dominant role in drought prediction. Indeed, Ref. [74], using TerraClimate data for PDSI prediction, reported that soil moisture and precipitation variables were the most influential inputs in the model output. Similarly, Ref. [75] showed that, in an XGBoost-based hydrological drought (streamflow classification) prediction across China, SPI—a precipitation-based indicator—had the highest SHAP scores, and that this effect was further enhanced by soil moisture and potential evapotranspiration variables depending on seasonal conditions. Additionally, in their work addressing groundwater drought with explainable artificial intelligence (XAI) and SHAP analysis, Ref. [76] demonstrated that the duration of precipitation-deficit-driven meteorological drought and the intensity of temperature-induced meteorological drought play critical roles. The SHAP distributions obtained in this study also clearly show that precipitation deficiency is a decisive factor in the decline of the Palmer Z-Index.

3.4. Analysis of the Relationship Between Palmer Z-Index Estimates and SRI

The temporal relationship between the Palmer Z-Index and SRI₃, a mid-term hydrological drought indicator, was examined to evaluate the transfer of drought signals from the meteorological to the hydrological stage. The results presented in Table 4 show that the Palmer Z-Index can consistently predict SRI₃-based hydrological drought conditions months in advance in both basins. In the Merkez–Atikhisar Basin, the average and median lead times for the Z– SRI₃ relationship were calculated as 5.19 and 5 months, respectively. The fact that the advance capture rate reached 81.3% indicates that the Palmer Z-Index can provide a significant early warning of the development of mid-term hydrological droughts. In the Biga–Bakacak Basin, the average lead time reached 6.39 months, and the advance capture rate was determined to be 95.7%, indicating that meteorological drought signals are more distinctly reflected in the hydrological system with a greater delay.

These results reveal that, owing to the structure of the Palmer Z-Index based on the meteorological water balance, it systematically provides an early signal for medium-term hydrological drought processes represented by SRI₃. Therefore, it can be concluded that the Palmer Z-Index values predicted by machine learning can be used as an effective indicator in operational early warning systems for the early detection and monitoring of medium-term hydrological droughts. The time-lagged relationship between the Z-Index and SRI supports the notion that the meteorological drought signal is the primary triggering mechanism for initiating hydrological droughts.

The observed lag findings are consistent with the literature. It is widely known that hydrological drought emerges a few months after meteorological drought; Ref. [7] emphasized this process within a cause-and-effect framework, showing this in the context of Türkiye. Ref. [77] also found high correlations between meteorological (SPI/SPEI) and hydrological (SRI) indices in a similar basin and showed that indices with the same time scale, in particular, produced stronger relationships. In this study, the highest Z–SRI relationship corresponded to a few months after the meteorological drought signal. However, it should be noted that the magnitude of the lag may vary depending on regional climate conditions and soil–water relationships, as supported by [78], who drew attention to the process of meteorological deficits manifesting in streamflow in the Rio Godavari Basin. However, studies directly addressing the relationship between indicators representing short-term moisture anomalies, such as the Palmer Z-Index, and reservoir-based hydrological drought indices, such as the SRI, are relatively limited. However, Ref. [79] found that the Palmer Z-Index strongly reflects short-term soil moisture anomalies and exhibits statistically significant, albeit delayed, relationships with reservoir-based drought indicators. This suggests that the Z-Index should be considered a precursor indicator representing the early stages of stress on the hydrological system rather than a direct descriptor of hydrological drought. The case of Brazil is noteworthy in this context; studies conducted for the Jucazinho Reservoir reported that indices such as SPI, SPEI, and SRI were insufficient to fully capture fluctuations in the reservoir water level [80]. When these findings are evaluated together, it is evident that although the lag structures of the Z-Index’s precursory relationship to hydrological droughts may vary in each basin, it provides a strong early warning link between meteorological and hydrological systems and may play an important role, especially in short-term drought monitoring and forecasting studies.

3.5. Event-Based Drought Analysis: The Ability of Models to Capture Drought Periods

The performance of machine learning-based Palmer Z-Index predictions in capturing drought events was evaluated through a combined analysis of time series comparisons (Figure 5) and category-based validation metrics (Table 5). The time series results presented in Figure 5 indicate that both the current-month Z(t) and one-month-ahead Z(t + 1) predictions generally follow the temporal dynamics of the observed Palmer Z-Index with a high degree of consistency. In particular, during periods when the index dropped below the drought threshold (Z ≤ −1), the predicted series were able to capture the timing of threshold crossings largely synchronously with observations, which is essential for identifying the onset of drought events. These visual findings are further supported by the quantitative validation results summarized in Table 5, demonstrating that the proposed approach has operational relevance not only in terms of overall predictive accuracy but also when evaluated from a threshold-based, event-oriented perspective [81].

Under mild drought conditions (Z ≤ −1), the current-month forecasts Z(t) exhibited strong detection skill at both stations. At the Çanakkale (Center) station, a POD value of 0.777 and a CSI value of 0.685 were obtained, while FAR remained low at 0.148, indicating a favorable balance between sensitivity and reliability. Similarly, at the Biga station, POD and CSI values of 0.740 and 0.655, respectively, were achieved. The close agreement of performance metrics between the two stations suggests that the proposed framework yields consistent event-detection capability across different locations, supporting its potential spatial generalizability. These results highlight that, for drought monitoring purposes, models must not only exhibit high sensitivity (high POD) but must also maintain controlled false alarm rates (low FAR) to ensure operational usefulness.

For the one-month-ahead forecasts Z(t + 1), a moderate decrease in POD and CSI values is observed for mild droughts; however, the increase in CSI to 0.669 at the Biga station indicates that short-term forecasts remain effective in tracking the persistence and evolution of ongoing drought conditions. This finding suggests that the model is capable of representing not only the initiation of drought events but also their short-term continuation, which is particularly relevant for monitoring applications.

In the severe drought category (Z ≤ −2), the performance metrics exhibit greater variability, primarily reflecting the relatively low frequency of such extreme events. Nevertheless, the Z(t + 1) forecasts consistently outperform Z(t) in terms of POD and CSI, underscoring the added value of short-term prediction for early warning. At the Çanakkale station, CSI increased from 0.493 for Z(t) to 0.536 for Z(t + 1), while at the Biga station, POD increased from 0.600 to 0.680, demonstrating that one-month-ahead forecasts can effectively capture the development of severe drought conditions. However, the elevated FAR values observed for severe droughts, particularly at the Biga station (FAR = 0.452 for Z(t + 1)), indicate an unavoidable trade-off between early warning capability and forecast reliability when predicting rare, high-impact events. This behavior is consistent with the drought literature, which emphasizes that detection skill for extreme events must be evaluated jointly with the cost of false alarms [82].

The combined evaluation of time series-based visual analysis (Figure 5) and category-based validation metrics (Table 5) demonstrates that the proposed modeling framework provides robust performance for mild drought monitoring and meaningful early warning potential for severe drought conditions. Forecasting the Palmer Z-Index for both the current and subsequent months using machine learning thus emerges as an effective tool for detecting both the onset and short-term evolution of drought events. The contribution of this study lies not only in modeling the Palmer Z-Index as a continuous time series but also in operationalizing it through an event-based verification framework based on POD–FAR–CSI metrics. While event-based approaches in drought research have traditionally been applied to frequency–duration–severity analyses using meteorological indices [83], machine learning studies have more often focused on forward classification of drought categories. For instance, DroughtCast has demonstrated skill in predicting USDM categories at lead times of 1–12 weeks [84], and [85] reported high F1 scores by framing threshold-defined drought events as a binary classification problem. In contrast, the present study integrates continuous index-based forecasting with event-based evaluation within a unified framework, enabling simultaneous real-time monitoring via Z(t) and short-term early warning via Z(t + 1) at two meteorological stations. The strong consistency between the time series behavior (Figure 5) and the event-based validation metrics (Table 5) therefore reinforces the applicability of the proposed approach for operational drought monitoring and early warning systems.

4. Conclusions

In this study, we present a machine learning-based integrated framework for short-term meteorological drought prediction using the Palmer Z-Index. This study is based on long-term monthly data from two meteorological stations (Çanakkale Central and Biga) in northwestern Türkiye, which have different hydroclimatic characteristics. Current month Z(t) and one-month-ahead Z(t + 1) drought predictions were evaluated using validation strategies that preserved the time series structure.

The findings indicate that, owing to the Palmer Z-Index’s high sensitivity to short-term moisture anomalies, drought can be predicted with high accuracy when appropriate feature engineering and time-aware validation approaches are employed. Across all stations and target time steps, tree-based ensemble models, particularly Gradient Boosting, XGBoost, and CatBoost, offered higher explanatory power and lower error values than linear and regularized linear models. This result demonstrates that short-term drought dynamics are determined by nonlinear interactions and multi-scale hydroclimatic continuity processes, and that these structures cannot be fully represented by linear models.

Among the models, the Gradient Boosting algorithm exhibited the highest and most consistent generalization performance at the Çanakkale Central station—where hydroclimatic variability and nonlinear interactions are more pronounced—for Z(t + 1) (OOF R² ≈ 0.83). In contrast, at the Biga station, which has a shorter data period and a relatively more regular hydroclimatic structure, the fact that the Elastic Net and linear regression models performed competitively with ensemble methods demonstrates that model performance depends not only on algorithmic complexity but is also strongly tied to the station-specific data structure and climatic dynamics. This finding highlights the methodological necessity of a comparative and multi-model approach, rather than relying on a single universal model for drought prediction.

In this study, an advanced validation framework that prioritizes physical consistency was applied, going beyond classical regression performance metrics. The SHAP analysis results indicated that at both stations, the main determinants of the model outputs were precipitation and its lagged components, whereas temperature and seasonality variables contributed at a secondary level, and variables such as relative humidity, wind speed, and atmospheric pressure played more limited yet complementary roles. These findings reveal that machine learning models, through the Palmer water balance approach, develop physically consistent decision-making mechanisms.

The hydrological relevance of the Palmer Z-Index values predicted by machine learning was further evaluated using reservoir storage anomalies from the Atikhisar and Bakacak dams. Event-based time lag analyses showed that the Z-Index, as a meteorological drought signal, systematically led to medium-term hydrological drought conditions—represented by SRI₃—by approximately 5–6 months in both basins. The fact that the lead time hit rates exceeded 80% in the Çanakkale Central Basin and 95% in the Biga Basin demonstrates the strong potential of Palmer Z-Index predictions based on machine learning for the early detection of hydrological droughts.

Category-based validation analyses support the proposed approach’s ability to distinguish between different drought severity levels and capture event onsets. For mild drought conditions (Z ≤ −1), high probabilities of detection (POD ≈ 0.74–0.78) were achieved for both current and forward predictions, whereas for rarer severe drought events (Z ≤ −2), forecasts one month ahead offered significant early warning capacity. The relatively high false alarm rates for severe events reflect the inevitable sensitivity–reliability trade-off in early warning systems.

In conclusion, this study demonstrates that machine learning-based nowcasting and short-term forecasting of the Palmer Z-Index can produce results that are not only statistically robust but also consistent with hydroclimatic processes and are operationally meaningful. The proposed framework, with its computational efficiency, interpretability, and adaptability to different hydroclimatic conditions, provides a powerful decision-support tool for irrigation planning, reservoir management, and drought early warning systems. In future studies, evaluating multi-step forecasting horizons, testing spatial generalization with gridded datasets, and integrating large-scale climate oscillations into the model will further enhance the predictive capability and operational value of this method.

Author Contributions

The authors contributed equally to this work. Conceptualization, U.M. and E.E.A.C.; methodology, U.M. and E.E.A.C.; software, U.M.; validation, U.M. and E.E.A.C.; formal analysis, U.M.; investigation, U.M. and E.E.A.C.; data curation, U.M.; writing—original draft preparation, U.M.; writing—review and editing, U.M. and E.E.A.C.; visualization, U.M.; supervision, E.E.A.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was conducted without any external financial support.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study are available from the corresponding author upon reasonable requests.

Acknowledgments

The authors would like to thank the Turkish State Meteorological Service for providing the meteorological data and the General Directorate of State Hydraulic Works (DSİ) for providing the reservoir data used in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SPEI	Standardized Precipitation Evapotranspiration Index
SPI	Standardized Precipitation Index
PDSI	Palmer Drought Severity Index
SRI	Standardized Reservoir Index
ANN	Artificial Neural Networks
SVM	Support Vector Machines
LR	Linear Regression
DT	Decision Trees
scPDSI	self-calibrating Palmer Drought Severity Index
PHDI	Palmer Hydrological Drought Index
PE	Potential Evapotranspiration
P	Total Precipitation
AWC	Available Water Capacity
RO	Runoff
PR	Potential Values of Recharge
ET	Actual Evapotranspiration
SVR	Support Vector Regression
GBR	Gradient Boosting Regressor
RMSE	Root Mean Square Error
MAE	Mean Absolute Error
OOF	Out-of-Fold
CV	Five-Fold Cross-Validation
R²	Coefficient of Determination

References

Muheki, D.; Zscheischler, J.; Messori, G.; Deijns, A.A.J.; Thiery, W.; Bevacqua, E. The perfect storm? Co-occurring climate extremes in East Africa. Earth Syst. Dyn. 2024, 15, 429–466. [Google Scholar] [CrossRef]
Wu, X.; Hao, Z.; Tang, Q.; Feng, S.; Zhang, X.; Hao, F. Population exposure to compound dry and hot events in China under 1.5 and 2 °C global warming. Int. J. Climatol. 2021, 41, 5766–5775. [Google Scholar] [CrossRef]
IPCC. Climate Change 2021: The Physical Science Basis; Cambridge University Press: Cambridge, UK, 2021. [Google Scholar] [CrossRef]
Estrela, T.; Pérez-Martin, M.A.; Vargas, E. Impacts of climate change on water resources in Spain. Hydrol. Sci. J. 2012, 57, 1154–1167. [Google Scholar] [CrossRef]
Sam, T.T.; Nhi, P.T.T.; Hoan, N.X.; Thao, N.T.T.; Nguyen, V.T.; Khoi, D.N.; Quan, N.T. Impact of climate change on meteorological, hydrological and agricultural droughts in the Lower Mekong River Basin: A case study of the Srepok Basin, Vietnam. Water Environ. J. 2018, 33, 547–559. [Google Scholar] [CrossRef]
Oertel, M.; Meza, F.J.; Gironás, J. Observed trends and relationships between ENSO and standardized hydrometeorological drought indices in central Chile. Hydrol. Process. 2019, 34, 159–174. [Google Scholar] [CrossRef]
Shukla, S.; Wood, A.W. Use of a standardized runoff index for characterizing hydrologic drought. Geophys. Res. Lett. 2008, 35, L02405. [Google Scholar] [CrossRef]
Hoffmann, D.; Arblaster, J.M.; Gallant, A.J.E. Uncertainties in drought from index and data selection. J. Geophys. Res. Atmos. 2020, 125, e2019JD031946. [Google Scholar] [CrossRef]
Vicente-Serrano, S.M.; López-Moreno, J.I.; Beguería, S. A multiscalar drought index sensitive to global warming: The standardized precipitation evapotranspiration index. J. Clim. 2010, 23, 1696–1718. [Google Scholar] [CrossRef]
Azizi, H.; Nejatian, N. Evaluation of the climate change impact on the intensity and return period for drought indices of SPI and SPEI (study area: Varamin plain). Water Supply 2022, 22, 4373–4386. [Google Scholar] [CrossRef]
Soydan Oksal, N.G. Comparative analysis of the influence of temperature and precipitation on drought assessment in the Marmara region of Turkey: An examination of SPI and SPEI indices. J. Water Clim. Change 2023, 14, 3096–3111. [Google Scholar] [CrossRef]
Liu, L.; Hocker, J.E.; Yong, B.; Shafer, M.A.; Bednarczyk, C.N.; Hong, Y.; Riley, R. Hydro-climatological drought analyses and projections using meteorological and hydrological drought indices: A case study in Blue River Basin, Oklahoma. Water Resour. Manag. 2012, 26, 2761–2779. [Google Scholar] [CrossRef]
Sadeghfam, S.; Farmani, H.; Mirabbasi, R. Developing reservoir drought index and conducting copula-based frequency analysis for Lake Urmia basin in Iran. J. Hydrol. Reg. Stud. 2025, 60, 102476. [Google Scholar] [CrossRef]
Keyantash, J.; Dracup, J.A. The Quantification of Drought: An Evaluation of Drought Indices. Bull. Amer. Meteor. Soc. 2002, 83, 1167–1180. [Google Scholar] [CrossRef]
Yang, Y.; Yang, D.; Zhang, S.; Roderick, M.L.; Liu, W.; Li, X.; McVicar, T.R. Comparing Palmer Drought Severity Index drought assessments using the traditional offline approach with direct climate model outputs. Hydrol. Earth Syst. Sci. 2020, 24, 2921–2930. [Google Scholar] [CrossRef]
Wilhite, D.A.; Glantz, M.H. Understanding the drought phenomenon: The role of definitions. Water Int. 1985, 10, 111–120. [Google Scholar] [CrossRef]
Jacobi, J.; Hornberger, G.; Perrone, D.; Duncan, L.L. A tool for calculating the Palmer drought indices. Water Resour. Res. 2013, 49, 6086–6089. [Google Scholar] [CrossRef]
Tufaner, F.; Özbeyaz, A. Estimation and easy calculation of the Palmer Drought Severity Index from meteorological data using advanced machine learning algorithms. Environ. Monit. Assess. 2020, 192, 576. [Google Scholar] [CrossRef] [PubMed]
Isia, I.; Shahedan, N.F.; Syafrudin, M.; Bhattacharjya, R.K.; Jusoh, M.N.H.; Hadibarata, T.; Fitriyani, N.L.; Bouaissi, A. Drought analysis based on standardized precipitation evapotranspiration index and standardized precipitation index in Sarawak, Malaysia. Sustainability 2022, 15, 734. [Google Scholar] [CrossRef]
Ma, M.; Liu, Y.; Ren, L.; Kong, H.; Jiang, S.; Gong, L.; Yuan, F. A new standardized Palmer drought index for hydro-meteorological use. Hydrol. Process. 2013, 28, 5645–5661. [Google Scholar] [CrossRef]
Mokhtarzad, M.; Eskandari, F.; Arabasadi, A.; Jamshidi Vanjani, N. Drought forecasting by ANN, ANFIS, and SVM and comparison of the models. Environ. Earth Sci. 2017, 76, 729. [Google Scholar] [CrossRef]
Sundararajan, K.; Srinivasan, K.; Kumaran Selvaraj, S.; Garg, L.; Kashif Bashir, A.; Pattukandan Ganapathy, G.; Kaliappan, J.; Meena, T. A contemporary review on drought modeling using machine learning approaches. Comput. Model. Eng. Sci. 2021, 128, 447–487. [Google Scholar] [CrossRef]
Zhao, Y.; Zhang, S.; Seka, A.M.; Bai, Y.; Nanzad, L.; Yang, S.; Henchiri, M.; Zhang, J. Drought monitoring and performance evaluation based on machine learning fusion of multi-source remote sensing drought factors. Remote Sens. 2022, 14, 6398. [Google Scholar] [CrossRef]
Aghelpour, P.; Mehdizadeh, S.; Duan, Z.; Bahrami-Pichaghchi, H.; Mohammadi, B. A novel hybrid dragonfly optimization algorithm for agricultural drought prediction. Stoch. Environ. Res. Risk Assess. 2021, 35, 2459–2477. [Google Scholar] [CrossRef]
Chen, L.; Lu, J.; Huang, J.; Gnyawali, K.R.; Miao, L.; Zhan, M.; Amankwah, S.O.Y.; Li, S.; Wang, G. Future drought in CMIP6 projections and the socioeconomic impacts in China. Int. J. Climatol. 2021, 41, 4151–4170. [Google Scholar] [CrossRef]
Fagariba, C.J.; Soule Baoro, S.K.G.; Song, S. Climate change adaptation strategies and constraints in Northern Ghana: Evidence of farmers in Sissala West District. Sustainability 2018, 10, 1484. [Google Scholar] [CrossRef]
Li, Y.; Wang, M.; Ye, W.; Yan, X. Climate change and drought: A risk assessment of crop-yield impacts. Clim. Res. 2009, 39, 31–46. [Google Scholar] [CrossRef]
Ilgar, R. Çanakkale ilinin sulak alanları [Wetlands of the Çanakkale Province]. Strat. Ve Sos. Araştırmalar Derg. 2021, 5, 613–629. [Google Scholar] [CrossRef]
Gökhan, C.; Taş, İ. Sulama alanlarına saptırılan sulama suyunun yeterlilik durumu: Çanakkale–Biga Bakacık Barajı örneği. Turk. J. Agric. Nat. Sci. 2024, 11, 463–474. [Google Scholar] [CrossRef]
Mucan, U.; Yıldırım, M. Drought analysis of Çanakkale Province based on long-term climate data. ÇOMÜ J. Agric. Fac. 2023, 11, 339–350. [Google Scholar] [CrossRef]
Taşova, H.; Akın, A. Marmara bölgesi topraklarının bitki besin maddesi kapsamlarının belirlenmesi, veri tabanının oluşturulması ve haritalanması. T.C. Gıda Tarım ve Hayvancılık Bakanlığı, Toprak Su Dergisi 2013, 2, 83–95. [Google Scholar]
Palmer, W.C. Meteorological Drought; Weather Bureau Research Paper No.45; U.S. Department of Commerce: Washington, DC, USA, 1965; 58p.
Wells, N.; Goddard, S.; Hayes, M.J. A self-calibrating Palmer Drought Severity Index. J. Clim. 2004, 17, 2335–2351. [Google Scholar] [CrossRef]
Thornthwaite, C.W. An approach toward a rational classification of climate. Geogr. Rev. 1948, 38, 55–94. [Google Scholar] [CrossRef]
Crow, W.T.; Liu, Q.; Xia, Y.; Chen, F.; Reichle, R.H. Exploiting soil moisture, precipitation and streamflow observations to evaluate soil moisture/runoff coupling in land surface models. Geophys. Res. Lett. 2018, 45, 4869–4878. [Google Scholar] [CrossRef]
Kendy, E.; Walter, M.T.; Gérard-Marchant, P.; Zhang, Y.; Liu, C.; Steenhuis, T.S. A soil-water-balance approach to quantify groundwater recharge from irrigated cropland in the North China Plain. Hydrol. Process. 2003, 17, 2011–2031. [Google Scholar] [CrossRef]
Borji, M.; Malekian, A.; Salajegheh, A.; Ghadimi, M. Multi-time-scale analysis of hydrological drought forecasting using support vector regression (SVR) and artificial neural networks (ANN). Arab. J. Geosci. 2016, 9, 725. [Google Scholar] [CrossRef]
Fung, K.F.; Koo, C.H.; Mirzaei, M.; Huang, Y.F. Improved SVR machine learning models for agricultural drought prediction at downstream of Langat River Basin, Malaysia. J. Water Clim. Change 2019, 11, 1383–1398. [Google Scholar] [CrossRef]
Li, Z.; Xia, G.; Chi, D.; Wu, Q.; Chen, T. Application of penalized linear regression and ensemble methods for drought forecasting in Northeast China. Meteorol. Atmos. Phys. 2019, 132, 113–130. [Google Scholar] [CrossRef]
Awad, M.; Khanna, R. Support vector regression. In Efficient Learning Machines; Apress: Berkeley, CA, USA, 2015; pp. 67–80. [Google Scholar] [CrossRef]
Tuia, D.; Camps-Valls, G.; Verrelst, J.; Alonso, L.; Perez-Cruz, F. Multioutput support vector regression for remote sensing biophysical parameter estimation. IEEE Geosci. Remote Sens. Lett. 2011, 8, 804–808. [Google Scholar] [CrossRef]
Xu, J.; Geng, Z.; Ma, L.; Jiang, W.; Li, M.; Yu, Z. Augmented time-delay twin support vector regression-based behavioral modeling for digital predistortion of RF power amplifier. IEEE Access 2019, 7, 59832–59843. [Google Scholar] [CrossRef]
Danandeh Mehr, A.; Jabarnejad, M.; Safari, M.J.S.; Nourani, V.; Torabi Haghighi, A. A new evolutionary hybrid random forest model for SPEI forecasting. Water 2022, 14, 755. [Google Scholar] [CrossRef]
Salman, H.A.; Steiti, A.; Kalakech, A. Random forest algorithm overview. Babylonian J. Mach. Learn. 2024, 2024, 69–79. [Google Scholar] [CrossRef]
Nyirandayisabye, R.; Li, H.; Dong, Q.; Hakuzweyezu, T.; Nkinahamira, F. Automatic pavement damage predictions using various machine learning algorithms: Evaluation and comparison. Results Eng. 2022, 16, 100657. [Google Scholar] [CrossRef]
Imani, M.; Arabnia, H.R. Hyperparameter optimization and combined data sampling techniques in machine learning for customer churn prediction: A comparative analysis. Technologies 2023, 11, 167. [Google Scholar] [CrossRef]
Mienye, I.D.; Jere, N. Optimized ensemble learning approach with explainable AI for improved heart disease prediction. Information 2024, 15, 394. [Google Scholar] [CrossRef]
Alazba, A.; Aljamaan, H. Software Defect Prediction Using Stacking Generalization of Optimized Tree-Based Ensembles. Appl. Sci. 2022, 12, 4577. [Google Scholar] [CrossRef]
Kumar, P.S.; Kumari, K.A.; Mohapatra, S.; Naik, B.; Nayak, J.; Mishra, M. CatBoost ensemble approach for diabetes risk prediction at early stages. In Proceedings of the IEEE Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology (ODICON), Bhubaneswar, India, 8–9 January 2021. [Google Scholar] [CrossRef]
Sahin, E.K. Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping. Geocarto Int. 2020, 35, 2441–2465. [Google Scholar] [CrossRef]
Boldini, D.; Grisoni, F.; Kuhn, D.; Friedrich, L.; Sieber, S.A. Practical guidelines for the use of gradient boosting for molecular property prediction. J. Cheminform. 2023, 15, 73. [Google Scholar] [CrossRef] [PubMed]
Su, Y.; Jiang, X. Prediction of tide level based on variable weight combination of LightGBM and CNN-BiGRU model. Sci. Rep. 2023, 13, 9. [Google Scholar] [CrossRef]
Xia, H.; Lv, H.; Gao, Y.; Wei, X. Traffic prediction based on ensemble machine learning strategies with bagging and LightGBM. In Proceedings of the IEEE International Conference on Communications Workshops (ICC Workshops), Shanghai, China, 20–24 May 2019; pp. 1–6. [Google Scholar] [CrossRef]
Cai, R.; Xie, S.; Xu, D.; He, Y.; Yang, R.; Wang, B. Wind speed forecasting based on extreme gradient boosting. IEEE Access 2020, 8, 175063–175069. [Google Scholar] [CrossRef]
Abhinaya, P.; Ozer, O.; Reddy, C.K.K.; Ranjan, A. Explicit monitoring and prediction of hailstorms with XGBoost classifier for sustainability. In Machine Learning for Sustainable Development; IGI Global: Hershey, PA, USA, 2024; pp. 107–132. [Google Scholar] [CrossRef]
Gianola, D.; Schön, C.C. Cross-validation without doing cross-validation in genome-enabled prediction. G3 Genes Genomes Genet. 2016, 6, 3107–3128. [Google Scholar] [CrossRef]
Tsamardinos, I.; Greasidou, E.; Borboudakis, G. Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Mach. Learn. 2018, 107, 1895–1922. [Google Scholar] [CrossRef]
Varoquaux, G.; Colliot, O. Evaluating machine learning models and their diagnostic value. In Artificial Intelligence in Medicine; Springer: New York, NY, USA, 2023; pp. 601–630. [Google Scholar] [CrossRef]
Tang, H.; Shi, P.; Qu, S.; Wen, T.; Li, Q.; Zhao, L. Analysis of characteristics of hydrological and meteorological drought evolution in Southwest China. Water 2021, 13, 1846. [Google Scholar] [CrossRef]
Xiang, Y.; Bai, Y.; Chen, Y.; Wang, Y.; Zhang, Q.; Zhang, L. Hydrological drought risk assessment using a multi-dimensional copula function approach in arid inland basins, China. Water 2020, 12, 1888. [Google Scholar] [CrossRef]
Li, Y.; Huang, Y.; Li, Y.; Zhang, H.; Fan, J.; Deng, Q.; Wang, X. Spatiotemporal heterogeneity in meteorological and hydrological drought patterns and propagations influenced by climatic variability, LULC change, and human regulations. Sci. Rep. 2024, 14, 56526. [Google Scholar] [CrossRef]
Shapley, L.S. A value for n-person games. In Contributions to the Theory of Games; Kuhn, H.W., Tucker, A.W., Eds.; Princeton University Press: Princeton, NJ, USA, 1953; Volume II, pp. 307–317. [Google Scholar]
Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.I. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020, 2, 252–583. [Google Scholar] [CrossRef]
Babre, A.; Kalvāns, A.; Avotniece, Z.; Retiķe, I.; Bikše, J.; Popovs, K.; Jemeljanova, M.; Zelenkevičs, A.; Dēliņa, A. The use of predefined drought indices for the assessment of groundwater drought episodes in the Baltic States over the period 1989–2018. J. Hydrol. Reg. Stud. 2022, 40, 101049. [Google Scholar] [CrossRef]
Bayissa, Y.; Van Andel, S.; Tadesse, T.; Van Griensven, A.; Moges, S.; Maskey, S.; Solomatine, D. Comparison of the performance of six drought indices in characterizing historical drought for the Upper Blue Nile Basin, Ethiopia. Geosciences 2018, 8, 81. [Google Scholar] [CrossRef]
Cammalleri, C.; Vogt, J.; Salamon, P. Development of an operational low-flow index for hydrological drought monitoring over Europe. Hydrol. Sci. J. 2016, 62, 346–358. [Google Scholar] [CrossRef]
Rodríguez-Blanco, M.L.; Taboada-Castro, M.M.; Taboada-Castro, M.T. Rainfall–runoff response and event-based runoff coefficients in a humid area (northwest Spain). Hydrol. Sci. J. 2012, 57, 445–459. [Google Scholar] [CrossRef]
Abu Arra, A.; Şişman, E.; Gazioğlu, Ş.A.; Birpınar, M.E. Critical drought characteristics: A new concept based on dynamic time period scenarios. Atmosphere 2024, 15, 768. [Google Scholar] [CrossRef]
Lohani, V.K.; Loganathan, G.V.; Mostaghimi, S. Long-term analysis and short-term forecasting of dry spells by Palmer drought severity index. Hydrol. Res. 1998, 29, 21–40. [Google Scholar] [CrossRef]
Xu, L.; Du, W.; Zhang, C.; Chen, N.; Yu, H.; Wu, T.; Zhang, X. Global prediction of flash drought using machine learning. Geophys. Res. Lett. 2024, 51, e2024GL111134. [Google Scholar] [CrossRef]
Ekmekcioğlu, Ö. Drought forecasting using integrated variational mode decomposition and extreme gradient boosting. Water 2023, 15, 3413. [Google Scholar] [CrossRef]
Tanrıverdi, İ.; Batmaz, İ. AI-driven U.S. drought prediction using machine learning and deep learning. Clim. Dyn. 2025, 63, 249. [Google Scholar] [CrossRef]
Das, P.; Zhang, Z.; Ghosh, S.; Hang, R. A hybrid ensemble learning merging approach for enhancing the super drought computation over Lake Victoria Basin. Sci. Rep. 2024, 14, 13870. [Google Scholar] [CrossRef] [PubMed]
Melese, T.E.; Assefa, G.; Terefe, B.; Belay, T.; Bayable, G.; Senamew, A. Machine learning-based drought prediction using Palmer drought severity index and TerraClimate data in Ethiopia. PLoS ONE 2025, 20, e0326174. [Google Scholar] [CrossRef] [PubMed]
Li, M.; Yao, Y.; Feng, Z.; Ou, M. Hydrological drought prediction and its influencing features analysis based on a machine learning model. Nat. Hazards Earth Syst. Sci. 2025, 25, 4299–4316. [Google Scholar] [CrossRef]
Başağaoğlu, H.; Sharma, C.; Chakraborty, D.; Yoosefdoost, I.; Bertetti, F.P. Heuristic data-inspired scheme to characterize meteorological and groundwater droughts in a semi-arid karstic region under a warming climate. J. Hydrol. Reg. Stud. 2023, 48, 101481. [Google Scholar] [CrossRef]
Dikici, M. Drought analysis with different indices for the Asi Basin (Turkey). Sci. Rep. 2020, 10, 20739. [Google Scholar] [CrossRef]
Kadapala, B.K.R.; Farsana, M.A.; Vimala, C.H.G.; Joshi, S.; Hakeem, K.A.; Raju, P.V. A grid-wise approach for accurate computation of standardized runoff index (SRI). Sci. Total Environ. 2024, 946, 174472. [Google Scholar] [CrossRef]
Vasiliades, L.; Loukas, A. Hydrological response to meteorological drought using the Palmer drought indices in Thessaly, Greece. Desalination 2009, 237, 3–21. [Google Scholar] [CrossRef]
Araújo, L.M., Jr.; Souza Filho, F.A.; Cid, D.A.C.; Oliveira da Silva, S.M.; Silveira, C.S. Avaliação de índices de seca meteorológica e hidrológica em relação ao impacto de acumulação de água em reservatório: Um estudo de caso para o reservatório de Jucazinho-PE. Rev. AIDIS Ing. Cienc. Ambient. 2020, 13, 382–398. [Google Scholar] [CrossRef]
Cammalleri, C.; Acosta Navarro, J.C.; Bavera, D.; De Jager, A.; Barbosa, P.; Vogt, J. An event-oriented database of meteorological droughts in Europe based on spatio-temporal clustering. Sci. Rep. 2023, 13, 3145. [Google Scholar] [CrossRef]
Kulkarni, S.; Sawada, Y. Near-global agro-climatological drought monitoring dataset. Sci. Data 2025, 12, 2038. [Google Scholar] [CrossRef] [PubMed]
Lee, S.; Moriasi, D.N.; Mehr, A.D.; Mirchi, A. Sensitivity of standardized precipitation evapotranspiration index (SPEI) to the choice of probability distribution and evapotranspiration method. J. Hydrol. Reg. Stud. 2024, 53, 101761. [Google Scholar] [CrossRef]
Brust, C.; Kimball, J.S.; Maneta, M.P.; Jencso, K.; Reichle, R.H. DroughtCast: A machine learning forecast of the United States drought monitor. Front. Big Data 2021, 4, 773478. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, C.; Meng, F.-R.; Bourque, C.P.-A.; Zhang, C. Evaluation of the suitability of six drought indices in naturally growing, transitional vegetation zones in Inner Mongolia (China). PLoS ONE 2020, 15, e0233525. [Google Scholar] [CrossRef]

Figure 1. Location map of the study area showing the topography, river network, dam reservoirs, and the meteorological station (17112 Çanakkale, 18084 Biga).

Figure 2. Comparative display of the monthly reservoir storage volume time series for the Atikhisar and Bakacak dams.

Figure 3. Palmer Z-Index Z(t) and Z(t + 1) OOF prediction performances for Çanakkale (17112) and Biga (18084) stations.

Figure 4. SHAP summary plots for Z(t) and Z(t + 1) predictions at the Çanakkale Central and Biga meteorological stations.

Figure 5. Time series comparison of the observational Palmer Z-Index and the machine learning-based current-month Z(t) and next-month Z(t + 1) predictions at the Çanakkale and Biga stations, and representation of drought events.

Table 1. Consolidated feature engineering across modeling pipelines used in this study.

Feature Category	HP Tree Pipeline (CatBoost + GB Tuning)	Multi-Model ML Pipeline (Linear–SVR–RF–GB–XGB)	Fast LightGBM Pipeline
Base meteorological variables	Precipitation, temperature, humidity, pressure, wind speed	Same	Same
Lag features	Lags of 1, 3, 6, and 12 months for each base variable	Lags of 1, 2, 3, 6, and 12 months for each base variable	Lags of 1, 3, 6, and 12 months for each base variable
Rolling window features	Rolling mean over 3, 6, and 12 months, computed on shift (1) values to prevent information leakage	Rolling sum and rolling mean over 3, 6, and 12 months	Rolling mean over 3, 6, and 12 months
Seasonal encoding	Month-of-year encoding: month_sin, month_cos (+ month as integer)	month_sin, month_cos	month_sin, month_cos
Long-term trend feature	Not included	Calendar year included as a proxy for long-term trend	Not included
Target lead formulation	Targets defined as Z(t) → Z(t + 1)	Targets defined as Z(t) → Z(t + 1)	Targets defined as Z(t) → Z(t + 1)
Normalization/scaling	Not applied (tree-based models)	StandardScaler applied only to linear regression, Elastic Net, and SVR	Not applied (tree-based models)

Table 2. Augmented Dickey–Fuller (ADF) test results for the Z-Index series (Çanakkale and Biga stations).

Station	Series	ADF Statistic	p Value	Lags	Observations	1% Critical Value	5% Critical Value	10% Critical Value	Stationary (p < 0.05)
Çanakkale	Z (original)	−30.674	<0.001	0	1019	−3.4368	−2.8644	−2.5683	Yes
Çanakkale	ΔZ (first difference)	−12.156	<0.001	20	998	−3.4369	−2.8644	−2.5683	Yes
Biga	Z (original)	−20.832	<0.001	0	503	−3.4434	−2.8673	−2.5698	Yes
Biga	ΔZ (first difference)	−11.893	<0.001	11	491	−3.4437	−2.8674	−2.5699	Yes

Table 3. Time series cross-validated (OOF) performance metrics of machine learning models for Palmer Z-Index prediction, computed using the test dataset.

Çanakkale	Z(t)	Model	CV R² (±std)	OOF R²	OOF RMSE	OOF MAE
		Gradient Boosting	0.857 ± 0.104	0.841	0.731	0.451
		XGBoost	0.850 ± 0.112	0.833	0.748	0.462
		CatBoost	0.836 ± 0.096	0.821	0.776	0.472
		Elastic Net	0.793 ± 0.058	0.801	0.821	0.578
		Linear Regression	0.782 ± 0.080	0.792	0.835	0.601
		LightGBM	0.802 ± 0.127	0.783	0.854	0.511
		Random Forest	0.766 ± 0.136	0.744	0.926	0.559
		SVR	0.704 ± 0.139	0.688	1.023	0.684
	Z(t + 1)	Model	CV R² (±std)	OOF R²	OOF RMSE	OOF MAE
		Gradient Boosting	0.843 ± 0.107	0.828	0.762	0.491
		XGBoost	0.834 ± 0.109	0.817	0.784	0.521
		CatBoost	0.833 ± 0.096	0.815	0.745	0.452
		Elastic Net	0.805 ± 0.055	0.813	0.793	0.566
		Linear Regression	0.797 ± 0.063	0.805	0.811	0.591
		LightGBM	0.789 ± 0.121	0.771	0.877	0.555
		Random Forest	0.741 ± 0.145	0.721	0.969	0.634
		SVR	0.690 ± 0.125	0.677	1.042	0.711
Biga	Z(t)	Model	CV R² (±std)	OOF R²	OOF RMSE	OOF MAE
		Elastic Net	0.800 ± 0.053	0.8061	0.918	0.637
		Gradient Boosting	0.795 ± 0.071	0.7979	0.937	0.668
		XGBoost	0.788 ± 0.072	0.7928	0.949	0.675
		CatBoost	0.785 ± 0.073	0.7891	0.957	0.671
		Linear	0.769 ± 0.075	0.7773	0.984	0.706
		Random Forest	0.702 ± 0.059	0.7067	1.129	0.796
		LightGBM	0.678 ± 0.226	0.6853	1.171	0.852
		SVR	0.583 ± 0.143	0.5865	1.341	0.993
	Z(t + 1)	Model	CV R² (±std)	OOF R²	OOF RMSE	OOF MAE
		Elastic Net	0.813 ± 0.054	0.8199	0.884	0.612
		Gradient Boosting	0.796 ± 0.052	0.8017	0.928	0.657
		Linear	0.794 ± 0.060	0.8014	0.927	0.672
		XGBoost	0.792 ± 0.064	0.7965	0.939	0.673
		CatBoost	0.795 ± 0.040	0.7958	0.941	0.664
		Random Forest	0.699 ± 0.051	0.7059	1.129	0.794
		LightGBM	0.671 ± 0.255	0.6786	1.181	0.856
		SVR	0.573 ± 0.139	0.5734	1.361	1.009

Table 4. Temporal lead times and event detection performance of the Palmer Z-Index compared to hydrological drought indicators (SRI₁, SRI₃, and SRI₆).

Station/Dam	Index Pair	Average Lead (Month)	Median Lead (Month)	Previous Capture Rate (%)	Number of Cases (Z → SRI)
Çanakkaleonu—Atikhisar	Z → SRI₁	5.53	5.0	86.7	15/46
	Z → SRI₃	5.19	5.0	81.3	16/46
	Z → SRI₆	6.07	6.5	92.9	14/46
Biga—Bakacak	Z → SRI₁	6.46	7.0	95.8	24/48
	Z → SRI₃	6.39	6.0	95.7	23/48
	Z → SRI₆	6.20	7.0	95.0	20/48

Table 5. Category-based validation criteria for Palmer Z-Index forecasts of different drought severity classes at the Merkez and Biga stations.

Station	Category	Series	POD	FAR	CSI
Çanakkale	Z ≤ −1 (Mild)	Z(t)	0.777	0.148	0.685
	Z ≤ −1 (Mild)	Z(t + 1)	0.742	0.189	0.632
	Z ≤ −2 (Severe)	Z(t)	0.571	0.217	0.493
	Z ≤ −2 (Severe)	Z(t + 1)	0.597	0.159	0.536
Biga	Z ≤ −1 (Mild)	Z(t)	0.740	0.149	0.655
	Z ≤ −1 (Mild)	Z(t + 1)	0.725	0.103	0.669
	Z ≤ −2 (Severe)	Z(t)	0.600	0.500	0.375
	Z ≤ −2 (Severe)	Z(t + 1)	0.680	0.452	0.436

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mucan, U.; Arslantaş Civelekoğlu, E.E. Improving PDSI Z-Index Prediction with Ensemble Learning: A Case Study from the Troy Region of Türkiye. Sustainability 2026, 18, 1752. https://doi.org/10.3390/su18041752

AMA Style

Mucan U, Arslantaş Civelekoğlu EE. Improving PDSI Z-Index Prediction with Ensemble Learning: A Case Study from the Troy Region of Türkiye. Sustainability. 2026; 18(4):1752. https://doi.org/10.3390/su18041752

Chicago/Turabian Style

Mucan, Umut, and Ebru Elif Arslantaş Civelekoğlu. 2026. "Improving PDSI Z-Index Prediction with Ensemble Learning: A Case Study from the Troy Region of Türkiye" Sustainability 18, no. 4: 1752. https://doi.org/10.3390/su18041752

APA Style

Mucan, U., & Arslantaş Civelekoğlu, E. E. (2026). Improving PDSI Z-Index Prediction with Ensemble Learning: A Case Study from the Troy Region of Türkiye. Sustainability, 18(4), 1752. https://doi.org/10.3390/su18041752

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improving PDSI Z-Index Prediction with Ensemble Learning: A Case Study from the Troy Region of Türkiye

Abstract

1. Introduction

2. Materials and Methods

2.1. Case Study

2.2. Data

2.3. Computation of Palmer Drought İndices

2.4. Feature Engineering

2.5. Model Selection

2.5.1. Linear Regression (Ordinary Least Squares)

2.5.2. Support Vector Regression (SVR)

2.5.3. Elastic Net Regression

2.5.4. Random Forest

2.5.5. Gradient Boosting Regressor (GBR)

2.5.6. Extreme Gradient Boosting (XGBoost)

2.5.7. CatBoost Regressor

2.5.8. LightGBM Regressor

2.6. Model Training and Evaluation

Evaluation Criteria

2.7. Model Interpretation and Advanced Validation Framework

2.7.1. Model Interpretability Using Shapley Additive Explanations (SHAP)

2.7.2. Assessing the Hydroclimatic Consistency of Z-Index with Reservoir Storage Anomalies

2.7.3. Evaluation of Drought Detection Skill Across Severity Categories

3. Results and Discussion

3.1. Results of the Augmented Dickey–Fuller Test

3.2. Comparative Evaluation of the Prediction Performance of Machine Learning Models for the Palmer Z-Index

3.3. Explaining the Decision Mechanisms of Models with SHAP

3.4. Analysis of the Relationship Between Palmer Z-Index Estimates and SRI

3.5. Event-Based Drought Analysis: The Ability of Models to Capture Drought Periods

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI