Leveraging Sentinel-2 Data and Machine Learning for Drought Detection in India: The Process of Ground Truth Construction and a Case Study

Sharma, Shubham Subhankar; Mukherjee, Jit; Dell’Acqua, Fabio

doi:10.3390/rs17183159

Open AccessArticle

Leveraging Sentinel-2 Data and Machine Learning for Drought Detection in India: The Process of Ground Truth Construction and a Case Study

by

Shubham Subhankar Sharma

,

Jit Mukherjee

and

Fabio Dell’Acqua

^*

Department of Electrical, Computer & Biomedical Engineering, University of Pavia, 27100 Pavia, Italy

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(18), 3159; https://doi.org/10.3390/rs17183159

Submission received: 27 June 2025 / Revised: 25 August 2025 / Accepted: 8 September 2025 / Published: 11 September 2025

(This article belongs to the Special Issue Deep Learning and Foundation Models: Advancing Remote Sensing Applications)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Ensemble learning models trained on Sentinel-2 multispectral indices reliably classified regional drought conditions in India during the Rabi season, with Bagging Classifier and Random Forest yielding accuracies above 83%, and seasonal majority voting raising performance to 94%.
SHAP-based feature attribution consistently identified the Normalized Multi-band Drought Index (NMDI) and Day of the Season (DOS) as dominant predictors, with RECI, EVI, NDMI, and RDI emerging as additional key contributors across models.

What is the implication of the main finding?

Integrating multispectral drought-sensitive indices with ensemble classifiers provides a scalable and robust methodological framework for regional drought detection and monitoring, complementing conventional ground-based drought assessments.
Feature importance rankings demonstrate that vegetation stress and soil-moisture–related indices are central for model generalization, offering transferable insights for agricultural risk management and operational drought early warning systems.

Abstract

Droughts significantly impact agriculture, water resources, and ecosystems. Their timely detection is essential for implementing effective mitigation strategies. This study explores the use of multispectral Sentinel-2 remote sensing indices and machine learning techniques to detect drought conditions in three distinct regions of India, such as Jodhpur, Amravati, and Thanjavur, during the Rabi season (October–April). Twelve remote sensing indices were studied to assess different aspects of vegetation health, soil moisture, and water stress, and their possible joint use and influence as indicators of regional drought events. Reference data used to define drought conditions in each region were primarily sourced from official government drought declarations and regional and national news publications, which provide seasonal maps of drought conditions across the country. Based on this information, a district vs. year (3 × 10) ground truth is created, indicating the presence or absence of drought (Drought/No Drought) for each region across the ten-year period. Using this ground truth table, we extended the remote sensing dataset by adding a binary drought label for each observation: 1 for “Drought” and 0 for “No Drought”. The dataset is organized by year (2016–2025) in a two-dimensional format, with indices as columns and observations as rows. Each observation represents a single measurement of the remote sensing indices. This enriched dataset serves as the foundation for training and evaluating machine learning models aimed at classifying drought conditions based on spectral information. The resultant remote sensing dataset was used to predict drought events through various machine learning models, including Random Forest, XGBoost, Bagging Classifier, and Gradient Boosting. Among the models, XGBoost achieved the highest accuracy (84.80%), followed closely by the Bagging Classifier (83.98%) and Random Forest (82.98%). In terms of precision, Bagging Classifier and Random Forest performed comparably (82.31% and 81.45%, respectively), while XGBoost achieved a precision of 81.28%. We applied a seasonal majority voting strategy, assigning a final drought label for each region and Rabi season based on the majority of predicted monthly labels. Using this method, XGBoost and Bagging Classifier achieved

96.67 %

accuracy, precision, and recall, while Random Forest and Gradient Boosting reached

90 %

and

83.33 %

, respectively, across all metrics. Shapley Additive Explanation (SHAP) analysis revealed that Normalized Multi-band Drought Index (NMDI) and Day of Season (DOS) consistently emerged as the most influential features in determining model predictions. This finding is supported by the Borda Count and Weighted Sum analysis, which ranked NMDI, and DOS as the top feature across all models. Additionally, Red-edge Chlorophyll Index (RECI), Normalized Difference Water Index (NDWI), Normalized Difference Moisture Index (NDMI), and Ratio Drought Index (RDI) were identified as important features contributing to model performance. These features help reveal the underlying spatiotemporal dynamics of drought indicators, offering interpretable insights into model decisions. To evaluate the impact of feature selection, we further conducted a feature ablation study. We trained each model using different combinations of top features: Top 1, Top 2, Top 3, Top 4, and Top 5. The performance of each model was assessed based on accuracy, precision, and recall. XGBoost demonstrated the best overall performance, especially when using the Top 5 features.

Keywords:

copernicus; agricultural applications; Sentinel-2; SHAP; drought detection; Borda Count; XGBoost; India; machine learning; remote sensing indices; bagging classifier

1. Introduction

Droughts are climatic events that occur naturally and play a significant role in shaping ecosystems by influencing species adaptation, water availability, and vegetation dynamics. However, despite their ecological importance, droughts frequently lead to severe consequences for human populations, including widespread suffering, loss of livelihoods, and adverse environmental and economic impacts [1]. These effects are particularly pronounced in the agricultural sector, where reduced water availability and declining soil moisture levels directly threaten crop yields and food security. Naturally, droughts pose a significant threat to India’s agricultural sector, which is highly sensitive to variations in rainfall and water availability. Recent studies have shown that drought in India has impacted resource availability, especially water and food, population dynamics, and ecological balance [2,3]. Effective drought monitoring is critical for ensuring food security and minimizing economic losses, particularly in a country where agriculture supports a substantial portion of the population. The principal cause of drought is the deficit of precipitation [4]. Droughts can be categorized based on different factors such as meteorological (deficit in precipitation), hydrological (deficit in groundwater or total water storage), agricultural (deficit in soil moisture), and socioeconomic (impact of drought conditions on socioeconomic goods) [5]. All these aspects are highly correlated. A significant decline in soil moisture levels compared to normal conditions can be characterized by the transition from meteorological drought to agricultural drought [6]. Further prolonged deficit in soil moisture may lead to hydrological drought [6]. Agricultural drought, in turn, has both direct and indirect impacts on food-related industries, ultimately weakening the overall economy of a country [7]. India is an agrarian country with a diverse climate and socio-economic structure. Different parts of India have endured multiple agricultural droughts lasting between 5 and 17 years, leading to severe famine and widespread loss of human and livestock populations in the last hundred years [7]. Several regions of India, especially the Bundelkhand region, have faced severe hydrological droughts [8]. Ref. [8] provides an elaborate review of the effects of droughts on various factors of society, such as population density, agricultural labor, cultivators, and others. The propagation of meteorological droughts to the agricultural droughts of India has been studied in different works through trend analysis [9]. An event-based approach is employed in [10] to study the propagation of meteorological drought to hydrological drought at the Krishna River Basin in semi-arid India. Given India’s vast and varied agricultural landscapes, accurately assessing drought conditions remains a significant challenge. The country’s diverse climatic conditions, ranging from arid and semi-arid zones to tropical and subtropical regions, influence the frequency and intensity of droughts. Additionally, varying cropping patterns across different agro-climatic zones further complicate drought monitoring efforts. The lack of comprehensive, consolidated ground truth data for validation adds another layer of complexity, making it difficult to establish reliable drought assessment models. Addressing these challenges requires analyzing drought conditions through different modalities, such as satellite-based remote sensing, climate modeling, and on-ground observations to enhance drought detection and monitoring across India’s agricultural sector. Different multi-modal analyses using satellite data, meteorological indices, soil moisture records, and machine learning techniques have been studied for agricultural drought monitoring of India [11]. In [12], forty parameters spanning meteorological, geo-environmental, social, and remote-sensing data are considered for identifying drought-prone areas and informing targeted mitigation and policy decisions. However, agricultural drought detection using machine learning techniques with only satellite indices and studying their significance is largely underexplored.

This gap is critical from the application-oriented perspective, as such spectral indices can be computed by open-access satellites such as Landsat, Sentinel, and others. This has the potential to offer a low-cost, scalable solution for drought detection in resource-constrained regions. A regional study in Tamil Nadu, India, uses different spectral indices, including NDVI, NMDI, NDWI, and others to effectively capture vegetation and moisture variation, reinforcing the value of spectral indicators in drought analysis [13]. NMDI and NDWI are used in [14] for mapping drought conditions in Vietnam. Such studies showcase the feasibility of relying solely on remote sensing-based indices from Sentinel-2 for drought detection. Further, ref. [15] shows that no single index works universally, and multi-indicator strategies are necessary for drought detection. Thus, the proposed work tries to identify and accumulate the most influential features, a problem that is somehow currently underexplored in the literature. Results from this study have the potential to advance a machine learning based technique beyond algorithmic performance to provide agronomically interpretable insights by identifying which indices consistently signal drought across diverse agro-climatic zones. Such application-oriented contributions remain limited in existing studies and can provide substantial value for any operational drought monitoring systems.

Related Work

Several studies have employed a variety of approaches for drought detection and monitoring, utilizing different methodologies such as remote sensing, climate modeling, hydrological analysis, and ground-based observations to assess drought conditions across various spatial and temporal scales. Different remote sensing technologies, such as multispectral, thermal infrared, and microwave data, are widely employed to retrieve key drought indicators such as precipitation [16], soil moisture [5], and evaporation [17]. Insufficient precipitation can significantly impact plant health by reducing photosynthetic capacity [5]. When water availability decreases, plants experience physiological stress, which results in lower photosynthetic activity. This reduction affects the absorption of solar radiation in photosynthetically active wavelengths, such as visible and near-infrared. Thus, vegetation may exhibit changes in spectral reflectance, which can be detected through remote sensing techniques [5]. Therefore, drought detection and monitoring have been increasingly conducted using multiple indicators derived from satellite imagery. Normalized difference vegetation index (NDVI) has been proven to be an effective indicator of vegetation stress; however, it may not provide the underlying cause of the stress, which may relate to different factors such as plant disease, flood, and others [18]. Hence, NDVI or vegetation indices, along with other indices such as land surface temperature [19], are applied for drought monitoring. Short-wave infrared bands are also found to be susceptible to soil moisture and leaf water content [5]. Thus, Normalized Difference Water Index (NDWI) and combinations of NDWI and NDVI are studied to detect drought monitoring [20]. Different climate-based indices and biophysical parameters, along with vegetation conditions, are used to form an index named Vegetation Drought Response Index [21]. However, it is complex to compute and its performance varies widely in different regions [21]. Several indices are proposed for the identification and monitoring of plant stress and subsequently drought such as Crop Water Stress Index (CWSI) [22], Water Deficit Index (WDI) [23], Evaporative Stress Index (ESI) [24], Drought Severity Index (DSI) [25], and several others [5]. Multivariate drought indices are typically based on four factors such as vegetation health, soil moisture levels, hydroclimatic variables, and crop stress status [7]. One such combined drought index using hydro-climatic and biophysical variables which indicate anomalies in soil moisture conditions, rainfall, and crop-sown area progression, is proposed using synthetic aperture radar (SAR) images of Sentinel for monitoring early-season agricultural drought in South Asia [26]. A regional agricultural drought index (RegCDI) based on crop water stress, soil moisture deficits, and vegetation health is used in the detection of a regional drought of India [7]. Such indices have been proven insightful in various circumstances. However, traditional drought indices based assessments often struggle to adapt across diverse environmental conditions, which may lead to inconsistencies in detection and prediction. As an example, NDVI has certain limitations in detecting early-season drought for reduced sensitivity to fluctuations in soil moisture, and a delayed response to rainfall variations [26]. The challenges associated with generalization and sensitivity to various atmospheric and climatic factors create an opportunity for the application of machine learning in drought monitoring. With their ability to analyze complex, non-linear relationships, machine learning techniques offer a promising approach to enhance the accuracy and robustness of drought monitoring by integrating multiple data sources, adjusting for regional variations, and improving predictive capabilities.

Sentinel-2 has been found instrumental to detect the drought related land surface changes [27,28]. However, the lack of thermal bands in Sentinel-2 and its limited direct applicability for drought monitoring impose certain constraints on its effectiveness [28]. These limitations can hinder the satellite’s ability to provide comprehensive data for drought analysis, as thermal information is crucial for assessing parameters like soil moisture and evapotranspiration. Nevertheless, these challenges are addressed through multi-modal analysis, which integrates data from multiple sources or sensors. Fusion of Landsat-8 and Sentinel-2 data has provided a significant leap in drought analysis, particularly by perpendicular drought index [29]. In [1], a multi-modal analysis is conducted by integrating radar-based surface moisture estimation and multi-spectral vegetation indices derived from Sentinel-1, Sentinel-2, and Landsat-8 imagery for the savanna ecosystem in South Africa. SAR imagery has proven to be highly effective in regions with persistent cloud cover. However, a major limitation of using SAR for drought assessment is the need to accurately parameterize surface roughness [26]. Nonetheless, the added complexity of multi-modal drought detection is hindered by two key challenges: the intricate processing required for SAR images and the discrepancy in spatial resolution when integrating Sentinel-2 data with other multi-spectral satellites like Landsat-8. These factors can complicate the analysis and limit the effectiveness of combining such datasets for drought monitoring. On the other hand, vegetation indices derived from Sentinel-2, such as NDVI and others, have demonstrated a stronger correlation with drought conditions and have yielded more reliable outcomes, as highlighted in recent studies [28]. Hence, this paper focuses on utilizing Sentinel-2-derived indices and harnessing the power of machine learning to enhance the accuracy and reliability of drought detection. The prolonged impact of drought can lead to substantial changes in land use and land cover (LULC). Machine learning techniques are conducted in the literature through a spatio-temporal analysis on land use and land cover (LULC) to effectively detect drought and visualize its transformative effects over time [28,30]. The focus area of this work is the agricultural regions in different parts of India. The high spatial resolution of Sentinel-2, ranging from 10 to 20 m, is well-suited for monitoring small-scale field crops, which are prevalent across India [31].

Multiple linear regression (MLR), long short-term memory (LSTM), and Random Forest (RF) have been employed to detect flash drought in China [32]. Random Forest (RF) has been found most effective among different machine learning algorithms in drought stress detection of wheat [33]. Four machine learning models of RF, the Extreme Gradient Boost (XGB), the convolutional neural network (CNN), and the long-short term memory (LSTM) are used for the estimation of meteorological drought in [34]. In [35], the naïve Bayes classifier has been found to be more suitable than decision trees to characterize droughts. A deep neural network was employed in [36] to estimate soil moisture for agricultural drought monitoring in South Korea. In [37], three advanced machine learning techniques, bias-corrected Random Forest, support vector machine (SVM), and multi-layer perceptron neural network, were employed to detect and analyze agricultural drought in South-Eastern Australia. It is observed that machine learning techniques are widely used in recent paradigms in monitoring and forecasting of meteorological, hydrological, and agricultural droughts [38]. Still, the application of machine learning techniques in different climatic conditions of India through multiple spectral and temporal indices to detect drought with high accuracy is underexplored. A significant research gap is also observed in the literature in quantifying the influence of different spectral and temporal parameters on detecting drought in different climatic locations of India. This work contributes to addressing these gaps by evaluating the potential of a range of indices based on Sentinel-2 data toward drought detection in India with higher accuracy. While the study does not introduce novel machine learning algorithms, its novelty lies in the effective application and integration of existing methods to analyze Sentinel-2-derived indices and their significance for drought monitoring. Notably, this application-oriented perspective can improve operational drought monitoring frameworks by highlighting which indices matter most across distinct agro-climatic zones. Thus, the novelty of the proposed study lies in the advancement beyond methodological development and offering feasible insights towards agricultural planning and drought resilience strategies through identifying influential features from different machine learning algorithms. Sentinel-2 offers open multispectral and multitemporal data under the Copernicus open access scheme. These data are well-suited for detecting agricultural drought indicators such as vegetation health and water stress. By leveraging key indices from Sentinel-2 (S-2) data, we aim to develop a scalable drought detection system tailored to India’s diverse regions. In our experiments, we focused on three districts, i.e., Jodhpur (Rajasthan), Amravati (Maharashtra), and Thanjavur (Tamil Nadu), which experience distinct climatic conditions and cultivate different crops, enabling a comprehensive evaluation of drought detection during the critical Rabi season. To enhance the reliability and interpretability of the machine learning models, SHAP, i.e., Shapley Additive Explanation [39], analysis is employed to determine the weight and importance of individual features. For aggregating outputs from multiple models, the Borda Count method [40] and the Weighted Sum approach are further utilized to identify the top features that most closely relate to drought conditions. Furthermore, the top-ranked features, from the first to the fifth, are systematically evaluated to analyze their individual and collective performance trends across the study regions. This approach not only addresses an important challenge posed by India’s diverse agricultural landscapes but also provides relevant clues for identifying and prioritizing the factors driving drought conditions, paving the way for more targeted and effective drought mitigation strategies.

2. Preliminaries

A few established techniques are used in this work, described below for the readers’ convenience.

2.1. Remote Sensing Indices

Multispectral images widely use spectral indices to enhance and identify specific spectral features of a relevant land cover class. The spectral indices used in this work are briefly described below, with the rationale for including each of them in a drought identification study.

2.1.1. Normalized Difference Vegetation Index

The Normalized Difference Vegetation Index (NDVI) is a widely used metric to assess vegetation health using near-infrared (NIR) and red (Red) bands as shown in Equation (1):

NDVI = \frac{NIR - Red}{NIR + Red}

(1)

NDVI values range between

[- 1, 1]

, where higher values denote denser/healthier vegetation. NDVI is easily interpretable and widely applicable. Although it is affected by soil reflectance in sparsely vegetated areas, NDVI performs well also in extreme conditions [41].

2.1.2. Enhanced Vegetation Index

The enhanced vegetation index (EVI) is regarded as an enhanced version of the NDVI. It provides greater sensitivity in high-biomass regions, thus improving vegetation monitoring capabilities. It achieves this by decoupling the canopy background signal and reducing the impact of atmospheric interference [42]. It is demonstrated in [43] that EVI can be effectively used to monitor water stress. EVI showed indeed high correlation with patch pressure (Pp), a measure of leaf water status [43]. Moreover, EVI is sensitive to changes in plant water status hence it can detect temporary changes in leaf hydration. It values between

[- 1, 1]

[44], and it is computed using Red, NIR, and Blue bands as per Equation (2):

EVI = 2.5 \cdot (\frac{NIR - Red}{NIR + 6 \cdot Red - 7.5 \cdot Blue + 1})

(2)

2.1.3. Atmospherically Resistant Vegetation Index

The Atmospherically Resistant Vegetation Index (ARVI) is an effective tool to mitigate the effect of high atmospheric aerosol content [45] when evaluating vegetation status. ARVI is computed as shown in Equation (3):

ARVI = \frac{NIR - Red - y \cdot (Red - Blue)}{NIR + Red - y \cdot (Red - Blue)}, w h e r e y = 0.1

(3)

Here, y is a coefficient tuned to compensate for the effects of atmospheric aerosols, estimated through relative values of red and blue bands [45]. Simulations show that, while ARVI features a dynamic range similar to NDVI, it is four times less sensitive to atmospheric effects than NDVI [46]. A self-correction process on the red channel is employed in ARVI to achieve this robustness [47].

2.1.4. Normalized Difference Water Index

The Normalized Difference Water Index (NDWI), which is primarily a water body index, can also be utilized to monitor drought stress in agricultural areas, providing timely information on crop quality and development [48]. It is particularly valuable for precision agriculture and forest health monitoring, as it is sensitive to changes in plant water content, which is insightful for drought detection in various vegetation types [48]. It ranges between

[- 1, 1]

[49] as visible from Equation (4):

NDWI = \frac{Green - NIR}{Green + NIR}

(4)

2.1.5. Soil-Adjusted Vegetation Index

The Soil-Adjusted Vegetation Index (SAVI) introduces a soil brightness correction factor (L) to mitigate the soil reflectance factor in vegetation indices [50] as shown in Equation (5). It demonstrates superior stability in time-series analysis of vegetation [51]. Hence, SAVI can be found invaluable for drought detection, particularly in areas where soil background might bias other vegetation indices [51].

SAVI = \frac{(NIR - Red)}{(NIR + Red + L)} \cdot (1 + L), L = 0.8

(5)

2.1.6. Transformed Vegetative Index

The Transformed Vegetative Index (TVI) is used as an indicator for vegetational coverage and its health [52,53]. TVI can amplify subtle changes in vegetation health, which is crucial for drought monitoring [53]. TVI is computed as follows.

TVI = \sqrt{NDVI + 0.5}

(6)

2.1.7. Normalized Difference Moisture Index

The Normalized Difference Moisture Index (NDMI) is a dynamic indicator that characterizes the moisture content of vegetation, making it valuable for drought detection [54]. It has a close correlation with NDVI and effectively tracks changes in vegetation health related to water stress, providing insights into drought conditions in urban and natural environments [55]. It utilizes the short wave infrared one (SWIR-I) band along with NIR bands as shown in Equation (7):

NDMI = \frac{NIR - SWIR - I}{NIR + SWIR - I}

(7)

2.1.8. Normalized Multi-Band Drought Index

The Normalized Multi-band Drought Index (NMDI) demonstrated enhanced sensitivity to drought severity by integrating data from NIR and two short-wave infrared bands (SWIR-I and SWIR-II) [56]; it was found to be effective in estimating both soil and vegetation moisture content. It is computed as shown in Equation (8).

NMDI = \frac{NIR - (SWIR - I - SWIR - II)}{NIR + (SWIR - I - SWIR - II)}

(8)

2.1.9. Modified Normalized Water Index

The Modified Normalized Water Index (MNDWI) is used to map soil moisture conditions across large areas, offering real-time assessment of moisture distribution in agricultural regions [57,58]. MNDWI is computed as shown in Equation (9):

MNDWI = \frac{Green - SWIR - I}{Green + SWIR - I}

(9)

2.1.10. Modified Normalized Difference Vegetation Index

Modified Normalized Difference Vegetation Index (MNDVI) utilizes the mid infrared (MIR) band [59]. As it can accurately capture changes in vegetation photosynthetic activity, which is often impacted by water availability [59] (Equation (10)), it can provide a deeper understanding of the drought conditions:

MNDVI = \frac{NIR - MIR}{NIR + MIR}

(10)

2.1.11. Ratio Drought Index

The Ratio Drought Index (RDI) is a simple ratio of short wave infrared band (SWIR, band 12 in Sentinel-2) and NIR as shown in Equation (11). It is highly sensitive to changes in vegetation health, making it effective for detecting early signs of drought stress [60].

RDI = \frac{SWIR - II}{NIR}

(11)

2.1.12. Red-Edge Chlorophyll Index

The Red-edge Chlorophyll Index (RECI) is computed using the narrow spectral band between red and near-infrared reflectance, i.e., red-edge, making it sensitive to the cellular structure of plants, which correlates with their greenness [61,62] (Equation (12)):

RECI = (\frac{NIR}{Red - Edge}) - 1

(12)

Here, Band 5 (

B 5

) of Sentinel-2 is used as the red-edge band input. Unlike NDVI, which can saturate at high biomass levels, red-edge provides more accurate vegetation maps and is particularly effective for assessing crop health during late growth stages when canopy closure exceeds

80 %

[61,62].

These twelve indices—NDVI, EVI, ARVI, NDWI, SAVI, TVI, NDMI, NMDI, MNDWI, MNDVI, RDI, and RECI—are used in the proposed work as feature vector. All these indices are dimensionless. The selection of these indices is based on the existing literature and their proven relevance in drought monitoring and vegetation stress assessment across diverse agro-climatic zones. These indices cover diverse aspects of vegetation health, water content, and soil moisture, which are critical to drought. As an example, NDWI, MNDWI, NDMI, and NMDI emphasize moisture availability. NDVI, EVI, ARVI, SAVI, MNDVI, RECI, and TVI are well-established vegetation indices that can be used to monitor canopy vigor, chlorophyll concentration, and photosynthetic activity. Further, RECI brings additional sensitivity to dry conditions and plant stress. These indices collectively use different bands, including visible, NIR, SWIR, and red-edge bands, and thus provide a multi-dimensional spectral view of the land surface. This selection of bands also aligns with different works in the literature [11,12]. NDVI and NDWI have been used in [63] to map drought conditions in 2015 (drought year) and 2020 (normal year), which demonstrates the sensitivity of these indices to rainfall anomalies. Spatiotemporal variations in NDVI, SAVI, EVI are found effective to monitor crop health and environmental stress factors in Punjab, India [64]. Similarly, various such indices are used in [65] for drought event mapping across South Asia. While several of these twelve indices have been studied individually or in combination for drought detection and monitoring, they have rarely been used collectively in a single comprehensive analysis. Hence, the proposed work studies these twelve indices with different data balancing techniques to understand their influence on drought detection.

2.2. Machine Learning Classifier

The machine learning models used to classify droughts in this work are Random Forest (RF), Bagging (BGN), Gradient Boost (GB), and XGBoost (XGB). These models are well-suited for remote sensing applications due to their ability to handle complex interactions among variables and to provide a good basis for feature importance analysis [66,67].

2.2.1. Random Forest

Random Forest is an ensemble learning algorithm through the Bagging technique that creates a large number of decision trees [68]. Each tree is trained on a random subset of the data and a random subset of the features. The final prediction is made by aggregating the predictions of individual trees, typically through a majority vote. Random Forests are known for their robustness to overfitting, high accuracy, and ability to handle high-dimensional data [68].

2.2.2. Gradient Boosting Classifier

Gradient Boosting is also an ensemble learning algorithm using the boosting technique to form trees sequentially, with each new tree attempting to correct mistakes made in preceding trees [69]. It aims to minimize a loss function through iteratively adding new trees specifically trained to predict the negative gradient of the loss function. The technique is highly robust and a powerful as it can work with different loss functions.

2.2.3. Extreme Gradient Boosting (XGBoost)

XGBoost enhances the gradient boosting algorithm with high efficiency and performance [70]. XGBoost algorithm, which is also a boosting technique, creates trees sequentially, with each tree fixing its predecessor’s mistakes. It integrates a regularization objective function with a penalty for model complexity, and thus avoids overfitting. XGBoost often outperforms traditional Gradient Boosting due to its regularization and optimization techniques.

2.2.4. Bagging Classifier

Bagging is an ensemble learning technique that aggregates a variety of models in an attempt to make a prediction with increased accuracy [71]. In Bagging (BGN), subsets of training data are generated through bootstrapping. For each subset, a model is constructed, and a prediction is derived through aggregation of model output, in many cases through voting for classification, for a prediction in a classification problem. Bagging reduces variance effectively and helps in overcoming overfitting, especially in complex models.

3. Feature Ranking and Aggregation Techniques

This work also aims at assessing the influence of different features; in the following, a brief description is provided of a few feature ranking and aggregation techniques used in this work.

3.1. Shapley Additive Explanation Analysis

Shapley Additive Explanation (SHAP) values offer a game-theoretic approach to explain the output of a machine learning model [72,73]. It computes the contribution of each player, i.e., each feature, for the outcome of a game, i.e., the prediction. Formally, the SHAP value for feature i in instance x is calculated as:

ϕ_{i} (x) = \sum_{S \subseteq F ∖ {i}} \frac{S! (F - S - 1)!}{F!} [f (S \cup {i}) - f (S)]

(13)

Here F, S, and

f (S)

are defined as the set of all features, a subset of features, and the model’s prediction using only the features in S, respectively. It computes the average change in the model’s output when feature i is added to all possible subsets of other features [73,74]. SHAP values can be used for both global and local feature importance analysis. Global importance can be determined by aggregating the absolute SHAP values for all instances. A key advantage of SHAP is its ability to provide both magnitude and direction (positive or negative) of feature influence on the model’s output.

3.2. Borda Count

The Borda Count is a voting mechanism designed to aggregate preferences from multiple voters [75]. Each voter ranks the candidates, i.e., features, in order of their preferences. For a group of n candidates, a voter assigns

n - 1

points to their most preferred candidate,

n - 2

to their second, and continues in a similar fashion to 0 for the least preferred candidate. Next, each candidate’s overall score is computed by adding together all of the points received from all voters. The winner is then determined to be the one with the largest overall score. In feature ranking, the voters can be considered different evaluation metrics or different runs of a feature selection algorithm. The Borda Count aggregates these rankings to produce a consensus [75].

3.3. Weighted Sum

The Weighted Sum method combines multiple feature importance scores or rankings [76]. Given a set of n features and m different importance scores or rankings for each feature, a weight

w_{j}

is assigned to each of the m scores. Here,

\sum_{j = 1}^{m} w_{j}

is taken as 1. The combined score for feature i is then calculated as shown in Equation (14).

S_{i} = \sum_{j = 1}^{m} w_{j} \cdot R_{i j}

(14)

R_{i j}

is the rank or score of feature i according to the

j^{t h}

criterion.

S_{i}

determines the final influence of each feature. The weights (

w_{j}

) reflect the relative importance of the different criteria. This technique is simple to implement and interpret; however, the selection of weights can significantly impact the final ranking. Appropriate weight selection is crucial and often depends on the specific application and the nature of the input scores.

4. Resampling Techniques

This section discusses the resampling techniques employed in this study to address potential class imbalance in the drought data.

4.1. Synthetic Minority Over-Sampling Technique

Synthetic Minority Over-Sampling Technique (SMOTE) denotes an oversampling technique that generates synthetic instances for the minority class through its k-nearest neighbors [77]. Synthetic samples are produced over the connecting line segments between the selected instance and its neighbors. This helps in attaining a balanced distribution of classes, and as a consequence, it improves the performance of machine learning algorithms with imbalanced datasets. It addresses the issue of simply duplicating minority class instances, which may lead to overfitting.

4.2. Borderline SMOTE

Borderline SMOTE enhances traditional SMOTE, with a focus placed on generating synthetic samples near the borderline cases of the minority class [78]. This aims at sharpening the decision boundary and, in turn, enhancing classifiers’ effectiveness.

4.3. Adaptive Synthetic Sampling Approach

The Adaptive Synthetic Sampling Approach (ADASYN) is an oversampling technique that adaptively generates synthetic samples for the minority class based on the density of neighboring majority class instances [79]. The algorithm prioritizes generating more synthetic samples in regions where the minority class is harder to learn, i.e., surrounded by more majority class instances. This is particularly beneficial when the minority class is sparsely distributed or when there are notable differences in the density of the majority and minority classes.

Several techniques exist that can help rebalance data. In this scenario, data-driven approaches are prioritized as they are more flexible and scalable than algorithm-driven data balancing approaches, where existing learning algorithms are modified to reduce their bias toward the majority class [80]. Typical data-driven approaches can be categorized into two major techniques: undersampling and oversampling. In undersampling, instances of majority classes are strategically reduced, which may lead to potential information loss [80,81]. New instances in the form of synthetic data are generated to extend the minority class by oversampling [81]. However, while the extended dataset may be sufficient to feed machine learning techniques, it is less dense in actual data-sourced information; there is a high potential for information loss in further processing stages. Hybrid techniques attempt to combine the advantages of undersampling and oversampling. They have been found useful for noisy and large data with significant overlaps [80]; on the other hand, hybrid techniques, in general, are more computationally expensive than canonical undersampling or oversampling techniques [80]. Considering that the concerned dataset is sparse and the amount of available computing power is limited, oversampling techniques are prioritized here for a less computationally expensive solution.

5. Data and Study Area

Three districts in India were selected as our study area: Jodhpur in Rajasthan state, Amravati in Maharashtra state, and Thanjavur in Tamil Nadu state, as shown in Figure 1. These districts feature distinct climates, different agricultural practices, and slightly varying cropping seasons. Each district typically follows two annual cropping cycles: the Kharif (monsoon) season, with sowing in June–July and harvesting in September–October, and the Rabi (winter) season, with sowing in October and harvesting in March–April. We focused exclusively on the Rabi season, as optical satellite data from the Kharif season is often unreliable due to frequent and persistent cloud cover. Sentinel-2 is considered here to collect time-series data for 12 remote sensing indices across the three districts from year 2016 to year 2025 through the Google Earth Engine (GEE) cloud infrastructure. Data are selected with <20% cloud cover, covering the period from 1 October of one year to 30 April of the next year. As an example, the year “2017” indicates the duration from October 2016 to March 2017. A drought season is defined as the period spanning from October of the previous year to April of the current year, with each temporal sample labeled as a drought sample. Thus, the number of data points in each year is different, as <20% cloud cover discards a varying number of data points over the years. These data are not corrected for atmospheric effects such as aerosol scattering, water vapor absorption, or ozone interference. As, before March 2018, Level-2A products were not systematically available from ESA, Level-2A product data is not used here. Users need to generate them locally using tools like the Sen2Cor processor. In March 2018, ESA started the systematic production of Level-2A data; this, however, resulted in partial coverage of our relevant timeframe, which spans from October 2016 to April 2025. Since homogeneous data covering this entire timeframe was required, Sentinel-2 Level-1C top-of-atmosphere reflectance data was used instead. These data were acquired for each year (2016–2025) using the “COPERNICUS/S2” image collection in GEE. The collection was filtered by date and cloud cover (<20%). The cloud masking function was applied to each image using the

Q A 60

band. District boundaries for Jodhpur, Amravati, and Thanjavur were defined as regions of interest (ROIs) within GEE using the FAO GAUL dataset [82]. Administrative level 2 (ADM2) boundaries are primarily used here. In order to concentrate on agricultural areas, another land cover mask was generated using agricultural land classes from the Copernicus Global Land Cover dataset [83]. The mask was clipped to each district’s boundary to isolate agricultural land within each ROI.

We organized the data by year (2016–2025) in a two-dimensional format. One example is shown in Table 1, with indices as columns and observations as rows. Note that there can be multiple observations recorded on the same day in this dataset, each representing a single measurement of the remote sensing indices. The dataset comprises a total of 13 features per row: 12 spectral indices and a temporal “Day of Season” (DOS) feature, starting from 1 on 1 October; this is processed later in the preprocessing stage using the date column. The DOS feature conveys information about the time of the year, i.e., it relates other variables to the advancement of the season. The rows span all selected cloud-free days in the mentioned time period for each year and district, creating a comprehensive temporal record for analysis. These rows serve as individual data points for model training and testing, linking the observed remote sensing patterns to drought outcomes. Each value in this row reflects the mean value of a specific index (e.g., NDVI) for an entire district (e.g., Thanjavur) on a particular date at a particular time.

5.1. Drought Declaration Process in India

In India, drought declarations follow a formal, multitiered process outlined in the national drought manual [84,85] as shown in Figure 2. As per the Manual for Drought Management issued by the Ministry of Agriculture & Farmers Welfare, Government of India, the definition and criteria for agricultural drought are defined in our study. This manual was created in 2016 and again revised in 2020. This process begins with routine monitoring of agro-meteorological indicators by specialized agencies: the India Meteorological Department (IMD) analyzes rainfall and temperature data, the National Remote Sensing Centre (NRSC) provides satellite-derived soil moisture and vegetation indices, and the Central Research Institute for Dryland Agriculture (CRIDA) offers agronomic assessments [86]. District-level drought committees, typically chaired by the District Collector with members of the agriculture and finance departments, further review these data against predefined criteria (for example, specified rainfall deficits or Standardized Precipitation Index thresholds). When these quantitative triggers are met, field verification teams are deployed to inspect crop conditions and local water availability. The findings of the committees (including any field reports) are submitted to the state government for review. If the state agrees that a district meets the drought criteria, it issues a formal declaration for the affected districts, usually by notification in the state gazette. In severe cases, the state can also request central relief assistance or activate national contingency plans for drought relief.

Assembling a consolidated ground truth for drought declarations is extremely challenging given the process’s decentralized, multi-tiered nature. Each state (and often each district within it) follows its own procedure and publishes drought notifications in various formats and languages (for example, in state gazettes, local newspapers, or agriculture department bulletins), with no single central repository. Researchers must therefore compile data from many scattered sources. Declarations often rely on qualitative field reports rather than uniform quantitative metrics, and different states may apply different rainfall deficit or SPI thresholds. This heterogeneity makes retrospective interpretation difficult. It is also difficult to confirm that a district did not experience drought, as the absence of an official declaration is not explicitly recorded. Timing adds further complexity: some districts may declare drought only after a significant delay or at subdistrict levels, creating inconsistencies between district-wide and local reports. Together, these factors make it extremely challenging to build an accurate and unified ground truth data set of drought occurrences; still, even with some difficulties, it was possible to assemble a reference dataset as reported in the following subsection.

5.2. Ground Truth Table

The ground truth data was prepared primarily based on meteorological reports, government declarations, and relevant news articles covering regional drought impacts. Table 2 presents an annual summary of drought occurrences in the rabi season in the districts from 2016 to 2025. This tabular consolidation aids interpretation and future use. For example, severe drought was widespread in different regions in 2016 and 2019, affecting agricultural productivity and water resources. However, other years saw droughts in some districts but not others. The table captures these regional patterns in drought frequency over the ten-year period.

An additional column representing the drought label was added to time-series data to enable the application of machine learning techniques. This label was derived from the consolidated ground truth data presented in Table 2, which indicates whether a drought occurred during the Rabi season for each district–year pair. In particular, each observation received a binary label: “Drought” encoded as 1 and “No Drought” encoded as 0. For example, since a drought was reported in Jodhpur during the 2016 rabi season, all corresponding observations for Jodhpur in 2016 were labeled with 1. This practice guarantees that each row in the dataset not only captures the temporal and spectral characteristics of the season but also carries the correct drought classification, rendering it suitable for training and evaluating supervised machine learning models.

We assign a positive drought label to a (district, year, season) entry only if all three identifiers are identical and the occurrence is corroborated by at least two independent and reliable sources. These sources include formal government declarations bulletins, such as state-level drought notifications or central advisories, quantitative meteorological indicators like severely negative SPI values or significant rainfall deviations, and credible public documentation such as state gazette publications, parliament Q&A transcripts, or regionally verified news reports. All these sources should be specific to the same district and the Rabi season to qualify. Entries lacking such corroboration are either labeled as non-drought or excluded if the evidence is inconclusive. This rigorous, cross-validated labelling framework enhances the reliability of the dataset and ensures its suitability for training and validating supervised machine learning models aimed at drought prediction.

5.2.1. Jodhpur

In the Rabi season of 2016, Rajasthan faced severe drought conditions, with 19 out of 33 districts being officially declared drought-affected. The Hindu newspaper reported that the state grappled with a serious water crisis, prompting the government to deploy water trains to parched Bhilwara and water tankers to other regions [87]. Districts such as Ajmer, Banswara, Baran, Barmer, Bhilwara, Chittorgarh, Churu, Dungarpur, Hanumangarh, Jaipur, Jaisalmer, Jalore, Jhunjhunu, Jodhpur, Nagaur, Pali, Rajsamand, Udaipur, and Pratapgarh were the most affected. A follow-up report by [88] also confirmed that Rajasthan was among 11 drought-affected states during 2015–2016. Furthermore, a parliamentary document corroborated the declaration and listed Rajasthan’s drought-affected status during the Rabi season [89].

In 2017, no conclusive evidence was found to support a drought declaration for the Jodhpur district or other parts of Rajasthan. On the contrary, there was positive news of increased agricultural production throughout the state. Additionally, official records from the parliament [89] confirm that no funds were allocated for drought relief for either the Kharif or Rabi crops in Rajasthan during 2017, further indicating a relatively stable agricultural season. From the same document, it can be seen that the funds were only allocated for the Kharif season of 2018.

In the Rabi season of 2019, there is substantial evidence that the Rajasthan government declared drought before the start of summer. According to [90], more than 5000 villages across nine districts, including Barmer, Churu, Pali, Bikaner, Jaisalmer, Jalore, Jodhpur, Hanumangarh, and Nagaur, were declared drought-affected by the state government. This severe drought badly hit the region, causing a decline in employment opportunities. Another source in [91] also confirms this news; the government officially declared 5555 villages as drought-affected, an indication of the intensity of the adverse situation.

In the Rabi season 2020, official notifications confirm that the Government of Rajasthan declared 1388 villages across 13 tehsils in four districts—Barmer, Jaisalmer, Jodhpur and Hanumangarh—as drought-affected. According to a report by [92], the notification, issued on 11 November 2019, classified 13 villages in Jodhpur district as “severely drought-prone” and 297 villages as “moderately drought-prone”. Furthermore, as confirmed by [93], the provisions regarding the drought declaration would remain in force for six months from the date of notification, covering the Rabi season. The central assistance disbursed for drought relief during this period. As per official documentation [94], financial aid was allocated to drought-affected regions, confirming the severe impact of drought on agriculture and livelihoods during this season. However, in contrast, for the Rabi season 2021, no drought declaration was made in Jodhpur district, as documented in the same official record [94].

In the subsequent Rabi season of 2022, another official notification issued by the Rajasthan Government’s Department of Disaster Management, Relief and Civil Defence [95,96] listed Jodhpur district once again among 13 drought-affected districts. The notification, based on Drought Management Code 2016 and informed by indicators such as rainfall deficiency, declining groundwater levels, surface water scarcity, poor crop conditions, and remote sensing data, declared several tehsils across these districts, including Jodhpur, as either severely or moderately drought-affected. This declaration invoked legal provisions under the Rajasthan Affected Areas (Suspension of Proceedings) Act, 1952, and the drought status was to remain in effect for six months from the date of the circular’s publication.

However, for the Rabi season of 2023 and 2025, no such declaration was made for the Jodhpur district. A similar circular issued by the same department [97], dated 2nd December 2022, limited the drought-affected classification to a single tehsil in one district, with Jodhpur notably absent from the list.

For the Rabi season of 2024, evidence from both official records and media reporting confirms that Jodhpur was once again included in the list of drought-affected districts. A government circular [98], dated 21 November 2023 and extended on 17 May 2024, declared multiple districts, including Jodhpur and Phalodi, as severely or moderately drought-affected based on crop losses during the Kharif season (Samvat 2080). This notification, grounded in the Drought Management Code 2016, remained in force until 31 July 2024, following a six-month extension due to persistent drought conditions. Additionally, news reports corroborate that funds were sanctioned for transporting water to the 13 drought-affected districts, which included Jodhpur [99]. These actions underscore the significant and ongoing impact of drought on local agriculture, water access, and livestock in the district.

5.2.2. Amravati

During the Rabi season of 2016 [100], the Maharashtra government declared drought in more than 29,000 villages of the state. Most of these villages were from the parched Marathwada and Vidarbha regions, including Amravati. The state government stated that in these regions, the anewari (i.e., the proportion of failed crops) was below 50 percent in both Kharif and Rabi seasons. In the Rabi season of 2019, the Amravati district in Maharashtra faced severe drought conditions. On 1 November 2018, the Maharashtra government declared drought in 151 tehsils spread across 26 districts, including Amravati, as part of its drought relief program [101]. The program was in effect for six months from the date of declaration, covering a major part of the Rabi season.

According to the National Centre for Crop Forecasting (NCCF), 180 tehsils were identified as vulnerable based on remote sensing data, groundwater table index, reservoir storage, vegetation index, and deficient rainfall [102]. In Vidarbha, which includes Amravati, drought conditions were particularly severe, with only 425 mm of rainfall received instead of the usual 900 mm. This significantly affected orange flowering (Mrig Bahar), which happens from February onwards. By February 2019, the drought had reduced the area under Rabi crop cultivation in Maharashtra by 40%, according to government estimates [103]. This significant drop in agricultural output highlighted the lasting impact of the drought on farmers’ livelihoods and regional agriculture.

For the Rabi season of 2022 and 2025, no official drought declaration or substantial reporting specific to the Amravati district were found. However, during the Rabi seasons of 2023 and 2024, clear signs of drought emerged in Amravati. News sources reported a sharp decline in Rabi sowing due to poor soil moisture and inadequate water availability during the winters of 2022–2023 and 2023–2024 [104]. In particular, for the Rabi season of 2024, state-level drought assessments led to formal drought declarations in several parts of the district. According to reports from Agrowon and Hindustan Times, the Maharashtra government officially recognized drought conditions in numerous mandals and revenue circles across the Amravati division [105,106,107]. These reports detail how multiple blocks within the district were included in the drought list, citing acute water shortages, low dam storage levels, and failed winter crops. Additionally, the state extended support measures such as water tankers and fodder assistance under drought-relief norms.

Similarly, for the Rabi season of 2023, after the final assessment by the authorities, drought was declared in the district, but they did not announce the concession after the addition of the district to drought-affected list [108].

5.2.3. Thanjavur

In the Rabi season of 2016, no significant evidence for drought was found in Thanjavur district. However, reports indicate that heavy and continuous rains lashed the delta districts, including Thanjavur, as a low-pressure system intensified over the Bay of Bengal. According to a report dated 16 November 2015, standing samba paddy crops remained submerged in waterlogged fields due to widespread rainfall [109]. Kollidam registered 175.5 mm of rainfall, while Sirkali recorded 172.5 mm during a 24 h period. Furthermore, Sansad reports on drought-affected states confirm that there was no drought declaration for either the Rabi or Kharif season in 2016 [110]. In the Rabi season of 2017, the Tamil Nadu government declared a drought on 10 January 2017, due to the severe impact of the retreating monsoon. According to a report by [111], Tamil Nadu, which relies heavily on the Northeast monsoon for its winter crops (Rabi), saw a significant 33 percent drop in winter rice sowing. Nagapattinam, Thiruvarur, and Thanjavur were the worst-hit districts. A Tamil Nadu government document [112] also supports these claims, detailing the severe deviation in rainfall: the state received only 168.3 mm of rainfall, which was 62 percent below the normal, leaving 21 districts with large deficiencies in rainfall. This poor monsoon was the main cause of a shortfall in crop coverage. As early as 7 January 2017, there were indications that Thanjavur would be declared drought-hit. For the Rabi season of 2018, similar conclusions can be drawn. No drought declaration was issued by the government, as per Sansad reports [110]. Further, a report dated 6 November 2017 mentions that the Thanjavur and neighboring Tiruvarur districts received heavy rainfall, surpassing even the flood-battered Nagapattinam district during a 24 h period [113]. This rainfall likely alleviated drought concerns for the subsequent agricultural seasons. In the Rabi season of 2019, Thanjavur district in Tamil Nadu faced severe drought conditions due to the widespread failure of the Northeast monsoon. According to a report dated 21 March 2019, the Tamil Nadu government declared 24 districts, including Thanjavur, as drought-affected [114]. The failed Northeast monsoon, which normally lasts from October to December, significantly impacted the Rabi crops in Tamil Nadu. During the Rabi season of 2020, reports indicate that the Northeast monsoon continued to bring substantial rainfall to the delta districts, including Thanjavur. A weather forecast from 12 December 2019 predicted heavy rains from December 13 for Thanjavur, Nagapattinam, Tiruvarur, and Pudukottai districts. This followed an already active Northeast monsoon, which brought 14 cm more rainfall than usual [115]. No substantial evidence of drought was found in this season. In the Rabi season of 2021, there were no major indications of drought in Thanjavur. There were observations of heavy rains due to the onset of the Northeast monsoon in October 2020. The Northeast monsoon set in on 28 October 2020, and there was heavy rainfall in South India, which included Thanjavur [116]. Besides this, the Southwest monsoon also contributed to heavy rainfall in the area, thereby enhancing the agrarian conditions. Additional confirmation from other reports indicates that the region underwent harsh weather conditions; however, there was no indication of a drought event. Cyclone Nivar, which hit the coastal areas, brought flooding throughout Tamil Nadu, especially in Thanjavur, which also experienced heavy rains [117,118]. While these incidents do not completely eliminate the chances of agricultural drought-like situations, they heavily indicate that the 2021 Rabi season in Thanjavur did not see a drought.

For the Rabi seasons of 2022 and 2023, no conclusive evidence was found to suggest drought in Thanjavur district. There were no official declarations or widespread media reports indicating significant agricultural or hydrological stress during these seasons.

In contrast, the Rabi season of 2024 presented clear evidence of drought-like conditions in Thanjavur. A Tamil Nadu government order [119] confirmed that during the 2023 Northeast monsoon (October–December), Thanjavur district received 345.5 mm of rainfall compared to the normal 579.4 mm, representing a 40% deficit placing it in the “deficient” category. While the overall rainfall across Tamil Nadu was moderate, block-level analysis revealed that parts of Thanjavur experienced significant water scarcity due to uneven rainfall distribution. The report emphasized the impact on groundwater recharge and water availability for drinking and agriculture. Furthermore, an official gazette notification [120] corroborated this assessment, highlighting the broader recognition of drought in several blocks.

Supportive media reports reinforce this conclusion. According to a New Indian Express article dated 19 December 2024 [121], a joint survey by the Agriculture and Revenue Departments found that paddy cultivated on 26,508 acres in Thanjavur was damaged beyond 33% due to inundation and poor field drainage conditions following erratic monsoon rainfall. Additionally, another report from November 2024 [122] quoted the Agriculture Minister confirming crop inundation across 947 hectares in Thanjavur district, with compensation to be paid to affected farmers under the State Disaster Relief Fund (SDRF). These developments indicate that both rainfall deficiency and waterlogging disrupted the Rabi crop cycle, especially samba paddy, leading to considerable agricultural loss. For the Rabi season of 2025, no drought has been reported or declared in Thanjavur district. There are no official records or media accounts confirming any rainfall deficit or drought declaration during this period, suggesting normal agricultural conditions prevailed.

5.2.4. Limitations of Ground Truth Data

It must be noted that although ground truth data were compiled from multiple sources to improve coverage and representativeness, variations in data quality, spatial and temporal resolution, and classification criteria may introduce certain inherent uncertainties. Resolving such inconsistencies is one of the future directions of the proposed work. Further, there exist drought monitoring studies that distinguish between multiple levels of drought severity (e.g., moderate, severe, extreme). However, in this case, the primary constraint was inconsistency and incompleteness in the available ground-truth data. Although some official sources and historical records specify the severity of the drought (e.g., “moderate drought” or “severe drought”), many others only reported the occurrence of drought without indicating its severity. In several districts and years, it is difficult to reliably determine the precise classification of the drought event. To ensure consistency and avoid introducing subjective assumptions or regional bias in labeling, a binary classification approach is adapted, distinguishing only between drought and non-drought, and considering the entire district as a single classification unit. This approach allowed us to retain a broader and more uniform dataset, avoiding the risk of mislabeling due to incomplete or ambiguous records.

5.3. Temporal Coverage

As discussed, the dataset covers the Rabi season each year from 1 October to 30 April. The temporal resolution of each year varies based on the cloud coverage filter (<20%), ensuring data quality, seasonal weather conditions, particularly during monsoon transitions, which can reduce image availability and others. Thus, while we aim for regular temporal sampling, gaps can occur during periods of sustained cloud cover or due to satellite and sensor limitations. These are well-known constraints in optical remote sensing. One example of such temporal variation is shown in Figure 3 to understand the distribution pattern in Jodhpur. Here, all potential acquisition dates are shown as gray dots, whereas actual observations meeting the quality threshold are shown as blue dots.

6. Methodology

In this work, drought conditions are identified by different machine learning algorithms in different regions of India based on remote sensing indices. The methodology can be divided into three sections such as data preprocessing, feature engineering, and model training, as discussed below.

6.1. Data Acquisition and Preprocessing

First, twelve vegetation and drought indices such as NDVI, EVI, ARVI, NDWI, SAVI, TVI, NDMI, NMDI, MNDWI, MNDVI, RDI, and RECI were computed for each image using the appropriate spectral bands. Next, the agricultural land mask, clipped to the district boundary, was applied to each image to retain only data from agricultural areas. Here, data both with and without an agricultural land mask are studied. For each image and each index, the mean value within the whole district and the district’s agricultural area were computed. Further, the date of each image acquisition was extracted, and a “Day of Season” (DOS) feature was created, representing the day number within the Rabi season (1 October to 30 April). Next, the data rows containing any NaN (Not a Number) or empty values were removed to ensure data quality and prevent issues during model training. Yearly data for each district were concatenated into a single DataFrame. Unnecessary columns, including “system:index” (related to GEE data management) and geolocation data (“.geo”), were removed. A “District” column was added to identify the district. A “Drought” column was added to represent Table 2 contents. The process flow for the preprocessing stage is shown in Figure 4. The resulting time-series data for each district and year, consisting of date, “Day of Season”, and the twelve spectral indices, were further studied for feature engineering as explained below.

6.2. Feature Engineering

Following the data acquisition and initial preprocessing steps described in the previous subsection, the data underwent further processing and feature engineering. The combined data was shuffled randomly to ensure that the training and testing sets were representative of the overall data distribution. The first 12 spectral indices were normalized to a range between 0 and 1. Min–Max linear scaling was selected to avoid changing ratios among values of the same index. It keeps the internal consistency of values in each index. This is expected to conserve the information linked to the physical status of the observed land surface. Min–Max normalization prevents features with larger magnitudes from dominating the learning process. Further, unlike Z-score normalization (standardization), Min–Max scaling does not center the data around zero. This is advantageous for remote sensing applications, where many indices are inherently non-negative and are better interpreted within their original value ranges. This step is crucial as a significant number of machine learning algorithms are sensitive to feature scaling, such as gradient-based techniques. The DOS, Year, Month, District, and Drought values were not scaled. The data was split into training and testing sets using an

80 / 20

split. All training and validation were conducted on the training and validation data, respectively, and the final performance results reported in this work are obtained using the held-out test data. The test set remained untouched throughout model training and validation. To strengthen the statistical robustness of the evaluation, a k-fold cross-validation where

k = 5

is also included. The methodology considered addressing potential class imbalance using techniques like SMOTE, Borderline SMOTE, or ADASYN. These methods generate synthetic samples for the minority class to balance the dataset.

6.3. Machine Learning Model Training and Evaluation

Four machine learning models were trained and evaluated for drought classification: XGBoost (XGB), Random Forest (RF), Bagging Classifier (BGN), and Gradient Boosting (GB) Classifier. Each model was trained on the training dataset. Hyperparameters for each model (e.g., number of estimators, random state) were empirically determined through experimentation. The trained models were used to predict drought occurrences on the test dataset. The performance of each model was evaluated using accuracy, precision, and recall metrics; precision and recall were calculated with the positive label (drought) designated as class 1. An additional evaluation step was performed to assess the models’ ability to correctly classify drought at the district-year level. The test data was grouped by district and year. For each group, a majority vote was taken based on the individual predictions of a machine learning technique within the group. This majority prediction was then compared to the actual drought label for that district and year. Similarly, the accuracy, precision, and recall of this group-level classification were also computed.

A Shapley Additive Explanation (SHAP) analysis was performed for each model to understand feature importance and the impact of individual features on model predictions. SHAP summary plots, i.e., bar charts and beeswarm plots both were studied to understand the performance of each feature. Borda Count and Weighted Sum were used to aggregate feature importance rankings obtained from SHAP values for XGBoost, Random Forest, and Bagging models, separately. The mean absolute SHAP values are used to represent the feature importance for each model. The Borda Count method and Weighted Sum method assigned points to features based on their rank in each model’s feature importance list. Two separate lists of the Top 5 features were created. A feature with the highest total Borda Count or the Weighted Sum was considered the most important. The models were further evaluated using only the top-ranked features identified by the Borda Count and Weighted Sum. The models were trained and tested using the Top 1, Top 2, Top 3, Top 4, and Top 5 features, and their performance metrics were further analyzed.

Considering a drought manifests itself—and evolves its manifestation—along a time span, different time-aware deep learning techniques such as convolutional neural network (CNN), and long short-term memory network (LSTM) could be used for drought detection. However, machine learning algorithms were preferred over deep learning algorithms for three main reasons:

Data Volume Requirements: Deep learning models require a huge amount of data to generalize effectively. The size of our dataset, while sufficient for robust machine learning techniques, is unsuitable for training deep learning models without a high risk of overfitting.
Temporal Irregularity: Our time-series data is irregular due to persistent cloud cover and the satellite’s revisit cycle. Sequence-based deep learning models like LSTMs perform best with consistent, high-frequency temporal patterns, which our data cannot provide.
Interpretability and Efficiency: A core objective of this work is to understand feature influence through interpretable techniques like SHAP. Machine learning models, particularly tree-based ensembles, offer greater explainability, faster training times, and significantly lower computational demands than deep learning alternatives.

For the convenience of the reader, Figure 5 illustrates the overall workflow.

6.4. Error Analysis

Confusion matrix analysis was performed to provide a more detailed understanding of the classification performance of each model. For each model (XGBoost, Random Forest, Bagging Classifier, and Gradient Boosting Classifier), a confusion matrix was generated. The axes of the confusion matrices were labeled to represent the true and predicted classes (No Drought and Drought).

This analysis allowed for a more in-depth examination of the types of errors made by each model, providing insights into their strengths and weaknesses in classifying drought conditions. For example, the confusion matrices can reveal if a model is more prone to false positives (predicting drought when there is none) or false negatives (failing to predict drought when it occurs). This information is valuable for understanding the practical implications of using each model for drought monitoring.

6.5. Software and Libraries

The data processing and machine learning analysis were conducted using Python 3.9 with different libraries such as: Pandas, NumPy, Scikit-learn, XGBoost, SHAP, and Imbalanced-learn. Matplotlib 3.5.3 was used for plotting graphs.

6.6. Evaluation Metrics

All models were evaluated based on the traditional metrics: accuracy, precision, and recall. Drought prediction was treated as the positive (P) class, whereas no drought as the negative (N) class; as per prediction results,

T P

,

F P

,

F N

, and

T N

are defined as true positive, false positive, false negative, and true negative cases, respectively. Based on the above definition, the metrics are as follows:

Accuracy: The percentage of correct predictions. It is defined as per Equation (15).

$Accuracy = \frac{T P + T N}{T P + F P + F N + T N}$

(15)
Precision: The fraction of true drought predictions among all predicted droughts. It is defined as shown in Equation (16).

$Precision = \frac{T P}{T P + F P}$

(16)
Recall: The fraction of actual droughts that were correctly identified. It is defined by Equation (17).

$Recall = \frac{T P}{T P + F N}$

(17)

7. Results and Discussion

As discussed, Sentinel-2 data with less than

20 %

cloud cover were utilized for experimentation. Additionally, an “agricultural land” mask was considered to focus specifically on cropland areas. Different machine learning techniques over 12 remote sensing indices and a temporal data item (DOS, or Day-of-Season) are applied to distinguish drought and non-drought. Ground truth data on drought conditions (Table 2) was used to define class labels. A drought season is defined as the period spanning from October of the previous year to April of the current year. Consequently, Sentinel-2 data from these periods are used to train the machine learning models.

7.1. Model Performance

Four decision-tree-based Bagging Classifiers, RF, GB, BGN, and XGB, are utilized. It can be observed from Table 2 that the occurrences of drought, i.e., “Yes” labels are less frequent than the non-occurrence of drought, i.e., “No” labels. Hence, a data imbalance occurs between drought and non-drought data, which may hamper the performance of the machine learning techniques. Different strategies, primarily oversampling strategies, are applied in this paper to tackle such an imbalance. Sentinel images were downloaded based on the considered month of each year for each selected region. These images with adequate levels were used to train the model.

To assess the impact of oversampling techniques that were implemented to reduce imbalance, performance assessment is divided into two parts: one carried out on results without oversampling and the other after oversampling on the data was implemented, as detailed in the following sections.

7.1.1. Before Oversampling

The accuracy of the machine learning techniques without oversampling is shown in Table 3. Throughout the paper, the terms accuracy, precision, and recall refer to the overall accuracy of the model, the precision of drought detection, and the recall of drought detection, respectively. Among the evaluated models, the Bagging Classifier demonstrates the most balanced performance across all three metrics. Random Forest closely follows, particularly in precision and accuracy. XGBoost shows competitive performance, especially in recall, while Gradient Boosting lags in all categories, particularly in recall, suggesting its struggle to correctly identify drought conditions.

These trends are further confirmed by the ROC curves and AUC values shown in Figure 6. XGBoost achieves the highest AUC (0.9192), indicating excellent discriminative power, even outperforming Bagging (AUC = 0.9151) despite having slightly lower precision and accuracy. This highlights that XGBoost is highly effective in distinguishing between drought and non-drought cases but may be more sensitive to class imbalance, which might explain the discrepancies in precision. The Bagging Classifier, on the other hand, maintains high AUC while being more stable across metrics, suggesting it handles variance and overfitting more effectively.

Gradient Boosting, with the lowest AUC (0.8134), also records the weakest precision and recall. This performance pattern suggests that Gradient Boosting is more prone to overfitting the majority class, thus underperforming in minority class detection (i.e., drought). The Random Forest model, while slightly trailing Bagging in AUC (0.9075), offers a solid trade-off between precision and recall, making it a dependable baseline. Overall, Bagging and XGBoost appear to be the most promising models in terms of both ROC analysis and classification metrics, with the former being more robust and the latter showing strong potential with proper handling of class imbalance.

To further ensure the reliability of model performance, cross-validation was conducted, and the results are presented in Table 4. The trends observed in the initial evaluation (Table 3) are largely consistent with the cross-validated results. XGBoost continues to exhibit strong performance with the highest mean accuracy (0.8352), precision (0.8271), and recall (0.7747), albeit with slightly higher variance across folds. Bagging and Random Forest follow closely, demonstrating stable and competitive performance with narrower confidence intervals, which indicates robustness across different data partitions. In contrast, Gradient Boosting shows the lowest performance and least stability, especially in recall, reinforcing its limitations in detecting drought instances under class imbalance. These observations are also visually supported by the ROC curves shown in Figure 6, where XGBoost and Bagging achieve the highest AUC values, while Gradient Boosting remains the weakest classifier. The consistency between standard evaluation and cross-validation results strengthens confidence in the models’ generalizability and reinforces the advantage of ensemble techniques, particularly Bagging and XGBoost, in imbalanced classification tasks like drought detection.

7.1.2. SMOTE

SMOTE randomly generates synthetic minority class instances by interpolating between existing samples, thereby balancing the class distribution in the dataset. The impact of SMOTE on the model performances is reported in Table 5. After applying SMOTE, all models show a noticeable drop in both accuracy and precision on the test data. However, there is a consistent improvement in recall, indicating better detection of drought cases (minority class). This suggests that while SMOTE improves the model’s sensitivity to minority instances, it does so at the cost of misclassifying more non-drought cases, thus reducing precision.

Table 6 presents the cross-validated results, where all models, particularly XGBoost, Random Forest, and Bagging, demonstrate a significant boost in all three metrics, including precision, recall, and accuracy. These values far exceed the performance observed on the test set, raising concerns about generalizability. This discrepancy likely stems from the way SMOTE operates on the training data alone. While cross-validation evaluates performance on resampled (synthetic) balanced data, the real-world test set remains imbalanced and unaltered. Consequently, the models tend to overfit the synthetic minority patterns seen during training and fail to generalize to actual unseen data. It may introduce synthetic noise and overfitting, necessitating more data-aware oversampling techniques such as BSMOTE or ADASYN for robust performance. ROC curves in Figure 7 support this overall inference.

7.1.3. Borderline SMOTE

Borderline SMOTE improves upon standard SMOTE by focusing on generating synthetic samples near the decision boundary, where minority and majority class instances are more difficult to separate. As shown in Table 7, models such as XGBoost, Bagging, and Random Forest show an improvement in precision and overall accuracy compared to standard SMOTE, while still maintaining decent recall. The recall values are slightly lower than in pure SMOTE, but this trade-off appears beneficial as the precision is much higher, suggesting a better balance between detecting drought events and avoiding false positives. Gradient Boosting again lags in all metrics, continuing its trend of underperformance in imbalanced settings.

Cross-validation results (Table 8) indicate strong consistency across folds, with XGBoost and Bagging achieving the best balance between accuracy, precision, and recall. These findings are supported by the test ROC-AUC values in Figure 8, where Bagging achieves the highest AUC (0.9097), followed closely by XGBoost (0.9071) and Random Forest (0.8974). These high AUC values indicate strong class separability, confirming that Borderline SMOTE effectively enhances the model’s decision boundaries without overfitting to synthetic patterns. The combination of strong AUC and balanced test metrics supports the conclusion that Borderline SMOTE offers more reliable generalization than basic SMOTE, especially for real-world drought detection scenarios.

7.1.4. ADASYN

ADASYN (Adaptive Synthetic Sampling) is employed as a further enhancement over SMOTE-based techniques. It adaptively generates more synthetic samples in regions where the minority class is harder to learn, thus focusing on more informative data space. As reported in Table 9, models such as XGBoost, Bagging, and Random Forest demonstrate improved recall compared to previous oversampling methods, with XGBoost reaching the highest recall (0.8608). However, this improvement in recall comes at the cost of lower precision—indicating more false positives—especially evident in Random Forest and Bagging. Despite this, the overall accuracy remains relatively high, showing that ADASYN offers a well-balanced performance across classes.

Cross-validation results in Table 10 confirm strong generalization capabilities, with consistent gains in all three metrics. Furthermore, the ROC-AUC values on the test set (Figure 9) reinforce these findings: XGBoost achieves the highest AUC (0.9232), closely followed by Bagging (0.9215) and Random Forest (0.9089), confirming their excellent class separability. Although precision is somewhat lower than in Borderline SMOTE, ADASYN yields a better compromise between recall and accuracy, particularly in challenging decision regions. Overall, ADASYN helps models form more complex and adaptive decision boundaries, improving drought detection while maintaining competitive overall performance. An error analysis is further conducted to better understand misclassifications and identify opportunities for future refinement.

7.2. Error Analysis

The confusion matrices of Bagging, XGBoost, Random Forest, and Gradient Boosting are visualized in Table 11 to facilitate a detailed error analysis. Among the models, XGBoost achieves the lowest type-II error (false negatives) of 18.35%, indicating it is more effective at correctly identifying drought conditions, an essential goal of this study. However, this comes with a trade-off in type-I error (false positives), which is relatively higher at 13.00%.

Random Forest and Bagging both offer lower type-I error rates of 11.93% and 11.50%, respectively, indicating better specificity, but at the cost of increased type-II errors of 24.47% and 22.51%, respectively. While their overall performance is strong, these elevated false negatives can be critical in drought detection applications. Gradient Boosting, on the other hand, yields the highest type-II error (39.60%), highlighting its difficulty in identifying drought conditions. This is consistent with its tendency to overfit the majority (non-drought) class. The consistently higher type-II error across all models suggests that the “Drought” class is inherently more complex and harder to detect, potentially due to subtle patterns in the spectral indices. These results strongly support the need for targeted class-balancing strategies to improve minority class detection.

The confusion matrices of the machine learning models trained with SMOTE are summarized in Table 12 to facilitate a comparative error analysis. Overall, the type-II error (false negatives) has decreased across all classifiers, XGBoost, Random Forest, Bagging, and Gradient Boosting, to 19.94%, 20.66%, 19.80%, and 24.82%, respectively. This improvement highlights the benefit of oversampling in reducing missed detections of drought conditions, particularly for models like Gradient Boosting which previously struggled with high false negatives due to class imbalance.

XGBoost and Bagging now show the lowest type-II errors, confirming their enhanced ability to detect minority class instances in the balanced dataset. However, this gain in sensitivity is accompanied by increased type-I errors (false positives), with Gradient Boosting experiencing the largest rise at 27.51%, followed by Random Forest (18.27%), Bagging (17.78%), and XGBoost (17.67%). These elevated false positive rates suggest that while SMOTE effectively addresses minority class underrepresentation, the random generation of synthetic samples may lead to noise and overlap between classes.

Therefore, these findings motivate the use of more sophisticated, data-aware oversampling methods, such as Borderline SMOTE and ADASYN, which strategically generate synthetic points near decision boundaries to further improve classification performance without inflating false positive errors.

Table 13 presents the confusion matrices and corresponding error rates for the models trained with Borderline SMOTE. It can be observed that the type-II error rates have marginally improved or remained close to those obtained with standard SMOTE, with XGBoost, Random Forest, Bagging, and Gradient Boosting showing type-II errors of 22.72%, 28.72%, 25.56%, and 36.44%, respectively. However, these values are generally higher compared to those from SMOTE, indicating a slight trade-off in minority class detection.

Similarly, type-I errors have decreased across all models, with values at 9.24% for XGBoost, 9.35% for Random Forest, 8.24% for Bagging, and 17.15% for Gradient Boosting, reflecting improved specificity. These results suggest that while Borderline SMOTE effectively enhances classification near the decision boundaries and reduces false positives, the overall minority class detection is not solely governed by boundary samples. The presence of ambiguous or complex decision regions likely contributes to classification challenges, highlighting the need for further exploration of oversampling techniques tailored to such intricacies.

ADASYN not only generates synthetic samples near the decision boundary but also prioritizes harder-to-classify regions, adapting the sampling density according to data complexity. The confusion matrices for XGBoost, Random Forest, Bagging, and Gradient Boosting after ADASYN oversampling are presented in Table 14.

From these, the type-I and type-II error rates are calculated as follows: XGBoost exhibits a type-I error of 15.45% and type-II error of 13.93%, Random Forest shows 16.41% and 17.06%, Bagging has 15.51% and 14.51%, and Gradient Boosting presents 28.82% and 24.22%, respectively. These improvements in both error types for XGBoost, Random Forest, and Bagging correspond with higher overall accuracy, reflecting a better understanding of the complex data distribution.

The results suggest that the combination of the 12 spectral indices along with the temporal feature DOS creates an overlapping class distribution with intricate decision boundaries. This complexity partly explains the comparatively suboptimal performance of Gradient Boosting, which continues to exhibit higher error rates despite the adaptive oversampling.

This study focuses on agricultural drought during the Rabi season, with data collected across all phenological phases, from sowing to harvesting. Spectral indices vary throughout these phases, creating complex decision boundaries between drought and non-drought. Since drought is declared on a seasonal basis, detecting it effectively requires considering data across the entire season. However, using only seasonal data would result in a small dataset, unsuitable for machine learning. To address this, a seasonal majority voting strategy is applied, where individual observations within each Rabi season are aggregated to determine the overall drought condition. This approach balances the need for sufficient data with the ability to capture temporal patterns within the season.

7.3. Model Performance (Season Majority Voting Strategy)

In the next step, the performance of the machine learning models was studied Rabi season-wise. For each region in a given year (i.e., Rabi season), all predicted labels were aggregated, and a final label was assigned based on majority voting. The accuracies for this season’s majority voting strategy are presented without oversampling in Table 15. It can be observed that XGBoost and Bagging classifiers achieve the highest accuracy, precision, and recall values of 96.30%, while Random Forest attains a slightly lower performance at 90.00%. Gradient Boosting lags behind with 83.33%. The limited size of the yearly sample pool results in a small number of distinct accuracy values, but overall, these models show strong season-level classification performance, with few misclassifications across the ten-year dataset from three different locations (Table 16).

The impact of SMOTE oversampling on season-wise voting is summarized in Table 17. Here, Random Forest and Bagging classifiers show improved accuracies of 93.33%, surpassing XGBoost, which drops to 90.00%. Gradient Boosting sees a slight decrease in accuracy to 80.00%. This indicates that while SMOTE helps balance the data, it may not uniformly improve performance across all models (Table 18).

Table 19 reports the results of Borderline SMOTE. Borderline SMOTE offers consistent performance across models, with XGBoost achieving 93.33% accuracy, and Random Forest and Bagging classifiers at 90.00%. Gradient Boosting remains at 83.33% (Table 20). Table 21 reports the results of ADASYN.

ADASYN oversampling yields the best results overall, with XGBoost reaching perfect classification accuracy of 100%, and Random Forest and Bagging both achieving 96.67%. Gradient Boosting improves modestly to 86.67%. These results suggest that ADASYN’s adaptive strategy effectively addresses data complexity, enhancing minority class detection and improving overall season-level predictions (Table 22).

7.4. SHAP Analysis

SHAP analysis was conducted on machine learning algorithms before and after oversampling to better understand the most influential features.

7.4.1. Before Oversampling

A comparison of four machine learning models using SHAP values showed both common patterns and some unique differences in how they predicted drought occurrence or lack thereof. Across all models, the Normalized Multi-band Drought Index (NMDI) stood out as the most important factor, showing that vegetation moisture and plant health play a big role in determining crop productivity. However, the strength of NMDI’s effect was not the same for every model.

NMDI shows a long horizontal stretch in both directions in XGBoost (Figure 10a) and the highest contribution. This wide horizontal spread indicates a high variance in SHAP values, and a strong and varied impact on the model’s output across samples. The red (high NMDI values) being spread mostly on the positive SHAP value side suggests that higher NMDI values contribute significantly to predicting drought, while blue (low NMDI values) on the negative side indicates association with non-drought conditions.

There is a mix of red and blue around zero, but mostly with the directions of respective directions (Figure 10a). It suggests a stable direction of impact even when NMDI values are closer to average. In contrast, NDWI shows a narrower SHAP value range concentrated near zero, with high (red) values leaning toward a positive influence. The broadening of the plot around zero with red points indicates that although NDWI’s contribution is generally smaller than NMDI’s, it is still meaningful.

Despite its limited SHAP value range, NDWI appears as the second most important feature possibly due to its consistent but moderate contribution across many samples, rather than isolated strong effects. DOS and NDMI show that their lower values contribute towards detecting droughts. However, the blue values spread over the negative values for NDMI, which demonstrates the importance of DOS over NDMI (Figure 10a). Similar observations can be found for RECI; however, higher RECI values contribute towards the detection of drought. The near-similar impact of DOS, NDMI, and RECI can also be observed. RDI, and ARVI show moderate stretch; however, a significant number of values are around zero. Thus, their contribution is much lower.

Similarly, NMDI shows prominent values while using Random Forest (Figure 10b). The higher and lower values have wide horizontal spread in positive and negative directions, respectively. The values near zero have higher width, however, higher values and lower values of NMDI are in the respective directions as above. In RF, notably, DOS is the second most influential feature. The lower values of DOS contribute to detecting more of the droughts. It shows a clear separation of lower and higher values (blue and red) even at zero. NDWI exhibits a distribution where red appears both on the positive and negative SHAP value sides, while blue clusters near the center (Figure 10b). This pattern suggests that high NDWI values can have both positive and negative effects on the model’s prediction of drought. Low NDWI values have minimal or neutral influence. RDI and NDMI show similar distributions.

In Bagging Classifiers, wider horizontal spreads can be observed for different indices. Similar to RF and XGBoost, NMDI has a higher magnitude and long horizontal spread (Figure 10c). Notably, the two top-most features are NMDI and DOS. They both show long horizontal spread with fewer values near zero. However, other features, such as RDI, NDWI, and NDMI, show higher width close to zero. Hence, they provide significance, but lower than NMDI and DOS. The beeswarm plot gradient boost shows that NMDI is the most contributing feature. This pattern suggests that low RECI values can have both positive and negative effects on the model’s prediction of drought. In contrast, high RECI values have minimal or neutral influence. A similar pattern can be observed with NDWI, where high values can have both positive and negative effects and lower values have a neutral influence.

The techniques show the strong connection between NMDI and drought occurrence, i.e., these models were more sensitive to changes in vegetation moisture. DOS, which tracks the timing within the growing season, and NDWI were also important across all models, though their level of influence varied. DOS ranked as the second most important factor. These results point to the fact that different models pick up on different aspects of the environment, and understanding these differences can help create better predictions. It also confirms that factors like vegetation moisture, seasonal timing, and plant health work together in complex ways to affect crop yields.

7.4.2. SMOTE

From the magnitude of the SHAP values resulting from our analysis (Figure 11), we observe that NMDI emerges as the most significant feature, followed closely by DOS. It is the top contributor for the XGBoost, RF, and bagging models. RECI and EVI also demonstrate consistent importance in all models. Notably, RECI is one of the top contributors for the Bagging classifier and remains consistently among the top features in other models. SAVI, NDVI, and TVI consistently show narrow spreads around zero, indicating minimal impact across predictions for all the models. Their influence is low in magnitude and consistent across data points, making them the least important features. SHAP values are mostly concentrated around zero in the Gradient Boosting model. This suggests that these features may have both positive and negative impacts on predictions across different samples, resulting in an overall distribution centered around zero. Similarly, NDMI also has a highly concentrated distribution. NDMI is slightly negative on average, while higher RDI contributes more to the positive results. Both NDMI and RDI exhibit a highly concentrated distribution of SHAP values. NMDI provides a marginal contribution towards the classification of non-drought conditions. In contrast, RDI displays a more SHAP value, which contributes significantly to the classification of drought conditions. With the Bagging Classifier, RDI exhibits the widest spread in SHAP, which suggests that variations in this feature can lead to both strong positive and negative contributions to the prediction.

7.4.3. SMOTE Borderline

For the borderline SMOTE, DOS and NMDI emerge as top contributors for two models each when focusing on the magnitude of the SHAP analysis. This is followed by NDWI. On the contrary, SAVI, NDVI, and TVI consistently rank as the least-contributing features across the models.

As shown in Figure 12a, NMDI is by far the most influential feature. NMDI shows long horizontal stretch in both directions in XGBoost (Figure 12a) and the highest contribution. This wide horizontal spread indicates a high variance in SHAP values, and a strong and varied impact on the model’s output across samples. The positive SHAP value suggests that higher NMDI values contribute significantly to predicting drought, while low NMDI values on the negative side indicate association with non-drought conditions. There is a mix of red and blue around zero, but mostly with the directions of respective directions (Figure 12a). It suggests a stable direction of impact even when NMDI values are closer to average. In contrast, DOS is the second most influential feature. However, its magnitude is much lower than NMDI. Lower values of DOS contribute to detecting drought, and higher values otherwise. A wider width can be observed at zero. However, it shows a clear separation of lower and higher values even at zero. RECI shows near similar magnitude of influence with DOS. NDWI emerges as the fourth influential feature.

Similar to XGBoost, in RF, NMDI appears as the most influential feature (Figure 12b). DOS appears as the second most influential feature, however, its magnitude is higher than XGBoost. Notably, some of the higher values of NMDI is found to be counterproductive as shown in higher width and negative red values in Figure 11b. The third and fourth influential features are NDWI, and RDI. Most of the other features have short horizontal stretch and higher width at the center which indicates their lower influence.

For Bagging Classifier, NMDI, and DOS provide near similar impacts, as shown in Figure 12c. RDI appears as the third most influential feature. It can be observed that all the indices have higher horizontal spread, as it has been observed in most of the Bagging Classifier examples in this paper (Figure 12c). NMDI, and DOS both have highest horizontal stretch. Both observe a higher width at the zero with significant overlap. A portion of higher values of NMDI values contributes to the detection of non-drought, whereas most of the higher values contribute to detecting drought. Similarly, a portion of higher values of DOS values contributes to the detection of drought, whereas most of the higher values contribute to detecting non-drought. This observation suggests that the relationship between the features and drought classification is non-linear and context-dependent. For NMDI, although it generally captures vegetation and soil moisture conditions, its higher values sometimes indicate non-drought (healthy moisture levels), while in most cases, they are associated with drought. This dual behavior implies that NMDI alone cannot fully disambiguate drought conditions without contextual cues (e.g., timing, crop stage) using Bagging Classifiers. Similar inference can be derived for DOS in such a case.

For Gradient Boosting as shown in Figure 12d, NMDI appears to be the most influential feature and has a strong horizontal width with minimal overlap of higher and lower values. However, there is significant width at the center. Hence, it can be inferred that though NMDI is influential but it can not differentiate drought and non-drought solely. RECI appears as the second most influential feature, where the higher values of it have minimal significance as they are concentrated near zero. Whereas lower values of RECI are spread in the negative direction with significant values at the positive side (Figure 12d). DOS, NDMI, and NDMI are the other impactful features. DOS shows a strong relationship as its lower values are effective from drought and higher values are effective for non-drought, and their distinction is clear even at zero. These patterns suggest that while DOS provides a clear directional signal for drought prediction, RECI contributes more subtly.

7.4.4. ADASYN

While using the ADASYN method to oversample across all models, NMDI, NDWI, RECI, and RDI emerge as the most influential features. They have high SHAP values indicating a huge impact on the model output. For example, in the XGBoost model, NMDI exhibits the highest SHAP values, ranging up to

1.5

, suggesting a notable positive influence on predictions. NDWI, and RECI show near similar magnitude (Figure 13a). Higher values of RECI are concentrated on zero, whereas lower values are spread over both positive and negative x values, indicating they contribute variably to both drought and non-drought classifications. Low RECI values carry more predictive significance, but their influence is context-dependent (Figure 13a).

Similarly, in the RF model, NMDI provide significant impact. Further, NDWI and RDI show high SHAP values, with a broader range, indicating both positive and negative impacts depending on feature values (Figure 13b). Lower NDWI values are concentrated close to zero. Higher NDWI values have longer stretches in the positive values with significant values at the negative (Figure 13b), reflecting context-dependent contributions. RDI, the third influential feature, shows a clear distinction between higher and lower values; however, a significant amount of values are concentrated at the zero. It indicates that while it is influential, its effect is not uniformly strong across all instances. DOS, the fourth influential feature, has a longer stretch and also shows clear distinctions of low (positive), and high (negative) values (Figure 13b). This reflects a meaningful agronomic interpretation where the timing of the season correlates with drought conditions, likely due to moisture availability and crop response.

The Bagging model has similar tendencies as the previous with longer horizontal stretch, but they also have a narrower range of SHAP values (Bagging: −0.4 to 0.6; RF: −0.2 to 0.2), suggesting a more moderate influence of features (Figure 13c). NMDI appears to be the most influential feature. RDI and DOS are the second and third most impactful features, with both having strong horizontal stretch and clear distinction of lower and higher values along the axis (Figure 13c). The impact features gradually decrease with notable impacts of NDWI, RECI, and NDMI.

As shown in (Figure 13d), NMDI is the most influential feature. RECI, and NDWI are the second and third influential feature. RDI, NMDI, and ARVI are equivalent winners as the other impactful features (Figure 13d). TVI, SAVI and NDVI consistently have the lowest SHAP values across all models. This implies minimal impact on predictions. The SHAP summary plots also stress the variability in feature importance across models. This detailed analysis highlights the critical role of NMDI, RDI, NDWI, RECI, and DOS in driving model predictions, while also revealing model-specific differences in feature sensitivity and impact magnitude.

7.5. Model Aggregation for Most Relevant Features

One of the objectives of this work is to find the most influential spectral indices for the separation of drought and non-drought. Spectral indices vary in their sensitivity to drought-related changes in vegetation, soil moisture, and canopy structure. Different features influence machine algorithms differently based on their internal mechanisms. Further, each machine learning algorithm has its own distinctive advantages. As an example, XGB is well-suited for non-linear relationships, whereas RF and BGN use ensemble averaging and random subspace sampling to reduce variance, making them more robust to noise. Hence, relying on a single algorithm risks overlooking critical features that are algorithm-specific or context-dependent. Thus, a consensus is required to aggregate feature importance rankings across algorithms such that it could identify the features consistently influential across diverse model assumptions. Borda Count and Weighted Sum are used here in model aggregation. Borda Count is simple to interpret and works with ranks, not with associated scores. Hence, it is less sensitive to extreme values and treats all models equally, avoiding bias toward any single algorithm. It is better suited when all the models are equally trusted, as it is observed that XGB, RF, and BGN perform similarly in some cases. However, the Borda Count ignores the magnitude of importance, such as a feature ranked 1st in one model and 10th in another is treated the same as a feature ranked 5th in both. Weighted sum incorporates magnitudes of importance along with the aggregation. Though the Weighted Sum may be found sensitive to outliers, its performance improves when unbiased contributions can be summarized, such as in SHAP. Hence, both Borda Count and Weighted Sum are used as discussed below.

7.5.1. Before Oversampling

The top five features based on Weighted Sum and Borda Count are shown in Figure 14a and Figure 14b, respectively. It can be shown that though the top two features are NMDI and DOS in the top five features by Borda Count, with NDWI being the third most valuable feature. Weighted Sum denotes NMDI, NDWI, and DOS as the most valuable features in this order. In Weighted Sum, NDMI is the fourth most influential feature, whereas it is fifth in Borda Count. RDI can be found in the top five features only in Borda Count; however, RECI is found in the top five features of Weighted Sum. The aggregators agree with four features in the top five, i.e., NMDI, DOS, NDWI, and NDMI.

To further understand their influence, the top features are employed by the machine learning techniques to assess their performance as shown in Table 23 (Weighted Sum), and Table 24 (Borda Count). It can be observed that the top feature (NMDI) contributes significantly in both cases. NDWI improves the performance significantly except in XGB. The inclusion DOS provides

> 70 %

accuracy in XGB, RF, BGN in all cases. Hence, NMDI, NDWI, and DOS contribute significantly to the classification. The top 5 features in Borda Count and weighted subm achieve an accuracy level close to that of using all features. Borda Count and Weighted Sum have four common features, whereas Borda Count includes RDI and Weighted Sum includes RECI.

7.5.2. SMOTE

Aggregations were also studied through SMOTE. It can be observed that the top two features in Weighted Sum and Borda Count are the same. However top five features are different. NMDI and DOS are the top-most influential features in both cases (Figure 15a,b). NMDI contributes more significantly than DOS in Weighted Sum as shown in Figure 15a. However, they contribute nearly equally in the Borda Count (Figure 15b). NMDI and DOS were also among the top three influential features before oversampling. NDWI was also found influential before oversampling and in SMOTE.

Notably, NDMI, which was influential before oversampling, is absent in Borda Count. RECI, which was absent in the top five features in the Borda Count before oversampling, is among the top five features in both cases. RDI has been found influential in the Borda Count, similar to before oversampling. The performance by top features in Weighted Sum and Borda Count is shown in Table 25 and Table 26, respectively. The top two features significantly contribute to the performance in both cases. The inclusion of the third feature improves the performance. Notably, the performance in Weighted Sum and Bodra count based on their ranking is not on par with all the features.

7.5.3. Borderline SMOTE

As shown in Figure 16a,b same five features as before oversampling (Figure 14a,b) with different ranking. NMDI, and DOS are most prominent features in both the cases. NDMI, and NDWI, are found most important features with different rankings in borderline SMOTE. NMDI, and DOS are found top two influential features with high ranking.

As shown in Table 27 (Weighted Sum), and Table 28 (Borda Count), these two features capture the essential signal for accurate predictions, providing

\approx 70 %

accuracy. Alongside, it can be observed that the inclusion of all five features also boosts the performance to the extent of using all the features.

7.5.4. ADASYN

The top features are different in ADASYN with SMOTE, and borderline SMOTE (Figure 17). The ranking of the features is different in Weighted Sum and Borda Count. Hence, NMDI, which is present in all the oversampling techniques, provides significance to the majority class. NMDI, and NDWI, provide significant performance in both Weighted Sum and Borda Count. Notably, DOS, which is present among the top three in all the cases, is at the fifth position in Weighted Sum and Borda Count both. Feature subsets selected via weighted ranking (Table 29) and Borda Count (Table 30) yield accuracy comparable to using all features, demonstrating that dimensionality reduction can be achieved without sacrificing predictive power.

Notably, in all cases, the top feature provides accuracy just slightly better than chance or random guessing (

0.5

). However, accuracy significantly improves with the addition of the second and third best features. This suggests that no single feature alone is sufficient to reliably predict drought, hence detection of drought requires combining multiple factors. The most influential feature alone does not separate classes well, but it still plays a key role when combined with others. The second and third most influential features may complement or refine the information provided by the most influential feature, as supported by the accuracy boost. However, the second and third most influential features alone may not perform well without the first. The significant leap in accuracy suggests a synergistic effect where these features work better together. In most of the experiments, NMDI and DOS emerge as the two most influential features. NMDI is defined especially for drought [56]. An exponential model proposed in [123] defines the soil reflectance due to moisture change as shown in Equation (18).

R = f \times R_{d r y} + (1 - f) \times R_{d r y} \times e^{- c \times θ}

(18)

Here, R, f,

θ

,

R_{d r y}

, and c are the soil reflectance at a particular wavelength, the ratio of the saturated and dry reflectance, volumetric soil water content, reflectance of dry soil (at

θ = 0

), and the rate of soil reflectance change with moisture, respectively. Based on the insights from Equation (18), NMDI is proposed, which uses NIR, SWIR1, and SWIR2 bands [56]. The NIR band is insensitive to the changes in leaf water content [56]. Further it uses the slope of two liquid absorption bands [56], and such a combination has shown, in the literature, high potential to estimate water content in vegetation and soil. Therefore, NMDI has been observed as a well-established indicator for vegetation water status and soil water availability [56]. Thus, NMDI shows strong evidence to distinguish between drought and non-drought. It underscores the central role of vegetation and soil moisture stress in drought detection. The emergence of the combination of NMDI and DOS as dominant predictors aligns with their ecological and agronomic relevance, too. DOS captures the temporal position of an observation within the crop season: it implicitly reflects phenological development stages and captures crop phenological dynamics with their interaction with climatic variability. Lower DOS values typically align with early-season conditions where sowing, germination, and initial rainfall distribution are critical. Higher DOS values coincide with later phenological stages such as flowering or grain filling, which are highly sensitive to moisture deficits. The strong influence of DOS in SHAP analysis suggests that drought is expressed through spectral vegetation stress with the timing of agricultural cycles. Therefore, anomalies revealed by DOS such as delayed emergence or early senescence can be associated with moisture stress or temperature extremes. Thus, DOS shows strong evidence on the drought classification. This interpretation highlights that the dominance of DOS and NMDI is not only an algorithmic artifact but indicates the dual nature of drought impacts, i.e., biophysical stress on vegetation (NMDI) and temporal alignment of agricultural activities (DOS). Further insights and experimentation are regarded as future work.

7.6. Statistical Comparison of Oversampling Methods

To rigorously evaluate the impact of oversampling techniques on model performance, we conducted comprehensive pairwise comparisons using McNemar’s test across four ensemble classifiers: XGBoost, Random Forest, Bagging, and Gradient Boosting. The four sampling methods compared were No Sampling, SMOTE, BSMOTE, and ADASYN, assessed on key classification metrics (Table 31).

McNemar’s test primarily revealed significant differences when compared to other methods. However, this test may not fully capture improvements in more nuanced metrics such as the F1-score. As illustrated in Figure 18 and Table 32, ADASYN consistently outperformed No Sampling and SMOTE across most classifiers, yielding F1-score improvements ranging from +0.0118 (XGBoost) to +0.0244 (Bagging). BSMOTE also demonstrated promising results, notably with a near-significant accuracy improvement in Gradient Boosting (

p = 0.0501

) and a +0.0175 F1-score gain over No Sampling. In Table 32, the best F1 scores for a model among oversampling methods are highlighted in green.

The relatively poorer performance of SMOTE may be attributed to the generation of synthetic samples that are less representative or noisier, which can degrade ensemble classifier performance. In contrast, ADASYN and BSMOTE focus on generating samples in more challenging or borderline regions, enhancing classifier learning in these areas. Overall, these findings indicate that ADASYN is the most effective oversampling method in this setting, providing practically meaningful improvements in F1-score supported by statistically significant pairwise differences where detected. BSMOTE also shows potential benefits, particularly for Gradient Boosting. Future work could incorporate statistical tests directly on F1-score differences to further substantiate these conclusions.

Beyond the algorithmic performance, our statistical findings offer direct pathways to enhance real-world drought monitoring operations. The exceptional accuracy (96.67%) achieved by the seasonal majority voting strategy for models like XGBoost and Bagging Classifier is not merely a metric but indicates a high potential for operational deployment. Such a model could serve as a reliable, automated tool for generating district-level drought alerts early in the Rabi season, providing valuable lead time for state governments to initiate mitigation measures and resource allocation. Furthermore, the feature ablation study reveals that near-optimal performance can be maintained using only the top five—or even top three—features. This is a significant finding towards operational efficiency, as it suggests monitoring programs can focus computational resources and bandwidth on calculating and analyzing a subset of key indices (e.g., NMDI and DOS) without a substantial loss of predictive power, simplifying the workflow for agencies like the NRSC, CRIDA, or IMD. Finally, the comparative analysis of oversampling techniques provides a practical solution to a common problem in operational settings: the severe class imbalance between drought and non-drought years. The demonstrated effectiveness of ADASYN in improving model sensitivity offers a validated methodological choice for building robust monitoring systems that are less likely to miss critical drought events, despite their rarity in the historical record. These connections underscore that our study provides not just a methodological framework, but also practical insights for building more efficient, accurate, and reliable drought monitoring systems.

7.7. Limitations of the Proposed Work

In this work, twelve multispectral indices are considered: NDVI, EVI, ARVI, NDWI, SAVI, TVI, NDMI, NMDI, MNDWI, MNDVI, RDI, and RECI, along with different machine learning and oversampling models. Using such indices and machine learning models poses certain limitations. The limitations of the proposed work, as measured by these indices, are as follows. These indices may behave differently in any diverse agro-climatic zones. Such sensitivity may depend upon crop type, soil background, atmospheric conditions, and land management practices. These variations require the need for region-specific fine-tuning for a better outcome. A major constraint is the limited availability of ground-based observations needed for validating satellite-derived indices and enhancing model generalizability, which will be addressed in the future. Further, Sentinel-2 provides open access, high-quality, cloud-free data. However, obtaining high-quality, cloud-free data (<20% cloud cover) can be challenging in certain regions or seasons. Additionally, the processing of large-scale remote sensing data combined with machine learning models demands substantial computational resources and storage capacity. This presents a limitation of the current work, as is common with AI-driven modeling approaches.

8. Conclusions

Droughts can have severe consequences, particularly in the agricultural sector, where reduced water availability may catastrophically impact crop yield and food quality. In India, an agrarian nation with climatic heterogeneity, the repercussions of agricultural droughts are especially profound. Early detection of drought, particularly in India, is useful yet challenging. Spectral indices and machine learning have become widely used in drought studies, yet the combined influence of multiple spectral and temporal parameters across diverse climatic zones remains insufficiently explored. This study investigates the use of multispectral Sentinel-2 remote sensing indices and machine learning techniques to detect drought conditions in three distinct regions of India, such as Jodhpur, Amravati, and Thanjavur, during the Rabi season (October to April).

One of the key contributions of this work lies in procuring a structured ground truth drought dataset for the Rabi seasons across ten years and three diverse regions. The drought dataset was compiled by corroborating official drought declarations with regional and national reports, consulting various sources, and navigating the local declaration process. To enable the application of machine learning models, a district–year-level ground truth table and a seasonal aggregation strategy were used: for each district and Rabi season, acquisitions have been considered separately and then combined through a majority voting process. This enables training of machine learning models on a time-series dataset with high temporal resolution, while preserving the integrity of seasonal drought definitions in the labeling process. Using twelve spectral indices and one temporal index, we trained several machine learning models and used them to identify droughts. While individual-date predictions achieved accuracies above

82 %

, seasonal aggregation substantially improved performance, exceeding

90 %

for most models.

In addition to the accuracy assessment, we conducted a detailed feature importance analysis using SHAP values, which assign an interpretable impact score to each feature. To consolidate rankings across models, we used a consensus-based Borda count method and a magnitude-based Weighted Sum of SHAP values. According to the Borda count ranking, the five most influential features were NMDI, DOS, NDWI, RDI, and NDMI. In contrast, the Weighted Sum of SHAP values identified NMDI, NDWI, DOS, NDMI, and RECI as the top contributors. Feature importance analysis consistently highlighted NMDI and DOS as the most influential features. This finding aligns with the ecological and agronomic relevance of such indices, and it showcases the complementary roles of spectral indices and temporal information in capturing drought-related signals. These features exhibit such strong predictive power that their use alone yields performance nearly equivalent to that obtained with the full feature set. As drought occurrences often lead to imbalanced datasets, data balancing becomes necessary. Such challenges were addressed using different oversampling techniques such as SMOTE, BSMOTE, and ADASYN, which revealed a complex, nonlinear set of boundaries between drought and non-drought conditions, suggesting prioritizing machine learning over classical approaches.

Findings from our work provide useful clues on how to assess drought conditions in the Rabi season across diverse regions of India, leveraging multispectral optical spaceborne data. While these results are encouraging, it is important to acknowledge practical limitations. Such an approach depends on the availability of high-quality, cloud-free optical imagery and curated ground truth datasets, both of which are not always feasible across all regions and seasons. Hence, the future work foresees a direction towards multi-modal analysis that integrates optical indices with complementary data streams such as Sentinel-1 SAR images, soil moisture datasets, and climate indices. In addition, incorporating deep learning approaches for spatio-temporal modeling may further enhance robustness.

Author Contributions

Conceptualization, J.M., S.S.S. and F.D.; methodology, J.M., S.S.S. and F.D.; software, J.M. and S.S.S.; validation, J.M. and S.S.S.; formal analysis, J.M. and S.S.S.; investigation, J.M. and S.S.S. and F.D.; data curation, J.M. and S.S.S.; writing—original draft preparation, J.M. and S.S.S.; writing—review & editing, F.D.; visualization, J.M. and S.S.S.; supervision, F.D.; project administration, F.D.; funding acquisition, F.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the “Nord Ovest Digitale e Sostenibile (NODES)” project, which has been granted funding through the MUR—M4C2 1.5 of PNRR, under the European Union’s NextGenerationEU initiative (Grant agreement No. ECS00000036).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the Data Availability Statement. This change does not affect the scientific content of the article.

List of Abbreviations

Abbreviation	Full Form
NIR	Near Infrared
SWIR	Shortwave Infrared
DOS	Day of Season
NDVI	Normalized Difference Vegetation Index
NDWI	Normalized Difference Water Index
EVI	Enhanced Vegetation Index
ARVI	Atmospherically Resistant Vegetation Index
SAVI	Soil-Adjusted Vegetation Index
TVI	Transformed Vegetative Index
NDMI	Normalized Difference Moisture Index
NMDI	Normalized Multi-band Drought Index
MNDWI	Modified Normalized Water Index
MNDVI	Modified Normalized Difference Vegetation Index
RDI	Ratio Drought Index
RECI	Red-edge Chlorophyll Index
SAR	Synthetic Aperture Radar
RF	Random Forest
GB	Gradient Boosting
XGB	Extreme Gradient Boosting
BGN	Bagging Classifier
SHAP	Shapley Additive Explanation Analysis
SMOTE	Synthetic Minority Oversampling Technique
ADASYN	Adaptive Synthetic Sampling
BSMOTE	Borderline Synthetic Minority Oversampling Technique
GEE	Google Earth Engine
IMD	India Meteorological Department
CRIDA	Central Research Institute for Dryland Agriculture
NRSC	National Remote Sensing Centre
ROC	Receiver Operating Characteristic
AUC	Area Under the Curve

References

Urban, M.; Berger, C.; Mudau, T.E.; Heckel, K.; Truckenbrodt, J.; Onyango Odipo, V.; Smit, I.P.; Schmullius, C. Surface moisture and vegetation cover analysis for drought monitoring in the southern Kruger National Park using Sentinel-1, Sentinel-2, and Landsat-8. Remote Sens. 2018, 10, 1482. [Google Scholar] [CrossRef]
Aadhar, S.; Mishra, V. Challenges in drought monitoring and assessment in India. Water Secur. 2022, 16, 100120. [Google Scholar] [CrossRef]
Mishra, V.; Thirumalai, K.; Jain, S.; Aadhar, S. Unprecedented drought in South India and recent water scarcity. Environ. Res. Lett. 2021, 16, 054007. [Google Scholar] [CrossRef]
Gimeno-Sotelo, L.; Sorí, R.; Nieto, R.; Vicente-Serrano, S.M.; Gimeno, L. Unravelling the origin of the atmospheric moisture deficit that leads to droughts. Nat. Water 2024, 2, 242–253. [Google Scholar] [CrossRef]
AghaKouchak, A.; Farahmand, A.; Melton, F.S.; Teixeira, J.; Anderson, M.C.; Wardlow, B.D.; Hain, C.R. Remote sensing of drought: Progress, challenges and opportunities. Rev. Geophys. 2015, 53, 452–480. [Google Scholar] [CrossRef]
Zhang, Y.; Hao, Z.; Feng, S.; Zhang, X.; Xu, Y.; Hao, F. Agricultural drought prediction in China based on drought propagation and large-scale drivers. Agric. Water Manag. 2021, 255, 107028. [Google Scholar] [CrossRef]
Satapathy, T.; Dietrich, J.; Ramadas, M. Agricultural drought monitoring and early warning at the regional scale using a remote sensing-based combined index. Environ. Monit. Assess. 2024, 196, 1132. [Google Scholar] [CrossRef] [PubMed]
Bhatt, S.C.; Singh, V.K.; Singh, M. Perspectives of drought in bundelkhand, central India; Causes, effects, and mitigation: A review. In Geospatial Technologies for Integrated Water Resources Management: Mapping, Modelling, and Decision-Making; Springer: Cham, Switzerland, 2024; pp. 103–114. [Google Scholar]
Pachore, A.B.; Remesan, R.; Kumar, R. Multifractal characterization of meteorological to agricultural drought propagation over India. Sci. Rep. 2024, 14, 18889. [Google Scholar] [CrossRef]
Gupta, A.; Jain, M.K.; Pandey, R.P. The changing characteristics of propagation time from meteorological drought to hydrological drought in a semi-arid river basin in India. Hydrol. Processes 2024, 38, e15266. [Google Scholar] [CrossRef]
Pandey, V.; Srivastava, P.K.; Singh, A.K.; Suman, S.; Maurya, S. Techniques and tools for monitoring agriculture drought: A review. In Geographical Information Science; Elsevier: Amsterdam, The Netherlands, 2024; pp. 497–519. [Google Scholar]
Senapati, U.; Das, T.K. Geospatial assessment of agricultural drought vulnerability using integrated three-dimensional model in the upper Dwarakeshwar river basin in West Bengal, India. Environ. Sci. Pollut. Res. 2024, 31, 54061–54088. [Google Scholar] [CrossRef]
Kumaraperumal, R.; Pazhanivelan, S.; Ragunath, K.; Kannan, B.; Prajesh, P.; Mugilan, G. Agricultural drought monitoring in Tamil Nadu in India using Satellite-based multi vegetation indices. J. Appl. Nat. Sci. 2021, 13, 414–423. [Google Scholar] [CrossRef]
Nguyen, N.; Nguyen, B. Potential of drought monitoring using Sentinel-2 data. In Proceedings of the Conference: GeoInformatics for Spatial-Infrastructure Development in Earth and Allied Sciences (GISIDEAS), Hanoi, Vietnam, 12–15 November 2016. [Google Scholar]
Behifar, M.; Kakroodi, A.A.; Kiavarz, M.; Azizi, G. Satellite-based drought monitoring using optimal indices for diverse climates and land types. Ecol. Inform. 2023, 76, 102143. [Google Scholar] [CrossRef]
Sorooshian, S.; AghaKouchak, A.; Arkin, P.; Eylander, J.; Foufoula-Georgiou, E.; Harmon, R.; Hendrickx, J.M.; Imam, B.; Kuligowski, R.; Skahill, B.; et al. Advanced concepts on remote sensing of precipitation at multiple scales. Bull. Am. Meteorol. Soc. 2011, 92, 1353–1357. [Google Scholar] [CrossRef]
Anderson, M.C.; Hain, C.; Wardlow, B.; Pimstein, A.; Mecikalski, J.R.; Kustas, W.P. Evaluation of drought indices based on thermal remote sensing of evapotranspiration over the continental United States. J. Clim. 2011, 24, 2025–2044. [Google Scholar] [CrossRef]
Heim, R.R., Jr. A review of twentieth-century drought indices used in the United States. Bull. Am. Meteorol. Soc. 2002, 83, 1149–1166. [Google Scholar] [CrossRef]
Xie, F.; Fan, H. Deriving drought indices from MODIS vegetation indices (NDVI/EVI) and Land Surface Temperature (LST): Is data reconstruction necessary? Int. J. Appl. Earth Obs. Geoinf. 2021, 101, 102352. [Google Scholar] [CrossRef]
Gu, Y.; Brown, J.F.; Verdin, J.P.; Wardlow, B. A five-year analysis of MODIS NDVI and NDWI for grassland drought assessment over the central Great Plains of the United States. Geophys. Res. Lett. 2007, 34. [Google Scholar] [CrossRef]
Nam, W.H.; Tadesse, T.; Wardlow, B.D.; Hayes, M.J.; Svoboda, M.D.; Hong, E.M.; Pachepsky, Y.A.; Jang, M.W. Developing the vegetation drought response index for South Korea (VegDRI-SKorea) to assess the vegetation condition during drought events. Int. J. Remote Sens. 2018, 39, 1548–1574. [Google Scholar] [CrossRef]
Parkash, V.; Singh, S. A review on potential plant-based water stress indicators for vegetable crops. Sustainability 2020, 12, 3945. [Google Scholar] [CrossRef]
Martínez-Fernández, J.; González-Zamora, A.; Sánchez, N.; Gumuzzio, A.; Herrero-Jiménez, C. Satellite soil moisture for agricultural drought monitoring: Assessment of the SMOS derived Soil Water Deficit Index. Remote Sens. Environ. 2016, 177, 277–286. [Google Scholar] [CrossRef]
Anderson, M.C.; Zolin, C.A.; Sentelhas, P.C.; Hain, C.R.; Semmens, K.; Yilmaz, M.T.; Gao, F.; Otkin, J.A.; Tetrault, R. The Evaporative Stress Index as an indicator of agricultural drought in Brazil: An assessment based on crop yield impacts. Remote Sens. Environ. 2016, 174, 82–99. [Google Scholar]
Mu, Q.; Zhao, M.; Kimball, J.S.; McDowell, N.G.; Running, S.W. A remotely sensed global terrestrial drought severity index. Bull. Am. Meteorol. Soc. 2013, 94, 83–98. [Google Scholar] [CrossRef]
Dilip, T.; Kumari, M.; Murthy, C.; Neelima, T.; Chakraborty, A.; Devi, M.U. Monitoring early-season agricultural drought using temporal Sentinel-1 SAR-based combined drought index. Environ. Monit. Assess. 2023, 195, 925. [Google Scholar] [PubMed]
Volden, E. New Capabilities in Earth Observation for Agriculture; European Space Agency: Budapest, Hungary, 2017. [Google Scholar]
Varghese, D.; Radulović, M.; Stojković, S.; Crnojević, V. Reviewing the potential of Sentinel-2 in assessing the drought. Remote Sens. 2021, 13, 3355. [Google Scholar]
Wang, Q.; Blackburn, G.A.; Onojeghuo, A.O.; Dash, J.; Zhou, L.; Zhang, Y.; Atkinson, P.M. Fusion of Landsat 8 OLI and Sentinel-2 MSI data. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3885–3899. [Google Scholar]
Thanh Noi, P.; Kappas, M. Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using Sentinel-2 imagery. Sensors 2017, 18, 18. [Google Scholar] [CrossRef]
Ferrant, S.; Selles, A.; Le Page, M.; Ahmad, A.B.; Mermoz, S.; Gascoin, S.; Bouvet, A.; Ahmed, S.; Kerr, Y.H. Sentinel-1&2 for near real time cropping pattern monitoring in drought prone areas. application to irrigation water needs in telangana, south-india. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 42, 285–292. [Google Scholar]
Zhang, L.; Liu, Y.; Ren, L.; Teuling, A.J.; Zhu, Y.; Wei, L.; Zhang, L.; Jiang, S.; Yang, X.; Fang, X.; et al. Analysis of flash droughts in China using machine learning. Hydrol. Earth Syst. Sci. 2022, 26, 3241–3261. [Google Scholar] [CrossRef]
Gupta, A.; Kaur, L.; Kaur, G. Drought stress detection technique for wheat crop using machine learning. PeerJ Comput. Sci. 2023, 9, e1268. [Google Scholar] [CrossRef] [PubMed]
Mokhtar, A.; Jalali, M.; He, H.; Al-Ansari, N.; Elbeltagi, A.; Alsafadi, K.; Abdo, H.G.; Sammen, S.S.; Gyasi-Agyei, Y.; Rodrigo-Comino, J. Estimation of SPEI meteorological drought using machine learning algorithms. IEEE Access 2021, 9, 65503–65523. [Google Scholar] [CrossRef]
Sriram, K.; Suresh, K. Machine learning perspective for predicting agricultural droughts using Naïve Bayes algorithm. Middle-East J. Sci. Res. 2016, 24, 178–184. [Google Scholar]
Lee, C.S.; Sohn, E.; Park, J.D.; Jang, J.D. Estimation of soil moisture using deep learning based on satellite data: A case study of South Korea. GIScience Remote Sens. 2019, 56, 43–67. [Google Scholar] [CrossRef]
Feng, P.; Wang, B.; Li Liu, D.; Yu, Q. Machine learning-based integration of remotely-sensed drought factors can improve the estimation of agricultural drought in South-Eastern Australia. Agric. Syst. 2019, 173, 303–316. [Google Scholar] [CrossRef]
Prodhan, F.A.; Zhang, J.; Hasan, S.S.; Sharma, T.P.P.; Mohana, H.P. A review of machine learning methods for drought hazard monitoring and forecasting: Current research trends, challenges, and future research directions. Environ. Model. Softw. 2022, 149, 105327. [Google Scholar] [CrossRef]
Bowen, D.; Ungar, L. Generalized SHAP: Generating multiple types of explanations in machine learning. arXiv 2020, arXiv:2006.07155. [Google Scholar] [CrossRef]
Saari, D.G. Selecting a voting method: The case for the Borda count. Const. Political Econ. 2023, 34, 357–366. [Google Scholar] [CrossRef]
West, H.; Quinn, N.; Horswell, M.; White, P. Assessing vegetation response to soil moisture fluctuation under extreme drought using sentinel-2. Water 2018, 10, 838. [Google Scholar] [CrossRef]
Huete, A.; Justice, C.; Van Leeuwen, W. MODIS vegetation index (MOD13). Algorithm Theor. Basis Doc. 1999, 3, 295–309. [Google Scholar]
Jopia, A.; Zambrano, F.; Pérez-Martínez, W.; Vidal-Páez, P.; Molina, J.; De la Hoz Mardones, F. Time-series of vegetation indices (VNIR/SWIR) derived from Sentinel-2 (A/B) to assess turgor pressure in kiwifruit. ISPRS Int. J. Geo-Inf. 2020, 9, 641. [Google Scholar] [CrossRef]
Jiang, Z.; Huete, A.R.; Didan, K.; Miura, T. Development of a two-band enhanced vegetation index without a blue band. Remote Sens. Environ. 2008, 112, 3833–3845. [Google Scholar] [CrossRef]
Sentinel Hub. Atmospherically Resistant Vegetation Index (ARVI). 2024. Available online: https://custom-scripts.sentinel-hub.com/sentinel-2/arvi/ (accessed on 27 January 2024).
Marshall, G.; Zhou, X. Drought detection in semi-arid regions using remote sensing of vegetation indices and drought indices. In Proceedings of the IGARSS 2004, 2004 IEEE International Geoscience and Remote Sensing Symposium, Anchorage, AK, USA, 20–24 September 2004; Volume 3, pp. 1555–1558. [Google Scholar]
Kaufman, Y.J.; Tanre, D. Atmospherically resistant vegetation index (ARVI) for EOS-MODIS. IEEE Trans. Geosci. Remote Sens. 1992, 30, 261–270. [Google Scholar] [CrossRef]
Ahmed, T.; Javed, N.; Faisal, M.; Sadia, H. A framework for smart agriculture system to monitor the crop stress and drought stress using sentinel-2 satellite image. In Proceedings of the 3rd International Conference on Artificial Intelligence: Advances and Applications: ICAIAA 2022, Jaipur, India, 23–24 April 2022; Springer: Berlin/Heidelberg, Germany, 2023; pp. 345–361. [Google Scholar]
McFeeters, S.K. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
Sun, C.; Li, J.; Cao, L.; Liu, Y.; Jin, S.; Zhao, B. Evaluation of vegetation index-based curve fitting models for accurate classification of salt marsh vegetation using sentinel-2 time-series. Sensors 2020, 20, 5551. [Google Scholar] [CrossRef] [PubMed]
Nellis, M.D.; Briggs, J.M. Transformed vegetation index for measuring spatial variation in drought impacted biomass on Konza Prairie, Kansas. In Transactions of the Kansas Academy of Science (1903); Kansas Academy of Science: Lawrence, KS, USA, 1992; pp. 93–99. [Google Scholar]
Gu, Z.; Zeng, Z.; Shi, X.; Yu, D.; Zheng, W.; Zhang, Z.; Hu, Z. Estimating models of vegetation fractional coverage based on remote sensing images at different radiometric correction levels. Front. For. China 2009, 4, 402–408. [Google Scholar] [CrossRef]
Strashok, O.; Ziemiańska, M.; Strashok, V. Evaluation and Correlation of Sentinel-2 NDVI and NDMI in Kyiv (2017–2021). J. Ecol. Eng. 2022, 23. [Google Scholar] [CrossRef] [PubMed]
Sentinel Hub. Normalized Difference Moisture Index (NDMI). 2024. Available online: https://custom-scripts.sentinel-hub.com/sentinel-2/ndmi/ (accessed on 27 January 2024).
Wang, L.; Qu, J.J. NMDI: A normalized multi-band drought index for monitoring soil and vegetation moisture with satellite remote sensing. Geophys. Res. Lett. 2007, 34. [Google Scholar] [CrossRef]
Zhang, H.w.; Chen, H.l. The Application of Modified Normalized Difference Water Index (MNDWI) by Leaf Area Index in the Retrieval of Regional Drought Monitoring. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, 40, 141–147. [Google Scholar] [CrossRef]
Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
Jurgens, C. The modified normalized difference vegetation index (mNDVI) a new index to determine frost damages in agriculture based on Landsat TM data. Int. J. Remote Sens. 1997, 18, 3583–3594. [Google Scholar] [CrossRef]
Dong, Z.; Wang, L.; Gao, M.; Zhu, X.; Feng, W.; Li, N. Ratio Drought Index (RDI): A soil moisture index based on new NIR-red triangle space. Int. J. Remote Sens. 2024, 45, 6976–6989. [Google Scholar] [CrossRef]
EOS Data Analytics. Chlorophyll Index: Overview, Calculation, and Application. 2024. Available online: https://eos.com/make-an-analysis/chlorophyll-index/ (accessed on 27 January 2024).
Sentinel Hub. Red-edge Chlorophyll Index (RECI). 2024. Available online: https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel-2/chl_rededge/ (accessed on 27 January 2024).
Patil, P.P.; Jagtap, M.P.; Khatri, N.; Madan, H.; Vadduri, A.A.; Patodia, T. Exploration and advancement of NDDI leveraging NDVI and NDWI in Indian semi-arid regions: A remote sensing-based study. Case Stud. Chem. Environ. Eng. 2024, 9, 100573. [Google Scholar] [CrossRef]
Shibani, N.; Pandey, A.; Satyam, V.K.; Bhari, J.S.; Karimi, B.A.; Gupta, S.K. Study on the variation of NDVI, SAVI and EVI indices in Punjab State, India. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2023; Volume 1110, p. 012070. [Google Scholar]
Ali, S.; Tong, D.; Xu, Z.T.; Henchiri, M.; Wilson, K.; Siqi, S.; Zhang, J. Characterization of drought monitoring events through MODIS-and TRMM-based DSI and TVDI over South Asia during 2001–2017. Environ. Sci. Pollut. Res. 2019, 26, 33568–33581. [Google Scholar] [CrossRef]
Jafarzadeh, H.; Mahdianpari, M.; Gill, E.; Mohammadimanesh, F.; Homayouni, S. Bagging and boosting ensemble classifiers for classification of multispectral, hyperspectral and PolSAR data: A comparative evaluation. Remote Sens. 2021, 13, 4405. [Google Scholar] [CrossRef]
Shao, Z.; Ahmad, M.N.; Javed, A. Comparison of Random Forest and XGBoost Classifiers Using Integrated Optical and SAR Features for Mapping Urban Impervious Surface. Remote Sens. 2024, 16, 665. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
Shapley, L.S. A value for n-person games. In Contribution to the Theory of Games; Princeton University Press: Princeton, NJ, USA, 1953; Volume 2. [Google Scholar]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; NIPS’17. pp. 4768–4777. [Google Scholar]
Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.I. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef] [PubMed]
Pacuit, E. Voting Methods. 2011. Available online: https://plato.stanford.edu/eNtRIeS/voting-methods/ (accessed on 1 September 2025).
Churchman, C.W.; Ackoff, R.L. An approximate measure of value. J. Oper. Res. Soc. Am. 1954, 2, 172–187. [Google Scholar] [CrossRef]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Han, H.; Wang, W.Y.; Mao, B.H. Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In Proceedings of the International Conference on Intelligent Computing, Hefei, China, 23–26 August 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 878–887. [Google Scholar]
He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on COMPUTATIONAL Intelligence), Hong Kong, 1–8 June 2008; pp. 1322–1328. [Google Scholar]
Krawczyk, B. Learning from imbalanced data: Open challenges and future directions. Prog. Artif. Intell. 2016, 5, 221–232. [Google Scholar] [CrossRef]
Kumari, N.; Kanungo, S.; Mukherjee, J. An Elementary Cellular Automata Based Two-Class Data Imbalance Problem: Initial Study and Observations. In Proceedings of the Asian Symposium on Cellular Automata Technology, Ranchi, India, 6–8 March 2025; Springer: Berlin/Heidelberg, Germany, 2025; pp. 57–68. [Google Scholar]
Food and Agriculture Organization of the United Nations. FAO Global Administrative Unit Layers (GAUL) 2015. Available online: https://www.fao.org/about/about-fao/en/ (accessed on 27 January 2024).
Copernicus. Copernicus Global Land Cover. Available online: https://land.copernicus.eu/global/lc (accessed on 27 January 2024).
ISRO. Drought Assessment Using Remote Sensing and GIS: Drought Manual 2020; Technical Report, National Remote Sensing Centre (NRSC); ISRO: Bengaluru, India, 2020.
ISRO. Agri-DSS Help Manual; Technical Report, National Remote Sensing Centre (NRSC); ISRO: Bengaluru, India, 2020.
Mishra, V.; Singh, M.; Ghosh, S. Monitoring Agricultural Drought in India Using Multisource Remote Sensing Indicators. Environ. Challenges 2021, 4, 100021. [Google Scholar]
The Hindu. 19 Districts in Rajasthan Drought Hit. 2016. Available online: https://www.thehindu.com/news/national/other-states/19-districts-in-rajasthan-droughthit/article8491809.ece (accessed on 7 February 2025).
Factly. 266 Districts in 11 Different States Declared Drought-Affected (2015-16). 2016. Available online: https://factly.in/266-districts-11-different-states-drought-affected-2015-16/ (accessed on 7 February 2025).
Parliament of India. Drought Fund Allocation Status (2017). 2017. Available online: https://sansad.in/getFile/loksabhaquestions/annex/14/AU4057.pdf?source=pqals (accessed on 7 February 2025).
Firstpost Editor. Drought in Rajasthan: Over Rs 7000 Crore Spent on Projects But Not Much Water Has Flown Through Western Region. 2019. Available online: https://www.firstpost.com/india/drought-in-rajasthan-over-rs-7000-crore-spent-on-projects-but-not-much-water-has-flown-through-western-region-6331911.html (accessed on 7 February 2025).
ANI. Rajasthan Govt Declares 5555 Villages as Drought-Affected. 2019. Available online: https://www.aninews.in/news/national/general-news/rajasthan-govt-declares-5555-villages-as-drought-affected20190306224744/ (accessed on 7 February 2025).
The Statesman. 1388 Villages in Rajasthan Declared Drought-Affected by State Govt. 2019. Available online: https://www.thestatesman.com/india/1388-villages-in-rajasthan-declared-drought-affected-by-state-govt-1502820817.html (accessed on 7 February 2025).
NDTV. More Than 1000 Villages in 4 Districts of Rajasthan Affected by Drought. 2019. Available online: https://www.ndtv.com/india-news/more-than-1-000-villages-in-4-districts-of-rajasthan-affected-by-draught-2130998 (accessed on 7 February 2025).
Government of India. Annexure to Lok Sabha Question AU454: Central Assistance for Drought. 2020. Available online: https://sansad.in/getFile/loksabhaquestions/annex/177/AU454.pdf?source=pqals (accessed on 7 February 2025).
State Level Bankers’ Committee, Rajasthan. Agenda Notes for the 152nd SLBC Meeting. 2025. Available online: https://slbcrajasthan.in/uploads/CEDocuments/Agenda152.pdf (accessed on 3 August 2025).
Disaster Management & Relief Department, Government of Rajasthan. Notification Regarding Drought Declaration (Dated: 29/10/2021). 2021. Available online: https://dmrelief.rajasthan.gov.in/documents/Notification_Drought_29102021.pdf (accessed on 3 August 2025).
Disaster Management & Relief Department, Government of Rajasthan. Notification Regarding Drought Declaration for Pali District (2022). 2022. Available online: https://dmrelief.rajasthan.gov.in/documents/Notifidroughtpali2022.pdf (accessed on 3 August 2025).
Disaster Management & Relief Department, Government of Rajasthan. Directions Regarding Operation of Fodder Depots (Dated: 28/05/2024). 2024. Available online: https://dmrelief.rajasthan.gov.in/documents/fodder_depot_direction_28052024.pdf (accessed on 3 August 2025).
Times of India. Funds Okayed for Transporting Water to 13 Drought-Hit Districts. 2024. Available online: https://timesofindia.indiatimes.com/city/jaipur/funds-okayed-for-transporting-water-to-13-drought-hit-districts/articleshow/108427789.cms (accessed on 3 August 2025).
Economic Times. Maharashtra Government Declares Drought in 29,000 Villages. 2016. Available online: https://economictimes.indiatimes.com/news/politics-and-nation/maharashtra-government-declares-drought-in-29000-villages/articleshow/52238372.cms?from=mdr (accessed on 7 February 2025).
Hindustan Times. Maharashtra Declares Drought; 26 Districts Hit. 2018. Available online: https://www.hindustantimes.com/mumbai-news/maharashtra-declares-drought-26-districts-hit/story-ETaPfo9owb7yVW8EQ1lQGL.html (accessed on 7 February 2025).
Times of India. Eight of 11 Vidarbha Districts Declared Drought-Hit. 2018. Available online: https://timesofindia.indiatimes.com/city/nagpur/8-of-11-vid-dists-declared-drought-hitryots-say-need-more-sops-to-tackle-crisis/articleshow/66595394.cms (accessed on 7 February 2025).
Times of India. Drought Brings down Rabi Crop Area by 40% in 2018-19. 2019. Available online: https://timesofindia.indiatimes.com/city/pune/drought-brings-down-rabi-crop-area-by-40-in-2018-19/articleshow/67949533.cms (accessed on 7 February 2025).
The News Dirt. Drought-Stricken Vidarbha Faces Falling Dam Levels and Water Supply Gaps. 2024. Available online: https://www.thenewsdirt.com/post/drought-stricken-vidarbha-faces-falling-dam-levels-and-water-supply-gaps (accessed on 3 August 2025).
Hindustan Times. Drought in 66% of Maharashtra as State Includes 224 More Revenue Circles in the List. 2024. Available online: https://www.hindustantimes.com/cities/mumbai-news/drought-in-66-of-maharashtra-as-state-includes-more-224-revenue-circles-in-the-list-101708281986009.html (accessed on 3 August 2025).
Agrowon. Drought-Like Condition Declared in 78 Mandals of Amravati. 2024. Available online: https://agrowon.esakal.com/agro-special/drought-like-condition-declared-in-78-mandals-of-amravati-2 (accessed on 3 August 2025).
Agrowon. Drought Sealed in Amravati District. 2024. Available online: https://agrowon.esakal.com/agro-special/drought-sealed-in-amravati-district (accessed on 3 August 2025).
Agrowon. Forget the Government to Announce Drought Relief. 2024. Available online: https://agrowon.esakal.com/agro-special/forget-the-government-to-announce-drought-relief?utm_source=chatgpt.com (accessed on 3 August 2025).
The Hindu. Rain Causes Immense Damage to Huts and Paddy Fields. 2015. Available online: https://www.thehindu.com/news/national/tamil-nadu/rain-causes-immense-damage-to-huts-and-paddy-fields/article7882286.ece (accessed on 7 February 2025).
Drought-Affected States Report AU981. 2016. Available online: https://sansad.in/getFile/loksabhaquestions/annex/15/AU981.pdf?source=pqals (accessed on 7 February 2025).
Moneylife. Retreating Monsoon Worst in 140 Years, TN Declares Drought as 144 Farmers Die. 2017. Available online: https://www.moneylife.in/article/retreating-monsoon-worst-in-140-years-tn-declares-drought-as-144-farmers-die/49433.html (accessed on 7 February 2025).
Tamil Nadu Agricultural Department. Government Order on Drought Declaration. 2017. Available online: https://www.tnagrisnet.tn.gov.in/fcms/documents/go/20-GO.No.29-2(2).pdf (accessed on 7 February 2025).
The Hindu. Heavy Rain in Tiruvarur and Thanjavur Districts. 2017. Available online: https://www.thehindu.com/news/cities/Tiruchirapalli/heavy-rain-in-tiruvarur-and-thanjavur-districts/article19991157.ece (accessed on 7 February 2025).
New Indian Express. 24 Districts Declared as Drought-Hit; Number to Rise in Coming Months. 2019. Available online: https://www.newindianexpress.com/states/tamil-nadu/2019/Mar/21/24-districts-declared-as-drought-hit-number-to-rise-in-coming-months-1953962.html (accessed on 7 February 2025).
News Bricks. Tamil Nadu Weather Forecast 12 December 2019. 2019. Available online: https://www.newsbricks.com/tamil-nadu/tamil-nadu-weather-forecast-december-12-2019/67258 (accessed on 7 February 2025).
Weather.com. Northeast Monsoon to Commence Over South India From 28 October 2020. 2020. Available online: https://weather.com/en-IN/india/news/news/2020-10-27-northeast-monsoon-commence-over-south-india-from-october-28 (accessed on 7 February 2025).
Times of India. Disasters That Struck India in 2020. 2020. Available online: https://timesofindia.indiatimes.com/india/disasters-that-struck-india-in-2020/articleshow/79954339.cms (accessed on 7 February 2025).
Mongabay. Though Cyclone Nivar Had a Soft Landing, Floods Hit Coastal Districts. 2020. Available online: https://india.mongabay.com/2020/12/though-cyclone-nivar-had-a-soft-landing-floods-hit-coastal-districts/ (accessed on 7 February 2025).
Government of Tamil Nadu. G.O. Ms. No. 111: Government Order Regarding Drought or Relief Measures. 2024. Available online: https://www.slbctn.com/Uploads/G.O.Ms.No.111.pdf?utm_source (accessed on 3 August 2025).
Government of Tamil Nadu. Tamil Nadu Government Gazette (Extraordinary), Part II—Section 1, Issue No. 105, Dated 13 March 2024. 2024. Available online: https://archive.org/details/in.gazette.tamilnadu.2024-03-13.Extraordinary_105_Part-II_Section-1/mode/2up (accessed on 3 August 2025).
The New Indian Express. Paddy on 26,508 Acres in Thanjavur Damaged in Northeast Monsoon Rains, Finds Survey. 2024. Available online: https://www.newindianexpress.com/states/tamil-nadu/2024/Dec/19/paddy-on-26508-acres-in-thanjavur-damaged-in-northeast-monsoon-rains-finds-survey?utm_source=chatgpt.com (accessed on 3 August 2025).
The New Indian Express. Crops in 13,749 Hectares in Tamil Nadu Inundated Due to Rains: Agriculture Minister Panneerselvam. 2024. Available online: https://www.newindianexpress.com/states/tamil-nadu/2024/Nov/28/crops-in-13749-hectares-in-tamil-nadu-inundated-due-to-rains-agriculture-minister-panneerselvam?utm_source=chatgpt.com (accessed on 3 August 2025).
Lobell, D.B.; Asner, G.P. Moisture effects on soil reflectance. Soil Sci. Soc. Am. J. 2002, 66, 722–727. [Google Scholar] [CrossRef]

Figure 1. Location of the ground truth districts.

Figure 2. Drought declaration workflow in India.

Figure 3. Temporal distribution of data for Jodhpur district for Rabi season 2022 (1 October 2021–2030 April 2022).

Figure 4. Data acquisition and preprocessing workflow with Google Earth Engine. This shows the processing of raw Sentinel-2 satellite images into a machine-learning-ready dataset of spectral indices, focused on agricultural areas.

Figure 5. End-to-end machine learning pipeline, from splitting the data and handling class imbalance to model training, evaluation, and interpreting the results.

Figure 6. Comparison of model performance before applying any oversampling. XGBoost demonstrates the highest ability to distinguish between drought and non-drought conditions (AUC = 0.9192).

Figure 7. Model performance after applying SMOTE oversampling. While recall improves, the overall discriminative power on the imbalanced test set decreases compared to Figure 6, indicating potential overfitting to synthetic samples.

Figure 8. Model performance after applying Borderline SMOTE. A recovery in AUC values is observed compared to SMOTE, suggesting a better balance between detecting drought events and minimizing false alarms.

Figure 9. Model performance after applying ADASYN oversampling. XGBoost and Bagging achieve the highest AUC values here, indicating this method effectively improves model sensitivity to the minority class without substantial loss of overall performance.

Figure 10. SHAP beeswarm plots comparing feature impacts across four models before oversampling. While XGBoost, Random Forest, and Bagging strongly emphasize NMDI and DOS, Gradient Boosting reveals weaker and less consistent dependencies, aligning with its lower predictive performance. (a) XGBoost: Feature impact on predictions. NMDI shows the strongest influence. (b) Random Forest: Predictions driven mainly by NMDI and DOS. (c) Bagging: Similar reliance on NMDI and DOS for classification. (d) Gradient Boosting: More complex and weaker feature relationships.

Figure 11. SHAP beeswarm plots comparing feature impacts across four models with SMOTE oversampling. The dominance of NMDI and DOS is consistent across XGBoost, Random Forest, and Bagging, while Gradient Boosting continues to show weaker and more diffuse feature relationships. (a) XGBoost: NMDI remains the strongest predictor after SMOTE oversampling. (b) Random Forest: NMDI and DOS continue to dominate, with clearer separation. (c) Bagging: Similar reliance on NMDI and DOS, consistent with pre-SMOTE trends. (d) Gradient Boosting: Feature effects remain weaker and more dispersed.

Figure 12. SHAP beeswarm plots comparing feature impacts across four models with Borderline SMOTE oversampling. NMDI and DOS consistently dominate XGBoost, Random Forest, and Bagging predictions, while Gradient Boosting shows weaker and less structured dependencies. (a) XGBoost: NMDI remains the dominant feature, with high values linked to drought. (b) Random Forest: NMDI and DOS continue to drive predictions under Borderline SMOTE. (c) Bagging: Consistent emphasis on NMDI and DOS, mirroring pre-oversampling behavior. (d) Gradient Boosting: Feature influences remain weaker and more scattered.

Figure 13. SHAP beeswarm plots comparing feature impacts across four models with ADASYN oversampling. XGBoost highlights NMDI, NDWI, and RECI, while Random Forest and Bagging emphasize NMDI and DOS. Gradient Boosting continues to show less structured dependencies. (a) XGBoost: NMDI, NDWI, and RECI emerge as the top predictors under ADASYN. (b) Random Forest: Predictions remain dominated by NMDI and DOS after ADASYN. (c) Bagging: Consistent reliance on NMDI and DOS, with patterns stable post-ADASYN. (d) Gradient Boosting: Feature contributions remain weaker and more dispersed.

Figure 14. Model aggregation for the 5 most relevant features before oversampling. Aggregation of results across all models via (a) Weighted Sum and (b) Borda Count methods confirms NMDI, DOS, and NDWI as the most critical features.

Figure 15. Model aggregation for 5 most relevant features in SMOTE. The top features remain consistent, though their order changes slightly between the two aggregation methods.

Figure 16. Model aggregation for 5 most relevant features in Borderline SMOTE.

Figure 17. Model Aggregation for the 5 most relevant features in ADASYN. A shift in the top features is observed, with NMDI, NDWI, and RDI gaining prominence in the aggregated ranking. DOS is still in the Top 5.

Figure 18. Comparison of average F1-score for each oversampling method across models. ADASYN shows consistently high F1-score across all classifiers.

Table 1. Part of the exported data set from time-series data of Amravati in 2018.

ARVI	EVI	MNDVI	MNDWI	NDMI	NDVI	NDWI	NMDI	RDI	RECI	SAVI	TVI	Date
0.2713	0.4561	−0.2863	−0.2797	−0.0222	0.2668	−0.2600	0.4326	1.8608	0.5180	0.4802	0.8747	2 January 2018
0.2912	0.6662	−0.3054	−0.3240	−0.0341	0.2881	−0.2937	0.4220	1.9493	0.5633	0.5184	0.8864	7 January 2018
0.2950	0.7153	−0.3604	−0.3653	−0.0481	0.2945	−0.3228	0.4089	2.2248	0.6098	0.5299	0.8894	17 January 2018

Table 2. Sample data of district-specific responses over the years.

Year/District	Jodhpur	Amravati	Thanjavur
2016	Drought	Drought	No Drought
2017	No Drought	No Drought	Drought
2018	No Drought	No Drought	No Drought
2019	Drought	Drought	Drought
2020	Drought	No Drought	No Drought
2021	No Drought	No Drought	No Drought
2022	Drought	No Drought	No Drought
2023	No Drought	Drought	No Drought
2024	Drought	Drought	Drought
2025	No Drought	No Drought	No Drought

Table 3. Before oversampling: comparison of classification methods based on accuracy, precision, and recall.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.8480	0.8129	0.8164
Random Forest	0.8298	0.8145	0.7561
Bagging Classifier	0.8398	0.8232	0.7747
Gradient Boosting	0.7424	0.7209	0.6040

Table 4. Before oversampling: comparison of classification methods based on accuracy, precision, and recall with confidence and interval for cross validation.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.8352 ± 0.0089	0.8271 ± 0.0136	0.7747 ± 0.0157
Random Forest	0.8190 ± 0.0032	0.8122 ± 0.0058	0.7471 ± 0.0137
Bagging Classifier	0.8232 ± 0.0040	0.8138 ± 0.0087	0.7578 ± 0.0137
Gradient Boosting	0.7403 ± 0.0054	0.7339 ± 0.0120	0.6111 ± 0.0092

Table 5. SMOTE: Comparison of classification methods based on accuracy, precision, and recall.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.8140	0.7582	0.8006
Random Forest	0.8075	0.7503	0.7934
Bagging Classifier	0.8140	0.7575	0.8020
Gradient Boosting	0.7359	0.6542	0.7518

Table 6. SMOTE: Comparison of classification methods based on accuracy, precision, and recall with confidence and interval for cross validation.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.9173 ± 0.0064	0.9124 ± 0.0080	0.9293 ± 0.0099
Random Forest	0.8837 ± 0.0126	0.8722 ± 0.0146	0.9081 ± 0.0116
Bagging Classifier	0.8922 ± 0.0063	0.8835 ± 0.0090	0.9117 ± 0.0105
Gradient Boosting	0.8104 ± 0.0085	0.7960 ± 0.0151	0.8519 ± 0.0130

Table 7. Borderline SMOTE: comparison of classification methods based on accuracy, precision, and recall.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.8527	0.8528	0.7733
Random Forest	0.8275	0.8409	0.7131
Bagging Classifier	0.8468	0.8621	0.7446
Gradient Boosting	0.7494	0.7192	0.6356

Table 8. Borderline SMOTE: Comparison of classification methods based on accuracy, precision, and recall with confidence and interval for cross validation.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.8856 ± 0.0042	0.8731 ± 0.0041	0.8645 ± 0.0097
Random Forest	0.8687 ± 0.0060	0.8646 ± 0.0120	0.8305 ± 0.0034
Bagging Classifier	0.8754 ± 0.0067	0.8676 ± 0.0095	0.8446 ± 0.0071
Gradient Boosting	0.7756 ± 0.0094	0.7714 ± 0.0089	0.6934 ± 0.0214

Table 9. ADASYN: comparison of classification methods based on accuracy, precision, and recall.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.8521	0.7947	0.8608
Random Forest	0.8333	0.7779	0.8293
Bagging Classifier	0.8492	0.7926	0.8551
Gradient Boosting	0.7306	0.6455	0.7575

Table 10. ADASYN: Comparison of classification methods based on accuracy, precision, and recall with confidence and interval for cross validation.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.8856 ± 0.0042	0.8731 ± 0.0041	0.8645 ± 0.0097
Random Forest	0.8687 ± 0.0060	0.8646 ± 0.0120	0.8305 ± 0.0034
Bagging Classifier	0.8754 ± 0.0067	0.8676 ± 0.0095	0.8446 ± 0.0071
Gradient Boosting	0.7756 ± 0.0094	0.7714 ± 0.0089	0.6934 ± 0.0214

Table 11. Confusion Matrix before oversampling. The matrices for (a) XGBoost, (b) Random Forest, (c) Bagging, and (d) Gradient Boosting show that Gradient Boosting struggled most with identifying drought events (high false negatives).

(a) XGBoost			(b) Random Forest
	Predicted			Predicted
Actual	No Drought	Drought	Actual	No Drought	Drought
No Drought	876	131	No Drought	887	120
Drought	128	569	Drought	170	527
(c) Bagging			(d) Gradient Boosting
	Predicted			Predicted
Actual	No Drought	Drought	Actual	No Drought	Drought
No Drought	891	116	No Drought	844	163
Drought	157	540	Drought	276	421

Table 12. Confusion Matrix after SMOTE oversampling. A consistent reduction in missed drought events (false negatives) is observed across all models, though often at the cost of increased false alarms (false positives).

(a) XGBoost			(b) Random Forest
	Predicted			Predicted
Actual	No Drought	Drought	Actual	No Drought	Drought
No Drought	829	178	No Drought	823	184
Drought	139	558	Drought	144	553
(c) Bagging			(d) Gradient Boosting
	Predicted			Predicted
Actual	No Drought	Drought	Actual	No Drought	Drought
No Drought	828	179	No Drought	730	277
Drought	138	559	Drought	173	524

Table 13. Confusion Matrix after Borderline SMOTE oversampling. This technique reduces the number of false alarms generated by SMOTE while maintaining a lower count of missed detections than the original imbalanced data.

(a) XGBoost			(b) Random Forest
	Predicted			Predicted
Actual	No Drought	Drought	Actual	No Drought	Drought
No Drought	914	93	No Drought	913	94
Drought	158	539	Drought	200	497
(c) Bagging			(d) Gradient Boosting
	Predicted			Predicted
Actual	No Drought	Drought	Actual	No Drought	Drought
No Drought	924	83	No Drought	834	173
Drought	178	519	Drought	254	443

Table 14. Confusion Matrix after ADASYN oversampling. XGBoost, Random Forest, and Bagging show the most balanced performance, with a strong reduction in both false negatives and false positives compared to other methods.

(a) XGBoost			(b) Random Forest
	Predicted			Predicted
Actual	No Drought	Drought	Actual	No Drought	Drought
No Drought	852	155	No Drought	842	165
Drought	97	600	Drought	119	578
(c) Bagging			(d) Gradient Boosting
	Predicted			Predicted
Actual	No Drought	Drought	Actual	No Drought	Drought
No Drought	851	156	No Drought	717	290
Drought	101	596	Drought	169	528

Table 15. Group-wise detection before oversampling: comparison of classification methods based on accuracy, precision, and recall.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.9630	0.9630	0.9630
Random Forest	0.9000	0.9000	0.9000
Bagging Classifier	0.9630	0.9630	0.9630
Gradient Boosting	0.8333	0.8333	0.8333

Table 16. Number of misclassifications (errors) in group detection without oversampling, by district (rows) and model (columns), over 2016–2025. Lower is better (0 = all years correct).

District	XGB	RF	Bagging	GB
Amravati	0	0	0	2
Jodhpur	0	0	0	0
Thanjavur	1	3	1	3
Total errors	1	3	1	5

Notes: (i) In Thanjavur, 2019 is consistently misclassified by all models; 2017 and 2024 are frequently misclassified (RF and GB). (ii) In Amravati, only GB makes errors (2016, 2019); other models behave perfectly. (iii) Jodhpur is correctly classified by all models across all years.

Table 17. Group-wise detection after SMOTE: comparison of classification methods based on accuracy, precision, and recall.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.9000	0.9000	0.9000
Random Forest	0.9333	0.9333	0.9333
Bagging Classifier	0.9333	0.9333	0.9333
Gradient Boosting	0.8000	0.8000	0.8000

Table 18. Number of misclassifications (errors) in group detection with SMOTE, by district (rows) and model (columns), over 2016–2025. Lower is better (0 = all years correct).

District	XGB	RF	Bagging	GB
Amravati	0	0	0	2
Jodhpur	0	0	0	2
Thanjavur	3	2	2	2
Total errors	3	2	2	6

Notes: (i) InThanjavur, SMOTE does not fully resolve misclassifications—errors occur most often in 2017, 2019, and occasionally in 2024. (ii) InAmravati andJodhpur, XGB, RF, and Bagging perform perfectly, while GB introduces isolated mistakes. (iii) Compared to the no-oversampling case, SMOTE slightly increases errors in Thanjavur but stabilizes Amravati and Jodhpur for tree-based ensembles.

Table 19. Group-wise detection after borderline SMOTE: comparison of classification methods based on accuracy, precision, and recall.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	0.9333	0.9333	0.9333
Random Forest	0.9000	0.9000	0.9000
Bagging Classifier	0.9000	0.9000	0.9000
Gradient Boosting	0.8333	0.8333	0.8333

Table 20. Number of misclassifications (errors) in group detection with Borderline-SMOTE, by district (rows) and model (columns), over 2016–2025. Lower is better (0 = all years correct).

District	XGB	RF	Bagging	GB
Amravati	0	0	0	2
Jodhpur	0	0	0	0
Thanjavur	2	3	3	3
Total errors	2	3	3	5

Notes: (i) InThanjavur, errors remain concentrated—especially in 2019 and 2024—across all models. (ii) Amravati is perfectly classified by XGB, RF, and Bagging, but GB fails in a few years (2016, 2019). (iii) Jodhpur is always classified correctly, regardless of model. (iv) Compared to plain SMOTE, Borderline-SMOTE slightly increases errors in Thanjavur, while not improving Amravati or Jodhpur.

Table 21. Group-wise detection after ADASYN: comparison of classification methods based on accuracy, precision, and recall.

Methods/Metrics	Accuracy	Precision	Recall
XG Boost	1.0000	1.0000	1.0000
Random Forest	0.9667	0.9667	0.9667
Bagging Classifier	0.9667	0.9667	0.9667
Gradient Boosting	0.8667	0.8667	0.8667

Table 22. Number of misclassifications (errors) in group detection with ADASYN, by district (rows) and model (columns), over 2016–2025. Lower is better (0 = all years correct).

District	RF	Bagging	GB
Amravati	0	0	1
Jodhpur	0	0	2
Thanjavur	1	1	1
Total errors	1	1	4

Notes: (i) XGB achieves perfect classification across all districts and years. (ii) RF andBagging make a single error each in Thanjavur (2019). (iii) GB shows scattered failures: Amravati (2016), Jodhpur (2017–2018), and Thanjavur (2019). (iv) Compared to SMOTE and Borderline-SMOTE, ADASYN is the most stable oversampling method, eliminating errors in Amravati and Jodhpur for tree ensembles and fully resolving issues in Thanjavur with XGB.

Table 23. Model performance of Top 1 to Top 5 feature using Weighted Sum before overampling.

	XGBoost	Random Forest	Bagging	Gradient Boosting
Top 1	Acc: 0.6015 Prec: 0.5204 Rec: 0.3300	Acc: 0.5452 Prec: 0.4470 Rec: 0.4720	Acc: 0.5452 Prec: 0.4470 Rec: 0.4720	Acc: 0.5939 Prec: 0.5065 Rec: 0.2812
Top 2	Acc: 0.6050 Prec: 0.5198 Rec: 0.4519	Acc: 0.6097 Prec: 0.5257 Rec: 0.4692	Acc: 0.6162 Prec: 0.5336 Rec: 0.4892	Acc: 0.6080 Prec: 0.5315 Rec: 0.3515
Top 3	Acc: 0.7441 Prec: 0.6986 Rec: 0.6585	Acc: 0.7676 Prec: 0.7482 Rec: 0.6643	Acc: 0.7541 Prec: 0.7132 Rec: 0.6671	Acc: 0.6696 Prec: 0.6309 Rec: 0.4634
Top 4	Acc: 0.8058 Prec: 0.7691 Rec: 0.7504	Acc: 0.8163 Prec: 0.7883 Rec: 0.7532	Acc: 0.8163 Prec: 0.7900 Rec: 0.7504	Acc: 0.7072 Prec: 0.6911 Rec: 0.5136
Top 5	Acc: 0.8327 Prec: 0.7901 Rec: 0.8049	Acc: 0.8345 Prec: 0.8130 Rec: 0.7733	Acc: 0.8275 Prec: 0.8039 Rec: 0.7647	Acc: 0.7289 Prec: 0.7094 Rec: 0.5710

Table 24. Model performance of Top 1 to Top 5 feature using Borda Count before oversampling.

	XGBoost	Random Forest	Bagging	Gradient Boosting
Top 1	Acc: 0.6015 Prec: 0.5204 Rec: 0.3300	Acc: 0.5452 Prec: 0.4470 Rec: 0.4720	Acc: 0.5452 Prec: 0.4470 Rec: 0.4720	Acc: 0.5939 Prec: 0.5065 Rec: 0.2812
Top 2	Acc: 0.6854 Prec: 0.6292 Rec: 0.5624	Acc: 0.6884 Prec: 0.6258 Rec: 0.5925	Acc: 0.6737 Prec: 0.6060 Rec: 0.5782	Acc: 0.6485 Prec: 0.6324 Rec: 0.3357
Top 3	Acc: 0.7441 Prec: 0.6986 Rec: 0.6585	Acc: 0.7664 Prec: 0.7325 Rec: 0.6758	Acc: 0.7523 Prec: 0.7106 Rec: 0.6657	Acc: 0.6696 Prec: 0.6309 Rec: 0.4634
Top 4	Acc: 0.8104 Prec: 0.7895 Rec: 0.7317	Acc: 0.8075 Prec: 0.7869 Rec: 0.7260	Acc: 0.8104 Prec: 0.7799 Rec: 0.7475	Acc: 0.7031 Prec: 0.6792 Rec: 0.5194
Top 5	Acc: 0.8345 Prec: 0.8120 Rec: 0.7747	Acc: 0.8351 Prec: 0.8210 Rec: 0.7633	Acc: 0.8298 Prec: 0.8107 Rec: 0.7618	Acc: 0.7224 Prec: 0.6972 Rec: 0.5681

Table 25. Model performance of Top 1 to Top 5 feature using Weighted Sum in SMOTE.

	XGBoost	Random Forest	Bagging	Gradient Boosting
Top 1	Acc: 0.5370 Prec: 0.4498 Rec: 0.5911	Acc: 0.5546 Prec: 0.4637 Rec: 0.5681	Acc: 0.5546 Prec: 0.4637 Rec: 0.5681	Acc: 0.5352 Prec: 0.4505 Rec: 0.6198
Top 2	Acc: 0.6808 Prec: 0.5975 Rec: 0.6729	Acc: 0.6690 Prec: 0.5888 Rec: 0.6327	Acc: 0.6808 Prec: 0.6035 Rec: 0.6399	Acc: 0.6127 Prec: 0.5213 Rec: 0.6499
Top 3	Acc: 0.7406 Prec: 0.6658 Rec: 0.7346	Acc: 0.7289 Prec: 0.6594 Rec: 0.6973	Acc: 0.7365 Prec: 0.6671 Rec: 0.7102	Acc: 0.6450 Prec: 0.5554 Rec: 0.6614
Top 4	Acc: 0.7805 Prec: 0.7128 Rec: 0.7762	Acc: 0.7788 Prec: 0.7145 Rec: 0.7647	Acc: 0.7741 Prec: 0.7074 Rec: 0.7633	Acc: 0.6696 Prec: 0.5772 Rec: 0.7188
Top 5	Acc: 0.8016 Prec: 0.7365 Rec: 0.8020	Acc: 0.7952 Prec: 0.7326 Rec: 0.7862	Acc: 0.7934 Prec: 0.7309 Rec: 0.7834	Acc: 0.7119 Prec: 0.6250 Rec: 0.7389

Table 26. Model performance of Top 1 to Top 5 feature using Borda Count in SMOTE.

	XGBoost	Random Forest	Bagging	Gradient Boosting
Top 1	Acc: 0.5370 Prec: 0.4498 Rec: 0.5911	Acc: 0.5546 Prec: 0.4637 Rec: 0.5681	Acc: 0.5546 Prec: 0.4637 Rec: 0.5681	Acc: 0.5352 Prec: 0.4505 Rec: 0.6198
Top 2	Acc: 0.6808 Prec: 0.5975 Rec: 0.6729	Acc: 0.6690 Prec: 0.5888 Rec: 0.6327	Acc: 0.6808 Prec: 0.6035 Rec: 0.6399	Acc: 0.6127 Prec: 0.5213 Rec: 0.6499
Top 3	Acc: 0.7406 Prec: 0.6658 Rec: 0.7346	Acc: 0.7289 Prec: 0.6594 Rec: 0.6973	Acc: 0.7365 Prec: 0.6671 Rec: 0.7102	Acc: 0.6450 Prec: 0.5554 Rec: 0.6614
Top 4	Acc: 0.7923 Prec: 0.7340 Rec: 0.7719	Acc: 0.7752 Prec: 0.7169 Rec: 0.7446	Acc: 0.7705 Prec: 0.7090 Rec: 0.7446	Acc: 0.6702 Prec: 0.5822 Rec: 0.6858
Top 5	Acc: 0.8146 Prec: 0.7592 Rec: 0.8006	Acc: 0.8028 Prec: 0.7469 Rec: 0.7834	Acc: 0.7946 Prec: 0.7310 Rec: 0.7877	Acc: 0.7077 Prec: 0.6194 Rec: 0.7403

Table 27. Model performance of Top 1 to Top 5 feature using Weighted Sum in Borderline Smote.

	XGBoost	Random Forest	Bagging	Gradient Boosting
Top 1	Acc: 0.5728 Prec: 0.4724 Rec: 0.3802	Acc: 0.5288 Prec: 0.4295 Rec: 0.4634	Acc: 0.5288 Prec: 0.4295 Rec: 0.4634	Acc: 0.5839 Prec: 0.4864 Rec: 0.3070
Top 2	Acc: 0.6995 Prec: 0.6529 Rec: 0.5667	Acc: 0.6972 Prec: 0.6496 Rec: 0.5638	Acc: 0.6866 Prec: 0.6330 Rec: 0.5567	Acc: 0.6455 Prec: 0.5987 Rec: 0.4046
Top 3	Acc: 0.7377 Prec: 0.7029 Rec: 0.6212	Acc: 0.7394 Prec: 0.7185 Rec: 0.5968	Acc: 0.7418 Prec: 0.7110 Rec: 0.6212	Acc: 0.6749 Prec: 0.6248 Rec: 0.5136
Top 4	Acc: 0.7934 Prec: 0.7851 Rec: 0.6815	Acc: 0.8046 Prec: 0.8085 Rec: 0.6844	Acc: 0.8052 Prec: 0.8007 Rec: 0.6973	Acc: 0.7077 Prec: 0.6639 Rec: 0.5782
Top 5	Acc: 0.8327 Prec: 0.8399 Rec: 0.7303	Acc: 0.8275 Prec: 0.8375 Rec: 0.7174	Acc: 0.8210 Prec: 0.8213 Rec: 0.7188	Acc: 0.7377 Prec: 0.7090 Rec: 0.6083

Table 28. Model performance of Top 1 to Top 5 feature using Borda Count in Borderline Smote.

	XGBoost	Random Forest	Bagging	Gradient Boosting
Top 1	Acc: 0.5728 Prec: 0.4724 Rec: 0.3802	Acc: 0.5288 Prec: 0.4295 Rec: 0.4634	Acc: 0.5288 Prec: 0.4295 Rec: 0.4634	Acc: 0.5839 Prec: 0.4864 Rec: 0.3070
Top 2	Acc: 0.6995 Prec: 0.6529 Rec: 0.5667	Acc: 0.6972 Prec: 0.6496 Rec: 0.5638	Acc: 0.6866 Prec: 0.6330 Rec: 0.5567	Acc: 0.6455 Prec: 0.5987 Rec: 0.4046
Top 3	Acc: 0.7388 Prec: 0.7114 Rec: 0.6083	Acc: 0.7588 Prec: 0.7563 Rec: 0.6055	Acc: 0.7770 Prec: 0.7673 Rec: 0.6528	Acc: 0.6766 Prec: 0.6448 Rec: 0.4663
Top 4	Acc: 0.7993 Prec: 0.7983 Rec: 0.6815	Acc: 0.8034 Prec: 0.8121 Rec: 0.6758	Acc: 0.8046 Prec: 0.8054 Rec: 0.6887	Acc: 0.7072 Prec: 0.6749 Rec: 0.5481
Top 5	Acc: 0.8263 Prec: 0.8314 Rec: 0.7217	Acc: 0.8275 Prec: 0.8353 Rec: 0.7202	Acc: 0.8245 Prec: 0.8284 Rec: 0.7202	Acc: 0.7312 Prec: 0.7057 Rec: 0.5882

Table 29. Model performance of Top 1 to Top 5 features using Weighted Sum in ADASYN.

	XGBoost	Random Forest	Bagging	Gradient Boosting
Top 1	Acc: 0.5464 Prec: 0.4556 Rec: 0.5595	Acc: 0.5252 Prec: 0.4350 Rec: 0.5380	Acc: 0.5252 Prec: 0.4350 Rec: 0.5380	Acc: 0.5293 Prec: 0.4497 Rec: 0.6729
Top 2	Acc: 0.5692 Prec: 0.4778 Rec: 0.5710	Acc: 0.5904 Prec: 0.4994 Rec: 0.5739	Acc: 0.5910 Prec: 0.5000 Rec: 0.5725	Acc: 0.5851 Prec: 0.4941 Rec: 0.6011
Top 3	Acc: 0.6620 Prec: 0.5761 Rec: 0.6571	Acc: 0.6802 Prec: 0.5964 Rec: 0.6743	Acc: 0.6725 Prec: 0.5906 Rec: 0.6499	Acc: 0.6180 Prec: 0.5253 Rec: 0.6844
Top 4	Acc: 0.7283 Prec: 0.6448 Rec: 0.7475	Acc: 0.7453 Prec: 0.6667 Rec: 0.7547	Acc: 0.7371 Prec: 0.6578 Rec: 0.7446	Acc: 0.6684 Prec: 0.5766 Rec: 0.7131
Top 5	Acc: 0.8316 Prec: 0.7649 Rec: 0.8494	Acc: 0.8263 Prec: 0.7691 Rec: 0.8221	Acc: 0.8187 Prec: 0.7526 Rec: 0.8293	Acc: 0.7048 Prec: 0.6160 Rec: 0.7389

Table 30. Model performance of Top 1 to Top 5 features using Borda Count in ADASYN.

	XGBoost	Random Forest	Bagging	Gradient Boosting
Top 1	Acc: 0.5464 Prec: 0.4556 Rec: 0.5595	Acc: 0.5252 Prec: 0.4350 Rec: 0.5380	Acc: 0.5252 Prec: 0.4350 Rec: 0.5380	Acc: 0.5293 Prec: 0.4497 Rec: 0.6729
Top 2	Acc: 0.5692 Prec: 0.4778 Rec: 0.5710	Acc: 0.5904 Prec: 0.4994 Rec: 0.5739	Acc: 0.5910 Prec: 0.5000 Rec: 0.5725	Acc: 0.5851 Prec: 0.4941 Rec: 0.6011
Top 3	Acc: 0.6702 Prec: 0.5810 Rec: 0.6944	Acc: 0.6866 Prec: 0.6049 Rec: 0.6743	Acc: 0.6843 Prec: 0.6023 Rec: 0.6714	Acc: 0.6320 Prec: 0.5418 Rec: 0.6514
Top 4	Acc: 0.7283 Prec: 0.6448 Rec: 0.7475	Acc: 0.7494 Prec: 0.6709 Rec: 0.7604	Acc: 0.7394 Prec: 0.6599 Rec: 0.7489	Acc: 0.6684 Prec: 0.5766 Rec: 0.7131
Top 5	Acc: 0.8316 Prec: 0.7649 Rec: 0.8494	Acc: 0.8269 Prec: 0.7666 Rec: 0.8293	Acc: 0.8187 Prec: 0.7526 Rec: 0.8293	Acc: 0.7048 Prec: 0.6160 Rec: 0.7389

Table 31. Comparisons between oversampling methods using McNemar’s test. All pairs are shown, with significance determined at

p < 0.05

.

Table 31. Comparisons between oversampling methods using McNemar’s test. All pairs are shown, with significance determined at

p < 0.05

.

Model	Method 1	Method 2	p-Value	Effect Size	Significant
XGBoost	No Sampling	SMOTE	$6.94 \times 10^{- 18}$	0.0340	Yes
XGBoost	No Sampling	BSMOTE	0.3961	0.0047	No
XGBoost	No Sampling	ADASYN	0.4188	0.0041	No
Random Forest	No Sampling	SMOTE	$7.66 \times 10^{- 5}$	0.0223	Yes
Random Forest	No Sampling	BSMOTE	0.6889	0.0023	No
Random Forest	No Sampling	ADASYN	0.6101	0.0035	No
Bagging	No Sampling	SMOTE	$1.15 \times 10^{- 6}$	0.0258	Yes
Bagging	No Sampling	BSMOTE	0.1337	0.0070	No
Bagging	No Sampling	ADASYN	0.1253	0.0094	No
Gradient Boosting	No Sampling	SMOTE	0.4973	0.0065	No
Gradient Boosting	No Sampling	BSMOTE	0.0501	0.0070	No
Gradient Boosting	No Sampling	ADASYN	0.2141	0.0117	No

Table 32. F1-Scores across models and oversampling methods. Green background indicates the best result for each model.

Model	No_Sampling	SMOTE	BorderlineSMOTE	ADASYN	Best Method	Improvement
XGBoost	0.8146	0.7788	0.8111	0.8264	ADASYN	+0.0118
RandomForest	0.7842	0.7713	0.7717	0.8028	ADASYN	+0.0186
Bagging	0.7982	0.7791	0.7991	0.8226	ADASYN	+0.0244
GradientBoosting	0.6573	0.6996	0.6748	0.6970	SMOTE	+0.0423

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sharma, S.S.; Mukherjee, J.; Dell’Acqua, F. Leveraging Sentinel-2 Data and Machine Learning for Drought Detection in India: The Process of Ground Truth Construction and a Case Study. Remote Sens. 2025, 17, 3159. https://doi.org/10.3390/rs17183159

AMA Style

Sharma SS, Mukherjee J, Dell’Acqua F. Leveraging Sentinel-2 Data and Machine Learning for Drought Detection in India: The Process of Ground Truth Construction and a Case Study. Remote Sensing. 2025; 17(18):3159. https://doi.org/10.3390/rs17183159

Chicago/Turabian Style

Sharma, Shubham Subhankar, Jit Mukherjee, and Fabio Dell’Acqua. 2025. "Leveraging Sentinel-2 Data and Machine Learning for Drought Detection in India: The Process of Ground Truth Construction and a Case Study" Remote Sensing 17, no. 18: 3159. https://doi.org/10.3390/rs17183159

APA Style

Sharma, S. S., Mukherjee, J., & Dell’Acqua, F. (2025). Leveraging Sentinel-2 Data and Machine Learning for Drought Detection in India: The Process of Ground Truth Construction and a Case Study. Remote Sensing, 17(18), 3159. https://doi.org/10.3390/rs17183159

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Leveraging Sentinel-2 Data and Machine Learning for Drought Detection in India: The Process of Ground Truth Construction and a Case Study

Highlights

Abstract

1. Introduction

Related Work

2. Preliminaries

2.1. Remote Sensing Indices

2.1.1. Normalized Difference Vegetation Index

2.1.2. Enhanced Vegetation Index

2.1.3. Atmospherically Resistant Vegetation Index

2.1.4. Normalized Difference Water Index

2.1.5. Soil-Adjusted Vegetation Index

2.1.6. Transformed Vegetative Index

2.1.7. Normalized Difference Moisture Index

2.1.8. Normalized Multi-Band Drought Index

2.1.9. Modified Normalized Water Index

2.1.10. Modified Normalized Difference Vegetation Index

2.1.11. Ratio Drought Index

2.1.12. Red-Edge Chlorophyll Index

2.2. Machine Learning Classifier

2.2.1. Random Forest

2.2.2. Gradient Boosting Classifier

2.2.3. Extreme Gradient Boosting (XGBoost)

2.2.4. Bagging Classifier

3. Feature Ranking and Aggregation Techniques

3.1. Shapley Additive Explanation Analysis

3.2. Borda Count

3.3. Weighted Sum

4. Resampling Techniques

4.1. Synthetic Minority Over-Sampling Technique

4.2. Borderline SMOTE

4.3. Adaptive Synthetic Sampling Approach

5. Data and Study Area

5.1. Drought Declaration Process in India

5.2. Ground Truth Table

5.2.1. Jodhpur

5.2.2. Amravati

5.2.3. Thanjavur

5.2.4. Limitations of Ground Truth Data

5.3. Temporal Coverage

6. Methodology

6.1. Data Acquisition and Preprocessing

6.2. Feature Engineering

6.3. Machine Learning Model Training and Evaluation

6.4. Error Analysis

6.5. Software and Libraries

6.6. Evaluation Metrics

7. Results and Discussion

7.1. Model Performance

7.1.1. Before Oversampling

7.1.2. SMOTE

7.1.3. Borderline SMOTE

7.1.4. ADASYN

7.2. Error Analysis

7.3. Model Performance (Season Majority Voting Strategy)

7.4. SHAP Analysis

7.4.1. Before Oversampling

7.4.2. SMOTE

7.4.3. SMOTE Borderline

7.4.4. ADASYN

7.5. Model Aggregation for Most Relevant Features

7.5.1. Before Oversampling

7.5.2. SMOTE

7.5.3. Borderline SMOTE

7.5.4. ADASYN

7.6. Statistical Comparison of Oversampling Methods

7.7. Limitations of the Proposed Work

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Correction Statement

List of Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information