Remote Sensing and Machine Learning Approaches for Hydrological Drought Detection: A PRISMA Review

August, Odwa; Sibiya, Malusi; Ilunga, Masengo; Sumbwanyambe, Mbuyu

doi:10.3390/w18030369

Open AccessSystematic Review

Remote Sensing and Machine Learning Approaches for Hydrological Drought Detection: A PRISMA Review

by

Odwa August

^1,*

,

Malusi Sibiya

²

,

Masengo Ilunga

¹

and

Mbuyu Sumbwanyambe

²

¹

Civil & Environmental Engineering and Building Science Department, University of South Africa, Florida Campus, Roodepoort 1709, South Africa

²

Centre for Augmented Intelligence and Data Science, University of South Africa, Florida Campus, Roodepoort 1709, South Africa

^*

Author to whom correspondence should be addressed.

Water 2026, 18(3), 369; https://doi.org/10.3390/w18030369

Submission received: 28 September 2025 / Revised: 3 November 2025 / Accepted: 3 November 2025 / Published: 31 January 2026

(This article belongs to the Section Hydrology)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Hydrological drought poses a significant threat to water security and ecosystems globally. While remote sensing offers vast spatial data, advanced analytical methods are required to translate this data into actionable insights. This review addresses this need by systematically synthesizing the state-of-the-art in using convolutional neural networks (CNNs) and satellite-derived vegetation indices for hydrological drought detection. Following PRISMA guidelines, a systematic search of studies published between 1 January 2018 and August 2025 was conducted, resulting in 138 studies for inclusion. A narrative synthesis approach was adopted. Among the 138 studies included, 58% focused on hybrid CNN-LSTM models, with a marked increase in publications observed after 2020. The analysis reveals that hybrid spatiotemporal models are the most effective, demonstrating superior forecasting skill and in some cases achieving 10–20% higher accuracy than standalone CNNs. The most robust models employ multi-modal data fusion, integrating vegetation indices (VIs) with complementary data like Land Surface Temperature (LST). Future research should focus on enhancing model transferability and incorporating explainable AI (XAI) to strengthen the operational utility of drought early warning systems.

Keywords:

convolutional neural networks (CNN); drought monitoring; Google Earth Engine (GEE); hydrological drought; machine learning; remote sensing; satellite imagery; vegetation indices (NDVI, EVI); water resources

1. Introduction

Drought is an intricate, slow-onset, and recurring natural hazard instigated by a persistent shortage of precipitation over a specific region and period [1,2,3,4]. Unlike more dramatic natural disasters such as floods or hurricanes [1,5]. Droughts are often characterized as creeping disasters due to their slow and often unnoticed onset, with prevalent cascading impacts across multiple sectors of society and the environment [2,6,7,8,9,10]. The scale of these impacts is immense and growing: the economic consequences of an average drought today can be up to six times higher than in 2000, with costs projected to rise by at least 35% by 2035 [9]. A significant reduction in annual rainfall can halve a region’s GDP growth rate [9]. The human cost is equally staggering; between 2002 and 2021, droughts affected over 1.4 billion people [10]. While droughts accounted for only 6% of natural disasters between 1970 and 2019, they caused 34% of all disaster-related deaths, mostly due to famine in Africa [9]. As of 2022, roughly half the global population faced severe water scarcity for part of the year [10], and in 2023, extreme drought left 23 million people in severe hunger in the Horn of Africa [9]. Droughts also disrupt global trade and energy; the 2024 drought in Central America forced a 49% reduction in monthly traffic through the Panama Canal [9].

Its effects can span vast geographical areas and persist for months or even years [11,12,13], leading to devastating consequences for drinking water supplies, agricultural production, energy generation, waterborne transportation [7,12], and the health of both terrestrial and aquatic ecosystems [14]. According to [15], the phenomenon of drought is not uniform. It is typically categorized into several interconnected types that manifest in a progressive process known as drought propagation. Figure 1 shows this sequence, which generally begins with a meteorological drought, defined by a prolonged period of below-average precipitation [1,13,15].

As this precipitation deficit persists, it leads to a depletion of soil moisture, particularly in the root zone, giving rise to an agricultural drought, which directly impacts crop health and productivity [13,15,16]. Over time, the continued lack of precipitation and recharge from the soil results in diminished surface and subsurface water supplies, culminating in a hydrological drought [17]. Hydrological drought is characterized by abnormally low streamflow in rivers, reduced water levels in lakes and reservoirs, and depleted groundwater aquifers [18,19]. It is this final stage that most directly affects public water supply, industrial use, and the viability of aquatic habitats [20]. Monitoring hydrological drought presents significant challenges. Traditional methods rely on in situ gauging stations that measure variables like river discharge, reservoir levels, and groundwater depth [16,21,22]. While these point-based measurements provide highly accurate data, they are often spatially scarce, expensive to maintain, and labor-intensive to operate [13,23,24]. This scarce coverage makes it difficult to capture the continuous spatial dynamics of drought over large, heterogeneous, or remote regions [6,16,25,26]. This leads to significant uncertainties when interpolating data between measurement points [27,28].

To overcome these limitations, researchers have increasingly turned to remote sensing technology [1,15], as shown in Figure 2. Earth observation satellites provide a powerful alternative, offering the capacity to monitor vast areas in a near-continuous fashion [1,17], capturing data with consistent spatiotemporal resolution [16,17]. Platforms such as the Moderate-Resolution Imaging Spectroradiometer (MODIS), the Landsat series, and the European Space Agency’s Sentinel constellation have become indispensable tools [29,30,31]. They provide a wealth of data on land surface conditions. From this satellite imagery, scientists derive various spectral indices, particularly vegetation indices [15,25], which serve as effective proxies for monitoring the impacts of water stress on plant health [23,32] by measuring the reflectance of light from the Earth’s surface across different spectral bands [22,33]. These indices can provide quantitative information on vegetation density, vigor, and water content, offering a direct window into the health of ecosystems responding to drought conditions [34,35]. This capability has fundamentally transformed the field, moving from scarce point data to comprehensive spatial assessments of drought’s impact. The sheer volume and complexity of data generated by modern remote sensing platforms necessitate advanced analytical tools [36,37]. They are capable of discerning meaningful patterns from vast, multi-dimensional datasets [36,37,38,39]. In recent years, machine learning (ML), and more specifically its subfield of deep learning (DL), has emerged as a transformative force in environmental science [36,40,41]. Unlike traditional statistical models that may struggle with the non-linear and interdependent relationships inherent in environmental systems [40,42], DL models excel at learning complex, hierarchical features directly from raw data [40,43]. Among the various DL architectures, Figure 3a the convolutional neural network (CNN) has proven exceptionally well-suited for tasks involving spatial data, such as satellite imagery [44,45,46]. Originally designed for computer vision tasks like image classification [26], CNNs employ a series of specialized layers, most notably convolutional layers, to automatically and adaptively learn spatial hierarchies of features [47,48] (Figure 3b).

A convolutional layer applies a set of learnable filters (or kernels) across an input image, creating feature maps that highlight specific patterns, such as edges, textures, or shapes [26,43,44,46,50]. As data passes through successive layers, the network learns to combine these simple patterns into more complex and abstract representations [26,43,44,46,48]. This ability to automatically extract relevant spatial features without manual engineering makes CNNs a powerful tool for analyzing environmental imagery [51]. This enables applications from land cover classification and water body extraction to the identification of complex patterns indicative of drought stress [22,52,53]. The application of CNNs in drought studies marks a significant evolution from earlier ML methods like Support Vector Machines (SVMs) and Random Forests (RFs) [33,39,43,54,55]. While these traditional ML models have demonstrated considerable success [56,57]. CNNs offer a more profound capability to learn the intricate spatial context embedded within satellite data, providing a more nuanced understanding of how drought manifests across a landscape [58,59,60,61]. This has unlocked new potential for not only monitoring the current drought conditions but also for forecasting their future evolution based on learned spatiotemporal patterns [26,62]. The past decade has witnessed a remarkable convergence of technological advancements that are reshaping the landscape of hydrological science [1,25,26,63]. Four key pillars underpin this transformation: (1) the deployment of advanced remote sensing satellites providing a continuous stream of high-resolution, multi-spectral data [36,47,64]; (2) the refinement and diversification of vegetation indices that act as sensitive proxies for ecosystem health and water stress [25,65]; (3) the maturation of powerful CNN architectures capable of sophisticated spatial and spatiotemporal analysis [54,59]; and (4) the rise in planetary-scale cloud computing platforms. A key technological transformation has been the arrival of cloud computing platforms, particularly Google Earth Engine (GEE) [6,13,16,66], a planetary-scale platform designed to transform access to earth observation data [67]. GEE provides the infrastructure to process and analyze petabytes of geospatial data by integrating a multi-petabyte archive of analysis-ready satellite imagery with parallelized processing and a library of machine learning algorithms [68,69,70,71]. This model fundamentally alters remote sensing workflows by eliminating the need to download and preprocess massive datasets locally, thereby enabling rapid analyses of environmental phenomena such as drought, land cover change, and agricultural productivity across vast areas [72,73]. This rapid, parallel evolution has spurred an explosion of research, with numerous studies demonstrating the potential of combining these technologies for drought monitoring and prediction [13,54,74,75]. However, this rapid growth has also led to a fragmented and heterogeneous body of literature [74,75]. Individual studies often focus on specific geographic regions [54,76,77], employ unique combinations of satellite data and vegetation indices [13,29,78], and utilize a wide array of custom or standard CNN architectures [29,54,74,77,79]. Figure 4 shows one of these custom CNNs [80]. Bidirectional Long Short-Term Memory (BiLSTM) is a deep learning architecture that learns sequential dependencies in both forward and backward directions, enhancing the model’s ability to capture temporal dynamics. When combined with CNN, BiLSTM complements spatial feature extraction by modeling temporal relationships in drought-related time series [80,81].

While these case studies are invaluable, a critical knowledge gap has emerged: there is no comprehensive, systematic synthesis that critically evaluates and compares these growing approaches [13,74,75,77]. Specifically, there is a need to move beyond isolated examples and develop an all-inclusive understanding of the state-of-the-art [13,77,79,80]. Studies show that hybrid deep learning models, including CNN-LSTM (Figure 5), CNN-Random Forest (RF), and CNN-Support Vector Machine (SVR), have demonstrated strong performance in drought prediction and monitoring, often outperforming standalone or simpler models [4,11,26,72,79,80,81,82,83,84,85,86].

Some findings indicate that simpler models like CNN and LSTM can sometimes outperform more complex models in drought forecasting, mostly in data-scarce environments or regions with limited temporal data [4,49,87]. A hybrid CNN-RF model showed superiority over individual CNN or RF models in estimating SPEI-3 and forecasting drought categories [26], shown in Figure 5. In the hybrid CNN–RF model, the convolutional layers of CNN are first used to extract spatial features from remote sensing data, which are then used as input variables for the Random Forest algorithm to perform classification or regression tasks. NDVI is the most widely used remote sensing index for vegetation-related studies and drought monitoring [11,13,29,54,75,76,80,85,86]. Its effectiveness can vary by region and soil type, with some studies showing higher correlations between NDVI and soil moisture in areas with homogeneous vegetation and silt loams [1]. NDVI can be affected by soil moisture and surface conditions, and its reflectance can be negatively influenced by these factors [88]. Additionally, NDVI can be saturated after the crop canopy is closed, limiting its performance in drought monitoring [89]. The Enhanced Vegetation Index (EVI) was developed to overcome some of the NDVI’s limitations [13]. Overall, the suitability of various vegetation indices for drought monitoring varies across different ecosystems, land cover types, and climatic conditions [75,83,90]. Combining various drought indices and data sources, such as remote sensing products, ground station data, and biophysical information, is crucial for comprehensive drought monitoring [13,91,92]. This systematic review is designed to address this knowledge gap by rigorously identifying, appraising, and synthesizing the peer-reviewed literature published since 2018. This review aims to provide a clear, evidence-based overview of the most effective methodologies at the intersection of remote sensing, vegetation indices, and CNNs for hydrological drought detection. Several studies have highlighted the need for more comprehensive systematic reviews focusing specifically on deep learning techniques for drought prediction and their application in hydrological contexts [74,87,93,94].

RQ1: What CNN architectures are most effective for detecting hydrological drought using vegetation indices from satellite imagery?

RQ2: How do different vegetation indices contribute to CNN-based hydrological drought detection accuracy, and what are their specific roles?

RQ3: What are the emerging trends, operationalization challenges, and future directions for applying CNN-based models in hydrological drought monitoring?

This review will offer critical insights into the operational potential of these advanced technologies. By clarifying which models and data inputs are most effective, it can guide the development and implementation of next-generation drought early warning systems. The ability to move from purely reactive monitoring to proactive forecasting, a key capability enabled by these new methods, has profound implications for mitigating the severe socioeconomic impacts of hydrological drought.

2. Materials and Methods

2.1. Protocol and Registration

This protocol for this systematic literature review was conducted and reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement [95]. The PRISMA framework ensures a transparent, rigorous, and replicable methodological approach [96], which is essential for producing a high-quality and unbiased synthesis of the existing literature [96].

2.2. Eligibility Criteria

The selection of studies for inclusion in this review was guided by the PICOS (Participants, Intervention, Comparator, Outcomes, Study Design) framework [97], which provides a structured approach to defining the scope of the research questions. Participants/Population (P): The review included studies focused on hydrological systems at any geographical scale, from local watersheds to large river basins or national regions. The core subject of investigation had to be hydrological drought or a closely related phenomenon. Intervention/Exposure (I): The central intervention of interest was the application of a convolutional neural network (CNN) or a hybrid deep learning model that incorporates a CNN component. The model’s primary purpose had to be detection, monitoring, classification, or forecasting drought conditions. A critical requirement was that the CNN model must utilize satellite-derived vegetation indices as a primary data input. Included architectures comprised standard 2D/3D CNNs; encoder–decoder models such as U-Net; and hybrid models including, but not limited to, CNN-LSTM, CNN-GRU, CNN-RF, and ConvLSTM. Comparator (C): Studies were considered eligible if they involved a comparison, either explicitly or implicitly. This could include comparisons between different CNN architectures. The performance of models using different vegetation indices or the benchmarking of a CNN-based approach against other methods. Studies that provided a detailed performance evaluation of a single, novel CNN-VI approach without a direct comparator were also included. Outcomes (O): The primary outcomes of interest were quantitative performance metrics that assess the effectiveness of the model. These metrics included accuracy, precision, recall, F1-score, coefficient of determination (R2), Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Nash–Sutcliffe Efficiency (NSE). Study Design and Publication Characteristics (S): The review considered peer-reviewed journal articles, full-text conference papers from reputable proceedings, and publicly available pre-prints from recognized archives. The publication period was restricted to studies published from 1 January 2018 to the date of the final search execution (August 2025). To capture the most recent advancements in this field. Only studies published in the English language were included.

2.3. Information Sources and Search Strategy

2.3.1. Databases and Search Platforms

A comprehensive search was conducted across multiple electronic databases to ensure broad coverage of the relevant literature in environmental science, computer science, and engineering. The primary databases searched were Scopus, Web of Science, IEEE Xplore Digital Library, Google Scholar, and PubMed.

2.3.2. Development of Search Strings and Keywords

A detailed search string was developed to precisely target the intersection of the key concepts of this review. The search strategy combined keywords from four core conceptual blocks. These blocks were Drought Concept: (hydrological drought OR drought detection), Vegetation Index Concept: (vegetation index OR vegetation indices OR NDVI OR EVI), CNN Concept: (convolutional neural network OR CNN), and Data Source Concept: (satellite imagery OR remote sensing). These blocks were combined using the AND Boolean operator to form the primary search string used across the databases: (“hydrological drought” OR “drought detection”) AND (“vegetation index” OR “vegetation indices” OR “NDVI” OR “EVI”) AND (“convolutional neural network” OR “CNN”) AND (“satellite imagery” OR “remote sensing”).

2.3.3. Date and Language Restrictions

The search covered articles published between 1 January 2018 and 14 October 2025. The start date of 1 January 2018 was selected to capture the contemporary era of deep learning applications in remote sensing. This period is marked by the widespread adoption of sophisticated CNN architectures beyond initial proof-of-concept studies and the maturation of planetary-scale cloud computing platforms like Google Earth Engine for large-scale analysis, representing a distinct phase in the field’s evolution, which has seen exponential growth in recent years. The search was restricted to articles published in English.

2.4. Study Selection Process

Initial screening was performed by reviewing the titles and abstracts of all identified records. This stage filtered out irrelevant studies based on their alignment with the research questions.

2.4.1. Screening of Titles and Abstracts

First, all the records retrieved from the database searches were imported into citation management software. The duplicate entries were identified and removed. The titles and abstracts of the remaining unique records were then independently screened by the two independent reviewers against the eligibility criteria. Records that did not meet the criteria were excluded.

2.4.2. Full-Text Review

The full texts of all the articles that passed the screening were retrieved. The same two independent reviewers then assessed each full-text article for final eligibility. Reasons for exclusion were documented at this stage.

2.5. Data Extraction Process

A standardized and piloted data extraction form was employed to systematically gather pertinent information from the included studies. This ensures methodological consistency and reduces the risk of data omission. The extracted data encompassed CNN architectural characteristics, key training parameters, and reported performance metrics. The process focused on identifying the specific VIs utilized, the source, and the resolution of the satellite data to report findings on the comparative impact of different VIs on model accuracy. Also, the inclusion of other fused input data, such as Land Surface Temperature or precipitation.

2.6. Risk of Bias Assessment in Individual Studies

The methodological quality and potential risk of bias of all included studies will be rigorously assessed using the Joanna Briggs Institute (JBI) Critical Appraisal Tool for quantitative studies [98]. The methodological quality of each included study was independently assessed by two reviewers using the appropriate Joanna Briggs Institute (JBI) Critical Appraisal Checklist. For quantitative studies, the checklist for analytical cross-sectional studies was employed. Any disagreements in the assessment were resolved through discussion and consensus with a third reviewer [98].

2.7. Data Synthesis

Due to the anticipated high degree of heterogeneity across the included studies in terms of geography, methodology, and datasets, a quantitative meta-analysis was considered infeasible [99], necessitating the adoption of a narrative synthesis approach. This method involved systematically structuring and integrating the findings thematically to construct a coherent analysis addressing the research questions. The process involves organizing studies into logical groups based on the CNN architecture and the vegetation indices employed. Key characteristics and results were then tabulated to facilitate a comparative analysis. This is aimed at identifying related patterns, trends, and inconsistencies within and also between these groups. This synthesis concluded in the development of a comprehensive explanatory narrative to interpret the combined findings within the broader context of the field.

3. Results

3.1. Study Selection and Flow (PRISMA Flow Diagram)

The systematic search of the Scopus, Web of Science, Google Scholar, and IEEE Xplore databases initially yielded a total of 878 records. After importing these records into a citation management tool, 213 duplicates were identified and removed, leaving 665 unique records for screening. The title and abstract screening process, conducted by two independent reviewers, resulted in the exclusion of 458 records that did not meet the PICOS eligibility criteria. Common reasons for exclusion at this stage included an incorrect study focus, the use of machine learning models other than CNNs, or the absence of vegetation indices as a primary input. There were 207 articles left for full-text review. During the full-text assessment, 5 articles were not retrieved, and a further 71 articles were excluded. An analysis of the exclusion reasons at the full-text screening stage provides insight into the current research landscape. Of the 71 articles excluded, the most common reason was a focus on non-hydrological drought (e.g., meteorological or agricultural) (n = 32, 45.1%), indicating a significant body of research utilizing advanced deep learning methods that have yet to be fully translated to hydrological applications. The second most frequent reason for exclusion was the use of machine learning models other than CNNs (n = 22, 31.0%), highlighting the continued prevalence of traditional algorithms like Random Forests and Support Vector Machines in the field. This underscores a potential gap in the adoption of deep learning’s spatial feature extraction capabilities for hydrological drought analysis. Following this rigorous, multi-stage screening process, a final set of 138 studies was deemed eligible and was included in the narrative synthesis. The complete flow of the study selection process is detailed in the PRISMA 2020 flow diagram in Figure 6.

3.2. Characteristics of Included Studies

3.2.1. Overview of Study Designs and Methodologies

The 138 included studies represent a diverse and rapidly evolving body of work. Methodologically, the research is overwhelmingly data-driven, using historical remote sensing and climate data to train and validate predictive models. A common methodological framework observed across many studies involves (Figure 7) (1) acquiring multi-source data, including satellite imagery and meteorological records; (2) preprocessing this data to create a consistent spatiotemporal dataset; (3) using this dataset to train a CNN-based model to predict or classify a target drought index; and (4) evaluating the model’s performance against a hold-out test set [55], while some studies focus on developing and testing a single novel architecture. A significant portion employs a comparative approach, benchmarking several models to identify the most effective method for their specific application and region [40]. The use of cloud computing platforms, particularly Google Earth Engine (GEE), is a prominent feature, frequently cited for its ability to handle the massive data processing requirements of these analyses [34].

3.2.2. Geographical Distribution and Temporal Scope of Research

Figure 8 shows that the marked acceleration in publication frequency is evident from 2021 onwards. This is shown by the significant volume of research published between 2023 and 2024. This vigorous research drive is further highlighted by numerous articles already scheduled for publication in 2025.

Geographically, this temporally focused body of work maintains a global view, reflecting the worldwide exigency of drought monitoring. This review is demonstrated by case studies situated across multiple continents, including investigations in Africa, specifically South Africa and Namibia; various regions within China; Middle Eastern nations such as Iran; Australia; and the broader MENA region in Figure 9.

3.2.3. Data Sources and Satellite Imagery Utilized

The studies draw upon a wide range of satellite data products. The most frequently utilized sensor is MODIS, valued for its long-term data record (since 2000) and moderate spatial resolution, which is suitable for regional and continental-scale analysis [16]. The Landsat series (particularly Landsat 8) and the Sentinel constellation (especially Sentinel-2 for optical data and Sentinel-1 for radar data) are also commonly used, prized for their higher spatial resolution, which enables more localized and detailed assessments [100]. For additional data, precipitation products like CHIRPS (Climate Hazards Group InfraRed Precipitation with Station data) and GPM (Global Precipitation Measurement) are frequently integrated to provide meteorological context or to calculate ground-truth drought indices like SPI [101]. Similarly, reanalysis datasets like ERA5-Land are often used for variables such as soil moisture and temperature [102,103]. A summary of the key characteristics of a representative sample of the included studies is presented in Table 1.

3.3. Effectiveness of CNN Architectures for Hydrological Drought Detection

3.3.1. Identified CNN Architectures and Their Variants

The review identified a clear hierarchy of CNN architectures being applied to drought detection, as shown in Figure 10, evolving from simpler spatial models to highly complex spatiotemporal hybrids. These are the most foundational architectures, often custom-built for a specific study. They typically consist of a sequence of convolutional, activation, and pooling layers, followed by fully connected layers for classification or regression [40].

This distribution provides insight into the field’s current methodological preferences. The dominance of standard 2D CNNs (62.1%) suggests that many studies still prioritize spatial analysis, such as classifying drought severity at a specific point in time. In contrast, the significant share of hybrid CNN–LSTM models (20.7%) reflects the growing focus on temporal drought dynamics and forecasting, which is essential for predictive early warning systems. The smaller, specialized categories like encoder–decoders and GANs represent emerging applications for pixel-level mapping and data generation, respectively. These models treat the input data as a single image and are effective at extracting spatial features to classify the overall drought state of a region or predict a single drought index value [103]. Z. Chen et al. (2024) [40] employed a 2D CNN to predict the daily SAPEI drought index by feeding it spatially explicit meteorological data from surrounding regions, demonstrating the architecture’s ability to learn from spatial context. Models like U-Net, which were originally developed for biomedical image segmentation, have been adapted for environmental applications. These architectures feature a “contracting” path (encoder) to capture context and a symmetric “expanding” path (decoder) to enable precise localization [53,54,106]. In drought studies, they are particularly effective for tasks requiring pixel-level classification, such as mapping the spatial extent of drought severity or delineating surface water bodies to assess hydrological drought impacts [21]. A study by M Wieland and S. Marinis (2020) [21] on hydrological drought in Germany successfully used a U-Net-based chain to extract water bodies from Sentinel-2 imagery with high accuracy (≥0.95), showcasing the power of this architecture for precise spatial mapping. A growing number of studies leverage well-established, pre-trained CNN architectures such as VGGNet, AlexNet (Figure 11), and more recently, EfficientNet [105].

These models, originally trained on massive image datasets like ImageNet, have already learned to recognize a rich hierarchy of visual features. By using a transfer learning approach, researchers can fine-tune these models on smaller, domain-specific datasets of drought imagery. This approach can lead to faster convergence and improved performance, especially when labelled training data is scarce [108]. Chaudhari et al. (2021) [11] compared a custom CNN against AlexNet and VGGNet for drought classification in India, finding that their custom CNN performed best but also highlighted the varying performance characteristics of these standard architectures (Figure 12). The hybrid spatiotemporal model represents the most advanced and increasingly prevalent approach, particularly for drought forecasting [4,74,79]. These models explicitly acknowledge that drought is a dynamic process that evolves in both space and time [63,106]. The most common hybrid is the CNN-LSTM, which combines a CNN for spatial feature extraction with a Long Short-Term Memory (LSTM) network, a type of Recurrent Neural Network (RNN), for learning temporal dependencies (Figure 13) [71,109].

In this setup, the CNN processes a sequence of satellite images, and the resulting feature vectors are fed into the LSTM, which learns to predict the next state in the sequence [32,110]. Variants like the ConvLSTM integrate convolutional operations directly into the LSTM cell (Figure 14), allowing the model to learn spatiotemporal patterns simultaneously [74,111]. These hybrid models are consistently shown to be superior for forecasting tasks, as they can model the physical process of drought propagation over time [82,112].

3.3.2. Performance Metrics and Comparative Analysis of Architectures

A narrative synthesis of the reported results reveals clear and consistent patterns regarding the relative effectiveness of different architectures. Table 2 provides a comparative summary of performance for key architectural types. The evidence strongly suggests that for forecasting tasks, hybrid spatiotemporal models (CNN-LSTM and its variants) consistently outperform other architectures [80,82,112]. Elbeltagi et al. (2024) [84] conducted a comprehensive comparison of four hybrid models (CNN-LSTM, CNN-RF, CNN-SVR, and CNN-XGB) for forecasting the Palmer Drought Severity Index (PDSI) in Egypt. Their results showed that the CNN-LSTM model achieved the highest performance during the training phase across multiple metrics, including an NSE of 0.885 and an R² of 0.885 [82], while the CNN-SVR model showed slightly better generalization in the testing phase. The study concluded that the CNN-LSTM was the most suitable for future investigation due to its inherent ability to capture temporal dynamics [82]. For tasks focused on spatial mapping and classification, encoder–decoder architectures like U-Net demonstrate excellent performance [53,113,114]. Their ability to produce high-resolution output maps makes them ideal for delineating the precise boundaries of drought-affected areas or changes in water bodies [3,53]. Standard 2D CNNs serve as a robust baseline and can achieve high accuracy in classification tasks, especially when fine-tuned or when the problem is spatially focused rather than temporally predictive [44,105]. The performance of pre-trained models is more variable and appears to be highly dependent on the similarity between the source domain and the target domain (satellite imagery), and the specific fine-tuning strategy employed [105].

3.3.3. Factors Influencing Architectural Effectiveness

The effectiveness of a given CNN architecture is not absolute but is influenced by a combination of factors related to data, the model, and the problem itself. Deep learning models are data hungry [54,118,119]. The performance of complex architectures like CNN-LSTM is highly dependent on the availability of long and consistent time-series data for training [103,110,120]. Studies with larger and more diverse datasets tend to develop more robust and generalizable models [44]. As noted, using pre-trained models can be a powerful strategy, especially with limited data [47,48,79]. However, the effectiveness depends on the “distance” between the source and target data domains. Fine-tuning is critical to adapt the learned features to the specific spectral and spatial characteristics of satellite imagery [47]. There is a trade-off between model complexity and the risk of overfitting [45,54]. While a complex model like a deep CNN-LSTM has a higher capacity to learn intricate patterns, it also requires more data and is more prone to memorizing noise if not properly regularized [58,121]. For simpler classification tasks, a well-designed, smaller 2D CNN may outperform a more complex model that is not adequately trained [11,45,122,123]. The structure of the input data profoundly influences architectural choice. Models designed for forecasting benefit from inputs structured as a time series of images, which naturally suits a CNN-LSTM architecture [110,124]. In contrast, models for static classification may perform well with a single image input or a “data cube” that stacks multiple indices at a single time step [54]. The fusion of multiple data types often requires custom architectures designed to handle these multi-modal inputs effectively [67,103].

3.4. Contribution and Roles of Different Vegetation Indices

3.4.1. Commonly Employed Vegetation Indices in CNN-Based Detection

Vegetation indices (VIs) are mathematical combinations of different spectral bands, designed to enhance the signal of vegetation properties whilst minimizing noise from factors like soil background and atmospheric conditions [13,79,125]. They are the primary link between what the satellite “sees” and the health of the ecosystem on the ground [126]. The review found that a core set of VIs is repeatedly used as inputs for CNN-based drought detection models (Figure 15) [75,86].

Table 3 summarizes the characteristics of the most common indices. The review found that models use a core set of VIs, each with specific strengths and weaknesses. The characteristics of these key indices are detailed in Table 3 and Table 4. The Normalized Difference Vegetation Index (NDVI) is by far the most globally used VI, serving as a robust indicator of vegetation greenness and density [33,79,127]. It is used in a vast number of the included studies as a primary input for assessing vegetation health [96]. Its primary weakness, however, is that it is known to saturate in high-biomass areas. The Enhanced Vegetation Index (EVI) was developed to address this; as an improvement over NDVI, EVI is designed to be more sensitive in areas with high biomass where NDVI tends to saturate [74,105]. It also incorporates corrections for atmospheric influences and soil background, making it a more stable index [79,105]. The Normalized Difference Water Index (NDWI) plays a unique role, as it is specifically designed to detect water content in vegetation canopies. Its unique sensitivity to water makes it a powerful predictor for early-stage water stress, rather than inferring it solely from greenness [128,129]. For arid and semi-arid regions, the Soil Adjusted Vegetation Index (SAVI) is often superior. It is a modification of NDVI that includes a soil adjustment factor (‘L’) to minimize the effect of soil brightness [15,130], which can be a significant source of noise in areas with scarce vegetation cover [127]. Finally, Leaf Area Index (LAI) is a key biophysical parameter quantifying the amount of leaf material in an ecosystem [98]. It is a key indicator of canopy structure and photosynthetic capacity and is often used as an input or validation variable in drought studies [131].

3.4.2. Impact of Specific Vegetation Indices on Detection Accuracy

Studies consistently show that in arid and semi-arid environments with scarce vegetation, indices that account for the soil background. SAVI can offer improved performance over the standard NDVI [15,127,130]. The NDVI, while a powerful pillar, can be confused by changes in soil moisture and color, which SAVI is designed to mitigate [15,123,130]. A study in Iran by Rahnama et al. (2024) [15] evaluated drought across different climate zones and found that in the arid and semi-arid regions of Birjand and Shiraz, SAVI performed better than NDVI because the soil’s effect on reflectance was more significant. In contrast, in areas with high vegetation cover, the two indices performed equally well [13]. In regions with dense vegetation, such as forests or intensive agriculture, EVI is often superior to NDVI [125]. The primary reason is that NDVI values tend to become saturated in these high-biomass areas. This means the index value stops increasing even as the vegetation becomes denser or healthier [1]. EVI is specifically formulated to overcome this saturation issue, providing a more linear relationship with biophysical parameters like LAI and biomass [130], thus offering a more sensitive measure of drought stress in these environments [88]. NDWI plays a unique and complementary role. While NDVI and EVI measure vegetation vigor, which is an indirect symptom of water stress, NDWI is sensitive to the actual water content within the plant canopy [128,129]. This makes it a more direct indicator of drought-induced water stress. Studies that incorporate NDWI often find it to be a powerful predictor, especially for early-stage drought detection, as plants may reduce their water content before showing visible signs of browning or reduced greenness [129].

3.4.3. Synergistic Effects and Redundancy Among Indices

A dominant and powerful trend identified in this review is the shift from single-index models to multi-modal data fusion (Figure 16).

Researchers are increasingly finding that combining multiple indices and data types leads to more robust and accurate drought detection systems [34]. This approach is based on the convergence of evidence principle: if multiple, independent indicators all point towards drought, the confidence in that assessment is much higher [132]. A common and effective fusion strategy is to combine a vegetation index (like NDVI or EVI) with a thermal index, most often derived from Land Surface Temperature (LST). This combination forms the basis of widely used composite indices like the Vegetation Health Index (VHI) (Figure 17), which is calculated from the Vegetation Condition Index (VCI) and the Temperature Condition Index (TCI) [16].

The VCI (derived from NDVI or EVI) reflects the health of the vegetation, while the TCI (derived from LST) reflects temperature-related stress. A situation with low VCI and high TCI is a strong indicator of drought-induced stress. Feeding both vegetation and temperature data into a CNN allows the model to learn these complex relationships automatically, leading to more accurate assessments than using either data type alone. Furthermore, integrating precipitation data provides information on the meteorological driver of drought [101]. Figure 18 shows the Scaled Drought Condition Index (SDCI) combining VCI, TCI, and a Precipitation Condition Index (PCI) [34]. By fusing information on vegetation response (VCI), thermal stress (TCI), and the causal precipitation deficit (PCI), models can build a far more comprehensive picture of the drought event, from its meteorological origins to its agricultural and hydrological impacts, while some indices may offer redundant information in certain contexts [128]. The strategic fusion of complementary data types, such as greenness, water content, and temperature, has proven to be a highly effective strategy for enhancing the accuracy and reliability of CNN-based drought detection.

3.4.4. The Role of Cloud Computing Platforms

Our analysis of the 138 included articles found that 17 studies (approximately 10.9%) explicitly utilized the Google Earth Engine (GEE) platform. The primary application of GEE within this cohort was for large-scale data acquisition and preprocessing. Hegazi et al. (2023) [24] used the platform to access analysis-ready Sentinel-2 imagery, while Saini and Nagpal (2024) [29] used it for calculating vegetation indices and extracting phenological parameters across a large geographical area before exporting the data for model training.

3.5. Illustrative Case Study: A Hybrid Deep Learning Model for Drought Monitoring in Southwest China

To illustrate the practical application of the advanced methods identified in this review, we examine the work of Xiao et al. (2024) [26] as a representative case study. This study was selected because it exemplifies two of the most significant trends identified in this review: (1) the use of powerful, multi-modal data fusion (integrating satellite-derived indices, precipitation reanalysis, soil moisture data, and static terrain factors) and (2) the application of a hybrid architecture to answer a research question aligned with RQ1 and RQ2. The study’s methodology, Figure 19, utilizes a hybrid CNN-Random Forest (CNN-RF) model. This architecture represents a common and effective hybrid strategy, chosen to combine the strengths of both algorithms: the CNN acts as an efficient feature extractor to capture spatial patterns from gridded data, while the Random Forest (RF) excels at regressing these complex features against the target variable (SPEI-3). While many studies in this review opt for hybrid CNN-LSTM models for their superior temporal forecasting strengths (as discussed in Section 3.3.2), the CNN-RF model represents another major branch of hybrid design. It is particularly well-suited for high-dimensional, non-linear regression tasks where the goal is to model complex interactions between a set of spatiotemporal inputs, rather than just a time-series sequence. This approach is often benchmarked against other tree-based ensembles like CNN-XGBoost, reflecting a broader trend of leveraging the predictive power of ensemble learners to enhance the spatial feature extraction of CNNs. The authors chose Yunnan Province, a region characterized by complex topography and an increased frequency of severe drought events, as their study area. The model was designed to reproduce the station-based 3-month Standardized Precipitation Evapotranspiration Index (SPEI-3), a widely accepted indicator of agricultural drought, from 2001 to 2020. The study’s methodology directly aligns with the key trends identified in this review: the use of hybrid architecture and multi-modal data fusion. In Figure 19, the authors developed a hybrid CNN-Random Forest (CNN-RF) model. This architecture was chosen to combine the strengths of both algorithms: the CNN acts as an efficient feature extractor to capture spatial patterns from gridded data, while the Random Forest (RF) excels at regressing these features against the target variable by handling complex, non-linear relationships. The model demonstrates a powerful data fusion strategy, integrating nine different drought factors from four distinct data sources. Satellite remote sensing from (MODIS) was used to derive the Vegetation Condition Index (VCI), Temperature Condition Index (TCI), and Scaled Potential Evapotranspiration (SPET). Precipitation Data from Climate Hazards Group InfraRed Precipitation with Station (CHIRPS) was provided with the Precipitation Condition Index (PCI) and the 3-month Standardized Precipitation Index (SPI-3). Reanalysis Data from the Global Land Data Assimilation System (GLDAS) was the source for the Soil Moisture Condition Index (SMCI). Topographical data (SRTM) for the elevation and slope were included as static terrain factors that influence drought patterns. The study rigorously evaluated the model’s performance against standalone CNN and RF models, as well as against real-world ground-truth data. The hybrid CNN-RF model significantly outperformed both the individual CNN and RF models in estimating SPEI-3. During validation, it achieved a high correlation coefficient (CC) of 0.9113, a low Root Mean Square Error (RMSE) of 0.3997, and a Kling–Gupta Efficiency (KGE) of 0.8635. The CNN-RF model also proved superior in forecasting drought categories, achieving the highest micro-average Area Under the Curve (AUC) value of 0.86. The model’s predictions showed strong consistency with actual drought conditions. The simulated SPEI-3 had a significant correlation with the 3-month Standardized Soil Moisture Index (SSMI-3) derived from in situ soil moisture data (CC = 0.42, p < 0.01). Furthermore, the model-estimated drought area on cropland showed a significant negative correlation with summer harvest grain yields, validating the model’s effectiveness in tracking real-world agricultural impacts. The Xiao et al. (2024) study [26] validates two of this review’s core findings. First, hybrid architectures that combine the spatial feature extraction of CNNs with the non-linear modeling capabilities of algorithms like Random Forests can lead to superior performance compared to standalone models. Second, the fusion of complementary, multi-source data integrating vegetation, climate, soil, and terrain factors. It is a highly effective strategy that allows the model to build a more comprehensive and physically coherent picture of drought, leading to more accurate and reliable assessments.

4. Discussion

4.1. Summary of Main Findings

This systematic review has synthesized and analyzed the state-of-the-art in applying convolutional neural networks (CNNs) and satellite-derived vegetation indices for hydrological drought detection from 2018 to the present. The findings reveal a field characterized by rapid technological advancement and a clear evolution in methodological approaches. The analysis shows a distinct trend moving beyond simple 2D CNNs towards more sophisticated architectures. For drought forecasting, hybrid spatiotemporal models, particularly the CNN-LSTM, have emerged as the most effective class of architecture due to their ability to simultaneously process spatial features with a CNN and learn temporal dependencies. An LSTM allows them to more accurately model the dynamic nature of drought propagation. For tasks involving precise spatial mapping as delineating drought extent or surface water changes, encoder–decoder architectures like U-Net have proven to be highly effective. The review confirms that there is no single best index; rather, their utility is context dependent. While NDVI remains the most widely used index, EVI offers superior sensitivity in high-biomass regions. SAVI is better suited for scarcely vegetated, arid lands. A critical finding is the powerful trend towards multi-modal data fusion. Combining complementary indices as in the greenness index (NDVI/EVI), a water index (NDWI), and a thermal index (derived from LST). This provides a “convergence of evidence” that yields significantly more robust and accurate drought assessments than any single index alone. Finally, the emergence of platforms like Google Earth Engine (GEE) has been identified as a fundamental assistant. Google Earth Engine brings transformation access to the vast data and computational resources necessary for this research, thereby accelerating innovation across the field.

4.2. Interpretation of Results

4.2.1. Optimal CNN Architectures for Hydrological Drought Detection

The observed dominance of hybrid spatiotemporal models like the CNN-LSTM is not merely an incremental improvement; it represents a fundamental alignment of the analytical method with the underlying physical process. Hydrological drought is the culmination of a process that unfolds over both space and time. A meteorological drought (a spatial precipitation deficit) propagates through the hydrological cycle, impacting soil moisture and eventually manifesting as reduced streamflow over a period of weeks or months. A standard 2D CNN is adept at answering the question, “What does drought look like spatially at this moment?” It can effectively learn the spatial signatures of stressed vegetation or dry soil from a single satellite image [58]. However, it lacks an intrinsic mechanism for understanding the temporal sequence of events. The LSTM component of a hybrid model directly addresses this shortcoming. By processing the sequence of spatial feature maps extracted by CNN, the LSTM learns the temporal patterns of how drought develops, persists, and recovers [107]. This architectural synergy is why CNN-LSTM models are not just better at forecasting; they are conceptually better suited to the problem. They move beyond static pattern recognition to dynamic process modeling, which is essential for the predictive capabilities required by modern early warning systems.

4.2.2. The Critical Role of Vegetation Indices in Detection Accuracy

The findings regarding vegetation indices highlight a similar maturation in the field, moving from a search for a single “silver bullet” index to a more nuanced, multi-faceted approach. The VIs used in these models are not interchangeable data points; they are distinct lenses that each capture a different aspect of the ecosystem’s response to water stress. NDVI and EVI primarily measure vegetation greenness, a proxy for photosynthetic activity and overall plant health [131]. This is an excellent indicator of the impact of agricultural drought. However, a plant’s greenness may not decline until it has been under stress for some time. NDWI, by contrast, measures canopy water content, which can decrease much earlier in a drought event as a plant’s first response is to conserve water [128]. LST provides information on thermal stress, which is related to evapotranspiration and surface energy balance. The trend towards fusing these different data types within a single CNN model is a logical and powerful response to the complexity of the drought phenomenon. A model fed with only NDVI is learning to detect the consequences of drought. A model fed with NDVI, NDWI, and LST is learning to connect the meteorological drivers. The initial plant response and the eventual impact on plant health. This multi-modal approach [32] allows CNN to build a more complete, physically coherent model of the entire drought process. This leads to more reliable and timely detection. It reflects a shift from single-indicator monitoring to an all-inclusive system-level assessment. A system that better captures the convergence of evidence necessary for robust environmental analysis [132].

4.2.3. Bridging the Scale Gap: The Role of Unmanned Aerial Vehicles (UAVs) in Drought Monitoring

While satellite remote sensing provides invaluable large-scale coverage for drought monitoring, Unmanned Aerial Vehicles (UAVs, or drones) have emerged as a critical technology for addressing the inherent scale mismatch between sparse, point-based ground measurements and the coarser resolution of satellite imagery [33,133,134,135,136]. UAVs offer on-demand, flexible data acquisition capabilities at very high spatial resolutions, often at the centimeter level, providing a crucial intermediate scale for detailed drought analysis and complementing satellite observations [133]. UAVs deployed for drought and water stress monitoring are commonly equipped with various sensors, including thermal, multispectral, and hyperspectral cameras [134,135]. Thermal sensors are particularly valuable as they can detect increases in canopy temperature associated with water stress-induced stomatal closure [33,135]. Data from these thermal sensors is frequently used to calculate indices like the Crop Water Stress Index (CWSI), a widely adopted and effective indicator of plant water status derived directly from UAV-based thermal imagery [131,135]. Multispectral and hyperspectral sensors provide detailed spectral information used to derive various vegetation indices sensitive to plant health and water content [135]. Crucially, UAVs function not as replacements for satellites but as powerful complements, enabling sophisticated multi-scale data fusion approaches. High-resolution UAV data serves two primary synergistic roles in conjunction with satellite imagery [133,134,136]. Firstly, it provides high-quality ground truth for calibrating and validating products derived from coarser-resolution satellite data, such as soil moisture estimates or vegetation indices, thereby improving the accuracy of large-area satellite assessments [135]. This synergistic approach, often referred to as “model calibration,” is used in research areas like chlorophyll inversion, soil salinity inversion, and nitrogen content monitoring. Secondly, UAV data is increasingly used in downscaling or “upscaling” workflows [133,134]. Machine learning models are employed to fuse the high spatial detail from UAV imagery (covering smaller areas) with the broad spatial coverage of satellite data (at lower resolution) to generate high-resolution maps of drought-related variables (like soil moisture) over larger extents than a UAV could feasibly cover alone [134,135]. A study [133] has successfully demonstrated this integration, using machine learning algorithms such as XGBoost and Random Forest to combine UAV and satellite data, significantly improving the accuracy of satellite-based soil moisture inversion for agricultural monitoring. Despite their advantages, UAVs also have limitations, including restricted spatial coverage per flight compared to satellites, operational constraints due to regulations, and sensitivity to weather conditions during data acquisition [135]. Nonetheless, their ability to provide high-resolution, on-demand data makes them an indispensable tool in the multi-scale remote sensing toolkit, bridging the gap between field observations and satellite perspectives and enhancing the overall capability for precise drought monitoring and management [133].

4.3. Foundational and Theoretical Frameworks

While this review concentrates on the recent advancements (post-2018) in convolutional neural networks (CNNs) and vegetation indices for hydrological drought detection, primarily documented in journal articles, it is essential to recognize the foundational literature, including monographs and book chapters, that underpin this contemporary research. Seminal works offer crucial historical context and theoretical frameworks. A key example is the monograph Remote Sensing of Drought: Innovative Monitoring Approaches, edited by Wardlow, Anderson, and Verdin (2012) [137]. As one of the first comprehensive volumes dedicated specifically to this topic, it consolidated knowledge previously dispersed across various disciplines. It covered critical components relevant to drought monitoring via remote sensing, such as vegetation health analysis using NDVI from AVHRR, evapotranspiration monitoring through thermal indicators, precipitation estimation using Artificial Neural Networks, and microwave remote sensing techniques for soil moisture. Referring to such foundational texts demonstrates a deeper understanding of the field’s historical trajectory. Moreover, the synthesis of knowledge continues in recent books, reflecting the modern integration of computational techniques. Publications like the second edition of the Remote Sensing Handbook [138] series (including Volume V covering Water Resources, Hydrology, Floods, Snow and Ice) highlight the maturation of the field and the growing synergy between remote sensing, AI, and machine learning. Acknowledging these broader syntheses provides a richer context for interpreting the specific, rapidly evolving journal literature analyzed in this review.

4.4. Limitations of the Review

This systematic review, despite its rigorous methodology, is subject to several inherent limitations that must be acknowledged to contextualize its findings. As with any review, there is a potential for publication bias, wherein studies reporting positive or statistically significant results are more likely to be published than those with null or negative findings. This could lead to an overestimation of the effectiveness of the reviewed technologies. The review was restricted to studies published in English for practical reasons [94]. This may have resulted in the exclusion of relevant research published in other languages, potentially omitting valuable insights from non-English speaking research communities. The narrative synthesis, while robust, relies on qualitative interpretation and cannot produce a single, pooled estimate of effect size. The finding that a particular architecture is more effective is based on a consistent pattern of reported superiority across multiple studies rather than a formal statistical test. The search strategy was designed to be highly specific to retrieve a manageable and highly relevant set of articles. However, this specificity means that studies using slightly different terminology or novel vegetation indices not included in the search string may have been missed. The rapid pace of innovation in this field means that new techniques are constantly emerging.

4.5. Implications for Future Research and Practical Applications

The findings of this review have profound implications for both the direction of future scientific inquiry and the development of practical, operational tools for water resource management. The demonstrated success of CNN-based methods, particularly spatiotemporal hybrids using fused, multi-modal data. Provides a clear blueprint for building the next generation of drought early warning systems [34]. The ability to generate accurate, high-resolution drought forecasts with significant lead times can transform drought management from a reactive, crisis-driven exercise into a proactive, risk-based endeavor. However, bridging the gap from research to robust operational deployment requires addressing several key challenges. One of the most significant hurdles is model generalizability. Many of the reviewed studies are case studies focused on a single basin or region. A model trained in the unique climatic and ecological characteristics of one area may not perform well when transferred to another without significant recalibration. The relationships between vegetation response, climate variables, and hydrological drought are not universal. Therefore, a critical frontier for future research is the development of models that are more transferable and generalizable across diverse environments, perhaps by using techniques like domain adaptation or by training on more globally comprehensive datasets. A second major challenge is the “black box” problem of deep learning; a key limitation of the very technologies being reviewed and a major barrier to their operational adoption. The complex, multi-layered nature of CNNs makes it difficult to understand why a model made a particular prediction, pointing to a pressing need to integrate explainable AI (XAI) techniques into the development workflow. XAI is a set of methods that aim to make the decisions of complex models like CNNs understandable to humans. For a water manager or policymaker to trust and act upon a model’s drought forecast, transparency and interpretability are paramount [118]. This discussion must move beyond the abstract concept to include specific, widely used XAI techniques. These include SHAP (SHapley Additive exPlanations), a game theory-based approach that assigns an importance value to each input feature for a given prediction. It can provide both global explanations (which features are most important overall) and local explanations (why a specific prediction was made). Another key method is LIME (Local Interpretable Model-agnostic Explanations), which explains individual predictions by creating a simpler, interpretable local model (e.g., a linear model) that approximates the behavior of the complex model in the vicinity of that prediction. These methods have demonstrated clear relevance in environmental science. For instance, one study used SHAP with machine learning models to identify that the Palmer Drought Severity Index (PDSI) and the 90-day Standardized Precipitation Index (SPI) were the most influential predictors of vegetation health (NDVI) in a river basin, providing clear, actionable insights into drought drivers. Explicitly positioning XAI as a critical solution to the ‘black box’ problem is essential for building the trust necessary for operational adoption (Figure 20) [80].

5. Conclusions

This systematic review synthesized the state-of-the-art in CNN-based hydrological drought detection, charting a paradigm shift from static, single-indicator monitoring to dynamic, multi-modal forecasting. In direct response to RQ1, the analysis confirms that hybrid spatiotemporal architectures (e.g., CNN-LSTM and ConvLSTM) are the most effective for forecasting, as they align with the physical, time-dependent nature of drought propagation. However, this effectiveness is critically dependent on large, high-quality time-series datasets, without which they are prone to overfitting. For static spatial classification tasks, simpler 2D CNNs often remain a more robust and computationally efficient baseline. In response to RQ2, the review finds no single superior vegetation index. The most robust models rely on multi-modal data fusion, typically combining a greenness index (like NDVI or EVI), a water content index (NDWI), and a thermal index (LST). This “convergence of evidence” approach yields a more comprehensive assessment. The primary source of uncertainty lies in this data selection: NDVI, for example, is known to saturate in high-biomass regions (where EVI is superior), while SAVI is required to mitigate soil brightness noise in arid lands. The integration of GEE was identified as a key enabler for processing these large, multi-modal datasets. This synthesis identifies several crucial future research gaps and a clear evolutionary path for the field. The progression has moved from (1) traditional machine learning (e.g., RF and SVM), reliant on handcrafted features, to (2) CNNs enabling automated spatial feature learning, and (3) hybrid CNN-LSTMs integrating temporal modeling. The next frontiers, detailed below, aim to address remaining challenges in capability, generalizability, and interpretability.

5.1. Beyond Convolutions: The Potential of Vision Transformers (ViTs) for Hydrological Modeling

While CNN-LSTMs are state-of-the-art, the next wave of architectures is emerging. Transformer architectures, which revolutionized Natural Language Processing (NLP), are now being successfully applied to computer vision tasks in the form of Vision Transformers (ViTs) and are gaining traction in the remote sensing community. The key advantage over CNNs must be clearly articulated: while CNNs use fixed-size kernels to learn local spatial features, their ability to model long-range dependencies is inherently limited. ViTs, by contrast, use a self-attention mechanism that allows the model to weigh the importance of all parts of an image relative to each other simultaneously. This enables ViTs to capture global context more effectively, a significant advantage for modeling large-scale, interconnected environmental phenomena like drought. Emerging research has shown ViTs achieving state-of-the-art performance in remote sensing tasks like land cover classification, object detection, and even crop yield forecasting by fusing environmental and remote sensing data. To maintain a balanced perspective, it is important to note the challenges associated with ViTs, such as their typically larger data requirements for training compared to CNNs. This positions ViTs as a key frontier for future research in hydrological drought modeling.

5.2. Other Key Research Frontiers

Beyond new architectures, future work must prioritize model generalizability to move beyond single-region case studies, likely through domain adaptation. Critically, integrating explainable AI (XAI) techniques, as discussed in Section 4.5, is essential to overcome the “black box” problem and build the operational trust required by water managers. The field would also benefit immensely from the creation of standardized, publicly available benchmark datasets to enable more direct and meaningful model comparisons. Finally, a promising frontier is the development of physics-informed neural networks that merge the deep learning strengths of pattern recognition with the foundational constraints of hydrological models.

Author Contributions

Conceptualization, O.A. and M.S. (Malusi Sibiya); methodology, O.A.; software, O.A.; validation, M.S. (Malusi Sibiya), M.I. and M.S. (Mbuyu Sumbwanyambe); formal analysis, O.A.; investigation, O.A.; resources, O.A.; data curation, O.A.; writing—original draft preparation, O.A.; writing—review and editing, O.A.; visualization, O.A.; supervision, M.S. (Malusi Sibiya), M.I. and M.S. (Mbuyu Sumbwanyambe). All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

West, H.; Quinn, N.; Horswell, M. Remote sensing for drought monitoring & impact assessment: Progress, past challenges and future opportunities. Remote Sens. Environ. 2019, 232, 111291. [Google Scholar] [CrossRef]
Van Loon, A.F.; Kchouk, S.; Matanó, A.; Tootoonchi, F.; Alvarez-Garreton, C.; Hassaballah, K.E.A.; Wu, M.; Wens, M.L.K.; Shyrokaya, A.; Ridolfi, E.; et al. Review article: Drought as a continuum—Memory effects in interlinked hydrological, ecological, and social systems. Nat. Hazards Earth Syst. Sci. 2024, 24, 3173–3205. [Google Scholar] [CrossRef]
Varghese, D.; Radulović, M.; Stojković, S.; Crnojević, V. Reviewing the Potential of Sentinel-2 in Assessing the Drought. Remote Sens. 2021, 13, 3355. [Google Scholar] [CrossRef]
Gyaneshwar, A.; Mishra, A.; Chadha, U.; Raj Vincent, P.M.D.; Rajinikanth, V.; Pattukandan Ganapathy, G.; Srinivasan, K. A Contemporary Review on Deep Learning Models for Drought Prediction. Sustainability 2023, 15, 6160. [Google Scholar] [CrossRef]
Noh, G.-H.; Ahn, K.-H. Enhancing Multiple Precipitation Data Integration Across a Large-Scale Area: A Deep Learning ResU-Net Framework Without Interpolation. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1–16. [Google Scholar] [CrossRef]
Duan, X.; Aslam, R.W.; Naqvi, S.A.A.; Kucher, D.E.; Afzal, Z.; Raza, D. Multi-index Assessment and Machine Learning Integration for Drought Monitoring in Yunnan, China, Using Google Earth Engine. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 16479–16516. [Google Scholar] [CrossRef]
Qin, Q.; Wu, Z.; Zhang, T.; Sagan, V.; Zhang, Z.; Zhang, Y.; Zhang, C.; Ren, H.; Sun, Y.; Xu, W.; et al. Optical and Thermal Remote Sensing for Monitoring Agricultural Drought. Remote Sens. 2021, 13, 5092. [Google Scholar] [CrossRef]
Khan, R.; Gilani, H. Global drought monitoring with big geospatial datasets using Google Earth Engine. Environ. Sci. Pollut. Res. 2021, 28, 17244–17264. [Google Scholar] [CrossRef] [PubMed]
OECD. Global Drought Outlook: Trends, Impacts and Policies to Adapt to a Drier World. 2025. Available online: https://www.oecd.org/en/publications/global-drought-outlook_d492583a-en.html (accessed on 15 October 2025).
United Nations. The United Nations World Water Development Report 2024: Water for Prosperity and Peace. 2024. Available online: https://unesdoc.unesco.org/ark:/48223/pf0000388948 (accessed on 15 October 2025).
Chaudhari, S.; Sardar, V.; Rahul, D.S.; Chandan, M.; Shivakale, M.S.; Harini, K.R. Performance Analysis of CNN, AlexNet and VGGNet Models for Drought Prediction using Satellite Images. In Proceedings of the 2021 Asian Conference on Innovation in Technology (ASIANCON), Pune, India, 27–29 August 2021; pp. 1–6. [Google Scholar] [CrossRef]
Hao, R.; Yan, H.; Chiang, Y.-M. Forecasting the Propagation from Meteorological to Hydrological and Agricultural Drought in the Huaihe River Basin with Machine Learning Methods. Remote Sens. 2023, 15, 5524. [Google Scholar] [CrossRef]
Alahacoon, N.; Edirisinghe, M. A comprehensive assessment of remote sensing and traditional based drought monitoring indices at global and regional scale. Geomat. Nat. Hazards Risk 2022, 13, 762–799. [Google Scholar] [CrossRef]
Alemu, M.G.; Zimale, F.A. Integration of remote sensing and machine learning algorithm for agricultural drought early warning over Genale Dawa river basin, Ethiopia. Environ. Monit. Assess. 2025, 197, 243. [Google Scholar] [CrossRef]
Rahnama, S.; Shahidi, A.; Yaghoobzadeh, M.; Mehran, A.A. Comparison of different drought monitoring indices in different climatic conditions in Iran. Atmósfera 2024, 38, 507–529. [Google Scholar] [CrossRef]
Al Nadabi, M.S.; D’Antonio, P.; Fiorentino, C.; Scopa, A.; Shams, E.M.; Fadl, M.E. Utilizing the Google Earth Engine for Agricultural Drought Conditions and Hazard Assessment Using Drought Indices in the Najd Region, Sultanate of Oman. Remote Sens. 2024, 16, 2960. [Google Scholar] [CrossRef]
Afan, H.A.; Almawla, A.S.; Al-Hadeethi, B.; Khaleel, F.; AbdUlameer, A.H.; Khan, M.M.H.; Ma’arof, M.I.N.; Kamel, A.H. LSTM Model Integrated Remote Sensing Data for Drought Prediction: A Study on Climate Change Impacts on Water Availability in the Arid Region. Water 2024, 16, 2799. [Google Scholar] [CrossRef]
Sseguya, F.; Jun, K.-S. Drought Quantification in Africa Using Remote Sensing, Gaussian Kernel, and Machine Learning. Water 2024, 16, 2656. [Google Scholar] [CrossRef]
Wan, W.; Zhou, Y. Deep hybridnet for drought prediction based on large-scale climate indices and local meteorological conditions. Stoch. Environ. Res. Risk Assess. 2024, 39, 4725–4747. [Google Scholar] [CrossRef]
Çoban, Ö.; Eşit, M.; Yalçın, S. ML-DPIE: Comparative evaluation of machine learning methods for drought parameter index estimation: A case study of Türkiye. Nat. Hazards 2023, 120, 989–1021. [Google Scholar] [CrossRef]
Wieland, M.; Martinis, S. Large-scale surface water change observed by Sentinel-2 during the 2018 drought in Germany. Int. J. Remote Sens. 2020, 41, 4742–4756. [Google Scholar] [CrossRef]
Sardar, V.S.; Yindumathi, K.M.; Chaudhari, S.S.; Ghosh, P. Convolution Neural Network-based Agriculture Drought Prediction using Satellite Images. In Proceedings of the 2021 IEEE Mysore Sub Section International Conference (MysuruCon), Hassan, India, 24–25 October 2021; pp. 601–607. [Google Scholar] [CrossRef]
Zhang, Z.; Xu, W.; Shi, Z.; Qin, Q. Establishment of a Comprehensive Drought Monitoring Index Based on Multisource Remote Sensing Data and Agricultural Drought Monitoring. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2113–2126. [Google Scholar] [CrossRef]
Hegazi, E.H.; Samak, A.A.; Yang, L.; Huang, R.; Huang, J. Prediction of Soil Moisture Content from Sentinel-2 Images Using Convolutional Neural Network (CNN). Agronomy 2023, 13, 656. [Google Scholar] [CrossRef]
Xu, Z.; Sun, H.; Zhang, T.; Xu, H.; Wu, D.; Gao, J. Evaluating established deep learning methods in constructing integrated remote sensing drought index: A case study in China. Agric. Water Manag. 2023, 286, 108405. [Google Scholar] [CrossRef]
Xiao, X.; Ming, W.; Luo, X.; Yang, L.; Li, M.; Yang, P.; Ji, X.; Li, Y. Leveraging multisource data for accurate agricultural drought monitoring: A hybrid deep learning model. Agric. Water Manag. 2024, 293, 108692. [Google Scholar] [CrossRef]
Seo, J.Y.; Lee, S.-I. Probabilistic Evaluation of Drought Propagation Using Satellite Data and Deep Learning Model: From Precipitation to Soil Moisture and Groundwater. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 6048–6061. [Google Scholar] [CrossRef]
Behfar, N.; Sharghi, E.; Nourani, V.; Booij, M.J. Drought index downscaling using AI-based ensemble technique and satellite data. Theor. Appl. Climatol. 2024, 155, 2379–2397. [Google Scholar] [CrossRef]
Saini, P.; Nagpal, B. Spatiotemporal Landsat-Sentinel-2 satellite imagery-based Hybrid Deep Neural network for paddy crop prediction using Google Earth engine. Adv. Space Res. 2024, 73, 4988–5004. [Google Scholar] [CrossRef]
Zhu, X.; Li, Q.; Guo, C. Evaluation of the monitoring capability of various vegetation indices and mainstream satellite band settings for grassland drought. Ecol. Inform. 2024, 82, 102717. [Google Scholar] [CrossRef]
Nandgude, N.; Singh, T.P.; Nandgude, S.; Tiwari, M. Drought Prediction: A Comprehensive Review of Different Drought Prediction Models and Adopted Technologies. Sustainability 2023, 15, 11684. [Google Scholar] [CrossRef]
Fathi, M.; Shah-Hosseini, R.; Moghimi, A. 3D-ResNet-BiLSTM Model: A Deep Learning Model for County-Level Soybean Yield Prediction with Time-Series Sentinel-1, Sentinel-2 Imagery, and Daymet Data. Remote Sens. 2023, 15, 5551. [Google Scholar] [CrossRef]
Kumar, V.; Sharma, K.V.; Pham, Q.B.; Srivastava, A.K.; Bogireddy, C.; Yadav, S.M. Advancements in drought using remote sensing: Assessing progress, overcoming challenges, and exploring future opportunities. Theor. Appl. Climatol. 2024, 155, 4251–4288. [Google Scholar] [CrossRef]
Qazvini, A.T.; Carrion, D. A Spatiotemporal Drought Analysis Application Implemented in the Google Earth Engine and Applied to Iran as a Case Study. Remote Sens. 2023, 15, 2218. [Google Scholar] [CrossRef]
Sun, Y.; Lao, D.; Ruan, Y.; Huang, C.; Xin, Q. A Deep Learning-Based Approach to Predict Large-Scale Dynamics of Normalized Difference Vegetation Index for the Monitoring of Vegetation Activities and Stresses Using Meteorological Data. Sustainability 2023, 15, 6632. [Google Scholar] [CrossRef]
Maniyar, C.B.; Yan, X.; Mai, G.; Srivastava, D.; Samiappan, S.; Oliazadeh, A.; Kumar, A.; Kumar, M.; Mishra, D.R. Artificial intelligence in environmental remote sensing: Progress, way forward and key considerations. Prog. Environ. Geogr. 2025, 4, 363–389. [Google Scholar] [CrossRef]
Vawda, M.I.; Lottering, R.; Mutanga, O.; Peerbhay, K.; Sibanda, M. Comparing the Utility of Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) on Sentinel-2 MSI to Estimate Dry Season Aboveground Grass Biomass. Sustainability 2024, 16, 1051. [Google Scholar] [CrossRef]
Pokhariyal, S.; Patel, N.R.; Govind, A. Machine Learning-Driven Remote Sensing Applications for Agriculture in India—A Systematic Review. Agronomy 2023, 13, 2302. [Google Scholar] [CrossRef]
Barbosa, H.A.; Buriti, C.O.; Kumar, T.V.L. Deep Learning for Flash Drought Detection: A Case Study in Northeastern Brazil. Atmosphere 2024, 15, 761. [Google Scholar] [CrossRef]
Chen, Z.; Wang, G.; Wei, X.; Liu, Y.; Duan, Z.; Hu, Y.; Jiang, H. Basin-Scale Daily Drought Prediction Using Convolutional Neural Networks in Fenhe River Basin, China. Atmosphere 2024, 15, 155. [Google Scholar] [CrossRef]
Ruiz-Lendínez, J.J.; Ariza-López, F.J.; Reinoso-Gordo, J.F.; Ureña-Cámara, M.A.; Quesada-Real, F.J. Deep learning methods applied to digital elevation models: State of the art. Geocarto Int. 2023, 38, 2252389. [Google Scholar] [CrossRef]
Nigar, A.; Li, Y.; Muhammad, Y.J.B.; Alrefaei, A.F.; Almutairi, M.H. Comparison of machine and deep learning algorithms using Google Earth Engine and Python for land classifications. Front. Environ. Sci. 2024, 12, 1378443. [Google Scholar] [CrossRef]
Kan, J.; Ferreira, C.S.S.; Destouni, G.; Haozhi, P.; Passos, M.V.; Barquet, K.; Kalantari, Z. Predicting agricultural drought indicators: ML approaches across wide-ranging climate and land use conditions. Ecol. Indic. 2023, 154, 110524. [Google Scholar] [CrossRef]
Mainali, K.; Evans, M.; Saavedra, D.; Mills, E.; Madsen, B.; Minnemeyer, S. Convolutional neural network for high-resolution wetland mapping with open data: Variable selection and the challenges of a generalizable model. Sci. Total Environ. 2022, 861, 160622. [Google Scholar] [CrossRef]
Iilonga, S.N.; Ajayi, O.G. Implementation of deep learning algorithms to model agricultural drought towards sustainable land management in Namibia’s Omusati region. Land Use Policy 2025, 156, 107593. [Google Scholar] [CrossRef]
Zhang, Q.; Li, Y.P.; Huang, G.H.; Wang, H.; Li, Y.F.; Shen, Z.Y. Multivariate time series convolutional neural networks for long-term agricultural drought prediction under global warming. Agric. Water Manag. 2024, 292, 108683. [Google Scholar] [CrossRef]
Peng, M.; Liu, Y.; Khan, A.; Ahmed, B.; Sarker, S.K.; Ghadi, Y.Y.; Bhatti, U.A.; Al-Razgan, M.; Ali, Y.A. Crop Monitoring using remote sensing land use and land change data: Comparative analysis of deep learning methods using pre-trained CNN models. Big Data Res. 2024, 36, 100448. [Google Scholar] [CrossRef]
Kamilaris, A.; Prenafeta-Boldú, F.X. A review of the use of convolutional neural networks in agriculture. J. Agric. Sci. 2018, 156, 312–322. [Google Scholar] [CrossRef]
Li, Y.; Liu, C.; Zhao, W.; Huang, Y. Multi-spectral remote sensing images feature coverage classification based on improved convolutional neural network. Math. Biosci. Eng. 2020, 17, 4443–4456. [Google Scholar] [CrossRef]
Wu, J.; Li, X.; Shi, Z.; Li, S.; Hou, K.; Bai, T. Research on Walnut (Juglans regia L.) Classification Based on Convolutional Neural Networks and Landsat-8 Remote Sensing Imagery. Forests 2024, 15, 165. [Google Scholar] [CrossRef]
Long, J.; Xu, C.; Wang, Y.; Zhang, J. From meteorological to agricultural drought: Propagation time and influencing factors over diverse underlying surfaces based on CNN-LSTM model. Ecol. Inform. 2024, 82, 102681. [Google Scholar] [CrossRef]
Bodhale, A.; Verma, S.; Panthakkan, A. Comparative Analysis of Fine Tuned and Transfer Learning Model for Plant Disease Detection. In Proceedings of the 2022 5th International Conference on Signal Processing and Information Security (ICSPIS), Dubai, United Arab Emirates, 7–8 December 2022; pp. 65–69. [Google Scholar] [CrossRef]
Ghaznavi, A.; Saberioon, M.; Brom, J.; Itzerott, S. Comparative performance analysis of simple U-Net, residual attention U-Net, and VGG16-U-Net for inventory inland water bodies. Appl. Comput. Geosci. 2024, 21, 100150. [Google Scholar] [CrossRef]
Kattenborn, T.; Leitloff, J.; Schiefer, F.; Hinz, S. Review on Convolutional Neural Networks (CNN) in vegetation remote sensing. ISPRS J. Photogramm. Remote Sens. 2021, 173, 24–49. [Google Scholar] [CrossRef]
Mokhtar, A.; Jalali, M.; He, H.; Al-Ansari, N.; Elbeltagi, A.; Alsafad, K. Estimation of SPEI Meteorological Drought Using Machine Learning Algorithms. IEEE Access 2021, 9, 65503–65523. [Google Scholar] [CrossRef]
Zhang, B.; Salem, F.K.A.; Hayes, M.J.; Smith, K.H.; Tadesse, T.; Wardlow, B.D. Explainable machine learning for the prediction and assessment of complex drought impacts. Sci. Total Environ. 2023, 898, 165509. [Google Scholar] [CrossRef]
Wang, Y.; Cui, J.; Miao, B.; Li, Z.; Wang, Y.; Jia, C.; Liang, C. Evaluating Performance of Multiple Machine Learning Models for Drought Monitoring: A Case Study of Typical Grassland in Inner Mongolia. Land 2024, 13, 754. [Google Scholar] [CrossRef]
Sakka, M.E.; Ivanovici, M.; Chaari, L.; Mothe, J. A Review of CNN Applications in Smart Agriculture Using Multimodal Data. Sensors 2025, 25, 472. [Google Scholar] [CrossRef] [PubMed]
Varela, S.; Zheng, X.; Njuguna, J.N.; Sacks, E.J.; Allen, D.P.; Ruhter, J.; Leakey, A.D.B. Deep Convolutional Neural Networks Exploit High-Spatial- and -Temporal-Resolution Aerial Imagery to Phenotype Key Traits in Miscanthus. Remote Sens. 2022, 14, 5333. [Google Scholar] [CrossRef]
Deneu, B.; Servajean, M.; Bonnet, P.; Botella, C.; Munoz, F.; Joly, A. Convolutional neural networks improve species distribution modelling by capturing the spatial structure of the environment. PLoS Comput. Biol. 2021, 17, e1008856. [Google Scholar] [CrossRef]
Pang, S.; Sun, L.; Tian, Y.; Ma, Y.; Wei, J. Convolutional Neural Network-Driven Improvements in Global Cloud Detection for Landsat 8 and Transfer Learning on Sentinel-2 Imagery. Remote Sens. 2023, 15, 1706. [Google Scholar] [CrossRef]
Rezaie, F.; Panahi, M.; Jun, C.; Dayal, K.; Kim, D.; Darabi, H.; Kalantari, Z.; Seifollahi-Aghmiuni, S.; Deo, R.C.; Baten, S.M. Deep learning models for drought susceptibility mapping in Southeast Queensland, Australia. Stoch. Environ. Res. Risk Assess. 2025, 39, 4849–4865. [Google Scholar] [CrossRef]
Balti, H.; Abbes, A.B.; Sang, Y.; Mellouli, N.; Farah, I.R. Spatio-temporal heterogeneous graph using multivariate earth observation time series: Application for drought forecasting. Comput. Geosci. 2023, 179, 105435. [Google Scholar] [CrossRef]
Vijayakumar, S.; Saravanakumar, R.; Arulanandam, M.; Ilakkiya, S. Google Earth Engine: Empowering developing countries with large-scale geospatial data analysis—A comprehensive review. Arab. J. Geosci. 2024, 17, 139. [Google Scholar] [CrossRef]
Li, J.; Li, Y.; Yin, L.; Zhao, Q. A novel composite drought index combining precipitation, temperature and evapotranspiration used for drought monitoring in the Huang-Huai-Hai Plain. Agric. Water Manag. 2024, 291, 108626. [Google Scholar] [CrossRef]
Yang, L.; Driscol, J.; Sarigai, S.; Wu, Q.; Chen, H.; Lippitt, C.D. Google Earth Engine and Artificial Intelligence (AI): A Comprehensive Review. Remote Sens. 2022, 14, 3253. [Google Scholar] [CrossRef]
Cao, J.; Zhang, Z.; Luo, Y.; Zhang, L.; Zhang, J.; Li, Z.; Tao, F. Wheat yield predictions at a county and field scale with deep learning, machine learning, and google earth engine. Eur. J. Agron. 2021, 123, 126204. [Google Scholar] [CrossRef]
Zhou, S.; Xu, L.; Chen, N. Rice Yield Prediction in Hubei Province Based on Deep Learning and the Effect of Spatial Heterogeneity. Remote Sens. 2023, 15, 1361. [Google Scholar] [CrossRef]
Kazemi Garajeh, M.; Abdoli, N.; Seyedebrahimi, E.; Naboureh, A.; Kurdpour, I.; Bakhshi Lomer, A.R.; Sadeqi, A.; Mirzaei, S. Impact of Long-Term Drought on Surface Water and Water Balance Variations in Iran: Insights from Highland and Lowland Regions. Remote Sens. 2024, 16, 3636. [Google Scholar] [CrossRef]
Amani, M.; Ghorbanian, A.; Ahmadi, S.A.; Kakooei, M.; Moghimi, A.; Mirmazloumi, S.M. Google Earth Engine Cloud Computing Platform for Remote Sensing Big Data Applications: A Comprehensive Review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5326–5350. [Google Scholar] [CrossRef]
Tamiminia, H.; Salehi, B.; Mahdianpari, M.; Quackenbush, L.; Adeli, S.; Brisco, B. Google Earth Engine for geo-big data applications: A meta-analysis and systematic review. ISPRS J. Photogramm. Remote Sens. 2020, 164, 152–170. [Google Scholar] [CrossRef]
Kumar, L.; Mutanga, O. Google Earth Engine Applications Since Inception: Usage, Trends, and Potential. Remote Sens. 2018, 10, 1509. [Google Scholar] [CrossRef]
Sazib, N.; Mladenova, I.; Bolten, J. Leveraging the Google Earth Engine for Drought Assessment Using Global Soil Moisture Data. Remote Sens. 2018, 10, 1265. [Google Scholar] [CrossRef]
Prodhan, F.A.; Zhang, J.; Hasan, S.S.; Sharma, T.P.P.; Mohana, H.P. A review of machine learning methods for drought hazard monitoring and forecasting: Current research trends, challenges, and future research directions. Environ. Model. Softw. 2022, 149, 105327. [Google Scholar] [CrossRef]
Márquez-Grajales, A.; Villegas-Vega, R.; Salas-Martínez, F.; Acosta-Mesa, H.-G.; Mezura-Montes, E. Characterizing drought prediction with deep learning: A literature review. MethodsX 2024, 13, 102800. [Google Scholar] [CrossRef]
Nyamane, S.; Mohamed, A.M.A.E.; Obagbuwa, I.C. Harnessing Deep Learning for Meteorological Drought Forecasts in the Northern Cape, South Africa. Int. J. Intell. Syst. 2024, 2024, 7562587. [Google Scholar] [CrossRef]
Zhou, J.; Fan, Y.; Guan, Q.; Feng, G. Research on Drought Monitoring Based on Deep Learning: A Case Study of the Huang-Huai-Hai Region in China. Land 2024, 13, 615. [Google Scholar] [CrossRef]
Huang, S.; Tang, L.; Hupy, J.P.; Wang, Y.; Shao, G. A commentary review on the use of normalized difference vegetation index (NDVI) in the era of popular remote sensing. J. For. Res. 2020, 32, 1–6. [Google Scholar] [CrossRef]
Ferchichi, A.; Abbes, A.B.; Barra, V.; Farah, I.R. Forecasting vegetation indices from spatio-temporal remotely sensed data using deep learning-based approaches: A systematic literature review. Ecol. Inform. 2022, 68, 101552. [Google Scholar] [CrossRef]
Dikshit, A.; Pradhan, B. Interpretable and explainable AI (XAI) model for spatial drought prediction. Sci. Total Environ. 2021, 801, 149797. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Li, C.; Wu, X.; Xiang, H.; Jiao, Y.; Chai, H. BO-CNN-BiLSTM deep learning model integrating multisource remote sensing data for improving winter wheat yield estimation. Front. Plant Sci. 2024, 15, 1500499. [Google Scholar] [CrossRef]
Ferchichi, A.; Chihaoui, M.; Ferchichi, A. Spatio-temporal modeling of climate change impacts on drought forecast using Generative Adversarial Network: A case study in Africa. Expert Syst. Appl. 2024, 238, 122211. [Google Scholar] [CrossRef]
Bojer, A.K.; Biru, B.H.; Al-Quraishi, A.M.F.; Debelee, T.G.; Negera, W.G.; Woldesillasie, F.F.; Esubalew, S.Z. Machine learning and remote sensing based time series analysis for drought risk prediction in Borena Zone, Southwest Ethiopia. J. Arid Environ. 2024, 222, 105160. [Google Scholar] [CrossRef]
Elbeltagi, A.; Srivastava, A.; Ehsan, M.; Sharma, G.; Yu, J.; Khadke, L.; Gautam, V.K.; Awad, A.; Jinsong, D. Advanced stacked integration method for forecasting long-term drought severity: CNN with machine learning models. J. Hydrol. Reg. Stud. 2024, 53, 101759. [Google Scholar] [CrossRef]
Houmma, I.H.; El Mansouri, L.; Gadal, S.; Garba, M.; Hadria, R. Modelling agricultural drought: A review of latest advances in big data technologies. Geomat. Nat. Hazards Risk 2022, 13, 2737–2776. [Google Scholar] [CrossRef]
Sardar, V.; Chaudhari, S.; Anchalia, A.; Kakati, A.; Paudel, A.; Bhavana, B.N. Ensemble Learning with CNN and BMO for Drought Prediction. In Proceedings of the 2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT), Bangalore, India, 7–9 October 2022; pp. 1–6. [Google Scholar] [CrossRef]
Kartal, S.; Iban, M.C.; Sekertekin, A. Next-level vegetation health index forecasting: A ConvLSTM study using MODIS Time Series. Environ. Sci. Pollut. Res. 2024, 31, 18932–18948. [Google Scholar] [CrossRef]
Poptani, A.; Lokhande, S.; Pandya, R.J.; Iyer, S. Decoding Drought: Embracing Simplicity in Effective Predictive Models. In Proceedings of the 2023 IEEE Asia-Pacific Conference on Geoscience, Electronics and Remote Sensing Technology (AGERS), Surabaya, Indonesia, 19–20 December 2023; pp. 124–131. [Google Scholar] [CrossRef]
Devkota, K.P.; Bouasria, A.; Devkota, M.; Nangia, V. Predicting wheat yield gap and its determinants combining remote sensing, machine learning, and survey approaches in rainfed Mediterranean regions of Morocco. Eur. J. Agron. 2024, 158, 127195. [Google Scholar] [CrossRef]
Xuan, F.; Liu, H.; Xue, J.; Li, Y.; Liu, J.; Huang, X.; Tan, Z.; Abd-Elbasit, M.A.M.; Gu, X.; Su, W. The novel triangular spectral indices for characterizing winter wheat drought. Int. J. Appl. Earth Obs. Geoinf. 2024, 134, 104151. [Google Scholar] [CrossRef]
Liu, L.; Yang, X.; Zhou, H.; Liu, S.; Zhou, L.; Li, X.; Yang, J.; Han, X.; Wu, J. Evaluating the utility of solar-induced chlorophyll fluorescence for drought monitoring by comparison with NDVI derived from wheat canopy. Sci. Total Environ. 2018, 625, 1208–1217. [Google Scholar] [CrossRef] [PubMed]
Mukhawana, M.B.; Kanyerere, T.; Kahler, D. Review of In-Situ and Remote Sensing-Based Indices and Their Applicability for Integrated Drought Monitoring in South Africa. Water 2023, 15, 240. [Google Scholar] [CrossRef]
Wei, W.; Zhang, J.; Zhou, L.; Xie, B.; Zhou, J.; Li, C. Comparative evaluation of drought indices for monitoring drought based on remote sensing data. Environ. Sci. Pollut. Res. 2021, 28, 20408–20425. [Google Scholar] [CrossRef] [PubMed]
Jariwala, K.A.; Agnihotri, P.G. Comparative Analysis of Drought Modeling and Forecasting Using Soft Computing Techniques. Water Resour. Manag. 2023, 37, 6051–6070. [Google Scholar] [CrossRef]
Anshuka, A.; van Ogtrop, F.F.; Vervoort, R.W. Drought forecasting through statistical models using standardised precipitation index: A systematic review and meta-regression analysis. Nat. Hazards 2019, 97, 955–977. [Google Scholar] [CrossRef]
PRISMA. Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). 2020. Available online: https://www.prisma-statement.org/ (accessed on 23 October 2025).
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Br. Med. J. 2021, 372, n71. [Google Scholar] [CrossRef]
Amir-Behghadami, M.; Janati, A. Population, Intervention, Comparison, Outcomes and Study (PICOS) Design as a Framework to Formulate Eligibility Criteria in Systematic Reviews. Emerg. Med. J. 2020, 37, 387. [Google Scholar] [CrossRef]
Joanna Briggs Institute. Checklist for Qualitative Research. 2017. Available online: https://jbi.global/sites/default/files/2019-05/JBI_Critical_Appraisal-Checklist_for_Qualitative_Research2017_0.pdf (accessed on 23 August 2025).
Menon, J.M.L.; Struijs, F.; Whaley, P. The methodological rigour of systematic reviews in environmental health. Crit. Rev. Toxicol. 2022, 52, 167–187. [Google Scholar] [CrossRef]
Crespo, N.; Pádua, L.; Santos, J.A.; Fraga, H. Satellite Remote Sensing Tools for Drought Assessment in Vineyards and Olive Orchards: A Systematic Review. Remote Sens. 2024, 16, 2040. [Google Scholar] [CrossRef]
Nabizada, M.J.; Köylü, Ü.; Rousta, I. Assessing spatiotemporal dynamics of meteorological and agricultural drought in the North Basin of Afghanistan using multiple remote sensing-based drought indices. Int. J. River Basin Manag. 2025, 1–27. [Google Scholar] [CrossRef]
Xu, Z.; Sun, H.; Zhang, T.; Xu, H.; Wu, D.; Gao, J. The high spatial resolution Drought Response Index (HiDRI): An integrated framework for monitoring vegetation drought with remote sensing, deep learning, and spatiotemporal fusion. Remote Sens. Environ. 2024, 312, 114324. [Google Scholar] [CrossRef]
Zhang, Y.; Xie, D.; Tian, W.; Zhao, H.; Geng, S.; Lu, H.; Ma, G.; Huang, J.; Choy Lim Kam Sian, K.T. Construction of an Integrated Drought Monitoring Model Based on Deep Learning Algorithms. Remote Sens. 2023, 15, 667. [Google Scholar] [CrossRef]
Chaudhari, S.; Anchalia, A.; Kakati, A.; Paudel, A.; Bhavana, B.N.; Sardar, V. A Bio-inspired and Deep Learning Based Hybrid Model for Agricultural Drought Assessment. J. Water Manag. Model. 2024, 32. [Google Scholar] [CrossRef]
Foroumandi, E.; Gavahi, K.; Moradkhani, H. Generative Adversarial Network for Real-Time Flash Drought Monitoring: A Deep Learning Study. Water Resour. Res. 2024, 60, e2023WR035600. [Google Scholar] [CrossRef]
Edris, S.; McGovern, A.; Basara, J.B.; Christian, J.I.; Furtado, J.C.; Olayiwola, H.; Xiao, X. Evaluation of Flash Drought Identification with Machine Learning Techniques, Part II: Neural Network Algorithms. Artif. Intell. Earth Syst. 2025, 4, 3. [Google Scholar] [CrossRef]
Chaudhari, S.; Sardar, V.; Ghosh, P. Drought classification and prediction with satellite image-based indices using variants of deep learning models. Int. J. Inf. Technol. 2023, 15, 3463–3472. [Google Scholar] [CrossRef]
Vij, P.; Tiwari, A. AI-Driven Drought Monitoring: Advanced Machine Learning Techniques for Early Prediction. SHS Web Conf. 2025, 216, 01024. [Google Scholar] [CrossRef]
Azimi, S.; Wadhawan, R.; Gandhi, T.K. Intelligent Monitoring of Stress Induced by Water Deficiency in Plants Using Deep Learning. IEEE Trans. Instrum. Meas. 2021, 70, 1–13. [Google Scholar] [CrossRef]
Marusov, A.; Grabar, V.; Maximov, Y.; Sotiriadi, N.; Bulkin, A.; Zaytsev, A. Long-term drought prediction using deep neural networks based on geospatial weather data. Environ. Model. Softw. 2024, 179, 106127. [Google Scholar] [CrossRef]
Mehr, A.D.; Ghiasi, A.R.; Yaseen, Z.M.; Sorman, A.U.; Abualigah, L. A novel intelligent deep learning predictive model for meteorological drought forecasting. J. Ambient. Intell. Humaniz. Comput. 2022, 14, 10441–10455. [Google Scholar] [CrossRef]
Boston, T.; Van Dijk, A.; Thackway, R. U-Net Convolutional Neural Network for Mapping Natural Vegetation and Forest Types from Landsat Imagery in Southeastern Australia. J. Imaging 2024, 10, 143. [Google Scholar] [CrossRef] [PubMed]
Chen, S.; Lei, F.; Dong, S.; Zang, Z.; Zhang, M. Land use/land cover mapping using deep neural network and sentinel image dataset based on google earth engine in a heavily urbanized area, China. Geocarto Int. 2022, 37, 16951–16972. [Google Scholar] [CrossRef]
Fernández-Beltrán, R.; Baidar, T.; Kang, J.; Pla, F. Rice-Yield Prediction with Multi-Temporal Sentinel-2 Data and 3D CNN: A Case Study in Nepal. Remote Sens. 2021, 13, 1391. [Google Scholar] [CrossRef]
Lees, T.; Tseng, G.; Atzberger, C.; Reece, S.; Dadson, S. Deep Learning for Vegetation Health Forecasting: A Case Study in Kenya. Remote Sens. 2022, 14, 698. [Google Scholar] [CrossRef]
Kladny, K.-R.; Milanta, M.; Mraz, O.; Hufkens, K.; Stocker, B.D. Enhanced prediction of vegetation responses to extreme drought using deep learning and Earth observation data. Ecol. Inform. 2024, 80, 102474. [Google Scholar] [CrossRef]
Attri, I.; Awasthi, L.K.; Sharma, T.P.; Rathee, P. A review of deep learning techniques used in agriculture. Ecol. Inform. 2023, 77, 102217. [Google Scholar] [CrossRef]
Saleh, A.; Zulkifley, M.A.; Harun, H.H.; Gaudreault, F.; Davison, I.; Spraggon, M. Forest fire surveillance systems: A review of deep learning methods. Heliyon 2024, 10, e23127. [Google Scholar] [CrossRef]
Zhong, G. Convolutional Neural Network Model to Predict Outdoor Comfort UTCI Microclimate Map. Atmosphere 2022, 13, 1860. [Google Scholar] [CrossRef]
Cortés-Andrés, J.; Fernández-Torres, M.-Á.; Camps-Valls, G. Deep Learning with Noisy Labels for Spatio-Temporal Drought Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–13. [Google Scholar] [CrossRef]
Tsalera, E.; Papadakis, A.; Voyiatzis, I.; Samarakou, M. CNN-based, contextualized, real-time fire detection in computational resource-constrained environments. Energy Rep. 2023, 9, 247–257. [Google Scholar] [CrossRef]
Dehghani, A.; Moazam, H.M.Z.H.; Mortazavizadeh, F.; Ranjbar, V.; Mirzaei, M.; Mortezavi, S.; Ng, J.L.; Dehghani, A. Comparative evaluation of LSTM, CNN, and ConvLSTM for hourly short-term streamflow forecasting using deep learning approaches. Ecol. Inform. 2023, 75, 102119. [Google Scholar] [CrossRef]
Yalçın, S.; Eşit, M.; Çoban, Ö. A new deep learning method for meteorological drought estimation based-on standard precipitation evapotranspiration index. Eng. Appl. Artif. Intell. 2023, 124, 106550. [Google Scholar] [CrossRef]
Estrada, J.S.; Fuentes, A.; Reszka, P.; Cheein, F.A. Machine learning assisted remote forestry health assessment: A comprehensive state of the art review. Front. Plant Sci. 2023, 14, 1139232. [Google Scholar] [CrossRef]
Rostami, A.; Akhoondzadeh, M.; Amani, M. A fuzzy-based flood warning system using 19-year remote sensing time series data in the Google Earth Engine cloud platform. Adv. Space Res. 2022, 70, 1406–1428. [Google Scholar] [CrossRef]
Hu, P.; Sharifi, A.; Tahir, M.N.; Tariq, A.; Zhang, L.; Mumtaz, F.; Shah, S.H.I.A. Evaluation of Vegetation Indices and Phenological Metrics Using Time-Series MODIS Data for Monitoring Vegetation Change in Punjab, Pakistan. Water 2021, 13, 2550. [Google Scholar] [CrossRef]
Lykhovyd, P.; Averchev, O.; Fedorchuk, M.; Fedorchuk, V. The Relationship Between Spatial Vegetation Indices: A Case Study for the South of Ukraine. Environ. Ecol. Res. 2023, 11, 740–746. [Google Scholar] [CrossRef]
Xiong, X.; Zhong, R.; Tian, Q.; Huang, J.; Zhu, L.; Yang, Y.; Lin, T. Daily DeepCropNet: A hierarchical deep learning approach with daily time series of vegetation indices and climatic variables for corn yield estimation. ISPRS J. Photogramm. Remote Sens. 2024, 209, 249–264. [Google Scholar] [CrossRef]
Wang, N.; Naz, I.; Aslam, R.W.; Quddoos, A.; Soufan, W.; Raza, D.; Ishaq, S.; Ahmed, B. Spatio-Temporal Dynamics of Rangeland Transformation using machine learning algorithms and Remote Sensing data. Rangel. Ecol. Manag. 2024, 94, 106–118. [Google Scholar] [CrossRef]
Vélez, S.; Martínez-Peña, R.; Castrillo, D. Beyond Vegetation: A Review Unveiling Additional Insights into Agriculture and Forestry Through the Application of Vegetation Indices. J 2023, 6, 421–436. [Google Scholar] [CrossRef]
Fragaszy, S.; Bergaoui, K.; Fraj, M.B.; Ghanim, A.; Al-Hamadin, O.; Al-Karablieh, E.; Al-Bakri, J.; Fakih, M.; Fayyad, A.; Comair, F.; et al. Development of a composite drought indicator for operational drought monitoring in the MENA region. Sci. Rep. 2024, 14, 5414. [Google Scholar] [CrossRef] [PubMed]
Ndlovu, H.S.; Odindi, J.; Sibanda, M.; Mutanga, O. A systematic review on the application of UAV-based thermal remote sensing for assessing and monitoring crop water status in crop farming systems. Int. J. Remote Sens. 2024, 45, 4923–4960. [Google Scholar] [CrossRef]
Yang, X.; Gao, F.; Yuan, H.; Cao, X. Integrated UAV and Satellite Multi-Spectral for Agricultural Drought Monitoring of Winter Wheat in the Seedling Stage. Sensors 2024, 24, 5715. [Google Scholar] [CrossRef]
Sharma, H.; Sidhu, H.; Bhowmik, A. Remote Sensing Using Unmanned Aerial Vehicles for Water Stress Detection: A Review Focusing on Specialty Crops. Drones 2025, 9, 241. [Google Scholar] [CrossRef]
Su, J.; Coombes, M.; Liu, C.; Zhu, Y.; Song, X.; Fang, S.; Guo, L.; Chen, W. Machine Learning-Based Crop Drought Mapping System by UAV Remote Sensing RGB Imagery. Unmanned Syst. 2020, 08, 71–83. [Google Scholar] [CrossRef]
Wardlow, B.D.; Anderson, M.C.; Verdin, J.P. (Eds.) Remote Sensing of Drought: Innovative Monitoring Approaches, 1st ed.; CRC Press: Boca Raton, FL, USA, 2012; Available online: https://www.taylorfrancis.com/books/mono/10.1201/b11863/remote-sensing-drought-brian-wardlow-martha-anderson-james-verdin (accessed on 20 October 2025).
Thenkabail, P.S. (Ed.) Remote Sensing Handbook, Volume V: Water, Hydrology, Floods, Snow and Ice, Wetlands, and Water Productivity, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2024; Available online: https://www.taylorfrancis.com/books/edit/10.1201/9781003541400/remote-sensing-handbook-volume-prasad-thenkabail (accessed on 20 October 2025).

Figure 1. Drought propagation illustrates linkages among meteorological, agricultural, hydrological, and socioeconomic droughts. Adapted from West (2019), with modifications [1]. The arrows indicate the typical sequence of progression from meteorological to socio-economic impacts.

Figure 2. Timeline of drought indices. Adapted from West (2019), with modifications [1]. The arrows illustrate the chronological development of key remote sensing platforms and indices used for drought monitoring.

Figure 3. (a) CNN architecture. Adapted from Vawda, with modifications [37]. (b) CNN feature extraction process. Adapted from Li, with modifications [49].

Figure 4. CNN-BiLSTM hybrid architecture for drought prediction. Adapted from Dikshit (2021), with modifications [80]. The arrows show the data flow from the CNN spatial feature extractor to the BiLSTM temporal processor.

Figure 5. Hybrid CNN-RF architecture for drought prediction. Adapted from Xiao et al. (2024), with modifications [26]. The arrows illustrate the workflow, where the CNN acts as a feature extractor for the Random Forest (RF) model.

Figure 6. PRISMA 2020 flow diagram for study selection.

Figure 7. Data processing sequence diagram [86].

Figure 8. Publication trend bar chart.

Figure 9. Geographical distribution map.

Figure 10. CNN architecture breakdown chart.

Figure 11. Example of a pre-trained architecture: AlexNet. The diagram shows the sequence of convolutional (Conv), pooling (Max-Pool), and fully connected (FC) layers used for feature extraction and classification. Adapted from Chaudhari (2021), with modifications [11].

Figure 12. (a): GAN model [80], (b) detailed GAN architecture [80]. The dashed lines represent the flow of data and latent vectors between the generator and discriminator components of the GAN.

Figure 13. (a) The LSTM cell [19], (b) Recurrent Neural Network (RNN) [79]. The colors and arrows distinguish the different gates (e.g., forget, input, output) and data flows within the LSTM memory cell.

Figure 14. Conv–LSTM workflow [79]. The arrows and layers illustrate the spatiotemporal data flow, where convolutional operations are integrated into the recurrent cell.

Figure 15. Workflow for drought index calculation and prediction using a CNN-Bioinspired Algorithm. Adapted from Sardar (2022), with modifications [86].

Figure 16. Multi-modal data fusion in an ANN [79].

Figure 17. Calculation of the Vegetation Health Index (VHI) from its components, the Vegetation Condition Index (VCI) and Temperature Condition Index (TCI). Example maps shown are derived from MODIS data (2012–2022). Adapted from Kartal (2024), with modifications [87]. The color scale ranges from Low (Drought, warm colors) to High (Healthy, cool colors).

Figure 18. GEE-based workflow for climate-specific drought index calculation. The GEE platform enables cloud-based processing of multi-temporal satellite datasets, significantly reducing computational costs. Adapted from Qazvini (2023), with modifications [34]. The arrows depict the workflow for data retrieval and index calculation within the GEE platform.

Figure 19. Methodological framework of the selected case study (Xiao et al., 2024 [26]), illustrating the hybrid CNN-RF model and multi-modal data fusion workflow. Adapted from Xiao et al. (2024), with modifications [26]. The colors and lines differentiate the multi-modal data inputs, model components, and evaluation pathways.

Figure 20. Explainable AI (XAI) for drought prediction. Adapted from Dikshit et al. (2021) [80]. The arrows illustrate the workflow from model prediction to interpretation by XAI methods.

Table 1. Descriptive summary of representative included studies.

Author(s) and Year	Journal /Source	Geographical Study Area	“Ground Truth” Drought Index	Satellite Sensor(s)	Vegetation Indices (VIs) Used	CNN Architecture Employed
Xu et al. (2023) [25]	Agricultural Water Management	China	Station-based 1-month Standardized Precipitation Evapotranspiration Index (SPEI-1)	MODIS, FLDAS, SERVIR GLOBAL, Landsat	Standardized Vegetation Index (SVI, from NDVI)	Compares multiple models, including 1D-CNN and Entity Embedding Deep Neural Network (EEDNN)
Xiao et al. (2024) [26]	Agricultural Water Management	A mountainous region in Southwest China (Yunnan Province)	Station-based 3-month Standardized Precipitation Evapotranspiration Index (SPEI-3)	MODIS (MOD16A2, MOD13A1), CHIRPS, GLDAS, SRTM	(PCI), Soil Moisture Condition Index (SMCI), VCI, TCI, Scaled Potential Evapotranspiration (SPET)	Hybrid CNN-Random Forest (CNN-RF) model
Barbosa et al. (2024) [39]	Atmosphere	Northeastern Brazil	Standardized Soil Moisture Index (SSMI-3). Flash Drought Detection	SMOS, MODIS, BR-DWGD	NDVI	2D CNN (Encoder–Decoder)
Iilonga and Ajayi (2025) [45]	Land Use Policy	Omusati region, Namibia	A composite agricultural drought index derived from NDVI, LST, and soil moisture	MODIS (MOD13A1, MOD16A2, MOD11A2)	NDVI is the primary VI	Compares a standard CNN, Long Short-Term Memory (LSTM), and ConvLSTM
Sakka et al. (2025) [58]	Applied Sciences		Review of CNNS		NDVI, EVI, etc.	CNNs (General Review)
Elbeltagi et al. (2024) [84]	Journal of Hydrology	Nile River Basin, Egypt	PDSI	Multisource Used meteorological data	Not explicitly used as model input; model predicts future PDSI time series	CNNLSTM-, CNN-RF, CNN-SVR, CNN-XGB
Zhang et al. (2024) [103]	Remote Sensing	North China Plain	Winter Wheat Yield	MODIS, OCO-2	SIF, LAI, EVI	BCBL (CNN-LSTM variant)
Zhang, Y. et al. (2023) [104]	Journal of Water ManagementModelling	Xinjiang Uygur Autonomous Region, China	Station-based SPEI	MODIS, CHIRPS	VCI, TCI, VHI, VSWI, LAI, Soil Moisture Condition Index (SMCI), Evapotranspiration (ET)	ConvLSTM, compared to standalone CNN, AlexNet, VGGNet
Chaudhari et al. (2024) [105]	Frontiers in Plant Science	Kolar, India	Drought Classification	Landsat	NDVI, ARVI, SAVI, EVI	A Generative Adversarial Network (GAN) that uses a modified U-Net as its generator
Foroumandi et al. (2024) [106]	Water Resources Research	Contiguous United States (CONUS)	Standardized Soil Moisture Index (SSI) maps	NLDAS-2-Noah model data	Evapotranspiration (ET), Soil Moisture (SM), and Temperature	ANN, CNN (U-Net), RNN (LSTM)
Edris et al. (2025) [107]	AI for the Earth Systems		Flash Drought ID	Multisource	Used hydro-climate data

Note: Abbreviations are defined as follows: ANN (Artificial Neural Network); ARVI (Atmospherically Resistant Vegetation Index); CNN (convolutional neural network); EVI (Enhanced Vegetation Index); LAI (Leaf Area Index); LST (Land Surface Temperature); NDVI (Normalized Difference Vegetation Index); PCI (Precipitation Condition Index); PDSI (Palmer Drought Severity Index); RNN (Recurrent Neural Network); SAVI (Soil Adjusted Vegetation Index); SIF (Solar-Induced Fluorescence); SMCI (Soil Moisture Condition Index); SPEI (Standardized Precipitation Evapotranspiration Index); SVI (Standardized Vegetation Index); TCI (Temperature Condition Index); VCI (Vegetation Condition Index); VHI (Vegetation Health Index); VSWI (Vegetation Supply Water Index).

Table 2. Comparative performance of key CNN architectures in drought detection.

CNN Architecture Type	Representative Study	Application Task	Key Performance Metrics	Noted Strengths/Weaknesses
Standard 2D CNN	Chen (2024) [40]	Daily Drought Forecasting	NSE: 0.71	Strong at learning from spatial context of surrounding areas. Less effective at capturing long-term temporal dependencies.
Standard 2D CNN	Chaudhari et al. (2024) [105]	Drought Classification	Accuracy: 91–97%	High accuracy for classification tasks. Computationally efficient compared to deeper models.
Spatiotemporal 3D-CNN	Varela et al. (2022) [59]	Biomass yield and culm length estimation in Miscanthus under drought stress using UAV time-series imagery.	R2: 0.69 (Biomass); R2: 0.66 (Culm Length)	Explicitly models both spatial and temporal dynamics by operating on sequences of images, enabling it to capture crop growth trajectories. Performance improves significantly when analyzing longer time sequences.
Spatiotemporal 3D-CNN	Fernández-Beltrán et al. (2021) [115]	Also used for rice yield prediction.	Outperformed 2D-CNNs by up to 23% in R2 and 17% in RMSE.	Computationally intensive with many parameters. Performance can be sensitive to the timing and length of the image sequence used for analysis.
Encoder–Decoder (U-Net)	Varghese et al. (2021) [3]	Water Body Extraction	Accuracy: ≥0.95; Kappa: ≥0.89	Excellent for pixel-level segmentation and precise spatial mapping. Captures both local and global context.
Encoder–Decoder (U-Net)	Wieland & Martinis (2020) [21]	Surface Water Segmentation	Accuracy: ≥ 0.95; Kappa: ≥ 0.89	It can precisely delineate major features while also detecting small objects, such as ponds and reservoirs.
Encoder–Decoder (U-Net)	Lees et al. (2022) [116]	Flash Drought Identification	Moderate skill, over-emphasized spatial patterns	Better than ANNs at learning patterns, but can over-predict hotspots.
Encoder–Decoder (U-Net)	Kladny et al. (2024) [117]	Satellite Image Forecasting	SGConvLSTM achieved an ENS of 0.2740	When adapted for time-series forecasting by stacking temporal data along the channel dimension, it was less effective than recurrent architectures like ConvLSTM, which are specifically designed to process sequential information.
Hybrid CNN-LSTM	Elbeltagi et al. (2024) [84]	Long-term Drought Forecasting	R2: 0.885 (Train), NSE: 0.885 (Train)	Superior for forecasting by explicitly modeling both spatial features and temporal sequences.
Hybrid CNN-LSTM	Zhang et al. (2024) [103]	Crop Yield Estimation (Drought Impact)	R2: 0.81	Outperforms standalone CNN or LSTM by integrating multimodal spatiotemporal data.
Pre-trained (VGG, AlexNet)	Chaudhari et al. (2021) [11]	Drought Classification	Accuracy: 64–67%	Can leverage learned features but may underperform custom CNNs if not properly fine-tuned for satellite data.

Table 3. Characteristics of core greenness indices (NDVI and EVI).

Vegetation Index	Formula	Primary Sensitivity	Common Satellite Sources	Noted Advantages	Noted Limitations
NDVI [125]	$\frac{(N I R - R e d)}{(N I R + R e d)}$	Vegetation greenness, density	MODIS, Landsat, Sentinel-2, AVHRR	Robust, widely used, long historical record.	Saturates in high biomass; sensitive to atmospheric and soil effects.
EVI [125]	$2.5 \times \frac{(N I R - R e d)}{(N I R + 6 \times R e d - 7.5 \times B l u e) + 1}$	Vegetation greenness, canopy structure	MODIS, Landsat, Sentinel-2	Reduced atmospheric influence; improved sensitivity in high biomass areas.	More complex calculation requires the blue band, which can be noisy.

Table 4. Characteristics of water, soil, and biophysical indices (NDWI, SAVI, and LAI).

Index	Formula	Primary Sensitivity	Common Satellite Sources	Noted Advantages	Noted Limitations
NDWI [128]	$\frac{(G r e e n - N I R)}{(G r e e n + N I R)}$	Leaf water content, open water bodies	MODIS, Landsat, Sentinel-2	Directly sensitive to water stress; excellent for mapping surface water.	Can be confused with built-up areas; less direct measure of plant vigor.
SAVI [15]	$[\frac{N I R - R e d}{N I R + R e d + L}] \times (1 + L)$	Vegetation greenness in scarce areas	MODIS, Landsat, Sentinel-2	Minimizes soil brightness influence; good for arid/semi-arid lands.	Requires an adjustment factor (L) that may need calibration.
LAI [103]	(Varies, often model-derived)	Canopy leaf area, biomass	MODIS, VIIRS	Direct biophysical parameter; strongly related to photosynthesis.	Often an indirect product from a model, not a direct spectral index; can be difficult to validate.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

August, O.; Sibiya, M.; Ilunga, M.; Sumbwanyambe, M. Remote Sensing and Machine Learning Approaches for Hydrological Drought Detection: A PRISMA Review. Water 2026, 18, 369. https://doi.org/10.3390/w18030369

AMA Style

August O, Sibiya M, Ilunga M, Sumbwanyambe M. Remote Sensing and Machine Learning Approaches for Hydrological Drought Detection: A PRISMA Review. Water. 2026; 18(3):369. https://doi.org/10.3390/w18030369

Chicago/Turabian Style

August, Odwa, Malusi Sibiya, Masengo Ilunga, and Mbuyu Sumbwanyambe. 2026. "Remote Sensing and Machine Learning Approaches for Hydrological Drought Detection: A PRISMA Review" Water 18, no. 3: 369. https://doi.org/10.3390/w18030369

APA Style

August, O., Sibiya, M., Ilunga, M., & Sumbwanyambe, M. (2026). Remote Sensing and Machine Learning Approaches for Hydrological Drought Detection: A PRISMA Review. Water, 18(3), 369. https://doi.org/10.3390/w18030369

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Remote Sensing and Machine Learning Approaches for Hydrological Drought Detection: A PRISMA Review

Abstract

1. Introduction

2. Materials and Methods

2.1. Protocol and Registration

2.2. Eligibility Criteria

2.3. Information Sources and Search Strategy

2.3.1. Databases and Search Platforms

2.3.2. Development of Search Strings and Keywords

2.3.3. Date and Language Restrictions

2.4. Study Selection Process

2.4.1. Screening of Titles and Abstracts

2.4.2. Full-Text Review

2.5. Data Extraction Process

2.6. Risk of Bias Assessment in Individual Studies

2.7. Data Synthesis

3. Results

3.1. Study Selection and Flow (PRISMA Flow Diagram)

3.2. Characteristics of Included Studies

3.2.1. Overview of Study Designs and Methodologies

3.2.2. Geographical Distribution and Temporal Scope of Research

3.2.3. Data Sources and Satellite Imagery Utilized

3.3. Effectiveness of CNN Architectures for Hydrological Drought Detection

3.3.1. Identified CNN Architectures and Their Variants

3.3.2. Performance Metrics and Comparative Analysis of Architectures

3.3.3. Factors Influencing Architectural Effectiveness

3.4. Contribution and Roles of Different Vegetation Indices

3.4.1. Commonly Employed Vegetation Indices in CNN-Based Detection

3.4.2. Impact of Specific Vegetation Indices on Detection Accuracy

3.4.3. Synergistic Effects and Redundancy Among Indices

3.4.4. The Role of Cloud Computing Platforms

3.5. Illustrative Case Study: A Hybrid Deep Learning Model for Drought Monitoring in Southwest China

4. Discussion

4.1. Summary of Main Findings

4.2. Interpretation of Results

4.2.1. Optimal CNN Architectures for Hydrological Drought Detection

4.2.2. The Critical Role of Vegetation Indices in Detection Accuracy

4.2.3. Bridging the Scale Gap: The Role of Unmanned Aerial Vehicles (UAVs) in Drought Monitoring

4.3. Foundational and Theoretical Frameworks

4.4. Limitations of the Review

4.5. Implications for Future Research and Practical Applications

5. Conclusions

5.1. Beyond Convolutions: The Potential of Vision Transformers (ViTs) for Hydrological Modeling

5.2. Other Key Research Frontiers

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI