A Summary of Recent Advances in the Literature on Machine Learning Techniques for Remote Sensing of Groundwater Dependent Ecosystems (GDEs) from Space

Chiloane, Chantel Nthabiseng; Dube, Timothy; Sibanda, Mbulisi; Dalu, Tatenda; Shoko, Cletah

doi:10.3390/rs17081460

Open AccessReview

A Summary of Recent Advances in the Literature on Machine Learning Techniques for Remote Sensing of Groundwater Dependent Ecosystems (GDEs) from Space

by

Chantel Nthabiseng Chiloane

^1,*

,

Timothy Dube

¹

,

Mbulisi Sibanda

²

,

Tatenda Dalu

³

and

Cletah Shoko

⁴

¹

Institute of Water Studies, Department of Earth Sciences, University of the Western Cape, Bellville, Cape Town 7535, South Africa

²

Department of Geography, Environmental Studies and Tourism, University of the Western Cape, Bellville, Cape Town 7535, South Africa

³

Aquatic Systems Research Group, School of Biology and Environmental Sciences, University of Mpumalanga, Nelspruit 1201, South Africa

⁴

Division of Geography, School of Geography, Archaeology and Environmental Studies, University of Witwatersrand, Johannesburg 2050, South Africa

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(8), 1460; https://doi.org/10.3390/rs17081460

Submission received: 8 February 2025 / Revised: 30 March 2025 / Accepted: 14 April 2025 / Published: 19 April 2025

(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)

Download

Browse Figures

Versions Notes

Abstract

While groundwater-dependent ecosystems (GDEs) occupy only a small portion of the Earth’s surface, they hold significant ecological value by providing essential ecosystem services such as habitat for flora and fauna, carbon sequestration, and erosion control. However, GDE functionality is increasingly threatened by human activities, rainfall variability, and climate change. To address these challenges, various methods have been developed to assess, monitor, and understand GDEs, aiding sustainable decision-making and conservation policy implementation. Among these, remote sensing and advanced machine learning (ML) techniques have emerged as key tools for improving the evaluation of dryland GDEs. This study provides a comprehensive overview of the progress made in applying advanced ML algorithms to assess and monitor GDEs. It begins with a systematic literature review following the PRISMA framework, followed by an analysis of temporal and geographic trends in ML applications for GDE research. Additionally, it explores different advanced ML algorithms and their applications across various GDE types. The paper also discusses challenges in mapping GDEs and proposes mitigation strategies. Despite the promise of ML in GDE studies, the field remains in its early stages, with most research concentrated in China, the USA, and Germany. While advanced ML techniques enable high-quality dryland GDE classification at local to global scales, model performance is highly dependent on data availability and quality. Overall, the findings underscore the growing importance and potential of geospatial approaches in generating spatially explicit information on dryland GDEs. Future research should focus on enhancing models through hybrid and transformative techniques, as well as fostering interdisciplinary collaboration between ecologists and computer scientists to improve model development and result interpretability. The insights presented in this study will help guide future research efforts and contribute to the improved management and conservation of GDEs.

Keywords:

anthropogenic pressure; data analytics; drylands; ecological integrity; hydrological uncertainty; spatial data; water resources

Graphical Abstract

1. Introduction

Groundwater plays a crucial role in supporting human systems and preserving the ecological balance of Earth’s ecosystems, particularly amid growing uncertainties in surface water availability due to climate change, rainfall fluctuations, drought, and rapid population expansion [1]. While accounting for 96% of the planet’s unfrozen freshwater resources, groundwater serves as the primary drinking water source for approximately 1.5–3 billion people [2]. Globally, groundwater satisfies half of the total water demand, a figure expected to rise due to climate change-induced hydrological unpredictability [3]. Agricultural activities drive 60–70% of global groundwater usage, while industrial needs account for the remaining 40%. Unsustainable exploitation of groundwater resources leads to adverse effects, including water quality degradation and diminished recharge of water bodies such as lakes, wetlands, and water-dependent ecosystems [4]. Variations in temperature and rainfall patterns alter hydrodynamic conditions, affecting groundwater recharge rates and mineralization processes, leading to salinization [5,6]. Continued reliance on groundwater threatens both human and ecological systems, especially groundwater-dependent ecosystems (GDEs), which are susceptible to the unintended consequences of excessive groundwater extraction [7]. The structure, biodiversity, composition, and ecological functions of GDEs are intricately linked to natural groundwater availability [8]. These ecosystems, which include cave systems, terrestrial vegetation, and water-dependent surface features like rivers, lakes, wetlands, and estuaries, provide vital ecosystem services such as biodiversity maintenance, food production, recreation, flood control, carbon sequestration [9], erosion prevention, and habitat provision [10,11]. Given the significant benefits of dryland GDEs and the looming threat from global change, research into groundwater and ecosystem conservation has become a pressing issue. Ecological conservation endeavors primarily focus on broader ecosystems, with limited attention given to dryland GDEs, which are particularly sensitive to fluctuations in water availability. Significant research efforts are necessary to deepen our understanding of the intricate relationship between groundwater, dryland GDEs, and global environmental changes, ensuring that socioeconomic progress does not compromise ecosystem functionality. Consequently, it is imperative to devise practical methodologies for efficient monitoring aimed at conserving dryland GDEs under evolving climate conditions [12,13,14].

The spatial delineation of dryland GDE patterns and their extent poses significant challenges due to their heterogeneous nature. Conceptually, interactions between groundwater and surface water, as well as between groundwater and vegetation, entail diverse hydrological, physicochemical, and biogeochemical processes occurring across various spatial and temporal scales, which are difficult to assess [15,16]. The evaluation of these interactions demands assessment methodologies that are both financially viable and technically feasible, applicable across both developing and developed nations [16,17].

Previous research has utilized ground-based field techniques to evaluate and monitor vegetation health, diversity, composition, and above-ground biomass in GDEs [14]. For instance, ref. [14] conducted field assessments to verify the presence of presumed GDEs, employing various measurements such as soil coring to assess rooting depth, stable isotope analysis, soil moisture analysis, and matching leaf water potential. Additionally, ref. [18] investigated how groundwater depth influences leafing intensity through field surveys of dryland phreatophytes. Although these methods are inferential, they yield precise data for comprehensive assessments of dryland GDE status. However, field-based measurements, while accurate, often suffer from limited spatial coverage. Consequently, capturing the inherent distribution and composition of diverse vegetation species within dryland GDEs, particularly in remote or resource-constrained areas, presents challenges [19]. Moreover, this approach is labor-intensive, time-consuming, and expensive, without adequately addressing the complexities associated with delineating the spatial extent of dryland GDEs. As a result, the information derived from such methods is temporally and spatially constrained.

Automated, near-real-time remotely sensed data have emerged as a critical resource for acquiring spatially explicit information regarding the status and distribution of GDEs at both local and global scales. The distribution of GDEs is characterized by sporadic occurrences, with facultative GDEs exhibiting ephemeral characteristics, posing challenges for continuous monitoring. To address these challenges, multi-spectral and hyperspectral remotely sensed data obtained from satellite sensors have proven invaluable. Hyperspectral sensors like Project for Onboard Autonomy (PROBA-1), PRISMA, and Hyperion offer detailed information owing to their fine spectral resolution, although access to such data may be limited. Conversely, multispectral datasets from sensors such as Moderate Resolution Imaging Spectroradiometer (MODIS), Landsat, and Sentinel, while possessing lower spectral resolutions compared to hyperspectral datasets, are readily accessible to the public for various applications, including GDE assessments. The spectral characteristics of these datasets range from coarse to very high spectral resolution, making them suitable for GDE assessments across different scales. Furthermore, timely acquisition of remotely sensed data enhances the potential for distinguishing GDEs through spectral texture analysis. Recent advancements have demonstrated not only the potential of remotely sensed datasets for delineating GDEs but also for assessing their composition, vegetation species diversity, water use characteristics, and land use/land cover changes (LULCC) impacting GDE status. Moreover, advances in sensor technology have facilitated the acquisition of freely available, high-quality satellite datasets, providing reliable information necessary for detecting inherent temporal ecohydrological changes. Consequently, research integrating remote sensing data is gaining traction, as it effectively addresses the limitations associated with traditional ground-based measurements. The advancement in sensor technology, data quality, and availability developed is coupled with advancement classification and regression methods that make GDE assessment efficient. One of these is the introduction of machine learning (ML) algorithms, which are statistical methods crucial in extracting meaningful, mapping-predicted habitat suitability of GDEs from vast amounts of geospatial data [20].

Furthermore, while previous studies have extensively reviewed remote sensing advancements in GDE assessments [20,21,22,23,24], they often lack a focused examination of the nuanced application of machine learning (ML) to maximize the potential of freely accessible, open-source data for continuous GDE monitoring, especially in resource-limited regions. The integration of ML is crucial for efficiently processing the complex and heterogeneous nature of GDEs, enabling precise mapping of their distribution, accurate assessment of vegetation diversity and health, and effective analysis of temporal ecohydrological dynamics [24,25,26,27]. Given the rapid proliferation of remote sensing data, sophisticated analytical tools like ML are indispensable for extracting meaningful insights [26]. Specifically, ML algorithms offer enhanced capabilities in distinguishing subtle spectral differences, vital for differentiating various GDE types, including small dryland wetlands, patchy vegetation, and ephemeral water bodies with similar spectral characteristics [23,25]. This systematic literature analysis aims to evaluate the geographic and temporal adoption of ML in GDE studies. It will also provide a background on relevant ML algorithms, assess their application to diverse GDE types, and evaluate their performance in GDE assessments. Moreover, this study will identify challenges in integrating ML into GDE studies and provide recommendations for future research. This comprehensive approach is essential for accurate GDE assessment and conservation, facilitating a deeper understanding of their intricate relationships with groundwater and environmental changes.

2. Literature Search

A systematic literature search was conducted using the SCOPUS database to investigate the role of remote sensing and machine learning, particularly within the application for GDE assessments. The search strategy employed the query: ALL (machine AND learning AND vegetation AND phreatophyte OR wetland) AND PUBYEAR > 1999 AND (LIMIT-TO (SUBJAREA, “EART”) OR LIMIT-TO (SUBJAREA, “ENVI”)). This yielded 480 articles, which were subsequently filtered using the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines (Figure 1). Inclusion criteria were as follows: (1) clear application of machine learning in GDE assessments, (2) publication in international peer-reviewed journals, (3) English language, (4) availability of free PDF access, and (5) provision of detailed methodologies and quantitative data for evaluating machine learning algorithms. Articles were automatically excluded if they did not meet these criteria, i.e., were duplicates, were literature reviews, or were not in English. This resulted in 178 articles. For each provided article, data on title, country, and publication year were imported into an Excel database for further analysis. Linear regression was used to analyze the temporal evolution of publications, while a heatmap, generated using Matplotlib 3.10.0, was employed to visualize the geographical distribution of publications related to GDE assessment and machine learning applications. Following the initial screening, titles, abstracts, and full articles were assessed for relevance, focusing on studies that evaluated terrestrial GDE distribution and composition using machine learning techniques and provided sufficient quantitative information. This process resulted in 65 articles, which were analyzed in this systematic review. Additional relevant articles were identified through the reference lists of these remaining studies, following the methodology of ref. [28] indicated in Figure 1.

3. Results and Analysis

3.1. Publication Trends in Machine Learning and GDE Assessments

The analysis reveals a significant increase in the application of machine learning techniques for assessing GDE, more specifically wetlands, vegetation, and surface water bodies maintained by groundwater. The analysis of publication trends from Scopus indicates a dynamic and increasing cumulative trend in publications on GDEs using machine learning techniques (Figure 2). This indicates a growing interest and relevance of the application of machine learning in this field. The increasing interest in and application of machine learning (ML) techniques for assessing groundwater-dependent vegetation and wetlands can be attributed to advancements in technology, improved data availability, and heightened awareness of the ecological value of these ecosystems. These factors have collectively fostered innovations in monitoring and managing wetland environments, enabling a more detailed and efficient approach to ecological studies. A significant contribution to this trend is the integration of high-resolution remote sensing data with ML algorithms, which has transformed the monitoring of wetland ecosystems. Applications utilizing multispectral imagery and unmanned aerial vehicles (UAVs) have demonstrated that these technologies provide superior spatial resolution and spectral data, essential for accurate classification of wetland vegetation. Research indicates that machine learning models like Random Forest and Support Vector Machines are effective in identifying and quantifying vegetation health and stress indicators [29,30,31]. The capacity to process and analyze large datasets through ML allows for a more comprehensive assessment of wetland conditions compared to traditional methods [31]. Moreover, the rising availability of big data has enabled scientists to implement machine learning techniques across various ecological domains, fostering new insights into complex interactions within ecosystems. Machine learning models do not impose strict assumptions about data distribution, which enhances their applicability in ecological studies characterized by nonlinear relationships [32]. This flexibility empowers researchers to predict changes in GDEs in response to varying environmental conditions [33]. Furthermore, there is a renewed focus on the ecological importance of GDEs as vital habitats supporting biodiversity and ecosystem services, which has spurred research interest. The challenges posed by climate change and human impacts underscore the necessity for effective monitoring tools. Consequently, the application of ML in assessing ecological health and forecasting future changes has become increasingly pertinent [31,32,33,34,35]. The ability of ML techniques to synthesize vast amounts of information and provide actionable insights is critical in establishing effective conservation strategies for these delicate ecosystems. Finally, the cross-disciplinary nature of machine learning encourages collaboration among ecologists, computer scientists, and data analysts, enhancing the sophistication of ecological models and methodologies [23,36]. Collaborative networks that share data and ML models contribute to a broader appreciation and understanding of these approaches within ecological research [37]. Despite the growing interest, a total of 100 publications over 17 years presents opportunities for further research and the development of ML-based assessment techniques for GDEs.

3.2. Geographical Trends of Machine Learning-Based Research on GDEs

Figure 3 illustrates the global trend in GDE studies utilizing ML. Most of these studies originate from United States of America (USA) (65), China (49), and Germany (34). In contrast, Africa demonstrates a significant gap, with only 42 publications across the entire region. The dominance of ML applications for assessing GDEs and wetlands in regions like the USA, China, and Germany, contrasted with their limited use in Africa, can be attributed to several interrelated factors linked to socio-economic context, technological infrastructure, research capacity, and data availability. First, the level of technological advancement and infrastructure supporting scientific research varies significantly between these regions. Countries like the USA, China, and India have extensive research funding, advanced technological capabilities, and established institutions dedicated to environmental studies, enabling the integration of ML techniques in GDE assessments [38,39]. In contrast, many African nations face challenges including limited research funding, a shortage of trained professionals in data science and ecology, and less access to advanced technologies necessary for deploying ML algorithms and analyzing large datasets [40,41]. Furthermore, the disparity in data availability plays a crucial role in the application of ML in ecosystem studies. High-resolution data from remote sensing technologies and comprehensive ecological datasets in developed countries facilitate the use of sophisticated ML techniques that require vast and diverse training datasets [30,42]. Conversely, data scarcity in Africa, often compounded by inadequate localized studies or insufficient infrastructure for data collection, hampers the effective application of ML in assessing GDEs [43,44]. For instance, research in the USA and Europe leverages data collected over decades, while in many African regions, similar datasets are often still being compiled or are inconsistent and fragmented [40,41,45]. Additionally, there is generally greater awareness and recognition of the ecological importance of GDEs in developed nations, which translates into proactive environmental policies and conservation efforts that incorporate ML methodologies [30,46]. Subsequently, there is opportunity to expand research on GDEs and to build capacity for leveraging ML and freely available datasets to support GDE conservation efforts more specifically in African countries.

4. Analytical Algorithms for Evaluating GDEs Using Remotely Sensed Data

Compared to traditional field monitoring, GIS and remote sensing offer rapid and efficient methods for GDE assessment. GIS approaches can be broadly categorized as either human-computer interactive visual interpretation or automated (supervised and unsupervised) classification methods [20,47,48]. While visual interpretation, conducted by experienced experts, yields high-quality classifications, it is time-consuming, prone to bias, and difficult to scale [20,40]. Conversely, automated, pixel-based classifiers are swift, cost-effective, and scalable across various temporal and spatial scales. Consequently, they are widely employed in GDE monitoring [14,49,50]. For example, ref. [50] successfully mapped water-dependent wetlands and their temporal changes using a combination of hierarchical object-based image analysis, user-defined parameters, and a classification regression tree algorithm, achieving high mapping accuracies (>90%). This highlights the efficacy of automated pixel-based methods for rapid GDE classifications. Unsupervised classification, an automated technique involving clustering without human input, offers unbiased and rapid image classification. However, it often suffers from lower accuracy, particularly with low-quality or sparsely distributed data [51]. Supervised classification, which requires training models with manually selected sampling points, generally yields more accurate results due to the use of artificial ancillary training [51]. The accuracy of supervised classification is, of course, influenced by the method used and data quality.

With the advancement of satellite sensors providing high and moderate resolution imagery, object-based image (OBI) classification has gained prominence. OBI classification, which considers spectral signatures and pixel characteristics like texture, shape, and spatial context, improves upon pixel-based classifiers [52,53]. OBI is particularly suited for high-resolution datasets, as image segmentation reduces heterogeneity and noise [54]. This enhances spectral separability for classifying complex features like dryland GDEs, which have inconsistent shapes and spectral responses [55,56]. However, image segmentation in OBI can reduce accuracy if not properly calibrated, leading to mis-segmented images and classification errors [57,58]. Hybrid methods, combining pixel-based classification and object-based segmentation, have demonstrated improved vegetation monitoring accuracies (>90%) [59,60,61,62], indicating the potential of OBI classification for GDEs, especially when integrated with high spatial resolution data and advanced machine learning algorithms. While single indicator methods from unsupervised classification can be used for GDE classification, they often overlook crucial factors, resulting in lower accuracy [51]. Multi-scale object-based image analyses, which incorporate numerous variables and image segmentation, provide contextual information vital for classifying heterogeneous GDEs. However, managing these large datasets requires appropriate analytical algorithms.

Machine learning algorithms, such as RF, Classification and Regression Trees (CART), and Support Vector Machines (SVM), have proven effective in mapping and modelling GDE distribution by mitigating the effects of feature selection [56,62,63]. These algorithms can handle diverse data types and correlated variables, which is particularly useful in GDE studies [62]. Table 1 provides a summary of commonly used ML algorithms for GDE classification, including decision trees, SVM, RF, Naïve Bayes (NB), Gradient Tree Boosting (GTB), and Artificial Neural Networks (ANN). Decision trees are valued for their interpretability, while SVMs excel with high-dimensional data [63,64]. RF, by aggregating multiple decision trees, improves predictive accuracy and reduces overfitting [64,65] ANNs are adept at modelling non-linear relationships, common in ecological data. In ref. [56,64,66], the authors demonstrated the effectiveness of RF, SVM, and CART in mapping seasonal wetlands, while NB performed poorly. Ensemble learning methods and hybrid models, blending deep learning with traditional statistics, are also gaining traction, offering improved accuracy and interpretability [67].

These ML algorithms, however, come with limitations, such as slow training and biases in RF, underperformance with large data in SVM, longer training times in CART, and inability to model class dependencies in NB [56]. Studies like ref. [53] have shown that combining airborne imagery with RF and OBIA can effectively map vegetation in complex environments. Ancillary data, like topography and geology, can further enhance accuracy, especially when combined with supplementary variables and indices. Deep learning algorithms, such as Convolutional Neural Networks (CNNs) and Multilayer Perceptrons (MLPs), along with hyperparameter optimisation, are also increasingly used in ecosystem studies [65]. Furthermore, the adaptability of ML models to evolving datasets and environmental contexts is crucial, particularly in ecosystems affected by climate change [68,69]. Feature selection and data preprocessing are essential for improving model performance and reducing computational costs [70,71,72,73]. Finally, interpretability techniques, like SHAP, are vital for understanding the mechanisms driving predictions [67,71].

While a diverse array of ML algorithms can be applied to GDE assessments, several critical factors influence ML model selection. Firstly, the specific type of GDE assessment required, whether it involves mapping distribution, analysing vegetation dynamics, or modelling ecohydrological processes, dictates the suitability of different ML techniques. For instance, detailed vegetation analysis may benefit from algorithms capable of handling high-dimensional spectral data, like SVM or RF, while time-series analysis of GDE dynamics might necessitate Recurrent Neural Networks (RNNs) or Long Short-Term Memory (LSTM) networks. Secondly, data quality is paramount. This includes considering the spatial and spectral resolution of the remote sensing data, as well as the frequency of data acquisition. High spatial resolution data, such as that from airborne imagery or advanced satellite sensors, allow for detailed object-based image analysis, which is particularly effective with algorithms that can leverage textural and contextual information. Conversely, coarse resolution data might necessitate pixel-based approaches with algorithms robust to noise and variability. Spectral resolution influences the ability to distinguish subtle differences in vegetation and soil characteristics, which is crucial for accurately classifying diverse GDE components. The temporal frequency of data collection impacts the ability to monitor GDE changes over time, requiring algorithms that can handle time-series data effectively. Thirdly, the availability and quality of training data are fundamental. Supervised ML algorithms, such as SVM, RF, and CNNs, rely on labelled training data to learn patterns and make accurate predictions. Insufficient or biased training data can lead to significant errors in classification and modelling. Therefore, it is crucial to ensure that training datasets are representative of the variability within the study area and are of sufficient size to train robust models.

In summary, robust ML algorithms are indispensable for accurately mapping GDEs, which is crucial for understanding their function and monitoring their response to environmental changes. The multifaceted nature of ML, combining various algorithms and techniques, enhances prediction accuracy and improves our understanding of ecological processes, essential for effective conservation and management. However, the capabilities of the ML algorithms can be realized when applied to appropriate objectives with the correct data and computational capacity. Subsequently, further research is needed to determine appropriate, region-specific ML algorithms for assessing the ecohydrological dynamics of dryland GDEs.

5. Analysis of Machine Learning Algorithms for Detecting Groundwater-Dependent Ecosystems (GDEs) and Composition

Based on a review of existing studies, the key models for detecting GDEs and vegetation diversity assessments within GDEs are RF, SVM, CNN, and object-based image analysis (OBIA) (Table 2). Random Forest is an ensemble learning algorithm that combines multiple decision trees to improve classification accuracy and reduce overfitting. It has demonstrated superior performance in GDE mapping because it handles high-dimensional remote sensing data (e.g., Landsat, Sentinel-2, UAV imagery). RF has been successfully applied to classify wetlands, springs, and surface water bodies using high-dimensional remote sensing datasets (e.g., Landsat, Sentinel-2, UAV imagery). Studies show RF achieving 99.9% accuracy in surface water mapping [74] and 93.3% accuracy in wetland GDE classification [75]. For instance, ref. [76] demonstrated that RF models had moderate to high agreement with field and remote sensing training data. Their study, a comparative analysis of Analytic Hierarchy Process (AHP) and RF models was conducted to map groundwater-dependent vegetation (GDV) across Mediterranean biomes in the USA, South Africa, Italy, and Australia. The study highlighted variations in the type and quality of spatial datasets across these regions and presented a significant challenge to a unified global analysis. However, the researchers were able to compare the performance of machine learning models based on the intensity of field training data versus remote sensing data. Notably, the RF model trained with extensive field data exhibited the highest performance (Kappa coefficient, K = 0.76). The study further revealed that the type of input data significantly influenced variable importance (weighting), which, in turn, affected model performance. Additionally, model overestimation was observed, potentially attributable to the misclassification of drought-resistant vegetation as GDV. Furthermore, unlike traditional statistical methods, RF does not require data to follow a specific distribution. It is resilient to noise and missing values, making it suitable for complex ecological data. RF provides insights into the most important predictors, aiding in the interpretation of ecological drivers affecting GDEs. Compared to Support Vector Machines (SVM) and decision trees (CART, C4.5), RF consistently shows higher accuracy and robustness in mapping GDEs. For example, ref. [52] evaluated the capabilities of RF, SVM, CART, and NB to characterize and map small, seasonal wetlands in an arid environment using field and Sentinel-2 data. RF performed the best, with an overall accuracy of 85%, with NB (25%) performing the worst and the other models producing moderate accuracies. Additionally, they noted that model performance was low for wetlands with limited field training data. Moreover, algorithms exhibited limitations such as slow training and bias when dealing with categorical data in the case of RF, underperformance when handling large data in the case of SVM, more time required to train the model in the case of CART, and the inability of dependencies amongst classes to be modelled when using NB. RF struggles with very high-resolution imagery, where deep learning approaches, such as CNN, outperform it. It is also less effective in capturing fine-scale vegetation details compared to object-based or deep learning models.

CNNs are deep learning architectures designed to extract spatial and spectral features automatically from images. They are particularly effective in high-resolution vegetation mapping and classification. CNNs capture both spectral and textural differences within vegetation communities, making them ideal for fine-scale classification. UAV-based wetland and vegetation classification studies reported CNNs achieving 94.7% overall accuracy [83]. Unlike RF, which primarily relies on predefined features (e.g., NDVI, texture), CNNs learn deep, hierarchical patterns, enabling better differentiation of vegetation classes. CNNs outperform pixel-based methods in distinguishing multiple vegetation types and structural variations at fine scales. This is useful for segmenting heterogeneous landscapes found in GDEs. However, CNNs require large, labelled datasets and computational resources. Training can be slow, and performance depends on the quality of training data. Moreover, CNNs are less interpretable compared to RF, making ecological insights harder to extract. The study by [83] utilized UAVs and CNNs to estimate the cover of Phragmites australis reeds and other wetland features in a large inland floodplain wetland in Western New South Wales, Australia. Their validation process demonstrated an overall high precision and reproducibility in recognizing Phragmites australis reeds and other wetland features. The model demonstrated that it could correctly identify Phragmites australis with a TP rate of >98% and an FP rate of 0.002%. The very high accuracy and precision in recognizing Phragmites australis was attributed to the plant having a distinct structure and phenology compared with the other vegetation at each site. Machine learning using remote sensing has been shown to be most effective when the target species has unique spectral and textural differentiation [83,85].

Studies integrating OBIA, which segments images into meaningful objects before classification, thereby reduce misclassification errors common in pixel-based approaches like ML. ML classifies at the pixel level, which may lead to spectral confusion in heterogeneous landscapes. OBIA aggregates similar pixels into objects, improving classification accuracy. OBIA is particularly effective in segmenting scattered vegetation patches, which are common in dryland GDEs. A study showed that fusing OBIA with CNN increased classification accuracy by 25% [59]. Furthermore, OBIA minimizes spectral confusion, making it suitable for mapping mixed ecosystems such as wetlands, riparian zones, and phreatophytes. Integrating OBIA requires expert knowledge to fine-tune segmentation parameters. It is computationally intensive for large datasets, and its performance depends on segmentation quality and feature selection. Furthermore, Cellular automata (CA) enhance machine learning models by incorporating spatial dependencies, enabling dynamic simulations, and improving accuracy in environmental predictions. In the study by [86], ANN-CA models wetlands by capturing complex, non-linear interactions while ensuring spatial continuity. Unlike traditional models, CA allows fine-grained, pixel-level predictions and iterative forecasting, making it ideal for long-term wetland monitoring. By integrating ANN with CA, the model achieves better generalization, higher accuracy, and improved interpretability.

By adopting a hybrid approach, researchers can maximize classification accuracy while improving the ecological interpretability of results. For instance, RF is suitable for initial classification of GDEs using multi-spectral data, CNN for high-resolution vegetation segmentation, particularly from UAV and hyperspectral imagery, and OBIA to refine classification by incorporating object-based segmentation. This combination leverages the strengths of each method to improve GDE detection and vegetation diversity assessment, ensuring high accuracy and ecological interpretability. Further research should focus on optimizing these models for specific environmental conditions and integrating additional data sources, such as LiDAR and climate models, for enhanced GDE monitoring.

6. Challenges in Determining a Suitable Machine Learning Model for GDE Detection and Vegetation Diversity Analysis

The determination of the most effective machine learning model for detecting and assessing groundwater-dependent water bodies, wetlands, and vegetation is complicated by the diverse evaluation methods employed across different studies. Each model has unique strengths and weaknesses, depending on the characteristics of the datasets and the evaluation metrics utilized, thereby impacting interpretability and decision-making in model selection.

Ref. [87] evaluated the machine learning algorithms SVM and RF in extracting water bodies using modified Normalized Difference Water Index (MNDWI) indices. They highlight the importance of the evaluation framework, as results vary significantly depending on surface complexity; simple threshold methods are effective for uncomplicated environments, whereas supervised methods like SVM are recommended for more challenging scenarios. This suggests that model selection can significantly depend on the chosen evaluation strategies, as different environments demand different approaches. Furthermore, ref. [88] emphasizes the need for varied performance metrics when comparing ML methods. They investigate multiple models using R², RMSE, and MAE, demonstrating how different evaluation metrics can lead to varying conclusions regarding model effectiveness. This reinforces the notion that evaluation methods can bias the perceived performance of a model, complicating the selection process, as stakeholders may prioritize different aspects based on specific application needs. Moreover, refs. [89,90] adopt different methods for detecting water bodies, with [89] utilizing unsupervised techniques, while ref. [90] employs object-based analysis through machine learning algorithms such as RF and SVM. This methodological variety contributes to uncertainty in the field, as the application and outcomes of different methods are not directly comparable without a unified evaluation rubric. The integration of various model evaluation approaches also renders it challenging to determine which model is best suited for specific applications.

The evaluation challenge increases when considering the influence of data sources and feature selection strategies, as highlighted by [91]. They note that the performance of ML models for estimating vegetation diversity can vary greatly depending on the multidimensional spectral data used. For instance, models based on single data, such as Sentinel-2 and Lidar, had lower performance than the combined model. This finding suggests that differences in data characteristics, combined with evaluation methods, could complicate the decision-making process regarding optimal model selection. Additionally, the ensemble learning approach advocated by [92] adds another layer of complexity to model comparison. By integrating multiple machine learning algorithms, the ensemble model aims for enhanced performance accuracy. However, it raises questions regarding how the performance of individual models compares to ensembles when different evaluation metrics are selectively applied, complicating the task of identifying a singular best model. Therefore, a cohesive evaluation framework is essential for future studies to ensure comparability and clarity in model selection. This framework should prioritize the use of standardized evaluation metrics and benchmarked datasets. Adopting common evaluation metrics and datasets would enable fairer comparisons across studies employing diverse models. Consequently, this standardization would facilitate the identification of the most appropriate machine learning approach for specific environments and available data.

7. Application of Machine Learning Assessments of GDEs: Challenges and Recommendations

The application of machine learning (ML) algorithms in managing wetland, vegetation, and surface water bodies influenced by groundwater presents significant challenges and promising mitigation measures. These challenges primarily revolve around the complexity of ecological dynamics, data quality, interpretability issues, and regional variability.

One critical challenge lies in the complexity of the GDEs themselves. As pointed out by [53,93] the performance of ML algorithms varies when applied in different geographical contexts, complicating the formulation of generalized models for tasks such as ecological mapping [53]. This ecological complexity adds layers of unpredictability, making it difficult to rely solely on historical data for accurate predictions. Furthermore, ref. [94] highlight that standard ML methods, such as deep neural networks, act as “black boxes”, resulting in poor interpretability of their outputs, which poses significant challenges in environmental management, where understanding underlying mechanisms is crucial [94].

Moreover, data quality and availability pose formidable obstacles to effective ML applications. In many ecosystems, data on water quality parameters are sparse and inconsistently gathered [95,96,97]. This lack of robust datasets can lead to challenges in training ML models effectively, resulting in inaccurate or unreliable predictions. Ref. [98] emphasizes the necessity of ongoing monitoring of water quality parameters to ensure any predictive model retains its reliability over time [98]. Thus, even with sophisticated algorithms, the absence of comprehensive data undermines the potential of ML applications.

To mitigate these challenges, several strategies can be employed. First, reduce data redundancy. The study by [99] demonstrates the effectiveness of using dimensionality reduction techniques to extract meaningful variables from multiple earth observation sources, such as Digital Elevation Models (DEMs), optical imagery, and Synthetic Aperture Radar (SAR) data [99]. This approach not only streamlines the model input but also enhances classification accuracy by focusing on uncorrelated variables that best represent the study area. Moreover, utilizing complementary datasets from instruments like Sentinel-1 and Sentinel-2 can provide superior insights into GDE dynamics [99,100]. Ref. [101] highlights the benefits of integrating multiple data types, boosting the accuracy and comprehensiveness of GDE classification efforts [101,102]. This synergistic application allows for a more holistic understanding of GDE features while addressing challenges posed by spectral confusion between classes. Additionally, enhancing data collection practices through IoT technology could improve the richness and accuracy of datasets utilized for training ML models, as indicated by [103]. This could enable more real-time monitoring and adjustments based on recent ecological changes [103,104].

Furthermore, utilizing hybrid ML models that integrate domain knowledge, as discussed by [105] et al., may improve predictive accuracy while adhering to ecological constraints [105]. Such models could be particularly effective in ensuring that biological realities are considered during prediction, thus enhancing model reliability and interpretability. Another significant mitigation measure is employing advanced algorithms capable of generating interpretable results. RF and Gradient Boosting tend to provide better interpretability and can handle diverse datasets more efficiently [106]. For instance, Wen and Hughes emphasize the efficacy of ensemble techniques over traditional single-algorithm approaches, providing robust classification outcomes against changing environmental conditions [107]. Additionally, employing ensemble neural network architectures, as showcased in the work of [102], which combines different classifiers to enhance prediction accuracy while minimizing errors associated with individual model limitations [102]. This can be particularly useful in complex GDE ecosystems where distinct classes may often overlap. Additionally, ref. [108] applied deep convolutional networks to achieve high accuracy in classifying GDEs based on intricate features derived from high-resolution datasets [108]. Such models are capable of learning complex patterns that traditional methods may overlook.

Moreover, the innovative application of vision transformers has demonstrated potential in enhancing classification processes [109]. These approaches leverage vast amounts of data for training while accommodating the unique aspects of wetland environments, thus increasing the reliability of classifications. Another vital strategy involves specialized pre-processing and post-processing techniques tailored to remote sensing data. Properly handling atmospheric corrections and aligning images temporally can significantly improve model performance. The authors of [52] demonstrated that image OBIA techniques can mitigate challenges arising from noise and overfitting inherent in ML models, allowing for more stable and accurate classifications [56]. In post-processing phases, applying thresholding and reclassification based on hydrologic conditions can refine preliminary results, allowing for iterative improvements and validations of the classification model [79,110]. This feedback mechanism can incorporate continuous learning and adaptation to shifting ecological conditions, enhancing long-term monitoring capabilities. Furthermore, adaptive and iterative modeling techniques should be employed, as they allow models to improve over time through feedback loops based on updated data, effectively adapting to ecological changes, as suggested by [95]. Continuous model evaluation can ensure that the predictions remain relevant as ecological conditions evolve.

Additionally, fostering interdisciplinary collaboration between ecologists and computer scientists would enhance model development and the interpretability of results within ecological contexts [111]. Continuous education and training of practitioners in these integrated approaches is crucial, as it can enhance effective communication of model outcomes to stakeholders involved in water management decisions.

8. Conclusions

The literature research highlights that the use of advanced machine learning is still at its infancy for determining or assessing the spatial distribution of GDEs. Furthermore, the application of GDE studies dominates in China, USA, and Germany. There is still limited application in Africa; this presents opportunities for further work in that region. Furthermore, machine learning is a multifaceted field that combines various algorithms and techniques to tackle complex ecological data challenges. The most widely used ML algorithms in GDE studies are RF, SVM, and XGBoost. The integration of these methods into ecological research is not only enhancing prediction accuracy but also improving our understanding of ecological processes, which is essential for effective conservation and management practices. Various machine learning techniques demonstrate differing performance levels in assessing groundwater-dependent ecosystems under varying conditions. Random Forest and Artificial Neural Networks show robust performance across multiple contexts, while Support Vector Machines provide accurate predictions under specific conditions. Deep learning models capture temporal complexities effectively, particularly in dynamic environments such as agricultural landscapes, while ensemble methods enhance overall predictive capabilities. The key challenges in the application of ML for GDE studies are ecological complexity and regional variability, which complicate model generalization, while sparse and inconsistent data limit training effectiveness. Many ML models lack interpretability, making it difficult to understand underlying mechanisms, which is crucial for environmental management. These limitations can be mitigated by embracing hybrid models, improving data collection, ensuring model interpretability, and fostering interdisciplinary collaboration. The potential of machine learning in this crucial area can be significantly realized. As research continues to evolve in this field, these methods offer robust solutions to advance the precision and reliability of GDE assessments.

Author Contributions

Conceptualization, C.N.C., T.D. (Timothy Dube) and T.D. (Tatenda Dalu); methodology, C.N.C.; software, C.N.C.; validation, C.N.C., T.D. (Timothy Dube), T.D. (Tatenda Dalu), and M.S.; formal analysis, C.N.C.; investigation, C.N.C.; data curation, C.N.C.; writing—original draft preparation, C.N.C.; writing—review and editing, C.N.C., T.D. (Timothy Dube), T.D. (Tatenda Dalu), and C.S.; visualization, C.N.C.; supervision, C.N.C., T.D. (Timothy Dube), and T.D. (Tatenda Dalu); project administration, T.D. (Timothy Dube) and T.D. (Tatenda Dalu); funding acquisition, T.D. (Timothy Dube) and T.D. (Tatenda Dalu). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Water Research Commission, grant number C2022/2023-00902.

Acknowledgments

We are grateful to the Water Research Commission for funding this work under project number C2022/2023-00902.

Conflicts of Interest

The authors declare no conflict of interest.

References

Al-Fugara, A.; Pourghasemi, H.R.; Al-Shabeeb, A.R.; Habib, M.; Al-Adamat, R.; Al-Amoush, H.; Collins, A.L. A comparison of machine learning models for the mapping of groundwater spring potential. Environ. Earth Sci. 2020, 79, 206. [Google Scholar] [CrossRef]
Morsy, K.M.; Alenezi, A.; Alrukaibi, D.S. Groundwater and dependent ecosystems: Revealing the impacts of climate change. Int. J. Appl. Eng. Res. 2017, 12, 3919–3926. [Google Scholar]
Nevill, J.C.; Hancock, P.J.; Murray, B.R.; Ponder, W.F.; Humphreys, W.F.; Phillips, M.L.; Groom, P.K. Groundwater-dependent ecosystems and the dangers of groundwater overdraft: A review and an Australian perspective. Pac. Conserv. Biol. 2010, 16, 187–208. [Google Scholar] [CrossRef]
Dams, J.; Salvadore, E.; Van Daele, T.; Ntegeka, V.; Willems, P.; Batelaan, O. Spatio-temporal impact of climate change on the groundwater system. Hydrol. Earth Syst. Sci. 2012, 16, 1517–1531. [Google Scholar] [CrossRef]
Essam, D.; Ahmed, M.; Abouelmagd, A.; Soliman, F. Monitoring temporal variations in groundwater levels in urban areas using ground penetrating radar. Sci. Total. Environ. 2020, 703, 134986. [Google Scholar] [CrossRef]
Barbieri, M.; Barberio, M.D.; Banzato, F.; Billi, A.; Boschetti, T.; Franchini, S.; Gori, F.; Petitta, M. Climate change and its effect on groundwater quality. Environ. Geochem. Health 2021, 45, 1133–1144. [Google Scholar] [CrossRef]
Wada, Y.; Van Beek, L.P.H.; Van Kempen, C.M.; Reckman, J.W.T.M.; Vasak, S.; Bierkens, M.F.P. Global depletion of groundwater resources. Geophys. Res. Lett. 2010, 37, L20402. [Google Scholar] [CrossRef]
Eamus, D.; Froend, R.; Loomes, R.; Hose, G.; Murray, B. A functional methodology for determining the groundwater regime needed to maintain the health of groundwater-dependent vegetation. Aust. J. Bot. 2006, 54, 97–114. [Google Scholar] [CrossRef]
Norton, R.K. Planning for resilient and sustainable coastal shorelands and communities in the face of global climate change. Oxf. Res. Encycl. Environ. Sci. 2022. [Google Scholar] [CrossRef]
Baker, C.; Lawrence, R.; Montagne, C.; Patten, D. Mapping wetlands and riparian areas using Landsat ETM+ imagery and decision-tree-based models. Wetlands 2006, 26, 465–474. [Google Scholar] [CrossRef]
Berhanu, M.; Suryabhagavan, K.V.; Korme, T. Wetland mapping and evaluating the impacts on hydrology, using geospatial techniques: A case of Geba Watershed, Southwest Ethiopia. Geol. Ecol. Landscapes 2021, 7, 293–310. [Google Scholar] [CrossRef]
Kalbus, E.; Reinstorf, F.; Schirmer, M. Measuring methods for groundwater—Surface water interactions: A review. Hydrol. Earth Syst. Sci. 2006, 10, 873–887. [Google Scholar] [CrossRef]
Doody, T.M.; Barron, O.V.; Dowsley, K.; Emelyanova, I.; Fawcett, J.; Overton, I.C.; Pritchard, J.L.; Van Dijk, A.I.; Warren, G. Continental mapping of groundwater dependent ecosystems: A methodological framework to integrate diverse data and expert opinion. J. Hydrol. Reg. Stud. 2017, 10, 61–81. [Google Scholar] [CrossRef]
Jones, C.; Stanton, D.; Hamer, N.; Denner, S.; Singh, K.; Flook, S.; Dyring, M. Field investigation of potential terrestrial groundwater-dependent ecosystems within Australia’s Great Artesian Basin. Hydrogeol. J. 2020, 28, 237–261. [Google Scholar] [CrossRef]
Bertrand, G.; Goldscheider, N.; Gobat, J.-M.; Hunkeler, D. Review: From multi-scale conceptualization to a classification system for inland groundwater-dependent ecosystems. Hydrogeol. J. 2012, 20, 5–25. [Google Scholar] [CrossRef]
Bertrand, G.; Siergieiev, D.; Ala-Aho, P.; Rossi, P.M. Environmental tracers and indicators bringing together groundwater, surface water and groundwater-dependent ecosystems: Importance of scale in choosing relevant tools. Environ. Earth Sci. 2014, 72, 813–827. [Google Scholar] [CrossRef]
Kløve, B.; Ala-Aho, P.; Bertrand, G.; Gurdak, J.J.; Kupfersberger, H.; Kværner, J.; Muotka, T.; Mykrä, H.; Preda, E.; Rossi, P.; et al. Climate change impacts on groundwater and dependent ecosystems. J. Hydrol. 2014, 518, 250–266. [Google Scholar] [CrossRef]
Han, L.; He, D. Leafing intensity decreases with increasing water table depth and plant height in Populus euphratica, a desert riparian species. Acta Oecologica 2020, 109, 103672. [Google Scholar] [CrossRef]
Hoyos, I.C.P.; Krakauer, N.Y.; Khanbilvardi, R.; Armstrong, R.A. A review of advances in the identification and characterization of groundwater dependent ecosystems using geospatial technologies. Geosciences 2016, 6, 17. [Google Scholar] [CrossRef]
Orellana, F.; Verma, P.; Loheide II, S.P.; Daly, E. Monitoring and modeling water-vegetation interactions in groundwater-dependent ecosystems. Rev. Geophys. 2012, 50, 1–24. [Google Scholar] [CrossRef]
Zhao, X.; Zhou, D.; Fang, J. Satellite-based Studies on Large-Scale Vegetation Changes in China. J. Integr. Plant Biol. 2012, 54, 713–728. [Google Scholar] [CrossRef] [PubMed]
Eamus, D.; Zolfaghar, S.; Villalobos-Vega, R.; Cleverly, J.; Huete, A. Groundwater-dependent ecosystems: Recent insights from satellite and field-based studies. Hydrol. Earth Syst. Sci. 2015, 19, 4229–4256. [Google Scholar] [CrossRef]
Chiloane, C.; Dube, T.; Shoko, C. Impacts of groundwater and climate variability on terrestrial groundwater dependent ecosystems: A review of geospatial assessment approaches and challenges and possible future research directions. Geocarto Int. 2021, 37, 6755–6779. [Google Scholar] [CrossRef]
Chen, W. Major scientific issues on water demand studying for groundwater-dependent vegetation ecosystems in inland arid regions. Earth Sci. J. Earth Sci. China Univ. Geosci. 2014, 39, 1340–1348. [Google Scholar] [CrossRef]
Kumar, L.; Mutanga, O. Google Earth Engine Applications Since Inception: Usage, Trends, and Potential. Remote Sens. 2018, 10, 1509. [Google Scholar] [CrossRef]
Amani, M.; Ghorbanian, A.; Ahmadi, S.A.; Kakooei, M.; Moghimi, A.; Mirmazloumi, S.M.; Moghaddam, S.H.A.; Mahdavi, S.; Ghahremanloo, M.; Parsian, S.; et al. Google Earth Engine Cloud Computing Platform for Remote Sensing Big Data Applications: A Comprehensive Review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5326–5350. [Google Scholar] [CrossRef]
Zhou, B.; Okin, G.S.; Zhang, J. Leveraging Google Earth Engine (GEE) and machine learning algorithms to incorporate in situ measurement from different times for rangelands monitoring. Remote Sens. Environ. 2020, 236, 111521. [Google Scholar] [CrossRef]
Horsley, T.; Dingwall, O.; Sampson, M. Checking reference lists to find additional studies for systematic reviews. Cochrane Database Syst. Rev. 2011, MR000026. [Google Scholar] [CrossRef]
Moghaddam, D.D.; Rahmati, O.; Haghizadeh, A.; Kalantari, Z. A Modeling comparison of groundwater potential mapping in a mountain bedrock aquifer: QUEST, GARP, and RF models. Water 2020, 12, 679. [Google Scholar] [CrossRef]
Rohde, M.M.; Biswas, T.; Housman, I.W.; Campbell, L.S.; Klausmeyer, K.R.; Howard, J.K. A machine learning approach to predict groundwater levels in California reveals ecosystems at risk. Front. Earth Sci. 2021, 9, 784499. [Google Scholar] [CrossRef]
Balogun, A.-L.; Yekeen, S.T.; Pradhan, B.; Althuwaynee, O.F. Spatio-temporal analysis of oil spill impact and recovery pattern of coastal vegetation and wetland using multispectral satellite landsat 8-oli imagery and machine learning models. Remote Sens. 2020, 12, 1225. [Google Scholar] [CrossRef]
Zhou, R.; Yang, C.; Li, E.; Cai, X.; Yang, J.; Xia, Y. Object-based wetland vegetation classification using multi-feature selection of unoccupied aerial vehicle RGB imagery. Remote Sens. 2021, 13, 4910. [Google Scholar] [CrossRef]
Jafarzadeh, H.; Mahdianpari, M.; Gill, E.W.; Brisco, B.; Mohammadimanesh, F. Remote sensing and machine learning tools to support wetland monitoring: A meta-analysis of three decades of research. Remote Sens. 2022, 14, 6104. [Google Scholar] [CrossRef]
Thessen, A. Adoption of machine learning techniques in ecology and earth science. One Ecosyst. 2016, 1, e8621. [Google Scholar] [CrossRef]
Ghannam, R.B.; Techtmann, S.M. Machine learning applications in microbial ecology, human microbiome studies, and environmental monitoring. Comput. Struct. Biotechnol. J. 2021, 19, 1092–1107. [Google Scholar] [CrossRef]
Tuia, D.; Kellenberger, B.; Beery, S.; Costelloe, B.R.; Zuffi, S.; Risse, B.; Mathis, A.; Mathis, M.W.; van Langevelde, F.; Burghardt, T.; et al. Perspectives in machine learning for wildlife conservation. Nat. Commun. 2022, 13, 792. [Google Scholar] [CrossRef]
Dujon, A.; Schofield, G. Importance of machine learning for enhancing ecological studies using information-rich imagery. Endanger. Species Res. 2019, 39, 91–104. [Google Scholar] [CrossRef]
Otoo, N.G.; Sutanudjaja, E.H.; van Vliet, M.T.H.; Schipper, A.M.; Bierkens, M.F.P. Mapping groundwater dependent ecosystems using a high-resolution global groundwater model. Hydrol. Earth Syst. Sci. Discuss. 2024. in review. [Google Scholar] [CrossRef]
Howard, J.K.; Dooley, K.; Brauman, K.A.; Klausmeyer, K.R.; Rohde, M.M. Ecosystem services produced by groundwater dependent ecosystems: A framework and case study in California. Front. Water 2023, 5, 1115416. [Google Scholar] [CrossRef]
Touré, H.; Boateng, C.D.; Gidigasu, S.S.; Wemegah, D.D.; Mensah, V.; Aryee, J.N.; Osei, M.A.; Gilbert, J.; Afful, S.K. A review of geological and climatic variables in groundwater availability prediction in Africa: Machine learning approaches. EarthArXiv eprints 2024. [Google Scholar] [CrossRef]
Gaffoor, Z.; Pietersen, K.; Jovanovic, N.; Bagula, A.; Kanyerere, T. Big data analytics and its role to support groundwater management in the southern African development community. Water 2020, 12, 2796. [Google Scholar] [CrossRef]
Brkić, Ž.; Kuhta, M.; Larva, O.; Gottstein, S. Groundwater and connected ecosystems: An overview of groundwater body status assessment in Croatia. Environ. Sci. Eur. 2019, 31, 75. [Google Scholar] [CrossRef]
Pazola, A.; Shamsudduha, M.; French, J.; MacDonald, A.M.; Abiye, T.; Goni, I.B.; Taylor, R.G. High-resolution long-term average groundwater recharge in Africa estimated using random forest regression and residual interpolation. Hydrol. Earth Syst. Sci. 2023, 28, 2949–2967. [Google Scholar] [CrossRef]
Mensah, V.; Boateng, C.D.; Gidigasu, S.S.; Wemegah, D.D.; Aryee, J.N.; Osei, M.A.; Touré, H.; Gilbert, J.; Afful, S.K. Groundwater exploration methods in west africa: A review. EarthArXiv eprints 2024. [Google Scholar] [CrossRef]
Muhury, N.; Apan, A.A.; Marasani, T.N.; Ayele, G.T. Modelling floodplain vegetation response to groundwater variability using the ArcSWAT hydrological model, MODIS NDVI data, and machine learning. Land 2022, 11, 2154. [Google Scholar] [CrossRef]
Rohde, M.M.; Sweet, S.B.; Ulrich, C.; Howard, J. A Transdisciplinary approach to characterize hydrological controls on groundwater-dependent ecosystem health. Front. Environ. Sci. 2019, 7, 175. [Google Scholar] [CrossRef]
Maskooni, E.K.; Naghibi, S.A.; Hashemi, H.; Berndtsson, R. Application of advanced machine learning algorithms to assess groundwater potential using remote sensing-derived data. Remote Sens. 2020, 12, 2742. [Google Scholar] [CrossRef]
Pedzisai, E.; Mutanga, O.; Odindi, J.; Mushore, T.D. The use of remote sensing indices to understand flood-recharged soil moisture impacts on trees in semi-arid floodplains: A review. Ecohydrology 2022, 15, e2460. [Google Scholar] [CrossRef]
Werstak, C.E.; Housman, I.; Maus, P.; Fisk, H.; Gurrieri, J.; Carlson, C.P.; Johnston, B.C.; Stratton, B.; Hurja, J.C. Groundwater-Dependent Ecosystem Inventory Using Remote Sensing; RSAC-10011-RPT1; U.S. Department of Agriculture, Forest Service, Remote Sensing Applications Center: Washington, DC, USA, 2012. [Google Scholar]
Fitoka, E.; Tompoulidou, M.; Hatziiordanou, L.; Apostolakis, A.; Höfer, R.; Weise, K.; Ververis, C. Water-related ecosystems’ mapping and assessment based on remote sensing techniques and geospatial analysis: The SWOS national service case of the Greek Ramsar sites and their catchments. Remote Sens. Environ. 2020, 245, 111795. [Google Scholar] [CrossRef]
Ismail, M.H. Evaluating supervised and unsupervised techniques for land cover mapping using remote sensing data. Geogr. Malays. J. Soc. Space 2009, 5, 1–10. [Google Scholar]
Gxokwe, S.; Dube, T.; Mazvimavi, D. Leveraging Google Earth Engine platform to characterize and map small seasonal wetlands in the semi-arid environments of South Africa. Sci. Total. Environ. 2022, 803, 150139. [Google Scholar] [CrossRef] [PubMed]
Nguyen, U.; Glenn, E.P.; Dang, T.D.; Pham, L.T. Mapping vegetation types in semi-arid riparian regions using random forest and object-based image approach: A case study of the Colorado River Ecosystem, Grand Canyon, Arizona. Ecol. Informatics 2019, 50, 43–50. [Google Scholar] [CrossRef]
Dlikilili, S. Investigating the Groundwater Dependence and Response To Rainfall Variability of Vegetation in the Touws River and Catchment Using; University of the Western Cape: Cape Town, South Africa, 2019. [Google Scholar]
Dronova, I. Object-Based Image Analysis in Wetland Research: A Review. Remote Sens. 2015, 7, 6380–6413. [Google Scholar] [CrossRef]
Gxokwe, S.; Dube, T.; Mazvimavi, D.; Grenfell, M. Using cloud computing techniques to monitor long-term variations in ecohydrological dynamics of small seasonally-flooded wetlands in semi-arid South Africa. J. Hydrol. 2022, 612, 128080. [Google Scholar] [CrossRef]
Liu, H.-H. Impact of climate change on groundwater recharge in dry areas: An ecohydrology approach. J. Hydrol. 2011, 407, 175–183. [Google Scholar] [CrossRef]
Vizzari, M.; Lesti, G.; Acharki, S. Crop classification in Google Earth Engine: Leveraging Sentinel-1, Sentinel-2, European CAP data, and object-based machine-learning approaches. Geo-Spat. Inf. Sci. 2024, 1–16. [Google Scholar] [CrossRef]
Guirado, E.; Blanco-Sacristán, J.; Rigol-Sánchez, J.P.; Alcaraz-Segura, D.; Cabello, J. A Multi-Temporal Object-Based Image Analysis to Detect Long-Lived Shrub Cover Changes in Drylands. Remote Sens. 2019, 11, 2649. [Google Scholar] [CrossRef]
Zhu, X.; Chen, J.; Gao, F.; Chen, X.; Masek, J.G. An enhanced spatial and temporal adaptive reflectance fusion model for complex heterogeneous regions. Remote Sens. Environ. 2010, 114, 2610–2623. [Google Scholar] [CrossRef]
Massey, R.; Sankey, T.T.; Yadav, K.; Congalton, R.G.; Tilton, J.C. Integrating cloud-based workflows in continental-scale cropland extent classification. Remote Sens. Environ. 2018, 219, 162–179. [Google Scholar] [CrossRef]
Duro, D.C.; Franklin, S.E.; Dubé, M.G. Multi-scale object-based image analysis and feature selection of multi-sensor earth observation imagery using random forests. Int. J. Remote Sens. 2011, 33, 4502–4526. [Google Scholar] [CrossRef]
Liu, D.; Xia, F. Assessing object-based classification: Advantages and limitations. Remote Sens. Lett. 2010, 1, 187–194. [Google Scholar] [CrossRef]
Liu, Z.; Peng, C.; Work, T.; Candau, J.-N.; DesRochers, A.; Kneeshaw, D. Application of machine-learning methods in forest ecology: Recent progress and future challenges. Environ. Rev. 2018, 26, 339–350. [Google Scholar] [CrossRef]
Huang, L.; Liu, Y.; Huang, W.; Dong, Y.; Ma, H.; Wu, K.; Guo, A. Combining Random Forest and XGBoost Methods in Detecting Early and Mid-Term Winter Wheat Stripe Rust Using Canopy Level Hyperspectral Measurements. Agriculture 2022, 12, 74. [Google Scholar] [CrossRef]
Lucas, T.C.D. A translucent box: Interpretable machine learning in ecology. Ecol. Monogr. 2020, 90, e01422. [Google Scholar] [CrossRef]
Perry, G.L.W.; Seidl, R.; Bellvé, A.M.; Rammer, W. An outlook for deep learning in ecosystem science. Ecosystems 2022, 25, 1700–1718. [Google Scholar] [CrossRef]
Ahlawat, A.; Roy, A. Assessing the vulnerability of western Himalayan ecosystem to climate change using machine learning algorithms. In Proceedings of the Remote Sensing for Agriculture, Ecosystems, and Hydrology XXV, Amsterdam, The Netherlands, 3–7 September 2023; pp. 15–95. [Google Scholar]
Ji, P.; Su, R.; Wu, G.; Xue, L.; Zhang, Z.; Fang, H.; Zhang, D. Projecting Future Wetland Dynamics Under Climate Change and Land Use Pressure: A Machine Learning Approach Using Remote Sensing and Markov Chain Modeling. Remote Sens. 2025, 17, 1089. [Google Scholar] [CrossRef]
Bayat, M.; Burkhart, H.; Namiranian, M.; Hamidi, S.K.; Heidari, S.; Hassani, M. Assessing biotic and abiotic effects on biodiversity index using machine learning. Forests 2021, 12, 461. [Google Scholar] [CrossRef]
Simon, S.M.; Glaum, P.; Valdovinos, F.S. Interpreting random forest analysis of ecological models to move from prediction to explanation. Sci. Rep. 2023, 13, 3881. [Google Scholar] [CrossRef]
Gurung, N.; Gazi, S.; Islam, Z. Strategic employee performance analysis in the USA: Deploying machine learning algorithms intelligently. J. Bus. Manag. Stud. 2024, 6, 1–14. [Google Scholar] [CrossRef]
Jordan, M.I.; Mitchell, T.M. Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef]
Tulbure, M.G.; Broich, M.; Stehman, S.V.; Kommareddy, A. Surface water extent dynamics from three decades of seasonally continuous Landsat time series at subcontinental scale in a semi-arid region. Remote Sens. Environ. 2016, 178, 142–157. [Google Scholar] [CrossRef]
Ahangarha, M.; Seydi, S.T.; Shahhoseini, R. Hyperspectral change detection in wetland and water-body areas based on machine learning. ISPRS Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 42, 19–24. [Google Scholar] [CrossRef]
El-Hokayem, L.; De Vita, P.; Usman, M.; Link, A.; Conrad, C. Mapping potentially groundwater-dependent vegetation in the Mediterranean biome using global geodata targeting site conditions and vegetation characteristics. Sci. Total. Environ. 2023, 898, 166397. [Google Scholar] [CrossRef]
Weise, K.; Höfer, R.; Franke, J.; Guelmami, A.; Simonson, W.; Muro, J.; O’connor, B.; Strauch, A.; Flink, S.; Eberle, J.; et al. Wetland extent tools for SDG 6.6.1 reporting from the Satellite-based Wetland Observation Service (SWOS). Remote Sens. Environ. 2020, 247, 111892. [Google Scholar] [CrossRef]
Araya-López, R.A.; Lopatin, J.; Fassnacht, F.E.; Hernández, H.J. Monitoring Andean high altitude wetlands in central Chile with seasonal optical data: A comparison between Worldview-2 and Sentinel-2 imagery. ISPRS J. Photogramm. Remote Sens. 2018, 145, 213–224. [Google Scholar] [CrossRef]
Snyder, S.D.; Loftin, C.S.; Reeve, A.S. Predicting the Presence of Groundwater-Influenced Ecosystems in the Northeastern United States with Ensembled Models. Water 2023, 15, 4035. [Google Scholar] [CrossRef]
Adam, E.; Mureriwa, N.; Newete, S. Mapping Prosopis glandulosa (mesquite) in the semi-arid environment of South Africa using high-resolution WorldView-2 imagery and machine learning classifiers. J. Arid. Environ. 2017, 145, 43–51. [Google Scholar] [CrossRef]
Mpakairi, K.S.; Dube, T.; Dondofema, F.; Dalu, T. Spatio–temporal variation of vegetation heterogeneity in groundwater dependent ecosystems within arid environments. Ecol. Informatics 2022, 69, 101667. [Google Scholar] [CrossRef]
Arvor, D.; Daher, F.R.; Briand, D.; Dufour, S.; Rollet, A.-J.; Simões, M.; Ferraz, R.P. Monitoring thirty years of small water reservoirs proliferation in the southern Brazilian Amazon with Landsat time series. ISPRS J. Photogramm. Remote Sens. 2018, 145, 225–237. [Google Scholar] [CrossRef]
Higgisson, W.; Cobb, A.; Tschierschke, A.; Dyer, F. Estimating the cover of Phragmites australis using unmanned aerial vehicles and neural networks in a semi-arid wetland. River Res. Appl. 2021, 37, 1312–1322. [Google Scholar] [CrossRef]
Mpakairi, K.S.; Dube, T.; Dondofema, F.; Dalu, T. Spatial Characterisation of Vegetation Diversity in Groundwater-Dependent Ecosystems Using In-Situ and Sentinel-2 MSI Satellite Data. Remote Sens. 2022, 14, 2995. [Google Scholar] [CrossRef]
Eckersley, J.; Moore, C.E.; Thompson, S.E.; Renton, M.; Grierson, P.F. Separating leaf area index from plant area index using semi-supervised classification of digital hemispheric canopy photographs: A case study of dryland vegetation. Agric. For. Meteorol. 2025, 363, 110395. [Google Scholar] [CrossRef]
Saha, T.K.; Pal, S.; Sarkar, R. Prediction of wetland area and depth using linear regression model and artificial neural network based cellular automata. Ecol. Inform. 2021, 62, 101272. [Google Scholar] [CrossRef]
Li, A.; Fan, M.; Qin, G.; Xu, Y.; Wang, H. comparative analysis of machine learning algorithms in automatic identification and extraction of water boundaries. Appl. Sci. 2021, 11, 10062. [Google Scholar] [CrossRef]
Lu, Q.; Si, W.; Wei, L.; Li, Z.; Xia, Z.; Ye, S.; Xia, Y. Retrieval of water quality from UAV-borne hyperspectral imagery: A comparative study of machine learning algorithms. Remote Sens. 2021, 13, 3928. [Google Scholar] [CrossRef]
Kossieris, S.; Asgarimehr, M.; Wickert, J. Unsupervised machine learning for GNSS reflectometry inland water body detection. Remote Sens. 2023, 15, 3206. [Google Scholar] [CrossRef]
Ulfa, A.; Kusuma, F.B.; Suardana, A.A.M.A.P.; Asriningrum, W.; Ibrahim, A.; Fadlillah, L.N. Detection of water-body boundaries from sentinel-2 imagery for floodplain lakes. Int. J. Remote Sens. Earth Sci. 2023, 19, 119–132. [Google Scholar] [CrossRef]
Ming, L.; Liu, J.; Quan, Y.; Li, M.; Wang, B.; Wei, G. Mapping tree species diversity in a typical natural secondary forest by combining multispectral and LiDAR data. Ecol. Indic. 2024, 159, 111711. [Google Scholar] [CrossRef]
Peng, C.; Xie, Z.; Jin, X. Using ensemble learning for remote sensing inversion of water quality parameters in poyang lake. Sustainability 2024, 16, 3355. [Google Scholar] [CrossRef]
Nguyen, P.T.; Ha, D.H.; Jaafari, A.; Nguyen, H.D.; Van Phong, T.; Al-Ansari, N.; Prakash, I.; Van Le, H.; Pham, B.T. Groundwater potential mapping combining artificial neural network and real adaboost ensemble technique: The DakNong province case-study, Vietnam. Int. J. Environ. Res. Public Health 2020, 17, 2473. [Google Scholar] [CrossRef]
Chen, C.; He, W.; Zhou, H.; Xue, Y.; Zhu, M. A comparative study among machine learning and numerical models for simulating groundwater dynamics in the Heihe River Basin, northwestern China. Sci. Rep. 2020, 10, 3904. [Google Scholar] [CrossRef] [PubMed]
Rodríguez-López, L.; Usta, D.B.; Alvarez, L.B.; Duran-Llacer, I.; Lami, A.; Martínez-Retureta, R.; Urrutia, R. Machine learning algorithms for the estimation of water quality parameters in lake Llanquihue in southern Chile. Water 2023, 15, 1994. [Google Scholar] [CrossRef]
Rincon-Flores, E.G.; Castano, L.; Solis, S.L.G.; Lopez, O.O.; Hernández, C.F.R.; Lara, L.A.C.; Valdés, L.P.A. Improving the learning-teaching process through adaptive learning strategy. Smart Learn. Environ. 2024, 11, 27. [Google Scholar] [CrossRef]
Al-Adhaileh, M.H.; Alsaade, F.W. Modelling and Prediction of Water Quality by Using Artificial Intelligence. Sustainability 2021, 13, 4259. [Google Scholar] [CrossRef]
del Castillo, A.F.; Yebra-Montes, C.; Garibay, M.V.; de Anda, J.; Garcia-Gonzalez, A.; Gradilla-Hernández, M.S. Simple prediction of an ecosystem-specific water quality index and the water quality classification of a highly polluted river through supervised machine learning. Water 2022, 14, 1235. [Google Scholar] [CrossRef]
DeLancey, E.R.; Kariyeva, J.; Bried, J.T.; Hird, J.N. Large-scale probabilistic identification of boreal peatlands using Google Earth Engine, open-access satellite data, and machine learning. PLoS ONE 2019, 14, e0218165. [Google Scholar] [CrossRef]
DeLancey, E.R.; Brisco, B.; McLeod, L.J.T.; Hedley, R.; Bayne, E.M.; Murnaghan, K.; Gregory, F.; Kariyeva, J. Modelling, Characterizing, and Monitoring Boreal Forest Wetland Bird Habitat with RADARSAT-2 and Landsat-8 Data. Water 2021, 13, 2327. [Google Scholar] [CrossRef]
Jamali, A.; Mahdianpari, M.; Mohammadimanesh, F.; Brisco, B.; Salehi, B. A synergic use of sentinel-1 and sentinel-2 imagery for complex wetland classification using generative adversarial network (GAN) scheme. Water 2021, 13, 3601. [Google Scholar] [CrossRef]
Jamali, A.; Mahdianpari, M. Swin transformer for complex coastal wetland classification using the integration of sentinel-1 and sentinel-2 imagery. Water 2022, 14, 178. [Google Scholar] [CrossRef]
Shanthini, S.; Devi, M.S.; Grace, R.S. A comprehensive review of learning rules and architecture of perceptron in artificial neural networks (ANNs). Deep. Learn. Eng. Energy Financ. 2024, 115–163. [Google Scholar]
Shashank, Y.; Amith, S.; Sahana, S.; Sastry, A. Monitoring of water quality using machine learning—A review. Int. J. Res. Appl. Sci. Eng. Technol. 2024, 12, 495–498. [Google Scholar] [CrossRef]
Zhao, W.L.; Gentine, P.; Reichstein, M.; Zhang, Y.; Zhou, S.; Wen, Y.; Lin, C.; Li, X.; Qiu, G.Y. Physics-constrained machine learning of evapotranspiration. Geophys. Res. Lett. 2019, 46, 14496–14507. [Google Scholar] [CrossRef]
Raman, R.; Das, A.; Manna, R.; Sahu, S.; Das, B. Physicochemical habitat traits preferred by small indigenous fish (chanda nama) in indian river discerning through machine learning. Environ. Sci. Pollut. Res. 2021, 30, 16499–16509. [Google Scholar] [CrossRef]
Wen, L.; Hughes, M. Coastal wetland mapping using ensemble learning algorithms: A comparative study of bagging, boosting and stacking techniques. Remote Sens. 2020, 12, 1683. [Google Scholar] [CrossRef]
Du, L.; McCarty, G.W.; Zhang, X.; Lang, M.W.; Vanderhoof, M.K.; Li, X.; Huang, C.; Lee, S.; Zou, Z. Mapping forested wetland inundation in the delmarva peninsula, USA using deep convolutional neural networks. Remote Sens. 2020, 12, 644. [Google Scholar] [CrossRef]
Marjani, M.; Mahdianpari, M.; Mohammadimanesh, F.; Gill, E.W. CVTNet: A fusion of convolutional neural networks and vision transformer for wetland mapping using sentinel-1 and sentinel-2 satellite data. Remote Sens. 2024, 16, 2427. [Google Scholar] [CrossRef]
Soulard, C.E.; Walker, J.J.; Smith, B.W.; Kreitler, J. The feasibility of using national-scale datasets for classifying wetlands in Arizona with machine learning. Earth Surf. Process. Landforms 2024, 49, 4632–4649. [Google Scholar] [CrossRef]
Nguyen, D.P.; Ha, H.D.; Trinh, N.T.; Nguyen, M.T. Application of artificial intelligence for forecasting surface quality index of irrigation systems in the Red River Delta, Vietnam. Environ. Syst. Res. 2023, 12, 24. [Google Scholar] [CrossRef]

Figure 1. Systematic review process.

Figure 2. Comparison of total publications and semi-arid publications from 2010 to 2024.

Figure 3. Heatmap of the number of publications by country for the period 2008–2024. The regions in white represent countries with no record of groundwater-dependent ecosystems studies utilizing machine learning.

Table 1. Characteristics of commonly used machine learning classifiers.

Factor	Artificial Neural Networks	Decision Trees	Random Forest	Gradient Tree Boosting	Support Vector Machines	Naïve Bayes
Type	Connectionist, deep learning	Tree-based	Ensemble (bagging)	Ensemble (boosting)	Kernel-based, geometric	Probabilistic
Computational Cost	High (especially deep models)	Low to moderate	Moderate	High	Moderate to high (with kernels)	Very low
Interpretability	Low (black box)	High (easy to understand)	Moderate (harder than DTs)	Low (complex ensemble)	Moderate (difficult in high dimensions)	High (simple probability outputs)
Feature Engineering	Minimal (automatically learns features)	Requires manual feature selection	Minimal	Minimal	Requires feature scaling and selection	Requires independent features
Performance on Small Data	Poor (needs large datasets)	Good	Good	Moderate to good	Good (especially with kernels)	Excellent
Overfitting Tendency	High (needs regularization)	High (prone to deep trees overfitting)	Low (averages multiple trees)	Moderate (regularization needed)	Low to moderate	Low
Training Time	Long (especially deep models)	Fast	Moderate	Slow (especially with many trees)	Slow (with large datasets and complex kernels)	Very fast
Generalization Ability	High (with enough data)	Moderate (depends on pruning)	High	High	High	Moderate
Scalability	High (with GPUs)	High (for small data, poor for big data)	High (parallel processing)	Moderate to high	Poor for very large datasets	High
Best Use Cases	Species distribution modelling, remote sensing, habitat classification	Simple environmental impact assessments, rule-based ecological models	Biodiversity assessment, species–habitat relationships, ecological risk assessment	Climate change impact modelling, habitat suitability prediction, species population dynamics	Pollution impact studies, land cover classification, ecological niche modelling	Rapid classification of species, early-warning ecological assessments

Table 2. Previous studies using machine learning and remote sensing data for GDE assessment in Drylands.

GDE Type	Purpose	Model Performance	Key Findings	Citation/Country
Wetland	Monitoring small seasonal wetlands using Google Earth Engine & ML algorithms.	RF: 68.8–80.55%, SVM: 66.60–62.29%, CART: 62.30–75.00%, NB: 29.50–25.00%	RF performed best, while NB had the lowest accuracy.	[52]
Wetland	Automated wetland and water body monitoring using hyperspectral images.	RF: 93.33% (Kappa: 0.811)	High accuracy achieved with hyperspectral data.	[75]
Wetland	Developing a wetland observation service using Sentinel-2 data.	OA > 91% for different wetland classes	High accuracy for mapping lakes, rivers, estuaries, and vegetated wetlands.	[77]
Wetland	Comparing WorldView-2 & Sentinel-2 data for wetland monitoring.	BSVM: OA > 78%, Soil moisture (r = 0.58, NMSE = 19%)	Soil moisture estimation feasible with optical remote sensing.	[78]
Wetland	Mapping wetlands and riparian areas using Landsat ETM+ & decision trees.	Classification Tree: OA 73.1%, SGB: OA 86%	SGB improved classification performance over traditional decision trees.	[10]
Vegetation	Mapping pGDV in Mediterranean Biome	RF:86.7% (Kappa: 0.76)	RF outperformed the AHP, which overestimated pGDV	[76]
Vegetation	Predict groundwater influenced ecosystems using ensembled models	(GLM, GAM, MaxEnt, Random Forest). RF: AUC = 81%	Modelled wetland and surface water GDEs in two distinct ecoregions.	[79]
Vegetation	Mapping Prosopis glandulosa in a semi-arid environment.	RF: 86.59% (Kappa: 0.84), SVM: 85.98% (Kappa: 0.83)	Spectral confusion led to lower producer’s accuracies for some species.	[80]
Vegetation	Spatio-temporal variation of vegetation heterogeneity in arid GDEs.	RF (Shannon–Weiner Index): MAE = 30.37, RMSE = 33.25, %RMSE = 63.94	RF models explained environmental drivers affecting heterogeneity.	[81]
Surface Water	Surface water extent analysis over three decades using Landsat data.	RF: OA 99.9%, Producer’s Accuracy: 87%, User’s Accuracy: 96%	Accuracy increased with newer sensors but was lower in wet years.	[74]
Surface Water	30 years of monitoring small water bodies using Landsat datasets.	OA: 0.872, Kappa: 0.745	Spatial resolution limited detection of very small water bodies.	[82]
Vegetation & Surface Water	Estimating Phragmites australis cover using UAVs and neural networks.	CNN: OA 94.7%, F-score: 0.945	High accuracy (>98%) for reeds, leaf litter, and trash; lowest for bare ground (41%).	[83]
Springs	Mapping groundwater spring potential with ML models.	RF: 80.6%, SVM: 80.2%, MDA: 83.2%, BRT: 78.0%, MARS: 75.5%	MDA had the highest accuracy, while BRT handled missing data well.	[1]
Springs	Comparing robustness of models for groundwater spring potential.	Linear Model Tree: AUC-ROC 0.9, C4.5 Decision Tree: AUC-ROC 0.831, SVM: AUC-ROC 0.889	Linear Model Tree was sensitive to input data changes.	[29]
Phreatophytes	Accuracy assessment of OBIA & Mask R-CNN segmentation in drylands.	Fusion of OBIA & Mask R-CNN improved accuracy by 25%	Combined methods were more effective for vegetation segmentation.	[59]
Phreatophytes	Explored environmental drivers affecting vegetation heterogeneity.	RF models explained environmental drivers affecting heterogeneity	RF models provided insights into environmental drivers.	[84]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chiloane, C.N.; Dube, T.; Sibanda, M.; Dalu, T.; Shoko, C. A Summary of Recent Advances in the Literature on Machine Learning Techniques for Remote Sensing of Groundwater Dependent Ecosystems (GDEs) from Space. Remote Sens. 2025, 17, 1460. https://doi.org/10.3390/rs17081460

AMA Style

Chiloane CN, Dube T, Sibanda M, Dalu T, Shoko C. A Summary of Recent Advances in the Literature on Machine Learning Techniques for Remote Sensing of Groundwater Dependent Ecosystems (GDEs) from Space. Remote Sensing. 2025; 17(8):1460. https://doi.org/10.3390/rs17081460

Chicago/Turabian Style

Chiloane, Chantel Nthabiseng, Timothy Dube, Mbulisi Sibanda, Tatenda Dalu, and Cletah Shoko. 2025. "A Summary of Recent Advances in the Literature on Machine Learning Techniques for Remote Sensing of Groundwater Dependent Ecosystems (GDEs) from Space" Remote Sensing 17, no. 8: 1460. https://doi.org/10.3390/rs17081460

APA Style

Chiloane, C. N., Dube, T., Sibanda, M., Dalu, T., & Shoko, C. (2025). A Summary of Recent Advances in the Literature on Machine Learning Techniques for Remote Sensing of Groundwater Dependent Ecosystems (GDEs) from Space. Remote Sensing, 17(8), 1460. https://doi.org/10.3390/rs17081460

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Summary of Recent Advances in the Literature on Machine Learning Techniques for Remote Sensing of Groundwater Dependent Ecosystems (GDEs) from Space

Abstract

1. Introduction

2. Literature Search

3. Results and Analysis

3.1. Publication Trends in Machine Learning and GDE Assessments

3.2. Geographical Trends of Machine Learning-Based Research on GDEs

4. Analytical Algorithms for Evaluating GDEs Using Remotely Sensed Data

5. Analysis of Machine Learning Algorithms for Detecting Groundwater-Dependent Ecosystems (GDEs) and Composition

6. Challenges in Determining a Suitable Machine Learning Model for GDE Detection and Vegetation Diversity Analysis

7. Application of Machine Learning Assessments of GDEs: Challenges and Recommendations

8. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI