Mapping Wetlands with High-Resolution Planet SuperDove Satellite Imagery: An Assessment of Machine Learning Models Across the Diverse Waterscapes of New Zealand

Md. Saiful Islam Khan; Maria C. Vega-Corredor; Matthew D. Wilson

doi:10.3390/rs17152626

,

and

¹

Geospatial Research Institute, University of Canterbury, Christchurch 8140, New Zealand

²

Eco-Index Ltd., Hamilton 3216, New Zealand

³

School of Earth and Environment, University of Canterbury, Christchurch 8140, New Zealand

^*

Author to whom correspondence should be addressed.

Remote Sens.2025, 17(15), 2626;https://doi.org/10.3390/rs17152626

This article belongs to the Special Issue Machine Learning and Automation in Remote Sensing Applied in Hydrological Processes

Version Notes

Order Reprints

Abstract

(1) Background: Wetlands are ecologically significant ecosystems that support biodiversity and contribute to essential environmental functions such as water purification, carbon storage and flood regulation. However, these ecosystems face increasing pressures from land-use change and degradation, prompting the need for scalable and accurate classification methods to support conservation and policy efforts. In this research, our motivation was to test whether high-spatial-resolution PlanetScope imagery can be used with pixel-based machine learning to support the mapping and monitoring of wetlands at a national scale. (2) Methods: This study compared four machine learning classification models—Random Forest (RF), XGBoost (XGB), Histogram-Based Gradient Boosting (HGB) and a Multi-Layer Perceptron Classifier (MLPC)—to detect and map wetland areas across New Zealand. All models were trained using eight-band SuperDove satellite imagery from PlanetScope, with a spatial resolution of ~3 m, and ancillary geospatial datasets representing topography and soil drainage characteristics, each of which is available globally. (3) Results: All four machine learning models performed well in detecting wetlands from SuperDove imagery and environmental covariates, with varying strengths. The highest accuracy was achieved using all eight image bands alongside features created from supporting geospatial data. For binary wetland classification, the highest F1 scores were recorded by XGB (0.73) and RF/HGB (both 0.72) when including all covariates. MLPC also showed competitive performance (wetland F1 score of 0.71), despite its relatively lower spatial consistency. However, each model over-predicts total wetland area at a national level, an issue which was able to be reduced by increasing the classification probability threshold and spatial filtering. (4) Conclusions: The comparative analysis highlights the strengths and trade-offs of RF, XGB, HGB and MLPC models for wetland classification. While all four methods are viable, RF offers some key advantages, including ease of deployment and transferability, positioning it as a promising candidate for scalable, high-resolution wetland monitoring across diverse ecological settings. Further work is required for verification of small-scale wetlands (<~0.5 ha) and the addition of fine-spatial-scale covariates.

Keywords:

wetlands; remote sensing; Planet SuperDove; machine learning; Multi-Layer Perception Classification (MLPC); Random Forest; XGBoost; Histogram-Based Gradient Boosting (HGB)

1. Introduction

Advances in remote sensing technology have significantly enhanced environmental monitoring by providing extensive spatial coverage and frequent temporal observations, particularly through the utilization of satellite data, which offers a synoptic view crucial for studying large and remote areas and offering cost-effective solutions for studying various phenomena across scales [1]. Satellite datasets generated from multiple sensors are invaluable but massive and complex, necessitating sophisticated analytical approaches to extract meaningful information [2].

Machine learning (ML) has emerged as a powerful instrument for leveraging satellite data in environmental applications by enabling efficient processing and analysis of large volumes of data to identify patterns and make predictions that would be impractical with traditional analytical methods [3]. In previous studies, ML techniques have been instrumental in various environmental monitoring applications, such as detecting volcanic impacts [4] using satellite data in conjunction with ML algorithms, which has enabled the monitoring of environmental changes such as vegetation cover changes [5,6]; land-use and land-cover classification [2] and even monitoring microalgae [7,8]. These applications highlight the versatility and effectiveness of ML in extracting valuable insights from satellite data for environmental studies, demonstrating the wide-ranging impact of the use of remote sensing combined with ML to address environmental challenges, e.g., monitoring of specific environmental parameters, such as vegetation water content [9], power-line vegetation [6] and coastal marine debris [10]. The improvement in the accuracy of ML algorithms has also contributed to more effective decision making and resource management strategies [11,12], as well as the development of innovative solutions for environmental monitoring in aquatic ecosystems, such as smartphone-based microalgae monitoring platforms [8] and cloudiness assessment in marine environments [13]. These applications highlight the diverse range of fields where ML techniques can be applied to extract valuable insights from remote sensing data for environmental and health-related studies.

Wetlands are dynamic environments that experience seasonal and long-term changes influenced by many factors, such as hydrological fluctuations, climate variability, and anthropogenic activities. They undergo periodic changes due to seasonal flooding, vegetation growth cycles, and long-term shifts in hydrology. The monitoring of wetlands over time has contributed to better understanding of their expansion or loss due factors such as climate change, land-use conversion, and conservation efforts [14]. Wetlands are characterized by a near or over-surface water table [15] and are highly productive, with a wide array of critical ecosystem services, making them essential in terms of supporting biodiversity [16], maintaining ecological balance, and regulating global cycles [17]. They support many native and rare species [18] and contribute to water purification and the regulation of the water cycle, nutrients, and climate by acting as carbon sinks [19], as well as acting as a buffer against natural hazards and disasters, such as floods. It is estimated that 4–6% of the world’s land surface is wetlands, corresponding to approximately 7–9 million km² [20]. However, the definition of wetland varies among sources, leading to variation in the modeling of wetland extents, and global estimates are reported as high as 27 million km² [21]. The vastness and diversity of the landscape and the need for near-continuous assessments make the use of traditional wetland assessment methods more complicated [22].

The wetlands of New Zealand, once thriving ecosystems, are now facing a concerning decline due to urbanization, agriculture, and changing land use, posing a threat to biodiversity [23]. Preserving wetlands is critical, yet globally, 71% of the wetlands have been converted to other land uses since 1900 [24]. In New Zealand, the extent of lost wetlands is above 90%. The remaining wetlands are diverse, with freshwater wetlands comprising primarily bogs (rainfall-fed), fens (rainfall- and groundwater-fed), swamps (groundwater- and surface water-fed), marshes (surface water-fed), and areas of shallow water [25], as well as saltmarshes and mangroves in coastal areas.

Enhanced monitoring systems are fundamental and critical for wetland conservation. The National Policy Statement for Freshwater Management 2020 (NPS-FM) of New Zealand established that regional councils must identify and map their natural inland wetlands if they are over 0.05 ha in extent [26], a process that, until now, has been carried out mostly manually. The current manual methods for wetland mapping in NZ are resource-intensive and often time-constraining, depend on local expertise, and vary in methodological standards across regions. Consequently, wetland monitoring could benefit from methods such as automated and semi-automated approaches that combine the use of machine learning with high-resolution satellite imagery and other data. These methods can significantly enhance wetland mapping and classification while facilitating monitoring and decision-making efforts. By leveraging these technologies, New Zealand could establish a nationally consistent wetland inventory that supports NPS-FM implementation.

Remote sensing techniques (including multispectral and hyperspectral imagery), together with geospatial analysis technologies, have been shown to be useful tools for the mapping and monitoring of wetlands dynamics and vegetation classification, allowing for a better understanding of their ecosystems and enabling more effective conservation and management strategies [14]. Traditional remote sensing methods rely on spectral indices such as the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) to distinguish wetlands from other land cover types [27]. Multispectral and hyperspectral imagery from satellites like Landsat, Sentinel-2, and MODIS are used to understand spectral variations associated with wetland vegetation, water presence, and soil moisture conditions [28]. Moreover, the use of remote sensing methods such as LiDAR (Light Detection and Ranging) has contributed to further assessment and monitoring of different types of wetlands and their water storage capacity [29].

To help to refine wetlands classification, predicting models can include ancillary GIS contextual datasets such as Digital Elevation Models (DEMs), soil maps, and hydrological layers, which refine classification by accounting for topographic and hydrological constraints [30,31]. However, there have been limited studies that have assessed ecologically relevant predictors for wetland detection [32,33,34]. Given the natural complexity of wetland ecosystems and the difficulty of spectrally separating wetlands from other land cover types, using multiple remote sensing data, as well as other data sources and features, such as spectral indices, radar backscatter, topographic variables, and soil properties, can help to improve classification accuracy. Nevertheless, not all features contribute equally to classification performance, and redundant or non-informative features may introduce noise, reducing model efficiency [35].

Wetland classification accuracy has benefited from the use of machine learning algorithms such as Random Forest (RF), Support Vector Machines (SVMs), and deep learning models like Convolutional Neural Networks (CNNs) [22,36]. It has been demonstrated how RF can outperform traditional thresholding and unsupervised classification methods by leveraging a broader range of spectral, textural, and topographic inputs [37]. Rodriguez-Galiano et al. [35] found that while SVM performed well in wetland classification, it was more computationally expensive and sensitive to parameter tuning than RF. In addition, algorithms like Histogram-Based Gradient Boosting (HGB) could be efficient and scalable variants of gradient boosting algorithms, particularly well-suited for large-scale, structured datasets. By discretizing continuous features into histograms, HGB reduces memory usage and accelerates training times without compromising predictive accuracy [38]. This kind of approach could be especially beneficial in data-intensive domains such as environmental monitoring of wetlands, where rapid processing of vast remote sensing datasets is required. Moreover, XGBoost (Extreme Gradient Boosting) is a widely used machine learning algorithm known for its speed, accuracy, and scalability. It extends traditional gradient boosting methods by incorporating system optimizations such as parallelized tree construction and regularization, which help prevent overfitting [39]. These features make XGBoost particularly effective in handling structured data for classification and regression tasks. Its ability to handle missing data natively, along with customizable objective functions and tree pruning strategies, makes it a powerful tool for large-scale environmental applications and predictive modeling tasks where both performance and interpretability are critical.

Each ML algorithm has different strengths in handling remote sensing and geospatial datasets. Random Forest (RF) is widely used due to its robustness to noise, ability to handle high-dimensional data, and interpretability [36]. Support Vector Machines (SVMs) offer strong generalization capabilities but can be computationally expensive and require careful hyperparameter tuning [35]. Gradient boosting methods such as XGBoost and LightGBM excel in handling imbalanced datasets and feature interactions, making them effective for complex classification tasks [40]. A less commonly explored but promising approach is the Multilayer Perceptron Classifier (MLPC), a type of artificial neural network (ANN) that can capture non-linear relationships in data. Unlike traditional ML models, MLPC learns hierarchical feature representations through multiple layers, making it particularly useful when working with high-dimensional satellite imagery and multi-source geospatial datasets [22]. The advantage of MLPC lies in its ability to model intricate patterns in spectral, textural, and topographic features, though it requires more computational resources and careful tuning of hyperparameters such as the number of hidden layers and activation functions.

With the increasing availability of high-resolution satellite imagery and the growing complexity of environmental datasets, machine learning has become central to modern ecological monitoring. Classifiers such as Random Forest (RF) and Multi-Layer Perceptron Classifiers (MLPCs) have been widely used for remote sensing applications, including wetland classification. However, their pixel-based nature can be limited in environments like wetlands, where spatial context and structural coherence are critical for distinguishing diverse and fragmented ecosystem types.

Wetland characteristics vary across different spatial scales, from local site-specific hydrological features to broad regional wetland distributions. Comparing classification performance at multiple spatial scales, ranging from high-resolution UAV and aerial imagery to large-scale satellite-derived national and global wetland datasets, could be essential for understanding scale-dependent classification performance. Some features, such as vegetation indices, texture, and radar backscatter, could be more informative, depending on the resolution [22]. In addition, comparing wetland detection across different spatial and temporal scales, both over multiple years and from fine to large spatial scales, can provide critical insights about water dynamics, intra-annual variation in water availability, and vegetation phenology, which help to reduce classification uncertainty, as well as improving its accuracy, as both measurements are difficult to obtain in single-date imagery [22]. Additionally, temporal comparisons improve model generalization by training classification models on multi-year datasets, increasing their robustness to seasonal and inter-annual variations in wetland conditions [34].

Our motivation for the research presented in this paper was to test the suitability of four machine learning methods for the detection and mapping wetlands at a national scale from fine-spatial-resolution optical remote sensing imagery. If possible, this would enable routing, automated monitoring of wetland systems. This would be especially useful for smaller wetlands (<~0.5 ha), which are currently excluded from available data [41] but which are required by government policy to be monitored [26]. Our aim was to develop a machine learning model that can rapidly detect wetlands from widely available eight-band Planet SuperDove imagery, alongside ancillary geospatial data such as topographic data, and use it to map wetland extent and likelihood across New Zealand’s diverse landscapes. By assessing feature importance analysis, this study sought to refine wetland classification models, leading to more precise mapping and monitoring, ultimately supporting effective decision making towards wetland conservation. We only used data that is available internationally (both imagery and ancillary data); therefore, the developed methods can be transferred globally.

To achieve our aim, this research was guided by the following two objectives, the first of which is to develop a pixel-based machine learning classification workflow that integrates satellite imagery and geospatial datasets to detect and delineate wetlands across varying landscapes. This workflow evaluates the effectiveness of different remote sensing-derived features, such as spectral indices and topographic variables, in improving classification accuracy through an ecological lens. The second objective is to compare the performance of machine learning algorithms to determine their suitability for wetland detection. This objective assesses the trade-offs between model accuracy, keeping in mind computational efficiency, and potential future improvements of the models.

2. Materials and Methods

The modeling for wetlands detection developed in this research was performed across New Zealand. Wetlands are characterized by the presence of water, hydric soil, and specialized vegetation adapted to a wet environment [42]. Alongside satellite imagery, important bio-physical wetland characterization features were identified [43] and represented by geospatial layers used as supporting layers in the machine learning models. Importantly, all the data used is available globally, ensuring transferability of the methods. The data types, sources, and processing are presented below and in Figure 1. A summary of the data used is provided in Table 1.

Figure 1. Workflow diagram illustrating the wetland classification pipeline developed in this study. Multisource satellite, topographic, and soil datasets—including PlanetScope 8-band imagery, FABDEM, NDVI, NDWI, TWI, and hydrological soil properties (SM, RZSM, HySOG, and Ksat)—are processed, resampled, and aligned into tiled data stacks (768 × 768 m). Feature scaling and encoding are applied before stratified random sampling is used to extract training data from merged wetland and land cover datasets. Machine learning models (Random Forest and MLPC) are trained, evaluated, and applied across the national tile stack, with predicted tiles mosaicked to generate the final wetland classification maps.

Satellite imagery: Eight-band SuperDove imagery from PlanetScope [44] was obtained using the Planet API (PSB.SD), including all bands (coastal blue, blue, green I, green, yellow, red, red edge, and near-infrared) and corrected for surface reflectance (analytic_8b_sr_udm2 bundle) [45]. A mosaic across New Zealand was produced, primarily using images acquired in spring, between mid-September and October 2024. Spring was selected due to the reduced likelihood of snow in mountain areas while being early in the growing season for vegetation, possibly reducing its impact on classification. For locations that had no data (for example, due to cloud), later images from November and December 2024 were included. Images used in the mosaics were prioritized based on the image clarity identified in the image metadata; mosaic pixels received data from only one source image, with no averaging between images. Importantly, this allowed the wetland predictions for each pixel to be based on one image only, with the image identifier included as part of metadata. Figure 2 shows the mosaic grid across New Zealand and two example grid boxes with an indication of the locations and dates of images used in the wetland classification.

Figure 2. (Left) The processing grid created across New Zealand, comprising 390 areas, each approximately 1100 km². (Right) example areas are shown in (i,ii): each comprises a mosaic of PlanetScope SuperDove imagery obtained on the dates shown. The yellow boxes in (i,ii) indicate the area shown in subsequent figures.

Geospatial data derived from satellite images: The Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) were derived from SuperDove images as supporting layers. The NDVI was included to help distinguish between wetlands and agricultural areas with higher expected values; the NDWI was included to help distinguish between wetlands and areas of open water. To calculate the NDVI and NDWI from PlanetScope imagery, we normalized the spectral bands and computed the indices using standard spectral formulas:

NDVI = (N I R - R e d) / (N I R + R e d), and

(1)

NDWI = (G r e e n - N I R) / (G r e e n + N I R),

(2)

where NIR (near-infrared) is band 8, red is band 6, and green is band 4 in the underlying SuperDove imagery.

Geospatial data from other sources: We included elevation (Forest And Buildings removed Copernicus Digital Elevation Model—FABDEM [46]), with lower values expected indicate a higher likelihood of wetlands; the Hydrological Soil Group (HySOG) [47], which helped by indicating areas with a high propensity of soil saturation; saturated soil hydraulic conductivity (Ksat) [48], which similarly identifies areas with lower hydrological drainage; and the Topographic Wetness Index (TWI), derived from the MERIT and MERIT Hydro datasets [49,50]. The TWI is a widely used hydrological index based on terrain characteristics (slope and upstream contributing flow area) that indicates areas that are more likely to have hydrological saturation (areas with both a low slope and high upslope area of contributing flow). We utilized the MERIT DEM elevation tiles to calculate slope in radians, then combined this with the upstream area (UPA), which was extracted from the MERIT Hydro dataset using the following formula:

TWI = \ln (\frac{A}{t a n β}),

(3)

where A is the UPA in meters squared and tanβ is the local slope angle [51]. The TWI was then calculated on a per-pixel basis. During the process, no-data values were masked, and invalid TWI values resulting from division by zero or negative slopes were handled by assigning a defined no-data value. The resulting TWI layers represented an important hydrological predictor that reflects water accumulation potential across the landscape.

Available wetland data: To train the detection model, GIS data layers of wetlands and wetland types were obtained. The Land Cover Database (LCDBv5) by Maanaki Whenua [52] provides national-scale land information at a relatively coarse spatial scale, as it was derived from ~20 m Sentinel 2 imagery. In this research, LCDBv5 was used to identify locations as wetland/non-wetland while ensuring that location sampling included a range of land cover types. The sampling scheme is outlined in Section 2.2.

Table 1. Data sources and preprocessing: input datasets used for wetland classification model development. The table details the data type, source, and preprocessing steps applied to each layer. Satellite imagery from PlanetScope SuperDove (8-band) formed the base stack, with additional spectral indices (NDVI and NDWI), topographic layers (FABDEM and TWI), and soil properties (HySOG and Ksat) incorporated as ancillary predictors. All datasets were resampled and aligned to a common 3 m spatial resolution. Reference land cover and wetland labels were derived from LCDB v5.0 with wetland updates, and stratified random sampling was used to generate training points for model input.

Data Type	Layer/Index	Source	Preprocessing Notes
Satellite Imagery	PlanetScope SuperDove 8 bands (Coastal Blue to NIR)	Planet (https://www.planet.com, accessed on 28 July 2025)	Mosaic created from same-day swath; resampled to ~3 m; used as base stack
Spectral Indices	NDVI and NDWI	Derived from PlanetScope	Normalized and aligned to Planet imagery; single-band COGs
Topography	FABDEM (DEM)	Copernicus FABDEM (Forest and Buildings removed DEM) [46]	Resampled to 3 m; aligned and stored as float32 COGs
Topography	TWI (Topographic Wetness Index)	Derived from MERIT DEM [49] and MERIT Hydro [50]	UPA and slope harmonized; invalid values masked; stored as COG
Soil Properties	HySOG (Hydrologic Soil Group)	HYSOGs250 m, Ross et al., 2018 [47]	Categorical input; encoded using OneHotEncoder; resampled and tiled
Soil Properties	Ksat (Saturated Hydraulic Conductivity)	Gupta et al., 2021 [48]	Continuous layer; resampled to 3 m; normalized
Reference Land Cover	LCDB v5.0 + Wetland Type Update	Manaaki Whenua—Landcare Research [52]	Used for training sample generation and label reference
Training Samples	Stratified random points within LCDB polygons	Derived using ArcGIS Pro 3.1 (Section 2.2)	Buffered internally (20 m); used for feature extraction

2.1. Data Preparation

To ensure spatial consistency, all input raster datasets were aligned to a common spatial grid based on the Planet ~3 m mosaics. Categorical layers were processed using nearest-neighbor resampling, while continuous layers used cubic interpolation to preserve gradient values. Rasters from multiple sources, including spectral bands, topographic models, soil characteristics, and derived indices, were stacked into unified multiband datasets. Stacked rasters were saved in compressed, tiled formats to optimize read/write efficiency during training and prediction. We used a combination of Python 3.13 libraries to create a data stack. All data stacks were projected in the same coordinate referencing system (CRS: EPSG 2193), and the core spatial resolution of Planet images (~3 m ground resolution) was maintained, with other data resampled as required to match that resolution. To help with data handling, we tiled the data stack to 768 m × 768 m data cubes across New Zealand, resulting in 582,900 files totaling 1.7 TB, each of which had 14 bands including the eight bands of SuperDove, two of spectral indices (NDVI and NDWI), two of topographic characteristics (elevation and TWI), and two for soil (HySOG and Ksat). Data for example areas are shown in Figure 3.

Figure 3. Example input layers and derived indices used in the national-scale wetland detection workflow for the areas indicated in Figure 2: (a–j) area ID 5edc0e26-0299-4c3f-8631-b39a4e3b0029; (k–t) area ID 1677525c-6446-40bd-99bc-459596e9b67a. The overview maps in (a,k) are © OpenStreetMap contributors and available from https://www.openstreetmap.org (accessed on 28 July 2025). The first row of each group displays core inputs, including environmental and terrain predictors: (b,l) the binary LCDB wetland reference map, where dark blue indicates reference wetland areas; (c,d) true-color composites of SuperDove imagery using red (R), green (G), and blue (B) bands (6, 4, 2); (d,n) false-color infrared composites of SuperDove imagery using near-infrared (R), red-edge (G), and red (B) bands (8, 7, 6); (e,o) the Normalized Difference Vegetation Index (NDVI); (f,p) the Normalized Difference Water Index (NDWI); (g,q) elevation from FABDEM; (h,r) the Topographic Wetness Index (TWI); (i,s) the categorical hydrological soil group (HySOG); (j,t) the saturated hydraulic conductivity (Ksat).

2.2. Sample Point Generation and Feature Extraction

Stratified random sampling was employed within polygon features that defined known wetland classes to build a representative training dataset. A stratified random sample of 1000 points for each land cover class in LCDBv5 was generated, with a minimum distance constraint of 50 m applied to reduce spatial autocorrelation. This ensured both class balance and geographic diversity across the sample set. We used an internal buffer of 20 m on the LCDBv5 polygons to avoid sampling close to class spatial boundaries, since there is uncertainty in these resulting from their derivation from 20 m satellite imagery.

For each sample point, values from all feature bands in the stacked raster tiles were extracted, resulting in a database of 68,406 points. Only points located within the central “core” region of each tile were considered, avoiding edge effects that may arise from clipping or resampling. Extracted data were stored along with spatial coordinates and class labels for use in model training.

2.3. Machine Learning Model Development

To evaluate the most effective classification strategy for wetland detection in New Zealand, we implemented and compared the effectiveness of four supervised machine learning models: Random Forest (RF), eXtreme Gradient (XG) Boosting (XGB), Histogram-Based Gradient Boosting (HGB), and Multi-Layer Perceptron Classifier (MLPC). These models were selected based on their demonstrated performance in remote sensing applications and complementary strengths. RF has been widely used in ecological modeling due to its robustness, interpretability, and tolerance to overfitting and noisy labels [36]. XGB is a gradient boosting framework known for its scalability, accuracy, and ability to natively handle missing values and class imbalance through regularization and optimized tree construction [39]. HGB, a more recent variant of boosting algorithms, reduces training time and memory usage by discretizing continuous variables into histograms, making it highly efficient for large geospatial datasets [38]. MLPC, a neural network model, excels in capturing complex non-linear interactions between features, offering advantages when spectral variability and high-dimensional data are present [22].

Each model was trained on a harmonized feature stack comprising eight-band PlanetScope imagery, vegetation indices (NDVI and NDWI), topographic predictors (FABDEM and TWI), and soil hydrological variables (HySOG and Ksat), ensuring a comprehensive representation of wetland characteristics. Categorical variables (e.g., HySOG) were one-hot encoded, while continuous variables were standardized.

For modeling, the scikit-learn Python library (version 1.5) was used, and OneHotEncoder was used for the categorical features (e.g., HySOG). A linear scalar was used for all other continuous variables. Each stage was optimized for accuracy, spatial consistency, and computational efficiency. All candidate machine learning classifiers were trained using the extracted point-level dataset. Prior to training, data types were standardized, and categorical variables (e.g., soil type) were encoded appropriately. Analysis of feature importance was conducted to identify the most informative predictors. The model was validated using stratified hold-out samples, and performance was evaluated based on overall accuracy and class-specific metrics.

The trained classifier was applied across the full set of stacked raster tiles. Each tile was read, features were preprocessed as required, and class predictions were generated on a per-pixel basis, including both wetland/non-wetland classification and probability maps that reflected classification confidence. The outputs were written as single-band classification rasters with consistent metadata, resolution, and spatial extent.

Model performance was assessed using stratified hold-out validation, with evaluation metrics including overall accuracy, precision, recall, and F1 score at both the class and macro levels. This comparative framework allowed us to systematically evaluate trade-offs in classification accuracy, computational efficiency, spatial consistency, and generalizability across models, guiding the selection of scalable methods for national wetland monitoring.

2.4. National-Scale Deployment of the Model Outputs

To support national-scale wetland classification, seamless classification maps spanning all of New Zealand (area: ~268,000 km²) were generated. Predicted tiles were grouped based on spatial proximity and mosaicked into larger continuous rasters. Overlapping tiles were merged using the first valid pixel value rule (i.e., the value of the first non-NA pixel in a stack of overlapping tiles), ensuring that tile boundaries were not visually prominent in the final output. Final mosaics were compressed and saved as cloud-optimized GeoTIFFs. This approach enabled consistent and scalable production of high-resolution wetland maps that align with New Zealand’s national monitoring and policy needs.

3. Results

3.1. Model Evaluation and Comparison

To evaluate the performance of different machine learning models in classifying wetlands from high-resolution PlanetScope imagery and environmental covariates, we tested four classifiers—RF, HGB, XBG, and MLPC—across six different predictor combinations. Three of these included only SuperDove imagery with band different band combinations: red, green, and blue (RGB; bands 6, 4, and 2); RGB and near-infrared (RGBI; bands 6, 4, 2, and 8); and the full eight-band PlanetScope stacks (PSS8B). One combination included the full eight-band images augmented with vegetation indices (NDVI and NDWI), and two combinations included terrain-derived variables (DEM and TWI) and soil properties (Ksat and HySOG).

As illustrated in Figure 4, overall classification accuracy peaked at 0.89 for the RF, XGB, and HGB models when using the full feature set, while RGB-based combinations plateaued at approximately 0.83. However, overall accuracy masked important differences in class-level performance. Weighted macro F1 scores (Figure 5) highlighted superior performance for wetland classes when enriched environmental inputs were used. Specifically, using the full combination of predictor variables (PSS8B_NDVI_NDWI_DEM_TWI_Ksat_HySOG) yielded the highest F1 score for wetland classification—up to 0.73 using XGB, compared to ≤0.60 for RGB-based models. This demonstrates that incorporating soil and hydrological layers notably enhances the model’s ability to correctly identify wetlands, which are typically under-represented and spectrally mixed classes in complex landscapes. Complete model performance metrics are provided in Appendix A.

Figure 4. Heatmap showing overall classification accuracy for various spectral- and auxiliary-band combinations across four machine learning models: HGB, MLPC, RF, and XGB. Models incorporating comprehensive feature sets (e.g., PSS8B_NDVI_NDWI_DEM_TWI_Ksat_HySOG) consistently achieve higher overall accuracy, particularly with gradient boosting methods. Simpler inputs, such as RGB and RGBI, result in comparatively lower performance, highlighting the benefit of extended feature information for wetland classification.

Figure 5. Wetland-class F1 scores for different spectral- and auxiliary-band combinations across four machine learning models: HGB, MLPC, RF, and XGB. The inclusion of hydrological and topographic features (e.g., FABDEM, TWI, Ksat, and HySOG) improves model performance, with the highest F1 score (0.73) achieved using the full feature set and XGB. Simpler input combinations, such as RGB and RGBI, yield lower wetland F1 scores, underscoring the importance of enriched feature sets for accurate wetland detection.

Across all feature combinations, model-level comparisons revealed subtle but consistent performance differences. HGB and RF generally outperformed MLPC and XGB in terms of both overall accuracy and F1-score consistency, particularly when richer environmental variables were included. HGB exhibited the highest accuracy (0.89) and competitive weighted F1 scores (up to 0.72), suggesting its robustness in capturing both majority- and minority-class patterns. RF showed similar strengths, particularly in preserving class balance across heterogeneous inputs. MLPC, while slightly trailing in overall accuracy, demonstrated stable performance across most band combinations, indicating it may be more resilient to reduced input dimensionality. In contrast, XGB showed the greatest sensitivity to input features—performing well with full stacks (F1 = 0.73) but less reliably with RGB-only inputs (F1 = 0.56). These results suggest ensemble-based tree models (HGB and RF) offer the most balanced trade-off between accuracy and class-level sensitivity, especially for detecting spectrally and structurally complex wetland features.

When using the full spectral and environmental stack (PSS8B_NDVI_NDWI_DEM_TWI_Ksat_HySOG), both HGB and RF achieved high per-class F1 scores for wetlands (0.74 and 0.73, respectively), indicating strong sensitivity and precision in distinguishing wetland areas from non-wetlands. MLPC also showed reasonable performance on wetlands (F1 ≈ 0.71), though was slightly more variable across simpler band combinations. XGB demonstrated comparable wetland F1 performance (0.73) under full-feature conditions but was more affected by reduced inputs, dropping to 0.60 with RGBI and as low as 0.56 with RGB. Notably, while overall accuracy for all models was high, only the HGB and RF models maintained wetland F1 scores above 0.70 consistently across the more complex feature sets, highlighting their robustness in classifying ecologically important but often under-represented wetland categories.

Among the models that included TWI and HySOG predictors, XGB consistently performed the best overall. When only the TWI was added to the feature set, all models showed modest improvements, but XGB achieved a slightly higher wetland F1 score (0.69) and recall (0.63) compared to others, suggesting a stronger sensitivity to terrain-related features. Once HySOG and other hydrological variables were introduced, performance gains became more substantial across the board. In this context, XGB again led, with the highest wetland F1 score of 0.73 and a recall of 0.67, indicating both precision and sensitivity improved markedly. RF and HGB also performed strongly, both reaching an F1 score of 0.72, while MLPC lagged slightly, at 0.71. These patterns suggest that XGB is particularly effective at leveraging hydrological and soil-related inputs for wetland detection, making it the most reliable model in the TWI and HySOG-enhanced scenarios.

The progression of feature sets reveals clear trends in how band combinations influence model performance across all four classifiers. At the base level, models using only the core 8-band PlanetScope composite (Coastal blue to Near infrared) already achieved strong accuracy (~0.87–0.88) and wetland F1 scores around 0.67–0.72. MLPC and RF performed slightly better than HGB and XGB in terms of wetland recall and F1, indicating that even without additional predictors, these models were effective in identifying wetlands. The addition of NDVI and NDWI improved wetland-specific metrics modestly, particularly for RF and XGB, which both saw increases in recall and precision. This suggests that vegetation and moisture indices, derived from the existing spectral bands, offer marginal gains in discriminating wetland areas. Further incorporating FABDEM and the TWI provided incremental gains. Although the overall accuracy remained similar, wetland precision and recall slightly increased for most models. Interestingly, MLPC showed a small dip in recall compared to using only the NDVI/NDWI, highlighting some sensitivity to terrain variables.

The most notable jump came after adding HySOG and Ksat to the input features. All models improved in wetland-related metrics, with RF and XGB reaching F1 scores above 0.72 and wetland recalls around 0.65–0.67. This confirms that soil group and hydraulic conductivity are particularly informative for wetland classification, offering complementary information beyond spectral and vegetation indices. While the base PlanetScope bands provide a solid foundation, the inclusion of hydrological and topographic predictors like the NDVI, TWI, and especially HySOG leads to consistent and meaningful improvements. Among these, XGB consistently ranks highest or ties for best in wetland F1 score, followed closely by RF, demonstrating their robustness across progressively enriched feature sets.

3.2. Feature Importance

The feature importance analysis reveals notable differences between RF and MLPC in how they prioritize features for wetland classification (Figure 6). In both models, Green I, NDWI, and NDVI rank among the most influential features, confirming that spectral bands and vegetation indices play a key role in distinguishing wetland types. However, the ranking and emphasis of these features differ.

Figure 6. Classification performance and feature contribution analysis for four machine learning models applied to wetland detection. The panels on the left show confusion matrices for (a) Histogram-Based Gradient Boosting (HGB), (b) Multi-Layer Perceptron Classifier (MLPC), (c) Random Forest (RF), and (d) XGBoost (XGB). These matrices reflect model performance on stratified validation samples, with stronger diagonal values indicating better agreement with ground-truth labels. Panels on the right display the corresponding feature importance rankings derived from each model. Ensemble models (HGB, RF, and XGB) prioritize spectral indices (NDVI and NDWI) and topographic predictors (TWI and FABDEM), while MLPC places greater emphasis on spectral bands such as Red Edge and Near-Infrared, capturing complex spectral relationships. Together, the figures highlight distinct learning patterns across model types and their effectiveness in discriminating wetland and non-wetland classes.

In RF, Green I is the most important feature, followed by the NDWI, TWI, and NDVI (Figure 6), suggesting that the model relies on both spectral and topographic features. RF also assigns higher importance to TWI and FABDEM, indicating that it integrates topographical information when making classification decisions. On the other hand, MLPC places greater emphasis on Red Edge, which emerges as the most important feature in that model. The ranking of features such as Near-Infrared, Red, and Coastal Blue is also more pronounced in MLPC, reinforcing its reliance on spectral patterns.

When comparing lower-ranked features, HySOG and Ksat consistently appear as the least influential in both models, suggesting that soil properties contribute less to the classification outcome compared to spectral and hydrological indicators. Additionally, the importance of Coastal Blue is relatively higher in MLPC than in RF, indicating that MLPC might be capturing finer spectral details that RF does not prioritize as strongly.

3.3. Wetland Prediction Across New Zealand

Each model was used to produce wetland maps and wetland class likelihoods across the New Zealand scale for September–October 2024. Example predictor variables and wetland predictions from each model are shown in Figure 7 for an area in northern New Zealand excluded from the training data, so previously unseen by the models. Comparing the LCDBv5 (ground truth data, Figure 7b) with the basemap image indicates what appears to be an underestimate in wetland extent in LCDBv5 due to a clear riverine wetland area in the northeast of the area. The predictions for each model shown in the third row of Figure 7 indicate distinct variability in predicted wetland extent, with RF (Figure 7i) clearly under-predicting extent and HGB (Figure 7j) and XGB (Figure 7k) both matching the LCDBv5 extent reasonably well while also predicting riverine wetlands absent from the LCDBv5 data. Although it has been indicated as an important feature, the coarse ~1 km spatial resolution of Ksat (Figure 7g) causes a loss of spatial fidelity in the predictions, and there is a need for higher resolution, more detailed soil information.

Figure 7. Binary wetland classification (a–d,i–l) and corresponding probability outputs (e–h,m–p) from four machine learning models (HGB, MLPC, RF, and XGB) for each area indicted in Figure 2. White and gray regions indicate areas with no data and those masked due to cloud cover or boundary clipping.

The prediction likelihood maps for each model (Figure 7m–p) indicate where the models are most confident in wetland prediction, with the highest values occurring in the location of the wetland extent within the LCDBv5 data. These maps indicate good potential for improvement in the binary wetland maps, for example, by softening the classification. In addition, they may be utilized in multi-temporal monitoring of wetland ecosystems by using the prediction likelihood as a proxy for wetland condition or status, although this would need further analysis.

Each model was used to predict wetland extent and likelihood across New Zealand, with these data made available under an open license. An example map for HGB is shown in Figure 8a, alongside the LCDBv5 wetland extent in Figure 8b. This indicates a generally good representation of current wetlands on the national scale, although the model predictions contain far more details at the local level due to the ~3 m imagery used, including the detection of small wetlands. Figure 8c shows the potential area of wetlands in New Zealand [48,49], indicating the likely area of historical wetland systems that have been lost following drainage by European settlers since ~1800. Importantly, the model does not appear to over-predict wetlands within areas where they have been lost, indicating that it is able to discern between areas with a propensity for wetlands and those with current wetland systems. However, the total national area for each model (Figure 9) indicates that wetland extent is likely to be over-predicted, although it is not possible to confirm this due to the spatial scale of the reference data, which misses small wetlands <1 ha [41] and may mis-represent the edges of wetlands. By increasing the classification probability threshold from 0.5 to 0.66, the area of predicted wetlands decreases by as much as 64% for the random forest model. By smoothing the probability maps using a mean spatial filter with a 3 × 3 kernel, the total area is reduced further. This indicates that there are substantial areas of predicted wetland that are close to the classification threshold and numerous locations where predicted wetlands are only a few pixels in extent (<~0.01 ha). Consequently, it is likely that the models can be improved, particularly if additional training data are available with higher spatial precision.

Figure 8. (a) National prediction of wetlands using the HGB model with all predictor variables compared to (b) the wetland part of the LCDBv5 dataset and (c) the potential area of wetlands, indicating the substantial loss of wetland systems that has occurred within New Zealand.

Figure 9. National area of wetlands produced by each model, with comparison to the reference data, indicating that the models likely over-predict wetland extent; increasing the probability threshold for wetland classification to 0.66 and spatial filtering reduce the total predicted extent to closer to LCDB.

4. Discussion

The classification results demonstrate that all four machine learning models—RF, HGB, XGB, and MLPC—performed well in detecting wetlands from high-resolution PlanetScope imagery and environmental covariates, with varying strengths. When using the full feature stack (PSS8B_NDVI_NDWI_FABDEM_TWI_Ksat_HySOG), RF and HGB achieved the highest overall accuracy (both at 0.89), followed closely by XGB and MLPC (both at 0.88). For binary wetland classification, the highest F1 scores were recorded by XGB (0.73) and RF/HGB (both 0.72), indicating a strong ability to distinguish wetland from non-wetland areas when comprehensive spectral and terrain inputs were provided. MLPC also showed competitive performance (wetland F1 score of 0.71), despite its relatively lower spatial consistency.

While each model offers unique strengths, ensemble tree methods (especially RF and XGB) provided the most balanced trade-offs between accuracy, class sensitivity, and computational efficiency. MLPC, though more sensitive to input dimensionality and noise, remains a valuable tool for spectrally based generalization. These findings support the potential of automated, pixel-based classifiers for national-scale wetland detection while also emphasizing the need for continued refinement in model design, training data, and spatial post-processing to improve boundary delineation and class generalization. A comparison of these four models is presented in Table 2.

Table 2. Comparative summary of four machine learning classifiers—Random Forest (RF), XGBoost (XGB), Histogram-Based Gradient Boosting (HGB), and Multi-Layer Perceptron Classifier (MLPC)—evaluated for their suitability in wetland classification using high-resolution satellite and geospatial data. Criteria include interpretability, handling of non-linear relationships, performance on imbalanced data, computational efficiency, categorical feature handling, robustness to noisy labels, scalability, spatial consistency, and relevance to wetland detection. Each model exhibits distinct strengths and trade-offs, with ensemble-based methods (RF, XGB, and HGB) showing greater efficiency and robustness, while MLPC offers flexibility in modeling complex patterns but requires more tuning and computational resources.

The difference in performance in classifying overall wetlands can be attributed to several factors. One factor is that variations in wetlands often exhibit high intra-class variability due to differences in vegetation, water levels, and soil types. This variability makes it more difficult for the model to generalize patterns for the overall wetland class, which comprises of many different types, leading to lower precision and recall compared to the broader wetland category, where such distinctions are less necessary.

Classification performance is also influenced by the quality and resolution of the training data. In this study, training labels were derived from the Land Cover Database (LCDBv5), which focuses primarily on terrestrial vegetation classes and provides only a limited representation of wetlands. As a result, wetlands were often captured in the dataset as vegetated areas with wetland context rather than as distinct hydrological or ecological entities. This limited scope, combined with the absence of comprehensive ground-truth data for wetlands, constrained the model’s ability to fully capture the diversity and boundaries of wetland ecosystems. Enhancing the training dataset with field-verified wetland observations would be critical for improving classification accuracy, especially for under-represented or non-vegetated wetlands.

Wetlands are inherently dynamic ecosystems, and their boundaries are often diffuse and transitional. Seasonal fluctuations in water levels, vegetation growth, and hydrological connectivity, along with human-induced changes such as drainage or land conversion, can significantly alter wetland appearance over time. These factors create blurred distinctions between wetland and non-wetland areas, presenting a challenge for pixel-based classification models that rely on static imagery. As a result, even well-performing models may struggle to consistently and accurately delineate wetlands, particularly in regions where seasonal variability or anthropogenic disturbance is high. Testing of the models with images acquired at other times of year (e.g., summer) should be conducted, including assessments of their ability to be used to monitor wetland state. We suggest that changes to the wetland classification probability metric may be used as an indicator of wetland status, but further research is needed to test this.

The differences in feature importance among different models highlight fundamental distinctions in how each model processes information. RF, XGB, and HGB, as decision-tree-based models, tend to prioritize features that create clear hierarchical splits, which explains the high importance assigned to Green I, NDWI, and TWI. This suggests that the tree-based models leverage both spectral and terrain-based indicators, making them particularly effective when elevation and hydrological factors are relevant in distinguishing wetlands from non-wetlands.

In contrast, MLPC assigns greater importance to spectral bands like Red Edge and Near-Infrared, which suggests that neural networks are better at capturing fine-grained spectral variations that might not be explicitly modeled in RF. This difference implies that MLPC identifies non-linear interactions between features, whereas RF relies on feature separability through decision trees. The higher ranking of Coastal Blue in MLPC further suggests that it detects subtle spectral signatures that RF does not exploit as effectively.

The fact that HySOG and Ksat consistently show lower feature importance in wetland detection models suggests that soil properties alone do not strongly differentiate wetlands. However, this does not mean these features are not relevant; rather, they may be highly correlated with other variables or lack sufficient spatial variation to significantly influence model predictions. The use of data with better spatial resolution might help to improve model performance. The low ranking of the TWI in MLPC compared to RF also suggests that neural networks may not leverage topographic features as effectively as RF, possibly because these features contribute to clear separations in decision-tree models but are less influential in the learned representations of a neural network.

These findings suggest that RF is likely better suited for datasets where topographic and hydrological indicators are crucial, while MLPC may provide better classification accuracy when spectral variations are the dominant distinguishing factors. If the goal is to enhance classification performance, a hybrid approach combining both models could be beneficial, leveraging RF for structure-based classification and MLPC for spectrally based differentiation.

Most of the earlier wetland classification using satellite imagery and machine learning used RF algorithms with variable degrees of accuracy [40,50]. Some of these studies had different pixel resolutions, such as one national-scale classification for France with 5 m ground resolution and base DEM pixels combined with Sentinel-2 satellite images of 10 m resolution [40]. Our models are based on the ~3 m resolution of SuperDove 8-band data, with F1 scores between 0.71 and 0.73, compared to the French model’s F1 score of 0.75. However, as noted, if improved ground-truth data for wetland model training become available, this has potential for improvement.

Evaluating model transferability is important to determine whether classification models trained on high-resolution imagery, such as ~3 m SuperDove data, can still perform well when applied to coarser-resolution satellite imagery, such as 10 m Sentinel-2 or 30 m Landsat data. This is particularly relevant for large-scale wetland mapping, where higher-resolution data may not always be necessary. Additionally, wetlands across New Zealand have varying hydrological and vegetation characteristics that appear differently depending on the spatial scale. Therefore, classification models need to be adaptable to varying resolutions to accurately capture these differences [51].

Comparing MLPC with traditional ML algorithms like RF, as well as boosting methods, provides insights into their respective trade-offs in terms of classification accuracy, computational efficiency, and generalization. Additionally, model comparison helps determine whether deep learning-based methods offer a substantial performance gain over ensemble-based techniques, given the constraints of available training data. By systematically evaluating different algorithms, this research aimed to establish a reliable and interpretable wetland classification model that balances predictive power with ecological relevance for improved wetland monitoring and conservation strategies.

As it is a neural network, MLPC excels over RF by automatically learning feature interactions from the dataset that may be missed by RF without explicit manual feature engineering [53]. Studies across various domains, from medical diagnosis to cybersecurity, underscore the strength of multi-layer perceptron (MLP) in handling complex non-linear relationships and offering a flexible architectural design [54,55,56]. These findings highlight MLP’s broad applicability and effectiveness in classification tasks, reinforcing its value in capturing intricate data patterns.

The model validation exercise indicated some underlying issues with the training data extracted from LCDBv5. These data were developed using Sentinel-2 satellite imagery, with a reported average location error of approximately 15 m. The higher resolution data from PlanetScope are inherently a mismatch for the pixel-level classification. More sample points are needed to rectify some of this, and to do so, the best alternative may be a ground-truth field map of wetland types from other sources, such as local government agencies, although these maps have limited coverage. Since wetlands have gone through extensive land use change, it would also be helpful to incorporate land use data, as it can explain or restrict false-positive detections where the bio-physical context suggests the presence of wetlands but that have been altered to other uses.

In this context, a consistent, high-resolution national wetland map is urgently needed to support regional governments and conservation agencies in meeting their obligations under the NPS-FM and to address long-standing data gaps in wetland monitoring. While previous efforts, such as the LCDBv5 and the associated Manaaki Whenua wetland layer, provide broad national coverage, they are often limited in spatial resolution and typically classify vegetated wetlands as a single category, without capturing type-specific distinctions [41,52]. Internationally, large-scale efforts—such as the 5 m national wetland map of France [43], the U.S. National Wetlands Inventory (NWI), and global products like WAD2M [21]—demonstrate the value and feasibility of mapping wetlands at scale yet often rely on coarser imagery or static input layers. These legacy maps also struggle to reflect small, fragmented, or hydrologically dynamic wetland systems. The growing availability of high-resolution satellite imagery, such as SuperDove, and advances in machine learning present an opportunity to develop scalable, spatially explicit wetland maps with greater thematic detail and accuracy [22,44]. A national-scale classification framework, built from transferable and automated methods, can reduce manual mapping burdens, improve consistency, and enhance the repeatability of wetland assessments across diverse landscapes and time periods [14,57,58].

Feature importance assessment can be used for optimizing machine learning models by identifying the most relevant variables, thereby reducing computational costs and improving generalization [36]. Several techniques, such as permutation importance, SHAP (SHapley Additive exPlanations), and recursive feature elimination, have been employed in previous studies to determine which features contribute most to wetland classification [33,34]. The inclusion of ecologically relevant features ensures that the classification model aligns with real-world wetland dynamics, enhancing both interpretability and conservation applicability.

The pixel-based modeling approach has some limitations in classifying objects that are not necessarily homogeneous across the entity, which was particularly apparent in each of the model predictions, given the reduction in national area following an increase in the classification probability threshold and spatial filtering of classification probabilities. Vegetated wetlands are one such type of entity where there is lot of internal variation among pixels to classify them as a single object. Deep learning-based AI models may help delineate such land classes. The neural architecture of MLPC models can be used as an initial model for such deep-learning-based convolutional neural network model exercises.

While the pixel-based machine learning models used in this study demonstrate good performance in identifying general wetland presence, a key limitation lies in their inability to delineate precise wetland boundaries. The classification outputs indicate areas likely to contain wetlands based on spectral, hydrological, and topographic predictors but do not inherently represent ecologically meaningful or legally mappable wetland extents. This is particularly relevant in cases where wetlands occur as fragmented or seasonally variable features embedded within heterogeneous landscapes. As a result, while these models are highly effective for screening and broad-scale wetland detection, further refinement—such as object-based image analysis or integration with field-validated delineation datasets—may be necessary to support regulatory applications or restoration planning. Future work should also explore the use of region-based deep learning architectures to enhance the spatial coherence and boundary accuracy of predicted wetland areas.

It is important to note that the definition of wetlands varies across ecological, hydrological, and policy frameworks. In the context of this study, wetlands were defined narrowly as vegetated areas with persistent or seasonally wet conditions, consistent with the dominant wetland classes represented in the LCDB training data. This operational definition excludes non-vegetated wetland types such as open water bodies, ephemeral saturated flats, and mineral wetlands with sparse vegetation. Consequently, the model outputs reflect this focused scope and may not capture the full range of wetland forms present across New Zealand. Future work could broaden this definition to improve representation of hydrologically dynamic and non-vegetated wetland systems.

To address variations in wetland classification, several strategies can be conducted. First, multi-temporal classification can be utilized to incorporate seasonal variability, providing a dynamic view of wetland changes throughout the year. This approach recognizes that wetlands can exhibit significant seasonal changes in vegetation and water levels, which is critical for accurate classification. Secondly, the inclusion of Synthetic Aperture Radar (SAR) data would enhance the classification process. As reported in the review by Adeli et al. [59], SAR data has proven effective in aiding in wetland classification due to its ability to penetrate cloud cover and provide high-resolution images, regardless of lighting conditions. This makes it particularly valuable for the monitoring of wetlands in cloudy or rainy environments. For example, Varugu et al. [60] demonstrated the use of SAR for multi-temporal monitoring of a tidal wetland system. Furthermore, SAR imagery with longer wavelengths (L band) can detect wetlands beneath a vegetation canopy, as demonstrated in Amazonia by Hess et al. [61]. New L-band missions such as NISAR are likely to provide valuable data to aid in wetland monitoring [62].

Finally, knowledge transfer can be leveraged by feeding the MLPC model outputs to more sophisticated deep learning-based convolutional neural network (CNN) models. This knowledge-transferable modeling approach aims to refine predictive accuracy by utilizing the nuanced pattern recognition capabilities of CNNs, thereby enhancing the overall effectiveness of wetland classification. These steps represent a comprehensive strategy to improve the monitoring and management of wetlands through advanced technological means. The global coverage of Planet data and the use of global supporting datasets provide an opportunity to upscale the machine learning modeling approach used this this paper to the global level, yet with high spatiotemporal resolution, it can be adapted to the local context, as demonstrated in this research.

5. Conclusions

Evolving Geospatial Artificial Intelligence (GeoAI) technologies offer a comprehensive and efficient approach to wetland identification and monitoring by harnessing satellite imagery and advanced algorithms. Machine learning models enhance classification accuracy by analyzing intricate spectral and spatial patterns. This research addressed the challenges in identifying and monitoring wetland types in New Zealand. The objective of this research was to develop a wetland classification model powered by ML using high-resolution satellite images from PlanetScope, supported by ancillary topography and geospatial soil drainage data. Each of the four tested ML methods was able to predict wetlands with reasonable accuracy, with the best performing models being those that include all predictor variables. In comparing the total predicted wetland area nationally, models appear to over-predict, although we note that it is likely that the reference datasets underpredict due to their exclusion of small (<~1 ha) wetlands. Post-processing wetland classifications by adjusting the threshold probability to 0.66 and spatially smoothing the probability led to a closer overall match between predicted wetland extent and the reference dataset, especially for the RF model.

Pixel-based machine learning classification has shown great promise in the detection of vegetated wetland types in New Zealand with reasonable accuracy, including small wetlands. However, we identified some key limitations in available training data, including uncertainty in identifying wetland types. Further work will be required for robust validation for improvement and refinement of these initial models, especially using local data for model verification. Once trained with more certainty, the ML-based models will be able to reduce the cost and effort required for the monitoring of wetlands over large areas. While all four methods appear promising, RF offers some key advantages, including ease of deployment and transferability, positioning it as a promising candidate for scalable, high-resolution wetland monitoring across diverse ecological settings.

Author Contributions

Conceptualization: M.S.I.K. and M.D.W.; methodology: M.S.I.K. and M.D.W.; software: M.S.I.K. and M.D.W.; validation: M.S.I.K. and M.D.W.; formal analysis: M.S.I.K. and M.D.W.; resources: M.D.W.; data curation: M.S.I.K. and M.D.W.; writing—original draft: M.S.I.K.; writing—review and editing: M.C.V.-C. and M.D.W.; visualization: M.S.I.K. and M.D.W.; supervision: M.C.V.-C. and M.D.W.; project administration: M.C.V.-C.; funding acquisition: M.C.V.-C. and M.D.W. All authors have read and agreed to the published version of the manuscript.

Funding

Eco-index Programme, Biological Heritage National Science Challenge, Ministry of Business, Innovation and Employment (MBIE), New Zealand (C09X1901).

Data Availability Statement

The original data presented in the study (the machine learning predictions of wetlands) are openly available in FigShare at https://doi.org/10.26021/canterburynz.c.7848596 (accessed 28 July 2025), including metadata that identify which SuperDove images were used for each pixel in the wetland predictions. Restrictions apply to the availability of the SuperDove data. Data were obtained from Planet and are available at https://www.planet.com/ (accessed 28 July 2025) under license.

Acknowledgments

We gratefully acknowledge the assistance of Luke Parkinson (Geospatial Research Institute, University of Canterbury) with software engineering, Karen Denyer (Eco-index Limited/Papawera Geological Consulting Limited) with the review of predicted wetlands, and Kevan Cote (Eco-index Limited) in methodological development in machine learning.

Conflicts of Interest

Author Md. Saiful Islam Khan has been involved as a consultant and expert witness in Eco-index Ltd.

Abbreviations

The following abbreviations are used in this manuscript:

API	Application Programming Interface
CNN	Convolutional Neural Network
COG	Cloud-Optimized Geotiff
DEM	Digital Elevation Model
FABDEM	Forests and Buildings removed Digital Elevation Model
GIS	Geographical Information System
HGB	Histogram-Based Gradient Boosting
HySOG	Global Hydrologic Soil Group
Ksat	Saturated hydraulic conductivity
LCDBv5	Land Cover Database version 5
MERIT	Multi-Error-Removed Improved-Terrain DEM
ML	Machine Learning
MLPC	Multi-Layer Perceptron Classifier
NDVI	Normalized Difference Vegetation Index
NDWI	Normalized Difference Water Index
NZ	New Zealand
PSS8B	PlanetScope SuperDove 8-Band imagery
RF	Random Forest
RGB	Red, green, blue
RGBI	Red, green, blue, near-infrared
TWI	Topographic Wetness Index
XGB	eXtreme Gradient (XG) Boosting

Appendix A

Table A1. Model performance metrics across different input feature combinations for wetland classification. Accuracy, recall, precision, and wetland-specific F1 scores are shown for four machine learning models—Histogram-Based Gradient Boosting (HGB), Multi-Layer Perceptron Classifier (MLPC), Random Forest (RF), and XGBoost (XGB)—evaluated on various input stacks, ranging from simple RGB to full-featured combinations (e.g., PSS8B with NDVI, NDWI, FABDEM, TWI, Ksat, and HySOG). Overall performance improves with the inclusion of hydrological and topographic predictors, with the highest wetland F1 scores (≥0.72) achieved by RF, HGB, and XGB when using the full environmental stack. Simpler input configurations (e.g., RGB or RGBI) consistently result in reduced performance across all models, underscoring the importance of enriched feature sets for accurate wetland detection.

Row Labels	Accuracy	Recall	Precision	Wetland_f1
PSS8B
HGB	0.87	0.60	0.76	0.67
MLPC	0.88	0.69	0.75	0.72
RF	0.88	0.61	0.79	0.69
XGB	0.87	0.62	0.75	0.68
PSS8B_NDVI_NDWI
HGB	0.87	0.61	0.76	0.68
MLPC	0.88	0.67	0.74	0.71
RF	0.88	0.63	0.79	0.70
XGB	0.88	0.63	0.75	0.69
PSS8B_NDVI_NDWI_FABDEM_TWI
HGB	0.87	0.62	0.76	0.68
MLPC	0.88	0.61	0.78	0.68
RF	0.87	0.62	0.73	0.67
XGB	0.87	0.63	0.75	0.69
PSS8B_NDVI_NDWI_FABDEM_TWI_Ksat_HySOG
HGB	0.89	0.65	0.80	0.72
MLPC	0.88	0.66	0.77	0.71
RF	0.89	0.65	0.81	0.72
XGB	0.89	0.67	0.79	0.73
RGB
HGB	0.83	0.48	0.65	0.55
MLPC	0.83	0.57	0.62	0.60
RF	0.83	0.50	0.63	0.55
XGB	0.83	0.51	0.64	0.56
RGBI
HGB	0.85	0.56	0.68	0.61
MLPC	0.85	0.63	0.66	0.64
RF	0.84	0.55	0.68	0.60
XGB	0.84	0.56	0.66	0.61

References

Marvin, D.C.; Koh, L.P.; Lynam, A.J.; Wich, S.; Davies, A.B.; Krishnamurthy, R.; Stokes, E.; Starkey, R.; Asner, G.P. Integrating technologies for scalable ecology and conservation. Glob. Ecol. Conserv. 2016, 7, 262–275. [Google Scholar] [CrossRef]
Talukdar, S.; Singha, P.; Mahato, S.; Shahfahad; Pal, S.; Liou, Y.-A.; Rahman, A. Land-Use Land-Cover Classification by Machine Learning Classifiers for Satellite Observations—A Review. Remote Sens. 2020, 12, 1135. [Google Scholar] [CrossRef]
Karpatne, A.; Ebert-Uphoff, I.; Ravela, S.; Babaie, H.A.; Kumar, V. Machine learning for the geosciences: Challenges and opportunities. IEEE Trans. Knowl. Data Eng. 2019, 31, 1544–1554. [Google Scholar] [CrossRef]
Anantrasirichai, N.; Biggs, J.; Albino, F.; Hill, P.; Bull, D. Application of machine learning to classification of volcanic deformation in routinely generated insar data. J. Geophys. Res. Solid. Earth 2018, 123, 6592–6606. [Google Scholar] [CrossRef]
Nadzri, I.F.M.; Khalid, N.; Wahab, W.A.; Hashim, N. Analyzing the effectiveness of support vector machine and random forest classifiers in delineating the green area. IOP Conf. Ser. Earth Environ. Sci. 2023, 1217, 012032. [Google Scholar] [CrossRef]
Gazzea, M.; Kristensen, L.M.; Pirotti, F.; Ozguven, E.E.; Arghandeh, R. Tree Species Classification Using High-Resolution Satellite Imagery and Weakly Supervised Learning. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–11. [Google Scholar] [CrossRef]
Ning, H.; Li, R.; Zhou, T. Machine learning for microalgae detection and utilization. Front. Mar. Sci. 2022, 9, 947394. [Google Scholar] [CrossRef]
Kim, S.; Sosnowski, K.; Hwang, D.S.; Yoon, J.-Y. Smartphone-Based Microalgae Monitoring Platform Using Machine Learning. ACS EST Eng. 2024, 4, 186–195. [Google Scholar] [CrossRef]
Li, S.; Jing, H.; Yuan, Q.; Yue, L.; Li, T. Investigating the spatio-temporal variation of vegetation water content in the western United States by blending GNSS-IR, AMSR-E, and AMSR2 observables using machine learning methods. Sci. Remote Sens. 2022, 6, 100061. [Google Scholar] [CrossRef]
Sasaki, K.; Sekine, T.; Emery, W. Enhancing the Detection of Coastal Marine Debris in Very High-Resolution Satellite Imagery via Unsupervised Domain Adaptation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 6014–6028. [Google Scholar] [CrossRef]
Basheer, S.; Wang, X.; Farooque, A.A.; Nawaz, R.A.; Liu, K.; Adekanmbi, T.; Liu, S. Comparison of land use land cover classifiers using different satellite imagery and machine learning techniques. Remote Sens. 2022, 14, 4978. [Google Scholar] [CrossRef]
Cheong, S.; Sankaran, K.; Bastani, H. Artificial intelligence for climate change adaptation. WIREs Data Min. Knowl. 2022, 12, e1459. [Google Scholar] [CrossRef]
Paszkuta, M.; Krężel, A.; Ryłko, N. Application of shape moments for cloudiness assessment in marine environmental research. Remote Sens. 2022, 14, 883. [Google Scholar] [CrossRef]
Kaplan, G.; Avdan, U. Mapping and monitoring wetlands using sentinel-2 satellite imagery. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, 4, 271–277. [Google Scholar] [CrossRef]
Semeniuk, C.A.; Semeniuk, V. A geomorphic approach to global classification for inland wetlands. Vegetatio 1995, 118, 103–124. [Google Scholar] [CrossRef]
Mitsch, W.J.; Bernal, B.; Hernandez, M.E. Ecosystem services of wetlands. Int. J. Biodivers. Sci. Ecosyst. Serv. Manag. 2015, 11, 1–4. [Google Scholar] [CrossRef]
Gallant, A. The challenges of remote monitoring of wetlands. Remote Sens. 2015, 7, 10938–10950. [Google Scholar] [CrossRef]
Georgiou, S.; Turner, R.K. Valuing Ecosystem Services: The Case of Multi-Functional Wetlands, 1st ed.; Routledge: London, UK, 2012; p. 192. ISBN 978-1-136-54916-8. [Google Scholar]
Debanshi, S.; Pal, S. Assessing the role of deltaic flood plain wetlands on regulating methane and carbon balance. Sci. Total Environ. 2022, 808, 152133. [Google Scholar] [CrossRef] [PubMed]
Mitsch, W.J.; Gosselink, J.G.; Anderson, C.J.; Fennessy, M.S. Wetlands, 6th ed.; Wiley: Hoboken, NJ, USA, 2023; p. 672. ISBN 978-1119826934. [Google Scholar]
Melton, J.R.; Wania, R.; Hodson, E.L.; Poulter, B.; Ringeval, B.; Spahni, R.; Bohn, T.; Avis, C.A.; Beerling, D.J.; Chen, G.; et al. Present state of global wetland extent and wetland methane modelling: Conclusions from a model inter-comparison project (WETCHIMP). Biogeosciences 2013, 10, 753–788. [Google Scholar] [CrossRef]
Mahdavi, S.; Salehi, B.; Granger, J.; Amani, M.; Brisco, B.; Huang, W. Remote sensing for wetland classification: A comprehensive review. GIsci. Remote Sens. 2018, 55, 623–658. [Google Scholar] [CrossRef]
McCarthy, J.; Leathwick, J.; Roudier, P.; Barringer, J.; Etherington, T.; Morgan, F.; Odgers, N.; Price, R.; Wiser, S.; Richardson, S. New Zealand Environmental Data Stack (NZEnvDS): A standardised collection of spatial layers for environmental modelling and site characterisation. N. Z. J. Ecol. 2021, 45, 3440. [Google Scholar] [CrossRef]
Ludwig, C.; Walli, A.; Schleicher, C.; Weichselbaum, J.; Riffler, M. A highly automated algorithm for wetland detection using multi-temporal optical satellite data. Remote Sens. Environ. 2019, 224, 333–351. [Google Scholar] [CrossRef]
Clarkson, B.; Peters, M. Wetland types. In Wetland Restoration: A Handbook for New Zealand Freshwater Systems; Lincoln, N.Z., Ed.; Manaaki Whenua Press: Lincoln, New Zealand, 2010; pp. 26–37. ISBN 978-0-478-34706-7. Available online: https://www.landcareresearch.co.nz/publications/wetland-restoration (accessed on 28 July 2025).
Ministry for the Environment. National Policy Statement for Freshwater Management 2020—Amended October 2024; Ministry for the Environment: Wellington, New Zealand, 2024; p. 75. [Google Scholar]
Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
Hirano, A.; Madden, M.; Welch, R. Hyperspectral image data for mapping wetland vegetation. Wetlands 2003, 23, 436–448. [Google Scholar] [CrossRef] [PubMed]
Lane, C.R.; D’Amico, E. Calculating the Ecosystem Service of Water Storage in Isolated Wetlands using LiDAR in North Central Florida, USA. Wetlands 2010, 30, 967–977. [Google Scholar] [CrossRef]
Sørensen, R.; Zinko, U.; Seibert, J. On the calculation of the topographic wetness index: Evaluation of different methods based on field observations. Hydrol. Earth Syst. Sci. 2006, 10, 101–112. [Google Scholar] [CrossRef]
Pettorelli, N.; Schulte to Bühne, H.; Tulloch, A.; Dubois, G.; Macinnis-Ng, C.; Queirós, A.M.; Keith, D.A.; Wegmann, M.; Schrodt, F.; Stellmes, M.; et al. Satellite remote sensing of ecosystem functions: Opportunities, challenges and way forward. Remote Sens. Ecol. Conserv. 2017, 4, 71–93. [Google Scholar] [CrossRef]
Zhang, Y.; Xiong, F.; Xie, Y.; Fan, X.; Gu, H. The impact of artificial intelligence and blockchain on the accounting profession. IEEE Access 2020, 8, 110461–110477. [Google Scholar] [CrossRef]
Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017. [Google Scholar] [CrossRef]
Jafarzadeh, H.; Mahdianpari, M.; Gill, E.W.; Brisco, B.; Mohammadimanesh, F. Remote Sensing and Machine Learning Tools to Support Wetland Monitoring: A Meta-Analysis of Three Decades of Research. Remote Sens. 2022, 14, 6104. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J.P. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote Sens. 2012, 67, 93–104. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of machine-learning classification in remote sensing: An applied review. Int. J. Remote Sens. 2018, 39, 2784–2817. [Google Scholar] [CrossRef]
Guryanov, A. Histogram-Based Algorithm for Building Gradient Boosting Ensembles of Piecewise Linear Decision Trees. In Analysis of Images, Social Networks and Texts: 8th International Conference, AIST 2019, Kazan, Russia, 17–19 July 2019, Revised Selected Papers; van der Aalst, W.M.P., Batagelj, V., Ignatov, D.I., Khachay, M., Kuskova, V., Kutuzov, A., Kuznetsov, S.O., Lomazova, I.A., Loukachevitch, N., Napoli, A., et al., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2019; Volume 11832, pp. 39–50. ISBN 978-3-030-37333-7. [Google Scholar]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD ’16, San Francisco, CA, USA, 13–17 August 2016; ACM Press: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Jafarzadeh, H.; Mahdianpari, M.; Gill, E.; Mohammadimanesh, F.; Homayouni, S. Bagging and boosting ensemble classifiers for classification of multispectral, hyperspectral and polsar data: A comparative evaluation. Remote Sens. 2021, 13, 4405. [Google Scholar] [CrossRef]
Dymond, J.; Sabetizade, M.; Newsome, P.; Harmsworth, G.; Ausseil, A.-G. Revised extent of wetlands in New Zealand. N. Z. J. Ecol. 2021, 45, 3444. [Google Scholar] [CrossRef]
Trettin, C.C.; Kolka, R.K.; Marsh, A.S.; Bansal, S.; Lilleskov, E.A.; Megonigal, P.; Stelk, M.J.; Lockaby, G.; D’Amore, D.V.; MacKenzie, R.A.; et al. Wetland and hydric soils. In Forest and Rangeland Soils of the United States Under Changing Conditions: A Comprehensive Science Synthesis; Pouyat, R.V., Page-Dumroese, D.S., Patel-Weynand, T., Geiser, L.H., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 99–126. ISBN 978-3-030-45215-5. [Google Scholar]
Rapinel, S.; Panhelleux, L.; Gayet, G.; Vanacker, R.; Lemercier, B.; Laroche, B.; Chambaud, F.; Guelmami, A.; Hubert-Moy, L. National wetland mapping using remote-sensing-derived environmental variables, archive field data, and artificial intelligence. Heliyon 2023, 9, e13482. [Google Scholar] [CrossRef] [PubMed]
PlanetScope|Planet Documentation. Available online: https://docs.planet.com/data/imagery/planetscope/ (accessed on 11 July 2025).
Planet Product Bundles|Planet Documentation. Available online: https://docs.planet.com/develop/apis/orders/product_bundles/ (accessed on 11 July 2025).
Hawker, L.; Uhe, P.; Paulo, L.; Sosa, J.; Savage, J.; Sampson, C.; Neal, J. A 30 m global map of elevation with forests and buildings removed. Environ. Res. Lett. 2022, 17, 024016. [Google Scholar] [CrossRef]
Ross, C.W.; Prihodko, L.; Anchang, J.; Kumar, S.; Ji, W.; Hanan, N.P. HYSOGs250m, global gridded hydrologic soil groups for curve-number-based runoff modeling. Sci. Data 2018, 5, 180091. [Google Scholar] [CrossRef]
Gupta, S.; Lehmann, P.; Bonetti, S.; Papritz, A.; Or, D. Global prediction of soil saturated hydraulic conductivity using random forest in a covariate-based geotransfer function (cogtf) framework. J. Adv. Model. Earth Syst. 2021, 13, e2020MS002242. [Google Scholar] [CrossRef]
Yamazaki, D.; Ikeshima, D.; Tawatari, R.; Yamaguchi, T.; O’Loughlin, F.; Neal, J.C.; Sampson, C.C.; Kanae, S.; Bates, P.D. A high-accuracy map of global terrain elevations. Geophys. Res. Lett. 2017, 44, 5844–5853. [Google Scholar] [CrossRef]
Yamazaki, D.; Ikeshima, D.; Sosa, J.; Bates, P.D.; Allen, G.; Pavelsky, T. MERIT Hydro: A high-resolution global hydrography map based on latest topography datasets. Water Resour. Res. 2019, 55, 5053–5073. [Google Scholar] [CrossRef]
Beven, K.J.; Kirkby, M.J. A physically based, variable contributing area model of basin hydrology / Un modèle à base physique de zone d’appel variable de l’hydrologie du bassin versant. Hydrol. Sci. Bull. 1979, 24, 43–69. [Google Scholar] [CrossRef]
Barnes, M. LCDB v5.0—Land Cover Database version 5.0, Mainland New Zealand. Landcare Res. 2019. [Google Scholar] [CrossRef]
Elsaid, A.F.; Fahmi, R.M.; Shehta, N.; Ramadan, B.M. Machine learning approach for hemorrhagic transformation prediction: Capturing predictors’ interaction. Front. Neurol. 2022, 13, 951401. [Google Scholar] [CrossRef] [PubMed]
Souza, A.F.; Martins, F.R. Demography of the clonal palm Geonoma brevispatha in a Neotropical swamp forest. Austral Ecol. 2006, 31, 869–881. [Google Scholar] [CrossRef]
Khedkar, S.P.; Ramalingam, A.C. Classification and Analysis of Malicious Traffic with Multi-layer Perceptron Model. Ingénierie Systèmes Inf. 2021, 26, 303–310. [Google Scholar] [CrossRef]
Dos Santos, A.M.M.; Pinto, R.C.G.; Duarte, J.C.; Schulze, B.R. Application of profile prediction for proactive scheduling. Rev. Informática Teórica Apl. 2022, 29, 65–75. [Google Scholar] [CrossRef]
Mahdavi, S.; Salehi, B.; Amani, M.; Granger, J.; Brisco, B.; Huang, W. A dynamic classification scheme for mapping spectrally similar classes: Application to wetland classification. Int. J. Appl. Earth Obs. Geoinf. 2019, 83, 101914. [Google Scholar] [CrossRef]
Berhane, T.M.; Lane, C.R.; Wu, Q.; Autrey, B.C.; Anenkhonov, O.A.; Chepinoga, V.V.; Liu, H. Decision-Tree, Rule-Based, and Random Forest Classification of High-Resolution Multispectral Imagery for Wetland Mapping and Inventory. Remote Sens. 2018, 10, 580. [Google Scholar] [CrossRef]
Adeli, S.; Salehi, B.; Mahdianpari, M.; Quackenbush, L.J.; Brisco, B.; Tamiminia, H.; Shaw, S. Wetland Monitoring Using SAR Data: A Meta-Analysis and Comprehensive Review. Remote Sens. 2020, 12, 2190. [Google Scholar] [CrossRef]
Varugu, B.K.; Jones, C.E.; Oliver-Cabrera, T.; Simard, M.; Jensen, D.J. Study of Hydrologic Connectivity and Tidal Influence on Water Flow Within Louisiana Coastal Wetlands Using Rapid-Repeat Interferometric Synthetic Aperture Radar. Remote Sens. 2025, 17, 459. [Google Scholar] [CrossRef]
Hess, L.L.; Melack, J.M.; Affonso, A.G.; Barbosa, C.; Gastil-Buhl, M.; Novo, E.M.L.M. Wetlands of the Lowland Amazon Basin: Extent, Vegetative Cover, and Dual-season Inundated Area as Mapped with JERS-1 Synthetic Aperture Radar. Wetlands 2015, 35, 745–756. [Google Scholar] [CrossRef]
McDonald, K.; Podest, E.; Steiner, N.; Tesser, D.; Zimmermann, R.; Niessner, A.; Rios, M.; Urquiza, J.D.; Huneini, R.; Downs, B.; et al. NISAR: Seeing beyond the trees to understand wetlands, forests and biodiversity. In Proceedings of the IGARSS 2024—2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; pp. 6775–6778. [Google Scholar]

Figure 1. Workflow diagram illustrating the wetland classification pipeline developed in this study. Multisource satellite, topographic, and soil datasets—including PlanetScope 8-band imagery, FABDEM, NDVI, NDWI, TWI, and hydrological soil properties (SM, RZSM, HySOG, and Ksat)—are processed, resampled, and aligned into tiled data stacks (768 × 768 m). Feature scaling and encoding are applied before stratified random sampling is used to extract training data from merged wetland and land cover datasets. Machine learning models (Random Forest and MLPC) are trained, evaluated, and applied across the national tile stack, with predicted tiles mosaicked to generate the final wetland classification maps.

Figure 2. (Left) The processing grid created across New Zealand, comprising 390 areas, each approximately 1100 km². (Right) example areas are shown in (i,ii): each comprises a mosaic of PlanetScope SuperDove imagery obtained on the dates shown. The yellow boxes in (i,ii) indicate the area shown in subsequent figures.

Figure 3. Example input layers and derived indices used in the national-scale wetland detection workflow for the areas indicated in Figure 2: (a–j) area ID 5edc0e26-0299-4c3f-8631-b39a4e3b0029; (k–t) area ID 1677525c-6446-40bd-99bc-459596e9b67a. The overview maps in (a,k) are © OpenStreetMap contributors and available from https://www.openstreetmap.org (accessed on 28 July 2025). The first row of each group displays core inputs, including environmental and terrain predictors: (b,l) the binary LCDB wetland reference map, where dark blue indicates reference wetland areas; (c,d) true-color composites of SuperDove imagery using red (R), green (G), and blue (B) bands (6, 4, 2); (d,n) false-color infrared composites of SuperDove imagery using near-infrared (R), red-edge (G), and red (B) bands (8, 7, 6); (e,o) the Normalized Difference Vegetation Index (NDVI); (f,p) the Normalized Difference Water Index (NDWI); (g,q) elevation from FABDEM; (h,r) the Topographic Wetness Index (TWI); (i,s) the categorical hydrological soil group (HySOG); (j,t) the saturated hydraulic conductivity (Ksat).

Figure 4. Heatmap showing overall classification accuracy for various spectral- and auxiliary-band combinations across four machine learning models: HGB, MLPC, RF, and XGB. Models incorporating comprehensive feature sets (e.g., PSS8B_NDVI_NDWI_DEM_TWI_Ksat_HySOG) consistently achieve higher overall accuracy, particularly with gradient boosting methods. Simpler inputs, such as RGB and RGBI, result in comparatively lower performance, highlighting the benefit of extended feature information for wetland classification.

Figure 5. Wetland-class F1 scores for different spectral- and auxiliary-band combinations across four machine learning models: HGB, MLPC, RF, and XGB. The inclusion of hydrological and topographic features (e.g., FABDEM, TWI, Ksat, and HySOG) improves model performance, with the highest F1 score (0.73) achieved using the full feature set and XGB. Simpler input combinations, such as RGB and RGBI, yield lower wetland F1 scores, underscoring the importance of enriched feature sets for accurate wetland detection.

Figure 6. Classification performance and feature contribution analysis for four machine learning models applied to wetland detection. The panels on the left show confusion matrices for (a) Histogram-Based Gradient Boosting (HGB), (b) Multi-Layer Perceptron Classifier (MLPC), (c) Random Forest (RF), and (d) XGBoost (XGB). These matrices reflect model performance on stratified validation samples, with stronger diagonal values indicating better agreement with ground-truth labels. Panels on the right display the corresponding feature importance rankings derived from each model. Ensemble models (HGB, RF, and XGB) prioritize spectral indices (NDVI and NDWI) and topographic predictors (TWI and FABDEM), while MLPC places greater emphasis on spectral bands such as Red Edge and Near-Infrared, capturing complex spectral relationships. Together, the figures highlight distinct learning patterns across model types and their effectiveness in discriminating wetland and non-wetland classes.

Figure 7. Binary wetland classification (a–d,i–l) and corresponding probability outputs (e–h,m–p) from four machine learning models (HGB, MLPC, RF, and XGB) for each area indicted in Figure 2. White and gray regions indicate areas with no data and those masked due to cloud cover or boundary clipping.

Figure 8. (a) National prediction of wetlands using the HGB model with all predictor variables compared to (b) the wetland part of the LCDBv5 dataset and (c) the potential area of wetlands, indicating the substantial loss of wetland systems that has occurred within New Zealand.

Figure 9. National area of wetlands produced by each model, with comparison to the reference data, indicating that the models likely over-predict wetland extent; increasing the probability threshold for wetland classification to 0.66 and spatial filtering reduce the total predicted extent to closer to LCDB.

Table 2. Comparative summary of four machine learning classifiers—Random Forest (RF), XGBoost (XGB), Histogram-Based Gradient Boosting (HGB), and Multi-Layer Perceptron Classifier (MLPC)—evaluated for their suitability in wetland classification using high-resolution satellite and geospatial data. Criteria include interpretability, handling of non-linear relationships, performance on imbalanced data, computational efficiency, categorical feature handling, robustness to noisy labels, scalability, spatial consistency, and relevance to wetland detection. Each model exhibits distinct strengths and trade-offs, with ensemble-based methods (RF, XGB, and HGB) showing greater efficiency and robustness, while MLPC offers flexibility in modeling complex patterns but requires more tuning and computational resources.

Criterion	Random Forest (RF)	XGBoost (XGB)	Histogram Gradient Boosting (HGB)	MLP Classifier (MLPC)
Interpretability	High (feature importance, decision paths clear)	Medium	Medium	Low (weights are abstract, less intuitive)
Handling of Non-linear Relationships	Good	Excellent	Excellent	Excellent
Performance on Imbalanced Data	Good with class weighting or sampling	Very good with built-in handling	Very good (supports sample weighting)	Sensitive; needs careful class balancing
Computational Efficiency	Moderate	Efficient with parallel trees	Highly efficient on large datasets	Slower training, can be memory-intensive
Handling Categorical Features	Requires preprocessing (label/one-hot)	Native support (with encoding), better with ordinal	Good support with categorical encoding	Requires one-hot encoding or manual embedding
Robustness to Noisy Labels	Moderate (averaging helps generalize)	Higher (regularization reduces overfitting)	Higher (leaf histograms smooth noise)	Low; overfits noisy labels unless regularized heavily
Scalability (Large Tile Sets)	Moderate–Good (parallelizable with joblib)	Excellent (multi-core, GPU support via Dask)	Excellent (fast histogram-based splits)	Moderate; slower with large sample sizes
Spatial Consistency	Moderate (pixel-based)	Moderate (no spatial smoothing)	Moderate	Lower (may create speckled predictions if not filtered)
Feature Importance Insights	Yes (Gini importance, permutation)	Yes (SHAP, gain, cover, permutation)	Limited (no built-in importance, uses permutation)	No direct importance; uses permutation post hoc
Use Case in Wetland Detection	Widely used, reliable for baseline models	Excellent for final tuned models	Strong contender for large datasets	Good for non-linear feature fusion; needs tuning
Suitability for Multi-class Wetland Types	Good	Excellent	Excellent	Good; dependent on architecture

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Mapping Wetlands with High-Resolution Planet SuperDove Satellite Imagery: An Assessment of Machine Learning Models Across the Diverse Waterscapes of New Zealand

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Preparation

2.2. Sample Point Generation and Feature Extraction

2.3. Machine Learning Model Development

2.4. National-Scale Deployment of the Model Outputs

3. Results

3.1. Model Evaluation and Comparison

3.2. Feature Importance

3.3. Wetland Prediction Across New Zealand

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Article Metrics

Citations

Article Access Statistics