Abstract
Accurate delineation of stream networks in low-gradient wetlands remains challenging due to subtle topographic variation and dense vegetation cover. This study systematically evaluated 48 Unmanned Aerial Vehicle Light Detection and Ranging (UAV-LiDAR) processing workflows through 1128 pairwise comparisons to identify optimal approaches for mapping fine-scale channels in Japan’s Kushiro Wetland, a Ramsar-designated ecosystem. The workflows combined three ground filtering methods (Progressive Morphological Filter, Cloth Simulation Filter, Multiscale Curvature Classification), four interpolation techniques (Inverse Distance Weighting, Triangulated Irregular Network, Kriging, Multilevel B-spline Approximation), two sink-filling algorithms (Planchon & Darboux; Wang & Liu), and two flow direction models (D8, D-infinity). Performance was first assessed using pixel-based Intersection over Union (IoU) metrics to quantify inter-method consensus. Independent plausibility-based validation was then conducted using near-contemporaneous Sentinel-2 imagery. Although pairwise statistical analysis identified workflows that achieved high inter-method consensus (median IoU = 0.90), external validation demonstrated that the CSF-MBA-Planchon-D8 workflow provided the most realistic presentation of optically observable channel corridors (validation IoU ≈ 0.85). These findings reveal that high inter-method agreement does not necessarily imply accurate landscape representation; multiple workflows may converge on systematically biased solutions. Ground filtering exerted the strongest influence on pairwise consensus, whereas plausibility-based validation highlighted the importance of selecting workflow combinations that preserve subtle channel morphology. Sink-filling and flow direction choices exerted comparatively minor effects in this low-gradient setting. The proposed dual-validation framework provides methodological guidance for wetland restoration planning and highlights the necessity of external validation in LiDAR-derived hydrological feature extraction.
1. Introduction
1.1. Background
Wetlands provide essential ecosystem services, including water purification, flood attenuation, carbon sequestration, and biodiversity support [1]. Accurate delineation of stream networks within wetlands is fundamental for hydrological modeling, flood risk assessment, and ecological restoration planning [2]. However, mapping fine-scale drainage networks in low-gradient wetlands presents substantial technical challenges due to subtle topographic relief, dense vegetation cover, and complex hydrological connectivity patterns [3]. Traditional field surveys are labor-intensive and spatially constrained, while global hydrographic datasets such as HydroRIVERS [4,5] lack the spatial resolution necessary for site-specific restoration projects. Unmanned aerial vehicle-based Light Detection and Ranging (UAV-LiDAR) systems provide high-resolution topographic data and have emerged as powerful tools for terrain mapping in complex environments [6].
1.2. Research Gap
Previous studies have typically evaluated individual processing components in isolation, such as comparing ground filtering algorithms [7,8,9] or interpolation techniques [10]. However, hydrological features extraction is inherently workflow-dependent, and the cumulative effects of combined processing steps remain insufficiently explored.
Moreover, most validation approaches rely on inter-method comparison without critically examining whether high methodological agreement ensures accurate representation of real-world features. The distinction between methodological reproducibility (agreement among workflows) and external validity (correspondence with observable landscape features) remains underexplored in wetland-scale hydrological mapping.
This gap is particularly significant given that river restoration projects increasingly rely on high-resolution topographic data to guide interventions such as meander restoration and floodplain reconnection, where accurate mapping of channel networks and hydrological connectivity is essential for achieving ecological restoration goals [11].
1.3. Objectives
This study does not introduce a new channel extraction algorithm. Instead, its scientific novelty lies in the development and demonstration of a systematic workflow-level evaluation framework for UAV-LiDAR-derived channel networks in low-gradient wetlands.
Specifically, this study aims to achieve five primary objectives: (1) systematically evaluate 48 complete UAV-LiDAR processing workflows combining alternative ground filtering, interpolation, sink-filling, and flow routing methods through comprehensive pairwise comparisons totaling 1128 combinations; (2) independently assess workflow plausibility using Sentinel-2 satellite imagery, emphasizing external validation rather than internal consensus; (3) identify workflow configurations best suited for fine-scale wetland channel extraction; (4) quantify the relative influence of individual processing components on extraction outcomes; (5) establish a reproducible dual-validation framework integrating statistical inter-method comparison with independent plausibility assessment. The findings provide actionable guidance for the ongoing Kushiro Wetland Restoration Project and contribute to a broader understanding of LiDAR-based hydrological feature extraction in complex lowland landscapes worldwide.
2. Materials and Methods
2.1. Study Area
The study area covers 0.91 km2 in the northern Kushiro Wetland, eastern Hokkaido, Japan (43°02′ N, 144°24′ E). Elevation ranges from approximately 2.5 to 5.0 m above sea level, with average slopes below 0.5°. The landscape consists of active and abandoned channels, low-relief floodplain, and slightly elevated areas colonized by alder forest (Figure 1).
Figure 1.
Study area location and UAV-LiDAR coverage. (a) Location of Hokkaido Island within Japan; (b) Kushiro Basin showing the wetland boundary (thick yellow line); (c) UAV-LiDAR flight tile coverage acquired in August 2023, showing individual flight paths; (d) The final analysis area (0.91 km2) visualized as a three-dimensional LiDAR point cloud with an elevation gradient (Z, 6.3–11.0 m a.s.l.). The color scale ranges from 6.3 m (dark blue) to 11.0 m (red), showing the main channel corridor and tributary networks incised into the surrounding low-gradient wetland terrain. Coordinate system: Japan Plane Rectangular Coordinate System XIII (JGD2011, EPSG:6681). Elevation values represent meters above sea level (m a.s.l.).
Vegetation structure is heterogeneous, including dense reed and sedge communities (1–2 m height), shrubby alder stands (3–6 m), and mixed herbaceous cover. This structural complexity complicates ground classification in LiDAR data.
The hydrological regime is characterized by seasonal flooding during spring snowmelt from April to May and autumn precipitation events from September to October, with baseflow conditions during summer and winter. Channel networks range from the first to third-order streams (according to Strahler classification), with widths ranging from less than 1 m for low-order tributaries to 8–15 m for main channels.
2.2. UAV-LiDAR Data Acquisition
UAV-LiDAR data were acquired in August 2023. The survey employed a DJI Matrice 300 RTK equipped with a Zenmuse L1 sensor (SZ DJI Technology Co., Ltd., Shenzhen, China). Flights were conducted at 80 m altitude with 70% forward and 60% side overlap. The sensor operated at 240 kHz, producing up to 480,000 points per second.
2.3. Workflow Components
This study systematically evaluated combinations of processing methods to assess their cumulative effects on stream network extraction accuracy from UAV-LiDAR data. Figure 2 presents the complete methodological workflow, highlighting the novel dual-validation framework that combines statistical consensus analysis (1128 pairwise comparisons across 48 workflows) with independent external validation using Sentinel-2 satellite imagery. This approach enables identification of truly optimal workflow configurations while revealing potential discrepancies between inter-method agreement and actual accuracy. All processing steps were fully scripted using R v.4.5.2 and Python v.3.12.12 to ensure reproducibility, with comprehensive documentation of workflow execution, parameter settings, and intermediate outputs enabling exact replication for other sites or datasets.
Figure 2.
Methodological workflow for systematic evaluation of UAV-LiDAR-based stream network mapping. The workflow consists of four phases: (1) data acquisition through UAV-LiDAR survey (DJI Matrice 300 RTK with Zenmuse L1, 4.43 million points, 4.76 pts/m2) and Sentinel-2 reference imagery collection; (2) systematic generation of 48 unique workflows by combining ground filtering methods (PMF, CSF, MCC), DTM interpolation techniques (IDW, TIN, KRG, MBA), sink-filling algorithms (Wang & Liu, Planchon & Darboux), and flow direction methods (D8, D-infinity); (3) dual-validation framework employing both statistical consensus analysis (1128 pairwise comparisons) and external validation against satellite imagery; (4) comparative analysis to quantify component influence and identify optimal workflow configuration. The center panel shows the 48 output stream network maps, demonstrating systematic variation across workflow combinations.
All 48 workflow combinations were implemented in RStudio v.2025.09.2 with the lidR package, QGIS v.3.40, SAGA GIS v.9.8, and custom R and Python scripts for batch processing. The complete processing chain for each workflow (Figure 2) consisted of: (1) application of ground filtering to the raw point cloud (4.76 points/m2) to classify ground versus non-ground points; (2) DTM generation through interpolation of ground points to 5 m resolution; (3) sink filling to remove spurious depressions; (4) flow direction calculation to determine downslope flow paths; (5) flow accumulation computation of upstream contributing area for each cell; and (6) channel extraction using a 5000 m2 flow accumulation threshold to delineate the stream network.
2.3.1. Ground Filtering Methods
Three widely implemented ground filtering algorithms were evaluated to classify the UAV-LiDAR point clouds into ground and non-ground returns: the Progressive Morphological Filter (PMF), the Cloth Simulation Filter (CSF), and the Multiscale Curvature Classification (MCC). These methods were selected because they represent distinct algorithmic philosophies and are commonly applied in terrain modeling studies.
The PMF applies a sequence of morphological opening operations using progressively increasing window sizes to distinguish ground from elevated objects [7]. This approach performs well in environments where ground and vegetation exhibit clear elevation separation; however, in low-relief landscapes such as wetlands, the PMF may over-smooth subtle terrain features that define channel morphology. In this study, the PMF was implemented with window sizes ranging from 0.2 m to 1 m at 0.2 m increments, slope thresholds between 0.1 m and 1.5 m, and a maximum window size of 1 m.
The CSF [8] adopts a physically inspired approach by inverting the point cloud and simulating a virtual cloth draped over the inverted surface. The resulting cloth surface approximates the terrain envelope and is used to separate ground from non-ground returns. Unlike fixed-window morphological approaches, the CSF adapts to local terrain variability and is therefore better suited for processing subtle elevation transitions in low-gradient environments. In this study, the CSF was implemented using a rigidness parameter of 3 (appropriate for flat terrain), a cloth resolution of 1 m, and a classification threshold of 1 m.
The MCC [9] identifies ground points by analyzing surface curvature across multiple spatial scales. By evaluating curvature signatures rather than solely elevation differences, the MCC is capable of preserving terrain detail in heterogeneous landscapes. This method is particularly effective in environments where vegetation and ground elevations partially overlap. The MCC was implemented using a scale parameter of 1 m and a curvature threshold of 0.05.
2.3.2. DTM Resolution Justification
A grid resolution of 5 m was selected based on point density characteristics, target channel dimensions, and computational efficiency considerations. The UAV-LiDAR dataset exhibited an average point density of 4.76 points/m2, corresponding to an average post spacing of approximately 0.46 m. Post spacing, which represents the average horizontal distance between neighboring points, was calculated using Equation (1), as approximately 0.46 m. for a point cloud with N points distributed over area A calculated as
At a 5 m grid resolution (cell area = 25 m2), each cell contains, on average, approximately 119 total points (25 m2 × 4.76 points/m2). Following ground filtering, which retained approximately 30% to 40% of points depending on vegetation density and algorithm choice, each cell retained an estimated 35–48 ground returns. This density provides sufficient support for robust interpolation across all methods while minimizing interpolation artifacts associated with sparse ground sampling.
The selected resolution is also appropriate relative to the geomorphic scale of the target features. Main channels within the study area range from 8–15 m in width and are therefore represented by approximately 2–3 pixels across-channel at 5 m resolution. Low-order tributaries (1–5 m width) are not directly resolved geometrically but are captured through flow accumulation processing, which integrates upstream contributing areas across multiple cells. The drainage threshold of 5000 m2 corresponds to 200 grid cells at 5 m resolution, ensuring stable channel initiation representation.
From a sampling theory perspective, the chosen resolution (5 m) is approximately 11 times the average post spacing (5/0.46 ≈ 10.9), substantially exceeding the Nyquist criterion (2 × post spacing ≈ 0.92 m) required to prevent spatial aliasing. Consequently, the DTM captures terrain variability while avoiding artificial amplification of measurement noise.)
Alternative resolutions were considered conceptually. A finer resolution (1–2 m) would contain only 5–19 total points per cell, reducing to 1–8 ground points after filtering, thereby increasing susceptibility to interpolation noise and local artifacts. Conversely, a coarser resolution (10 m) would overly smooth low-order tributaries and reduce hydrological connectivity representation. The 5 m grid, therefore, represents a balanced compromise between terrain fidelity, channel detectability, computational tractability across 48 workflows, and consistency with the 10 m spatial resolution of the Sentinel-2 imagery used for external plausibility assessment.
2.3.3. Interpolation Methods
Following ground classification and the DTM resolution justification, four interpolation techniques were used to generate 5 m resolution DTMs: Inverse Distance Weighting (IDW), Triangulated Irregular Network (TIN), Ordinary Kriging (KRG), and Multilevel B-spline Approximation (MBA). These methods represent deterministic, geometric, geostatistical, and hierarchical surface-fitting approaches, respectively.
The IDW [12] estimates cell values as weighted averages of nearby points, with the weights being inversely proportional to distance. This method assumes spatial autocorrelation decreases with distance and is computationally efficient. In this study, the IDE was implemented using a power parameter of 2, a search radius of 5 m, and a minimum of eight points per cell.
The TIN method [13] constructs a network of non-overlapping triangles connecting ground points. Elevation values are linearly interpolated within each triangle, and raster conversion was performed using natural neighbor interpolation. The TIN is well suited for irregularly distributed point clouds and preserves local surface variability.
The KRG [14] employs a geostatistical framework to model spatial autocorrelation using variograms and generates statistically optimal, unbiased predictions. A spherical variogram model was fitted to empirical semi-variograms derived from the ground points. The model parameters included a nugget of 1 m, a partial sill of 1 m, and a range of 8 m, based on exploratory variogram analysis.
The MBA [10] constructs a hierarchical spline surface through progressive refinement. This approach smooths measurement noise at coarser levels while preserving dominant terrain features at finer scales. The MBA was implemented with a maximum refinement level of 5 and a convergence tolerance of 0.01 m.
The use of multiple interpolation approaches allowed systematic evaluation of how deterministic smoothing, geometric triangulation, geostatistical modeling, and hierarchical spline fitting influence fine-scale channel extraction in low-gradient wetland terrain.
2.3.4. Sink Filling Algorithms
Two methods were evaluated for removing spurious depressions that are artifacts of interpolation or measurement error and that impede hydrological flow. The Planchon and Darboux [15] algorithm efficiently fill depressions via iterative flooding simulation. We used an epsilon parameter of 0.001 m for the depression tolerance. The Wang and Liu [16] method identifies and fills depressions using a priority-flood algorithm that processes cells in order of elevation. We applied default parameters with a priority queue implementation for large raster processing.
2.3.5. Flow Direction Methods
Two approaches were used to determine downslope flow routing. The D8 method developed by O’Callaghan and Mark [17] assigns flow from each cell to its steepest downslope neighbor among eight surrounding cells. The D-infinity method developed by Tarbotan [18] calculates flow direction as a continuous angle from 0 to 360° based on the steepest descent across eight triangular facets formed by cell centers.
2.3.6. Channel Extraction
The flow accumulation threshold for channel initiation was set at 5000 m2. With a grid resolution of 5 m (cell area = 25 m2), this corresponds to 200 contributing cells. This threshold was held constant across all workflows to enable direct comparisons. The drainage area in this environment was appropriate for first-order stream channels. Channel networks were converted to binary rasters with 1 representing a channel and 0 representing a non-channel at 5 m resolution, followed by pairwise comparison. Processing required approximately 20 min of computation time per workflow on a workstation with an Intel Core Ultra 9 185H processor running at 2.30 GHz, 16 cores, 22 logical processors, and 64 GB RAM, with individual workflows requiring 15–30 min depending on algorithm complexity.
To keep the comparison tractable and focused on workflow-level effects, we fixed algorithm parameters to the commonly used settings rather than performing full parameter optimization.
2.4. Performance Assessment Framework
2.4.1. Pairwise Statistical Comparison
To quantify inter-method agreement among workflows, all 48 channel network outputs were compared pairwise, yielding 1128 unique combinations (C(48, 2)). Comparisons were conducted at 5 m resolution across the entire 0.91 km2 study area (36,400 grid cells).
The primary similarity metric was the Intersection over Union (IoU), also known as the Jaccard Index. The IoU ranges from 0 (indicating no overlap) to 1 (indicating perfect agreement), defined as
where (True Positives) denotes pixels classified as channel by both workflows, (False Positives) represent pixels classified as channel by Workflow A but not Workflow B, and (False Negatives) denotes pixels classified as channel by Workflow B but not Workflow A. The IoU was selected because it simultaneously penalizes over-extraction and under-extraction and is robust to class imbalance, which is inherent in channel mapping, where channel pixels represent a small fraction of the domain. To provide complementary insight, precision, recall, and -score were also computed. Precision is defined as
and represents the proportion of extracted channel pixels that are confirmed by the comparison workflow. Recall is defined as
and quantifies the proportion of reference channel pixels successfully detected. The -score, calculated as the harmonic mean of precision and recall, provides a balanced summary metric of extraction performance. The -score was computed as
In addition, the channel length ratio was computed as the total channel length of Workflow A divided by that of Workflow B. This metric provides insight into systematic tendencies toward over- or under-extraction independent of spatial alignment.
To assess the influence of individual workflow components, the IoU values were aggregated by processing step (ground filtering, interpolation, sink-filling, and flow routing). Summary statistics, including median, mean, standard deviation, coefficient of variation , and interquartile range , were calculated for each component category. Box plot visualizations were used to compare central tendency and variability across alternatives.
2.4.2. Independent Plausibility-Based Validation
To distinguish inter-method consensus from external validity, all workflow outputs were evaluated against independently acquired Sentinel-2 imagery. Sentinel-2 Level-2A data at 10 m spatial resolution were obtained within three days of UAV–LiDAR acquisition to minimize temporal mismatch. True-color composite images at 10 m resolution combining Band 4 (Red, 665 nm), Band 3 (Green, 560 nm), and Band 2 (Blue, 490 nm) enabled visual identification of channel features based on vegetation patterns, soil moisture signatures, and surface water status (present/absent).
All 48 workflow outputs were systematically compared against Sentinel-2 imagery via visual overlay inspection. Workflows were evaluated based on their spatial correspondence with optically observable channel corridors, considering: (1) main channel alignment with visible water surfaces and riparian vegetation corridors; (2) tributary connectivity patterns consistent with drainage flows visible through vegetation and moisture gradients; (3) absence of extracted features in areas showing no channel-indicative signatures; (4) maintenance of hydrological connectivity at tributary junctions that were observable through the vegetation patterns; and (5) geometric alignment with channel centerlines identifiable by the presence of surface water or vegetation contrasts.
In this systematic visual comparison across the entire study area, the CSF-MBA-Planchon-D8 workflow demonstrated consistently superior correspondence with observable channel features compared to workflows that achieved the highest pairwise statistical consensus. The best workflow preserved sinuous channel geometries visible in the imagery, correctly represented low-order tributaries evident through linear vegetation patterns in densely vegetated zones, and avoided false channel predictions in transitional areas where no channel signatures were observable in the satellite imagery.
Although Sentinel-2 imagery provides valuable independent validation, the 10 m spatial resolution cannot resolve channels narrower than approximately 10 m. Validation primarily assesses the main channel (8–15 m in width) alignment and major tributary connectivity. Fine-scale tributaries (1–5 m in width) are evaluated based on continuity and connectivity patterns rather than precise geometric accuracy. Such a validation approach is necessarily qualitative and comparative in nature, establishing relative performance rankings among workflows rather than absolute measures of accuracy. However, such an external plausibility check provides critical evidence that workflows achieving high inter-method statistical agreement may not optimally represent actual landscape features observable in independent remote sensing data.
2.4.3. Spatial Variability and Uncertainty Analysis
To evaluate the spatial pattern of methodological disagreement, pixel-wise variability was assessed across all 48 workflow outputs. For each pixel, the was calculated based on the 48 binary channel classification (0 = non-channel, 1 = channel): The for pixel is calculated as:
where is the standard deviation and is the mean of the 48 binary values (0 or 1) for pixel .
High values indicate spatially sensitive zones where workflow disagreement is greatest, typically along channel margins and low-relief transitional areas. Conversely, low values correspond to areas of strong consensus, including well-defined channels and upland surfaces.
This spatial uncertainty analysis provides insight into where methodological variability is concentrated and highlights geomorphic contexts in which workflow selection exerts the strongest influence.
3. Results
3.1. Pairwise Comparison
Pairwise comparison of the 48 workflow outputs revealed substantial variability in flow accumulation structure and channel continuity across processing configurations. Visual inspection of the cluster raster stacks (Figure 3, Figure 4 and Figure 5) demonstrates clear differences in drainage concentration patterns depending on ground filtering method, interpolation technique, depression-filling algorithm, and flow routing scheme.
Figure 3.
PMF-based cluster raster stack of flow accumulation at 5 m resolution. Each panel shows flow accumulation derived from different interpolation, sink-filling, and routing configurations. Accumulation values represent upstream contributing pixel counts (0–100 pixels). Lower maximum values reflect differences in channel continuity.
Figure 4.
CSF-based cluster raster stack of flow accumulation at 5 m resolution. Each panel represents a workflow combination of interpolation method (IDW, Kriging, MBA, TIN), sink-filling algorithm (Planchon–Darboux or Wang–Liu), and flow routing scheme (D8 or Dinf). Flow accumulation values are calculated by pixel counting of upstream contributing cells (0–300 pixels). Darker blue indicates higher drainage concentration.
Figure 5.
MCC-based cluster raster stack of flow accumulation at 5 m resolution. Panels show workflow combinations of interpolation, depression-filling, and routing algorithms. Flow accumulation is computed as the upstream contributing pixel count (0–300 pixels). Higher values indicate stronger channel convergence.
Workflows based on CSF and MCC generally produced coherent, spatially continuous high-accumulation corridors aligned with the main channel network. Accumulation values (up to ~300 contributing pixels) were concentrated along well-defined drainage paths, particularly in configurations using Planchon and Darboux sink filling combined with D8 or Dinf routing. In contrast, configurations using Wang and Liu depression filling frequently exhibited structured grid-like artifacts and spatially fragmented accumulation patterns, particularly when coupled with D8 routing.
PMF-based workflows exhibited systematically lower maximum accumulation values (≤100 pixels) and reduced channel continuity. The resulting flow networks appeared more diffuse and spatially fragmented compared to CSF and MCC outputs, suggesting that aggressive ground smoothing during filtering influenced downstream accumulation structure.
Quantitatively, pairwise IoU scores reflected these structural differences. Higher IoU values were observed among workflows sharing the same ground filtering method, indicating that ground classification exerts strong control over final channel representation. In contrast, workflows differing in ground filtering but sharing interpolation or routing methods exhibited substantially lower spatial agreement.
These results indicate that ground filtering introduces the largest structural divergence among workflows, while interpolation and routing primarily modulate drainage density and spatial detail. Importantly, high inter-method agreement did not uniformly correspond to visually coherent accumulation patterns, reinforcing the distinction between methodological consensus and external plausibility.
3.2. Component-Level Performance Analysis
Figure 6 shows a comprehensive box plot comparison of IoU distributions across the workflow components, revealing distinct performance hierarchies and allowing quantitative assessment of how each processing step influences final channel extraction accuracy. The vertical axis range (0.80–1.00) was optimized to highlight performance differences while maintaining visual clarity across all component comparisons.
Figure 6.
Box plot comparison of Intersection over Union (IoU) distributions across workflow components. (a) Ground filtering methods: PMF, CSF, and MCC; (b) interpolation methods: TIN, IDW, Kriging, and MBA; (c) sink-filling algorithms: Wang & Liu and Planchon & Darboux; (d) flow direction methods: D8 and D-infinity. Boxes represent the interquartile range (IQR), with the median shown as a thick black line and the mean as a colored diamond. Individual data points shown as semi-transparent dots. The vertical axis range (0.80–1.00) highlights performance differences while maintaining visual clarity. Ground filtering exhibits the largest performance spread (5.1% difference between PMF and CSF), while flow direction methods show near-identical performance (<0.1% difference).
Ground filtering methods exhibited the most pronounced differences, with PMF achieving the highest median IoU of 0.900 with the narrowest distribution (IQR = 0.046), substantially outperforming CSF (median = 0.854, IQR = 0.046) and MCC (median = 0.857, IQR = 0.054) in terms of pairwise consensus (Figure 6a). The 5.1% difference in the median IoU between PMF and CSF represents the largest performance spread observed across any component category, indicating that ground filtering is the most influential processing step in terms of inter-method agreement. PMF’s superior consensus performance reflects its tendency to produce smoother, more generalized terrain representations that converge with the output of other smoothing-prone methods. However, this high consensus does not necessarily indicate superior accuracy in terms of representing actual channel features, as discussed below for the validation results.
The interpolation methods demonstrated remarkably similar performance with medians ranging from 0.856 to 0.876 and overlapping distributions, suggesting interpolation choice minimally affects inter-method agreement (Figure 6b). The narrow performance range of only 2.3% between methods—substantially smaller than the 5.1% range for ground filtering—reflects the relatively uniform spatial distribution of ground points (4.76 points/m2) achieved during UAV-LiDAR acquisition. IDW demonstrates slightly higher median IoU (0.876) than did TIN (0.856), MBA (0.857), and Kriging (0.859), although the overlapping interquartile ranges indicate that these differences are not statistically substantial.
Turning to the sink-filling algorithms, Wang & Liu (median = 0.890) outperformed Planchon and Darboux (median = 0.846) as revealed by a lower variability (IQR = 0.050 versus 0.044), indicating greater consistency across different input DTMs (Figure 6c). The Wang and Liu’s priority-flood approach produces more connected, streamlined channel networks by aggressively filling depressions to ensure complete drainage and, therefore, better agrees with similarly aggressive methods than more timid methods.
For the flow direction methods, D8 (median = 0.860) and D-infinity (median = 0.859) performed nearly identically with overlapping distributions and minimal median difference (<0.1%), suggesting that the flow routing algorithm has a negligible impact on the extraction outcomes in a low-gradient environment (Figure 6d). The computational efficiency advantage of D8 (3–5 times faster than D-infinity in our processing tests) makes it preferable for operational applications in similar low-gradient settings.
Table 1 quantifies these patterns using the summary statistics for all pairwise comparisons. This provides detailed numerical support for the visual patterns observed in Figure 5. Ground filtering exhibits the largest performance spread with a PMF median of 0.900 versus CSF of 0.854, representing a 5.1% difference, while interpolation methods show minimal variation with a maximum 2.3% difference between IDW (0.876) and TIN (0.856). The coefficients of variation across all components range from 0.048 to 0.061 and therefore indicate stable reproducible patterns in pairwise comparison despite the large number of workflow combinations evaluated.
Table 1.
Summary Statistics of Pairwise IoU by Workflow Component 1.
3.3. Top-Performing Workflows by Statistical Consensus
Table 2 identifies workflows that achieved the highest pairwise agreement (i.e., statistical consensus across methods). The rankings reflect inter-method agreement patterns rather than the validation plausibility against observable features.
Table 2.
Top 10 Workflows Ranked by Pairwise Statistical Consensus with Validation Comparison.
We identified certain cases where different workflow configurations produced identical channel rasters. These duplicate outputs were grouped and treated as a single unique solution when interpreting pairwise statistics. Perfect IoU values associated with very small numbers of comparisons, therefore, reflect technical duplication rather than independent methodological confirmation.
Workflows with perfect IoU scores (1.000) indicate that the outputs were identical to those of compared workflows; however, low N values reveal important nuances. PMF-TIN-Planchon-Dinf and PMF-TIN-Wang-D8 both achieved a perfect median IoU of 1.000, but their N comparisons equaled 1, indicating that these workflows are in fact identical, producing duplicate outputs because of equivalent processing chains.
The CSF-IDW-Wang-D8 workflow demonstrates a robust performance with a median IoU of 0.905 across 45 diverse comparisons, representing genuine agreement across a wide range of methodologically distinct workflows rather than technical duplication. However, as discussed in subsequent sections, workflows achieving the highest pairwise statistical consensus do not necessarily offer the most plausible representation of channel features when compared with external observations.
3.4. Independent Validation Results
Figure 7 demonstrates side-by-side comparisons of the validated optimal workflows against Sentinel-2 imagery, revealing critical differences in how workflows represent observable channel features despite similar pairwise statistical performance.
Figure 7.
Validation comparison of the optimal workflow (CSF-MBA-Planchon-D8) against Sentinel-2 satellite imagery. (a) Sentinel-2 false-color composite (NIR–Red–Green; Bands 8-4-3) acquired on 17 August 2022. This date was selected because the cloud cover was below 30%, providing clear surface visibility, whereas the August 2023 acquisition over the same tile was heavily cloud-covered and unsuitable for validation. Vegetation appears in shades of red/green, water and wet soil in dark green/black, and bare soil in white/gray. (b) Validated optimal workflow output showing extracted channel networks. Although Sentinel-2’s 10 m resolution limits validation of channels <10 m width, the systematic visual correspondence across the study area provides critical external evidence that the CSF-MBA-Planchon-D8 workflow achieves superior plausibility compared to consensus-leading alternatives.
Visual comparisons of all 48 workflow outputs against Sentinel-2 true-color imagery showed that the CSF-MBA-Planchon-D8 workflow provided consistently superior correspondence with observable channel features across the study area. This workflow achieved a median pairwise IoU of 0.857 when compared statistically against other methods, but demonstrated superior alignment with independently observable features at the Sentinel-2 scale. This workflow featured three key advantages over the consensus-leading alternatives:
Main channel alignment. The CSF-MBA-Planchon-D8 workflow accurately followed the sinuous geometry of meandering channels, clearly visible as water surfaces and riparian vegetation corridors in the satellite imagery. Channel centerlines extracted by this workflow showed excellent spatial agreement with observable channel courses, maintaining realistic curvature and avoiding artificial straightening artifacts present in some high-consensus workflows.
Tributary representation. Low-order tributaries visible as linear vegetation patterns, soil moisture signatures, or narrow water surfaces in densely vegetated zones were correctly represented in the CSF-MBA-Planchon-D8 output. These fine-scale features, critical for wetland hydrological connectivity, were systematically underrepresented or missed entirely by many workflows achieving higher pairwise consensus scores.
False positive control. The validated workflow avoided the extraction of channels in transition zones between well-defined channels and surrounding hillslopes where no channel features were observable in satellite imagery. In contrast, several high-consensus workflows systematically over-predicted channels in these ambiguous areas, likely due to terrain smoothing effects that create artificial drainage patterns.
Despite ranking only 18th in the pairwise statistical comparison (median IoU = 0.857), the CSF-MBA-Planchon-D8 workflow consistently outperformed the top-ranked consensus workflows, including PMF-TIN-Planchon-Dinf (median pairwise IoU > 0.900), in terms of correspondences to observable channel corridors in Sentinel-2 imagery. This finding reveals a fundamental disconnect between inter-method consensus and plausibility with respect to external observations.
The superior performance of CSF-MBA-Planchon-D8 is particularly evident in complex zones, including densely vegetated areas, where CSF’s adaptive ground filtering preserved subtle channel incisions, meandering channel segments, where MBA’s hierarchical spline interpolation maintained a smooth, realistic curvature, and hydrologically complex areas, where Planchon and Darboux sink-filling preserved natural depressions rather than artificially enforcing complete drainage connectivity.
Although this validation approach is necessarily limited by the 10 m resolution of Sentinel-2 imagery and the inherent challenges of establishing absolute ground truth for fine-scale features, the systematic visual comparison and scale-consistent IoU provide critical external evidence that high pairwise statistical consensus does not guarantee accurate representation of actual landscape features. Multiple workflows may converge on similar outputs that collectively deviate from observable reality—a phenomenon with profound implications for operational applications where accuracy rather than reproducibility determines project success.
4. Discussion
4.1. Inter-Method Consensus Does Not Ensure Accuracy
The most significant finding of this study is the substantial disconnect between inter-method statistical consensus and validated correspondence with observable features. Workflows achieving near-perfect pairwise agreement (median IoU > 0.900) demonstrated systematic deviations from channel corridors observable in Sentinel-2 imagery at the 10 m scale, while the visually validated optimal workflow CSF-MBA-Planchon-D8, ranking only 18th in the pairwise statistical comparison, provided consistently superior representation of observable channel geometry, tributary connectivity, and spatial extent.
This reveals a fundamental limitation of consensus-based validation frameworks, in which multiple methods may systematically converge on similar but collectively inaccurate results. This phenomenon likely arises because many LiDAR processing algorithms were developed and optimized for high-relief forested or urban environments. In low-gradient wetlands, these algorithms may share common systematic biases such as over-smoothing subtle channel features, misclassifying low shrubs as ground, or failing to preserve gentle elevation transitions that define wetland drainage patterns.
The practical implications are profound for operational applications. Validation strategies relying solely on inter-method comparisons can produce results that are misleading because high consensus does not guarantee accuracy. External validation against observable features using satellite imagery, field surveys, or other independent data sources is essential for operational applications where accuracy rather than reproducibility determines success. This finding has particular relevance for wetland restoration, where incorrect channel mapping could lead to restoration plans that fail to achieve the hydrological connectivity objectives or that inadvertently damage existing ecological functions.
IoU-based pairwise comparisons emphasize internal consistency among workflows, effectively selecting solutions that appear reasonable within an ensemble of extraction results. However, in environments characterized by fine-scale channel networks such as Kushiro Wetland, these subtle features may be treated as noise by many workflows, leading to systematic collective under-representation. This distinction is particularly critical in low-gradient wetlands where fine-scale tributary networks provide essential hydrological connectivity in terms of species dispersal, nutrient transport, and seasonal flooding patterns. The consequences of failing to represent these features accurately extend beyond mapping precision to fundamental questions of ecosystem function and restoration effectiveness.
4.2. Component-Level Drivers
The superior validation performance of CSF-based workflows, despite the fact that PMF achieves higher pairwise consensus, can be attributed to fundamental algorithmic differences in how these methods handle low-relief terrain. PMF employs fixed morphological windows that iteratively remove non-ground points based on elevation thresholds. This approach tends to over-smooth terrain in low-relief areas where elevation differences between ground and low vegetation approach the discrimination threshold of the method, resulting in loss of subtle topographic undulations that define wetland channels. CSF’s cloth simulation metaphor is associated with adaptive terrain-following behavior that better preserves the gentle elevation transitions characteristic of wetland channels. The pairwise comparison results that favored PMF likely reflect the tendency of PMF to produce smoother, more generalized terrain representations that were similar to the outputs from other smoothing-prone methods.
Although the pairwise comparisons suggested minimal differences among the interpolation methods (median IoU = 0.856–0.876), the validation results revealed MBA superiority in terms of channel delineation within optimal workflows. MBA’s multilevel B-spline approach progressively refines a surface representation via hierarchical decomposition, effectively smoothing measurement noise at coarse scales while preserving dominant features at fine scales. This hierarchical framework naturally accommodates the smooth, organically meandering channel geometries characteristic of alluvial wetlands.
Pairwise comparisons favored Wang and Liu [16] sink-filling (median IoU = 0.890 versus Planchon = 0.846), but the validated optimal workflow employed Planchon and Darboux [15]. This reversal reflects different algorithm behaviors in low-gradient terrain with real rather than spurious depressions. Wang and Liu’s priority-flood approach aggressively fills depressions to ensure complete drainage, potentially removing real hydrological features, including wetland pools, backwater areas, and abandoned channels that are topographically disconnected but ecologically significant. In low-gradient wetlands where real depressions are common, this approach may over-simplify hydrological connectivity. Planchon and Darboux [15] employ epsilon-based filling that preserves small depressions below the threshold while ensuring computational efficiency. This conservative approach better maintains realistic hydrological complexity in wetland environments, as evidenced by superior visual correspondence with natural channel patterns observable in satellite imagery.
Both flow direction methods performed near-identically in the pairwise comparisons (median IoU difference < 0.001) and also in the plausibility-based validation, with the optimal workflow employing D8. This finding contrasts with the expectation that D-infinity’s continuous flow direction representation would improve accuracy in complex terrain. The similarity likely reflects the study area’s low topographic complexity and relatively well-defined channels. However, this finding should not be over-generalized. In environments with complex microtopography, divergent flow paths, or braided channels, D-infinity’s continuous flow representation would likely provide substantial advantages.
4.3. Implications for Wetland Restoration and Management
The validated workflow has several direct applications to ongoing Kushiro Wetland restoration efforts. Historical channel reconstruction becomes possible through processing of pre-channelization aerial LiDAR or DEM data through the validated workflow, enabling restoration planners to reconstruct historical drainage networks that will guide re-meandering design. The workflow’s superior preservation of natural channel geometry is particularly valuable in this context. Hydrological connectivity assessment is enhanced through accurate channel networks that enable modeling of surface water flow paths, residence times, and inundation dynamics under different restoration scenarios. This supports evidence-based evaluation of intervention effectiveness before implementation. Ecological habitat mapping benefits from channel networks derived through the validated workflow, which provide base layers for species distribution models, particularly those of aquatic taxa that are dependent on hydrological connectivity. Such improved representation of low-order tributaries is critical when assessing habitat availability for species, including the endangered red-crowned crane and various wetland-dependent fish and amphibians. Monitoring of restoration outcomes becomes more quantitative when repeated UAV-LiDAR surveys are processed through the standardized workflow.
The methodology transfers to diverse wetland environments requiring high-resolution hydrological characterization. Although this study was grounded in Kushiro Wetland, the key methodological insight (that inter-method consensus can mask systematic bias) is broadly applicable to low-gradient, vegetated landscapes worldwide, including peatlands, floodplains, and coastal marshes.
4.4. Limitations and Future Research Directions
Several limitations warrant careful consideration when interpreting these results or extending findings to other contexts. This study focused on a single wetland type, specifically a temperate alluvial wetland, in a specific geographic region. Performance rankings may differ substantially in peatlands with organic soils and microtopographic hummock–hollow structure, tidal wetlands where bidirectional flow and intertidal exposure complicate surface definition, tropical wetlands with distinct vegetation phenology, and arid wetlands dominated by ephemeral drainage patterns. The validated workflow identified in this study should therefore not be assumed universally optimal. Site-specific validation is strongly recommended prior to transferring this workflow to substantially different geomorphic or ecological contexts.
The analysis employed single-date LiDAR acquisition during late-autumn baseflow conditions. Performance may vary with the seasonal vegetation phenology, hydrological state, or soil moisture conditions. Multi-temporal assessment would reveal whether validated workflow configurations remain optimal across seasonal variations.
Although a 5 m grid resolution was appropriate for the point density and channel dimensions in this study, finer-scale hydrological features (e.g., sub-meter rills, microtopographic flow paths) are not captured. Studies targeting ephemeral channels or detailed surface roughness would require higher point densities and finer grid resolution. However, the selected resolution is consistent with wetland restoration planning scales and regional hydrological modeling applications.
The validation approach employed in this study, although independent of LiDAR processing assumptions, was necessarily limited and subject to several inherent constraints. Sentinel-2’s 10 m spatial resolution cannot resolve channels narrower than approximately 10 m, limiting validation primarily to main channels and major tributaries while precluding precise geometric assessment of fine-scale features. The visual comparison approach, although systematic, lacks the quantitative rigor of field-validated reference data collected in differential Global Navigation Satellite System (GNSS) surveys or high-resolution aerial photography. Observer subjectivity in identifying channel features from optical imagery may introduce interpretation biases, particularly in densely vegetated areas where channel signatures are subtle. Future research should complement satellite-based plausibility checks with field-collected reference data to establish a more definitive accuracy measure, though we note that the relative performance rankings established through visual validation provide valuable operational guidance even in the absence of absolute accuracy measures.
Although the 48 workflows encompassed substantial methodological diversity, individual algorithm parameters were held constant rather than comprehensively optimized. Full parameter optimization would require orders of magnitude more computational resources, but might reveal additional performance improvements.
This study evaluated widely implemented, general-purpose algorithms rather than specialized, wetland-specific methods. Future research should evaluate advanced methods, including machine learning classification, multi-return LiDAR analysis, multi-sensor fusion, and hydrologically conditioned filtering, against the baseline established here.
5. Conclusions
This study demonstrates that processing methodology introduces substantial uncertainty in wetland channel extraction, even when high-quality UAV–LiDAR data are available. Although several workflows achieved strong inter-method statistical consensus, the externally validated optimal configuration ranked only 18th in pairwise agreement, clearly illustrating that reproducibility does not necessarily imply accuracy. Component-level analysis revealed that ground filtering exerts the strongest influence on inter-method consensus; however, validation results emphasized that preservation of subtle channel morphology may require workflow configurations that do not maximize statistical agreement.
Ground filtering emerged as the most influential processing step in determining inter-method agreement, while validation results emphasized the importance of preserving subtle channel morphology rather than maximizing consensus. Sink-filling and flow-routing methods exerted comparatively minor influence in this low-gradient environment.
The broader methodological implication is clear: evaluation frameworks for LiDAR-derived hydrological products must extend beyond internal agreement metrics to incorporate independent validation. Without such external checks, methodological convergence may conceal systematic bias.
The proposed dual-validation framework provides practical guidance for wetland restoration planning and establishes a transferable evaluation structure for other low-relief landscapes worldwide.
Future research should replicate this methodological framework across diverse landscape types and temporal conditions to develop comprehensive method selection guidelines based on measurable environmental attributes. Integration of multi-scale validation approaches combining field surveys and satellite imagery would strengthen ground truth quality, while systematic exploration of resolution effects and threshold sensitivity would reveal generalization boundaries for the current findings. Ultimately, advances in geospatial derivative accuracy require an understanding of how landscape characteristics, data properties, and processing methods interact to determine derivative quality.
Author Contributions
W.P.: Conceptualization, Methodology, Investigation, Writing—original draft, Writing—Review and Editing, Visualization, Supervision, Funding acquisition. T.I.: Methodology, Formal analysis, Investigation, Writing—review and editing. T.J.Y.: Investigation, Writing—review and editing, funding acquisition. All authors have read and agreed to the published version of the manuscript.
Funding
This research was supported by the Strategic Innovation Promotion Program (SIP) “Development of a Resilient Smart Network System against Natural Disasters” by the Council for Science, Technology and Innovation, Cabinet Office, Government of Japan (Research Promotion Agency: National Research Institute for Earth Science and Disaster Prevention).
Data Availability Statement
Restrictions apply to the availability of these data. The raw UAV-LiDAR data were obtained from the Hokkaido Development Bureau of the Ministry of Land, Infrastructure, Transport and Tourism, Japan, through Hokkai Suiko Consultant Co., Ltd. (http://www.suiko.jp/), and are available from the authors with the permission of Hokkai Suiko Consultant Co., Ltd. The processed channel-network shapefiles and digital terrain models (DTMs) generated in this study are available from the corresponding author upon reasonable request due to restrictions imposed by the Hokkaido Development Bureau of the Ministry of Land, Infrastructure, Transport and Tourism, Japan, through Hokkai Suiko Consultant Co., Ltd. (http://www.suiko.jp/). Sentinel-2 satellite imagery used for validation is publicly accessible through the Copernicus Open Access Hub (https://dataspace.copernicus.eu).
Acknowledgments
The first author acknowledges the JICA Knowledge Co-Creation Program (KCCP) for doctoral scholarship support under the Human Resource Development in Space Technology Utilization program. The authors thank the Hokkaido Development Bureau of the Ministry of Land, Infrastructure, Transport, and Tourism and Hokukai Suiko Consultant Co., Ltd. for providing the LiDAR data. We are particularly grateful to Daigo Inagaki, Fuyuki Tazaki, and Keita Kawashima for their assistance. Technical support from the River and Watershed Engineering Laboratory, Hokkaido University, is appreciated.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Gardner, R.C.; Finlayson, C. Global Wetland Outlook: State of the World’s Wetlands and Their Services to People; Stetson University College of Law Research Paper No. 2020-5, 5 October 2018; Ramsar Convention Secretariat: Gland, Switzerland, 2018; Available online: https://ssrn.com/abstract=3261606 (accessed on 25 September 2025).
- Kondolf, G.M.; Boulton, A.J.; O’Daniel, S.; Poole, G.C.; Rahel, F.J.; Stanley, E.H.; Wohl, E.; Bang, A.; Carlstrom, J.; Cristoni, C.; et al. Process-based ecological river restoration: Visualizing three-dimensional connectivity and dynamic vectors to recover lost linkages. Ecol. Soc. 2006, 11, 5. Available online: https://www.ecologyandsociety.org/vol11/iss2/art5/ (accessed on 30 September 2025). [CrossRef]
- Hooshyar, M.; Wang, D.; Medeiros, S.C. Wet channel network extraction by integrating LiDAR intensity and elevation data. Water Resour. Res. 2015, 51, 10029–10046. [Google Scholar] [CrossRef]
- Lehner, B.; Verdin, K.; Jarvis, A. New global hydrography derived from spaceborne elevation data. Eos Trans. Am. Geophys. Union 2011, 89, 93–94. [Google Scholar] [CrossRef]
- Lehner, B. HydroRIVERS v1.0: Global River Network Delineation Derived from HydroSHEDS Data at 15 Arc-Second Resolution (Version 1.0); McGill University: Montreal, QC, Canada, 2019; Available online: https://data.hydrosheds.org/file/technical-documentation/HydroRIVERS_TechDoc_v10.pdf (accessed on 25 September 2025).
- Jones, J.W. Efficient wetland surface water detection and monitoring via Landsat: Comparison with in situ data from the Everglades Depth Estimation Network. Remote Sens. 2015, 7, 12503–12538. [Google Scholar] [CrossRef]
- Zhang, K.; Chen, S.-C.; Whitman, D.; Shyu, M.-L.; Yan, J.; Zhang, C. A progressive morphological filter for removing nonground measurements from airborne LiDAR data. IEEE Trans. Geosci. Remote Sens. 2003, 41, 872–882. [Google Scholar] [CrossRef]
- Zhang, W.; Qi, C.; Wan, P.; Wang, H.; Xie, D.; Wang, X.; Yan, G. An easy-to-use airborne LiDAR data filtering method based on cloth simulation. Remote Sens. 2016, 8, 501. [Google Scholar] [CrossRef]
- Evans, J.S.; Hudak, A.T. A multiscale curvature algorithm for classifying discrete return LiDAR in forested environments. IEEE Trans. Geosci. Remote Sens. 2007, 45, 1029–1038. [Google Scholar] [CrossRef]
- Lee, S.; Wolberg, G.; Shin, S.Y. Scattered data interpolation with multilevel B-splines. IEEE Trans. Vis. Comput. Graph. 1997, 3, 228–244. [Google Scholar] [CrossRef]
- Nakamura, F.; Ishiyama, N.; Sueyoshi, M.; Negishi, J.N.; Akasaka, T. The significance of meander restoration for the hydrogeomorphology and recovery of wetland organisms in the kushiro river, a lowland river in Japan. Restor. Ecol. 2014, 22, 544–554. [Google Scholar] [CrossRef]
- Shepard, D. A two-dimensional interpolation function for irregularly-spaced data. In Proceedings of the 1968 23rd ACM National Conference, New York, NY, USA, 27–29 August 1968; pp. 517–524. [Google Scholar] [CrossRef]
- De Floriani, L.; Magillo, P. Triangulated Irregular Networks. In Encyclopedia of Database Systems; Springer: Boston, MA, USA, 2009; pp. 3185–3189. [Google Scholar] [CrossRef]
- Matheron, G. Principles of geostatistics. Econ. Geol. 1963, 58, 1246–1266. [Google Scholar] [CrossRef]
- Planchon, O.; Darboux, F. A fast, simple and versatile algorithm to fill the depressions of digital elevation models. Catena 2001, 46, 159–176. [Google Scholar] [CrossRef]
- Wang, L.; Liu, H. An efficient method for identifying and filling surface depressions in digital elevation models for hydrologic analysis and modelling. Int. J. Geogr. Inf. Sci. 2006, 20, 193–213. [Google Scholar] [CrossRef]
- O’Callaghan, J.F.; Mark, D.M. The extraction of drainage networks from digital elevation data. Comput. Vis. Graph. Image Process. 1984, 28, 323–344. [Google Scholar] [CrossRef]
- Tarboton, D.G. A new method for the determination of flow directions and upslope areas in grid digital elevation models. Water Resour. Res. 1997, 33, 309–319. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.






