Extraction of River Water Bodies Based on ICESat-2 Photon Classification

Ma, Wenqiu; Liu, Xiao; Zhao, Xinglei

doi:10.3390/rs16163034

Open AccessArticle

Extraction of River Water Bodies Based on ICESat-2 Photon Classification

by

Wenqiu Ma

,

Xiao Liu

and

Xinglei Zhao

^*

College of Information Science and Engineering, Shandong Agricultural University, Tai’an 271018, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(16), 3034; https://doi.org/10.3390/rs16163034

Submission received: 3 July 2024 / Revised: 11 August 2024 / Accepted: 15 August 2024 / Published: 18 August 2024

(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)

Download

Browse Figures

Versions Notes

Abstract

The accurate extraction of river water bodies is crucial for the utilization of water resources and understanding climate patterns. Compared with traditional methods of extracting rivers using remote sensing imagery, the launch of satellite-based photon-counting LiDAR (ICESat-2) provides a novel approach for river water body extraction. The use of ICESat-2 ATL03 photon data for inland river water body extraction is relatively underexplored and thus warrants investigation. To extract inland river water bodies accurately, this study proposes a method based on the spatial distribution of ATL03 photon data and the elevation variation characteristics of inland river water bodies. The proposed method first applies low-pass filtering to denoised photon data to mitigate the impact of high-frequency signals on data processing. Then, the elevation’s standard deviation of the low-pass-filtered data is calculated via a sliding window, and the photon data are classified on the basis of the standard deviation threshold obtained through Gaussian kernel density estimation. The results revealed that the average overall accuracy (OA) and Kappa coefficient (KC) for the extraction of inland river water bodies across the four study areas were 99.12% and 97.81%, respectively. Compared with the improved RANSAC algorithm and the combined RANSAC and DBSCAN algorithms, the average OA of the proposed method improved by 17.98% and 7.12%, respectively, and the average KC improved by 58.38% and 17.69%, respectively. This study provides a new method for extracting inland river water bodies.

Keywords:

ice, cloud, and land elevation satellite-2 (ICESat-2); ATL03; photon classification; river photon extraction

1. Introduction

Inland water bodies, including rivers, lakes, reservoirs, and wetlands on Earth’s surface, are primary components of water resources and are crucial for human life, ecological conservation, and the sustainable development of the social economy [1,2]. Among these, rivers are among the most important components of inland water bodies, serving as the cradle of human society and civilization, and are intimately linked with human culture and history [3,4], as well as with ecosystems in ecological assessments [5,6,7]. River systems enhance horizontal and vertical ecological connectivity among various habitats [8]. Given climate change and human activities, spatial–temporal variations in river water bodies have become hotspots for scientists and governments worldwide [2]. The range of river water bodies, water level/depth, and flow rate are crucial elements in the dynamic monitoring of water resources and in the study of hydrological and ecological environment protection [9,10]. Rapid and accurate acquisition of river information from water bodies is essential for resource management and planning, disaster assessment, and understanding the climate [11,12,13].

Compared with traditional in situ water body monitoring methods, satellite remote sensing monitoring offers numerous advantages, including a low cost, wide coverage, multi-scale imaging, and long-term monitoring capabilities. Sensors that can serve the purpose of measuring surface water can be classified into two categories: optical and microwave [13]. Benefiting from the accumulated theories and methods of ocean-color remote sensing, optical remote sensing satellite imagery has made remarkable progress in monitoring inland water bodies, especially large areas of lakes and reservoirs. Currently, the extraction of water bodies via optical remote sensing data often involves threshold segmentation based on pixel threshold methods, employing suitable bands to construct indices, such as the normalized difference water index [14], modified normalized difference water index [15], and automated water extraction index [16], for water body separation. With the improvement in image resolution, mixed pixels are easily generated at the edges of water bodies, making the threshold method inaccurate for small water bodies and edge areas. Therefore, researchers have constructed machine learning classification methods, which include decision trees [17,18], artificial neural networks [19,20], support vector machines [21,22], and random forests [23]. However, for narrow rivers, the extraction results are generally unsatisfactory due to the limited spatial resolution of optical sensors and obstruction from clouds and vegetation [24]. Microwave remote sensing, which uses long-wavelength radiation, is unaffected by solar radiation, allowing it to penetrate cloud cover and some vegetation. Among them, synthetic aperture radar (SAR), which can operate under all weather conditions, is often used for inland water body extraction. Water body extraction methods using radar remote sensing data include edge detection [25] and active contour models [26]. Although SAR images can provide valuable information in shallow water and shadowed areas, radar sensors typically have low spatial resolution and high noise levels [27].

Light detection and ranging (LiDAR) is an active remote sensing technology that uses a transmitter to emit laser pulses and a receiver to collect the return photons, calculating the distance based on the time delay between the signal pulse emission and return. On the basis of measurement platforms, LiDAR systems can be classified into ground-based, airborne, and spaceborne LiDAR systems.

Compared with ground-based LiDAR, airborne LiDAR uses aircraft as the measurement platform and has the advantages of greater flexibility and efficiency. Airborne LiDAR can achieve integrated land and sea measurements and simultaneously obtain laser point cloud data on sea and land surfaces. Using the elevation difference between the sea surface and land points, sea and land classification (SLC) can be performed to extract water body information. Airborne LiDAR typically generates a digital elevation model from point cloud data and combines tidal models to obtain coastlines for SLC [28,29,30]. In addition, Jiang proposed a method for coastline extraction under multiple constraints of coarse and fine grids, achieving SLC [31]. Yu et al. introduced a coastline extraction method based on airborne LiDAR point cloud data in which the point cloud is gridded and the local elevation is calculated using precise tidal models and tide data to obtain the coastline, thereby achieving SLC [32]. Zhao et al. proposed an SLC method based on elevation threshold intervals derived from airborne LiDAR point cloud data [33]. They used the RANSAC algorithm to obtain the linear mean water surface, thereby determining the elevation threshold intervals for water points to complete the SLC. Airborne LiDAR can balance accuracy and efficiency, is easy to manage, and has high mobility, enabling the acquisition of high-precision shoreline data [33,34,35]. However, the cost of acquiring airborne data is high, making large-scale surveys challenging, and airborne data are often subject to airspace control and other restrictions.

Spaceborne LiDAR has overcome the regional limitations of airborne LiDAR [36], providing a novel means for altimetry and bathymetry. Spaceborne LiDAR enables large-scale, all-weather measurements with global coverage, low data acquisition costs, a high repeat frequency, and minimal influence from external factors, remaining at the forefront of technological development [37]. Since its data release, ice, cloud, and land elevation satellite 2 (ICESat-2) has been applied to various terrains, including but not limited to lakes [38], reservoirs [39], vegetation [40,41], sea ice [42,43], shallow water bathymetry [44,45,46,47], and land [48].

Currently, SLC algorithms for ICESat-2 ATL03 data are commonly used for shallow ocean bathymetry. Zhen et al. employed a Gaussian statistical model to remove sea surface photons [47], whereas Xi et al. utilized the RANSAC algorithm to obtain water surface elevation [49]. Studies using ICESat-2 for inland water bodies typically focus on large bodies of water, such as reservoirs [50,51] and lakes [52,53]. However, research on SLC of small bodies of water, such as river regions, is limited. In addition, most point cloud data classification methods are designed for SLC, with few studies on river/land classification (RLC).

As shown in Figure 1a, the proportion of flat-water surfaces is greater in oceanic regions than in inland river regions. Therefore, airborne and spaceborne point cloud data often use the RANSAC algorithm for SLC, where the fitted red dotted line covers sea surface point clouds and filters out land point clouds, achieving SLC. Consequently, the RANSAC algorithm yields excellent classification results in oceanic regions. However, as illustrated in Figure 1b, inland regions, particularly around rivers, have more uneven terrains with a small proportion of water bodies. When RANSAC is used for fitting, it is highly susceptible to the influence of land point cloud data, resulting in substantial deviations, such as the red dotted line in Figure 1b. Sometimes, it even appears as a fitting line with a steep slope, which is unfavorable for RLC. To address these issues, this study proposes a new photon dispersion method for extracting inland river water bodies.

The remainder of this paper is organized as follows: Section 2 provides the details of the photon dispersion methods. Section 3 validates and analyzes the proposed methods through experiments. Section 4 discusses the experimental results. Finally, Section 5 provides the conclusions and recommendations.

2. Data and Study Areas

2.1. Research Data

2.1.1. ICESat-2 ATL03

ICESat-2, launched in September 2018, carries the advanced topographic laser altimeter system (ATLAS) and used micropulse, multibeam photon-counting technology for the first time [54]. ATLAS utilizes a green laser (532 nm) and photomultiplier tubes as photon-counting detectors [55]. ATLAS emits six laser beams, which are arranged in three groups parallel to the along-track direction, each containing one strong signal and one weak signal, with a strong-to-weak signal energy ratio of 4:1. The cross-track distance between groups is approximately 3.3 km, and within each group, the cross-track distance is approximately 90 m, providing broad spatial coverage [54]. The ATL03 dataset contains common noise found in single-photon-counting LiDAR, primarily including solar noise and atmospheric scattering noise [56]. The ATL03 data are corrected for errors such as atmospheric delay, tides, and system pointing biases [55]. The ATL03 data used in this study were collected from February to March 2022, and only strong signal data were utilized. The ATL03 data product can be freely downloaded from https://nsidc.org/data/atl03/versions/6 accessed on 13 March 2023.

2.1.2. World Imagery

This study utilized remote sensing imagery from Esri’s 2022 historical imagery service (World Imagery Wayback) to visually interpret and extract the distribution of inland river water bodies within the study area. The World Imagery dataset primarily sources data from the SPOT5, GEOI, WV02, and WV03 satellites, with spatial resolutions ranging from 0.31 m to 2.5 m [57].

2.2. Study Area

This paper identifies four study areas to validate the effectiveness of the proposed classification method under different land cover conditions. The overall distribution of the study area is shown in Figure 2c,d. The first study area is located in Zhuhai, Guangdong Province, China (Figure 2a). This area has relatively flat terrain but includes villages, making the surface environmental conditions complex. The second study area is located in Foshan, Guangdong Province, China (Figure 2b). This area passes through extensive paddy fields with a flat terrain, which can easily cause classification interference. The study area also features a mountainous region in the middle and an island in the center of the river, where laser pulses pass over. The surface conditions in this study area are highly variable. The third study area is located in northwestern Borneo, Malaysia (Figure 2e). It is situated in the estuary of three rivers, forming a bay. The water body area is larger than the land area, differing from other study areas where the land proportion is remarkably greater than the water body. Additionally, this area contains numerous small river sections. The fourth study area is in western Borneo, Indonesia (Figure 2f). Here, the river flows through a tropical forest, with the surface covered by tall trees all year round, and the proportion of the river is smaller than that in other study areas.

3. Methods

After the ATL03 signal point cloud data are unfolded in a two-dimensional along-track distance–elevation space, a profile of the photon data can be obtained. Given the variations in terrain and surface features, the elevation of land photons can change considerably over short distances. In contrast, the gentle flow and stability of inland river water result in small or negligible elevation changes in water photons. On the basis of these characteristics, we propose a photon dispersion method specifically for extracting inland rivers. The detailed process is as follows: First, two levels of denoising are performed on the ATL03 point cloud photon data to obtain signal data. Second, low-pass filtering is applied to the signal data to obtain the low-frequency signal. Next, the data within the window are classified via the elevation standard deviation (STD) in the sliding window and the STD threshold of the elevation to distinguish between land and inland river photons. Finally, a confusion matrix is constructed on the basis of the validation data to calculate the overall accuracy (OA) and Kappa coefficient (KC) for the accuracy assessment.

3.1. Photon Dispersion Method

According to the characteristics of inland river water body data, the elevation change characteristics of photon point clouds in the study area must be captured to extract water photons from inland areas. High-frequency components must be suppressed to highlight the elevation features of interest in the photon point cloud, facilitating subsequent processing and classification. To achieve this, we employ a Butterworth low-pass filter on the photon point cloud data to suppress high-frequency noise while amplifying the elevation features of the photon point cloud. Additionally, the elevation’s STD is often used to describe terrain surface variability, which can exactly reflect fluctuations in the data elevation values and their dispersion relative to the overall average elevation.

3.1.1. Butterworth Low-Pass Filtering

The Butterworth low-pass filter is chosen for smoothing and enhancing the photon point clouds in this study for several reasons. First, it has a smooth frequency response with a flat passband that preserves the original signal values without distortion. Second, it achieves zero-phase shift filtering, meaning that no time or positional correction is needed during the filtering of photon point cloud elevations. Third, the filter’s order can be adjusted to increase the steepness of the results, offering strong anti-interference capabilities and broad applicability in processing photon point cloud data.

The cutoff frequency of the Butterworth low-pass filter remarkably impacts the final classification results. If the cutoff frequency is set too low, the elevation fluctuation trend of the study data can be clearly obtained; however, this excessively smooths important photon elevation change characteristics, making it difficult to obtain effective standard deviation values. Conversely, if the cutoff frequency is set too high, more detailed information of the photon data is retained, but the presence of noise can obscure the true data variation characteristics, resulting in a low signal-to-noise ratio. Therefore, selecting an appropriate cutoff frequency is crucial for retaining photon elevation change information while mitigating the adverse effects of noise.

The transfer function of an N-order Butterworth low-pass filter applied to the photon point cloud elevation data is shown in Equation (1). After processing, the photon point cloud with amplified elevation features is obtained, facilitating subsequent inland channel water body extraction.

H_{l o w p a s s} = \frac{1}{1 + {(\frac{H_{i}}{ω_{C}})}^{2 N}}

(1)

Here,

H_{l o w p a s s}

is the elevation value of the photon after low-pass filtering,

H_{i}

is the original elevation value of the ith photon,

ω_{C}

is the cutoff frequency, and N is the order of the filter.

3.1.2. Calculation of the Elevation’s STD Using a Sliding Window

By using a sliding window in this study, the low-pass-filtered data are processed along the along-track distance (m) sequence of the point cloud data. Traditional sliding windows use a fixed distance to slide forward, which is unsuitable for ATL03 data because of its uneven along-track photon density distribution. For regions with a high photon density, a small window is needed to describe the data characteristics accurately, whereas a large window is needed for regions with a low photon density to ensure adequate data points for feature description. Therefore, as shown in Equation (2), this study improves the fixed-step sliding-window method by using a fixed photon count for the window size and moving one photon step length at a time, calculating the elevation’s STD for the data within the window, as shown in Equation (5).

The filtered data produce a continuous elevation sequence,

H_{l o w p a s s} = [H_{1}, H_{2}, H_{3}, \dots, H_{n}]

, where n is the length of the elevation sequence of the point cloud data after low-pass filtering. First, the filtered data must be reordered in the along-track direction to prevent backtracking when the sliding window advances by a single photon length. Next, the elevation’s STD for the filtered data within the window is calculated using Equations (2)–(5). Finally, the elevation’s STD for the filtered data within the window is assigned to the filtered data point in the middle of the window, resulting in the filtered data sliding-window STD set

σ = [σ_{1}, σ_{2}, σ_{3}, \dots, σ_{n}]

.

F_{t} = [H_{t}, H_{t + 1}, H_{t + 2}, \dots, H_{t + w - 1}]

(2)

t = t_{0} + s

(3)

{\bar{H}}_{t} = \frac{1}{w} \sum_{t}^{t + w - 1} H_{t}

(4)

σ_{\frac{W - 1}{2}} = \sqrt{\frac{1}{w} \sum_{t}^{t + w - 1} {(H_{t} - {\bar{H}}_{t})}^{2}}

(5)

Here,

F_{t}

represents the window function, which indicates the data points contained within the window at position t; t is the starting position of the window; w is the window size; s is the window step size;

{\bar{H}}_{t}

denotes the mean value of the data points within the window at position t; and

σ_{\frac{W - 1}{2}}

represents the elevation’s STD of the central data point within the window at position t.

3.1.3. RLC Criterion

In the filtered data assigned with sliding-window STD assignment, as shown in Equation (6), data points with an STD less than the threshold are classified as inland river points, whereas those exceeding the threshold are classified as land points. On the basis of the classified inland river points, the original inland water and land photons are deduced, as shown in the following equation:

L a b e l (i) = \{\begin{array}{l} r i v e r w a t e r p h o t o n, σ_{i} < σ_{t h r e s h o l d} \\ l a n d p h o t o n, σ_{i} \geq σ_{t h r e s h o l d} \end{array}

(6)

where Label(i) represents the label of the ith photon point cloud;

r i v e r w a t e r p h o t o n

and

l a n d p h o t o n

denote the inland water and land photons, respectively;

σ_{i}

is the STD of the central data point in the ith window; and

σ_{t h r e s h o l d}

is the threshold.

3.1.4. Threshold Calculation

Under clear-sky conditions with minimal clouds and aerosols, the signal return rate from the ICESat-2 laser emitter is roughly the same for water and land surfaces, with approximately zero to four signal photons expected to be received from each strong beam laser shot [49]. In consideration of the terrain’s surface undulations and ocean wave effects, an average of approximately one photon per meter is received vertically [58]. Given the considerable surface undulations on land, the elevation range of land photons is large, whereas the calm water surface of inland rivers results in a small elevation range for water photons. Consequently, within the same along-track distance, land photons exhibit a larger elevation range and lower photon density, whereas inland river photons, owing to their smaller elevation range, are more concentrated and have a high density.

These characteristics indicate that water photons have a smaller STD and higher STD density than land photons do. Therefore, when transitioning from water photon STD density to land photon STD density, a density peak is observed. Therefore, this study uses Gaussian kernel density estimation (GKDE) to analyze the elevation’s STD of the sliding-window-filtered data. The first STD density peak of the filtered data points is used as the elevation’s STD threshold for RLC. The density estimation formula is represented by Equation (10).

GKDE is a commonly used nonparametric density estimation method. It does not require assumptions about the specific distribution of the data when the probability density function (PDF) of the data is estimated, making it suitable for various types of data [59]. The Gaussian kernel function, which presents a bell-shaped curve, offers excellent smoothness and produces good estimation results for data distributions, especially for approximately unimodal distributions [60]. Therefore, the Gaussian kernel function is chosen as the kernel function for density estimation, as shown in Equation (9). Given its excellent smoothness, the Gaussian kernel function reduces estimation errors caused by data fluctuations when the PDF of the data is estimated. The elevation STD density data obtained from the low-pass-filtered sliding window exhibit an approximately skewed and bimodal distribution, allowing the Gaussian kernel function to fit the data density distribution effectively.

Silverman’s rule of thumb (SROT) method can avoid missing bimodal distributions and better handle various non-normal unimodal densities [61]. Silverman reported that the SROT performed well for densities with skewness or bimodality [62]. The SROT is used to select the bandwidth parameter in this study, as shown in Equation (8). To balance smoothness based on the data’s STD and quantity, the bandwidth parameter decreases as the sample size increases, thereby avoiding oversmoothing issues. Moreover, the bandwidth parameter is proportional to the sample STD or interquartile range (IQR), ensuring a small bandwidth parameter for data with small STDs or IQRs. This makes the bandwidth parameter sensitive to the distribution characteristics of the data.

For each STD data point

σ_{i}

, the bandwidth parameter and standardized distance are first calculated using Equations (7) and (8), respectively. Then, the Gaussian kernel function value is computed using Equation (9). Finally, all the kernel function values are summed and normalized in accordance with Equation (10). The PDF is constructed through smoothing and convolution techniques.

u_{i} = \frac{σ - σ_{i}}{h}

(7)

h = \frac{0.9 \times m i n (s t d, \frac{I Q R}{1.34})}{n^{\frac{1}{5}}}

(8)

K (u_{i}) = \frac{1}{\sqrt{2 π}} e^{- \frac{u_{i}^{2}}{2}}

(9)

\hat{f} (σ) = \frac{1}{n h} \sum_{i = 1}^{n} K (u_{i})

(10)

Here,

u

is the standardized distance;

σ_{i}

is the STD data point of the window-filtered data;

h

is the bandwidth function;

s t d

is the STD of the data sample

σ

; IQR is the range between the 75th percentile (Q3) and the 25th percentile (Q1) of the data sample;

n

is the sample size of the window-filtered STD;

K

is the kernel function; and

\hat{f} (σ)

is the estimated probability density at point

σ

. With GKDE used to fit the PDF of the elevation’s STD, the first peak of the PDF is taken as the elevation’s STD threshold

σ_{t h r e s h o l d}

for RLC.

3.1.5. DBSCAN Correction

Given the presence of a few photons uniformly distributed around the mean within a certain elevation and along-track distance range, these points are prone to being misclassified during preliminary classification. Moreover, using the elevation’s STD threshold for classification inevitably results in some misclassified points. However, compared with correctly classified water points, these misclassified points are smaller and more fragmented along the track direction. On the basis of this characteristic, a one-dimensional DBSCAN algorithm can be utilized to filter and correct the preliminarily classified water points based on the density of the misclassified points. Considering some ICESat-2 level 3 products, such as ATL08, which acquire photon information at fixed segment sizes of 100 m along the ground track [63], this study sets the along-track width threshold to 100 m. By filtering out clusters with an along-track scale smaller than a certain width, as shown in Equation (11), the accuracy of SLC can be further improved.

C_{m i s c l a s s i f i c a t i o n} = \{C_{k}| w i d t h (C_{k}) < 100)\}

(11)

Here,

C_{m i s c l a s s i f i c a t i o n}

denotes misclassified point clusters;

C_{k}

represents the classified clusters; and

w i d t h (C_{k})

represents the width of the cluster.

3.2. Accuracy Assessment Method

To effectively validate the RLC method proposed in this study, a combination of qualitative and quantitative validation methods was employed by comparing with the visual interpretation of high-resolution maps of inland river areas. The qualitative validation is conducted by comparing the photon data classified using the method proposed in this study with the surface types interpreted at the corresponding locations. The qualitative validation involves observing whether the classified water photons and land photons correspond to the actual inland rivers and land and checking for any noticeable misclassification. The quantitative validation employs the calculation of a confusion matrix for the classified photons (as shown in Table 1) to obtain the OA and KC, which serve as metrics for assessing the classification accuracy. The confusion matrix, OA, and KC are represented as follows:

In the table, true positive (TP) represents the number of photons correctly classified as water photons; false negative (FN) represents the number of actual water photons misclassified as land photons; false positive (FP) represents the number of actual land photons misclassified as water photons; and true negative (TN) represents the number of photons correctly classified as land photons.

P_{0} = \frac{T P + T N}{T W + F N + F P + T N}

(12)

P_{e} = \frac{(T P + F N) (T P + F P) + (T N + F P) (T N + F N)}{N^{2}}

(13)

K = \frac{P_{0} - P_{e}}{1 - P_{e}}

(14)

Here, N is the total number of photons;

P_{0}

is the overall classification accuracy;

P_{e}

is the chance agreement; and

K

is the KC.

A comparative analysis was conducted with the improved RANSAC algorithm and the RANSAC + DBSCAN algorithm to verify the performance of the proposed RLC algorithm.

4. Experimental Results and Analysis

4.1. Photon Point Cloud Denoising

Figure 3a shows the photon point cloud data for study area A, which are displayed as a two-dimensional distribution along the satellite track direction and photon elevation. In the data, the densely distributed points (red line) represent signal photons, whereas the numerous points distributed on both sides are noise photons. The ATL03 photon data have a low signal-to-noise ratio and strong background noise.

In this study, one-stage coarse denoising is performed via the elevation statistical histogram within a window, followed by two-stage fine denoising using an improved DBSCAN algorithm to obtain the signal data [64,65]. The elevation statistical histogram is calculated by counting the photon density in the elevation direction. The signal photons are obtained by taking the density peak as the center and using the standard deviation multiple as the buffer. The window size is chosen as 100 m (as shown in Figure 3c) along the track distance, which is a widely adopted window size, e.g., the window size used in the elevation statistics of ATL08 data [63]. On the basis of the elevation density distribution of photon points within each window, one-stage coarse denoising [64] is completed. A scatter plot of the raw photon points and an elevation statistical histogram for a random window are illustrated in Figure 3d. After one-stage coarse denoising, the majority of noise points are removed, reducing the elevation range to one-tenth of its original size, which decreases the subsequent computational load and improves the data processing efficiency. After two-stage fine denoising, any remaining stray noise points around the signal points are eliminated, reducing the interference of stray noise photons with the extraction of elevation variation characteristics of the signal photons. The elevation range of the signal photons in the study area is further reduced by nearly 12 m, resulting in the denoised point cloud photons shown in Figure 3b.

4.2. RLCs Based on the Photon Dispersion Method

After denoising the photons, the low-pass filter sliding-window classification method proposed in this study is applied for classification. First, the denoised signal photon data are input into the Butterworth low-pass filter to amplify and highlight the elevation variation characteristics. Second, a sliding window (initially set at 1000 data points for effective feature extraction) is used to calculate the elevation’s STD within the window. Then, using GKDE, the threshold for the elevation’s STD is determined, and the filtered photon data are classified in accordance with this threshold. Finally, the classified data are screened and corrected to reclassify small-scale misclassified data.

The denoised signal photons undergo Butterworth low-pass filtering to further extract the elevation variation characteristics. The selection of the Butterworth low-pass filter cutoff frequency often requires experimental adjustments to determine the optimal frequency. Through multiple analyses and experimental adjustments, the cutoff frequencies for study datasets A, B, C, and D were determined to be 0.26, 0.6, 0.28, and 0.5, respectively. In Figure 4a, the red polyline represents the Butterworth low-pass filter fitting line. The elevation variation trend of the data is well preserved, and the elevation information is undistorted, minimizing the impact of high-frequency stray points on the overall data trend.

Figure 4b shows the results of calculating the sliding window STD for the low-pass-filtered data. The blue line in Figure 4b represents the elevation STD obtained from the sliding window applied to the filtered data. The sliding window STD accurately describes surface elevation variations, facilitating classification.

On the basis of the GKDE algorithm, a density distribution histogram of the STD data and the probability density fitting curve of the sliding window STD of the low-pass-filtered data for each study area are obtained, as shown in Figure 5a–d (Figure 5b is part of the image; the part of the curve where the STD density value between 6 and 34 is close to 0). The STD at the first peak of the STD PDF curve is taken as the elevation’s STD threshold for classifying the data. In the fitted PDF curves of datasets A and B, there is no long-tail phenomenon (where the majority of the data correspond to small STD values, whereas only a small portion corresponds to large STD values). Most of the STD values corresponding to the density histograms are within 1.5. In contrast, datasets C and D exhibit a long-tail phenomenon in their fitted PDF curves, with the STD values corresponding to the density histograms reaching 5 and nearly 16, respectively. The long tail in the STD PDF represents the land. The longer the tail is, the more complex the land terrain is.

As illustrated in Figure 6a, the STD data are divided into high-STD data and low-STD data via the sliding window STD threshold from the low-pass filter. The classification results of the STD are shown in Figure 6b, where the red portions represent high-STD data corresponding to land-filtered photons, and the blue portions represent low-STD data corresponding to water-filtered photons.

Based on the classified high- and low-STD data, backtracking to the filtered photon data ultimately yields the preliminary photon classification results shown in Figure 7a. A few misclassified points are found among the initially classified water points (blue points in Figure 7a). As shown in Figure 7b, the fragmented misclassified points are corrected, where the red portions are misclassified points, and the blue portions are water points. By recombining the filtered and corrected misclassified points with the land points, the final classification results of the inland river water bodies are obtained.

4.3. Results from the Three Classification Methods

Figure 8a–d compare the two-dimensional expansion results of photon classification for the four study areas via three different classification methods. Figure 8 reveals that within the four study areas, the algorithm proposed in this study, the improved RANSAC algorithm, and the combined RANSAC and DBSCAN algorithms can classify water and land points. The data classified using the improved RANSAC algorithm contain more misclassified points, particularly in segments where the data are uniformly distributed in the elevation direction. The delineation of water regions produced by the combined RANSAC + DBSCAN algorithms is unclear, with boundaries often encroaching on land. Compared with the two other traditional methods, the proposed method produces water bodies with clear boundaries and fewer misclassified points.

4.4. Accuracy Assessment and Comparison

4.4.1. Qualitative Evaluation

The overlay results of the classification outputs from the three methods with the visually interpreted water body areas reveal that the proposed RLC algorithm can effectively perform classification in flat and rugged terrains and in areas with simple or complex surface cover. The locations of the classified water photons match those in high-precision maps without large-scale misclassification.

Figure 9a–i show the overlay of the classification results from the three methods with the visually interpreted results for dataset A, along with enlarged local views. Figure 9a–c present the overall classification results from the three methods, whereas Figure 9d–f and Figure 9g–i present enlarged views of the upper and lower blue boxes in the overall images, respectively. The above figures reveal that for relatively flat data A, all three classification methods accurately locate the water body. The method proposed in this study provides more precise boundary determination for water areas (Figure 9d,g). The improved RANSAC algorithm results in more misclassified points, likely because the data pass through paddy fields (Figure 9e,h), where the elevation range is small and the distribution is uniform, making the improved RANSAC algorithm prone to misclassification. Although the RANSAC + DBSCAN algorithms can remove the misclassified points generated by the RANSAC algorithm, its water boundary delineation is inferior to that of the method proposed in this study (Figure 9f,i).

For dataset B, which contains rivers with central islands, the proposed method accurately classifies the river data, identifies the location of the island within the data, and correctly determines the land boundaries (Figure 10d). However, the other two classification methods are not sensitive to the central island in the river and fail to classify and identify it correctly (Figure 10f). Despite the presence of large elevation variations in the mountainous areas at the center of the data, none of the three methods are affected by the mountainous terrain in the data.

Study dataset C consists of river estuary data, where the water body area is larger than the land area. For this dataset, the improved RANSAC and RANSAC + DBSCAN algorithms perform better than in other datasets. For the small river areas in the center of dataset C (blue box in Figure 11a–c), the proposed method and the improved RANSAC algorithm can classify them, but the improved RANSAC algorithm is less accurate in determining river boundaries (Figure 11d,e). The RANSAC + DBSCAN algorithms, during correction, even remove small water areas (Figure 11f), resulting in an overcorrection phenomenon.

The along-track distance range of study dataset C is much larger than that of the other three study datasets and contains several narrow water bodies, which leads to a narrow range of water bodies on the two-dimensional photon classification map. The classification details cannot be clearly observed in Figure 8c, so the part containing the narrow water body classification is magnified and displayed in Figure 12b. The classification results of dataset C are analyzed in detail, and a one-to-one illustration of the photon classification results and overlay results is given, as shown in Figure 12. The algorithm proposed in this paper has a good effect on the extraction of narrow rivers in such areas.

In dataset D, the river is located within a forest with considerable terrain undulations (Figure 8d). This dataset highlights the advantages of the proposed algorithm, which accurately and completely classifies the water body data (Figure 13d). The improved RANSAC algorithm misclassifies flat blank areas within the forest (Figure 13e), whereas the RANSAC + DBSCAN algorithms, although capable of removing some misclassified points, retain many and inaccurately determine river water body boundaries (Figure 13f), similar to dataset A.

Qualitative analysis revealed that the proposed photon dispersion algorithm can effectively classify land and water photons in areas with flat or rugged terrain and simple or complex surface cover. Compared with the improved RANSAC algorithm, the proposed photon dispersion method can accurately classify photons in flat areas with minimal misclassification. Compared with the RANSAC + DBSCAN method, the proposed photon dispersion algorithm better extracts water and land photon boundary points and effectively classifies small river water photon points. Thus, the proposed photon dispersion algorithm outperforms the improved RANSAC and RANSAC + DBSCAN algorithms.

4.4.2. Quantitative Evaluation

The proposed photon dispersion algorithm was quantitatively evaluated using the interpreted water data, and a comparative analysis was conducted with the improved RANSAC algorithm and the RANSAC + DBSCAN algorithms. The confusion matrix results for water photons and land photons are shown in Table 2. The dataset with the most signal photons is dataset C, with a total of 100,188 signal photon points, whereas the dataset with the least signal photons is dataset D, with 35,014 signal photon points. OA and KC were calculated using Formulas (12)–(14).

As shown in Table 3, the proposed photon dispersion algorithm achieves average OA and KC values of 99.12% and 97.81%, respectively, for the four datasets. For datasets A and D, the proposed algorithm achieves the highest accuracy, indicating that the proposed method is most effective for data with considerable and uneven changes, which is consistent with the qualitative analysis results. The improved RANSAC algorithm produced the best classification results for dataset D, with an OA of 90.29%, which aligns with previous analyses suggesting that RANSAC is suitable for large water body classification. Compared with the improved RANSAC algorithm, the RANSAC + DBSCAN algorithms, which remove misclassified points from the improved RANSAC algorithm, achieve markedly better results. Compared with the proposed method, the best OA of the RANSAC + DBSCAN algorithms was less than 1% lower (dataset A), and the worst was less than 25% lower (dataset D). Similarly, the best KC result of the RANSAC + DBSCAN algorithms was less than 5% lower than that of the proposed method (dataset A).

To better compare the impact of different sizes of water bodies on the classification results in the along-track direction, this study introduces the percentage of water bodies (i.e., the percentage of the total length occupied by water bodies in the along-track direction) and the water/land (W/L) ratio (i.e., the ratio of the distance occupied by water bodies to that occupied by land in the along-track direction) as key parameters to measure the information of river water bodies and land within the study area, as shown in Table 4. Among them, the highest percentage of water bodies in relation to land is found in study area C, reaching 53.7%, with a W/L ratio of 115.8%. Conversely, the smallest percentages of water bodies and W/L ratios are also found in study area D, at 0.056% and 0.059%, respectively. Additionally, the number of water bodies traversed by each study dataset was counted: datasets A, B, C, and D have 2, 4, 8, and 1 water bodies, respectively.

The classification accuracy of the proposed photon dispersion algorithm is shown in Figure 14. The OAs and KCs of the proposed photon dispersion algorithm decrease as the W/L ratio increases, and the smaller the number of water bodies is, the higher the classification accuracy. Therefore, dataset D achieves the best OA and KC. According to Figure 8d and Figure 12, research dataset D passes through a forest area with remarkable terrain undulation and complex surface cover while having the lowest W/L ratio. The OA results for the proposed photon dispersion algorithm are slightly higher than the KC results for datasets A and B. The specific reason may be that these two datasets are located in relatively flat areas with little terrain undulation and moderate W/L ratios. Compared with that of study dataset D, the classification performance of dataset C is worse, with the OAs and KCs decreasing by 1.43% and 2.60%, respectively. The reason may be that study dataset C has the highest W/L ratio and the longest along-track length among the four study areas.

Overall, the proposed algorithm demonstrated higher classification accuracy for data with complex terrains, various surface covers, and low W/L ratios because water and land photon densities are similar along the track direction, but land photons show greater elevation changes, making elevation variation characteristics more apparent and thus achieving higher classification accuracy.

5. Discussion

In this study, we classified land and inland river photons from ICESat-2 photon point cloud data based on the spatial photon dispersion using our proposed photon dispersion algorithm. When selecting the STD threshold for the low-pass-filtered data, we used the STD at the first peak of the PDF curve fitted using the GKDE as the threshold. However, while plotting the STD probability density distribution histogram and fitting the PDF curve, we found that the fitted curves of datasets C and D, in contrast to datasets A and B, are more complex in the latter half of the PDF curve, exhibiting more minor fluctuations and a pronounced long-tail phenomenon. Therefore, we selected the STD values at the first peak, first trough, and second peak of the dataset’s PDF curve as thresholds to discuss the rationale of using the first peak as the threshold for classifying inland river water bodies. Additionally, we discuss the reasons for the more intricate peaks and long-tail phenomena in datasets C and D than in the other two datasets.

5.1. Photon Dispersion Algorithm Parameter Sensitivity

The choice of the cutoff frequency in Butterworth low-pass filtering and the window size in the sliding window affects the final classification results. Using dataset A as an example, this section analyzes the sensitivity of the different cutoff frequencies and window size values.

The OAs and KCs are calculated for different cutoff frequencies, and the results are shown in Figure 15. The OAs do not exhibit marked changes before 0.4 but decrease thereafter. The values of KCs decreased after increasing, with the largest value occurring at the cutoff value of 0.26. Therefore, the choice of the cutoff frequency in Butterworth low-pass filtering is reasonable.

Similarly, the OAs and KCs are calculated under different sliding window sizes, as shown in Figure 16. The OAs do not show marked changes, remaining stable at over 99% after a window size of 700. The KCs exhibited an increasing trend followed by stabilization, reaching a maximum value at 1000. Therefore, the selection of the sliding window size is reasonable.

5.2. Using the First Peak of the PDF Curve as the Threshold

We chose the first peak of the PDF curve fitted using GKDE as the STD threshold. To validate this rationale, we compared the classification results using the first trough and the second peak of the PDF curve as thresholds, using dataset A as an example. The classification results when the first trough and second peak are used as thresholds are shown in Figure 17.

As shown in research dataset A in Table 5, the OA and KC for the first trough of the PDF curve are 94.44% and 73.52%, respectively. Compared with the use of the first peak as a threshold, the use of the first trough as a threshold accurately classifies the locations of water photons but is less accurate in determining water body boundaries. The OA and KC for the second peak of the PDF curve are 17.10% and 1.71%, respectively.

In the two-dimensional display of photons, most photons are classified as water photons, with only those in regions of remarkable elevation change classified as land photons. Clearly, the second peak is not suitable as a threshold for classifying inland water bodies in dataset A.

Water photons have a smaller dispersion than land photons do. In addition to the few photons that penetrate the water, most photons are concentrated on the water surface. As a result, the STD values and their variations for water photons obtained through the photon dispersion method are smaller, as shown in Figure 6a. In the STD distribution plot, water photons exhibit smaller values and are nearly parallel to the horizontal axis. Given the terrain undulations and surface cover, land photons have greater dispersion, leading to greater STD values when our method is used.

As shown in Table 5 and Figure 18, not only for dataset A but also for the other datasets, classification using the first peak as the threshold yields better results than using the first trough or the second peak. However, the variations in datasets C and D are much smaller than those in datasets A and B. Figure 5 and Table 5 clearly show that the fitted PDF curves of datasets C and D exhibit a long-tail phenomenon, whereas those of datasets A and B do not. Therefore, when the fitted PDF curve does not exhibit a long-tail phenomenon, the classification results are more sensitive to these three thresholds, with marked differences in the outcomes. For such data without long tails, using the first peak as the threshold is the most appropriate choice. Using the first trough as the threshold is not accurate for determining the water body boundaries, and using the second peak as the threshold fails to distinguish between water and land. However, for data where the PDF curve exhibits a long tail, the sensitivity to thresholds is low, and classification using the first peak, the first trough, or the second peak can achieve high OA and KC values. In conclusion, regardless of the dataset used, the threshold selection proposed in this study can yield satisfactory classification results.

5.3. Relationship between PDF Curves and Land Complexity

Figure 6 shows that the STD distribution histograms and PDF curves fitted using GKDE for datasets A and B differ considerably from those for datasets C and D. For datasets A and B, the probability density of the curve section with an STD greater than 2 approaches zero, with minimal long-tail phenomena. However, the PDF curves for datasets C and D have more minor peaks within the long tail. The number of peaks in the curves is shown in Table 6. Figure 8 indicates that most photons in datasets A and B have an elevation change range within 10 m and are more concentrated, resulting in few minor peaks in the long tail. Dataset B has a few photons with elevation changes of approximately 80 m; these photons are few but have a large elevation change range, making the long tail the longest despite having only one peak.

As shown in Table 6, datasets C and D have more peaks, more than twice as many as datasets A and B do, and the two-dimensional scatter plot in Figure 8 shows that the photons in datasets C and D are more dispersed because of the complex terrain or surface cover in their locations. The use of thresholds to classify data points in the long tail of the PDF curve indicates that areas with high land complexity tend to produce PDF curves with long tails and multiple peaks, as shown in Figure 19. Figure 11 and Figure 13 reveal that the STD values within the long tail of the fitted curve correspond to regions with considerable terrain undulation or complex surface cover. Hence, if the GKDE-fitted PDF curve has a long tail with multiple peaks, then the long tail represents data from areas with considerable terrain variation or complex surface cover.

6. Conclusions

On the basis of the spatial distribution characteristics of the ICESat-2 ATL03 data, we propose a novel photon dispersion algorithm for river water body extraction. The photon signals undergo Butterworth low-pass filtering to highlight the elevation features of the photons. Using a sliding window, we calculate the elevation’s STD within the window for the low-pass-filtered data. A threshold parameter for elevation STD classification is determined with the first peak of the PDF curve fitted using GKDE. The reliability and accuracy of the proposed photon dispersion algorithm are validated through qualitative and quantitative evaluations using visually interpreted data.

(1) The average OA and KC of the proposed photon dispersion method reached 99.12% and 97.81%, respectively, for the extraction of river water bodies in the four study areas. Compared with the improved RANSAC algorithm and the combined RANSAC and DBSCAN algorithms, the average OA of the proposed photon dispersion method improved by 17.98% and 7.12%, respectively, and the average KC improved by 58.39% and 17.69%, respectively. This result demonstrates that our classification method offers markedly higher classification accuracy and precision than the comparative methods used in this study do. For datasets with small rivers, such as dataset D, the proposed method accurately identifies the locations of these small rivers. This result shows that the proposed photon dispersion algorithm has high sensitivity in determining water body boundaries and has a strong ability to classify small water bodies accurately.

(2) River photons have lower dispersion than land photons do. Therefore, the first peak in the PDF curve represents the photons of the river, and the subsequent peaks in the PDF curve represent the photons of the land. Furthermore, in areas with high land complexity, characterized by considerable terrain undulation or complex surface cover, the PDF curve fitted using GKDE has a long tail with numerous peaks. This finding suggests that the PDF curve can also reflect land complexity.

The proposed photon dispersion algorithm effectively extracts river photons along the ICESat-2 tracks. By using this method to monitor river water level changes in specific regions and explore global inland water bodies (lakes, reservoirs, and rivers), addressing the limitations of current water level monitoring methods, which rely on tide gauges and have low efficiency, is possible in the future. This approach enhances our understanding of the water cycle and is crucial for the protection of biodiversity and fragile ecosystems.

Author Contributions

Conceptualization, X.Z. and W.M.; methodology, X.Z. and W.M.; software, W.M.; validation, X.Z. and W.M.; formal analysis, W.M. and X.L.; investigation, W.M. and X.L.; resources, W.M.; data curation, W.M.; writing—original draft preparation, W.M.; writing—review and editing, X.Z. and W.M.; visualization, W.M.; supervision, X.Z.; project administration, X.Z.; funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the project ZR2023MD016 supported by Shandong Provincial Natural Science Foundation and the National Natural Science Foundation of China under Grant 41906166.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to thank the NASA NSIDC for distributing the ICESat-2 data (https://search.earthdata.nasa.gov, accessed on 13 March 2023) and the anonymous reviewers and members of the editorial team for their constructive comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Schröter, M.; Bonn, A.; Klotz, S.; Seppelt, R.; Baessler, C. Atlas of Ecosystem Services: Drivers, Risks, and Societal Responses; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
Zhang, B.; Li, J.; Shen, Q.; Wu, Y.; Zhang, F.; Wang, S.; Yao, Y.; Guo, L.; Yin, Z. Recent research progress on long time series and large scale optical remote sensing of inland water. Natl. Remote Sens. Bull. 2021, 25, 37–52. [Google Scholar] [CrossRef]
Ni, J.; Liu, Y. Ecological rehabilitation of damaged river system. J. Hydraul. Eng. 2006, 37, 1029–1037. [Google Scholar]
Ni, J.; Ma, A. Rever Dynamica Geomorphology; Peking University Press: Beijing, China, 1998; pp. 151–159. [Google Scholar]
Norris, R.H.; Morris, K. The need for biological assessment of water quality: Australian perspective. Aust. J. Ecol. 1995, 20, 1–6. [Google Scholar] [CrossRef]
Bonada, N.; Prat, N.; Resh, V.H.; Statzner, B. Developments in aquatic insect biomonitoring: A comparative analysis of recent approaches. Annu. Rev. Entomol. 2006, 51, 495–523. [Google Scholar] [CrossRef] [PubMed]
Feio, M.J.; Hughes, R.M.; Callisto, M.; Nichols, S.J.; Odume, O.N.; Quintella, B.R.; Kuemmerlen, M.; Aguiar, F.C.; Almeida, S.F.P.; Alonso-EguíaLis, P.; et al. The Biological Assessment and Rehabilitation of the World’s Rivers: An Overview. Water 2021, 13, 371. [Google Scholar] [CrossRef]
Canaz, S.; Karsli, F.; Guneroglu, A.; Dihkan, M. Automatic boundary extraction of inland water bodies using LiDAR data. Ocean Coast. Manag. 2015, 118, 158–166. [Google Scholar] [CrossRef]
Shi, Z.; Huang, C. Recent advances in remote sensing of river characteristics. Prog. Geogr. 2020, 39, 670–684. [Google Scholar] [CrossRef]
Yuan, X.; Chao, Q.; Mengzhen, X.; Xudong, F.; Dan, L.; Baosheng, W.; Guangqian, W. Progress and Prospects in River Cross Section Extraction Based on Multi-Source Remote Sensing. Natl. Remote Sens. Bull. 2024, xx, 1–21. [Google Scholar] [CrossRef]
Langat, P.K.; Kumar, L.; Koech, R. Monitoring river channel dynamics using remote sensing and GIS techniques. Geomorphology 2019, 325, 92–102. [Google Scholar] [CrossRef]
Jin, S.; Liu, Y.; Fagherazzi, S.; Mi, H.; Qiao, G.; Xu, W.; Sun, C.; Liu, Y.; Zhao, B.; Fichot, C.G. River body extraction from sentinel-2A/B MSI images based on an adaptive multi-scale region growth method. Remote Sens. Environ. 2021, 255, 112297. [Google Scholar] [CrossRef]
Huang, C.; Chen, Y.; Zhang, S.; Wu, J. Detecting, extracting, and monitoring surface water from space using optical sensors: A review. Rev. Geophys. 2018, 56, 333–360. [Google Scholar] [CrossRef]
McFeeters, S.K. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
Feyisa, G.L.; Meilby, H.; Fensholt, R.; Proud, S.R. Automated Water Extraction Index: A new technique for surface water mapping using Landsat imagery. Remote Sens. Environ. 2014, 140, 23–35. [Google Scholar] [CrossRef]
Nashait, A.F.; Jasim, O.; Ismail, M.; Saad, F. Integrating various satellite images for identification of the water bodies through using machine learning: A case study of Salah Adin, Iraq. In Proceedings of the IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2020; p. 012223. [Google Scholar]
Tulbure, M.G.; Broich, M.; Stehman, S.V.; Kommareddy, A. Surface water extent dynamics from three decades of seasonally continuous Landsat time series at subcontinental scale in a semi-arid region. Remote Sens. Environ. 2016, 178, 142–157. [Google Scholar] [CrossRef]
Paul, A.; Tripathi, D.; Dutta, D. Application and comparison of advanced supervised classifiers in extraction of water bodies from remote sensing images. Sustain. Water Resour. Manag. 2018, 4, 905–919. [Google Scholar] [CrossRef]
Skakun, S. A neural network approach to flood mapping using satellite imagery. Comput. Inform. 2010, 29, 1013–1024. [Google Scholar]
Aung, E.M.M.; Tint, T. Ayeyarwady river regions detection and extraction system from Google Earth imagery. In Proceedings of the 2018 IEEE International Conference on Information Communication and Signal Processing (ICICSP), Singapore, 28–30 September 2018; pp. 74–78. [Google Scholar]
Qin, X.; Yang, J.; Li, P.; Sun, W. Research on water body extraction from Gaofen-3 imagery based on polarimetric decomposition and machine learning. In Proceedings of the IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 6903–6906. [Google Scholar]
Rao, P.; Jiang, W.; Wang, X.; Chen, K. Flood disaster analysis based on MODIS data—Taking the flood in Dongting Lake area in 2017 as an example. J. Catastrophol. 2019, 34, 203–207. [Google Scholar]
Schumann, G.J.-P.; Moller, D.K. Microwave remote sensing of flood inundation. Phys. Chem. Earth Parts A/B/C 2015, 83, 84–95. [Google Scholar] [CrossRef]
Niedermeier, A.; Lehner, S.; van der Sanden, J. Monitoring big river estuaries using SAR images. In Proceedings of the IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Symposium (Cat. No. 01CH37217), Sydney, NSW, Australia, 9–13 July 2001; pp. 1756–1758. [Google Scholar]
Tan, Q.; Liu, Z.; Fu, Z.; Hu, J. Lake shoreline detection and tracing in SAR images using wavelet transform and ACM method. In Proceedings of the Proceedings. 2005 IEEE International Geoscience and Remote Sensing Symposium, 2005. IGARSS’05, Seoul, Republic of Korea, 29 July 2005; pp. 3703–3706. [Google Scholar]
Su, L.; Li, Z.; Gao, F.; Yu, M. A review of remote sensing image water extraction. Remote Sens. Land Resour. 2021, 33, 9–19. [Google Scholar]
Gens, R. Remote sensing of coastlines: Detection, extraction and monitoring. Int. J. Remote Sens. 2010, 31, 1819–1836. [Google Scholar] [CrossRef]
Wang, J.; Wang, L.; Feng, S.; Peng, B.; Huang, L.; Fatholahi, S.N.; Tang, L.; Li, J. An overview of shoreline mapping by using airborne LiDAR. Remote Sens. 2023, 15, 253. [Google Scholar] [CrossRef]
Stockdonf, H.F.; Sallenger Jr, A.H.; List, J.H.; Holman, R.A. Estimation of shoreline position and change using airborne topographic lidar data. J. Coast. Res. 2002, 18, 502–513. [Google Scholar]
Jiang, H. Coastline Extraction and Property Identification Based on LiDAR; University of Information Engineering, Strategic Support Forces: Zhengzhou, China, 2020. [Google Scholar]
Yu, C.; Wang, J.; Bao, J.; Xu, J.; Chen, H. A binary image optimization method of extracting coastline based on LiDAR data. J. Geomat. Sci. Technol. 2015, 32, 187–191. [Google Scholar]
Zhao, X.; Wang, X.; Zhao, J.; Zhou, F. Water–land classification using three-dimensional point cloud data of airborne LiDAR bathymetry based on elevation threshold intervals. J. Appl. Remote Sens. 2019, 13, 034511. [Google Scholar] [CrossRef]
Liang, G.; Zhao, X.; Zhao, J.; Zhou, F. MVCNN: A Deep Learning-Based Ocean–Land Waveform Classification Network for Single-Wavelength LiDAR Bathymetry. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 16, 656–674. [Google Scholar] [CrossRef]
Zhao, X.; Wang, X.; Zhao, J.; Zhou, F. An improved water-land discriminator using laser waveform amplitudes and point cloud elevations of airborne LIDAR. J. Coast. Res. 2021, 37, 1158–1172. [Google Scholar] [CrossRef]
Jianhu, Z.; Yongzhong, O.; Aixue, W. Status and development tendency for seafloor terrain measurement technology. Acta Geod. Et Cartogr. Sin. 2017, 46, 1786. [Google Scholar]
Li, R.; Wang, C.; Su, G.-Z.; Zhang, K.; Tang, L.; Li, C. Development and applications of spaceborne LiDAR. Sci. Technol. Rev. 2007, 25, 58–63. [Google Scholar]
Armon, M.; Dente, E.; Shmilovitz, Y.; Mushkin, A.; Cohen, T.J.; Morin, E.; Enzel, Y. Determining bathymetry of shallow and ephemeral desert lakes using satellite imagery and altimetry. Geophys. Res. Lett. 2020, 47, e2020GL087367. [Google Scholar] [CrossRef]
Xie, J.; Li, B.; Jiao, H.; Zhou, Q.; Mei, Y.; Xie, D.; Wu, Y.; Sun, X.; Fu, Y. Water level change monitoring based on a new denoising algorithm using data from Landsat and ICESat-2: A case study of Miyun Reservoir in Beijing. Remote Sens. 2022, 14, 4344. [Google Scholar] [CrossRef]
Dong, J.; Ni, W.; Zhang, Z.; Sun, G. Evaluation of the effect of ICESat-2 vegetation canopy height and surface elevation data products for forest height extraction. J. Remote Sens. 2021, 25, 1294–1307. [Google Scholar]
Huang, X.; Cheng, F.; Wang, J.; Duan, P.; Wang, J. Forest Canopy Height Extraction Method Based on ICESat-2/ATLAS Data. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5700814. [Google Scholar] [CrossRef]
Liu, W.; Jin, T.; Li, J.; Jiang, W. Adaptive clustering-based method for ICESat-2 sea ice retrieval. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4301814. [Google Scholar] [CrossRef]
Brunt, K.; Smith, B.; Sutterley, T.; Kurtz, N.; Neumann, T. Comparisons of satellite and airborne altimetry with ground-based data from the interior of the Antarctic ice sheet. Geophys. Res. Lett. 2021, 48, e2020GL090572. [Google Scholar] [CrossRef]
Liu, C.; Qi, J.; Li, J.; Tang, Q.; Xu, W.; Zhou, X.; Meng, W. Accurate refraction correction—Assisted bathymetric inversion using ICESat-2 and multispectral data. Remote Sens. 2021, 13, 4355. [Google Scholar] [CrossRef]
Ma, Y.; Xu, N.; Liu, Z.; Yang, B.; Yang, F.; Wang, X.H.; Li, S. Satellite-derived bathymetry using the ICESat-2 lidar and Sentinel-2 imagery datasets. Remote Sens. Environ. 2020, 250, 112047. [Google Scholar] [CrossRef]
Parrish, C.E.; Magruder, L.A.; Neuenschwander, A.L.; Forfinski-Sarkozi, N.; Alonzo, M.; Jasinski, M. Validation of ICESat-2 ATLAS bathymetry and analysis of ATLAS’s bathymetric mapping performance. Remote Sens. 2019, 11, 1634. [Google Scholar] [CrossRef]
Wen, Z.; Tang, X.; Ai, B.; Yang, F.; Li, G.; Mo, F.; Zhang, X.; Yao, J. A new extraction and grading method for underwater topographic photons of photon-counting LiDAR with different observation conditions. Int. J. Digit. Earth 2024, 17, 1–30. [Google Scholar] [CrossRef]
Ye, J.; Qiang, Y.; Zhang, R.; Liu, X.; Deng, Y.; Zhang, J. High-precision digital surface model extraction from satellite stereo images fused with ICESat-2 data. Remote Sens. 2021, 14, 142. [Google Scholar] [CrossRef]
Klotz, B.W.; Neuenschwander, A.; Magruder, L.A. High-resolution ocean wave and wind characteristics determined by the ICESat-2 land surface algorithm. Geophys. Res. Lett. 2020, 47, e2019GL085907. [Google Scholar] [CrossRef]
Ryan, J.C.; Smith, L.C.; Cooley, S.W.; Pitcher, L.H.; Pavelsky, T.M. Global characterization of inland water reservoirs using ICESat-2 altimetry and climate reanalysis. Geophys. Res. Lett. 2020, 47, e2020GL088543. [Google Scholar] [CrossRef]
Li, Y.; Gao, H.; Jasinski, M.F.; Zhang, S.; Stoll, J.D. Deriving high-resolution reservoir bathymetry from ICESat-2 prototype photon-counting lidar and landsat imagery. IEEE Trans. Geosci. Remote Sens. 2019, 57, 7883–7893. [Google Scholar] [CrossRef]
Xu, N.; Zheng, H.; Ma, Y.; Yang, J.; Liu, X.; Wang, X. Global estimation and assessment of monthly lake/reservoir water level changes using ICESat-2 ATL13 products. Remote Sens. 2021, 13, 2744. [Google Scholar] [CrossRef]
Liu, C.; Hu, R.; Wang, Y.; Lin, H.; Zeng, H.; Wu, D.; Liu, Z.; Dai, Y.; Song, X.; Shao, C. Monitoring water level and volume changes of lakes and reservoirs in the Yellow River Basin using ICESat-2 laser altimetry and Google Earth Engine. J. Hydro-Environ. Res. 2022, 44, 53–64. [Google Scholar] [CrossRef]
Markus, T.; Neumann, T.; Martino, A.; Abdalati, W.; Brunt, K.; Csatho, B.; Farrell, S.; Fricker, H.; Gardner, A.; Harding, D. The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2): Science requirements, concept, and implementation. Remote Sens. Environ. 2017, 190, 260–273. [Google Scholar] [CrossRef]
Neumann, T.A.; Martino, A.J.; Markus, T.; Bae, S.; Bock, M.R.; Brenner, A.C.; Brunt, K.M.; Cavanaugh, J.; Fernandes, S.T.; Hancock, D.W. The Ice, Cloud, and Land Elevation Satellite–2 Mission: A global geolocated photon product derived from the advanced topographic laser altimeter system. Remote Sens. Environ. 2019, 233, 111325. [Google Scholar] [CrossRef]
Herzfeld, U.C.; McDonald, B.W.; Wallin, B.F.; Neumann, T.A.; Markus, T.; Brenner, A.; Field, C. Algorithm for detection of ground and canopy cover in micropulse photon-counting lidar altimeter data in preparation for the ICESat-2 mission. IEEE Trans. Geosci. Remote Sens. 2013, 52, 2109–2125. [Google Scholar] [CrossRef]
Re, X.; Wang, G.; Wang, Y.; Luo, J.; Ma, Y. Geomorphological characteristics of the coexistence area of crescent dunes and parabolic dunes in the western Hunshandake Sandy Land. Arid Zone Res./Ganhanqu Yanjiu 2023, 40, 2016–2030. [Google Scholar]
Yang, J.; Zheng, H.; Ma, Y.; Zhao, P.; Zhou, H.; Li, S.; Wang, X.H. Background noise model of spaceborne photon-counting lidars over oceans and aerosol optical depth retrieval from ICESat-2 noise data. Remote Sens. Environ. 2023, 299, 113858. [Google Scholar] [CrossRef]
Tang, L.; Yang, H.; Zhang, H. The application of the Kernel Density Estimates in predicting VaR. Math. Pract. Underst 2005, 35, 31–37. [Google Scholar]
Seeger, M. Gaussian processes for machine learning. Int. J. Neural Syst. 2004, 14, 69–106. [Google Scholar] [CrossRef] [PubMed]
Harpole, J.K.; Woods, C.M.; Rodebaugh, T.L.; Levinson, C.A.; Lenze, E.J. How bandwidth selection algorithms impact exploratory data analysis using kernel density estimation. Psychol. Methods 2014, 19, 428. [Google Scholar] [CrossRef] [PubMed]
Silverman, B.W. Density Estimation for Statistics and Data Analysis; Routledge: Oxfordshire, UK, 2018. [Google Scholar]
Neuenschwander, A.L.; Pitts, K.; Jelley, B.; Robbins, J.; Klotz, B.; Popescu, S.C.; Nelson, R.F.; Harding, D.; Pederson, D.; Sheridan, R. ATLAS/ICESat-2 L3A Land and Vegetation Height, Version 3; NASA National Snow and Ice Data Center Distributed Active Archive Center: Boulder, CO, USA, 2021.
Gwenzi, D.; Lefsky, M.A.; Suchdeo, V.P.; Harding, D.J. Prospects of the ICESat-2 laser altimetry mission for savanna ecosystem structural studies based on airborne simulation data. ISPRS J. Photogramm. Remote Sens. 2016, 118, 68–82. [Google Scholar] [CrossRef]
Ma, Y.; Xu, N.; Sun, J.; Wang, X.H.; Yang, F.; Li, S. Estimating water levels and volumes of lakes dated back to the 1980s using Landsat imagery and photon-counting lidar datasets. Remote Sens. Environ. 2019, 232, 111287. [Google Scholar] [CrossRef]

Figure 1. Comparison of the RANSAC model for SLC (a) and RLC (b).

Figure 2. Distribution of study areas and photon data. (c,d) Overall maps of the study areas, where (a,b,e,f) are study areas A, B, C, and D, respectively. The red lines represent the locations of ground tracks of the ICESat-2.

Figure 3. Original photon point cloud data and denoised results. (a) Original data of study area A; (b) denoised photon point cloud; (c) window division results of photon point cloud; (d) original photon distribution result and elevation density distribution histogram in the magenta window of (c).

Figure 4. Butterworth low-pass filtering results (a); elevation’s STD for filter photons using a sliding window (b).

Figure 5. Histogram of the elevation STD distribution, PDF curve, and threshold for each study area. (a) Results for study area A; (b) results for study area B (part of the curve); (c) results for study area C; (d) results for study area D.

Figure 6. RLC results based on the elevation’s STD. (a) Elevation’s STD and threshold; (b) high- and low-STD results.

Figure 7. Photon classification results. (a) Preliminary classification results; (b) DBSCAN-corrected results.

Figure 8. Comparison of the results of three classification methods: the methods of this study, the improved RANSAC algorithm, and the combined RANSAC and DBSCAN algorithms. (a) Final photon classification results of research area A; (b) final photon classification results of research area B; (c) final photon classification results of research area C; (d) final photon classification results of research area D.

Figure 9. Overlay of classified photons and local magnification results for study dataset A. (a) Classification results of this study; (b) classification results of the improved RANSAC algorithm; (c) classification results of the RANSAC + DBSCAN algorithms; (d) local magnification of the higher section using the proposed method; (e) local magnification of the higher section using the improved RANSAC algorithm; (f) local magnification of the higher section using the RANSAC + DBSCAN algorithms; (g) local magnification of the lower section using the proposed method; (h) local magnification of the lower section using the improved RANSAC algorithm; (i) local magnification of the lower section using the RANSAC + DBSCAN algorithms.

Figure 10. Overlay of classified photons and local magnification results for study dataset B. (a) Classification results of this study; (b) classification results of the improved RANSAC algorithm; (c) classification results of the RANSAC + DBSCAN algorithms; (d) local magnification of the higher section using the proposed method; (e) local magnification of the higher section using the improved RANSAC algorithm; (f) local magnification of the higher section using the RANSAC + DBSCAN algorithms.

Figure 11. Overlay of classified photons and local magnification results for study dataset C. (a) Classification results of this study; (b) classification results of the improved RANSAC algorithm; (c) classification results of the RANSAC + DBSCAN algorithms; (d) local magnification of the higher section using the proposed method; (e) local magnification of the higher section using the improved RANSAC algorithm; (f) local magnification of the higher section using the RANSAC + DBSCAN algorithms.

Figure 12. Correspondence between narrow rivers in research dataset D and water points in the photon distribution map. (a) The local enlarged image corresponding to the dataset D and (b) photon distribution along the ICESat-2 track.

Figure 13. Overlay of classified photons and local magnification results for study dataset C. (a) Classification results of this study; (b) classification results of the improved RANSAC algorithm; (c) classification results of the RANSAC + DBSCAN algorithms; (d) local magnification of the higher section using the proposed method; (e) local magnification of the higher section using the improved RANSAC algorithm; (f) local magnification of the higher section using the RANSAC + DBSCAN algorithms.

Figure 14. The classification accuracies of the proposed photon dispersion algorithm vary with the W/L ratio.

Figure 15. The cutoff frequency sensitivity analysis of the classification accuracy.

Figure 16. The window size sensitivity analysis of the classification accuracy.

Figure 17. First trough and second peak threshold selection and results. (a) First trough threshold of the PDF curve of dataset A; (b) photon classification results of the first trough threshold; (c) second peak threshold of the PDF curve of dataset A; (d) photon classification results of the second peak threshold.

Figure 18. The classification results of study areas A, B, C, and D when the first peak, first trough, and second peak are selected as the classification thresholds. (a) OA results; (b) KC results.

Figure 19. Selection of long-tail thresholds and classification results of PDF curves for datasets C and D. (a) Long-tail threshold of the PDF curve of dataset C; (b) photon classification results of the long-tail threshold of dataset C; (c) long-tail threshold of the PDF curve of dataset D; (d) photon classification results of the long-tail threshold of dataset D.

Table 1. Classification photon confusion matrix.

Classification	Actual Water Photons	Actual Land Photons
Classified water photons	TP	FP
Classified land photons	FN	TN

Table 2. Statistical index results of the three classification methods.

Research Dataset	Classification Method	TP	TN	FP	FN	All
A	The proposed photon dispersion algorithm	4733	46,652	22	163	51,570
	The improved RANSAC algorithm	4701	38,775	7899	195	51,570
	The RANSAC + DBSCAN algorithms	4701	46,275	399	195	51,570
B	The proposed photon dispersion algorithm	21,239	53,242	20	933	75,434
	The improved RANSAC algorithm	20,606	36,268	16,994	1566	75,434
	The RANSAC + DBSCAN algorithms	20,179	51,116	2146	1993	75,434
C	The proposed photon dispersion algorithm	66,927	31,900	1143	218	100,188
	The improved RANSAC algorithm	63,550	26,913	6130	3595	100,188
	The RANSAC + DBSCAN algorithms	57,426	32,179	864	9719	100,188
D	The proposed photon dispersion algorithm	12,923	22,010	45	36	35,014
	The improved RANSAC algorithm	12,904	17,237	4818	55	35,014
	The RANSAC + DBSCAN algorithms	12,904	17,670	4385	55	35,014

Table 3. Accuracy evaluation results of the three classification methods.

Research Dataset	A		B		C		D		Mean
Evaluation Index	OA	Kappa	OA	Kappa	OA	Kappa	OA	Kappa	OA	Kappa
The proposed algorithm	99.64%	97.89%	98.74%	96.92%	98.34%	96.91%	99.77%	99.50%	99.12%	97.81%
RANSAC	84.30%	46.41%	75.40%	50.73%	90.29%	77.61%	86.08%	72.25%	84.02%	61.75%
RANSAC + DBSCAN	98.85%	93.42%	94.51%	86.81%	89.44%	77.63%	87.32%	74.56%	92.53%	83.11%

Table 4. Relevant information on river water body sizes in the research dataset.

Research Dataset	Total Length along Track (m)	Water Bodies Length (m)	Land Length (m)	Water Bodies Percentage (%)	W/L Ratio (%)	Number of Water Bodies
A	8575	1550	7025	0.181	0.221	2
B	9010	3190	5820	0.354	0.548	4
C	63,810	34,235	29,575	0.537	1.158	8
D	11,300	630	10,670	0.056	0.059	1

Table 5. The first peak, first trough, and second peak are used as the threshold classification statistical parameters and accuracy evaluation results.

Research Dataset	Threshold	W/L Ratio (%)	Long Tail	OA (%)	KC (%)
A	First peak	0.221	No	99.64	97.89
	First trough			94.44	73.52
	Second peak			17.10	1.71
B	First peak	0.548	No	98.74	96.92
	First trough			79.18	57.75
	Second peak			60.25	30.63
C	First peak	1.158	Yes	98.34	96.91
	First trough			97.79	95.00
	Second peak			96.75	92.56
D	First peak	0.059	Yes	99.77	99.50
	First trough			99.26	98.49
	Second peak			99.14	98.17

Table 6. Number of peaks and the maximum STD of the PDF curve of each dataset.

Research Dataset	A	B	C	D
Number of peaks	5	5	15	13
Maximum STD	4.3	31.1	5.2	15.8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, W.; Liu, X.; Zhao, X. Extraction of River Water Bodies Based on ICESat-2 Photon Classification. Remote Sens. 2024, 16, 3034. https://doi.org/10.3390/rs16163034

AMA Style

Ma W, Liu X, Zhao X. Extraction of River Water Bodies Based on ICESat-2 Photon Classification. Remote Sensing. 2024; 16(16):3034. https://doi.org/10.3390/rs16163034

Chicago/Turabian Style

Ma, Wenqiu, Xiao Liu, and Xinglei Zhao. 2024. "Extraction of River Water Bodies Based on ICESat-2 Photon Classification" Remote Sensing 16, no. 16: 3034. https://doi.org/10.3390/rs16163034

APA Style

Ma, W., Liu, X., & Zhao, X. (2024). Extraction of River Water Bodies Based on ICESat-2 Photon Classification. Remote Sensing, 16(16), 3034. https://doi.org/10.3390/rs16163034

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Extraction of River Water Bodies Based on ICESat-2 Photon Classification

Abstract

1. Introduction

2. Data and Study Areas

2.1. Research Data

2.1.1. ICESat-2 ATL03

2.1.2. World Imagery

2.2. Study Area

3. Methods

3.1. Photon Dispersion Method

3.1.1. Butterworth Low-Pass Filtering

3.1.2. Calculation of the Elevation’s STD Using a Sliding Window

3.1.3. RLC Criterion

3.1.4. Threshold Calculation

3.1.5. DBSCAN Correction

3.2. Accuracy Assessment Method

4. Experimental Results and Analysis

4.1. Photon Point Cloud Denoising

4.2. RLCs Based on the Photon Dispersion Method

4.3. Results from the Three Classification Methods

4.4. Accuracy Assessment and Comparison

4.4.1. Qualitative Evaluation

4.4.2. Quantitative Evaluation

5. Discussion

5.1. Photon Dispersion Algorithm Parameter Sensitivity

5.2. Using the First Peak of the PDF Curve as the Threshold

5.3. Relationship between PDF Curves and Land Complexity

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI