Adaptive Determination of the Flow Accumulation Threshold for Extracting Drainage Networks from DEMs

Zhang, Wei; Li, Wenkai; Loaiciga, Hugo A.; Liu, Xiuguo; Liu, Shuya; Zheng, Shengjie; Zhang, Han

doi:10.3390/rs13112024

Open AccessArticle

Adaptive Determination of the Flow Accumulation Threshold for Extracting Drainage Networks from DEMs

¹

School of Geography and Information Engineering, China University of Geosciences (Wuhan), Wuhan 430074, China

²

Department of Geography, University of California, Santa Barbara, CA 93106, USA

³

China Petroleum Pipeline Engineering Co., Ltd., Langfang 065000, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2021, 13(11), 2024; https://doi.org/10.3390/rs13112024

Submission received: 1 April 2021 / Revised: 12 May 2021 / Accepted: 13 May 2021 / Published: 21 May 2021

(This article belongs to the Special Issue Agricultural Water Management Using Geospatial Technologies)

Download

Browse Figures

Versions Notes

Abstract

:

Selecting the flow accumulation threshold (FAT) plays a central role in extracting drainage networks from Digital Elevation Models (DEMs). This work presents the MR-AP (Multiple Regression and Adaptive Power) method for choosing suitable FAT when extracting drainage from DEMs. This work employs 36 sample sub-basins in Hubei (China) province. Firstly, topography, the normalized difference vegetation index (NDVI), and water storage change are used in building multiple regression models to calculate the drainage length. Power functions are fit to calculate the FAT of each sub-basin. Nine randomly chosen regions served as test sub-basins. The results show that: (1) water storage change and NDVI have high correlation with the drainage length, and the coefficient of determination (R²) ranges between 0.85 and 0.87; (2) the drainage length obtained from the Multiple Regression model using water storage change, NDVI, and topography as influence factors is similar to the actual drainage length, featuring a coefficient of determination (R²) equal to 0.714; (3) the MR-AP method calculates suitable FATs for each sub-basin in Hubei province, with a drainage length error equal to 5.13%. Moreover, drainage network extraction by the MR-AP method mainly depends on the water storage change and the NDVI, thus being consistent with the regional water-resources change.

Keywords:

flow accumulation threshold (FAT); multiple regression; power function; drainage networks

Graphical Abstract

1. Introduction

Water is a crucial resource for humans and the environment. Its distribution and flow characteristics have unique geographical traits. Extracting hydrologic information based on DEMs has received substantial attention in the hydrologic analysis of river networks. Classic problems in distributed hydrological modeling are the calculation of flow accumulation and the extraction of drainage networks [1,2]. A key variable in the extraction of drainage networks from DEMs is the flow accumulation threshold (FAT) [3]. The flow accumulation threshold is related to the upslope contributing area at a given point in a drainage area represented by a DEM. The FAT represents the minimum number of upslope cells flowing into a flow-receiving cell that is part of a drainage network (notice the FAT is a dimensionless variable). Applying an FAT to flow accumulation values delineates the drainage network from a DEM.

Several algorithms have been proposed for the extraction of drainage networks from DEMs, such as that of Callaghan and Mark, which applied the FAT as the basis for delineating drainage networks [4]. Several authors treat the FAT as a constant; Camara reported that the drainage networks obtained from DEM represent intermittent watercourses when the FAT equaled 20,000 m² (the areas of eight study sites equaled 1.64 km²) [5]. This finding is not supported in other settings, however. Gandolfi and Bischetti compared the drainage networks obtained with field observations in two small alpine catchments and determined that the FAT for channel initiation is not constant. The latter authors concluded that the FAT has a significant effect on many geomorphological indices, but there is a poor correlation between the FAT and local slope [6]. Helmlinger et al. argued that the FAT considerably affects the estimation of morphometric properties, and their results indicated that the channel networks extracted with constant FAT produces unrealistic fractal dimension estimates in basins with uneven distribution of slopes [7]. Lee and Kim presented graphical analysis of slope and the FAT, showing conditions in which the FAT becomes constant as a function of slope [8]. Linke and Lehner suggested the contributing drainage area associated with the FAT should exceed 10 km² or the streams may not be correctly extracted due to unreliable spatial representation [9].

There are qualitative and quantitative methods for the selection of the optimal FAT. On the one hand, the former methods evaluate various FATs and compare the extracted drainage network to the source map visually [10,11]. Experience guides the selection of the FATs in a subjective manner. Quantitative methods, on the other hand, propose formulas, such as composite functions [12], relating drainage length and density to flow accumulation. It has been proposed that the optimal FAT occurs where the second derivative of the cited formulas approaches zero [13]. However, the terrain may be such that the empirical formulas might not exhibit vanishing second derivatives, or the second derivatives may vanish at an incorrect value [14]. Vogt et al. classified drainage length in function of the climate, vegetation, slope, and other environmental parameters, and analyzed the relation between the local slope and the contributing area in determining the FAT of each type of drainage length [15]. Drainage characteristics vary with the topography, vegetation, precipitation, and other characteristics [16,17,18,19,20,21]. Bhowmik et al. employed the lateral displacements from surveyed drainage networks to select the FAT [22]. Several authors contend the FAT cannot be set uniformly and should be chosen according to regional characteristics [23]. Qin et al. proposed the ‘case-based’ approach for extracting drainage networks and estimating the FAT [24].

Drainage networks develop over geologic time through complex processes that are governed by a number of factors. However, previous studies mainly focus on the influence of geomorphologic characteristics on drainage networks, while hydrological factors are rarely considered. Moreover, current methods used to set the threshold relies more on subjective judgment and omit adaptive corrections that consider sub-basin characteristics [25,26]. Several authors have proposed multiple regression approaches for the identification of the FAT [3,27]. This work, considers topographic and vegetation characteristics to construct and evaluate multiple regressions relating the FAT to hydrological factors such as water storage change, and identifies the optimal FAT of sub-basin adaptively. China features many regions with diverse topography, Hubei province being a case in point. This paper evaluates DEM data from Hubei province to test methods for FAT identification. This paper develops the MR-AP (Multiple Regression and Adaptive Power) methods to identify the FAT. A multiple linear regression model between the drainage length (the dependent variable) and the maximum roughness, NDVI, and water storage change (the independent or explanatory variables) is developed and evaluated. The power function is employed to calculate the FAT from drainage length. The drainage length equals the total length of all the streams in a drainage area.

2. Materials

2.1. Study Region

Hubei province features diverse geomorphic characteristics. It is surrounded by mountains in its western region, features plains in the south-central region and hilly topography towards its eastern portion. Rivers in Hubei province form part of the Yangtze River network. The Yangtze River network was formed by geologic processes, influenced by climate, vegetation, and topography. Mountains generate convection and frontal heavy rain producing runoff that carves the river network through long-term erosion and deposition, while dense vegetation influences erosion [28,29]. This study’s data stem from 36 regions employed for model development and 9 test areas in Hubei province, which are shown in Figure 1. These regions are sub-basins each with an area of more than 400 km² and they encompass most of the western and northern portions of Hubei province.

2.2. Topographic Data

The digital elevation model (DEM) of the study area was collected from the Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (Aster GDEM), having a 30-m spatial resolution. The Version 3 data have employed cloud masking to avoid cloud-contaminated pixels [30], which were downloaded from the Earthdata website (https://doi.org/10.5067/ASTER/ASTGTM.003, accessed on 21 November 2019).

The surface roughness (r) of the study region is defined as the ratio of the area of the actual surface to that of a smooth surface having the same geometric shape and dimensions. Equations (1) and (2) yield the surface roughness [31]:

\cos \frac{SOD * Π}{180} = \frac{geometric surface}{actual surface},

(1)

r = \frac{actual surface}{geometric surface},

(2)

where SOD denotes the slope of the DEM’s grid expressed as an angle in radians, geometric surface denotes the smooth surface area of the gird, and actual surface means the actual surface area of the grid.

2.3. Vegetation Index Data

The Normalized Difference Vegetation Index (NDVI) from June through August 2018 was determined from the Moderate-resolution Imaging Spectroradiometer (MODIS) Vegetation Indices (MOD13A3) Version 6 data, which were downloaded from the Earthdata website (https://doi.org/10.5067/MODIS/MOD13A3.006, accessed on 21 November 2019). These data complements National Oceanic and Atmospheric Administration (NOAA)’s Advanced Very High Resolution Radiometer (AVHRR) NDVI products and provides continuity for a time series’ historical applications [32]. The NDVI were resampled to have the same spatial resolution as the DEM with the IDW (Inverse Distance Weighted) method.

2.4. Water Storage Change from Precipitation, Runoff, and Evapotranspiration

The GPM_3IMERGM 06 precipitation data with a spatial resolution of 0.1° during the period of June to August in 2018 were obtained from the Goddard Earth Sciences Data and Information Services Center (GES DISC) website (Huffman, G.J., E.F. Stocker, D.T. Bolvin, E.J. Nelkin, Jackson Tan. https://doi.org/10.5067/GPM/IMERG/3B-MONTH/06, accessed on 21 November 2019). The Global Precipitation Measurement (GPM) satellite launched on 28 February 2014, which is the successor of the Tropical Rainfall Measuring Mission (TRMM) and has made three critical improvements, that is, improving the spatial resolution of precipitation products from 0.25° into 0.1°, providing more coverage (covering 60°N−60°S), and adding sensitivity to liquid and solid precipitation [33,34].

NASA’s GlobFigureal Land Data Assimilation System (GLDAS) generates optimal fields of land surface states and fluxes by blending satellite- and ground-based observational data products using advanced land surface modeling and data assimilation techniques [35]. GLDAS-2.1 started on 1 January 2000 with a spatial resolution of 0.25°. GLDAS_NOAH025_M_2.1 products (including Evapotranspiration (ET) and Storm surface runoff (R)) covering this study area were used in this study. All of the data were downloaded from the GES DISC website (Beaudoing, H. and M. Rodell. https://doi.org/10.5067/SXAVCZFAQLNO, accessed on 21 November 2019) [36].

Firstly, the precipitation, surface runoff, and evapotranspiration were resampled to 30 m spatial resolution with the IDW method. Secondly, the change in water storage (ΔS) in any time interval can be calculated by the precipitation (P), surface runoff (R), and evapotranspiration (ET) occurring in an interval i according to the following function [37]:

Δ S_{i} = P_{i} - R_{i} - {ET}_{i},

(3)

2.5. Actual Drainage Networks Data

The OpenStreetMap (OSM) is a free map that can be created, edited, and updated by volunteers globally. In some urban areas, the quantity of networks already surpasses that of commercial or governmental datasets [38]. The actual drainage networks were obtained from the gis_osm_waterways_free_1 datasets corresponding to China. All of the data were downloaded from the OpenStreetMap website (https://download.geofabrik.de/asia/china-latest-free.shp.zip, accessed on 21 November 2019). They were employed in this work in the determination of the length of the actual drainage networks and the optimal determination of the FAT.

3. Methods

3.1. Drainage Networks Extraction

This paper’s method for drainage network extraction applies the Deterministic eight-node (D8) algorithm of the hydrology module of ArcGIS, which consists of the following four steps: pit-filling, calculation of flow direction, calculation of flow accumulation grids, and drainage network extraction [4]. The specification of the FAT is required for drainage network extraction.

The D8 algorithm specifies each cell has its own flow direction, and the extraction of the drainage network conforms to this specification. This paper employs catchments whose drainage networks are linear and applies the drainage length method for the determination of the optimal FAT. The drainage length equation is calculated as follows:

L = \sum_{i = 1}^{N} l_{i},

(4)

where L denotes the drainage length and l_i represents the length of the i-th channel, where i= 1, 2, …, N [34].

3.2. Multiple Regression and Adaptive Power (MR-AP) Method

The Multiple Regression and Adaptive Power (MR-AP) method herein proposed resorts to a sub-basin’s topography, the NDVI, and water storage change (ΔS) as explanatory variables to accurately identify the FAT. The MR-AP method features two parts: multiple regression fitting to estimate drainage network length, and fitting of a power function between the FAT and drainage length to each sub-basin. The drainage network extracted with the MR-AP method is called the MR-AP network.

A workflow diagram of the MR-AP method is displayed in Figure 2.

3.2.1. Multiple Stepwise Regression Fitting

The rationale for employing topography, NDVI, and water storage change as explanatory variables in drainage network extraction is founded on physical principles. Surface runoff derives from precipitation: the larger the precipitation, the larger the runoff, which has erosive power whether as overland flow or channelized [39]. Erosion is the beginning stage of drainage network formation. The topography is a key erosion factor: terrain with small surface roughness is less susceptible to erosion by runoff than terrain with large surface roughness, all other factors being equal. Vegetation, including forests and grasslands, on the other hand, increases the surficial shear capacity of soil or rock in which its roots are embedded and provides cover to the soil against a detachment of soil particles by falling rainfall. In other words, vegetation reduces the erosive power of runoff, besides increasing the water holding capacity of the soil and rock. These considerations motivate an exploratory analysis to ascertain the predictive skill of explanatory variables for the drainage length. This study applies the Pearson correlation analysis to test the correlation between drainage length and explanatory variables. Multiple regression employs topography, NDVI, and water storage change as the explanatory variables of drainage length of the 36 sample catchments. The fitted multiple regression is presented in the Results section.

3.2.2. Power Function Fitting for Each Sub-Basin

The FAT values selected as locally suitable decrease with increasing drainage length. Several authors have reported the FAT and drainage length are well related through power functions [14,40]. Likewise, this paper fits power functions in each sub-basin, as expressed by Equation (5):

L = a + {b * FAT}^{α},

(5)

where L means the drainage length, a, b, and α denote the coefficients of the power function, and FAT denotes the flow accumulation threshold.

A fraction equal to 1% of the basin area was applied as the initial FAT. The D8 algorithm was employed to calculate the drainage length corresponding to each initial FAT. Equation (3) of each sub-basin was fitted with the initial FAT and its corresponding drainage length. Next, the drainage length calculated by MR was applied to calculate the optimal FAT. This was followed by the extraction of the drainage network from the DEM in each sub-basin based on the optimal FAT. The drainage network extracted in this way herein is called the MR-AP drainage network.

3.3. Validation

The MR-AP method builds the multiple linear regression model based on the 36 sample sub-basins, then fits power functions to each sub-basin’s data. The MR-AP method was subsequently evaluated with nine test sub-basins. The drainage length error was calculated by comparing extracted lengths with the actual drainage river length.

Analysts evaluate the quality of the extracted networks by comparing them with networks drawn on maps, represented in remotely sensed images, or represented in aerial photographs [23,41]. This paper compares the MR-AP drainage network with the actual drainage networks (graphed in Section 4.3) and the remote sensing image (graphed in Section 4.3) in evaluating the quality of the extracted network. The drainage length error is herein employed to assess the accuracy of the MR-AP drainage networks according to the following function:

\hat{l} = = \frac{L_{i} - L}{L},

(6)

where

\hat{l}

denotes the drainage length error, L_i represents the length of all the extracted river network in a sub-basin, and L denotes the length of the actual drainage network in a sub-basin. A positive (negative) value of

\hat{l}

represents an overestimation (underestimation) of the drainage length.

4. Results

4.1. Results of Drainage Length Using Multiple Regression (MR) Analysis

Many previous works have evaluated the spatial dependence of environmental parameters on the drainage length [42,43]. This study investigated the potential of using the explanatory variables cited above in building the multiple linear regression model.

The calculated Pearson correlation between actual drainage length and explanatory variables is listed in Table 1. The Pearson correlation values of the water storage change (ΔS), NDVI, and maximum surface roughness (R_max) are equal to 0.85, 0.874, and 0.425, respectively, which indicates that all explanatory variables were significantly correlated with the actual drainage length at the 0.01 significance level (two-sided). Multiple linear regression model was applied next to account for the interaction between explanatory variables in the search for optimal regressions predicting the drainage network length.

Table 2 shows results for the chosen independent variables of the MR model, namely, the maximum surface roughness (R_max), the NDVI, and the water storage change (ΔS). The constant coefficient, the coefficients associated with the water storage change, NDVI, and the maximum surface roughness were equal to −4270.479, 9015.26, 107.426, and −888.753, respectively. The R² (coefficient of determination) and adjusted R² (see Equations (A1) and (A2) in Appendix A) were 0.714 and 0.688, which means a relatively high degree of statistical association. The multiple regression equation is as follows:

L = - 4270.679 + 9015.260 \times Δ S + 107.426 \times NDVI - 888.753 \times R_{\max},

(7)

where L denotes the fitted drainage length (m), ΔS represents water storage change (mm), NDVI denotes the average NDVI, and R_max is the maximum surface roughness of the sub-basins. The drainage length calculated in Equation (7) is called the MR drainage network length.

4.2. Drainage Network Results Using Adaptive Power (AP) Analysis

The drainage length and the FATs are well related by a power function law in which the power coefficient is negative [40]. For this reason, it has been argued the optimal threshold is defined by the point at which the second derivative of a power-law fitted to the drainage length as a function of the FAT approaches zero. This paper analyzed the 45 sub-basins (including 36 sample sub-basins and 9 test sub-basins) by fitting the FATs against the drainage with power functions. The drainage lengths of all the sub-basins are obtained with the MR method and the corresponding fitted FATs were calculated subsequently.

Figure 3 shows the R² of all the sub-basins including the power functions corresponding to the best (No. 1) and worst (No. 36) sub-basins. All the sub-basins exhibit power functions between the FAT and drainage length: the range of the R² varied from 0.9341 to 0.9967. Figure 4 shows that the sub-basin No. 1 had the best R², slightly lower than 0.997 (Figure 4a), and the sub-basin No. 36 has the lowest R², slightly higher than 0.93 (Figure 4b). The average value and the standard deviation of R² of the power functions (see Equations (A3) and (A4) in Appendix A) equal 0.9795 and 0.0152, respectively, which means that the fitting of the function is accurate. It is herein established, therefore, that the FAT may be accurately estimated in terms of the calculated drainage length.

4.3. Validation on the MR-AP Drainage Networks

To further verify the reliability of the MR-AP method in quantitative and qualitative ways, the actual drainage network data from OpenStreetMap were employed as the reference to quantitatively evaluate the quality of the MR-AP drainage network (see Figure 5 and Figure 6), and to identify the exact location of the drainage network, as shown in Figure 7. Nine test sub-basins within Hubei province were randomly chosen for this evaluation, and they are depicted in Figure 1.

The MR-AP drainage network is calculated with the adaptive method depicted in Figure 2. Figure 5 compares the drainage length extracted by the MR-AP and actual drainage length, where La represents the actual drainage length and Le represents the MR-AP drainage length. The regression function is well fitted with R² (see Eqaution (A1) in Appendix A) equal to 0.9901 and correlation coefficient (CC, see Eqaution (A5) in Appendix A) equal to 0.995, which are slightly lower than 1. The root mean square error (RMSE, see Eqaution (A6) in Appendix A) equals 4736 m and the maximum absolute deviation (MAD, see Eqaution (A7) in Appendix A) equals 4250 m, while the minimum length of the drainage network among the test basins equals 51,231 m. Figure 6 depicts the degree of normality of the regression standardized residuals. In terms of the histogram (Figure 6a), the mean of the regression standardized residual is −4.19 × 10⁻¹⁵, compared with 0 representing the best fitting, and the standard deviation equals 0.935, compared with 1 representing the best fitting. The P-P Plot (Figure 6b) shows that the standardized regression residuals are normally distributed.

The similarity between the MR-AP extracted and the actual drainage networks for the nine test sub-basins is quantified with the absolute drainage length error listed in Table 3. The range of the actual drainage length varied from 51,231.77 to 195,583.04 m, with the average being equal to 75,823.67 m, and the MR-AP drainage length varied from 50,423.53 to 200,455.6 m, with the average around being 89,464.49 m; the drainage length error ranged between −6.25% and 9.87%, with the average drainage length error equal to 0.051. A negative drainage length error means an overestimation of the actual drainage by the MR-AP model, and vice versa. The miss hits, false hits, and true hits were calculated to evaluate the MR-AP drainage against actual drainage, and are listed in Table 3. The test sub-basins 3 and 8 exhibit the best and worst performances, respectively, as can be seen in Figure 7 and Figure 8.

Figure 7 and Figure 8 shows a qualitative comparison of the MR-AP drainage networks with the actual drainage networks and the remote sensing image. The actual drainage networks may have disconnected networks in some sub-basins, most of which cannot be captured by the MR-AP drainage network. The disconnected networks are always caused by human-made action, such as by canals and ditches in test 6 and test 8 in Figure 7. The OpenStreetMap drainage networks, which are created and updated by the field investigation of volunteers, may not be accurate in rough terrain and inaccessible areas. On the other hand, the MR-AP drainage networks extracted from DEMs that are generated by remote sensing methods are not limited by such terrain traits, as shown by test 3, 4, and 7 displayed in Figure 7. Despite these differences, the actual and the MR-AP drainage networks exhibit good resemblance, such as demonstrated in test basin 3 displayed in Figure 7. The MR-AP drainage networks are compared with remote sensing imagery from Google Earth Pro in Figure 8 [44]. Gullies, which are generated by runoff erosion, promote the convergence of runoff. The distribution of gullies and surface runoff exhibited a good degree of consistency, and for this reason, gullies are considered incipient drainage networks. Gullies are well represented in the MR-AP drainage networks depicted in Figure 8. More detailed inspection of the extracted river networks in test sub-basins 3, 4, and 7 can be garnered from Figure 9.

These results establish the MR-AP method is a suitable estimator of the FAT and an accurate tool for extracting accurate drainage networks.

5. Discussion

5.1. The Statistical Association between the Drainage Length and the Explanatory Variables

The test results presented in Section 4.3 establish that the calculated results based on explanatory variables (water storage change, NDVI, and maximum surface roughness) performed well compared with the OSM drainage networks, which demonstrated that those explanatory variables can be effective environmental variables for extracting drainage networks with the MR-AP method. The statistical association between the drainage length and the explanatory variables in the 45 areas was further explored. Figure 10 shows the statistical association between the drainage length derived from the OSM drainage network and the explanatory variables. It can be seen in Figure 10 that there is a significant positive statistical association between the drainage length and the water storage change and the NDVI, and there is a high correlation in these regressions. However, it is also evident in Figure 10 that there is a weak statistical association between the maximum surface roughness and the drainage length; yet, several studies have reported that the surface roughness has an effect on the drainage length [15,21]. The weak statistical association between the maximum surface roughness and the drainage length displayed in Figure 10 may be due to the fact that some of the test sub-basin (such as test 6 and test 8) are located in plain areas, and the DEM-based drainage network extraction method does not perform well in flat terrain [45,46].

5.2. Comparison between MR-AP, Mean Change-Point Analysis, and the Fitness Index Method

Change-point analysis is applied to detect an abrupt or structural change in the distributional properties of data [47], and it has been used to calculate the optimal threshold in discrete models in many studies [48]. The fitness index method is also used to calculate the optimal threshold in drainage network extraction [49]. Figure 11 depicts the drainage length error, true hits, miss hits, and false hits (see Equations (A8)–(A10) in Appendix A) of the extracted drainage networks obtained with the MR-AP model, mean change-point analysis, and the fitness index method. It can be seen in Figure 11a that the drainage extracted by MR-AP consistently features the best performance with respect to the drainage length error, with the absolute error under 10%. The fitness index method tends to underestimate the drainage length, having negative errors. The three methods exhibit opposite quality of performance with respect to the miss hits and false hits according to Figure 11b,c. For example, the MR-AP model exhibits the best performance with respect to miss hits and the worst performance with respect to false hits in many sub-basins. Figure 11d displays the true hits obtained with the three methods; the MR-AP method features the best true hits in eight sub-basins and worse performance than the mean change-point analysis only in test sub-basin 6. These results establish that the proposed MR-AP model extracts drainage networks with higher accuracy than the other two methods.

6. Conclusions

This work chose 45 sub-basins in Hubei province and evaluated three potential explanatory variables for extracting drainage networks, namely, maximum surface roughness, NDVI, and water storage change. This study proposes the MR-AP method to calculate the FAT adaptively, which employs power functions relating the FAT and drainage length in each sub-basin. This paper’s results indicate that (1) it is reasonable and accurate to employ maximum surface roughness, normalized difference vegetation index, and water storage change as explanatory variables for the drainage network length; (2) multiple regression fitting provides superior fitting of drainage network length compared to individual regressions between the drainage length and each explanatory variable; (3) the drainage length and the FATs exhibited strong correlation expressed through power functions fitted with data for Hubei province, China; (4) the MR-AP method was shown to be accurate in estimating the FAT and for extracting drainage networks. However, drainage networks are created by complex processes that involve a number of factors. Future research should consider other factors such as temperature, humidity, soil type, and surface geologic features as possible explanatory variables in the extraction of drainage networks. Multiple regression models require a linear statistical association between the dependent and explanatory variables. Nonlinear fitting models could also be considered in future studies. The morphology of extracted drainage networks depends on the calculation method for flow direction. Multi-flow algorithms such as the D-infinity method could be considered instead of the single-flow D8 algorithm in future works.

Author Contributions

W.Z. proposed the theme and method of the article, conducted the initial analysis, wrote the original draft, and provided the financial support for the project; W.L. implemented the research methods, performed the experimental analysis, and participated in writing of the original draft; H.A.L. provided a critical review, data interpretation, improved the presentation of the results, and English edited the paper through several revision stages; X.L. took part in the data analysis and research; S.L. contributed to programming; S.Z. participated in the research process; H.Z. took part in the experimental design. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 41501584).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available in a publicly accessible repository that does not issue DOIs. Publicly available datasets were analyzed in this study. The Topographic Data can be found in https://doi.org/10.5067/ASTER/ASTGTM.003, accessed on 21 November 2019. The Vegetation index data can be found in https://doi.org/10.5067/MODIS/MOD13A3.006, accessed on 21 November 2019. The Water Storage Change from Precipitation, Runoff, and Evapotranspiration can be found in https://doi.org/10.5067/GPM/IMERG/3B-MONTH/06, accessed on 21 November 2019 and https://doi.org/10.5067/SXAVCZFAQLNO, accessed on 21 November 2019. Actual Drainage Networks Data can be found in https://download.geofabrik.de/asia/china-latest-free.shp.zip, accessed on 21 November 2019.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In Table 2, the R² and Modified R² represents the coefficient of determination and the adjusted coefficient of determination of the MR model, respectively, which are calculated with the following equations:

R^{2} = \frac{\sum {(\hat{L} - \bar{L})}^{2}}{\sum {(L - \bar{L})}^{2}},

(A1)

{Adjusted R}^{2} = 1 - \frac{(1 - R^{2}) (N - 1)}{(N - k - 1)},

(A2)

where

L

,

\bar{L}

, and

\hat{L}

denote the actual, average, and predicted drainage length, respectively, N denotes the number of the sub-basins, and k denotes the number of independent variables.

In Figure 3 the Mean and StdDev represents the mean and the standard deviation of the R², respectively, which are calculated with the following equations:

Mean = \frac{\sum_{i = 1}^{N} R^{2}_{i}}{N},

(A3)

Std . Dev . = \sqrt{\frac{\sum_{i = 1}^{N} {(R^{2}_{i} - Mean)}^{2}}{N}},

(A4)

where

R^{2}_{i}

denotes the R² of the i-th sub-basin and N denotes the total number of the sub-basins.

In Figure 5 La represents the actual drainage length (La, m) and Le represents the drainage length extracted by MR-AP (Le, m). CC, RMSE, and MAD represent the correlation coefficient, root mean squared error, and median absolute deviation between La and Le, respectively, which are calculated with the following equations:

CC = \frac{\sum_{i = 1}^{n} ({La}_{i} - \bar{La}) ({Le}_{i} - \bar{Le})}{\sqrt{\sum_{i = 1}^{n} {({La}_{i} - \bar{La})}^{2} \sum_{i = 1}^{n} {({Le}_{i} - \bar{Le})}^{2}}},

(A5)

RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {({La}_{i} - {Le}_{i})}^{2}}{n}},

(A6)

MAD = \frac{\sum_{i = 1}^{n} | {La}_{i} - {Le}_{i} |}{n},

(A7)

where

{La}_{i}

and

{Le}_{i}

denote the actual drainage length and the drainage length extracted with the MR-AP method in the i-th test sub-basin, respectively,

\bar{La}

and

\bar{Le}

denote the average La and Le in all test sub-basins, and n denotes the total number of test sub-basins.

In Table 3 and Figure 11, the true hits, miss hits, and false hits represent the truly or correctly extracted, mis-extracted or failed to be extracted, and falsely or incorrectly extracted drainage networks when compared to the actual drainage networks, respectively, which are calculated with the following equations:

true hits = \frac{L_{true}}{L},

(A8)

miss hits = \frac{L - L_{true}}{L},

(A9)

false hits = \frac{\hat{L} - L_{true}}{L},

(A10)

where

L

,

L_{true}

, and

\hat{L}

denote the actual, true, and predicted drainage lengths respectively.

References

Arnold, N. A new approach for dealing with depressions in digital elevation models when calculating flow accumulation values. Prog. Phys. Geogr. 2010, 34, 781–809. [Google Scholar] [CrossRef]
Qin, C.Z.; Zhan, L.J.; Zhu, A.X.; Zhou, C.H. A strategy for raster-based geocomputation under different parallel computing platforms. Int. J. Geogr. Inf. Sci. 2014, 28, 2127–2144. [Google Scholar] [CrossRef]
Martinez-Casasnovas, J.A.; Stuiver, H.J. Automated delineation of drainage networks and elementary catchments from digital elevation models. Int. J. Appl. Earth Obs. Geoinf. 1998, 3, 198–208. [Google Scholar]
O’Callaghan, J.F.; Mark, D.M. The extraction of drainage networks from digital elevation data. Comput. Vis. Graph. Image Process. 1984, 28, 323–344. [Google Scholar] [CrossRef]
Camara, J.; Martin, M.A.; Gomez-Miguel, V. Quantifying the Relationship Between Drainage Networks at Hillslope Scale and Particle Size Distribution at Pedon Scale. Fractals-Complex Geom. Patterns Scaling Nat. Soc. 2015, 23, 1540007. [Google Scholar]
Gandolfi, C.; Bischetti, G.B. Influence of the drainage network identification method on geomorphological properties and hydrological response. Hydrol. Process. 1997, 11, 353–375. [Google Scholar] [CrossRef]
Helmlinger, K.R.; Kumar, P.; Foufoula-Georgiou, E. On the Use of Digital Elevation Model Data for Hortonian and Fractal Analyses of Channel Networks. Water Resour. Res. 1993, 29, 2599–2613. [Google Scholar] [CrossRef]
Lee, G.; Kim, J.C. Comparative Analysis of Geomorphologic Characteristics of DEM-Based Drainage Networks. J. Hydrol. Eng. 2011, 16, 137–147. [Google Scholar] [CrossRef]
Lehner, B.; Linke, S.; Ouellet-Dallaire, C.; Thieme, M. Global hydro-environmental catchment and river reach characteristics at high spatial resolution. Geophys. Res. Abstr. 2019, 21, 1. [Google Scholar]
Morris, D.G.; Heerdegen, R.G. Automatically derived catchment boundaries and channel networks and their hydrological applications. Geomorphology 1988, 1, 131–141. [Google Scholar] [CrossRef]
Band, L.E. A Terrain-Based Watershed Information-System. Hydrol. Process. 1989, 3, 151–162. [Google Scholar] [CrossRef]
Chen, J.; Lin, G.; Yang, Z.; Chen, H. The relationship between DEM resolution, accumulation area threshold and drainage network indices. In Proceedings of the 2010 18th International Conference on Geoinformatics, Beijing, China, 18–20 June 2010; pp. 1–5. [Google Scholar]
Li, Z.; Guo, L.; Liu, R.; Wang, Y. The relationship between the threshold of catchment area for extraction of digital river network from DEM and the river source density. J. Geo-Inf. Sci. 2018, 20, 1244–1251. (In Chinese) [Google Scholar]
Wu, M.; Shi, P.; Chen, A.; Shen, C.; Wang, P. Impacts of DEM resolution area threshold value uncertainty on the drainage network derived using, SWAT. Water SA 2017, 43, 450–462. [Google Scholar] [CrossRef] [Green Version]
Vogt, E.V.; Colombo, R.; Bertolo, F. Deriving drainage networks and catchment boundaries: A new methodology combining digital elevation data and environmental characteristics. Geomorphology 2003, 53, 281–298. [Google Scholar] [CrossRef]
Band, L.E. Topographic Partition of Watersheds with Digital Elevation Models. Water Resour. Res. 1986, 22, 15–24. [Google Scholar] [CrossRef]
Wood, E.F.; Sivapalan, M.; Beven, K.; Band, L. Effects of Spatial Variability and Scale with Implications to Hydrologic Modeling. J. Hydrol. 1988, 102, 29–47. [Google Scholar] [CrossRef]
Montgomery, D.R.; Dietrich, W.E. Source Areas, Drainage Length, and Channel Initiation. Water Resour. Res. 1989, 25, 1907–1918. [Google Scholar] [CrossRef] [Green Version]
Dietrich, W.E.; Wilson, C.J.; Montgomery, D.R.; McKean, J.; Bauer, R. Erosion thresholds and land surface morphology. Geology 1992, 20, 675–679. [Google Scholar] [CrossRef]
Martz, L.W.; Garbrecht, J. Automated recognition of valley lines and drainage networks from grid digital elevation models: A review and a new method—Comment. J. Hydrol. 1995, 167, 393–396. [Google Scholar] [CrossRef]
Beighley, R.; Eggert, K.; Wilson, C.J.; Rowland, J.C.; Lee, H. A hydrologic routing model suitable for climate-scale simulations of arctic rivers: Application to the Mackenzie River Basin. Hydrol. Process. 2015, 29, 2751–2768. [Google Scholar] [CrossRef]
Bhowmik, A.K.; Metz, M.; Schafer, R.B. An automated, objective and open source tool for stream threshold selection and upstream riparian corridor delineation. Environ. Model. Softw. 2015, 63, 240–250. [Google Scholar] [CrossRef]
Khan, A.; Richards, K.S.; Parker, G.T.; McRobie, A.; Mukhopadhyay, B. How large is the Upper Indus Basin? The pitfalls of auto-delineation using DEMs. J. Hydrol. 2014, 509, 442–453. [Google Scholar] [CrossRef]
Qin, C.Z.; Wu, X.W.; Jiang, J.C.; Zhu, A. Case-based knowledge formalization and reasoning method for digital terrain analysis—Application to extracting drainage networks. Hydrol. Earth Syst. Sci. 2016, 20, 3379–3392. [Google Scholar] [CrossRef] [Green Version]
Zhang, H.H.; Loáiciga, H.A.; Feng, L.W.; He, J.; Du, Q. Setting the Flow Accumulation Threshold Based on Environmental and Morphologic Features to Extract River Networks from Digital Elevation Models. ISPRS Int. J. Geo-Inf. 2021, 10, 186. [Google Scholar] [CrossRef]
Zhang, W.; Wu, X.; Lu, C.; Su, R.; Wang, Y. Determination of Flow Accumulation Threshold Based on Multiple Regression Model in Raster River Networks Extraction. Trans. Chin. Soc. Agric. Mach. 2016, 47, 131–138. (In Chinese) [Google Scholar]
Luo, M.; Tang, G.; Dong, Y. Uncertainty of flow accumulation threshold influence in hydrology modeling-a case study in Qinling Mountain SRTM3 DEM based. In Proceedings of the 2008 International Workshop on Education Technology and Training & 2008 International Workshop on Geoscience and Remote Sensing, Shanghai, China, 21–22 December 2008; Volume 2, pp. 219–222. [Google Scholar]
You, Y.Y.; Jin, W.B.; Xiong, Q.X.; Xue, L.; Ai, T.C.; Li, B.L. Simulation and Validation of Non-point Source Nitrogen and Phosphorus Loads under Different Land Uses in Sihu Basin, Hubei Province, China. Procedia Environ. Sci. 2012, 13, 1781–1797. [Google Scholar] [CrossRef] [Green Version]
Huang, R.; Zhu, L.P.; Xu, Y.X. Crustal structure of Hubei Province of China from teleseismic receiver functions: Evidence for lower crust delamination. Tectonophysics 2014, 636, 286–292. [Google Scholar] [CrossRef]
Abrams, M.; Crippen, R.; Fujisada, H. ASTER Global Digital Elevation Model (GDEM) and ASTER Global Water Body Dataset (ASTWBD). Remote Sens. 2020, 12, 1156. [Google Scholar] [CrossRef] [Green Version]
Wenzel, R.N. Surface roughness and contact angle. J. Phys. Chem. 1949, 53, 1466–1467. [Google Scholar] [CrossRef]
Didan, K. MODIS/Terra Vegetation Indices Monthly L3 Global 1km SIN Grid V061. NASA EOSDIS Land Processes DAAC. 2015. Available online: https://doi.org/10.5067/MODIS/MOD13A3.061 (accessed on 21 November 2019).
Huffman, G.J.; Stocker, E.F.; Bolvin, D.T.; Nelkin, E.J.; Tan, J. GPM IMERG Final Precipitation L3 1 Month 0.1 Degree x 0.1 Degree V06, Greenbelt, MD, Goddard Earth Sciences Data and Information Services Center (GES DISC). 2019. Available online: https://doi.org/10.5067/GPM/IMERG/3B-MONTH/06 (accessed on 21 November 2019).
Chen, F.; Gao, Y.; Wang, Y.; Qin, F.; Li, X. Downscaling satellite-derived daily precipitation products with an integrated framework. Int. J. Climatol. 2019, 39, 1287–1304. [Google Scholar] [CrossRef]
Rodell, M.; Houser, P.R.; Jambor UE, A.; Gottschalck, J.; Mitchell, K.; Meng, C.J.; Arsenault, K.; Cosgrove, B.; Radakovich, J.; Bosilovich, M.; et al. The global land data assimilation system. Bull. Am. Meteorol. Soc. 2004, 85, 381–394. [Google Scholar] [CrossRef] [Green Version]
Beaudoing, H.; Rodell, M. NASA/GSFC/HSL. GLDAS Noah Land Surface Model L4 Monthly 0.25 × 0.25 Degree V2.1, Greenbelt, Maryland, USA, Goddard Earth Sciences Data and Information Services Center (GES DISC). 2020. Available online: https://doi.org/10.5067/SXAVCZFAQLNO (accessed on 21 November 2019).
Yeh, P.J.; Irizarry, M.; Eltahir, E.A. Hydroclimatology of Illinois: A comparison of monthly evaporation estimates based on atmospheric water balance and soil water balance. J. Geophys. Res. Atmos. 1998, 103, 19823–19837. [Google Scholar] [CrossRef]
Over, M.; Schilling, A.; Neubauer, S.; Zipf, A. Generating web-based 3D city models from OpenStreetMap: The current situation in Germany. Comput. Environ. Urban Syst. 2010, 34, 496–507. [Google Scholar] [CrossRef]
Rosen, D. WARSSS-A Watershed Assessment for River Stability and Sediment Supply-An Overview. Hydrol. Sci. Technol. 2007, 23, 181. [Google Scholar]
Tarboton, D.G.; Bras, R.L.; Rodriguez-Iturbe, I. On the extraction of channel networks from digital elevation data. Hydrol. Process. 1991, 5, 81–100. [Google Scholar] [CrossRef]
Choi, Y. A new algorithm to calculate weighted flow-accumulation from a DEM by considering surface and underground stormwater infrastructure. Environ. Model. Softw. 2012, 30, 81–91. [Google Scholar] [CrossRef]
Vogt, J.; Soille, P.; De Jager, A.; Rimaviciute, E.; Mehl, W.; Foisneau, S.; Bodis, K.; Dusart, J.; Paracchini, M.L.; Haastrup, P.; et al. A pan-European river and catchment database. Rep. Eur 2007, 22920, Ispra. [Google Scholar]
Schneider, A.; Jost, A.; Coulon, C.; Silvestre, M.; Théry, S.; Ducharne, A. Global-scale river network extraction based on high-resolution topography and constrained by lithology, climate, slope, and observed drainage length. Geophys. Res. Lett. 2017, 44, 2773–2781. [Google Scholar] [CrossRef] [Green Version]
Xu, X.; Zhang, Y.; Chen, Q.; Li, N.; Shi, K.; Zhang, Y. Regime shifts in shallow lakes observed by remote sensing and the implications for management. Ecol. Indic. 2020, 113, 106285. [Google Scholar] [CrossRef]
Li, C.; Feng, X.; Zhao, R. The Methods and Application of Automatically Extracting Stream Network of Watershed. J. Lake Sci. 2003, 3, 205–212. (In Chinese) [Google Scholar]
Li, D.; Wu, B.; Chen, B.; Qin, C.; Wang, Y.; Zhang, Y.; Xue, Y. Open-Surface River Extraction Based on Sentinel-2 MSI Imagery and DEM Data: Case Study of the Upper Yellow River. Remote Sens. 2020, 12, 2737. [Google Scholar] [CrossRef]
Luo, Y.; Zhu, L.; Hu, J. A new method for determining threshold in using PGCEVD to calculate return values of typhoon wave height. China Ocean Eng. 2012, 26, 251–260. [Google Scholar] [CrossRef]
Militino, A.F.; Moradi, M.; Ugarte, M.D. On the Performances of Trend and Change-Point Detection Methods for Remote Sensing Data. Remote Sens. 2020, 12, 1008. [Google Scholar] [CrossRef] [Green Version]
Lin, W.T.; Chou, W.C.; Lin, C.Y.; Huang, P.H.; Tsai, J.S. Automated suitable drainage network extraction from digital elevation models in Taiwan’s upstream watersheds. Hydrol. Process. Int. J. 2006, 20, 289–306. [Google Scholar] [CrossRef]

Figure 1. Distribution of the test and sample sub-basins.

Figure 2. The workflow diagram of the MR-AP method.

Figure 3. The R² of the power functions in all sub-basins.

Figure 4. The power function of the best-fitting (a) and the worst-fitting (b) sub-basin; y = L, and x = FAT in the regression formulas.

Figure 5. Calculated regression between the actual drainage and the MR-AP drainage.

Figure 6. The normality test (the Histogram (a) and the P-P Plot (b)) of the fitting results.

Figure 7. Visual comparison of the MR-AP drainage network with the actual drainage network.

Figure 8. Visual comparison of the MR-AP drainage network with the remote sensing imagery.

Figure 9. Visual comparison of local areas in test sub-basins 3, 4, and 7.

Figure 10. Scattergrams of drainage length against water storage change (a), NDVI (b), and maximum surface roughness (c).

Figure 11. The comparison between MR-AP, Mean Change-point Analysis, and the Fitness Index Method using drainage length error (a), miss hits (b), false hits (c), and true hits (d).

Table 1. The Pearson correlation coefficient values between actual drainage length and explanatory variables.

Explanatory Variables	Pearson Correlation
water storage change	0.850 **
NDVI	0.874 **
Maximum Surface Roughness	0.425 **

** Significant correlation at the p < 0.01 level (two-sided).

Table 2. Fitting results for the MR model.

Explanatory Variables	Pearson Correlation
Constant	−4270.679
The water storage change	9015.260
NDVI	107.426
Maximum surface roughness	−888.753
R²	0.714
Adjusted R²	0.688

Table 3. Performance results of the drainage length (m) for the test sub-basins.

Area Index	Actual Drainage Length (m)	FATs by MR-AP	MR-AP Drainage Length (m)	Drainage Length Error by MR-AP Model	Miss Hits	False Hits	True Hits
1	105,216.89	37,069	111,854.2	6.31%	9.51%	15.82%	90.49%
2	195,583.04	31,023	200,455.6	2.49%	24.19%	26.68%	75.81%
3	51,231.77	61,953	50,423.53	−1.58%	1.98%	0.41%	98.02%
4	89,795.84	33,695	84,181.19	−6.25%	19.50%	13.25%	80.50%
5	58,822.43	61,372	63,952.42	8.72%	25.05%	33.77%	74.95%
6	66,871.99	47,921	63,990.78	−4.31%	33.88%	29.57%	66.12%
7	63,287.99	31,858	66,656.82	5.32%	21.75%	27.07%	78.25%
8	79,994.08	46,961	87,887.44	9.87%	57.69%	67.55%	42.31%
9	76,825.84	34,109	75,778.47	−1.36%	16.38%	15.01%	83.62%
Average	75,823.67	42,885	89,464.49	2.13%	23.33%	25.46%	76.67%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, W.; Li, W.; Loaiciga, H.A.; Liu, X.; Liu, S.; Zheng, S.; Zhang, H. Adaptive Determination of the Flow Accumulation Threshold for Extracting Drainage Networks from DEMs. Remote Sens. 2021, 13, 2024. https://doi.org/10.3390/rs13112024

AMA Style

Zhang W, Li W, Loaiciga HA, Liu X, Liu S, Zheng S, Zhang H. Adaptive Determination of the Flow Accumulation Threshold for Extracting Drainage Networks from DEMs. Remote Sensing. 2021; 13(11):2024. https://doi.org/10.3390/rs13112024

Chicago/Turabian Style

Zhang, Wei, Wenkai Li, Hugo A. Loaiciga, Xiuguo Liu, Shuya Liu, Shengjie Zheng, and Han Zhang. 2021. "Adaptive Determination of the Flow Accumulation Threshold for Extracting Drainage Networks from DEMs" Remote Sensing 13, no. 11: 2024. https://doi.org/10.3390/rs13112024

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Determination of the Flow Accumulation Threshold for Extracting Drainage Networks from DEMs

Abstract

1. Introduction

2. Materials

2.1. Study Region

2.2. Topographic Data

2.3. Vegetation Index Data

2.4. Water Storage Change from Precipitation, Runoff, and Evapotranspiration

2.5. Actual Drainage Networks Data

3. Methods

3.1. Drainage Networks Extraction

3.2. Multiple Regression and Adaptive Power (MR-AP) Method

3.2.1. Multiple Stepwise Regression Fitting

3.2.2. Power Function Fitting for Each Sub-Basin

3.3. Validation

4. Results

4.1. Results of Drainage Length Using Multiple Regression (MR) Analysis

4.2. Drainage Network Results Using Adaptive Power (AP) Analysis

4.3. Validation on the MR-AP Drainage Networks

5. Discussion

5.1. The Statistical Association between the Drainage Length and the Explanatory Variables

5.2. Comparison between MR-AP, Mean Change-Point Analysis, and the Fitness Index Method

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI