1. Introduction
Hydrodynamic simulations on prescribed river network centerlines [
1] are valuable for conducting hydrodynamic simulations at the scale of a country, continent, or the globe [
2]. These so-called vector-based simulations capture the longitudinal water flow along rivers while significantly simplifying the dynamic changes in river width. This simplification is justified by the fact that the predominant flow direction of water in rivers is longitudinal. Compared with raster-based simulations, which model hydrodynamic processes on two-dimensional grids, vector-based simulations are computationally efficient. This efficiency makes vector-based simulations ideal for large-domain operational flood forecasting, such as the National Water Model of the United States [
3]. They strike a good balance between the timeliness and computational demand of operational forecasts, especially when parameter calibration and data assimilation are involved.
The accuracy of the river network centerlines is essential for these vector-based simulations. In a few regions, such as the United States, high-accuracy river network datasets are available for vector-based hydrodynamic simulations. These datasets are developed and maintained through extensive field surveys and manual delineation. However, in most regions of the world, such manually maintained datasets do not exist.
In the vast regions of the globe where manually maintained river network datasets are lacking, there are primarily two approaches to delineate river networks over large domains: digital elevation model (DEM)-based methods [
4,
5,
6,
7,
8] and satellite or aerial imagery-based methods [
9]. Imagery-based methods are well-suited for identifying the presence of rivers and often offer higher spatial resolution than DEM-based methods [
9]. Imagery-derived river networks can serve as the basis for calculating drainage density. However, river networks extracted from imagery lack information on drainage directions and connectivity between river reaches, rendering them unsuitable for hydrodynamic simulations. On the other hand, DEM-based methods provide detailed information on flow direction and preserve the critical upstream–downstream relationships necessary for hydrodynamic simulations [
4,
5,
6]. However, river networks delineated from DEMs are often not representative of the real network, which leads to inaccuracies in various characteristics of the river network.
Among the various characteristics of river networks, drainage density is particularly significant [
10]. Drainage density is defined as the total length of river segments per unit area [
11,
12] and serves as a key indicator of a region’s hydrological characteristics [
13]. It varies significantly under different hydroclimatic conditions, with higher values typically found in wetter regions and lower values in arid areas. In regions with high drainage density, water is drained more efficiently from the landscape, leading to more rapid hydrologic responses and potentially higher flood risks [
14,
15]. This characteristic also influences sediment transport, as more developed river networks can more effectively carry sediment [
13].
The difficulty in accurately representing drainage density is a major limitation for large-domain simulations, as drainage density naturally varies substantially across different regions. Several studies have attempted to address this limitation. Existing DEM-based methods utilize a parameter known as the critical drainage area to delineate river networks [
4]. This parameter is defined as the minimum area required to sustain a river segment. The smaller the critical drainage area, the more river segments are delineated and, consequently, the higher the drainage density. Schneider et al. [
16] developed a statistical model to estimate the critical drainage area using slope, lithology, and climate. This model was calibrated to match drainage densities from reference river networks in France and Australia and was subsequently applied globally to delineate river networks from DEMs. Another approach was proposed by Lin et al. [
5], who introduced a method to trim river segments from networks delineated using a small critical drainage area threshold (1 km
2). The river with the smallest drainage area is trimmed first, and this process is repeated until the drainage density within a watershed matches a value derived from machine learning. However, all existing methods are performed in an ad hoc manner. Drainage density is not incorporated into the delineation process itself but is only used in the evaluation or post-processing step. As a result, none of these existing studies can effectively preserve the observed drainage density across the entire river network.
We take a fundamentally different approach to addressing the challenge of DEM-based delineation. A new river network delineation algorithm is proposed, which uses two-dimensional drainage density as a direct input. The algorithm is based on a novel concept of upstream accumulation length rather than the critical drainage area. Upstream accumulation length is determined by both flow direction and drainage density. Given a fixed flow direction, upstream accumulation length is mathematically equivalent to drainage density. Our proposed algorithm leverages the upstream accumulation length to preserve the observed drainage density across the entire river network.
This technical note is organized as follows.
Section 2 details the proposed algorithm.
Section 3 sets up two test cases for the algorithm: one is a synthetic case, and the other is a real-world application.
Section 4 presents the results of the two cases.
Section 5 discusses the limitations of the algorithm. Finally,
Section 6 concludes the paper.
2. The Drainage Density-Preserving River Network
Delineation Algorithm
As shown in
Figure 1, the algorithm consists of three main steps: (1) calculation of upstream accumulation length using flow direction and drainage density, (2) calculation of drainage area using flow direction, and (3) river network delineation using flow direction, drainage area, and upstream accumulation length. While extensive research has been conducted on the second step (i.e., the calculation of drainage area) [
17], we will focus on the first and third steps in this section.
2.1. Upstream Accumulation Length
In this section, we first introduce the definition of upstream accumulation length. Then, we show how to calculate the upstream accumulation length using drainage density. Third, we prove that, given the flow direction, the upstream accumulation length is mathematically equivalent to drainage area by establishing the formula for inversing drainage density from upstream accumulation length. Finally, we present a fast algorithm to calculate the upstream accumulation length with O(N) time complexity.
As illustrated in
Figure 2, the upstream accumulation length for a grid cell (Point O) is defined as the total length of river segments (i.e., segments A, B, and C) that lie within the catchment of the grid cell. This can be expressed mathematically as follows:
where
represents the upstream accumulation length for the grid cell,
denotes the catchment area of the grid cell, and
l is the length of the river segments within the catchment.
Equation (
1) can be rewritten using the definition of drainage density (
) as follows:
where
denotes an infinitesimal area within the catchment and
denotes the length of the river segments within the infinitesimal area. The equation states that the upstream accumulation length is equal to the integral of the drainage density over the area of the catchment. In other words, if we have two-dimensional drainage density data and know the catchment of every grid cell over the area (which is feasible given flow direction), we can calculate a two-dimensional upstream accumulation length at every grid cell using Equation (
2).
Furthermore, given flow direction, upstream accumulation length and drainage density have a strictly one-to-one relationship. We can not only precisely estimate upstream accumulation length from drainage density (Equation (
2)), but also vice versa. The formula for calculating drainage density from upstream accumulation length can be derived by differentiating Equation (
2) as follows:
This equation states that the drainage density of a grid cell is equal to the increase in upstream accumulation length divided by the area of the grid cell. The increase in upstream accumulation length can be estimated as the difference between the upstream accumulation length of a grid cell and the sum of the upstream accumulation length of the grid cell’s donor cells. This one-to-one relationship between upstream accumulation length and drainage density is valid at any point of a two-dimensional domain.
Given flow direction and drainage density, the calculation of upstream accumulation length can be performed efficiently. Inspired by the work of Zhou et al. [
17], we propose an algorithm to calculate the upstream accumulation length with O(N) time complexity. Algorithm 1 depicts the pseudocode. In the algorithm, we define three types of grid cells: (1) source cells, (2) intersection cells, and (3) interior cells. The number of donor cells for source cells is 0, for intersection cells is greater than 1, and for interior cells is 1. The algorithm starts from the source cells and iteratively calculates the upstream accumulation length downstream until an intersection cell is encountered. The number of donor cells for the encountered intersection cell is then decreased by 1. The algorithm continues until all source cells are processed. Similar to Zhou et al. [
17], every grid cell is processed at most
n times, where
n is the maximum number of donor cells for a grid cell. Consequently, the time complexity of the algorithm is O(N), where N is the number of grid cells in the study area.
Algorithm 1: Algorithm for calculating upstream accumulation length. |
![Water 17 01636 i001]() |
2.2. River Network Delineation with Upstream Accumulation Length
The key idea of our proposed algorithm is to transform drainage density into upstream accumulation length and to use the upstream accumulation length to delineate river networks. Since the upstream accumulation length and drainage density are mathematically equivalent, if the upstream accumulation length of every grid cell in the delineated river network matches the observed value, the drainage density of the delineated river network will also match the observed value at any point of a two-dimensional study area.
However, achieving this target exactly is computationally demanding. We can slightly relax the target and restate the objective of this study as follows: match the upstream accumulation length at every catchment outlet with the observed value. This relaxation significantly reduces the computational burden. Meanwhile, if the target is achieved, the average drainage density within every catchment of the delineated river network will also match the catchment-averaged observed value. This characteristic represents a substantial improvement over the existing methods discussed in the Introduction.
The delineation with the relaxed objective can be efficiently implemented using a divide-and-conquer strategy. Using
Figure 3 as an illustration, the algorithm starts from the outlets of the whole domain (Outlet A in
Figure 3) and iteratively grows the centerlines of the river network (blue lines) by adding grid cells. If a bifurcation is encountered, the algorithm splits the river network into two or more parts (three parts in
Figure 3). A bifurcation point (the red point) is defined as a grid cell that lies on the existing centerline and has two or more donor cells that flow from existing centerlines into it. The donor cells of the bifurcation point (Outlets B and C) are then identified using flow direction. These donor cells serve as the outlets of the upstream catchments. The newly identified outlets split the river network into parts, each with a river segment (River A, B, and C) and a corresponding catchment (Catchment A, B, and C).
The algorithm calculates the expected length for each river segment based on the upstream accumulation length. If a reach has no upstreams (Rivers B and C), the expected length of the reach is the upstream accumulation length at its outlet (Outlets B and C). If a reach has upstreams (River A), the expected length is the upstream accumulation length at its outlet (Outlet A) minus the sum of the upstream accumulation lengths of all its upstreams at their outlets (Outlets B and C). The algorithm grows each river segment by adding grid cells and repeats the growing process until the length of every reach is greater than or equal to the expected length. When the iteration stops, the upstream accumulation length at every catchment outlet is equal to the observed value.
In
Figure 3, if the upstream accumulation length of the delineated network is the same as the observed value at Outlets B and C, the drainage density within Catchments B and C must match the observed drainage density. Since the upstream accumulation length at Outlet A is equal to the sum of the lengths of Rivers A, B, and C, if the expected length of River A is also achieved, the upstream accumulation length at Outlet A is equal to the observed value. Consequently, the drainage density within the combined catchments A, B, and C must also match the observed drainage density. Given that the drainage density within Catchments B and C is already equal to the observed value, the drainage density within Catchment A must also match the observed value. In this case, the drainage density within every catchment of the delineated river network is equal to the observed value.
The last piece of the algorithm is how to add grid cells to grow the river segments. There are certainly many ways to do this. We would like to keep our algorithm compatible with existing critical drainage area-based river network delineation algorithms. In critical drainage area-based algorithms, grid cells on the delineated rivers always have a larger drainage area value than those not delineated as rivers. Based on this observation, we grow the rivers by adding one grid cell at a time. The added grid cell is the one with the largest drainage area value among the neighbors of the delineated centerlines that flow into the centerlines. In this way, we ensure that the grid cells on the delineated centerlines always have a larger drainage area value than those not delineated as centerlines. This is the same idea as in critical drainage area-based algorithms.
Algorithm 2 shows pseudocode for the density-preserving river network delineation algorithm. The algorithm utilizes two queues: the Open queue, which is a temporary store of partially delineated rivers, and the Lines queue, which stores the finalized rivers. The algorithm grows each river in the Open queue. If the length of a river is greater than or equal to the expected length, the river is finalized and moved to the Lines queue. If a bifurcating condition is encountered, the algorithm divides the river into parts, as illustrated in
Figure 3, and puts the parts back into the Open queue.
The algorithm is efficient. The divide-and-conquer strategy ensures that the river-growing process focuses on a specific reach and its neighboring grid cells at a time without having to consider the entire study domain. This locality is especially important when dealing with large domains such as those at the scale of a country, a continent, or even the globe.
2.3. Avoidance of Short River Segments
The algorithm shown in Algorithm 2 strictly preserves the observed drainage density for every catchment of the delineated river network. However, the algorithm often produces a large number of short river segments. These short river segments are not only difficult to represent in hydrodynamic simulations but also have a negligible impact on the overall drainage density. Algorithm 3 shows an improved algorithm tailored for hydrodynamic simulations. Two thresholds are introduced in the algorithm: one for the minimum length of river segments (
, line 24 in Algorithm 3) and the other for the minimum drainage area (
, line 17 in Algorithm 3). The former is used to avoid overly short river segments on mainstreams, while the latter is used to avoid overly short river segments on headwaters.
Algorithm 2: Density-preserving river network delineation algorithm. |
![Water 17 01636 i002]() |
Algorithm 3: Density-preserving river network delineation algorithm with thresholds for river segment length and drainage area. |
![Water 17 01636 i003]() |
3. Experimental Setup
We test the proposed algorithm using two cases: a synthetic case and a real-world case. The synthetic case is designed to demonstrate the steps of the algorithm and to show its compatibility with the critical drainage area-based algorithm. The real-world case is designed to demonstrate the computational efficiency and effectiveness of the proposed algorithm in delineating river networks over large domains.
The synthetic case is a 4 × 4 grid with a grid size of 1 × 1.
Figure 4b shows the flow direction and drainage area, respectively. A river network is first delineated using the critical drainage area-based algorithm with a threshold of 4, as shown in
Figure 4c. This river network is used as the synthetic truth. Drainage density is calculated from this network. We then delineate the river network using the proposed algorithm to check whether the proposed algorithm can produce the same river network shown in
Figure 4c.
The real-world case involves delineating the river network over the Chinese Mainland. The MERIT-Hydro flow direction dataset [
6] is used in this case. MERIT-Hydro is a global 3-arc-second (approximately 90 m) resolution dataset derived from the MERIT DEM [
18]. This dataset has been widely used in hydrodynamic simulations [
19]. Lin et al. [
5] have shown that the river network delineated from the MERIT-Hydro flow direction dataset has high accuracy in terms of centerline position. The drainage density dataset used in this case is produced from the Third National Land Resources Survey of China. The dataset covers the entire Chinese Mainland at a spatial resolution of 1 km. We bilinearly interpolated the 1 km drainage density data to the 90 m MERIT-Hydro grid. To ensure the delineated river network is continuous, we slightly extended the boundary of the study area outside the Chinese Mainland. In these extended areas, there is no surveyed drainage density data. The drainage density is estimated using a river network delineated from the MERIT-Hydro dataset and the critical drainage area-based method with a threshold of 1 km
2.
5. Discussion
The proposed algorithm uses two-dimensional drainage density as a priori information to delineate river networks, and the delineated river network can preserve the observed drainage density at every delineated catchment. This represents a significant improvement over existing methods, which often fail to capture the variation of drainage density across large domains. The capability of capturing the variation of drainage density is not limited to the spatial dimension but could also be extended to the temporal dimension. Given a time-varying drainage density dataset, the proposed algorithm can be used to delineate river networks that preserve the observed drainage density at every time step. Although we do not expect the algorithm to capture fragmented rivers, it is still able to represent the extension and retreat of headwaters. Such studies would be useful for understanding the changes in non-perennial rivers and the impacts of climate change on river networks [
20,
21,
22,
23].
However, the proposed algorithm is still subject to several limitations that are common to all DEM-based methods. First, the resolution of the input DEM significantly affects the delineation results [
24]. A higher resolution DEM provides more accurate flow direction and drainage area information, and it would lead to a more accurate delineation of river networks. However, DEM-based methods discussed in this study have difficulties when the DEM resolution is too fine. As shown by Bernard et al. [
25], Costabile et al. [
26], these methods, including the one proposed in this study, are not able to capture the morphology of rivers wider than the resolution. Special treatment of the river surface [
27] is necessary to adjust the DEM. Second, the delineated river network is affected by the accuracy of the input DEM and drainage density data. Dense vegetation in the tropics and steep terrain at the edge of mountains can introduce errors in the DEMs, which can propagate to the delineated river network. In this study, we used the MERIT-Hydro dataset, which is produced by correcting multiple errors in DEMs, including the impacts of vegetation. However, the impacts of errors in drainage density data are still unknown. We leave the investigation of drainage density’s impact for future research to keep this paper focused and concise. Third, the delineated river network represents mere line skeletons without width information. For use in large-domain hydrodynamic simulations, the cross-section of the rivers is parameterized using simple geometry and calibratable geometric parameters [
28]. This approach often faces difficulties in representing floodplains accurately. Two-dimensional hydraulic simulations on DEMs show potential for addressing these limitations. However, the computational cost hinders the application of two-dimensional hydraulic simulations on large domains, like the Chinese Mainland presented in this study.
6. Conclusions
This study introduces a novel river network delineation method that preserves observed drainage density across the entire river network. The method is based on a novel concept named upstream accumulation length, which is calculated using flow direction and drainage density. The delineation algorithm is designed to be compatible with critical drainage area-based methods, allowing for the preservation of observed drainage density in every single catchment of the river network.
The proposed algorithm was applied to the Chinese Mainland, utilizing the MERIT-Hydro flow direction dataset and a kilometer-resolution drainage density dataset produced from the Third National Land Resources Survey of China. The resulting river network dataset is segmented into nearly uniform lengths of approximately one kilometer and provides a critical foundation for the development of national-scale flood forecasting systems in the Chinese Mainland.
The proposed algorithm was tested on a synthetic case and a real-world case. The results from the synthetic case demonstrated that the proposed algorithm can produce the same river network as critical drainage area-based methods when using the same drainage density. The real-world case of delineating the river network over the Chinese Mainland showed that the proposed algorithm is both computationally efficient and effective in capturing the variation of drainage density. Moreover, 2,314,986 river segments were delineated in less than 10 min on a single thread of an Apple M3 Max processor. The delineated river network was shown to be superior in capturing the variation of drainage density across the entire river network compared to existing publicly available datasets such as HydroSHEDS and MERIT-Vector.
The results are encouraging since the method opens the possibility of combining aerial satellite imagery and DEMs to delineate river networks accurately on a global scale. This is particularly important, as only a few countries (e.g., the United States of America, the United Kingdom, France, and Australia) have manually maintained river network datasets. The vast majority of the world, including China, lacks such datasets. The proposed algorithm can be applied to delineate river networks in these regions, providing a valuable resource for hydrodynamic simulations and flood forecasting systems.