Pixel-Based Geometric Assessment of Channel Networks/Orders Derived from Global Spaceborne Digital Elevation Models

Digital Elevation Models (DEMs) contribute to geomorphological and hydrological applications. DEMs can be derived using different remote sensing-based datasets, such as Interferometric Synthetic Aperture Radar (InSAR) (e.g., Advanced Land Observing Satellite (ALOS) Phased Array type L-band SAR (PALSAR) and Shuttle Radar Topography Mission (SRTM) DEMs). In addition, there is also the Digital Surface Model (DSM) derived from optical tri-stereo ALOS Panchromatic Remote-sensing Instrument for Stereo Mapping (PRISM) imagery. In this study, we evaluated satellite-based DEMs, SRTM (Global) GL1 DEM V003 28.5 m, ALOS DSM 28.5 m, and PALSAR DEMs 12.5 m and 28.5 m, and their derived channel networks/orders. We carried out these assessments using Light Detection and Ranging (LiDAR) Digital Surface Models (DSMs) and Digital Terrain Models (DTMs) and their derived channel networks and Strahler orders as reference datasets at comparable spatial resolutions. We introduced a pixel-based method for the quantitative horizontal evaluation of the channel networks and Strahler orders derived from global DEMs utilizing confusion matrices at different flow accumulation area thresholds (ATs) and pixel buffer tolerance values (PBTVs) in both±X and±Y directions. A new Python toolbox for ArcGIS was developed to automate the introduced method. A set of evaluation metrics—(i) producer accuracy (PA), (ii) user accuracy (UA), (iii) F-score (F), and (iv) Cohen’s kappa index (KI)—were computed to evaluate the accuracy of the horizontal matching between channel networks/orders extracted from global DEMs and those derived from LiDAR DTMs and DSMs. PALSAR DEM 12.5 m ranked first among the other global DEMs with the lowest root mean square error (RMSE) and mean difference (MD) values of 4.57 m and 0.78 m, respectively, when compared to the LiDAR DTM 12.5 m. The ALOS DSM 28.5 m had the highest vertical accuracy with the lowest recorded RMSE and MD values of 4.01 m and −0.29 m, respectively, when compared to the LiDAR DSM 28.5 m. PALSAR DEM 12.5 m and ALOS DSM 28.5 m-derived channel networks/orders yielded the highest horizontal accuracy when compared to those delineated from LiDAR DTM 12.5 m and LiDAR DSM 28.5 m, respectively. The number of unmatched channels decreased when the PBTV increased from 0 to 3 pixels using different ATs.


Introduction
Current advances in remote sensing techniques are essential in producing high-quality Digital Elevation Models (DEMs). Because of the general availability of different optical and microwave satellite data-based DEMs, many authors have extensively used these elevation datasets for a

Study Area
We used an area covering 235.56 km 2 in San Luis Obispo County along the western coast of California, USA, to test our method. The study area is geographically located between 672,000 m E to 696,000 m E and 3,924,000 m N to 3,940,000 m N ( Figure 1). It is moderately rugged and has significant variations in the relief height, ranging from −1 m to +437 m above sea level. Geomorphologically, the area is characterized by a narrow coastal area of steep cliffs, in addition to a coastal range sculpted by hills and valleys [73]. It is also characterized by the presence of the Whale Rock Reservoir to the south. It has a watershed area of 53 km 2 , and the reservoir has a capacity of 50.156 m 3 and the maximum water height of 66 m [74]. Geologically, the study area is dominated mainly by sandstone, in addition to exposures of serpentinites, rhyolite, basalt and alluvium terraces [75]. The dominant forests are evergreen, deciduous, and mixed, and their density varies from low to scattered [76,77]. Furthermore, the area is covered by grassland, and scattered vegetation is present on both sides of the lake.
Remote Sens. 2019, 11,235 4 of 33 Considering the above-mentioned issues, our objectives were five-fold: (i) To evaluate the pixelbased vertical elevation accuracies of spaceborne-based global DEMs (i.e., SRTM DEM 28.5 m, Advanced Land Observing Satellite (ALOS) DSM 28.5 m, and Phased Array type L-band SAR (PALSAR) DEMs 28.5 m and 12.5 m) based on LiDAR-based DTM and DSM utilizing traditional statistical metrics, such as the root mean square error (RMSE) and mean difference (MD), (ii) to introduce a pixel-based technique to assess the horizontal spatial variability in the channel networks/orders extracted from the global elevation sources using those delineated from LiDAR DTM and DSM at similar spatial resolutions and at different pixel buffer tolerance values (PBTVs), (iii) to develop a new Python toolbox for ArcGIS to automate the previous objectives, (iv) to determine which global DEM dataset would be closer in performance to the airborne LiDAR DTM or LiDAR DSM in the study area, and (v) to compare the outcomes of the first and the second objectives to depict the degree of matching between the results achieved from both methods.

Study Area
We used an area covering 235.56 km 2 in San Luis Obispo County along the western coast of California, USA, to test our method. The study area is geographically located between 672,000 m E to 696,000 m E and 3,924,000 m N to 3,940,000 m N ( Figure 1). It is moderately rugged and has significant variations in the relief height, ranging from -1 m to +437 m above sea level. Geomorphologically, the area is characterized by a narrow coastal area of steep cliffs, in addition to a coastal range sculpted by hills and valleys [73]. It is also characterized by the presence of the Whale Rock Reservoir to the south. It has a watershed area of 53 km 2 , and the reservoir has a capacity of 50.156 m 3 and the maximum water height of 66 m [74]. Geologically, the study area is dominated mainly by sandstone, in addition to exposures of serpentinites, rhyolite, basalt and alluvium terraces [75]. The dominant forests are evergreen, deciduous, and mixed, and their density varies from low to scattered [76,77]. Furthermore, the area is covered by grassland, and scattered vegetation is present on both sides of the lake.

Data Requirements
In this study, we used two types of data. One was the global DEM datasets [13,[79][80][81]; their major characteristics are summarized in Table 1. The other was the LiDAR point cloud datasets that were acquired for the Diablo Canyon Power Plant (DCCP) San Simeon project for the Pacific Gas and Electric Company (PG&E) [82], California, USA. Although these data were over an area of 810 km 2 , this study deals with an area of 235.56 km 2 in order to demonstrate the effectiveness and applicability of our proposed approach. The airborne LiDAR point cloud datasets were acquired, calibrated, and verified by Quantum Spatial for the funder PG&E. These LiDAR data were made available to the public through OpenTopography (a public data domain: https://opentopography.org/) on 29 March 2013.  [79] [80] [13,81] The LiDAR survey was accomplished using a Leica ALS70 sensor mounted on a Cessna Grand Caravan. The ALS70 system was set to capture a scan angle of 15 • from nadir to yield high-resolution data of more than 15 pulses per square meter and a swath width of 191 m over terrestrial surfaces. It flew 1100 m above ground level and acquired at least 240,000 laser pulses per second. The LiDAR scanning was achieved without data voids and gaps, excluding non-reflective surfaces (e.g., open water, wet asphalt). The LiDAR data were acquired under optimum conditions with minimal to no cloud cover (i.e., less than 10% cloud shadow) and maximum solar zenith angles. In addition, consistent aircraft altitude over the terrain was obtained to eliminate the potential for data gaps related to acquisition and laser shadowing of targets. Furthermore, an accurate ground survey was achieved by Watershed Sciences Inc. in parallel with the airborne LiDAR scanning.
The absolute vertical accuracy of the LiDAR datasets was initially assessed using ground checkpoints collected from bare earth surfaces of constant slope. For this project, the reported RMSE and MD values of the absolute and average relative vertical accuracies of the LiDAR datasets were 2.6 cm and 5 cm, respectively (see Wilson and Steinberg 2013 [82] for additional details).  Figure 2 shows our proposed method in the form of a schematic diagram. It consisted of 4 distinct components: (i) Data preparation, (ii) evaluation of the vertical elevation accuracy of the global DEMs utilizing LiDAR DTM/DSM, (iii) extraction of the channel networks, and (iv) development of ArcGIS Python toolbox for the geometric assessment of channel networks/orders. They are described in the following sub-sections.

Data Preparation
We obtained the global SRTM DEM and ALOS DSM from the OpenTopography (High-Resolution Topography Data and Tools) website in GeoTIFF format with a horizontal resolution of approximately 1 arcsec/28.5 m ( Table 1). The PALSAR DEM was downloaded from the Alaska Satellite Facility Distributed Active Archive Data Center (ASF DAAC) in geographic information systems (GIS)-ready GeoTIFF format with a horizontal resolution of 12.5 m. For full coverage of the area under study, two dual-polarization (HH + HV) PALSAR scenes operating in fine beam mode (FBD) were obtained from an ascending path on 15 June 2007.
We downloaded the ground and unclassified LiDAR point cloud data, with a point density of 3.1 and 21 pts/m 2 , from the OpenTopography domain in a compressed LAS file format. LAS is a public file format for the interchange of 3-D LiDAR point cloud datasets. The LAS binary file format is an alternative to proprietary systems or a generic ASCII file interchange system and is compatible with many commercial and open source software packages. Each point within the LiDAR datasets was classified by whether it was returned from the ground, vegetation, or building/structure. The vertical and horizontal references of the LiDAR point cloud data are NADV 88 and NAD 83. The

Data Preparation
We obtained the global SRTM DEM and ALOS DSM from the OpenTopography (High-Resolution Topography Data and Tools) website in GeoTIFF format with a horizontal resolution of approximately 1 arcsec/28.5 m ( Table 1). The PALSAR DEM was downloaded from the Alaska Satellite Facility Distributed Active Archive Data Center (ASF DAAC) in geographic information systems (GIS)-ready GeoTIFF format with a horizontal resolution of 12.5 m. For full coverage of the area under study, two dual-polarization (HH + HV) PALSAR scenes operating in fine beam mode (FBD) were obtained from an ascending path on 15 June 2007.
We downloaded the ground and unclassified LiDAR point cloud data, with a point density of 3.1 and 21 pts/m 2 , from the OpenTopography domain in a compressed LAS file format. LAS is a public file format for the interchange of 3-D LiDAR point cloud datasets. The LAS binary file format is an alternative to proprietary systems or a generic ASCII file interchange system and is compatible with many commercial and open source software packages. Each point within the LiDAR datasets was classified by whether it was returned from the ground, vegetation, or building/structure. The vertical and horizontal references of the LiDAR point cloud data are NADV 88 and NAD 83. The LiDAR point cloud datasets were geocoded to the Universal Transverse Mercator (UTM) projection system, Zone 10N.
Both ground and unclassified LiDAR points were gridded, resampled, and averaged based on the spatial extent and resolution of the SRTM DEM 28.5 m, ALOS DSM 28.5 m, PALSAR DEM 12.5 m, and resampled PALSAR DEM 28.5 m.
Before the global elevation datasets (SRTM DEM V003 28.5 m, ALOS DSM 28.5 m, and PALSAR DEMs 12.5 m and 28.5 m) could be directly compared with LiDAR DTM/DSM of similar spatial resolution, it was imperative to have them in a common reference system. The global elevation products were transformed into the LiDAR reference system. Additionally, we made the projected coordinate systems consistent among the global elevation products and LiDAR datasets. Each pair of comparable DEMs had the same number of rows and columns and were well aligned.

Evaluating the Vertical Elevation Accuracy of Global DEMs Based on LiDAR DTM/DSM
We assessed elevation differences among LiDAR DTM and DSM (reference datasets) and the other global DEMs by computing the traditional statistical metrics (RMSE and MD) grids at the co-located pixels.
Validation accuracy measures the closeness of observation to a true value [7]. RMSE has become a standard statistical tool for analyzing DEM accuracy and has been used in many studies to quantify the vertical accuracy in DEMs [83,84]. RMSE is a single measure that characterizes the error surface, while MD indicates the bias of the error surface and their equations are as follows: where N is the number of pixels; LDEM ref is the reference LiDAR DEM (DTM or DSM); and GDEM is the global DEM (SRTM DEM V003, ALOS DSM, or PALSAR DEM).
For the purpose of this paper, the elevations values are represented by the geometric centers of all DEM cells included in the evaluation.

Extraction of the Channel Networks/Orders
The channel network is the most significant terrain parameter derived from DEMs; along its tributaries, fluvial processes act to transport water and sediments from an upstream high-elevated region by gravity downslope to a lower, flat landscape [32]. We used the hydrology geoprocessing tools assembled in the ArcGIS ModelBuilder [85] to extract channel networks/orders from the global DEMs, as well as from LiDAR DTM and DSM.
Delineating a channel network depends mainly on detecting the flow path of every cell in the DEM grid through a series of consecutive steps [38,86]. The first step was to create a depression-less DEM by filling the pits [37]. The presence of sinks within DEMs is a common problem that affects the proper detection of flow directions. Therefore, to have a hydrologically-corrected DEM, it is necessary to first fill these sinks [32,39]. The algorithm developed by Jenson and Dominque 1988 [39] has been widely used in many GIS software packages for sink filling [39], where every depression is converted to a flat area by raising each cell's elevation to the lowest elevation of its neighbors.
Based on the availability of high-resolution remote sensing-based DEMs, many authors developed accurate flow direction algorithms. They derived paths of surface flow using a nondispersive single (e.g., References [39,41,42]) and dispersive multiple (e.g., Reference [87]) flow direction methods. Orlandini and Moretti 2009 [42] stated that nondispersive algorithms should be used when the extraction of channel systems and surface flow directions is the main focus of the study. Furthermore, Zhu et al., 2013 [88] mentioned that most pit filling algorithms were based on a 1-D single flow direction (e.g., Reference [32,39]). Therefore, in the second step, we derived the flow direction grid from the Remote Sens. 2019, 11, 235 8 of 33 conditioned DEM by using the nondispersive eight-direction (D8) surface flow method [39]. The flow path was determined by comparing each cell's elevation with its eight adjacent or diagonal 3 × 3 cell neighbors, where the cell with the steepest downward direction is identified as the flow path based on the underlying topography [39,42]. The direction of flow determines the ultimate destination of the surface water flowing across the land toward downslope zones.
Third, using the predetermined flow direction spatial layer, it was possible to define cells with high flow concentration to detect how the flow would be accumulated and where small groups of cells could turn into streams [37,39]. In fact, cells with flow accumulation values greater than a certain threshold would constitute an effective part of the stream. The threshold is called the flow accumulation area threshold (AT), and it defines the minimum contributing area required to initiate the channels [32,86,89]. The AT is the main factor in extracting the channel networks, where it determines the channels' initiation and differentiates between stream and non-stream pixels. The AT is strongly affected by topography, geomorphology, geology, climate, vegetation, and human influence [40,90]. The determination of the AT is a matter of debate, but utilizing a constant value for delineating DEM-based channels network has been well-accepted among different researchers [32,91]. Most GIS software used 1% of the maximum flow accumulation value as a default to determine the AT [92]. Orlandini et al. 2011 [86] specified the AT by comparing the predicted and observed channel heads determined from LiDAR DEM and field measurements, respectively. Tarboton et al. 1991 [37] extracted channel networks of high density from DEMs that satisfy the scaling laws computed from the contour DEMs-derived networks (blue lines). Tribe 1992 [90] selected the optimum AT when there was a close match between the channel networks extracted from DEMs and manually drawn blue lines. Jones 2002 [93] visually determined the flow accumulation support AT by a trial and error approach.
The channel network extracted from LiDAR DEM had higher accuracy than that delineated from contour-based DEM [25]. The Google Earth Pro tool provides rich spatial details for determining individual objects [94]. Therefore, it is widely and efficiently used in different remote sensing applications, in particular for land use/cover mapping (e.g., Reference [94][95][96]). In this study, we used the approach of a trial and error [93] with a subsequent visual verification using Google Earth imagery to detect 4 ATs to test our method for evaluating the horizontal accuracy of channel networks. The ATs were equal to at least 0.004 km 2 , 0.008 km 2 , 0.012 km 2 , and 0.016 km 2 , and 0.020 km 2 , 0.041 km 2 , 0.061 km 2 , and 0.081 km 2 for spatial resolutions of 12.5 m and 28.5 m, respectively. The ATs corresponded to at least 25, 50, 75, and 100 pixels at spatial resolutions of 12.5 m and 28.5 m. Applying the predetermined threshold values to the flow accumulation grid, the real channels of the network began to be defined. We then converted the extracted channel grids to vector layers and then exported them to keyhole markup language (KML) format to visually check the quality of the extracted channels using Google Earth Pro. The delineated channels were well matched with the actual watercourses of Google Earth 3-D imagery. To obtain the equivalent AT values among multiple DEM grids with different spatial resolutions, we used a simple derived mathematical relationship as follows: where CAT is the comparable area threshold that needs to be estimated; LRD is the lower-resolution DEM (test DEM); HRD is the higher-resolution DEM (reference DEM); and OAT is the original area threshold based on which the channel network/orders were extracted. Finally, we generated the channel segment links with unique identifications by using the most common [34] stream order designation method (i.e., the Strahler method [97], modified from Horton 1945 [98]), which was applied to delineate the order of stream segments in the network. The channel order is in direct proportion to the channel size, watershed dimension, and discharge of water and sediments [97]. The Strahler ordering approach [97] assigned a numeric order for each channel segment based on a hierarchy of tributaries. In this method, the unbranched fingertip tributaries are designated as first order, and the order increases to the next higher one when branches of the same orders are joined. For instance, the joining of 2 first-order channels at a specific point will generate a second-order channel (Figure 3a,b), and so on. The stream ordering method can be simplified using the following relationships: where A and B denote the numbers of channel orders and "v" refers to the joining between 2 channels. The trunk stream through which water and sediments discharge downstream was assigned the highest order [97].

Developing ArcGIS Python Toolbox for Geometric Assessment of Channel Networks
We developed a new Python toolbox for ArcGIS for the purpose of pixel-based geometric evaluation of the channel networks/orders derived from open-source global DEMs based on those extracted from LiDAR DTMs/DSMs. The availability of numerous GIS software packages enabled the extraction of channel networks from remote sensing-based DEMs such as ArcGIS [85], Geographic Resources Analysis Support System (GRASS) GIS [99], and Quantum GIS [100]. In this study, we used the ArcGIS environment to introduce our toolbox, because ArcGIS and its powerful geoprocessing toolboxes have been widely used by different authors in different hydrological and geomorphological related research (e.g., References [57,66,101]).
Accuracy assessment is a mandatory step in evaluating the results of different remote sensing related studies [102,103]. Users with different applications should be able to assess whether the accuracy of their outcomes (e.g., map) fits their objectives [104]. In the remote sensing literature, the confusion matrix is the most commonly endorsed and utilized method (i.e., the core) of the accuracy assessment [102,103]. It consists of a simple cross-tabulation that introduces the foundation to define the classification accuracy and characterize errors (Tables 2 and 3). It has been widely used by different authors to evaluate the accuracy of different remote sensing-based models (e.g., fragmented agricultural landscapes [105], automatic classification of LiDAR datasets in an urban area [106], global climatic maps [107], object extraction [108], change detection [109], and land cover/use classifications [110][111][112]).
In this study, the calculated two-class ( Figure 3c and Table 2) and multiclass ( Figure 3d and Table 3) confusion matrices arranged the channel networks (Table 2), and channel orders (Table 3) of the reference data in the rows and the test datasets in the columns. The PBTVs ( Figure 4) around the LiDAR DTMs/DSMs 12.5 m and 28.5 m-derived networks/orders were set to 0, 1, 2, and 3 pixels, to detect the horizontal matching with those derived from global DEMs in both ±X and ±Y directions and at comparable spatial resolutions.
Remote Sens. 2019, 11, 235 9 of 33 orders are joined. For instance, the joining of 2 first-order channels at a specific point will generate a second-order channel (Figure 3a,b), and so on. The stream ordering method can be simplified using the following relationships: where A and B denote the numbers of channel orders and "v" refers to the joining between 2 channels. The trunk stream through which water and sediments discharge downstream was assigned the highest order [97].

Developing ArcGIS Python Toolbox for Geometric Assessment of Channel Networks
We developed a new Python toolbox for ArcGIS for the purpose of pixel-based geometric evaluation of the channel networks/orders derived from open-source global DEMs based on those extracted from LiDAR DTMs/DSMs. The availability of numerous GIS software packages enabled the extraction of channel networks from remote sensing-based DEMs such as ArcGIS [85], Geographic Resources Analysis Support System (GRASS) GIS [99], and Quantum GIS [100]. In this study, we used the ArcGIS environment to introduce our toolbox, because ArcGIS and its powerful geoprocessing toolboxes have been widely used by different authors in different hydrological and geomorphological related research (e.g., References [57,66,101]).
Accuracy assessment is a mandatory step in evaluating the results of different remote sensing related studies [102,103]. Users with different applications should be able to assess whether the accuracy of their outcomes (e.g., map) fits their objectives [104]. In the remote sensing literature, the confusion matrix is the most commonly endorsed and utilized method (i.e., the core) of the accuracy assessment [102,103]. It consists of a simple cross-tabulation that introduces the foundation to define the classification accuracy and characterize errors (Tables 2 and 3). It has been widely used by different authors to evaluate the accuracy of different remote sensing-based models (e.g., fragmented agricultural landscapes [105], automatic classification of LiDAR datasets in an urban area [106], global climatic maps [107], object extraction [108], change detection [109], and land cover/use classifications [110][111][112]).
In this study, the calculated two-class ( Figure 3c and Table 2) and multiclass ( Figure 3d and Table 3) confusion matrices arranged the channel networks (Table 2), and channel orders (Table 3) of the reference data in the rows and the test datasets in the columns. The PBTVs ( Figure 4) around the LiDAR DTMs/DSMs 12.5 m and 28.5 m-derived networks/orders were set to 0, 1, 2, and 3 pixels, to detect the horizontal matching with those derived from global DEMs in both ±X and ±Y directions and at comparable spatial resolutions.     In general, a confusion matrix is a statistical technique for summarizing the performance of a classification algorithm [113,114]. In this study, an N × N error matrix ( Figure 3 and Table 2) was used to geometrically evaluate the channel networks, where N is equal to the number of classes (whole channel networks) in the case of the simplest 2 × 2 array (Table 2). When separately evaluating the channels having the same order, N was equal to the number of classes (orders) ( Table 3). Each row and column of the matrix corresponds to the test (one of the global DEMs-derived networks/orders) and reference (one of the LiDAR DTMs/DSMs-based networks/orders) classes, respectively. The counts of correct and incorrect agreements (i.e., disagreements) were then filled into the confusion matrices (Tables 2 and 3).
The results of the geometric evaluation of the channel networks/orders derived from global DEMs (test data) and LiDAR DTMs/DSMs (reference data) were arranged in a matrix format (Tables 2 and 3) with the following 4 outcomes: (i) True positive (TP), where the matched pixels were correctly classified as the same channel segment of the networks/orders of both test and reference datasets, (ii) true negative (TN), where co-located pixels were correctly classified as non-channel networks/orders of both test and reference classes (iii) false positive (FP), where pixels of the test data were unmatched with the reference data (i.e., the test pixels corresponded to a channel of another order or to the background), and (iv) false negative (FN), where pixels of the reference data were incorrectly matched by the test data (i.e., the reference pixels corresponded to the background or to a channel belonging to another order).   The auto-extracted channel networks/orders from global DEM grids were geometrically evaluated using those derived from LiDAR DTMs and DSMs at similar spatial resolutions. The maximum PBTVs around each channel segment-based reference LiDAR DTMs/DSMs were set to 0, 1, 2, and 3 pixels ( Figure 4). These PBTVs were equal to horizontal distances of 0 m, 12.5 m, 25 m, and 37.5 m, and 0 m, 28.5 m, 57 m, and 86.5 m, at spatial resolutions of 12.5 m and 28.5 m, respectively. Our developed algorithm was first checked for the matched co-located non-classified or classified (i.e., ordered) channels' pixels (i.e., a PBTV of 0) from the test data with respect to the reference datasets. If there were no more matched pixels, the algorithm kept running to locate the closest horizontal matching between the remainder of the deviated pixels within the subsequent nearest 1-pixel, 2-pixels, and 3-pixels neighbors with respect to the reference datasets (i.e., PBTV of 1 to 3 pixels) (Figure 4).

Categorical Performance Measures for Assessing the Horizontal Accuracy of Channel Networks/Orders
In this study, we derived different evaluation measures, such as producer accuracy (PA), user accuracy (UA), F-score (F), and Cohen's kappa index (KI), from the error matrices at different ATs and PBTVs to quantify the reliability and accuracy of the matching between networks/orders. Many studies have used these measures to evaluate the accuracy of various remote sensing datasets and models (e.g., References [106,108,115,116]).
The PA and UA [114] were calculated using the marginal row or column of the matrix, respectively. PA (i.e., row values) was computed considering the agreement of a particular class with the summation of all classes in that row (Tables 2 and 3). The rows of the table represent the actual class (global DEM-derived network/orders), while the columns represent the test class (LiDAR DTM/DSM-based network/orders). TP and TN (Tables 2 and 3) denote the correctly classified pixels, while FP and FN represent the incorrectly classified cells.
UA (i.e., column values) was similarly calculated, but with respect to the summation of all classes in that column (Tables 2 and 3). PA and UA represent measures of completeness and correctness, respectively. The difference between PA and UA lies in the definitions of how well the channel networks/orders can be matched (PA) versus how reliable the matching accuracy is (UA). Therefore, both PA and UA are of interest and considered as important accuracy metrics. In particular, the accuracy of each channel order using PA and UA is useful in determining how different models perform (see Congalton 1991 [102] and Stehman 1997 [117]) for an in-depth discussion). In Table 3, for channels of order 1 (i.e., Ord_1 class), the TP, FN, FP, and TN outcomes were labeled in yellow, green, orange, and gray, respectively. In other words, the total number of FN outputs for an Ord_1 (i.e., channels that had the order of 1) (Table 3) equalled the summation of values in the corresponding row, excluding the TP. If a channel pixel of order 1 was located in the reference class (LiDAR DEM-based orders), and no corresponding channel pixel of the same order was reported in the test data (global DEM-based orders) (i.e., a channel of another order or a background pixel of 0 value was recorded), this cell was assigned the value of (1, the other order recorded in the test data) or (1, 0) in the error matrix, respectively. In the same way, the total number of the FP outcomes for an Ord_1 (Table 3) equalled the summation of values in the corresponding column, excluding the TP. If a channel pixel of order 1 was not located in the reference data, but was recorded in the test data, this cell was assigned the value of (0, the other order recorded in the test data) or (0, 1), in the matrix.
The F metric represents the harmonic mean (i.e., weighted average) of PA and UA [118]. It measures the accuracy of the compared whole networks, as well as the channels with the same order. The F value provides the balance between precision (UA) and recall (PA). Therefore, it takes both FP and FN into account, and it addresses how similar the PA and UA values are. The F-score can summarize UA and PA into a single value, which makes it simple to determine the level of matching between the networks/orders extracted from different DEMs at different ATs and PBTVs.
The higher the PA, UA, and F values, the better the performance of the matching between the channel networks/orders. A score of 1 means perfect matching. The lowest possible score of the PA, UA, and F is 0, which denotes no horizontal matching between the networks/orders. The KI is a measure of the overall agreement of a matrix, calculating the proportion of agreement beyond chance agreement and expected disagreement [119]. It was introduced to the remote sensing community in the early 1980s [113,120] and has become a widely accepted measure for classification accuracy [102]. The KI provides an overall assessment of the accuracy of the classification [121]. It has a negative value if the chance agreement increases, a positive value if the strength of the agreement increases, and a value of zero when the agreement between reference and test datasets equal the chance agreement (i.e., no agreement) [122]. The KI uses both the overall accuracy of the model and the accuracy within each class; therefore, it has the advantage of statistically comparing two classification outcomes. In contrast to the overall accuracy [113], the KI takes the non-diagonal elements into consideration as expressed by Equation (9) [119]. The equations for computing PA [114], UA [114], KI, and F [118] are as follows: where m is the numbers of rows; s ii is the numbers of channel network/order pixels in row i and column i (on the major diagonal); s i+ is the total number of the channel network/order pixels in row i; s +i is the total number of the observations in column i; and N is the total number of observations.

Traditional Statistical Indices for Evaluating the Vertical Height Accuracy of Global DEMs
We assessed the vertical accuracies of the global DEMs (SRTM DEM V003 28.5 m, ALOS DSM 28.5 m, and PALSAR DEMs 12.5 m and 28.5 m) by computing the per-pixel difference with the LiDAR DTMs/DSMs at similar spatial details. The continuous elevation differences were generated and pairwise RMSE and MD values were calculated for each error surface in meters ( Figure 5). In general, LiDAR DTMs and DSMs had higher elevation values than PALSAR DEMs with spatial resolutions of 12.5 m and 28.5 m. We observed significant positive height differences in the northwestern part of the study area, comparing the PALSAR DEM 12.5 m to LiDAR DTM and DSM (Figure 5c

Horizontal Evaluation of the Channel Networks
We evaluated the whole networks using the four categorical measures (PA, UA, F, and KI) derived from the two-class pixel-based confusion matrix outcomes (Table 2 and Figure 3c). The flow accumulation ATs were set to correspond to at least 25, 50, 75, and 100 pixels. Figure 6 shows some examples of the outcomes of the confusion matrices (TP, FP, TN, and FN) resulting from comparing the whole networks extracted from global DEMs based on those derived from LiDAR DTMs/DSMs. In general, the values of these metrics were improved with the increase the PBTVs from 0 to 3 pixels.

Horizontal Evaluation of the Channel Networks
We evaluated the whole networks using the four categorical measures (PA, UA, F, and KI) derived from the two-class pixel-based confusion matrix outcomes (Table 2 and Figure 3c). The flow accumulation ATs were set to correspond to at least 25, 50, 75, and 100 pixels. Figure 6 shows some examples of the outcomes of the confusion matrices (TP, FP, TN, and FN) resulting from comparing the whole networks extracted from global DEMs based on those derived from LiDAR DTMs/DSMs. In general, the values of these metrics were improved with the increase the PBTVs from 0 to 3 pixels. Slight differences were recorded among these measures in the comparison between networks extracted from global DEMs 28.5 m and LiDAR DTM/DSM 28.5 m at comparable ATs and PBTVs (Tables 5-8). In general, the values of these metrics were improved with the increase the PBTVs from 0 and up to 3 pixels. Slight differences were recorded among these measures in the comparison between networks extracted from global DEMs 28.5 m and LiDAR DTMs and DSMs at comparable ATs and PBTVs (Tables 5-8).  In general, the values of these metrics were improved with the increase the PBTVs from 0 and up to 3 pixels. Slight differences were recorded among these measures in the comparison between networks extracted from global DEMs 28.5 m and LiDAR DTMs and DSMs at comparable ATs and PBTVs (Tables 5-8). Table 5. Performance accuracy metrics computed based on a comparison between whole channel networks/orders extracted from the LiDAR DTM/DSM 28.5 m and the ALOS DSM 28.5 m at different ATs expressed in the corresponding number of pixels and PBTV in pixels.  (Table 7) were higher than those estimated from evaluating the ALOS DSM 28.5-derived network (Table 5) by 0.025, 0.064, 0.045, and 0.052 and 0.056, 0.081, 0.068, and 0.073, respectively, using a PBTV of 0 and ATs corresponding to 25 and 100 pixels. Additionally, using the previously mentioned conditions, but employing a PBTV of 3, the differences were −0.039, 0.029, −0.005, and −0.005 and −0.025, 0.019, −0.003, and −0.003, respectively.

Reference
The SRTM DEM-derived network ranked third in comparison with the LiDAR DTM 28.5 m-based network (Table 6). Employing the previously mentioned conditions, the measures were 0.910, 0.891, 0.900, and 0.887 and 0.891, 0.912, 0.902, and 0.896, respectively, in the comparison between networks extracted from SRTM DEM and LiDAR DTM ( Table 6). The performance metrics calculated from comparing the networks delineated from PALSAR DEM and LiDAR DTM at a spatial resolution of 28.5 m ( Table 7) were higher than those reported at spatial details of 12.5 m ( Table 8) by average values of 0.068 and 0.075 using a comparable PBTV of 3 and ATs corresponding to 25 and 100 pixels, respectively. Using different PBTVs and ATs, the metrics computed from comparing networks extracted from ALOS and LiDAR DSM 28.5 m (Table 5) were almost equal to those reported from evaluating networks derived from ALOS DSM and LiDAR DTM 28.5 m (Table 5), with a maximum absolute difference of 0.024. The channel network-based ALOS DSM 28.5 m (Table 5) reported the best evaluation metrics when compared to that extracted from LiDAR DSM. The PALSAR DEM 28.5 m and SRTM DEM 28.5 m (Table 6)-derived networks ranked second and third in accuracy performance when compared to the LiDAR DSM 28.5 m-based network.
The average of the differences between the performance measures reported from comparing the network delineated from ALOS DSM 28.5 m (Table 5) against networks extracted from PALSAR DEM 28.5 m (Table 7) and SRTM DEM 28.5 m (Table 6) was 0.006 and 0.016, and 0.074 and 0.014 using an AT corresponding to 25 pixels and PBTV of 0 and 3, respectively. Under the previously mentioned conditions, but with using an AT corresponding to 100 pixels, the average of differences was 0.012 and 0.018, and 0.066 and 0.015, respectively. The PALSAR DEM 12.5 m-derived channel network (Table 8) reported the lowest accuracy measures compared to that extracted from LiDAR DSM.

Performance Evaluation Metrics of the Geometric Assessment of Channel Orders
We computed three performance measures (PA, UA, and KI) using the outcomes of the multiclass error matrices (Figure 3d and Table 3) derived from the comparison between channels of similar orders (Tables 5-8).
In general, comparing the PALSAR DEM 28.5 m and LiDAR DTM 28.5 m-derived orders (Table 7) at similar ATs and PBTVs showed the best evaluation measures with few exceptions. Using a PBTV of 3 and an AT corresponding to 100 pixels, the estimated PA and UA values for channels of orders 1, 2, 3, 4, and 5 were equal to 0.790, 0.699, 0.691, 0.568, and 503 and 0.806, 0.809, 0.682, 0.548, 0.742, respectively (Table 7). Using the previously mentioned conditions, the measures computed from assessing the orders extracted from PALSAR DEM ( Table 7) were slightly higher than those estimated from evaluating orders derived from ALOS DSM ( Table 5) and SRTM DEM (Table 6), when compared to the LiDAR DTM 28.5 m-based orders with absolute differences ranging from 0.017 to 0.188 and from 0.025 to 0.177, respectively. The estimated KIs per orders using a PBTV of 3 confirmed the previous results, since they were equal to 0.766, 0.721, and 0.769 and 0.816, 0.793, and 0.828 in the evaluation of orders delineated from ALOS DSM (Table 5), SRTM DEM (Table 6), and PALSAR DEM (Table 7) using those extracted from LiDAR DTM 28.5 m at ATs corresponding to 25 and 100 pixels, respectively. The average of the differences between the PA and UA values for channels of orders 1 to 5 resulting from comparing orders delineated from PALSAR DEM 28.5 m (Table 7) and LiDAR DTM was 0.184 and 0.181 using an AT corresponding to 25 pixels and PBTV of 0 and 3, respectively. Using the previously mentioned conditions, but with an AT corresponding to 100 pixels, the average of differences was 0.266 and 0.300, respectively ( Table 7). The PA and UA values estimated from evaluating orders extracted from PALSAR DEM 28.5 m (Table 7) were higher than those computed from assessing PALSAR DEM 12.5 m-derived orders (Table 8), when compared to those delineated from LiDAR DTMs, at equivalent spatial resolutions, with average values of 0.117 and 0.123 at an AT corresponding to 100 pixels, respectively. In addition, the estimated KI from evaluating orders delineated from PALSAR DEM 28.5 m was higher than that reported from assessing orders derived from PALSAR DEM 12.5 m when compared to LiDAR DTMs by difference values of 0.127 and 0.112 utilizing a PBTV of 3 and ATs corresponding to 25 and 100 pixels, respectively.
The performance metrics of the orders derived from ALOS DSM 28.5 m (Table 5) had the highest accuracy when compared to those extracted from LiDAR DTM. Employing a PBTV of 3 and an AT corresponding to 25 pixels, the PA and UA values calculated from comparing ALOS and LiDAR DSMs-derived channels having orders of 1 to 6 were equal to 0.733, 0.581, 0.610, 0.563, 0.347, and 0.377 and 0.691, 0.583, 0.626, 0.521, 0.460, and 0.593, respectively (Table 5). Using a PBTV of 3 and an AT corresponding to 100 pixels, the differences between the PA values of channels having orders 1 to 5 reported from evaluating ALOS DSM 28.5 m (Table 5) and those derived from SRTM DEM (Table 6) and PALSAR DEM (Table 7) were 0.020, 0.087, 0.151, −0.052, and −0.040 and 0.015, 0.067, 0.037, 0.135, and 0.127, respectively, when compared to those derived from LiDAR DSM 28.5 m. Additionally, the UA differences for channels of orders 1 to 5 were 0.036, 0.025, 0.022, 0.140, and 0.039 and 0.009, −0.047, 0.043, 0.101, and 0.066, respectively. There were minor exceptions where the performance of orders delineated from SRTM DEM 28.5 m ( Table 6) exceeded that of PALSAR DEM 28.5 m-derived orders (Table 7), particularly for PA values of orders 4 and 5 and UA of order 5, using an AT corresponding to 100 pixels and PBTV of 0 and 3, respectively. The estimated KIs per orders assured the previous results, since they were equal to 0.766, 0.725, and 0.749 and 0.827, 0.797, and 0.806, respectively, when assessing orders delineated from ALOS DSM (Table 5), SRTM DEM (Table 6), and PALSAR DEM (Table 7) based on those extracted from LiDAR DSM using a PBTV of 3 and ATs corresponding to 25 and 100 pixels.

Effect of Global DEM Spatial Resolution on the Evaluation of Channel Networks/Orders
Although the previously mentioned results, the PALSAR DEM 12.5 m-derived channel network and Strahler orders (Table 8) was still the most accurate and had the best agreement with those extracted from LiDAR DTM, with taking into the account the fine spatial resolution it had. Table 8. Performance accuracy metrics computed based on the comparison between whole networks/orders extracted from the LiDAR DTM/DSM 12.5 m and the PALSAR DEM 12.5 m at different ATs expressed in the corresponding number of pixels and PBTVs in pixels. We used an AT corresponding to 519 pixels (equivalent to an AT corresponding to 100 pixels at a spatial resolution of 28.5 m) using Equation (3) to compare the channel network and orders extracted from PALSAR DEM versus those derived from LiDAR DTM 12.5 m (Table 9). It was found that the calculated performance measures were improved. These measures were found to be closer to those estimated from the comparison between networks extracted from ALOS DSM (Table 5) and PALSAR DEM (Table 7) with that delineated from LiDAR DTM 28.5 m. Furthermore, the differences between PA and UA values resulted from comparing channels of orders 1 to 5 delineated from PALSAR DEM 12.5 m, PALSAR DEM 28.5 m, and ALOS DSM 28.5 m were 0.020, −0.026, 0.151, −0.005, and 0.081 and 0.002, 0.011, 0.098, 0.011, and 0.014, respectively, when compared to those derived from LiDAR DTMs using a PBTV of 3 and ATs corresponding to 519 and 100 for spatial resolutions of 12.5 m and 28.5 m, respectively (Table 9). Additionally, the UA differences for channels of orders 1 to 5 were −0.064, −0.072, −0.022, −0.002, and 0.195 and −0.088, −0.151, −0.068, −0.026, and 0.007, respectively (Table 9). Moreover, the KI differences were 0.061 and −0.052 using the same previously mentioned conditions (Table 9). Table 9. Performance metrics computed based on the comparison between networks/orders extracted from LiDAR DTMs and global DEMs at different ATs expressed in the corresponding number of pixels, PBTV in pixels, and spatial resolution in m. In the comparison of channel network and orders extracted from PALSAR DEM 12.5 m with those derived from LiDAR DTM at an AT corresponding to 100 pixels, we found that the performance metrics started to noticeably improve after utilizing only one pixel as a buffer tolerance. This means that we would only need to approximately deviate one pixel (corresponding to a horizontal distance of 12.5 m) to directly gain an obvious improvement in the matching accuracy. A horizontal offset of three pixels (corresponding to a horizontal distance of 87.5 m) was required to achieve an apparent enhancement in matching during the assessment of the networks/orders extracted from ALOS DSM and PALASR DEM based on those delineated from LiDAR DTM at a spatial resolution of 28.5 m (Table 9). Figure 7 shows the histograms estimated based on the comparison between channel networks extracted from global DEMs based on those derived from the reference LiDAR DTMs/DSMs. They displayed each channel segment's displacement in both ±X and ±Y directions, at PBTVs ranging from 0 to 3 pixels and four ATs (corresponding to at least 25, 50, 75, and 100 pixels). The number of co-located channels' pixels using a PBTV of 0 was always higher than that reported using other PBTVs (1 to 3 pixels). Employing larger ATs and a PBTV of 3, the number of unmatched pixels decreased (Figure 7). Moreover, using different ATs, the number of displaced channels' pixels reduced with the increase of PBTV from 0 to 3 pixels (Figure 7). The number of channels' pixels that remained without displacement (i.e., had 0 PBTV) was always greater when comparing the networks extracted from global DEMs based on those derived from LiDAR DTMs rather than LiDAR DSMs. The highest, and a nearly equal number of co-located pixels (using a PBTV of 0) were reported from evaluating the networks derived from ALOS DSM (Figure 7c) and PALSAR DEM (Figure 7e) when compared to that extracted from LiDAR DTM with a similar spatial resolution of 28.5 m. Due to the fine spatial details of the PALSAR DEM 12.5 m, the number of matched pixels using a PBTV of 0 was higher than that with a spatial resolution of 28.5 m.

Characterizing the Horizontal Offset between the Extracted Channel Networks
Using a PBTV of 3 pixels and ATs corresponding to 25 and 100, the PALSAR DEM 28.5 m-derived channel segments shifted 1689 and 973 pixels in +X direction and 1151 and 541 pixels in +Y direction, and 159 pixels and 54 pixels in −X direction and 653 and 325 pixels in −Y direction, respectively; with respect to LiDAR DTM-based channels' pixels ( Figure 7e).
The number of unmatched pixels was higher in evaluating the network extracted from PALSAR DEM 12.5 m (Figure 7g), due to its finer details than at a spatial resolution of 28.5 m (Figure 7e). Using a PBTV of 3 and an AT corresponding to 100 pixels, we found shifting in the channel segments by 8342 and 5655 pixels toward the +X and +Y directions, and 754 and 2300 pixels toward the −X and −Y directions, respectively, in the assessment of channels' pixels derived from PALSAR DEM and LiDAR DTM 12.5 m (Figure 7g).
In the comparison between the channels extracted from ALOS DSM and LiDAR DSM 28.5 m, the ALOS DSM-derived network was displaced by 1163 and 667 pixels in the +X and +Y directions, and 82 and 366 pixels in the −X and −Y direction, respectively, with respect to LiDAR DSM-based network at a PBTV of 3 pixels and an AT corresponding to 100 pixels (Figure 7d).

Vertical Accuracy of Global DEMs
The use of LiDAR DEMs with fine spatial resolutions as benchmarks to assess global spaceborne DEM sources has been well documented by previous studies. Dewitt et al. 2015 [123]  To our knowledge, there have been no studies to evaluate the PALSAR DEM 12.5 m using LiDAR datasets. However, the higher accuracy of the PALSAR DEM 12.5 m could be interpreted by the use of high-quality DEMs with fine spatial resolutions (i.e., the National Elevation Dataset (NED)) in the detailed radiometric and geometric correction of PALSAR imagery (see References [80,129] for more details).
Some previous studies reported findings that were similar to our results, but with different RMSE values. Therefore, a number of factors would be worthwhile to consider when quantifying the vertical height accuracy of optical and radar satellite data-based DEMs" such as the presence of: (i) An extensive topographic change (e.g., due to surface mining excavations) (e.g., Reference [126]), (ii) rugged mountainous regions, particularly for interferometric SAR returns that may potentially be affected by foreshortening, layover, and shadow [130], (iii) vegetation canopy of varied roughness (e.g., References [126,131]), (iv) different dates for collecting the original data to generate various DEMs, possible land use changes, and growth of trees during extended time spans (e.g., Reference [128]), (v) slope change due to abrupt change in relief, where it was proved that DEM errors rapidly increased if the slope was greater than 20 • [69], (vi) significant differences in the elevation ranges (i.e., difference between minimum and maximum relief) within a particular study area, where a high elevation variance can reduce the DEM's vertical accuracy [69], and (vii) various versions of the same global elevation dataset with different levels of accuracy.

Horizontal Accuracy of Channel Networks
To our knowledge, there are two similar studies in the literature, in which Anderson et al. 2014 [46] and Mozas-Calvache et al. 2017 [47] introduced two methods for the quantitative comparison of vector-based stream networks. Anderson et al. 2014 [46] mentioned that it is a complex and challenging task to compare and evaluate the degree of matching between two networks of several sets of polylines. They proposed the relative sinuosity, and longitudinal root mean square error (LRMSE) techniques for the quantitative evaluation of the quality and variation in linear stream features. They found that matched sinuosity could indicate a similar level of meandering, but did not imply that both channel network polylines were well matched. Therefore, they recommended using the LRMSE technique to evaluate the horizontal similarity between channel lines rather than sinuosity deviation. However, they stated that both techniques must be carefully reviewed before being used to avoid the no-data anomalies, such as significantly unequal polyline lengths.
Mozas-Calvache et al. 2017 [47] proposed a method to determine the maximum and mean positional displacements of DEMs-based drainage networks. They used the adapted Hausdorff distance (i.e., a 2-D maximum distance between channels) and vertex influence (i.e., weighting each vector of the 3-D channel by the segments' lengths adjacent to each vertex) methods [132] to determine the horizontal displacement between the networks. Their findings demonstrated their method's applicability to determine the positional displacement of the selected channels [47]. However, the proposed methods by Mozas-Calvache et al. 2017 [47] and Anderson et al. 2014 [46] had similar limitations in the selection and preparation of channels for evaluation. They selected only a subset of channels; also, they checked that there were a one-to-one correspondence and proximity between the channels' polylines in both the reference and test datasets. Moreover, they edited the selected channels, and if a particular channel was missed in the test data, they either ignored or deleted the reference channel of interest. The last consideration was that they manually trimmed the more extended channels around the missing branches.
In this study, our introduced method overcame all the previously mentioned constraints, so it is a practical method that can be used without any prior selection, adjustment, trimming, and deletion of the comparable channel networks/orders. It directly considered all the channels in both the reference and test datasets, whether they were co-located or not. Furthermore, our method and the developed toolbox can automate the quantification and visualization of the horizontal spatial variations between channel networks/orders, as well as they have the advantage of evaluating unmatched pixels using different PBTVs (any number of pixels).

Similarity Between the Findings of the Vertical Assessment of Global DEMs and the Horizontal Evaluation of Their Derived Channel Networks/Orders
The achieved results in Section 4 demonstrated that the findings of both methods (pixel-based vertical accuracy of global DEMs and horizontal accuracy of their derived networks/orders) were similar in some cases, but not in others.
Using traditional statistical indices (RMSE and MD), it was found that PALSAR DEM 12.5 m had the best performance with respect to the PALSAR DEM 28.5 m, ALOS DSM 28.5 m, and SRTM DEM 28.5 m, when compared to the LiDAR DTMs at comparable spatial resolutions. The channel network/orders derived from PALSAR DEM 28.5 m had the highest accuracy, followed by those extracted from ALOS DSM 28.5 m and SRTM DEM 28.5 m in comparison with those derived from LiDAR DTM 28.5 m. The findings of both methods were similar, except for the performance of networks/orders delineated from PALSAR DTM 12.5 m using ATs corresponding to at least 25, 50, 75, and 100 pixels. However, employing an equivalent AT to that at a spatial resolution of 28.5 m, it was found that the performance of channel network/orders extracted from PALSAR DEM 12.5 m was obviously improved. We suggest selecting the channel network/orders extracted from the DEM with the finest spatial resolution for using in geomorphological and hydrological applications if the accuracy metrics evaluating both original DEMs and their derived drainage networks/orders were high and close to each other. Consequently, the channel network and Strahler orders extracted from PALSAR DEM 12.5 m were considered to have the best accuracy performance (see Section 4.4 for more details) when compared to those delineated from LiDAR DTM (i.e., the findings of both methods were considered similar in the cases mentioned above).
Employing the RMSE and MD statistical measures, the ALOS DSM showed the highest vertical accuracy, followed by SRTM DEM 28.5 m, PALSAR DEM 12.5 m, and PALSAR DEM 28.5 m, when compared to the LiDAR DSMs at comparable spatial resolutions. Channel networks/orders derived from ALOS DSM 28.5 m and PALSAR DEM 12.5 m also showed the highest and lowest performance, respectively, when compared to those extracted from LiDAR DSMs at similar spatial resolutions. Therefore, the reported results from both methods were similar in the latter case. However, there were two exceptions where the networks/orders extracted from PALSAR DEM 28.5 m and SRTM DEM 28.5 m ranked second and third in the horizontal accuracy, respectively, contrary to the performance of the original DEMs. Therefore, the findings of the two methods were dissimilar when comparing PALSAR DEM 28.5 m and SRTM DEM 28.5 and their derived networks/orders with LiDAR DSM 28.5 m and its extracted network/orders.

Potential Applications of the Introduced Method
In terms of other potential applications related to remote sensing research, the introduced method can also be used to: (i) Determine the optimum AT by comparing the extracted drainage network from any remote sensing technology-based DEM with a reference network derived from high-quality DEM source (e.g., high-quality satellite imagery and aerial photographs, with the help of topographic maps and field measurements), (ii) assess the effectiveness of different channel networks' extraction algorithms, and (iii) quantify the degree of horizontal variation between other linear geologic and geomorphological features (e.g., structural lineaments, surface geologic contacts, and shorelines) extracted from remote sensing-based geospatial datasets of simultaneous or different temporal series, after converting them to raster format. For extended applications, and even if the LiDAR point cloud datasets are not available elsewhere in the world, other accurate DEM sources and their derived channel networks/orders can be used as benchmarks to quantify the vertical height accuracy of the DEMs used, as well as the horizontal accuracy of their channel networks/orders.

Conclusions
This paper presents a pixel-based method to evaluate the horizontal accuracy of channel networks and Strahler orders delineated from three global DEMs with four spatial resolutions using reference LiDAR DTMs/DSMs and their derived networks/orders at comparable spatial resolutions and different ATs and PBTVs. We quantified the horizontal displacements between the extracted channels in both the ±X and ±Y directions. The pixel-based vertical elevation accuracies of SRTM DEM 28.5 m, ALOS DSM 28.5 m, and PALSAR DEMs 12.5 m and 28.5 m were also determined using traditional statistical metrics (RMSE, MD). In particular, the vertical accuracy of the newly released ALOS PALSAR DEM with two spatial resolutions, 12.5 and 28.5 m, as well as their derived channel networks/orders were thoroughly studied. We examined the similarity between the findings of the vertical assessment of the remote sensing-based DEMs and the horizontal variation of their delineated channel networks/orders. We also developed a new Python toolbox for ArcGIS to automate the introduced method. The presented method effectively determines the horizontal accuracy of the different networks/orders. It was able to detect the performance of the networks/orders beyond the co-located channels' pixels using different PBTVs. In general, the PALSAR DEM 12.5 m and ALOS DSM 28.5 m and their derived channel networks/orders were very close in performance to the LiDAR DTM 12.5 m and DSM 28.5 m and their extracted networks/orders, respectively, at comparable spatial resolutions.
The evaluations of the vertical accuracy of spaceborne DEMs and their derived channel networks and Strahler orders revealed the following: 1.
The ALOS DSM 28.5 m and PALSAR DEM 12.5 m had the best performance when compared to the LiDAR DSM 28.5 m and LiDAR DTMs 12.5 m, respectively.

2.
The categorical performance measures were improved with the increase of PBTVs from 0 to 3 pixels. When evaluating the horizontal accuracy using LiDAR DTM 28.5 m derived-channel networks/orders, it was found that networks/orders delineated from PALSAR DEM 28.5 had the highest performance, followed by those from ALOS DSM 28.5 and SRTM DEM 28.5 m. However, taking into consideration the high spatial details of the PALSAR DEM 12.5 m, there was an extended possibility for observing more unmatched pixels, particularly with the use of an AT corresponding to 100 pixels. However, using an AT corresponding to 519 pixels (equivalent to an AT corresponding to 100 pixels at a spatial resolution of 28.5 m), the evaluation performance of the network/orders derived from LiDAR DEM 12.5 m was noticeably improved with the use of only one pixel as a PBTV. Therefore, the channel network and Strahler orders derived from PALSAR DEM 12.5 m were considered to have high horizontal accuracy (see Sections 4.4 and 5.3 for additional details). 3.
Using a PBTV of 0, the number of co-located channels' pixels was higher than those resulting from the use of more PBTVs. The number of unmatched pixels decreased with the increase of PBTV from 0 to 3 at different ATs. The number of channels that remained without displacement (a PBTV of 0) was greater when evaluating the networks delineated from global DEMs using those derived from LiDAR DTMs rather than LiDAR DSMs at comparable spatial resolutions. Furthermore, the highest number of matched co-located pixels was recorded in the comparison of the PALSAR DEM 28.5 m-and ALOS DSM 28.5 m-derived networks with that derived from LiDAR DTM 28.5 m. 4.
The findings of the two methods (pixel-based vertical accuracy of global DEMs and horizontal accuracy of their derived channel networks/orders) were mostly similar, but there were exceptions, particularly in comparison with LiDAR DSM 28.5 m and its derived network/orders.
We recommend that other researchers evaluate DEMs and their channel networks/orders before involving them in their geomorphological and hydrological studies. Additionally, we suggest using our method over areas of different land covers, geomorphic units, lithology, and climatic zones elsewhere in the world.