Mapping Drainage Structures Using Airborne Laser Scanning by Incorporating Road Centerline Information

: Wide-area drainage structure (DS) mapping is of great concern, as many DSs are reaching the end of their design life and information on their location is usually absent. Recently, airborne laser scanning (ALS) has been proven useful for DS mapping through manual methods using ALS-derived digital elevation models (DEMs) and hillshade images. However, manual methods are slow and labor-intensive. To overcome these limitations, this paper proposes an automated DS mapping algorithm (DSMA) using classiﬁed ALS point clouds and road centerline information. The DSMA begins with removing ALS ground points within the buffer of the road centerlines; the size of the buffer varies according to different road classes. An ALS-modiﬁed DEM (ALS-mDEM) is then generated from the remaining ground points. A drainage network (DN) is derived from the ALS-mDEM. Candidate DSs are then obtained by intersecting the DN with the road centerlines. Finally, a reﬁnement buffer of 15 m is placed around each candidate DS to prevent duplicate DS from being generated in close proximity. A total area of 50 km 2 , including an urban site and a rural site, in Vermont, USA, was used to assess the DSMA. Based on the road functional classiﬁcation scheme of the Federal Highway Administration (FHWA), the centerline information regarding FHWA roads was obtained from a public data portal. The centerline information on non-FHWA roads, i.e., private roads and streets, was derived from the impervious surface data of a land cover dataset. A benchmark DS dataset was gathered from the transport agency of Vermont and was further augmented using Google Earth Street View images by the authors. The one-to-one correspondence between the benchmark DS and mapped DS for these two sites was then established. The positional accuracy was assessed by computing the Euclidian distance between the benchmark DS and mapped DS. The mean positional accuracy for the urban site and rural site were 13.5 m and 15.8 m, respectively. F1-scores were calculated to assess the prediction accuracy. For FHWA roads, the F1-scores were 0.87 and 0.94 for the urban site and rural site, respectively. For non-FHWA roads, the F1-scores were 0.72 and 0.74 for the urban site and rural site, respectively.


Introduction
A bridge is a structure constructed to carry a road or a pathway over a watercourse (i.e., a river or a stream). A culvert is a tunnel structure that carries a watercourse underneath a road, or through another type of obstruction, to a natural drainage point [1]. To clarify the terminology used in this paper, the term "drainage structures" (DS) is used when referring to both a bridge and a culvert collectively. The Global Roads Inventory Project (GRIP) has estimated over 21 million kilometers of global road infrastructure comprised of billions of DS [2,3]. In the United States, the state Department of Transportation (DOT) is responsible for the mapping and maintenance of DS in a state-wide capacity [4]. DS that are not properly maintained can cause disaster and serious economic loss following extreme weather conditions [5,6]. The American Society of Civil Engineers (ASCE) has reported that almost 1.6 trillion dollars are required through various bonds and public

Study Sites
The proposed DSMA was tested in Northern Vermont, USA (Figure 1). Land-use landcover (LULC) data obtained from Vermont Center for Geographic Information (VCGI) [33] shows that Site A (Figure 1a) has a more impervious surface area (ISA), e.g., residential and commercial, compared to Site B (Figure 1b), which predominantly contains agricultural areas. The two sites cover an accumulated area of 50 km 2 (Figure 1), as we intended to show the ability of the DSMA to process data over large areas.

Study Sites
The proposed DSMA was tested in Northern Vermont, USA (Figure 1). Land-u land-cover (LULC) data obtained from Vermont Center for Geographic Informatio (VCGI) [33] shows that Site A (Figure 1a) has a more impervious surface area (ISA), e. residential and commercial, compared to Site B (Figure 1b), which predominantly co tains agricultural areas. The two sites cover an accumulated area of 50 km 2 (Figure 1), we intended to show the ability of the DSMA to process data over large areas.
Site A is comprised of St. Alban city and some adjoining areas, with a latitude an longitude of 73°1′17.94″ W, and 44°58′5.41″ N, respectively. Site A spans an area of 14 km with an elevation range from 85 m to 258 m. The St. Alban city urban area is largely com prised of flat terrain (Figure 1c). Site B is located in a rural part of northern Vermont at latitude and longitude of 72°48′7.84″ W and 44°54′41.76″ N, respectively, and covers an ar of 36 km 2 . The topography is hilly with an elevation range from 107 m to 273 m, as show in Figure 1d. The terrain has slope variations in the range of 0°-20°. The site slope valu were obtained from ALS-DEM created at a ground sampling distance (GSD) of 0.5 m fro the acquired ALS point clouds (see Section 2.2.1). The lower slope values (i.e., 0°) are fr quently found along the Missisquoi River flood plains (see elevation map in Figure 1d).
According to their geographical characteristics, Sites A and B are referred to as "u ban" and "rural" sites in this article.   Figure 1d).
According to their geographical characteristics, Sites A and B are referred to as "urban" and "rural" sites in this article. The DSMA study sites ( Figure 1) were surveyed using the Leica ALS50II laser scanner during leaf-off conditions in November 2008 for Site A (Figure 1c) and April 2009 for Site B (Figure 1d), with no snow cover on the ground [34]. The ALS scanners were flown at an altitude of 1350 m in 2008 for Site A and at 1371 m altitude in 2009 for Site B, with four returns of each outgoing laser pulse at 150 kHz with a beam divergence of 0.22 mrad, and a nominal side overlap of 36%. This resulted in an average point density of 2 pts/m 2 . The field validation over different land cover types, e.g., bare earth, buildings, and grass showed vertical accuracy with a root mean square error of 0.08 m [34]. The laser point clouds were classified into ALS ground and non-ground points. We obtained classified ALS point clouds that were made freely available on OpenTopography by the USGS [34]. The acquired ALS point clouds were available in the Vermont State Plane NAD83 (2007) coordinate system.

Centerlines of FHWA Roads
The Vermont Transportation(VT) road centerlines were acquired from the Vermont Open Geodata portal [35]. The VT road centerlines follow the road functional classification scheme developed by the Federal Highway Administration (FHWA) [35] and are comprised of four major road classes of interstate highways, arterials, collectors, and locals. The interstate highways spread nationwide by providing the highest speed and uninterrupted mobility. Arterials are highways and freeways within a state and are directly connected with the interstate highways. Collectors link state arterials and state local roads. Locals are the road networks in residential and industrial areas [36]. It is noted that the VT road centerline does not account for residential streets and private roads because they do not fall under the jurisdiction of the FHWA. For the rest of the paper, the VT road centerlines are referred to as "FHWA roads" (denoted by the black lines in Figure 2).

Centerlines of Non-FHWA Roads
To account for the road centerlines of residential streets and private roads that were not available using the VT road centerlines, an impervious surface area (ISA) map was acquired from the Vermont Center for Geographic Information (VCGI) [37]. Impervious surfaces are anthropogenic surfaces through which water cannot infiltrate into the soil [38]. The ISA was mapped as a vector polygon for 2011 using object-based image analysis (OBIA). The ISAs of residential streets and private roads were mapped with an overall accuracy of 99% [33]. To clarify the terminology used in this paper, the centerlines of residential streets and private roads are referred to as "non-FHWA roads". The ISAs were comprised of polygon features, which were then processed to obtain the centerlines of non-FHWA roads using the skeleton method [39] (denoted by the yellow lines in Figure 2).

Benchmark DS Dataset
A benchmark DS dataset was acquired from the Vermont agency of transportation (VTrans), covering the interstates and state highways, arterials, and collector roads (see Section 2.2.2) [40]. The VTrans benchmark DS dataset was selected due to its high reliability because VTrans conducted GPS field surveys to map bridges and culverts in a state-wide capacity (see red dots in Figure 2a,b). The DS locations were marked at the center of the road (denoted by the red dots in Figure 2e). The VTrans DS dataset was unavailable for FHWA locals, non-FHWA residential streets and private roads (Section 2.2.3).
In order to acquire the DS dataset for FHWA local and non-FHWA roads, we augmented the DS along with residential streets, locals, and private roads using GE Street View (GE-SV) images (blue dots in Figure 2a,b). GE-SV images are very high-resolution panoramic images contributed by Google (Google LLC., Mountain View, CA, USA) [19,41]. To have consistent positional accuracy compared to the VTrans DS dataset, the locations of augmented DS were marked at the center of the road (blue dots in Figure 2c). Because Remote Sens. 2021, 13, 463 5 of 23 the commercial area of St. Alban City (shown as a green region in Figure 2a) carries the watercourses underground, the DS dataset for this area is not available. The DS datasets collected from VTrans and augmented from GE-SV were combined, ensuring that both datasets did not overlap, and the result is herein referred to as the "benchmark DS dataset" (Table 1)

Centerlines of non-FHWA Roads
To account for the road centerlines of residential streets and private roads that were not available using the VT road centerlines, an impervious surface area (ISA) map was acquired from the Vermont Center for Geographic Information (VCGI) [37]. Impervious surfaces are anthropogenic surfaces through which water cannot infiltrate into the soil [38]. The ISA was mapped as a vector polygon for 2011 using object-based image analysis (OBIA). The ISAs of residential streets and private roads were mapped with an overall accuracy of 99% [33]. To clarify the terminology used in this paper, the centerlines of residential streets and private roads are referred to as "non-FHWA roads". The ISAs were

Orthophotos
Orthophotos of the leaf-off seasons at a GSD of 0.5 m for 2009 were acquired from the Vermont imagery program [42] for two sites. The orthophotos were used for displaying data and results (Figure 2c-e). All datasets, e.g., FHWA, non-FHWA roads, the benchmark DS dataset, and orthophotos were projected onto the Vermont State Plane NAD83 (2007) coordinate system to match with the coordinate system of the ALS point clouds (Section 2.2.1).

Methods
The algorithm proceeds through four stages, as shown in Figure 3. Each stage is briefly explained in the following sub-sections.

Methods
The algorithm proceeds through four stages, as shown in Figure 3. Each stage is briefly explained in the following sub-sections.

Data Preparation
The ALS point clouds were obtained from OpenTopography (see Section 2.2.1) for two sites (Figure 1). In practice, ALS acquisitions are provided as hundreds or even thou- The ALS point clouds were obtained from OpenTopography (see Section 2.2.1) for two sites (Figure 1). In practice, ALS acquisitions are provided as hundreds or even thousands of files, causing memory allocation errors due to their large size and massive data volume. Moreover, we adopted the natural neighbor interpolation (NNI) [43], which is implemented in ArcGIS desktop version 10.8.1 and has a maximum limit of 15 million input points. The ALS point clouds were therefore processed into tiles (Figure 3a) of the same size using the lidR package [44], where tiles overlap their boundaries with the four immediately neighboring tiles (Figure 4a). The tile boundary overlap is used to eliminate tile boundary artifacts caused by NNI (see Section 3.2). The maximum tile size (x, y) of 4 × 4 km with a 1 m boundary overlap is used in this paper. input points. The ALS point clouds were therefore processed into tiles (Figure 3a) of the same size using the lidR package [44], where tiles overlap their boundaries with the four immediately neighboring tiles (Figure 4a). The tile boundary overlap is used to eliminate tile boundary artifacts caused by NNI (see Section 3.2). The maximum tile size (x, y) of 4 × 4 km with a 1 m boundary overlap is used in this paper. In the context of this paper, a road is comprised of a paved way with sloping sides, as shown by the ALS ground points (Figure 5b). The surface of a road is impervious and it is elevated above the surrounding terrain. Figure 5b shows an example of an ALS point cloud of a road surface with sloping sides (red dots in Figure 5b) from an interstate highway ( Figure 5a) in one of our sites. The 5 m wide cross-sections of different road profiles, including FHWA and non-FHWA, are shown in Figure 5c-h, where each profile contains more than 1000 points. It is found that non-FHWA roads (i.e., residential streets and private roads; see Figure 5g, h) were less elevated than FHWA roads (Figure 5c-f), showing that, for both FHWA and non-FHWA, the roads are frequently elevated above neighboring ALS ground points ( Figure 5).
The road width (including its sloping sides) of FHWA and non-FHWA roads was estimated from ALS point clouds. To account for the variation in urban and rural environments, the road widths at 15 random locations for each road class were estimated (see Table 2). Out of the 15 observations made, the maximum width value of each road class was considered so that the roads could be removed entirely from the ALS point clouds (see Section 3.2). The road buffer values were calculated from the maximum road widths (Figure 5c-h). A buffer value is simply half of the maximum width of each road class, as shown in Figure 5c-h and estimated buffer values for each functional road class are provided in Table 2.
The estimated buffer values were used to place buffers around both sides of FHWA and non-FHWA road centerlines (Figure 3a) according to their road class (Table 2) in order

Creating a Combined Mask of FHWA and Non-FHWA Roads
In the context of this paper, a road is comprised of a paved way with sloping sides, as shown by the ALS ground points (Figure 5b). The surface of a road is impervious and it is elevated above the surrounding terrain. Figure 5b shows an example of an ALS point cloud of a road surface with sloping sides (red dots in Figure 5b) from an interstate highway ( Figure 5a) in one of our sites. The 5 m wide cross-sections of different road profiles, including FHWA and non-FHWA, are shown in Figure 5c-h, where each profile contains more than 1000 points. It is found that non-FHWA roads (i.e., residential streets and private roads; see Figure 5g,h) were less elevated than FHWA roads (Figure 5c-f), showing that, for both FHWA and non-FHWA, the roads are frequently elevated above neighboring ALS ground points ( Figure 5).
The road width (including its sloping sides) of FHWA and non-FHWA roads was estimated from ALS point clouds. To account for the variation in urban and rural environments, the road widths at 15 random locations for each road class were estimated (see Table 2). Out of the 15 observations made, the maximum width value of each road class was considered so that the roads could be removed entirely from the ALS point clouds (see Section 3.2). The road buffer values were calculated from the maximum road widths  Table 2.

ALS-mDEM Production
As shown in Figure 5, the roads are elevated above their neighboring ALS ground points (green points in Figure 5), leading to the unreliable pathways of watercourses, i.e., DN [21]. Therefore, in the second stage, the ALS ground points of FHWA and non-FHWA roads were removed from each ALS tile (Figures 3b and 6b) using a combined road mask (Figures 3a and 6a). Then, ALS-mDEM tiles were produced from the remaining ALS ground points using NNI at a GSD of 0.5 m (Figure 6c). A GSD of 0.5 m was used to retain most of the topographic information offered by the ALS point clouds at a point density of 2 pt/m 2 . In the context of automation, NNI does not require any user-defined parameters to find the minimum number of points to be interpolated, e.g., the neighborhood search  The estimated buffer values were used to place buffers around both sides of FHWA and non-FHWA road centerlines ( Figure 3a) according to their road class (Table 2) in order to obtain a combined road mask, as shown in Figure 6a.
Remote Sens. 2021, 13, 463 9 of 23 radius, which makes it the simplest interpolation method [45]. To ensure the NNI interpolated elevation values are consistent between ALS-mDEM tiles, elevation values within a 1 m buffer of the tile boundaries were discarded (Figure 6d). To avoid the data voids at tile boundaries, a 1 m tile boundary overlap was maintained in the data processing stage (Figure 4a, b).

Candidate DS Mapping
In the third stage, ALS-mDEM tiles were then mosaiced into a single continuous ALS-mDEM using the "Mosaic to New Raster" tool in a ArcGIS Toolbox [46]. Both the real and artificial sink artifacts were removed from ALS-mDEM using the method adopted by Jenson and Dominque [47] (Figure 3c). A sink is defined as a cell or group of cells with all immediate neighboring cells at a higher elevation, which prevents automated DN extraction from ALS-mDEM due to sinks. The stream raster was then generated from the sinkfree ALS-mDEM ( Figure 7a) using the D8 method [48], followed by assigning Strahler stream orders [49,50] to the stream raster (Figures 3c and 7b).
In the context of automation, no information was gained by plotting the benchmark DS dataset over the stream raster (see red rectangles in Figure 7a). We found that the benchmark DS dataset aligned with streams of higher orders (see red rectangles in Figure  7b) by making the streams of lower orders, e.g., stream orders 1-5, less significant ( Figure  7b). Because of this, Strahler stream orders were implemented in DSMA so that different stream order thresholds could be used automatically. Examples of candidate DS obtained after intersecting DNs with stream order thresholds of 6, 7, and 8 are shown in Figure 8a,

ALS-mDEM Production
As shown in Figure 5, the roads are elevated above their neighboring ALS ground points (green points in Figure 5), leading to the unreliable pathways of watercourses, i.e., DN [21]. Therefore, in the second stage, the ALS ground points of FHWA and non-FHWA roads were removed from each ALS tile (Figures 3b and 6b) using a combined road mask (Figures 3a and 6a). Then, ALS-mDEM tiles were produced from the remaining ALS ground points using NNI at a GSD of 0.5 m (Figure 6c). A GSD of 0.5 m was used to retain most of the topographic information offered by the ALS point clouds at a point density of 2 pt/m 2 . In the context of automation, NNI does not require any user-defined parameters to find the minimum number of points to be interpolated, e.g., the neighborhood search radius, which makes it the simplest interpolation method [45]. To ensure the NNI interpolated elevation values are consistent between ALS-mDEM tiles, elevation values within a 1 m buffer of the tile boundaries were discarded (Figure 6d). To avoid the data voids at tile boundaries, a 1 m tile boundary overlap was maintained in the data processing stage (Figure 4a,b).

Candidate DS Mapping
In the third stage, ALS-mDEM tiles were then mosaiced into a single continuous ALS-mDEM using the "Mosaic to New Raster" tool in a ArcGIS Toolbox [46]. Both the real and artificial sink artifacts were removed from ALS-mDEM using the method adopted by Jenson and Dominque [47] (Figure 3c). A sink is defined as a cell or group of cells with all immediate neighboring cells at a higher elevation, which prevents automated DN extraction from ALS-mDEM due to sinks. The stream raster was then generated from the sink-free ALS-mDEM (Figure 7a) using the D8 method [48], followed by assigning Strahler stream orders [49,50] to the stream raster (Figures 3c and 7b).
8b, and 8c, respectively. The stream order thresholds help to discard those streams that are less likely to contain DS (see Figure 8a, b).
Candidate DS (green dots), obtained with a stream order threshold of 7, meaning all streams with stream orders of 7 or higher were intersected with centerlines of FHWA and non-FHWA roads, showed close agreement with the benchmark DS dataset (Figure 8b). A large number of false candidate DS were obtained with the stream order threshold of 6 (green dots within red rectangles in Figure 8a). Several valid candidate DS were omitted at a stream order threshold of 8 (red rectangles in Figure 8c). Therefore, a stream order threshold of 7 was used for mapping DS in Site A ( Figure 1a) and Site B (Figure 1b).  The candidate DS enclosed within the area marked with a red rectangle in Figure 8b indicates that a single benchmark DS has several competing candidate DS mapped at a In the context of automation, no information was gained by plotting the benchmark DS dataset over the stream raster (see red rectangles in Figure 7a). We found that the benchmark DS dataset aligned with streams of higher orders (see red rectangles in Figure 7b) by making the streams of lower orders, e.g., stream orders 1-5, less significant ( Figure 7b). Because of this, Strahler stream orders were implemented in DSMA so that different stream order thresholds could be used automatically. Examples of candidate DS obtained after intersecting DNs with stream order thresholds of 6, 7, and 8 are shown in Figure 8a-c, respectively. The stream order thresholds help to discard those streams that are less likely to contain DS (see Figure 8a,b). 8b, and 8c, respectively. The stream order thresholds help to discard those streams that are less likely to contain DS (see Figure 8a, b). Candidate DS (green dots), obtained with a stream order threshold of 7, meaning all streams with stream orders of 7 or higher were intersected with centerlines of FHWA and non-FHWA roads, showed close agreement with the benchmark DS dataset (Figure 8b). A large number of false candidate DS were obtained with the stream order threshold of 6 (green dots within red rectangles in Figure 8a). Several valid candidate DS were omitted at a stream order threshold of 8 (red rectangles in Figure 8c). Therefore, a stream order threshold of 7 was used for mapping DS in Site A ( Figure 1a) and Site B (Figure 1b).  The candidate DS enclosed within the area marked with a red rectangle in Figure 8b indicates that a single benchmark DS has several competing candidate DS mapped at a Candidate DS (green dots), obtained with a stream order threshold of 7, meaning all streams with stream orders of 7 or higher were intersected with centerlines of FHWA and non-FHWA roads, showed close agreement with the benchmark DS dataset (Figure 8b). A large number of false candidate DS were obtained with the stream order threshold of 6 (green dots within red rectangles in Figure 8a). Several valid candidate DS were omitted at a stream order threshold of 8 (red rectangles in Figure 8c). Therefore, a stream order threshold of 7 was used for mapping DS in Site A (Figure 1a) and Site B (Figure 1b).
The candidate DS enclosed within the area marked with a red rectangle in Figure 8b indicates that a single benchmark DS has several competing candidate DS mapped at a stream order threshold of 7, all of which are considered duplicate DS. To fix duplicate DS, the candidate DS refinement stage was implemented.

Candidate DS Refinement
The candidate DS refinement process begins by placing a buffer around each candidate DS (Figure 9a-c). Buffers that overlap with neighboring buffers are clusters of the duplicate DS; therefore, these buffers are merged into a single buffer (Figure 9d-f). All candidates within the merged buffer are later removed (Figure 9d-f). Finally, the centroid of each merged buffer is computed (blue dots in Figure 9d-f), which represents the output of DSMA (see the DSMA workflow in Figure 3d). This is referred to as "mapped DS" in the rest of the article.

Candidate DS Refinement
The candidate DS refinement process begins by placing a buffer around each candidate DS (Figure 9a-c). Buffers that overlap with neighboring buffers are clusters of the duplicate DS; therefore, these buffers are merged into a single buffer (Figure 9d-f). All candidates within the merged buffer are later removed (Figure 9d-f). Finally, the centroid of each merged buffer is computed (blue dots in Figure 9d-f), which represents the output of DSMA (see the DSMA workflow in Figure 3d). This is referred to as "mapped DS" in the rest of the article.  proved to be less effective because values higher than 15 m eliminate several valid DS, resulting in omissions ( Figure 10). Figure 10. Refining process to remove duplicate candidate DS using different refinement buffer values.

Accuracy Assessment
To analyze and quantify the performance of the proposed algorithm along with FHWA and non-FHWA roads of an urban site and a rural site, we undertook positional and prediction accuracy assessments using a benchmark DS dataset (see Section 2.2.4).

Positional Accuracy
A one-to-one correspondence was established between the mapped DS and the benchmark DS, e.g., each mapped DS locates its closest neighboring benchmark DS. A unique ID was generated for each benchmark DS so that one-to-many correspondences could be avoided. The Euclidian distance of the offset ( ) was calculated between each mapped DS ( 1 , 1 ) and its corresponding benchmark DS ( 2 , 2 ) using Equation (1).

Prediction Accuracy
The F1-score, which is the harmonic mean of precision (P) and recall (R), was used to assess prediction accuracy. An F1-score near 1 indicates good performance, and values closer to 0 indicate a bad performance. The precision and recall were calculated using Equations (2) and (3), respectively. The F1-score was calculated using Equation (4).
where is the number of true positive (TP) cases, i.e., the number of correctly mapped DS; is the number of false positive (FP) cases, i.e., the number of false DS mappings;

Accuracy Assessment
To analyze and quantify the performance of the proposed algorithm along with FHWA and non-FHWA roads of an urban site and a rural site, we undertook positional and prediction accuracy assessments using a benchmark DS dataset (see Section 2.2.4).

Positional Accuracy
A one-to-one correspondence was established between the mapped DS and the benchmark DS, e.g., each mapped DS locates its closest neighboring benchmark DS. A unique ID was generated for each benchmark DS so that one-to-many correspondences could be avoided. The Euclidian distance of the offset (D p ) was calculated between each mapped DS (x 1 , y 1 ) and its corresponding benchmark DS (x 2 , y 2 ) using Equation (1). (1)

Prediction Accuracy
The F1-score, which is the harmonic mean of precision (P) and recall (R), was used to assess prediction accuracy. An F1-score near 1 indicates good performance, and values closer to 0 indicate a bad performance. The precision and recall were calculated using Equations (2) and (3), respectively. The F1-score was calculated using Equation (4).
where N TP is the number of true positive (TP) cases, i.e., the number of correctly mapped DS; N FP is the number of false positive (FP) cases, i.e., the number of false DS mappings; N FN is the number of false negative (FN) cases, i.e., the number of DS that were not mapped by DSMA.

Results
The positional accuracies of DSMA are generally similar for the two sites. The median values of D p were 9.03 m and 10.17 m for the urban site and rural site, respectively; the mean values of D p were 13.53 m and 15.82 m for the rural site and urban site, respectively. Detailed statistics regarding the positional accuracies are shown in Table 3. Notable differences were found for the maximum values and the third quartiles. For the urban site, maximum offsets of 63.45 m were found in residential areas; for the rural site, maximum offsets of 72.53 m were found in the Missisquoi River flood plain. This means the maximum D p values of the mapped DS were found in very low topographical relief areas where the slope was approximately 0 • (Figure 1c,d). The third quartiles were 17.12 m and 10.17 m for the urban sites and rural sites, respectively.
To further investigate the effect of the topographic slope on the positional accuracy of mapped DS, the maximum slope within a 30 m buffer around each DS was calculated using ALS-mDEM. Scatterplots of the offsets (D p ) of mapped DS between the maximum slopes of benchmark DS are shown in Figure 11. The relation between the offset and slope can be better visualized by the moving average line (Figure 11). According to the moving average line, when the slope is less than 5 degrees, the offsets are larger than 15 m and 20 m for the urban site ( Figure 11a) and rural (Figure 11b)

Results
The  Table 3.
Notable differences were found for the maximum values and the third quartiles. For the urban site, maximum offsets of 63.45 m were found in residential areas; for the rural site, maximum offsets of 72.53 m were found in the Missisquoi River flood plain. This means the maximum values of the mapped DS were found in very low topographical relief areas where the slope was approximately 0° (Figure 1c, d). The third quartiles were 17.12 m and 10.17 m for the urban sites and rural sites, respectively. To further investigate the effect of the topographic slope on the positional accuracy of mapped DS, the maximum slope within a 30 m buffer around each DS was calculated using ALS-mDEM. Scatterplots of the offsets ( ) of mapped DS between the maximum slopes of benchmark DS are shown in Figure 11. The relation between the offset and slope can be better visualized by the moving average line (Figure 11). According to the moving average line, when the slope is less than 5 degrees, the offsets are larger than 15m and 20 m for the urban site ( Figure 11a) and rural (Figure 11b) site, respectively. The prediction accuracies (F1-scores) of DSMA for FHWA roads were 0.87 and 0.94 for the urban site and rural site, respectively ( Table 4). The F1-scores for non-FHWA roads were 0.72 and 0.74 for the urban site and rural site, respectively (Table 4). Figure 12 shows The prediction accuracies (F1-scores) of DSMA for FHWA roads were 0.87 and 0.94 for the urban site and rural site, respectively ( Table 4). The F1-scores for non-FHWA roads were 0.72 and 0.74 for the urban site and rural site, respectively (Table 4).  Figure 12 shows the distribution of TP (blue dots), FP (purple dots), and FN (red dots) of mapped DS in the urban site and rural site. A few examples of TPs in cases A, B, C, and D, along with FHWA and non-FHWA roads for the urban and rural site, are marked with red rectangles in Figure 12 and are shown in GE-SV (see Figure 13). The grey dots in Figure 10 denote mapped DS without a corresponding benchmark DS and are herein referred to as "not confirmed" (NC) cases. For the FHWA roads, the number of NC cases (N NC ) were 12 and 66 for in the urban site and rural site, respectively; for the non-FHWA roads, N NC were 86 and 177 for the urban site and rural site, respectively.

False Negative Cases
For the urban site, a total of 17 FN cases were found for FHWA roads, and 15 of them were found along the interstate highway. Because the proposed DSMA was devised to extract DN belonging to a stream or a river, it failed to extract DN belonging to the underground highway drainage system. Two FN examples from the urban site (A) interstate highway are shown in Figure 14c, d using GE-SV images. For the rural site, FHWA roads do not contain an interstate highway; therefore, there were less FN ( Table 4). The lower recall value observed for FHWA roads of the urban site (A) was due to the underground highway drainage system (Table 4).
For both the urban site and the rural site, several FN cases were found for the non-FHWA roads (more specifically, residential streets). There were 28 and 14 FN cases for the urban site (A) and rural site (B), respectively. DS of one side of a residential street were mapped, while the other side was missed entirely, as shown in Figure 15a, b.
In the residential area, most of the streams were of a stream order lower than 6. Therefore, at the stream order threshold of 7, lower-order streams were dropped from DN, and DS were not mapped by the DSMA (Figure 15b). By using a lower stream order threshold of 4, these FN cases could be reduced (Figure 15c). This suggests that an adaptive stream order threshold to account for residential areas (vs. agricultural and forest areas) would be useful for processing a large dataset.
The FN of residential streets (Figure 15) caused the lower recall and F1-score values of non-FHWA roads in urban and rural sites, respectively ( Table 4).
The rest of the FN cases were randomly found along FHWA and non-FHWA roads of the two sites. Examples of these cases are shown in Figure 16. Figure 16a shows the FN cases where both the DN and centerline of the non-FHWA roads were absent for the extraction of DS (denoted by red dots). Figure 16b shows the FN cases where both DN and the centerline of a non-FHWA road were present, but where they did not intersect to extract a DS (denoted by red dots). Figure 16c shows the FN cases where the extracted DN did not cross the FHWA road for DS extraction (denoted by red dots). Most of these cases are associated with data completeness (e.g., road centerlines of FHWA and non-FHWA roads).

False Negative Cases
For the urban site, a total of 17 FN cases were found for FHWA roads, and 15 of them were found along the interstate highway. Because the proposed DSMA was devised to extract DN belonging to a stream or a river, it failed to extract DN belonging to the underground highway drainage system. Two FN examples from the urban site (A) interstate highway are shown in Figure 14c,d using GE-SV images. For the rural site, FHWA roads do not contain an interstate highway; therefore, there were less FN ( Table 4). The lower recall value observed for FHWA roads of the urban site (A) was due to the underground highway drainage system (Table 4).
For both the urban site and the rural site, several FN cases were found for the non-FHWA roads (more specifically, residential streets). There were 28 and 14 FN cases for the urban site (A) and rural site (B), respectively. DS of one side of a residential street were mapped, while the other side was missed entirely, as shown in Figure 15a,b.
In the residential area, most of the streams were of a stream order lower than 6. Therefore, at the stream order threshold of 7, lower-order streams were dropped from DN, and DS were not mapped by the DSMA (Figure 15b). By using a lower stream order threshold of 4, these FN cases could be reduced (Figure 15c). This suggests that an adaptive stream order threshold to account for residential areas (vs. agricultural and forest areas) would be useful for processing a large dataset.
The FN of residential streets (Figure 15) caused the lower recall and F1-score values of non-FHWA roads in urban and rural sites, respectively ( Table 4).
The rest of the FN cases were randomly found along FHWA and non-FHWA roads of the two sites. Examples of these cases are shown in Figure 16. Figure 16a shows the FN cases where both the DN and centerline of the non-FHWA roads were absent for the extraction of DS (denoted by red dots). Figure 16b shows the FN cases where both DN and the centerline of a non-FHWA road were present, but where they did not intersect to extract a DS (denoted by red dots). Figure 16c shows the FN cases where the extracted DN did not cross the FHWA road for DS extraction (denoted by red dots). Most of these cases are associated with data completeness (e.g., road centerlines of FHWA and non-FHWA roads).

False Positive Cases
The regions marked with purple rectangles in Figure 12a, b contain most of the FP cases for FHWA roads in the two sites.
A total of 19 FP cases were found in the urban site and 10 cases of FP cases were found along the FHWA interstate highway (I-89) in the southeast part of the urban site (denoted by the purple rectangle in Figure 12a). These FP cases (purple dots in Figure 17) were due to the presence of some of the DSMA-extracted artificial DN, which intersected with the FHWA road. For fewer locations, road masking caused the removal of some roadside ditches, causing some artificial DN channels in DSMA.

False Positive Cases
The regions marked with purple rectangles in Figure 12a,b contain most of the FP cases for FHWA roads in the two sites.
A total of 19 FP cases were found in the urban site and 10 cases of FP cases were found along the FHWA interstate highway (I-89) in the southeast part of the urban site (denoted by the purple rectangle in Figure 12a). These FP cases (purple dots in Figure 17) were due to the presence of some of the DSMA-extracted artificial DN, which intersected with the FHWA road. For fewer locations, road masking caused the removal of some roadside ditches, causing some artificial DN channels in DSMA.

False Positive Cases
The regions marked with purple rectangles in Figure 12a, b contain most of the FP cases for FHWA roads in the two sites.
A total of 19 FP cases were found in the urban site and 10 cases of FP cases were found along the FHWA interstate highway (I-89) in the southeast part of the urban site (denoted by the purple rectangle in Figure 12a). These FP cases (purple dots in Figure 17) were due to the presence of some of the DSMA-extracted artificial DN, which intersected with the FHWA road. For fewer locations, road masking caused the removal of some roadside ditches, causing some artificial DN channels in DSMA.  The FP cases in the rural site were frequently found along the FHWA road (the purple rectangle in Figure 12b, with a zoomed-in image shown in Figure 18). Although the FHWA road had drainage ditches along its sides (Figure 18c,d) to carry the watercourse to a nearby stream or river (see Figure 18b-d), they were removed from the ALS-mDEM (Figure 18a) due to the road mask that was implemented in the second stage of DSMA. Thus, an artificial DN was formed, which crossed the FHMA road and led to FP cases due to the absence of drainage ditches in ALS-mDEM. The FP cases in the rural site were frequently found along the FHWA road (the purple rectangle in Figure 12b, with a zoomed-in image shown in Figure 18). Although the FHWA road had drainage ditches along its sides (Figure 18c,d) to carry the watercourse to a nearby stream or river (see Figure 18b-d), they were removed from the ALS-mDEM (Figure 18a) due to the road mask that was implemented in the second stage of DSMA. Thus, an artificial DN was formed, which crossed the FHMA road and led to FP cases due to the absence of drainage ditches in ALS-mDEM.
Similar artificial DN, which crossed over the FHWA and non-FHWA roads in the residential areas (Figure 19a,b), were also found in the two sites because the topography of both sides of the roads was removed by the road mask implemented in the second stage of DSMA. There were 8 and 3 FP cases for FHWA and non-FHWA roads, respectively.  Similar artificial DN, which crossed over the FHWA and non-FHWA roads in the residential areas (Figure 19a,b), were also found in the two sites because the topography of both sides of the roads was removed by the road mask implemented in the second stage of DSMA. There were 8 and 3 FP cases for FHWA and non-FHWA roads, respectively. The FP cases in the rural site were frequently found along the FHWA road (the purple rectangle in Figure 12b, with a zoomed-in image shown in Figure 18). Although the FHWA road had drainage ditches along its sides (Figure 18c,d) to carry the watercourse to a nearby stream or river (see Figure 18b-d), they were removed from the ALS-mDEM (Figure 18a) due to the road mask that was implemented in the second stage of DSMA. Thus, an artificial DN was formed, which crossed the FHMA road and led to FP cases due to the absence of drainage ditches in ALS-mDEM.
Similar artificial DN, which crossed over the FHWA and non-FHWA roads in the residential areas (Figure 19a,b), were also found in the two sites because the topography of both sides of the roads was removed by the road mask implemented in the second stage of DSMA. There were 8 and 3 FP cases for FHWA and non-FHWA roads, respectively.

Discussion
The DSMA is implemented using the model builder in ArcGIS 10.8.1 (https://github. com/Nadeem1geo/Drainage-Structures-Mapping-Algorithm-DSMA-) and tested over an area of 50 km 2 (Figure 1) using an Intel Core™ i7-3770K @ 3.5GHz processor with an installed random access memory of 16 gigabytes. The ALS-mDEM was processed at GSD of 0.5 m and a stream order threshold of 7 was set to map DS. The processing times for each module under a different tiling schemes are presented in Table 5. Note: processing time is given in minutes. Table 5 emphasizes the fact that the processing time can be reduced by increasing the tile size. This is because a larger tile size requires reduced iterations to process point clouds, e.g., creating ALS-mDEM tiles and afterwards removing tile boundary artifacts from each ALS-mDEM tile. In the ArcGIS 10.8.1 environment, DSMA can process point cloud tiles with a maximum size of 4 × 4 km because NNI reaches an interpolation limit of a maximum of 15 million points at a point density of 2 pt/m 2 . For that reason, the tiling scheme was implemented to process massive point clouds by splitting them into manageable tiles ( Table 5). Note that the different tiling schemes only affect the overall processing time. The total number of mapped DS remains the same at different tiling schemes because the DSMA mosaics ALS-mDEM tiles into a single continuous ALS-mDEM before mapping DS. The end product for different tiling schemes has always been a single, continuous ALS-mDEM.
The ISA was included to map the DS of non-FHWA roads, e.g., NC cases in both sites (Figure 12a,b). The FHWA roads obtained from government sources do not account for private roads and residential streets as they do not fall under the jurisdiction of the government (i.e., non-FHWA roads); therefore, DS of non-FHWA roads (Table 4) cannot be mapped using FHWA roads only. This feature makes the DSMA robust, as it includes the ISA to map DS placed under non-FHWA roads ( Figure 12). The DSMA successfully predicts where a DS is located (TP) or where a DS may possibly be found (NC; black dots in Figure 12).
Certain limitations were found during the DSMA evaluation. First, ground points must be present in the given ALS data as DSMA utilizes ground points only, and intensity information is of no use for DSMA. Automated or semi-automated ground point filtration must be used for un-classified point clouds. This can be achieved using ground point classification algorithms [51]. Despite those limitations, reasonable positional and prediction accuracies were achieved (Tables 3 and 4) with an F1-score of 0.87 for FHWA urban and 0.92 for FHWA rural, whereas 0.72 was achieved for non-FHWA urban and 0.74 was achieved for non-FHWA rural roads, respectively. In recent times, ALS technology has significantly advanced, acquiring point clouds with higher point-densities and vertical accuracies [29,52]. Therefore, it is believed that the DSMA may show a better performance using more recent ALS data, and this hypothesis will be evaluated in future work. Second, a road mask was an effective approach to eliminate FHWA and non-FHWA roads from ALS ground points. To create an effective road mask from different road classes, buffer values were found manually using ALS ground points ( Figure 5 and Table 2). The buffer values given in Table 2 are effective for Vermont, USA. For other regions of the world, buffer values can vary. For that reason, buffer values must be found according to road classes (as designated by DOTs of other states and countries).
In terms of FN, the DSMA is unable to map DS placed within an underground drainage system of an interstate highway (see Figure 14). This is one of the limitations of the DSMA because it is based on the intersection of the road centerlines with a DN from ALS-mDEM. ALS-mDEM-based DN can only be found for surface watercourses (i.e., rivers and streams); therefore, the underground drainage systems of interstates and state highways constitute an exception. At the data end, low quality and incomplete road centerline information can also compromise DS mapping (see Figure 16a,b). DS mapping can be improved by ensuring data quality and consistency issues are solved, as shown in Figure 16. In terms of FP, the algorithm seems to map DS wrongly for streams that are diverted or artificially channeled along roads through drainage ditches in urban and rural areas (Figures 17 and 18). This is because the DSMA attempts to map DS based on the local topography upstream and downstream of a road (see TP of Figure 14a,b, Figures 17a and 18a). For streams that are diverted along an interstate highway or along roads on agricultural land through drainage ditches, the DSMA aims to find the possible location where the stream should cross a road segment based on the local topography of that particular area. In situ, the stream follows the diverted path through drainage ditches that were removed by the second stage of DSMA (see Figure 18a-d), thus causing the false DS mapping for some locations (see FP of Figures 17 and 18).
The DSMA performance was evaluated and quantified for whole-site areas (Figure 1a,b). The evaluation suggests that the overall prediction accuracies have shown improved results for urban areas with a stream order threshold of 4 ( Figure 15c). The urban areas contain residential streets where small culverts (Figure 13a,c) are placed over streams with stream order thresholds of 4 or higher (Figure 15a). We already demonstrated that the DSMA prediction accuracy could improve for residential areas by setting a lower stream order threshold of 4 (see Section 4.1, Figure 15c). DSMA is a newly developed algorithm with no counterpart in the present market and provides substantial new information on this topic; therefore, a comparison with a state-of-the-art method was not possible. Instead, we assessed the DSMA performance compared to a benchmark DS dataset (see Section 2.2.4) gathered and augmented from reliable sources.
The use of ALS topographic data in geomorphology, hydrology, and flood inundation modeling has increased substantially in recent years [11,16]. For several other studies, detailed DS datasets were either absent or manually incorporated into the research work carried out [15,[17][18][19]53]. With wide-area ALS data availability in many countries, e.g., 3D elevation programs (3DEPs) [54], the DSMA is a useful tool to provide extensive DS datasets over wide areas, as evaluated in this study.

Conclusions
This study proposed a DSMA using ALS point clouds and road centerline information and investigated its performance over an urban site and a rural site of 50 km 2 in Vermont, USA. Due to the presence of elevated roads in ALS-DEM, extracting DN from ALS-DEM becomes problematic. For that reason, ALS-mDEM is introduced in this study. Prior to the creation of ALS-mDEM, ground points within a combined road mask, which was created from FHWA and non-FHWA road centerlines, were removed. The DN, derived from the ALS-mDEM, was then intersected with FHWA and non-FHWA road centerlines to obtain candidate DS. In order to remove duplicate candidate DS, different buffers were experimented with to refine the candidate DS using proximity analysis, which resulted in a 15 m refinement buffer for the obtained mapped DS.
The DSMA evaluation is based on a benchmark DS dataset acquired from VTrans and further augmented using GE-SV by the authors. First, we calculated the one-toone Euclidean distances of mapped DS compared to the benchmark DS to quantify their positional accuracies. The mean positional accuracies were 13.5 m and 15. 8 m for the urban and rural sites, respectively, whereas the median values were 9.03 m and 10.07 m for the urban and the rural sites, respectively. The maximum Euclidean distances of 72.53 and 63.45 m were observed for the urban and the rural sites, respectively, when the topographic slope was approximately 0 • . We concluded that the slope is an important factor in determining the positional accuracies of the DSMA.
The proposed DSMA was designed to map DS placed over rivers and streams; therefore, DS placed within highway drainage systems, which connect to nearby streams, were missed by DSMA.
We used modest hardware and software resources that are available to most researchers. The DSMA processed ALS point clouds covering 50 km 2 in 122 min with an Intel Core™ i7-3770K @ 3.5GHz processor with 16GB of RAM (Random Access Memory). Our evaluation concludes that the proposed DSMA is also capable of processing ALS point clouds over larger areas to uniformly map DS on FHWA and non-FHWA roads.