Flight Path Setting and Data Quality Assessments for Unmanned-Aerial-Vehicle-Based Photogrammetric Bridge Deck Documentation

Imagery from Unmanned Aerial Vehicles can be used to generate three-dimensional (3D) point cloud models. However, final data quality is impacted by the flight altitude, camera angle, overlap rate, and data processing strategies. Typically, both overview images and redundant close-range images are collected, which significantly increases the data collection and processing time. To investigate the relationship between input resources and output quality, a suite of seven metrics is proposed including total points, average point density, uniformity, yield rate, coverage, geometry accuracy, and time efficiency. When applied in the field to a full-scale structure, the UAV altitude and camera angle most strongly affected data density and uniformity. A 66% overlapping was needed for successful 3D reconstruction. Conducting multiple flight paths improved local geometric accuracy better than increasing the overlapping rate. The highest coverage was achieved at 77% due to the formation of semi-irregular gridded gaps between point groups as an artefact of the Structure from Motion process. No single set of flight parameters was optimal for every data collection goal. Hence, understanding flight path parameter impacts is crucial to optimal UAV data collection.


Introduction
To ensure ongoing serviceability and safety, bridges must be inspected periodically as per local regulations (e.g., AASHTO, 1970 [1]; RAIU, 2010 [2]). Although many methods have been developed to support bridge inspection, visual inspection using on-site inspectors dominates. However, the visual inspection has many shortcomings, including the following aspects: (1) subjective results; (2) access only via heavy and/or specialty equipment; (3) traffic closures; (4) the requirement for highly skilled trained inspectors; (5) safety risks for inspectors; and (6) time-consuming and expensive processes. These aspects are particularly challenging in the absence of as-built drawings or an existing 3D model.
Bridge documentation and inspection been conducted using cameras (Xie et al., 2018) [3] and/or laser scanners (Truong-Hong and Laefer, 2015 [4]; Gyetvai et al., 2018 [5]), and even microwave radar interferometry (Zhang et al., 2018) [6] and synthetic aperture radar. Three-dimensional point clouds can be produced either directly through laser scanning or indirectly through assembling two-dimensional images. However, the quality of these point clouds is highly related to the view angles and offset distances. For example, the camera or scanner is set on the bridge deck or river bank, and incomplete coverage of the entire structure may occur due to the fixed field of view and positioning logistics. Low-cost UAVs equipped with cameras provide workarounds and offer many benefits such as non-contact measurement, avoidance of traffic closures, and use of non-specialized equipment (Atole et al., 2017) [7], while providing better data coverage in hard-to-reach areas like beneath the deck or the upper portions of a bridge's pylons (Chen et al., 2019) [8].
As an alternative to laser scanners, low-cost UAVs equipped with a single digital camera can generate dense and accurate point clouds when coupled with state-of-the-art computer-vision-based methods. Such capabilities have accelerated the adoption of UAVbased data capture for a wide range of infrastructure needs including building modelling (Byrne and Laefer, 2016) [9], dam inspection (Sizeng Zhao et al., 2021) [10], construction site monitoring (Hoegner et al., 2016) [11], and road surface evaluation (Chen and Dou, 2018) [12]. However, there are many factors that directly influence the model's final accuracy, resolution, and completeness. While these are known to include the camera positioning, number of images collected, overlap extent, and image quality, the interaction between these factors with respect to their impact on the final 3D model quality has yet to be quantified. Additionally, to explore their capability for comprehensive documentation and to devise optimization strategies, a series of reliable and systematic evaluation metrics are required to evaluate the results. To date, these have yet to be established. Therefore, this paper introduces four data quality evaluation metrics for bridge deck or roadway point clouds and investigates the interaction between flight path parameters and the quality of the reconstructed point cloud using those metrics and with respect to a terrestrial laser scanner.

Background
In recent years, with the improvement in design, control, and navigation technologies, UAVs are becoming cheaper and more easily accessible (Chen et al., 2016) [13]. In addition to conventional fixed-wing UAVs, newer designs developed for low-altitude, close-range inspection are increasingly available. For example, multirotor UAVs with outstanding hovering capabilities and better safety tolerances for rotor failure are already being used to a limited extent for civil infrastructure inspection   [14]. The incorporation of navigation sensors, such as Inertial Measurement Units (IMUs) (Li et al., 2015) [15], and obstacle detection sensors, such as optical flow cameras (Honegger et al., 2013) [16] and ultrasonic sensors (Papa and Del, 2015) [17], is further improving UAV reliability. For data collection purposes, laser scanners (Chisholm et al., 2013) [18] and digital cameras (Ferrick et al., 2012) [19] are commonly used separately and together both with and without UAVs. Examples are shown in Table 1. Laser scanning provides high-quality 3D point clouds but the equipment is comparatively expensive. Imagery is arguably more cost effective but not without difficulties, especially as many applications including full documentation and crack detection require 3D data (Chen et al., 2011) [36]. To obtain depth information from 2D images, they must be stitched together to form either stereoscopic images or a point cloud. For UAV inspection, the latter is commonly achieved using the SfM (Structure from Motion) method. The approach relies on having overlapping images taken from multiple viewpoints to enable the formation of a 3D point cloud. The process starts by detecting key points in each image through which images can be linked. This procedure can be accomplished by applying feature detectors like scale invariant feature transforms (SIFT) (Lowe, 1999) [37] or the speeded up robust features (SURF) method (Bay et al., 2008) [38]. Then, the 3D structure and camera motion based on the extracted features can be estimated to improve triangulation. Subsequently, a spare bundle adjustment (Lourakis and Argyros, 2009) [39] can be used to optimize the camera's position and generate a sparse point cloud to represent the object. Finally, point density can be intensified by applying multi-view stereo (MSV) techniques (Yasutaka and Hernández, 2015) [40]. Many of these procedures and related algorithms appear singly or in combination with many off-the-shelf software products, including Agisoft Photoscan, Pix4D, OpenMVG, and VisualSfM. For the sake of simplicity, in this paper the term SfM will be used to denote the entire reconstruction procedure.
While SfM is well established, the presence of cars, shadows, and specific terrains can complicate the subsequent data processing. The resulting 3D point clouds are also impacted by camera setting, lens distortion, flying height, quality and quantity of images, distribution of perspectives in those images, and capture angles (Smith and Vericat, 2015) [41]. Recent efforts have investigated the impact of these factors on the quality and quantity of SfMgenerated point clouds. For example, Byrne et al. (2017) [42] studied the effects of camera mode and lens settings on point density. This study showed that the lens distortion under a wide view mode generated a point cloud only half as dense as the one derived from images with no distortion. Similarly, poor data density and distortions were observed by Chen et al. (2017) [43] under laboratory conditions when the angle of incidence was high. That study recommended combining images from different oblique angles (e.g., 45 • with 60 • ) to minimize the density and distortion issues that appear when they are processed separately. A similar recommendation was made previously by James et al. (2014) [44], where the addition of oblique or parallel images was performed to reduce the error in the digital elevation models by as much as two orders of magnitude. However, all images may not be equally valuable. For example, Dandois et al. (2015) [45] found that denser point clouds were more easily produced on cloudy days due to the absence the unwanted shadows produced on sunny days. However, Chen et al. (2017) demonstrated that under laboratory conditions direct light increased the contrast in the images, which improved model accuracy, thereby implying that sunny days will lead to more accurate point clouds even though they may be less dense than those collected on cloudy days. Han et al. (2023) [35] conducted a study on the influence of UAV flight paths on the geometric accuracy of the final model. However, it is important to note that the geometric accuracy of the point cloud does not solely represent the point cloud quality. In real-world engineering scenarios, the point cloud quality typically requires evaluation from various perspectives, including volume density, completeness, geometric accuracy, and time taken, among others.
Although the aforementioned studies have recognized the effect of some variables related to camera calibration or data post-processing, a systematic understanding of how flight path parameters affect final point cloud quality has yet to be established, especially at lower altitudes (below 50 m) and in the presence of buildings and other infrastructure. Furthermore, there has yet to emerge a standard evaluation process for SfM point clouds. While some studies, such as those by Byrne et al. (2017) [46] and Slocum and Parrish (2017) [47], used the final number of points in a point cloud or point density as a proxy for quality, this is not widely performed, and while a few researchers (e.g., (Dandois et al., 2015)) have considered the geometric accuracy of the reconstructed point cloud with respect to GPS and GCP, these properties do not address data completeness or uniformity. To address these knowledge gaps, a systematic evaluation method is proposed in this paper to quantitatively study image-based point clouds. The usefulness of these metrics is then demonstrated as a means to determine the impact of flight parameters on 3D point cloud reconstructions.

Quality-Based Evaluation Metrics
To determine the quality of reconstructed point clouds for bridge deck and road surface documentation, a quintet of new quality-based point cloud evaluation metrics is herein proposed that covers the following aspects: (i) point average density, (ii) uniformity, (iii) completeness, (iv) overall point yield, and (v) geometric accuracy. Having these objective metrics will then enable more informed decisions about UAV flight path planning with respect to the required outputs. Each of these metrics is described in this section and then implemented in the next section as part of an actual field study.

Point Density and Uniformity
The first two proposed evaluation metrics are overall point density and point uniformity. Point cloud density is an indicator of data resolution. When the overall density is too low, small details will not appear in the dataset and may preclude damage identification because of poor data availability. Conversely, overly dense point clouds will have redundant Sensors 2023, 23, 7159 5 of 29 data, thereby unnecessarily requiring storage space and slowing analyses. Non-uniform point clouds will include both high-density and low-density areas.
These defects influence the quality of subsequent data processing and the affiliated outputs as well. This may include the performance of neighbor search algorithms and feature estimation processes, further data simplification (Moenning and Dodgson, 2003) [48], surface reconstruction (Huang et al., 2009 [49]; Holz and Behnke, 2014 [50]), and multidataset registration (Holz and Behnke, 2014;Huang et al., 2009). In addition, employing algorithms that specify a minimum-density threshold (Zolanvari and Laefer, 2016 [51]; Truong-Hong et al., 2013 [52]) may be especially challenging as even quantification of the minimum density would require significant resources to establish. Unfortunately, in real surveys, both TLS (terrestrial laser scanning) point clouds and imagery-derived point clouds (referred to here as SfM point clouds) are non-uniform.
To overcome these aforementioned problems, identifying the parameters that most affect the density and uniformity of a point cloud is necessary. Due to different data capturing mechanisms, parameters impacting the TLS and SfM point clouds differ. The non-uniformity of TLS data is directly linked to offset distance and the angle of incidence, as well as data capture speed. Specifically, smaller offsets and incidence angles tend to produce higher densities and lower differences in data distributions and can be represented as largely linear relationships (Laefer et al., 2009 [53]; Quagliarini et al., 2017 [54]). However, the main factors contributing to non-uniformity in SfM point clouds are less understood, and the explicit relationship between image resolution and overlapping rate has yet to be studied systematically.
To identify the critical data capture parameters that affect data densities and uniformities in SfM point clouds collected from UAVs, a volume density calculation is proposed; volume density is more representative as the surfaces are not entirely flat. The approach considers point distribution across a sphere. As shown in Figure 1, for each point P i , the number of neighbour points inside a specified spherical neighbourhood (N i ) with a radius R is calculated using a k-Nearest Neighbour (kNN) algorithm (Fukunaga and Hostetler, 1973) [55]. The volume density of Pi is equal to N i divided by the neighbourhood volume. As such, the general density can be represented by the statistical characteristics of each point. As shown in Equation (1), the average density (AD) will represent the overall density of the point cloud, while a standard deviation (SD) (Equation (2)) is used to evaluate its uniformity level. As density may vary greatly between datasets, a direct comparison of the SD is not meaningful. Thus, the term relative standard deviation (RSD) is introduced in the form of Equation (3) as an indication of the uniformity. Lower RSD values represent more uniform datasets. These defects influence the quality of subsequent data processing and the affiliated outputs as well. This may include the performance of neighbor search algorithms and feature estimation processes, further data simplification (Moenning and Dodgson, 2003) [48], surface reconstruction (Huang et al., 2009 [49]; Holz and Behnke, 2014 [50]), and multi-dataset registration (Holz and Behnke, 2014;Huang et al., 2009). In addition, employing algorithms that specify a minimum-density threshold (Zolanvari and Laefer, 2016 [51]; Truong-Hong et al., 2013 [52]) may be especially challenging as even quantification of the minimum density would require significant resources to establish. Unfortunately, in real surveys, both TLS (terrestrial laser scanning) point clouds and imagery-derived point clouds (referred to here as SfM point clouds) are non-uniform.
To overcome these aforementioned problems, identifying the parameters that most affect the density and uniformity of a point cloud is necessary. Due to different data capturing mechanisms, parameters impacting the TLS and SfM point clouds differ. The nonuniformity of TLS data is directly linked to offset distance and the angle of incidence, as well as data capture speed. Specifically, smaller offsets and incidence angles tend to produce higher densities and lower differences in data distributions and can be represented as largely linear relationships (Laefer et al., 2009 [53]; Quagliarini et al., 2017 [54]). However, the main factors contributing to non-uniformity in SfM point clouds are less understood, and the explicit relationship between image resolution and overlapping rate has yet to be studied systematically.
To identify the critical data capture parameters that affect data densities and uniformities in SfM point clouds collected from UAVs, a volume density calculation is proposed; volume density is more representative as the surfaces are not entirely flat. The approach considers point distribution across a sphere. As shown in Figure 1, for each point Pi, the number of neighbour points inside a specified spherical neighbourhood (Ni) with a radius R is calculated using a k-Nearest Neighbour (kNN) algorithm (Fukunaga and Hostetler, 1973) [55]. The volume density of Pi is equal to Ni divided by the neighbourhood volume. As such, the general density can be represented by the statistical characteristics of each point. As shown in Equation (1), the average density (AD) will represent the overall density of the point cloud, while a standard deviation (SD) (Equation (2)) is used to evaluate its uniformity level. As density may vary greatly between datasets, a direct comparison of the SD is not meaningful. Thus, the term relative standard deviation (RSD) is introduced in the form of Equation (3) as an indication of the uniformity. Lower RSD values represent more uniform datasets.   3 (1)

Completeness
The third metric relates to completeness. Incompleteness in SfM point clouds commonly relates to insufficient coverage, insufficient overlap, inability to discern textures in the images, and overall poor image quality. As shown in Figure 2, the missing data appear as either missing patches or randomly distributed empty spots. In contrast, incompleteness in TLS datasets is usually caused by high angles of incidence or line-of-sight interference, both common artefacts of site access issues. (3

Completeness
The third metric relates to completeness. Incompleteness in SfM point clouds com monly relates to insufficient coverage, insufficient overlap, inability to discern textures in the images, and overall poor image quality. As shown in Figure 2, the missing data appea as either missing patches or randomly distributed empty spots. In contrast, incomplete ness in TLS datasets is usually caused by high angles of incidence or line-of-sight interfer ence, both common artefacts of site access issues. To quantify the completeness of a point cloud, a mesh-based area calculation method is introduced. Since the bridge deck upper surface is nearly a flat plane, to make the cal culation more efficient, a 2D mesh is used. The process involves first projecting the dat points onto a normal plane. Then, a triangulation mesh is built from the projected dat points based on x and y coordinates across an entire plane. Next, the threshold radius α is applied to control the searching radius for the mesh generation. For any point C withi the radius α, if a neighbour point exists, a triangular mesh will be generated, as shown in Figure 3. The mesh is then used to calculate the area. Thus, by controlling the threshold α the areas with and without coverage can be calculated. To quantify the completeness of a point cloud, a mesh-based area calculation method is introduced. Since the bridge deck upper surface is nearly a flat plane, to make the calculation more efficient, a 2D mesh is used. The process involves first projecting the data points onto a normal plane. Then, a triangulation mesh is built from the projected data points based on x and y coordinates across an entire plane. Next, the threshold radius α is applied to control the searching radius for the mesh generation. For any point C within the radius α, if a neighbour point exists, a triangular mesh will be generated, as shown in Figure 3. The mesh is then used to calculate the area. Thus, by controlling the threshold α, the areas with and without coverage can be calculated.
To choose an appropriate α, the average distance of any point to its nearest neighbours must be measured. In this algorithm, points are randomly taken from the original data as querying points and used in a KNN search to find the closest point to each query point. Then, the average Euclidean distance (β ave ) of all pairs of query points and their closest neighbour are calculated. If the α value is close to or equal to the β ave , then the mesh will overlook the incomplete areas and only represent the real data coverage. Instead, if α is set as much larger than the β ave , the mesh will connect all points and measure the entirety of the pavement. By comparing these two meshes, the degree of coverage for each dataset can be measured and compared, as shown in Figure 4. The completeness index (CI) is equal to the percentage of the area covered by the points compared to the entirety of the area enveloped inside the boundary, as shown in Equation (4).  To choose an appropriate α, the average distance of any point to its ne bours must be measured. In this algorithm, points are randomly taken from data as querying points and used in a KNN search to find the closest point t point. Then, the average Euclidean distance (βave) of all pairs of query poin closest neighbour are calculated. If the α value is close to or equal to the βave, th will overlook the incomplete areas and only represent the real data coverage. is set as much larger than the βave, the mesh will connect all points and mea tirety of the pavement. By comparing these two meshes, the degree of cover dataset can be measured and compared, as shown in Figure 4. The comple (CI) is equal to the percentage of the area covered by the points compared to of the area enveloped inside the boundary, as shown in Equation (4).  To choose an appropriate α, the average distance of any point to its nearest neighbours must be measured. In this algorithm, points are randomly taken from the original data as querying points and used in a KNN search to find the closest point to each query point. Then, the average Euclidean distance (βave) of all pairs of query points and their closest neighbour are calculated. If the α value is close to or equal to the βave, then the mesh will overlook the incomplete areas and only represent the real data coverage. Instead, if α is set as much larger than the βave, the mesh will connect all points and measure the entirety of the pavement. By comparing these two meshes, the degree of coverage for each dataset can be measured and compared, as shown in Figure 4. The completeness index (CI) is equal to the percentage of the area covered by the points compared to the entirety of the area enveloped inside the boundary, as shown in Equation (4).

100%
(4) In an ideal world, the smallest known feature or damage could be used, but that would require a priori knowledge or extensive pre-processing and localized surface generation prior to implementation of this check. In an ideal world, the smallest known feature or damage could be used, but that would require a priori knowledge or extensive pre-processing and localized surface generation prior to implementation of this check.

Geometric Accuracy
The fourth evaluation metric is geometric accuracy, which is important for engineering inspection, especially for applications such as deformation monitoring and quantifiable damage assessment. In a surveying context, Lucieer [57] used LiDAR-derived digital terrain models as the ground truth along with 103 control points for topographic mapping. Such methods rely on GCPs for the large-scale global accuracy assessment and demonstrated a range of SfM-point cloud accuracy from 0.05 m to 0.97 m for the applications and equipment considered in those studies. This type of approach works well for topographic surveying, as the goal is to compare the positioning of data points to known positions in the real world.
For documentation, inspection, and modelling, however, the accuracy must be tied to the geometric object under evaluation. For small-scale surveys, Palmer et al. (2015) [58] used TLS data as the ground truth. In that process, fixed features of the structure (e.g., beam length) were used for comparison. However, picking the same points from different datasets for measurement is hard to achieve reliably given the discrete nature of the data capture and is arguably fraught with hard-to-quantify errors. To overcome these problems, Byrne et al. (2017) [42] proposed a point-to-point distance evaluation based on an average point-to-point distance. However, the problem remains that the geometry is not itself being checked in the absence of measured drawings, which are rarely available. Moreover, because this point-to-point distance calculation is based on the closest neighbour searching, non-uniform data distribution will cause errors to the result as well.
To resolve the problems mentioned above, a cross-section evaluation method for the accuracy assessment is proposed herein. First of all, each SfM dataset and the TLS ground truth point cloud were aligned using the ICP algorithm (Besl and McKay, 1992) [59]. Then, a cross-section (with a thickness of 5 cm in the x-direction) of the bridge deck from each dataset was manually extracted, as shown in Figure 5. After that, those points were projected to the Y-Z plane and separated into multiple intervals in the y-direction (Figure 6a). In each interval, the average Z value was calculated. By linking those points, the local surface was assembled ( Figure 6b). Lastly, by measuring the difference between each SfM dataset and the TLS dataset, the Pearson correlation coefficient could be calculated through Equation (5). In that equation, cov(A, B) is the covariance between two sets of mean values along the cross-section from different datasets. The terms σA and σB are the standard deviations of each set of mean values. In that process, fixed features of the structur beam length) were used for comparison. However, picking the same points from di datasets for measurement is hard to achieve reliably given the discrete nature of th capture and is arguably fraught with hard-to-quantify errors. To overcome these lems, Byrne et al. (2017) [42] proposed a point-to-point distance evaluation based average point-to-point distance. However, the problem remains that the geometry itself being checked in the absence of measured drawings, which are rarely av Moreover, because this point-to-point distance calculation is based on the closest bour searching, non-uniform data distribution will cause errors to the result as we To resolve the problems mentioned above, a cross-section evaluation method accuracy assessment is proposed herein. First of all, each SfM dataset and the TLS g truth point cloud were aligned using the ICP algorithm (Besl and McKay, 1992) [59] a cross-section (with a thickness of 5 cm in the x-direction) of the bridge deck fro dataset was manually extracted, as shown in Figure 5. After that, those points we jected to the Y-Z plane and separated into multiple intervals in the y-direction (Figu In each interval, the average Z value was calculated. By linking those points, th surface was assembled ( Figure 6b). Lastly, by measuring the difference between ea dataset and the TLS dataset, the Pearson correlation coefficient could be calc through Equation (5). In that equation, , is the covariance between two mean values along the cross-section from different datasets. The terms and standard deviations of each set of mean values.  , ,

Data Density Yield
In some studies, the total points appearing in a reconstructed point cloud are used as a proxy to compare the quality of different reconstruction methods. For infrastructure documentation applications, such a broad approach may not encapsulate the true quality of the output, as points appearing in the background or in non-essential areas may contribute little. To determine the extent that captured data appear in the relevant portion of the point cloud, a density conversion rate ( ) metric is proposed as a direct indicator of the yield. As shown in Equation (6), the is the average volume density of the area of interest (AOI), in this case the bridge deck. In this equation, is the total number of points included in the dataset. The indicates the relative value of the overall point cloud with respect to an area of interest (e.g., the bridge deck). Lower values indicate a lower yield percentage with respect to all data collected.

Field Study
To demonstrate the applicability and usefulness of the aforementioned metrics, a field study was undertaken. Such an approach provides insight for understanding the interaction of flight path parameter selection for bridge documentation. In this case only the bridge deck was considered as the target object.

Scope
The field study considered three common UAV flight path parameters: altitude, oblique angle, and overlapping rate. The Blessington bridge at Co. Wicklow, Ireland, a concrete bridge, was selected as the case study. This bridge was selected because it is outside the Dublin airport flight control area with clear surroundings and light vehicular traffic, which facilitated both UAV flights and the TLS data collection. More information about the site is presented in Section 4.3.

Data Density Yield
In some studies, the total points appearing in a reconstructed point cloud are used as a proxy to compare the quality of different reconstruction methods. For infrastructure documentation applications, such a broad approach may not encapsulate the true quality of the output, as points appearing in the background or in non-essential areas may contribute little. To determine the extent that captured data appear in the relevant portion of the point cloud, a density conversion rate (DCR) metric is proposed as a direct indicator of the yield. As shown in Equation (6), the AD AOI is the average volume density of the area of interest (AOI), in this case the bridge deck. In this equation, PN is the total number of points included in the dataset. The DCR indicates the relative value of the overall point cloud with respect to an area of interest (e.g., the bridge deck). Lower DCR values indicate a lower yield percentage with respect to all data collected.

Field Study
To demonstrate the applicability and usefulness of the aforementioned metrics, a field study was undertaken. Such an approach provides insight for understanding the interaction of flight path parameter selection for bridge documentation. In this case only the bridge deck was considered as the target object.

Scope
The field study considered three common UAV flight path parameters: altitude, oblique angle, and overlapping rate. The Blessington bridge at Co. Wicklow, Ireland, a concrete bridge, was selected as the case study. This bridge was selected because it is outside the Dublin airport flight control area with clear surroundings and light vehicular traffic, which facilitated both UAV flights and the TLS data collection. More information about the site is presented in Section 4.3.
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

Methodology
The overall methodology is shown Figure 7, in which the workflow for obtaining and processing the experimental data from the UAV is shown in parallel to the acquisition and processing of the ground truth data.
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimenta conclusions that can be drawn.

Methodology
The overall methodology is shown Figure 7, in which the workflow for obtaining and processing the experimental data from the UAV is shown in parallel to the acquisition and processing of the ground truth data. The procedure includes data acquisition, processing, and evaluation of the recon structed point cloud (Figure 7). In regard to the UAV data acquisition, multiple flight paths were designed to help determine the influence of specific parameters on the final 3D model reconstruction. As shown in Figure 8, flight paths 1-5 were situated directly over the bridge deck. These were flown at vertical offsets of 10 m, 15 m, 20 m, 30 m, and 40 m In each of these configurations, the camera was positioned directly above the bridge deck and the oblique angle (the angle between the camera centre line to the bridge deck's normal direction) was 0°. These flights considered the impact of elevation. The procedure includes data acquisition, processing, and evaluation of the reconstructed point cloud (Figure 7). In regard to the UAV data acquisition, multiple flight paths were designed to help determine the influence of specific parameters on the final 3D model reconstruction. As shown in Figure 8, flight paths 1-5 were situated directly over the bridge deck. These were flown at vertical offsets of 10 m, 15 m, 20 m, 30 m, and 40 m. In each of these configurations, the camera was positioned directly above the bridge deck, and the oblique angle (the angle between the camera centre line to the bridge deck's normal direction) was 0 • . These flights considered the impact of elevation. To determine the effect of the oblique angle, two flight paths were undertaken (flight path 9, Figure 8). These were conducted along each side of the bridge, with one oriented at 45° and one at 30° from the bridge deck. In both cases, the offset distance was approximately 15 m from the deck centre and thus captured different vantage points. For flight paths 1-9, the image overlapping rate was above 80%. In addition to those images, path 10 was flown along the previous path, path 1, with an overlapping rate higher than 90%. Tables 2-5 demonstrate how each of these flight paths were used singly and in combination to create 15 distinct datasets. Dataset Groups A and B used images from single flight paths 1-9 to analyse the effect of altitude and angles. Dataset Group C used multiple flight paths to evaluate the effect of different combination strategies. Dataset Group D used images from flight path 10. Dataset D-I used all images as the input, while D-II and D-III were generated using only the second (D-II) or third (D-III) image in the acquisition sequence. Thus, three different datasets in Group D were created to check the effect of the overlapping rate with respect to image acquisition speed.   To determine the effect of the oblique angle, two flight paths were undertaken (flight path 9, Figure 8). These were conducted along each side of the bridge, with one oriented at 45 • and one at 30 • from the bridge deck. In both cases, the offset distance was approximately 15 m from the deck centre and thus captured different vantage points. For flight paths 1-9, the image overlapping rate was above 80%. In addition to those images, path 10 was flown along the previous path, path 1, with an overlapping rate higher than 90%. Tables 2-5

Dataset Name Input Data Source Path Combination
C-I Path 2 + 7 Top + one side C-II Path 7 + 9 Two sides C-III Path 2 + 7 + 9 Top + two sides After the image acquisition, a standard SfM 3D reconstruction process and noise reduction process was applied using methods previously introduced by the authors (Chen et al., 2017; Chen et al., 2018 [60]). Then, to ensure that the same section of the bridge was compared, each 3D point cloud was aligned with its accompanying TLS dataset through the ICP algorithm (Besl and McKay, 1992). After data alignment, the bridge deck was extracted in each dataset to evaluate the quality and accuracy of each model on a local (per-point) basis. For this, five metrics were employed in the form of point density (Equation (1)), point uniformity (Equation (3)), completeness of the reconstruction (Equation (4)), geometric accuracy (Equation (5)), and data yield (Equation (6)).

Experimental Set Up
To investigate flight path optimization, an experiment was conducted using the Blessington bridge in County Wicklow, Ireland. The bridge is constructed of reinforced concrete, is approximately 130 m long and 8 m wide, and is typically situated 10 m above the water level ( Figure 9).  After the image acquisition, a standard SfM 3D reconstruction process and noise reduction process was applied using methods previously introduced by the authors (Chen et al., 2017; Chen et al., 2018 [60]). Then, to ensure that the same section of the bridge was compared, each 3D point cloud was aligned with its accompanying TLS dataset through the ICP algorithm (Besl and McKay, 1992). After data alignment, the bridge deck was extracted in each dataset to evaluate the quality and accuracy of each model on a local (perpoint) basis. For this, five metrics were employed in the form of point density (Equation (1)), point uniformity (Equation (3)), completeness of the reconstruction (Equation (4)), geometric accuracy (Equation (5)), and data yield (Equation (6)).

Experimental Set Up
To investigate flight path optimization, an experiment was conducted using the Blessington bridge in County Wicklow, Ireland. The bridge is constructed of reinforced concrete, is approximately 130 m long and 8 m wide, and is typically situated 10 m above the water level (Figure 9). A DJI Engineering quadrotor Phantom 4 was used for the experiment. The UAV was equipped with a 4K camera (3000 × 4000 pix) and a 3-axis gimbal, as shown in Figures 10  and 11. The total cost for the system was about EUR 1500. The take-off, image capture, and landing operations were manually controlled by a remote pilot through a first-personview camera with a mandated second operator to help ensure obstacle avoidance. A DJI Engineering quadrotor Phantom 4 was used for the experiment. The UAV was equipped with a 4K camera (3000 × 4000 pix) and a 3-axis gimbal, as shown in Figures 10 and 11. The total cost for the system was about EUR 1500. The take-off, image capture, and landing operations were manually controlled by a remote pilot through a firstperson-view camera with a mandated second operator to help ensure obstacle avoidance.

Data Processing
The 3D reconstruction process was performed in the com (Agisoft, 2017) with GPS tagging. In the software, both the ima the dense point reconstruction quality were set to high. The cessed on a Dell XPS 15 laptop (i7 GPU, 16 Gb RAM); the res 6. Point cloud registration, manual bridge deck extraction, an achieved through the open-source software CloudCompare2. [61], and Equations (3)-(7) were implemented in MatLab.

Data Processing
The 3D reconstruction process was performed in the comme (Agisoft, 2017) with GPS tagging. In the software, both the image the dense point reconstruction quality were set to high. The re cessed on a Dell XPS 15 laptop (i7 GPU, 16 Gb RAM); the result 6. Point cloud registration, manual bridge deck extraction, and d achieved through the open-source software CloudCompare2.11. [61], and Equations (3)-(7) were implemented in MatLab.

TLS Data Collection
The TLS data to be used for benchmarking were collected w P20 terrestrial laser scanner (Figures 12 and 13). The bridge desk w of 10 scan stations (see Figure 14) along the side path of the bridg Figure 11. Image acquisition (UAV is shown above front right support).

Data Processing
The 3D reconstruction process was performed in the commercial software PhotoScan (Agisoft, 2017) with GPS tagging. In the software, both the image alignment accuracy and the dense point reconstruction quality were set to high. The reconstructions were processed on a Dell XPS 15 laptop (i7 GPU, 16 Gb RAM); the results are reported in Section 6. Point cloud registration, manual bridge deck extraction, and density calculations were achieved through the open-source software CloudCompare2.11.3 (CloudCompare, 2017) [61], and Equations (3)-(7) were implemented in MatLab.

TLS Data Collection
The TLS data to be used for benchmarking were collected with a Leica Scan Station P20 terrestrial laser scanner (Figures 12 and 13). The bridge desk was captured from a total of 10 scan stations (see Figure 14) along the side path of the bridge. The resolution was set as 6.1 mm at 10 m, resulting in a sampling step of 5 mm. That data collection took approximately 3 h by one surveyor including logistics and scanner set up. The scanning only required about 7 min per scan station including data and target capture. Scan co-registration was performed by using Leica's proprietary software Cyclone (V9.1). The final dataset contained approximately 270 million points. The local geometric accuracy was measured using TLS as the ground truth. TLS data have high resolution and accuracy at close distances via a single scan; multiple long-distance surveys as would be required for global accuracy would have cumulative global errors introduced by the registration process.

Collected Data
The image acquisition process was conducted in the early morning to minimize v hicular-based occlusions. During the UAV imagery acquisition process, 526 images we captured across the 10 flight paths. Of the 55 min required for imagery data collection, was for site checks, take-offs, reversals, and landings (see Table 6 for more details). T highest ground resolution (GR) achieved was 3.71 mm/pixel. The individual flig ranged from 2 to 10 min yielding as few as 21 and as many as 143 images at data captu

Collected Data
The image acquisition process was conducted in the early morning to minimize vehicular-based occlusions. During the UAV imagery acquisition process, 526 images were captured across the 10 flight paths. Of the 55 min required for imagery data collection, 14 was for site checks, take-offs, reversals, and landings (see Table 6 for more details). The highest ground resolution (GR) achieved was 3.71 mm/pixel. The individual flights ranged from 2 to 10 min yielding as few as 21 and as many as 143 images at data capture rates of 8.7 to 15.7 images per minute but at a constant overlapping rate. More details are shown in Table 6.

Error Sources
A key aim of this paper is to provide a better understanding of how different UAV flight paths impact the quality of imagery-based point clouds for the inspection of bridge decks and similar infrastructure. To that end, flight paths were designed with pre-specified altitudes and offset distances from the bridge. However, the equipment's on-board GPS system has an advertised hover accuracy of ±0.5 m in the vertical direction and ±1.5 m in the horizontal direction (DJI, 2019). Furthermore, the field conditions included wind effects. By checking the camera pose estimation results, a presence drift was verified. For example, flight path 1 and flight path 10 were intended to have identical altitudes of 10 m. In reality, the average capture distance for path 1 was 9.5 m, while that for path 10 was 10.5 m. While such differences affected the ground resolution of the captured images, the general trends being reported herein were not impacted.
Characteristics of the SfM point clouds derived from those images are shown in Table 7. Generally, the total processing time related to the quantity of input images ( Figure 15). However, dataset C-III used less processing time than the less-populated dataset D-I. A possible reason is that the multiple flight paths were parallel to each other. Thus, overlapping between images occurred both in the horizontal and vertical directions, which appears to have decreased the feature matching time (Byrne et al., 2017) in a different experimental arrangement.

Density and Uniformity Comparison
As expected, the TLS dataset had a point density with a radial distribution, with the scanner at its centre producing higher-density point areas closer to the scanner. Lowerdensity strips are an artefact of cars or pedestrians passing in front of the scanner. Within

Density and Uniformity Comparison
As expected, the TLS dataset had a point density with a radial distribution, with the scanner at its centre producing higher-density point areas closer to the scanner. Lowerdensity strips are an artefact of cars or pedestrians passing in front of the scanner. Within allowable time constraints, these were minimized by re-performing the scans. In contrast, the SfM point cloud exhibits a largely uniform point distribution across the study area interspersed with waves of slightly lower density strips, as shown in Figure 16. This comparative homogeneity of the data offers a constant data resolution across the entire structure and reduces post-processing difficulties, as previously mentioned. To further understand how the flight path setting interacted with general point density and uniformity, the volume-based density calculation method introduced in Section 3.1 was applied to all 15 datasets ( Table 7). The results are shown in Table 8.   Figure 16. Data density maps of TLS and SfM point cloud.  As expected, the results shown in Table 8 demonstrate a significant correlation between the flight altitude and data density, with lower flights generating more uniform datasets (approximately 1% improvement per meter in RSD). The linear overlapping rate also affected the density. In the D Test series, as the overlap rate increased from 66% to 90%, the data density increased by about 10%, while in the A Test series the density increased by more than 227% when the altitude decreased from 40 m to 10 m.
As shown in Table 9, when comparing B Test series outputs to A Test series outputs, datasets obtained with narrower oblique angles at the same altitude led to denser point clouds than those collected with wider one. Also shown in Table 9, datasets with similar ground resolutions (B-I, B-III, and A-II) exhibited similar average densities. Importantly, the C Test series showed that, instead of increasing the final point density, adding more flight paths from various angles decreased the final point cloud density. Based on work by Byrne et al., the extra images may be providing rich geometric information, which would then allow for the better detection of invalid points or noise and their subsequent removal as part of the reconstruction process. This concept of quality over quantity is further explored in Section 6.1.
To better understand the RSD changes in the SfM point cloud, a density map was generated ( Figure 17) for the A Test series. At each fight altitude, different data density patterns appear. Close ups (10 and 20 times) illustrate that those patterns segmented the point cloud into numerous irregular grids, and in the boundary of each grid, points are missing ( Figure 17). The grid size and the gap width are highly related to flight altitude ( Figure 17). A probable reason is that with the increasing altitude, the ground resolution of each pixel increases correspondingly. As the pixel is the smallest unit for feature detection, ground resolution will be directly affected in the feature matching process (Verhoeven et al., 2015;Apollonio et al., 2014) [62,63]. After feature matching, the dense reconstruction process occurs through MVS algorithms to generate denser patches around matched seed features (Shao et al., 2016 [64]). In the detailed inspection of the data, around each patch is a gap where no data exist and which reduces the overall average density. The size of the gaps increases with altitude ( Figure 17).

Completeness Comparison
The aforementioned gaps are treated as incomplete areas and quantified by Equ (4). The average point-to-point distance βave of each dataset was selected as the thres for mesh generation. Areas where this threshold was exceeded were considered as in plete. Table 10 demonstrates that for the particular equipment and the specific brid this field study, the single path datasets ranged in completeness from just over 66 nearly 77%-generally producing better ones at lower altitudes. The fixed angle and ferent altitudes in Group A showed a U-shaped distribution in completeness. Wit creasing altitude, the completeness level dropped quickly in the beginning. Then, a 20 m it increased again but more slowly. The highest completeness was achieved b lowest flight path centred over the pavement's centre. When the 80% overlapping ra A-I was increased to 90% at the same altitude in D-I, the completeness rate nudged sli higher but certainly nothing close to proportional for the additional quantity of data b collected and processed.

Completeness Comparison
The aforementioned gaps are treated as incomplete areas and quantified by Equation (4). The average point-to-point distance β ave of each dataset was selected as the threshold for mesh generation. Areas where this threshold was exceeded were considered as incomplete. Table 10 demonstrates that for the particular equipment and the specific bridge in this field study, the single path datasets ranged in completeness from just over 66% to nearly 77%-generally producing better ones at lower altitudes. The fixed angle and different altitudes in Group A showed a U-shaped distribution in completeness. With increasing altitude, the completeness level dropped quickly in the beginning. Then, above 20 m it increased again but more slowly. The highest completeness was achieved by the lowest flight path centred over the pavement's centre. When the 80% overlapping rate in A-I was increased to 90% at the same altitude in D-I, the completeness rate nudged slightly higher but certainly nothing close to proportional for the additional quantity of data being collected and processed. Dataset Group B showed that depending upon the oblique angle much greater completeness can be achieved with significantly less data. In this case the completeness was nearly 10% more even in the absence of nearly a third of the data. Interestingly, when flight patterns were mixed other complexities arose, as shown in Group C where the completeness levels were less than other groups, as measured herein. The multiple flight paths caused a mixing of the grid layouts, thereby resulting in a range of gap sizes ( Figure 17). When processed according to the procedure described in Section 3.2, a small threshold was selected, which was then used to calculate the completeness rate. Consequently, the non-uniform gaps introduce an artefact into the dataset that influences the calculation. Therefore, this must be considered as a limitation of this newly proposed metric.

Geometry Accuracy Comparison
A geometric accuracy assessment was conducted by comparing a cross-section of each SfM point cloud to the equivalent portion of the TLS point cloud. In this test, the crosssection was divided into 200 intervals. The mean altitude of each interval was calculated and compared through the method introduced in Section 4.2.
As visible in Figure 18 and Table 11, the geometric accuracy was affected by all parameters. As expected, the Group A test series showed a linear improvement with lower altitudes. Similarly, the overlapping rate and oblique angle had direct effects on the accuracy. Using multiple view angles increased the geometric accuracy compared to processing each angle separately. In summary, as expected, the best overall results were achieved at lower altitudes, smaller angles, and higher overlapping rates.

Data Yield
The DCR results are shown in Table 12. According to the Group A tests, lower altitudes have higher DCRs, as would be expected peripheral information such as the river or its banks is not being captured. Test Series B shows that the oblique angle will also

Data Yield
The DCR results are shown in Table 12. According to the Group A tests, lower altitudes have higher DCRs, as would be expected peripheral information such as the river or its banks is not being captured. Test Series B shows that the oblique angle will also decrease the DCR, this because the oblique angle captures more of the bridge's side view. In Test Series C, even though the total number of points (PN) were similar across the data series, the average point density (AD) and the final data yield (DCR) differed significantly. Capturing the bridge's side data negatively impacted these two metrics. Test Series D shows that, the higher overlapping rate improved the PN significantly. However, the AD did not change much and the DCR decreased when the overlapping rate increased, which means the higher overlapping rate decreased the efficiency of point utilization. To provide guidance for flight path planning, each category of data analysed in Tables 7-12 was normalized by the highest value achieved across the 15 datasets and  compiled in Table 13. Those datasets with the best performance in at least one metric were further analysed in a seven-pronged radar map to show a more holistic performance across the various metrics ( Figure 19). Unlike previous research using the total point numbers or the average density as a unique standard to evaluate the reconstruction performance, herein seven different metrics are proposed and compared. Both from Table 13 and Figure 19, highly distinctive patterns can be observed.   As such, proper flight path selection must be informed by the surve example, Dataset A-I, which had the closest survey distance (9.5 m) and angle, produced the highest average density, yield rate, and uniformity, de this flight path can generate a well-distributed point cloud. Additionally, i balance in completeness, geometric accuracy, and time efficiency. Datas As such, proper flight path selection must be informed by the survey's purpose. For example, Dataset A-I, which had the closest survey distance (9.5 m) and no offset oblique angle, produced the highest average density, yield rate, and uniformity, demonstrating that this flight path can generate a well-distributed point cloud. Additionally, it also had a good balance in completeness, geometric accuracy, and time efficiency. Dataset D-I illustrated that by adding more images to increase the overlapping the completeness and accuracy level could be increased. However, this improvement is costly. To improve the completeness by 1%, when compared to D-II, D-I tripled the time cost in image acquisition and post-processing. In some surveys, rapid assessment through shorter flight times and limited processing periods is important, such as after nature disasters. In those cases, if the bridge deck area is the focus of concern, then flying at a higher altitude directly over the bridge (e.g., path A group) may be the most appropriate choice at the cost of accuracy and density.
The evaluation concepts of accuracy and completeness, as well as point yield, introduced in this paper, provide a more holistic and, arguably, more rigorous approach to UAV-based imagery acquisition for bridge documentation. In fact, the experimental work demonstrates that maximizing point density may actually be counterproductive to obtaining cost-effective and comprehensive point clouds depending upon the position of the UAV with respect to the areas targeted for documentation.

UAV Photogrammetry vs. TLS
TLS is often proposed as an alternative solution for bridge documentation. For the quality evaluation purpose, this section compares the best SfM point clouds achieved in the experiments that were achieved by TLS. As mentioned in some other studies (Hallermann et al., 2015; Chen et al., 2018 [65]), the advantages of UAV imagery data collection include high efficiency and low costs. In this experiment, even with multiple flight paths (10 paths), the entire flight time was less than one hour-only a third of the TLS data collection time. In this instance, the post-processing times were almost the same for the UAV images' SfM reconstruction and the TLS data's co-registration (Table 14). However, as illustrated in Figure 19, the SfM post-processing time is highly dependent upon image quantity. If the datasets have large amounts of sky and water, image matching becomes harder and more time consuming. However, if the imagery is collected via video, a limited number of frames can be automatically selected to restrict the image matching process during reconstruction, as explained by Byrne et al. (2017). In contrast, the TLS data co-registration time is largely linear, more predictable, and can be minimized based on reducing the number of scan station locations. Additionally, the UAV-SfM system used in this study cost only 10% of the cost of the TLS system budget and generated a competitive result (Table 14). However, this figure does not include UAV training, permitting, or insurance costs.  This paper's experimental results for documenting a bridge deck demonstrated that a well-designed flight path can achieve two-thirds of the average density of the TLS result, with a geometric difference as little as 3 mm. While this figure is important, what is arguably of greater concern for further post-processing is the uniformity of the point cloud. The UAV-SfM point cloud is much more uniform than the TLS result (RSD of 5.56-25.6% vs. 73.12%) and with almost no low-density pockets. Moreover, with the designed metric and strict threshold, the completeness level of TLS is only 7.49%. That means only 7.49% of the entire survey area was covered by well-distributed, high-density points. In contrast, the UAV-SfM method easily achieved more than 50%. The higher completeness and better uniformity of the SfM point cloud has many benefits for inspections, such as (1) less unknowns and (2) more ability to obtain consistent post-processed objects, as the input is more uniform. However, the UAV method is highly vulnerable to the weather. Wind especially affects flight path quality by causing the camera to shake and the UAV to drift-both impacting the final quality. Sunlight was also shown to have some impacts. Thus, when designing a proper flight path for a specified quality, those issues should be considered ahead of time.

Conclusions
To optimize UAV-SfM bridge deck inspection or similar applications, flight path design and data capture considerations in terms of altitude, angle of capture, overlapping rate, and combined flight paths were explored. To evaluate the various outcomes, this paper proposed a suite of seven evaluation metrics to check the variance of point cloud quality and overall efficiency in the form of the total number or points, average density, uniformity, yield rate, completeness, geometry accuracy, and time efficiency. In the presented case study of the Blessington Bridge, bridge deck geometry was acquired from 10 different flight paths from which 15 groups of point cloud datasets were assembled as generated through an SfM method. Evaluation of these 15point clouds established that both altitude and oblique angle significantly affected the point density and uniformity.
Several major conclusions can be drawn from this study. First, irrespective of the individual and combined parameters, the SfM process resulted in point groupings in semiirregular grids with clearly identifiable gaps between point groups. The size of both the grids and the gaps increased at higher flight altitudes. Multiple flight paths resulted in a combination of the individual grid patterns from the specific flight paths, which decreased the general completeness rate but improved the overall geometric accuracy. The best completeness (77%) was achieved by a single flight path with the lowest altitude (9.5 m) and an 80% overlapping rate. Next, while the overlapping rate strongly affects the total number of points, it only weakly impacts the average density of the portion of the point cloud representing the deck surface, thereby negatively impacting the time efficiency without strongly improving the data yield rate. However, in this study a minimum overlapping rate of 66% was found to be needed to successfully achieve the SfM reconstruction process.
Additionally, this research suggests that there is not a unique solution for UAV bridge deck surveys due to the complex relationship between the flight path settings and the specific survey objectives' (e.g., accuracy, completeness, and economy) strong influence on the optimal data capture strategy ( Figure 19). For example, if high accuracy is the goal, using a lower altitude, smaller angle, and higher overlapping rate can achieve better results than other flight path combinations. Finally, in the case study presented herein, the UAV-SfM method demonstrated some critical advantages over TLS documentation, including time efficiency, general cost, and data uniformity, but at the expense of point density and some accuracy.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
All data and codes that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest:
We declare that none of the work contained in this manuscript is published in any language or currently under consideration at any other journal, and there are no conflict of interest to declare.