Accuracy of 3D Landscape Reconstruction without Ground Control Points Using Di ﬀ erent UAS Platforms

: The rapid increase of low-cost consumer-grade to enterprise-level unmanned aerial systems (UASs) has resulted in the exponential use of these systems in many applications. Structure from motion with multiview stereo (SfM-MVS) photogrammetry is now the baseline for the development of orthoimages and 3D surfaces (e.g., digital elevation models). The horizontal and vertical positional accuracies (x, y and z) of these products in general, rely heavily on the use of ground control points (GCPs). However, for many applications, the use of GCPs is not possible. Here we tested 14 UASs to assess the positional and within-model accuracy of SfM-MVS reconstructions of low-relief landscapes without GCPs ranging from consumer to enterprise-grade vertical takeo ﬀ and landing (VTOL) platforms. We found that high positional accuracy is not necessarily related to the platform cost or grade, rather the most important aspect is the use of post-processing kinetic (PPK) or real-time kinetic (RTK) solutions for geotagging the photographs. SfM-MVS products generated from UAS with onboard geotagging, regardless of grade, results in greater positional accuracies and lower within-model errors. We conclude that where repeatability and adherence to a high level of accuracy are needed, only RTK and PPK systems should be used without GCPs.


Introduction
The recent rapid development of relatively low-cost (<US$25,000) small unmanned aerial systems (UASs) (<25 kg) has resulted in their use in a myriad of disciplines.Applications in precision agriculture [1,2], archeological reconstruction [3,4], forestry [5,6], geomorphology [7][8][9], freshwater and marine systems [10][11][12], environmental monitoring [13,14], animal population studies [15][16][17][18], and recently, traffic accident reconstruction [19,20] are just a few of the fields where UASs have been exploited and continue to grow with enhanced platform capabilities (e.g., real-time analysis) [21], advanced sensors [22][23][24] and diverse software implementations [25,26].The high accuracy and precision (i.e., cm error) of results obtained for 2D (e.g., orthomosaics) and 3D (e.g., digital surface model-DSM) mapping, based on structure from motion with multiview stereo photogrammetry (SfM-MVS), is transforming most disciplines where studies need to characterize areas up to ~100 hectares [1,27,28].In addition, more operationally demanding systems such as hyperspectral pushbroom sensors [22,29], thermal imagers [24] and LiDAR [30] are being implemented on UASs, providing new insights for data fusion and advanced data analysis [31].Centimeter level accuracies are particularly important in these multi-sensor applications where spatial alignment of the different datasets is needed.At the moment, the vast majority of applications within the public-domain, are limited to small UASs with limited coverage due to the near-universal regulatory requirement that these systems be flown within visual-line-of-sight (VLOS) [32].As well, limited battery performance [33] and a restricted envelop of weather conditions for operations compounds the current limitations of small UAS operations.However, as UAS technology matures and more robust operations such as beyond-visual-line-of-sight (BVLOS) become common [34], novel applications will continue to be tested [35].It is important to recognize, however, that small UAS BVLOS is still in its early stages of development [36] and it will be some time before unfettered BVLOS operations will be allowed by airspace regulators.In these future BVLOS applications, using GCPs collected on-site will not always be a viable option because of the large extents that will be covered by the imagery and the potential remoteness or inaccessibility of the study area (e.g., [35,37]).
While the early development of SfM photogrammetry emerged approximately 40 years ago [38], its popularity in geomatics has increased with access to higher performance computers/workstations and the availability of UAS-based photography [39].A generic SfM pipeline reconstructs the landscape as a sparse 3D point cloud from overlapping 2D photographs.Additional products such as a dense 3D point cloud (through the application of an MVS algorithm), a DSM from interpolation of the point cloud or a textured mesh can also be generated.The SfM algorithms locate common points in the multiple 2D photographs taken from different viewing positions (and angles) from which the landscape is reconstructed in 3D [40,41].The SfM pipeline does not actually require any geopositioned information for the photographs.In the absence of coordinates, it recovers the camera parameters, and position and orientation estimates from the photographs resulting in more flexibility than conventional photogrammetry from stereo-pairs [42].A review of different algorithms and implementations of SfM can be found in [43].
Many factors that affect the quality of the UAS-based SfM-MVS photogrammetry products (e.g., orthomosaic and DSM) are described in detail by [44,45].The first consideration is the image size as defined by the number of pixels on the imaging sensor, namely the pixel size and pixel pitch (i.e., the linear distance from the center of a pixel to the center of the adjacent pixel on the detector array).The size of the detector array and the size of the pixels affect the resolving power of the sensor.Given a specific pixel size, larger arrays accommodate more pixels that capture more incoming electrons, but given a specific sensor size, larger pixels are a tradeoff in a corresponding loss of spatial resolution.The second factor is the sensor's radiometric resolution (e.g., 8-14 bits), where a higher radiometric resolution corresponds to the system being able to accommodate a greater range of intensities of incoming radiation.A final important factor to consider is the accurate location of imaged features with respect to the real-world coordinates (absolute position) and accurate geometries (dimensions, distances and volumes) as they are represented in the products (relative accuracy).In addition, minimization of image blur and distortion for a given camera and lens implementation during mission planning requires the consideration of specific camera and lens parameters (e.g., f-stop, shutter speed and ISO) [44].In addition to hardware specifics, several studies have addressed equally important aspects of data collection from UASs such as flight altitude [46,47], image overlap [48] and environmental conditions (i.e., wind speed) for optimizing data collection [22], which are closely related to the implementation's objective(s).
However, positional accuracy is still a very challenging attribute to conceive and quantify for many UAS practitioners.The general lack of knowledge of true positional errors is in part, due to the relatively high degree of ease of collecting ultra-high-resolution (few cm) images.It has been recognized, previously, that UASs and their system implementation requires improvement [49,50].Applications where high accuracy and/or precision (e.g., repeatability) are required, have relied on the use of ground control points (GCPs) for improving product geolocation with respect to real-world coordinates and for ensuring accurate measurements of geometries within the end-products (e.g., [51]).For instance, [52][53][54] have shown the utility of GCPs for improving horizontal and vertical positional accuracy for different data acquisition scenarios (e.g., GCP density and distribution).Nevertheless, as stated by [55], implementing a GCP network can be impractical in certain situations, for example, glacial terrains and fluvial systems [11] or with BVLOS flights.Similar situations where GCPs are logistically unfeasible or dangerous are volcanic crater mapping, post-fire landscape (before fully cooled), fragile ecosystems (e.g., peatlands), marine studies, glacier/snowpack and animal counts/population studies.
Given a stable platform and the implementation of a gimbal and inertial measurement unit (e.g., attitude) to record the position and orientation of photographs, the positional accuracy of UAS derived photogrammetry products that do not use GPCs is largely determined by the type, frequency(ies) and number of global navigation satellite systems (GNSS) constellations used for navigation purposes and geotagging.Moreover, the accuracy obtained is influenced by the type of data processing option, which ranges from basic onboard position calculation to more advanced post-processing kinematic (PPK) or real-time kinematic (RTK) solutions (Figure 1).Here, we provide a brief description from low-cost onboard position acquisition, to PPK and RTK solutions.Onboard low-cost GNSS systems (Figure 1A), for example, generally use single-frequency receivers (L1 and/or F1), which are sufficient for navigation but the accuracy of positioning remains metric (1-3 m) [56,57].Although less common, dual-frequency GNSS receivers (L1/L2 and/or F1/F2) provide a faster position lock with higher accuracy and precision [58] but not necessarily enough for applications where cm level accuracy is required [22].Post-processing kinematic (Figure 1B) solutions rely on a high precision and accurate GNSS base station that can be a local receiver or a commercial service provider.The fixed location of the base station is used to compute the geotags after data acquisition.The PPK solution can also use precise clock and ephemeris data in post-processing, providing consistent and repeatable positions with an accuracy of several centimeters [59].In addition to a precisely located base station, RTK solutions (Figure 1C) also rely on a stable radio, cellular or Wi-Fi link between the base station and the GNSS receiver to geotag the photographs with an accurate location in real-time.For example, based on a single frequency RTK module, [60] a horizontal accuracy of 1.5 cm and a vertical accuracy of 2.3 cm was obtained.
The purpose of this study is to evaluate the positional and within-model relative accuracies of SfM and SfM-MVS photogrammetry reconstructions of low relief landscapes without the inclusion of GCPs for a range of multirotor VTOL (vertical takeoff and landing) UAS with camera systems from various price points (US$1000 to >US$50,000).We follow the American Society for Photogrammetry and Remote Sensing 2015 [61] definitions of absolute accuracy as "a measure that accounts for all systematic and random errors in a data set"; positional accuracy as "the accuracy of the position of features, including horizontal and vertical positions, with respect to horizontal and vertical datums"; and relative accuracy as "a measure of variation in point-to-point accuracy in a data set".We further refer to positional error as "the difference between data set coordinate values and coordinate values from an independent source of higher accuracy for identical points".
Overall, the following study used 14 distinct UASs wherein, there was one high-cost enterprise UAS manufactured by Aeyron Labs and the remaining 13 were all manufactured by DJI (Dà-Jiāng Innovations).Our decision to focus mainly on these systems stems from four factors which are: (1) airframe market share, (2) range of performances, (3) flight controller supply market, and (4) access.First, as of 2019, DJI accounted for approximately 76.8% of the market in the USA (based on FAA registrations) [62].In Canada, in the first quarter of 2020, 30 of 241 UAS models with manufacturer assurance declarations submitted to Transport Canada for advanced operations were DJI systems.An additional 23 UASs on the Transport Canada list were from manufacturers that modified various DJI UASs [63].Second, the UAS from our study covers a range of grades from consumer (mass market) to professional and enterprise systems, which are accessible to most users.Third, many custom-built systems or third-party manufactured systems' flight controllers are frequently purchased from DJI.On the Transport Canada list, a minimum of 15 additional UASs were listed with safety declarations in configurations using the DJI flight controllers (e.g., A3, A3 Pro and N3).By evaluating these systems, our study provides a large number of entities (e.g., service providers and individuals), who use commercial off-the-shelf UASs, an assessment of the accuracy of the systems for the scenario we cover in this study.Lastly, the UASs tested here are the systems we had access to over the course of the study.

Study Sites
This study was carried out over a three-year period at three mid-latitude sites in Eastern Canada (shown in Figure 2): (1) a 2.8 ha field of herbaceous vegetation next to the Mer Bleue peatland (MB), near Ottawa, Ontario (Figure 2A); (2) an abandoned 3.7 ha agricultural field on île Grosbois (IGB), near Montreal, Quebec (Figure 2B); (3) a 1.5 ha agricultural field in Rigaud, Quebec (Figure 2C).As a means to introduce checkpoints to validate the horizontal and vertical accuracies of the SfM and SfM-MVS products, 70 cm tall wooden posts were placed in the field at MB.Each post had a 10 cm wide metal plate affixed to the top.The plates were painted matte grey and marked with an "X" in the center using contrasting (black and white) tape.These posts were installed for multi-year use at the site (Figure 2D).At the two other sites, checkpoints consisted of circular plastic orange bucket lids (30.5 cm diameter and 23.5 cm diameter) marked with an "X" in contrasting tape that were placed flat on the ground randomly before each flight (Figure 2E).All three study sites have relatively low topographic variability comprised of the variable herbaceous vegetation height at MB and IGB and the soil separating the furrows in the plowed field (Rigaud).For all flights, the weather was sunny with few clouds and little wind.

UASs and Camera Systems Tested
We tested fourteen UASs ranging in weight from 430 g to 14 kg (Tables 1 and 2).Flights were conducted at 30-45 m above ground level (AGL) nominally, with orthogonal flight lines.The Phantom 4 RTK (P4RTK), the Matrice 600 Pro RTK (M600P), and Matrice 210-RTK (M210-RTK) required an external base station to function in RTK flight mode.Flight line spacing and camera triggering were optimized for each system by the flight controller software while maintaining 80% frontlap and sidelap coverage.Photographs were collected at nadir with the exception of the Mavic Air for which the maximum allowable angle by the flight controller software was -80 • .All onboard cameras were triggered directly by the UAS.Shutter speed, ISO, aperture and exposure compensation were automatically set by the on-board cameras without user intervention.The onboard cameras were also set to autofocus mode.
For the M600, the digital single-lens reflex (DSLR) camera was mounted on a Ronin MX gimbal (DJI, Shenzhen China) for stabilization and orientation.Two configurations were evaluated: (1) PPK mode where geotagging was achieved via an M+ RTK GNSS module (Emlid, St Petersburg, Russia) to record the position and altitude and a PocketWizard MultiMax II intervalometer (LPA Design, South Burlington, VT, USA) to trigger the camera at 2-second intervals; (2) in stand-alone mode, the DSLR was also triggered by the intervalometer but geotagging was automated with a Canon GP-E2 GPS receiver connected to the DLSR's hot shoe.In all cases, the DLSR was operated in "programmed auto" mode, in which the aperture and shutter speed are automatically set by the camera but the user has control of the ISO and exposure compensation.The ISO was set to 800 with no exposure compensation.The lens was set to autofocus, using all of the available points of the camera's autofocus sensor.
The P4RTK received RTK corrections from the Can-Net CORS network via a virtual reference station (VRS) mountpoint utilizing both GPS and GLONASS constellations.All P4RTK photographs were captured in the fixed status for maximum geolocation accuracy.Three types of geotagging were implemented for the M600P DSLR photographs collected in PPK mode.First, with the "local base" configuration (PPK LB ) a base station dedicated to collecting GNSS data for the photographs was set up.For this, we used an RS+ single-band receiver (Emlid, St Petersburg, Russia).Second, for the local base configuration with the added NTRIP correction (Networked Transport of Radio Technical Commission for Maritime Services (RTCM) via Internet Protocol), PPK LB-NTRIP , the RS+ base station received the incoming corrections from the SmartNet North America NTRIP casting service on an RTCM3-iMAX (individualized master-auxiliary) mount point utilizing both GPS and GLONASS constellations.These corrections were transmitted to the M+ receiver onboard over LoRA (long-range) 915 MHz radio.Lastly, for the PPK configuration with a commercial base station, PPK CB , the SmartNet North America station in Vaudreuil-Dorion (Station QCVD-16 km baseline) was used in post-processing.
All photographs (from all UASs) were acquired as jpgs except for the DSLR, which acquired photographs in Canon RAW (.CR2) format.These were subsequently converted to large jpg in Adobe Lightroom ® with minimal compression for analysis.Table 2. Camera, lens and flight controller software specifications ordered by sensor size (Figure 3).The Canon 5D Mark III was used with a Canon EF 24-70 mm f/2.8LII USM Lens set to 24 mm.The X5 and X5S cameras were used with a DJI MFT 15 mm f/1.7 ASPH lens.The SkyRanger R60 ′ s Sony DSC-QX30U camera has an HD Zoom 30 lens that was set to 24 mm.FF is a full-frame sensor.The Exmor R sensor differs from the others in that it is a back illuminated CMOS image sensor (vs.conventional front side illumination), which increases the amount of light captured.The pixel size is the value reported in the Pix4D camera database.* Based on the sensor size stated by Sony the calculated pixel size is 1.2 µm but for this camera, Pix4D considers the usable area on the sensor rather than the physical dimension (Pix4D, pers.comm).Rolling shutter distortion for all CMOS sensors was estimated and mitigated through Pix4D [64].2).

Photograph Geotagging
The non-RTK/PPK systems automatically geotag the acquired photographs with the horizontal and vertical position of the UAS time-synchronized to the onboard GNSS receiver (and camera attitude) with the exception of the Mavic Air that writes the horizontal position to the Exif data, but for altitude only records the barometer measurement (height above the take off point).Therefore, for the Mavic Air, elevation (HAE) was manually calculated based on the flight altitude and the elevation of the take-off point and added to the Exif.The P4RTK automatically geotagged photographs with the incoming RTK corrections applied.For all UASs, in addition to the geotagged position, the camera roll, pitch and yaw at the time of frame acquisition was also written into the Exif.
For the DSLR photographs with positions determined via PPK CB , PPK LB and PPK LB-NTRIP , the horizontal coordinates and altitude needed to be calculated in post-processing.The open-source RTKLib software [65] was used to calculate the geotag of each DSLR photograph.For the PPK LB-NTRIP configuration, the geotags were calculated with and without precise clock and ephemeris data downloaded from the Natural Resources Canada's Canadian Geodetic Survey.For each of the DSLR PPK configurations, a lever arm correction was also applied.The precise DSLR attitude on the Ronin MX at the time of frame acquisition is not recorded.
For the photographs geotagged with the onboard GNSS receivers (no RTK or PPK correction), the altitude tags were discarded due to the large errors recorded in the Exif data by each system (>10 m in some cases).The altitude tags were manually recalculated from the barometer measurements of altitude above the take off point (m ASL or HAE) added to the known ground elevation.The accuracy of the barometer measurements had previously been assessed to be ±10 cm [22].Because the GP-E2 does not have a barometer the correct altitude was determined from the flight logs and a lever arm correction.For the P4RTK the lever arm correction is automatically taken into account in the positions recorded in the Exif.

Checkpoint Position Measurement
The targets used as checkpoints were measured after every flight.On August 15, 2019, ten checkpoints were measured using a Trimble Catalyst GNSS/RTK receiver with corrections obtained using the Can-Net VRS network.Only points with a fixed status were considered in the analyses.At MB and Rigaud, the RS+ with incoming corrections from the SmartNet North America NTRIP service was used after all flights (fifteen checkpoints).The accuracy of the RS+ with the incoming NTRIP correction was previously verified in comparison to the location of the Natural Resources Canada High Precision 3D Geodetic Passive Control Network station 95K0003 in Dorval, Quebec.The error in the computed position of the station by the RS+ was determined to be 0.6 cm (X), 2.7 cm (Y) and 5.1 cm (Z).For the Trimble Catalyst, the accuracy was assessed to be <2.5 cm (X and Y) and <3 cm (Z) as reported by the system.The dual-frequency Trimble Catalyst was considerably faster at achieving a fixed position than the RS+ (<10 sec vs up to 15 min).

Structure from Motion-Multiview Stereo (SfM-MVS) Processing
An SfM-MVS workflow (Figure 4) was carried out in Pix4D Mapper (Pix4D S.A, Prilly, Switzerland) to reconstruct the study areas after each flight.The two main products of interest in our study were the sparse 3D point cloud because during its generation the positional accuracy of the model is computed in relation to the checkpoints, and the orthomosaic from which the within-model horizontal distances were computed.For photographs with camera orientation information in the Exif (i.e., all except the DSLR), Pix4D converts these to Omega, Phi and Kappa angles (rotation between the image coordinate system and a projected coordinate system) [66,67].Key components of Pix4D's workflow are the calibration and optimization during which an automatic aerial triangulation, bundle block adjustment, and camera self-calibration steps are carried out (see [68] for details).Pix4D generates the sparse 3D point cloud through a modified scale-invariant feature transform (SIFT) algorithm [69,70].Following the generation of this initial 3D point cloud, an MVS photogrammetry algorithm densifies the point cloud [71] (Figure 4).Subsequently, a raster DSM is created through an inverse distance weighting (IDW) interpolation of the dense 3D point cloud.The DSM includes objects such as trees and buildings as part of the model (as opposed to a digital terrain model (DTM), which represents the bare earth elevation).The DSM and the input photographs are used to create an orthomosaic without perspective distortion.

Model Accuracy Assessment
Horizontal (x and y) and vertical (z) positional accuracies were determined for the SfM models from the coordinates of the checkpoints within Pix4D (Figure 4).From the reported values of RMSE x and RMSE y , the horizontal linear RMSE in the radial direction (includes both x-and y-coordinate errors, RMSE r ) and National Standard for Spatial Data Accuracy (NSSDA) horizontal accuracy at a 95% confidence level were computed according to Equations 1 and 2 following [61].Because the vertical error in vegetated terrain (z-component) typically does not follow a normal distribution, the vegetated vertical accuracy (VVA) at the 95th percentile is calculated following [61] (and discussed rather than RMSE z ).
Horizontal accuracy at 95% con f idence level = 1.7308 where RMSE x is the horizontal linear RMSE in the easting and RMSE y is the horizontal linear RMSE in the northing.In the computation of RMSE x and RMSE y the NSSDA assumes that the errors are random errors that follow a normal distribution.We computed the D'Agostino-Pearson omnibus k2 test for normality on ∆x and ∆y.This tests against the alternatives of skewed and/or kurtic distributions.Due to the smaller than recommended sample size for checkpoints [61], the significance of RMSE r and the VVA at the 95th percentile should be treated with caution.
The within-model horizontal accuracy was determined by comparing the distances between all pairs of checkpoints from the orthomosaic and from the field measured coordinates.The locations of the targets in the orthomosaics were manually located, and the coordinates extracted with ArcMap 10.7.For the distance calculations between pairs of checkpoints, we took into consideration standard error propagation of the uncertainty in the coordinates of the checkpoints as determined on the ground, as well as user uncertainty locating the exact center of checkpoints in the orthomosaics.User uncertainty in digitization was estimated at a maximum of 2 pixels in x and y.For the uncertainties in the ground GPS measurements, the values described in Section 2.3 were used.The error propagation to determine the uncertainty in the distance measurements was done by calculating the partial derivatives of the distance between two points with respect to both the x and y coordinates, multiplication with the uncertainty of those two variables, and addition of those terms in quadrature (Equations 3 and 4).
where D is the distance between the location of the two checkpoints (x i and x j ) and δD is the uncertainty in the distance calculation.

Camera Focal Length Considerations for the Phantom 4 RTK
For the P4RTK, we initially followed Pix4D's recommended workflow [72], which included applying the built-in optimized camera parameters.For greatest accuracy, all photograph and checkpoint coordinates were first converted to UTM (zone 18N) coordinates and ellipsoidal heights with NAD83(CSRS) 2010 epoch as the reference frame, using Natural Resources Canada's TRX software [73].The same reference frame was used for the outputs.The TRX software also allows the user to set the GPS epoch correctly (i.e., provinces adopted different epochs, see [74]).
Initial results for the P4RTK using this recommended workflow yielded low RMSE x, RMSE y , RMSE r (2-4 cm), but an 18 cm RMSE z (Table 3, Figure 5).Setting the initial camera parameters from "All" to "All prior", as recommended by Pix4D, actually made results worse (Table 3, second row).The "All prior" setting forces the optimal internal parameters (focal length, coordinates of principal point, radial distortion parameters and tangential distortion parameters), to be close to the initial values from the camera database.In contrast, "All" optimizes these parameters starting from the initial values in the database with subsequent recalculation.The values are recalculated based on the calibrated photographs in the dataset [68].We manually decreased the focal length parameter in the camera description by 0.01 mm increments, keeping the "All prior" setting to fix the new focal length.The lowest RMSE z (1.4 cm; Table 3, Figure 5) was obtained by reducing the default focal length from 8.58 mm to 8.53 mm, after which the vertical accuracy gradually worsened (Table 3, Figure 5).Horizontal accuracy (RMSE x, RMSE y and RMSE r ) was largely unaffected by the focal length.We used the optimized camera parameters from the best model for all other analyses, using the "All prior" initial camera parameter calibration setting.

Results
Summary statistics comprised of the RMSE (x,y,z,r) and mean absolute error (MAE x,y,z ) are shown in Figures 6-8 to illustrate ∆x, ∆y and ∆z between the SfM model derived checkpoint locations and those measured in situ.Both MAE and RMSE measure the average magnitude of the positional errors, however, the RMSE is more sensitive to large errors (outliers).For this reason, and because RMSE z is not expected to follow a normal distribution in vegetated terrain, the VVA at the 95th percentile (Figure 8) is examined to quantify the positional accuracy in elevation.As expected, the UAS with RTK or PPK geotagged photographs produced models with the lowest positional errors (RMSE, MAE and VVA).These were also the most accurate systems for the within-model measurements of distances between checkpoints (Figure 9).The P4RTK, a system specifically designed for enterprise SfM-MVS photogrammetry, includes an incoming NTRIP correction to its base station and directly applies RTK corrections to the geotags of the photographs.Both the positional error (RMSE r : 4 cm, MAE z : 1 cm, VVA: 3 cm) and within-model error (µ = 0.8 ± 2 cm) of this system are analogous to the low errors achieved by various UASs as reported in the literature with GCPs included in the processing (see summary by [59]) (Figure 10A-C).It is also the system with the highest percentage (84%) of within-model distance calculation errors that were less than the uncertainty of the measurements (Figure 9).In order to achieve the high vertical accuracy, an important consideration for the P4RTK is the calculation of the specific focal length of the lens unique to the system being used (Table 3, Figure 5).The generalized camera model focal length within Pix4D (8.58 mm) resulted in unreasonably high vertical errors (18-25 cm) for our particular camera.It is possible that other users with camera focal lengths closer to the default Pix4D values could get high vertical accuracies out of the box.
The UAS with the second-lowest positional and within-model errors (M600P + PPKLB-NTRIP) overcomes the lack of onboard generated geotags through third-party hardware and software resulting in a 7 cm RMSEr, 3 cm MAEz, 10 cm VVA and low within model error (µ = 3 +/− 4 cm) (Figures 6-9).While this system had a low percentage of within model distance errors less than the uncertainty of the measurements (9%), the errors were close to the uncertainty estimates (Figure 9).The same system performed slightly worse (higher error) with PPKCB and PPKLB (Figures 6-8).Negligible differences were seen by using the fast versus precise clock and ephemeris data (PPKLB-NTRIP).
The remaining systems relying on onboard GNSS geotagging resulted in positional errors ranging from an average RMSE r of 0.60 m (M600P + X5S) to >3 m (SkyRanger, Mavic 2 Pro and Phantom 4 Pro) (Figures 6-8).It is important to note that for the three UASs with RMSE r > 3 m, there is a substantial difference in RMSE x vs RMSE y (and MAE x versus MAE y ) (Figures 6 and 7) resulting in the larger values of RMSE r .For operational use without GCPs for these UASs, results suggest further investigation would be warranted to determine, if possible, the reason for the larger error in one direction.Overall, the non-RTK/PPK systems were consistent in the within-model horizontal distance measurement errors (µ = 0.21-0.26m) except for the SkyRanger (µ = 0.39 +/− 0.28 m) and Mavic Air (µ= 1.2 +/− 0.48 m) (Figure 9).All non-RTK/PPK systems have a lower within-model distance error compared to the positional error (RMSE x,y,r or MAE x,y ) (Figures 6-9).The distributions of the within-model horizontal measurement errors presented as violin plots (Figure 9) are important to consider because this provides an indication of the homogeneity of the spatial errors throughout the SfM-MVS orthomosaics.For non PPK/RTK systems the broad range of within-model error values indicates that errors are spatially inconsistent.The greatest range can be seen for the GP-E2 (0.45-3.9 m, µ = 1.67).The original geotags of this GNSS module recorded the largest vertical errors and erratic horizontal positioning (Figure 11A,C).Following the replacement of the original altitude tags in the Exif by the altitude (AGL) recorded in the flight logs with a lever arm correction applied (Figure 11B), the SfM model still resulted in within-model errors with an inconsistent and variable (up to ~90 • ) orientation (Figure 11D).On the western side of the model, the displacement between known checkpoints and those in the orthomosaic are mainly E-W oriented, while on the eastern side they are predominantly N-S.The large discrepancies in the original altitude tags (Figure 11C) as well as horizontal position, indicates that the GP-E2 is inaccurate for SfM or SfM-MVS reconstructions.It has been previously shown that the M600P computes a GNSS altitude of +/− 1 m during flight.Furthermore, the altitude computed from each of the three A3 Pro modules can vary up to ~2 m between modules [22].However, with RTK enabled in the flight controller (as was done here), the altitude difference recorded between the three modules is reduced to <1 cm and overall the altitude varies by 5-10 cm during flight [22].This can also be seen in Figure 11A,B in the position of the optimized photograph locations in comparison to the original geotags.The other UASs for which processing needed to be modified was the Mavic 2 Pro with the integrated Hasselblad L1D-2C camera.The standard pipeline for processing that allows Pix4D to optimize the camera internal parameters and recalculate/optimize the position and orientation of the photographs resulted in a domed SfM point cloud (radial distortion) (Figure 12A).By setting the initial camera parameters from "All" to "All prior" and selecting the "Accurate Geolocation and Orientation" option, the distortion was removed (Figure 12B).The "All prior" option alone did not remove the deformation.While this alternate pipeline is generally recommended for RTK/PPK solutions that also have accurate IMU information (≤3 • ) it can improve SfM products from other systems as well as seen here.The lower positional and within-model accuracy for the Mavic Air (Figures 6-9, 13) was expected because it is a consumer-grade system that was not developed for photogrammetry purposes or precise flight controls as would be required by professional or enterprise systems.The high positional error (RMSE and MAE) and low within-model accuracy of the Skyranger were unexpected given it is an enterprise UAS (Figures 6-9, 10d-f, 13).In general, we found that the cost of the system is only weakly related to the accuracy of the models generated by the different systems (Figure 13).The two most accurate systems (P4RTK and M600P + PPKLB-NTRIP) fall into the second-highest cost category (US$5000-$15,000) but the most expensive system tested (SkyRanger, >US$100,000) has both low positional accuracy (high RMSEr and MAE) and high horizontal within-model measurement error (µ = 39 +/− 28 cm).The majority of the non-RTK/PPK systems, which ranged in price at the time of purchase from US$2000-$15,000, perform similarly in terms of both positional accuracy and within-model accuracy.
Based on the 2015 American Society for Photogrammetry and Remote Sensing (ASPRS) positional accuracy standards for digital geospatial data [61], only the P4RTK and the M600P + PPK LB-NTRIP SfM-MVS products could be used without GCPs for projects requiring high spatial accuracy (Figure 14).The accuracy requirements of SfM or SfM-MVS products (RMSE AT ) to be used for elevation data and/or planimetric data (orthomosaic) production or analysis is calculated as (Equation ( 5)) [61]: where RMSE AT is the RMSE x,y,z the SfM-MVS product must meet, and RMS Map,DEM is the project accuracy requirement.For example, a forestry inventory requirement of RMSE Map of 2 m would require an SfM-MVS orthoimage with an RMSE AT of no larger than 1 m.RMSE AT is shown as the finest RMSE Map,DEM for which the UAS products generated here could be used (Figure 14). Figure 14 also illustrates that without GCPs, six UASs could be used to support manned aircraft such as airborne hyperspectral imagery or high spatial resolution satellite generated (e.g., Planet Dove, Pleiades).The remaining six UASs with the largest RMSE AT could be used to support projects with moderate resolution satellite data products (e.g., Sentinel-2, Landsat).

Discussion
We found that eight of the fourteen of UASs tested can achieve relatively high positional (RMSE r < 2 m) and within-model accuracies (<0.5 m) for SfM and SfM+MVS models without GCPs.A clear distinction in horizontal and vertical accuracy was whether the UAS photographs were tagged with a PPK/RTK solution or not, regardless of the flight controller's use of RTK for navigation.Similar to other studies (e.g., [75,76]) a PPK/RTK GNSS solution resulted in low positional errors without GCPs.Depending on the purpose of the data collections (e.g., animal counts), users may not need high positional accuracy to real-world coordinates, and therefore, the within-model measurement error may be more important.Twelve of the UASs had average within-model errors of <0.4 m, four systems (the M600P PPK configurations and the P4RTK) each had an average error <3 cm, and one (P4RTK) had an average linear within-model error of 0.8 cm with a range of 0-6 cm.In this case, 0 refers to errors less than the uncertainty of the measurements.
As this work shows, in order to achieve the low vertical positional error with the P4RTK, the user must determine the camera-specific focal length (one of the leading internal camera parameters); a calculation that is relatively easy to do.This is likely due to minute differences in lens element distortions and other internal camera parameters between individual units.As such, it is likely that the out-of-the-box generalized focal length will not result in the advertised survey-grade accuracy for all units.It is also the only UAS we tested where the integrated camera tagged the photographs with a dual-frequency GNSS position calculation using both GPS and GLONASS.Additional frequencies provide for better signal reception within close proximity to obstacles such as trees and buildings and reduce ionospheric error in the position calculation.Generally, dual-frequency systems also achieve a "fixed" GNSS solution considerably faster than single-band systems and are more accurate.While not tested here, it is also likely that the D-RTK2 base station of the P4RTK (can utilize GPS L1, L2, L5, GLONASS F1 and F2, Galileo E1, E5A and E5B, and BEIDOU B1, B2 and B3) can achieve a more precise position calculation even without an incoming correction than the earlier generation D-RTK base station of the M600P and M210-RTK (GPS L1 and L2, GLONASS F1 and F2).However, an incoming correction is important to achieve the highest positional accuracy.Contrary to expectations, the model from the Inspire 1, which uses a single frequency GNSS receiver that also only receives GPS L1 with no GLONASS support had lower positional and within-model errors than six other UASs that also support GLONASS F1 (Table 1, Figure 13) indicating that under the right conditions (i.e., good GNSS geometry, no obstacles and a low planetary k index) it is possible to achieve acceptable results using a single GNSS constellation.
Not all "RTK" systems geotag the photographs with the RTK corrected coordinates (e.g., M210-RTK, M600P + X5).In these systems, RTK is only used for accurate navigation.Newer systems such as the M210-RTK v2 (not tested here) and the P4RTK include the RTK corrections in the geotags.As such, users need to be aware of the characteristics of the systems they purchase.In order to accurately geotag the photographs from the M600P with the DSLR camera, third-party hardware and software needed to be incorporated into the setup with a PPK workflow.While this did result in high accuracy, these configurations (PPK LB , PPK CB , PPK LB-NTRIP ) are considerably more complicated to operate with multiple potential points of failure (i.e., hardware from multiple manufacturers and human error in setup/operation) than integrated systems such as the P4RTK.These DSLR configurations also require precise lever arm measurements to correct for the distance from the GNSS antenna to the film plane (detector array) on camera when the geotags are calculated in post-processing.For example, for the DLSR mounted on the Ronin MX gimbal on the M600P, the vertical distance between the GNSS antenna of the M+ and the DSLR's film plane was -50.2 cm.These measurements should be taken every time the DSLR is installed and balanced on the gimbal.Novel SfM-MVS object reconstructions of the airframe, as shown by [77], allow for digital preservation of the system and precise measurements post-flight.
A common aspect of all non-RTK/PPK systems was that the altitude recorded in the Exif is of very low accuracy and should not be used for SfM-MVS if GCPs are not included in the processing pipeline.We recommend that users replace these values by ones they calculate themselves from the barometer value added to the ground elevation (m AGL or m HAE).Until more accurate GNSS altitudes are possible from small non-RTK/PPK UASs, the original values (errors in altitude up to >10 m) recorded in the Exif are unreliable (e.g., Figure 11A).Furthermore, manufacturer documentation related to the coordinate systems (horizontal and vertical) generally lacks critical details, especially for the vertical component, which would allow for more precise transformations between datasets.In the case of RTK/PPK systems, the coordinate systems are readily determined because they correspond to those of the base station, and therefore, precise transformations can be carried out.
The relatively high errors of the Mavic 2 Pro (compared to the Mavic Pro) and the SkyRanger were unexpected.Despite having a superior camera in comparison to its predecessors, the Mavic 2 Pro is the system with third-largest RMSE r (Figure 8), and also the one with the deformation of the SfM sparse point cloud without additional consideration for the processing steps (Figure 12).In contrast to its predecessors (Mavic Pro and Mavic Air), this system integrates a 1 inch L1D-2C camera.As this relatively new UAS was not designed specifically for mapping it may take additional versions of firmware upgrades to improve the positional information in the Exif of the photographs and characterization of the camera internal and external parameters.The domed output is indicative of incorrect camera model parameters [78].Changing the calibration to "All prior" with "Accurate Geolocation and Orientation" removes the deformation error by not attempting to recalculate camera model characteristics from the photographs.In landscapes such as our study areas that are topographically flat relative to the flight altitude, optimization can introduce errors where the distance to the surface is wrongly calculated.Ideally, there should be a low correlation between the internal camera parameters.However, some correlation is unavoidable in flat terrain [79].Correlation among the leading parameters (i.e., focal length (F) and the x, y coordinates of the principal point) result in errors in the SfM reconstruction.In the case of the Mavic 2 Pro, high correlations were seen in the reconstruction with the domed output (Figure 15).A potential source of error not addressed here but requiring further study is the impact on SfM and SfM-MVS products of the radiometric degradation of from lossy compressed files with low bit depth (i.e., jpg) rather than lossless TIFs generated from RAW files captured by the sensor.Of the cameras tested, only the DSLR was capable of collecting photographs in RAW while mapping, the others all save to jpg.The onboard processors of the UAS lack the write speed to save RAW images at the rate they are taken for photogrammetry.It is well known that manufacturers (hardware and/or software) implement proprietary jpg engines, which apply varying degrees of processing and compression; therefore, each set of photographs from the UAS underwent different jpg generation pipelines within the cameras, or within Adobe Lightroom ® for the DSLR.A jpg with 8 bits can represent a maximum of 256 digital numbers (DN) per color channel.A 14-bit sensor such as used by the DSLR can represent 16,384 DN per channel when the file is saved in RAW (or exported as a lossless tiff).The total theoretical color depth of an 8-bit photograph is 16,777,216 colors in comparison to 4.398 × 10 12 for a 14-bit RAW DSLR photograph.In one example, exporting the M600P + PPK LB-NTRIP photographs with twice the jpg compression (100% jpg quality versus 50%) resulted in lower positional accuracy of the SfM model: RSME x increase of 1 cm, RMSE y increase of 2 cm and an RMSE z increase of 2.9 cm.A similar decrease in accuracy was found by [80] in a comparison between SfM reconstructions from RAW photographs versus jpg.A comparison of the number of pixels whose DN was different between the 100% and 50% jpg quality indicated that only 55% of the pixels retained their DN with the greater compression.The remainder changed by up to 52 DN.The effects of compression on a photograph are scene dependent, and therefore these values are simply provided as an example that the well-known degradation from lossy compression (e.g., jpg) does matter for SfM and should be minimized when possible.
Acquiring photographs in RAW format requires computational speed to ensure the files are written to the media at a rate faster than they are taken by the camera.The write speed of the files is determined by both the onboard processor of the camera, and the type of media used.Because the Inspire 2 was designed for cinematography (internal CineCore processor capable of 6K RAW video recording) and has the option to write directly to a high-speed SSD instead of a micro SD card, it is plausible that it could in the future, with changes to its firmware, be used to collect photographs in RAW for mapping purposes.RAW files are more flexible to allow for adjustments to produce uniform (in color, saturation and exposure) photographs needed for improving the overall quality and visual appeal of SfM-MVS models.Potentially, the new lossy compressed format, HEIF, which supports higher bit depth than jpg while retaining a small file size may be a suitable compromise between image quality and write speed limitations.
Additional aspects requiring further study include an investigation into the impact of the pixel size, sensor modulation transfer function (MTF) and signal to noise ratio on the accuracy of SfM-MVS products.While we found a correlation of r = −0.55 between pixel size (Table 2) and RMSE r (Figure 8) it was not significant (α = 0.05).However, we believe that for non-RTK/PPK systems the sensor size (and, independently, pixel size) is more strongly related to the grade of the system (e.g., consumer vs professional or enterprise).In addition to a smaller sensor, with smaller pixels, a consumer-grade UAS, such as the Mavic Air or Mavic Pro also has a lower accuracy GNSS modules and/or algorithms for the computation of their position.In terms of the sensor and pixel sizes, the SkyRanger is an outlier because it uses a back-illuminated sensor.In contrast to conventional front-illuminated sensors, the wiring, which reduces the number of photons being recorded, is placed under the photodiode substrate allowing for greater sensitivity and higher resolution (more and smaller pixels) on smaller sensors.Ref. [81] found that for small sensors, the light sensitivity of pixels with less than 3.2 µm pitch decreases with further pixel size reduction.In an examination of the tradeoffs between spatial resolution (i.e., more smaller pixels) and noise, they determined a theoretical maximum image information capacity based on the signal to noise ratio and MTF of individual pixels as 1.45 µm.Real-world testing of sensors from different manufacturers revealed, however, that variations in quality between manufacturers were greater than the effect of differences in pixel pitch.With the exception of the SkyRanger's camera (pixel size of 0.99 µm), we found that all UASs tested here with a small sensor (1/2.3")have a pixel size close to the theoretical maximum image information capacity: Mavic Air = 1.50 µm, Mavic Pro = 1.57µm, X3 = 1.58 µm (Table 2).All UAS cameras tested here with a 1" sensor have a pixel size of 2.35 µm.Only the M4/3 and full-frame sensors surpass 3.2 µm in pixel size.It is further important to remember that despite generic characterizations of sensors in terms of mega-pixels or image size, due to the Bayer color filter pattern used by the majority of sensors in photographic cameras (including all cameras tested here), the capture ratio, and native resolution of the green channel is twice that of the red and blue channels [80].
Also, rather than focusing on smaller/more compact sensors, the latest models of mirrorless cameras and larger flange diameter lenses may improve individual photo quality due to their higher light sensitivity and increased sharpness and larger image size.It is however uncertain how much the increase in overall sharpness and dynamic range may improve the accuracy and overall details of the SfM-MVS products.Early issues with short battery life seem to have been fixed in the most recent models.Medium format and high megapixel 35 mm format cameras may not substantially improve the SfM-MVS model accuracy, due to oversampling at low altitude compared to the accuracy of onboard GNSS or even RTK/PPK solutions.These systems would likely be of greater benefit for higher altitude flight (e.g., >150 m), but this also increases atmospheric effects (e.g., haze).
Importantly, this study was conducted at vegetated sites with low topographic relief.Further analysis is warranted over sites with highly variable terrain and a range of materials, natural and manmade (e.g., monuments and buildings) as well as aquatic systems to fully characterize the systems.Comparison with georeferenced terrestrial laser scanning (TSL) products of these more complex landscapes would further allow for quantifying the accuracy of geometries.We anticipate differences in RMSE r and non-vegetated accuracy (NVA for impermeable surfaces) in comparison to our results.
All our SfM reconstructions were carried out with the same software.As has been shown by [25,82,83], results can vary based on the software due to differences in the processing algorithms.Nevertheless, we expect the general pattern of accuracy ranges for the various UASs to be consistent across software implementations even if the absolute values for the accuracies may differ.
Lastly, for the use of RTK/PPK UASs in remote areas the impact of calculating the base station position from a precise point positioning (PPP) solution should be investigated.Due to the general stated accuracies of PPP of 10-30 cm, the SfM-MVS products would achieve slightly lower accuracies under a best-case scenario.

Conclusions
Because for many UAS 3D landscape reconstruction applications, the use of GCPs is not feasible, our study assessed the horizontal and vertical accuracies (positional and within-model) for SfM-MVS reconstructions based on a series of VTOL UAS ranging from low-to high-cost (e.g., consumer to enterprise), without the use of GCPs.On selecting a UAS for a specific project and objective(s), it is important to recognize that price is not necessarily related to better data quality (i.e., higher accuracy) as shown by our results.Overall, our results indicate that based on the accuracy obtained from the 14 UASs tested, four main groups can be defined.Very high accuracy (<5 cm) is obtained with systems using RTK or PPK LB-NTRIP solutions, which are suitable for projects requiring very low MAE/RMSE and repeatability (e.g., 4D Earth surface monitoring, traffic accident reconstruction).High accuracies (greater than 5 cm but less than 15 cm) were obtained with PPK CB (11 cm) and PPK LB (local base L1 system, 10 cm) for enterprise systems, which can be implemented for herbaceous vegetation mapping for example.Our third group encompasses mainly professional and enterprise systems with errors 0.15-1 m suitable for comparisons with manned aircraft products.Our last category contains all consumer UAS as well as two enterprise systems, producing moderate errors >1 m, which might be suitable for the validation of medium to high-resolution satellite products (e.g., Landsat, Sentinel-2) or projects where positional accuracy is less important (e.g., animal counts).As expected, our results indicate that camera sensor type is only a secondarily important consideration.Overall, we conclude that with the diversification of UAS systems and services, careful attention should be given when selecting a UAS or using a UAS service provider in order to ensure users receive and work with data for which they understand the characteristics and limitations and are most suited to their application.

Figure 1 .
Figure 1.Three common configurations for geotagging photographs for a SfM or SfM-MVS workflow.(A) Onboard position calculation: positions of the photographs are based on the location of the UAS and recorded in the Exif; (B) post-processing kinematic (PPK): positions of the photographs are computed after the flight from the rover and base station logs.A commercial or local base station can be used; (C) real-time kinematic (RTK): positions of the photographs are computed in real-time with corrections sent to the rover directly from the base station.The base station can be local or in specialized scenarios, a commercial base station correction can be sent via NTRIP to the remote controller.The accuracy of the photograph positions for both the PPK and RTK solutions greatly depends on the accuracy of the base station location.

Figure 2 .
Figure 2. Aerial view of the three study sites.(A) Herbaceous field next to the Mer Bleue (MB) peatland, Ontario; (B) abandoned agricultural field on île Grosbois (IGB), Quebec; (C) agricultural field in fallow in Rigaud, Quebec.The white boxes indicate the location of the fields within the landscape; (D) posts in the MB field with metal targets affixed to the top; (E) temporary target used in IGB and Rigaud being measured with a Trimble Catalyst GNSS receiver.

μFigure 3 .
Figure 3. Illustration of the relative differences in sensor size of the cameras used in this study (Table2).

Figure 4 .
Figure 4. General workflow to determine the positional errors of the checkpoints and the within-model horizontal distance error.

Figure 5 .
Figure 5. Relationship between focal length (FL) (mm) and RMSE( r,z ) for the P4RTK.The effect on RMSE z of using the generalized Pix4D focal length (8.57976 mm) for the P4RTK and generalized focal length with "All Prior" initial camera parameters are shown by the circle and triangle respectively.

Figure 6 .
Figure 6.RMSE x,y,z (positional accuracy) for the SfM sparse point clouds.The number above each group of bars is the GSD in cm.

Figure 7 .Figure 8 .
Figure 7. Positional error (as MAE x,y,z ) for the SfM sparse point clouds.The standard deviation of MAE is also shown (in m).

Figure 9 .Figure 10 .
Figure 9. Violin plots of the within-model horizontal measurement errors calculated as distances between all pairs of checkpoints.Red lines represent the median and dotted lines indicate the quartiles.Distance calculations take into consideration the error propagation of the uncertainty in the position of the checkpoints as well as user error locating the center of the checkpoints in the orthomosaics.The percentages of pairwise distance deviations (from the orthomosaic vs distances measured in situ) less than the measurement uncertainty (and therefore set to 0) are also indicated.Due to the similarity in results between M600P + PPK LB-NTRIP , M600P + PPK LB and M600P + PPK CB , only one is shown.μ

Figure 11 .
Figure 11.(A) Position of the original geotags from the GP-E2 (blue) in comparison to the optimized positions as calculated by Pix4D (green); (B) positions of the GP-E2 geotags with the altitude tag replaced by the altitude from the flight logs with a lever arm correction applied (blue) in comparison to the optimized position as calculated by Pix4D (green); (C) original GP-E2 altitude transect of the flight; (D) polar histogram of the directional offsets between the checkpoints measured in situ and located in the orthomosaic for the GP-E2.

Figure 12 .
Figure 12.Profile view comparison of SfM sparse point cloud from the Mavic 2 Pro with integrated Hasselblad L1D-2C camera.(A) Domed deformation (radial distortion) as the product of standard calibration settings; (B) deformation removed following processing with initial camera parameters set to "All prior" and "Accurate Geolocation and Orientation".The remaining slope on the left side (entrance to the field) is real.

Figure 13 .
Figure 13.Relationship between the NSSDA horizontal positional accuracy at 95% confidence level (m) and the mean within-model horizontal distance measurement error (m).The legend and size of the circles indicate the price category of each UAS from Table 1 at the time of purchase (2016-2019).The letters C, P and E refer to consumer, professional and enterprise grades as set by the manufacturer.* Indicates cases where RMSE x and RMSE y were found to not be normally distributed (D'Agostino Pearson omnibus k2 test, α = 0.05).

Figure 14 .
Figure 14.RMS Map,DEM project accuracy requirements ordered by RMSE Map(AT) .The largest value of RMSE x or RMSE y was used to calculate RMSE AT for each UAS.The three project categories, high-resolution, manned aircraft or high-resolution satellite data products, and moderate resolution satellite data products are based on RMSE Map(AT) .

Figure 15 .
Figure 15.Correlation matrix of internal camera parameters, focal length (FL), coordinates of the principal point (C0x and C0y), radial distortion parameters (R1, R2 and R3) and tangential distortion parameters (T1 and T2), for the Mavic 2 Pro's L1D-2C camera.The matrix on the left illustrates the correlations in the SfM reconstruction with the domed deformation generated by optimizing all parameters.The matrix on the right illustrates the correlations in the SfM reconstruction without the deformation generated by using internal parameters close to the initial values and minimal recalibration of the location and orientation of the photographs.Of importance, the correlation between the FL, C0x and C0y has decreased in the matrix on the right.

Table 1 .
List of UASs tested ordered by takeoff weight.* These systems only utilize RTK for the flight controller and the geotagging only uses GPS L1 and GLONASS F1 frequencies.The DSLR camera used was a Canon 5D Mark III.

Table 3 .
Effects of changing the camera focal length parameter on location accuracy for the Phantom 4 RTK.The best model is highlighted in bold.FL: focal length.* Generalized FL of the P4RTK camera in Pix4D.