Accuracy Assessment of a UAV Direct Georeferencing Method and Impact of the Configuration of Ground Control Points

Unmanned aerial vehicles (UAVs) can obtain high-resolution topography data flexibly and efficiently at low cost. However, the georeferencing process involves the use of ground control points (GCPs), which limits time and cost effectiveness. Direct georeferencing, using onboard positioning sensors, can significantly improve work efficiency. The purpose of this study was to evaluate the accuracy of the Global Navigation Satellite System (GNSS)-assisted UAV direct georeferencing method and the influence of the number and distribution of GCPs. A FEIMA D2000 UAV was used to collect data, and several photogrammetric projects were established. Among them, the number and distribution of GCPs used in the bundle adjustment (BA) process were varied. Two parameters were considered when evaluating the different projects: the ground-measured checkpoints (CPs) root mean square error (RMSE) and the Multiscale Model to Model Cloud Comparison (M3C2) distance. The results show that the vertical and horizontal RMSE of the direct georeferencing were 0.087 and 0.041 m, respectively. As the number of GCPs increased, the RMSE gradually decreased until a specific GCP density was reached. GCPs should be uniformly distributed in the study area and contain at least one GCP near the center of the domain. Additionally, as the distance to the nearest GCP increased, the local accuracy of the DSM decreased. In general, UAV direct georeferencing has an acceptable positional accuracy level.


Introduction
As a new aerial survey platform, unmanned aerial vehicles (UAVs) have attracted increasing attention worldwide. Compared with methods based on satellite or airborne sensors, UAVs provide user-defined spatial and temporal resolution data at a relatively low cost, as well as flexible options for sensor use and data collection [1]. Images captured by UAVs and processed by structure-from-motion (SfM) photogrammetry are a combination of mature photogrammetry principles and modern computer vision technology [2] (hereafter UAV SfM). UAV SfM is widely used in Earth and Environmental Sciences to generate highresolution topography (HRT) data [3,4], including precision agriculture [5,6], landslide monitoring [7,8], coastal change [9,10], and glacier dynamics [11,12].
Georeferencing is the process of referencing the results of bundle adjustment (BA) and photogrammetric processes to a specific coordinate system [13]. During bundle adjustment (BA), ground control point (GCP) coordinates on the ground are provided and measured in the images (indirect georeferencing) or known external elements of the images are directly used (direct georeferencing) [14]. Direct georeferencing requires cm-level positioning accuracy of the UAV, obtained by difference data. GCP field deployment, surveying, and recognition in images may require a significant amount of time and cost, while direct georeferencing based on IMU and GNSS can quickly collect data and significantly improve work efficiency. However, these benefits are only valuable if the accuracy of the directly georeferenced topographic product meets the requirements of the application. First, the GNSS mobile receiver on the UAV achieves centimeter-level positioning accuracy by receiving difference data provided by a virtual reference station (VRS), and then adds the position data solved by the fusion of real-time kinematic (RTK) and post-processed kinematic (PPK) into the BA process. Direct georeferencing requires obtaining the coordinates of a GNSS receiver at the exact moment the image is acquired, although RTK can provide high-accuracy single-point positioning. However, there might be distortions in signals from satellite constellations and interruptions in RTK connections for a fast-flying UAV. Fusion PPK technology can solve epoch data in the lockout period through a reverse Kalman filter to improve the fixed rate and positioning accuracy [15,16].
Considering the rapid development of UAV technology, it is necessary to evaluate the accuracy of UAV direct georeferencing methods with GNSS RTK/PPK technology. Many studies have evaluated the technology based on measured checkpoints on the ground. Padró et al. evaluated the data of a farm and showed that the horizontal and vertical RMSE of direct georeferencing were no more than 0.256 and 0.238 m, respectively [17]. Nolan et al. obtained data with a GSD of 10-20 cm in an area over tens of square kilometers and verified that the accuracy and precision (repeatability) of direct georeferencing were better than ±30 and ±8 cm, respectively, at 95% RMSE [18]. In the other two studies, the vertical RMSE did not exceed 10 and 20 cm, respectively [19,20]. The accuracy of the vertical direction is not always able to meet the requirements, and the accuracy can only reach the meter level in some studies [21,22]. Mian obtained a vertical RMSE of 40 cm using an image of 0.7 cm GSD [23]. Hugenholtz compared direct and indirect georeferencing methods, and recommended using GCPs where high accuracy is required [24].
The final product quality mainly depends on camera specifications, GCP configuration (accuracy, density and distribution), flight parameters (image overlap, GSD, etc.), land cover and terrain complexity, processing software, and flight platform (fixed wing or rotor wing). Measuring GCP is a time-consuming task, so a balance between appropriate accuracy and efficiency is needed. The literature to date shows consistently that accuracy increased with an increase in the number of GCPs and rapidly reached an asymptotic trend [25][26][27]. Conclusions differ among studies regarding the amount of GCPs required to produce a favorable outcome. In [28], 0.5~1 GCP per ha was the optimal GCP density, and GCPs were placed inside the area with stratified distribution to obtain the minimum total error. In [29], the optimal density was 1.8 GCPs per ha, uniformly distributed across the whole surface. Scott et al. reported setting control points in the center and edge of the study area, which is of great significance to reduce the height vertical error in spatial concentration [30].
The purpose of this study was to evaluate the accuracy of UAV RTK/PPK direct georeferencing and to determine whether the method could be used as a solution for rapid mapping applications with data generated from images of natural environments, including buildings, low vegetation and so on. The aim was to understand the difference in survey effectiveness between using direct georeferencing and GCPs. In addition, the effect of GCP quantity and distribution on the quality of the results was also studied. The evaluation was performed by calculating the vertical and horizontal RMSE of checkpoints on a digital surface model (DSM) and digital orthomosaic (DOM), respectively, and the Multiscale Model to Model Cloud Comparison (M3C2) distance based on point clouds [31]. The M3C2 method is the unique way to compute signed (and robust) distances directly between two point clouds.

Materials and Methods
The workflow of this study is shown in Figure 1. The study comprises four main steps: route planning (field survey, pre-flight, and setting flight parameters); data acquisition (ground GCP and CP layout and survey, UAV image acquisition); data processing consider-

Materials and Methods
The workflow of this study is shown in Figure 1. The study comprises four main steps: route planning (field survey, pre-flight, and setting flight parameters); data acqui sition (ground GCP and CP layout and survey, UAV image acquisition); data processing considering different quantities of GCPs (BA and image-intensive matching); and hori zontal and vertical quality assessment (data and error analysis).

The Study Area
The study area is located near Xishan, Taiyuan, Shanxi Province, People's Republic of China ( Figure 2). The approximate coordinates in the geodetic reference system WGS84 are 112°27′5.64″ E and 37°52′0.98″ N. The region covers an area of approximately 0.5 km 2 with the highest point at 896 m and the lowest point at 840 m. The area features railways factories, and low vegetation.

The Study Area
The study area is located near Xishan, Taiyuan, Shanxi Province, People's Republic of China ( Figure 2). The approximate coordinates in the geodetic reference system WGS84 are 112 • 27 5.64 E and 37 • 52 0.98 N. The region covers an area of approximately 0.5 km 2 , with the highest point at 896 m and the lowest point at 840 m. The area features railways, factories, and low vegetation. Accurate GCP coordinates are required for georeferencing UAV images. Sixteen ground survey markers were deployed, according to different photogrammetric projects-some of them were used as the input of the BA process, and the rest were used as horizontal checkpoints for cross verification of the horizontal accuracy. All control points Accurate GCP coordinates are required for georeferencing UAV images. Sixteen ground survey markers were deployed, according to different photogrammetric projectssome of them were used as the input of the BA process, and the rest were used as horizontal checkpoints for cross verification of the horizontal accuracy. All control points were as evenly distributed as possible in the study area. In addition, coordinates of 120 points were obtained for vertical accuracy analysis of the generated DSM. Compared to CPs for horizontal accuracy assessment, vertical CPs do not require accurate identification in the image. Aerial markers consisted of 70 × 70 cm highly reflective red and yellow material fixed with nails to the ground, far away from high vegetation, buildings, and slopes. These markers were large enough to be identified in images and placed and measured prior to flight. A GNSS RTK receiver was used for field measurements. The receiver was connected to a virtual reference station and received differential signals through the network. Ten fixed solutions were recorded at each point, and the average was taken as the final result. Taking the average of the results of multiple measurements as the final result can exclude the influence of accidental factors and make the measurement results closer to the true value. At a control point marker, it takes about 3-5 min from deployment to the end of measurement. All coordinates were recorded in the WGS84 reference system. In the experiment, the means of the RTK coordinate residuals in the X, Y, and Z directions were 0.007, 0.006, and 0.012 m, respectively. The expected coordinate accuracy was higher than the spatial resolution of the UAV image (GSD about 1.7 cm). Figure 2 shows the position of GCPs and CPs relative to the study area.

Data Collection
In this study, a FEIMA D2000 multi-rotor UAV (Figure 3a) was used to collect data. This UAV included a fuselage, power motor, quick removal wing, differential antenna, magnetometer, and data transmission antenna. The 24.3-megapixel camera installed on the D2000 UAV provided a ground sampling distance (GSD) of 1.7 cm/pixel at an altitude of 110 m relative to the ground. The camera captured images at fixed intervals and stored them in JPG format. Detailed information about the UAV and camera are shown in Table 1.  Table  1.  By pre-setting flight parameters in the UAV Manager software, the UAV flew autonomously from takeoff to landing. The ground control station consisted of computers, a ground-based data transmission radio, and antennas that communicated with the UAV to continuously monitor its flight status and allowed users to interrupt the flight if the UAV was in danger. The aerial survey was conducted on 10 August 2021 with clear  By pre-setting flight parameters in the UAV Manager software, the UAV flew autonomously from takeoff to landing. The ground control station consisted of computers, a ground-based data transmission radio, and antennas that communicated with the UAV to continuously monitor its flight status and allowed users to interrupt the flight if the UAV was in danger. The aerial survey was conducted on 10 August 2021 with clear weather and light wind.

Data Processing
In this study, Agisoft Photoscan Professional Version 1.7.0 [32] software was used to generate dense point clouds, a DSM, and a DOM based on the SfM algorithm. Before processing, position data were solved by the network RTK/PPK fusion differential job mode, which gave priority to the results of the PPK fixed solution, while for the non-fixed solution part of PPK, RTK fixed solution data were used for fusion, so as to ensure the quality of high-precision position data through complementary operation mode, and the RTK trajectory file was input during fusion difference resolution. Then, image EXIF was sequentially written-that is, the GPS positioning data were stored in the header file of the image, so that the GPS data could be directly read through the image in the software. The workflow in Photoscan is described as follows: 1.
Image feature extraction and matching. The software automatically identifies many conspicuous points in each image, regardless of image scale or perspective, and similar feature points are recognized in multiple images. After locating the feature points in each image, similar feature points are recognized in multiple images. The quality of feature matching depends on the texture and overlap success of the image [33,34]; 2.
Iterative bundle adjustment. The purpose of BA is to determine internal and external orientation elements of the images by minimizing the reprojection errors between predicted and observed points, which can be converted into a nonlinear least-squares problem [35]. By applying the BA, the three-dimensional structure of the scene, the internal and external orientation elements of the camera are estimated at the same time; 3.
Model optimization based on control points. GCPs provide additional external information about reconstructed scene geometry. The optimization process in Photoscan refines the camera position and reduces non-linear project deformations by incorporating GCPs [36]; 4.
Point cloud density matching. The MVS image matching algorithm operates on a single-pixel scale of the image to build dense clouds and increases the point density by several orders of magnitude; 5.
Generate DSM and DOM. Using the dense point cloud as input, other results, such as DSM and DOM, can be produced. The outliers in the dense point cloud are removed before the dense point cloud is interpolated to generate DSM, and then the DOM is generated by digital differential correction based on DSM.
The images in all projects underwent the same photogrammetric processing, with differences in the number and distribution of GCPs used in the BA process. The number of GCPs in the experiment for evaluating the vertical RMSE ranged from 0 to 16, while the number for the horizontal RMSE ranged from 0 to 10. Seven other experiments evaluated the impact of the GCP distribution; the distribution schemes are shown in Figure 4.

Quality Assessment
Two methods were used to evaluate the quality of the UAV SfM results. The first method evaluates the accuracy of the DSM and DOM generated by different projects. The root mean square error (RMSE) was further calculated by comparing the CP coordinates estimated in the calculation results with the reference CP coordinates measured with the GNSS RTK receiver. Specifically, horizontal accuracy was verified on DOM, while vertical accuracy was verified by extracting the elevation of the corresponding DSM on a larger range of 120 vertical checkpoints.
where ∆x, ∆y, and ∆z are the differences between RTK checkpoint coordinates and modelextracted coordinates, and n is the number of checkpoints. Calculation of the horizontal RMSE (RMSE XY ) is as follows: The second method uses the Multiscale Model to Model Cloud Comparison (M3C2) tool of the CloudCompare software version 2.10.
[37], which calculates the distance of the reference cloud and comparison point cloud relative to the local surface normal direction through two parameters (user-defined normal proportion and projection proportion). The process runs directly on the point cloud and does not require grid partitioning, avoiding the uncertainties involved in the interpolation process.
To compare point clouds, the M3C2 distance between the reference point cloud generated by the 16 GCPs participating in BA and other project point clouds was calculated. Then, the mean and standard deviation of the M3C2 distance calculation were used to evaluate the accuracy and precision, respectively, of each point cloud. By plotting the error and its distribution curve, we determined the influence of the GCP distribution on the spatial distribution of the M3C2 distance difference and determined the possible pattern of the spatial distribution of error. as DSM and DOM, can be produced. The outliers in the dense point cloud are removed before the dense point cloud is interpolated to generate DSM, and then the DOM is generated by digital differential correction based on DSM.
The images in all projects underwent the same photogrammetric processing, with differences in the number and distribution of GCPs used in the BA process. The number of GCPs in the experiment for evaluating the vertical RMSE ranged from 0 to 16, while the number for the horizontal RMSE ranged from 0 to 10. Seven other experiments evaluated the impact of the GCP distribution; the distribution schemes are shown in Figure 4.

Quality Assessment
Two methods were used to evaluate the quality of the UAV SfM results. The first method evaluates the accuracy of the DSM and DOM generated by different projects. The root mean square error (RMSE) was further calculated by comparing the CP coordinates estimated in the calculation results with the reference CP coordinates measured with the GNSS RTK receiver. Specifically, horizontal accuracy was verified on DOM, while vertical accuracy was verified by extracting the elevation of the corresponding DSM on a larger

Model Evaluation Based on RMSE
Based on the model extraction coordinates and measured checkpoint coordinates, the vertical and horizontal RMSE for each project were calculated using Equations (1)-(4). The GCP density was calculated from the number of GCPs used divided by the area investigated; the relationship between GCP density and RMSE was obtained, as shown in Figure 5. The results show that when the GCP density was highest, the vertical and horizontal RMSE were 0.032 and 0.015 m, respectively, representing approximately 1.88 and 0.88 GSD. When GCPs were not used, the vertical and horizontal RMSE increased to 0.087 and 0.041 m, respectively, representing a GSD of approximately 5.12 and 2.41. RMSE XY was less than RMSE Z in all projects. The mean ratio of RMSE Z :RMSE XY is approximately 2.3. As the density of the GCP increased, the vertical and horizontal RMSE gradually decreased; this trend is fitted by the nonlinear curve in Figure 5. As shown in Figure 5a, when d > 12 GCP/km 2 , the vertical RMSE did not significantly decrease (the change from the maximum was less than 20%). In Figure 5b, the same is true for d > 10 GCP/km 2 when considering the horizontal RMSE (the error does not show significant change). There are some cases when the GCP density increased and the RMSE increased which may be due to the error introduced by the ground measurement of the newly added GCPs. Our results for the horizontal error evaluation are influenced by manually settin the coordinates in the center of the GCPs. A similar uncertainty is expected when deter mining the coordinates of the GCP centers on the DOM.
Statistical analysis of the vertical differences of the three projects is shown in Figur 6. In Figure 6a-c, the histograms of the differences between RTK checkpoint elevation and DSM extraction elevations correspond to 0, 1, and 2 GCPs, respectively; all of th curves exhibit a Gaussian distribution. The mean value depicted in Figure 6a is −0.08 m indicating a systematically biased distribution. The mean value in Figure 6b is −0.025 m indicating that the addition of a single GCP is conducive to reducing the vertical error When two GCPs are considered (Figure 6c), the mean value is 0.003 m, and the distribu tion is significantly improved. A single sample t-test was conducted, as shown in Table 2 The null hypothesis is that there is no significant difference between the mean and the tes value 0. The significance of one GCP is less than 0.05, rejecting the original hypothesis that is, there is a significant difference between the mean of one GCP and zero at a 95% confidence interval. The significance of two GCPs was 0.449, greater than 0.05, and th original hypothesis was accepted-that is, there was no significant difference between th mean of two GCPs and zero at a 95% confidence interval. Therefore, the DSM has system atic dome error in the vertical direction when zero or one GCP was used, and it is im portant to use at least two GCPs to improve DSM elevation accuracy. The standard devi ation did not show a significant difference, with one GCP having the lowest standard de viation, at 0.033. Figure 6d shows the linear fit of the height difference between the DSM obtained by checkpoints and zero GCPs used; the coefficient of determination (R 2 ) is 0.99 There are some cases when the GCP density increased and the RMSE increased, which may be due to the error introduced by the ground measurement of the newly added GCPs. Our results for the horizontal error evaluation are influenced by manually setting the coordinates in the center of the GCPs. A similar uncertainty is expected when determining the coordinates of the GCP centers on the DOM.
Statistical analysis of the vertical differences of the three projects is shown in Figure 6. In Figure 6a-c, the histograms of the differences between RTK checkpoint elevations and DSM extraction elevations correspond to 0, 1, and 2 GCPs, respectively; all of the curves exhibit a Gaussian distribution. The mean value depicted in Figure 6a is −0.08 m, indicating a systematically biased distribution. The mean value in Figure 6b is −0.025 m, indicating that the addition of a single GCP is conducive to reducing the vertical error. When two GCPs are considered (Figure 6c), the mean value is 0.003 m, and the distribution is significantly improved. A single sample t-test was conducted, as shown in Table 2. The null hypothesis is that there is no significant difference between the mean and the test value 0. The significance of one GCP is less than 0.05, rejecting the original hypothesis, that is, there is a significant difference between the mean of one GCP and zero at a 95% confidence interval. The significance of two GCPs was 0.449, greater than 0.05, and the original hypothesis was accepted-that is, there was no significant difference between Drones 2022, 6, 30 8 of 15 the mean of two GCPs and zero at a 95% confidence interval. Therefore, the DSM has systematic dome error in the vertical direction when zero or one GCP was used, and it is important to use at least two GCPs to improve DSM elevation accuracy. The standard deviation did not show a significant difference, with one GCP having the lowest standard deviation, at 0.033. Figure 6d shows the linear fit of the height difference between the DSM obtained by checkpoints and zero GCPs used; the coefficient of determination (R 2 ) is 0.99.

Point Cloud Evaluation Based on M3C2 Distance
The M3C2 distance between the reference point cloud (16 GCPs of the BA process) and point cloud obtained from different projects was calculated. Mean and standard deviation were used as indicators to evaluate the accuracy and precision, respectively, of the point clouds of different projects (Table 3 and Figure 7). The mean distance between point clouds obtained by zero GCPs and the reference point cloud was 0.062 m. Adding a GCP at any position can reduce the mean distance, and the improvement is most obvious after adding K8 GCP in the lower right corner. Regardless of the distribution, when using two GCPs, the mean error can be reduced by about 50% compared with 0 GCPs. With an increase in the number of GCPs, the mean distance decreased, and the point cloud became more accurate. The mean distance decreased to a minimum of 0.01 m at eight and nine GCPs. The standard deviations of different projects were distributed between 0.02 and 0.03, showing no obvious difference. Several projects achieved a minimum standard deviation of 0.021 m.

Point Cloud Evaluation Based on M3C2 Distance
The M3C2 distance between the reference point cloud (16 GCPs of the BA process) and point cloud obtained from different projects was calculated. Mean and standard deviation were used as indicators to evaluate the accuracy and precision, respectively, of the point clouds of different projects (Table 3 and Figure 7). The mean distance between point clouds obtained by zero GCPs and the reference point cloud was 0.062 m. Adding a GCP at any position can reduce the mean distance, and the improvement is most obvious after adding K8 GCP in the lower right corner. Regardless of the distribution, when using two GCPs, the mean error can be reduced by about 50% compared with 0 GCPs. With an increase in the number of GCPs, the mean distance decreased, and the point cloud became more accurate. The mean distance decreased to a minimum of 0.01 m at eight and nine GCPs. The standard deviations of different projects were distributed between 0.02 and 0.03, showing no obvious difference. Several projects achieved a minimum standard deviation of 0.021 m.  The error space distributions of the M3C2 distance between the reference point cloud and the point clouds of the 0-, 3-, 6-, and 9-GCP projects were calculated, and the result are displayed with the same legend and histogram distribution of the distance (Figure 8) A significant dome effect (red area) can be observed at zero GCPs in Figure 8a; the uppe left corner of Figure 8b shows a clear error compared with the reference point cloud du to a lack of control; the error from Figure 8c to Figure 8d is not significantly reduced which is only reflected in the mean distance (from 0.013 to 0.01 m). Overall, the accuracy near the boundary of the study area is not as high as that in th The error space distributions of the M3C2 distance between the reference point cloud and the point clouds of the 0-, 3-, 6-, and 9-GCP projects were calculated, and the results are displayed with the same legend and histogram distribution of the distance (Figure 8). A significant dome effect (red area) can be observed at zero GCPs in Figure 8a; the upper left corner of Figure 8b shows a clear error compared with the reference point cloud due to a lack of control; the error from Figure 8c to Figure 8d is not significantly reduced, which is only reflected in the mean distance (from 0.013 to 0.01 m).
Overall, the accuracy near the boundary of the study area is not as high as that in the center, which is related to the low overlap of images near the boundary (Figure 3e). There are obvious errors along the north-south road and the boundaries of buildings. The reasons for the large M3C2 distance error along the road are as follows: (1) tall plants on both sides of the road block the ground; (2) sample data from the road area are extracted, and the calculated point cloud density is approximately 12 per m 2 , which is less than the average density of approximately 35 per m 2 in the overall study area. The uniform surface of the road (asphalt pavement) lacks adequate feature points, resulting in a sparse area within the point cloud. The M3C2 distance increased at the boundaries of buildings due to the sudden change in topographic characteristics.
The error space distributions of the M3C2 distance between the reference point cloud and the point clouds of the 0-, 3-, 6-, and 9-GCP projects were calculated, and the results are displayed with the same legend and histogram distribution of the distance (Figure 8). A significant dome effect (red area) can be observed at zero GCPs in Figure 8a; the upper left corner of Figure 8b shows a clear error compared with the reference point cloud due to a lack of control; the error from Figure 8c to Figure 8d is not significantly reduced, which is only reflected in the mean distance (from 0.013 to 0.01 m). Overall, the accuracy near the boundary of the study area is not as high as that in the center, which is related to the low overlap of images near the boundary (Figure 3e). There are obvious errors along the north-south road and the boundaries of buildings. The reasons for the large M3C2 distance error along the road are as follows: (1) tall plants on both sides of the road block the ground; (2) sample data from the road area are extracted, and the calculated point cloud density is approximately 12 per m 2 , which is less than the average density of approximately 35 per m 2 in the overall study area. The uniform surface of

Influence of GCPs Distribution
The experimental results of evaluating the influence of the GCP distribution are shown in Figure 9. The box diagram represents the difference between the elevation of checkpoints and the elevation of the DSM extraction point. In the layout of one GCP, four tests (excluding K15) showed low variability, in which K10 was near the center of the region and the error was the smallest. In the experiments with different distributions of four GCPs, compared with the results of using K1, 6, 8, and 15, the error of using K2, 5, 9, and 12 was significantly reduced. The difference is that the distance between K2, 5, 9, and 12 decreased. the road (asphalt pavement) lacks adequate feature points, resulting in a sparse area within the point cloud. The M3C2 distance increased at the boundaries of buildings due to the sudden change in topographic characteristics.

Influence of GCPs Distribution
The experimental results of evaluating the influence of the GCP distribution are shown in Figure 9. The box diagram represents the difference between the elevation of checkpoints and the elevation of the DSM extraction point. In the layout of one GCP, four tests (excluding K15) showed low variability, in which K10 was near the center of the region and the error was the smallest. In the experiments with different distributions of four GCPs, compared with the results of using K1, 6, 8, and 15, the error of using K2, 5, 9, and 12 was significantly reduced. The difference is that the distance between K2, 5, 9, and 12 decreased. In addition, ArcGIS was used for differential calculations. The difference was assessed by subtracting the DSM produced using a uniform distribution of five GCPs and the DSM produced using the optimal distribution of 16 GCPs; the differential DSM resolution was 0.2 m (Figure 10a). The results show that when the distance from the nearest In addition, ArcGIS was used for differential calculations. The difference was assessed by subtracting the DSM produced using a uniform distribution of five GCPs and the DSM produced using the optimal distribution of 16 GCPs; the differential DSM resolution was 0.2 m (Figure 10a). The results show that when the distance from the nearest GCP increases, the DSM local vertical difference tends to increase. There are many scattered points in Figure 10b, indicating that the local DSM accuracy is not solely determined by the distance to the nearest GCP; therefore, a nonlinear curve is used for fitting.

Discussion
Using direct georeferencing can significantly improve the efficiency of UAV measurement. This study evaluated the geospatial accuracy of photogrammetric products obtained by direct georeferencing, and the impact of the configuration of GCPs on the quality of results. The RMSE, the M3C2 distance of the point cloud, the GCP distribution, and measures to improve the accuracy of the results when direct georeferencing are discussed below.
1. Error analysis based on the RMSE shows that adding one GCP can help to reduce the deviation, but there may still be dome error, as shown in the study of Rosnell and Javernick [27,38]. When two GCPs were used, the mean vertical difference was reduced to 0.003 m, and the horizontal RMSE was 0.0198 m, approximately 1.1 GSD, when 3 GCPs were used. From the work in some natural environments to the investigation of infrastructure, such as modelling water runoff during rain, different projects have different requirements for the fineness of ground features, so the required accuracy depends on the purpose of generating DSM. Therefore, using only one GCP may not meet the high accuracy standards; two to three GCPs are recommended for a trade-off between accuracy and work efficiency. The influence of each error on RMSE is directly proportional to the size of the square error, and therefore RMSE is sensitive to large differences and does not reflect terrain changes. Because of scale differences, errors that do not occur in flat areas may also occur in sloped areas [15]. The natural environment presents a series of complexities, including changing vegetation cover, strong topographic relief, and changes in texture. Future studies will need to assess the impact of these complexities on the accuracy of the results. Calculating RMSE is a common error assessment method when the actual dataset of the ground surface is a set of distribution points rather than a continuous, real surface. Error evaluation benefits from a larger number and more evenly distributed checkpoints. Gomes et al. arranged 270 vertical checkpoints in an area of about 0.22 km 2 , with a density of 1227 checkpoints per km 2 [39]. In the study of Tomaštík, the density of checkpoints at the three sites was approximately 11363, 3674 and 2749 per km 2 [8].
Thus, even if the number of GCP measurements on the ground is minimized, the role of checkpoints in the error assessment is critical. In the future, the plan is to deploy

Discussion
Using direct georeferencing can significantly improve the efficiency of UAV measurement. This study evaluated the geospatial accuracy of photogrammetric products obtained by direct georeferencing, and the impact of the configuration of GCPs on the quality of results. The RMSE, the M3C2 distance of the point cloud, the GCP distribution, and measures to improve the accuracy of the results when direct georeferencing are discussed below.

1.
Error analysis based on the RMSE shows that adding one GCP can help to reduce the deviation, but there may still be dome error, as shown in the study of Rosnell and Javernick [27,38]. When two GCPs were used, the mean vertical difference was reduced to 0.003 m, and the horizontal RMSE was 0.0198 m, approximately 1.1 GSD, when 3 GCPs were used. From the work in some natural environments to the investigation of infrastructure, such as modelling water runoff during rain, different projects have different requirements for the fineness of ground features, so the required accuracy depends on the purpose of generating DSM. Therefore, using only one GCP may not meet the high accuracy standards; two to three GCPs are recommended for a trade-off between accuracy and work efficiency. The influence of each error on RMSE is directly proportional to the size of the square error, and therefore RMSE is sensitive to large differences and does not reflect terrain changes. Because of scale differences, errors that do not occur in flat areas may also occur in sloped areas [15]. The natural environment presents a series of complexities, including changing vegetation cover, strong topographic relief, and changes in texture. Future studies will need to assess the impact of these complexities on the accuracy of the results. Calculating RMSE is a common error assessment method when the actual dataset of the ground surface is a set of distribution points rather than a continuous, real surface. Error evaluation benefits from a larger number and more evenly distributed checkpoints. Gomes et al. arranged 270 vertical checkpoints in an area of about 0.22 km 2 , with a density of 1227 checkpoints per km 2 [39]. In the study of Tomaštík, the density of checkpoints at the three sites was approximately 11363, 3674 and 2749 per km 2 [8]. Thus, even if the number of GCP measurements on the ground is minimized, the role of checkpoints in the error assessment is critical. In the future, the plan is to deploy as many checkpoints as possible in the study area, build an accurate error surface, analyze the spatial distribution characteristics of errors, and better verify the results; these tasks will help to understand and reduce the potential error sources in the UAV SfM workflow [36]; 2.
The M3C2 algorithm eliminates the error introduced by the interpolation process. Lower error measurements with M3C2 are comparable to point-to-point or point-tomesh; the method has been widely used in research based on point cloud change detection [40][41][42][43]. Due to the high point density of UAV matching point clouds, an intercomparison can actually be regarded as continuous [44]. At zero GCP, the M3C2 distance error shows randomness in the study area. The standard deviation range of the comparison between the reference point cloud and the point cloud of different projects is 0.021-0.028 m. The change is relatively small, and the point cloud only deviates in the vertical direction, which is similar to the study of Tomaštík and Štroner et al. [6,44]. Standard deviation is an indicator of precision. In some applications of UAV SfM, the accuracy of geolocation is not as important as the repeatability (precision) of data. For example, comparing multi temporal measurement data to study the change in terrain with time, more attention is paid to the relative change between data, and the quality of multi temporal data can be improved through cooperative registration; 3.
GCP distribution experiments show that the uniform distribution of GCPs is crucial when using more than one GCP. Figure 10b uses nonlinear curve fitting. In Gindraux's study, linear fitting was used to determine that, on average, the vertical accuracy decreased by 0.09 m when the distance from the nearest GCP increased by 100 m [45].
In the experiment with four GCPs, the accuracy was improved after moving the GCP slightly towards the center of the study area compared with placing the GCP at the edge of the study area, which is similar to the study of Martínez. Martínez's study concluded that the best horizontal accuracies are achieved by placing GCPs around the edges of the study area, but it is also essential to place GCPs inside the area with a stratified distribution to optimize vertical accuracy [28]; 4.
The DSM vertical RMSE and DOM horizontal RMSE obtained by direct georeferencing with the GSD set to 1.7 cm/pixel were 0.087 and 0.041 m, respectively. Without GCPs, the accuracy of the results was highly dependent on the accuracy of the image position data. The following measures can be taken to improve the accuracy of results obtained without GCPs. UAV cross flights and imagery with large overlap can provide redundant data and improve the reliability of image matching, which requires high computing power [1,39]. The addition of oblique images helps to accurately estimate the internal and external orientation elements in the process of bundle adjustment, extract vertical features such as building sidewalls, and obtain the best vertical accuracy [13]. A more accurate GCP measurement method can be used, rather than simply increasing the number of GCPs. Another recommendation is to use a tripod-mounted prism instead of a pole-mounted prism, as well as the RTK static measurement method when time permits. If inaccurate coordinates are introduced when measuring GCPs, a more complex error surface will be introduced, as opposed to reducing the initial deformation [14].
The direct georeferencing technology integrated by UAV and GNSS RTK has advantages in monitoring locations with large ranges or difficult access. However, further development is needed. High-quality optical lenses and multi-frequency GPS can obtain higher quality images and positioning accuracy.

Conclusions
In this study, the geospatial accuracy of photogrammetric products obtained by the FEIMA D2000 direct georeferencing method was evaluated using ground-measured data.
In addition, we evaluated the effect of the quantity and spatial distribution of GCPs on the quality of the results. The research results are summarized as follows: 1.
UAV SfM is a flexible and efficient method to obtain high-resolution topographic data. The direct georeferencing method based on the RTK/PPK fusion difference to obtain high-accuracy image positions has potential for improving the accuracy of the products, especially when GPS measurements are difficult, as well as reducing the dependence on the GCPs in the bundle adjustment, and decreasing the field work time and cost. 2.
The research results show that the vertical RMSE of the DSM obtained by direct georeferencing was 0.087 m, approximately equal to 5.12 GSD. The horizontal RMSE of the DOM was 0.041 m, approximately equal to 2.41 GSD. Both values reached the centimeter positioning accuracy and achieve the application research of decimetererror scale. The accuracy of UAV direct georeferencing could be guaranteed through careful flight planning, an appropriate survey, and accurate data post-processing.
In the study of terrain change detection, we suggest evenly deploying two to three GCPs to achieve a good compromise between appropriate accuracy, repeatability and efficiency.

3.
GCPs should be uniformly distributed in the study area and contain at least one GCP near the center of the domain to reduce the dome effect. With an increase in the number of GCPs in the bundle adjustment, both the horizontal error and vertical error decreased, and the horizontal error was always lower than the vertical error. When the density of the GCPs was greater than 12 GCP/km 2 and 10 GCP/km 2 , respectively, the decrease in the vertical and horizontal errors was not obvious. The minimum vertical and horizontal RMSE were 0.032 (~1.88 GSD) and 0.015 m (~0.88 GSD), respectively.