Photogrammetry Using UAV-Mounted GNSS RTK: Georeferencing Strategies without GCPs

Georeferencing using ground control points (GCPs) is the most common strategy in photogrammetry modeling using unmanned aerial vehicle (UAV)-acquired imagery. With the increased availability of UAVs with onboard global navigation satellite system–real-time kinematic (GNSS RTK), georeferencing without GCPs is becoming a promising alternative. However, systematic elevation error remains a problem with this technique. We aimed to analyze the reasons for this systematic error and propose strategies for its elimination. Multiple flights differing in the flight altitude and image acquisition axis were performed at two real-world sites. A flight height of 100 m with a vertical (nadiral) image acquisition axis was considered primary, supplemented with flight altitudes of 75 m and 125 m with a vertical image acquisition axis and two flights at 100 m with oblique image acquisition axes (30◦ and 15◦). Each of these flights was performed twice to produce a full double grid. Models were reconstructed from individual flights and their combinations. The elevation error from individual flights or even combinations yielded systematic elevation errors of up to several decimeters. This error was linearly dependent on the deviation of the focal length from the reference value. A combination of two flights at the same altitude (with nadiral and oblique image acquisition) was capable of reducing the systematic elevation error to less than 0.03 m. This study is the first to demonstrate the linear dependence between the systematic elevation error of the models based only on the onboard GNSS RTK data and the deviation in the determined internal orientation parameters (focal length). In addition, we have shown that a combination of two flights with different image acquisition axes can eliminate this systematic error even in real-world conditions and that georeferencing without GCPs is, therefore, a feasible alternative to the use of GCPs.


Introduction
UAV photogrammetry combined with the structure from motion (SfM) technique is a well-established method for the mapping and creation of digital terrain models (DTMs), digital elevation models (DEMs), and/or other spatial models. This technique is often used in mining [1][2][3], for monitoring of various natural phenomena and geohazards [4,5], for the detection of agricultural crops/trees [6][7][8], dam and riverbed erosion [9], modeling topographic features [10], updating cadastral data [11], solar irradiation estimates [12], etc. SfM has become so popular that it is currently also used besides ground and UAV photogrammetry in mobile measurement systems [13]; even attempts at creating DTMs using a mobile phone have been reported [14][15][16]. UAV photogrammetry simplifies the work, makes it faster, and improves the quality, although the resulting model accuracy depends on many circumstances, such as the configuration and number of ground control points (GCPs) [17,18], camera pitch [19,20], image overlap [21], flight trajectory [22], camera calibration method [23,24], software used for reconstruction [25], or the quality of global navigation satellite system (GNSS) signal processing [26][27][28]. As well as the point cloud, other outputs, such as orthomosaics, are also widely used [29]. Photogrammetry outcomes are usually evaluated using independent laser scanning or control points (CPs), the position of which is measured by ground surveys, which facilitate the assessment of model deformations [30][31][32]. Correct determination of the elements of internal and external orientation, traditionally performed using GCPs, is crucial for creating an accurate photogrammetric model. At present, the use of UAVs equipped with an onboard global navigation satellite system-real-time kinematic (GNSS RTK) receiver is on the rise. This equipment used to be prohibitively expensive but, lately, it has become more affordable. DJI Phantom 4 RTK multicopter is an example of such a low-end UAV. The knowledge of the camera position during image acquisition (with centimeter accuracy) has been suggested to be able to fully substitute the presence of GCPs for georeferencing of the photogrammetric model [33]. Elimination of the need for GCPs would be a great advantage, potentially making the measurement simpler and/or cheaper; in inaccessible areas, it could be crucial for even making the measurement possible. The resulting model accuracy should correspond to that of the GNSS RTK; the standard deviation should, therefore, not exceed several centimeters. Many studies (e.g., [33]) used such an approach, with satisfactory results. Other studies, however, reported that this approach, combined with the calculation of internal orientation elements when solving bundle adjustment, i.e., without a known camera calibration, yields results systematically shifted in the elevation axis that may be, in some instances, significantly higher than the expected accuracy [34][35][36]. The expected accuracy of the resulting point cloud is 1-2x the ground sampling distance (GSD) [37]; nevertheless, in our previous research, the systematic error was as high as tens of centimeters despite a ground sampling distance (GSD) of 0.03 m [34]. Similar results were presented, for example, by Forlani et al. [38]. The association of such high inaccuracy with the incorrect determination of the focal length (f) was suggested in those studies. Additionally, the systematic error was shown to be more or less random as it differed even between repeated flights on the same site with the same conditions (the same UAV and processed in the same way) [34]. A small number of GCPs [39], or even a single one [34], was, however, shown to significantly reduce this problem. Several studies also suggested methods that should suppress or even remove this phenomenon. Besides the aforementioned use of a minimum number of GCPs, the use of oblique images is another promising alternative [20,36,40]. The presented study aims to propose and test strategies enabling the acquisition of a quality photogrammetric model without the high systematic elevation error while avoiding the use of GCPs. Proof that the source of the error indeed lies in the incorrect focal length determination would show the path to resolving this problem-choosing a flight configuration ensuring a sufficiently accurate calculation of internal orientation elements. We performed image acquisition at two study sites with various flight parameters to independently evaluate the results associated with different external conditions. By processing this acquired imagery in various combinations, we aimed to find a working strategy (configuration) for safe and accurate use of the GNSS RTK-based approach without the systematic elevation error.

Data Acquisition
To facilitate the generalization of the results of this study, the experiment was performed in two study sites. The first site-Brownfield-had only a minimum of dense undergrowth, with low buildings, and concrete and natural surfaces without monochromatic areas (Figure 1). The other site-Rural-was characterized by continuous rapeseed fields alternating with more or less dense forest and shrubs ( Figure 2). These two sites differ with respect to the present surfaces and their properties. Figures 1 and 2 show the point clouds in colors indicating the reliability of the individual points acquired through SfM in Agisoft Metashape (confidence), with lower numbers indicating lower reliability (confidence; possible range 1-255). The Brownfield site can be, therefore, considered highly suitable for SfM modeling, while the Rural site can be considered problematic, which is particularly true for the part of the site that is covered by dense tree vegetation.  The DJI Phantom 4 RTK UAV mounted with a camera equipped with an FC6310R lens (f = 8.8 mm), resolution of 4864 × 3648 pixels, and with a pixel size of 2.61 × 2.61 µm (total price approx. EUR 6000), was used for image acquisition. The GNSS RTK receiver was connected to the CZEPOS permanent reference station network.
The primary flight altitude was set to 100 m above ground with a vertical image acquisition axis and ground sampling distance (GSD) of 0.03 m; at the same altitude, flights with the image acquisition axis angled by 15 • and 30 • from the vertical direction were performed. In addition, flights at altitudes of 75 m and 125 m above ground with a nadiral imagery acquisition axis were performed ( Figure 3).
Each flight was performed with a gridded flight plan; two perpendicular flights were performed for each flight setup (forming a double grid, see Figure 4).
Altogether, 10 flights with 75% front and side overlaps were performed. Table 1 shows the parameters of individual flights with the hereinafter used designations and numbers of usable images.  Points that subsequently served, depending on the calculation method, as either control points or ground control points were stabilized at each of the sites. Where possible, these were marked out as a cross painted with a contrast matte spray ( Figure 5 left). Where this was not possible, especially in the undergrowth, wooden boards with black and white targets were used ( Figure 5, right) and stabilized using a 10 cm long nail. The dimensions of the crosses/targets were approx. 0.40 m × 0.40 m.
(Ground) control points were distributed as evenly as possible throughout both areas. In all, 25 points were stabilized at the Brownfield site and 30 at the Rural site ( Figure 6). The (ground) control points were surveyed using a GNSS RTK Trimble Geo XR receiver with a Zephyr 2 antenna connected into the CZEPOS permanent reference station network (czepos.cuzk.cz). Measurement of each control point was taken three times (before, between, and after UAV flights) for detecting potential errors or variations caused, e.g., by the change of the configuration or satellite availability. The expected nominal accuracy of each coordinate was 0.03 m.

Data Processing
Both GNSS RTK measurements (UAV onboard and ground receiver) were processed using the same methods. First, the terrestrial GNSS RTK measurements were exported from the receiver in the WGS 84 coordinate system (latitude, longitude, ellipsoidal height). Similarly, the spatial position (in the same coordinate system) was extracted from the GNSS RTK data-containing images using Exiftool. The offset between the GNSS receiver antenna reference point (ARP) and the camera center (CC) was automatically considered by the software. Subsequently, all data were converted into the Czech national coordinate positioning system (S-JTSK, System of Unified Trigonometric Cadastral Network) and the Bpv vertical datum (Balt after adjustment) using the EasyTransform software (http: //adjustsolutions.cz/easytransform/, accessed on 1 March 2021) to ensure that the same algorithm was used on all data and thereby to eliminate potential systematic errors that could occur as a result of different transformation algorithms. The three measurements taken for each point by the terrestrial GNSS RTK receiver were used for the calculation of the standard deviation. Then, images were processed in Agisoft Metashape 1.6.1 using the structure from motion calculation method (SfM) with the custom settings listed in Table 2. The UAV was not equipped with a professional metric camera. Although precalibration is generally recommended, the stability of the parameters in non-metric UAVmounted cameras cannot be fully ascertained and other studies have shown that laboratory calibration does not provide better results than the method used in this study [41,42]. The interior orientation parameters were, therefore, calculated in the usual way.
In all, five duplicate flights were performed at each site. The geometries of the imagery from individual flights differed (the bundles intersected at different points due to different flight altitudes and/or camera angles). Each flight (i.e., 10 flights at each site) was processed separately. In addition, joint processing of the two duplicate flights was also performed (i.e., 5 for each site), which allowed investigation of whether a simple increase in the number of images can improve the accuracy.
Each of the variants described above was calculated with onboard GNSS RTK data only (All_RTK) and with a combination of the onboard GNSS RTK receiver data with GCP coordinates (All_Combined).
The processing sequence was the same in all flights-alignment, sparse cloud computation, system optimization, dense cloud calculation. Dense cloud was subsequently manually cropped along the convex envelope of GCPs and filtered in CloudCompare 2.11.0 using the CSF filter to remove trees and shrubs that could potentially cause uncertainty in comparisons of the dense clouds.
Subsequently, parameters of the regression and coefficient of determination describing the relationship between the difference in the focal length f (acquired from the respective flight and from all available data from the respective site, i.e., the All_Combined calculation variant) and the (mean) systematic error in the entire model elevation were calculated for each site. Similarly, the correlation between the systematic error in model elevation and the principal point offset difference (Cx, Cy) was also calculated. The aim was to show the association between the elevation error and the calculation of internal orientation elements.

The Accuracy of the GNSS RTK Ground Geodetic Survey
The accuracy of the GCP/checkpoints survey was evaluated through a calculation of standard deviations in individual coordinates S X , S Y , S H from the replicates detailed in Table 3. The results confirm the expected accuracy of 0.03 m in each coordinate.

Elevation Errors and Related Parameters
As shown, e.g., in our previous study [34], the elevation component of the resulting model is the most problematic one when using direct georeferencing based only on onboard GNSS RTK measurements. Therefore, the mean difference in elevation of the control points between the onboard GNSS RTK and the geodetic survey was calculated to obtain the systematic error (Tables 4 and 5). The residual error was then characterized by the standard deviation, the total error is described by the root mean square error (RMS). Similarly, residual systematic error between the recorded and adjusted camera elevations was calculated for the camera coordinates; the mean difference for individual flights was always equal to zero, and the standard deviation did not exceed 0.03 m (approx. 1x GSD), which confirms that the coordinates recorded during flights were correct. Tables 4 and 5 also detail other parameters, including the focal length f and the coordinates of the principal point offset Cx and Cy. The first two rows of each table represent the reference values calculated from all images using all available images together with the measured GCP coordinates (All_Combined) and using GNSS RTK coordinates only (All_RTK).
These results indicate that systematic (average) elevation error in individual flights is highly variable, with values as high as 0.85 m in some flights (Brownfield 75 • (100 m) −1 ). It is also apparent that even results derived from two flights at the same height and with the same image acquisition angle differ (both between sites and within the same site). The standard deviation of elevation is approximately 0.03 m, which is in accordance with the expected GNSS RTK measurement accuracy.
The All_Combined variants demonstrate a very good agreement of all measured images and coordinates, i.e., that no outliers or erroneous measurements are present. The agreement of internal orientation elements is also very good (a bit worse between sites).
It should be also noted that a higher systematic elevation error was observed in flights where a higher difference between the focal length f calculated for the particular flight and the most likely value determined from the All_Combined calculation occurred. Similar conclusions can be made with respect to the coordinates of the principal point offset.  Tables 6 and 7 show the results of calculations performed using both corresponding flights. Obviously, the increase in the number of images from the two mutually perpendicular flights led to a reduction of the systematic (mean) elevation error, which is particularly true for the Brownfield site. However, it still exceeds the expected measurement accuracy.  Tables 8 and 9 detail the values of combined calculations. It is obvious that in the Brownfield site, basically any non-homogeneous combination of flight parameters improved the results to such a degree that the maximum mean elevation error did not exceed 0.05 m and the total error (RMS) of 0.053 m, which is still less than two GSDs. On the other hand, no such improvement was observed when data from flights performed at two different heights were combined at the Rural site; the systematic error still remained up to 0.4 m. However, the combination of the flight at 100 m and flights with oblique image acquisition improved the systematic errors; the improvement increased when increasing the image acquisition angle from the nadiral direction. The standard deviations are similar in all these cases, approximately 0.03 m.

Analysis of the Association between the Systematic Error in Elevation and the Deviation of the Focal Length f
The data detailed in Tables 4-9 were used for the calculation of the differences between the determined focal lengths and the reference value determined from the All-Combined calculation variant. The relationship is shown in Figures 7 and 8, along with the regression coefficients and determination coefficient (calculated in MS Excel).  Both the graphical record and the determination coefficient R 2 , which is in both cases close to 1 (0.97), prove a practically linear association of the systematic error and the focal length error, which confirms the hypothesis stated, for example, in our previous paper [34].
The linear regression parameters can be further interpreted. The constant element represents the independent constant error (mutual difference) between coordinates determined by the onboard UAV GNSS RTK receiver and the terrestrial GNSS RTK survey.
The detected size of the difference (max 0.017 m) corresponds to the declared accuracy of 0.03 m. The line direction can also be interpreted; using triangle similarity, the following simple equation can be derived: describing the geometric configuration of the regression line direction as h/f, where h is the flight altitude above the terrain and f is the focal length. The primary flight altitude in the performed experiments was always h = 100 m and the focal length f = 3685 pix, the ratio h/f was therefore 0.027 m/pix, which is very close to the experimentally determined h/f values (0.0271 for the Brownfield site and −0.0285 for the Rural site). The practically linear relationship between the constant chamber deviation and the systematic error in elevation proves that the inaccuracy of the internal orientation element calculation is indeed the source of that error. The focal length f determined during model calculation can, therefore, be used as an indicator of this error.
A similar approach to the calculation of the correlation coefficient was also applied to the coordinates of the principal point offset Cx and Cy. However, in this case, no reasonable correlation was found; there is, therefore, no direct relationship between the erroneous computation of the principal point offset and the systematic elevation error.

Comparison of Dense Clouds
Thus far, all comparisons and evaluations in this paper were performed using control points (CPs) in CloudCompare software. It is, therefore, necessary to show that this acquired information also has general validity, i.e., that it is also valid for the generated dense cloud point. For this purpose, the data were cropped, filtered to remove trees and shrubs, and the resulting cloud points were compared. The data from All_Combined calculations (utilizing all available data) were, again, used as the reference. Below, results of the comparisons of the All_Combined calculations to calculations from the primary flight providing the worse result from each location are shown. The comparison of the point clouds from the 100 m −1 flight and All_Combined dataset for Brownfield returned a systematic elevation error of 0.268 m with a standard deviation of elevation of 0.038 m, as illustrated by Figure 9. This systematic shift corresponds very well to the value calculated from control points (see Table 4). The comparison of the All_Combined data and 100m_2 flight for the Rural site illustrated in Figure 10 revealed a systematic error of 0.305 m with a standard deviation of 0.029 m. Again, this systematic shift corresponds very well to the value obtained from the control points (0.315 m; see Table 5). The last presented comparison focused on the difference between All_Combined and All_RTK point clouds, which demonstrates the practical agreement of the resulting point clouds; it is obvious that no deformation has occurred and the elevation difference is practically constant (the systematic shift is 0.002 m and the standard deviation is 0.006 m).
The comparisons of the point cloud elevations to the All_Combined variant detailed in Figures 9-11 clearly show that the average difference value corresponds to that calculated using control points, and the same can be said about the standard deviation. The results of analyses of the remaining point clouds were in agreement with this statement, which proves that the difference is indeed caused by a systematic shift of the point cloud in the nadiral direction without deformation in the horizontal plane.
It is also obvious that the All_RTK and All_Combined calculation variants produced practically identical results, which implies that if the internal orientation elements are determined correctly, the result is correct even if only onboard RTK data are used.

Discussion
Many studies have tested the use of UAV photogrammetry with onboard GNSS RTK for preparing a point cloud representing a DTM and subsequent calculations. Although, in principle, it is possible to perform the whole calculation without the use of GCPs, practical testing revealed that simultaneous calculation of the internal and external orientation elements may, in some cases, lead to a systematic elevation error, although all measurements are correct. This elevation error is random in a way, as it differed even in duplicate flights with the same configuration and the same UAV. This is in accordance with the findings by other authors (e.g., [34,36,38,39]), regardless of whether a fixed-wing or rotary-wing UAV was used.
James and Robson described in 2014 a so-called doming effect when creating a photogrammetric model solely from photographs without the use of reference points. This was not observed in our paper, which can be explained by the fact that the current algorithms used in Agisoft Metashape already consider the camera coordinates during construction of the model [43].
A detailed analysis of the results revealed that incorrect calculation of the internal orientation elements, particularly of the focal length f, is the most likely reason for this problem (e.g., [34,38,40]). Various strategies to deal with this problem have been proposed, such as the use of oblique images (e.g., [19,36,40]), including a small number of GCPs (e.g., [36,39,40,44]) or camera pre-calibration (e.g., [34,42]). Disregarding the use of GCPs (a reliable solution which, however, negates the principal goal of this study, i.e., the simplification of the measurement by avoiding the use of GCPs), it is necessary to try to propose a measurement strategy (flight configuration, processing method, other techniques) to prevent the systematic error. In this study, we aimed to prove the association between the systematic elevation error and erroneous determination of the internal orientation elements and to test various strategies for eliminating the systematic elevation error. We have shown experimentally in two sites (33 models calculated at each site) that the systematic error in elevation and deviation of the focal length are practically linearly dependent, with a coefficient of determination of more than 0.96. The regression coefficients also correspond very well to the relationship between the flight altitude and focal length (Equation (1)). This implies that it is necessary to adopt a strategy ensuring the best possible determination of the internal orientation elements. Here, we must also note that the correct camera calibration is indeed a principal problem of photogrammetry. There are methods that can be used for this purpose, but they usually rely on the use of GCPs and a very specific image acquisition configuration. Additionally, it is also well known that pre/post-calibration, i.e., calibration independent of the particular flight, brings (when using UAVs with non-metric cameras) poorer results than the calibration performed within the scope of the particular flight [45]. It is, therefore, necessary to choose approaches that are actually feasible from the perspective of the camera mounted on the UAV and that can be simply implemented in the image acquisition flight. Here, we combined the primary flight (100 m altitude, nadiral image acquisition) with imagery acquired from other flights. The results of the evaluation of such combined calculations revealed that: 1.
Performing duplicate flights, even if the second flight is perpendicular to the first one (double-grid), brings only a minor improvement; however, the accuracy still remains above the expected limit of 1-2 GSD. In some cases, it may still fail with a systematic shift of up to 0.18 m.

2.
Geometrically different combinations (i.e., the primary flight combined with flights at other altitudes or with different camera angles) led to a significant improvement. This was especially apparent at the Brownfield site where any of these combinations led to the expected accuracy (elevation difference below 0.05 m). Still, the error at the Rural site remained in some instances as high as 0.4 m. It is, therefore, obvious that the quality and selection of the key points for image matching affect the quality of the calibration.

3.
The best results were obtained from the combinations of the primary flight and flights with oblique image acquisition. This was the only strategy that worked well at both sites in all tested combinations. The variant with the higher angle (30 • from the vertical direction) provided the best results, with even the worst systematic error not exceeding 0.03 m (1 GSD).
In our experiment, only relatively small camera angles (15 • and 30 • from the vertical direction) were used to prevent disruption of image alignment, which could pose problems in rugged terrain (i.e., terrain with sloped surfaces). Obviously, the higher the difference in the camera angle, the greater are the differences between the appearance of the same area in the images and, therefore, the more difficult the image matching. Some studies [36,40] used a greater angle (45 • ) but these were performed on an "ideal" flat terrain. The possible angles and their effect on the resulting accuracy should be subject to further research.

Conclusions
UAVs equipped with onboard GNSS RTK receivers are becoming financially more accessible. Their usage for SfM modeling without the need for GCPs seems to be a natural path to take; however, the usual flight configuration with nadiral image acquisition and selfcalibration often produces results with significant systematic elevation error. In this study, we have proved that the principal cause for this is the incorrect determination of the internal orientation parameters as the resulting elevation error is practically directly proportional to the error in the determination of the focal length (coefficient of determination of 0.96). This was also confirmed by the numerical agreement with the geometrical relationship.
To be able to use an onboard GNSS RTK receiver for direct georeferencing, it is, therefore, necessary to ensure correct calibration of the internal orientation elements; where the use of GCPs and/or accurate camera calibration is not possible or feasible, the results can be improved by adjusting the flight geometry and calculation method. We have tested strategies of joint calculations of two flights that were (i) geometrically identical and (ii) geometrically different. A major difference was revealed between sites; at the site with surfaces suitable for SfM, all tested non-homogeneous flight combinations yielded satisfactory results and the systematic error was reduced to approx. 1 GSD; on the contrary, at the Rural site, covered predominantly with rapeseed, shrubs, trees, and other vegetation, combining different flight altitudes did not result in sufficient improvement. However, combinations of two flights at the same altitude with different camera acquisition axes (nadiral and oblique) performed very well. The combination with the higher difference in acquisition angle (i.e., of nadiral and 30 deg from nadiral image acquisition axes) performed best, capable of reducing the systematic elevation error to approx. 1 GSD even in the Rural area with a surface highly unsuitable for photogrammetry.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.