Indirect UAV Strip Georeferencing by On-Board GNSS Data under Poor Satellite Coverage

: The so-called Real Time Kinematic (RTK) option, which allows one to determine with cm-level accuracy the Unmanned Aerial Vehicles (UAV) camera position at shooting time, is also being made available on medium- or low-cost drones. It can be foreseen that a sizeable amount of UAV surveys will be soon performed (almost) without Ground Control Points (GCP). However, obstacles to Global Navigation Satellite Systems (GNSS) signal at the optimal flight altitude might prevent accurate retrieval of camera station positions, e.g., in narrow gorges. In such cases, the master block can be georeferenced by tying it to an (auxiliary) block flown at higher altitude, where the GNSS signal is not impeded. To prove the point in a worst case scenario, but under controlled conditions, an experiment was devised. A single strip about 700 m long, surveyed by a multi-copter at 30 m relative flight height, was referenced with cm-level accuracy by joint adjustment with a block flown at 100 m relative flight height, acquired by a fixed-wing UAV provided with RTK option. The joint block orientation was repeated with or without GCP and with pre-calibrated or self-calibrated camera parameters. Accuracy on ground was assessed on a fair number of Check Points (CP). The results show that, even without GCP, the precision is effectively transferred from the auxiliary block projection centres to the object point horizontal coordinates and, with a pre-calibrated camera, also to the elevations.


UAV-Photogrammetry
Photogrammetric applications of UAV have been expanding relentlessly in the last decade. Drones available on the market span a large range of capabilities and characteristics, so users can find the best compromise between price and performance, from specialized applications to general purpose surveys; see [1,2] for two comprehensive reviews. In the following, we restrict ourselves to the so-called micro-UAV, either fixed-wing or rotary-wings, with Maximum Take Off Weight below 2 kg.
Though still too restrictive in the opinion of many users and professionals, regulations on the commercial use of drones [3] are being revised frequently by the flight authorities to reflect the dramatic evolution of drone features and capabilities and the push towards allowing operation also in the so-called critical areas. On the one hand, this led to the development of new on-board devices (for instance, safety equipment such as anti-collision systems) with an overall improvement of product standards [3]. On the other hand, some limits of normal visual line of sight (VLOS) operations can be overcome by applying extended VLOS (EVLOS) and beyond VLOS (BVLOS) operations.

Flight Planning
Over decades of practice and theoretical studies, aerial photogrammetry with large format film cameras developed a full set of rules (on strip forward and side overlaps, baselength-to-height ratios and GCP density and distribution) to minimize the number of images necessary to fulfil map production tolerances [15]. Many years after the advent of digital aerial cameras with non-standard sensor formats [16], on the one hand, and of digital photogrammetry and SfM, on the other hand, no comparable set of established rules has yet emerged. This is even truer in planning UAV photogrammetric flights, where the variety of camera formats and focal lengths, as well as of survey scenarios and goals, is larger. Rather than minimizing the number of images, the goal of flight planning is today ensuring a sufficient degree of multi-image coverage, to increase image matching reliability and possibly avoid gaps and occlusions. The GSD, rather than the image scale, is the main parameter to consider in planning [16]; large overlaps (up to 70-90% forward and 60-80% sidelap) are normally adopted.
Increasingly, UAV are employed in surveys of complex landscapes with large height differences, as in geomorphological studies [17] or rock face stability analysis [18] and open pit surveys [19]. Flight planning becomes more complex than in standard nadiral blocks, as not only the 2D shape of the survey area is to be considered, to guarantee uniformity of precision. Indeed, increasingly oblique images are being included in the block [17,[20][21][22] not only to improve camera calibration (see Section 1.4), but also to improve ground point precision and landscape reconstruction completeness [17].
Though not yet formally employing design techniques normally used in close range surveys [23] or in next-best-view [24] and exploration [25] problems, full exploitation of multi-copter flying capabilities in complex landscapes is pushing flight design in that direction.

Georeferencing and Control of UAV Photogrammetric Blocks
In most cases, UAV photogrammetric block georeferencing and control is today obtained by the so-called indirect determination of the exterior orientation (EO) parameters, i.e., by extracting and matching tie points with SfM, measuring GCP coordinates and running a Bundle Block Adjustment (BBA) [15].
GCP measurement is currently the bottleneck of UAV Photogrammetry. Placing and measuring targets by GNSS is in many cases fast and efficient, but not always so in rough or unsafe terrain or with dense low vegetation or finally in forested areas. Using the camera positions geotagged by the UAV navigation system might be the only solution in difficult environments, though such data accuracy, typically 1-3 m in horizontal coordinates and 2-3 times worse in elevation, might not be sufficient for the survey requirements, especially for repeat (periodic) surveys. In [26] it has been shown that, under condition of a geometrically strong block (high overlap, nadir and oblique images, cross strips) and of a carefully calibrated camera, a fair reconstruction of the topography (of the object shape) can be achieved, with rough georeferencing by the (inaccurate) on-board navigation system. However, proper Remote Sens. 2019, 11,1765 3 of 20 georeferencing (especially whenever elevations and terrain slope are necessary) is to be provided using GCP or other task-specific auxiliary information.
A large number of fixed-wing UAV on the market offer today the so-called RTK option, where an on-board multi-frequency GNSS receiver can determine the camera position at shooting time with cm-level accuracy thanks to differential corrections sent by a master station, or, through the ground control station, by a Continuously Operating Reference Station (CORS) network [11]. Recently, low-or medium-cost multi-copters with the same capabilities have also been announced or introduced [27].
Blocks flown with RTK-enabled drones can be georeferenced with an extended BBA where, besides the tie points coordinates, the camera station positions as well as the coordinates of GCPs (if available) are included as observations with pre-defined precision [28]. This process, sometimes improperly referred to as Direct Georeferencing, has been formerly employed in aerial photogrammetry, variously named as GPS-supported or GPS-assisted Aerial Triangulation [28][29][30]. In the following, it will be referred to as GNSS-AT for short. Direct Georeferencing proper is achieved when the EO parameters are computed by Direct Sensor Orientation (DSO), processing GNSS and inertial measurements collected by on-board sensors [31]. In DSO there is no need for tie points nor GCP: in principle the photogrammetric workflow can begin straight from the dense point cloud generation, skipping SfM and BBA altogether. DSO is however not so common even in manned aerial photogrammetry, where Inertial Measurement Units (IMU) attached to the cameras are much more accurate than those available in UAVs. Indeed, as SfM significantly improves the attitude data through tie point matching, the EO parameters are still obtained with a BBA that includes camera attitude and position from IMU/GNSS observations as well as image observations; GCP might also be added for camera and system calibration purposes. This technique is called Integrated Sensor Orientation (ISO); GNSS-AT can be seen as a special case of ISO, without IMU data.
Due to payload constraints, typically UAV's IMU are based on MEMS (MicroElectroMechanical Systems) technology, whose attitude data are currently still of insufficient quality, especially in yaw [32,33]. As GNSS-AT is simpler to implement on the hardware side and accurate enough, the real benefits brought by attitude data compared to GNSS-AT are limited to corridor mapping. In this kind of surveys, due to camera projection centers being roughly aligned, single strips cannot be reliably oriented by GNSS-AT, as poor determinability of the roll angle leads to a nearly rank-deficient normal equation system in the BBA.

Camera Calibration
The camera interior orientation (IO) parameters must be accurately determined, otherwise residual errors in such parameters, though partly adsorbed by the EO parameters, may translate to systematic errors on ground coordinates [20]. In contrast to this, convergent (oblique) imaging has been shown to be effective [26,34]. To avoid projective coupling with the EO parameters, especially for nadiral imaging, a pre-calibration is advisable. However, as the consumer cameras employed in UAVs exhibit not-so-stable IO parameters, self-calibration is routinely used in practice. Conditions for successful self-calibration in the BBA of UAV blocks (GCP distribution, flight pattern, nadiral and oblique imaging, etc.) are discussed in [7,[35][36][37][38]. Camera calibration is even more critical in blocks oriented with GNSS-AT as opposed to GCP, as residual errors from calibration are more likely to affect primarily ground point coordinates rather than the EO parameters [37][38][39]. In practice, therefore, the alternative is between a pre-calibration on-site (on-the-job calibration) [26] and a self-calibrating BBA with GCP [40,41].

Paper Objectives and Previous Work on the Topic
High resolution UAV photogrammetric surveys are sometimes needed in places where only a multi-copter, often under careful manual control, can be used. Think for instance of narrow canyons, gorges or of streams or creeks lined by high trees or close to high rock faces. Such situations are not uncommon in surveys for road safety checks from rock fall [18] or in maintenance works, geomorphological studies, erosion studies of gullies [42,43], as well as in surveys in mountain areas Remote Sens. 2019, 11, 1765 4 of 20 subject to debris flow [44]. Placing and measuring GCP in such circumstances is normally difficult, often dangerous or outright impossible. To add to the difficulties, even for a RTK-equipped multi-copter, successfully receiving and processing the GNSS signal would be hard, as not enough satellites might be in sight. In such cases, frequent cycle slips, poor Position Dilution of Precision (PDOP) or strong multi-path might affect the positioning accuracy of the camera stations, and so block georeferencing accuracy.
Even if the GNSS signal is available, the narrowness of a safe flight corridor might prevent flying several parallel strips. As the single strip case cannot be handled by GNSS-AT, due to the above-mentioned poor determination of the strip roll angle, an alternative must be found. This scenario has similarities with corridor mapping, where a (sequence of) single strip(s) is sufficient to map the area of interest. As using GCP in corridor mapping dramatically decreases the survey's pipeline efficiency, several alternatives have been proposed, all based on inertial and satellite data collection as well as SfM: in [32], ISO is applied to single UAV strips; recognizing the limited accuracy and the sensitivity to systematic errors of MEMS inertial sensors, in [45], the so-called relative aerial control is proposed to add constraints to BBA in corridor mapping, a technique also experimented in [46]. More recently, a general approach based on dynamic networks to tightly integrate image, inertial and GNSS observations has been proposed and implemented in [33]. Finally, in [41], GNSS-AT is applied in corridor mapping, adding oblique-and nadir-looking strips to increase imaging geometry strength. Moreover, to exploit the multi-copter flexibility in camera pointing, lever arm estimation is included in the BBA. A common feature of these approaches is that they assume continuous GNSS signal coverage or signal outages lasting for a short time, so that inertial data, eventually combined with SfM, can still supply a solution for the EO parameters.
An alternative to the above-mentioned approaches, pursued in this paper, is combining two blocks flown at different elevations by an RTK-equipped multi-rotor. While in the lower block (master) the GNSS signal is unlikely to be received, the higher (auxiliary) block should have better chances. Flying higher, the sky visibility improves and so will the likelihood that more GNSS satellites can be traced and that the GNSS signal can be received continuously. Except with really vertical walls, normally flying higher in a valley allows several parallel strips to be acquired, avoiding running into the single strip near deficiency condition.
For this strategy to succeed, the lower and higher blocks have to be effectively connected by tie points, and the uncertainty propagation from the camera stations of the higher flight to the object points of the lower flight should not exceed the accuracy requirements of the survey.
Satisfying the former condition depends on the extent to which image descriptors used in feature extraction by SfM algorithms are actually invariant to image scale and ground resolution and so can reliably and densely be found across both image sets.
As far as the latter condition is concerned, the answer is not straightforward, as many parameters are involved, the main one being reasonably the ratio between the GSD of the two flights.
To gain a better understanding of the issues at stake, an experiment was carried out under a "worst case scenario" for the proposed method, i.e., trying to georeference a single strip with a minimal configuration of the auxiliary block. To ensure a fair amount of CP for the accuracy checks, the test was set in an environment where ground control could be easily provided, rather than in a mountain site: the single-strip master block was flown over a road bridge crossing a dry riverbed while the higher-altitude auxiliary block, made of several parallel and cross strips, was flown over a larger area across the bridge. The uncertainty propagation from the higher camera projection centers to the ground was assessed on a number of CP distributed over the bridge and on the riverbed. The results show that the indirect orientation of the master block is indeed feasible but its accuracy depends, as can be expected, on the auxiliary block strength. The accuracy loss with respect to a traditional GCP-based block orientation can be estimated in the 30% to 50% range both in horizontal coordinates and elevation.
The paper is structured as follows: Section 2 (Materials and Methods) describes the test site, the UAV features, the reference network survey and finally the experiment organization, in particular camera calibration and the auxiliary-master joint adjustment configurations; Section 3 (Results) reports the RMSE on CP found in the various tests; Section 4 (Discussion) discusses the significance of the results and foresees additional investigations on some still open questions. Finally, Section 5 draws the conclusions and perspectives.

Test Site Description
The surveyed bridge belongs to a national road that crosses River Taro at (44 • 29 22" N; 10 • 13 28" E), a few kilometers west of the city of Parma (Italy). Including the ramps, the bridge is about 850 m long and about 8 m wide. The bridge is about 10.5 m higher than the riverbed, made of an alluvial plain with deposits of silt, sands and gravel, mostly dry at the survey period (September 2018). Both riversides and about one third of the northern bridge side are lined by high trees (see Figure 1). camera calibration and the auxiliary-master joint adjustment configurations; Section 3 (Results) reports the RMSE on CP found in the various tests; Section 4 (Discussion) discusses the significance of the results and foresees additional investigations on some still open questions. Finally, Section 5 draws the conclusions and perspectives.

Test Site Description
The surveyed bridge belongs to a national road that crosses River Taro at (44°29′22″ N; 10°13′28″ E), a few kilometers west of the city of Parma (Italy). Including the ramps, the bridge is about 850 m long and about 8 m wide. The bridge is about 10.5 m higher than the riverbed, made of an alluvial plain with deposits of silt, sands and gravel, mostly dry at the survey period (September 2018). Both riversides and about one third of the northern bridge side are lined by high trees (see Figure 1).

Reference Network and GCP
To provide an independent network to evaluate the georeferencing accuracy of GNSS-AT, a total of 60 signalized targets were deployed and surveyed: 31 over the bridge (road surface and parapet) and 29 distributed on the riverbed and on a factory service area on the river west bank.
The GCP and the CP coordinates were determined by TS measurements from a network of 12 stations. Moreover, 26 of the targets were surveyed twice in Network RTK mode with a Leica GS14 and a Geomax Zenith 35 Pro receiver, in order to provide double points for the connection of the GNSS network to the TS network.
The GNSS positions of the targets as well as of the higher flight camera stations (see Section 2.3.), determined in the national reference frame ETRF2000(2008), were converted to a local cartesian reference system centered at mid bridge, with origin on the reference ellipsoid, z axis along the ellipsoid normal and y axis parallel to the north axis of the UTM 32N fuse of the ETRS89 datum.

Reference Network and GCP
To provide an independent network to evaluate the georeferencing accuracy of GNSS-AT, a total of 60 signalized targets were deployed and surveyed: 31 over the bridge (road surface and parapet) and 29 distributed on the riverbed and on a factory service area on the river west bank.
The GCP and the CP coordinates were determined by TS measurements from a network of 12 stations. Moreover, 26 of the targets were surveyed twice in Network RTK mode with a Leica GS14 and a Geomax Zenith 35 Pro receiver, in order to provide double points for the connection of the GNSS network to the TS network.
The GNSS positions of the targets as well as of the higher flight camera stations (see Section 2.3.), determined in the national reference frame ETRF2000(2008), were converted to a local cartesian reference system centered at mid bridge, with origin on the reference ellipsoid, z axis along the ellipsoid normal and y axis parallel to the north axis of the UTM 32N fuse of the ETRS89 datum.
The TS network was adjusted in this reference system, using as known points with accuracy of 1 cm in horizontal coordinates and 1.5 cm in elevation the GNSS positions expressed in the local reference system. The Root Mean Square (RMS) of the residuals on the GNSS coordinates turned out to be 1 cm for each horizontal coordinate and 1.2 cm for the elevation. From the network adjustment, the RMS of the estimated precisions of all network points in the local system turned out to be 8 mm in X,Y and 7 mm in Z.

Survey Flights
Even though, as mentioned in Section 1.5, a single RTK-equipped multi-rotor could have acquired both the master and the auxiliary block, in our case we had to use two different UAVs. Indeed, we had available only a RTK-equipped fixed-wing senseFly eBee and a multi-rotor DJI Phantom4 Pro, not provided with the RTK option. The former was therefore used for the auxiliary flight while the latter was necessary for the high-resolution survey of the bridge (the master block), as only a multi-rotor could ensure adequate forward overlap for the low-elevation single strip over the bridge.
The eBee RTK is equipped with a 20 Mpx compact S.O.D.A. camera with 29 mm nominal focal length (35 mm equivalent). The RGB camera (resolution 5472 × 3648 pixels, pixel size 2.4 µm) acquires nadiral images with exposure parameters set automatically. The on-board receiver can process L1 and L2 GPS and GLONASS data and receive the differential corrections from the master station or from the control center of a CORS network via the flight control software and a ground radio modem. Camera positions are stored in the image metadata as geo-tags as well as in the flight log. A Geomax Zenith 35 Pro, set on a benchmark at the eastern bridge end, was used as master station.
The DJI Phantom4 Pro is equipped with a 20 Mpx CMOS sensor with 24 mm nominal focal length (35 mm equivalent). The FC6310 RGB camera (resolution 5472 × 3078 pixels, pixel size 2.5 µm) is mounted on a stabilized gimbal with controllable pitch range from −90 • to +30 • . The single frequency on-board GNSS receiver is fit only for navigation purposes, not for accurate georeferencing.
Four flights were executed, two with the eBee and two with the Phantom4; the main parameters of each flight are shown in Table 1. The high-resolution bridge survey flight is made of a single strip, flown manually with the Phantom4, with an average 80% forward overlap at ground level (see Figure 2); while still under manual control and without landing, a second single strip was executed, at a higher flying height but with a slightly larger forward overlap (as a matter of fact, however, the overlap is far from constant in both strips). Finally, two flights were executed in automatic mode with the eBee, about half an hour apart from each other. The former is made of 11 strips, flown along the bridge direction (roughly in east-west direction), with forward and side overlap of about 50% and 70% respectively (covered area: about 740 m × 370 m). The latter consists of 7 strips, flown across the bridge, just east of bridge center, almost at the same height as the previous flight and with roughly the same overlaps (covered area: about 370 m × 220 m). The two eBee flights were combined in a single block (see Figure 3) and will be referred to as the eBee block in the following. Due to proximity to Parma Airport, the activity was authorized by air traffic control and all flights were made under coordination of the control tower.

Photogrammetric Data Processing
Photogrammetric data processing was executed with the commercial package PhotoScan (PS) version 1.2.6, build 2834, by Agisoft LLC, St. Petersburg, Russia. Block orientation is performed with SfM algorithms in an arbitrary reference frame. A Helmert transformation is computed from the arbitrary reference to the object reference system, based on the GCP coordinates, or, in the case of GNSS-AT, on the camera center positions, loaded directly from the image geotags or from a file.
A global optimization stage is then executed that minimizes the sum of the reprojection error and of the GCP and/or the camera station coordinate residuals; camera parameters can be estimated by selfcalibration or just applied if the camera has been pre-calibrated. Each GCP coordinate and each camera position can be assigned a specific a-priori precision; otherwise, default values can be assigned. In the tests, based also on previous experiences [40] the following default standard deviations were assigned: 1 pixel to tie points image coordinates (automatically or manually extracted and matched), 5 mm to the GCP coordinates and 3 cm to the eBee camera station coordinates.

Photogrammetric Data Processing
Photogrammetric data processing was executed with the commercial package PhotoScan (PS) version 1.2.6, build 2834, by Agisoft LLC, St. Petersburg, Russia. Block orientation is performed with SfM algorithms in an arbitrary reference frame. A Helmert transformation is computed from the arbitrary reference to the object reference system, based on the GCP coordinates, or, in the case of GNSS-AT, on the camera center positions, loaded directly from the image geotags or from a file.
A global optimization stage is then executed that minimizes the sum of the reprojection error and of the GCP and/or the camera station coordinate residuals; camera parameters can be estimated by selfcalibration or just applied if the camera has been pre-calibrated. Each GCP coordinate and each camera position can be assigned a specific a-priori precision; otherwise, default values can be assigned. In the tests, based also on previous experiences [40] the following default standard deviations were assigned: 1 pixel to tie points image coordinates (automatically or manually extracted and matched), 5 mm to the GCP coordinates and 3 cm to the eBee camera station coordinates.

Photogrammetric Data Processing
Photogrammetric data processing was executed with the commercial package PhotoScan (PS) version 1.2.6, build 2834, by Agisoft LLC, St. Petersburg, Russia. Block orientation is performed with SfM algorithms in an arbitrary reference frame. A Helmert transformation is computed from the arbitrary reference to the object reference system, based on the GCP coordinates, or, in the case of GNSS-AT, on the camera center positions, loaded directly from the image geotags or from a file.
A global optimization stage is then executed that minimizes the sum of the reprojection error and of the GCP and/or the camera station coordinate residuals; camera parameters can be estimated by self-calibration or just applied if the camera has been pre-calibrated. Each GCP coordinate and each camera position can be assigned a specific a-priori precision; otherwise, default values can be assigned. In the tests, based also on previous experiences [40] the following default standard deviations were assigned: 1 pixel to tie points image coordinates (automatically or manually extracted and matched), 5 mm to the GCP coordinates and 3 cm to the eBee camera station coordinates.

Test Description
The goal of the experiments is to find out whether GNSS-AT can be applied indirectly, i.e., also in cases where the UAV collecting the master block images is not RTK-equipped, or, due to site characteristics, the GNSS signal cannot be received reliably, or, finally, the master block is just made of a single strip. In such cases an auxiliary block, flown at higher elevation by a UAV with an on-board RTK GNSS receiver, could be employed to georeference the master block.
To this aim, the experiment foresaw several stages, comparing different eBee camera calibration parameter sets and flight combinations in terms of accuracy on the CP. Each stage is briefly described in the following, stating first its objectives.

O1.
To evaluate the GNSS-AT orientation accuracy of the standalone eBee block with camera parameters obtained from on-site pre-calibration or from self-calibration.
Though the expected accuracy of RTK-equipped UAV blocks has already been reported by several studies [40,47] and can therefore be reasonably foreseen, it is still worth checking the eBee block accuracy prior to its combination with the Phantom4 strip, investigating the alternatives for calibration discussed in Section 1.4. An on-site camera pre-calibration executed imaging a small test-field, located on the riverbed just north of the bridge (see Figure 1). A subset of 18 eBee images, 9 from the east-west flight and 9 from the north-south flight (see Figure 4), was selected for the calibration, in order to keep the effort commensurate to that of the survey. By keeping fixed all the test-field GCP or just one GCP at a time, four calibration parameter sets were estimated. Finally, four self-calibrating BBA were also performed fixing each time a single GCP, located at mid bridge, or at one of the bridge ends, or on the riverbed, to find out whether GNSS-AT orientation accuracy depends on the single GCP location.

Test Description
The goal of the experiments is to find out whether GNSS-AT can be applied indirectly, i.e., also in cases where the UAV collecting the master block images is not RTK-equipped, or, due to site characteristics, the GNSS signal cannot be received reliably, or, finally, the master block is just made of a single strip. In such cases an auxiliary block, flown at higher elevation by a UAV with an onboard RTK GNSS receiver, could be employed to georeference the master block.
To this aim, the experiment foresaw several stages, comparing different eBee camera calibration parameter sets and flight combinations in terms of accuracy on the CP. Each stage is briefly described in the following, stating first its objectives.

O1.
To evaluate the GNSS-AT orientation accuracy of the standalone eBee block with camera parameters obtained from on-site pre-calibration or from self-calibration.
Though the expected accuracy of RTK-equipped UAV blocks has already been reported by several studies [40,47] and can therefore be reasonably foreseen, it is still worth checking the eBee block accuracy prior to its combination with the Phantom4 strip, investigating the alternatives for calibration discussed in Section 1.4. An on-site camera pre-calibration executed imaging a small testfield, located on the riverbed just north of the bridge (see Figure 1). A subset of 18 eBee images, 9 from the east-west flight and 9 from the north-south flight (see Figure 4), was selected for the calibration, in order to keep the effort commensurate to that of the survey. By keeping fixed all the test-field GCP or just one GCP at a time, four calibration parameter sets were estimated. Finally, four self-calibrating BBA were also performed fixing each time a single GCP, located at mid bridge, or at one of the bridge ends, or on the riverbed, to find out whether GNSS-AT orientation accuracy depends on the single GCP location.
These calibration parameter sets were then used in the GNSS-AT adjustments of the O4 and O5 stages. . The 18-images pre-calibration block extracted from the two eBee flights (9 from each) and the location of the 12 GCP used in the pre-calibration adjustment (see also light blue dots in Figure 1).

O2. To assess the accuracy of the Phantom4 single strip at 30 m oriented with GCP.
A traditional GCP-based BBA of the Phantom4 single strip was performed fixing the coordinates of 17 GCP; 13 placed on both parapets along the bridge and 4 located on the riverbed in pairs, upstream and downstream with respect to the bridge. As a single strip is far from the ideal geometry for a reliable camera calibration, first self-calibration was applied in the joint adjustment of both the O3. To find out whether SfM effectively connects master and auxiliary blocks.
To this aim, an analysis of the number of tie points connections and their distribution across the master and the auxiliary blocks was performed.

O4.
To assess the accuracy of the master-auxiliary joint adjustment using different camera calibration techniques, and compare it to the accuracy of the GCP-controlled master block.
To this aim, different joint adjustments of the eBee and of the Phantom4 30 m blocks were performed with GNSS-AT, with camera calibration parameters pre-calibrated of self-calibrated. Finally, the RMSE on CP were computed and compared to those obtained in O2.

O5.
To find the best (minimum) auxiliary block configuration still ensuring the stability of the master block.
As flying the auxiliary block adds time and cost to the survey, it is worth searching for its minimal effective configuration. To this aim, the eBee and Phantom4 blocks were jointly processed, each time progressively removing from the former the longitudinal and/or the cross strips, and measuring the accuracy decrease on the CP. Strip removal was applied both to the whole eBee block and to the east-west eBee flight only (i.e., without cross strips). The longitudinal strips were removed in pairs, one from each side with respect to the bridge axis, from 10 to 2, i.e., progressively moving towards the quasi-singularity condition as the distance between the two flight lines farthest apart decreases.
In the full block case, all cross strips were maintained until only two longitudinal strips remained. Then the cross strips were first shortened from 10 to 6 images each and after progressively removed (from 7 to 4 to 2), always maintaining the first and last ones (i.e., the cross strips farthest apart from each other).
The accuracy of block orientation was evaluated by computing the differences between the coordinates of the CP estimated in every BBA and those estimated from the topographic network.
In the accuracy checks on single blocks (eBee block, Phantom4 strip) the targets were measured in all images. To the contrary, in the accuracy checks for the eBee and Phantom4 joint blocks, targets were measured only in the Phantom4 images, as the goal is to assess the accuracy of ground points from the indirectly-oriented master block. Any collimation of CP also in the eBee images would have, instead, bypassed the uncertainty propagation through the connection master-auxiliary established by the across-blocks tie points.
The above remarks imply that the number of CP varies according to the examined block configuration: 59 or 60 in the eBee block alone, 43 or 44 in the joint eBee and Phantom4 blocks, 27 in the GCP-controlled Phantom4 strip.
In Table 2

Results
The experiment results are presented in the following, according to the list of goals O1-O5 illustrated in the previous section.

Comparison between Pre-and Self-eBee Camera Calibration (O1)
Using only the images of the small block shown in Figure 4, the pre-calibration of the eBee camera was repeated four times: fixing all 12 GCP available in the area or just one at a time (GCP number 706, 707 and 713). Table 3 reports the estimated calibration parameters and, for the single GCP cases, the RMSE on the nine remaining GCP of the block, used as CP. The estimated precision of the IO parameters (not shown in Table 3) is best in the 12 GCP case, at about 0.1 pixels for all three IO elements, while it ranges from 0.1 to 0.4 pixels in the single GCP cases for the principal distance f and the principal point coordinates cx, cy. The precision of the lens distortion parameters k1-k3 and p1-p2 remains almost the same in all cases. Differences between parameters estimates in different adjustments are in most cases significant to the t-test. Correlations between parameters exceed 0.9 only among the k1-k3 parameters and reach 0.82 between f and cy in the single GCP case.
The accuracy on the 9 CP coordinates changes as the single GCP fixed changes; overall, however, the total error remains the same, at about 4 cm.
To evaluate the alternative between pre-and self-calibration, the eBee block orientation accuracy was estimated on 57 CP, using first the pre-calibration parameter sets of Table 3 (without any GCP fixed) and later applying single-GCP self-calibration in the BBA. In the latter case, five different GCP were used: the same three used in pre-calibration (GCP 706, 707 and 713), all located at mid bridge, and GCP A721 and NS484, located at the two opposite bridge ends. Finally, an additional self-calibrating BBA was executed without fixing any GCP. The RMSE on CP of the different BBA are shown in Table 4. As the eBee flight GSD is about 2.3 cm, the horizontal error is fairly constant to around 0.86 GSD, while the elevation error ranges from 0.95 to 1.65 GSD, a better result compared to the P4 30 flight with GCP (see next Section 3.2). The larger GSD, which makes target collimation more uncertain, is more than compensated by a stronger camera network and by the even distribution of the control on all camera stations. The influence of the employed calibration set seems negligible as far as the horizontal coordinates are concerned, while it varies in elevation, with a ratio of 1.7 of the worst-to-best result. This is likely due to the actual accuracy of the GCP: the best result is indeed obtained with the 12 GCP parameter set, where some averaging of survey errors can be expected. The last row of Table 4 shows that, with self-calibration, large (systematic) errors in elevation should be expected if no GCP are fixed, due to correlations between camera parameters and ground coordinates, as already noticed in [11,40,41]. Which method is the best between single GCP pre-calibration and single GCP self-calibration is not apparent from the results.

Accuracy of the Phantom4 30 m Strip Adjusted with GCP Only (O2)
The orientation accuracy of the P4 30 strip, adjusted fixing 17 GCP along the bridge, was measured on 27 CP. The adjustment was executed twice: with self-calibration and with the fixed calibration parameter set previously estimated by a GCP-controlled joint adjustment of the P4 30 and P4 70 strips. The mean reprojection error of the adjustment is in both cases about 0.55 pixels. The RMSE on the CP for the two adjustments are reported in Table 5. As the GSD of the P4 30 flight at bridge level is 0.9 cm, the overall horizontal accuracy and the elevation accuracy are both around 2 GSD. No influence on CP accuracy of the calibration set employed or estimated can be noticed. Fixing more GCP does not improve significantly the result.

Tie Points Across Master and Auxilary Blocks (O3)
Automatically extracted tie points shared among images of the two blocks are the key to run a successful joint orientation. An exhaustive analysis of the relative proportion of valid matches between same-block images and across-block images is out of the scope of this paper, as it would be concerned primarily with image descriptors performance on image scale differences. However, Photoscan allows for a visualization and check of the matches of each image with those overlapping it. Figure 5 shows on the left-hand side the image Id and, in decreasing order, the total number of matches and the number of the validated ones for the image DJI_0145 of the P4 30 strip.
Self-c 1.2 1.5 2.1 As the GSD of the P4 30 flight at bridge level is 0.9 cm, the overall horizontal accuracy and the elevation accuracy are both around 2 GSD. No influence on CP accuracy of the calibration set employed or estimated can be noticed. Fixing more GCP does not improve significantly the result.

Tie Points Across Master and Auxilary Blocks (O3)
Automatically extracted tie points shared among images of the two blocks are the key to run a successful joint orientation. An exhaustive analysis of the relative proportion of valid matches between same-block images and across-block images is out of the scope of this paper, as it would be concerned primarily with image descriptors performance on image scale differences. However, Photoscan allows for a visualization and check of the matches of each image with those overlapping it. Figure 5 shows on the left-hand side the image Id and, in decreasing order, the total number of matches and the number of the validated ones for the image DJI_0145 of the P4 30 strip. Most of the matches occur obviously with the P4 30 images preceding and following the current, and the cardinality of matches decreases with the decrease of the overlap. The number of valid matches range from a few thousand (the limit is fixed to at most 4000 per image pair) to about one hundred. The number of matches with the eBee images is much smaller, and subject to strong variations even for consecutive images: up to four hundred at most and on average 50-80 only, shared with 4 to 12 images (less than 10 valid matches are not considered in this analysis). On the right-hand side of Figure 5, valid matches are represented in blue while invalid ones, discarded by SfM in the orientation stage, are in red. Figure 5 shows an almost ideal case, where (actually few) matches are fairly distributed across the whole image format of the master (large scale) image, as the scene offers a number of details and texture. Most of the matches occur obviously with the P4 30 images preceding and following the current, and the cardinality of matches decreases with the decrease of the overlap. The number of valid matches range from a few thousand (the limit is fixed to at most 4000 per image pair) to about one hundred. The number of matches with the eBee images is much smaller, and subject to strong variations even for consecutive images: up to four hundred at most and on average 50-80 only, shared with 4 to 12 images (less than 10 valid matches are not considered in this analysis). On the right-hand side of Figure 5, valid matches are represented in blue while invalid ones, discarded by SfM in the orientation stage, are in red. Figure 5 shows an almost ideal case, where (actually few) matches are fairly distributed across the whole image format of the master (large scale) image, as the scene offers a number of details and texture.
As can be seen from Figure 6, referring to Phantom4 image 0031 at the eastern end of the bridge, the number of images sharing points and the number of ties are much smaller; additionally, tie points are concentrated on a small part of the image format. This inhomogeneity depends primarily, apart from the image content, on the lower degree of overlap of the auxiliary block along the block perimeter (see Figure 3).
As can be seen from Figure 6, referring to Phantom4 image 0031 at the eastern end of the bridge, the number of images sharing points and the number of ties are much smaller; additionally, tie points are concentrated on a small part of the image format. This inhomogeneity depends primarily, apart from the image content, on the lower degree of overlap of the auxiliary block along the block perimeter (see Figure 3).

Orientation Accuracy of the Master-Auxiliary Joint Block (O4)
The accuracy of the joint adjustment of the eBee block and of the P4 30 strip (O4), evaluated on 44 CP, is reported in Table 6 for different camera calibration settings in the BBA. More specifically, the P4 camera parameters have been always estimated by self-calibration; to the contrary, for the eBee camera the same sequence of ten different settings as in Table 4 was applied. In the first four cases no GCP was fixed in the adjustment, as the parameter set was obtained by pre-calibration. In the next five cases, self-calibration was applied with a single GCP fixed. In the final one, self-calibration was applied without fixing any GCP. Table 6. Accuracy of the joint adjustment of the eBee and P4 30 blocks: RMSE on 44 CP coordinates. No GCP was fixed in the four BBA with pre-calibrated parameters, while self-calibration was executed with a single GCP fixed in all but the last case. The average accuracy of horizontal coordinates is 2.8 cm with pre-calibration and 2.4 cm with self-calibration. This is less accurate by about 50% and 30% respectively compared to the 1.9 cm of

Orientation Accuracy of the Master-Auxiliary Joint Block (O4)
The accuracy of the joint adjustment of the eBee block and of the P4 30 strip (O4), evaluated on 44 CP, is reported in Table 6 for different camera calibration settings in the BBA. More specifically, the P4 camera parameters have been always estimated by self-calibration; to the contrary, for the eBee camera the same sequence of ten different settings as in Table 4 was applied. In the first four cases no GCP was fixed in the adjustment, as the parameter set was obtained by pre-calibration. In the next five cases, self-calibration was applied with a single GCP fixed. In the final one, self-calibration was applied without fixing any GCP. The average accuracy of horizontal coordinates is 2.8 cm with pre-calibration and 2.4 cm with self-calibration. This is less accurate by about 50% and 30% respectively compared to the 1.9 cm of the GCP-adjusted P4 30 single strip. In elevation, self-calibration with the single GCP located at mid bridge performs the same as pre-calibration; the accuracy with respect to the GCP-adjusted P4 30 single strip is about 40% lower. However, self-calibration is clearly worse (70% less accurate) if the GCP is located at one of the bridge ends. Figure 7 summarizes the effects on 44 CP accuracy of progressively reducing the number of auxiliary block strips, and of including (left hand side of the graph) or not including (right hand side of the graph) the cross strips. The different colour bars refer to the RMSE in cm for the coordinates in the along-and across-bridge directions and to the elevation. Each auxiliary block configuration tested is identified by an id composed by the number of active longitudinal (s) and cross (cr) strips and, for cross strips only, by the number of images per strip. For instance, "10 s + 7 × 10 cr" stands for an auxiliary block with 10 longitudinal strips and with 7 cross strips with 10 images each. Out of the four longitudinal strips closest to the bridge axis, the id "2 s_ int" and "2 s_ ext" refer respectively to the internal strip pair and the external one.

Search for the Best Auxiliary Block Configuration (O5)
the GCP-adjusted P4 30 single strip. In elevation, self-calibration with the single GCP located at mid bridge performs the same as pre-calibration; the accuracy with respect to the GCP-adjusted P4 30 single strip is about 40% lower. However, self-calibration is clearly worse (70% less accurate) if the GCP is located at one of the bridge ends. Figure 7 summarizes the effects on 44 CP accuracy of progressively reducing the number of auxiliary block strips, and of including (left hand side of the graph) or not including (right hand side of the graph) the cross strips. The different colour bars refer to the RMSE in cm for the coordinates in the along-and across-bridge directions and to the elevation. Each auxiliary block configuration tested is identified by an id composed by the number of active longitudinal (s) and cross (cr) strips and, for cross strips only, by the number of images per strip. For instance, "10 s + 7 × 10 cr" stands for an auxiliary block with 10 longitudinal strips and with 7 cross strips with 10 images each. Out of the four longitudinal strips closest to the bridge axis, the id "2 s_ int" and "2 s_ ext" refer respectively to the internal strip pair and the external one.

Search for the Best Auxiliary Block Configuration (O5)
The BBA were executed with the pre-calibration parameter set computed with GCP 706 for the eBee camera and with self-calibration for the P4 camera; no GCP was fixed. If the cross strips are included (left hand side of Figure 7) the accuracy along and across bridge is similar and about constant as long as at least four strips and six cross strips are included; then the across error grows quickly. To the contrary, without cross strips the accuracy across flight direction is always about 60-70% worse (and much more in the two strips "internal" case) than along flight direction. The elevation accuracy remains below 3 cm in almost all cases, with or without cross strips, even with just two longitudinal strips; with fewer cross strips, it degrades to about 4 cm.

Discussion
As far as the accuracy of GNSS-AT block orientation is concerned, the RMSE on 59 CP of Table  4 shows that the tie points horizontal coordinate accuracy of the eBee block, even if determined without a pre-calibrated camera and without any GCP (last row of Table 4), is quite good. However, a significant bias may affect the elevations, unless a pre-calibrated camera is used or at least one GCP is used to strengthen the self-calibrating BBA. Both these findings agree with the results presented in previous studies on GNSS-AT by the authors [11,40] and by Zhou [41] and Hugen [47]; in Jozkow The BBA were executed with the pre-calibration parameter set computed with GCP 706 for the eBee camera and with self-calibration for the P4 camera; no GCP was fixed.
If the cross strips are included (left hand side of Figure 7) the accuracy along and across bridge is similar and about constant as long as at least four strips and six cross strips are included; then the across error grows quickly. To the contrary, without cross strips the accuracy across flight direction is always about 60-70% worse (and much more in the two strips "internal" case) than along flight direction. The elevation accuracy remains below 3 cm in almost all cases, with or without cross strips, even with just two longitudinal strips; with fewer cross strips, it degrades to about 4 cm.

Discussion
As far as the accuracy of GNSS-AT block orientation is concerned, the RMSE on 59 CP of Table 4 shows that the tie points horizontal coordinate accuracy of the eBee block, even if determined without a pre-calibrated camera and without any GCP (last row of Table 4), is quite good. However, a significant bias may affect the elevations, unless a pre-calibrated camera is used or at least one GCP is used to strengthen the self-calibrating BBA. Both these findings agree with the results presented in previous studies on GNSS-AT by the authors [11,40] and by Zhou [41] and Hugen [47]; in Jozkow [48] a large bias in elevation is also found in GNSS-AT self-calibration without GCP. On the other hand, Mian [49] found that fixing a single GCP for system calibration purposes in the BBA of a UAV block with IMU and GNSS data did not prevent a bias of about 11 cm to remain in CP elevations. Finally, in [50], increasing from 0 to 4, the number of GCP improved the RMSE in elevation by just more than 1 cm, from 6.7 to 5.4 cm and adding 14 more GCP the RMSE improved only to 5.1 cm. Though no specific bias estimate is provided in the paper, it can therefore reasonably be inferred that just a small amount of elevation bias was present in this case; moreover, the lack of improvements from 4 to 18 GCP shows the strength that precise GNSS-determined camera stations convey to the block, as also found by simulations by [51].
As far as on-site pre-calibration is concerned, the tests performed on a small block using a single GCP or a sizeable number of GCPs (12) show variations by less than half pixel for the principal point and less than one pixel for the principal distance in the calibration parameters (Table 3). Applying these different sets of parameters to the eBee block in the GNSS-AT BBA, the RMSE on the CP horizontal coordinates are fairly small, with differences from set to set smaller than 4 mm. In elevation the average error is 2.8 cm, while the differences are larger, up to 1.6 cm. They look related to the specific GCP rather than to the amount of GCP used in pre-calibration and are therefore likely to originate from the particular measurement error in the image or object coordinates of the GCP fixed.
A larger pre-calibration test-field and a stronger imaging geometry would likely deliver more stable IO parameters; however, two points must be stressed: with GNSS-AT a few (even a single) GCP seem sufficient for effective self-calibration, as control from the camera positions is spread all over the block. Though this point deserves deeper investigations by simulations, a less demanding imaging geometry might be therefore enough for GNSS-AT camera calibration, unlike GCP-based calibration where oblique imaging is recommendable [20,34]. In any case, as up-to-date camera parameters are necessary, only on-site (pre-)calibration represents a working alternative to self-calibration. However, there should be a balance between the time and effort required by the pre-calibration and that of the whole survey, when GNSS-AT is employed.
Zhou [41] also applied GNSS-AT to corridor mapping, using a rotary-wing RTK-equipped UAV capable of oblique imaging. Therefore, in addition to camera calibration, lever arm calibration is also considered, to exploit the flexibility of image acquisition. In the experiment, made of three nadir strips 600 m long, they used a pre-calibration flight with strong geometry (a combination of nadir strips at different altitudes and of oblique imaging). Then they applied to the survey flight the same IO parameters configuration as we have done: with self-calibration (with or without a single GCP fixed) and with camera parameters fixed (again, with and without a single GCP fixed). While in horizontal coordinates the accuracy on CP stays the same, at about 3.3 cm with a GSD of 1 cm, in the self-calibration without GCP case, a 15 cm bias is found on elevations, as in our case. Fixing the GCP, the RMS is down to 1 cm. In the pre-calibrated case, using or not the single GCP does not change the RMS in elevation, which is slightly larger than that in the self-calibration case (2 cm). Our results are therefore very well in agreement with those in [41], both with respect to the alternative between pre-calibration and self-calibration as well as for the need of a single CGP in case of self-calibration. The agreement applies also to accuracy on CP, as from Table 6 and Figure 7 it can be seen that our results are well comparable (slightly better in horizontal coordinates and slightly worse in elevation).
As this paper refers specifically to cases where GCP are difficult or impossible to place in the master block area, so preventing self-calibration with one or more GCP, the main finding of our experiment in this respect is that a small test-field, arranged in a convenient area near the survey site, seems sufficient for a camera pre-calibration that removes most, if not all, the bias in elevation from the GNSS-assisted survey flight.
Overall, the experiment shows that an auxiliary block, flown with a RTK-equipped UAV, can be successfully adjusted with the GNSS-AT technique to georeference and control a master block, flown at a lower elevation, without using GCP. The presented technique is likely to remain a special one, useful whenever the master block cannot be reliably georeferenced by GNSS-AT and surveying GCP is difficult or inconvenient. However, it should be noted that, as the RMSE on CP in Table 6 shows, the error propagation from the auxiliary to the master block is not too unfavourable, as an accuracy loss compared to the eBee block between 30% and 50% is registered in horizontal coordinates and none in elevation. In a similarly difficult environment, [33] proposed a new formulation of ISO, which foresees the tightly coupled integration of inertial, GNSS, and image observations. In their experiment they also consider the corridor case (actually made by three long strips, divided in two sections, one with good GNSS coverage and the other with no GNSS coverage). They show that, with the current poor IMU data quality in UAV, inertial navigation can hardly bridge GNSS gaps with the IMU data only (the elevation error grows very fast), while their method performs better, combining position and attitude data from IMU with SfM. However, for a strip of comparable length with our experiment, the RMS on CP is larger than 10 cm in horizontal coordinates and 6 cm in elevation. In comparison, the proposed approach is of course more demanding in operational terms but still more accurate even with a two-strip auxiliary block. Moreover, it works also for a single strip of (in principle) any length, as the stability is tied to the auxiliary block, while their method starts drifting anyway, if the GNSS-denied section of the flight is too long. Finally, no additional sensors and software are required, as GNSS-AT is today implemented in all major SfM packages for UAV image processing.
The number and spatial distribution of tie points common to master and auxiliary blocks is certainly a critical element for the successful transfer of georeference information and block deformation control. In this respect, the test setup presented in this paper is from the one hand a demanding case, as a long single strip (i.e., a weak configuration) was successfully oriented. On the other hand, as far as perspective differences between the two sets of images are concerned, the test area is not particularly demanding, as both the bridge and the riverbed are basically flat, while the viewing direction is nadiral for both cameras. Despite a relevant height difference of about 70 m between the two flights, the different camera focal lengths keep the ratio among the two GSD, which is the driving factor for image matching, at about 1:2.5. Further investigations on the limits of this ratio for master and auxiliary blocks might help to clarify how far can this technique be extended.
As in previous tests [40], the position within the block of the single GCP used for self-calibration in GNSS-AT was found to be not relevant. However, this does not seem to be the case in the master-auxiliary joint block, where the GCP position affects the amount of elevation bias removed, with the most effective position being at mid-block. The reason for this difference is likely due to the weak single-strip geometry and therefore suggests using a pre-calibrated camera in the master-auxiliary case. No other study, to the best of our knowledge, reports on this point, however.
As far as the optimal configuration of the auxiliary block is concerned, our findings are obviously related to the specific case of the single strip, where both longitudinal and cross strip are necessary to ensure an accuracy not too far from a traditional adjustment.
More investigations are still necessary to assess the true applicability of the technique in a more demanding environment: think for instance of a narrow gorge, where the walls might be the main area of interest. In such cases, to ensure good connections between two blocks, both might need nadiral as well as oblique images. However, perspective and scale differences between the oblique images of the two flights might be too demanding for SfM to handle.

Conclusions
The results presented in the previous sections show that accurate georeferencing and control of a master block, even in the unfavourable case of a single strip, can be achieved by means of the joint adjustment with an auxiliary block, flown at a higher elevation with an RTK-equipped UAV. For a single strip about 700 m long, with 70 m height difference between master and auxiliary flight lines, a GSD of 0.9 cm and a strong auxiliary block, the accuracy verified on targets can be as good as 1.5 cm in each horizontal coordinates and less than 2.5 cm in elevation. With a weaker geometry of the auxiliary block the accuracy decreases, but still could be maintained below 4 cm for both coordinates.
The auxiliary block represents an obvious project overhead, that might be acceptable when cheaper or more efficient alternatives cannot be found. More investigations are necessary as far as its optimal configuration is concerned with different master blocks characteristics.
Another limit that the actual application of the technique on demanding environments may clarify is the maximum height difference between the master and auxiliary block, that in turn might depend on the ratio among the GSD of the two blocks.
However, on the one hand, the test results were obtained by a BBA with self-calibration without any GCP, as far as the horizontal coordinates accuracy is concerned. On the other hand, no significant bias could be found in the elevations using up-to-date camera calibration parameters. On-site pre-calibration with at least one GCP has proven to be adequate and on par or better than self-calibration, where one or more GCP are anyway necessary.
From the above remarks, it is clear that GCP are still necessary even with an RTK-equipped UAV. However, as more experimental evidence is gathered on the accuracy of the GNSS-supported GCP-free UAV photogrammetry, consistently confirming the high accuracy of horizontal coordinates and the limited amount of bias in elevation, a clearer picture of the future of this technique is emerging, particularly as far as medium-accuracy applications are concerned.
The strength conveyed to UAV blocks by the GNSS-determined camera projection centers is probably still underestimated. More investigations should be performed: from a theoretical standpoint, on residual calibration errors propagation; from a practical standpoint, on the requisites for an effective but affordable on-site calibration.
As we can expect a dramatic diffusion of RTK-equipped UAVs also to multi-copters, the proposed master-auxiliary technique is likely to remain a useful approach, to resort to in special cases. However, it shows the high degree of flexibility that RTK-endowed UAV and GNSS-AT offer today to the surveyor.