Benchmarking Different SfM-MVS Photogrammetric and iOS LiDAR Acquisition Methods for the Digital Preservation of a Short-Lived Excavation: A Case Study from an Area of Sinkhole Related Subsidence

: We are witnessing a digital revolution in geoscientiﬁc ﬁeld data collection and data sharing, driven by the availability of low-cost sensory platforms capable of generating accurate surface reconstructions as well as the proliferation of apps and repositories which can leverage their data products. Whilst the wider proliferation of 3D close-range remote sensing applications is welcome, improved accessibility is often at the expense of model accuracy. To test the accuracy of consumer-grade close-range 3D model acquisition platforms commonly employed for geo-documentation, we have mapped a 20-m-wide trench using aerial and terrestrial photogrammetry, as well as iOS LiDAR. The latter was used to map the trench using both the 3D Scanner App and PIX4Dcatch applications. Comparative analysis suggests that only in optimal scenarios can geotagged ﬁeld-based photographs alone result in models with acceptable scaling errors, though even in these cases, the orientation of the transformed model is not sufﬁciently accurate for most geoscientiﬁc applications requiring structural metric data. The apps tested for iOS LiDAR acquisition were able to produce accurately scaled models, though surface deformations caused by simultaneous localization and mapping (SLAM) errors are present. Finally, of the tested apps, PIX4Dcatch is the iOS LiDAR acquisition tool able to produce correctly oriented models.


Introduction
Digital photogrammetry and LiDAR-based geospatial field data acquisition using smartphones and tablets is revolutionizing the use of close-range 3D remote sensing within the geosciences [1][2][3][4][5][6][7]. Commensurately, the rapid uptake of low-cost, readily deployable multi-senor drones has extended the reach of such techniques, enabling nadir view photogrammetric surveys of horizontal outcrops, as well as occlusion free reconstructions of large vertical sections [8][9][10]. Despite the relative simplicity with which 3D surface reconstructions of geological exposures (i.e., virtual or digital outcrop models: [11][12][13][14][15][16][17][18][19]) can be acquired using such platforms, the reliability of the geospatial information extracted from their data products is typically unclear, particularly when survey grade measurements are unavailable to calibrate and benchmark the resultant outcrop models. The deployed sensor platform's accuracy and precision in terms of position and orientation is often a key consideration for many geoscientific applications of 3D surface reconstructions, with deviations in the resultant scale, geolocation and attitude of the generated models being deleterious to the quality of metric data extracted thereof. Recently, Uradziński and Bakuła [20] have shown that under optimal conditions, dual-frequency receivers on smartphone devices allow geolocation with accuracies of a few tens of centimeters with post-processing carrier phase correction, providing accurate ground control points (GCPs) for georeferencing and scaling. Analogously, additional authors have demonstrated that 3D models can be satisfactorily oriented and scaled by utilizing the smartphone camera pose information (i.e., the camera's extrinsic parameters) to register photogrammetry-derived models (e.g., [21][22][23]). In similitude, Corradetti et al. [4] obtained oriented and scaled 3D models using inertial measurement unit (IMU) derived smartphone orientation data, with constraints on the device's (and thus image's) major axis using a handheld gimbal. Whilst relatively streamlined in comparison to conventional GCP geolocation using survey grade tools (i.e., differential GNSS or total station surveys), these methods still require a degree of setup in the field and significant post-processing. However, many casual users routinely acquire photographs for close-range photogrammetry, and more recently, scans through the iOS LiDAR devices do so without any predefined strategy for the georectification of the resultant 3D model, severely limiting their utility as a medium for quantitative geological analysis. Conversely, it is well established amongst geospatial specialists that the absence of a sound registration strategy can negatively impact upon results.
Using a case study of a trench that was recently excavated to probe historical subsidence in the proximity of an infilled sinkhole in the municipality of Enemonzo (Italy), we have investigated the utility of various acquisition strategies for the generation of correctly oriented and scaled 3D models. The geological interpretation of the study area has been presented elsewhere [56,57] and the reader is invited to consult the aforementioned work for further details. In summary, the area is characterized by the presence of several types of sinkholes [58,59] with the studied trench being located proximally to the west of a phenomenon that manifested at the surface during the 1970s ( [56] and the references cited therein) and reactivated in the 1980s and 2010s. The presence of this sinkhole is related to a Triassic evaporitic bedrock mantled by variably consolidated and loose Quaternary deposits having variable thickness, which varies in the area, from north to south, from a few meters to more than 60 m [56,60]. Bedrock dissolution associated with groundwater flux is thought to have caused the collapse of the sinkholes [61,62] within the present study area.
A detailed description of the acquisition procedure and setup is presented within the method section below. In summary, we have used three camera-equipped devices: namely, a DJI Air 2S drone, a Nikon D5300 camera, and an iPhone 13 Pro, to generate a structure from motion-multiview stereo (SfM-MVS) photogrammetric reconstructions of the trench. The model made from the DJI Air 2S dataset (hereafter named Air 2S) included a larger acquisition area, which coupled with its superior nominal accuracy, provided a benchmark against which the ground-based surveys could be compared. The Air 2S model was also compared against a LiDAR-derived Digital Terrain Model (DTM) available at 1 m resolution to check for its vertical accuracy in relation to the world frame. Moreover, additional field-based surveys were performed using the embedded LiDAR sensor of the iPhone 13 Pro, using the 3D Scanner App and PIX4Dcatch apps in order to evaluate their utility towards geospatial site documentation.
The comparative analysis here indicates that only in optimal scenarios (i.e., when the accuracy of the geospatial positioning is intrinsically high), geotagged field-based photographs alone can result in models with acceptable scaling errors, though even in these cases the orientation of the transformed models cannot be sufficiently accurate for many geoscientific applications. Moreover, misalignment of the reconstructed scene is exacerbated when the acquisition is performed in a collinear fashion. The apps tested for iOS LiDAR acquisition were able to produce accurately scaled models. However, their resultant scene reconstructions exhibited surface deformations caused by simultaneous localization and mapping (SLAM) errors, which in turn may prove detrimental to the accuracy of structural measurements extracted from the model surfaces. Finally, of the tested apps, PIX4Dcatch is the only iOS LiDAR acquisition tool able to produce correctly oriented models.

Methods
The studied trench is~20 m long, having an approximate east-west strike direction. The southern wall of the trench was cut vertical and prepared for the study, whereby gridlines demarcating 1 m 2 subregions were installed, providing a reference frame for the comparative analysis. Proceeding with the preparation of this framework, field surveys using the aforementioned remote sensing platforms were performed on 6 April 2022. Weather conditions on the day of the surveys were overcast, providing uniform diffuse light, favoring the acquisition of the north-facing trench wall, minimizing the impact of shadowing upon the captured images and their resultant reconstructed scenes, thus limiting the impact of shifts in solar azimuth and zenith upon model quality.
The drone used for the acquisition is a DJI Air 2S (Table 1), which is equipped with an embedded GNSS positioning system (GPS, GLONASS, and Galileo constellations), compass, and an inertial measurement unit (IMU). According to the manufacturer, the hovering accuracy range of vertical and horizontal positioning with GNSS is~0.5 m and 1.5 m respectively. The DJI Air 2S is equipped with a 20-megapixel camera with a 1" CMOS sensor, mounted on a three-axis gimbal. A total of 226 photos were taken in JPG format (5472 × 3648 pixels) and 72 dpi resolution, at distances between~0.4 to 80 m from the scene in manual flight mode. Every photo's camera position (latitude, longitude, and altitude) and pose information (yaw, pitch, and roll angles) are automatically recorded. All photos taken with this platform have a~0 • roll angle, attributable to gimbal stabilization (e.g., [4]). This dataset includes aerial views of the trench, including the surrounding roads and buildings. The third set of photographs was taken using an iPhone 13 Pro (Table 1). This device is also equipped with a GNSS receiver (GPS, GLONASS, Galileo, QZSS, and Beidou constellations) with the final location provided by the Apple Core Location framework, which combines GNSS geolocation measurements with data provided by Bluetooth and Wi-Fi networks when available. The iPhone 13 Pro embeds an IMU and a magnetometer able to provide orientation data. However, only the azimuthal orientation of the camera direction is preserved in each photo in addition to their geographic coordinates and altitude. A total of 329 photographs at 12.2-megapixels resolution were acquired, with the majority captured at close range from inside the trench.
These three photographic datasets were processed independently in Agisoft Metashape Professional (version 1.8.1): a commercially available SfM-MVS photogrammetric reconstruction software platform. Geographic coordinates obtained from each platform were converted to UTM zone 33N-WGS84 (EPSG: 32633) within Metashape. Exported point clouds were subjected to comparative analysis within CloudCompare: an open-source software for point cloud processing and analysis [63]. The computation speed of this analysis was enhanced by leveraging a virtual machine installed on a Dell PowerEdge R7525 server rack placed at the Department of Mathematics and Geosciences at the University of Trieste (Italy) equipped with an AMD EPYC™ 7F72 (beanTech, Udine, Italy) chipset and NVIDIA GRID RTX8000P GPU architecture. Moreover, to further improve computation speed, all point clouds were decimated using random sampling to 12 M points within CloudCompare. This value was chosen arbitrarily and considered adequate to represent the reconstructed geometry. In CloudCompare, the comparative analysis was performed after point cloud manual alignment using a minimum of four non-collinear points. The results were visually inspected, and the alignment procedure repeated when deemed unsatisfactory.
In addition to its 12.2-megapixel digital camera, the iPhone 13 Pro is equipped with a built-in LiDAR scanner. This sensor utilizes the simultaneous emission of 576 rebounded laser pulses to acquire scene geometry, which under optimal conditions has a range of up to 5 m [5,7]. After introducing this sensor in the iPad Pro and iPhone 12 Pro in 2020, several apps have been developed to retrieve geospatial information in the form of point clouds and textured meshes [5,64]. In this work, we have tested the 3D Scanner App (v. 1.9.8) and PIX4Dcatch (v. 1.12.0) apps. During each acquisition, the iPhone was mounted on a DJI OSMO 3 gimbal to limit the deleterious impacts that abrupt movements imbue upon the iPhone's IMU measurements, and hence reduce possible errors resulting from the simultaneous localization and mapping (SLAM) required to generate the LiDAR models.
The 3D Scanner App is a free app for iOS LiDAR acquisition, which produces a textured mesh of the scanned scene as the output. Several settings related to the resolution of the acquisition can be set. In this work, we set the acquisition at (i) low confidence, (ii) 2.0 m range, (iii) no masking, and (iv) 8 mm resolution. The entire duration of the 3D Scanner App LiDAR survey required~6 min. The scanning duration was long to accommodate coverage of the entire trench (including the trench floor). The textured model output by the 3D Scanner App was processed in less than two minutes using the acquisition device at the field site. Moreover, the output models are natively scaled, thus providing metric information directly in the field, with the size of model features being ascertained by selecting two points on the model surface within the 3D Scanner App. The textured meshes can be exported using the Wavefront *.obj format while colored point clouds of the scene can be exported via the *.xyz format. During the acquisition, the app also captures low-resolution (2.7 MP) images of the scene at a frequency of~2 Hz. These images are primarily used for generating texture maps and to assign RGB attributes to the point cloud but can also constitute a backup dataset that can be used to generate a stand-alone SfM-MVS model.
The PIX4Dcatch app is part of a larger software suite that includes PIX4Dmapper, a popular SfM photogrammetric reconstruction software. The PIX4Dcatch app is free to use but requires a subscription to export data. In this work, the default acquisition settings with PIX4Dcatch were used, including an image overlap of 90%. With the device mounted on the DJI OSMO 3, we acquired the north-facing wall of the trench. The PIX4Dcatch app does not process the data within the smartphone but requires the LiDAR project to be uploaded to the PIX4Dcloud for processing (an upload of 1.51 Gb was required for the test case in this study). This app, in addition to the textured mesh and point cloud, stores 2.7 MP images of the scene including their position (latitude, longitude, and altitude) and pose information (omega, phi, and kappa), that are directly readable even when imported into third party software (i.e., Metashape). Despite this flexibility, the LiDAR-generated depth maps can only be read if processed using PIX4Dcatch.

Processing Outline and Results
In total, 226 photos from the drone (56 nadir view and 170 oblique images of the trench wall) were processed in Metashape, using high accuracy alignment and high-quality densification settings. The resulting dense point cloud has~111 million points and comprises the trench and its surrounding area ( Figure 1a). Ostensibly, due to the robust GNSS positioning data provided by the drone, coupled with a wider acquisition area, the resulting model appears reasonably georeferenced with respect to its horizontal components ( Figure 1a). By contrast, after comparison with a freely available 1 m resolution LiDARderived Digital Terrain Model (DTM) of the region Friuli Venezia Giulia (available at http://irdat.regione.fvg.it/CTRN/ricerca-cartografia/ (accessed on 15 July 2022)), a significant vertical translation was observed ( Figure 1b). It should be noted that both datasets are framed in terms of their altitude above sea level (ASL). The cloud-cloud computed distance between the DTM and the Air 2S model is relatively uniform throughout the area, lying at about 8 m (Figure 1c), indicating minimal angular deviation between the two reconstructions. It is worth noting that accurate positioning of ground control points (GCPs) should be provided to obtain below centimeter accuracy estimates in both horizontal and vertical directions and to check the consistency of the model throughout the investigated area (e.g., [65]). Nevertheless, in spite of this vertical shift, for the scope of this work we consider this dataset without any vertical translation as the benchmark against which the other models will be compared. To enhance the reconstruction quality of the trench, we selected a smaller region and repeated the densification procedure in Metashape after disabling all aerial photographs, whilst keeping the pre-established alignment. This procedure resulted in a benchmark point cloud composed of~124 million points (~0.47 point/mm 2 at the center of the scene), which was later decimated in CloudCompare, as described within the Methods section.
Metashape is able to read the EXIF metadata tagged onto photographs, which in the case of the Air 2S dataset includes the gimbal yaw, pitch and roll angles. Note that in this study, the roll angle is effectively fixed at 0 • , owing to gimbal stabilization. The iPhone dataset only records the image's direction with respect to true north, which Metashape automatically identifies as the yaw angle. The Nikon D5300 does record GNSS geolocation data but does not record camera orientation parameters.
In Metashape, photo-alignment based on the extrinsic parameters only takes into account location information, though orientation parameters can be utilized in postprocessing. In the case of the Air 2S dataset, the use of the orientation parameters results in an anticlockwise rotation of the model around the world frame's vertical axis of~3.4 degrees. This value is close to the angular deviation between magnetic and true declination at the site (3.2 • ). Hence, this additional registration may prove useful in cases where the model needs to be aligned to the magnetic instead of geographic north (e.g., the collection of orientation data from a model matched to equivalent field measurements collected using a compass or compass-clinometer).
The iPhone alignment was produced using medium accuracy and high-quality densification settings. The resulting dense point cloud has~123 million points (~0.64 point/mm 2 at the center of the scene). The use of the azimuthal orientation parameter associated with iPhone photographs for the model's alignment to magnetic north is troublesome. Camera azimuth with respect to the north cannot be used alone since the software assigns null values to pitch and roll fields. Moreover, the north bearing may also not match that of a yaw angle, for example in the case of portrait photos. Whilst we have not robustly tested the use of the iPhone orientation parameters, it appears to match the orientation and scale to the Air 2S generated model when visually compared (Figure 2a), although a major vertical translation of~5.5 m is observed (Figure 2b). This translation reduces to~2.5 m ASL in real-world coordinates considering the vertical shift between the Air 2S model and the LiDAR-derived DTM. Closer inspection reveals that whilst the scaling between the two models is comparable, with a scaling factor of 0.995, a rotation of~13 • around the x-axis (which corresponds to the strike of the trench) is observed (Figure 2c). Metashape is able to read the EXIF metadata tagged onto photographs, which in the case of the Air2S dataset includes the gimbal yaw, pitch and roll angles. Note that in this study, the roll angle is effectively fixed at 0°, owing to gimbal stabilization. The iPhone dataset only records the image's direction with respect to true north, which Metashape automatically identifies as the yaw angle. The Nikon D5300 does record GNSS geolocation data but does not record camera orientation parameters.
In Metashape, photo-alignment based on the extrinsic parameters only takes into account location information, though orientation parameters can be utilized in post-processing. In the case of the Air 2S dataset, the use of the orientation parameters results in an anticlockwise rotation of the model around the world frame's vertical axis of ~3.4 degrees. This value is close to the angular deviation between magnetic and true declination at the site (3.2°). Hence, this additional registration may prove useful in cases where the model needs to be aligned to the magnetic instead of geographic north (e.g., the collection of orientation data from a model matched to equivalent field measurements collected us- Both models acquired by directly using the iPhone's LiDAR sensor through the 3D Scanner App and PIX4Dcatch apps lack georeferencing but are registered within a local coordinate system. Consequently, a translation had to be applied to observe both LiDAR models together with the benchmark model ( Figure 3). Notably, the PIX4Dcatch model was correctly oriented. To test whether the PIX4Dcatch model's orientation was coincidental with the benchmark model, we undertook a second survey at a different locality (not shown) and found that the model was correctly oriented. In contrast, the 3D Scanner App model was misoriented with the z-axis oriented almost at 90 • to its expected value ( Figure 3). The x-axis was approximately parallel to the world frame x axis but inverted.
To test if the erroneous registration related to an incorrect reading order of the registered x, y and z coordinates, the model was rotated 180 • around the word frame z-axis and 90 • around the x-axis. The resulting model still deviated~30 • around a z-axis rotation from the world frame. Finally, a scaling factor of 1.00123 and of 1.00646 had to be applied to the 3D Scanner App and PIX4Dcatch models, respectively, to match the photogrammetric benchmark model. assigns null values to pitch and roll fields. Moreover, the north bearing may also not match that of a yaw angle, for example in the case of portrait photos. Whilst we have not robustly tested the use of the iPhone orientation parameters, it appears to match the orientation and scale to the Air 2S generated model when visually compared (Figure 2a), although a major vertical translation of ~5.5 m is observed (Figure 2b). This translation reduces to ~2.5 m ASL in real-world coordinates considering the vertical shift between the Air 2S model and the LiDAR-derived DTM. Closer inspection reveals that whilst the scaling between the two models is comparable, with a scaling factor of 0.995, a rotation of ~13° around the x-axis (which corresponds to the strike of the trench) is observed (Figure 2c). Both models acquired by directly using the iPhone's LiDAR sensor through the 3D Scanner App and PIX4Dcatch apps lack georeferencing but are registered within a local coordinate system. Consequently, a translation had to be applied to observe both LiDAR models together with the benchmark model ( Figure 3). Notably, the PIX4Dcatch model was correctly oriented. To test whether the PIX4Dcatch model's orientation was coincidental with the benchmark model, we undertook a second survey at a different locality (not shown) and found that the model was correctly oriented. In contrast, the 3D Scanner App model was misoriented with the z-axis oriented almost at 90° to its expected value ( Figure 3). The x-axis was approximately parallel to the world frame x axis but inverted. To test if the erroneous registration related to an incorrect reading order of the registered x, y and z coordinates, the model was rotated 180° around the word frame z-axis and 90° around the x-axis. The resulting model still deviated ~30° around a z-axis rotation from the world frame. Finally, a scaling factor of 1.00123 and of 1.00646 had to be applied to the 3D Scanner App and PIX4Dcatch models, respectively, to match the photogrammetric benchmark model. The photo-alignment of the Nikon DSLR generated model was also problematic. Despite using medium accuracy settings in Metashape (in source preselection mode), the initial reconstruction failed after more than eight hours processing time using the available computing hardware (in the virtual machine described above). Thus, a second attempt to reconstruct the scene captured using the Nikon D5300 was attempted at low accuracy in source preselection mode, which was later reset to medium accuracy in estimated preselection mode for a secondary alignment. This resulted in a sparse point cloud of ~2.7 initial reconstruction failed after more than eight hours processing time using the available computing hardware (in the virtual machine described above). Thus, a second attempt to reconstruct the scene captured using the Nikon D5300 was attempted at low accuracy in source preselection mode, which was later reset to medium accuracy in estimated preselection mode for a secondary alignment. This resulted in a sparse point cloud of 2.7 million points, which expanded to 334 million points (~0.18 point/mm 2 at the center of the scene) after dense reconstruction. The resulting model (hereafter termed Nikon model) is poorly scaled and oriented. The scaling factor with the reference model was~10, meaning the Nikon model was about 10 times larger. The orientation of the model (not shown) was also arbitrary, with the z-axis aligned almost perpendicular to the world frame.

Comparative Analysis in CloudCompare
After registering all test point clouds with the benchmark in CloudCompare, we meshed the benchmark model and then performed a point cloud to mesh distance calculation for each test point cloud to investigate the occurrence of the surface deformations ( Figure 4). As shown in the histogram of Figure 4b,~82% of the points belonging to the iPhone photogrammetric point cloud are within ±1.5 cm of the benchmark model's surface (and~68% within ±1 cm), with the majority of outliers being located on the floor of the trench. The Nikon point cloud only represented the north-facing wall of the trench, which in similitude to the iPhone point cloud, exhibited 85.6% of points lying within ±1.5 cm distance from the benchmark model (and~72% of points within ±1 cm). For the Nikon model, some outliers were located at the eastern wall of the trench and on the western side of the north-facing wall. The iPhone LiDAR model captured using the 3D Scanner App covered the entirety of the trench. Data acquisition using the 3D Scanner App started from the area indicated by the green arrow in Figure 4e and followed an approximately anticlockwise transect, which tracked the sidewalls of the trench and the trench floor. After completing this circuit, the acquisition returned to the center of the scene terminating at the location indicated by the dark-red arrow in Figure 4e. About 77% of the points of the 3D Scanner App point cloud are located within ±6 cm from the benchmark model's surface. In this case, most of the outliers are located on the western sector of the north-facing wall. Notably, this area coincided with the start and end of the survey transect used for 3D Scanner App LiDAR acquisition. Finally,~94% of the PIX4Dcatch point cloud lies within ±2.5 cm distance from the benchmark model with outliers mainly present in the western sector of the acquisition area.

Orthomosaics
In Metashape, the 3D models of the photographic datasets (Air 2S, Nikon and iPhone) were also used to produce three orthomosaics from their associated textured meshes ( Figure 5). It should be noted that the Air 2S orthomosaic in Figure 5 was produced after removing all the nadir view images oriented at an acute angle to the trench wall from the aerial survey prior to texturing, which would introduce blurring into the resultant texture map. Resultant orthomosaics reached different pixel sizes of 0.99, 1, and 0.635 mm/pixel for the Air 2S, Nikon and iPhone, respectively. Interestingly, a former Air 2S orthomosaic produced prior to the removal of nadir view images (not shown) reached a pixel size of 2.96 mm/pixel. The orthomosaics in Figure 5 were exported using a fixed resolution of surface. In this case, most of the outliers are located on the western sector of the northfacing wall. Notably, this area coincided with the start and end of the survey transect used for 3D Scanner App LiDAR acquisition. Finally, ~94% of the PIX4Dcatch point cloud lies within ±2.5 cm distance from the benchmark model with outliers mainly present in the western sector of the acquisition area. Figure 4. The iPhone (photogrammetric) RGB colored point cloud (a) and its distance to the benchmark model (b). The Nikon RGB colored point cloud (c) and its distance to the benchmark model (d). The 3D Scanner App colored point cloud (e) and its distance to the benchmark model (f). The PIX4Dcatch app RGB colored point cloud (g) and its distance to the benchmark model (h). Green and red arrows in e and g represent the first and last view of the acquisition. Gray arrows in f represent artifact points related to self-localization and mapping (SLAM) errors. The Red Green and Blue Cartesian system represent the East, North and Up directions, respectively.

Orthomosaics
In Metashape, the 3D models of the photographic datasets (Air 2S, Nikon and iPhone) were also used to produce three orthomosaics from their associated textured meshes ( Figure 5). It should be noted that the Air 2S orthomosaic in Figure 5 was produced after removing all the nadir view images oriented at an acute angle to the trench wall from the aerial survey prior to texturing, which would introduce blurring into the resultant texture map. Resultant orthomosaics reached different pixel sizes of 0.99, 1, and 0.635 mm/pixel for the Air2S, Nikon and iPhone, respectively. Interestingly, a former Air 2S orthomosaic

Discussion
Three SfM-MVS photogrammetry-based surveys using consumer-grade camera platforms and two iOS LiDAR-based apps were tested in this work to evaluate their ability to The two available textured meshes obtained from the 3D Scanner App and the PIX4Dcatch app were also used to generate two orthopanels using the LIME software suite [66] (Figure 5).

Discussion
Three SfM-MVS photogrammetry-based surveys using consumer-grade camera platforms and two iOS LiDAR-based apps were tested in this work to evaluate their ability to reproduce the geometry and optical signature of a typical geoscience and geoarchaeological field site (a trench transecting a sinkhole).

Scaling and Orientation Accuracy of SfM-MVS Models
The first of the three SfM-MVS photogrammetry-derived models (hereafter termed SfM models for parsimony) was generated using an aerial photographic survey using a DJI Air 2S drone (226 images), including photographs captured within the trench. The Air 2S dataset covered a much larger area (>1200 m 2 ) than the compared surveys. This acquisition strategy, coupled with the excellent GNSS, IMU and stabilization capability of this device resulted in the most accurately scaled and (horizontally) georeferenced of the SfM models, as evidenced by the aerial orthophoto derived from the model added as an overlay in Google Earth (Figure 1). Consequently, the Air 2S model was used as a benchmark with which to test the reconstruction quality of the remaining survey methods. It should be noted that the Air 2S model is vertically translated by~8 m ( Figure 1). The assumed fidelity of the Air 2S model is therefore an approximation, though its internal scaling and registration with respect to the horizontal axes of the world frame is likely reliable for the sake of comparison.
The use of photographs taken with the Nikon D5300, which are characterized by low accuracy (GPS) geotags and lack camera orientation metadata, resulted in the model having a scaling factor of about 10 and arbitrary orientation. Despite the larger quantity of photographs in the Nikon dataset (382 images) and the larger sensor resolution, the accuracy of the GPS sensor was above the size of the investigated area (>100 images had position error > 100 m). It was only after 150 images were taken that the signal error stabilized below 20 m. It is likely that low accuracy of the dataset's geolocation information in combination with the relatively high pixel count of the survey contributed to the failed alignment in Metashape encountered during the first attempted reconstruction. The third SfM model was built from 329 photographs captured using an iPhone 13 Pro. The resultant 3D model was natively scaled, with a scale factor to the benchmark of~0.995. This revealed that the multi-satellite GNSS receiver of the iPhone used by the Apple Core Location framework to retrieve the camera position provides reliable location data when large enough datasets are used. Nevertheless, recent single-point accuracy tests performed with the previous iPhone Pro model (the iPhone 12 Pro, [7]), have evidenced a location accuracy within a few meters that stabilized within seconds. Further to this, the Apple Core Location framework can provide more accurate positioning over standard consumer grade GNSS receivers while utilized within residential areas by combining data provided by the GNSS receiver, Wi-Fi networks, and nearby Bluetooth devices. The device is also equipped with the iBeacon micro-location system enabling indoor navigation when available. A possible way to discern whether the final precision is due to averaging of a large dataset or to the intrinsic location accuracy by the Apple Core Location framework is by testing the precision and the accuracy of each photo's position in comparison to the values estimated through the photo-alignment process in Metashape. During the photoalignment workflow, each photo is re-positioned in a new location that better fits the overall geometry of the scene linking each image to the reconstructed scene and minimizing the reconstruction error (e.g., [67,68]). If the model is properly scaled, the difference between the estimated and measured camera locations provide a proxy for the precision of the Apple Core Location framework at the site. It can be seen in Figure 6a that the estimated precision of the iPhone 13 Pro positioning is generally <1 m in each axis. The highest error can be observed in the first three photographs, suggesting that after a minimal time, the location signal stabilizes (as also previously observed by Tavani et al. [7]). To estimate the accuracy of the camera positions (which differs from the precision being compared to a benchmark), we have aligned the iPhone model to the Air 2S model (considered in this work as the benchmark model) such that the difference between the new estimated camera positions and the measured positions represent the accuracy of the device (Figure 6b). The alignment in Metashape was achieved by providing the coordinates of three non-collinear markers as obtained by the Air 2S model (in lieu of robust GCPs). It has to be noted that this is an ambitious assumption as even though it has been observed that the lat-long positioning of the benchmark model is approximately correct (see Figure 1), the actual altitude of the model suffers from a vertical translation of~−8 m in the absence of a survey-grade registration. This comparison shows that there exists significant error in the vertical axis of the world frame (~5.9 m). Considering that the Air 2S model itself suffers a~−8 m translation from the LiDAR-derived DTM (Figure 1b,c), the mean vertical error of the iPhone is~2.1 m. The accuracy estimate in the east direction is within 1 m (average easting error is 0.6 m). The north direction accuracy estimate is mostly between 0 and 3 m (average northing error is 1.5 m) (Figure 6b). The total accuracy error is 6.14 m in total and mostly reflects the vertical shift. Regarding the anticlockwise rotation of about 13 • around the x-axis, this is likely related to the photo-survey acquisition strategy. In fact, having performed the acquisition along the east-west direction (i.e., parallel to the strike of the trench), most of the photographs lie along the x-axis of the world frame. The high collinearity of this dataset is conducive to the introduction of rotational errors along this axis. The availability of camera pose information in the EXIF files (i.e., in similitude to the Air 2S dataset) would have provided the means to mitigate such errors.  (Figure 6b). The alignment in Metashape was achieved by providing the coordinates of three non-collinear markers as obtained by the Air 2S model (in lieu of robust GCPs). It has to be noted that this is an ambitious assumption as even though it has been observed that the lat-long positioning of the benchmark model is approximately correct (see Figure 1), the actual altitude of the model suffers from a vertical translation of ~−8 m in the absence of a survey-grade registration. This comparison shows that there exists significant error in the vertical axis of the world frame (~5.9 m). Considering that the Air 2S model itself suffers a ~−8 m translation from the LiDAR-derived DTM (Figure 1b,c), the mean vertical error of the iPhone is ~2.1 m. The accuracy estimate in the east direction is within 1 m (average easting error is 0.6 m). The north direction accuracy estimate is mostly between 0 and 3 m (average northing error is 1.5 m) (Figure 6b). The total accuracy error is 6.14 m in total and mostly reflects the vertical shift. Regarding the anticlockwise rotation of about 13° around the x-axis, this is likely related to the photo-survey acquisition strategy. In fact, having performed the acquisition along the east-west direction (i.e., parallel to the strike of the trench), most of the photographs lie along the x-axis of the world frame.
The high collinearity of this dataset is conducive to the introduction of rotational errors along this axis. The availability of camera pose information in the EXIF files (i.e., in similitude to the Air 2S dataset) would have provided the means to mitigate such errors.

Scaling and Orientation Accuracy of the iPhone LiDAR Reconstructions
The scaling of the point clouds derived from the LiDAR sensor of the iPhone 13 Pro closely matches that of the Air 2S model. The 3D Scanner App model had a scaling factor of 1.0012, while the PIX4Dcatch model had a scaling factor of 1.0065. Whilst the latter

Scaling and Orientation Accuracy of the iPhone LiDAR Reconstructions
The scaling of the point clouds derived from the LiDAR sensor of the iPhone 13 Pro closely matches that of the Air 2S model. The 3D Scanner App model had a scaling factor of 1.0012, while the PIX4Dcatch model had a scaling factor of 1.0065. Whilst the latter mentioned app is also able to return correctly oriented models, the 3D Scanner App generated model was arbitrarily oriented. Prior to its latest updates, the PIX4Dcatch app was also unable to produce correctly oriented models [7]. The capacity to build accurately scaled and oriented models directly in the field (assuming mobile network connectivity) offers a potential gamechanger for applications that require the rapid collection of attitude data at a given study site (e.g., fault and fracture analysis). It should also be noted that handheld GNSS Real-time kinematic (RTK) rovers have recently become available for selected iPhone and Android devices (e.g., the viDoc RTK rover). These highly accurate add-on receivers offer the potential to turn smartphones into survey grade GNSS tools, though at present, the cost of these devices rivals that of standalone RTK-GNSS platforms. We acknowledge that a relatively modest upgrade has been announced with the release of dual-frequency receivers in the iPhone 14 Pro. Nevertheless, the typically limited access to the Apple Core Location framework raw data may limit or impede the post-processing carrier phase fix (e.g., [20]).

Internal Accuracy of Reconstructions
All point clouds produced by the survey methods deployed in this study were compared to test for deformations in the reconstructed scene after point clouds manual alignment (similarity transform). After translating, rotating, and scaling each cloud to fit the Air 2S model, their distance to the benchmark model was computed (Figure 4). When using such a comparison it must be taken into consideration that the point-to-mesh distance will encounter some disparity due to the tessellation of the surface (mesh). Most of the points composing the iPhone model (~82%) are <1.5 cm from the benchmark model surface, with most of the outlier points at the floor of the trench. These deformations observable at the base of the model are probably caused by the obliquity between the photo view direction and the ground, which is typical of ground-based surveys targeting vertical edifices. In this work, the main objective was to reconstruct a single trench wall. In cases where the acquisition of the floor of the trench is required, it is recommended, even during ground-based surveys, to include photos that are normal to sub-normal to the base of the excavation. As for the iPhone model, the Nikon model has 85.6% of the values within a 1.5 cm distance from the benchmark model. Outlier points are more randomly distributed. The fact that both the iPhone and the Nikon models have a similar distribution of errors in relation to the distance from the benchmark model (e.g., Figure 4c,d) suggesting that, for these two models, reconstruction errors are largely dictated by the limitations of SfM estimation. Reconstruction errors related to the SfM techniques are strongly coupled to the resolution of the input image dataset, as well as additional factors, such as camera sensor noise and motion blur (e.g., [69]). The internal deformation error associated with the LiDAR acquisition is mostly within 6 cm (for 77% of the point population) and 2.5 cm (94%) for the 3D Scanner App and the PIX4Dcatch models, respectively. It should be noted that the size of these two models and their acquisition strategies were distinct (Figure 4). As a result of the strategy followed during the 3D Scanner App acquisition, the area between the green and red arrows in Figure 4e was scanned at least twice over a relatively long and convoluted path. This path exacerbated errors associated with simultaneous localization and mapping (SLAM) [70]. When the user moves the smartphone while scanning, its position and orientation (i.e., the pose information) must be continuously determined in order to append consecutive scanned portions of the scene. This is generally achieved by merging the pose information provided by visual and inertial sensors [70]. As a result, small errors in the phone's pose estimation can accumulate and propagate, giving rise to mispositioned points within the model. These deleterious effects are evident from the survey performed with the 3D Scanner App, where the last portions of the scan produced a 'ghost' planar feature at a distance of about 25 cm from the features' true position. The acquisition carried out through the PIX4Dcatch app was much smoother, having acquired only the north-facing wall of the trench (not the floor nor any of the other walls). The acquisition proceeded from east to west as indicated by the green and red arrows in Figure 4g. It can be observed (Figure 4h) that for smaller and smoother acquisitions, such as the survey conducted with the PIX4Dcatch app, the majority of points (94%) are <2.5 cm distance to the benchmark model's surface. Qualitatively, it can be observed that most of the outlier points are located at the periphery of the acquisition. Again, this is likely the result of SLAM errors. In effect, SLAM errors are roughly comparable to the so-called 'doming effect', which may impact SfM-MVS photogrammetric reconstructions in terms of their deleterious impact upon model analysis (e.g., [22,71]).

Orthopanels
Any of the acquisition methods tested herein can be used to generate textured models of the scene, and to orthographically project the resulting model over a panel, thus generating an orthomosaic (or orthopanel) of the scene [30,31,[72][73][74]. This procedure finds considerable applications in structural geology, stratigraphy, geoarchaeology and geomorphology, as well as other earth science disciplines, where orthomosaics are generated by orthogonally projecting the model towards a direction that minimizes geometry distortion of the targeted features observable in the model (e.g., geologic structures, bedding planes, clasts, etc. [75][76][77]). In the case of the three SfM models, the orthomosaic construction is a trivial additional step in the SfM-MVS reconstruction workflow, available within photogrammetric reconstruction software tools (e.g., Metashape [73]). All three SfM-derived orthomosaics faithfully reproduced the scene with variable resolutions (<3 mm pixels) that primarily relate to the average spatial resolution of the photo survey. A 2 × 1 m subregion of these orthomosaics is shown for comparison ( Figure 5) after export at a fixed resolution of 3 mm pixels. In this section, a small (<40 cm throw) structure cutting through the Quaternary strata can be observed (Figure 5). At a glance, it can be confused with a small normal fault, but it corresponds to the eastern side wall of the active sinkhole. How this structure relates to the subsiding evolution of the area is beyond the scope of this work. The resolution of the orthomosaic produced from the textured model processed in the field by the 3D Scanner App is less sharp than the SfM models, although the sidewall and stratification are still discernable ( Figure 5). The PIX4Dcatch app allows for the uploading of data to the PIX4Dcloud for remote processing, although in this work we have only used the textured model available in the saved folder of the app. The latter model is not sharp enough to recognize most of the features observed in the other models, as can be seen from the derived orthomosaic ( Figure 5).

Final Remarks
In this work, we have tested readily available surface reconstruction methods, leveraging consumer-grade sensor platforms to produce 3D models of a typical field location encountered within geoscience and geoarchaeology applications. Our results have highlighted that only the SfM-MVS-derived Air 2S model and the iPhone LiDAR-derived PIX4Dcatch model satisfactorily recovered the orientation of the scene, with the Air 2S model also being georeferenced. Nevertheless, the Air 2S model has suffered a vertical translation of~8 m with respect to the real-world ASL coordinates. The comparison of the Air 2S model with the LiDAR-derived DTM and aerial orthoimage of Google Earth together with the geometric consistency of the Air 2S model with the tested iPhone LiDAR's models (which should nominally excel in distance accuracy in lieu of SLAM errors) demonstrates that aerial photogrammetry deployed from consumer-grade drones reached levels of accuracy in an uncontrolled field setting sufficient for many geoscience field surveying applications where survey grade measurements are not mission critical (e.g., gauging approximate bed thicknesses, orientations, etc.). In general, there are several geoscience applications where the internal scales and geometrical consistency of the scene supersede the need for accurate georeferencing, such as for models intended for the quantitative extraction of oriented data (e.g., [29,78]), the production of oriented orthopanels (e.g., [12,73]), and the qualitative observation, preservation, and sharing of models (e.g., [79,80]). For all those cases, our results suggest that modern drones, such as the DJI Air 2S, can be used to produce stand-alone surveys with orientation accuracies sufficient for the vast majority of user cases. Nevertheless, we suggest using such an approach with caution, since the GNSS signal can be subject to occlusions, particularly within mountainous or urban areas. Direct georeferencing alone is not sufficient to establish survey-grade registrations, even when RTK-drones with centimetric position accuracy are used (e.g., [81]). Therefore, direct georeferencing alone is not recommended for all applications where absolute orientation accuracy is required. For the cases where an approximate alignment to the real-world coordinates is sufficient, we suggest extending the coverage of the acquisition over a much larger area than the region of interest whilst avoiding the acquisition of collinear images. It is also recommended to always implement routine quality checking of 3D surface reconstructions produced within the field. Assuming no GCP's are available, it is possible to insert objects of known scale and orientation into the mapped scene to meet this objective (e.g., [4,9]).
A noteworthy aspect of this work is the recognition of the recent improvement obtained by the PIX4Dcatch app in aligning iOS LiDAR-derived models to the world frame. To confirm this observation, we performed an additional test over an object characterized by a simplistic geometry. The test consisted of five independent iOS LiDAR surveys of the base of an obelisk made of limestone blocks in the city of Trieste (Italy) through the PIX4Dcatch app. The survey strategy involved circumnavigating the obelisk around either half or the entirety of its perimeter, whilst continuously acquiring LiDAR data using the iPhone 13 Pro (Figure 7a). In similitude to the workflow presented within Section 2, we were able to provide an estimate of the positional accuracy of the GNSS during the LiDAR acquisition by generating an SfM-MVS photogrammetry model of the obelisk using all frames recorded by the app during five distinct acquisitions (Figure 7b). Note that, unlike the LiDAR-derived models themselves, each image used for texturing is georeferenced, which is recorded in its EXIF data. In comparison to what was observed from the case study presented in Section 2, the accuracy estimates in the east and north directions is~1 m, while it is~2 m in the vertical direction. Note that the positioning error is almost stable during each discreet acquisition (Figure 7b). This suggests that proceeding the acquisition of the first frame, subsequent image locations are established based upon inertial measurements. The five resultant LiDAR models, although shifted, are consistently aligned to the world frame. The observation was corroborated by deriving the obelisk orientation data from each of the generated LiDAR point clouds using the Compass plugin of CloudCompare [82] ( Figure 7c). All measurements (from 10 to 20 for each model) are plotted as black great circles in Figure 7d. Orientation measurements of the obelisk were also made at the site using the Clino app for iOS on the iPhone 13 Pro and the FieldMove app for iOS on an iPad Pro (https://www.petex.com/products/move-suite/digital-field-mapping/ (accessed on 21 July 2022)). Particular attention was paid during this procedure to avoid magnetic interactions with the target object that might perturb measurement accuracy. In the field, orientation measurements were taken as planar and linear features (Figure 7d). Field measurements in Figure 7d are referred to with respect to geographic north (the magnetic declination at the site was +4). Incidentally, an average rotational error around the vertical axis of the world frame of <5 degrees is observed. Unfortunately, due to the limited magnetic declination at the site, and the proprietary nature of the PIX4Dcatch app's registration procedure, we cannot discern if the LiDAR models were intended to be aligned with geographic or magnetic north. If the LiDAR model is natively aligned to the magnetic north then the rotation error is potentially reduced to <1 degree. In any case, knowing this rotation, one can decide to rotate the models, or any derived structural data, accordingly, to maintain a degree of accepted accuracy where possible. Overall, the results obtained by this study highlight that commercial products, such as the DJI Air 2S and the iPhone 13 Pro, are able to prove useful as standalone field data acquisition platforms for diverse applications in the geosciences. Nevertheless, these instruments are subject to several sources of error (e.g., SLAM, geolocation, etc.) that can compromise entire studies. It is recommended that users remain cautious about the data quality models derived from such sensor platforms if no accuracy estimates exist, particularly for applications where the fidelity of the resultant metric data is critical.

Conclusions
Progressively more geoscientists are relying upon the claimed accuracy of commercial-grade tools for the 3D modeling of outcrops and landforms, commonly without sitespecific validation. Nevertheless, this work has shown that even the most up-to-date consumer-grade tools (e.g., the DJI Air 2S and the iPhone 13 Pro) are subject to numerous errors, which are potentially deleterious to the intended application. Indeed, the magnitude of these errors may be sufficiently profound to nullify the results of entire studies or may lie within a range that is acceptable for many user cases. In order to provide baseline reliability, accuracy must always be checked and evaluated for a given study site and/or application. Herein, we tested the geolocation capabilities and the native LiDAR sensor of the iPhone 13 Pro. The obtained results are to be considered positive, particularly with Overall, the results obtained by this study highlight that commercial products, such as the DJI Air 2S and the iPhone 13 Pro, are able to prove useful as standalone field data acquisition platforms for diverse applications in the geosciences. Nevertheless, these instruments are subject to several sources of error (e.g., SLAM, geolocation, etc.) that can compromise entire studies. It is recommended that users remain cautious about the data quality models derived from such sensor platforms if no accuracy estimates exist, particularly for applications where the fidelity of the resultant metric data is critical.

Conclusions
Progressively more geoscientists are relying upon the claimed accuracy of commercialgrade tools for the 3D modeling of outcrops and landforms, commonly without site-specific validation. Nevertheless, this work has shown that even the most up-to-date consumergrade tools (e.g., the DJI Air 2S and the iPhone 13 Pro) are subject to numerous errors, which are potentially deleterious to the intended application. Indeed, the magnitude of these errors may be sufficiently profound to nullify the results of entire studies or may lie within a range that is acceptable for many user cases. In order to provide baseline reliability, accuracy must always be checked and evaluated for a given study site and/or application.
Herein, we tested the geolocation capabilities and the native LiDAR sensor of the iPhone 13 Pro. The obtained results are to be considered positive, particularly with respect to the PIX4Dcatch app, which is able to provide well-scaled and oriented point clouds image geotags within their associated EXIF metadata. Funding: This research was partially funded by the Geological Survey of the Friuli Venezia Giulia Region within the framework of the following project: Accordo attuativo di collaborazione per l'aggiornamento censimento e pericolosità dei sinkhole del territorio regionale (prot.no. 0035220 of 27 July 2020).

Data Availability Statement:
All data in support of this publication, including Agisoft Metashape reports, are available upon request to the corresponding author. A 3D surface reconstruction of the trench is available at https://skfb.ly/ouI7o (accessed on 8 September 2022).