UAV-Based Remote Sensing for Detection and Visualization of Partially-Exposed Underground Structures in Complex Archaeological Sites

: The utilization of remote sensing technologies for archaeology was motivated by their ability to map large areas within a short time at a reasonable cost. With recent advances in platform and sensing technologies, uncrewed aerial vehicles (UAV) equipped with imaging and Light Detection and Ranging (LiDAR) systems have emerged as a promising tool due to their low cost, ease of deployment/operation, and ability to provide high-resolution geospatial data. In some cases, archaeological sites might be covered with vegetation, which makes the identiﬁcation of below-canopy structures quite challenging. The ability of LiDAR energy to travel through gaps within vegetation allows for the derivation of returns from hidden structures below the canopy. This study deals with the development and deployment of a UAV system equipped with imaging and LiDAR sensing technologies assisted by an integrated Global Navigation Satellite System/Inertial Navigation System (GNSS/INS) for the archaeological mapping of Dana Island, Turkey. Data processing strategies are also introduced for the detection and visualization of underground structures. More speciﬁcally, a strategy has been developed for the robust identiﬁcation of ground/terrain surface in a site characterized by steep slopes and dense vegetation, as well as the presence of numerous underground structures. The derived terrain surface is then used for the automated detection/localization of underground structures, which are then visualized through a web portal. The proposed strategy has shown a promising detection ability with an F1-score of approximately 92%.


Introduction
Archaeology is the study of previous human activities and cultures through the recovery, documentation, and analysis of material remains and the environment when used, modified, and perceived by people [1][2][3]. The accurate, detailed documentation of cultural heritage sites is a crucial aspect of archaeology. The majority of documentation strategies are based on labor and involve time-intensive, and sometimes invasive and hazardous, site surveys [4][5][6]. Therefore, remote sensing technologies have emerged as a more practical approach to obtaining a detailed understanding of archaeological sites. More specifically, the emergence of passive and active remote sensing modalities operating in different portions of the electromagnetic spectrum allows for the derivation of a rich set of information, which is useful for the detection and mapping of archaeological remains. Improvements in direct georeferencing technologies, and lower-cost technologies, i.e., integrated Global Navigation Satellite Systems/Inertial Navigation Systems (GNSS/INS), which allow for the controlfree mapping of such sites. This makes the remote sensing of archaeological sites a more attractive option for their documentation. Remote sensing data have traditionally been acquired by spaceborne and airborne platforms [7][8][9][10][11][12]. In spite of their ability to improve performance, these remote sensing systems do not provide reasonable resolution at an affordable cost [10]. Over the last decade, uncrewed aerial vehicles (UAVs) have emerged as promising remote sensing platforms. The use of UAVs is motivated by their low cost, ease of deployment, autonomous operation, ability to fly under cloud cover, and ability to fill an important gap between airborne and terrestrial sensing modalities [13][14][15][16][17]. The recent availability of miniaturized sensing modalities and direct georeferencing technologies, together with the improved payload capacity of modern UAVs, are other factors that promote the use of such platforms in a wide range of applications, including archaeology [4].
In terms of remote sensing modalities, imagery has been used for the derivation of high-resolution orthophotos, which are quite useful for the 2D mapping of archaeological sites. With the recent developments in Structure from Motion (SfM) algorithms, we are able to generate dense point clouds that can cover a site, allowing for its 3D mapping [18][19][20][21]. However, for sites with heavy vegetation cover, image-based documentation fails to provide useful information below canopy (e.g., hidden walls, and underground structures). In this regard, Light Detection and Ranging (LiDAR) provides a viable alternative for undercanopy mapping due to its ability to travel through tiny gaps while delivering results pertaining to hidden structures. Thus, Terrestrial Laser Scanning (TLS) has been used as a surveying tool in archaeology [22]. However, mapping an extensive archaeological site with TLS is a time-intensive and data-heavy operation. Furthermore, vegetation cover is a significant impediment to an accurate recording. This leads to the need for expensive post-processing for registration and information extraction from the derived point clouds [23][24][25][26]. Therefore, TLS is often preferred for smaller, unwooded areas as well as architectural remains. Considering this, the use of LiDAR units onboard UAVs is becoming an interesting concept for the mapping of archaeological sites [27][28][29].
In spite of its advantages, LiDAR-based remote sensing lacks the important information necessary for understanding acquired scenes (e.g., lack of spectral information). With the improved payload capabilities of modern UAVs, both camera and LiDAR systems can be mounted on the platform, allowing for the simultaneous acquisition of image and point cloud data. Incorporating a digital camera onboard UAVs provides more capabilities, which enhances the processes of feature extraction, scene understanding, and the visualization of derived products. The synergistic characteristics of image and LiDAR mapping technologies allow for a more complete mapping/documentation. Table 1 provides a brief summary of these complementary characteristics, which are illustrated in Figure 1. • Only provides information about the canopy envelope. • Affected by illumination conditions. • Complex 3D reconstruction (e.g., establishing reliable matchings in overlapping imagery). [4,18,21,30,31] LiDAR-based  In spite of the above characteristics of LiDAR data, the detection of underground structures (including cisterns) from point clouds remains a challenging task, especially when dealing with complex archaeological sites exhibiting steep slopes and dense vegetation [32][33][34]. The detection of cisterns is quite important for some archaeological sites, especially those in areas with limited access to fresh water. Inhabitation of such sites is only sustainable if a provision of fresh water is created. Having information about such structures is important for: (1) identifying the maximum population that can be sustained, (2) differentiating between private and public spaces, (3) the identification of the functions of damaged buildings, and (4) providing insight into the settlement layout and phases of occupation. For instance, on Dana Island, where a large maritime village developed from the first through the eighth century CE, cisterns were critical not only for resident population but also for supplying the maritime vessels that used the island as a way station [35]. Using UAV LiDAR, researchers are able to answer fundamental archaeological questions such as the volume of quarrying, the spatial relationship between quarries and inhabitation, and the transformation of the natural terrain into a highly modified industrial settlement. Moreover, it can address environmental concerns about denuding the island of its natural flora while drastically reducing the labor and finances required to map such a complex site. Therefore, this manuscript focuses on establishing a UAV system equipped with imaging and LiDAR remote sensing technologies aided by an integrated GNSS/INS unit to provide useful geospatial data for the detection of underground structures in general, and cisterns in particular.
Much of the current UAV LiDAR-based archeological work focuses on the descriptive interpretation of results [36]. Some of the most famous recent studies typically present archaeological layers covered by vegetation, partial human-based reconstruction of missing features, or broad framework outlines for a study area [37]. Our research pushes the boundaries of research by quantitatively interpreting LiDAR signals to recognize archaeological features that are hard to survey and analyze at scale by human observation alone. Our work also opens the way for more sophisticated analytic projects, which could infer the degree and type of human occupation, leading to important socio-cultural conclusions about the historical processes around the area of interest.
The detection of underground structures from UAV-based point clouds requires a crucial ground-filtering step. Such UAV datasets could prove challenging due to the presence of above ground objects, land features of different sizes, and varying point densities [38]. Ground filtering algorithms for LiDAR point clouds can be categorized into three main groups: (1) morphology-based, (2) slope-based, and (3) surface-based approaches. Morphology-based filtering separates above-ground and bare-earth points using an opening operation [39][40][41], and it is robust in steep areas while removing smaller non-ground features. Slope-based strategies [42][43][44] aim to distinguish bare-earth points from above-ground points by detecting inconsistent slope changes. Hence, they are more effective in flat areas but less so in areas with drastic terrain changes. The goal of surfacebased filtering [45][46][47] is to approximate bare-earth points using a mathematical description of a Digital Terrain Model (DTM); however, it tends to ignore terrain details. Given the characteristics of UAV LiDAR data in archaeological sites (e.g., steep areas with land features of different sizes), surface-based filtering will likely be most suitable for this study.
Specifically, the cloth simulation filtering algorithm has frequently been compared with other filtering approaches in previous studies. Serifoglu Yilmaz et al. [48] evaluated the performance of seven commonly used ground-filtering algorithms [45,[49][50][51][52][53][54] for UAVbased point clouds. Their results showed that the cloth simulation filtering algorithm [45] produces the best results since it has the advantage of requiring only a few, easily adjustable parameters. Bolkas et al. [55] compared the use of UAV photogrammetry and TLS for examining the performance of two ground-filtering algorithms (i.e., Agisoft Metashape classification and cloth simulation filtering algorithms). They found that vegetation density has a major impact on surface change estimation due to the varying levels of penetration, while both ground-filtering algorithms provide acceptable results in areas with low vegetation density. In summary, among the existing ground-filtering algorithms, cloth simulation shows great promise in handling data obtained from UAVs. However, modifications are necessary to address challenges posed by noise and outliers in point clouds, as well as the dense vegetation, sudden elevation changes, and/or underground structures that might be frequent in archaeological sites.
In addition to using remote sensing technologies to develop algorithms capable of archaeological site documentation, there is a lack of easy-to-use visualization tools considering the massive amount of captured data, especially LiDAR point clouds. There are some commercial and opensource tools for remote sensing data visualization-e.g., Cloud-Compare (http://www.cloudcompare.org/, accessed on 10 February 2023), VisionLidar (https://geo-plus.com/point-cloud-software/, accessed on 10 February 2023), and Veesus (https://veesus.com/, accessed on 10 February 2023). However, such software programs rely on the availability of large amounts of computer memory storage and are limited by the computational performance of the used hardware. With recent improvements in internet speed, several web portals have been developed, allowing end-users to access point clouds without prerequisite installation and data downloading [56][57][58]. However, a visualization tool capable of integrating both image and point cloud data while providing end-users with interactive means for the manipulation of such data (e.g., forward and backward projection between 2D images and 3D point clouds) is still lacking. Therefore, another objective of the proposed research is to develop a web portal for managing/visualizing the collected UAV imagery and LiDAR data as well as the derived products. The main objectives/contributions of this study can be summarized as follows: • Develop a UAV-based remote sensing platform for the acquisition of image and LiDAR data for the documentation of isolated, complex archaeological sites rich in underground structures, such as cisterns and the basements of buildings.

•
Develop a robust terrain model generation strategy that can handle rugged terrains with sudden elevation changes, dense vegetation cover, and/or the presence of underground structures.

•
Develop a detection strategy for identifying underground structures in LiDAR point clouds.

•
Develop a web-based visualization portal for illustrating image and LiDAR data together with derived products while providing the end-users with easy-to-use switching between imaging and LiDAR data.
• Illustrate the performance of the developed strategies using real datasets captured over a complex archaeological site.
The remainder of this paper is organized as follows: Section 2 introduces the utilized UAV system, study site, and acquired datasets; Section 3 presents the mathematical models for LiDAR/image-based 3D point positioning (which are necessary for subsequent data processing), proposed terrain model generation and underground structure detection approaches, together with the developed web-visualization portal and its use in establishing a reference dataset; experimental results are then reported and discussed in Section 4; finally, the conclusions and recommendations for future work are summarized in Section 5.

Data Acquisition System, Study Site, and Dataset Description
One of the objectives of the proposed research is to develop and deploy a UAV-based remote sensing system equipped with a digital camera and LiDAR remote sensing modalities assisted by an integrated GNSS/INS unit. The system will be used for data acquisition over a complex archaeological site. The following subsections outline the specifications of the used UAV and provide details about the study site and acquired datasets.

UAV-Based Mobile Mapping System
An in-house-developed UAV mobile mapping system (MMS) was used in this study. The UAV, as shown in Figure 2, was equipped with a LiDAR scanner-Velodyne VLP-32C, a digital camera-Sony Alpha ILCE-7R, and an Applanix APX-15 UAV V2 GNSS/INS unit. The LiDAR scanner has 32 laser beams that are radially aligned in a vertical plane with a 40 • Field of View (FOV) at an angular resolution of 0.33 • . The laser beam assembly is rotated around the unit's vertical axis to provide a 360 • horizontal FOV with an angular resolution between 0.1 • and 0.4 • . The VLP-32C emits 600,000 pulses per second for a maximum measurement range of 200 m and ±3 cm range accuracy [59]. The LiDAR unit was mounted on the UAV with its vertical axis parallel to the flight direction. The Sony Alpha ILCE-7R is a 36.4-megapixel off-the-shelf camera with a frame size of 7360 × 4912 pixels and 4.86 µm pixel size [60]. The nominal focal length of the used lens is 35 mm. The camera, which was set up on the UAV with its optical axis pointing in the nadir direction, is triggered by an Arduino Micro microcontroller at a frame period of 1.5 s. The LiDAR and camera units were directly georeferenced by the APX-15 UAV V2 GNSS/INS system [61], whose data were post-processed using POSPac [62]

Study Site and Dataset Description
The study site was Dana Island (36 • 11 ′ 91"N, 33 • 46 ′ 27"E), which was part of the ancient Rough Cilicia (roughly rectangular in shape with dimensions of 2.7 km × 0.8 km) in southern Turkey. Located 2 km off the Mediterranean coast, this rich archaeological landscape includes a maritime village of the early Byzantine period and a hilltop occupation that goes back to the Iron Age [35]. The coastal village along the western shoreline includes houses, shops, hostels, baths, six churches, and other buildings related to seaborne travel. This was also the site of a major quarrying operation, as evidenced by the extensive limestone quarries across the settlement, and the infrastructure for exporting stone blocks via maritime vessels. These materials provide a wealth of information about the social, economic, political, and religious structures of the communities that lived on the island. Dana Island is known for its rugged terrain, hot and arid environment, and limited access to fresh water. Therefore, inhabitation was only possible by establishing a dense set of cisterns, which were not only used by the resident population but also to supply the maritime vessels that used Dana Island as a way station. The steep slope, rugged terrain, dense vegetation cover, and availability of numerous underground structures in the island make it an excellent site for testing the ability to use UAV LiDAR to detect and visualize such structures.
The west coast of Dana Island, where most archaeological sites are located, was covered by ten flight missions between 26 and 29 July 2019. Due to the isolated location of the island, a local Trimble base station (SPS585) was established for differential GNSS post-processing. The conducted missions, together with the local base station, are shown in Figure 3. As an example, Figure 4 illustrates a color-coded point cloud over the area covered by mission #2 (collected on 27 July). Since the LiDAR system allows for a wider swath coverage across the flight line than the onboard camera, some of the LiDAR points are not color-coded. The collection date, flying height, average speed, flight time, and collected data of each mission are listed in Table 2.

Methodology
The methodology developed in this work starts by using LiDAR data from different missions for the detection of underground structures. Original point clouds, acquired imagery, and detected objects are then incorporated in a web-visualization portal allowing for interactive switching between 3D LiDAR data and 2D imagery. This section starts with the mathematical models used for deriving 3D coordinates using LiDAR and imaging systems, as well as establishing the 2D-to-3D transformation between image and point cloud data. The methodology for underground structure detection is then explained, followed by coverage of the web-visualization portal. Finally, we will introduce the utilization of the developed portal for establishing a reference dataset, which will be utilized to assess the performance of the underground structure detection approach.

Point Positioning Equations for GNSS/INS-Assisted LiDAR and Imaging Systems
The success of any multi-modal geospatial data-processing/integration activity is contingent on ensuring the positional quality of such data (e.g., proper georeferencing of the used sensors, together with a comprehensive modeling of the point-positioning equations, relating their measurements to the respective ground coordinates). In general, establishing the point-positioning equations for either LiDAR or imaging systems requires two steps. First, we need to define the laser beam or imaging ray relative to the sensor coordinate system. This definition is based on the sensor measurements (i.e., laser range and pointing direction for a LiDAR and image coordinate measurements for a camera), together with the Interior Orientation Parameters (IOP) of the used sensor (i.e., parameters describing the encoder mechanism for a LiDAR or principal point coordinates, principal distance, and distortion parameters for a camera). Second, the position and orientation of the laser beam or imaging ray relative to the mapping frame are established through the Exterior Orientation Parameters (EOP) that describe the position and orientation of the sensor relative to the mapping frame.
The point positioning models for LiDAR and camera units are described in Equations (1) and (2), respectively. In Equation (1), r lu(t) I denotes the position of the footprint of a laser beam, emitted at time t, relative to the laser unit frame, while, r m lu(t) and R m lu(t) are the position and orientation information of the laser unit frame relative to the mapping frame at time t-i.e., the EOP of the laser unit. The derivation of r lu(t) I is based on the range and pointing direction measurements of the LiDAR unit as well as its IOP. For a GNSS/INS-assisted system, the estimation of r m lu(t) and R m lu(t) can be derived according to Equation (3), which is graphically explained in Figure 5, where r m b(t) and R m b(t) are derived from the GNSS/INS-data processing to define the position and orientation of the IMU body frame relative to the mapping frame at time t; r b lu and R b lu represent the lever arm and boresight rotation matrix relating the laser unit and IMU body frame coordinate systems. Thus, for a LiDAR system, the coordinates of an object point I in the mapping frame (r m I ) can be derived through Equations (1) and (3).
The point positioning for an imaging system, on the other hand, is shown in Equation (2), where r represents the imaging ray for point i relative to the camera coordinate system at time t. This term is derived from the image coordinates of point i (x i and y i ) and the camera IOP, including the principal point coordinates of the used camera (x p and y p ), principal distance ( f ), as well as distortions in the x and y coordinates for image point i (dist x i and dist y i ). Similarly, the position and orientation information of the camera frame relative to the mapping frame (r m c(t) and R m c(t) ) are estimated using the GNSS/INS trajectory information (r m b(t) /R m b(t) ) and mounting parameters relating to the camera frame and IMU body frame (r b c and R b c ), as shown in Equation (4) and Figure 5. Different from LiDAR, image-based 3D reconstruction involves an unknown scale factor (λ(i, c, t) for image point i captured by camera c at time t), which needs to be estimated. From the LiDAR/image-based point positioning equations (i.e., Equations (1)-(4)), it is evident that accurate trajectory information and system calibration parameters (including sensor IOP and mounting parameters) are critical for producing properly georeferenced data from LiDAR and imaging systems. Considering that the utilized UAV-based MMS was flown above the canopy under an open-sky condition without GNSS signal outages, the post-processed trajectory is expected to be accurate. As for the system calibration parameters, the IOP of a LiDAR unit is usually provided by the manufacturer and is relatively accurate/stable. Camera IOP and mounting parameters relating the LiDAR/camera sensor frames to the IMU body frame were estimated through a system calibration procedure [63].
Based on the point positioning equation (Equations (1) and (3)), a LiDAR point cloud was directly reconstructed. In this study, a LiDAR point was reconstructed only when the direction of the corresponding laser beam was less than ±70 • from the nadir direction. In this study, imagery acquired by the onboard camera was used for visualization through interactive backward and forward projection between 2D images and 3D LiDAR point clouds. Such image-based visualization involves two main processes: (i) for a 3D object identified in the point cloud, one should derive the corresponding point in an image where it is visible (i.e., denoted as backward projection; (ii) for a selected feature point in an image, we need to identify the corresponding location in a LiDAR point cloud (i.e., the location denoted as forward projection). In other words, backward/forward projection processes establish the link between the 2D imagery and 3D LiDAR point cloud, as shown in Figure 6.  (6) and (7).
For the forward projection of an image point (x i , y i ), its corresponding 3D coordinates are estimated by finding the intersection between the imaging ray and the 3D surface defined by the LiDAR data. In particular, the unknown scale factor λ(i, c, t) in Equation (2) is solved for in this process. To solve this scale factor in this work, an octree-based ray tracing algorithm similar to the one proposed by Revelles et al. [64] was adopted. More specifically, an octree of LiDAR points was first built. Then, a set of points was generated at equal distances along the imaging ray. For each point along the light ray I, its closest point in the LiDAR octree tree L was identified and the distance between these two points d was computed. Next, starting from the point I 1 that is closest to the perspective center, the first point I i that meets the following criteria was identified: (i) the distance d i was smaller than a threshold (e.g., 0.2 m) and (ii) the distance d i was smaller than the distance from the next point d i+1 . These conditions guarantee that the intersection of the light ray with the closest LiDAR surface (i.e., the visible surface) is identified. Finally, the forward projection solution was derived by projecting the closest LiDAR point to I i onto the imaging ray. This process is schematically illustrated in Figure 7.

Underground Structure Detection
The second objective of the proposed work in this study is to develop a robust, automated strategy for the detection of underground structures similar to those in Figure 8. The top two rows in this figure show situations where objects of interest can be detected from imagery. However, the last row shows a situation where the underground structure cannot be detected in the image due to canopy cover, although it is visible in the LiDAR point cloud. Therefore, the proposed methodology is based on the utilization of LiDAR data to detect these objects. As can be seen in Figure 8, for underground structures, LiDAR returns below the ground. Therefore, a terrain model comprising the bare-earth point cloud can be used for the identification of below-earth points, which could be grouped into clusters that are hypothesized as underground structures. The flowchart of the proposed methodology is shown in Figure 9.
The main challenges in deriving a reliable terrain model from LiDAR data over a complex site such as Dana Island include (as can be seen in Figure 10): (1) the presence of some noise/outliers in the point cloud; (2) dense vegetation that would lead to sparse points below canopy; (3) rugged terrain with sudden elevation changes; and (4) the presence of underground structures. Among the existing terrain model generation strategies, the cloth simulation algorithm has been repeatedly used in the prior literature [65][66][67]. A schematic diagram of the cloth simulation strategy is shown in Figure 11, which proceeds according to four steps: (i) turn the point cloud upside down, (ii) define a cloth (consisting of particles and their interconnections) with some rigidness, and place it above the point cloud, (iii) let the cloth drop under the influence of gravity and designate the final shape of the cloth as the DTM, and (iv) use the DTM to filter ground (i.e., bare earth points) from above-ground points. In spite of this simple yet sound procedure, the cloth simulation would result in less-desirable results for the derived DTM and bare-earth points when dealing with a complex environment. To illustrate the resulting artifacts, Figure 12 shows the derived terrain model and bare-earth points for the scenarios depicted in Figure 10 (one can see that the derived terrain model does not represent the actual ground surface). To produce more reliable results, the cloth simulation strategy has been modified as discussed below.

Mitigation of noise/outlier points:
In the original cloth simulation, the resting place of the cloth particles is defined by what is known as the intersection height value (IHV). For a given cloth particle, the IHV is established as the nearest LiDAR point to that particle in 2D (i.e., the highest point in the inverted point cloud). Such a definition makes the derived DTM sensitive to the noise level and presence of outliers in the point cloud (Figure 12a). To reduce the sensitivity to the presence of noisy/outlier points, we redefined the IHV as the 90th percentile of the elevations at the 2D vicinity for the particle in question (Figure 13-Case B). The impact of this change is shown in Figure 14a.   Mitigation of sparse points along the ground: In situations where gaps can be found along the ground due to the presence of above-ground structures and/or vegetation, the resting place of the cloth will show sags (illustrated in Figure 13-Case C), which will manifest as artificial peaks in the derived DTM (refer to Figure 12b). To reduce these artificial peaks, we conducted an iterative cloth simulation procedure, which has been presented in our previous work [46], where the rigidity of the inter-particle connections is modified according to the defined bare earth in the previous iteration (i.e., the rigidity is increased for particles that fall in areas with sparse bare-earth points as defined by the first iteration). The impact of adopting such a mitigation strategy can be seen in Figure 14b. ff ff Figure 13. Derived IHV using maximum height and 90th percentile height for three cases: Case Aground with clean definition; Case B-ground with noise points; Case C-area with dense vegetation.

Mitigation of erroneous terrain model at locations with sudden elevation changes:
In such situations, the cloth will smoothly change its elevation on both sides of the cliff, leading to erroneous DTM and missing bare-earth points on one side of the cliff (Figure 12c). This problem was handled through a post-processing strategy. After the generation of the bare-earth points from the adaptive cloth simulation, we generated a raster grid with the same DTM resolution. Then, we identified whether each cell included bare-earth points. Cells with no associated bare-earth points were denoted as empty cells. For each of the empty cells, we identified the neighboring non-empty cells and evaluated the minimum/maximum elevations of the bare-earth points in these cells. For the empty cell in question, we searchd for above-ground points whose elevations were within the bare-earth range of neighboring non-empty cells. If such points existed, they were added as bare-earth points to that cell, and the corresponding DTM elevation was adjusted to the average elevation of such points. The improvement in the DTM generation and bare-earth classification after considering this modification is shown in Figure 14c.
Mitigation of erroneous terrain model at locations of underground structures: As can be seen in Figure 12d, the presence of underground structures with a considerable amount of LiDAR returns at the base of these objects will lead to an erroneous DTM that dips at those locations. In addition to the dips, we will miss a set of bare-earth points at both sides of the structure. These missing points were handled through the previous mitigation strategy, leaving the erroneous DTM elevation and bare-earth points at the base of the underground structure. To mitigate such artifacts, we started by defining a raster grid of the same resolution as the cloth simulation DTM. Then, we identified the bare-earth points associated with each cell. For underground structures, the elevation of bare-earth points will be significantly less than those outside the structure. Therefore, we started by identifying cells that exhibited a lower elevation relative to their neighbors. These cells were then clustered through a region-growing strategy. Clustered regions with a size that does not exceed a predefined threshold, which depends on the expected size of underground structures, were hypothesized to belong to non-ground objects. The bare-earth classification of points in these cells was nullified, and the DTM elevation at their location was redefined as the average elevation of neighboring DTM cells. The improvement in the ground-filtering results obtained following this mitigation strategy can be seen in Figure 14d. Once the DTM was generated from the modified cloth simulation, together with the proposed post-processing mitigation strategies, the elevations of the LiDAR point cloud were normalized by subtracting the DTM elevation ( Figure 15). Following the normalization, below-surface points were identified and clustered into groups using the density-based spatial clustering of applications with noise (DBSCAN) algorithm [68]. Each of these clusters was hypothesized to correspond to an underground structure (Figure 16).

Web-Visualization Portal
The last objective of this study is to develop a web-visualization portal that can be used by archaeologists to navigate through LiDAR point clouds, UAV images, and detected underground structures. The design criteria for the visualization portal include: (1) the ability to show a massive number of points and images; (2) not requiring local data storage (i.e., it could use data stored in a cloud server); (3) not requiring software installation; (4) providing some annotation tools; and (5) being amenable to developing other tools, such as forward and backward projection between 2D images and 3D point cloud data. In this work, Potree [69] was chosen for the development of the prototype web-visualization portal since it meets the above design requirements. Potree is an opensource code (http://www.potree.org), which can efficiently render a huge point cloud (>10 9 points) through a web browser without the need for software installation or data downloading [70]. It is also capable of displaying georeferenced meshes (either in ASCII or binary format), shapefiles, and images. However, there is no built-in function for backward/forward projection between imagery and point cloud data. Thus, as part of this work, we developed backward/forward projection functions within the Potree webvisualization portal. The portal architecture and developed backward/forward projection functions are briefly discussed in the next paragraphs.
Potree Web-visualization Portal: The established structure of the portal is illustrated in Figure 17. The front-end is the graphical user interface used to visualize the geo-referenced data, such as point clouds and images. The back-end consists of various visualization and/or computational functions that enable end-users to manipulate the georeferenced data. Figure 18 shows a Cesium base map (https://cesium.com/, accessed on 10 February 2023) and UAV LiDAR point cloud covering the study site (Dana Island), together with the captured images. Since the georeferencing parameters of the imagery are available from the UAV GNSS/INS trajectory and system calibration parameters (Equation (4)), the displayed imagery is shown in the proper position and orientation relative to the point cloud data. Finally, the georeferenced data (point cloud, imagery, and metadata) of the web portal are stored in a database. As shown in Figure 17, the back-end receives client requests from the front-end and processes them by coordinating with the database using the visualization and/or computational functions. For example, in this study, the backward/forward projection functions are realized through the back-end, which interacts with the front-end and database.  Backward/Forward Projection Functions: As shown in Figure 18, multiple georeferenced data acquired by different sensors can be rendered in the Potree web-visualization portal. To provide the user with the ability to visualize corresponding features in both image and point cloud data, backward/forward projection functions are established, as described in Section 3.1. Figure 19 shows the point cloud and image viewers with a chosen point (red dot) in the former and corresponding point (blue placemark) in the latter. For backward projection, an object point selected in the LiDAR data is projected onto the closest image where the point is visible. The backward projection function also allows for visualization of the same object point in multiple images where the former is visible. This could be useful in scenarios where an object point is not visible in one image while being visible in others (this could happen in sites with elevation variations, which are responsible for relief displacement, as can be seen in Figure 20). Figure 21 illustrates the forward projection, where a selected point in an image is projected onto the LiDAR point cloud, and the blue placemark in the former is projected as a red dot in the latter. Through these projection functions, end-users can visualize and navigate between properly georeferenced, multi-modal remote sensing data captured at the same time/different times by the same platform/different platforms.

Establishing a Reference Dataset and Accuracy Assessment
In order to evaluate the potential of UAV LiDAR as well as the proposed detection approach, a reference dataset with all existing underground structures is needed. Having these reference data, together with the detection results, precision, recall, and F1-score metrics, can be evaluated using Equations (8)- (10), where TP, FP, and FN are the true positives, false positives, and false negatives, respectively. The precision metric represents the proportion of truly detected underground structures among all identified ones using the proposed strategy. The recall signifies the ability of the proposed methodology to detect all existing underground structures in the site. The F1-score is the harmonic mean of precision and recall (i.e., it could be used as the overall evaluation metric).
Curating a reference dataset for this study was quite challenging. Therefore, we had to use a variety of sources to generate a reference dataset that is as complete as possible. A portion of the reference data was available through multi-year field survey missions by archaeologists on the site. Another set was derived through manual inspection of the LiDAR point cloud and imagery data, which were manipulated through the developed web-visualization portal. More specifically, the LiDAR data were carefully inspected to determine potential underground structure locations. These potential locations were then back-projected onto the imagery for verification. Alternatively, the operator could navigate through the images and whenever an underground structure is identified, it could be forward-projected onto the point cloud to derive its 3D location. Through this image-aided LiDAR identification strategy, underground structures are classified into three categories: (i) easy to see in the images, (ii) hard to see in the images, and (iii) no image available (hereafter denoted as "easy_img", "hard_img", and "no_img"). Figure 22 shows sample cistern point clouds and their corresponding images for the easy_img, hard_img, and no_img categories. Figure 23 illustrates the reference dataset generated from manual inspection and field survey.

Experimental Results and Discussion
This section illustrates the feasibility of the proposed strategy for the detection/ visualization of underground structures using UAV LiDAR data. The assessment process will be conducted through both qualitative and quantitative analyses, which are covered in the following subsections.

Qualitative Evaluation
As shown in Table 1, a total of 5965 UAV images and roughly 931 million LiDAR points were collected through ten missions over four days. The proposed underground structure detection strategy was applied to these missions, and a total of 169 underground structures were detected. For the qualitative evaluation of the detection results, we mainly relied on the developed Potree web-visualization portal. The image and point cloud datasets were rendered by the portal in less than ten seconds. Figure 24 demonstrates the Cesium base maps overlaid with the LiDAR point clouds and UAV imagery from the ten missions. In addition to the imagery and LiDAR data, detected underground structures can also be visualized and annotated through the portal, as shown in Figure 25. The portal allows for end-users to interact with the rendered data using its built-in functions (i.e., rotate, zoom-in/out, move) without experiencing any lag, as shown in Figure 25b. The detected underground structures can be simultaneously visualized in the imagery and LiDAR data through the developed forward-and backward-projection functions. Figure 26 illustrates a perspective view of annotated underground structure locations on the UAV LiDAR data. The backward projection function enables the user to visualize a specific underground structure location on all UAV images where it is visible (i.e., where its back-projection lies within the image frame), as shown in Figure 27. Conversely, for a given UAV image capturing an underground structure, its image coordinates can be forwardprojected onto the LiDAR data, as shown in Figure 28. These backward and forward projections can be used to qualitatively evaluate the validity of detected underground structures. Figure 29 shows examples of detected cisterns in the LiDAR point cloud and corresponding images in which they are supposed to be visible while highlighting situations where a cistern is clearly visible (Figure 29a

Quantitative Evaluation
A total of 169 underground structures (the majority of which is cisterns) were detected in the LiDAR data by the proposed methodology. As already mentioned, these detected underground structures were verified through backward projection and categorized according to whether they are clearly visible in the images. Of the detected underground structures, a total of 70 were difficult to identify in the imagery; however, these structures are clearly visible in the LiDAR data as below-terrain surface objects. On the other hand, 93 underground structures were clearly visible in the image and LiDAR point cloud data. Using manual inspection and field survey to curate the reference data, a total of 188 reference underground structures were generated. It should be noted that the number and locations of the underground structures in the reference data are based on our best efforts to obtain as complete a dataset as possible. Figure 30 illustrates the detected underground structures and those existing in the reference dataset. The TP, FP, and FN values are reported in Table 3. Figure 31 shows a sample situation with a false-positive detection. This case shows a sudden terrain elevation change coupled with above-ground canopy. Although the modified cloth simulation and proposed post-processing steps can handle each of these scenarios individually, they fail to simultaneous addressing both scenarios. A false-negative situation is shown in Figure 32. This false negative is caused by a cistern that is filled with debris (i.e., points inside the cistern were not deep enough below the local terrain surface to be detected as a below-ground object). Based on the reported values in Table 3, the precision, recall, and F1-score are 0.97, 0.87, and 0.92, respectively.

Conclusions
This paper investigated the potential of deploying a UAV system equipped with GNSS/INS-assisted imaging and LiDAR units for the documentation of archaeological sites. To ensure practical access to acquired data and potential products, a web-visualization portal was developed without the need for high-end computational resources and the installation of a dedicated software. In addition, a methodology was developed for the detection of underground structures in complex archaeological sites. An example of such a site, Dana Island, Turkey, was selected for this study due to its rich archaeological landscape and presence of steep terrain with sudden elevation changes, which are sometimes covered by vegetation. Image and LiDAR data from a total of ten UAV missions captured over four days were used in this study. The acquired data showed a high level of detail and synergistic characteristics in imagery and LiDAR point clouds. A Potree webvisualization portal was successful in rendering large imagery and LiDAR data. Moreover, the implemented forward/backward projection capabilities of the portal confirmed the georeferencing quality of the acquired data. The proposed underground structure detection strategy focused on the derivation of a reliable terrain model in a complex environment containing: (1) noisy/outlier points, (2) sparse ground points due to canopy cover, (3) the presence of rugged terrain with sudden elevation change, and (4) the presence of numerous below ground objects. The detection strategy showed a good performance with an F1-score of 92%. However, we still obtained a few false positives and false negatives. The false negatives were mainly attributed to cisterns filled with debris.
Furthermore, the quantitative evaluation of results in this paper highlights the importance of incorporating field observations in future research. Although LiDAR sensing technology can successfully detect the vast majority of underground structures, it is unable to capture fully covered cisterns. Therefore, future research will investigate the use of Ground-Penetrating Radar (GPR).
For LiDAR-based algorithms, current and future work will focus on refining the terrain model generation strategies to handle situations where we have a combination of challenging factors. One such combination was the cause of the detected false positives. Regarding the false negatives obtained for those cisterns filled with debris, a hybrid strategy that utilizes both imagery and LiDAR data will be proposed. Moreover, the expansion of the data analytics to automatically detect other archaeological objects of interest (walls, quarry cuts, building layout, etc.) will be also addressed. The Potree web-visualization portal will be augmented by 2D-and 3D-plotting tools for the generation of precise sketches of archaeological artifacts. Finally, the developed terrain extraction methodology will be also investigated for the derivation of a reliable terrain model from UAV LiDAR data in natural forests with a complex terrain (i.e., steep ravines, debris, and canopy undergrowth). A reliable terrain model will be valuable for the precise determination of tree heights, which is important for the management of forest ecosystems.  Data Availability Statement: Data sharing is not applicable to this paper.