Close-Range Sensing and Data Fusion for Built Heritage Inspection and Monitoring—A Review

Built cultural heritage is under constant threat due to environmental pressures, anthropogenic damages, and interventions. Understanding the preservation state of monuments and historical structures, and the factors that alter their architectural and structural characteristics through time, is crucial for ensuring their protection. Therefore, inspection and monitoring techniques are essential for heritage preservation, as they enable knowledge about the altering factors that put built cultural heritage at risk, by recording their immediate effects on monuments and historic structures. Nondestructive evaluations with close-range sensing techniques play a crucial role in monitoring. However, data recorded by different sensors are frequently processed separately, which hinders integrated use, visualization, and interpretation. This article’s aim is twofold: i) to present an overview of close-range sensing techniques frequently applied to evaluate built heritage conditions, and ii) to review the progress made regarding the fusion of multi-sensor data recorded by them. Particular emphasis is given to the integration of data from metric surveying and from recording techniques that are traditionally non-metric. The article attempts to shed light on the problems of the individual and integrated use of image-based modeling, laser scanning, thermography, multispectral imaging, ground penetrating radar, and ultrasonic testing, giving heritage practitioners a point of reference for the successful implementation of multidisciplinary approaches for built cultural heritage scientific investigations.


Introduction
The maintenance and conservation of historic structures are elaborate tasks filled with challenges. Geometrical complexity, multiplicity, and degradation of materials, varying historical construction techniques, and a plethora of other intrinsic and extrinsic factors-including environmental pressures and past anthropogenic interventions-induce problems (regarding protecting the built environment). Therefore, extensive knowledge of these parameters is required to ensure the effectiveness of implemented interventions. Thus, comprehensive condition inspections of built cultural heritage are necessary to holistically address the state of preservation (facilitating the diagnostic process) and understand the prevailing problems of historical structures that place them at risk. Furthermore, monitoring the state of preservation through time is fundamental towards effectively interpreting the occurring degradation phenomena and, therefore, a powerful tool for the decision-making process regarding built heritage protection.
Systematic nondestructive acquisition and integrated handling of multisource scientific data play essential roles in documenting the state of preservation of historic structures [1]. The need for multidisciplinary inspection methodologies is frequently noted in the literature, mainly in application cases of built cultural heritage of outstanding value, personnel experience, budget, accuracy specifications, and the integrability of recording methods [37][38][39][40].
Alongside the advancements in reality capturing, significant technological developments have taken place in the field of (historic) material nondestructive testing (NDT) and evaluation. Non-destructive inspection techniques operating at the visible, infrared, microwave, and radio-wave frequencies, have become more versatile and cost-effective and, therefore, have been increasingly used in many fields, with innovation and development primarily being driven by industry. NDT sensors have different advantages and limitations depending on their operating principles and spectral ranges, but, nevertheless, the continuous innovation and development of portable and compact devices will have a major role for future NDT instruments as these can facilitate the decision-making processes through agile on-site inspections [41,42].

Laser Scanning
Laser scanning methods are based on active recording techniques; they emit radiation through their own sources and record the backscatter, instead of sensing the reflected radiation originating from other sources. Terrestrial laser scanning (TLS) instruments utilize light detection and ranging (LiDAR) for range measurements and an optical beam deflection mechanism to record angle measurements. Depending on their operating principles, which vary significantly, the use of TLS techniques for build heritage recording poses different advantages and limitations [43,44]. Laser scanning mechanisms generally enable dense measurements, capturing in accurate and fast manners, and are (relatively) easily operated. In a conventional laser-scanning instrument, the scanner measures, stepwise, the surrounding scene with a fast vertical mirror rotation, and a slower horizontal instrument rotation. More specifics on the scanning mechanisms and measuring techniques of TLS can be found in Beraldin et al. [45], and Petrie and Toth [46].
TLS describes a variety of measuring instrumentation, sometimes integrated with a digital camera that provides color information to the measured point cloud. TLS has experienced a rapid decrease in the size, weight, and price of sensors, and a constant increase in measurement speed and spatial resolution. These rapid improvements allow measuring up to 1 million points per second at the range of 100-300 m with ranging precision at the millimeter level, at a relatively low price. However, TLS sensors are lineof-sight and, therefore, multiple scans are required to scan an entire structure's surface ( Figure 1). The implementation of TLS means that, in ideal conditions of calibration, the captured point clouds do not need to be scaled, such as photogrammetric models. There are two typologies of TLS instrumentation widely used for cultural heritage documentation, operating on different recording principles: • Time-of-Flight (ToF) scanners measure distances, by measuring the time difference between the emitted laser pulse and the received backscatter. These devices are characterized by lower acquisition speeds and accuracies (5-6 mm), but are mainly suited for long-range acquisition. • Phase Shift (PS) scanners record the difference of phase between the emitted and backscattered signal (sinusoidal wave patterns) of continuous laser pulses. These devices are characterized by shorter ranges (up to 300 m) and provide better accuracy compared to ToF scanners (2-3 mm); thus, they are suited for documentation at large scales.  Recording with TLS presupposes planning the data acquisition campaign to identify the elements or surfaces to be covered, determine the optimal number and location for scanning positions and targets, and the management process of the point clouds [38,43,47,48]. Optimally placed scanning positions are selected to maximize cover and incidence angles, achieving the required resolution specifications, while decreasing occlusions and, if possible, the number of scans/scanning time [49]. Targets are positioned in overlapping areas to facilitate registration between scans. Maintaining a substantial spatial distribution of scan targets on the x-y plane and at the z-direction is essential to avoid multiplicity of solutions when solving the orientation between scans. Depending on the registration method between point clouds from different scanning positions, at least four correctly distributed targets at xyz should be positioned [50]. Registration between measured point clouds is usually performed through a coarse transformation based on common, often artificial, targets followed by a fine registration method-which mainly refers to the itera-tive closest point (ICP) algorithm [51][52][53]. Other fine registration methods for TLS point clouds include random sample consensus (RANSAC), normal distribution transform, and methods using auxiliary data, such as target imagery and measurement-device location GNSS coordinates [54,55].
Regarding the documentation of historic structures, TLS devices have been successfully employed for high-fidelity reality-based modeling of numerous large and geometrically complex monuments [56][57][58][59]. However, the use of ToF scanning devices has become less frequent, although it is preferred for long-range applications (e.g., monitoring the erosion of historical mine remains [60]), and is used in applications that require acquisition from variable ranges, in combination with PS scanners [61][62][63]. Nevertheless, the possibility of directly geo-referencing point clouds through the integration of ToF scanners and GNNS measurement systems provides a powerful 3D recording solution [64].
TLS is, moreover, a source of important radiometric data, exploitable, to facilitate nondestructive condition documentations further. For example, reflectivity values recorded by TLS, which express the intensity of the backscattered laser energy, have been recently explored for mapping the alterations of historical surfaces [83][84][85][86] as well as for surface moisture detection [87][88][89]. However, to assure the usefulness of intensity data collected by TLS, rigorous radiometric calibrations are required to eliminate the effects of data acquisition geometry, instrumental errors, environmental effects, and reflectivity characteristics of the target [90][91][92].

Photogrammetric Techniques
Digital close-range photogrammetry involves techniques for retrieving 3D information from two-dimensional digital images recorded under controlled illumination conditions. The advancements in dense image matching [93] and the improvements in camera sensor manufacturing [94] have drastically improved image-based modeling (IBM), allowing the generation of dense point clouds, textured models, and high-resolution ortho-mosaics from large datasets. Up-to-date IBM approaches are based on photogrammetric computer vision algorithms. They are affordable, generally robust, and agile, considering implementation and flexibility of the ground-sampling distances (GSD) and other parameters that can be adjusted according to the specified requirements [95,96]. In addition, these approaches allow the use of non-metric and lower-end cameras, and have widened the application scope of photogrammetry for built cultural heritage (because of their increased automatization and the level of detail they can record).
Multi-view IBM refers to digitization approaches similar for the production of point clouds and 3D models from datasets of overlapping images, using automated algorithmic methods [97]. Standard multi-view 3D reconstruction pipelines start with detecting and describing salient features on every image of a dataset. The features are matched across different image pairs, and false matches are filtered out. Next, Structure-from-Motion (SfM) implementations are needed to estimate the interior and exterior orientations for the cameras, combining all relative orientations of the image pairs at a local coordinate system without an absolute scale. Then, each image's relative position and orientation in every pair is calculated using triangulation, and the combined image block is optimized through bundle adjustment. The resulting sparse point cloud is further densified by employing dense image matching algorithms, and most points of the scene are reconstructed in a procedure typically called multi-view stereo (MVS). The dense point cloud is meshed into a 3D model, usually utilizing Delaunay triangulation algorithms, and textured by interpolating color information from the image dataset. Multi-view image-based recording approaches in principle do not require the implementation of control points with known coordinates to function. Nonetheless, the use of control points with known coordinates during the orientation improves the accuracy of the results, and is mandatory for acquiring measurements or for geo-referencing the 3D models.
The IBM pipelines can effectively involve oblique imagery and require low levels of supervision and user expertise, making them extremely popular for digitizing the historic built environment. Many studies report the application of IBM workflows for documenting monumental architecture [98][99][100][101][102][103] (Figure 2) and other historical constructions [104][105][106][107], often supporting the implementation of failure analysis through numerical modeling. In any case, IBM is seldom considered as a stand-alone solution for nondestructive evaluation of historical structures [108][109][110], in all likelihood due to the higher cost-effectiveness of TLS to produce large-volume models suitable for deformation monitoring. Nevertheless, IBM provides a rich in information background for the thematic mapping of historic structures' deterioration [111,112], necessary for calculating damage/risk indexes [113].

Infrared Thermography
Infrared thermography (IRT) is a close-range sensing technique well established for inspection and monitoring of structures. IRT is a noncontact and noninvasive technique that allows repeatability, prolonged use, and comparison between areas of the target and multitemporal application; thus, presenting many advantages over other nondestructive evaluation technologies [114,115]. Through thermal detectors, it measures levels of emitted infrared radiation at the long-wavelength infrared (LWIR) portion (7-14 µm) of the electromagnetic spectrum [116]. Infrared radiation is emitted from all materials, at temperatures above absolute zero (i.e., T > −273.15 • C), due to their molecules' mobility. This infrared motion increases at higher material temperatures and reduces at lower temperatures. Therefore, the intensity, frequency, and wavelength of infrared radiation depend on the temperature and magnitude of the source and the material's emissivity [117].
A thermal camera is a device employing a thermal-infrared detector that records the radiant energy-at the LWIR range-that falls onto the camera lens and converts it to a measurable form (Figure 3). Using a radiation detector, the thermal camera displays a target's temperature, creating a visual representation, a two-dimensional thermal image from the detected average of incoming radiative energy intensities [118].
A few fundamental parameters affect the performance of the thermal camera's sensor and, subsequently, the image quality. They are sensor spectral range, or spectral response; spatial resolution, or pixel pitch; thermal sensitivity, or equivalent random noise level; intensity resolution, or the number of intensity levels; scan speed, or update rate of the scanning mechanism [119]. The spectral range refers to the portion of the infrared spectrum in which the camera will be operationally active. Sensitivity is measured in Celsius degrees and reflects the minimum detectable temperature difference. Inspectionpurposed temperature sensors with good sensitivity recognize temperature differences of even 0.040 • C (uncooled cameras). The intensity resolution is proportional to the number of hues or shades on the thermal camera screen. The higher the resolution, the more smoothly temperature changes will occur. If a target has sudden temperature changes, it will be due to the target itself and not the camera. The spatial resolution of thermography cameras is significantly lower than optical cameras [120]. Recently, affordable thermal camera models have come into the market, including smartphone-adjustable low-resolution instruments [121,122]. However, these inexpensive cameras provide lower accuracy, which makes them unusable for some applications. Table 2 presents some standard thermal cameras purposed for infrastructure inspection available in the market. The typical way of displaying thermal images through a device or computer is generally either a black-and-white image or a colored image, where each color correlates with a temperature range ( Figure 4). Thermal images are essentially a mapping of the distribution of infrared radiation, which originates from the different parts of the object. It is also possible to depict isothermal curves, which are lines at the boundary between two colors that reflect points with the same temperature. The thermal image processing software can provide heat profiles, temperature frequency histograms in each area, temperature differences from different images, points with maximum and minimum temperatures, magnifications, and filtering. Nevertheless, thermal infrared images can be difficult to interpret. To obtain high quality and useful thermographic data, it is usually necessary to take into account the prevailing conditions (ambient temperature, relative humidity, recording distance, materials emissivity factor) to adjust the camera, eliminating the noise errors they cause in measuring the temperature changes of a target's surface [123,124]. For this reason, IRT should be used in controlled environments. Furthermore, the thermal infrared images are, generally, noisy and suffer from a low signal-to-noise ratio. Consequently, several digital image processing (DIP) procedures are employed to enhance acquired thermal images ( Figure 5). For image enhancement reasons, a variety of point operation algorithms, such as contrast stretching and histogram equalization, can be applied [125]. The objective of these algorithms is to widen the histogram of an image, which increases the dynamical range, thereby enhancing contrast. Using advanced signal analysis techniques like thermographic signal reconstruction (TSR) and principal component analysis (PCA), defects of greater depths can be detected with higher thermal contrast. Furthermore, feature extraction from thermal images, based on the temperature values, has been observed to be advantageous in moisture, decay, and thermal leaks classification. For the detection of hot spots, image classification and thresholding are performed. Overall, several image segmentation techniques are used on thermal images, and the selection between them depends on the application and the nature of the thermographic acquisition [126]. IRT records the thermal radiation emitted from surfaces, enabling the analysis of surface temperature patterns, to reveal existing anomalies. In other terms, thermography aims to identify surface areas of interest by observing local temperature differences using thermal sensors [127]. In IRT, two different approaches are employed: active and passive [128]. In active IRT, the target is subjected to thermal stimulation by an external radiation source. The heat propagation depends on the materials' thermal properties and subsurface irregularities resulting in temperature differences on the target's surface. In this scenario, measured thermal radiation comes from the thermal response of the target to the external excitation. This technique is applied in cases where the target is in thermal equilibrium and does not show surface temperature differences or if they are so small that they cannot be detected with passive testing [129]. Given the ability to control the intensity of the external energy source, the artificial thermal excitation can reach deeper into the object, and hence information can be obtained from more internal layers.
Successful application of active thermography requires that the targeted surface is more or less homogeneous (has a defined high emissivity and, thus, low reflectivity), and that a good knowledge exists about the radiation coming from additional sourcesdirect or indirect (reflected)-and other environmental factors that may affect the measurements [130]. This suggests inherent difficulties in applying active IRT for historical structures, especially for cases of highly deteriorated architectural elements and, thus, less frequent use.
Passive IRT measures the thermal radiation emitted from the target's surface without external heat stimulation. Passive thermography is a technique often employed for building inspections when measuring temperature differences is a factor for evaluating an existing structure's preservation state (or energy performance) [131]. The documentation of irregular temperature distributions on a building's façade or structural elements may assist in detecting potential problems or damages by estimating surface temperature changes compared with assigned reference values [132][133][134]. New critical developments in thermal measuring device technology, combined with other advantages stemming from its nondestructive nature, have led to widespread application on structural surveys of monumental and historic architecture [135][136][137][138][139]. Moreover, the applications of passive IRT concerning the investigation of historic buildings include the localization of original and replacement materials [140][141][142], evaluation of the plaster conditions [143][144][145][146], assessment of cracks [137,147,148], characterization of material loss-induced features, and other alterations on architectural surfaces [139,[149][150][151], detection of moisture [152][153][154][155][156], localization of concealed defects and subsurface construction [157][158][159], in addition to evaluation of restoration and consolidation interventions [2,160]. Increased demand is also reported regarding the inspection of masonry arc bridges since their periodical inspection can be difficult due to access restrictions, which necessitates the application of remote sensing techniques [161][162][163].

Multispectral Imaging
Multispectral image acquisition is primarily associated with capturing data with a single imaging sensor capable of recording at multiple electromagnetic spectrum bands. Materials have specific spectral signatures at different regions of the electromagnetic spectrum, which can be obtained under controlled conditions [164]. Surface defects caused by weathering, concentration of moisture, and other alterations change their normal and homogeneous spectral behavior. Therefore, capturing spectral anomalies with imaging sensors facilitates identifying these characteristics. However, acquiring useful data of damages of historical structure surfaces poses considerable challenges, such as selecting the proper sensors, radiometrically and geometrically calibrating them, and identifying the environmental factors that may affect the captured reflectance data, which makes the use of these technologies not frequent [165]. Notably, Del Pozo et al. [166] reports using a Tetracam Mini-MCA6 for obtaining multispectral ortho-mosaics to map altered and unaltered materials and moisture of a historical church. Table 3 presents the characteristics of some miniaturized multispectral camera options, which have been implemented for terrestrial applications.
The reciprocity of visible-spectrum and infrared imaging is often considered essential for identifying damage on historic structures. Particularly, very near-infrared reflectance imaging is an effective tool for identifying biological colonization and the development of crusts on stone and concrete [167][168][169][170][171][172]. However, the cost and usability of hyperspectral/multispectral camera systems are often considered prohibitive factors for their implementation in the heritage sector, and thus the collection of multispectral data is performed via modified commercial cameras, and frequently with multi-sensor approaches. By removing the internal near-infrared cut-off filter, a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS)-based camera can be used as an affordable and agile alternative for multispectral acquisition. Narrow or wideband external filters allow capturing reflectance data at the very near-infrared range, while the camera retains user-friendly features and interfaces to a wide variety of photographic accessories and software [173][174][175]. Adamopoulos and Rinaudo [167], Lerma et al. [168], Meroño et al. [169], and Sánchez and Quirós [176] have used this method to obtain visible and very near-infrared images aiming to identify weathering on historical stone buildings or masonry. Thermographic reflectance imaging can be performed with an additional sensor ( Figure 6).

Ground-Penetrating Radar
Ground-penetrating radar (GPR) is a geophysical prospection technique widely used for NDT applications. GPR is a noninvasive measurement method that utilizes high frequency (10-10,000 MHz) low-power electromagnetic pulse sequences to locate subsurface targets and interfaces between materials with different electrical and magnetic properties. The possibility of distinguishing between materials and mapping interfaces within visually opaque substances or earth material depends mainly on the propagation speed of electromagnetic waves and the difference in electrical conductivity and permeability between different materials [177,178]. GPR's operating principle is based on the generation of short-duration radio wave pulses by a transmitter, transmitted as wide beams at a speed that depends on the electromagnetic properties of the medium. The electromagnetic signal propagates in a medium (such as a structural element or subsoil). When it encounters an interface between materials with different electrical properties, then some of its energy is reflected or diffused back to the surface, some is refracted, and the residual energy of the pulse passes through the interface to deeper horizons, where this process can be repeated. The part of the wave reflected from an interface returns to the surface, where it is detected and recorded by the receiver [179]. The selection of the appropriate operating frequencies of the antenna depends on the purpose of the investigation and the requirements of the respective application of geo-radar inspection.
One primary challenge of GPR is the interpretation of the collected data, which highly depends on the quality of performed measurements, knowledge of the prospected medium dielectric properties, layering of the materials, and suitability of signal processing techniques [180,181]. An equally important issue for retrieving useful information from GPR measurements is the dimensionality of presenting the results, with 2D and 3D representations being the most frequent visualization scenarios for the historical structures' state of preservation assessment.
By performing a horizontal GPR scan along a linear profile, a 2D recording is obtained, which results from the successive individual one-dimensional traces retrieved along the path of the antenna. The retrieved data can be displayed as an image using a predefined color scale or palette (usually grayscale), matching the strength (range) of the recorded signal with a specific hue (brightness) of the selected palette ( Figure 7). This image, also referred to as a 2D scan profile or radargram, represents a vertical section in the structure where the horizontal axis corresponds to the position of the antenna along with the scan, and the vertical axis to the time of the electromagnetic wave's dual-path, which corresponds to depth. Retrieving this type of result requires mechanical equipment with a built-in position encoder, which records the distance the antenna traverses along the scan line and the retrieval location of each individual trace [182]. Reflections from small or point scatterers below the ground, building element, or other medium's surface appear on the radargram as diffraction hyperbolas. This is because the electromagnetic waves are transmitted by the monostatic antenna in the form of a wide conical beam so that the receiver records the reflected signals from an undersurface target, not only when it passes just above the position where the target is located, but also in multiple scans before and after this position. The shape of the retrieved hyperbola depends on the antenna layout, the depth at which the point scatterer is located, the speed at which the electromagnetic waves propagate, and the scan spacing selected by the operator. At greater depths, the hyperbolae are larger because they consist of more scans. In addition, higher electromagnetic wave velocities (lower relative dielectric constant) produce wider hyperbolae and vice versa. Finally, the shorter the selected interval between scans (equivalent to a larger number of scans per unit of horizontal distance), the wider the hyperbolae recorded by point scatterers. The reflection always comes from the top of the point target, and the maximum (peak) of the recorded hyperbola curve corresponds precisely to the position where the target is. Usually, the larger the size (diameter) of a point scatterer, the stronger (wider) the hyperbolic reflection produced. The brightness or power of a hyperbolic reflection depends on the difference in electrical conductivity (and therefore relative dielectric constant) between the medium and the target. As a general rule, the brightness of a reflection produced by an interface between two materials with different dielectric properties is proportional to the dielectric contrast between the two materials, which means that the higher the contrast, the stronger is the reflection produced [183,184].
When scanning with GPR over a continuous boundary layer, the antenna receives consecutive reflections from the parts of said boundary, which in the retrieved 2D radargram appear in the form of a continuous reflecting layer that resembles the boundary layer. When the antenna crosses over an undersurface linear target of tubular shape transversely, i.e., perpendicular to the longitudinal axis of the target, then the recorded reflection will be hyperbolic, similar to the case of diffraction by point scatterers described above. If the antenna moves in parallel, i.e., along the target, then the reflection will appear as a continuous straight line, as long as the distance of the antenna from the subsurface target remains constant. Various subsurface inhomogeneities such as gaps (with air or water) produce strong reflections without a specific shape. Reflection polarity can also provide important information when interpreting GPR results. The presence of various subsurface discontinuities, such as large air-filled voids or cracks, is detected in the form of strong inverted phase reflections with a black-and-white sequence of colors and an indeterminate shape. In the case of disintegrated areas with high levels of moisture or water-filled voids, then the generated reflections will be strong but will show the normal polarity sequence (white-black-white), which is very important for the identification and differentiation between specific types of deterioration when interpreting radargrams. In addition, these reflections are usually stronger and more visible than those mentioned above. This phenomenon occurs because, for example, if we consider a stone or concrete structure, the dielectric contrast between the diffuser and water is much higher than the dielectric contrast between concrete and air [185,186].
By collecting multiple parallel 2D sections (time-slice method) or, in other words, by performing multiple horizontal scans on an x-y axial plane of grid coordinates, a 3D data set can be recorded that can be used to construct subsurface models, thus improving the efficiency and quality of the signal interpretation [187]. Three-dimensional data retrieval requires the use of a properly designed measurement grid; the dimensions and distance between successive scan lines on each axis are user-defined. The way scans are performed on the grid is usually towards one direction starting from the same straight line ("normal" way of scanning), although there may be the possibility of zigzag measurements, in which the direction of the scans profiles changes alternately. Essentially, with this type of GPR scanning, the mapping of a subsurface area of interest is achieved, providing information about the location, depth, and orientation of the internal reflectors. Today, most of the processing software with which the geo-radar systems are equipped to provide the possibility of displaying the 3D data in various ways-such as in the form of horizontal sections at defined time ranges that correspond to depths parallel to the recording level, or isosurfaces-these are interpolated surfaces that represent subsurface points with a constant reflection coefficient or amplitude [188][189][190][191].
Since the (crucial for maintenance and damage repairing) inspection of historic buildings must, in many cases, be minimally invasive, making some common and valuable techniques' application not favorable, GPR has acquired great importance as a technique for revealing both historical and structural information [4,[192][193][194][195][196][197][198][199][200][201][202][203][204]. Particularly, some issues of structural interest are the probable presence of fractures [205][206][207], voids [208], infiltrations of humidity [209,210], or metallic bars [211] due to previous restoration works, often not sufficiently documented. GPR evaluation is well advised, especially if new restoration interventions are planned [212,213]. Furthermore, nondestructive surveys by GPR can provide evidence for addressing the restorations appropriately and enable one to verify the success of the restoration works through post-intervention monitoring. Some topics of historical interest, which can be addressed through GPR surveys, are the presence of hidden rooms, floors, mosaics, and frescoes [214]. The changes that a structure has undergone through the centuries are often not adequately documented, or in other cases, the documents have been lost. Nevertheless, the significance of a retrieved buried target can be of both historical and structural value, as, for example, in the case of a hidden crypt under a church.

Active Elastic Wave Techniques
Sensing methods focused on the propagation of elastic waves have been increasingly employed for the non-destructive inspection of historic structures and materials. The application of these techniques is based on the principle that wave propagation velocity (WPV) is associated with measurable parameters of the material through which it travels (density, elastic modulus, Poisson's ratio) [226]. The presence of damage (voids, cracks, defects) changes the material's physical properties, affecting propagation and, therefore, WPV measurements can be used for defect detection and quality control of materials [227]. The WPV is estimated by measuring the time that an elastic wave needs to transit from an emission point to another point located at the boundary surface of the material under examination. A receiver probe records the arrival of a pulse wave originating from the excitation source at the emission point. The precise and accurate measurement of the propagation time is a key factor when implementing active elastic wave techniques. Depending on the frequency content of the excitation pulse, WPV can be determined in the sonic or ultrasonic range [228]. Sonic and ultrasonic sensing techniques have many advantages compared with traditional invasive methods; however, there is a great number of parameters influencing the correct calculation of WPV. Roughness and defects of the historical surface under examination, and subsurface small cavities affect the results because the short wavelength of the pulse prevents it from passing even though very small voids between the surface and the receiver probe. Lack of knowledge about the distribution and heterogeneous physical properties of materials, especially for structures such as historic masonry, complicates the interpretation of results further, as well as water content [180,229,230]. Other limitations include operational costs due to high number of required measurements, calibration needs for different materials, and complexity of data elaboration caused by structural inhomogeneities [231,232].
Sonic and ultrasonic techniques applied for inspection purposes consider three types of acquisition methodologies conducted with different arrangement between excitation source and detector: (1) direct, in which the emission source (often a hammer) and the receiver are placed in line on opposite sides of the surveyed element; (2) semi-direct, in which the emission source and the receiver are placed with an angle between them; and (3) indirect, in which the emission source and the receiver are both located on the same face of the element in a vertical or horizontal line [233]. Commonly, surveys of building and infrastructure elements are carried out with a 2D array, for example, along longitudinal or transverse sections of columns and walls. Sometimes many 2D tomographic profiles are arranged to construct a 3D model. The full 3D sonic or ultrasonic tomographies are especially devoted to the internal study of pillars and columns [234]. Many examples of built heritage inspection through sonic and ultrasonic sensing can be found in recent bibliography [207,229,[235][236][237][238][239][240][241][242][243]].

Data Fusion
As a general multidisciplinary approach, the term data fusion implies integration of data from different sources to enhance their potential value, interpretability, and allow the generation of high-quality visual representations. Sensor fusion, data integration, and information fusion are similar terms often referring to the same concept. However, in the framework of this review, as sensor fusion approaches are referred to only those employing simultaneous data acquisition with multi-sensor configurations, to distinguish them from data fusion approaches performed at a post-acquisition processing stage.
Multisource data fusion techniques are beneficial for built heritage condition monitoring. They significantly improve holistic documentation, enhance the properties of recorded data, enable integrated analysis, support long-term inspection, and minimize misinterpretations caused by cross-examining the multi-disciplinary information. Data fusion approaches are most often categorized depending on the stage of data processing at which fusion occurs [244]. Ramos and Remondino [245] proposed an expanded classification of data fusion procedures regarding different aspects. As concerns cultural heritage applications, generally there are not incremental approaches for fusing heterogeneous data for inspection and monitoring. However, the registration of metric data and other metric or qualitative information requires a common reference system with known parameters, where their spatial integration can take place. Additionally, the integration of data from different sources at the pixel level requires images (or orthoimage-mosaics) that represent the same plane and are of the scene sampling distance.
The task of integrating multi-sensor data depends on aspects such as the spatial and radiometric resolution, positional accuracy, and dimensionality of the fusion [246]. Integrative 3D modeling approaches, especially those involving TLS and IBM, are a widely discussed topic of data fusion, and follow different pipelines [247]. However, the fusion between (geo)metric data and non-metric data is less often debated. Besides, close-range sensing data purposed for nondestructive monitoring are seldom recorded in a non-metric manner. Their inherent heterogeneity, a result of sensing at multiple wavelengths with vastly different instrumentation, hinders the integration processes.
Often, the near-infrared spectral images' similarities with color images allow for the high-resolution texturing of historical structures' 3D representations or the direct implementation of IBM-driven processing, thus facilitating integration with other metric data sources. The problematics of integrating thermograms with metric data come from the vast dissimilarities comparing with visible-spectrum images and concern both their spatial (low resolution) and radiometric (different observable features) characteristics. Methodologies on the fusion of thermographic and geometric data depend on sensor registration (optical camera with thermal camera/laser scanner with thermal camera), product registration (thermogram with point cloud/thermogram with 3D model), or hybrid photogrammetric techniques. Implementing one of the above data fusion techniques largely depends on the scale of the survey and the available equipment and produces thermaltextured 3D point clouds or models. Furthermore, data collected with subsurface inspection methods using radar, ultrasonic and sonic techniques can also be integrated, when the position of utilized antennae is estimated or tracked, thus allowing the referencing into a given coordinate system; however, this type of fusion refers mainly to information visualization and not integrated analysis.

Integration between Photogrammetric and Ranging Techniques
The primary goal of heritage geometric recording is the generation of complete, accurate, and photorealistic 3D representations and 2D metric derivatives, such as orthoimagemosaics and vector drawings. As discussed in Sections 2.2 and 2.3, there is a wide range of advanced active and passive sensors and sensing techniques for geometric recording, producing different data. Integrative IBM and TLS approaches are the standard approach for modeling ancient and historical structures and ensure that density, accuracy, and texture-resolution predefined specifications are met [56, [248][249][250]. Fusion approaches introduced for multisource point cloud (3D-to-3D) registration (and successively coloring) are: manual annotation of common features [251], iterative closest point (ICP) [252,253], feature-based [254], and georeferencing-based [35]. The rapid increase in the implementation of unmanned aerial systems (UAS) for cultural heritage IBM has recently introduced fascinating integrative approaches, on the convergence of TLS and low-altitude aerial photogrammetry [255][256][257][258][259] (see Section 4.1).

Multispectral Data
Regarding multi-sensor recording approaches, and some designs of integrated devices, it is anticipated that the images from different spectral channels require to be shifted or spatially re-scaled to be registered (2D-to-2D fusion) to form aligned multispectral image cubes. There are several algorithms associated with image registration [260]. Image registration involving linear shifts is relatively easy to compute and can be applied by performing cross-correlation. Spatial image scaling that involved re-sampling could result in some loss of information; as a result, it is best to design the system's optics to avoid scaling of the images. High-resolution imaging of large objects inevitably involves mosaicking of the images. Adjacent images need to be taken with sufficient overlap to allow automatic image registration. When the shifts are linear, a simple cross-correlation algorithm can be used for image registration. Regarding cultural heritage applications, registering images collected in different spectra has often been addressed through the manual identification of common features [261].
One prevalent form of multispectral data fusion for built heritage monitoring involves multi-sensor acquisition. IBM and TLS-produced ortho-mosaics referenced at the same coordinate system can be treated as multispectral images. Sánchez-Aparicio et al. [262], Del Pozo et al. [263], and Conde et al. [264] experimented with fusing data from laser scanners operating at different wavelengths, multispectral and commercial digital cameras to produce multispectral ortho-mosaics for detecting pathologies in constructions.

Thermographic Data
Registration between orthorectified thermal-infrared and color images/image-mosaics is usually performed for cultural heritage applications by manually identifying common points distinguishable in both spectra (characteristic points, or, more commonly, special targets with different reflectance characteristics) to define the necessary transformation [265,266]. However, the most frequently applied approach for the fusion of thermal and metric data of historic architectures has been integrating thermograms and 3D measurements, collected with individual proximal sensing techniques. This process frequently implies registering dense point clouds (or derivative 3D models) captured by TLS-containing metric spatial information-and thermographic images. This approach is considered the most costeffective, especially when complete thermographic mapping of a historic structure or structural element is required. The estimation of the geometric relation between a point cloud/3D model and a thermogram (the relative position and the orientation matrix) is realized by defining common features, allowing the projection of the thermal intensities to create a thermal texture.
Due to the limitations imposed by the low spatial resolution of thermal infrared images [267], research on thermographic modeling for built cultural heritage has largely concentrated on pipelines for reconstructing the 3D shape from color image datasets, and applying the texture from registered thermal infrared image datasets and hybrid workflows, which apply the photogrammetric principles on both color and thermal infrared image datasets, but use only the latter for texturing [268]. González-Aguilera et al. [269], Dlesk et al. [270], and Patrucco et al. [271] performed image-based modeling using thermal infrared images captured with NEC TH9260, FLIR E95, and FLIR SC660 thermal cameras, respectively, to reconstruct digitally and to inspect built heritage. Recent approaches take advantage of the thermal and color image sensors integrated into the thermographic camera [272][273][274][275].
The first approaches for thermal texturing via 2D-to-3D registration were developed on a manual basis. This method has been implemented by Spanò et al. [276], Zalama et al. [277], Costanzo et al. [278], and Mileto et al. [279] to investigate the pathology of historical buildings. However, since the correspondences between thermal images and 3D metric products may not always be visible, approaches have been recently developed to perform this 2D-to-3D registration automatically using features extracted from point clouds and thermal images [280,281].
Methodologies for simultaneous recording of high-density geometric and thermographic data have also been established, facilitating massive and more agile thermographic 3D modeling. Custom-made multi-sensor equipment has often been employed with this aim, necessitating the registration between different sensors used during acquisition. The sensor registration parameters consist of a vector of the differences in the sensors' relative position and the rotation angles between them, and they are necessary to transform and integrate data into the same coordinate system [282][283][284][285][286][287][288]. Overall, sensor registration that includes thermographic cameras is not usual due to thermal measurements' requirements, as regards to angle and distance of acquisition [289].
The combined application of IRT and GPR can be helpful for the detection and characterization of moisture [290,291], but the produced results (thermograms and 2D slices) are not integrated, because they represent different planes of the investigated structures. Although some research has been produced towards the pixel-level fusion of IRT and GPR for concrete bridge inspection [292], the application for built heritage poses many challenges due to the heterogeneity of historic materials, and their constant degradation.

Radar, Ultrasonic, and Sonic Data
Unlike thermography, GPR data are more challenging to interpret and have lower spatial resolution than TLS and close-range photogrammetry and, thus, are usually acquired and used independently [293,294]. The expected level of fusion between geometric and geophysical data for architectural heritage nondestructive investigations is frequently the registration of GPR slices or surfaces interpolated from 3D grid-organized GPR measurements, and metric products computed with methods for reality capturing [295,296]. When historical structures' surfaces with relatively flat geometries are investigated, the integration in 3D space is, according to the bibliography, achieved through measuring the 3D positioning of control points (usually the start and end-point) of the scan lines [297][298][299][300]. Apart from registration, the availability of a dense geometric 3D model or point cloud can also assist the spatial correction of GRP data collected for structures with more complex geometries, the most common example being historic bridges [301][302][303][304][305][306][307][308][309]. Penetrating radar exploration of historic structure's columns may require only a simplified knowledge of the geometrical shapes [310,311]. The integration of positioning systems, laser scanning, and GPR presents exciting potential for integrated surface and subsurface mapping but is subject to limitations stemming from the multi-sensor approach [312].
Ultrasonic and sonic sensing, likewise, have lower spatial resolution than TLS and close-range photogrammetry. However, the registration of ultrasonic/sonic measurements with color-textured 3D models and backscattered laser intensity-textured 3D models, from photogrammetry and TLS, respectively, can benefit the interpretation of shallow and inner anomalies of columns and pillars [313,314]. At the same time, referencing of longitudinal wave velocity maps or tomographic slices helps the better understanding of a historical asset's structural condition in three dimensions, especially when integrated with deformation maps [236,237].

Conclusions and Outlooks
In order to facilitate cohesive monitoring of built heritage, an identification of the advantages and limitations of each close-range sensing evaluation technique and their integrated use is necessary. Table 4 presents a brief comparison of close-range sensing techniques for build heritage inspection and monitoring based on reviewed recent literature.
Laser Scanning and IBM satisfy the 3D modeling needs for inspection, multitemporal deformation monitoring, numerical modeling, and building information modeling (BIM) [315,316], and enable surface feature extraction regarding deterioration and physical defects. The integration between them emphasizes the complementarity of geometric and color information. These techniques employ mobile instrumentation, are easily adapted for complicated acquisition scenarios, and can reach millimetric accuracy for the extracted features; however, they cannot provide any subsurface information. Conditionally, TLS can be applied for surface moisture detection, subject to calibration and knowledge of the material's emissivity at the laser instrument's operating band. IRT evaluation is appropriate for surface detection and feature extraction of defects and moisture, but is less mobile than IBM/TLS and requires knowledge of the ambient and material influence on LWIR radiance measurements. The integration with metric surveying allows for the quantification of extracted features and their correlation in 3D space to address potential sources of moisture, subsurface radiant sources, and calculate building envelopes for sustainable conservation. In addition, the resolution of thermographic results can be significantly improved through pan sharpening, super-resolution enhancement, or hybrid color-thermal IBM.
Near-infrared and multispectral imaging offer solutions for surface pattern extraction concerning weathering, moisture, and biological colonization, providing higher thematic accuracy of the extracted features comparing with color imaging and IBM. Especially the combination with learning-based digital image segmentation results in rapid mapping of the surface condition. However, challenges occur from the implementation of multi-sensor instrumentation due to increased cost, reduced mobility, and calibration needs.
GPR introduces one of the most promising monitoring technologies due to its ability to identify material depth and locate discontinuities between materials due to their different dielectric properties. The fusion of GPR measurements with geometric data enables spatial correction for structures of complex geometry but simultaneously facilitates better 3D visualization of the prospection results and increases the accuracy of locating material discontinuities defects in 3D. Furthermore, 3D modeling, GPR and ultrasonic/sonic techniques integration support truthful numerical modeling and parametrization for structural health analysis.
In the sense of pixel-level fusion, data integration for built heritage is scarcely being applied through the quantization of multitemporal or multispectral images to increase interpretation, utilizing clustering classification or principal component analysis. On the other hand, integrated management of nondestructively recorded data through a geographic information system (GIS) is a more common approach that allows geo-processing analysis for thematic pathology representation. The concept of integrating nondestructive evaluation and BIM is a novel concept that aims to facilitate management, support the restoration and rehabilitation process, enhance historical research, and promote building sustainability in an integrated way [317][318][319][320].

The Aerial Perspective
The recent substantial progress of uncrewed platform manufacturing, the increasing miniaturization of sensing payloads, and decreasing cost of integrated microelectronics, have gradually fostered the adoption of UAS for low-altitude aerial inspection of historic structures [111,259,295]. State-of-the-art UAS-born sensors include orientation systems, color cameras, multispectral cameras, hyperspectral cameras, infrared thermography cameras, action cameras, and LiDAR instruments [321]. Non-destructive recording that utilizes UAS platforms is considerably more cost-effective for covering large historic building complexes (and their surroundings) and inaccessible areas of historical significance than traditional data acquisition methods [322].
UAS-based close-range photogrammetry is a proven stand-alone approach to 3D modeling of built heritage [323][324][325][326][327][328][329]. Nevertheless, it can also support ground means such as terrestrial LiDAR [330][331][332][333] and simultaneous localization and mapping (SLAM)-based recording techniques [334,335]. UAS-born LiDAR is a 3D recording approach less explored for historical structures [336] than for landscape features induced by buried archaeological remains. Planning of the aerial surveys [258,337] is not the only important parameter for structural inspection, as the optimization of recorded large-volume data [338,339] can be considered equally essential. Furthermore, the automatic segmentation and classification of derived point clouds and 3D models [340][341][342] has particular importance for the semantic description and virtual reassembly of historic structures, especially for BIM integration [343], and web and augmented/mixed reality applications [344]. UAS-based photogrammetry has also been implemented for assessing damage in post-disaster scenarios through multitemporal 3D modeling [345][346][347], while the classification of images and point clouds collected with UAS sensors can also support structural health monitoring [348,349]. To conclude, the use of UAS presents us with an exciting prospect for damaged built heritage monitoring by supporting terrestrial non-destructive surveys via high-resolution modeling, multispectral mapping [162], and providing the necessary input for rapid inspections and numerical modeling [350].