Deep Learning Enhanced Multisensor Data Fusion for Building Assessment Using Multispectral Voxels and Self-Organizing Maps

Raimundo, Javier; Medina, Serafin Lopez-Cuervo; Mata, Julian Aguirre de; Herrero-Tejedor, Tomás Ramón; Priego-de-los-Santos, Enrique

doi:10.3390/heritage7020051

Open AccessArticle

Deep Learning Enhanced Multisensor Data Fusion for Building Assessment Using Multispectral Voxels and Self-Organizing Maps

by

Javier Raimundo

^1,*

,

Serafin Lopez-Cuervo Medina

¹

,

Julian Aguirre de Mata

¹

,

Tomás Ramón Herrero-Tejedor

²

and

Enrique Priego-de-los-Santos

³

¹

Departamento de Ingeniería Topográfica y Cartográfica, Escuela Técnica Superior de Ingenieros en Topografía, Geodesia y Cartografía, Universidad Politécnica de Madrid, Campus Sur, A-3, Km 7, 28031 Madrid, Spain

²

Departamento de Ingeniería Agroforestal, Universidad Politécnica de Madrid, Campus Ciudad Universitaria, Av. Puerta de Hierro, nº 2–4, 28040 Madrid, Spain

³

Department of Cartographic Engineering, Geodesy and Photogrammetry, Universitat Politècnica de València, Camí de Vera s/n, 46022 Valencia, Spain

^*

Author to whom correspondence should be addressed.

Heritage 2024, 7(2), 1043-1073; https://doi.org/10.3390/heritage7020051

Submission received: 2 January 2024 / Revised: 1 February 2024 / Accepted: 2 February 2024 / Published: 17 February 2024

(This article belongs to the Special Issue Conservation Methodologies and Practices for Built Heritage)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Efforts in the domain of building studies involve the use of a diverse array of geomatic sensors, some providing invaluable information in the form of three-dimensional point clouds and associated registered properties. However, managing the vast amounts of data generated by these sensors presents significant challenges. To ensure the effective use of multisensor data in the context of cultural heritage preservation, it is imperative that multisensor data fusion methods be designed in such a way as to facilitate informed decision-making by curators and stakeholders. We propose a novel approach to multisensor data fusion using multispectral voxels, which enable the application of deep learning algorithms as the self-organizing maps to identify and exploit the relationships between the different sensor data. Our results indicate that this approach provides a comprehensive view of the building structure and its potential pathologies, and holds great promise for revolutionizing the study of historical buildings and their potential applications in the field of cultural heritage preservation.

Keywords:

multisensor; data fusion; voxel; multispectral; building; point cloud; cultural heritage

1. Introduction

Picture a historical building as an enigmatic puzzle, each crack, deformation, and hidden corrosion forming the cryptic pieces waiting to be deciphered. As we stand at the intersection of technology and architectural history, our pursuit is to decode this intricate puzzle. Multisensor data fusion emerges as our toolkit, seamlessly blending the precision of laser scanners with the spectral insight of photographic cameras. Here, we introduce a revolutionary approach, not just to study buildings, but to unravel the mysteries concealed within their weathered walls.

The state-of-the-art in multisensor data fusion in the study of building pathologies has undergone significant advances in the last decades. The convergence of various technologies and sensors has enabled researchers to assess structural conditions and pathologies in buildings more fully and accurately [1,2].

Studies on potential issues in historical buildings employing geomatic sensor technology have garnered diverse attention in late years [3,4]. First approaches were dedicated only to generate three-dimensional models with different levels of detail. These three-dimensional models were complemented by spectral information so that visual representations could be made to highlight areas that could present some pathology [5,6].

Various studies have focused on different architectural elements and their specific pathologies. For example, in the study of façades [7,8,9] or concrete structural elements [10,11,12]. The data obtained were sometimes integrated into Building Information Models (BIM) for further study [13,14].

In turn, different research works have concentrated on the use of singular sensors. There are research works focused on the use of infrared thermal cameras [15,16,17,18,19], multispectral photographic cameras [20,21] and laser scanners [22,23,24].

The combined use of active and passive sensors such as terrestrial laser scanners and photographic cameras with different spectral sensitivities has provided a wealth of information. This allows the generation of detailed three-dimensional models, point clouds with spectral information, and visual representations highlighting the affected areas. This combination of different sensor types provides a complete view of the structure and its potential pathologies. Terrestrial laser scanners capture highly detailed three-dimensional point clouds, thereby facilitating the detection of structural deformations and cracks [25,26]. Photographic cameras provide valuable spectral information that can indicate signs of corrosion, moisture, or other problems that can not be visible to the naked eye.

The use of 3D point clouds requires the management of an immense volume of data, which requires substantial data-processing infrastructure for handling and visualization. Additionally, 3D point clouds are inherently unstructured, making the task of locating specific points within these clouds nontrivial [27]. One consequence of this lack of structure is that point clouds do not contain topological information, thereby precluding an understanding of neighborhood relationships among points [28].

Point clouds also exhibit non-uniform densities and distributions. The point density achieved is not always optimal for the intended applications. Some areas will display redundant information, particularly in the overlapping zones between scans. Conversely, other areas will present lower point densities. This results in heterogeneity in both the density and accuracy of points within the cloud [29].

In terms of analysis, multisensor data fusion has enabled the early detection and monitoring of structural pathologies, such as cracks, deformations, and corrosion. In addition, the integration of information from airborne and ground sensors has enabled more complete coverage of structures, both in terms of physical access and data diversity. Traditionally, the assessment of structural problems involves visual inspections, manual measurements, and, at best, the use of a single sensor to capture limited data [30]. Modern data fusion systems allow the creation of accurate three-dimensional digital models that can be compared to the original designs to identify discrepancies. This is especially valuable for assessing the performance of historic structures and monuments where preservation is critical [31].

The integration of various sensors and techniques also leads to the creation of diverse point cloud datasets, which in turn contributes to the formation of large data repositories [29]. This not only affects processing speed but also requires the conversion of substantial volumes of point data into dependable and actionable insights. Traditional methods for customizing point clouds for specific applications are becoming increasingly time consuming and require manual intervention. The growing complexity and volume of data, often spread across multiple stakeholders or platforms, pose a challenge to human expertise in effective data management [32]. To facilitate more effective decision making, it is crucial to efficiently convert voluminous point cloud data into streamlined processes, thereby heralding a new age of decision-support services. Strategies must be developed for extensive automation and structuring to eliminate the need for task-specific manual processing and to promote sustainable collaboration.

Voxels have been used in a variety of distinct fields, including geology [33], forest inventory [34], and medical research [35]. In the context of their application to point cloud management, multiple studies [36,37,38,39,40] have substantiated the efficacy of voxels as an appropriate tool for handling point cloud data.

Voxelization is more beneficial for managing raw 3D point clouds. With the points being placed in a regular grid pattern, it is now possible to structure the point cloud in a tree format that allows for significantly reduced computing time. Logical inference is also possible from a voxel structure because the known relationship with neighboring points allows for semantic reasoning [41].

In voxelization of point clouds, one defining parameter of the voxel structure is the elemental voxel size. This parameter governs the resolution of the phenomena under study through the data structure, impacting various applications, such as finite elements [42], structural analyses [43,44], and dynamic phenomena [45]. For instance, if a study or simulation requires a resolution of 5 cm, this would dictate the voxel elemental size established during the voxelization process. In addition, voxel size determines the extent of element reduction relative to the number of points present in the original clouds [29].

The application of deep learning algorithms to voxel structures is an active research area across various fields, such as computer vision, robotics, geography, and medicine. Within the spectrum of deep learning approaches applied to voxels, some notable methodologies as 3D Convolutional Neural Networks (3D CNNs) [46,47], Voxel-based Autoencoders [48,49], Generative Networks for Voxels [50,51,52], and Semantic Segmentation [28,53] have been used.

However, despite these advances, challenges still exist in terms of accurate calibration, data correction, and effective integration of the collected information. Additionally, the interpretation of merged data requires expertise in remote sensing and structural analysis to avoid false diagnoses. Accurate integration of data from different sources requires careful calibration and correction [54]. In addition, proper interpretation of results remains crucial, as the presence of detailed data does not always guarantee a complete understanding of the underlying causes of pathologies.

In this paper, we present a novel approach to multisensor data fusion using multispectral voxels. This data fusion allows for optimal and efficient management of the sensors information, allowing later applications of deep learning algorithms. With this, a workflow is determined that allows the study of buildings as well as the possible pathologies that could be present on them.

This study is organized as follows. Initially, the sensors that were used in a data acquisition campaign on a prominent Spanish Cultural Heritage building are described. Subsequently, the methodology that was designed for handling, processing, and fusing the data using our concept of multispectral voxels is outlined. The deep learning Self-organizing map algorithm will be applied to our multispectral voxel structure. Finally, we discuss the results.

2. Materials and Methods

2.1. Sensors

Within the context of multisensor fusion, a data-capture campaign with a range of different sensors featuring distinct spectral sensitivities and types were chosen. To this end, active and passive sensors were used. The set of active sensors comprises terrestrial laser scanners, whereas the passive sensor group encompasses a variety of photographic cameras deployed in both terrestrial and aerial arrangements using Unmanned Aerial Vehicles (UAVs). Each camera had distinct spectral sensitivities.

This meticulous selection of sensors encompasses various conventional modalities of geospatial data capture, which are commonly employed in the analysis of buildings and architectural structures. The acquired data were subjected to a series of processing techniques, with photogrammetry as one of the principal methodologies. Consequently, the final output consisted of multiple point cloud datasets, each characterized by unique and significant spectral properties.

While developing the data capture strategy, it was acknowledged that not all sensors were capable of capturing data encompassing the entire built structure. Sensors located on the ground can only provide information about the lower sections of the building, while sensors attached to aerial vehicles can capture data covering the entire architectural unit. This distinction is essential for accurate interpretation and cohesive integration of the collected data during the analytical process.

The employed sensors are described herein, including photographic cameras, unmanned aerial sensors (UAVs), and laser scanners:

The terrestrial digital imaging device deployed was a Sony NEX-7 photographic camera (Sony Corporation, Tokyo, Japan), a model equipped with a 19 mm optical lens of fixed length. This particular sensor captures spectral components within the red (590 nm), green (520 nm), and blue (460 nm) bands of the visible spectrum. The spectral response of this camera is illustrated in Figure 1a, and its specific parameters are outlined in Table 1.

Table 1. Sony NEX 7 camera sensor parameters.

Parameter	Value
Sensor	APS-C type CMOS sensor
Focal lenght (mm)	19
Sensor width (mm)	23.5
Sensor lenght (mm)	15.6
Effective pixels (megapixels)	24.3
Pixel size (micrometers)	3.92
ISO sensitivity range	100–1600
Image format	RAW (Sony ARW 2.3 format)
Weight (g)	350

A modified version of the digital image camera, the Sony NEX-5N model, was also employed in terrestrial positions. It was outfitted with a 16 mm fixed-length lens, accompanied by a near-infrared (NIR) filter and an ultraviolet (UV) filter in different shot sessions. Detailed information on this sensor is provided in Table 2.

Table 2. Sony NEX 5N camera sensor parameters.

Parameter	Value
Sensor	APS-C type CMOS sensor
Focal lenght (mm)	16
Sensor width (mm)	23.5
Sensor lenght (mm)	15.6
Effective pixels (megapixels)	16.7
Pixel size (micrometers)	4.82
ISO sensitivity range	100–3200
Image format	RAW (Sony ARW 2.2 format)
Weight (g)	269

By removing the internal infrared filter from the Sony NEX-5N camera, the sensitivity of the sensor was enhanced to encompass specific regions of the electromagnetic spectrum, including the near-infrared (820 nm) and ultraviolet (390 nm) wavelengths. The spectral responsiveness of this modified camera is shown in Figure 1b. Notably, the modified sensor exhibits sensitivity to distinct segments of the electromagnetic spectrum in contrast to its unmodified counterpart, the Sony NEX-7, which retains the internal infrared filter (as illustrated in Figure 1a) [55]. This modified camera was equipped with a collection of filters to obtain data corresponding to ultraviolet (UV) and near-infrared (NIR) spectral bands. The transmission curves of the filters employed during the data acquisition process are shown in Figure 2 and Figure 3.

The terrestrial laser scanner employed in this study was the Faro Focus S350 laser scanner, developed by Faro Technologies (Lake Mary, FL, USA). It has proven to be an invaluable tool for archaeological and building studies applications. Its cutting-edge phase-based laser scanning technology enables it to accurately measure distances and efficiently gather an extensive dataset of millions of data points in a short span. Particularly relevant in archaeological studies, this scanner aids in documenting historical sites and structures with high precision. The Faro Focus S350 laser beam wavelength is 1550 nm (Table 3). This allowed us to obtain building spectral information in the short-wavelength infrared spectral (SWIR) band [56]. Concerning the use of data derived from the laser scanner, the registered return signal intensity value has been used as its application has been demonstrated to be effective in buildings made of stone [22].

Figure 1. Spectral responses for Sony NEX cameras, from [55]: (a) Unmodified camera, normalized to the peak of the green channel. (b) Modified camera, normalized to the peak of the red channel.

Figure 2. Midopt DB 660/850 Dual Bandpass filter light transmission curve (Midwest Optical Systems, Inc., Palatine, IL, USA).

Figure 3. ZB2 filter light transmission curve, from Shijiazhuang Tangsinuo Optoelectronic Technology Co., Ltd., Shijiazhuang, Hebei, China.

For data collection in the upper regions of the building, where ground-based sensors lack access, an Unmanned Aerial Vehicle (UAV) was deployed. The chosen instrument was the Parrot Anafi Thermal (Table 4), which mounts a dual-camera system comprising an RGB sensor (for the visible spectrum) and an infrared thermal sensor. This configuration facilitates the simultaneous capture of conventional RGB and infrared thermal images within a single flight mission. Referring to the thermal infrared data, the Parrot Anafi Thermal incorporates a factory-calibrated uncooled microbolometer-type FLIR Lepton 3.5 thermal module [57], enabling the establishment of an absolute temperature for each image pixel. Consequently, precise surface temperatures of the building points were determined. This aids in discerning the discontinuities and potential pathologies resulting from the heterogeneity of construction materials [19].

2.2. Data Capture Campaign

For data acquisition, an emblematic building of Spanish historical heritage was selected. The chosen structure was the Visigothic Church of the Santa Maria de Melque. Data from all the described sensors were directly collected on this 7th-century A.D. building [58], located in the province of Toledo, Spain. The archaeological complex of Santa Maria de Melque (N 39.750878°, W 4.372965°) is located approximately 30 km southwest of the city of Toledo, in close proximity to the Tagus River [59] (Figure 4).

The structure of the Visigothic church of Santa Maria de Melque was constructed using masonry of immense granite blocks assembled without mortar, distinguished by its barrel vault covering the central nave. The layout of the aisles in the form of a Greek cross, the straight apse, and the arrangement of architectural elements reveal both Roman and Byzantine influences, reflecting the rich cultural diversity of the period [60].

The data collection campaign was carried out only on the exterior of this building in February 2022. The campaign covered the entire architectural structure and its ornamental elements. This location was chosen for its historical and architectural relevance, which allowed us to obtain precise and detailed information on the current state of the building and its possible pathologies. Figure 5 shows the studied building.

Prior to capturing the dataset, a meticulous preparatory phase was undertaken, involving the careful placement of a series of precision targets across the walls surface. These identified markers collectively formed a robust set of control points that served as pivotal anchors for the subsequent data collection process. By ensuring uniformity and accuracy at these measurement points, a reliable common geometric reference system was established, effectively standardizing the subsequent data acquisition process across all data collection sessions. This rigorous methodology not only underscored the accuracy of the data collected but also facilitated seamless comparability and analysis of the collected visual information (Figure 6).

The Terrain Target Signals (TTS) were measured using GNSS techniques. These marks were materialized by means of a metal nail in the ground, in order to be revisited on future occasions. Subsequently, a flat black-and-white square sign measuring 30 cm on each side was adopted (Figure 6).

For the observation, dual-frequency Topcon Hiper GNSS receivers were used, with calibrated antennas (GPS, GLONASS), tripod and centering system on the point mark, measuring a total of 12 TTS, forming a network of 30 vectors. The observation time at each point was 15–20 min, with at least a double observation session at each point with different receivers, thus configured in relative static surveying with high repeatability.

In the subsequent processing of GNSS data, to obtain greater precision in the results, the precise geodetic correction models of the ionosphere from the CODE (Center for Orbit Determination in Europe) and precise ephemeris from the IGS (International GNSS Service) for both constellations were downloaded. Along with these Topcon field GNSS data, data from continuous CORS (Continuously Operating Reference Stations) of the Spanish National Positioning System ERGNSS-IGN (Instituto Geográfico Nacional) were also processed. This was done in order to link the local GNSS measurements to a geodetic reference frame that guaranteed high precision stability and temporal permanence for future actions in the environment.

To compute the GNSS vectors, the Leica Infinity version 3.0.1 software was used, using the absolute antenna calibration models and using the VMF (Vienna Mapping Functions) [61]. Both for the observation and for the adjustment and compensation of the coordinates of all the TTS points that make up the network, the same methodology used in the implementation of geodetic precision control networks in engineering was followed [62]. The final adjustment of the network, the calculation of the coordinates and the estimation of its accuracy (Table 5) was performed with Geolab version px5 software, with a complete constraint in the ETRS89 (ETRF2000) frame.

Wall Surface Signals (WSS) were placed around the entire building and attached to the wall. They are intended to tie points supporting the georeferencing process in all point clouds. WSS were 12 bit coded precision targets (Figure 7). Thus, each WSS can be clearly identified by photogrammetry software, almost in automatic mode, and operators.

In Figure 8, we show the distribution of targets on terrain (TTS) and on the walls (WSS). The coordinates of the precision targets attached to the walls (Wall Surface Signals) are expressed in Table A1.

2.3. Point Clouds

In the course of this research, a range of sensors were used for the collection of geomatic data across multiple spectral bands. The processing of each dataset has been carried out with great care, resulting in the generation of several georreferenced three-dimensional point clouds. These point clouds offer a comprehensive and diverse view of the building under examination from various angles (as presented in Table 6). Each point cloud, except the laser scanner point cloud, comes from a photogrammetric process. The photogrammetric software used was Agisoft Metashape version 1.8.5. All derived photogrammetry point clouds are the result of processing a single sensor in a single process. Only in RGB point cloud, unmodified terrestrial RGB camera and UAV RGB image datasets were combined in the same process to obtain the RGB point cloud.

Figure 9 illustrates a subset of the point clouds views resulting from the use of the aforementioned sensors. First, Figure 9a presents a visualization of the point cloud obtained via RGB sensors using both conventional terrestrial and aerial (UAV) cameras. This representation unveils architectural details as well as visible features perceptible to the naked eye, furnishing a meticulous view of the surface of the structure. This point cloud has been selected as the starting point for the development of the voxelized data structure of the building in question, which will be subjected to analysis in subsequent stages.

Figure 9b shows the point cloud corresponding to the UV spectral band. This view allowed us to identify elements that are typically imperceptible within the visible range. In this Figure 9b we also can see that not the entire building has point cloud information in this band.

Finally, in Figure 9c,d, we depict the point cloud generated by our infrared thermal sensor. This visual representations highlight the heat distribution on the building surface, endowing us with an exceptional perspective concerning potential issues related to heat retention and the variability of construction materials.

Beyond the previously outlined difficulties concerning data structure, heterogeneity, and inconsistent density, point clouds generated by multiple sensors emanate from disparate data origins and convey unique informational facets. Consequently, it becomes essential to formulate strategies for the fusion of both geometric and spectral elements with the aim of optimizing data processing and extracting valuable conclusions from the varied information contained within.

2.4. Voxelization

To facilitate an optimal analysis, it is necessary to fuse the numerous point clouds that have been generated. To accomplish this, we have employed voxels as the data structure, specifically the multispectral voxel variant developed in our previous research [29].

Voxels, abbreviated from “volumetric elements”, serve as the fundamental abstract units in three-dimensional space, each having predetermined volume, positional coordinates, and attributes [63]. They offer significant utility in providing topologically explicit representations of point clouds, thereby augmenting the informational content.

In this study, we examined how the selected elemental voxel size affects the point distribution within the point clouds. Voxel structures with various elemental sizes (50, 25, 10, 5, and 3 cm) were created. To manage the point clouds and their voxelization, we used the Open3d open-source library [64]. We used the RGB point cloud as the initial dataset to create voxel structures, given its comprehensive coverage of the entire building. After establishing voxel structures based on these sizes, we located the points from each respective cloud enclosed within individual voxels. In instances where one specific voxel enclosed multiple points from a particular spectral band, the voxel’s spectral property was set to the mean value of those enclosed points. Additional statistical metrics, such as the maximum, minimum, and variance values, were also calculated. Given that the voxel adopts the spectral attributes of enveloped points, it effectively becomes a multispectral voxel [29]. The concept of a multispectral voxel used in this work is illustrated in Figure 10.

Figure 11 illustrates how point clouds are distributed along the 5 cm elemental voxel structure. For other chosen voxel elemental sizes, please refer to Appendix A, where we provide the respective distributions. The histograms in Figure 11 show the number of points per voxel in each spectral band.

We have also carried out an analysis of the influence of the size of the elementary voxel with respect to the number of points contained. We can observe their distribution in the form of histograms in Figure 12. We note that the distribution of the spectral bands contained in the multispectral voxels does not depend specifically on the elemental voxel size, as the distributions outlines are similar.

Once the structure of multispectral voxels is established, it is crucial to process this data to gain insights and draw conclusions about the target building under study. In this research work, we have opted for the implementation of Self-Organizing Maps (SOMs) algorithms, due to their multiple advantages in handling voxel data. SOMs provide an efficient summarization of data, facilitating understanding, especially when dealing with large datasets, thus excelling in dimensionality reduction tasks. They do not require a separate training phase and are adept at exploiting spatial attributes while maintaining local neighbourhood relationships. With a design that is more intuitive than other neural network types, SOMs ease the interpretation of generated outcomes, making them particularly suitable for tackling categorization problems.

2.5. Self-Organizing Maps

The Self-Organizing Map (SOM) executes a transformation from an input space of higher dimensionality to a map space of lower dimensionality, using a two-layered, fully interconnected neural architecture. The input layer comprises a linear array of neurons and the elementary units of an Artificial Neural Network (ANN), and the number of these neurons corresponds to the dimensionality of the input data vector (n). The output layer, also known as the Kohonen layer, is composed of neurons, each possessing a weight vector that matches the dimensionality of the input data (n). These neurons are organized within a rectangular grid of arbitrary dimensions (k). The weight vectors are collectively represented in a weight matrix configured as an

n \times k \times k

array [65]. Figure 13 illustrates the typical Kohonen map architecture.

Among the various features provided by the use of SOMs are the following:

Group together similar items and separate dissimilar items.
Classify new data items using the known classes and groups.
Find unusual co-ocurring associations of attribute values among items.
Predict a numeric attribute value.
Identify linkages between data items based on features shared in common.
Organize information based on relationships among key data descriptors.

It has been proven that SOMs need complete data records [66]. SOMs are highly sensitive to Nan values and empty fields in the input layer. For this purpose, only full voxels, that is, those with at least one point contained in each of the point clouds to be fused, were classified in our training.

SOM Quality Indices

The various parameters that define a SOM can lead to different neural spaces. SOM quality measures are required to optimize training, such that meaningful conclusions can be drawn from them. Among these, the most significant quality indices are:

Quantization error is the average error made by projecting data on the SOM, as measured by euclidean distance, i.e., the mean euclidean distance between a data sample and its best-matching unit [67]. Best value of quantization error is zero.
Topographic product (TP) [68] measures the preservation of neighborhood relations between input space and the map. It depends only on the prototype vectors and map topology, and can indicate whether the dimension of the map is appropriate for fitting the dataset, or if it introduces neighborhood violations, induced by foldings of the map [67]. Topographic product will be <0 if map size is small and >0 if map size is big. Best value will be the one with the lower absolute value.

In this work, SOM quality indices have been calculated using an open-source library, SOMperf [67].

To determine the optimal neuron map size (k), Topographic Product (TP) [68] has been used. An approximation corresponding to a rule of thumb, according to Equation (1):

k = 5 \sqrt{N},

(1)

being N the number of full voxels in the voxelized structure.

In Figure 14 we summarize in a flowchart the handling of the processed data during our multisensor data fusion strategy.

3. Results

SOM trainings was performed for each voxel size. For this purpose, the optimal map size was determined such that the topographic product was minimum (in absolute value). Table 7 lists the different parameters according to the voxel size, as well as the obtained topographic product and quantization error.

One of the first steps prior to training is the normalization of the data. Normalization helps the features to have a similar scale, facilitating the training process and faster convergence. In addition, normalization helps to avoid problems associated with large gradients during training, which negatively affect model convergence. In our work, we conducted a Min-max rescale normalization.

To illustrate, we will concentrate on the presentation of the SOM outcomes for a voxel size of 5 cm (0.05 m). The varied results of the maps according to the voxel size are organized in Appendix A.

As shown in Figure 13, the results of the training and classification of a self-organized map are provided by a matrix of neurons, which are assigned a series of weights given by the input layer. One way to show the relationships between neurons is to express the distances between them (Figure 15). The lighter parts in that matrix represent the parts of the map where the nodes are far away from each other, and the dark parts represent less distance between nodes.

Each activated neuron represents a set of voxels. There are also neurons that have not been activated by any of the vectors given by the multispectral voxels within the input layer. Figure 16 shows how many patterns (x-axis) have been recognized by how many neurons (y-axis).

The distribution of activations can be seen in the activation map depicted in Figure 17.

In the same way, we show how the activations of the neurons are distributed within each codebook vector (Figure 18).

4. Discussion

The use of Self-Organizing Maps (SOM) for the training and classification of multispectral voxel structures results in the generation of a neural map. In this map, each neuron points towards a group of voxels that exhibit similar characteristics, as indicated by a vector of weights assigned to each property (spectral band) of the input layer. Every voxel in the input layer has a Best Machine Unit (BMU). The BMU is the neuron whose weight vector is closest in the space to the pattern.

In the SOM heat map voxel 5 cm (Figure 17), we can identify which neurons are those that determine clearer patterns because they have been activated a greater number of times. We highlight the neurons in the corners of the heat map.

Neuron (0, 0), activated by 274 voxels. The red band is the predominant band, together with the laser. The UV and thermal bands have less weight for these voxels (Figure 19a). These activated voxels are mostly located on the south building façade.
Neuron (0, 33), activated by 425 voxels. Its most important band is the laser band. The rest of the bands (RGB and UV) show a medium weight, whereas the thermal information has little influence (Figure 19b). Its voxels are located on the western planes not illuminated by the Sun during data acquisition.
Neuron (33, 0), activated by 621 voxels. Its characteristic band is the thermal band, with a minimum in the UV band. The red and green bands, along with the NIR have medium weights. The laser band also has considerable weight (Figure 19c). The voxels targeted by this neuron are located on the northern façade.
Neuron (33, 33), activated by 136 voxels. Larger weight of the laser band. The rest of the bands (RGB, NIR, UV, and thermal) have lower weights (Figure 19d).

However, each neuron does not act in isolation, but interconnects with nearby neurons (Figure 15). Nearby neurons interrelate similar multispectral voxels. As shown in Figure 18, neurons can be zoned using cross-band interrelated weights. In our analysis, we have observed a striking similarity in the distribution of weights of the neurons in the heatmap, corresponding to the red, green, and blue bands, located in the lower left corner. This finding can be attributed to the fact that the built-up area of the building, as depicted in Figure 5, primarily exhibits a red-grayish tone, derived from the granite construction.

This distribution of weights presents an opportunity for further analysis. Employing a clustering algorithm, as the K-means method, we have successfully segmented the BMUs. Using, for example, a 12-zone clustering, we are able to identify similar BMUs in a manner that considerably streamlines analysis and enhances decision-making (Figure 20).

The primary focus of our methodology is to identify and study building pathologies through the fusion of geomatic sensor data. We identified an area with dampness in the building that was examined. This area is located in the lower part of one of the northern façades (Figure 21). Dampness in buildings can lead to various issues such as the formation of salt efflorescence. These are white deposits on the surfaces resulting from the migration of soluble salts in water through the pores of the construction materials. This phenomenon often occurs when water carries salts from the ground to walls. In addition to efflorescence, moisture can trigger mold growth, material deterioration, and structural problems [69].

In order to evaluate our hypothesis regarding the potential application of multispectral voxels in multisensor data fusion for the purpose of studying building pathologies through the use of self-organizing maps (SOMs), we specifically focused on the voxels that represented areas of the building with pathology present. We then identified the best matching units (BMUs) for these voxels and discovered that they were located in close proximity to the neuron (33, 20). To confirm the validity of their weight vector, we obtained the corresponding characteristic graph, shown in Figure 22. The graph indicates a strong correlation between the visible spectrum bands and the laser scanner band.

Earlier research in remote sensing building pathologies studies discovered that the infrared spectral range, particularly the 778, 905, and 1550 nm wavelength bands, is ideal for detecting dampness. The study [1] found that the best results were obtained using short-wave infrared data (1550 nm). Visible wave bands can detect efflorescence, as surfaces with this phenomenon have higher reflectance in these spectral ranges. For chlorides or sulphates being the cause, the short-wave infrared range is more appropriate. In accordance with the research conducted by [1], our methodology demonstrates a higher level of effectiveness as it does not rely on the derivation of information from point clouds to images to localize and study pathologies. Instead, we work directly with the original 3D point clouds, integrating geospatial 3D geometry with multispectral data information.

5. Conclusions

Given the vast amount of data provided by different geomatic sensors and the diverse nature of the information they offer, it is challenging to manage multisensor data and generate knowledge from it. Therefore, new approaches are required to effectively and efficiently express this information from these sensors. Our methodology combines the power of voxels in their multispectral character, which is necessary for data fusion, with deep-learning algorithms. In this case, the methodology was tested in a study of a cultural heritage building. Using this methodology, we were able to geospatially locate areas (through their respective voxels) that exhibit certain characteristics, showing pathologies. This allows to identify, study, and derives information from them about any building that has been studied with different sensors, regardless of their nature. Our methodology provides a powerful tool for analyzing architectural structures using various sensors fusing their data. The use of multisensor multispectral voxels locates possible pathologies such as damp and efflorescence in buildings. Further studies can verify correlations between the SOM and different fused sensors, defining their characteristic curves in such a way as to identify the whole catalogue of the most common pathologies in Historical Heritage buildings.

Author Contributions

All authors have made significant contributions to this manuscript. Conceptualization, J.R., S.L.-C.M., T.R.H.-T. and E.P.-d.-l.-S.; methodology, J.R.; software, J.R.; validation, J.R.; formal analysis, J.R., S.L.-C.M. and T.R.H.-T.; investigation, J.R.; field observations: J.R., S.L.-C.M., J.A.d.M. and E.P.-d.-l.-S.; resources, J.R., S.L.-C.M., T.R.H.-T. and E.P.-d.-l.-S.; data curation, J.R. and S.L.-C.M.; writing–original draft preparation, J.R.; writing–review and editing, J.R., S.L.-C.M., J.A.d.M., T.R.H.-T. and E.P.-d.-l.-S.; supervision, S.L.-C.M., J.A.d.M., T.R.H.-T. and E.P.-d.-l.-S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was founded by Grupo Acre at Fundación Premio Arce reference FPA1900001501.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We extend our gratitude to the dedicated staff of the Geovisualización, Espacios Singulares y Patrimonio (GESyP) research group. Additionally, the first author, Javier Raimundo, expresses his appreciation to the Consejo General de la Arquitectura Técnica de España (CGATE) for their valuable support. Special thanks to Noemi Martinez for her assistance in the field data capture campaign.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

TLS	Terrestrial Laser Scanner
NIR	Near Infrared
RGB	Red, green and blue
IR	Infrared
UV	Ultraviolet
TTS	Terrain Target Signals
WSS	Wall Surface Signals
GNSS	Global Navigation Satellite Systems
CODE	Center for Orbit Determination
IGS	International GNSS Service
CORS	Continuously Operating Reference Stations
IGN	Instituto Geografico Nacional
VMF	Vienna Mapping Functions
SOM	Self Organizing Map
ANN	Artificial Neural Network
SWIR	Short-Wavelength Infrared
TP	Topographic Product
BIM	Building Information Modelling
BMU	Best Matching Unit

Appendix A

Appendix A.1. Surface Targets Coordinates

Here, we show the coordinates for the signals attached on building surface (Table A1).

Table A1. Wall Surface Signals (WSS) coordinates. UTM 30N coordinate system, ETRS89 (ETRF2000) geodetic frame. All values in meters.

Target Signal	UTM 30 Coordinates		Height
Target Signal	Northing	Easting	Height
target 1	382360.015	440005.730	567.466
target 2	382368.957	4401021.247	566.398
target 3	382387.272	4401026.984	567.755
target 4	382396.179	4400998.666	567.397
target 5	382360.322	4400996.444	568.799
target 6	382388.457	4401003.889	566.041
target 7	382379.653	4400999.527	567.669
target 10	382382.008	4401000.464	568.380
target 12	382385.244	4401004.683	568.189
target 13	382383.552	4401018.785	567.828
target 15	382385.290	4401017.194	566.884
target 16	382367.348	4401015.839	569.020
target 18	382385.407	4401002.420	566.875
target 20	382380.706	4401019.521	567.237
target 21	382378.862	4401018.659	566.457
target 25	382374.552	4401019.167	568.003
target 26	382363.300	4401013.857	569.381
target 27	382391.098	4401011.804	568.204
target 28	382366.036	4401007.650	569.078
target 29	382378.289	4400999.028	568.894
target 30	382363.924	4401014.202	568.630
target 34	382372.581	4401003.133	569.414
target 35	382382.719	4401020.449	567.849
target 36	382370.265	4401017.214	567.388
target 37	382370.390	4401002.189	568.604
target 43	382368.778	4401001.428	569.551
target 45	382388.164	4401010.347	568.259
target 47	382389.463	4401015.952	568.259
target 48	382390.180	4401014.380	568.103
target 50	382385.086	4401001.835	568.112

Appendix A.2. Points Histograms

In Appendix A.2, we present the distribution of points per voxel through histograms, based on the voxel’s elemental size.

Figure A1. Points per voxel. Voxel size: 50 cm.

Figure A2. Points per voxel. Voxel size: 25 cm.

Figure A3. Points per voxel. Voxel size: 10 cm.

Figure A4. Points per voxel. Voxel size: 3 cm.

Appendix A.3. SOM Codebooks

Here, in Appendix A.3, we express the different heat map codebooks, which allows to see map distribution of all of its characteristics. Different voxel elemental sizes have been trained, expressing coherent results within the Results Section 3 and Discussion Section 4.

Figure A5. SOM codebooks for voxel 10 cm.

Figure A6. SOM codebooks for voxel 25 cm.

References

Del Pozo Aguilera, S. Multispectral Imaging for the Analysis of Materials and Pathologies in Civil Engineering, Constructions and Natural Spaces. Ph.D. Thesis, Universidad de Salamanca, Salamanca, Spain, 2016. [Google Scholar]
Torres-Martínez, J.A.; Sánchez-Aparicio, L.J.; Hernández-López, D.; González-Aguilera, D. Combining Geometrical and Radiometrical Features in the Evaluation of Rock Art Paintings. Digit. Appl. Archaeol. Cult. Herit. 2017, 5, 10–20. [Google Scholar] [CrossRef]
Crespo, C.; Armesto, J.; González-Aguilera, D.; Arias, P. Damage Detection on Historical Buildings Using Unsupervised Classification Techniques. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 2010, 38, 184–188. [Google Scholar]
Moropoulou, A.; Labropoulos, K.C.; Delegou, E.T.; Karoglou, M.; Bakolas, A. Non-Destructive Techniques as a Tool for the Protection of Built Cultural Heritage. Constr. Build. Mater. 2013, 48, 1222–1239. [Google Scholar] [CrossRef]
Oreni, D.; Cuca, B.; Brumana, R. Three-Dimensional Virtual Models for Better Comprehension of Architectural Heritage Construction Techniques and Its Maintenance over Time. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2012; Volume 7616 LNCS, pp. 533–542. [Google Scholar] [CrossRef]
Turco, M.L.; Mattone, M.; Rinaudo, F. Metric Survey and Bim Technologies to Record Decay Conditions. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 2017, 42, 261–268. [Google Scholar] [CrossRef]
González-Jorge, H.; Lagüela, S.; Krelling, P.; Armesto, J.; Martínez-Sánchez, J. Single Image Rectification of Thermal Images for Geometric Studies in Façade Inspections. Infrared Phys. Technol. 2012, 55, 421–426. [Google Scholar] [CrossRef]
Dias, I.; Flores-Colen, I.; Silva, A. Critical Analysis about Emerging Technologies for Building’s Façade Inspection. Buildings 2021, 11, 53. [Google Scholar] [CrossRef]
Masiero, A.; Costantino, D. TLS for detecting small damages on a building façade. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, XLII-2/W11, 831–836. [Google Scholar] [CrossRef]
Kim, N.; Choi, Y.; Hwang, S.; Park, K.; Yoon, J.S.; Kweon, I.S. Geometrical Calibration of Multispectral Calibration. In Proceedings of the 2015 12th International Conference on Ubiquitous Robots and Ambient Intelligence, URAI 2015, Goyangi, Republic of Korea, 28–30 October 2015; pp. 384–385. [Google Scholar] [CrossRef]
Wang, Q.; Kim, M.K.; Sohn, H.; Cheng, J.C. Surface Flatness and Distortion Inspection of Precast Concrete Elements Using Laser Scanning Technology. Smart Struct. Syst. 2016, 18, 601–623. [Google Scholar] [CrossRef]
Wang, Q.; Kim, M.K.; Cheng, J.C.; Sohn, H. Automated Quality Assessment of Precast Concrete Elements with Geometry Irregularities Using Terrestrial Laser Scanning. Autom. Constr. 2016, 68, 170–182. [Google Scholar] [CrossRef]
Chiabrando, F.; Lo Turco, M.; Rinaudo, F. Modeling the Decay in an Hbim Starting from 3d Point Clouds. A Followed Approach for Cultural Heritage Knowledge. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 2017, 42, 605–612. [Google Scholar] [CrossRef]
Yaagoubi, R.; Miky, Y. Developing a Combined Light Detecting And Ranging (LiDAR) and Building Information Modeling (BIM) Approach for Documentation and Deformation Assessment of Historical Buildings. MATEC Web Conf. 2018, 149, 02011. [Google Scholar] [CrossRef]
Previtali, M.; Barazzetti, L.; Redaelli, V.; Scaioni, M.; Rosina, E. Rigorous Procedure for Mapping Thermal Infrared Images on Three-Dimensional Models of Building Façades. J. Appl. Remote Sens. 2013, 7, 073503. [Google Scholar] [CrossRef]
Lerma, C.; Mas, Á.; Gil, E.; Vercher, J.; Peñalver, M.J. Pathology of Building Materials in Historic Buildings. Relationship Between Laboratory Testing and Infrared Thermography. Mater. Constr. 2014, 64, e009. [Google Scholar] [CrossRef]
Mercuri, F.; Cicero, C.; Orazi, N.; Paoloni, S.; Marinelli, M.; Zammit, U. Infrared Thermography Applied to the Study of Cultural Heritage. Int. J. Thermophys. 2015, 36, 1189–1194. [Google Scholar] [CrossRef]
Sfarra, S.; Marcucci, E.; Ambrosini, D.; Paoletti, D. Infrared Exploration of the Architectural Heritage: From Passive Infrared Thermography to Hybrid Infrared Thermography (HIRT) Approach. Mater. Constr. 2016, 66, e094. [Google Scholar] [CrossRef]
Luib, A. Infrared thermal imaging as a non-destructive investigation method for building archaeological purposes. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 2019, 42, 695–702. [Google Scholar] [CrossRef]
Del Pozo, S.; Herrero-Pascual, J.; Felipe-García, B.; Hernández-López, D.; Rodríguez-Gonzálvez, P.; González-Aguilera, D. Multispectral Radiometric Analysis of Façades to Detect Pathologies from Active and Passive Remote Sensing. Remote Sens. 2016, 8, 80. [Google Scholar] [CrossRef]
Del Pozo, S.; Rodríguez-Gonzálvez, P.; Sánchez-Aparicio, L.J.; Muñoz-Nieto, A.; Hernández-López, D.; Felipe-García, B.; González-Aguilera, D. Multispectral Imaging in Cultural Heritage Conservation. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 2017, 42, 155–162. [Google Scholar] [CrossRef]
Armesto-González, J.; Riveiro-Rodríguez, B.; González-Aguilera, D.; Rivas-Brea, M.T. Terrestrial Laser Scanning Intensity Data Applied to Damage Detection for Historical Buildings. J. Archaeol. Sci. 2010, 37, 3037–3047. [Google Scholar] [CrossRef]
Naranjo, J.M.; Parrilla, Á.; de Sanjosé, J.J. Geometric Characterization and Interactive 3D Visualization of Historical and Cultural Heritage in the Province of Cáceres (Spain). Virtual Archaeol. Rev. 2018, 9, 1–11. [Google Scholar] [CrossRef]
Antón, D.; Carretero-Ayuso, M.J.; Moyano-Campos, J.; Nieto-Julián, J.E. Laser Scanning Intensity Fingerprint: 3D Visualisation and Analysis of Building Surface Deficiencies. In New Technologies in Building and Construction; Bienvenido-Huertas, D., Moyano-Campos, J., Eds.; Springer Nature Singapore: Singapore, 2022; Volume 258, pp. 207–223. [Google Scholar] [CrossRef]
Batur, M.; Yilmaz, O.; Ozener, H. A Case Study of Deformation Measurements of Istanbul Land Walls via Terrestrial Laser Scanning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 6362–6371. [Google Scholar] [CrossRef]
Bayarri, V.; Prada, A.; García, F.; Díaz-González, L.M.; De Las Heras, C.; Castillo, E.; Fatás, P. Integration of Remote-Sensing Techniques for the Preventive Conservation of Paleolithic Cave Art in the Karst of the Altamira Cave. Remote Sens. 2023, 15, 1087. [Google Scholar] [CrossRef]
Xu, Y.; Tong, X.; Stilla, U. Voxel-Based Representation of 3D Point Clouds: Methods, Applications, and Its Potential Use in the Construction Industry. Autom. Constr. 2021, 126, 103675. [Google Scholar] [CrossRef]
Poux, F.; Billen, R. Voxel-Based 3D Point Cloud Semantic Segmentation: Unsupervised Geometric and Relationship Featuring vs Deep Learning Methods. ISPRS Int. J. Geo-Inf. 2019, 8, 213. [Google Scholar] [CrossRef]
Raimundo, J.; Lopez-Cuervo Medina, S.; Aguirre De Mata, J.; Prieto, J.F. Multisensor Data Fusion by Means of Voxelization: Application to a Construction Element of Historic Heritage. Remote Sens. 2022, 14, 4172. [Google Scholar] [CrossRef]
Alikhodja, N.; Zeghlache, H.; Bousnina, M. Remote Sensing Method (TLS) in Architectural Analysis and Constructive Pathology Diagnosis. Res. Sq. 2023. [Google Scholar] [CrossRef]
Herrero-Tejedor, T.R.; Maté-González, M.Á.; Pérez-Martín, E.; López-Cuervo, S.; López De Herrera, J.; Sánchez-Aparicio, L.J.; Villanueva Llauradó, P. Documentation and Virtualisation of Vernacular Cultural Heritage: The Case of Underground Wine Cellars in Atauta (Soria). Heritage 2023, 6, 5130–5150. [Google Scholar] [CrossRef]
Suchocki, C.; Błaszczak-Bąk, W.; Janicka, J.; Dumalski, A. Detection of Defects in Building Walls Using Modified OptD Method for Down-Sampling of Point Clouds. Build. Res. Inf. 2021, 49, 197–215. [Google Scholar] [CrossRef]
Li, J.; Liu, P.R.; Liang, Z.X.; Wang, X.Y.; Wang, G.Y. Three-Dimensional Geological Modeling Method of Regular Voxel Splitting Based on Multi-Source Data Fusion. Yantu Lixue Rock Soil Mech. 2021, 42, 1170–1177. [Google Scholar] [CrossRef]
Goodbody, T.R.; Tompalski, P.; Coops, N.C.; Hopkinson, C.; Treitz, P.; van Ewijk, K. Forest Inventory and Diversity Attribute Modelling Using Structural and Intensity Metrics from Multi-Spectral Airborne Laser Scanning Data. Remote Sens. 2020, 12, 2109. [Google Scholar] [CrossRef]
Inano, R.; Oishi, N.; Kunieda, T.; Arakawa, Y.; Yamao, Y.; Shibata, S.; Kikuchi, T.; Fukuyama, H.; Miyamoto, S. Voxel-Based Clustered Imaging by Multiparameter Diffusion Tensor Images for Glioma Grading. Neuroimage Clin. 2014, 5, 396–407. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Xie, H.; Shin, H. 3D Object Detection Using Frustums and Attention Modules for Images and Point Clouds. Signals 2021, 2, 98–107. [Google Scholar] [CrossRef]
Wang, X.; Xu, D.; Gu, F. 3D Model Inpainting Based on 3D Deep Convolutional Generative Adversarial Network. IEEE Access 2020, 8, 170355–170363. [Google Scholar] [CrossRef]
Poux, F.; Neuville, R.; Nys, G.A.; Billen, R. 3D Point Cloud Semantic Modelling: Integrated Framework for Indoor Spaces and Furniture. Remote Sens. 2018, 10, 1412. [Google Scholar] [CrossRef]
Kuang, H.; Wang, B.; An, J.; Zhang, M.; Zhang, Z. Voxel-FPN: Multi-Scale Voxel Feature Aggregation for 3D Object Detection from LIDAR Point Clouds. Sensors 2020, 20, 704. [Google Scholar] [CrossRef]
Gebhardt, S.; Payzer, E.; Salemann, L.; Fettinger, A.; Rotenberg, E.; Seher, C. Polygons, Point-Clouds, and Voxels, A Comparison of High-Fidelity Terrain Representations. In Proceedings of the Fall Simulation Interoperability Workshop 2009, 2009 Fall SIW, Orlando, FL, USA, 21–25 September 2009; pp. 357–365. [Google Scholar]
Arayici, Y.; Counsell, J.; Mahdjoubi, L.; Nagy, G.; Hawas, S.; Dweidar, K. Heritage Building Information Modelling; Taylor & Francis Ltd.: London, UK, 2021. [Google Scholar]
Castellazzi, G.; D’Altri, A.; Bitelli, G.; Selvaggi, I.; Lambertini, A. From Laser Scanning to Finite Element Analysis of Complex Buildings by Using a Semi-Automatic Procedure. Sensors 2015, 15, 18360–18380. [Google Scholar] [CrossRef]
Bitelli, G.; Castellazzi, G.; D’Altri, A.; De Miranda, S.; Lambertini, A.; Selvaggi, I. Automated Voxel Model from Point Clouds for Structural Analysis of Cultural Heritage. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS 2016, XLI-B5, 191–197. [Google Scholar] [CrossRef]
Zhang, C.; Jamshidi, M.; Chang, C.C.; Liang, X.; Chen, Z.; Gui, W. Concrete Crack Quantification Using Voxel-Based Reconstruction and Bayesian Data Fusion. IEEE Trans. Ind. Inform. 2022, 18, 7512–7524. [Google Scholar] [CrossRef]
Wang, Y.; Xiao, Y.; Xiong, F.; Jiang, W.; Cao, Z.; Zhou, J.T.; Yuan, J. 3DV: 3D Dynamic Voxel for Action Recognition in Depth Video. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 508–517. [Google Scholar] [CrossRef]
Brock, A.; Lim, T.; Ritchie, J.M.; Weston, N. Generative and Discriminative Voxel Modeling with Convolutional Neural Networks. arXiv 2016, arXiv:1608.04236. [Google Scholar] [CrossRef]
Li, B. 3D Fully Convolutional Network for Vehicle Detection in Point Cloud. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 1513–1518. [Google Scholar] [CrossRef]
Liu, J.; Mills, S.; McCane, B. Variational Autoencoder for 3D Voxel Compression. In Proceedings of the 2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ), Wellington, New Zealand, 25–27 November 2020; pp. 1–6. [Google Scholar] [CrossRef]
Huang, B.C.; Feng, Y.C.; Liang, T.Y. A Voxel Generator Based on Autoencoder. Appl. Sci. 2022, 12, 10757. [Google Scholar] [CrossRef]
Rezaei, M.; Yang, H.; Meinel, C. Voxel-GAN: Adversarial Framework for Learning Imbalanced Brain Tumor Segmentation. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries; Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., Van Walsum, T., Eds.; Springer International Publishing: Cham, Switzerland, 2019; Volume 11384, pp. 321–333. [Google Scholar] [CrossRef]
Kleineberg, M.; Fey, M.; Weichert, F. Adversarial Generation of Continuous Implicit Shape Representations. arXiv 2020, arXiv:2002.00349. [Google Scholar] [CrossRef]
Zhang, W.; Ma, Y.; Zhu, D.; Dong, L.; Liu, Y. MetroGAN: Simulating Urban Morphology with Generative Adversarial Network. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 2482–2492. [Google Scholar] [CrossRef]
Kniaz, V.V.; Remondino, F.; Knyaz, V.A. Generative adversarial networks for single photo 3D reconstruction. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS 2019, XLII-2/W9, 403–408. [Google Scholar] [CrossRef]
Bastonero, P.; Donadio, E.; Chiabrando, F.; Spanò, A. Fusion of 3D Models Derived from TLS and Image-Based Techniques for CH Enhanced Documentation. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS 2014, 2, 73–80. [Google Scholar] [CrossRef]
Berra, E.; Gibson-Poole, S.; MacArthur, A.; Gaulton, R.; Hamilton, A. Estimation of the Spectral Sensitivity Functions of Un-Modified and Modified Commercial off-the-Shelf Digital Cameras to Enable Their Use as a Multispectral Imaging System for UAVs. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 2015, 40, 207–214. [Google Scholar] [CrossRef]
Yeshwanth Kumar, A.; Noufia, M.A.; Shahira, K.A.; Ramiya, A.M. Building information modelling of a multi storey building using terrestrial laser scanner and visualisation using potree: An open source point cloud renderer. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS 2019, XLII-2/W17, 421–426. [Google Scholar] [CrossRef]
Acorsi, M.G.; Gimenez, L.M.; Martello, M. Assessing the Performance of a Low-Cost Thermal Camera in Proximal and Aerial Conditions. Remote Sens. 2020, 12, 3591. [Google Scholar] [CrossRef]
Zoreda, L.C.; Mier, M.F.; Pérez, A.R.; Arnanz, A.M.; López, P.; Rosado, R.M.; Uzquiano, P. Notas Sobre El Complejo Productivo de Melque (Toledo): Prospección del Territorio y Análisis de Carbono 14, Polínicos, Carpológicos y Antracológicos y de Morteros. In Archivo Español de Arqueología; Technical Report; Consejo Superior de Investigaciones Cientifica: Madrid, Spain, 1999. [Google Scholar]
Ortega, R.; Vanacker, V.; Sanjurjo-Sánchez, J.; Miralles, I. Human-Landscape Interactions during the Early and High Medieval Period in Central Spain Based on New Estimates of Sediment Yield from the Melque Agricultural Complex. Geoarchaeology 2017, 32, 177–188. [Google Scholar] [CrossRef]
Zoreda, L.C. Un Canal de Transmisión de Lo Clásico En La Alta Edad Media Española: Arquitectura y Escultura de Influjo Omeya En La Península Ibérica Entre Mediados Del Siglo VIII e Inicios Del X (I). Al-Qantara Rev. Estud. Árabes 1994, 321–350. [Google Scholar]
Boehm, J.; Werl, B.; Schuh, H. Troposphere Mapping Functions for GPS and Very Long Baseline Interferometry from European Centre for Medium-Range Weather Forecasts Operational Analysis Data. J. Geophys. Res. Solid Earth 2006, 111, 2005JB003629. [Google Scholar] [CrossRef]
Velasco-Gómez, J.; Prieto, J.F.; Molina, I.; Herrero, T.; Fábrega, J.; Pérez-Martín, E. Use of the Gyrotheodolite in Underground Networks of Long High-Speed Railway Tunnels. Surv. Rev. 2016, 48, 329–337. [Google Scholar] [CrossRef]
Foley, J.D. Computer Graphics: Principles and Practice; Addison Wesley: Boston, MA, USA, 1990. [Google Scholar]
Zhou, Q.Y.; Park, J.; Koltun, V. Open3D: A Modern Library for 3D Data Processing. arXiv 2018, arXiv:1801.09847. [Google Scholar] [CrossRef]
García-Tejedor, Á.J.; Nogales, A. An Open-Source Python Library for Self-Organizing-Maps. Softw. Impacts 2022, 12, 100280. [Google Scholar] [CrossRef]
Ruiz-Varona, A.; Lacasta, J.; Nogueras-Iso, J. Self-Organizing Maps to Evaluate Multidimensional Trajectories of Shrinkage in Spain. ISPRS Int. J. Geo-Inf. 2022, 11, 77. [Google Scholar] [CrossRef]
Forest, F.; Lebbah, M.; Azzag, H.; Lacaille, J. A Survey and Implementation of Performance Metrics for Self-Organized Maps. arXiv 2020, arXiv:2011.05847. [Google Scholar] [CrossRef]
Bauer, H.U.; Pawelzik, K.; Geisel, T. A Topographic Product for the Optimization of Self-Organizing Feature Maps. In Advances in Neural Information Processing Systems; Moody, J., Hanson, S., Lippmann, R., Eds.; Morgan-Kaufmann: Burlington, MA, USA, 1991; Volume 4. [Google Scholar]
Ortiz, R.; Ortiz, P.; Martín, J.M.; Vázquez, M.A. A New Approach to the Assessment of Flooding and Dampness Hazards in Cultural Heritage, Applied to the Historic Centre of Seville (Spain). Sci. Total. Environ. 2016, 551–552, 546–555. [Google Scholar] [CrossRef]

Figure 4. Location of the building surveyed.

Figure 5. Visigothic Church of Santa Maria de Melque, seen from the southeast.

Figure 6. Target signals attached to walls and on terrain.

Figure 7. Example of four 12 bit coded precision targets used as Wall Surface Signal (WSS).

Figure 8. Distribution of control points.

Figure 9. Different point clouds obtained from data adquisition campaing.

Figure 10. Multispectral voxel concept.

Figure 11. Number of Points per 5 cm Voxel.

Figure 12. Distribution of various spectral bands according to different voxel elemental sizes.

Figure 13. Typical Kohonen map architecture (source: [65]).

Figure 14. Multisensor data fusion with multispectral voxels and SOM flowchart.

Figure 15. U-Matrix SOM voxel 0.05 m.

Figure 16. Activated neurons from SOM voxel 0.05 m.

Figure 17. Heatmap of activated neurons from SOM voxel 0.05 m.

Figure 18. SOM codebooks for voxel 5 cm.

Figure 19. Characteristic Graph on some neurons of SOM voxel 5 cm.

Figure 20. Clustering of BMU Hits Matrix by K-means algorithm (12 clusters).

Figure 21. Pathology in north façade of the building.

Figure 22. Characteristic graph of BMU (33, 20).

Table 3. Faro Focus S 350 Laser Scanner sensor parameters.

Parameter	Value
Range	0.6–350 m
Angular FOV (vertical)	300°
Angular FOV (horizontal)	360°
Beam Divergence	0.3 mrad
Beam Diameter at Exit	2.12 mm
Laser Wavelength	1550 nm

Table 4. Parrot Anafi Thermal sensor parameters.

Parameter	Value
Thermal sensor spectral range	8 µm–14 µm
Thermal sensor size	160 × 120 pixels
Pixel pitch	12 µm
Thermal Sensitivity	0.050 °C

Table 5. Terrain Target Signals (TTS) coordinates and their standard deviations. UTM 30N coordinate system, ETRS89 (ETRF2000) geodetic frame. All values in meters.

Station	UTM 30 Coordinates		Height	Standard Deviations
Station	Northing	Easting	Height	North	East	Height
D001	4401021.383	382387.421	567.625	0.004	0.004	0.009
D002	4401007.299	382397.872	566.853	0.002	0.002	0.004
D003	4401011.239	382391.663	566.584	0.021	0.015	0.054
D004	4400996.864	382365.065	568.053	0.003	0.002	0.006
D005	4401011.720	382359.282	568.848	0.003	0.003	0.007
D006	4401021.414	382377.979	565.803	0.014	0.005	0.017
D007	4401014.232	382362.130	567.291	0.007	0.005	0.012
D008	4401021.499	382383.108	565.844	0.010	0.005	0.014
D009	4400998.691	382375.501	567.076	0.007	0.005	0.013
D010	4400990.703	382374.625	568.205	0.002	0.002	0.005
D011	4401024.085	382363.273	567.641	0.005	0.004	0.009
D012	4401028.664	382375.382	566.752	0.006	0.004	0.011

Table 6. Point Clouds.

Band	Number of Points
Red, Green and Blue (visible spectrum)	46,512,981
Near infrared	39,472,866
Ultraviolet	352,900,707
Short-Wavelength Infrared (Laser Scanner)	54,735,907
Thermal infrared	6,412,099

Table 7. SOM map sizes.

Voxel Size (m)	Num. Voxels	Num. Full Voxels	% Full Voxels	Map Size by Equation (1)	Optimal Map Size	TP (abs) $\times 10^{- 4}$	Quantization Error
0.50	8743	1681	19.22	8	17	0.49	0.161
0.25	23,962	3775	15.75	17	20	0.37	0.148
0.10	105,552	13,038	12.35	24	30	21.33	0.127
0.05	360,371	39,052	10.83	32	34	8.17	0.118
0.03	893,448	89,954	10.06	38	32	6.51	0.123

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Raimundo, J.; Medina, S.L.-C.; Mata, J.A.d.; Herrero-Tejedor, T.R.; Priego-de-los-Santos, E. Deep Learning Enhanced Multisensor Data Fusion for Building Assessment Using Multispectral Voxels and Self-Organizing Maps. Heritage 2024, 7, 1043-1073. https://doi.org/10.3390/heritage7020051

AMA Style

Raimundo J, Medina SL-C, Mata JAd, Herrero-Tejedor TR, Priego-de-los-Santos E. Deep Learning Enhanced Multisensor Data Fusion for Building Assessment Using Multispectral Voxels and Self-Organizing Maps. Heritage. 2024; 7(2):1043-1073. https://doi.org/10.3390/heritage7020051

Chicago/Turabian Style

Raimundo, Javier, Serafin Lopez-Cuervo Medina, Julian Aguirre de Mata, Tomás Ramón Herrero-Tejedor, and Enrique Priego-de-los-Santos. 2024. "Deep Learning Enhanced Multisensor Data Fusion for Building Assessment Using Multispectral Voxels and Self-Organizing Maps" Heritage 7, no. 2: 1043-1073. https://doi.org/10.3390/heritage7020051

APA Style

Raimundo, J., Medina, S. L.-C., Mata, J. A. d., Herrero-Tejedor, T. R., & Priego-de-los-Santos, E. (2024). Deep Learning Enhanced Multisensor Data Fusion for Building Assessment Using Multispectral Voxels and Self-Organizing Maps. Heritage, 7(2), 1043-1073. https://doi.org/10.3390/heritage7020051

Article Menu

Deep Learning Enhanced Multisensor Data Fusion for Building Assessment Using Multispectral Voxels and Self-Organizing Maps

Abstract

1. Introduction

2. Materials and Methods

2.1. Sensors

2.2. Data Capture Campaign

2.3. Point Clouds

2.4. Voxelization

2.5. Self-Organizing Maps

SOM Quality Indices

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Surface Targets Coordinates

Appendix A.2. Points Histograms

Appendix A.3. SOM Codebooks

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI