A Perspective on AI-Based Image Analysis and Utilization Technologies in Building Engineering: Recent Developments and New Directions

: Artiﬁcial Intelligence (AI) is a trending topic in many research areas. In recent years, even building, civil, and structural engineering have also started to face with several new techniques and technologies belonging to this ﬁeld, such as smart algorithms, big data analysis, deep learning practices, etc. This perspective paper collects the last developments on the use of AI in building engineering, highlighting what the authors consider the most stimulating scientiﬁc advancements of recent years, with a speciﬁc interest in the acquisition and processing of photographic surveys. Speciﬁcally, the authors want to focus both on the applications of artiﬁcial intelligence in the ﬁeld of building engineering, as well as on the evolution of recently widespread technological equipment and tools, emphasizing their mutual integration. Therefore, seven macro-categories have been identiﬁed where these issues are addressed: photomodeling; thermal imaging; object recognition; inspections assisted by UAVs; FEM and BIM implementation; structural monitoring; and damage identiﬁcation. For each category, the main new innovations and the leading research perspectives are highlighted. The article closes with a brief discussion of the primary results and a viewpoint for future lines of research.


Introduction
Building engineering is a world where the digital revolution has struggled to become popular, compared with other fields of engineering; however, in recent years, there has been a rapid comeback through increasingly cheaper technologies and widespread computer knowledge.The greatest progress was certainly attributable to the widespread diffusion of Building Information Modeling (BIM) environments [1], to the monitoring of the state of structural health [2], and to the use of technologies such as the Internet of Things (IoT), Virtual Reality (VR), data modeling, and Artificial Intelligence (AI) which are revolutionizing this industry [3].This perspective article showcases the latest studies in terms of image processing, examining technological developments of surveying tools, new techniques, proposed applications, their integration with artificial intelligence and the tangible improvements that such approaches can bring to each aspect of building practice.Indeed, thanks to extraordinarily more accurate and updated computer models and more effective inspection techniques during construction and service life, it is possible to achieve a design and structural monitoring quality that were unattainable in the past.
In this perspective paper, the authors, on the basis of their experience, are going to investigate the most recent state of the art, identifying and deepening seven different categories and research areas viewed as having great significance and broad prospects, related to the topics of image processing and artificial intelligence: photomodeling; thermal imaging; object recognition; inspections assisted by Unmanned Aerial Vehicles (UAVs); Finite Element Method (FEM) and BIM implementation; structural monitoring; and damage identification.These seven categories were designed with the aim of covering every aspect of the topic, taking into account, on the one hand, the current and future possibilities of physical tools useful for image processing, and on the other hand, the progress in implementation, new methodologies, and possible deficiencies in the use of the most popular software of engineering practice, with a special eye on the impact of artificial intelligence.AI is treated as a cross-category topic, precisely because of its adaptability and broad application possibilities.In order to clearly outline the future perspectives and to respect the typical characteristics and structure of a perspective paper, the authors have intentionally tried not to exceed the number of citations, but to select the articles which, in their opinion, were more interesting in the given category.The readers interested in a broader review on these topics can refer to the review papers that are already available in the technical literature, such as [4] or [5].
The article is organized as follows.First, there is an overview of the most common strengths, weaknesses, and mistakes of the necessary physical tools, namely photocameras and thermal cameras (Sections 2 and 3, respectively).Subsequently, in Section 4, the subject of object recognition is addressed, and new, very promising algorithms are reported that are capable of implementing the important transfer of the data acquired during scanning into the BIM environment.Section 5 is dedicated to the recent developments in UAV technologies and their wide range of uses, since these devices can integrate the possibilities given by photographic surveys with the prospects of flight.Because the main purpose of image acquisition activities is the creation of an updated and detailed model in BIM environments, Section 6 examines the possibility of extracting structural information from these models and their interoperability with FEM-based algorithms.Then, Section 7 analyzes the new findings in the field of monitoring structural health, whereas Section 8 is more specifically dedicated to damage identification.The article closes with a brief discussion on the main results and with an outline of the future research directions identified by the authors in Section 9.

Photomodeling
In the field of surveying and modelling techniques, photography was certainly a revolution.In monitoring activities, especially in the field of architectural restoration, photography has also allowed to solve many problems of logistics and of detail recognition.The integration of deep learning technologies with photography can surely perform the photomodeling for building engineering applications.For example, ref. [6] described the automated method of modeling the facade of existing and historic skyscrapers, called Scan4Façade, where an AI called U-net is used to generate high-resolution facade orthoimages and to segment pixels; this method can provide clear and accurate information to assist the creation of BIM functionality.This method can also contribute to the maintenance of existing buildings; for instance, ref. [7] proposed an advanced system for crack identification in large structures.
Other applications have been found in the world of virtual tours and Computer-Generated Holography (CGH) [8], where models based on photographic surveys are often accurate for the intended purposes.Today's research aims at the integration of laser and photographic surveys into product versatility, seeking theoretical lines that make technologies as compatible as possible [9].A crucial issue is that these models must always have an appropriate computational burden.For instance, many areas (such as underground cavities or similar) do not need a level of detail comparable to that of a laser scanner.
Although it was developed to detect objects in all directions, another research branch in photomodeling is the one moving towards the integration of open spaces and closed spaces.This new field, based on the different light conditions during the acquisition operations, has already been successfully applied in the framework of 3D content creation [10].
In the field of monitoring, especially of the conservative type, where construction materials are very important, the conditions would be particularly hard to reproduce only by means of traditional surveying and the combination of two-dimensional (2D) photos and three-dimensional (3D) scans has proven to be a very efficient procedure, especially when the 2D representation is predominant compared to the 3D one, or when dealing with parts that are difficult to inspect by large machines [11].Figure 1 is a 2D/3D enhanced combination, showing the integration of different information for the same portion of a given case study, the Castle of Torrelobatón in Spain.
in photomodeling is the one moving towards the integration of open spaces and closed spaces.This new field, based on the different light conditions during the acquisition operations, has already been successfully applied in the framework of 3D content creation [10].
In the field of monitoring, especially of the conservative type, where construction materials are very important, the conditions would be particularly hard to reproduce only by means of traditional surveying and the combination of two-dimensional (2D) photos and three-dimensional (3D) scans has proven to be a very efficient procedure, especially when the 2D representation is predominant compared to the 3D one, or when dealing with parts that are difficult to inspect by large machines [11].Figure 1 is a 2D/3D enhanced combination, showing the integration of different information for the same portion of a given case study, the Castle of Torrelobatón in Spain.Another particularly interesting aspect is the working procedures (for instance, pipelines and workflows) proposed by individual researchers and technicians.A noteworthy project is that of semantic segmentation which aims to enhance the photogrammetric pipeline by integrating semantic information within the processing phases.In [12], an approach to introduce AI-based semantic segmentation in the photogrammetric workflow was presented.The proposed workflow uses 2D image label data and robust AI-based methods to create separate point clouds for each class, demonstrating that the assumption of using the far more available labeled 2D training data is beneficial.
In this rapidly developing environment, another current problem is linked to the reconstruction of missing, unreachable, or undetectable parts (an important problem in photomodeling) or the automatic decomposition of what has been detected into layers.In Another particularly interesting aspect is the working procedures (for instance, pipelines and workflows) proposed by individual researchers and technicians.A noteworthy project is that of semantic segmentation which aims to enhance the photogrammetric pipeline by integrating semantic information within the processing phases.In [12], an approach to introduce AI-based semantic segmentation in the photogrammetric workflow was presented.The proposed workflow uses 2D image label data and robust AI-based methods to create separate point clouds for each class, demonstrating that the assumption of using the far more available labeled 2D training data is beneficial.
In this rapidly developing environment, another current problem is linked to the reconstruction of missing, unreachable, or undetectable parts (an important problem in photomodeling) or the automatic decomposition of what has been detected into layers.In these cases, the AI seems to be able to provide valuable answers through generative processes [13].

Thermal Camera
The use of thermal cameras as a detection device is now more than twenty years old.Developed for military purposes in order to allow for night combat and to identify solid elements, its diffusion was very rapid.Despite this, its use still has much room for improvement, especially related to the enormous use of drones in surveying large structures and thanks to the growing sensitivity towards the thermal dispersion of existing buildings which have a huge impact on global energy demand and consequently on the energy transition.Some authors studied a new framework that uses an instance segmentation technique (Mask R-CNN) from thermal camera observations to compute transmittance values, U, for various building objects, including doors, walls, windows, and facades [14].With the same purpose of limiting the heat losses in buildings, ref. [15] exploited the potential offered by AI applied to thermal images to detect thermal anomalies on wall surfaces with a novel segmentation approach to isolate areas of thermal anomalies in walls using infrared thermography images.The problem of overlapping data, once unsolvable, is now at the center of numerous studies, especially in regard to the identification of the damaged parts of artifacts, as continuous monitoring inserted within a BIM scheme or in a broader Data Fusion project.
Similarly, in the integration process of the photography with laser scanning, a merging procedure is being carried out for the integration of thermal camera surveys [16].This equipment, compared to the others, is decidedly more delicate and more influenced by environmental conditions [17], and more experiments are needed in the field of temperature corrections.
Rapid monitoring techniques occupy a large part of the scientific research sector [18].Thermal imaging is a non-invasive technology, and therefore it is useful in restoration works; moreover, it allows the rapid detection of alterations in structures (such plant-induced damage and rust) that are normally hidden from view [19].
At a larger scale, in [20], 3D city models were generated, encompassing both photographic surveys and thermal images.Each model could visualize construction elements like beams and columns, with the relevant surface temperature at a selected point.This is even possible if the structural elements were hidden by other (non-structural) elements, as long as the machinery has the calibration settings to be able to detect the different thermal behaviors among elements.An example is shown in Figure 2. The main findings of this paper concerned the evaluation and improvement of the thermal environment in pedestrian spaces.At the same scale, in [21], a code was developed to combine the attributes captured by both thermal images and visual ones, providing a quantitative detection of the number of surface cracks, and estimating their relevant severity.Field test results were gathered and statistically analyzed to correlate temperature gradients to the surface crack profiles of asphalt pavements.The extension to the case of buildings seems very promising, especially in the case of masonry structures.As in the case of photomodeling, the use of AI technologies can lead to greater precision of analysis in non-ideal conditions such as those At the same scale, in [21], a code was developed to combine the attributes captured by both thermal images and visual ones, providing a quantitative detection of the number of surface cracks, and estimating their relevant severity.Field test results were gathered and statistically analyzed to correlate temperature gradients to the surface crack profiles of asphalt pavements.The extension to the case of buildings seems very promising, especially in the case of masonry structures.As in the case of photomodeling, the use of AI technologies can lead to greater precision of analysis in non-ideal conditions such as those of variable lighting conditions [22].Furthermore, if one thinks of the very large scale, a whole series of very interesting studies in the field of biology already exist [23] and whose principles can easily be applied to the surveillance and creation of databases at an urban and structural level [24].

Object Recognition
Object recognition is a central topic in the reasoning of scan-to-BIM.In fact, simplification and replacement of elements is meaningful, since it can solve many problems in the information conflict between mesh-based and NURBS-based software (such as Nastran, Inventor, Midas).Basically, hybrid elements between the two sides are desirable, like McNee T-Spline or Autodesk T-Spline, which create surfaces with fewer control points than NURBS and are therefore closer to the computational entities of mesh-based algorithms.The sector that certainly boasts the best results is that linked to Mechanical, Electrical, and Plumbing (MEP), which works with very simple elements, consisting of primitive solids (such as pipes) [25].Advanced procedures, such as the Trimmed Iterative Closest Point (TrICP) scheme, are also available for irregular components (see the example in Figure 3).
Surely it is now universally known that object recognition is a purely Information Technology (IT) issue, rather than a logical one; therefore, a part of the research is concentrated in the development of plug-ins or in the search for methodological pipelines and software stacks that can be used for a wide range of applications [26].A comprehensive comparison was provided by [27].The elevation recognition is sometimes a difficult task, but specific algorithms can help on this point [28].
The recognition process does not exclude that elements could already be catalogued a priori when they are put in place or relying on visual inspections [29].For instance, the shape, the material, and other information can be first assigned, which can then facilitate the correspondence with the given database.An open topic regards the connections among elements: it could be almost automated once the recognition of the converging elements is completed.
Buildings 2023, 13, x FOR PEER REVIEW 6 of 15 software stacks that can be used for a wide range of applications [26].A comprehensive comparison was provided by [27].The elevation recognition is sometimes a difficult task, but specific algorithms can help on this point [28].The recognition process does not exclude that elements could already be catalogued a priori when they are put in place or relying on visual inspections [29].For instance, the shape, the material, and other information can be first assigned, which can then facilitate the correspondence with the given database.An open topic regards the connections among elements: it could be almost automated once the recognition of the converging elements is completed.

Inspections Assisted by UAVs
The technological developments of the last decades have led to the use of new devices in the field of construction engineering, and one of the most innovative is certainly the case of UAVs (Unmanned Aerial Vehicles), that is, an aerial vehicle without pilot.Thanks to the advantages related to the prospects obtainable from flight and to the increasingly affordable prices, these technologies are spreading more and more in image acquisition, even in the building engineering field.
Ref. [30] showed how UAVs were able to speed up the construction process of the Wuhan Leishenshan Hospital (China), an emergency hospital necessary for treating patients affected by COVID-19 (the recent contagious disease caused by the coronavirus SARS-CoV-2), reproducing the whole construction process at a high altitude, gathering a wider perspective, and thus providing efficient and accurate earthwork measurements and offering a full-cycle safety management model (Figure 4).

Inspections Assisted by UAVs
The technological developments of the last decades have led to the use of new devices in the field of construction engineering, and one of the most innovative is certainly the case of UAVs (Unmanned Aerial Vehicles), that is, an aerial vehicle without pilot.Thanks to the advantages related to the prospects obtainable from flight and to the increasingly affordable prices, these technologies are spreading more and more in image acquisition, even in the building engineering field.
Ref. [30] showed how UAVs were able to speed up the construction process of the Wuhan Leishenshan Hospital (China), an emergency hospital necessary for treating patients affected by COVID-19 (the recent contagious disease caused by the coronavirus SARS-CoV-2), reproducing the whole construction process at a high altitude, gathering a wider perspective, and thus providing efficient and accurate earthwork measurements and offering a full-cycle safety management model (Figure 4).SARS-CoV-2), reproducing the whole construction process at a high altitude, gathering a wider perspective, and thus providing efficient and accurate earthwork measurements and offering a full-cycle safety management model (Figure 4).Even the planning of safety on construction sites can be improved using drones: a real-life study has been performed for a high-rise residential construction in Chile.The outcomes of the study indicated that UAVs might have a considerable impact on safety planning and monitoring practices [31].Another important research field concerns dynamic applications: a video-based methodology for tracking the displacement response of Even the planning of safety on construction sites can be improved using drones: a reallife study has been performed for a high-rise residential construction in Chile.The outcomes of the study indicated that UAVs might have a considerable impact on safety planning and monitoring practices [31].Another important research field concerns dynamic applications: a video-based methodology for tracking the displacement response of buildings undergoing dynamic loadings using camera-equipped UAV platforms has been proposed in [32].
However, image processing based on UAVs is not without problems.Indeed, the small movements of the object and the visual obstacles that this could encounter hinder the production of solid data.Ref. [33] investigated the problem of occlusions, presenting an optimization method for the collection of BFTIs (Building Façade Texture Images) from the image flows acquired by five oblique cameras onboard the UAV.
In this field, AI becomes crucial, both for the correction phases of the UAVs maneuvers mentioned above and for the automatic detection of structural elements.Ref. [34] proposed a new framework, based on neural networks, with the aim of addressing camera movement problems and facilitating the extraction of structural displacements from videos.The use of AI to automatically detect structural features, such as damage or cracks, was demonstrated in [35,36].More details on these topics will be given in Sections 7 and 8.

Mesh, FEM, and BIM Implementation
In the world of building engineering, the role of BIM (Building Information Modeling) will become a fundamental key to structural design.Therefore, artificial intelligence becomes essential in the recognition of structural elements downstream of image acquisition processes and in improving the design and maintenance planning of buildings.The authors in [37] focused on the creation of an AI platform for the building and construction industry, capable of improving the efficiency, safety, and sustainability of construction operations.
The diffusion of new data capturing technologies and the advancements of modeling systems have allowed for the greater usability of digital twinning; Ref. [38] examined the implementation of this technology with an empirical approach, identifying the next challenge to overcome in the organization and integration of large and heterogeneous amounts of data.Indeed, for existing buildings the modeling process of geometric digital twins still lacks a streamlined systematic and completed framework.
The artificial intelligence based on image recognition is making an important contribution in overcoming these gaps: ref. [39] proposed a semi-automatic procedure to generate a systematic, accurate, and convenient digital twinning system pivoting on image surveys and CAD drawings.In [40], the authors presented a framework and a proof-of-concept prototype for on-demand automated replication of construction projects, combining some cutting-edge IT solutions (see Figure 5), specifically, image processing, machine learning, BIM, and virtual reality.Furthermore, a drone-based AI and 3D reconstruction for augmenting the digital twin was presented in [41], demonstrating an Information Fusion framework that goes beyond the capabilities of BIM to enable the integration of heterogeneous data sources.Additionally, ref. [42] analyzed the impact of the image-based digital twin in post-earthquake building inspections, highlighting the potential and applications of deep convolutional neural networks (CNNs).
These cited technologies even hold great promise for instilling vigor in safety training and assessment programs.Studies conducted on students and professionals about what kind of platform is preferable in monitoring and safety planning on construction sites concluded that both methods can improve this practical aspect of building engineering [43].
cutting-edge IT solutions (see Figure 5), specifically, image processing, machine learning, BIM, and virtual reality.Furthermore, a drone-based AI and 3D reconstruction for augmenting the digital twin was presented in [41], demonstrating an Information Fusion framework that goes beyond the capabilities of BIM to enable the integration of heterogeneous data sources.Additionally, [42] analyzed the impact of the image-based digital twin in post-earthquake building inspections, highlighting the potential and applications of deep convolutional neural networks (CNNs).These cited technologies even hold great promise for instilling vigor in safety training and assessment programs.Studies conducted on students and professionals about what kind of platform is preferable in monitoring and safety planning on construction sites concluded that both methods can improve this practical aspect of building engineering [43].
In the engineering world, making digital copies of geometries through 3D reconstruction models is of great interest, to compare and analyze structural evolutionary data.Through advanced numerical analysis and non-invasive data acquisition, it is possible to improve static and dynamic models, and to collect the structural modifications caused by aging and/or other factors [44].
Today, the bottleneck of the interaction between BIM models and the ones used by FEM analysis software (Nastran, Inventor, and Midas for example) is in the difficulty of transforming mesh entities, deriving from point clouds, into NURBS or other border curves, that is, the entities used by all the commercial FEM software.This transformation is done through "commutators" software.However, only models with a high quality mesh (i.e., suitable polygon density, skewness, etc.) and an adequate topology can be exchanged between the various BIM sectors in a more coherent way.
Indeed, despite the advanced developments, BIM still does not enjoy an optimal pipeline for the transition to FEM models, which are much more useful for structural purposes.Refs.[45,46] proposed and simulated two different scan-to-BIM-to-FEM pipelines, highlighting in both of cases how the BIM to FEM transition is still cumbersome and susceptible to macroscopic errors.The need for an automated procedure for the transition from BIM to FEM, possibly fully governed by artificial intelligence, becomes increasingly widespread as its usefulness is more and more requested in the monitoring and maintenance phases of existing structures, especially the ones belonging to the historical architectural heritage (as the cases showed in [47,48]).In the engineering world, making digital copies of geometries through 3D reconstruction models is of great interest, to compare and analyze structural evolutionary data.Through advanced numerical analysis and non-invasive data acquisition, it is possible to improve static and dynamic models, and to collect the structural modifications caused by aging and/or other factors [44].

Structural Monitoring
Today, the bottleneck of the interaction between BIM models and the ones used by FEM analysis software (Nastran, Inventor, and Midas for example) is in the difficulty of transforming mesh entities, deriving from point clouds, into NURBS or other border curves, that is, the entities used by all the commercial FEM software.This transformation is done through "commutators" software.However, only models with a high quality mesh (i.e., suitable polygon density, skewness, etc.) and an adequate topology can be exchanged between the various BIM sectors in a more coherent way.
Indeed, despite the advanced developments, BIM still does not enjoy an optimal pipeline for the transition to FEM models, which are much more useful for structural purposes.Refs.[45,46] proposed and simulated two different scan-to-BIM-to-FEM pipelines, highlighting in both of cases how the BIM to FEM transition is still cumbersome and susceptible to macroscopic errors.The need for an automated procedure for the transition from BIM to FEM, possibly fully governed by artificial intelligence, becomes increasingly widespread as its usefulness is more and more requested in the monitoring and maintenance phases of existing structures, especially the ones belonging to the historical architectural heritage (as the cases showed in [47,48]).

Structural Monitoring
In this Section, non-contact vision-based displacement monitoring is considered.It is a new trending topic for civil structures, based (differently from Section 4, where only object recognition was considered) on the use of image processing to measure the structural responses.Among the methods, a novel field structural displacement measurement method using deep learning was proposed in [49], which tried to address some significant drawbacks, such as the non-uniform sampling problems and the accumulation of errors in calculating the variations among successive images.Relying on the sampling Moiré method, ref. [50] proposed a technique for measuring the deflection and vibration frequency from captured video data, showing how to gather compressed images with an appropriate compression ratio, in order to reduce the image sizes without deteriorating the required accuracy.
The studies in [51] described the design and the implementation of a monitoring system, where images were used to both control the evolution of structural phenomena (such as the opening of cracks) and to implement a 3D model of a real-life ancient structure in a virtual reality framework (see Figure 6).as the opening of cracks) and to implement a 3D model of a real-life ancient structure in a virtual reality framework (see Figure 6).Generally speaking, any vision-based technique requires a Foreground-Background Segmentation (FBS), that is, a pixel-level separation where each pixel of the given image (or input video frame) is assigned to the foreground or to the background.A novel FBS approach, which proved superior to the available methods for FBS, was proposed in [52].
3D Digital Image Correlation (3D-DIC), 3D Point Tracking (3D-PT), and similar methods have been effectively used for structural monitoring, but quantitative measurements on large-scale structures are not well suited due to the demanding calibration processes of the cameras.In [53], a new sensor board was proposed to measure the degrees of freedom necessary for evaluating the extrinsic parameters of a set of stationary paired cameras to be employed for 3D-DIC applications.
The work in [54] investigated a structural monitoring method based on a pure organic Mechano-Responsive Luminogen (MRL), namely 1,1,2,2-tetrakis(4-nitrophenyl)ethane (TPE-4N), for the evaluation of strain distributions (transformed into visible fluorescence).The results obtained for the structural monitoring of the strain concentration in weld joints opened new opportunities in the field of structural monitoring of nodes and other D-regions.
Vision-based structural health monitoring is becoming increasing popularity because the images can be directly used to collect data and detect the onset of structural damage.The case study analyzed in [55] used image analysis and convolutional neural networks to automatically inspect bolt loosening.This topic is more comprehensively discussed in the next section.

Damage Identification
Unlike the previous Section, limited to monitoring activities of the structural responses, here, the next step, the detection of potential damages, is considered.Damage Generally speaking, any vision-based technique requires a Foreground-Background Segmentation (FBS), that is, a pixel-level separation where each pixel of the given image (or input video frame) is assigned to the foreground or to the background.A novel FBS approach, which proved superior to the available methods for FBS, was proposed in [52].
3D Digital Image Correlation (3D-DIC), 3D Point Tracking (3D-PT), and similar methods have been effectively used for structural monitoring, but quantitative measurements on large-scale structures are not well suited due to the demanding calibration processes of the cameras.In [53], a new sensor board was proposed to measure the degrees of freedom necessary for evaluating the extrinsic parameters of a set of stationary paired cameras to be employed for 3D-DIC applications.
The work in [54] investigated a structural monitoring method based on a pure organic Mechano-Responsive Luminogen (MRL), namely 1,1,2,2-tetrakis(4-nitrophenyl)ethane (TPE-4N), for the evaluation of strain distributions (transformed into visible fluorescence).The results obtained for the structural monitoring of the strain concentration in weld joints opened new opportunities in the field of structural monitoring of nodes and other D-regions.
Vision-based structural health monitoring is becoming increasing popularity because the images can be directly used to collect data and detect the onset of structural damage.The case study analyzed in [55] used image analysis and convolutional neural networks to automatically inspect bolt loosening.This topic is more comprehensively discussed in the next section.

Damage Identification
Unlike the previous Section, limited to monitoring activities of the structural responses, here, the next step, the detection of potential damages, is considered.Damage identification is indeed a crucial task, and there is a wide range of literature available on this topic.Ref. [56] and many other similar works are dedicated to the classification and quantification of cracks in concrete structures.In detail, the cited work was based on a convolutional neural network able to classify, locate, and quantify crack-like damage, with an accuracy higher than 96%.
When one deals with microstructural cracks (for instance, to analyze the onset of corrosion phenomena), the image processing process becomes more expensive, since voids and cracks grow in the same greyscales.Deep convolutional neural networks were used in [57] to approach this issue, adopting X-ray computed tomography and then reconstructing the three-dimensional distribution of mortar, aggregates, voids, and cracks (see Figure 7).This detailed approach is an important tool to better understand how a damage mechanism evolves at a small scale in concrete elements.

Damage Identification
Unlike the previous Section, limited to monitoring activities of the structural responses, here, the next step, the detection of potential damages, is considered.Damage identification is indeed a crucial task, and there is a wide range of literature available on this topic.Ref. [56] and many other similar works are dedicated to the classification and quantification of cracks in concrete structures.In detail, the cited work was based on a convolutional neural network able to classify, locate, and quantify crack-like damage, with an accuracy higher than 96%.
When one deals with microstructural cracks (for instance, to analyze the onset of corrosion phenomena), the image processing process becomes more expensive, since voids and cracks grow in the same greyscales.Deep convolutional neural networks were used in [57] to approach this issue, adopting X-ray computed tomography and then reconstructing the three-dimensional distribution of mortar, aggregates, voids, and cracks (see Figure 7).This detailed approach is an important tool to better understand how a damage mechanism evolves at a small scale in concrete elements.Some works are also available on the integration of different technologies with image processing: in [58], the combined use of three technologies, namely Acoustic Emission (AE), Digital Image Correlation (DIC), and Dynamic Identification (DI), was adopted to analyze crack forming and propagation in beam specimens.The latter were affected by pre-notches and subjected to three-point bending loadings.From this point of view, a technology that could be interesting for new buildings is the one reported in [59], based on distributed optical fiber sensors.The experiments also consisted of concrete beams undergoing three-point bending tests, where a polyamide-coated optical fiber sensor (protected by a thin silicone film) was directly bonded onto the surface of the (unaltered) reinforcement bars.This approach is interesting because, if the image processing gives information on the surface, such technologies can provide information on the inner state, without resorting to the cited expensive solutions [57] (which however, are still useful at the small scale).Some works are also available on the integration of different technologies with image processing: in [58], the combined use of three technologies, namely Acoustic Emission (AE), Digital Image Correlation (DIC), and Dynamic Identification (DI), was adopted to analyze crack forming and propagation in beam specimens.The latter were affected by pre-notches and subjected to three-point bending loadings.From this point of view, a technology that could be interesting for new buildings is the one reported in [59], based on distributed optical fiber sensors.The experiments also consisted of concrete beams undergoing threepoint bending tests, where a polyamide-coated optical fiber sensor (protected by a thin silicone film) was directly bonded onto the surface of the (unaltered) reinforcement bars.This approach is interesting because, if the image processing gives information on the surface, such technologies can provide information on the inner state, without resorting to the cited expensive solutions [57] (which however, are still useful at the small scale).
Previous works have focused on concrete elements.A deep neural network, called Material-and-Damage-network (MaDnet), was introduced in [60] to simultaneously identify materials (concrete, steel, or asphalt) and structural damage (both fine, such as cracks and exposed rebar, and coarse, such as spalling and corrosion).However, since regular supervised learning methods usually rely on (relatively) few training examples, some damage types can remain unclassified; ref. [61] proposed a method to combine a few image data points to describe a large class of structural damage.
Moving from the element scale to the scale of the structure, structural damage detection and localization from the combined use of high-speed DIC and local modal filtration is another promising research branch.The efficacy of this procedure has been recently demonstrated in [62] for a small-scale frame structure.
Moving to an even larger scale, the territorial one, techniques based on large-scale image processing have been proposed to analyze the effects of events of great importance, such as earthquakes and hurricanes.A rapid damage assessment of buildings in postdisaster conditions can ensure a fast emergency response, and increasing the chances of saving lives.In [63], using an UAV and a convolutional neural network, an automated method was proposed for assessing seismic-induced damage.Ref. [64] used the same techniques for the damage assessment after hurricanes or similar.

Conclusions and Future Directions
In this perspective article, the authors have collected the latest developments on the use of AI for image acquisition and processing.A collection of the most promising articles published in the last three years has been gathered.In detail, focusing on building engineering and photographic surveys, seven categories have been recognized: photomodeling; thermal imaging; object recognition; inspections assisted by UAVs; mesh, FEM, and BIM implementation; structural monitoring; and damage identification.
Generally speaking, it is certain that in the immediate future there will be a rapid increase in the use of images (due to the wide availability of UAVs and photo/video surveillance/maintenance stations); consequently, for obvious reasons related to the management of big data, there will also be a rapid development of systems based on AI.However, many of today's procedures are still in an experimental stage, and some of them are based on heuristic schemes; accordingly, the theoretical bases on which to support the obtained results are often missing.Therefore, future research lines will largely focus on this aspect.
To provide a clear picture of the recent state of the art, the following paragraphs illustrate the main findings and the leading research perspectives for each of the identified categories.
Photomodeling: The present scientific community universally recognizes how acquiring as much data as possible could be potentially detrimental, from both a logical and an IT point of view.In fact, more information to manage means many more errors and simplifications in the calculation models.It is thus possible that the photomodeling will finds application in fields of that were once exclusive to laser scanners, especially for certain Levels Of Detail (LOD) or for the so-called manipulable objects (which can be photographed 'in the round').A research effort on software architecture is needed, since application tools are too distant from each other to ensure a scientifically correct interoperability (which is only possible by creating software stacks inspired by the same principles for modeling and calculations).
Thermal camera: Whereas a few years ago, thermal cameras were imaging devices used in a small number of structural applications, today, it is a technology that is attracting great interest in both laboratory and in situ surveys.With the centralization of BIM environments and monitoring applications, the thermal analysis and the possibility of creating models from these tools are proving to be very useful.Three-dimensional thermal models will likely become more and more common, replacing the concept of a 'simple' image with the birth of a series of integrated information.Software will also move in this direction to better harmonize all data, according to the dictates of BIM interoperability.In the same way, a standard integration of the thermal cameras on board drones is foreseeable; this will allow us to capture a single data point with common coordinates in all surveys, even in detail.It should be noted, however, that most of the experimentation until now has been carried out on large plants, where the thermal difference between the components is considerable.Nevertheless, some early available studies show that thermal imaging is also useful for detecting structural alterations.
Object recognition: Object recognition is certainly a new and fascinating issue in this research field.However, the replacement of the objects detected through a photographic survey with their digital twins is not straightforward in its current state.First of all, there is the need to further investigate scientifically correct replacement approaches, able to calibrate both the purely visual architectural and the structural model.At the level of structures, the question is even more interesting, since many simplified models obtained through manual operations have the same performance (for some given purposes) as that of the original one.Therefore, the automatic recognition and transformation into simplified elements, as already happens in the field of the MEP applications, is certainly one of the areas that will be developed further.
Inspections assisted by UAVs: Thanks to their extraordinary ease of use and multiple application methods, UAVs are rapidly becoming an increasingly widespread technology in all areas of building, seismic, and civil engineering.More and more articles show how their adaptive capacity allows them to be widely used in all construction phases, from the design to the maintenance of existing structures.At the design stage, a more intelligent management of the construction site and of the relevant safety conditions is guaranteed, whereas the facilitation of photogrammetry (for the reconstruction of point clouds or for monitoring purposes) is ensured for existing structures.The case of UAVs is certainly a rapidly developing technology with an exponential diffusion and a very wide field of application.The authors believe that this data acquisition technology, coupled with the computing capabilities of recent AI-based algorithms, will become a tool that all professional engineers and all engineering companies will be using within a few years.
Mesh, FEM and BIM implementation: With the advancement of image processing techniques, the BIM environment becomes a central technology in building engineering, as evidenced by the large number of articles in this field.Artificial intelligence is an essential branch in this research line, for both the recognition of objects and the exclusion of obstacles and occlusions.Despite this, the authors have noted that the research is still rather lacking in proposing automated communication procedures between BIM and FEM programs.Even today, this step must be performed semi-manually, with a large loss of time and information.This gap in the field of building engineering can represent a great study opportunity.
Structural monitoring: Non-contact vision-based displacement monitoring for civil structures is a new trending topic.Recent studies have been focused on some significant drawbacks: non-uniform sampling problems, error accumulation in calculating the differences among successive images, reduction of the image size (without deteriorating the required accuracy), Foreground-Background Segmentation (FBS), and the pairing process of cameras in 3D Digital Image Correlation and 3D Point Tracking.Other studies have shown how images can be used both to control the evolution of structural phenomena (such as the opening of cracks or loosening of bolts) and to implement a 3D model of the structures in a virtual reality framework.These promising results pave the way for large-scale applications of AI-based image monitoring.
Damage identification: Many articles are devoted to the detection of cracks; the largest part focused on damaged concrete elements, but new interesting developments have come from X-ray computed tomography applied to small-scale samples (where microstructural cracks need to be monitored to better understand the initiation of damage mechanisms).Many other efforts concern AI-based procedures able to distinguish the type of material and to classify damages starting from an archive of a few images.One of the keys to success could be the implementation of polyamide-coated optical fiber sensors or similar technologies inside the elements, in order to combine the analysis of surface images with the information coming from the interior.With the "high-speed digital image correlation", structural dynamic monitoring is also possible (even if there are still few studies on this topic).Lastly, some recent techniques, allowing the processing of images on a territorial scale, have proven extremely fruitful to analyze the effects of events of great importance, such as earthquakes and hurricanes.Approaches like these, in conjunction with the diffusion of UAVs, can significantly increase the chances of saving lives in post-disaster conditions; thus, their improvement must become a crucial task for the scientific community.

Figure 2 .
Figure 2. Spatial distribution of Mean Radiant Temperature (MRT) values in pedestrian spaces: (a) MRT values at a height of 0.5 m; (b) MRT values at 1 m; (c) MRT values at 1.5 m; (d) MRT differences between points at heights of 0.5 and 1.5 m (from [20]).

Figure 2 .
Figure 2. Spatial distribution of Mean Radiant Temperature (MRT) values in pedestrian spaces: (a) MRT values at a height of 0.5 m; (b) MRT values at 1 m; (c) MRT values at 1.5 m; (d) MRT differences between points at heights of 0.5 and 1.5 m (from [20]).

Figure 3 .
Figure 3. Detection and verification of irregularly shaped components: (a) clustering results for potential positions of valves; (b) retrieved point cloud of a valve; (c) as-designed point cloud of the valve; (d) registration of as-built and as-designed point clouds with TrICP (adapted from [25]).

Figure 3 .
Figure 3. Detection and verification of irregularly shaped components: (a) clustering results for potential positions of valves; (b) retrieved point cloud of a valve; (c) as-designed point cloud of the valve; (d) registration of as-built and as-designed point clouds with TrICP (adapted from [25]).

Figure 5 .
Figure5.A photo taken from the construction site (left); photo's depth map (center); output of semantic segmentation for identifying structural building parts (right) (adapted from[40]).

Figure 5 .
Figure5.A photo taken from the construction site (left); photo's depth map (center); output of semantic segmentation for identifying structural building parts (right) (adapted from[40]).

Figure 6 .
Figure6.The virtual reality environment: users can explore the virtual reconstruction of a monitored structure while having direct access to the values measured by the sensors network (from[51]).

Figure 6 .
Figure6.The virtual reality environment: users can explore the virtual reconstruction of a monitored structure while having direct access to the values measured by the sensors network (from[51]).