Machine-Learning-Based Methods for Acoustic Emission Testing: A Review

: Acoustic emission is a nondestructive control technique as it does not involve any input of energy into the materials. It is based on the acquisition of ultrasonic signals spontaneously emitted by a material under stress due to irreversible phenomena such as damage, microcracking, degradation, and corrosion. It is a dynamic and passive-receptive technique that analyzes the ultrasonic pulses emitted by a crack when it is generated. This technique allows for an early diagnosis of incipient structural damage by capturing the precursor signals of the fracture. Recently, the scientiﬁc community is making extensive use of methodologies based on machine learning: the use of machine learning makes a machine capable of receiving a series of data, modifying the algorithms as they receive information on what they are processing. In this way, the machine can learn without being explicitly programmed, and this implies a huge use of data and an efﬁcient algorithm to adapt. This review described the methodologies for the implementation of the acoustic emission (AE) technique in the evaluation of the conditions and in the monitoring of materials and structures. The latest research products were also analyzed in the development of new methodologies based on machine learning for the detection and localization of damage for the characterization of the fracture and the prediction of the failure mode. The work carried out highlighted the strong use of these methods, which conﬁrms the extreme usefulness of these techniques in identifying structural damage in scenarios heavily contaminated by residual noise.


Introduction
Acoustic emission (AE) has long been recognized as a valid technique for real-time monitoring of materials and structures, providing useful information not only on the presence of defects but also on their criticalities [1].Acoustic emission is a nondestructive test technique as it does not involve any input of energy into the materials.However, it should be noted that, to identify a fracture, it is necessary that this is already present in the material [2].Nondestructive tests (NDTs) are a set of examinations, tests, and surveys conducted using methods that do not alter the material and do not require the destruction or removal of samples.The objective is to guarantee safety, verified in terms of compliance with the requirements of reliability and conformity to the project, according to which a specific product has been conceived and manufactured [3].Many nondestructive methods have very narrow fields of application and great uncertainties of interpretation related to the influence of the conditions in which they are performed: the problem is often very complex, and clear information can only be deduced by comparing multiple results.Nondestructive methods can be useful for the complete and continuous monitoring of significant parameters over shorter or longer periods [4].
The identification of structural anomalies through sound emissions is a methodology that has successfully been applied for a long time.The potters, up to several millennia BC, were dedicated to listening to the audible sounds during the cooling phase to identify a possible structural defect.Similarly, a few millennia later, this procedure was applied in metalworking [5].The artifacts were beaten to listen to the noise they emitted to identify possible fractures.In some artisan productions, these procedures are still applied today: think of the producers of bronze bells of the pontifical foundries who are dedicated to listening to the sound produced by the bells to identify defects, or the expert producers of Parmigiano Reggiano, a famous Italian aged cheese, who identify the degree of ripeness of a wheel weighing about 40 kg by listening to the sound emitted by pressing with a small hammer.
The bases of this technique, however, date back to the early 1950s when J. Kaiser used an electronic instrumentation to detect the sound emissions emitted by metals during deformation [4].Kaiser found that all the metals investigated exhibited acoustic activity and that this activity was irreversible in the sense that it disappeared during a reloading process of the material until the stress level exceeded its previous value.This phenomenon is now known as the Kaiser effect and has proved to be of considerable use in studies with AE [6].Schofield [7] and Tatro [8] began the research in the mid-1950s, making a significant contribution to the improvement of the instrumental equipment and to the clarification of the genesis of the AE.They were the first to observe that the emission in the metals was mainly due to the motion of the dislocations that accompanies the plastic deformation, rather than being entirely due to the reciprocal sliding of the grains, as initially proposed by Kaiser.During the 1960s, many scholars dealt with AE using this technique in materials studies, characterization and quality assessment, nondestructive tests, and structural checks [9].
The remarkable advances in instrumentation, achieved in the early 1960s, made possible unexpected developments in AE technology.Researchers found significant difficulties in processing the AE signal due to background and ambient noise: many of these problems could be reduced, if not eliminated, by working in a frequency range well above the audible.This innovation eliminated the need for acoustically insulated laboratories and allowed for a high degree of perfection and applicability.Subsequently, in the early 1970s, it was decided to extend the concepts of spectroscopic analysis to acoustic phenomena originating from the dynamics of materials.The development of ultrasonic techniques suggested the possibility that the excitability characteristic of materials, whereby a specific frequency response corresponds to an ultrasonic pulse, could justify a production of acoustic signals in the ultrasonic frequencies because of the release of energy at a microscopic level [10].
Modern acoustic-emission-based techniques have, therefore, neglected the audible spectrum to search for possible spectral signatures that identify structural defects in the ultrasonic spectrum.The term ultrasound defines elastic waves whose frequency is greater than the hearing limit for the human ear and whose frequency band, consequently, varies between 20 kHz and 1 GHz.Generally, in materials in which dynamic processes take place, such as deformations, fractures, or phase transitions, there is a release of elastic energy in the form of impulsive elastic waves, whose frequency spectrum is between 1 kHz and 10 GHz [11].The dynamic processes at the origin of acoustic emissions are well-highlighted, for example, by the macroscopic acoustic phenomena related to the breaking or deformation of a solid material.The physical interest in these phenomena, however, concerns signal far from the limits of the audible spectrum, both in intensity and in frequency, determined in solid materials by the motion of the displacements, growth of microfractures, and motion of the grain thread, processes that we will call emission sources acoustics [12].
A solid material, subjected to a stress that sets it in vibration, will undergo fractures at the molecular or atomic level, and a certain energy will be released.Furthermore, a voltage distribution will develop, which will depend on the phenomenon as a whole and will involve the whole system considered.As soon as this distribution of tension crosses an area in which breaks have already occurred, it is possible that new ones will occur, since it is where the structure is weakest [13].This implies a real chain reaction that will support the progress of each existing fracture, rather than the formation of new ones.Acoustic emissions are generated at a microscopic level due to breakage of the chemical bond and propagate because of a real chain reaction.We cannot grasp the effects of one of the breakings of a single chemical bond, but natural phenomena have memory of their origins and past events [14].
Machine learning (ML) is the technology for the development of computer algorithms capable of emulating human intelligence [15].It draws ideas from various disciplines such as artificial intelligence, probability and statistics, computer science, information theory, psychology, control theory, and philosophy [16].This technology has been applied in different fields such as pattern recognition [17], prediction of material characteristics [18], automatic recognition of acoustic sources [19], computer vision [20], prediction of perception personal to different stimuli [21], etc.The most important property of these algorithms is their distinctive ability to learn the surrounding environment from input data.The ability to learn through the input of the surrounding environment is the main key to developing an efficient automatic identification application.In this case, learning is defined as estimating dependencies on data [22].To do this, ML algorithms are used to query large databases and discover previously unknown properties in the data.Many ML algorithms use unsupervised learning methods such as preprocessing to increase learning accuracy before examining the desired activities.These characteristics have suggested the use of such methodologies for the automatic identification of structural defects through acoustic emission techniques [23].
In this work, the most popular methods based on ML for AE testing were analyzed and described.The paper is structured as follows: Section 2 describes in detail the acoustic emission testing methods, analyzing their characteristics and peculiarities that make the mathematical modeling extremely complex.Section 3 analyzes the most popular acoustic emission testing methods based on ML.Finally, Section 4 summarizes the results obtained in applying these methods to real cases, highlighting their potential and listing their limits.

Acoustic Emission Testing Methods
The solids have elasticity, stressed by external loads they deform returning to the initial configuration when the stress is lacking.The maximum tolerable effort and consequent elastic deformation depend on the solid's ability to store elastic energy.Exceeding the elastic limit, in brittle materials, fracture immediately occurs.On the other hand, in materials with high plasticity, the fracture occurs only after deformation.If the material subjected to stress has a defect, the triggering of the damage will more likely take place around these native defects as they are likely to be more strongly stressed points.In fact, near the defects, the stress field is further amplified.

Acoustic Emission Sources
In heterogeneous materials, fracture occurs at the maximum of progressive damage due to the applied loads or the severity of the environmental conditions.The microfracturing process is accompanied by a fast dislocation motion, which is associated with a rapid spontaneous release of energy in the form of transient elastic waves or acoustic emissions (AEs).The AE event manifests itself as an elastic wave that propagates through the material toward the surface of the element and can be detected by appropriate sensors that transform it into an electrical signal.The transient stress wave ends when a new equilibrium configuration is reached, in which the resulting forces acting on each volume element vanish.The AE signal carries with it a certain amount of information that characterizes it, identifying its origin.It is generated only when the crack grows or when its edges touch each other.AE can therefore provide information on the origin of the discontinuity in a component subjected to loads and on its subsequent development when the component is subjected to continuous and repetitive stresses [24].
As already said, when a solid is subjected to a mechanical stress of a certain intensity, it releases energy, which travels in the form of elastic high-frequency waves.These waves are captured by a sensor that converts their energy into an electrical signal.This signal is then electronically amplified and, using special circuits, processed as an AE signal.The data analysis includes the characterization of the signals according to the location of the source, intensity of the voltage, and frequency content.The onset of this phenomenon is found in situations of various kinds such as mechanical deformations and fractures, phase transformations, corrosion, friction, and processes of magnetic nature [25].
In addition to the signals deriving from the AE, there may be signals deriving from other causes such as noise or from sources of AE not relevant to the objective of the test.The main causes of noise are friction and impacts.The sources of friction are stimulated by structural loads that cause movements in the joints.The sources of impact, on the other hand, can include rain, windblown dust, or flying objects.A relevant part in tests using AE is, therefore, the ability to eliminate all these noise sources and focus on the one relevant to the test.This is achieved by selecting an appropriate setup, taking practical precautions to prevent noise sources as much as possible, and recognizing and removing noise from recorded data.All this depends on the experience of those who carry out the measurements, and it is necessary to have a wide range of data on which to rely to make a correct interpretation of the results [26].
AE, appearing in the form of an elastic wave, is always accompanied by vibrations.Vibrations can be studied from the point of view of wave motion.Each continuous system has masses and elastic forces continuously distributed.These systems consist of an infinite number of particles and, therefore, require an infinite number of coordinates to describe motion.The system must be modeled, so that the motion of each point in the system can be specified as a function of time.The resulting differential equations that describe the motion of particles are the wave equations and describe the propagation of the same in a solid.Wave propagation in solids is very complex, and it is, therefore, necessary to consider only waves directly relevant for the study of vibrations.
To remove parasitic oscillatory components, due to structure vibrations and noise, AE detection instruments consider only part of the frequency spectrum detected: other components are removed through the application of band-pass filters.AE has a frequency between 20 kHz and 1 MHz, but the vibrations at lower frequencies (20-100 kHz) can be masked by external noises, and those at higher frequencies tend to very quickly dampen, so the AE's survey range is reduced to the range 100-700 kHz [27].
Recent studies have highlighted the multiplicity of information that can be drawn from the use of these techniques.Through the adoption of AE tests, we can obtain much information:

•
Damage phenomena origin: Having noted the arrival times of the acoustic waves to each sensor and the speed of propagation of the waves in the material, it is possible to locate an AE event within the structure of the material with great accuracy.

•
Fractured area: The set of points relating to all events located in each time interval allows obtaining the density of the damage in the material.When the geometric distribution of the localized points identifies a well-defined path, the progressive formation of a crack is observed.Furthermore, the speed of seismic waves in damaged areas is significantly lower than that in intact areas.

•
Damage mechanism: The shape of the waves recorded by each sensor is a function of the source mechanism and depends on the path taken by the acoustic wave when it travels from the source to receiver.The analysis of the shape of the waves allows distinguishing between the different mechanisms of propagation of the cracks.

•
Stress state: There is a relationship between areas with high-speed anomaly and regions subject to a high state of stress and, therefore, potentially damaged.

•
Material properties: The frequency and amplitude of a wave traveling in a material directly depend on its properties.Measurements of seismic velocity, anisotropy, and attenuation are, therefore, sensitive to any variation in the properties of the materials.

•
Time-dependent behavior: Through continuous monitoring, information is obtained on the evolution of the mechanical response of a material and, therefore, on the phenomena of degradation and progressive formation of cracks.

Acoustic Emission Detection
The data recorded in the form of electrical impulses return different information depending on the type and number of sensors and the acquisition system.With normal transducers available on the market, the parameters that can be evaluated (counts, amplitude, duration, and energy of the events) only allow a qualitative analysis of the phenomena to be monitored.To obtain quantitative data, the use of high-tech transducers and an extremely fast data acquisition system is required, so that information relating to the shape of the acoustic waves is available [28].
A system for measuring AE consists of the following elements (Figure 1):  The AE sensors generate an electrical signal when they are hit by an acoustic wave.The principle of operation of these sensors can be different: signals can be generated by electromagnetic, magnetostrictive, and piezoelectric devices and using laser interferometers.The most widely used sensors are piezoelectric ones, which generate an electrical voltage and a corresponding charge separation when deformed.The deformation is produced by the motion of the wave, and it is the elastic response of the piezoelectric crystal when it is hit by the incoming stress wave.The electrical signal is generated by the material, which does not need to be powered from the outside: when hit by an impulse, it vibrates at its resonant frequency (Figure 2).There can be many resonant frequencies simultaneously excited.If shaken by a vibratory motion, the piezoelectric element will produce a corresponding oscillating voltage at the same frequency of the motion.The element also has a linear response: if the input motion is doubled, the output voltage will also double [29].
The relationship between the amplitude of the output-input voltage is a measure of the sensor sensitivity: this strongly depends on the motion frequency (the number of oscillations per second) and is better at the element resonant frequency.The sensitivity of the sensor depends not only on the frequency but also on the motion direction.Unlike accelerometers, which are carefully designed to measure only the motion component parallel to their axis, acoustic emission sensors respond to motion in any direction.In the choice of sensors, the priorities are high sensitivity, well-defined and consistent frequency response, high performance in the working environment, and immunity to unwanted noise.Noise sensitivity has been greatly improved with the development of sensors with integrated preamplifiers.These types of sensors have a preamplifier built into the housing together with the piezoelectric element, and for field tests, these have great advantages compared with previous types of sensors that required separate preamplifiers to be mounted a few meters from them [30].The sensor must be properly placed in contact with the surface of the material to be monitored to identify the motion of the AE wave and provide a strong signal.The coupling and assembly techniques are, therefore, very important.For example, an acoustic couplant in the form of an adhesive, viscous liquid, or grease can be used and applied to the surface of the sensor, which is then pressed against the structure, the surface of which must be smooth and clean.The sensor must then be firmly held in position using adhesives, magnetic bases, or other means.Finally, after assembly, the performance of the system is verified by simulating an AE signal and checking the response of the system.
The signal produced by the sensor is amplified and filtered, identified, and measured.The amplifiers increase the signal voltage to bring it to an optimal level for the measurement circuit.Along with various amplification stages, filters are incorporated into the instrument.These define the frequency range to be used and attenuate low-frequency background noises.These amplification and filtering processes are called signal conditioning.They clean the signal and prepare it for the detection and measurement process.After conditioning, the signal is sent to the detection circuit: it is an electronic comparator that compares the amplified signal with a threshold voltage defined by the operator.Whenever the voltage exceeds the threshold, the comparator generates a digital pulse.The first impulse produced marks the beginning of the phenomenon (hit).This pulse is used to activate the signal measurement process.As the signal continues to oscillate above and below the threshold level, the comparator generates additional pulses, and the electronic circuits actively measure the key characteristics of the signal.Over time, the amplitude of the signal is reduced to a level where the threshold is no longer crossed.After a predetermined time, called the hit definition time (HDT), any further impulse from the comparator determines the end of the event.The control circuit ends the measurement process and passes the results to a microprocessor.Finally, the measurement circuit is reset and re-prepared for the next event [13].
In many cases, the discrimination of the signal is very difficult: this is due to the temporal overlap of different processes, including secondary products such as reflections.In the case, for example, of the growth of fractures in heterogeneous materials, the effects due to the friction between the faces formed by the fracture itself or by preexisting fractures are superimposed on the primary source represented by the motion of the fracture tip.Preexisting defects can act as resonators and diffusion centers, modifying the signal.On the other hand, regions with lower fracture strength or different composition may have a more fragile behavior with a consequent increase in the amplitude of the signals.Therefore, the nature of the phenomena is so articulated that a univocal definition of acoustic emission appears difficult [12].
In general, the signals appear as isolated pulses over time (burst), formed by a very rapid rising edge followed by an exponentially decreasing trend, also due to the response of the transducer (Figure 3).Furthermore, it is possible to identify stationary or quasistationary components, free from decays, in which the single AE events are so close that they are not distinguishable.These contributions are of smaller amplitude than the burst event due both to the dispersion of the medium and to other causes such as viscoelasticity, internal friction, etc.The impulsive components (bursting signals) are caused by the initiation of fractures and their growth, whereas the continuous part of the signal is due to plastic deformations as well as external noise: in plastic deformations, the energy involved translates into work of plastic deformation used to overcome the resistance to motion of dislocations.This movement produces within the material the propagation of a stress wave that reaches the surface and manifests itself as an AE wave [14].Acoustic emission has the appearance of a damped sinusoid and is characterized by some characteristic quantities.Conventionally, referring to the sine wave (waveform), we speak of Hit of acoustic emission when the AE transient is identified and processed by a certain channel.We speak instead of Event when the localization of the source of this same acoustic emission is carried out.With reference to the ASTM E1316 [31] standard, the following definitions are given:

•
Hit: term to indicate that a given channel has identified and processed an acoustic emission transient;

•
Event: an acoustic emission wave can be identified in the form of a hit on one or more channels.An event is a group of hits received by two or more channels from a single source, which is located.
The ASTM E610-89 [32] standard defines the standard terminology relating to the quantities and phenomena involved in the study of acoustic emissions.An AE event is characterized by the following:

•
Amplitude: refers to the largest value present in the waveform of the signal and is linked to the type of source that produced it, the material, and its state of stress.It is generally measured in decibels, on a scale ranging from 0 to 100.

•
Duration: it is the time elapsed between the first overcoming of the threshold and the last overcoming.The relationship between duration and amplitude gives information on the shape of the signal.
• Counts: number of times the amplitude of the signal exceeds a predetermined threshold.A single Hit can provide only a few counts or hundreds of counts depending on the intensity and shape of the signal.

•
Counts rate: number of counts per unit of time.

•
Counts to peak: number of counts before the maximum amplitude included.

•
Signal energy: total elastic energy released in each event.

•
Rise time: time interval between the first crossing of the threshold and the reaching of the maximum amplitude.

•
Damping time: time between reaching the peak and the end of the last threshold crossing.

•
Dead time: time after which, if the threshold is not exceeded, the event is considered to have ended.
In addition to measuring the characteristics of individual signals, the instruments generally measure the time in which they are identified and the environmental variables that can cause the activity.
In AE techniques, the response of the materials under test can be observed throughout the entire history of stress, without any damage to the specimen.In addition, this monitoring can be carried out with fewer sensors than other nondestructive techniques.The sensors are fixed in a single position on the surface of the specimen, for the duration of the test, as a point-by-point monitoring is not necessary: it is not even required to monitor both sides of the object.This methodology can only detect the formation of new cracks and progression of existing cracks or friction processes, phenomena resulting from the application of external loads or due to internal mechanical or thermal loads.AE-based methods can be implemented under normal operating conditions or during a slight increase in the load: this makes them particularly suitable for carrying out tests on structures under real load conditions to identify a possible failure.

Machine-Learning-Based Methodology for AE Testing
AE testing has obvious limitations regarding its reproducibility: as we have said, this type of test involves the formation or progression of cracks in the material.Even when referring to specimens of the same material, of the same dimensions, and subjected to the same load cycle, they do not necessarily produce the same results.This is especially true in the case of anisotropic and heterogeneous materials.Moreover, since the signals used by precursors are of modest entity, to be able to detect possible forms of energy in the material, it is necessary to use particularly sensitive sensors.Further problems arise due to the attenuation phenomena of the acoustic stress wave that is dispersed in the material as it propagates: just as the noise due to sources independent of the possible structural defect can disturb the detection methodology.
To overcome these limitations, researchers adopted alternative methodologies to improve the results of the structural damage identification procedures.The capabilities demonstrated by the technologies based on ML in detecting patterns were immediately noticed by AE researchers.To make this nondestructive testing method even more effective, all the methodologies based on ML for the recognition of the stress wave can be applied during the detection phase of the acoustic emission generated by the source.In this way, it is possible to carry out a test that is robust regarding noise and effective in detecting waves of modest entity.The most common methodologies based on ML applied in the field of AE are presented below.
ML is a branch of artificial intelligence whose goal is to allow machines to automatically learn something from experience, without the need for them to be programmed in advance.Experience is a collection of data, which can be fixed and immutable, or even expand over time.Learning can be carried out through two main approaches: supervised and unsupervised.
Supervised algorithms can be used for what is commonly called classification.The peculiarity of these algorithms is that the data on which they are trained are labeled; that is, each element of the set is matched to its class of belonging.Their purpose will, therefore, be to be able to match new data, not part of the training set, to their own class.
Unsupervised learning uses a more independent approach and is very useful in situations where labeled data are not available, such as historical data with known results.It provides that the information entered in the machine is not encoded, but it is possible to draw on certain information without having any example of use and knowledge of the expected results.

Clustering Techniques
Clustering techniques are based on unsupervised algorithms that work on data but without knowing the class they belong to.All clustering techniques are based on distance measurements between all the elements that are part of the dataset to be analyzed [33].The elements that are more like each other will end up in the same cluster, that is, in the same group, whereas the less similar elements will end up in different clusters.Clustering techniques can be mainly used for two different purposes.The first is understanding: by observing the clusters that are formed from a dataset, it is possible to trace any relationships and patterns between the data themselves.The other purpose for which clustering can be used is the reduction of a dataset with too many elements to a smaller one [34].This may be necessary, for example, to train faster a classification algorithm that had too large training set.By applying a clustering algorithm, it will be possible, in fact, to find which are its main clusters, such as those subsets of data that are like each other, taking only some of them, which are the most representative for each cluster [35].
Clustering algorithms can be grouped into two large families: • Aggregative or bottom-up methods: Initially, all the elements of the dataset are identified as separate clusters.The elements closest to each other are then merged into a single cluster, thus aggregating smaller clusters into larger clusters.All this is carried out until a predetermined condition is reached, which could be the number of clusters, a minimum distance between clusters or other, depending on the algorithm used.

•
Divisive or top-down methods: It starts from a single large cluster that contains all the elements.Subsequently, the cluster is divided into smaller and smaller clusters.Proceed until a stop condition is reached, which is usually the desired number of clusters.
There are also other subdivisions of clustering algorithms.One of these sees the contrast between exclusive and nonexclusive clustering.In exclusive clustering, an element can belong to one and only one cluster.In nonexclusive clustering, also known as fuzzy clustering, an element can belong to several clusters at the same time, to each according to a probability p. Obviously, the sum of the probabilities of belonging of an element to the various clusters must be equal to one [36].
Fuzzy c-means clustering iteratively searches for a set of fuzzy clusters and associated cluster centers that characterize the data structure in the best possible way [37].The user specifies the number of clusters to locate in the dataset to be grouped.Omkar et al. [38] applied fuzzy c-means (FCM) techniques to classify the acoustic emission (AE) signal into different signal sources.The authors performed the test using a pulse, pencil, and spark signal source on the surface of the solid block of steel.Using the AET 5000 system, they measured four parameters: event duration, peak width, rise time, and loop back count.Marec et al. [39]  Oskouei et al. [40] adopted fuzzy C-means clustering associated with a principal component analysis to detect damage in glass-polyester composites with the AE technique.Behnia et al. [41] evaluated the damage of concrete structures subjected to pure torsional load by proposing a method based on AE and kernel fuzzy c-means.Time and frequency domain signals were used to classify the damage.Mohammadi et al. [42] studied the damage mechanisms in standard open-hole tensile (OHT) laminated composites through the AE.The authors used wavelet transforms as a descriptor and fuzzy C-means technology to distinguish sample damage mechanisms.Three damages were detected: matrix break, fiber-matrix detachment, and fiber break.Saeedifar et al. [43] detected interlaminar and intralaminar damage induced by dentation in laminated carbon-epoxy composites with AE and six different clustering methods including fuzzy C-means.Zhu et al. [44] estimated the leakage rate of a valve in a pipeline using various clustering techniques including fuzzy C-means using AE signals.Shateri et al. [45] detected damage in fiber-reinforced polymer (FRP) bars by applying a fuzzy c-means clustering algorithm to AE signals.Fotouhi et al. [46] identified the damage in the mixed-mode delamination of laminated composites using fuzzy clustering and the acoustic emission technique.Sayar et al. [47] investigated damage mechanisms in an open-hole carbon-epoxy laminate composite using wavelet packet transform and fuzzy C-means methods.Zhao et al. [48] detected failure of carbon-glass epoxy hybrid braided composites under tensile load based on acoustic emission signals and a fuzzy c-means algorithm.Mi et al. [49] have adopted fuzzy c-means to detect damage in fiber-resin composite structures that are closely related to fiber weaving methods (FWMs).Pei et al. [50] identified the progressive tensile damage of carbon fiber composites reinforced with multiwalled carbon nanotubes using AE and fuzzy c-means algorithms.Pomponi et al. [51] adopted an unsupervised approach for detecting plastic deformation, crack initiation, and corrosion cracking.The authors proposed a simple but effective non-iterative clustering algorithm (adaptive sequential k-means), oriented to acoustic emissions (AEs).The number of clusters is not specified a priori but deduced from the data, whereas the properties of the background noise control the creation of new clusters.The approach adopted has proved effective in grouping the AE signals associated with different emission sources (performance indexes = 0.776, 0.751).In Table 1, the essential characteristics of the methodology based on clustering are summarized.

Artificial Neural Network (ANN) Techniques
ANNs are composed of artificial neurons that are organized in an interconnected structure that allows the connection of the inputs and outputs of various neurons [52].This type of structure allows neurons to receive both initial and processed data from other neurons, depending on the level of the neuron.The ANN architecture arranges neurons on different levels, with several levels and a number of neurons for each level characterizing their structure [53].
ANNs are composed of levels containing a certain number of nodes: each node connects to another in which a weight and a threshold are associated with it.If the output of any single node is above the specified threshold, that node is activated and sends the data to the next network layer.Otherwise, no data are passed to the next network layer.From a mathematical point of view underlying ANNs, we can express a function ƒ as a composition of other g functions, which, in turn, can be expressed in simpler functions.An ANN is an interconnected set of elementary functions in which the outputs are the inputs of the subsequent functions.In general, ANNs rely on training data to learn how to improve their accuracy.Once optimized, these learning algorithms are powerful tools in computer science and artificial intelligence.At the base of the ANN, there is the perceptron, in complete analogy with the neuron in a biological neural network.In Figure 4, each dot is a node representing a perceptron.These are functions that take n input elements and return only a single output, which is sent as input for subsequent perceptrons [54].An ANN is an adaptive system capable of modifying its structure based on both external data and internal information that connects and passes through the ANN during the learning phase.A biological ANN receives external data and signals; these are processed into information through an impressive number of interconnected neurons in a nonlinear and variable structure in response to those data and external stimuli themselves.Similarly, ANNs are nonlinear structures of statistical data organized as modeling tools: they receive external signals on a layer of nodes.Each of these input nodes is connected to various internal nodes of the network, which, typically, are organized at multiple levels so that each single node can process the received signals by transmitting the result of its processing to subsequent levels [55].
Generally, ANNs consist of three layers (Figure 4): • Input layer: It is the one that has the task of receiving and processing the input signals, adapting them to the demands of the neurons in the network.

•
Hidden layer: In this layer, the data processing takes place.

•
Output layer: The results of the processing of the H layer are collected here and adapted to the requests of the next level block of the neural network.
ANNs have been widely used to classify the acoustic emission (AE) signal.De Oliveira et al. [56] have developed a procedure for the identification of damage for composite materials based on the grouping of acoustic emission signals using artificial neural networks.The authors adopted an unsupervised methodology based on the Kohonen self-organization map.The methodology was tested on a cross-layered fiberglass-polyester laminate subjected to a tensile test.Kalafat et al. [57] have developed an acoustic emission localization system based on the use of ANNs.The acoustic sources were applied to the test object to obtain data to be sent as input to an ANN.The method was tested on a type III carbon-fiber-reinforced polymer pressure vessel with metal coating.Boczar et al. [58] investigated the recognition of single-source one-off partial discharge forms that occur in isolation systems of power transformers.The system developed by the authors uses unidirectional artificial neural networks for the recognition of the acoustic emissions generated by the paper-oil isolation altered by the aging processes.Ativitavas et al. [59] identified the type of discontinuity and failure mechanisms within fiber-reinforced plastic (FRP) structures using acoustic emission (AE) data as input to be sent to an ANN-based system.The authors developed two types of networks based on back-propagation and probabilistic method with two levels to improve the accuracy of the forecast.Moia et al. [60] monitored the dressing operation of an aluminum oxide wheel: the dressing operation is necessary to restore the normal operation of a worn wheel.The statistics obtained from the measured acoustic emission (AE) signal were used as input from a classification algorithm based on neural networks.Two classes have been identified: sharp and dull wheel.Jierula et al. [61] used ANNs to identify damage locations in deep piles using AEs.The authors performed an impact test on a circular-section concrete column of a building.Łazarska et al. [62] monitored the steel hardening process using AEs and neural networks.Three types of events were detected in this study: high, medium, and low energy.These events allow the monitoring of the decay process of metastable austenite into bainite and martensite.The method made it possible to identify the alterations that occur on a microscopic scale.Schabowicz et al. [63] studied the degree of degradation of fiber cement panels exposed to fire with the AE method and with the use of ANNs.The fiber cement panels were exposed to fire and subsequently subjected to a three-point curvature with the relief of the acoustic emission: The collected signals were used as input for the ANNs.The degradation of the fibers contained in the boards increases with increasing exposure to fire, with a decrease in the number of AE events recognized by the ANNs as identifiers of fiber breakage.Nasir et al. [64] used AEs of thermally modified western hemlock wood as an input to an ANN-based model for the classification of the heat treatment level.
The authors used a high-sensitivity broadband differential AE sensor to detect the stress wave generated in the wood: this signal was subsequently processed to extract the time, frequency, and wavelet domain features.These signals were sent as input to three types of networks: multilayer perceptron, group method of data handling, and linear vector quantization.Elforjani et al. [65] identified deviations from normal bearing operating conditions by detecting AE signals and using ANNs.The AE measurements were performed with piezoelectric sensors mounted on bearings placed on a test bench at a speed of 72 rpm with an axial load of 50 kN.Three models have been implemented: ANN, support vector machine, and Gaussian process regression.In Table 2, the crucial aspect of the methodology based on ANNs is summarized.

Deep-Learning (DL) Methods for AE Testing
DL is a branch of ML that uses models consisting of multiple levels of information representation, built from the simplification of biological neural systems [66].They are based on ANNs organized in levels characterized by nodes; each level is connected to the next by means of connections having a weight whose value indicates the degree of connection between two nodes.The weights are optimized through a back-propagation process during training that minimizes the network error by slightly altering the value of the latter's nodes [67].

Convolutional Neural Network (CNN) Solutions
The architecture of CNNs differs from the common model of neural networks since the intermediate layers are not completely connected (Figure 5).The input level receives the data that are supplied to the network and is sized according to the specific characteristics of the input data [68].The convolutional level follows, from which the network takes its name, which precisely carries out a convolution operation to recognize specific characteristics in the data.There may be more convolutional levels depending on the complexity of the characteristics to be recognized.Subsequently, the pooling level can reduce the dimensionality by eliminating what is superfluous.Finally, the output level necessary for the classification consists of a completely connected level that connects all neurons to classify the characteristics identified by the previous levels [69].The convolutional level carries out a main operation, namely convolution, which consists of the product of two functions, one delayed compared with the other.This operation can be considered as the application of a filter consisting of a matrix (kernel) of smaller dimensions than the input data on which it is applied.The application of the matrix is a scalar product between the kernel weights and so-called receptive field, a subset of data having the same dimensions as the kernel.To perform the convolution on all data, the filter is shifted by an amount equal to the pitch (stride) until it reaches the edge [70].At the end of the scan, another matrix, called a feature map, is obtained that highlights a particular characteristic of the data.Therefore, to carry out the recognition, multiple filters are used at the same time; this will produce a tensor at the output whose depth is equal to the number of filters used.Each filter involves a few synaptic values (weights) equal to the size of the kernel; the number of parameters does not depend on the size of the image but can be calculated [71].
The output generated by the convolutional levels allows for more detailed and consistent information than the starting image.However, in most applications, it is not necessary to have a high resolution of the image, so it is possible to select only the useful information by reducing the size of the feature maps for subsequent processing.In fact, the pooling level has the purpose of resizing the feature maps while leaving the features of interest unchanged [72].There are different pooling mechanisms, and the most used is max pooling, which consists of applying a filter usually of 2 × 2 size, which moves on the feature map with a step of the same length.The pooling filter identifies the receptive fields and finds the maximum value for each [73].Finally, there are the completely connected levels that carry out the classification and then generate the output of the neural network.This layer of levels receives as input the matrix manipulated by the previous levels and produces a vector of dimension N that corresponds to the number of classes to be predicted.By analyzing the correlations present in the matrices, the relative probability of belonging to each class is calculated [74,75].
Shevchik et al. [76] applied ANN-based AE testing for quality monitoring of additive manufacturing of 3D printers.The acoustic emission signals were collected by a fiber Bragg grating sensor during the additive manufacturing process of the powder bed in a selective laser-melting machine.The relative energies of the narrow frequency bands of the wavelet packet transform were extracted and sent to a classifier based on the spectral convolutional neural network.Han et al. [77] simulated the effects of seismic events on a physical scale model of a pile-dwelling foundation using a vibrating table.The AE signals were collected with the use of accelerometers and sent to a CNN for the classification of the damage suffered by the structure.The authors demonstrated that the robustness of the method depends on the quality of the data: preparing the data with accurate labeling is crucial; it is necessary to consider a variety of different types of damage-induced AE signals that produce them in reinforced concrete structures, the model must be trained with various types of ambient noise, and, finally, the AE signals quickly attenuate with distance.
Hesser et al. [78] collected AE signals from an excited aluminum plate with pencil-lead break and two steel balls of different diameters using piezoelectric sensors.The signals in the time and time-frequency domains were extracted and sent to an ANN and a CNN-1D.Subsequently, the RGB images of the wavelet transforms were extracted and sent to a CNN-2D.The best classification results were obtained from the Conv2D architecture, which uses the deep transfer learning method, and the VGG16 architecture.Furthermore, deep transfer learning significantly reduces the number of parameters required for training the model.König et al. [79] monitored and classified the multivariant wear behavior of plain bearings.The AE signals were collected using a Nano30 sensor with a good frequency response (150-750 kHz), and subsequently, the continuous wavelet transform (CWT) was evaluated.The extracted features were sent to a 22-layer deep convolutional neural network (CNN) (GoogLeNet) for classification.Plain bearings are subject to multiple wear mechanisms, and the AE can detect critical operating conditions of wear: the spectra of the corresponding AE signals were in the frequency range from 40 kHz up to 700 kHz.Ebrahimkhanlou et al. [80] used CNNs to identify the area where an AE source is generated.The authors validated the methodology by exploiting a metal plate stressed with Hsu-Nielsen sources [81].Li et al. [82] extracted synchrosqueezed wavelet transforms from AE signals collected in railway crack monitoring.These features have been used for the classification of cracks by exploiting the CNN.Three types of acoustic emission waves were classified using synchronized wavelet transformation plots in various time-frequency scales.To speed up the training procedure, the authors applied transfer learning, whereas Bayesian optimization was applied to optimize the hyperparameters.Guo et al. [83] identified damage in carbon-fiber-reinforced composites using AEs and CNNs.The authors collected AE signals from tensile tests artificially causing fiber breakage, matrix breakage, and delamination.The data obtained in the form of time series were directly sent to a CNN-Inception Time obtaining a very high classification accuracy.Appana et al. [84] applied CNNs to effectively classify AE signals for bearing failure diagnosis.The authors extracted envelope spectra (ES) from raw AE signals, and since these features demodulate the signals by returning information on the frequency of failures and variations at unstable speeds, CNNs have learned to extract distinctive features to effectively diagnose the defects of the bearings.Hasan et al. [85] studied the incipient failures of a bearing through the analysis of the AE signals and the application of CNNs.The authors used acoustic spectral imaging (ASI) of acoustic emission (AE) signals by training a CNN with transfer learning.Xia et al. [86] identified failures of rotating machinery with CNNs and AE signals: temporal and spatial information of raw data from multiple AE sensors is considered.In Table 3, a summary of the described methods is shown.

Recurrent Neural Network (RNN) Based Applications
RNNs are DL models introduced for the processing of sequential data, or data in which the order of the observations is important [87].These algorithms find application above all in the analysis of time series [88].The idea behind the RNN is the sharing of parameters or the use of the same parameters along positions, instants of time, or, more generally, steps of the sequence, for two main reasons:

•
Possibility of applying the model, in the test phase, on sequences of different lengths from those seen by the algorithm in the training phase; • Reduction in the number of parameters and the ability to recognize information in different positions along the sequence.
An important role in these networks is played by hidden units.In fact, when the RNN is trained to carry out an activity that requires forecasting the future through the past, typically, the network learns to use the hidden units as a lossy summary of the aspects relevant to the activity of the input sequence up to time t [89].This summary, in general, is necessarily affected by losses since the hidden layer maps a sequence of arbitrary length into a vector of fixed length [90].
A problem with this type of model is that for sequences that are particularly long over time, the gradients tend to be infinity, to vanish, and for this reason, it is called a vanishing gradient.This means that an RNN thus constructed struggles to code temporally distant dependencies.To counteract the vanishing gradient, cells with a specific structure are used, which contain methods to build a memory that can also be propagated [91].
One of these is the long short-term memory (LSTM), which represents a particular RNN that solves the problem of a vanishing and exploding gradient that compromises the effectiveness of the RNN [92].The principle behind the LSTM is the memory cell, which maintains the state outside the normal flow of the recurring network.The state, in fact, has a direct connection with itself.Since the activation function for updating the state is, in fact, an identity function, the derivative will be unitary; therefore, the gradient in the back-propagation will not vanish or explode but will remain constant through all the time instants of the unfolded network (Figure 6) [93].
In many applications where the goal is not the prediction of one or more future values starting from the past (known) values, bidirectional recurrent neural networks can be used.This type of algorithm uses two recurring layers in the same network, but without direct connections between them, thus exploiting both the information in the order passed as input to the model and the opposite one, therefore solving a limitation of the classic recurring networks to the detriment of the introduction of parameters [94].The signals deriving from acoustic emission represent an example of a temporal sequence that can be effectively modeled using RNNs.Zheng et al. [95] have effectively identified the arrival of a microseismic emission by exploiting the RNN.The AEs deriving from seismic events show a fair variability of the stress waveforms and considerable differences in the trigger source phases of the rupture sources that make them difficult to identify with traditional techniques.The authors showed that RNNs can identify microseismic waves with discrete anti-interference capabilities.However, to achieve these results, you need to have a significant number of labeled data sequences.König et al. [96] detected plain bearing wear using the AE and RNN signals.The AE signals measured on a planetary gearbox plain bearing test stand were sent to an LSTM: the network was able to reconstruct the bearing wear history.Kolář et al. [97] identified three events in an AE generated by the uniaxial loading of a sample of western granite using RNNs.The method also allows calculating the position of the source AE of the tensor of the source moment.Li et al. [98] faced the gear pitting fault diagnosis problem with the RNNs and CNNs.The authors adopted the gated recurrent unit (GRU) network, an RNN with only three ports and no internal cell status.The information stored in the internal cell state is embedded in the hidden state of the gated recurring unit.This collective information is passed on to the next gated recurring unit.The authors sent the raw AE signals to a CNN and the vibration signals to a GRU network: finally, the chained outputs are sent to a softmax layer to diagnose gear pitting failures.Hsu et al. [99] used the AE from a U-shaped aluminum plate generated with multiple pencil-lead breaks for structural health monitoring.Two broadband AE sensors collect the time series of the direct and reflected waves of the AE signals, which are sent to an LSTM.
Nguyen et al. [100] predicted the failure of a concrete structure using the AE and RNN signals.The raw signals are first preprocessed by an SVM (support vector machine) to extract only the signals relevant to the construction of the health indicator of the concrete product.These indicators are then sent by an LSTM for the forecast of the remaining useful life.Bi et al. [101] used the diamond wheel AE signals when grinding brittle materials to construct a wheel condition regression prediction model.The AE components are well-separated in the frequency domain highlighting wheel deterioration events.The frequency spectrum of the AE signal was detected by constructing a time sequence during the rectification.These data were sent to an LSTM-based regression prediction model.Zhang et al. [102] used RNNs to detect railway cracks from AE signals.In this work, a NARX (nonlinear autoregressive with exogenous input) network was used to eliminate noise from the useful signal.First, the AE signals were collected under real operating conditions, and subsequently, the crack signals were added with an artificial generation system.AE noise has been effectively eliminated by bringing the SNR to values above 20 dB.In this way, it was easy to detect the AE signal deriving from the railway cracks with respect to the background noise.Xia et al. [103] used the raw AE signal to accurately estimate the remaining useful life of the machines.Sequential data from the sensor network are merged and sent to the model, bypassing the feature extraction procedure that requires prior knowledge.The prediction model is based on a hybrid approach that exploits the long-term memory levels of an LSTM and classical neural networks.This extracts the temporal information of the sequential data.Haile et al. [104] used acoustic emission signals for the detection and localization of structural damage.Recurrent neural networks have been adopted to process the raw time series data detected by the AE sensors.The model can extract the characteristics of direct and reflected acoustic emission waves: the reflected waves are filtered, whereas the direct waves are used to identify the position of the source.The filtering process of the reflected waves is necessary to eliminate the noise due to geometrically complex structures.For such structures, the direct waves can reach obstacles and be reflected generating noise and modulation.In Table 4, a summary of the described methods is shown.

Comparison between ML-Based Methods for AE
In the previous sections, we have collected various contributions from the scientific community that have adopted methodologies based on ML to address the problem of structural health monitoring using AE.To make a comparison between the different technologies available, it is appropriate to apply the different algorithms on the same dataset.In this regard, a dataset was used that contains the measurements using AE sensors performed on concrete specimens [105].The characteristics of the instrumentation used for data collection are shown in Table 5.The tests were carried out on concrete specimens with dimensions of 15 × 15 × 15 cm left for 28 days at a temperature of about 20 • C and humidity of 95%.The specimen was subjected to a nondestructive compression test to stress the formation of cracks.The data were subsequently labeled by grouping them into three classes: tensile, shear, and mixed.Several ML-based algorithms have been applied to address the data classification problem.The dataset consists of 1650 records and 1000 features.For the validation of the results, k-fold cross-validation was applied, one of the most widespread validation techniques of ML models, used to quantify the accuracy of the prediction and is a good preventive measure against overfitting.If the sample dataset is limited due to experimental problems or the impossibility of repeating the experiment to obtain a greater number of examples with which to train the algorithm, a method of splitting the algorithms is often used.Data are available in different categories.In the method called K-fold, the dataset available at the beginning of the experiment is divided into K groups, of which K-1 are used for the training and the remaining group for the generalization test.This procedure is repeated for all the K groups chosen, varying each time the group chosen for generalization.This has the advantage that all examples are used, at least once, for both training and testing.On a practical level, this method is very similar to that of the bootstrap type and has the advantage of being able to estimate the characteristic parameters of pattern recognition by distributing the results obtained.A problem inherent in the method lies in the calculation speed, as it could often be expensive to go to evaluate a large number of programs runs for all possible K's.In our case, a cross-validation with five folds was applied.
Table 6 summarizes the characteristics of the classification algorithms adopted and the results in terms of accuracy.The following algorithms were applied.

•
Decision tree: The decision tree is a graph characterized by nodes and edges.The highest no-do is defined as the root node; each intermediate node represents a test carried out on a certain attribute, and the arcs represent the result of the test.The attributes against which the test is performed in each intermediate node are called splitting attributes as they divide the data into subgroups.The goal of the algorithm is to carry out various tests on the values of the attributes starting from the root node, so that each record, based on the result obtained in each test, travels a more or less long path between the various intermediate nodes until arriving at a leaf node that defines its class to which it belongs.The decision tree has a structure like a flowchart: it consists of internal nodes (the first at the top is called the root node, and the following ones are called child nodes), which represent the subsets of the initial dataset divided according to the attributes; branches, which represent the decision-making rules according to which the division takes place; and leaf nodes, which represent the final subsets into which the initial database was split.

•
Linear discriminant: The purpose of the analysis is to find one or more linear combinations of parameters that allow optimal discrimination between the various groups.
In this way, an observation can be attributed to a given group based on the measurements.The methods of linear discriminant analysis can be justified by assuming that the distribution is, within each group, normal with a common variance-covariance matrix and based on semiempirical criteria of separation between groups, without hypothesis, and distributional on variables.The goal is to project the n-dimensional space of the features of the input data into a smaller subspace, removing the redundant and dependent features.The classification is based on three steps: calculate the separability between different classes (for example, the distance between the means of different classes), calculate the separability between the elements of the same class, and build a space smaller than the starting one in which the separation between the different classes is maximized and the separation between elements of the same class is minimized.

•
Gaussian naïve Bayes: Using the Bayes theorem, the algorithm allows you to assign a label to each group of text to facilitate their classification.This is what spam filtering, a popular application of the naïve Bayes algorithm, does.Basically, it is among the most popular learning methods that group analyzed data based on their similarity.The algorithm calculates the probabilities for each factor using the previous formula and selects the result with the highest probability (the maximum a posterior probability).While the estimation of the a priori probabilities is quite simple, if we do not have elements, we can hypothesize the equiprobable classes; the knowledge of conditional densities is possible only in theory.We often make assumptions about the shape of the distributions and learn the fundamental parameters from the training set.

•
Support vector machine: It is used in various fields, including facial recognition, text, or image classification.The algorithm works by dividing the data into different classes by finding a dividing line between the different classes (usually called hyperplane).This line is not casually taken: it is the one that maximizes the distance between the various classes in the case of more than one.In this way, the greater the distance, the greater the accuracy of the model.A hyperplane is a subspace of dimension n-1 with respect to the space in which it is contained: therefore, if we are talking about points in a 2D space, the hyperplane will be a straight line; in a 3D space, it will be a plane; and so on.In the optimal case, there is a hyperplane that completely separates the points of the two classes: this does not happen, so adjustments are applied to it in the form of soft margins or kernel tricks.In the first case, the model is granted a margin of error, that is, one or more points of a class can be found in the other class.In the second case, we find a nonlinear hyperplane, applying suitable transformations to the initial features.The support vector machine is also used for more complex models (nonlinear SVMs).In this case, it is not possible to separate the training data using a hyperplane but through a kernel function, which helps us to model nonlinear models of larger dimensions.• K-nearest neighbors (KNN): This is a supervised learning algorithm in that, unlike the other classifiers, it does not involve the creation of a model, but the training phase simply consists of memorizing the values assumed by the characteristics and labels.Its operation is based on the calculation of the distance between the record whose label you want to predict and the K elements of the dataset closest to it.The record will be labeled based on the labels of the selected K neighbors.Since the algorithm is based on the concept of distance, it is important to normalize the data in the preprocessing phase so that its measurement is not dominated by one of the attributes present in the dataset.The object to be classified is graphically represented in a three-dimensional space based on the attributes it possesses and is then classified according to its surroundings; it is assigned the class to which the majority of the closest K samples belong.

•
Ensemble methods: With ensemble, we mean a set of basic learning machines whose predictions are combined to improve the overall performance.The variety of terms with which the various machines are called in the literature reflects the absence of a unified theory on ensemble methods and the fact that it is a research field yet to be explored in many respects.The ensemble algorithms are made up of several basic classifiers combined with each other, according to the philosophy that a combination of classifiers provides better results than the single one.There are different types of ensembles: bagging uses a combination of weak models, each of which learns from a subset of the initial data.The final prediction is none other than the average (or the majority vote) of the output of the various ensemble models of this type, for example, random forest, composed of n decision trees.In voting, a simplified version of bagging is used but which, however, allows us to combine the results of different categories of classifiers.Boosting combines the results of individually weak models.However, it does not do so at the end, but sequentially: each model is trained on the results of the previous model, giving more weight each time to the erroneous predictions.
Ensembles of this type are AdaBoost and GradBoost.Stacking is an ensemble method structured on several levels: the output provided by the classifiers of a level is fed to other classifiers, called "meta-classifiers".Finally, cascading is composed of a cascade of classifiers that are sequentially interrogated when the previous one does not provide some certainty about the results obtained.
By analyzing Table 6, we can see that the methods that have returned the best accuracy both involve KNN technology.It is used for regression and classification, in which the result determines whether the analyzed object belongs to the most common class of its k neighbors (with k positive integer).The strength of this algorithm is that it allows you to store all available instances and classify them by judging their distance from their neighbors.The calculated distance between the two data points is usually the Euclidean distance, although the Manhattan, Minkowski, or Hamming distance is sometimes used.The first three functions are used for continuous variables and the fourth (Hamming) for categorical variables.We can note that the adoption of the Euclidean distance gives much better results than the cosine distance.The same ensemble method that returned results comparable with the KNN methodology is characterized by the same technology.Slightly lower results were obtained with both quadratic and cubic SVM, demonstrating that these technologies are also suitable for this type of classification.Finally, AdaBoost and Bag ensembles returned slightly lower results.

Summary and Future Trends
Damage diagnostics are performed by detecting the origin of the damage, identifying the type of damage, and locating the damage.The damage identification process is characterized by nondeterministic features such as noise, missing data, etc.Many AE characteristics are strongly influenced by the detection system: for example, amplitude; energy; rise time; duration; and counts return values that significantly depend on the distance between the sensor and source, on the type of sensor, on the geometry of the material, etc.These parameters must, therefore, be adequately considered to obtain AE-based monitoring that is reliable and robust.Methodologies based on the exploitation of the AE waveform significantly reduce the effect of the threshold level.The analytical tools associated with the great performances of modern computers allow the processing of large amounts of data (Big Data), returning detailed spectral analyzes, and provide us with groupings of AE parameters that were previously unattainable.This evolution dictated by technological development opens new scenarios in the search for correlations between acoustic events and damage mechanisms.
The monitoring of the health of materials and structures through the analysis of AE strongly depends on the quality of the sensors: the technological evolution we are experiencing obviously also affects this type of devices, which are becoming more and more performing.Traditionally, AE technology has exploited piezoelectric sensors, characterized by significant dimensions and with limits of use in harsh environments (high pressures and high temperatures).Furthermore, this type of sensor is subject to corrosion and has been shown to be sensitive to electromagnetic interference.Piezoelectric wafer active sensors (PWASs) represent a low-cost solution that can be inserted within composite materials or between layers of overlapping joints.The resistance to high temperatures of these sensors depends on the PWAS material and on the frequency; in fact, the antiresonance and resonance frequencies have a linear relationship with temperature [106].Fiber optic sensors are inexpensive and not very sensitive to electromagnetic interference.Furthermore, they show high reliability even in harsh environments with low operating costs, can be easily interrogated at long distances, and offer high spatial coverage [107].Microelectromechanical systems (MEMS) sensors have proven effective in detecting and controlling on a microscale and generating effects on a macroscale.These are miniaturized sensors used as resonators to amplify the signal-to-noise ratio, with a resonant silicon microstructure and a thin piezoelectric layer, mounted on a ceramic container.Such sensors are characterized by significantly smaller dimensions and weight and by a sensitivity comparable with that of conventional AE sensors [108].
ML-based techniques represent an alternative approach to traditional methodologies for identifying damage through AE.Although it requires significant resources for algorithm training, automatic monitoring returns optimization processes that are shorter than conventional optimization techniques based on manual procedures.These technologies show significant sensitivity to changes in high-frequency transmitted acoustic waves due to the emergence of a defect [109].The efforts of researchers concentrated on the characterization of the AE signal features to obtain models capable of discriminating between the types of defects: the influence of process parameters and operating conditions on the characteristics of the associated AE signals is evident.
The performance of a machine-learning-based model strongly depends on the quality of the data used as input in the training phase.Given the complexity of the phenomenon, fault estimation is often carried out starting from incomplete monitoring due, for example, to anomalies found in the acquisition devices or due to an interruption in data transmission, without forgetting the background noise that is always present in the workplaces.A monitoring affected by these problems will return unreliable diagnoses.The missing data, therefore, become a crucial element for the success of the procedure and must, therefore, be adequately treated through the different approaches available.Possible preventive actions to deal with this problem include deleting incomplete data or estimating missing data [110].Some ML-based algorithms can automatically manage incomplete data through the calculation of probabilities for estimating the degree of uncertainty or using expectationmaximization (EM) technology for parameter estimation.
Additionally, most ML-based methods take a supervised approach by leveraging the associated labels from the expert to make the diagnosis.In this way, however, the model only learns the type of defect labeled, whereas a new defect is rejected due to the distance between the labeled and unlabeled data, showing an inability of the model to adapt to the evolution of the system.Another important factor limiting the performance of these models is the unbalanced distribution of the data: the data collected that identify the operating conditions with the health of the structure are usually more represented than those that instead identify the damage.Working with unbalanced data involves the elaboration of a model with reduced accuracy of the diagnosis, strongly shifted toward the most observed class.The adoption of ensemble methods and resampling techniques can reduce the effect of this imbalance in the performance of the model.
In the previous sections, we have analyzed different applications of artificial intelligence techniques for the processing of AE signals: ML-based algorithms can be used to predict and locate structural defects in the absence of explicit analytic functions.It has been shown that the application of ML to the position of the source AE compensates for the effects of acoustic anisotropy, boundary reflections, and obstacles in the propagation path thanks to its ability to manage complex problems.The training of these algorithms is linked to the configuration of the monitoring system and the geometric and physical characteristics of the target structures.Most of the experimental AE test database is required to correctly interpret the signals and establish a well-trained algorithm for the location of the AE source.Obtaining such a database through experiments requires a lot of work, time, and money.To minimize the required experiments, an alternative way is to use simulation to study the mechanism underlying the detection of AE.Most existing studies using the wave propagation simulation model focus on flat plates and simple geometries.Referring to realistic structures, the analysis with simulated data is mainly used to identify the regions of possible damage sites that can, therefore, be considered areas of primary interest for structural monitoring.A very important aspect of the AE technique is the ability to identify the modality of fissure starting from recorded parameters.By determining the mechanism underlying the cracking event, the classification of the cracking mode plays an important role in understanding and predicting the likely failure modes of the complete structure.Furthermore, it can help detect the progress of the damage and provide guidelines for the correct maintenance process to improve safety and structural durability.
Unsupervised ML methods have been successfully used for the grouping of AE signals.The results obtained depend on the structure of the dataset collected with AE sensors.The techniques based on k-means have proved to be simple, fast, but above all effective when applied to structures on contained datasets.The hierarchical model, on the other hand, proved to be suitable for the treatment of complex structures with a particularly extensive dataset at the expense of a higher computational cost.A weakness of these methodologies lies in the difficulty of repeatable results.A well-labeled training dataset is resource-intensive.On the other hand, its availability makes it possible to use supervised classification techniques preferable to unsupervised techniques, at least in the case of complex structures.The unsupervised algorithms return the data of the training set that are in the vicinity of the data of the test sets, thus providing an estimate of the belonging of such data to a specific class.These algorithms are faster during the training process; however, they are slow during the grading process [111].Algorithms based on clustering can also be very useful in identifying anomaly data, thus improving the quality of monitoring through AE sensors.
Structural damage detection requires in-depth knowledge of the system, which is often not available.Often, monitoring the status of a system with the use of state-of-the-art sensors does not provide us with an exhaustive picture of its state of health.A robust damage identification system must fill these gaps through innovative methodologies.The data-driven approach can guarantee a solution to the problem if it provides us with a model capable of generalizing.In fact, the purpose of machine learning is to build an algorithm that can classify new inputs never seen during the learning phase.The generalization capacity of the system is precisely expressed in the ability to make correct predictions on inputs not observed during training.The effectiveness of a self-learning algorithm is measured with a low training error and a small difference between the training and test errors.These two factors identify two of the problems related to machine learning: overfitting and underfitting.Underfitting occurs when the model is unable to achieve a sufficiently small training error.Overfitting, on the other hand, occurs when the gap between training and test errors is too wide.
There are many algorithms based on ML, and each family of algorithms has specific characteristics that govern their use in each context.Table 7 compares the performances returned by the models most adopted by the scientific community for AE testing.To facilitate comparison, ranges of values declared by the authors in the respective articles have been reported.Accuracy metric has been adopted for performance evaluation.Accuracy measures how close the forecast is to the current value; it is usually given as a percentage.AE and ANNs have been used in the literature to address the problem of localization of fracture sources.For the analysis of AE data, advanced algorithms based on pattern recognition have been developed with the use of ANNs.These methods require signal characteristics extracted in the time or frequency domain.These techniques do not require knowledge of the wave velocity or the structural geometry of the material and are able to estimate the intensity of the source and its position.An ANN shows itself to be efficient in the classification of faults; then, by adapting the structure of the network to the type of data, it is possible to significantly increase its classification capacity.However, the ANN training process is complex and computationally demanding.
Analyzing Table 7, we can see that the accuracy returned by the models for fault identification has comparable values.This confirms that the evaluation metric, referable to the works available in the literature, is not the suitable tool to guide the researcher in the choice of the most appropriate algorithm for the identification of a specific structural damage.This choice can be made only after verifying how the different algorithms adapt to the available data, providing the system with adequate generalization capacity.However, a comparison of the results in Table 7 says some things: the algorithms based on CNNs and RNNs seem to return results with greater accuracy; this can be justified by the ability to automatically extract the characteristics.This increased ability to extract knowledge pays off with computational costs that become more expensive.
The application of advanced deep-learning algorithms such as CNNs brings considerable benefit in the localization of acoustic emission sources.Compared with other methods, the CNN-based approach has a greater potential for in situ detection of noise emission sources.No preliminary information is required on the distribution of the acoustic emission speed in the structure.Deep learning is a data-driven approach that does not require the prior extraction of functionality.The deep-learning architecture is directly applied to the data collected by the sensors because it automatically learns and extracts the representative characteristics.In this way, deep learning achieves better performance than traditional algorithms based on the extraction of functionality.The extraction of functionality through DL is like a filtering process and is particularly useful for analyzing the physical meanings of such models: visualization technologies will then visually express the knowledge extracted from the models.
Table 8 summarizes the strengths and weaknesses of each family of algorithms.The different potentials make us understand how the choice of the algorithm depends on the characteristics of the system to be modeled.Table 8 shows that the complexity of the model is linked to the characteristics of the input to be processed.Systems with significant input dimensions require more complex modeling tools with increased computational cost.However, this does not tell us that necessarily the most complex choice is the one best suited to the solution of the problem; in fact, it often happens that the performances of the algorithms are different according to the different inputs.
A further help in case of incomplete or unbalanced data is the use of transfer learning (TL): TL is an ML-based technique that focuses on memorizing relationships between inputs and outputs, patterns, and models, acquired during the training phase on a problem unrelated to the one under consideration, and then put them in relation with the latter.The goal is to reuse or transfer information from previously learned tasks to learn new ones.All this has led to a significant increase in the number and type of algorithms used in the field of ML.This technique also seeks to improve the efficiency of the single sample of the dataset.
We can then collect data from multiple sources and then associate a common classification, but the information contained in these input data is not all related to the output class: they could return diagnostic knowledge useful for the discrimination of the states of the structure, as well as could induce the system to an incorrect classification.To avoid these criticalities, it is necessary to adequately study both systems through experimental tests to identify parameters capable of guaranteeing transferability between domains suitable for the selection of relevant source data.
To detect the onset of damage using AE, descriptive and qualitative analyses can lead to different interpretations.In the future, it would be appropriate to define quantitative criteria specifically designed for the structure under consideration.Thanks to the highly computational computers we have today, it is possible to develop a fail-safe DL-based fracture identification system with significantly improved diagnostic performance.This system can bypass the need to extract the functionality required by other ML methods by reducing the risk of error associated with empirical procedures.

Conclusions
In this review, we have described the machine-learning-based methods most used by the scientific community for the detection of structural damage.For each type of technology, we introduced the methodology and subsequently examined the contributions most appreciated by the scientific community that has exploited these methodologies to identify structural damage.
Cracks and defects of various kinds can devastate the performance of components and structures to such an extent that their identification is an essential part of quality control in all fields of engineering.AE represents a technique used for structural monitoring that falls within the nondestructive tests: it is a passive monitoring technique such that it is not necessary to directly supply energy to the monitored structure from the outside, but energy is used from the same source of damage.Among the main features of the AE technique, we find the ability to localize damage.The identification of the position of origin can allow an accurate global investigation of a structure and a preliminary understanding of the possible damaged area.
Analytical techniques make the identification and localization procedures of acoustic emission sources complex; this is particularly true for structurally complex systems.To solve these criticalities, optimization techniques have been adopted to separate the AE signals from noise.These techniques require adequate experience in signal processing to discriminate between the different sources of the AE signals.A lack of experience can lead to a misinterpretation of the signals with an incorrect localization of the damage.ML techniques such as ANNs have shown enormous potential for the localization of the acoustic emission source.However, the significant amount of data required for ANN training makes them inconvenient, at least for large structures.Furthermore, we have seen that the location of the AE source is strongly influenced by the nature of the defect.The performance of localization methods can be improved by analyzing the damage diagnostics in detail.The identification system based on ML algorithms returns a greater reliability and adaptability.
ML-based diagnostic models can automatically recognize the presence of damage in a structure.However, the extraction of the characteristics is still based on the intervention of experts with the consequent associated empirical risk.The advent of DL revolutionizes the sector bringing significant advantages by returning end-to-end diagnostic models that automatically learn the characteristics from the collected data and, subsequently, can identify the possible damage.However, these models require enough labeled samples, which entail a significant burden from the point of view of the resources to be used.TL can be used to reduce the cost of data collection by transferring previously acquired diagnostic knowledge to other domains.

Figure 1 .
Figure1.Acoustic emission detection scheme.AE sensor captures the signal and passes it to a preamp, which then sends it to a band-pass filter.Subsequently, the selected spectrum is amplified and passed to the signal processor.

Figure 2 .
Figure 2. Piezoelectric sensor scheme: they consist of thin layers of single crystal, which produces an electrical charge when subjected to compressive force.

Figure 3 .
Figure 3.Typical AE signal: we can identify a very rapid rising edge followed by an exponentially decreasing trend also due to the response of the transducer, and by stationary or almost stationary signals, without decays.

Figure 4 .
Figure 4.Typical ANN architecture: we can identify an input layer that presents the data to the structure, a hidden layer that takes care of processing data, and, finally, an output layer that returns the results.

Figure 5 .
Figure 5.Typical CNN architecture: structure is divided into two sections, the first in which features composed of convolutional layers, ReLU, and Max pooling are extracted.Data classification section is instead made up of fully connected network.

Figure 6 .
Figure 6.RNN architecture unfolded: feedforward version of network of arbitrary length depending on a sequence of inputs.Number of blocks of unfolded version essentially depends on length of sequence to be analyzed.Within the network, it represents a pattern that temporally links elements of the series that RNN analyzes.
investigated local damage in composite materials based on the analysis of acoustic emission (AE) signals.The authors applied fuzzy C-means clustering techniques associated with principal component analysis to analyze AE data clusters and subsequently correlate them to material damage mechanisms.Continuous and discrete wavelet transforms are applied to typical AE signal damage mechanisms.

Table 1 .
Clustering methods for AE testing.

Table 2 .
ANN-based methods for AET.

Table 3 .
CNN-based methods for AE testing.

Table 4 .
RNN-based methods for AE testing.

Table 5 .
Characteristics of instrumentation used for data collection.

Table 6 .
Summary of different ML-based models applied.

Table 7 .
Performance of different Machine Learning-based models for Acoustic Emission Testing.

Table 8 .
ML algorithms features for Acoustic Emission Testing.