Lamb Wave Damage Quantification Using GA-Based LS-SVM

Lamb waves have been reported to be an efficient tool for non-destructive evaluations (NDE) for various application scenarios. However, accurate and reliable damage quantification using the Lamb wave method is still a practical challenge, due to the complex underlying mechanism of Lamb wave propagation and damage detection. This paper presents a Lamb wave damage quantification method using a least square support vector machine (LS-SVM) and a genetic algorithm (GA). Three damage sensitive features, namely, normalized amplitude, phase change, and correlation coefficient, were proposed to describe changes of Lamb wave characteristics caused by damage. In view of commonly used data-driven methods, the GA-based LS-SVM model using the proposed three damage sensitive features was implemented to evaluate the crack size. The GA method was adopted to optimize the model parameters. The results of GA-based LS-SVM were validated using coupon test data and lap joint component test data with naturally developed fatigue cracks. Cases of different loading and manufacturer were also included to further verify the robustness of the proposed method for crack quantification.


Introduction
Guided ultrasonic waves are widely used and have shown great potential in non-destructive evaluations (NDE) and structural health monitoring (SHM) systems. Compared with other guided waves, Lamb waves have the merits of strong penetration, can be used in both isotropic and anisotropic materials, and can interrogate a large area with relatively low energy lost [1]. A large number of studies have reported damage identification using Lamb waves [1][2][3]. Generally, the research of Lamb wave-based damage evaluation can be classified into two groups: damage location identification and damage size quantification. In the former category, identifying and locating an existing damage is the key issue. A considerable number of researchers using signal processing methods and imaging methods, such as tomography [4][5][6][7] and probability-based diagnostic imaging [8][9][10], have focused on this topic. For better diagnosis of the condition of structural health, there is also an increasing interest in more precise damage quantification using Lamb wave methods. The A 0 mode of Lamb wave was employed to identify and locate damages in metallic structures in reference [11]. Ben et al. [12] proposed a Lamb wave propagation method to measure damage location in composite materials. Damage identification was achieved by comparing changes of dispersion characteristics and attenuation between damaged and undamaged carbon fiber reinforced plastic bars. Leong et al. [13] experimentally verified that Lamb wave sensing utilizing scanning laser vibrometry has a potential for

Methodology Development
The overall framework for Lamb wave damage quantitation using GA-based LS-SVM is illustrated in Figure 1. Firstly, the Lamb wave test is implemented for data acquisition; the damage sensitive features are identified and extracted from the signals as the training data. Next the model parameters are optimized by genetic algorithm and the model is trained based on the learning algorithm of LS-SVM. The resulting GA-based LS-SVM model is validated using both coupon test data and lap joint component test data with naturally developed fatigue crack. In order to further verify the robustness of the proposed method for crack quantification, specimens from different manufactures under different loading spectra are used for fatigue testing. The mean relative error (MRE), which is calculated using the actual and predicted crack size, is used to measure the accuracy of the proposed method.

Lamb Waves Theory
As one of the most important guided ultrasonic waves, the Lamb wave is widely used for damage identification. Material discontinuities existing in the wave path can alter wave characteristics (such as energy, wave shape etc.) of the Lamb wave. Thus, monitoring and evaluating the changes of the Lamb wave signal provides a means to analyze damage location and severity. On the other hand, Lamb wave phase velocities are highly dispersed and depend on the product of the frequency and plate thickness. Theoretically, multiple Lamb wave modes exist simultaneously in plate-like structures, and the number of modes increases with the frequency. As can be seen from Figure 2, fewer Lamb wave modes are excited at lower frequencies. Therefore, the response signal is more distinguishable in the low frequency range. It is also known that the lowest-order symmetric mode (S0) and the lowest-order anti-symmetric mode (A0) carry more energy and have smaller energy attenuations during propagation, compared to higher-order modes. In addition, the wave length of the S0 mode (also known as the extensional mode) is significantly larger than the thickness of the plate and it has been proven to be more sensitive to smaller damages than the A0 mode [9,24,25]. The S0 mode is used to perform damage quantification in this study. As shown in the shaded area (the product of the frequency and plate thickness is between 0 and 400 kHz·mm) of Figure 2 [25], the group velocity of S0 is largely non-dispersive with a relative constant group velocity. Given a

Lamb Waves Theory
As one of the most important guided ultrasonic waves, the Lamb wave is widely used for damage identification. Material discontinuities existing in the wave path can alter wave characteristics (such as energy, wave shape etc.) of the Lamb wave. Thus, monitoring and evaluating the changes of the Lamb wave signal provides a means to analyze damage location and severity. On the other hand, Lamb wave phase velocities are highly dispersed and depend on the product of the frequency and plate thickness. Theoretically, multiple Lamb wave modes exist simultaneously in plate-like structures, and the number of modes increases with the frequency. As can be seen from Figure 2, fewer Lamb wave modes are excited at lower frequencies. Therefore, the response signal is more distinguishable in the low frequency range. It is also known that the lowest-order symmetric mode (S 0 ) and the lowest-order anti-symmetric mode (A 0 ) carry more energy and have smaller energy attenuations during propagation, compared to higher-order modes. In addition, the wave length of the S 0 mode (also known as the extensional mode) is significantly larger than the thickness of the plate and it has been proven to be more sensitive to smaller damages than the A 0 mode [9,24,25]. The S 0 mode is used to perform damage quantification in this study. As shown in the shaded area (the product of the frequency and plate thickness is between 0 and 400 kHz·mm) of Figure 2 [25], the group velocity of S 0 is Materials 2017, 10, 648 4 of 21 largely non-dispersive with a relative constant group velocity. Given a specimen with a fixed thickness, the actuation frequency is chosen to satisfy this criterion. For example, if the thickness of the specimen is 2 mm, the frequency of the Lamb wave can be set to 0.16 MHz. A symmetrical Hamming-windowed sinusoidal tone bust with 3.5 cycles is used as the excitation in this study, as shown in Figure 3. specimen with a fixed thickness, the actuation frequency is chosen to satisfy this criterion. For example, if the thickness of the specimen is 2 mm, the frequency of the Lamb wave can be set to 0.16 MHz. A symmetrical Hamming-windowed sinusoidal tone bust with 3.5 cycles is used as the excitation in this study, as shown in Figure 3.  The presence of material loss, crack or other discontinuities in the path of the Lamb waves can alter the wave characteristics. The energy is weakened due to back-scattering reflection and the transmitted waves are modified due to forward scattering. The detection and quantification of the crack are particularly difficult when the echoes from the specimen boundary and the cracks are superposed. In order to reduce the signal complexity, only the first wave package received by the sensor is used to extract damage sensitive features. To determine the time window for the first wave package, it is essential to calculate the group velocity of the S0 mode. The group velocity can be estimated by the time-of-flight (ToF) between A and B as shown in Figure 4 with a known wave propagating distance. The Hilbert transform is used to calculate the envelope of the signal. The time window of the first wave package received by the sensor is illustrated in Figure 4 and can be calculated by the following equations  specimen with a fixed thickness, the actuation frequency is chosen to satisfy this criterion. For example, if the thickness of the specimen is 2 mm, the frequency of the Lamb wave can be set to 0.16 MHz. A symmetrical Hamming-windowed sinusoidal tone bust with 3.5 cycles is used as the excitation in this study, as shown in Figure 3.  The presence of material loss, crack or other discontinuities in the path of the Lamb waves can alter the wave characteristics. The energy is weakened due to back-scattering reflection and the transmitted waves are modified due to forward scattering. The detection and quantification of the crack are particularly difficult when the echoes from the specimen boundary and the cracks are superposed. In order to reduce the signal complexity, only the first wave package received by the sensor is used to extract damage sensitive features. To determine the time window for the first wave package, it is essential to calculate the group velocity of the S0 mode. The group velocity can be estimated by the time-of-flight (ToF) between A and B as shown in Figure 4 with a known wave propagating distance. The Hilbert transform is used to calculate the envelope of the signal. The time window of the first wave package received by the sensor is illustrated in Figure 4 and can be calculated by the following equations  The presence of material loss, crack or other discontinuities in the path of the Lamb waves can alter the wave characteristics. The energy is weakened due to back-scattering reflection and the transmitted waves are modified due to forward scattering. The detection and quantification of the crack are particularly difficult when the echoes from the specimen boundary and the cracks are superposed. In order to reduce the signal complexity, only the first wave package received by the sensor is used to extract damage sensitive features. To determine the time window for the first wave package, it is essential to calculate the group velocity of the S 0 mode. The group velocity can be estimated by the time-of-flight (ToF) between A and B as shown in Figure 4 with a known wave propagating distance. The Hilbert transform is used to calculate the envelope of the signal. The time window of the first wave package received by the sensor is illustrated in Figure 4 and can be calculated by the following equations where T start and T end represent the start and stop time point of the time window respectively, T 1 is the time point when the Lamb wave is transmitted, and T 0 is the period time of excitation wave.
where Tstart and Tend represent the start and stop time point of the time window respectively, T1 is the time point when the Lamb wave is transmitted, and T0 is the period time of excitation wave. Lamb waves are inherently of a dispersive nature. Therefore the received wave package may vary from the wave package generated by the actuator at the beginning. The impact of this characteristic is not significant in this study due to the short wave transmission distance between actuators and sensors. Moreover, the damage sensitive features from the healthy specimen are used as baseline data to establish the crack length model. The purpose of using a time window is to avoid complex reflection waves from the boundary. Because comparison is made between the baseline and the actual cases, the time window can ensure that the approximately same amount of wave package data is retained in the time window; therefore the time window is not required to include the entire first wave package received by the sensor.
By comparing the response signal under damage state and healthy state, it is known that the features of the received signal are different. Figure 5 shows the mechanism of Lamb wave propagation through a damaged region. The crack-like damage can cause backscatter echoes and the energy loss can increase with the increasing of the crack size. On the other hand, the presence of a crack can alter the wave path. For a closed or a partially closed fatigue crack, the received signal contains two parts: directly transmitted wave signals across the crack and echoes from the crack tip. For a notch or a fully opened crack (e.g., a through-thickness crack), the received signals are the waves in a detour route from the crack tip [26]. Based on the above consideration, three damage features, namely, normalized amplitude, phase change, and correlation coefficient, are extracted from the received signal quantitatively as the damage features. In order to eliminate uncertainties from different piezoelectric (PZT) wafers, the normalized amplitude is used as a damage sensitive feature. First, the Hilbert transform is applied for both the received first Lamb wave package and the excitation wave package. The normalized amplitude is obtained by dividing the peak value of the two processed wave packages. The phase change is calculated by subtracting the time of the peak of the damaged signal from that of the healthy signal. The correlation coefficient is calculated by comparing the intact specimen (with 0 mm crack) with the defective specimen in the desired time window calculated based on Equations (1) and (2).  Lamb waves are inherently of a dispersive nature. Therefore the received wave package may vary from the wave package generated by the actuator at the beginning. The impact of this characteristic is not significant in this study due to the short wave transmission distance between actuators and sensors. Moreover, the damage sensitive features from the healthy specimen are used as baseline data to establish the crack length model. The purpose of using a time window is to avoid complex reflection waves from the boundary. Because comparison is made between the baseline and the actual cases, the time window can ensure that the approximately same amount of wave package data is retained in the time window; therefore the time window is not required to include the entire first wave package received by the sensor.
By comparing the response signal under damage state and healthy state, it is known that the features of the received signal are different. Figure 5 shows the mechanism of Lamb wave propagation through a damaged region. The crack-like damage can cause backscatter echoes and the energy loss can increase with the increasing of the crack size. On the other hand, the presence of a crack can alter the wave path. For a closed or a partially closed fatigue crack, the received signal contains two parts: directly transmitted wave signals across the crack and echoes from the crack tip. For a notch or a fully opened crack (e.g., a through-thickness crack), the received signals are the waves in a detour route from the crack tip [26]. Based on the above consideration, three damage features, namely, normalized amplitude, phase change, and correlation coefficient, are extracted from the received signal quantitatively as the damage features. In order to eliminate uncertainties from different piezoelectric (PZT) wafers, the normalized amplitude is used as a damage sensitive feature. First, the Hilbert transform is applied for both the received first Lamb wave package and the excitation wave package. The normalized amplitude is obtained by dividing the peak value of the two processed wave packages. The phase change is calculated by subtracting the time of the peak of the damaged signal from that of the healthy signal. The correlation coefficient is calculated by comparing the intact specimen (with 0 mm crack) with the defective specimen in the desired time window calculated based on Equations (1) and (2). where Tstart and Tend represent the start and stop time point of the time window respectively, T1 is the time point when the Lamb wave is transmitted, and T0 is the period time of excitation wave. Lamb waves are inherently of a dispersive nature. Therefore the received wave package may vary from the wave package generated by the actuator at the beginning. The impact of this characteristic is not significant in this study due to the short wave transmission distance between actuators and sensors. Moreover, the damage sensitive features from the healthy specimen are used as baseline data to establish the crack length model. The purpose of using a time window is to avoid complex reflection waves from the boundary. Because comparison is made between the baseline and the actual cases, the time window can ensure that the approximately same amount of wave package data is retained in the time window; therefore the time window is not required to include the entire first wave package received by the sensor.
By comparing the response signal under damage state and healthy state, it is known that the features of the received signal are different. Figure 5 shows the mechanism of Lamb wave propagation through a damaged region. The crack-like damage can cause backscatter echoes and the energy loss can increase with the increasing of the crack size. On the other hand, the presence of a crack can alter the wave path. For a closed or a partially closed fatigue crack, the received signal contains two parts: directly transmitted wave signals across the crack and echoes from the crack tip. For a notch or a fully opened crack (e.g., a through-thickness crack), the received signals are the waves in a detour route from the crack tip [26]. Based on the above consideration, three damage features, namely, normalized amplitude, phase change, and correlation coefficient, are extracted from the received signal quantitatively as the damage features. In order to eliminate uncertainties from different piezoelectric (PZT) wafers, the normalized amplitude is used as a damage sensitive feature. First, the Hilbert transform is applied for both the received first Lamb wave package and the excitation wave package. The normalized amplitude is obtained by dividing the peak value of the two processed wave packages. The phase change is calculated by subtracting the time of the peak of the damaged signal from that of the healthy signal. The correlation coefficient is calculated by comparing the intact specimen (with 0 mm crack) with the defective specimen in the desired time window calculated based on Equations (1) and (2).

Least Squares Support Vector Machine
The support vector machine, a data-driven approach based on statistical learning theory (SLT), was first proposed by Vapnik et al. [27]. Figure 6 shows the fundamental principle of SVM for linear classification, which is an optimization problem of a hyperplane decision boundary. There exist several separating hyperplanes that separate the data of the two classes (data depicted by yellow rectangle and green circle) [28]. Suppose a given training set D = {(x i , y i ) I = 1, 2, . . . , n} with input data x i ∈R n and class labels y i ∈{+1,−1} can be separated without error by a hyperplane H: w·x + b = 0, and the lines |w·x + b| = 1 are the boundaries for classification. To obtain better classification accuracy and generalization ability, the margin between the two boundary lines, i.e.,2/||w||, is maximized. In addition to classification, the SVM method can be used for regression problems using the so-called ε-insensitive loss [27].

Least Squares Support Vector Machine
The support vector machine, a data-driven approach based on statistical learning theory (SLT), was first proposed by Vapnik et al. [27]. Figure 6 shows the fundamental principle of SVM for linear classification, which is an optimization problem of a hyperplane decision boundary. There exist several separating hyperplanes that separate the data of the two classes (data depicted by yellow rectangle and green circle) [28]. Suppose a given training set D = {(xi, yi) I = 1, 2, …, n} with input data xi R n and class labels yi {+1,−1} can be separated without error by a hyperplane H: w·x + b = 0, and the lines |w·x + b| = 1 are the boundaries for classification. To obtain better classification accuracy and generalization ability, the margin between the two boundary lines, i.e., 2 w , is maximized. In addition to classification, the SVM method can be used for regression problems using the so-called ε-insensitive loss [27]. For nonlinear classification and regression problems, the input data are mapped to another linearly separable space using a nonlinear function φ and the normal linear SVM is applied [29]. The concept is illustrated in Figure 7, where the axes are used to define the spatial dimension. The least squares support vector machine (LS-SVM) is an improved variant of SVM. It can increase the convergence rate for complex problems [28]. The brief introduction of LS-SVM nonlinear regression theory is briefly reviewed for completeness purposes. Consider a given training data set where xi represents the ith input vector, yi is the regression target value corresponding to xi, and N is the sample size. The LS-SVM regression model in the primal weight space can be expressed as For nonlinear classification and regression problems, the input data are mapped to another linearly separable space using a nonlinear function φ and the normal linear SVM is applied [29]. The concept is illustrated in Figure 7, where the axes are used to define the spatial dimension.

Least Squares Support Vector Machine
The support vector machine, a data-driven approach based on statistical learning theory (SLT), was first proposed by Vapnik et al. [27]. Figure 6 shows the fundamental principle of SVM for linear classification, which is an optimization problem of a hyperplane decision boundary. There exist several separating hyperplanes that separate the data of the two classes (data depicted by yellow rectangle and green circle) [28]. Suppose a given training set D = {(xi, yi) I = 1, 2, …, n} with input data xi R n and class labels yi {+1,−1} can be separated without error by a hyperplane H: w·x + b = 0, and the lines |w·x + b| = 1 are the boundaries for classification. To obtain better classification accuracy and generalization ability, the margin between the two boundary lines, i.e., 2 w , is maximized. In addition to classification, the SVM method can be used for regression problems using the so-called ε-insensitive loss [27]. For nonlinear classification and regression problems, the input data are mapped to another linearly separable space using a nonlinear function φ and the normal linear SVM is applied [29]. The concept is illustrated in Figure 7, where the axes are used to define the spatial dimension. The least squares support vector machine (LS-SVM) is an improved variant of SVM. It can increase the convergence rate for complex problems [28]. The brief introduction of LS-SVM nonlinear regression theory is briefly reviewed for completeness purposes. Consider a given training data set where xi represents the ith input vector, yi is the regression target value corresponding to xi, and N is the sample size. The LS-SVM regression model in the primal weight space can be expressed as The least squares support vector machine (LS-SVM) is an improved variant of SVM. It can increase the convergence rate for complex problems [28]. The brief introduction of LS-SVM nonlinear regression theory is briefly reviewed for completeness purposes. Consider a given training data set , where x i represents the ith input vector, y i is the regression target value corresponding to x i , and N is the sample size. The LS-SVM regression model in the primal weight space can be expressed as Materials 2017, 10, 648 The term φ(·) is a nonlinear mapping function, w ∈ R n and b ∈ R are model parameters. The associated optimization problem can be formulated as where γ is the regularization parameter (also called penalty factor), and e i represents the prediction error term. The method of Lagrange Multiplier is used to solve the constrained optimization problem, and the Lagrange function is constructed as where, α i , I = 1, . . . , N, are the introduced Lagrange multipliers. The set of partial derivatives reads, After eliminating w and e, the following linear Karush-Kuhn-Tucker system in α and b is obtained where where, K(x i , x j ) is called the kernel function. The regression problem (3) can be solved in the dual space of the Lagrange multipliers after applying this kernel trick. Equation (3) can be represented as Due to the good generalization ability and fast convergence speed, the following radial basis function (RBF) is used as the kernel function in this paper where σ 2 is the kernel bandwidth parameter, which controls the radical range of the function.
To construct the LS-SVM regression model with a RBF kernel, it is necessary to select two appropriate tuning parameters: regularization parameter γ and RBF parameter σ 2 [28]. Here, the γ determines the trade-off between the training error minimization and the smoothness. In practical applications both cross validation and enumeration methods are often adopted to determine the tuning parameters, which require a large computational demand. To alleviate the computational demand, the genetic algorithm is employed to perform the tuning parameters-optimization for LS-SVM regression in the current study.
The genetic algorithm is a method designed for optimization of search problems. It repeatedly modifies the population of individual solutions with an iterative process. The population in each iteration is called a generation. The GA follows three main rules to produce the next generation from the current population [30]. (1) Selection: select the individuals, called parents, which contribute to the population at the next generation; (2) Crossover: combine two parents to form children for the next generation; (3) Mutation: apply random changes to individual parents to form children. In each generation, the fitness function of every individual, which is usually the objective of the optimization problems, is evaluated. Generally, the GA terminates when either a maximum number of generations has been produced or a satisfactory fitness level has been reached.
To optimize the tuning parameters of LS-SVM, the mean relative error (MRE) of the LS-SVM predictors is defined as the fitness function of GA. The MRE can be expressed as where N is the sample size, y i represents the actual value, andŷ is the prediction data. The flowchart of GA based LS-SVM is shown in Figure 8. The genetic algorithm is a method designed for optimization of search problems. It repeatedly modifies the population of individual solutions with an iterative process. The population in each iteration is called a generation. The GA follows three main rules to produce the next generation from the current population [30]. (1) Selection: select the individuals, called parents, which contribute to the population at the next generation; (2) Crossover: combine two parents to form children for the next generation; (3) Mutation: apply random changes to individual parents to form children. In each generation, the fitness function of every individual, which is usually the objective of the optimization problems, is evaluated. Generally, the GA terminates when either a maximum number of generations has been produced or a satisfactory fitness level has been reached.
To optimize the tuning parameters of LS-SVM, the mean relative error (MRE) of the LS-SVM predictors is defined as the fitness function of GA. The MRE can be expressed as ˆ1 00 (11) where N is the sample size, yi represents the actual value, and ŷ is the prediction data. The flowchart of GA based LS-SVM is shown in Figure 8.

Methodology Validation I: Coupon Test
A simple coupon test is performed in order to verify the efficiency and effectiveness of the proposed method. A pitch-catch sensor configuration is used to perform the damage detection. The normalized amplitude, phase change, and correlation coefficient are extracted from the received signal by signal processing techniques. After that, both the crack size and damage sensitive feature data are used to train the LS-SVM model. The model parameters are optimized by GA, with the optimization objective of minimizing the MRE between actual crack size and prediction data. Additionally, the trained GA based LS-SVM model is employed to predict the crack size using a different data set for validation purposes.

Experiment
The specimen of coupon test is made of 2024-T3 aluminum. In the center of each specimen, electric discharge machining (EDM) is used to produce a crack with a width of 0.3 mm. The geometry and mechanical properties of the test specimens are shown in Figure 9 and Table 1 respectively. As shown in Figure 9, two piezoelectric (PZT) sensors are placed at each side of the crack as a pitch-catch configuration. The red dot represents the actuator which is used to excite the Lamb wave and the green dot represents the sensor that is used to acquire the Lamb wave. Detailed information of the PZT is shown in Table 2.

Methodology Validation I: Coupon Test
A simple coupon test is performed in order to verify the efficiency and effectiveness of the proposed method. A pitch-catch sensor configuration is used to perform the damage detection. The normalized amplitude, phase change, and correlation coefficient are extracted from the received signal by signal processing techniques. After that, both the crack size and damage sensitive feature data are used to train the LS-SVM model. The model parameters are optimized by GA, with the optimization objective of minimizing the MRE between actual crack size and prediction data. Additionally, the trained GA based LS-SVM model is employed to predict the crack size using a different data set for validation purposes.

Experiment
The specimen of coupon test is made of 2024-T3 aluminum. In the center of each specimen, electric discharge machining (EDM) is used to produce a crack with a width of 0.3 mm. The geometry and mechanical properties of the test specimens are shown in Figure 9 and Table 1 respectively. As shown in Figure 9, two piezoelectric (PZT) sensors are placed at each side of the crack as a pitch-catch configuration. The red dot represents the actuator which is used to excite the Lamb wave and the green dot represents the sensor that is used to acquire the Lamb wave. Detailed information of the PZT is shown in Table 2.  A multi-channel digital oscilloscope with a sampling frequency of 1 GHz and a resolution of 12 bits is used for Lamb wave acquisition. A total of six specimens are employed for the coupon test and the crack size varies from 2 mm to 20 mm with an increment of 3 mm and from 20 mm to 30 mm with an increment of 5 mm for each specimen. Both the healthy and damaged states are tested respectively. The overall experimental setup is shown in Figure 10 and 11.  A multi-channel digital oscilloscope with a sampling frequency of 1 GHz and a resolution of 12 bits is used for Lamb wave acquisition. A total of six specimens are employed for the coupon test and the crack size varies from 2 mm to 20 mm with an increment of 3 mm and from 20 mm to 30 mm with an increment of 5 mm for each specimen. Both the healthy and damaged states are tested respectively. The overall experimental setup is shown in Figures 10 and 11.  The frequency of the Lamb wave is set to 0.16 MHz with a 2 mm thickness specimen to obtain the desired S0 mode. A symmetrical Hamming-windowed sinusoidal tone bust with 3.5 cycles is used as the excitation. The testing data are collected and the desired time window is obtained based on the procedure described in Section 2.1.
The extracted damage sensitive features of specimen T1 are shown in Table 3. With increasing crack size, the normalized amplitude and correlation coefficient decrease, and the phase change increases. The results are consistent with the previous discussion, indicating that the selected damage sensitive features are appropriate for the crack size qualification. The crack length versus the normalized amplitude, phase change, and the correlation coefficient are shown in Figure 12.  The frequency of the Lamb wave is set to 0.16 MHz with a 2 mm thickness specimen to obtain the desired S0 mode. A symmetrical Hamming-windowed sinusoidal tone bust with 3.5 cycles is used as the excitation. The testing data are collected and the desired time window is obtained based on the procedure described in Section 2.1.
The extracted damage sensitive features of specimen T1 are shown in Table 3. With increasing crack size, the normalized amplitude and correlation coefficient decrease, and the phase change increases. The results are consistent with the previous discussion, indicating that the selected damage sensitive features are appropriate for the crack size qualification. The crack length versus the normalized amplitude, phase change, and the correlation coefficient are shown in Figure 12. The frequency of the Lamb wave is set to 0.16 MHz with a 2 mm thickness specimen to obtain the desired S 0 mode. A symmetrical Hamming-windowed sinusoidal tone bust with 3.5 cycles is used as the excitation. The testing data are collected and the desired time window is obtained based on the procedure described in Section 2.1.
The extracted damage sensitive features of specimen T1 are shown in Table 3. With increasing crack size, the normalized amplitude and correlation coefficient decrease, and the phase change increases. The results are consistent with the previous discussion, indicating that the selected damage sensitive features are appropriate for the crack size qualification. The crack length versus the normalized amplitude, phase change, and the correlation coefficient are shown in Figure 12.

Crack Evaluation Using GA Based LS-SVM
In order to establish the relationship between crack size and damage sensitive features, the GA-based LS-SVM is established and trained using the acquired data. The datasets of T4, T5, and T6 are used for training and the datasets of T1, T2, and T3 are used for validation to verify the effectiveness of the trained model. For the input and output of the LS-SVM write ,1 , where Si and Li are the damage sensitive features and crack size, respectively, for the ith specimen. Terms ai,j, pi,j, ci,j, and li,j are the normalized amplitude, phase change, correlation coefficient, and crack size, respectively, for the ith specimen in the jth crack size configuration, i = 1, 2, …, 6, j = 1, 2, …, 10.
Denote xtrain as the independent variables of the training data and ytrain as the regression target value, the training uses the following set of data

Crack Evaluation Using GA Based LS-SVM
In order to establish the relationship between crack size and damage sensitive features, the GA-based LS-SVM is established and trained using the acquired data. The datasets of T4, T5, and T6 are used for training and the datasets of T1, T2, and T3 are used for validation to verify the effectiveness of the trained model. For the input and output of the LS-SVM write where S i and L i are the damage sensitive features and crack size, respectively, for the ith specimen. Terms a i,j , p i,j , c i,j , and l i,j are the normalized amplitude, phase change, correlation coefficient, and crack size, respectively, for the ith specimen in the jth crack size configuration, i = 1, 2, . . . , 6, j = 1, 2, . . . , 10. Denote x train as the independent variables of the training data and y train as the regression target value, the training uses the following set of data The regularization parameter γ and the RBF parameter σ are optimized using GA with the objective of minimizing the MRE of predication. The optimal target value, i.e., the MRE of predication converges to 0.024%, as shown in Figure 13. In each generation of GA, 50 pairs of parameters are used to train the LS-SVM and calculate the MRE in parallel. The mean fitness is the mean MRE of the total 50 results. According to the optimal results, γ = 788.2670 and σ 2 = 0.1050. Using the optimal parameters obtained with GA and the training data, the LS-SVM model is trained for crack size prediction. The prediction results of T1, T2, and T3 are shown in Figure 14. The predicted crack size is represented in the form of a scatter plot. The performance of the proposed GA-based LS-SVM Lamb wave damage quantification model is evaluated by the coefficient of determination R 2 . It is observed that the slope of y= a 0 x + a 1 in the scatter plots is very close to 1, indicating an accurate prediction. The regularization parameter γ and the RBF parameter σ are optimized using GA with the objective of minimizing the MRE of predication. The optimal target value, i.e., the MRE of predication converges to 0.024%, as shown in Figure 13. In each generation of GA, 50 pairs of parameters are used to train the LS-SVM and calculate the MRE in parallel. The mean fitness is the mean MRE of the total 50 results. According to the optimal results, γ = 788.2670 and σ 2 = 0.1050. Using the optimal parameters obtained with GA and the training data, the LS-SVM model is trained for crack size prediction. The prediction results of T1, T2, and T3 are shown in Figure 14. The predicted crack size is represented in the form of a scatter plot. The performance of the proposed GA-based LS-SVM Lamb wave damage quantification model is evaluated by the coefficient of determination R 2 . It is observed that the slope of y= a0x + a1 in the scatter plots is very close to 1, indicating an accurate prediction.  In [14], a second-order multivariate model is proposed to predict the crack size based on three damage sensitive features mentioned above; the model is given as  The regularization parameter γ and the RBF parameter σ are optimized using GA with the objective of minimizing the MRE of predication. The optimal target value, i.e., the MRE of predication converges to 0.024%, as shown in Figure 13. In each generation of GA, 50 pairs of parameters are used to train the LS-SVM and calculate the MRE in parallel. The mean fitness is the mean MRE of the total 50 results. According to the optimal results, γ = 788.2670 and σ 2 = 0.1050. Using the optimal parameters obtained with GA and the training data, the LS-SVM model is trained for crack size prediction. The prediction results of T1, T2, and T3 are shown in Figure 14. The predicted crack size is represented in the form of a scatter plot. The performance of the proposed GA-based LS-SVM Lamb wave damage quantification model is evaluated by the coefficient of determination R 2 . It is observed that the slope of y= a0x + a1 in the scatter plots is very close to 1, indicating an accurate prediction.  In [14], a second-order multivariate model is proposed to predict the crack size based on three damage sensitive features mentioned above; the model is given as In [14], a second-order multivariate model is proposed to predict the crack size based on three damage sensitive features mentioned above; the model is given as a = A + α 1 x + α 2 y + α 3 z + α 4 xy + α 5 xz + α 6 yz + α 7 x 2 + α 8 y 2 + α 9 z 2 (15) where a is the crack length, x is the correlation coefficient, y is the phase change, and z is the amplitude change. The model parameters are estimated based on regression analysis with experimental data. The regression parameters are shown in Table 4. The prediction results obtained from Equation (15) are compared with that of the GA based LS-SVM model, as illustrated in Figure 15. Results of MRE are shown in Table 5. In this case the GA-based LS-SVM performs better than the alternative method. where a is the crack length, x is the correlation coefficient, y is the phase change, and z is the amplitude change. The model parameters are estimated based on regression analysis with experimental data. The regression parameters are shown in Table 4. The prediction results obtained from Equation (15) are compared with that of the GA based LS-SVM model, as illustrated in Figure 15. Results of MRE are shown in Table 5. In this case the GA-based LS-SVM performs better than the alternative method.

Cross Validation
The stability and robustness of the proposed GA-based LS-SVM model is further validated using the rigorous cross validations. The dataset is partitioned into two subsets. One subset is used as training and the other is used for validation. The previous training and validation process is applied. Using the data of T1-T6, it is to be noted that three from six results are chosen, in total 20 different partitions, e.g., C(6,3) = 20. Except for one combination shown previously for methodology illustration, the results of the rest of the 19 combinations are shown in Table 6. An overall satisfactory performance in terms of MRE is observed, indicating that the proposed GA based LS-SVM model can reliably and accurately predict the crack size.

Cross Validation
The stability and robustness of the proposed GA-based LS-SVM model is further validated using the rigorous cross validations. The dataset is partitioned into two subsets. One subset is used as training and the other is used for validation. The previous training and validation process is applied. Using the data of T1-T6, it is to be noted that three from six results are chosen, in total 20 different partitions, e.g., C(6,3) = 20. Except for one combination shown previously for methodology illustration, the results of the rest of the 19 combinations are shown in Table 6. An overall satisfactory performance in terms of MRE is observed, indicating that the proposed GA based LS-SVM model can reliably and accurately predict the crack size.

Methodology Validation II: Lap Joint Components Fatigue Test
In Section 3, the coupon test was performed to prove the effectiveness of the proposed method. In this section, data from the realistic lap joint components fatigue test with a naturally developed fatigue crack is used to further validate the GA-based LS-SVM damage quantification model.

Experiment
The specimen of lap joint components fatigue test is made of 2024-T3 aluminum with a thickness of 1.6 mm. The mechanical properties of the test specimens are shown in Table 1. The geometry of the testing specimen and the PZT sensor layout are shown in Figure 16. The specimens are three rivet rows by five rivets wide lap joints, consisting of two aluminum panels. The experiment setup and specimens are shown in Figure 17.

Methodology Validation II: Lap Joint Components Fatigue Test
In Section 3, the coupon test was performed to prove the effectiveness of the proposed method. In this section, data from the realistic lap joint components fatigue test with a naturally developed fatigue crack is used to further validate the GA-based LS-SVM damage quantification model.

Experiment
The specimen of lap joint components fatigue test is made of 2024-T3 aluminum with a thickness of 1.6 mm. The mechanical properties of the test specimens are shown in Table 1. The geometry of the testing specimen and the PZT sensor layout are shown in Figure 16. The specimens are three rivet rows by five rivets wide lap joints, consisting of two aluminum panels. The experiment setup and specimens are shown in Figure 17.  For the lap joint components fatigue test, the excitation frequency of the Lamb wave is 0.2 MHz. Similar to the coupon test, a symmetrical Hamming-windowed sinusoidal tone bust with 3.5 cycles is used as the excitation. Detailed information regarding the lap joint components fatigue test can be found in reference [14]. A total of seven specimens numbered as S1-S7 are used for the fatigue test. In order to verify the robustness of the proposed method for crack quantification, specimens from different manufactures with different loading spectra are used for fatigue testing. The loading spectrum for specimens S1-S6 is constant. A block loading is used for S7. Figure 18 shows the constant and variable amplitude loading spectra. Specimen S6 is made from a different manufacture with the same material and geometry. Similar to the coupon test, normalized amplitude, phase change, and correlation coefficient are extracted as the damage sensitive features. The proposed damage sensitive features are extracted for all seven tested specimens, as shown in Figure 19. The data of lap joint specimens exhibits a larger variation comparing with the coupon test due to the complexity in crack orientations, boundary conditions, and manufacturing uncertainty.  For the lap joint components fatigue test, the excitation frequency of the Lamb wave is 0.2 MHz. Similar to the coupon test, a symmetrical Hamming-windowed sinusoidal tone bust with 3.5 cycles is used as the excitation. Detailed information regarding the lap joint components fatigue test can be found in reference [14]. A total of seven specimens numbered as S1-S7 are used for the fatigue test. In order to verify the robustness of the proposed method for crack quantification, specimens from different manufactures with different loading spectra are used for fatigue testing. The loading spectrum for specimens S1-S6 is constant. A block loading is used for S7. Figure 18 shows the constant and variable amplitude loading spectra. Specimen S6 is made from a different manufacture with the same material and geometry. Similar to the coupon test, normalized amplitude, phase change, and correlation coefficient are extracted as the damage sensitive features. The proposed damage sensitive features are extracted for all seven tested specimens, as shown in Figure 19. The data of lap joint specimens exhibits a larger variation comparing with the coupon test due to the complexity in crack orientations, boundary conditions, and manufacturing uncertainty.

Crack Quantificaiton Using GA Based LS-SVM
The damage sensitive features data and corresponding crack size of specimens S2, S3, S4 are used to train the GA-based LS-SVM, and the data of specimens S1 and S5 are used to validate the trained model. The regularization parameter γ and the RBF parameter σ are optimized using GA

Crack Quantificaiton Using GA Based LS-SVM
The damage sensitive features data and corresponding crack size of specimens S2, S3, S4 are used to train the GA-based LS-SVM, and the data of specimens S1 and S5 are used to validate the trained model. The regularization parameter γ and the RBF parameter σ are optimized using GA with the objective of minimizing the MRE. The optimal target value MRE converges to 0.34% with the optimal results of γ = 17.0316 and σ 2 = 0.0195, as shown in Figure 20. with the objective of minimizing the MRE. The optimal target value MRE converges to 0.34% with the optimal results of γ = 17.0316 and σ 2 = 0.0195, as shown in Figure 20.  The GA-based LS-SVM model is also compared with the second-order multivariate model proposed in reference [14]. The regression parameters of Equation (15) are shown in Table 7. The prediction results of the two methods are shown in Figure 22. It can be observed that the GA-based SVM generates more accurate results as the prediction points are closer to the actual points than that of the second-order multivariate model.   with the objective of minimizing the MRE. The optimal target value MRE converges to 0.34% with the optimal results of γ = 17.0316 and σ 2 = 0.0195, as shown in Figure 20.  The GA-based LS-SVM model is also compared with the second-order multivariate model proposed in reference [14]. The regression parameters of Equation (15) are shown in Table 7. The prediction results of the two methods are shown in Figure 22. It can be observed that the GA-based SVM generates more accurate results as the prediction points are closer to the actual points than that of the second-order multivariate model.  The GA-based LS-SVM model is also compared with the second-order multivariate model proposed in reference [14]. The regression parameters of Equation (15) are shown in Table 7. The prediction results of the two methods are shown in Figure 22. It can be observed that the GA-based SVM generates more accurate results as the prediction points are closer to the actual points than that of the second-order multivariate model.  The comparison of the MRE results for GA-based LS-SVM and second-order multivariate model is shown in Table 8. GA-based LS-SVM yields smaller MREs compared with the second-order multivariate model, indicating that the proposed GA based LS-SVM can produce more accurate prediction results.

Model Validation Using Different Loading and Manufacture
To further verify the robustness of the proposed GA-based LS-SVM model for crack quantification, specimens with different loading spectrum (S7) and manufacture (S6) were investigated. The GA-based LS-SVM model obtained in section 4.2 is used to predict the crack size for S6 and S7. The prediction results are shown in Figure 23. Table 9 presents the MRE results for S6 and S7.   The comparison of the MRE results for GA-based LS-SVM and second-order multivariate model is shown in Table 8. GA-based LS-SVM yields smaller MREs compared with the second-order multivariate model, indicating that the proposed GA based LS-SVM can produce more accurate prediction results.

Model Validation Using Different Loading and Manufacture
To further verify the robustness of the proposed GA-based LS-SVM model for crack quantification, specimens with different loading spectrum (S7) and manufacture (S6) were investigated. The GA-based LS-SVM model obtained in Section 4.2 is used to predict the crack size for S6 and S7. The prediction results are shown in Figure 23. Table 9 presents the MRE results for S6 and S7. The comparison of the MRE results for GA-based LS-SVM and second-order multivariate model is shown in Table 8. GA-based LS-SVM yields smaller MREs compared with the second-order multivariate model, indicating that the proposed GA based LS-SVM can produce more accurate prediction results.

Model Validation Using Different Loading and Manufacture
To further verify the robustness of the proposed GA-based LS-SVM model for crack quantification, specimens with different loading spectrum (S7) and manufacture (S6) were investigated. The GA-based LS-SVM model obtained in section 4.2 is used to predict the crack size for S6 and S7. The prediction results are shown in Figure 23. Table 9 presents the MRE results for S6 and S7.    The prediction results demonstrate that the GA-based LS-SVM model trained using specimens from one manufacture under constant loading can be used reliably and accurately to predict the crack size for specimens from another manufacture and specimens using a different loading. It indicates that the proposed method is robust against uncertainties associated with different loading cases and manufactures.

Cross Validation
Cross validation is also used to validate the stability and robustness of the proposed GA based LS-SVM model under the lap joint components fatigue test. The data of S1-S5 are partitioned into two subsets. One is used as training data, which includes three specimens' testing data, and the other is used for validation. There are in total 10 different partitions when choosing three from five, e.g., C(5,3) = 10. Except for the one combination presented above, the rest of the nine combinations are shown in Table 10. It was observed that the MRE for each result was small and stable. In addition, the specimens with different loading spectrum (S7) and manufacture (S6) are investigated in the cross validation. The GA-based LS-SVM model trained by different training data in the cross validation is used to predict the crack size for S6 and S7. The MREs for S6 and S7 based on the prediction of different models are presented in Table 11.

Conclusions
This paper presented a method for crack size prediction using in-situ Lamb wave testing and GA-based LS-SVM. Three damage sensitive features are built based on physical interpretations. The three features, namely, normalized amplitude, phase change, and correlation coefficient, are extracted from the Lamb wave signal and used to develop LS-SVM. To enhance the robustness and training efficiency, a genetic algorithm is employed to obtain the optimal model parameters.
A simple coupon test with six specimens was conducted to verify the effectiveness of the proposed method. The overall dataset is partitioned into two subsets, one for training and one for validation. The proposed GA-based LS-SVM is investigated thoroughly using the exhaustive cross validation technique. Furthermore, realistic lap joint components fatigue test data with naturally developed fatigue crack are used to validate the robustness and accuracy of the proposed GA-based LS-SVM model. Specimens from different manufactures and different loading spectra are designed to introduce inter-specimen uncertainty. The validation results indicate that the proposed method can predict the crack size reliably and accurately. Based on the current results, the following conclusions can be drawn.
(1) The proposed GA-based LS-SVM can accurately predict the crack size using Lamb wave data.
An exhaustive cross validation is performed to investigate the performance of data-driven methods. Cross validation results indicate the proposed method is robust and is independent of the training dataset selection. (2) The proposed GA-based LS-SVM can perform reliably under manufacturing and loading uncertainty. It provides a viable solution for structural health monitoring applications using the Lamb wave.
It is also worth mentioning that the crack lengths of all specimens (including the training set and validation set) in two cases are within the same size range. For the coupon test, all crack sizes are in the range of 0-30 mm. The range of crack size is 0-8 mm for the lap-joint case. Since the damage features are physics-based and the model incorporating the features is data-driven, the proposed method inherently has the limitation of extrapolation to the horizon out of the training size limit. Investigation on the extrapolation limit will be performed in a future study.