In this paper, the acceleration that was obtained from measuring the steel frame model and used as the CNN input to detect structural damages. The CNN could automatically extract damage features from these signals without analyzing the indicators like traditional detection methods. The overview of the proposed method was organized as follows: (1) got the structural response data through the FEA and vibration experiments; (2) trained the CNN by using the samples obtained from the FEA; (3) tested the trained CNN using single damage, multiple damages, and combined datasets.
This paper used a steel frame beam as the research object (
Figure 1a). The steel frame had a length, width, and height of 9.912 m, 0.354 m, and 0.354 m, respectively. The steel frame consisted of 355 rods; each rod had a hollow circular cross section with an external radius of 0.005 m and thickness of 0.002 m. The two ends of the steel frame were fixed. Damages were introduced in 9 rods (numbered 1 to 9 in
Figure 1b) in the steel frame. The response signals of the 13 measurement points (labelled as A1 to A13) on the bottom chords were used as the inputs of the CNN samples. The 4 excitation points (F1–F4) were on the top chords (
Figure 1b). In this paper, the acquisition frequency of the response signals in the numerical simulations and vibration experiments was 100 Hz, and the collection time was 8 s for each excitation.
2.3. CNN Samples
The CNN samples were from two sources, numerical simulations and vibration experiments, which are described in the following.
Numerical simulations: four datasets (A, B, C, D) with a single damage in a rod, double damages simultaneously in 2 rods, triple damages simultaneously in 3 rods and mixed partial samples of three above datasets.
For the single damage dataset, there were 10 scenarios (9 damage locations + 1 intact structure). The acceleration time history signals of the 13 measurement points for these 10 scenarios were used as the inputs of the CNN samples, and for the corresponding CNN output, the intact structure was set to 0, the damage on Rod 1 set to 1, the damage on Rod 2 set to 2, and so on.
For the dataset of damages simultaneously in 2 rods, any two of the 9 rods were randomly selected, hence, there are 36 () scenarios. The acceleration signals of the 13 measurement points for the 36 scenarios were used as the CNN inputs, correspondingly 1, 2, …, 36 were set as the CNN outputs respectively.
For the dataset of simultaneous damages in 3 rods, any three of the 9 rods were randomly selected. There were 84 scenarios (), and the acceleration signals of the 13 measurement points for the 84 scenarios were used as the CNN inputs, correspondingly, 1, 2, …, 84 were set as the CNN outputs respectively.
Dataset D included the intact structure plus 7 damage scenarios, which were (1) damage in Rod 1, (2) damage on Rod 5, (3) damage in Rod 9, (4) damages simultaneously in Rod 1 and Rod 5, (5) damages simultaneously in Rod 1 and Rod 9, (6) damages simultaneously in Rod 5 and Rod 9, and (7) damages simultaneously in Rod 1, Rod 5, and Rod 9. The acceleration signals of the 13 measurement points for the 8 scenarios were used as the CNN input, correspondingly, 0, 1, 2, …, 7 were set as the CNN outputs.
Vibration experiments: dataset E was consistent with the damage scenarios of dataset D in the numerical simulation; the vibration signals of 8 structural states were used as the CNN inputs and 0, 1, 2, …, 7 were set as the CNN outputs.
In the numerical simulations, the excitations were applied at 4 locations (F1, F2, F3, and F4), in turn, 5 times; for each excitation, the response was collected for 8 s (with the acquisition frequency of 100 Hz) at the 13 measurement points, therefore, a data matrix of
was collected. A sliding window (with the size of
) was used to slide down the data matrix with a step each time, so that 15,991 samples were produced for each damage scenario (
Figure 4).
Table 1,
Table 2 and
Table 3 list the sample numbers of each dataset.
2.4. Convolutional Neural Network
In this paper, the CNN architecture for classification is shown in
Figure 5. The CNN architecture was based on the steel frame and the location of the acceleration measurement points. The CNN included two convolution layers (the first layer had 30 convolution kernels with the size and stride being
and 1; the second layer had 60 convolution kernels with the size and stride being
and 1), a pooling layer (the size and stride being
and 3), a fully connection layer, and an output layer.
The convolution process was to multiply each element of the convolution kernel with the corresponding element of each sub matrix of the input matrix and then sum them to get an element in a feature matrix as shown in Equation (1):
where the function
is the input, the function
is the convolution kernel,
is the number of elements in the convolution region, and
is the number of moves of the convolution kernel. Then, for the convolution kernel slides with a fixed step size, the process was repeated until all elements of the input matrix were involved; finally, it forms the feature matrix (
Figure 6). The pooling layer plays a role in reducing the dimension of the input layer. Generally, there were two types of pooling: maximum pooling and average pooling (
Figure 7). In this paper, maximum pooling was adopted as it performs better than average pooling [
30].
Activation functions enhance the CNN learning ability. The commonly used activation functions include Sigmoid, tanh, and ReLU (Rectified Linear Unit) (
Figure 8). In this paper, ReLU was used, because it behaves better in computations compared with Sigmoid and tanh [
31].
Softmax was the output layer of the CNN, which was used to carry out multi-classification and output the final results predicted by the CNN. The softmax calculation is shown in Formula (2); where
was a vector composed of
elements, and
was the probability distribution of each element of the vector.
where
,
;
,
. The above formula was applied to solve the problem of multi-label classification in the CNN. The prediction probability of class
for a given sample vector
and weighted vector
was
where
,
,
,
was the output of the fully connected layer, and
was the probability of the connection weights between the predicted output and actual output.
In order to update the network weights efficiently, the classical stochastic gradient descent method is generally used to optimize the network. This paper adopts a more effective optimization method, adaptive motion estimation (Adam), so as to achieve a more effective recognition effect. Adam is a combination of AdaGrad (gradient algorithm) and Rmsprop (root mean square prop) [
32], which combines the advantages of both algorithms: their ability to maintain an adaptive level of learning for each parameter.
2.5. Structural Damage Detection
In this paper, the CNN was designed in MATLAB (MathWorks Inc, Natick, MA, USA) and the CNN internal parameters were adjusted to achieve the ideal detection results. The CNN parameters were shown in
Table 4.
In this paper, five datasets were studied. The CNN architecture was the same for all damage scenarios. Datasets A, B, C, and D were studied with the samples obtained from the numerical simulations. The vibration signals (accelerations) obtained from the numerical simulations were used for network training and testing (15,200 training samples and 791 testing samples) for each damage scenario. Dataset E was to provide experimental testing samples to validate the applicability of the CNN trained using numerical simulations.
Normalization was widely used in data processing because it can keep data in a range and make the data from different sources comparable. In this paper, the acceleration of the measuring points was normalized using Formula (4):
where
and
are the values before and after normalization, respectively, and
and
are the maximum and minimum values of the sample data, respectively. In this paper, the acceleration was normalized into the range [−1, 1] along the time direction. The normalization method is shown in
Figure 9. After the normalization, the data were inputted into the CNN for damage detection, which is shown in
Figure 10.