Measurement of Bridge Vibration by UAVs Combined with CNN and KLT Optical-Flow Method

: A measurement method of bridge vibration by unmanned aerial vehicles (UAVs) combined with convolutional neural networks (CNNs) and Kanade–Lucas–Tomasi (KLT) optical-ﬂow method is proposed. In this method, the stationary reference points in the structural background are required, a UAV is used to shoot the structure video, and the KLT optical-ﬂow method is used to track the target points on the structure and the background reference points in the video to obtain the coordinates of these points on each frame. Then, the characteristic relationship between the reference points and the target points can be learned by a CNN according to the coordinates of the reference points and the target points, so as to correct the displacement time–history curves of target points containing the false displacement caused by the UAV’s egomotion. Finally, operational modal analysis (OMA) is used to extract the natural frequency of the structure from the displacement signal. In addition, the reliability of UAV measurement combined with CNN is proved by comparing the measurement results of the ﬁxed camera and those of UAV combined with CNN, and the reliability of the KLT optical-ﬂow method is proved by comparing the tracking results of the digital image correlation (DIC) and KLT optical-ﬂow method in the experiment of this paper.


Introduction
The long-term use of bridges may lead to structural damages; hence, it is necessary to detect damages regularly. Vibration measurement is an important step in structural damage detections. In recent years, some noncontact-measurement methods have been proposed, such as Global Positioning System (GPS) [1] and laser Doppler vibrometer (LDV) [2], to replace the traditional contact-measurement methods (such as acceleration sensors [3] and strain gauges [4]). However, the GPS is of low accuracy [5] and LDV is costly and time-consuming [6]. With the development of computer-vision technology, digital image correlation (DIC) is more and more widely used in bridge vibration measurement [7,8]. Compared with GPS and LDV, DIC technology has the advantages of low cost, high precision, and high efficiency. DIC is also used for deformation and displacement measurement of other engineering structures [9]. However, the measurement accuracy of the DIC method is limited due to the errors caused by pixel interpolation [10]. Also as a computer-vision method, the optical-flow method is widely used in bridge vibration measurement, and its accuracy has been confirmed [11,12].
The Kanade-Lucas-Tomasi (KLT) optical-flow method [13] is proposed on the basis of the Lucas-Kanade optical-flow method [14]. The concept of optical flow was first proposed by Gibson [15], and represents the velocity of a moving object in a time-varying image. According to the idea of optical flow, the KLT optical-flow method matches and tracks the feature points of two adjacent frames to obtain the motion information of the feature points. The KLT optical-flow method is widely used in tracking the feature points of different scenes (buildings, grasslands, etc.) to validate its reliability [16] in measuring vibration of bridge models [12].
In order to adapt to various measurement environments (such as cross-river bridges, etc.), in recent years, UAVs have been used to replace fixed cameras for bridge measurement [17,18]. UAVs are also gradually applied to structural crack detection, displacement measurement, and damage inspection of bridges [19][20][21]. However, due to egomotion of UAVs during their flights, the measured displacement includes not only the displacement of the measured structure, but also the false displacement caused by UAV egomotion. The common method to eliminate the false displacement is homography transformation [22,23], but this requires that four or more static reference points are in the same plane of the target points, which is difficult to achieve in actual measurement. The random components in the signals collected by UAVs may be suppressed by a differential filtering method [24] to obtain the structural modal parameters [25]. Although the bridge modal parameters can be extracted by this method without using any reference point, it does not obtain the real displacement time-history curves of the target points. The correction method of threedimensional reconstruction is proposed to obtain the intrinsic and extrinsic parameters of the UAV cameras by using Zhang's method [26], so as to recover the 3D world coordinates of the structure to obtain its true displacement [27]. However, this method requires that the plane of the reference points is parallel to that of the target points and the distance between the two planes is known, which is difficult to achieve in practical measurement. The correction of UAV images by neural networks has attracted much attention in recent years. The method of correcting UAV images with a radial basis function (RBF) neural network is proposed [28]. The selected control points are used by this method as the training samples of the network. The corrected images can be obtained after the UAV images are inputted into the network. The two-layer feedforward neural network (FNN) was used to learn the characteristic relationship between the reference background and target points from the video with no structural motion [29]. The target-point coordinates can be estimated through the reference background when the structure is vibrating, which can be used to determine the homography transformation matrix of each frame image, so as to obtain the real coordinates of the target points.
A measurement method of bridge vibration by UAVs based on the KLT optical-flow method and a CNN is proposed in this paper. In this method, the KLT optical-flow method is used to accurately track the target points, and the multilayer CNN is used to effectively learn the characteristic relationship between the reference points and the target points to eliminate the false displacement of the UAV, so as to obtain the real displacement time-history curves of the structure, and the natural frequency of the model is extracted by OMA [30,31]. Finally, the measurement results of DIC and fixed camera are used as references to validate the reliability of the method proposed by this paper.

Methods
Firstly, the video of a static structure taken by a UAV is converted into a set of continuous digital image sequences stored in the frame form. The coordinates of reference points and target points are obtained by DIC and KLT optical-flow method to train the CNN. Then, the structure is excited and the image sequences are captured by a fixed camera and UAV. The target points are tracked with DIC technology and KLT optical-flow method, respectively, to obtain the displacement time-history curves (the one obtained by the UAV needs to be corrected by a CNN). The natural frequency of the structure is then extracted by OMA. The technical flowchart is shown in Figure 1

Assumptions of Optical-Flow Method
The optical-flow method is based on two assumptions: brightness constancy and small motion; that is, the pixel value of the same points between frames are unchanged and the motion of the points is small. Figure 2 shows three target points in two adjacent images. The position of the target points in the second image can be determined by finding the point whose pixel value is consistent with the target points in the first image.

KLT Optical-Flow Method
In addition to the two assumptions, the KLT optical-flow method [13] also assumes the spatial consistency; that is, the adjacent pixels in the previous frame are also adjacent in the next frame.
Suppose , and , are two adjacent images; a fixed size window, , centered on the position of a target point is established in the first image. All pixels in move , between the two images, and , are the coordinates of pixels. Let , . The closest to the actual value can be obtained by minimizing the following expression: . (1) where is a weighting function. In the simplest case, The optical-flow method is based on two assumptions: brightness constancy and small motion; that is, the pixel value of the same points between frames are unchanged and the motion of the points is small. Figure 2 shows three target points in two adjacent images. The position of the target points in the second image can be determined by finding the point whose pixel value is consistent with the target points in the first image.

Assumptions of Optical-Flow Method
The optical-flow method is based on two assumptions: brightness constancy and small motion; that is, the pixel value of the same points between frames are unchanged and the motion of the points is small. Figure 2 shows three target points in two adjacent images. The position of the target points in the second image can be determined by finding the point whose pixel value is consistent with the target points in the first image.

KLT Optical-Flow Method
In addition to the two assumptions, the KLT optical-flow method [13] also assumes the spatial consistency; that is, the adjacent pixels in the previous frame are also adjacent in the next frame.
Suppose , and , are two adjacent images; a fixed size window, , centered on the position of a target point is established in the first image. All pixels in move , between the two images, and , are the coordinates of pixels. Let , . The closest to the actual value can be obtained by minimizing the following expression: . (1) where is a weighting function. In the simplest case,

KLT Optical-Flow Method
In addition to the two assumptions, the KLT optical-flow method [13] also assumes the spatial consistency; that is, the adjacent pixels in the previous frame are also adjacent in the next frame.
Suppose I 1 (x, y) and I 2 (x, y) are two adjacent images; a fixed size window, W, centered on the position of a target point is established in the first image. All pixels in W move d x , d y between the two images, and (x, y) are the coordinates of pixels. Let d = [d x d x ] T , X = [x y] T . The d closest to the actual value can be obtained by minimizing the following expression: where ω(X) is a weighting function. In the simplest case, ω(X) = 1. Alternatively, ω could be a Gaussian-like function to emphasize the central area of the window. Move the centers of I 1 and I 2 by d/2 to obtain Set the partial derivative of ε with respect to d as 0: The following formula can be obtained from Taylor's expansion: Substitution of Equations (4) and (5) into Equation (3) leads to where p = ∂ ∂x The following equation can be obtained from Equation (6): where Z = W p(X)p T (X)ω(X)dX, e = W [I 1 (X) − I 2 (X)]p(X)ω(X)dX. Equation (8) can be solved by an iterative method to obtain the value of d. When the value of e is less than the set threshold, the approximate solution of d can be obtained.
In this paper, the process of tracking the target point on the bridge model by the KLT optical-flow method is as follows:

1.
Select the target point in the initial image; 2.
Based on the local template of each target point, the vector of the point between adjacent frames can be found [13]; 3.
The tracking effect is judged in each image to optimize the result for each target point.

KLT Optical-Flow Method under Pyramid
It can be seen from Section 2.1.2 that if the assumption of small motion is not met, the Taylor expansion of Equations (4) and (5) cannot be carried out. The image pyramid [32] referred by the KLT optical-flow method is used in this paper, which can effectively solve the above problem. Suppose there is an 800 × 800 image, the displacement range of the target point in this image is 32 × 32. Now the image is reduced to 400 × 400, the displacement range of the target point is reduced to 16 × 16, and the assumption of small motion can be met again according to this principle.
The specific principle of KLT optical-flow method under the pyramid is as follows. For each image, the original image is taken as layer 0, and the image reduced by 2 L times in length and width is taken as layer L. The obtained image is superimposed from bottom to top to generate the Gaussian pyramid shown in Figure 3. The displacement value of the target point on the highest layer is calculated in the way of the previous section, which is taken as the initial value of the optical-flow calculation of the next layer to calculate the accurate displacement value of this layer. The calculated displacement value is transmitted to the next layer again, so as to calculate to the lowest layer (level 0) to obtain the real displacement value.

DIC Technology
The principle of DIC method is shown in Figure 4: , is the deformed image of the image , ; centered on , and centered on , are two windows with the same size, , which are established at , and , , respectively. Where ∆ , ∆ . The correlation between two windows can be expressed as [33]: The real displacement of , between two images, ∆ , ∆ , can be obtained when the function value of ∆ , ∆ is the maximum. A fixed-size window is used by both DIC and KLT optical-flow methods to search target points in the whole image range; however, the two methods are different in tracking target points. As shown in Equation (1), the target tracking of KLT optical-flow method is based on the residual error of a fixed-size window between two frames, while that of DIC is based on the correlation of a fixed-size window between two frames, as shown in Equation (9). Moreover, the implementation process of KLT optical-flow method is accompanied by the image-pyramid technology, which tracks the points at multiple levels of resolution of an image to optimize tracking effect to improve the tracking accuracy.

DIC Technology
The principle of DIC method is shown in Figure 4: is the deformed image of the image I 1 (x, y); S 1 centered on (x, y) and S 2 centered on (x , y ) are two windows with the same size, S, which are established at I 1 (x, y) and I 2 (x, y), respectively. Where x = x + ∆x, y = y + ∆y. The correlation between two windows can be expressed as [33]: Appl. Sci. 2022, 12, x FOR PEER REVIEW 5 of 18 accurate displacement value of this layer. The calculated displacement value is transmitted to the next layer again, so as to calculate to the lowest layer (level 0) to obtain the real displacement value.

DIC Technology
The principle of DIC method is shown in Figure 4: , is the deformed image of the image , ; centered on , and centered on , are two windows with the same size, , which are established at , and , , respectively. Where ∆ , ∆ . The correlation between two windows can be expressed as [33]: The real displacement of , between two images, ∆ , ∆ , can be obtained when the function value of ∆ , ∆ is the maximum. A fixed-size window is used by both DIC and KLT optical-flow methods to search target points in the whole image range; however, the two methods are different in tracking target points. As shown in Equation (1), the target tracking of KLT optical-flow method is based on the residual error of a fixed-size window between two frames, while that of DIC is based on the correlation of a fixed-size window between two frames, as shown in Equation (9). Moreover, the implementation process of KLT optical-flow method is accompanied by the image-pyramid technology, which tracks the points at multiple levels of resolution of an image to optimize tracking effect to improve the tracking accuracy. The real displacement of (x, y) between two images, (∆x, ∆y), can be obtained when the function value of C(∆x, ∆y) is the maximum.
A fixed-size window is used by both DIC and KLT optical-flow methods to search target points in the whole image range; however, the two methods are different in tracking target points. As shown in Equation (1), the target tracking of KLT optical-flow method is based on the residual error of a fixed-size window between two frames, while that of DIC is based on the correlation of a fixed-size window between two frames, as shown in Equation (9). Moreover, the implementation process of KLT optical-flow method is accompanied by the image-pyramid technology, which tracks the points at multiple levels of resolution of an image to optimize tracking effect to improve the tracking accuracy.

Convolutional Neural Networks
The false displacement caused by UAV measurement can be removed by a CNN. As a feedforward neural network, each neuron of the CNN only extracts the local information of the input data, and the information are collected at a higher level of the network to obtain the global information [34]. The complexity of the model is reduced by weight sharing [35], which accelerates the computing speed and improves the calculation accuracy. The following is an introduction to the function layer involved in this paper.
The convolution layer is the core of the whole CNN, and the convolution process is shown in Figure 5. Suppose that there is a 2 × 2 convolution kernel, the sub area consistent with the size of convolution kernel is found in the input data. An element of the new matrix can be obtained by multiplying and summing each corresponding element in the sub area and convolution kernel, and a new matrix can be generated by navigating all of the input data in steps according to the above method.

Convolutional Neural Networks
The false displacement caused by UAV measurement can be removed by a CNN. As a feedforward neural network, each neuron of the CNN only extracts the local information of the input data, and the information are collected at a higher level of the network to obtain the global information [34]. The complexity of the model is reduced by weight sharing [35], which accelerates the computing speed and improves the calculation accuracy. The following is an introduction to the function layer involved in this paper.
The convolution layer is the core of the whole CNN, and the convolution process is shown in Figure 5. Suppose that there is a 2 × 2 convolution kernel, the sub area consistent with the size of convolution kernel is found in the input data. An element of the new matrix can be obtained by multiplying and summing each corresponding element in the sub area and convolution kernel, and a new matrix can be generated by navigating all of the input data in steps according to the above method. The feature-extraction ability of the network and the approximation ability of complex functions can be enhanced by activation-function layer. The activation function used in this paper is Leaky Relu, and its expression is as follows: The regression layer generally deals with regression problems, and its loss function is expressed as follows: where is the number of samples, is the target output, and is the prediction output. The value of will decrease with the training of the network until it converges to the target value.
The network used in this paper is a deep-convolution neural network including input layer, convolution layer, activation-function layer, full-connection layer, normalization layer, and output layer. The architecture of the CNN is shown in Figure 6.

Convolution kernel
Input data Convoluted data The feature-extraction ability of the network and the approximation ability of complex functions can be enhanced by activation-function layer. The activation function used in this paper is Leaky Relu, and its expression is as follows: The regression layer generally deals with regression problems, and its loss function is expressed as follows: where R is the number of samples, t i is the target output, and y i is the prediction output. The value of J will decrease with the training of the network until it converges to the target value.
The network used in this paper is a deep-convolution neural network including input layer, convolution layer, activation-function layer, full-connection layer, normalization layer, and output layer. The architecture of the CNN is shown in Figure 6. Appl. Sci. 2022, 12, x FOR PEER REVIEW 7 of 18 Figure 6. Architecture of the CNN.

Operational Modal Analysis
In a multiple-degree-of-freedom system, the expression of the response transmissibility [36] is . (12) where and are the Fourier transformations of and that are the time-history response signals at degrees of freedom and . The expression of the power spectral density (PSD) transmissibility [37] is ^ , , * * . (13) where , is the self-power spectral density of , , is the cross-power spectral density of and , and * is the conjugate complex number of . In this paper, the displacement time-history curves are obtained by KLT opticalflow method and DIC, which are used to obtain the PSD curves. The natural frequencies of the structure are obtained from the peaks of the self PSD curve, while the modal shape can be obtained from the ratios of the PSD transmissibility at corresponding degrees of freedom.

Experimental Equipment
The bridge model used in the experiment is a spatial steel-frame structure with a total length of 9.8 m and 29 nodes (Figure 7a

Operational Modal Analysis
In a multiple-degree-of-freedom system, the expression of the response transmissibility T io (ω) [36] is where X i (ω) and X o (ω) are the Fourier transformations of x i (t) and x o (t) that are the time-history response signals at degrees of freedom i and o. The expression of the power spectral density (PSD) transmissibility [37] iŝ where S o,o (ω) is the self-power spectral density of x o (t), S i,o (ω) is the cross-power spectral density of x i (t) and x o (t), and X * o (ω) is the conjugate complex number of X o (ω). In this paper, the displacement time-history curves are obtained by KLT optical-flow method and DIC, which are used to obtain the PSD curves. The natural frequencies of the structure are obtained from the peaks of the self PSD curve, while the modal shape can be obtained from the ratios of the PSD transmissibility at corresponding degrees of freedom.

Experimental Equipment
The bridge model used in the experiment is a spatial steel-frame structure with a total length of 9.8 m and 29 nodes (Figure 7a,b). Figure 7c acquisition frequency of the UAV is 30 frames/s and the image resolution is 3840 × 2160. In order to correct the measurement of the UAV, 5 reference points are selected on the cardboard at 45° to the plane of the model; the other 5 reference points are selected on the back wall. Hence, all reference points are not in the same plane. The experimental layout is shown in Figure 9b.

Experimental Scheme
To validate the reliability of vibration measurement of the bridge model by UA combined with CNN and KLT optical-flow method, the measurement results of the fixe camera are taken as a reference. Moreover, the measurement results of DIC are taken as reference to validate whether KLT optical-flow method is feasible to replace DIC in bridg vibration measurement.
Firstly, the fixed camera and UAV are placed about 2 m away from the bridge mode and the UAV is used to shoot the static model (the shooting time is 8 min). The reference point coordinates tracked by DIC and KLT optical-flow method are used as the input o CNN, and the target-point coordinates are used as the network output to train the ne work. Secondly, the bridge model is excited to vibrate, the fixed camera and UAV are use to shoot (the shooting time is 80 s for 5 times of excitation), and the points are tracked b DIC and KLT optical-flow method again. The reference-point coordinates measured b the UAV are still taken as the input of CNN, the false displacement of the target point caused by the UAV will be outputted by the trained CNN according to the characterist

Experimental Scheme
To validate the reliability of vibration measurement of the bridge model by UAV combined with CNN and KLT optical-flow method, the measurement results of the fixed camera are taken as a reference. Moreover, the measurement results of DIC are taken as a reference to validate whether KLT optical-flow method is feasible to replace DIC in bridge vibration measurement.
Firstly, the fixed camera and UAV are placed about 2 m away from the bridge model, and the UAV is used to shoot the static model (the shooting time is 8 min). The referencepoint coordinates tracked by DIC and KLT optical-flow method are used as the input of CNN, and the target-point coordinates are used as the network output to train the network. Secondly, the bridge model is excited to vibrate, the fixed camera and UAV are used to shoot (the shooting time is 80 s for 5 times of excitation), and the points are tracked by DIC and KLT optical-flow method again. The reference-point coordinates measured by the UAV are still taken as the input of CNN, the false displacement of the target points caused by the UAV will be outputted by the trained CNN according to the characteristic relationship between the reference points and the target points, and the real displacements of the target points measured by the UAV are obtained by subtracting the output value from the uncorrected displacement of the target point. The natural frequency of the model is obtained by OMA from the time-history curves. The specific process is shown in Figure 1, and the process of correcting displacement measured by a UAV through the CNN is shown in Figure 10. Finally, the tracking effects of DIC and KLT optical-flow method are compared under the fixed-camera and UAV measurements to validate the feasibility of KLT opticalflow method replacing DIC, and the results of fixed camera and UAV are compared under the KLT optical-flow method to validate the reliability of UAV data combined with CNN and KLT optical-flow method. The DIC, KLT optical-flow method, and CNN are realized using MATLAB (MathWorks Inc., Natick, MA, USA) software.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 10 of 18 relationship between the reference points and the target points, and the real displacements of the target points measured by the UAV are obtained by subtracting the output value from the uncorrected displacement of the target point. The natural frequency of the model is obtained by OMA from the time-history curves. The specific process is shown in Figure  1, and the process of correcting displacement measured by a UAV through the CNN is shown in Figure 10. Finally, the tracking effects of DIC and KLT optical-flow method are compared under the fixed-camera and UAV measurements to validate the feasibility of KLT optical-flow method replacing DIC, and the results of fixed camera and UAV are compared under the KLT optical-flow method to validate the reliability of UAV data combined with CNN and KLT optical-flow method. The DIC, KLT optical-flow method, and CNN are realized using MATLAB (MathWorks Inc., Natick, MA, USA) software.

Experimental Results and Analysis
The bridge model used in this paper has the largest displacement at node 15 in the middle of the span. Therefore, in order to better display the results, node 15 is selected as the target point in this paper (See Appendix A for the measurement results of node 5 and node 10).

Comparisons for DIC and KLT Method
The displacement results of the bridge model measured by fixed camera are showed in Figure 11, which demonstrates that the displacement time-history curves obtained by the DIC and KLT optical-flow method have very similar characteristics of free-vibration attenuation, and the maximum amplitude is about 5.7 mm. The ideal displacement curve ′ can be obtained by smoothing the displacement time-history curve through locally weighted regression (LOWESS [38]). The difference between and ′ is the error curve, and the average of the error curve reflects the smoothness of the displacement time-history curve measured by the DIC or KLT optical-flow method, which is defined as SDTH in this paper. The SDTHs of the DIC and KLT optical-flow method are 0.048 and 0.031, respectively. In addition, the displacement time-history curve obtained by DIC has obvious burrs, which illustrates its low measurement accuracy. In contrast, the displacement diagram obtained by the KLT optical-flow method is smoother, indicating that DIC is not as accurate as the KLT optical-flow method.

Experimental Results and Analysis
The bridge model used in this paper has the largest displacement at node 15 in the middle of the span. Therefore, in order to better display the results, node 15 is selected as the target point in this paper (See Appendix A for the measurement results of node 5 and node 10).

Comparisons for DIC and KLT Method
The displacement results of the bridge model measured by fixed camera are showed in Figure 11, which demonstrates that the displacement time-history curves obtained by the DIC and KLT optical-flow method have very similar characteristics of free-vibration attenuation, and the maximum amplitude is about 5.7 mm. The ideal displacement curve x (t) can be obtained by smoothing the displacement time-history curve x(t) through locally weighted regression (LOWESS [38]). The difference between x(t) and x (t) is the error curve, and the average of the error curve reflects the smoothness of the displacement time-history curve measured by the DIC or KLT optical-flow method, which is defined as SDTH in this paper. The SDTHs of the DIC and KLT optical-flow method are 0.048 and 0.031, respectively. In addition, the displacement time-history curve obtained by DIC has obvious burrs, which illustrates its low measurement accuracy. In contrast, the displacement diagram obtained by the KLT optical-flow method is smoother, indicating that DIC is not as accurate as the KLT optical-flow method. As shown in Figure 12, it is the displacements of the static point on the model measured by the fixed camera and obtained by the DIC and KLT optical-flow method. Theoretically, the curve should be a straight line with displacement of 0, so that the standard deviation (STD) of the curve can reflect the accuracy of the two methods. The STDs of the DIC and KLT optical-flow method are 0.075 and 0.061, respectively, which indicate that the KLT optical-flow method is more accurate than DIC. The displacement signals in Figure 11 are processed into PSD curves. The peak of the PSD curve corresponds to the natural first frequency of the model, as shown in Figure 13. It can be seen from the figure that the natural frequencies extracted by the two methods are 3.281 Hz and 3.282 Hz, respectively, and the relative error is 0.03%. It shows that the As shown in Figure 12, it is the displacements of the static point on the model measured by the fixed camera and obtained by the DIC and KLT optical-flow method. Theoretically, the curve should be a straight line with displacement of 0, so that the standard deviation (STD) of the curve can reflect the accuracy of the two methods. The STDs of the DIC and KLT optical-flow method are 0.075 and 0.061, respectively, which indicate that the KLT optical-flow method is more accurate than DIC. As shown in Figure 12, it is the displacements of the static point on the model mea ured by the fixed camera and obtained by the DIC and KLT optical-flow method. The retically, the curve should be a straight line with displacement of 0, so that the standa deviation (STD) of the curve can reflect the accuracy of the two methods. The STDs of t DIC and KLT optical-flow method are 0.075 and 0.061, respectively, which indicate th the KLT optical-flow method is more accurate than DIC. The displacement signals in Figure 11 are processed into PSD curves. The peak of t PSD curve corresponds to the natural first frequency of the model, as shown in Figure 1 It can be seen from the figure that the natural frequencies extracted by the two metho are 3.281 Hz and 3.282 Hz, respectively, and the relative error is 0.03%. It shows that t The displacement signals in Figure 11 are processed into PSD curves. The peak of the PSD curve corresponds to the natural first frequency of the model, as shown in Figure 13. It can be seen from the figure that the natural frequencies extracted by the two methods are 3.281 Hz and 3.282 Hz, respectively, and the relative error is 0.03%. It shows that the KLT optical-flow method is feasible to replace DIC to extract the natural frequency of the structure under fixed camera.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 12 o KLT optical-flow method is feasible to replace DIC to extract the natural frequency of structure under fixed camera.

Comparisons of UAV Measurement
The uncorrected displacement measured by the UAV and the false displacement p dicted by the CNN are shown in Figure 14. The figure shows that the uncorrected placement curves measured by the UAV have serious drift.
The true displacements of the structure are shown in Figure 15, which are obtain by subtracting the false displacement of the target points predicted by the CNN from uncorrected displacement of the target points. The corrected displacement curve obtain by DIC has more drift than that by the KLT optical-flow method. In addition, it can seen from Figure 15 that each wave of displacement time-history curves fluctuate 0. The greater the drift of the curve, the farther the mean line of e wave is from 0. The drift degree of the displacement time-history cur can be represented by the root mean square (RMS [39]). The RMSs of the DIC and K optical-flow methods are 0.063 and 0.025, respectively, which illustrates the low track accuracy of DIC, and it proves the stability of the KLT optical-flow method.

Comparisons of UAV Measurement
The uncorrected displacement measured by the UAV and the false displacement predicted by the CNN are shown in Figure 14. The figure shows that the uncorrected displacement curves measured by the UAV have serious drift.
The true displacements of the structure are shown in Figure 15, which are obtained by subtracting the false displacement of the target points predicted by the CNN from the uncorrected displacement of the target points. The corrected displacement curve obtained by DIC has more drift than that by the KLT optical-flow method. In addition, it can be seen from Figure 15 that each wave of displacement time-history curves fluctuates at Displacement = 0. The greater the drift of the curve, the farther the mean line of each wave is from Displacement = 0. The drift degree of the displacement time-history curves can be represented by the root mean square (RMS [39]). The RMSs of the DIC and KLT optical-flow methods are 0.063 and 0.025, respectively, which illustrates the low tracking accuracy of DIC, and it proves the stability of the KLT optical-flow method.   The above displacement signals in Figure 15 are processed into PSD curves, as shown in Figure 16. The natural frequencies extracted for the two displacement signals are 3.285 Hz and 3.286 Hz, respectively, the relative error is 0.04%. It shows that the KLT opticalflow method is feasible to replace DIC to extract the natural frequency of the structure under UAV.

Comparisons between Fixed Camera and UAV
The comparisons of the measurement results of the fixed camera and UAV processed by the KLT optical-flow method are shown in Figure 17, which demonstrates that the corrected displacement curve measured by UAV and the displacement curve measured by fixed camera have very similar free-vibration characteristics. By calculating the time response assurance criterion (TRAC) between the corrected displacement measured by the UAV and the displacement measured by the fixed camera, the correction effect of CNN The above displacement signals in Figure 15 are processed into PSD curves, as shown in Figure 16. The natural frequencies extracted for the two displacement signals are 3.285 Hz and 3.286 Hz, respectively, the relative error is 0.04%. It shows that the KLT optical-flow method is feasible to replace DIC to extract the natural frequency of the structure under UAV.  The above displacement signals in Figure 15 are processed into PSD curves, as shown in Figure 16. The natural frequencies extracted for the two displacement signals are 3.285 Hz and 3.286 Hz, respectively, the relative error is 0.04%. It shows that the KLT opticalflow method is feasible to replace DIC to extract the natural frequency of the structure under UAV.

Comparisons between Fixed Camera and UAV
The comparisons of the measurement results of the fixed camera and UAV processed by the KLT optical-flow method are shown in Figure 17, which demonstrates that the corrected displacement curve measured by UAV and the displacement curve measured by fixed camera have very similar free-vibration characteristics. By calculating the time response assurance criterion (TRAC) between the corrected displacement measured by the UAV and the displacement measured by the fixed camera, the correction effect of CNN

Comparisons between Fixed Camera and UAV
The comparisons of the measurement results of the fixed camera and UAV processed by the KLT optical-flow method are shown in Figure 17, which demonstrates that the corrected displacement curve measured by UAV and the displacement curve measured by fixed camera have very similar free-vibration characteristics. By calculating the time response assurance criterion (TRAC) between the corrected displacement measured by the UAV and the displacement measured by the fixed camera, the correction effect of CNN can be judged [10]. The higher the TRAC, the higher the degree of consistency, and the better the correction effect. The TRAC between displacement measured by the UAV and that measured by the fixed camera is 0.996, which shows their high consistency. Combined with the results of Section 4.2, it can be proved that the displacement of bridge model measured by the UAV combined with the CNN and KLT optical-flow method is reliable. can be judged [10]. The higher the TRAC, the higher the degree of consistency, and the better the correction effect. The TRAC between displacement measured by the UAV and that measured by the fixed camera is 0.996, which shows their high consistency. Combined with the results of Section 4.2, it can be proved that the displacement of bridge model measured by the UAV combined with the CNN and KLT optical-flow method is reliable.

Discussion of the Proposed Method
The experimental results show that the displacement of the bridge model measured by a UAV combined with the CNN and KLT optical-flow method is reliable. The displacement signals obtained by the KLT optical-flow method under fixed camera and UAV are more stable than those of DIC, and the extracted structural natural frequency of the KLT optical-flow method is basically consistent with that extracted by DIC, which shows the feasibility of replacing DIC with the KLT optical-flow method in bridge vibration measurement. The difference of the tracking effect between the two methods is mainly caused by the image pyramid of the KLT optical-flow method, which tracks target points at different levels of resolutions of an image to improve tracking accuracy. Under the KLT optical-flow method, the displacement curve measured by UAV combined with CNN is very close to that of the fixed camera, and the natural frequencies obtained by the two methods are basically the same, which shows the feasibility of the method of correcting displacements measured by UAV with CNN proposed in this paper. In the experiment, the 10 reference points are from two different planes (a cardboard and a wall), which shows that the reference points from the same plane are not required for the correction method of this paper, unlike other correction methods. That is one of the highlights of the method proposed in this paper. Moreover, in order to ensure the assumption of constant brightness of the KLT optical-flow method, the measurement can be carried out on cloudy days or when the light condition is stable. Since the measurement duration of the proposed method is about 10 min, the measurement time needs to be determined according to the actual situation. The experimental results show that the displacement of the bridge model measured by a UAV combined with the CNN and KLT optical-flow method is reliable. The displacement signals obtained by the KLT optical-flow method under fixed camera and UAV are more stable than those of DIC, and the extracted structural natural frequency of the KLT opticalflow method is basically consistent with that extracted by DIC, which shows the feasibility of replacing DIC with the KLT optical-flow method in bridge vibration measurement. The difference of the tracking effect between the two methods is mainly caused by the image pyramid of the KLT optical-flow method, which tracks target points at different levels of resolutions of an image to improve tracking accuracy. Under the KLT optical-flow method, the displacement curve measured by UAV combined with CNN is very close to that of the fixed camera, and the natural frequencies obtained by the two methods are basically the same, which shows the feasibility of the method of correcting displacements measured by UAV with CNN proposed in this paper. In the experiment, the 10 reference points are from two different planes (a cardboard and a wall), which shows that the reference points from the same plane are not required for the correction method of this paper, unlike other correction methods. That is one of the highlights of the method proposed in this paper. Moreover, in order to ensure the assumption of constant brightness of the KLT optical-flow method, the measurement can be carried out on cloudy days or when the light condition is stable. Since the measurement duration of the proposed method is about 10 min, the measurement time needs to be determined according to the actual situation.

Follow-Up Study
Although impressive results were obtained in this paper, there are still some problems that need follow-up research: 1.
The reference point is an important part of the correction method proposed in this paper, so its influence on the correction effect will be studied in the follow-up.

2.
There may be many influencing factors of accuracy of the proposed method in the actual bridge measurement, such as the wind condition and the actual measurement distance; therefore, the method in this paper will be verified in combination with the actual bridge-measurement results in the follow-up.

Conclusions
In this paper, a measurement method of bridge vibration by UAVs based on a CNN and KLT optical-flow method is proposed. The effectiveness of the UAV measurement results corrected by a CNN is validated by comparing them with the measurement results of a fixed camera, and the measurement results of DIC are taken as a reference to prove that the KLT optical-flow method is feasible to replace DIC in bridge vibration measurement.
As most of actual bridges cross rivers or valleys, which causes great inconvenience to the measurement, the method introduced in this paper has great prospects in measuring bridge vibration from the perspective of feasibility and accuracy.

Data Availability Statement:
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In order to verify that the conclusions of this paper are valid in the case of large, medium and small displacement, the measurement results of node 5 and node 10 are shown in the figures below.
lems that need follow-up research: 1. The reference point is an important part of the correction method proposed in this paper, so its influence on the correction effect will be studied in the follow-up. 2. There may be many influencing factors of accuracy of the proposed method in the actual bridge measurement, such as the wind condition and the actual measurement distance; therefore, the method in this paper will be verified in combination with the actual bridge-measurement results in the follow-up.

Conclusions
In this paper, a measurement method of bridge vibration by UAVs based on a CNN and KLT optical-flow method is proposed. The effectiveness of the UAV measurement results corrected by a CNN is validated by comparing them with the measurement results of a fixed camera, and the measurement results of DIC are taken as a reference to prove that the KLT optical-flow method is feasible to replace DIC in bridge vibration measurement.
As most of actual bridges cross rivers or valleys, which causes great inconvenience to the measurement, the method introduced in this paper has great prospects in measuring bridge vibration from the perspective of feasibility and accuracy.

Data Availability Statement:
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In order to verify that the conclusions of this paper are valid in the case of large, medium and small displacement, the measurement results of node 5 and node 10 are shown in the figures below.