An Integrated Machine Learning Algorithm for Separating the Long-Term Deflection Data of Prestressed Concrete Bridges

Deflection is one of the key indexes for the safety evaluation of bridge structures. In reality, due to the changing operational and environmental conditions, the deflection signals measured by structural health monitoring systems are greatly affected. These ambient changes in the system often cover subtle changes in the vibration signals caused by damage to the system. The deflection signals of prestressed concrete (PC) bridges are regarded as the superposition of different effects, including concrete shrinkage, creep, prestress loss, material deterioration, temperature effects, and live load effects. According to multiscale analysis theory of the long-term deflection signal, in this paper, an integrated machine learning algorithm that combines a Butterworth filter, ensemble empirical mode decomposition (EEMD), principle component analysis (PCA), and fast independent component analysis (FastICA) is proposed for separating the individual deflection components from a measured single channel deflection signal. The proposed algorithm consists of four stages: (1) the live load effect, which is a high-frequency signal, is separated from the raw signal by a Butterworth filter; (2) the EEMD algorithm is used to extract the intrinsic mode function (IMF) components; (3) these IMFs are utilized as input in the PCA model and some uncorrelated and dominant basis components are extracted; and (4) FastICA is applied to derive the independent deflection component. The simulated results show that each individual deflection component can be successfully separated when the noise level is under 10%. Verified by a practical application, the algorithm is feasible for extracting the structural deflection (including concrete shrinkage, creep, and prestress loss) only caused by structural damage or material deterioration.


Introduction
Due to their advantages of simple calculations, convenient construction and low maintenance costs, long-span prestressed concrete (PC) bridges with 100-300 m spans have been widely designed and built around the world [1]. In recent years, many long-span PC bridges, especially those whose spans are more than 200 m, have suffered from excessive midspan deflection, which can affect their safety during operation and restrict the development of PC bridges with larger spans [2][3][4]. The Koror-Babeldaob Bridge, with a record span of 241 m (791 ft), experienced one of the most famous bridge collapse disasters due to excessive midspan deflection. Built in 1977, this bridge developed a midspan deflection of 1.61 m (5.3 ft) after 18 years' operation and eventually collapsed in 1996 [5,6]. Deflection, which accurately reflects the structural performance under various loadings Also, the key parameters in some of these algorithms are sensitive to the results, which need to be optimized to get better results.
To characterize the influences of the individual load effect on the long-term deflection data, this paper presents a single-channel blind separation algorithm for separating the deflection component of live load effects, temperature effects, and structural deflection (including concrete shrinkage, creep, and prestress loss). The proposed algorithm is an integrated machine learning algorithm that combines a Butterworth filter, ensemble empirical mode decomposition (EEMD), PCA, and fast independent component analysis (FastICA). It consists of four stages: (1) The live load effect, which is a high-frequency signal, is separated from the raw signal by a Butterworth filter. (2) The EEMD algorithm is used to extract the intrinsic mode function (IMF) components. (3) These IMFs are then utilized as the inputs of the PCA model, and some uncorrelated and dominant basis components are extracted. (4) Finally, FastICA is applied to derive the independent component, that is, the deflection caused by the individual load effect. For this proposed algorithm, we only need to define the standard deviation and the ensemble number for EEMD and component scores for PCA. The standard deviation and the ensemble number are usually defined as 0.1-0.4 and 50-200, respectively. The general requirements of the component score are set as 95%. These three parameters are not sensitive to the results.

Multiscale Characteristics of Long-Term Deflection Data
Deflection caused by prestress loss, shrinkage, and creep of concrete is an irreversible trend and will change only after a few months. However, the deflection caused by live loads will change several times in one second. Therefore, the recurrence periods of deflection caused by different effects vary. Due to the multiscale characteristics of different effects, the deflection component of an individual load effect can be determined. As mentioned before, the long-term deflection data (D) obtained from the bridge health monitoring system can be regarded as the superposition of concrete shrinkage, creep, prestress loss, material deterioration, temperature effects, and live load effects: where D L , D T , D C , D s , D P , D D , and D n are the deflections caused by live loads, temperature, creep, shrinkage, prestress loss, dead loads, and noise, respectively. Furthermore, D C , D s , D P , and D D compose the structural deflection D v, which is an irreversible index that can be used to evaluate the condition of a bridge. D v can be expressed as: align symbols or replace with text versions As shown in Table 1, for live loads with deflection (D L ), there are hundreds of cycles in one second, which is a high frequency. The temperature deflection (D T ) consists of a daily temperature effect (D T1 ) and an annual temperature effect (D T2 ). Their recurrence periods are 1 day (24 h) and 1 year (8760 h), respectively. They have a medium and low frequency, respectively. For structural deflection (D v ), which will increase each year, the frequency can be considered 0. The amplitude of the live load deflection and temperature deflection is significant in comparison with the yearly increment caused by irreversible structural deflection D v .
Consequently, the long-term deflection D is written as: Since D represents composite data, in order to accurately evaluate the condition of a bridge, an evaluation system for the individual deflection component must be established. Therefore, the separation of the long-term deflection is the prerequisite.  Figure 1 shows the flowchart and algorithms used in the proposed method. Filtering technology has been widely used in the separation of signals with large time-scale differences, demonstrating a good effect. For the proposed algorithm, in the first step, a Butterworth filter [32] is used to separate the deflection of the live load effect from the long-term deflection. As the daily temperature deflection, annual temperature deflection, and structural deflection belong to a low-frequency domain, there will be a mode mixing problem, so it is difficult to separate the three components by filtering technology.

An Integrated Machine Learning Algorithm
In the second step, EEMD is used to decompose an observed mixed signal into a collection of IMFs. Although the IMF is a complete, adaptive, and basically orthogonal representation of a signal, EEMD has the characteristics of a binary filter bank structure, similar to dyadic discrete wavelet decomposition. Therefore, the bandwidths of the first few decomposed IMFs are too large, and the IMFs are not completely suitable for using the Hilbert transform to directly obtain an accurate instantaneous frequency spectrum [33,34]. In the third step, these IMFs are utilized as inputs in the PCA model, and some uncorrelated and dominant basis components (U i ) are extracted. The U i extracted by the PCA algorithm are uncorrelated, but they are not statistically independent. In the last step, a further procedure called FastICA needs to be performed to derive the independent deflection components.  Figure 1 shows the flowchart and algorithms used in the proposed method. Filtering technology has been widely used in the separation of signals with large time-scale differences, demonstrating a good effect. For the proposed algorithm, in the first step, a Butterworth filter [32] is used to separate the deflection of the live load effect from the long-term deflection. As the daily temperature deflection, annual temperature deflection, and structural deflection belong to a low-frequency domain, there will be a mode mixing problem, so it is difficult to separate the three components by filtering technology.

An Integrated Machine Learning Algorithm
In the second step, EEMD is used to decompose an observed mixed signal into a collection of IMFs. Although the IMF is a complete, adaptive, and basically orthogonal representation of a signal, EEMD has the characteristics of a binary filter bank structure, similar to dyadic discrete wavelet decomposition. Therefore, the bandwidths of the first few decomposed IMFs are too large, and the IMFs are not completely suitable for using the Hilbert transform to directly obtain an accurate instantaneous frequency spectrum [33,34]. In the third step, these IMFs are utilized as inputs in the PCA model, and some uncorrelated and dominant basis components (Ui) are extracted. The Ui extracted by the PCA algorithm are uncorrelated, but they are not statistically independent. In the last step, a further procedure called FastICA needs to be performed to derive the independent deflection components.

EEMD
As an adaptive signal processing algorithm for nonlinear and nonstationary signals, EMD expresses the signal as the sum of a series of signal components with amplitude-modulated and frequency-modulated parameters, which can reveal the overlap of time and frequency components. To solve the mode mixing problem, a noise-aided data analysis method called EEMD [35] is proposed, which is an extension of the empirical mode decomposition (EMD) [36]. The decomposition principle of EEMD is that, when the signal is added with normally distributed white noise, the signals of different scales are automatically mapped to appropriate scales related to white noise. Different series of normal distributions of Gaussian white noise are added to the signal in several trials. In each trial, added white noise that is different from the decomposed IMFs will show

EEMD
As an adaptive signal processing algorithm for nonlinear and nonstationary signals, EMD expresses the signal as the sum of a series of signal components with amplitude-modulated and frequency-modulated parameters, which can reveal the overlap of time and frequency components. To solve the mode mixing problem, a noise-aided data analysis method called EEMD [35] is proposed, which is an extension of the empirical mode decomposition (EMD) [36]. The decomposition principle of EEMD is that, when the signal is added with normally distributed white noise, the signals of different scales are automatically mapped to appropriate scales related to white noise. Different series of normal distributions of Gaussian white noise are added to the signal in several trials. In each trial, added white noise that is different from the decomposed IMFs will show no correlation with the corresponding IMFs from one test to another. If the number of trials (ensemble number) is sufficient, the added noise can be eliminated by averaging all the IMFs produced in each trial [37]. The decomposition process of EEMD is as follows.
(1) Add a white-noise series to the targeted signal.
x m (t) = x(t) + n m (t) (4) where x(t) is the mixed signal to be analyzed. n m (t) is the added white noise, which has a mean of zero and a standard deviation of σ, which is in the range of 0.1 to 0.4. The index m is the ensemble number with a preset maximum number (m e ), which starts from 1. (2) Decompose the signal x m (t) into a series of IMFs, c i,m , (i = 1, 2 · · · N − 1), and a residual, r m , using EMD. We can obtain: where N − 1 is the total number of IMFs produced in each EMD decomposition.
x m e (t) = (4) The final IMFs are obtained by overall averaging the IMFs produced in each trial.
Therefore, a signal x(t) with multiple components is represented as the sum of some IMFs and a residue:

PCA
After the signal is decomposed with the EEMD, R IMF components can be obtained. An observation matrix Y R×L is formed as: where R is number of IMFs, L is length of data, and T is the transpose operator. The observation matrix Y R×L is then used as the input of the PCA model to find uncorrelated dominant basis components. PCA was developed by Pearson in 1901 [38] as an analogue of the principal axis theorem in mechanics. It was later independently developed and named by Hotelling in the 1930s [39]. In statistics, PCA is a linear transformation method for feature extraction and dimensionality reduction. The principle of PCA is to transform the original correlated random vectors into new noncorrelated random vectors by means of an orthogonal transformation. In geometry, the transform procedure can be expressed as the original coordinate system being transformed into a new orthogonal coordinate system, which points in P orthogonal directions where the sample points scatter most openly. Then, a small number of eigenvectors are extracted from it so that these eigenvectors can generalize the most original data information.
In algebra, the transform procedure can be expressed as the covariance matrix of the original random vector being transformed into a diagonal matrix. PCA can be performed by a SVD, usually after a normalization step of the initial data. The SVD of Y L×R is a factorization of the form Y L×R = U L×L ∑ L×R V T R×R , where U and V are orthogonal matrices (with orthogonal columns) and ∑ is a matrix of r singular values λ r = λ r×r , where λ 1 ≥ λ 2 ≥ · · · ≥ λ r ≥ 0. The contribution of each vector is ranked based on the magnitude of its corresponding singular value. The result of the PCA is usually presented by the component score β i for each IMF: The component score β i refers to the proportion of the variance that each eigenvector represents and can be calculated by dividing the singular value corresponding to that eigenvector by the sum of all singular values. Usually, a component score is required to reach 85% so that the vectors can reflect most of the key information of the original data. More details of PCA can be found in References [40,41].

FastICA
A set of n (n ≤ R) uncorrelated basis IMFs are obtained after applying PCA. The components U i extracted by the PCA algorithm are uncorrelated, but they are not statistically independent. A further procedure, i.e., ICA, is needed to derive the independent deflection components. ICA [42] is a blind source separation technique that extracts statistically independent components from a set of recorded signals.
Suppose that n independent source signals are received by m sensors and that the source signals are randomly mixed through an unknown mixing system to form observation signals. The ICA model can be expressed as: where X = (x 1 , x 2 , · · · , x m ) T represents the observed mixed signal vectors, S = (s 1 , s 2 , · · · , s n ) T is the unknown source signal vectors, and A = a ij m×n is an m× n dimensional mixed matrix (m ≤ n).
As the source signal S(t) and mixed matrix A are unknown, the goal of ICA is to find a separation matrix W so that the output signal Y(t) can be obtained. Y(t) is the best estimation of source signal S(t), and the components of the output signal Y(t) are independent of one another. The ICA algorithm is shown in Figure 2.
scatter most openly. Then, a small number of eigenvectors are extracted from it so that these eigenvectors can generalize the most original data information.
In algebra, the transform procedure can be expressed as the covariance matrix of the original random vector being transformed into a diagonal matrix. PCA can be performed by a SVD, usually after a normalization step of the initial data. The SVD of YL×R is a factorization of the form where U and V are orthogonal matrices (with orthogonal columns) and  is a matrix of r singular values r r r The contribution of each vector is ranked based on the magnitude of its corresponding singular value. The result of the PCA is usually presented by the component score i β for each IMF: The component score refers to the proportion of the variance that each eigenvector represents and can be calculated by dividing the singular value corresponding to that eigenvector by the sum of all singular values. Usually, a component score is required to reach 85% so that the vectors can reflect most of the key information of the original data. More details of PCA can be found in References [40,41].

FastICA
A set of n (n ≤ R) uncorrelated basis IMFs are obtained after applying PCA. The components Ui extracted by the PCA algorithm are uncorrelated, but they are not statistically independent. A further procedure, i.e., ICA, is needed to derive the independent deflection components. ICA [42] is a blind source separation technique that extracts statistically independent components from a set of recorded signals.
Suppose that n independent source signals are received by m sensors and that the source signals are randomly mixed through an unknown mixing system to form observation signals. The ICA model can be expressed as: where ( ) 1 2 , , , As the source signal ( ) S t and mixed matrix A are unknown, the goal of ICA is to find a separation matrix W so that the output signal ( ) Y t can be obtained. ( ) Y t is the best estimation of source signal ( ) S t , and the components of the output signal ( ) Y t are independent of one another.
The ICA algorithm is shown in Figure 2. The ICA algorithm used in this study is FastICA. The FastICA algorithm (also known as fixedpoint ICA) is widely used in signal processing because of its fast convergence speed and good The ICA algorithm used in this study is FastICA. The FastICA algorithm (also known as fixed-point ICA) is widely used in signal processing because of its fast convergence speed and good separation effect. The algorithm can estimate the source signals that are independent of one another and mixed by unknown factors from the observed signals. Unlike an ordinary neural network algorithm, FastICA uses batch processing. This means that a large number of sample data participate in each iteration and can be used by applying the fourth-order cumulants, which are the maximum likelihood-based and the maximum entropy-based forms. In addition, this algorithm adopts a fixed-point iterative optimization algorithm, which makes convergence faster and more robust [43,44].

Numerical Simulation
The Hanxi Bridge is a 100 m× 160 m× 100 m prestressed concrete box-girder bridge. The bridge was opened to traffic in 2012 and has been in operation for more than five years. To monitor the safety of the bridge structure, a structural health monitoring system-namely, a liquid level sensing system-was deployed in this bridge to monitor the variations of the girder deflection and ambient temperature in real time. More information on the Hanxi Bridge can be found in Reference [45]. Based on the finite element model developed in Midas/Civil, as shown in Figure 3 separation effect. The algorithm can estimate the source signals that are independent of one another and mixed by unknown factors from the observed signals. Unlike an ordinary neural network algorithm, FastICA uses batch processing. This means that a large number of sample data participate in each iteration and can be used by applying the fourth-order cumulants, which are the maximum likelihood-based and the maximum entropy-based forms. In addition, this algorithm adopts a fixedpoint iterative optimization algorithm, which makes convergence faster and more robust [43,44].

Numerical Simulation
The Hanxi Bridge is a 100 m× 160 m× 100 m prestressed concrete box-girder bridge. The bridge was opened to traffic in 2012 and has been in operation for more than five years. To monitor the safety of the bridge structure, a structural health monitoring system-namely, a liquid level sensing system-was deployed in this bridge to monitor the variations of the girder deflection and ambient temperature in real time. More information on the Hanxi Bridge can be found in Reference [45]. Based on the finite element model developed in Midas/Civil, as shown in Figure 3

Live Load Effect
The grade-I lane load of a highway is the designed vehicle load of this bridge. As shown in  Based on the survey of the traffic conditions in Guangzhou City, the fatigue-loaded vehicle model was set up based on the simplified load frequency value spectrum by Wang et al. [47]. The simplified fatigue-load vehicle model, which might cause structural damage, consists of two types of vehicles, as shown in Figure 5. Model M1 is a two-axle vehicle model with a distance of 4.4 m and a total weight of 110 kN, accounting for 16.34% of the total traffic volume. Model M2 is a three-axle vehicle model with a wheelbase of 3.4 and 4.92 m and a total vehicle weight of 167 kN, accounting for 0.03% of the total traffic volume. Therefore, Model M1 is used to calculate the deflection of the live loads in Midas/Civil software.

Live Load Effect
The grade-I lane load of a highway is the designed vehicle load of this bridge. As shown in Figure 4, according to General Specifications for Design of Highway Bridges and Culverts (GSDHBC) [46], the grade-I lane load of a highway (which is used for the overall calculation), consists of a concentrated force P k and a distributed force q k . However, in order to ensure the safety of the bridge, the grade-I lane load is designed and simplified as the standard vehicle load according to the static strength design (i.e., the dynamic effect is simplified to the static effect) by the influence line loading method. The grade-I lane load is not the actual vehicle load model of the bridge in operation. The simulated signal of live loads deflection, which is a multi-period signal, is collected at a sampling frequency of 4000 Hz. Figure 6a shows the time history of the vehicle load deflection in the midspan of span 2# when Model M1 passes through the bridge at a speed of 60 km/h. The maximum deflection is −5.3 mm. The result of fast Fourier transformation (FFT) is shown in Figure 6b. It is evident that the main frequencies of live loads deflection are greater than 200 Hz, which belongs to the higher-frequency domain. Therefore, the live load deflection effect can be separated by a Butterworth filter.   The simulated signal of live loads deflection, which is a multi-period signal, is collected at a sampling frequency of 4000 Hz. Figure 6a shows the time history of the vehicle load deflection in the midspan of span 2# when Model M 1 passes through the bridge at a speed of 60 km/h. The maximum deflection is −5.3 mm. The result of fast Fourier transformation (FFT) is shown in Figure 6b. It is evident that the main frequencies of live loads deflection are greater than 200 Hz, which belongs to the higher-frequency domain. Therefore, the live load deflection effect can be separated by a Butterworth filter. The simulated signal of live loads deflection, which is a multi-period signal, is collected at a sampling frequency of 4000 Hz. Figure 6a shows the time history of the vehicle load deflection in the midspan of span 2# when Model M1 passes through the bridge at a speed of 60 km/h. The maximum deflection is −5.3 mm. The result of fast Fourier transformation (FFT) is shown in Figure 6b. It is evident that the main frequencies of live loads deflection are greater than 200 Hz, which belongs to the higher-frequency domain. Therefore, the live load deflection effect can be separated by a Butterworth filter.

Temperature Effect
According to GSDHBC [46], the deflection of the temperature effect (D T ) consists of a daily temperature effect deflection (D T1 ) and annual temperature effect deflection (D T2 ). D T1 consists of a daily global temperature effect deflection (D T1−1 ) and daily gradient temperature effect deflection (D T1−2 ). In Guangdong Province, China, the temperature does not vary greatly between day and night. In this paper, the temperature difference between day and night is set as ∆ d1 = 10 • C. For the calculation of deflections related to the gradient temperature effect, an equivalent linear temperature profile can be assumed [48]. The temperature difference between the top and bottom surfaces of the bridge girders is set as 4 • C. D T2 is a uniform temperature effect. When calculating the deflection of a bridge structure caused by an annual temperature effect, the temperature of the bridge closure should be set as the reference point to obtain the effective temperature effect at the highest and lowest temperature. The temperature difference of the annual temperature effect is set at ∆ d2 = 40 • C( −20 • C ∼ +20 • C).Therefore, the temperature effect (D T ) can be expressed using Equation (13): Midas software is used to calculate the deflection of the temperature effect. The results show that the midspan deflection of the main span develops d 1 (+1.48 mm) with a global temperature increase of 1 • C but develops d 2 (−2.92 mm) with an increase in the gradient temperature of 1 • C. It is assumed that there is a linear relationship between the temperature and the deflection of the bridge structure considering the sinusoidal variation. According to Equation (13), the deflection time history of temperature effect deflection can be obtained at a sampling frequency of 1 time/hour, as shown in Figure 7a. The sampling frequency is regarded as 1 Hz. For FFT, the frequency of the daily temperature and annual temperature effects are 0.04 and 0 Hz, respectively. In this paper, the temperature difference between day and night is set as 1 10 C d Δ =°. For the calculation of deflections related to the gradient temperature effect, an equivalent linear temperature profile can be assumed [48]. The temperature difference between the top and bottom surfaces of the bridge girders is set as 4 °C.
2 T D is a uniform temperature effect. When calculating the deflection of a bridge structure caused by an annual temperature effect, the temperature of the bridge closure should be set as the reference point to obtain the effective temperature effect at the highest and lowest temperature. The temperature difference of the annual temperature effect is set at Therefore, the temperature effect ( T D ) can be expressed using Equation (13):  Midas software is used to calculate the deflection of the temperature effect. The results show that the midspan deflection of the main span develops d1 (+1.48 mm) with a global temperature increase of 1 °C but develops d2 (−2.92 mm) with an increase in the gradient temperature of 1 °C. It is assumed that there is a linear relationship between the temperature and the deflection of the bridge structure considering the sinusoidal variation. According to Equation (13), the deflection time history of temperature effect deflection can be obtained at a sampling frequency of 1 time/hour, as shown in Figure 7a. The sampling frequency is regarded as 1 Hz. For FFT, the frequency of the daily temperature and annual temperature effects are 0.04 and 0 Hz, respectively.

Effect of Concrete Shrinkage and Creep
The effect of concrete shrinkage and creep, which will be increased by a year, can be considered to have a frequency of 0. In the Code for the Design of Highway Reinforced Concrete and Prestressed Concrete Bridges and Culverts (CDHRCPCBC) [49], the model of CEB-FIP (90) is applied to calculate the effect of concrete shrinkage and creep. The creep coefficient can be expressed as: where φ 0 is the nominal creep coefficient, which is related to the loading age of concrete, the relative humidity, and the theoretical thickness of components. β H is a coefficient related to the relative humidity and theoretical thickness of components. t and t 0 denote the calculating age and loading age of the concrete, respectively. t 1 equals 1 day. The shrinkage strain can be expressed as: where ε cs0 is the nominal shrinkage coefficient, which is related to the concrete strength and annual mean relative humidity; h and h 0 are the theoretical thickness of components and 100 mm, respectively; and t s is the age of concrete at the beginning of shrinkage. As shown in Figure 8, the bridge develops a deflection of −20.14 mm caused by shrinkage and creep over three years. According to the deflection curve, the expression of fitting curve is obtained by an exponential function, shown as Equation (16):

Effect of Prestress Loss
The results of [50] show that, from the beginning of the operation, a prestress loss of a long-span PC box girder bridge can reach up to 16% in eight years. In this paper, it is assumed that the longitudinal prestress loss of the Hanxi Bridge is 9% in three years, which causes a midspan deflection of −27.8 mm. The deflection caused by prestress loss at any time is obtained by linear interpolation, which is shown in Figure 9.

Total Deflection
As shown in Figure 6, the live load deflection effect signal belongs to the higher-frequency

Total Deflection
As shown in Figure 6, the live load deflection effect signal belongs to the higher-frequency domain and can be separated by a fourth-order Butterworth filter with a 0-2 Hz cutoff frequency. For a signal with a low frequency domain, in order to verify the validity of the proposed algorithm under different noise levels, test noise D n is added to the noise-free signal to obtain the source signal. The sum of temperature effect, shrinkage and creep effect, and prestress loss effect, which is a signal with a low frequency domain, is regarded as the reference noise-free signal to verify the proposed algorithm. In order to verify the validity of the proposed algorithm under different noise levels, test noise D n is added to the noise-free signal to obtain the source signal. The signal vectorsŝ(i) and x(i) are established as: The signal-to-noise ratio (SNR) can also be estimated as: where P S and P N represent the effective power of the noise-free signal and noise, respectively. x(i) andŝ(i) represent the signal vectors of the source signal data and the reference noise-free signal data, respectively.

Procedure of Separation
Due to space limitations, only the signal separation process with a 10% noise level is given here, as shown in Figure 10. At the first stage, we apply the EEMD algorithm on the source single channel signal. Two critical parameters need to be prescribed before the EEMD is applied to a signal. In this case, the standard deviation (σ) and the ensemble number (m e ) are defined as 0.4 and 100, respectively. As a result, we obtained 10 IMFs and a residue (as shown in Figure 11).

Total Deflection
As shown in Figure 6, the live load deflection effect signal belongs to the higher-frequency domain and can be separated by a fourth-order Butterworth filter with a 0-2 Hz cutoff frequency. For a signal with a low frequency domain, in order to verify the validity of the proposed algorithm under different noise levels, test noise n D is added to the noise-free signal to obtain the source signal. The sum of temperature effect, shrinkage and creep effect, and prestress loss effect, which is a signal with a low frequency domain, is regarded as the reference noise-free signal to verify the proposed algorithm. In order to verify the validity of the proposed algorithm under different noise levels, test noise Dn is added to the noise-free signal to obtain the source signal. The signal vectors ( ) The signal-to-noise ratio (SNR) can also be estimated as: where Ps and PN represent the effective power of the noise-free signal and noise, respectively. ( ) x i and ˆ( ) s i represent the signal vectors of the source signal data and the reference noise-free signal data, respectively.

Procedure of Separation
Deflection/mm B Figure 10. Total deflection of different effects with 10% noise level (not including live load effect).
Due to space limitations, only the signal separation process with a 10% noise level is given here, as shown in Figure 10. At the first stage, we apply the EEMD algorithm on the source single channel signal. Two critical parameters need to be prescribed before the EEMD is applied to a signal. In this case, the standard deviation ( σ ) and the ensemble number (me) are defined as 0.4 and 100, respectively. As a result, we obtained 10 IMFs and a residue (as shown in Figure 11).  After data normalization, the PCA is applied to the IMF components to obtain uncorrelated basis components. In determining the r principal component, the general requirements of the component After data normalization, the PCA is applied to the IMF components to obtain uncorrelated basis components. In determining the r principal component, the general requirements of the component score are set as 95% in this simulated example. The first three basis vectors, whose component scores are more than 95%, are shown in Figure 12 and are selected as the input of the FastICA model. The estimated output independent deflection components are shown in Figure 13. Due to space limitations, only the signal separation process with a 10% noise level is given here, as shown in Figure 10. At the first stage, we apply the EEMD algorithm on the source single channel signal. Two critical parameters need to be prescribed before the EEMD is applied to a signal. In this case, the standard deviation ( σ ) and the ensemble number (me) are defined as 0.4 and 100, respectively. As a result, we obtained 10 IMFs and a residue (as shown in Figure 11).  After data normalization, the PCA is applied to the IMF components to obtain uncorrelated basis components. In determining the r principal component, the general requirements of the component score are set as 95% in this simulated example. The first three basis vectors, whose component scores are more than 95%, are shown in Figure 12 and are selected as the input of the FastICA model. The estimated output independent deflection components are shown in Figure 13. To quantitatively analyze the validity of the proposed algorithm, the correlation coefficient ψ is used as an evaluation index to measure the similarity between source signals and the corresponding estimated output components: where si is the i-th source signal component, and zi is the i-th estimated output signal component, To quantitatively analyze the validity of the proposed algorithm, the correlation coefficient ψ is used as an evaluation index to measure the similarity between source signals and the corresponding estimated output components: where s i is the i-th source signal component, and z i is the i-th estimated output signal component, shown in Figure 13. The range of the correlation coefficient is [−1, 1]. The closer |ψ| is to 1, the closer the correlation is. Usually, when |ψ| is greater than 0.8, the two vectors are considered to have a high linear correlation.
The results are shown in Table 2. The signal-to-noise ratio (SNR) is critical to separating the results of the proposed algorithm. For the simulated mixed signals of low frequency domain, the correlation coefficients are greater than 0.84 when the noise level is under 10%. However, when the noise level is greater than 15%, the correlation coefficients are smaller than 0.75. The results show that the proposed algorithm is feasible when the noise level is under 10%.

Description of the Structural Health Monitoring (SHM) System
According to the regular inspection report, the Hanxi bridge developed a midspan deflection of −28 mm after four years of operation. As shown in Figure 14, a liquid level sensing system was deployed in 2015. Two reference points were set at pier 1#, and a total of four deflection measurement points were set at span 1# and span 2#. The first three vertical frequencies of the bridge are identified as 0.65, 0.78, and 1.07 Hz from field measurements. The sampling frequency of the liquid level sensing system is set at 4 Hz, which can meet the requirements.

Live Loads Effect
The real-time monitoring data of point 3 was used for further analysis. The length of the selected data is one year (from October 2016 to October 2017, lasting for 8760 h, with 126,144,000 total data points). A small data set of the raw signal, which lasts for half an hour, is shown in Figure 15. The maximum/minimum value of the deflection caused by the live load effect was 2 and −4 mm, respectively. Most of the time, the deflection is smaller than −3 mm, which is the deflection caused by light two-axle vehicles traveling on a bridge. When heavier vehicles are driving on the bridge, a greater deflection from −3 to −4 mm is developed. The real-time monitored deflection data can be used as an evaluation index for the overload assessment of vehicles.

Live Loads Effect
The real-time monitoring data of point 3 was used for further analysis. The length of the selected data is one year (from October 2016 to October 2017, lasting for 8760 h, with 126,144,000 total data points). A small data set of the raw signal, which lasts for half an hour, is shown in Figure 15. The maximum/minimum value of the deflection caused by the live load effect was 2 and −4 mm, respectively. Most of the time, the deflection is smaller than −3 mm, which is the deflection caused by light two-axle vehicles traveling on a bridge. When heavier vehicles are driving on the bridge, a greater deflection from −3 to −4 mm is developed. The real-time monitored deflection data can be used as an evaluation index for the overload assessment of vehicles.

Live Loads Effect
The real-time monitoring data of point 3 was used for further analysis. The length of the selected data is one year (from October 2016 to October 2017, lasting for 8760 h, with 126,144,000 total data points). A small data set of the raw signal, which lasts for half an hour, is shown in Figure 15. The maximum/minimum value of the deflection caused by the live load effect was 2 and −4 mm, respectively. Most of the time, the deflection is smaller than −3 mm, which is the deflection caused by light two-axle vehicles traveling on a bridge. When heavier vehicles are driving on the bridge, a greater deflection from −3 to −4 mm is developed. The real-time monitored deflection data can be used as an evaluation index for the overload assessment of vehicles.  To reduce the storage space on the storage server and save the network bandwidth for data transmission, the real time deflection data of one year is averaged every two hours after separating the live load effect, as shown in Figure 16. The maximum/minimum values of deflection were 25.0 and −23.3 mm, respectively.

Live Loads Effect
The real-time monitoring data of point 3 was used for further analysis. The length of the selected data is one year (from October 2016 to October 2017, lasting for 8760 h, with 126,144,000 total data points). A small data set of the raw signal, which lasts for half an hour, is shown in Figure 15. The maximum/minimum value of the deflection caused by the live load effect was 2 and −4 mm, respectively. Most of the time, the deflection is smaller than −3 mm, which is the deflection caused by light two-axle vehicles traveling on a bridge. When heavier vehicles are driving on the bridge, a greater deflection from −3 to −4 mm is developed. The real-time monitored deflection data can be used as an evaluation index for the overload assessment of vehicles.

Temperature Effect
According to the proposed algorithm, the long-term monitored deflection data are decomposed into a collection of IMFs by EEMD. These IMFs are utilized as inputs of the PCA model, then some uncorrelated and dominant basis components are extracted. A collection of IMFs, whose component score is more than 95%, is used as input of the ICA model. The separated deflections of the daily temperature effect and annual temperature effect are shown in Figures 17 and 18, respectively. The frequencies of the daily temperature deflection and annual temperature deflection are 0.25 and 0.04 Hz, respectively. With a correlation coefficient of 0.803, the range of the separated daily temperature deflection is −5 to 5 mm. The separated annual temperature deflection (with a correlation coefficient of 0.847) changes from −14 to 14 mm, showing a good correlation with the long-term monitored deflection data ( Figure 16). temperature effect and annual temperature effect are shown in Figure 17 and Figure 18, respectively. The frequencies of the daily temperature deflection and annual temperature deflection are 0.25 and 0.04 Hz, respectively. With a correlation coefficient of 0.803, the range of the separated daily temperature deflection is −5 to 5 mm. The separated annual temperature deflection (with a correlation coefficient of 0.847) changes from −14 to 14 mm, showing a good correlation with the long-term monitored deflection data (Figure 16).

Structural Deflection
For the separated structural deflection, which consists of the effect of creep, shrinkage, prestress loss, and dead load, −14.8 mm developed in one year (from December 2015 to December 2016), as shown in Figure 19. This result indicates that the structural deflection value increases over time and that the bridge might be in a "not healthy" condition. The proposed algorithm can extract the deflection caused only by structural damage and material deterioration.

Structural Deflection
For the separated structural deflection, which consists of the effect of creep, shrinkage, prestress loss, and dead load, −14.8 mm developed in one year (from December 2015 to December 2016), as shown in Figure 19. This result indicates that the structural deflection value increases over time and that the bridge might be in a "not healthy" condition. The proposed algorithm can extract the deflection caused only by structural damage and material deterioration.
Thorough regular inspection, several cracks were found in the top slabs, the bottom slabs, and the webs of the box girder, which will directly affect the stiffness of the structure. The interaction between the deflection and cracks makes the deflection much larger than the theoretical value and, with the increase in operating time, the deflection has an increasing trend. Therefore, in order to accurately predict the long-term deflection trend of long-span PC bridges, this interaction should be considered. Furthermore, SHM technology may allow for early warnings but more careful inspections must be conducted to assess the condition of the structure.

Structural Deflection
For the separated structural deflection, which consists of the effect of creep, shrinkage, prestress loss, and dead load, −14.8 mm developed in one year (from December 2015 to December 2016), as shown in Figure 19. This result indicates that the structural deflection value increases over time and that the bridge might be in a "not healthy" condition. The proposed algorithm can extract the deflection caused only by structural damage and material deterioration.
Thorough regular inspection, several cracks were found in the top slabs, the bottom slabs, and the webs of the box girder, which will directly affect the stiffness of the structure. The interaction between the deflection and cracks makes the deflection much larger than the theoretical value and, with the increase in operating time, the deflection has an increasing trend. Therefore, in order to accurately predict the long-term deflection trend of long-span PC bridges, this interaction should be considered. Furthermore, SHM technology may allow for early warnings but more careful inspections must be conducted to assess the condition of the structure.

Conclusions
Based on the advantages of EEMD and blind source separation (BSS), an integrated machine learning algorithm consisting of EEMD, PCA, and ICA is presented for estimating the number of sources and separation of the single channel deflection data. The problem of an underdetermined blind source separation is transformed into a well-posed problem, and the deep information contained in the single channel deflection signal is analyzed. By applying the proposed algorithm, different individual deflection components (i.e., live load effect, temperature effect, and structural deflection) can be obtained from the real-time monitoring deflection signal, which can reveal the changing pattern of the structural condition.
The algorithm proposed in this paper is based on the fact that the components of the bridge deflection signal are independent of each other. The separated deflection signal is an ideal mixture of different individual deflection components. The simulated results show that the algorithm is feasible when the noise level is under 10%. However, the algorithm is practicable for field applications under certain conditions. There are some difficulties in processing the real world measured data. For example, in the long-term monitoring data, the sensor output includes not only the deflection signal

Conclusions
Based on the advantages of EEMD and blind source separation (BSS), an integrated machine learning algorithm consisting of EEMD, PCA, and ICA is presented for estimating the number of sources and separation of the single channel deflection data. The problem of an underdetermined blind source separation is transformed into a well-posed problem, and the deep information contained in the single channel deflection signal is analyzed. By applying the proposed algorithm, different individual deflection components (i.e., live load effect, temperature effect, and structural deflection) can be obtained from the real-time monitoring deflection signal, which can reveal the changing pattern of the structural condition.
The algorithm proposed in this paper is based on the fact that the components of the bridge deflection signal are independent of each other. The separated deflection signal is an ideal mixture of different individual deflection components. The simulated results show that the algorithm is feasible when the noise level is under 10%. However, the algorithm is practicable for field applications under certain conditions. There are some difficulties in processing the real world measured data. For example, in the long-term monitoring data, the sensor output includes not only the deflection signal of the structure, but also the sensor fault caused by the system and environmental conditions (such as bias, drifting, precision degradation, and gain), which will lead to a poor result. Therefore, the detection and correction of the sensor fault is a very important pre-process work for deflection separation.
It is anticipated that, in the future, more automatic detection and correction algorithms for sensor faults will be proposed and applied to the SHM system. This fact makes the proposed algorithm practical and reliable for the bridge condition assessment of large-span PC bridges in field applications.