Artiﬁcial Neural Network for Vertical Displacement Prediction of a Bridge from Strains (Part 1): Girder Bridge under Moving Vehicles

: A real-time prediction method using a multilayer feedforward neural network is proposed for estimating vertical dynamic displacements of a bridge from the longitudinal strains of the bridge when vehicles pass across it. A numerical model for an existing ﬁve-girder bridge spanning 36 m proved by actual experimental values was used to verify the proposed method. To obtain a realistic vehicle distribution for the bridge, vehicle type and actual headways of moving vehicles were taken, and the measured vehicle distribution was generalized using Pearson Type III theory. Twenty-ﬁve load scenarios were created with assumed vehicle speeds of 40 km / h, 60 km / h, and 80 km / h. The results indicate that the model can reasonably predict the overall displacements of the bridge (which is di ﬃ cult to measure) from the strain (which is relatively easy to measure) in the ﬁeld in real time. H.S.M. and Y.M.L.; methodology, P.-j.C. and S.O.; validation, H.S.M.; formal analysis, P.-j.C. and S.O.; writing—original draft preparation, H.S.M. and S.O.; writing—review and editing, H.S.M. and Y.M.L.; supervision, Y.M.L.; project administration, H.S.M. and Y.M.L.


Introduction
Bridges are a critical infrastructure that are used for the movement of goods and people, so the evaluation of their safety is very important. The real-time evaluation of the safety of such structures is called structural health monitoring (SHM). Through SHM, it is possible to quickly address urgent situations, such as the threat of collapse. In addition to extending the life of a structure, SHM can reduce its maintenance costs [1]. SHM can be divided into two categories: safety evaluation to assess the load bearing capacity of a structure and serviceability evaluation to assess the degree of deformation present [2]. In SHM, various physical data are collected. Of these, data on the displacement of structures constitute one of the most basic data types, useful in evaluating both safety and serviceability [2,3]. From the displacements of a structure, the behavior and strength of a structure can be intuitively confirmed. In addition, by observing its long-term displacement history, the degree of deterioration and degree of damage suffered by a structure can be clearly ascertained. For these reasons, many studies have been conducted on methods of measuring the displacements of a structure [1][2][3][4][5][6][7].
Methods for measuring the displacements of a structure can be generally classified as contact or non-contact measurements [1,8,9]. Devices for contact-type measurements include linear variable differential transformers (LVDTs) and cable-type displacement transducers. These contact-type devices generally have a very high accuracy [1]. However, these devices are affected by the installation site because they require contact with the structure. For example, it is very difficult to install equipment for bridges that have very high clearance, such as bridges in valleys or bridges over rivers or oceans.

Methodology
In this study, a method of predicting displacements using ANN from strains, which is relatively easy to measure, is proposed. Training data (strains, displacements) is obtained from the verified FEM model. For testing the trained ANN, strains measured from sensors installed on the actual structure should be input to the trained ANN to compare the predicted displacements with the measure displacements. However, since the purpose of this paper is to propose a displacements prediction method, the test data was also obtained from the FEM. The proposed displacements method is demonstrated in the next section.

Numerical Model
In this study, a finite element model (FEM) was used to obtain training and test data for the ANN. If measured data are available, a more realistic prediction will be possible, but as mentioned in Section 1, measuring displacement is not easy. The motivation for this study is developing another method to predict displacement. Therefore, an FEM verified with experimental values was used as an alternative. It is assumed that the results analyzed from the verified FEM correspond to the actual data. In order to carry out the finite element analysis (FEA), a simple supported slab-and-girder bridge was selected. The span of the slab was set to 36 m and the width of the slab to 15 m (Figure 1a). It is Appl. Sci. 2019, 9,2881 3 of 21 assumed that the bridge carried four lanes of traffic, two in each direction. I-shaped steel girders were used (Figure 1b). The unit weight of the concrete slab is 2300 kgf/m 3 and the modulus of elasticity is 30 GPa. Also, the unit weight of the steel girder is 8000 kgf/m 3 and the modulus of elasticity is 200 GPa. For the loads applied to the bridge, only the vehicle load was considered. Dynamic analysis was performed assuming that all materials are within the elastic range and are homogeneous and isotropic. The commercial program ABAQUS for FEM was used. Three-dimensional solid elements were used for the bottom slab of the bridge, and three-dimensional shell elements were used for the girders. In order to analyze the dynamic behavior due to the passing of the vehicle load over the bridge, it is modeled as shown in Figure 2. At both ends of the bridge, approach portions were modeled to minimize any numerical problems that might occur between the bridge and the entrance as a vehicle moved onto the bridge. The numerical model used was verified by comparing it to experimental values under static and dynamic loads [41][42][43].
The behavior of bridges is affected by these factors, such as vehicle weight, length, suspension, and natural period. Because these interfere with each other in a complex manner, it is difficult to construct a model that considers all variables. Therefore, in this study, it is assumed that the vehicles are limited to three types-passenger cars, buses, and trucks-and that they move at constant speeds. The dynamic performance of structures can be affected by various factors such as loading type, geometry, and so on [44]. In this paper, the dynamic interaction between the bridge and the moving vehicles can have relevant effects. To express these interactions, FE model based on the vehicle model proposed by Zuo and Nayfeh [44] was used. The vehicle model includes the mass, spring, and damper which have different dynamic behavior with bridge. The specifications of the models used for cars, buses, and trucks were used those proposed by Zuo and Nayfeh [45], Ahmed et al. [46], and Li [47].
Appl. Sci. 2019, 9, x FOR 3 of 21 used as an alternative. It is assumed that the results analyzed from the verified FEM correspond to the actual data. In order to carry out the finite element analysis (FEA), a simple supported slab-andgirder bridge was selected. The span of the slab was set to 36 m and the width of the slab to 15 m ( Figure 1a). It is assumed that the bridge carried four lanes of traffic, two in each direction. I-shaped steel girders were used (Figure 1b). The unit weight of the concrete slab is 2300 kgf/m 3 and the modulus of elasticity is 30 GPa. Also, the unit weight of the steel girder is 8000 kgf/m 3 and the modulus of elasticity is 200 GPa. For the loads applied to the bridge, only the vehicle load was considered. Dynamic analysis was performed assuming that all materials are within the elastic range and are homogeneous and isotropic. The commercial program ABAQUS for FEM was used. Three-dimensional solid elements were used for the bottom slab of the bridge, and threedimensional shell elements were used for the girders. In order to analyze the dynamic behavior due to the passing of the vehicle load over the bridge, it is modeled as shown in Figure 2. At both ends of the bridge, approach portions were modeled to minimize any numerical problems that might occur between the bridge and the entrance as a vehicle moved onto the bridge. The numerical model used was verified by comparing it to experimental values under static and dynamic loads [41][42][43].
The behavior of bridges is affected by these factors, such as vehicle weight, length, suspension, and natural period. Because these interfere with each other in a complex manner, it is difficult to construct a model that considers all variables. Therefore, in this study, it is assumed that the vehicles are limited to three types-passenger cars, buses, and trucks-and that they move at constant speeds. The dynamic performance of structures can be affected by various factors such as loading type, geometry, and so on [44]. In this paper, the dynamic interaction between the bridge and the moving vehicles can have relevant effects. To express these interactions, FE model based on the vehicle model proposed by Zuo and Nayfeh [44] was used. The vehicle model includes the mass, spring, and damper which have different dynamic behavior with bridge. The specifications of the models used for cars, buses, and trucks were used those proposed by Zuo and Nayfeh [45], Ahmed et al. [46], and Li [47].

Pearson Type III Distribution
In order to describe the distribution of the vehicles as near to the actual situation as possible when loading the vehicle data into the FEM, a camera was installed in a fixed position in front of the starting point of the actual bridge for measuring (Figure 3), and the actual vehicle types and headways (i.e., interval times) were collected. Although the FEM used in this study and the actual bridge for measuring are not identical, it is assumed that it does not have a significant effect on the prediction result since the measured data was applied to Pearson Type III distribution, a probabilistic theory of general traffic flow. The vehicle types are assumed to be passenger cars, buses, or trucks. Table 1 shows the measured results by lane. .

Pearson Type III Distribution
In order to describe the distribution of the vehicles as near to the actual situation as possible when loading the vehicle data into the FEM, a camera was installed in a fixed position in front of the starting point of the actual bridge for measuring (Figure 3), and the actual vehicle types and headways (i.e., interval times) were collected. Although the FEM used in this study and the actual bridge for measuring are not identical, it is assumed that it does not have a significant effect on the prediction result since the measured data was applied to Pearson Type III distribution, a probabilistic theory of general traffic flow. The vehicle types are assumed to be passenger cars, buses, or trucks. Table 1 shows the measured results by lane.

Pearson Type III Distribution
In order to describe the distribution of the vehicles as near to the actual situation as possible when loading the vehicle data into the FEM, a camera was installed in a fixed position in front of the starting point of the actual bridge for measuring (Figure 3), and the actual vehicle types and headways (i.e., interval times) were collected. Although the FEM used in this study and the actual bridge for measuring are not identical, it is assumed that it does not have a significant effect on the prediction result since the measured data was applied to Pearson Type III distribution, a probabilistic theory of general traffic flow. The vehicle types are assumed to be passenger cars, buses, or trucks. Table 1 shows the measured results by lane.    Statistics were also introduced to generalize the measured headway values of the vehicles. The traffic situation was generalized using Pearson Type III theory assuming the most general intermediate flow of traffic [48]. Pearson Type III theory is a generalized function of the gamma distribution and is often used to describe traffic conditions [49]. The Pearson Type III probability density function at arbitrary headway t is given by Equation (1).
where λ is the flow rate and can be calculated using the mean value (µ) of the measured headway, α, and K; α is the minimum expected headway; and K is the shape factor that can be calculated using µ, α, and the standard deviation (σ) of the measured headway. Γ is the gamma function and is defined as If K is not an integer, the gamma function can be calculated numerically as Using the probability density function (Equation (1)), the probability (P) that any headway (t) is between h and h + δh can be estimated numerically as To apply Pearson Type III theory to the measured headway, seven steps are required, as given in Figure 4. Statistics were also introduced to generalize the measured headway values of the vehicles. The traffic situation was generalized using Pearson Type III theory assuming the most general intermediate flow of traffic [48]. Pearson Type III theory is a generalized function of the gamma distribution and is often used to describe traffic conditions [49]. The Pearson Type III probability density function at arbitrary headway t is given by Equation (1).
where λ is the flow rate and can be calculated using the mean value (μ) of the measured headway, α, and K; α is the minimum expected headway; and K is the shape factor that can be calculated using μ, α, and the standard deviation (σ) of the measured headway. Г is the gamma function and is defined as If K is not an integer, the gamma function can be calculated numerically as Using the probability density function (Equation (1)), the probability (P) that any headway (t) is between h and h + δh can be estimated numerically as To apply Pearson Type III theory to the measured headway, seven steps are required, as given in Figure 4.   Figure 5 shows the observed headway and the headway generalized using Pearson Type III theory. As seen in the figure, the observed headway shows some differences from the generalized headway because it has characteristics specific to a particular area. Nevertheless, it can be seen that the distribution of the headway generalized by applying Pearson Type III theory represents the measured headway distribution well overall.
Appl. Sci. 2019, 9, x FOR 6 of 21 Figure 5 shows the observed headway and the headway generalized using Pearson Type III theory. As seen in the figure, the observed headway shows some differences from the generalized headway because it has characteristics specific to a particular area. Nevertheless, it can be seen that the distribution of the headway generalized by applying Pearson Type III theory represents the measured headway distribution well overall.

Load Scenarios
In order to create the load scenarios for the FEA, the proportions of the vehicle types in Table 1 and the headways in Figure 5 was used. It is created an algorithm that selects a vehicle type for each lane with the proportions in Table 1, and at the same time, sets the headway of the vehicle to the probability shown in Figure 5. The headway is multiplied by the speed to calculate the distance between the vehicles. For each of three speeds (40 km/h, 60 km/h, and 80 km/h), 25 load scenarios were created using the algorithm. Figure 6 shows one example of the vehicle distribution scenario for 60 km/h.

Load Scenarios
In order to create the load scenarios for the FEA, the proportions of the vehicle types in Table 1 and the headways in Figure 5 was used. It is created an algorithm that selects a vehicle type for each lane with the proportions in Table 1, and at the same time, sets the headway of the vehicle to the probability shown in Figure 5. The headway is multiplied by the speed to calculate the distance between the vehicles. For each of three speeds (40 km/h, 60 km/h, and 80 km/h), 25 load scenarios were created using the algorithm. Figure 6 shows one example of the vehicle distribution scenario for 60 km/h.

Data Collection
The load scenarios created as described in Section 2.2.2 were loaded into the FEM, and FEA was performed. Strains and displacement measurements were obtained at 11 locations at uniform intervals of 3.6 m below the flange of each girder, as shown in Figure 7. The center point is included

Data Collection
The load scenarios created as described in Section 2.2.2 were loaded into the FEM, and FEA was performed. Strains and displacement measurements were obtained at 11 locations at uniform intervals of 3.6 m below the flange of each girder, as shown in Figure 7. The center point is included as a measurement point because the largest displacement generally occurs at the center of the girder. Both ends of the beam were included to confirm that the strain and displacement at the support points in the simple supported beam are zero. Thus, a total of 55 strain (11 points × 5 girders, input) and displacement (11 points × 5 girders, output) measurement points were set. That is, one data set contains 55 strains and 55 displacements. The FEA for 25 scenarios for each speed was performed using the load scenarios created as described in Section 2.2.2. The value of ∆t was 0.01 s, and the vehicles were loaded for a total of 8 s so that many vehicles could completely pass across the bridge. Thus, 800 data sets per scenario was obtained. Table 2 shows the amounts of training and test data obtained.

Data Collection
The load scenarios created as described in Section 2.2.2 were loaded into the FEM, and FEA was performed. Strains and displacement measurements were obtained at 11 locations at uniform intervals of 3.6 m below the flange of each girder, as shown in Figure 7. The center point is included as a measurement point because the largest displacement generally occurs at the center of the girder. Both ends of the beam were included to confirm that the strain and displacement at the support points in the simple supported beam are zero. Thus, a total of 55 strain (11 points × 5 girders, input) and displacement (11 points × 5 girders, output) measurement points were set. That is, one data set contains 55 strains and 55 displacements. The FEA for 25 scenarios for each speed was performed using the load scenarios created as described in Section 2.2.2. The value of Δt was 0.01 s, and the vehicles were loaded for a total of 8 s so that many vehicles could completely pass across the bridge. Thus, 800 data sets per scenario was obtained. Table 2 shows the amounts of training and test data obtained.  One data set consists of 55 strains (input) and 55 displacements (output).
An ANN is significantly less predictive of data beyond the range of the data that were used for training. In other words, if the range of data to be predicted is outside the range of data used for training, the prediction accuracy of the ANN is lowered. Therefore, the range of the test data should  An ANN is significantly less predictive of data beyond the range of the data that were used for training. In other words, if the range of data to be predicted is outside the range of data used for training, the prediction accuracy of the ANN is lowered. Therefore, the range of the test data should be configured to be within the range of the training data [27]. In this study, one scenario was selected such that the strain and displacement values used in the testing for each speed were within the range of values used in the training, as shown in Table 3. A total of 57,600 data sets were used as training data, and a total of 2400 data sets were used as test data, with 800 data sets per speed. In general, the quantity of data used for training should be greater than the quantity of data used in the test [34]. The difference between the quantity of training data and test data in this study is considered to be appropriate.

Multilayer Feedforward Neural Network (MFNN)
ANNs are computer algorithms that simulate human central nervous system information processing. When a human brain receives information, it is processed in the central nervous system and a final decision is made. The nervous system is made up of cells called neurons, and ANNs have neuron-like nodes. ANNs have a special ability to define the correlation of data in nonlinear relationships [33]. The neural network, first introduced by McCulloch and Pitts [50], has since been developed into various types of network models. Among these, the most efficient and widely used model is the multilayer feedforward neural network (MFNN) [51]. The structure of a general MFNN assuming the number of hidden nodes is J is shown in Figure 8. As shown in the figure, the MFNN consists of an input layer for inputting data, a hidden layer for computing with the input data, and an output layer for predicting the target value. Each layer consists of many nodes. The nodes of each layer are mathematically connected to the nodes of adjacent layers (input layer-hidden layer, hidden layer-output layer). Weights are assigned to the connections between layers. In addition, a bias is assigned to each node of the hidden layer and the output layer. In order for the ANN to accurately predict the target value, the optimal weight and bias should be determined, which is done through training. The process of training an artificial neural network is as follows. First, training data must be prepared for the learning process. These training data consist of input (I) and target output (T) values. In this study, 55 strain values were used as input values and 55 displacement values as output values. The prepared input values are used in a calculation along with the first weight and bias generated randomly, and the calculated values are input to the hidden layer First, training data must be prepared for the learning process. These training data consist of input (I) and target output (T) values. In this study, 55 strain values were used as input values and 55 displacement values as output values. The prepared input values are used in a calculation along with the first weight and bias generated randomly, and the calculated values are input to the hidden layer where I i is the variable input to the ith node of the input layer, w ij is the weight assigned between the ith node of the input layer and the jth node of the hidden layer, and θ j is the bias assigned to the jth node of the hidden layer. The result, net_hidden j , is the weighted sum of all input variables and the bias and is input to the jth node of the hidden layer. The value net_hidden j input to the hidden layer is calculated through the transfer function where f is the transfer function in the hidden layer, and Z j is the output value of the jth node of the hidden layer calculated through the transfer function and the variable input to the output layer. The same process is applied between the hidden layer and the output layer. The error is calculated by comparing the predicted output (K) with the target output (T) of the training data (feedforward). If the computed error does not satisfy the user-defined criteria, the ANN goes back to the hidden layer and the input layer and corrects the weight and bias using the training algorithm (backward) [52]. Through this iterative process, the error between the predicted output (K) and the target output (T) is reduced. If the criteria set by the user are met, the training stops and the optimal weights and biases have been determined.

Modeling of MFNN
The predictive power of ANNs depends heavily on its architecture. In general, the architecture of an MFNN is defined by four elements [53]: number of layers, number of nodes in each layer, type of transfer function in each layer, and type of training function.
Recently, many studies have been undertaken to optimize ANN architecture [53]. However, there is still no scientific method or general rule for finding the optimal architecture of an ANN [34]. Therefore, in this study, the architecture of the ANN was determined through a case study, which is the technique generally followed [34]. The four elements as applied in the case study method are as follows.

1.
The numbers of input layers and output layers are both fixed at one. Therefore, regarding the total number of layers, the number of hidden layers is the only variable. In general, as the number of hidden layers increases, the computational speed decreases proportionately, and the learning efficiency decreases because of the large amount of computer memory required. With consideration of these points, in this research, the case study was conducted with the number of hidden layers limited to two. 2.
The number of nodes in the input layer and the output layer is already determined because it is equal to the number of variables that are input and output, respectively, for the problem to be studied [54]. Therefore, the number of nodes in hidden layer is the only variable. There are no general rules for setting the number of hidden nodes [53]. The number of nodes proposed by Zhang et al. was used [55]. In their study, they proposed n/2, n, 2n, and 2n + 1 for the number of hidden nodes, where n is the number of input nodes. In this study, as the number of nodes in the input layer is 55, 28 (n/2), 55 (n), 110 (2n), and 111 (2n + 1) was used as the number of nodes in the hidden layer.

3.
The transfer function transforms the weighted sum of the input values to the output node and determines the strength of the output value [56]. Nalbant et al. [57] asserted that the transfer function is determined by the nature of the problem to be solved. In this study, a log-sigmoid function (logsig) was used for the hidden layer and a linear function (purelin) for the output layer, which is a commonly used transfer function [33].

4.
The training function determines the method of calculating the error of the predicted output value and adjusts the weight and bias. The training function is important because it can considerably affect the learning speed and accuracy [53]. The training function is affected by many factors, such as the nature of the problem, amount of data, and the number of weights. That is, even for a given function, the accuracy and the learning time may vary depending on the nature of the problem. In this study, a relatively large quantity of data (57,600) was used for learning. Therefore, trainscg was used as the learning function, which has relatively low memory consumption and low time consumption, as well as high accuracy when there are many data [58].
The above parameters are shown in Table 4. With two cases for the number of hidden layers and four cases for the number of nodes in the hidden layers, a total of eight cases were created for the ANN architecture (no. of input nodes-no. of first hidden nodes (-no. of second hidden nodes)-no.

Results and Discussion
In this section, the training accuracy of the ANN by the various numbers of hidden layers and nodes is compared using the training data obtained, as described in Section 2.2.3 and the parameters presented in Section 2.4. In addition, it is examined the test results according to the architecture of the ANN and compared the displacements at a certain point over time and the displacement distribution at a certain time.

Training Results by ANN Structure
A MATLAB R2018a toolbox was used to train the artificial neural network. At the beginning of training, initial weights, and biases are generated randomly, so different training results can be obtained for the same structure [59]. In this study, a total of 20 training sessions were performed with each structure, and the average training results are shown in Table 5 and Figure 9. The mean squared error (MSE) was used as the training performance function; it is the one most commonly used in MFNN [60]. The mean squared error (MSE) is given by In Equation (7), T is the target output and K is the predicted output. The closer the MSE is to zero, the smaller the error. As the MSE is affected by the absolute magnitude of the output (displacements), there is a limit to judging training performance by MSE only. Therefore, use of the correlation coefficient (R) value is suggested, and the correlation between the predicted value and the actual value is presented. The closer R is to 1, the higher the prediction accuracy. The training times for each ANN structure are presented in Table 5. From this table, two features can be found. First, the prediction accuracy varies according to the number of hidden layers and the number of nodes in the hidden layer(s). When there was one hidden layer, the minimum and maximum MSE values were 0.001103 (case 4) and 0.001308 (case 1), respectively. The minimum and maximum R values were 0.998728 (case 1) and 0.998928 (case 4), respectively. When there were two hidden layers, the minimum and maximum MSE values were 0.001282 (case 7) and 0.001825 (case 5), respectively. The minimum and maximum R values were 0.998225 (case 5) and 0.998753 (case 7), respectively. From Figure 9, it can be seen that the effect of the ANN architecture on MSE is similar to that on the R value. In general, the greater the number of hidden layers and nodes, the higher the prediction accuracy. However, as the number continues to increase, the accuracy may decrease because of overtraining. In this study, the learning accuracy was higher in the cases of one hidden layer than in the cases of two hidden layers. It is also seen that for a given number of hidden layers, the greater the number of nodes, the higher the prediction accuracy. The second feature found in Table 5 is that the training time varies according to the number of hidden layers and the number of nodes. For cases 1, 2, 3, and 4, the training time was 840.30 s, 1566.92 s, 3403.87 s, and 3440.74 s, respectively. In case 1, the number of hidden nodes was 28, and in case 2, the number of hidden nodes was 55, which is about twice that for case 1, and the training time was also about twice that for case 1. Similarly, the number of hidden nodes for case 3 was twice as large as that for case 2, and the learning time was also about twice that for case 2. For cases 3 and 4, the number of hidden nodes was similar, so there was no notable difference in training time. For cases 5, 6, 7, and 8, the training time was 997.67 s, 3544.24 s, 16,852.07 s, and 15,442.72 s, respectively. In case 5, the number of hidden-layer nodes was 28, and in case 6, the number of hidden-layer nodes was 55, which is about twice that for case 5, and the number of hidden layers was 2. As a result, the training times differed by a factor of about 3.5. For cases 6 and 7, the number of hidden-layer nodes differed by a factor of 2, and the number of hidden layers was 2. As a result, the training times differed by a factor of about 4.7. From these results, it can be seen that the ANN training time is proportional to the number of nodes in the hidden layer(s), but the effect is larger when the number of hidden layers is two. In general, the number of hidden layers and nodes is the most influential factor in ANN training [34]. However, in this study, it can be confirmed that the learning ability of the ANN is not substantially affected by the number of hidden layers and nodes. In other words, it is considered that, despite the differences among the network structures, fairly high accuracy is shown for all of the ANN structures.  In general, the number of hidden layers and nodes is the most influential factor in ANN training [34]. However, in this study, it can be confirmed that the learning ability of the ANN is not substantially affected by the number of hidden layers and nodes. In other words, it is considered that, despite the differences among the network structures, fairly high accuracy is shown for all of the ANN structures.

Test Results by ANN Structure
For the ANN trained with eight different structures having various numbers of hidden layers and nodes, testing was performed using data that were not used for training. The purpose of the test is to determine whether overfitting has occurred during the training process. Overfitting refers to situations in which the ANN is trained well, but the trained ANN do not predict data well. This means that the ANN simply memorized the correlation between the input and output values of the training data rather than defining it well [60]. This is an important process because in order to be practically applicable, an ANN must be able to produce accurate predictions for untrained data.
The test procedure was as follows. The strain obtained from the analysis of a load scenario that was not used for training was input to the ANN trained as described in Section 3.1. The trained ANN can predict displacement (output) immediately, which is one of the advantages of ANN. Then, the displacement predicted by the ANN was compared with the displacement obtained from the analysis. The data used in this test were those for the one load scenario for each of the three velocities (40 km/h, 60 km/h, and 80 km/h) that were not used for training, as mentioned in Section 2.2.3. All ANN architectures trained as described in Section 3.1 were tested, and the test results were reviewed. For testing, as the output value is immediately predicted after the input is entered into the trained neural network, the training time is not indicated separately. The test results are shown in Table 6, and the error for the load scenario at 40 km/h is shown in Figure 10. From the results for 40 km/h, when there was one hidden layer, the minimum and maximum MSE values were 0.67 × 10 −3 (case 4) and 0.70 × 10 −3 (case 1), respectively. The minimum and maximum R values were 0.995533 (case 1) and 0.995705 (case 4), respectively. When there were two hidden layers, the minimum and maximum MSE values were 0.70 × 10 −3 (case 8) and 0.77 × 10 −3 (case 5), respectively. The minimum and maximum R avalues were 0.995058 (case 5) and 0.995519 (case 8), respectively. Compared to the training results, the accuracy rankings for the structures with 110 and 111 hidden layers differed. However, as the difference is very small, it is considered that the test results are similar to the training results. Similar results were obtained for 60 km/h and 80 km/h. The test MSE values for 40 km/h and 60 km/h are smaller than the training MSE values, and the test MSE values for 80 km/h are larger than the training MSE values. This is because the MSE is affected by the magnitude of the absolute value. In addition, a comparison of the R values shows that the R value in testing was smaller than that for training for all velocities. Therefore, it is considered that the test results are slightly reduced in accuracy compared to the training results. However, we can see fairly good test results for MSE and R overall. Particularly in the cases with one hidden layer, a higher accuracy is shown. Therefore, it can be confirmed that the ANN was not overfitting, indicating that training and testing progressed very well. results are slightly reduced in accuracy compared to the training results. However, we can see fairly good test results for MSE and R overall. Particularly in the cases with one hidden layer, a higher accuracy is shown. Therefore, it can be confirmed that the ANN was not overfitting, indicating that training and testing progressed very well.     Figures 11-13 show the dynamic displacement over time for the midpoint, where the largest displacement is expected to occur. Figure 14 shows the overall behavior of the bridge at the time of maximum displacement for each speed. The following observations are made from the graphs.

1.
The ANN trained using three different velocities predicts the displacement for different speed-load scenarios with remarkably high accuracy. This means that it is possible to predict the displacement of a bridge due to a vehicle moving at various velocities in the field. 2.
The ANN predicts the overall behavior of all the girders at a given point in time, as well as the change in dynamic displacement of a given point over time.

3.
The results provide that the proposed method to predict vertical dynamic displacements of bridge from longitudinal strains is feasible.

Conclusions and Future Work
In this study, a method to predict the vertical displacement (output) of bridges was proposed from the longitudinal strain (input) generated when vehicles pass across bridges by using an ANN. Pearson Type III distribution theory was used to express the actual vehicle distribution, and the training and test data were acquired using the verified FEM as an alternative to the difficulty of obtaining actual data. The proposed method showed the predictability of a bridge's dynamic displacement, which is relatively difficult to measure. ANN, which has a strong ability to define correlations between data, and a verified FEM for training data were used to predict displacement from strain, which is relatively easy to measure. Displacement is one of the most important types of physical data in the SHM of bridges. In addition, ANN can continuously predict displacement in real time. By using the proposed method, it is expected that the prediction of displacement of bridges and other structures, whether in use or newly constructed, can become easier and economically feasible.
It is also examined the prediction accuracy by ANN structure, which has the greatest influence on the general prediction accuracy of an ANN. The comparison of the prediction accuracy for various numbers of hidden layers and of nodes suggested by Zhang et al. [55] showed a fairly high accuracy for all structures. Although the structure of the ANN does not have a major influence on the prediction accuracy in this study and it is costly and time consuming to find an ANN structure that has high accuracy, it is considered to be one of the processes that the researcher must perform to achieve better prediction.
In the present study, the feasibility of displacement prediction by ANN is confirmed using an FE bridge model, not the real bridge. In future work, an experiment to apply the proposed method to actual bridge will be conducted. Especially, with actual bridges, natural conditions such as temperature and wind load may cause noise to the measured strains and displacements. Future study is intended to quantify the relation between the prediction accuracy and noise by predicting the displacements using measured data with noise. Also, study on determining the minimum number of strain-measurement points required for predicting displacements will be conducted using optimization techniques such as genetic algorithm.