Infrastructure Safety Oriented Tra ﬃ c Load Monitoring Using Multi-Sensor and Single Camera for Short and Medium Span Bridges

: A reliable and accurate monitoring of tra ﬃ c load is of signiﬁcance for the operational management and safety assessment of bridges. Traditional weight-in-motion techniques are capable of identifying moving vehicles with satisfactory accuracy and stability, whereas the cost and construction induced issues are inevitable. A recently proposed tra ﬃ c sensing methodology, combining computer vision techniques and traditional strain based instrumentation, achieves obvious overall improvement for simple tra ﬃ c scenarios with less passing vehicles, but are enfaced with obstacles in complicated tra ﬃ c scenarios. Therefore, a tra ﬃ c monitoring methodology is proposed in this paper with extra focus on complicated tra ﬃ c scenarios. Rather than a single sensor, a network of strain sensors of a pre-installed bridge structural health monitoring system is used to collect redundant information and hence improve accuracy of identiﬁcation results. Field tests were performed on a concrete box-girder bridge to investigate the reliability and accuracy of the method in practice. Key parameters such as vehicle weight, velocity, quantity, type and trajectory are e ﬀ ectively identiﬁed according to the test results, in spite of the presence of one-by-one and side-by-side vehicles. The proposed methodology is infrastructure safety oriented and preferable for tra ﬃ c load monitoring of short and medium span bridges with respect to accuracy and cost-e ﬀ ectiveness.


Introduction
Over last two decades, bridge structural health monitoring (BSHM) has become a pervasive technique that monitors the static and dynamic bridge responses induced by environmental effects or vehicle loads [1].As the engineering practice of BSHM develops, an increasing number of structures are now being equipped with data acquisition equipment and sensor network consisting of strain sensors and accelerometers, as well as cameras.The initial purpose of these sensors is normally to observe the behavior of bridge structure over time, and thereby to conduct damage detection and assess the structural condition [2].Then, with the evolution of related technologies, researchers realized that the collected data may be analyzed to achieve the operational monitoring of the bridges as well, such as capturing the bridge response under extreme loads [3][4][5][6] or monitoring the traffic on the bridge [7][8][9].
For bridge structures, traffic load is the central one among these operational factors.On the one hand, bridges are constructed for traffic purpose.On the other hand, the traffic load might deviate from the original bridge design with the rapid development of the transportation industry.Therefore, monitoring traffic load, including vehicle weight, velocity, quantity, type and trajectory, is crucial for bridge design refinement and safety assessment, as well as operational management.In order to monitor the traffic load of bridges, the bridge weigh-in-motion (BWIM) technique is highlighted [10].The initial concepts behind BWIM were proposed by Moses [11], who used an instrumented bridge as the weighing scale to estimate vehicle weights in his engineering practice.Due to the advantages in terms of cost efficiency, durability and unbiased accuracy, BWIM technique turns out to be a preferable tool to weigh vehicles and is augmented by many subsequent research and engineering applications [12].
Recognizing the vehicle weight is the motivation of BWIM technique.The identification approach is generally based on the static influence line/surface theory [13].By now, there have been many engineering practices aimed at recognizing the vehicle weight, yet problems are experienced for obtaining accurate results in complicated traffic cases [14].The original Moses algorithm used for BWIM purpose has difficulty separating the contribution of the individual vehicles from the bridge response alone when more than one vehicle in adjacent lanes travels side by side on the bridge span.In addition, this method is unable to identify extra traffic information including types, size, axle number and velocity of vehicles, without the help of additional traffic sensors such as radar, road tubes and embedded axle detectors [15].However, the usage of any sort of those sensors would diminish the advantage of BWIM systems over pavement-based WIM systems.
Fortunately, the bridge structural health monitoring (BSHM) technique might provide solutions.An increasing number of bridges are now being instrumented with sensor network and data acquisition equipment.Through mining the data collected by the BSHM sensor network, extra traffic information might be discovered so as to mitigate the aforementioned problems faced by traditional BWIM techniques.An example is when Yu et al. [16] proposed a BWIM algorithm that was able to identify the lateral position of a single vehicle on a bridge by using seven strain gauges installed transversely at the bottom of the beams.Other valuable attempts use the traffic webcam of a BSHM system to automatically detect the vehicles on bridge and achieve further vehicle information identification [17][18][19][20][21][22][23].
Focusing on the aforementioned key issues of current BWIM techniques, this study combines the strain sensor network and an additional traffic video webcam belonging to a bridge structural health monitoring system, to monitor and identify the traffic load on bridge.The logic of the paper is as follows: i) both the theoretical background and the application procedure of camera visual sensing and strain sensing is introduced; ii) the influence line theory oriented towards gross vehicle weight (GVW) recognition is elaborated on with an emphasis on the multiple-vehicle problem, iii) overall framework of the data integration methodology for traffic monitoring is summarized; iv) field tests on a concrete box-girder bridge are conducted to demonstrate the proposed methodology, especially for complicated traffic cases.The advantages and the potential engineering applications of the methodology are summed up as a conclusion.

Computer Vision Technique
Over the last decades, the exponential growth in both hardware facilities and software algorithm has successfully made traffic video surveillance widespread.In essence, the goal of traffic video surveillance is the detection of moving object, which aims to decide whether a vehicle exists in the monitored area and where it is.To attain the goal automatically, the computer vision technique is invented, of which the main methods are either motion-based or feature-based.Compared to the motion-based one, the feature-based method is much more efficient and robust due to the upsurge of deep learning, and is considered to be the mainstream of computer vision techniques [24,25].
Convolution neural network (CNN) is a category of neural network with deep structure and convolution calculation.It is one of the representative algorithms of deep learning approaches employed for object detection, classification and segmentation tasks [26].The learning ability enable the CNN automatically learns features from the training data set, rather than using hand-engineered features, to detect the target object.This process imitates the visual perception mechanism of humans, which makes the CNN leap over the traditional manual feature extraction methods and greatly reduces the workload of operation.The robustness and efficiency of the CNN have been proven in numerous object detection practices.
With the unremitting efforts of computer scientists, superior CNN based computer vision algorithms have been continually proposed.In view of this, an advanced algorithm named Mask Region-based CNN (R-CNN) is applied in this research to detect vehicles in the video surveillance for traffic.The R-CNN is one of the bounding-box object detection approaches.Bounding-box object detection uses a sliding box to search for candidate positions where the object possibly occurs and evaluates the convolutional networks of the image in the box to determine the existence of the object [27].'Mask' indicates that the Mask R-CNN outperforms the R-CNN by outputting the mask of detected object.Put another way, Mask R-CNN not only detects the existence and position of the objects, but also recognizes the shape of the objects [28].
As with any deep-learning based computer vision algorithm, the implementation procedure of the Mask R-CNN has the following three steps: i) prepare training data sets, ii) train the convolution neural network of the Mask R-CNN algorithm and iii) apply Mask R-CNN to detect vehicles in the traffic video.The real-time detection results in this research are shown in Figure 1.
Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 21 invented, of which the main methods are either motion-based or feature-based.Compared to the motion-based one, the feature-based method is much more efficient and robust due to the upsurge of deep learning, and is considered to be the mainstream of computer vision techniques [24,25].Convolution neural network (CNN) is a category of neural network with deep structure and convolution calculation.It is one of the representative algorithms of deep learning approaches employed for object detection, classification and segmentation tasks [26].The learning ability enable the CNN automatically learns features from the training data set, rather than using hand-engineered features, to detect the target object.This process imitates the visual perception mechanism of humans, which makes the CNN leap over the traditional manual feature extraction methods and greatly reduces the workload of operation.The robustness and efficiency of the CNN have been proven in numerous object detection practices.
With the unremitting efforts of computer scientists, superior CNN based computer vision algorithms have been continually proposed.In view of this, an advanced algorithm named Mask Region-based CNN (R-CNN) is applied in this research to detect vehicles in the video surveillance for traffic.The R-CNN is one of the bounding-box object detection approaches.Bounding-box object detection uses a sliding box to search for candidate positions where the object possibly occurs and evaluates the convolutional networks of the image in the box to determine the existence of the object [27].'Mask' indicates that the Mask R-CNN outperforms the R-CNN by outputting the mask of detected object.Put another way, Mask R-CNN not only detects the existence and position of the objects, but also recognizes the shape of the objects [28].
As with any deep-learning based computer vision algorithm, the implementation procedure of the Mask R-CNN has the following three steps: i) prepare training data sets, ii) train the convolution neural network of the Mask R-CNN algorithm and iii) apply Mask R-CNN to detect vehicles in the traffic video.The real-time detection results in this research are shown in Figure 1.As seen in Figure 1, the detection tasks of the Mask R-CNN are divided into two trigger strategies: before and after vehicles entering the bridge deck zone.In the first mode, both the back and the side of a vehicle are visible in the video image so that the Mask R-CNN is capable of distinguishing different segments of a vehicle.It is remarkable that the back and side area, as well as closely-spaced wheels of a vehicle are successfully recognized as shown in Figure 1a, proving the segmentation capability of the Mask R-CNN algorithm.When vehicles drive further after entering the bridge deck, the side of the vehicles becomes invisible due to the fixed angle of the camera.Since computer vision techniques cannot detect what is invisible in the image, only the back of vehicles are detected in this scenario, even when the multiple vehicles are overlapping, as shown in Figure 1b.
The image pixel coordinates of the detection box are collected for further size measuring and vehicle positioning tasks.As seen in Figure 1, the detection tasks of the Mask R-CNN are divided into two trigger strategies: before and after vehicles entering the bridge deck zone.In the first mode, both the back and the side of a vehicle are visible in the video image so that the Mask R-CNN is capable of distinguishing different segments of a vehicle.It is remarkable that the back and side area, as well as closely-spaced wheels of a vehicle are successfully recognized as shown in Figure 1a, proving the segmentation capability of the Mask R-CNN algorithm.When vehicles drive further after entering the bridge deck, the side of the vehicles becomes invisible due to the fixed angle of the camera.Since computer vision techniques cannot detect what is invisible in the image, only the back of vehicles are detected in this scenario, even when the multiple vehicles are overlapping, as shown in Figure 1b.

Coordinate Transformation
The image pixel coordinates of the detection box are collected for further size measuring and vehicle positioning tasks.

Coordinate Transformation
The raw output of vehicle coordinates from traffic video are in the image coordinate, which cannot be directly used to recognize the velocity, size, or influence value of the vehicles, unless the image coordinates are transformed into the space coordinates.To this purpose, three coordinate systems, namely image pixel coordinate in the video image, space coordinate in the real space, and planar coordinate on the bridge deck, are established for obtaining the vehicle position in situ, as illustrated in Figure 2. A coordinate transformation method is utilized in this paper, basing on the former work of Xu and Zhang [29].The raw output of vehicle coordinates from traffic video are in the image coordinate, which cannot be directly used to recognize the velocity, size, or influence value of the vehicles, unless the image coordinates are transformed into the space coordinates.To this purpose, three coordinate systems, namely image pixel coordinate in the video image, space coordinate in the real space, and planar coordinate on the bridge deck, are established for obtaining the vehicle position in situ, as illustrated in Figure 2. A coordinate transformation method is utilized in this paper, basing on the former work of Xu and Zhang [29].Supposing a vehicle denoted as V is driving on the bridge, what the computer vision technique directly outputs is the pixel coordinate, V1(x1, y1), of the recognized vehicle in the image pixel plane shown in the video image in Figure 2a.In order to get the real space coordinate V2(x2, y2, z2) of the vehicle, the geometrical relations between the two systems are employed as shown in Figure 2b and Figure 2c, in which Figure 2c is the planar projection of Figure 2b.According to the geometrical relations, the pixel coordinate of a point V1(x1, y1) can be transferred into spatial coordinate V2(x2, y2, z2) as follows: , (1) where f is the focal length of the camera and t is the similarity coefficient between the two similar triangles in Figure 2c.Now that the coordinate V1(x1,y1) and the focal length f are known, the application of Equation (1) requires to find the similarity coefficient t initially.
Moreover, to get the relative position of vehicles, that is V3(x3,y3), on the bridge deck, it is necessary to firstly consider the bridge deck as a spatial plane in the space coordinate shown in Figure 2. The plane is described by the following equation: x y f t x y z ⋅ = Supposing a vehicle denoted as V is driving on the bridge, what the computer vision technique directly outputs is the pixel coordinate, V 1 (x 1 , y 1 ), of the recognized vehicle in the image pixel plane shown in the video image in Figure 2a.In order to get the real space coordinate V 2 (x 2 , y 2 , z 2 ) of the vehicle, the geometrical relations between the two systems are employed as shown in Figure 2b,c, in which Figure 2c is the planar projection of Figure 2b.According to the geometrical relations, the pixel coordinate of a point V 1 (x 1 , y 1 ) can be transferred into spatial coordinate V 2 (x 2 , y 2 , z 2 ) as follows: where f is the focal length of the camera and t is the similarity coefficient between the two similar triangles in Figure 2c.Now that the coordinate V 1 (x 1 , y 1 ) and the focal length f are known, the application of Equation (1) requires to find the similarity coefficient t initially.
Moreover, to get the relative position of vehicles, that is V 3 (x 3 , y 3 ), on the bridge deck, it is necessary to firstly consider the bridge deck as a spatial plane in the space coordinate shown in Figure 2. The plane is described by the following equation: where A, B, C, D are unknown parameters determining the bridge deck plane equation in the spatial coordinate system.The similarity coefficient t can thus be written as: The next step is to project the focal point of the camera, i.e., O 2 (0,0,0) in Figure 2a, onto the bridge deck plane.As shown in Figure 2a, the projection point is denoted as O 3 (x o3 ,y o3 ,z o3 ), of which the coordinates can be easily obtained according to the basic space geometry theory: Now the coordinate x 3 of the vehicle in the bridge deck coordinate system can be obtained through calculating the distance between the vehicle point V 2 (x 2 , y 2 , z 2 ) and the plane z 2 O 2 O 3 in the real space coordinate system shown in Figure 2a above.Assuming the plane z 2 O 2 O 3 is determined by two coplanar vectors, where A x , B x and C x are the coordinates of the normal vector belong to the spatial plane Similarly, the coordinate y 3 of the vehicle in the bridge deck coordinate system can be obtained through calculating the distance between the vehicle point V 2 (x 2 , y 2 , z 2 ) and the plane The expression is: The coordinates V 3 (x 3 , y 3 ) exactly describe the vehicle position on the bridge deck and will be directly used for BWIM purpose.Noticing that the camera orientation is not always identical with the bridge longitudinal direction due to the limited installation position of the camera, simple coordinate shift and rotation are needed in such cases.
Finally, the key issue of the coordinate transformation turns out to be the determination of parameters A, B, C and D. Instinctively, both the location and orientation of the webcam are needed for parameters determination.However, these data are generally unavailable because of some field conditions.For this reason, a new method is proposed in the companion paper [30], which obtains essential parameters directly from the video image with simply two lines of equal space length in the image regardless of the camera location and/or its orientation.For the conciseness of this paper, that method is not elaborated herein.

Bridge Strain Sensing
BWIM techniques generally take advantage of the bridge strains to recognize the gross vehicle weight (GVW) based on the static influence line theory.Unfortunately, the raw strain data collected by strain sensors contains the strain induced not only by vehicle weight, but also by vehicle-bridge couple vibration and other environmental factors.Therefore, analyzing bridge structural strain and conducting strain signal processing are imperative.Typically, the components of bridge strain can be denoted as the follow equations: where ε bridge is the bridge strain measured from sensors; ε environment is the bridge strain caused by environmental factors; ε vehicle is the bridge strain induced by vehicles, which consists of dynamic ε dynamic and static ε static components.
Within the different conponents of strains, the static component ε static is the one needed for the GVW estimation according to the influence theory, and it can be extracted from the measured ε bridge by filtering ε environment and ε dynamic .Technically, a local regression algorithm named locally weighted scatterplot smoothing (LOWESS) is used to realize the extraction approach in time domain.This algorithm is chosen due to its accuracy and convenience, according to Cleveland and Devlin [31].The whole procedure is shown in Figure 3 for illustration.Details of implementation are available in the literature [30].
where εbridge is the bridge strain measured from sensors; εenvironment is the bridge strain caused by environmental factors; εvehicle is the bridge strain induced by vehicles, which consists of dynamic εdynamic and static εstatic components.
Within the different conponents of strains, the static component εstatic is the one needed for the GVW estimation according to the influence theory, and it can be extracted from the measured εbridge by filtering εenvironment and εdynamic.Technically, a local regression algorithm named locally weighted scatterplot smoothing (LOWESS) is used to realize the extraction approach in time domain.This algorithm is chosen due to its accuracy and convenience, according to Cleveland and Devlin [31].The whole procedure is shown in Figure 3 for illustration.Details of implementation are available in the literature [30].It is noteworthy that the above discussion is merely applicable for short and medium span bridges that are commonly chosen as the targets of BWIM implementation [32].In contrast to longspan bridges, short and medium span bridges serve as ideal weighing scales to estimate the GVW for their structural simplicity, better linear elasticity and more observable responses under traffic loads.Furthermore, environmental load effects on those bridges, such as wind load, are relatively simple or even negligible.

Traffic Load and Bridge Reaction
As bridges are basically beam-like structures, the influence lines of bridges reflect the relationship between structural responses and traffic load.Available studies regarding BWIM suggest two approaches to obtain the influence line of a bridge, e.g., theoretical derivation based approach [11,33,34] and field tests based calibration [35,36].
This paper employs a method fitting strain influence line with measured strain data from field calibration tests.The method includes two steps [30]: first, the shape of the strain influence line of the target bridge is theoretically obtained with the kinematic method according to Timoshenko and Young [37]; second, a truck with known weight is arranged to cross the instrumented bridge several times as the calibration tests.The strain data measured in the tests is used to determine the exact It is noteworthy that the above discussion is merely applicable for short and medium span bridges that are commonly chosen as the targets of BWIM implementation [32].In contrast to long-span bridges, short and medium span bridges serve as ideal weighing scales to estimate the GVW for their structural simplicity, better linear elasticity and more observable responses under traffic loads.Furthermore, environmental load effects on those bridges, such as wind load, are relatively simple or even negligible.

Traffic Load and Bridge Reaction
As bridges are basically beam-like structures, the influence lines of bridges reflect the relationship between structural responses and traffic load.Available studies regarding BWIM suggest two approaches to obtain the influence line of a bridge, e.g., theoretical derivation based approach [11,33,34] and field tests based calibration [35,36].
This paper employs a method fitting strain influence line with measured strain data from field calibration tests.The method includes two steps [30]: first, the shape of the strain influence line of the target bridge is theoretically obtained with the kinematic method according to Timoshenko and Young [37]; second, a truck with known weight is arranged to cross the instrumented bridge several times as the calibration tests.The strain data measured in the tests is used to determine the exact value of the influence line obtained in the first step.The truck load is simplified as a concentrated load P = Wg for the sake of calibration convenience, where 'W' is the vehicle weight and 'g' is the gravitational acceleration of which the numerical value is 9.8 m/s 2 .The simplification is reasonable in mechanics, since for the linear elastic structures, the superposition principle works.The accuracy of the GVW recognition results in this paper also supports the simplification.Figure 4 demonstrates the procedure of obtaining the influence line of a four-span continuous bridge for BWIM purpose.
Remote Sens. 2019, 11, x FOR PEER REVIEW 7 of 21 load P = Wg for the sake of calibration convenience, where 'W' is the vehicle weight and 'g' is the gravitational acceleration of which the numerical value is 9.8 m/s 2 .The simplification is reasonable in mechanics, since for the linear elastic structures, the superposition principle works.The accuracy of the GVW recognition results in this paper also supports the simplification.Figure 4 demonstrates the procedure of obtaining the influence line of a four-span continuous bridge for BWIM purpose.

Identification with Irredundant Measurement
Now that the static component of the vehicle induced strain and the calibrated strain influence line are obtained, the inverse influence line theory can thereby be used to calculate the GVW.According to Timoshenko and Young [37], the influence line theory is expressed as: where ε is the value of the extracted static bridge strain, N is the total number of vehicles on the bridge and Wi, IWi(xi) and xi are the GVW, the strain influence value and the position of the i th vehicle when the extracted strain signal reaches the local peak, respectively.Equation ( 9) can also be written in matrix form as follows: Since the motivation for BWIM research is to identify the vehicle weight, Equation ( 10) is supposed to be used inversely to calculate the W. In case of only one vehicle driving on the bridge, Equation (10) can be expressed as: , Then, using strain data collected by a single strain sensor is enough to determine the GVW of that vehicle.The sole GVW can be easily calculated by: , (12) where ε peak is the peak value of vehicle induced static strain, I peak is the peak value of the calibrated strain influence line and W is the GVW of the vehicle.For a more intuitive illustration, ε peak and I peak correspond to the εS1 and IW1 in the Figure 4 above, respectively.

Least Square Based Identification with Redundant Measurements
( )

Identification with Irredundant Measurement
Now that the static component of the vehicle induced strain and the calibrated strain influence line are obtained, the inverse influence line theory can thereby be used to calculate the GVW.According to Timoshenko and Young [37], the influence line theory is expressed as: where ε is the value of the extracted static bridge strain, N is the total number of vehicles on the bridge and W i , I Wi (x i ) and x i are the GVW, the strain influence value and the position of the ith vehicle when the extracted strain signal reaches the local peak, respectively.Equation ( 9) can also be written in matrix form as follows: Since the motivation for BWIM research is to identify the vehicle weight, Equation ( 10) is supposed to be used inversely to calculate the W. In case of only one vehicle driving on the bridge, Equation (10) can be expressed as: Then, using strain data collected by a single strain sensor is enough to determine the GVW of that vehicle.The sole GVW can be easily calculated by: where ε peak is the peak value of vehicle induced static strain, I peak is the peak value of the calibrated strain influence line and W is the GVW of the vehicle.For a more intuitive illustration, ε peak and I peak correspond to the ε S1 and I W1 in the Figure 4 above, respectively.

Least Square Based Identification with Redundant Measurements
More generally, there are multiple vehicles driving on the bridge at the same time.In this multiple-vehicle scenario, no determined solution of the W in Equation ( 10) can be found, unless redundant measurements from multiple strain sensors are available.If strain sensors outnumber the vehicles, which is usually the case, Equation (10) is the form below.
where ε N is the maximum strain data collected by the Nth strain sensor, I εM WN (x N ) is the Nth vehicle's influence value belonging to the Mth strain sensor (M > N) and x N is the position of the Nth vehicle when the bridge strain reaches the maximum.Based on Equation ( 13), the inverse influence line equation aiming at determining the GVW of multiple vehicles is written as: It is noteworthy that the influence value matrix I is not a square matrix, which means it only has a pseudo inverse instead of a regular inverse.The pseudo inverse of I is denoted as I g , satisfying II g I = I.
The influence value I(x) of the vehicles is unknown without the position information, x, of the vehicles.Vehicles do not always simultaneously pass the bridge cross-section where I reaches its maximum; hence, Equation ( 12) is ineffective in the multiple vehicles situation.Such is the reason why identifying the presence of multiple-vehicle is still one of the main challenges faced by BWIM technology, as Yu et al. [16] stated.Fortunately, in this paper, the position of vehicles can be quantitatively identified by the deep learning based computer vision technique, which means that the influence values of every vehicle driving on bridge in every moment are available.The multiple-vehicle problem is thus solved.
In addition, Equation ( 14) is overdetermined, as the equations outnumber the unknowns (M > N).The redundant information in the overdetermined equation helps to reduce the GVW recognition error caused by inaccurate vehicle position or influence line calibration.
An overdetermined equation, however, gives no exact solutions based on matrix algebra.According to Lawson and Hanson [38], the method of ordinary least squares can be used to find an approximate solution to the overdetermined systems.For the equation ε = WI, the least squares formula is obtained from the problem: min The solution of which can be written in the normal equation: Then, the GVW results, W, are successfully calculated with better accuracy.It is noteworthy that the approach solving the overdetermined equation is effective for both one-vehicle and multiple-vehicle scenario, which helps to reduce the complexity of calculating the GVW in engineering practice.Furthermore, considering the fact that GVW recognition results of different strain sensors might have different accuracy, the weighted least square method is adopted to reduce the error caused by singular values.Expression of the method is denoted as: where w is the diagonal weight matrix and can be calculated using w ii = 1/σ i 2 , in which σ i is the variance of the GVW recognition results of the ith sensor.

Traffic Load Monitoring Framework
Combining the two sensing techniques presented above, the overall data integration framework proposed in this paper can be described as follows.
In the part of strain sensing, since the influence theory is a static mechanics concept, a local regression algorithm named LOWESS is used to extract the static component from the dynamic bridge strain response induced by vehicles.Then, calibration field tests are conducted to obtain the traffic lane influence line of the target bridge with the static strain induced by vehicles.
In the part of visual sensing, after its training, the Mask-RCNN algorithm is used to recognize vehicles in every video frame and pick up vehicle information such as position, type and size etc.
Finally, by combining the calibrated influence line, the obtained static bridge strain and the vehicle position, gross vehicle weight (GVW) can be calculated regardless of the presence of multiple vehicles.The whole procedure is summarized in Figure 5. where w is the diagonal weight matrix and can be calculated using wii = 1/σi 2 , in which σi is the variance of the GVW recognition results of the i th sensor.

Traffic Load Monitoring Framework
Combining the two sensing techniques presented above, the overall data integration framework proposed in this paper can be described as follows.
In the part of strain sensing, since the influence theory is a static mechanics concept, a local regression algorithm named LOWESS is used to extract the static component from the dynamic bridge strain response induced by vehicles.Then, calibration field tests are conducted to obtain the traffic lane influence line of the target bridge with the static strain induced by vehicles.
In the part of visual sensing, after its training, the Mask-RCNN algorithm is used to recognize vehicles in every video frame and pick up vehicle information such as position, type and size etc.
Finally, by combining the calibrated influence line, the obtained static bridge strain and the vehicle position, gross vehicle weight (GVW) can be calculated regardless of the presence of multiple vehicles.The whole procedure is summarized in Figure 5.It is worth mentioning that the vehicle type and size does not reflect the vehicle weight recognition tasks in this paper, but they are closely noticed by the traffic and bridge management department.For example, oversized trucks are often prohibited from driving on some bridges; therefore, recognizing the type and size of such vehicles and sounding an alarm automatically are of significance in this case.As for how to achieve this recognition, the vehicle type can be directly output by the Mask R-CNN because of the feature-based advantage.The vehicle size, namely the length, width and height of the vehicle, can be recognized through coordinate transformation after the back and the side of the vehicle are segmented.Since this paper mainly focuses on estimating the vehicle weight in multiple-vehicle scenarios, more details about the size recognition are omitted for the conciseness.It is worth mentioning that the vehicle type and size does not reflect the vehicle weight recognition tasks in this paper, but they are closely noticed by the traffic and bridge management department.For example, oversized trucks are often prohibited from driving on some bridges; therefore, recognizing the type and size of such vehicles and sounding an alarm automatically are of significance in this case.As for how to achieve this recognition, the vehicle type can be directly output by the Mask R-CNN because of the feature-based advantage.The vehicle size, namely the length, width and height of the vehicle, can be recognized through coordinate transformation after the back and the side of the vehicle are segmented.Since this paper mainly focuses on estimating the vehicle weight in multiple-vehicle scenarios, more details about the size recognition are omitted for the conciseness.

Instrumentation and Test Setup
Field tests were conducted on an existing bridge for the verification of the proposed traffic monitoring methodology.The tested bridge, referred as Fuchang Overpass (Figure 6), is located on the Baoding-Fuping Highway in Hebei province, China.It is a typical prestressed continuous girder bridge which has been in operation for many years, with a total length of 133 m (32 m + 37 m + 32 m + 32 m).The bridge consists of three traffic lanes in total, and each of them is 3.75 m wide.Lane 3, as the emergency lane, was ignored in this research, since vehicles are prohibited to drive on this lane under normal conditions.The first span of the bridge is instrumented with a structural health monitoring (SHM) system comprising of a pavement-based WIM system, 14 resistance-type strain sensors (named 'S1-1' ~'S3-4') and a webcam.The SHM system was installed with many kinds of sensors (strain gauge, thermometer, accelerometer etc.) for general monitoring purpose, but only strain sensors were employed in this research.The normal strain data in the field tests were recorded by the strain sensor network placed on three different cross-sections, i.e., at 1/4, 1/2 and 3/4 spans and numbered as Sections 1-3, respectively.All the discussed information is shown in Figure 6.Field tests were conducted on an existing bridge for the verification of the proposed traffic monitoring methodology.The tested bridge, referred as Fuchang Overpass (Figure 6), is located on the Baoding-Fuping Highway in Hebei province, China.It is a typical prestressed continuous girder bridge which has been in operation for many years, with a total length of 133 m (32 m + 37 m + 32 m + 32 m).The bridge consists of three traffic lanes in total, and each of them is 3.75 m wide.Lane 3, as the emergency lane, was ignored in this research, since vehicles are prohibited to drive on this lane under normal conditions.The first span of the bridge is instrumented with a structural health monitoring (SHM) system comprising of a pavement-based WIM system, 14 resistance-type strain sensors (named 'S1-1′ ~ 'S3-4′) and a webcam.The SHM system was installed with many kinds of sensors (strain gauge, thermometer, accelerometer etc.) for general monitoring purpose, but only strain sensors were employed in this research.The normal strain data in the field tests were recorded by the strain sensor network placed on three different cross-sections, i.e., at 1/4, 1/2 and 3/4 spans and numbered as section 1, 2 and 3, respectively.All the discussed information is shown in Figure 6.The acquired strain data and the video are stored in an online server for long-term and online monitoring of the bridge structure.As a contrast, vehicle weight and velocity measured by a pavement-based WIM system are used as a standard to evaluate the accuracy of this proposed methodology.In addition to all the foregoing, influence lines of traffic lane 1 and lane 2 of bridge The acquired strain data and the video are stored in an online server for long-term and online monitoring of the bridge structure.As a contrast, vehicle weight and velocity measured by a pavement-based WIM system are used as a standard to evaluate the accuracy of this proposed methodology.In addition to all the foregoing, influence lines of traffic lane 1 and lane 2 of bridge structure are obtained in the field calibration tests according to the aforementioned procedure in Section 3.1.Figure 7 illustrates the calibrated influence line on two traffic lanes of all 14 strain sensors on instrumented bridge cross-sections 1, 2 and 3.The vertical axes of the influence line plots are the influence value (IV, unit: µε/ton).The calibrated influence lines directly reveal the quantitative relationship between the GVW and the strain data collected by different strain sensors.As the influence lines significantly outnumber the vehicles driving on the bridge and each of the lines is different, they also deliver the deployment foundation for the GVW recognition algorithm using redundant measurements.Otherwise, it is needless to calibrate the influence lines of multiple strain sensors.
Remote Sens. 2019, 11, x FOR PEER REVIEW 11 of 21 structure are obtained in the field calibration tests according to the aforementioned procedure in subsection 3.1.Figure 7 illustrates the calibrated influence line on two traffic lanes of all 14 strain sensors on instrumented bridge cross-sections 1, 2 and 3.The vertical axes of the influence line plots are the influence value (IV, unit: με/ton).The calibrated influence lines directly reveal the quantitative relationship between the GVW and the strain data collected by different strain sensors.As the influence lines significantly outnumber the vehicles driving on the bridge and each of the lines is different, they also deliver the deployment foundation for the GVW recognition algorithm using redundant measurements.Otherwise, it is needless to calibrate the influence lines of multiple strain sensors.

Vehicle Trajectory Recognition
The recognition of vehicle trajectory plays a vital role in solving the multiple-vehicle problem.The influence values of multiple vehicles, which are essential for forming the inverse influence line equation in order to estimate the gross vehicle weight, cannot be obtained without knowing their real-time positions.As aforementioned, the computer vision technique makes it feasible to locate vehicles in every video frame so that the vehicle trajectory can be recognized, as shown in Figure 8,

Vehicle Trajectory Recognition
The recognition of vehicle trajectory plays a vital role in solving the multiple-vehicle problem.The influence values of multiple vehicles, which are essential for forming the inverse influence line equation in order to estimate the gross vehicle weight, cannot be obtained without knowing their real-time positions.As aforementioned, the computer vision technique makes it feasible to locate vehicles in every video frame so that the vehicle trajectory can be recognized, as shown in Figure 8, depicting vehicle trajectories tracked by the aforementioned method. (b)

Vehicle Trajectory Recognition
The recognition of vehicle trajectory plays a vital role in solving the multiple-vehicle problem.The influence values of multiple vehicles, which are essential for forming the inverse influence line equation in order to estimate the gross vehicle weight, cannot be obtained without knowing their real-time positions.As aforementioned, the computer vision technique makes it feasible to locate vehicles in every video frame so that the vehicle trajectory can be recognized, as shown in Figure 8, depicting vehicle trajectories tracked by the aforementioned method.

Identification Results for Complex Scenarios
As for simple traffic scenarios such as only one vehicle passing the bridge, the recognition of vehicle type, velocity and axle numbers has already been performed with certain accuracy [30].This paper would focus on a more challenging problem: elaborating the GVW recognition method on the multiple-vehicle problem that remains to be solved.In general, there are three elementary scenarios of vehicle distribution: i) single vehicle as seen in Figure 9a, ii) one-by-one vehicles on the same lane as seen in Figure 9b and iii) side-by-side vehicles on different lanes as shown in Figure 9c.

Identification Results for Complex Scenarios
As for simple traffic scenarios such as only one vehicle passing the bridge, the recognition of vehicle type, velocity and axle numbers has already been performed with certain accuracy [30].This paper would focus on a more challenging problem: elaborating the GVW recognition method on the multiple-vehicle problem that remains to be solved.In general, there are three elementary scenarios of vehicle distribution: i) single vehicle as seen in Figure 9a, ii) one-by-one vehicles on the same lane as seen in Figure 9b and iii) side-by-side vehicles on different lanes as shown in Figure 9c.
As for simple traffic scenarios such as only one vehicle passing the bridge, the recognition of vehicle type, velocity and axle numbers has already been performed with certain accuracy [30].This paper would focus on a more challenging problem: elaborating the GVW recognition method on the multiple-vehicle problem that remains to be solved.In general, there are three elementary scenarios of vehicle distribution: i) single vehicle as seen in Figure 9a, ii) one-by-one vehicles on the same lane as seen in Figure 9b and iii) side-by-side vehicles on different lanes as shown in Figure 9c.

Scenario: One-By-One Vehicles
The highway bridge chosen in the field tests has the span length of 32 + 37 + 32 + 32 = 133 m.Oftentimes, one-by-one vehicles drive simultaneously on the bridge.However, a sizeable safety margin, no less than 50 m, between the front and rear vehicles is demanded when driving on highways in China.That explains why two peaks can be observed clearly in the bridge strain signal caused by moving vehicles, as Figure 9d illustrates.Moreover, the demand provides the advantage that the front vehicle would add little to the bridge strain caused by the rear one.A 50 m margin means when the rear vehicle enters the first span of the bridge, the front vehicle has already reached the third or fourth span.According to the influence line in Figure 7, the influence value of the third Truck2

Scenario: One-By-One Vehicles
The highway bridge chosen in the field tests has the span length of 32 + 37 + 32 + 32 = 133 m.Oftentimes, one-by-one vehicles drive simultaneously on the bridge.However, a sizeable safety margin, no less than 50 m, between the front and rear vehicles is demanded when driving on highways in China.That explains why two peaks can be observed clearly in the bridge strain signal caused by moving vehicles, as Figure 9d illustrates.Moreover, the demand provides the advantage that the front vehicle would add little to the bridge strain caused by the rear one.A 50 m margin means when the rear vehicle enters the first span of the bridge, the front vehicle has already reached the third or fourth span.According to the influence line in Figure 7, the influence value of the third and fourth span is far smaller than the first span.In conclusion, the one-by-one vehicles scenario in this research can be simplified as the single vehicle scenario and the GVW of each vehicle can be calculated using the aforementioned Equation (12) with the corresponding peak of the strain signal.
To verify the accuracy of the simplification above, the GVW of the two trucks in Figure 9c are calculated and the process is listed in Table 1.Errors of the recognition results are acceptable, as listed in Table 1.
Table 1.GVW recognition process in a one-by-one vehicles scenario.

Vehicle Name
Truck1 Truck2 Challenge arises when two vehicles are driving side by side.In this scenario, one single strain signal peak in Figure 9f comprises two indistinguishable vehicles, making the above GVW recognition methods ineffective.According to Yu et al. [10], the identification of multiple-vehicle presence is still one of the main challenges faced by BWIM technique.
In order to solve this problem, it is necessary to integrate strain data of multiple strain sensors so that the previous Equation ( 14) can be used.Theoretically, two sensors are enough to determine two unknown GVW according to the linear algebra.However, the GVW results might vary considerably due to the inevitable measuring errors existing in field tests.One practical method to mitigate the variation is augmenting additional strain sensors to make the constraints exceed the unknowns and use the least squares to solve the overdetermined problem, as the text Section 3.3 has stated.
Taking the two trucks in Figure 9e as an example, a comprehensive explanation is given as follows.At the moment shown in the figure, the distances between the two trucks and the start line of the bridge are 16.1 m and 17.8 m, respectively, given by the visual sensing technique.According to the distances and the calibrated influence line, corresponding influence values of the trucks can be found as listed in the Table 2.The strain values at that moment are listed as well.Based on the information of sensor 'S2-2' and 'S2-3' in Table 2, the formula for calculating the GVW of the side-by-side vehicles is written as follows: where W 1 and W 2 are the GVW of truck1 and truck2 in Figure 9e.The recognition results are W 1 = −222.62t and W 2 = 425.52t, which are clearly wrong.Then, the recognition equation is rewritten using information of four sensors as follows: The least square method is used to solve the overdetermined equation and the results are W 1 = −57.06t and W 2 = −52.63t.The GVWs measured by the pavement-based WIM system are 51.12 t and 49.12 t.The error between the BWIM and the pavement-based WIM is acceptable, which means the problem of when two vehicles drive side-by-side is successfully solved.
Table 3 gives detailed identification results of a total of 38 vehicles in various traffic scenarios.The GVW of each vehicle was identified using partial or all of the available 14 strain sensors in order to compare and quantitatively optimize the number of sensors and the location of sensors.Comprehensively, Table 3 concludes as follows: i.
The recognition results of the GVW are of acceptable accuracy when using data from less than eight strain sensors; ii.
Errors in one-by-one and side-by-side vehicle scenarios are slightly larger in contrast to the single vehicle scenario.The difference is reasonable because the position of vehicles is essential to obtain their influence value when recognizing the GVW in complicated traffic scenarios, and the positioning error is inevitable in the process of coordinate transformation.iii.
An interesting phenomenon is the obvious larger error when using the data from all 14 strain sensors.Detailed reason would be particularly discussed later.Intuitive plots of the GVW results for different numbers of strain sensors are shown in Figure 10, in which each point corresponds to a vehicle.In these figures, the further away the point is from the baseline, the larger the error is. iii.
An interesting phenomenon is the obvious larger error when using the data from all 14 strain sensors.Detailed reason would be particularly discussed later.Intuitive plots of the GVW results for different numbers of strain sensors are shown in Figure 10, in which each point corresponds to a vehicle.In these figures, the further away the point is from the baseline, the larger the error is.Statistics of the relative errors compared with the results recognized by the pavement-based WIM system are also listed in Table 4.The statistics show that due to the introduction of more redundant information, the more strain sensors of the sensor network are used, the smaller the error is.Statistics of the relative errors compared with the results recognized by the pavement-based WIM system are also listed in Table 4.The statistics show that due to the introduction of more redundant information, the more strain sensors of the sensor network are used, the smaller the error is.Finally, it is necessary to highlight the reason behind the large error caused by the usage of all 14 strain sensors.Compared with the scenario using eight sensors, the extra six sensors are mounted close to the neutral axis of the bridge cross-section.According to the Euler-Bernoulli beam theory [39], the closer a strain sensor is to the neutral axis, the smaller its strain value, making the relative error lager in contrast.Figure 11 compares the time-history curves and the GVW recognition results of two strain sensors, S2-1 and S2-6, whose distances to the neutral axis are 100 mm and 800 mm, respectively.Errors of the sensor 'S2-1' are obviously more significant than those of the sensor 'S2-6'.To avoid this problem, strain sensors for BWIM purposes should be installed far from the section neutral axis for higher accuracy.
Finally, it is necessary to highlight the reason behind the large error caused by the usage of all 14 strain sensors.Compared with the scenario using eight sensors, the extra six sensors are mounted close to the neutral axis of the bridge cross-section.According to the Euler-Bernoulli beam theory [39], the closer a strain sensor is to the neutral axis, the smaller its strain value, making the relative error lager in contrast.Figure 11 compares the time-history curves and the GVW recognition results of two strain sensors, S2-1 and S2-6, whose distances to the neutral axis are 100 mm and 800 mm, respectively.Errors of the sensor 'S2-1′ are obviously more significant than those of the sensor 'S2-6′.To avoid this problem, strain sensors for BWIM purposes should be installed far from the section neutral axis for higher accuracy.Previous studies also point out that road roughness and vehicle velocity will affect the GVW recognition accuracy because of the vehicle-bridge coupling vibration.The faster the vehicle drives, the larger the GVW recognition errors are [40].However, according to the obtained results, this issue is not significant in this research.This is because, on one hand, vehicle-bridge coupling vibration effects are almost eliminated by the preceding LOWESS algorithm; on the other hand, road surface of highway is quite smooth, thus, severe vehicle-bridge coupling vibration will not be excited though vehicles driving at high velocity.Figure 12 proves that vehicle velocity does not induce GVW recognition errors in sensors S2-3 and S2-6, as no obvious pattern can be found in the scatter plot.Previous studies also point out that road roughness and vehicle velocity will affect the GVW recognition accuracy because of the vehicle-bridge coupling vibration.The faster the vehicle drives, the larger the GVW recognition errors are [40].However, according to the obtained results, this issue is not significant in this research.This is because, on one hand, vehicle-bridge coupling vibration effects are almost eliminated by the preceding LOWESS algorithm; on the other hand, road surface of highway is quite smooth, thus, severe vehicle-bridge coupling vibration will not be excited though vehicles driving at high velocity.Figure 12 proves that vehicle velocity does not induce GVW recognition errors in sensors S2-3 and S2-6, as no obvious pattern can be found in the scatter plot.

Conclusions
With special focus on complicated traffic scenarios, this paper presents a traffic load identification methodology using multiple strain sensors and single camera for short and medium span bridges.Systematic field tests were performed on a concrete box-girder bridge to investigate the reliability and accuracy of the proposed method in practice.Based on the results, the following conclusions are drawn: 1. Deep learning based computer vision technique is a practical tool to extract the key parameters from traffic video in real time manner, such as position, size, axle number and type of passing vehicles over bridge.Moreover, traffic mode of multi-vehicle problem is equally important to be

Conclusions
With special focus on complicated traffic scenarios, this paper presents a traffic load identification methodology using multiple strain sensors and single camera for short and medium span bridges.Systematic field tests were performed on a concrete box-girder bridge to investigate the reliability and accuracy of the proposed method in practice.Based on the results, the following conclusions are drawn: 1.
Deep learning based computer vision technique is a practical tool to extract the key parameters from traffic video in real time manner, such as position, size, axle number and type of passing vehicles over bridge.Moreover, traffic mode of multi-vehicle problem is equally important to be identified as one-by-one, side-by-side or mixed mode.

2.
By utilizing the redundant strain measurements, the proposed least square based identification method is capable of: i) distinguishing complicated traffic mode such as side-by-side vehicles, which is theoretically unidentifiable with single measurement and ii) solving the overdetermined inverse influence equations effectively, and hence, reducing the GVW recognition errors.

3.
Under the condition that vehicle parameters (especially positions) are identified and available, the proposed framework successfully recognizes the vehicle weight in spite of the presence of one-by-one and side-by-side vehicles, with an average weighing error less than 8%.Thus, the elementary scenarios of the multiple-vehicle problem for BWIM research are solved with an overall improvement with respect to cost and accuracy.4.
The usage of strain sensors installed at locations with larger response results in smaller recognition error of vehicle weight.It is suggested that strain sensors for BWIM purposes should be installed far from the neutral axis of cross-sections for the sake of higher accuracy.

Figure 1 .
Figure 1.Vehicle recognition results.(a) Before entering the bridge deck; (b) after entering the bridge deck.

Figure 1 .
Figure 1.Vehicle recognition results.(a) Before entering the bridge deck; (b) after entering the bridge deck.

Figure 3 .
Figure 3. Procedure of the vehicle induced static strain extraction.

Figure 3 .
Figure 3. Procedure of the vehicle induced static strain extraction.

Figure 4 .
Figure 4. Diagram of the influence line calibration method.

Figure 4 .
Figure 4. Diagram of the influence line calibration method.

Figure 8 .
Figure 8. Vehicle trajectory recognition.(a) Truck trajectories in image, (b) truck trajectories on bridge deck, (c) car trajectory in image and (d) car trajectory on bridge deck.

Figure 8 .
Figure 8. Vehicle trajectory recognition.(a) Truck trajectories in image, (b) truck trajectories on bridge (c) car trajectory in image and (d) car trajectory on bridge deck.

Figure 9 .
Figure 9. Scenarios of vehicle distribution on bridge.(a) Single vehicle, (b) strain signal of single vehicle, (c) one-by-one vehicles, (d) strain signal of one-by-one vehicles, (e) side-by-side vehicles and (f) strain signal of side-by-side vehicles.

Figure 9 .
Figure 9. Scenarios of vehicle distribution on bridge.(a) Single vehicle, (b) strain signal of single vehicle, (c) one-by-one vehicles, (d) strain signal of one-by-one vehicles, (e) side-by-side vehicles and (f) strain signal of side-by-side vehicles.

Figure 10
Figure10GVW recognition results for different numbers of strain sensors.(a) GVW results using four sensors (b) GVW results using six sensors (c) GVW results using eight sensors (d) GVW results using 14 sensors.

Figure 10 .
Figure10.GVW results for different numbers of strain sensors.(a) GVW results using four sensors (b) GVW results using six sensors (c) GVW results using eight sensors (d) GVW results using 14 sensors.

Figure
Figure Comparison between two sensors.(a) Bridge strain time-history curves; (b) GVW recognition results.

21 Figure 12 .
Figure 12.Correlation of velocity and relative GVW errors.

Figure 12 .
Figure 12. of velocity and relative GVW errors.

Table 2 .
GVW recognition information in side-by-side vehicles scenario.

Table 3 .
Recognition results of the GVW.

Table 4 .
Statistics of the relative errors compared with pavement-based WIM.

Table 4 .
Statistics of the relative errors compared with pavement-based WIM.