Automatic Bridge Design Parameter Extraction for Scan-to-BIM

: Building information modeling (BIM), which can e ﬃ ciently manage the life cycle of structures, has been increasingly applied in the construction industry. However, it is di ﬃ cult to implement BIM for existing structures, due to the di ﬀ erences between the design and as-built conditions. Point cloud data (PCD) can be obtained through the scan-to-BIM process, which builds a model based on the current state of the structure. The scan-to-BIM process is complicated for bridge structures and consumes signiﬁcant time and resources. Therefore, this study developed a system to extract bridge design parameters automatically to reduce the time and resources for the scan-to-BIM process. The proposed automatic bridge design parameter extraction is performed in three steps: (1) noise reduction, (2) 3D transformation, and (3) parameter extraction. The validation test was conducted on the Osong test track ﬁfth bridge in Nojang-ri, Jeondong-myeon, Yeongi-gun, Chungcheongnam-do, Korea. The system developed in this study successfully extracted the design parameters of the bridge from the PCD automatically, resulting in 0.8% and J.J.P.; methodology, J.H.L.; software, J.H.L.; validation, J.H.L., J.J.P.; formal analysis, J.H.L.; investigation, H.Y. and J.H.L.; resources, J.J.P.; data curation, J.J.P.; writing—original preparation, J.H.L.; writing—review and editing, H.Y.; supervision,


Introduction
The construction industry has been rapidly developing through its synergy with information technology (IT). An article entitled, "The Vision for Civil Engineers in 2025," published by ASCE(American Society of Civil Engineers) in the US in 2007, predicted that civil engineers would be using sensors and IT technology at construction sites in 2025 [1]. As of 2020, most of these predictions have been realized through emerging technologies that improve existing processes, such as predicting the health of structures using drones and imaging equipment, identifying cracks using deep learning techniques, automatically recognizing structural components, and construction project simulation based on extended reality (XR). [2][3][4][5][6][7].
Building information modeling (BIM) is also one of the core technologies in construction IT, which entails the management of data, such as design, construction, and maintenance information, throughout the life cycle of the structures by integrating the information into digital 3D models. The most well-known book on BIM, BIM Handbook: A Guide to Building Information Modeling for Owners, Managers, Designers, Engineers, and Contractors, stressed that BIM was a requirement for the future of the construction industry [8]. In addition, governments and companies have realized the importance of BIM and are making substantial efforts to promote it. In the US, integrated project delivery (IPD) has become mandatory for projects with construction cost estimates greater than US$500 million, and several BIM guidelines and standards have been published. In the EU, the BIM Handbook for the EU Public Sector was published through the EU BIM Task Group to encourage widespread BIM adoption [9]. In Korea, BIM guidelines were provided in the BIM Application Guide for Architecture, and the use of Figure 1 shows the flowchart of the automatic bridge design parameter extraction system proposed in this study. First, the point cloud data (PCD) of a bridge is collected using LiDAR, and error-causing noise is removed. Next, a 3D rotational transformation is performed on the PCD in which the errors are removed to facilitate design parameter extraction from the PCD. The x-, y-, and z-axes are set as the width, length, and height, respectively, and the 3D rotational transformation of the PCD occurs parallel to each axis. Finally, the design parameters are extracted from models constructed for each structure type to account for the differences in the bridge shapes between structure types. The extracted design parameters can then be used for BIM through the application of the parametric BIM library.

Noise Reduction
LiDAR emits laser pulses onto an object and analyzes the reflected pulses to obtain the 3D coordinates and RGB data of the object. However, during this process, noise arises due to factors such as equipment errors, multiple reflections, birds, and insects. If the PCD containing noise is utilized, significant errors will be generated during the design parameter extraction process. Therefore, the noise must be removed to ensure the accuracy of the results. The proposed noise-removal process classifies the PCD required for the design parameter extraction as inliers and the noise as outliers.
Various filters are available for noise removal. This study adopted a statistical outlier removal filter that is suitable for removing noise from PCD through the application of the k-nearest neighbors (kNN) algorithm [27]. The kNN algorithm calculates the statistics of the Euclidean distance for each point and its neighboring k points. Next, the data are classified by setting a threshold based on the standard deviation. As seen in Figure 2, the points not exceeding the threshold are classified as inliers, while those exceeding the threshold are classified as outliers and, accordingly, removed. In this study, the number of neighboring k points was set as four, and the threshold value was adjusted according to the PCD density, which depends on the LiDAR equipment.

3D Transformation
When PCD is collected using LiDAR, constant values are typically obtained because the z-axis is always perpendicular to the surface. However, as seen in Figure 3, this is not the case for the x-and y-axes. When the coordinate axes are not constant, additional data processing is required, which was performed in this study through the application of a 3D rotational transformation of the bridge PCD.

Plane Prediction Using the m-Estimator Sample Consensus (MSAC)
In this study, the planar geometric shapes of the bridge were used to set the reference vector before applying the 3D rotational transformation. Bridge components, such as the slab and girders, contain flat elements. Therefore, by locating a planar feature and setting the reference vector on this plane, a consistent reference can be obtained.
The m-estimator sample consensus (MSAC) algorithm was applied to locate the planar element in the PCD through the steps shown in Figure 4 [28]. The MSAC is an improved method of random sample consensus (RANSAC), in which sample data is randomly extracted, and a consensus is determined [29]. In this study, the consensus was based on a user-determined model, and the plane generated by sampling three random points of the PCD was taken as this model. The points within the selected threshold value were recognized as inliers and counted. These steps were repeated until the plane was determined by maximizing the number of points in the inlier; Figure 5 shows the results of this process.

Plane-Forming-Vector Extraction
Once a plane in the PCD is identified, a reference vector is set on that plane. Figure 6 illustrates the steps of this process. First, the points with the largest x, smallest x, and largest y values among the elements of the PCD plane are indexed and designated P 1 , P 2 , and P 3 , respectively. Figure 6. Reference vector extraction.
The → P 1 P 3 and → P 2 P 3 vectors are obtained by connecting to P 1 and P 2 , based on P 3 .
The vectors → P 1 P 3 and → P 2 P 3 are then compared, and the largest vector is set as the longitudinal directional vector → y of the bridge → y = P x , P y , P z

3D Rotation Matrix
A vector representing the axis of rotation and a rotation angle are required to match the coordinate axis by rotating the bridge PCD based on the longitudinal directional vector → y . The vector for the axis of rotation is the normal vector of the first predicted plane; therefore, it can be easily obtained.
The rotational axis vector is → V, and the components of the directional vector → y are P x , P y and P z .
The rotation angle θ is the angle obtained from the dot product of the longitudinal direction vector → y and the y-axis unit vector (0, 1, 0). The → V and θ is shown in Figure 7 [30]. A 3 × 3 matrix is required to apply the rotational transformation to the 3D elements in the PCD. The rotational matrix is as follows [31]: The rotational matrix R can be acquired by substituting the rotational axis → V and the angle of rotation θ previously obtained. The rotational transformation is then applied by multiplying the rotational matrix R by the PCD.
When the z-axis of the PCD obtained from the LiDAR is the same as the model, it is possible to obtain PCD aligned to the model axis by applying the transformation once. However, for PCD wherein the x, y, and z axes are all not aligned with the LiDAR data, the aligned PCD can be obtained by applying the rotational transformation twice. In the 3D rotationally transformed bridge PCD, the width of the bridge is parallel to the x-axis, the length of the bridge is parallel to the y-axis, and the height of the bridge is parallel to the z-axis. Therefore, the design parameters can be extracted easily. Figure 8 shows the rotational transformation of the bridge PCD.

Design Parameter Extraction
When the planar components of the bridge PCD are repeatedly extracted by applying the MSAC used in the 3D transformation process, they are decomposed, as shown in Figure 9, and demonstrate a pattern from the largest to the smallest plane. The design parameters can be extracted based on the distance between the planes because each decomposed plane has coordinates. However, bridge shapes and the design parameters requiring extraction differ based on the bridge type. Therefore, this study developed automatic design parameter extraction models for several common bridge types: beam girder, T-type girder, plate-type girder, steel box girder, and prestressed concrete box girder. The bridge type most widely used for railway bridges is the beam girder bridge, which has a more complex shape than other girder bridges. Further, the design parameter extraction using a constant decomposition pattern would not produce satisfactory results because the beam members all have the same shape. Therefore, this study added an algorithm to extract the design parameters from the beam members, which manually specifies the region of interest (ROI) in the area of the design parameter H 2 . The design parameters that can be extracted from the beam girder bridge model are , and W 5 , as shown in Figure 10, and the length L. T-type girder bridges have relatively simple shapes and are not currently widely used for railway bridges. Therefore, the model for this bridge type was developed to only extract the basic design parameters. Completing the T-type girder model-based on the LiDAR PCD will be a subject for future work. The design parameters that can be extracted from the T-type girder bridge model are H 1 , H 2 , H 3 , W 1 , W 2 , W 3 , and W 4 , as shown in Figure 11, and the length L. Although plate girder bridges were a popular bridge type selection in past decades, they are now rarely designed because of the large amount of steel that they require. Therefore, the model for this bridge type was developed to only extract the basic design parameters. Completing the plate girder model-based on the LiDAR PCD will be a subject for future work. The design parameters that can be extracted from the plate girder bridge model are H 1 , H 2 , H 3 , W 1 , W 2 , W 3 , and W 4 , as seen in Figure 12, and the length L. While steel box girder bridges have been commonly applied as roadway structures, currently, they are rarely used for railway bridges. Therefore, the model for this bridge type was developed to only extract the basic design parameters. Completing the steel box girder model-based on the LiDAR PCD will be a subject for future work. The design parameters that can be extracted from the steel box girder bridge model are H 1 , H 2 , H 3 , W 1 , W 2 , and W 3 , as shown in Figure 13, and the length L. Given the cross-sectional shape of prestressed concrete (PSC) box girder bridges, as shown in Figure 14, the shape inside the PSC box cannot be obtained or checked against an in-service bridge. Therefore, the model for this bridge type was developed to only extract the basic design parameters. Completing the PSC box girder model-based on the LiDAR PCD will be a subject for future work. The design parameters that can be extracted from the PSC box girder bridge model are H 1 , H 2 , H 3 , W 1 , W 2 , and W 3 , as seen in Figure 14, and the length L.

BIM Model-Based Validation Test Setup
A BIM model-based test was performed to validate the proposed method, prior to on-site field validation test. BIM data for each bridge type were generated with predetermined design parameters. Since the 3D model of the produced BIM is 3D mesh data, it was converted into point cloud data (PCD) using a method of assigning a point cloud to the mesh surface as shown in Figure 15. In addition, noise + data loss were added to simulate the process of obtaining PCD by LiDAR. Gaussian noise was applied within five times the width, length, and height of the bridge as shown in Figure 16, and a certain percentage of PCD data was removed as shown in Figure 17.  System validation was conducted with two cases. Case 1 is to simulate a similar environment with the LiDAR measurement for a beam girder bridge: extracting design parameters by assuming 5% of noise + data loss which usually occur from on-site field LiDAR measurement. Case 2 is to extract design parameters by increasing the noise level and the data loss level by 1% (from 0% to 50%) for all bridge types. The predicted design parameters were extracted by using the proposed method. To validate the system, the design parameters used to generate the BIM were compared to the predicted design parameters. To minimize the uncertainty in the MSAC algorithm, the design parameter extraction was performed 500 times and the predictive parameters were averaged.

BIM Model-Based Validation Result and Discussion
Case 1 was conducted assuming the noise level and data loss level that simulates the environment of LiDAR measurement (5%) for the beam girder bridge. The predicted design parameters for Case 1 is shown in Figure 18 and Table 1. When 5% noise + data loss were applied, the average error rate was about 1.95%. H 1 , which represents the height of the bridge, resulted an error rate of approximately 1.86%. H 2 , which represents the height of the beam and showed an error rate of about 1.82%, and H 3 , the thickness of the slab, showed an error rate of 4.2%. L 1 , representing the span length, showed an error rate of about 0.09%. W 1 represents the width of the bridge and showed an error rate of about 0.19%. W 2 represents the width of all beam girders and showed an error rate of about 0.65%. W 3 indicates the width of the flange and showed an error rate of about 1.23%. W 4 indicates the width between the beams and showed an error rate of about 2.95%. W 5 indicates the web of the beam. There was an error rate of about 4.57%. When comparing the error rates of the parameters, it is shown that the error rate of L 1 is the lowest, and W 5 is the highest.   Figure 19 shows the result of applying design parameter extraction with noise + data loss for different bridge types. Types of parameters were different for each bridge type, and thus the results cannot be directly compared. However, the result showed how the noise level and data loss level can influence the accuracy of the parameter extraction. When the noise ratio was above a certain value (e.g., about 30% for beam girder bridges), the error rate did not increase and showed a tendency to converge. This is due to the kNN algorithm, which was used for the noise reduction. The kNN algorithm can reduce the noise by removing the outliers when noise level is low enough. However, when noise exceeds certain value, noise points were not considered as outliers anymore; the points were rather considered as inliers. In the case of data loss, there was no significant change in the error rate. Because we have used more than 300,000 data points, the result was reasonably accurate even with 50% data loss. However, when noise + data loss are applied together, the error rate tends to converge faster because the density of PCD affects the consideration of noise as an outlier. There are two main sources of the error. The first source is an error due to noise + data loss. When there was no noise, the average error rate was about 0.06% as shown in Table 1, but the error rate increased to 1.95% as the noise + data loss increased to 5%. The second source of the error is due to the plane prediction procedure using the MSAC algorithm. In this study, the MSAC algorithm is applied to find the most probable plane based on probability. The MSAC algorithm is an effective parameter estimation method, but it does not use all the data; The MSAC estimates the parameters by randomly selecting sample data. Therefore, the result of the most probable plane varies depending on the selected sample data. The accuracy tends to increase (while the deviation tends to decrease) with more numbers of data points in PCD. For example, L 1 , which has relatively large numbers of data points, resulted in high accuracy with low deviation. However, W 5 , which has relatively small numbers of data points, resulted in lower accuracy with higher deviation. In other words, the error caused by the MSAC algorithm highly depends on how dense the PCD is, and it is expected that the error can be reduced by obtaining denser PCD.

On-Site Field Validation Test Setup
The on-site field experiment was conducted at the Osong test track fifth bridge ( Figure 20) in Nojang-ri, Jeondong-myeon, Yeongi-gun, Chungcheongnam-do, Korea to seek whether the developed system can be applied to the field. The Osong test track fifth bridge is a beam girder bridge with total 435 m length consist of 14 spans. LiDAR measurement was conducted on the Osong test track fifth bridge, and total of 62,005,533 point cloud data (PCD) points were obtained as shown in Figure 21.
The developed system was applied for each span, and the parameters for each span were predicted. The predicted design parameters were compared with the reference parameters which are the values directly measured by LiDAR. In order to reduce the error due to the randomness of the MSAC algorithm during the plane prediction, the proposed method was applied 100 times and the average error was obtained.

On-Site Field Validation Result and Discussion
The PCD of the bridges used for Section 3 (extracted directly from the BIM model) was very clear; the PCD did not contain significant data loss for certain point of view. However, most of the data points in the inner part of the bridge was missing when the PCD was obtained by LiDAR as shown in Figure 22. Even though the LiDAR was installed under the bridge, the LiDAR pulse could not fully reach the top of the girder. The proposed system was applied sequentially as discussed in Section 2. First, since the data obtained through LiDAR has noise, the noise was removed through the statistical outlier removal filter (Section 2.2). In the next step, the measurement coordinate system was transformed to the bridge coordinate system. By applying the MSAC algorithm, the most probable plane of the bridge was predicted as shown in Figure 23. The normal vector and the longitudinal direction vector of the plane were extracted, and the corresponding 3D rotation matrix and translation vector were calculated. Using the 3D rotation matrix and the translation vector, the PCD were transformed from the measurement coordinate to the bridge coordinate system as shown in Figure 24. Finally, the design parameters of the beam girder bridge were extracted by decomposing the PCD into multiple layers ( Figure 25) and calculating the plane distance as described in Section 2.3.   Design parameters for the Osong test track fifth bridge were predicted by each span as shown in Table 2. The parameter H 3 , which represents the thickness of the slab, could not be measured due to railroad gravel. Therefore, the experiment was conducted except for this. The parameter W 4 and W 5 for span 14 was not predicted, because the PCD was lost. Since the reference value for the parameter was also unable to be measured, this parameter was excluded when calculating the average. The total average error rate for each parameter was about 0.8%. As discussed in the BIM model-based validation test, the parameter that contains more points showed higher accuracy. For example, L 1 showed the lowest error (about 0.06%) rate and W 5 showed the highest error rate (about 2.17%). By comparing the results of span 2 showed the lowest error rate of about 0.5%, while span 6 had the highest error rate of about 1.35%. This is due to the difference in the PCD quality (noise + data loss) obtained by LiDAR.
There are four major sources of the error in the on-site field test. The first source of the error is due to the LiDAR measurement. The LiDAR measurement includes device errors (the inherent error due to the mechanical device) and random errors (remaining error even after removing the error). However, since reference values for the design parameters are obtained manually from the LiDAR data instead of using the parameter from drawings the reference value, the first source of error is assumed to be negligible. The second is the data loss due to the blind spot of the LiDAR measurement. Due to the characteristics of LiDAR, it is not possible to obtain data on certain areas where the laser pulse could not be reached. For example, the PCD in the inner part of the span 14 was omitted as discussed in above, and thus the parameters (i.e., W 4 and W 5 ) of the beam were unable to be predicted. The second source of the error is expected to be decreased with the advance of LiDAR technology. The third is the error caused by point cloud registration (or scan matching). For the on-site validation test, LiDAR measurements were performed in various directions. To combine the PCD into a single model, point cloud registration conducted. Despite of the advance in the point cloud registration algorithms, the result of the registration contains some outliers. The result can be improved by using a conjugate point, but the outlier cannot be completely removed. Lastly, the error could be produced during the plane predicting using the MSAC. As discussed in the BIM-based validation experiment test result, error occurs when the MSAC algorithm randomly selects samples to estimate the most probable plane. This source of error is expected to be decreased if a more dense point cloud can be obtained.

Conclusions and Future Work
In this study, we developed a system that automatically extracts design parameters from point cloud data (PCD) of bridges. The overall system consists of three components: (1) noise reduction, (2) 3D transformation, and (3) design parameter extraction. To validate the performance of the developed system, two different validation tests were conducted. The first validation test predicted the design parameters from PCD created based on BIM models. Since the reference design parameters are predetermined, the performance of the proposed system was validated by comparing with the predetermined parameters. As a result, an error rate of 0.06% was shown without noise, and an error rate of 1.95% was shown for 5% of noise + data loss. The proposed method was able to predict the design parameters with reasonable accuracy even when excessive noise + data loss were given. The second validation test was conducted on the Osong test track fifth bridge in Nojang-ri, Jeondong-myeon, Yeongi-gun, Chungcheongnam-do, Korea. As a result of on-site field validation, the error rate of the design parameter extraction system was 0.8%. Four main sources of error were identified and discussed. The extracted design parameters are used for BIM construction through a parametric library. By applying the proposed method, it is expect to reduce the time and cost required to manually obtaining design parameters from the PCD, and thus improve the current process of scan-to-BIM. Currently, the proposed method is limited to girder bridges including beam girder bridge, T-girder bridge, plate girder bridge, steel box girder bridge, and PSC box girder bridge. In the future, we are planning to expand the current work to different types of bridges such as truss bridges, suspension bridges, and cable-stayed bridges. In addition, components (e.g., deck, pier) and the span of the bridge were manually divided from the PCD in the current stage. If the process of dividing the spans can be automated using computer vision and deep learning, the current scan to BIM is expected to be fully automated in the future.