Confirmation of Final Bolt Tightening via Deep Learning-Based Image Processing

Fukuoka, Tomotaka; Minami, Takahiro; Fujiu, Makoto

doi:10.3390/app13137573

Open AccessCommunication

Confirmation of Final Bolt Tightening via Deep Learning-Based Image Processing

by

Tomotaka Fukuoka

¹,

Takahiro Minami

² and

Makoto Fujiu

^1,*

¹

Institute of Transdisciplinary Sciences for Innovation, Kanazawa University, Kanazawa 920-1192, Japan

²

Institute of Technology, Shimizu Corporation, Tokyo 135-8530, Japan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(13), 7573; https://doi.org/10.3390/app13137573

Submission received: 20 April 2023 / Revised: 21 June 2023 / Accepted: 23 June 2023 / Published: 27 June 2023

(This article belongs to the Special Issue Applications of Video, Digital Image Processing and Deep Learning)

Download

Browse Figures

Versions Notes

Abstract

:

In Japan, the final tightening of bolts in the bolt-tightening operations is guaranteed to have been performed correctly by visually determining the change in markings during the temporary tightening operation performed by the technician. However, the engineer must confirm many bolts; further, the amount of time needed for the confirmation work and the inability to keep an objective record of the confirmation results present problems. To solve these problems, we developed a system for automating the final tightening of bolts using deep learning-based image-processing technology. The proposed system takes as input videos of bolt fastening points, extracts individual bolts, extracts markings on the extracted bolts, and makes fastening decisions based on the markings. In the judgment stage, the system processes information on each bolt where a marking is detected; thus, it is possible to leave this information as objective data. In this paper, we evaluated the accuracy of each automated step using an actual bridge video. We also compared the confirmation time with human confirmation. As a result of the confirmation, our proposed method reduces the confirmation time by about 33% in comparison to human confirmation.

Keywords:

bolt fastening confirmation; image processing; deep learning; objective data record

1. Introduction

One method of joining steel structures of modern buildings is the high-strength bolted joint in Japan. Multiple steel materials are fastened with high-strength bolts and joined via friction force. In this paper, we focused on the method that uses a torque share-type high-tension bolt. With this method, prior to final tightening, a practicing engineer marks each tightening point by drawing a straight line across the bolt, nut, washer, and base material. Next, the practicing engineer conducts final tightening and confirms that the correct tightening is completed by visually confirming the misalignment of the marking at the tightening point [1]. When final tightening is performed properly, only the nut rotates, so only the nut marking moves along the markings on the bolt, nut, washer, and base metal. The pin tail of the bolt head receives the reaction force when the nut is tightened and the bolt is tightened. The pin tail breaks when the prescribed bolt axial force is applied. Therefore, there is no need to check the torque with a torque wrench. A torque check with a torque wrench is also unnecessary. If final tightening is not performed properly, the markings on the bolts and washers will also shift. By checking the shift of the markings, the worker can determine if the tightening was performed correctly. Workers perform this check on every bolt to ensure that no bolt tightening errors exist in the structure.

Two possible problems with this process of checking the tightening of high-strength bolts are the length of time involved and the lack of an objective record of the total number of bolts checked. This confirmation work is time-consuming because the workers have to move to all the tightening points and visually check the marking results of each bolt to determine if it is tightened properly. From Tanihira’s study [2], it is considered important to keep records of individual bolts. However, due to the large number of target locations, the workers’ records of the confirmation results were only a partial sampling, making it difficult to keep a record of the entire confirmation process.

To solve these problems, we propose a bolt tightening judgment system using im-age-processing technology. The proposed system takes video footage of the bolt tightening locations as an input, extracts each bolt from the footage, and judges whether the bolts are correctly tightened from the marking results. The videos used as input for the system can be recorded instead of having practicing engineers visually confirm the bolt tightening, which reduces the work time and provides an objective record of the tightening confirmation results.

In this paper, we present an overview of the proposed judgment system, as well as an examination of the results of actual bolt videos.

2. Related Research

Image-based bolt tightening judgment methods that have been developed so far have proposed methods that calculate from the features of the bolt region in the image, or methods that calculate the angle of the bolt using the extracted features [3,4]. These methods require the environment and shooting conditions at the bolt tightening point to be adjusted and are vulnerable to noise. In this study, we propose a deep learning-based method to determine bolt tightness.

The problem of realizing the proposed method involves the following: classifying images showing bolt marking results according to whether the bolts were tightened correctly and extracting each bolt from the image of the bolt tightening location via object detection.

For image classification, deep learning methods have achieved high accuracies in recent years. Among them, convolutional neural networks (CNNs) [5] are often used. In object detection for images, deep learning methods have achieved high accuracies in recent years. The R-CNN [6], which uses CNNs for object detection via image classification, achieved an accuracy 30% higher than that of existing methods, indicating the usefulness of deep learning methods for object detection. In this method, the CNN is used to classify images extracted from the input image as object detection candidates. Based on this method, a Fast R-CNN [7] and Faster R-CNN [8] were proposed as faster object detection methods. YOLO [9] and SSD [10] were also proposed as methods for faster object detection. These methods use videos as inputs and can detect objects almost in real time. Additionally, they use the entire image for model learning; thus, the models learn the object and its surrounding information, suppressing false detection of the background.

3. Overview of Proposed Method

Our proposed bolt tightening judgment system consists of four modules. The modules are a module for extracting bolts from input images, a module for extracting markers from extracted images, a module for making a tightening judgment from extracted markers, and a module for generating the output of the judgment results (Figure 1). Since some bolts are difficult to detect when photographed with a digital camera depending on the shooting angle, this study uses a video camera as input data to capture a single bolt from multiple angles.

3.1. Bolt Extraction Module

In the proposed system, an image including multiple bolt tightening points is used as an input image to increase the bolt tightening work efficiency. The tightening confirmation results for each bolt must be output; thus, it is necessary to detect each bolt in the input image and individually process each bolt.

We use YOLOv3 [11]—an object detection method based on deep learning—to detect bolts in input images. In this method, the model is trained with coordinate information of the target to be detected and set as a label, and the output for each input image is an image that encloses the detection target in a rectangle. Still images are extracted from the input video frame-by-frame and are used as inputs for YOLOv3 to detect the bolts. A score of 0–1 is calculated for each extracted result. The higher the score, the higher the probability that a bolt was detected. The model was created by fine-tuning a publicly available trained YOLOv3 model. The training data used for fine-tuning are a set of images in which bolts are present as well as the coordinates of rectangles surrounding the bolts in the images. In this paper, 303 sets of data were used for fine-tuning.

In YOLOv3, the identity of the detection result in one video frame is not guaranteed in the detection result of the next frame image. Therefore, each detected bolt must be judged to determine whether it is the same as the bolts detected in the previous and subsequent frame images.

We use the coordinate information of the bolts in the detection results to identify each bolt detected in each frame image. From the coordinates

x_{b_{t, 1}}, y_{b_{t, 1}}

of bolt

b_{t, 1}

detected at frame

f_{t}

and the bolt coordinates of the detection result of the next frame

f_{t + 1}

, the bolt which has the coordinates with the shortest Euclidean distance to the coordinates

x_{b_{t, 1}}, y_{b_{t, 1}}

is set as

b_{t + 1,1}

, and it is judged to be the same as the bolt

b_{t, 1}

detected in the previous frame

f_{t}

.

In the input video, there may be frames where bolts are not detected, depending on the angle. In such cases, to avoid erroneously identifying other bolts that are clearly distant from the bolt coordinates of the previous frame as the same bolt, a threshold is set for the Euclidean distance between the coordinates of the bolt in the previous frame and the coordinates of the nearest bolt. If the bolt whose coordinates are farther than the threshold is the closest, it is assumed that the bolt in the previous frame does not exist.

In the bolt extraction module, each detected bolt is extracted from the original image. Each bolt is allocated an ID and frame number indicating where the detection was conducted, and each extracted bolt would be saved as an image file (Figure 2).

3.2. Extracting Markers Module

The marker detection module detects marking marks on a bolt in an image in which only one bolt is present, obtained using the bolt extraction module.

In this module, the markers in the image are first extracted. This is because the information used to judge bolt tightening is the marking result, and the characteristics of the bolt are unnecessary; thus, these features are deleted to increase the classification accuracy. We use a semantic segmentation method based on deep learning as a marker identification method. We considered that this method would be able to respond more robustly to changes in the marker color due to the direction and strength of the light source in the actual shooting environment than pattern-matching methods that set thresholds for color information. The module uses DeepCrack [12]—a semantic segmentation method—to detect marker-colored locations. In this method, a classifier is trained using a binary image that is colored only in the area where the marker is colored as the label of the image. The classifier outputs a monochrome image of the input image, coloring only the areas of the marker coloring points (Figure 3). To train the model, 8995 bolt images and their label images extracted from images taken at bolt tightening sites were used.

3.3. Tightening Judgment Module

The tightening judgment module judges whether the target bolt is correctly tightened according to the pattern information of the monochrome image of the extracted marker. It solves the classification problem of judging whether a monochrome image represents a correct tightening pattern via a classification model using a CNN. The classification model learns the two classes of successful tightening images and unsuccessful tightening images as training data. Since there are only a few cases of failed fastening, this paper uses pseudo-tightening failure marking images generated by creating bolt fastening points using a 3D modeling tool and coloring only the marked points. For each extracted image of a bolt marker, a score of 0 to 1 is output to indicate whether the bolt was successfully tightened or not. The higher the score, the higher the probability of successful tightening. In this experiment, 7196 black-and-white images of successful tightening and 3600 images of pseudo-tightening failure were used to train the model (Figure 4). As described in (1), a single bolt may exist in multiple frames in the proposed method. The proposed method performs processes (2) and (3) on all the bolt extraction images, makes a tightening decision, and stores the result. Since the final output of the proposed method is the success or failure of tightening each bolt, the bolt is considered to have been successfully tightened if any of the identification results in all frames of the bolt are successful, rather than the results of individual frames. This is because bolts are shot at various angles in a single video, and depending on the shooting angle, there are cases where the markings are completely hidden.

3.4. Output Generation Module

The proposed method retains the results generated by each module, and at the same time, generates a video reflecting the tightening judgment results in the input video. This is to make it easier for the operator to know which bolt is where when reviewing the tightening judgment results (Figure 5).

4. Examination

In this paper, we evaluated the processing speed and judgment accuracy of the tightening confirmation system based on the proposed method using data taken at actual work sites.

4.1. Data Collection

To create bolt tightening image data for evaluation, the inside of a box girder at an actual bridge construction site was photographed. A video camera was used to take a series of images of the sides and bottom of the bolt tightening points in the box girders so that the images could be captured in a single video. The video was taken at the closest distance where all the horizontal rows of bolts at the bolted points could be captured, but the distance was approximately 30 cm to 50 cm because the number of bolts in a row varies depending on the location. There was only one type of bolt to be photographed. There was no lighting equipment installed on the bridge girder; thus, we used handheld lighting equipment to illuminate the bolt tightening points to be photographed. The lighting equipment was not fixed but placed next to the video camera and moved with the video camera. The video was shot with a JVC GZ-RY980 camcorder in 4K quality. The video used in this paper has a shooting time of 23 s, and the number of bolts in the video is 362. All the bolts in the video were visually confirmed to be tightened properly by the operator.

4.2. Evaluation of Processing Speed

According to a preliminary survey of the construction company, the inspection time for each bolt was approximately 3 s. The time required to check the bolt tightening points for this experiment is approximately 1089 s.

A personal computer with a GPU was used for this evaluation: Intel core i9-7920X as the CPU, RTX2080Ti as the GPU, with 64 GB of memory. In this experiment, it took approximately 23 s to capture the video and approximately 708 s from the time the captured video was input until all processing was completed, giving a total processing time of 731 s. The processing speed per bolt was approximately 2 s. The proposed method reduced the time required for tightening confirmation by 33%. The more bolts that are tightened at a site, the larger the reduction in time required.

4.3. Evaluation of Image Processing Models

We evaluate the bolt extraction model and bolt tightening judgment model by calculating precision and recall. Precision means the ratio of bolts correctly extracted or correctly determined to be tightened out of the model’s judgment results. Recall means the ratio of correct bolt extraction or correct tightening decisions among the correct data. High precision means the model makes fewer judgment errors. High recall means the model is less prone to oversight errors. Figure 6 shows an example of bolt and marker extraction results for a correctly tightened bolt.

Set a threshold of 0.5 for the score of the bolt extraction result. This score means that the detection result has been determined to be a bolt with a probability higher than 50%. The results with a score of 0.5 or less are bolt extraction failures, and the results are not saved. The number of bolts extracted by the model was 396 from the input video, and all the target bolts were extracted. The bolt extraction precision value was approximately 91.4%. The bolt extraction recall value was 100.0%. It is thought that it is possible to deal with the extraction error by adjusting the score threshold.

Set a threshold of 0.5 for the score of the bolt tightening judgment result. The results with a score of 0.5 or less are bolt tightening failures. The bolt tightening judgment model receives 396 images as an input dataset. The number of bolts that were determined to be successfully tightened by the model was 396, and all the target bolts were determined to be successfully tightened. The bolt tightening judgment precision value was approximately 91.4%. The bolt extraction recall value was 100.0%. However, the model also determined 34 extraction errors as successful tightening. For example, Figure 7 shows an example of a bolt extraction error. To improve the accuracy of judgments, it is necessary to add training data to the model, relearn the model, and revalidate the judgment results.

4.4. Issues during Shooting

In this experiment, it was also found that the video camera must be moved horizontally with respect to the shooting surface when shooting the video, which places a burden on the photographer’s legs and back when shooting the lower part of the wall surface, and that the moving speed must be slow to shoot both the wall surface and bottom surface while maintaining the horizontal bolt line of the video camera. The moving speed had to be slowed down in order to maintain a level bolt line for the video camera to capture both the wall and bottom. In light of the objective of reducing the burden on the operator, the filming method needs to be improved.

5. Conclusions

We propose an automatic system for bolt tightening confirmation via deep learning-based image processing to achieve efficient bolt tightening confirmation and provide objective confirmation records. Evaluation experiments were conducted in a box girder and the following findings were obtained.

It was shown that the bolt tightening confirmation work using the proposed method can be performed about 33% faster than the confirmation work performed by human hands. It was shown that 100% of bolts can be extracted from the captured images without missing any bolts, even with a hand-held light source in a dark place such as a box girder. Since it is difficult for a worker to take a moving image that enables accurate extraction and tightening judgment of bolts in a narrow box girder by using a hand-held video camera, it is necessary to improve the method of taking images.

On the other hand, the bolt tightening judgment model needs to be relearned because the model made successful tightening judgments for objects other than bolts.

In future research, we will evaluate solutions to the aforementioned problems in the actual shooting environment and evaluate the accuracy of bolt tightening confirmation using the extracted bolt images. We will also investigate and evaluate the construction condition that will be appropriate to use for this method.

Author Contributions

Conceptualization, T.F. and T.M.; methodology, T.F.; writing—original draft, T.F.; investigation, T.F., T.M. and M.F.; project administration, M.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are not publicly available due to the fact they are not open data.

Conflicts of Interest

The authors declare no conflict of interest. Please add the name of the publisher and their location.

References

Japan Road Association. Steel Bridge Construction Handbook; Japan Road Association: Tokyo, Japan, 2015. [Google Scholar]
Tsutomu, T.; Masahiro, K.; Yasuhiro, I.; Yoneyoshi, T. Carrying capacity test for friction join of high-strength bolt from a removed foot-way bridge used under 17years. J. Struct. Eng. A 1990, 36, 1087–1096. [Google Scholar]
Park, J.; Kim, T.; Kim, J. Image-based bolt-loosening detection technique of bolt joint in steel bridges. In 6th International Conference on Advances in Experimental Structural Engineering. In Proceedings of the 11th International Workshop on Advanced Smart Materials and Smart Structures Technology, University of Illinois Urbana, Champaign, IL, USA, 1–2 August 2015. [Google Scholar]
Cha, Y.J.; You, K.; Choi, W. Vision-based detection of loosened bolts using the Hough transform and support vector machines. Autom. Constr. 2016, 71, 181–188. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; Volume 25, pp. 1097–1105. [Google Scholar]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014. [Google Scholar]
Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; Volume 1, pp. 91–99. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot Multibox Detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016. [Google Scholar]
Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
Zou, Q.; Zhang, Z.; Li, Q.; Qi, X.; Wang, Q.; Wang, S. DeepCrack: Learning hierarchical convolutional features for crack detection. IEEE Trans. Image Process. 2019, 28, 1498–1512. [Google Scholar] [PubMed]

Figure 1. The process flow of the proposed method.

Figure 2. Example bolt detection results obtained using YOLOv3; (a) an input image; (b) a result of detection image; (c) an example of extracted bolt image.

Figure 3. Example of input image and output result of marker extraction; (a) an input bolt image; (b) an output image.

Figure 4. Example of successful tightening marking image (a) and failed tightening marking image (b).

Figure 5. Example of successful tightening judgment results for the input image; (Top) input image; (Bottom) output image.

Figure 6. Example of bolt and marker extraction results for a correctly tightened bolt: (a) bolt extraction image and (b) marker extraction image.

Figure 7. Example of failed detection image.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fukuoka, T.; Minami, T.; Fujiu, M. Confirmation of Final Bolt Tightening via Deep Learning-Based Image Processing. Appl. Sci. 2023, 13, 7573. https://doi.org/10.3390/app13137573

AMA Style

Fukuoka T, Minami T, Fujiu M. Confirmation of Final Bolt Tightening via Deep Learning-Based Image Processing. Applied Sciences. 2023; 13(13):7573. https://doi.org/10.3390/app13137573

Chicago/Turabian Style

Fukuoka, Tomotaka, Takahiro Minami, and Makoto Fujiu. 2023. "Confirmation of Final Bolt Tightening via Deep Learning-Based Image Processing" Applied Sciences 13, no. 13: 7573. https://doi.org/10.3390/app13137573

APA Style

Fukuoka, T., Minami, T., & Fujiu, M. (2023). Confirmation of Final Bolt Tightening via Deep Learning-Based Image Processing. Applied Sciences, 13(13), 7573. https://doi.org/10.3390/app13137573

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Confirmation of Final Bolt Tightening via Deep Learning-Based Image Processing

Abstract

1. Introduction

2. Related Research

3. Overview of Proposed Method

3.1. Bolt Extraction Module

3.2. Extracting Markers Module

3.3. Tightening Judgment Module

3.4. Output Generation Module

4. Examination

4.1. Data Collection

4.2. Evaluation of Processing Speed

4.3. Evaluation of Image Processing Models

4.4. Issues during Shooting

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI