YOLOv5s-D: A Railway Catenary Dropper State Identification and Small Defect Detection Model

Li, Ziyi; Rao, Zhiqiang; Ding, Lu; Ding, Biao; Fang, Jianjun; Ma, Xiaoning

doi:10.3390/app13137881

Open AccessArticle

YOLOv5s-D: A Railway Catenary Dropper State Identification and Small Defect Detection Model

by

Ziyi Li

¹,

Zhiqiang Rao

^1,*,

Lu Ding

¹,

Biao Ding

¹,

Jianjun Fang

¹ and

Xiaoning Ma

²

¹

Urban Rail Transit and Logistics College, Beijing Union University, Beijing 100101, China

²

China Academy of Railway Sciences Corporation Limited, Institute of Computing Technology, Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(13), 7881; https://doi.org/10.3390/app13137881

Submission received: 4 May 2023 / Revised: 28 May 2023 / Accepted: 2 June 2023 / Published: 5 July 2023

(This article belongs to the Section Transportation and Future Mobility)

Download

Browse Figures

Versions Notes

Abstract

:

High-speed railway catenaries are vital components in railway traction power supply systems. To ensure stable contact between the pantograph and the catenary, droppers are positioned between the messenger wire and contact line. The failure of one or more droppers will affect the power supply of the catenary and the operation of the railway. In this paper, we modify the You Only Look Once version five (YOLOv5) model in several ways and propose a method for improving the identification of dropper status and the detection of small defects. Firstly, to focus on small target features, the selective kernel attention module is added to the backbone. Secondly, the feature graphs of different scales extracted from the backbone network are fed into the bidirectional feature pyramid network for multiscale feature fusion. Thirdly, the YOLO head is replaced by a decoupled head to improve the convergence speed and detection accuracy of the model. The experimental results show that the proposed model achieves a mean average precision of 92.9% on the dropper dataset, an increase of 3.8% over the results using YOLOv5s. The detection accuracy of small dropper defects reaches 79.2%, representing an increase of 10.8% compared with YOLOv5s and demonstrating that our model is better at detecting small defects.

Keywords:

railway; catenary dropper; failure and defect detection; YOLOv5; attention mechanism; decoupled head

1. Introduction

As one of the important transportation modes, railway transportation needs a stable energy supply to guarantee train operations. Electric traction power supply is currently the most effective means of supplying energy to high-speed railway transportation. The catenary is an important component in any high-speed railway traction power supply system. In the catenary system, droppers play the role of connecting the contact line with the messenger wire. The contact line is suspended from the messenger wire by the droppers, and the height of the contact line is controlled by adjusting the dropper lengths. This makes it possible to improve the operation quality of the contact line and the pantograph. Catenary droppers are exposed to the natural environment for long periods of time. Due to the external loads from both the vehicle [1] and the environment [2], it is common for catenaries to experience deviations in their geometry, including dropper defects. They may be damaged by bad weather, by the impact of various forces, and by erosion from the electric currents that pass through them during the operation of electric locomotives. Under the influence of these factors, droppers are prone to numerous failure modes, such as loosening, deformation, fracturing, and falling off. One of the consequences of dropper deformation is the irregularity of the contact wires, which directly affects the quality of current collection [3]. These failures directly affect the quality of the electric traction power supply, and may lead to accidents, seriously affecting the safety of train operations.

To ensure the safety of train operations, railway workers must periodically check the catenary droppers. Any railway line contains a large number of catenary droppers, making manual inspection inefficient, time-consuming, and prone to mistakes. Performing inspection operations at night makes it more difficult for the inspectors to check the droppers, placing higher requirements on the inspection techniques. To identify faults in catenary droppers, railway companies have developed a catenary monitoring device. This device takes pictures of the catenaries at close range and transmits them to the railway workers. The railway workers check the pictures one by one and judge whether there are defects in the catenary droppers. This method improves the detection efficiency, but three problems persist: firstly, there is a delay between image acquisition and fault identification; secondly, a large number of images is obtained, and manually checking all of the pictures is highly inefficient; and, thirdly, small defects are not easy to identify in the pictures, and may therefore be missed or misidentified. Thus, there is an urgent need to use digital image processing and deep learning to develop an intelligent detection technology that will effectively improve the detection efficiency and accuracy of catenary dropper defects.

In pursuit of the intelligent detection of catenary defects, many scholars have studied the detection of various components of high-speed rail catenaries, such as insulator recognition and defect detection [4,5], loose nut detection in U-shaped hoops [6], bird’s nest detection [7], loose defect detection for bracing wire components [8], and arc detection and recognition in pantograph–catenary systems [9]. At present, research on catenary droppers tends to focus on engineering structures and geometric parameters, as well as the impact of the suspension string design on train operation safety at the physical level. For example, the inrush current characteristics [10], stress characteristics [11,12], dynamic performance [13,14], geometric characteristics [15], wind resistance [16], and fatigue analysis [17] of the droppers have recently been studied. To date, there has been relatively little research on the intelligent detection of dropper faults. Existing studies have mainly sought to ensure contact through dropper detection by improving the faster region-proposal convolutional neural network (R-CNN) and You Only Look Once (YOLOv3) algorithm. Yu et al. [18] replaced DenseNet with the original feature extraction module of faster R-CNN to extract the features of the dropper image in order to extract the deep image features. By modifying the original feature extraction module of faster R-CNN to extract the characteristics of the dropper, it is possible to effectively combine deep and shallow feature maps [19,20]. Guo et al. [21] propose an improved faster R-CNN for OCS dropper detection, including the balanced attention feature pyramid network (BA-FPN) and center-point rectangle loss (CR loss), which can accurately recognize and locate droppers, while Zhang et al. [22] combined YOLOv3 and Odin to achieve dropper positioning and status detection. In [23], a feature pyramid consisting of five convolution layers with different scales was constructed to improve YOLOv3, but this was only applied to the positioning of droppers rather than the detection of defects. Li et al. [24] proposed a machine learning detection method based on time–frequency analysis, combining a support vector machine (SVM) and independent component analysis (ICA) to identify and locate broken droppers. These studies have undoubtedly made progress in the intelligent detection of contact network droppers, but they are mainly focused on the dropper positioning and status detection.

The shooting time of high-speed railway catenary dropper images is mainly concentrated at night. Therefore, the background of the catenary dropper images is mainly black. In this kind of background, small defects in the catenary are difficult to identify and detect. Generally, they may cover only a dozen pixels, as shown in Figure 1. The overall characteristics of the image are relatively weak. Thus, the feature information extracted during the detection process will contain little information and lots of noise, which will seriously affect the test results. YOLOv5 is one of the better performing versions in the YOLO series of algorithms. By improving the architecture of YOLOv5 [25], changing the feature fusion method of the network model [26,27], and adding attention mechanisms to the network model [28], the performance of small-target detection in the network is improved. In response to these issues, this paper proposes an improved model based on YOLOv5s by introducing the selective kernel (SK) attention module [29], the bidirectional feature pyramid network (BiFPN) [30], and the decoupled head [31]. The model enhances the attention to small targets in catenary dropper images, enabling the recognition of the state of the catenary dropper and the detection of small defects against dark backgrounds. The experimental results using a self-made contact network dropper dataset show that the detection accuracy of the proposed approach is better than that of the original model, improving the detection of dropper defects in high-speed rail networks.

The contributions of the proposed approach are as follows:

(1): The SK attention module was optimized, resulting in reduced module complexity and improved performance, and further increasing the focus on small-target features.
(2): The feature fusion network structure of the YOLOv5 model was optimized by utilizing BiFPN for multi-scale feature fusion, enabling a better balance of information across different scales.
(3): The detection head of YOLOv5 was improved by replacing the original YOLO head with an enhanced decoupled head, leading to an improved detection accuracy and the faster convergence of the model.

The rest of this paper is organized as follows. Section 2 introduces the YOLOv5 model and the improved model in this article, and explains the structure of the improved SK attention module and decoupling head structure, respectively. In Section 3, firstly, the preparation for the experiments is explained, including the contact network suspension dataset and experimental parameter settings, and then the results of the ablation experiments and comparative experiments are analyzed. Section 4 concludes this study.

2. Methods

2.1. YOLOv5 Network

YOLO algorithm has developed YOLOv1 [32], YOLOv2 [33], YOLOv3 [34], YOLOv4 [35], YOLOv5, YOLOv7 [36], YOLOX, etc. Among them, YOLOv5 is an excellent target-detection network. The structure of the YOLOv5 network model is shown in Figure 2. YOLOv5 comes in five versions: n, s, m, l, and x. The network structure is unchanged in these five versions, but the weight, width, and depth of the model vary. Aiming at the task of catenary dropper detection, the smaller module depth and width of YOLOv5s is considered most suitable as the basic model for this study.

The YOLOv5 network model is divided into four parts: input, backbone, neck, and head. The input of YOLOv5 uses mosaic data enhancement, adaptive anchor frame calculation, and image size processing. The backbone includes a focus module, spatial pyramid pooling-fast (SPPF) module, and BottleneckCSP network. In the neck, the feature pyramid networks (FPN) and path aggregation network (PAN) are combined, the conventional FPN layer is combined with the bottom-up feature pyramid, the extracted semantic features and location features are fused, and the trunk layer and the detection layer are fused, enabling the model to obtain more abundant feature information. The head is composed of three detection layers. Feature maps of different sizes are used to detect target objects of different sizes. Each detection layer outputs the corresponding vectors, and finally generates and marks the predicted boundary boxes and categories of targets in the original image.

2.2. SK Attention Module

The background of dropper images collected at night is mostly black, and the quality of the images varies according to the illumination intensity of the lighting equipment. Images that are too bright or too dark make the recognition and identification of features more challenging. When the detection target has small defects, the network model will suffer serious information loss in the process of acquiring a low-resolution feature layer by subsampling. To solve this problem, an attention mechanism can be introduced in the shallow feature extraction stage, as this enables the model to effectively select target information, assign more weight to small targets, improve the feature attention of small targets, and improve the detection accuracy of the network model. To focus on the characteristics of fine defects in the suspension strings, the modified SK attention module is added to the YOLOv5 model. In the fuse phase of SK network, the two fully connected layers, that first reduce dimensionality and then increase dimensionality, are not conducive to the re-calibration of feature channel information, which weakens the correlation between feature map channels. Therefore, a one-dimensional convolution is used to avoid the accuracy loss caused by dimensionality reduction. The structure of this mechanism is shown in Figure 3.

The modified SK attention module is divided into three parts: split, fuse, and select. Firstly, after the input feature vectors have been convolved by 3 × 3 and 5 × 5 convolution kernels, the respective output vectors,

\tilde{U}

and

\hat{U}

, are added to obtain

U \in R^{H \times W \times C}

, where

H

and

W

are the height and width of the input feature, and

C

is the number of input feature channels. The global average pooling operation (

F_{g p}

) is then used to compress the matrix,

U

, to

1 \times 1 \times C

, applying two separate one-dimensional convolutional

(C_{k})

to the feature vector,

s

, and resulting in two separate vectors. After the softmax operation, the cross-channel soft attention module is used to adaptively select information at different spatial scales. The calculation process is shown as Formulas (1)–(5).

s_{c} = F_{g p} (U_{c}) = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} U_{c} (i, j),

(1)

p = C_{k} (s), q = C_{k} (s)

(2)

a_{c} = \frac{e^{A_{c} p}}{e^{A_{c} p} + e^{B_{c} q}}

(3)

b_{c} = \frac{e^{B_{c} q}}{e^{A_{c} p} + e^{B_{c} q}}

(4)

Finally, the output feature,

V

, obtained according to the attention weights of different cores, is shown in Formula (5).

V_{c} = a_{c} \cdot {\tilde{U}}_{c} + b_{c} \cdot {\hat{U}}_{c}, a_{c} + b_{c} = 1

(5)

In Formula (5),

V = [V_{1}, V_{2}, \dots, V_{C}]

,

V_{c} \in R^{H \times W}

.

2.3. Bidirectional Feature Pyramid Network

When there are features with different resolutions, existing feature fusion methods first adjust the features to have the same resolution, and then add them, thereby treating features with different scales equally; that is, feature fusion is carried out with equal weights. However, in real-world situations, input features with different resolutions contribute unequally to the output features. BiFPN effectively realizes bidirectional cross-scale connection and weighted feature fusion. The original YOLOv5 adopts the PANet structure, whereby a top-down FPN structure is followed by a bottom-up PANet structure, as shown in Figure 4a,b. To improve the detection effect of small- and medium-sized targets, and make full use of high-resolution low-level features to improve the fusion efficiency, the BiFPN is added to YOLOv5 to assign weights to each input feature according to their importance, as shown in Figure 4c.

2.4. Decoupled Head

In terms of object detection, the classification and regression tasks conflict with each other. In the YOLOv5 model, classification and regression occur in the same convolution. In the YOLOX model, the coupling head used for classification positioning is replaced with a decoupled head. As shown in Figure 5, considering the different requirements of classification and positioning, the decoupled head structure uses different branches for the calculations, which is conducive to improving the target detection effect. Moreover, to avoid a large increase in the computational load, the decoupled head first performs dimension reduction, and then conducts the classification and regression calculations on two branches.

2.5. YOLOv5s-D Network Structure

The structure of the YOLOv5s-D network model is shown in Figure 6. There are small defects in the catenary dropper images. To improve the detection accuracy of these small targets, the following improvements are made to YOLOv5s: (1) an SK attention module is added to the model to enhance the extracted features; (2) the BiFPN network replaces the characteristic pyramid network for multiscale feature fusion, enhancing the fusion of shallow and deep feature information from the image; and (3) a decoupled head replaces the original YOLO head to speed up the model convergence and improve the model detection effect.

Dropper detection mainly includes image screening, image enhancement, and image annotation, as well as model training and target detection based on YOLOv5. The specific process is shown in Figure 7.

3. Experiments

Experiments were conducted on a PC running Windows 10 with an Intel core I7-11800H, NVIDIA RTX3060, and 16 GB memory. The PyTorch deep learning framework, part of the Python programming language environment, was used to carry out the experiments.

3.1. Dataset Preparation

Images of the droppers along the catenary of a high-speed railway were acquired through the catenary monitoring device. A total of 1532 pictures of droppers were collected. After manual screening and the elimination of images without droppers, a total of 1056 valid sample pictures containing droppers remained. Among the valid samples, the majority were normal samples, with relatively few abnormal samples. Too few abnormal samples will lead to overfitting. Therefore, it was necessary to expand the data to increase the diversity of samples and improve the accuracy and generalization performance of the dropper defect detection network. The number of images containing defects was expanded by CycleGAN, and the total sample size was expanded by data enhancement. A total of 4224 images were obtained, including 1612 abnormal samples and 2612 normal samples. The targets in the dropper images were classified into broken droppers, bent droppers, normal droppers, and droppers with small defects. The Labellmg software was used to label the images and generate detection labels of “dxdx”, “wqdx”, “zcdx”, and “xxqx” for the respective categories. Some examples and corresponding category labels are shown in Figure 8. After labeling the dropper images, the dataset was randomly divided into a training set, verification set, and test set at a ratio of 7:2:1. Details of the experimental data are given in Table 1.

3.2. Experimental Comparison of Different Attention Modules in YOLOv5s

To verify whether the SK attention module in YOLOv5s improves the detection performance of the model, various attention modules were added to the original YOLOv5s model. In the same experimental environment, training was carried out using the dropper images obtained from a high-speed rail catenary. The experimental results are listed in Table 2. The experimental results indicate that the SK attention, GAM attention, CBAM, and CA modules produce a mean average precision (mAP@0.5) of 90.8%, 88.5%, 89.6%, and 89.9%, respectively. Adding the SK attention module to YOLOv5s gives the greatest improvement in detection accuracy. The detection accuracy of small defects on suspended strings is shown in Figure 9. The SK attention module also generates a higher detection accuracy for fine defects, achieving an average precision (AP) of 74.8%, compared with 68.9%, 71.2%, and 72.2% for the other three attention modules. This shows that the introduction of the SK attention module enhances the feature extraction ability of small targets.

3.3. Ablation Experiments

To directly verify the effectiveness of the different modifications in terms of model improvement, ablation comparison experiments were conducted to verify the influence of each strategy. Three improved strategies have been added to the original YOLOv5s algorithm. For each combination of these strategies, 150 rounds of training were performed under the same experimental conditions. The experimental results are presented in Table 3.

It can be seen that, replacing the feature extraction network of YOLOv5 with BiFPN gives a slight improvement in the detection accuracy, with the AP of fine defect detection increasing by 3.4%. Adding the SK attention module to the backbone of YOLOv5s ensures higher precision in detecting fine defects. The AP of fine defect detection increases by 6.3% and mAP@0.5 increases by 1.5%. The original detection head of YOLOv5 was replaced by a decoupled head. This improves the algorithm detection accuracy significantly, especially for small defects, where the AP increases by 8.3%. The improvement measures proposed in this paper greatly improve the detection accuracy of fine defects, and slightly improve the detection accuracy of the string state. The AP of dxdx, wqdx, zcdx, and xxqx classification increased by 1.2%, 2.6%, 0.8%, and 10.7%, respectively, and the mAP@0.5 increased by 3.8%. The visualization results of the ablation experiment are shown in Figure 10.

The significance of loss in the training process lies in its ability to mirror the correlation between the actual value and the predicted value. As the loss decreases, the model’s performance enhances and the predicted value gets closer to the true value. The improved YOLOv5 shows the loss curve of the network and the original network, as shown in Figure 11, which indicates that the bounding box loss, class loss, and object loss in both the training and validation sets decline gradually and eventually stabilize.

3.4. Comparative Experiments

To further verify the superiority of the YOLOv5s-D algorithm, the proposed algorithm and other mainstream target detection algorithms (YOLOv3, YOLOv4, YOLOv5, and the latest YOLOv7) were tested under the same experimental environment. The same dropper data were used in 150 rounds of training under the same strategy. Figure 12 compares the mAP@0.5 performance of dropper detection under the different models.

As shown in Figure 12, the detection accuracy of YOLOv4 is better than that of YOLOv3. This is because YOLOv4 improves the feature extraction network and data enhancement technology of YOLOv3. The detection accuracy of YOLOv7 is higher than that of both YOLOv3 and YOLOv5, but the model convergence rate is slower. The proposed YOLOv5s-D not only produces a higher detection accuracy than YOLOv3, YOLOv4, and YOLOv7, but also gives a faster improvement in accuracy than the original YOLOv5s network model. These results indicate that the improvement measures implemented in this study are effective and feasible.

In order to further understand the detection ability of the proposed algorithm for small defects of hanging strings, we use different network models to test the hanging string images of the test set, the detection results are shown in Figure 13, Figure 14, Figure 15 and Figure 16. The first column to the fifth column in the figure are the test results of YOLOv5s-D, YOLOv5, YOLOv7, YOLOv4, and YOLOv3, successively. In Figure 13 and Figure 14, all the five network models can detect the dropper state against the daytime background and the night background. In Figure 15, YOLOv4 and YOLOv3 are unable to detect the small defect. As can be seen from the detection results of the dropper image containing multiple minor defects in Figure 16, the YOLOv5, YOLOv7, YOLOv4, and YOLOv3 network models missed one or two minor defects, while YOLOv5s-D successfully detected all minor defects. It can be seen that the YOLOv5s-D model has better detection performance.

4. Conclusions

To improve the inspection efficiency of catenaries and intelligently identify the state of catenary droppers, this paper has described an improved YOLOv5s network model. The SK attention module was added to obtain information from different receptive fields and improve the generalization ability of the network model. The original feature fusion network has been replaced by BiFPN, allowing feature information of different scales to be fully fused, which improves the accuracy of dropper defect detection in complex environments. By replacing the original detection head with a decoupled head, the convergence speed of the model has been enhanced and the detection accuracy of small defects has been improved. The average mAP@0.5 of the proposed model was found to be 92.9%, an increase of 3.8% compared with the original model. The detection accuracy of fine defects was increased by 10.7%. The detection accuracy of the dropper state was also improved, and the AP of broken, bent, and normal dropper classification was increased by 1.2%, 2.6%, and 0.8%, respectively. The proposed method accurately recognizes the status of the dropper and can identify small defects, satisfying the requirements for intelligent inspection. However, the scarcity of small defect data means that the defect types cannot be further classified. In future work, we will attempt to enhance the small defect data and improve the dropper detection.

Author Contributions

Writing—original draft preparation, Z.L.; writing—review and editing, Z.R.; software, L.D.; validation, B.D.; visualization, J.F.; data curation, X.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Beijing Natural Science Foundation, grant number L221015.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Song, Y.; Wang, Z.; Liu, Z.; Wang, R. A spatial coupling model to study dynamic performance of pantograph-catenary with vehicle-track excitation. Mech. Syst. Signal Process. 2021, 151, 107336. [Google Scholar] [CrossRef]
Duan, F.; Song, Y.; Gao, S.; Liu, Y.; Chu, W.; Lu, X.; Liu, Z. Study on aerodynamic instability and galloping response of rail overhead contact line based on wind tunnel tests. IEEE Trans. Veh. Technol. 2023, 1–11. [Google Scholar] [CrossRef]
Song, Y.; Liu, Z.; Ronnquist, A.; Navik, P.; Liu, Z. Contact wire irregularity stochastics and effect on high-speed railway pantograph–catenary interactions. IEEE Trans. Instrum. Meas. 2020, 69, 8196–8206. [Google Scholar] [CrossRef]
Tan, P.; Li, X.; Ding, J.; Cui, Z.; Ma, J.; Sun, Y.; Huang, B.; Fang, Y. Mask R-CNN and multifeature clustering model for catenary insulator recognition and defect detection. J. Zhejiang Univ. Sci. A 2022, 23, 745–756. [Google Scholar] [CrossRef]
Li, T.; Hao, T. Damage Detection of Insulators in Catenary Based on Deep Learning and Zernike Moment Algorithms. Appl. Sci. 2022, 12, 5004. [Google Scholar] [CrossRef]
Han, Y.; Liu, Z.; Lyu, Y.; Liu, K.; Li, C.; Zhang, W. Deep learning-based visual ensemble method for high-speed railway catenary clevis fracture detection. Neurocomputing 2020, 396, 556–568. [Google Scholar] [CrossRef]
Wu, X.; Yuan, P.; Peng, Q.; Ngo, C.W.; He, J.Y. Detection of bird nests in overhead catenary system images for high-speed rail. Pattern Recognit. 2016, 51, 242–254. [Google Scholar] [CrossRef]
Liu, W.; Liu, Z.; Li, Y.; Wang, H.; Yang, C.; Wang, D.; Zhai, D. An automatic loose defect detection method for catenary bracing wire components using deep convolutional neural networks and image processing. IEEE Trans. Instrum. Meas. 2021, 70, 5016814. [Google Scholar] [CrossRef]
Huang, S.; Zhai, Y.; Zhang, M.; Hou, X. Arc detection and recognition in pantograph–catenary system based on convolutional neural network. Inf. Sci. 2019, 501, 363–376. [Google Scholar] [CrossRef]
Sun, J.; Hu, K.; Fan, Y.; Liu, J.; Yan, S.; Zhang, Y. Modeling and Experimental Analysis of Overvoltage and Inrush Current Characteristics of the Electric Rail Traction Power Supply System. Energies 2022, 15, 9308. [Google Scholar] [CrossRef]
Chen, L.; Guo, D.; Pan, L.; He, F. The influence of wind load on the stress characteristics of dropper for a high-speed railway. Adv. Mech. Eng. 2022, 14, 16878132221097833. [Google Scholar] [CrossRef]
Chen, L.; Sun, J.; Pan, L.; He, F. Analysis of Dropper Stress in a Catenary System for a High-Speed Railway. Math. Probl. Eng. 2022, 2022, 9663767. [Google Scholar] [CrossRef]
Xu, Z.; Liu, Z.; Song, Y. Study on the Dynamic Performance of High-Speed Railway Catenary System With Small Encumbrance. IEEE Trans. Instrum. Meas. 2022, 71, 3518810. [Google Scholar] [CrossRef]
Bryja, D.; Hyliński, A. Droppers’ stiffness influence on dynamic interaction between the pantograph and railway catenary. Probl. Kolejnictwa 2019, 183, 89–98. [Google Scholar]
Gregori, S.; Tur, M.; Nadal, E.; Fuenmayor, F.J. An approach to geometric optimisation of railway catenaries. Veh. Syst. Dyn. 2018, 56, 1162–1186. [Google Scholar] [CrossRef]
Song, Y.; Zhang, M.; Wang, H. A response spectrum analysis of wind deflection in railway overhead contact lines using pseudo-excitation method. IEEE Trans. Veh. Technol. 2021, 70, 1169–1178. [Google Scholar] [CrossRef]
Liu, X.; Peng, J.; Tan, D.; Xu, Z.; Liu, J.; Mo, J.; Zhu, M. Failure analysis and optimization of integral droppers used in high speed railway catenary system. Eng. Fail. Anal. 2018, 91, 496–506. [Google Scholar] [CrossRef]
Yu, X.; Gu, G.; Wang, Y.; Zhang, C. Catenary Dropper Fault Detection Method Based on Faster R-CNN. J. Lanzhou Jiaotong Univ. 2021, 40, 58–65. [Google Scholar]
Zhang, X.; Jing, W. Fault detection of overhead contact systems based on multi-view Faster R-CNN. J. Intell. Fuzzy Syst. 2022, 43, 397–407. [Google Scholar] [CrossRef]
Zhang, X.; Gong, Y.; Qiao, C.; Jing, W. Multiview deep learning based on tensor decomposition and its application in fault detection of overhead contact systems. Vis. Comput. 2022, 38, 1457–1467. [Google Scholar] [CrossRef]
Guo, Q.; Liu, L.; Xu, W.; Gong, Y.; Zhang, X.; Jing, W. An improved faster R-CNN for high-speed railway dropper detection. IEEE Access 2020, 8, 105622–105633. [Google Scholar] [CrossRef]
Zhang, M.; Jin, W.; Tang, P.; Li, L. A YOLOv3 and ODIN Based State Detection Method for High-speed Railway Catenary Dropper. In Proceedings of the 2021 IEEE International Conference on Progress in Informatics and Computing (PIC), Shanghai, China, 17–19 December 2021; pp. 72–76. [Google Scholar]
Liu, S.; Tang, P.; Jin, W. Study on Catenary Dropper and Support Detection Based on Intelligent Data Augmentation and Improved YOLOv3. Comput. Sci. 2020, 47, 178–182. [Google Scholar]
Li, J.; Zhang, X.; Zhang, C.; Tian, T. Simulation Research on High-Speed Railway Dropper Fault Detection and Location Based on Time-Frequency Analysis. J. Phys. Conf. Ser. 2020, 1631, 012100. [Google Scholar] [CrossRef]
Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
Mahaur, B.; Mishra, K. Small-object detection based on YOLOv5 in autonomous driving systems. Pattern Recognit. Lett. 2023, 168, 115–122. [Google Scholar] [CrossRef]
Liu, Z.; Gao, X.; Wan, Y.; Wang, J.; Lyu, H. An Improved YOLOv5 Method for Small Object Detection in UAV Capture Scenes. IEEE Access 2023, 11, 14365–14374. [Google Scholar] [CrossRef]
Wang, M.; Yang, W.; Wang, L.; Chen, D.; Wei, F.; KeZiErBieKe, H.; Liao, Y. FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection. J. Vis. Commun. Image Represent. 2023, 90, 103752. [Google Scholar] [CrossRef]
Deng, T.; Liu, X.; Mao, G. Improved YOLOv5 Based on Hybrid Domain Attention for Small Object Detection in Optical Remote Sensing Images. Electronics 2022, 11, 2657. [Google Scholar] [CrossRef]
Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
Redmon, J.; Farhadi, A. Yolov3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.; Liao, H. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]

Figure 1. Small defects of the dropper in images. In the red box is the small defect of the dropper.

Figure 2. The structure of YOLOv5 network model.

Figure 3. The structure of modified SK attention module.

Figure 4. Feature fusion network.

Figure 5. The structure of the decoupled head.

Figure 6. The structure of YOLOv5s-D network model.

Figure 7. Dropper detection flow chart.

Figure 8. Pictures of the dropper. In the red box is the small defect of the dropper.

Figure 9. The AP of the small defect (xxqx).

Figure 10. Visualization of ablation experiment.

Figure 11. Loss curve.

Figure 12. mAP@0.5 curves.

Figure 13. Experimental results of single dropper.

Figure 14. Experimental results of multiple droppers.

Figure 15. Experimental results of single fault.

Figure 16. Experimental results of multiple faults.

Table 1. Parameter settings.

Name	Learning Rate	Momentum	Weight Decay	Batch Size	Epoch
Value	0.01	0.937	0.0005	32	150

Table 2. Experimental results of adding different attention modules in YOLOv5s.

Module	dxdx (AP%)	wqdx (AP%)	zcdx (AP%)	xxqx (AP%)	mAP@0.5%
YOLOv5s	98.3	93.1	96.3	68.5	89.1
+SK attention	98.8	94.4	95.2	74.8	90.8
+GAM attention	98.3	92.1	94.6	68.9	88.5
+CBAM	98.4	93.1	95.9	71.4	89.6
+CA	98.9	92.0	96.7	72.2	89.9

Table 3. Ablation experiment results.

Module	dxdx (AP%)	wqdx (AP%)	zcdx (AP%)	xxqx (AP%)	mAP@0.5%
YOLOv5s	98.3	93.1	96.3	68.5	89.1
+BiFPN	98.7	94.5	96.1	71.9	89.9
+SK	98.8	94.4	95.2	74.8	90.8
+Decoupled head	99.3	94.1	96.4	76.8	91.7
+BiFPN+SK	99.2	93.1	96.7	75.6	91.2
+BiFPN+Decoupled head	99.1	94.3	95.1	77.9	91.6
+SK+Decoupled head	98.7	92.6	97.3	76.5	91.2
YOLOv5s-D	99.5	95.7	97.1	79.2	92.9

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Z.; Rao, Z.; Ding, L.; Ding, B.; Fang, J.; Ma, X. YOLOv5s-D: A Railway Catenary Dropper State Identification and Small Defect Detection Model. Appl. Sci. 2023, 13, 7881. https://doi.org/10.3390/app13137881

AMA Style

Li Z, Rao Z, Ding L, Ding B, Fang J, Ma X. YOLOv5s-D: A Railway Catenary Dropper State Identification and Small Defect Detection Model. Applied Sciences. 2023; 13(13):7881. https://doi.org/10.3390/app13137881

Chicago/Turabian Style

Li, Ziyi, Zhiqiang Rao, Lu Ding, Biao Ding, Jianjun Fang, and Xiaoning Ma. 2023. "YOLOv5s-D: A Railway Catenary Dropper State Identification and Small Defect Detection Model" Applied Sciences 13, no. 13: 7881. https://doi.org/10.3390/app13137881

APA Style

Li, Z., Rao, Z., Ding, L., Ding, B., Fang, J., & Ma, X. (2023). YOLOv5s-D: A Railway Catenary Dropper State Identification and Small Defect Detection Model. Applied Sciences, 13(13), 7881. https://doi.org/10.3390/app13137881

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

YOLOv5s-D: A Railway Catenary Dropper State Identification and Small Defect Detection Model

Abstract

1. Introduction

2. Methods

2.1. YOLOv5 Network

2.2. SK Attention Module

2.3. Bidirectional Feature Pyramid Network

2.4. Decoupled Head

2.5. YOLOv5s-D Network Structure

3. Experiments

3.1. Dataset Preparation

3.2. Experimental Comparison of Different Attention Modules in YOLOv5s

3.3. Ablation Experiments

3.4. Comparative Experiments

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI