1. Introduction
Mine hoisting equipment is an important device that undertakes transportation and load-bearing tasks in mines and plays an indispensable role in the coal industry [
1]. As a core component of mine hoisting equipment, the operating status of mining wire rope (MWR) is directly related to the production safety and economic benefits of the coal industry. During operation, MWR is often affected by bending fatigue, mechanical shock, and harsh working conditions, and is prone to damage such as wire breakage. If the health status of MWR cannot be detected in a timely and accurate manner, mine production will not only face economic losses, but also endanger the lives of workers. Therefore, the development of an accurate and reliable MWR damage identification method is of great significance to ensure production safety and improve operational reliability in the coal industry.
Non-destructive testing (NDT) methods for wire rope damage include ultrasonic testing [
2], acoustic emission testing [
3], ray testing [
4], and electromagnetic testing [
5]. Among these, electromagnetic testing is a commonly employed technique for wire rope damage detection. Electromagnetic testing methods encompass magnetic flux leakage (MFL) testing [
6], eddy current testing [
7], magnetic memory testing [
8], and Barkhausen noise testing [
9]. Owing to its low price and convenient operation, MFL testing has become the most widely adopted method for detecting damage in wire rope. The key to quantitative MWR damage identification based on MFL testing lies in the extraction of representative damage features from the signal. High-quality feature extraction not only provides sufficient discriminative basis for subsequent classification but also improves computational efficiency, thereby laying the foundation for accurate identification of MWR damage severity. Traditional wire rope damage identification typically involves two stages: first, manual feature extraction from the raw signal, followed by the integration of various classification models (such as BP neural networks or support vector machines) to complete damage identification [
10,
11]. While these methods have achieved certain effectiveness, they also present limitations: the feature extraction process is time-consuming, relies on experience, and is highly subjective, which can affect the final classification accuracy.
In recent years, the accelerated advancement of artificial intelligence (AI) has driven the extensive application of deep learning (DL) in mechanical fault diagnosis. Leveraging its powerful feature learning and pattern recognition capabilities, DL can automatically extract effective information directly from signals in the time, frequency, or time-frequency domains, eliminating tedious manual extraction and significantly improving diagnostic accuracy and efficiency. Inspired by this, Zhang et al. [
12] used continuous wavelet transform (CWT) to process the MFL signal of damaged wire rope and employed convolutional neural network (CNN) to quantitatively identify internal and external broken wires. He then combined generative adversarial network (GAN) with CNN to quantitatively identify MFL signals after CWT, further improving damage identification accuracy [
13]. Liu et al. [
14] used a modified Hilbert transform (HT) and long short-term memory (LSTM) network to quantitatively identify wire rope damage. Tian et al. [
15] proposed a method based on CWT and an improved CNN for the efficient identification of both minor and major damage in MWR, demonstrating exceptional performance in early damage detection. Zhao et al. [
16] proposed an end-to-end DL model for the quantitative identification of MWR damage. This model automatically learns features directly from raw signals, eliminating the need for manual extraction and effectively ensuring safety in mining operations. However, these methods rely on large amounts of labeled data to train highly accurate damage diagnosis models. In real-world industrial environments, obtaining sufficient labeled data is challenging due to the complex and harsh working conditions of coal mines. Furthermore, manual labeling is time-consuming and requires specialized knowledge, making it difficult to meet the large-scale data requirements. These factors limit the effectiveness of traditional DL methods in MWR damage diagnosis.
To deal with the issue of limited labeled samples, researchers have proposed various methods, including transfer learning (TL) and semi-supervised learning. TL alleviates overfitting by extracting common features between the source and target domains, but its over-reliance on domain relevance can easily lead to negative knowledge transfer. Semi-supervised learning uses unlabeled data to assist in training to improve model performance, but in practical applications it often fails to fully exploit fault information in unlabeled data. Furthermore, TL requires careful selection of transferable prior knowledge and auxiliary tasks in practice, while suitable source domain samples are sometimes difficult to obtain, limiting the effectiveness of the method [
17].
To solve the aforementioned issues, self-supervised learning offers a new approach that can utilize large amounts of unlabeled signals for automatic labeling and learning effective representations. Self-supervised methods are primarily categorized into generative and contrastive approaches: generative methods map inputs to latent spaces via encoders and reconstruct original signals using decoders; contrastive methods employ data augmentation to construct diverse perspectives, extracting discriminative features by maximizing similarity within the same sample view while minimizing similarity between different samples [
18]. In contrast, contrastive learning does not require reconstructing input details but directly captures differences within high-dimensional semantic spaces. Consequently, its optimization is more efficient, and the resulting models exhibit stronger generalization capabilities.
Contrastive learning was initially applied to computer vision tasks such as image classification and object recognition [
19]. In recent years, it has been gradually introduced into the field of intelligent fault diagnosis to alleviate the scarcity of labeled samples and enhance the effectiveness of feature extraction and the generalization ability of models. Ding et al. [
20] proposed a contrastive learning-based self-supervised pre-training method to learn discriminative representations from unlabeled bearing vibration signals, enabling early fault detection in bearings when combined with a small amount of labeled data. Zhang et al. [
21] applied contrastive learning to imbalance fault diagnosis, with experimental results demonstrating excellent performance in imbalance scenarios. Zhu et al. [
22] achieved accurate identification of wind turbine gearbox faults under limited labeling and complex operating conditions by integrating temporal prediction with similarity contrastive learning through an embedded self-attention mechanism. Sha et al. [
23] combined attribute supervision with contrastive learning-based feature generation for zero-shot fault diagnosis, optimizing feature space representations to enhance recognition performance for unknown faults. However, the application of contrastive learning in the mining industry remains limited, especially in the field of MWR damage identification. Due to the complex environment and noise interference in coal mines, existing methods struggle to accurately identify MWR damage with limited labeled data.
To address these challenges, this paper proposes a self-supervised contrastive learning method based on multi-scale convolution and informer (MS-Informer) for quantitative damage identification in MWR. This method automatically learns features from a massive volume of unlabeled damage signals, effectively improving the model’s identification capabilities, even with limited labeled samples. The main contributions and novelties of this paper are as follows:
- (1)
A self-supervised contrastive learning framework for damage identification of MWR is proposed. This framework can fully utilize unlabeled monitoring data from industrial sites and can still achieve accurate identification under limited labeling conditions, providing a novel solution for condition monitoring and fault diagnosis of mine hoisting systems.
- (2)
A combined data enhancement strategy for MFL signals of MWR is proposed. This strategy generates more representative positive and negative samples through combined data augmentation operation, effectively improving the feature discrimination ability in the contrastive learning process and enhancing the robustness of the model in complex environments.
- (3)
A contrastive learning model MS-Informer is constructed. This model focuses on both local details and overall regularities in the signal, making damage detection more comprehensive and reliable.
This paper is organized as follows:
Section 2 introduces the theoretical basis and overall framework of the proposed method.
Section 3 details the experimental setup and process.
Section 4 analyzes the experimental results.
Section 5 concludes the study.
3. Experiments
3.1. Experimental Platform
The experimental platform for non-destructive testing of MWR was built at the Key Laboratory of Coal Mine Intelligence and Robot Innovative Application of the Ministry of Emergency Management at China University of Mining and Technology (Beijing). This platform comprises an experimental bench, electric motor, power supply unit, transmission device, signal acquisition device, flaw detector, and MWR samples. Among these, the NdFeB type
permanent magnet was selected. The flaw detector’s housing and armature were constructed from industrial pure iron DT4 with high magnetic susceptibility to fully excite the MWR. Furthermore, the magnetic sensor was a THS119 Hall effect device, and the collected signal was amplified by an amplifier circuit containing an AD620 chip. Data acquisition was performed using an Altech USB5633 acquisition card, the overall structure is shown in
Figure 5. The common types of damage in MWR are shown in
Figure 6. During MFL signals collection, the sampling frequency was set to 2000 Hz.
During actual operation, MWR damage can be caused by multiple factors and is commonly classified into two categories: Localized Flaw (LF) and Loss of Metallic Area (LMA) [
16]. LF-type damage corresponds to localized defects occurring over a short rope length, such as broken wires, internal wire rupture and local wear, which lead to abrupt reductions in the effective cross-sectional area. In contrast, LMA-type damage is characterized by a gradual loss of metallic area along the axial direction of the rope, typically associated with uniform wear or corrosion over extended lengths. In engineering practice, the number of broken wires within a specified rope length is commonly used as a quantitative indicator for damage assessment and scrapping criteria. Accordingly, this study focuses on LF-type damage associated with broken wire defects. The constructed datasets include damage cases with different numbers of broken wires and varying broken wire lengths, which represent typical LF damage scenarios suitable for MFL inspection. The proposed MS-Informer framework is therefore developed and validated for the identification of localized broken wire damage in MWR. Other damage types, including LMA-related defects caused by uniform corrosion, plastic deformation or fatigue, are not explicitly considered in this study and remain outside the current scope.
3.2. Experimental Datasets and Implementation Details
The MWR model selected in this paper is the 6 × 19S + FC type with a diameter of 26 mm. All experiments were conducted under controlled laboratory conditions in order to ensure measurement consistency and repeatability. To comprehensively evaluate the performance of the proposed method in identifying MWR damage, three distinct datasets were constructed: Dataset A, Dataset B, and Dataset C.
In Dataset A, the length of broken wires remains constant, while the number of broken wires starts at 3 and increases by 2 each time, to validate the model’s identification capability under varying numbers of broken wires.
In Dataset B, the number of broken wires is fixed, and the length of broken wires starts at 3 mm, increasing by 2 mm each time, thereby further validating the effectiveness and generalization capability of the proposed method under different damage lengths.
Considering the complex forms of MWR damage in real-world mine environments, this paper also constructed a mixed dataset C. This dataset contains not only normal samples but also two typical damage scenarios: one with a fixed broken wire length and the number of broken wires set to 3, 7, and 11, respectively; and another with a fixed number of broken wires and the lengths set to 5 mm, 9 mm, and 13 mm, respectively. Details of the three different datasets are shown in
Table 2.
Due to the limitations of actual industrial conditions, the collected MFL damage data generally cannot meet the needs of experimental training. This paper uses the window slicing method [
28] to amplify the MWR damage data. Each type of MWR damage signal consists of 280,000 data points. After window slicing, each type of MWR damage signal is divided into 1024 data points. The three datasets in the experiment used a total of 6400 samples from 7 different damage types, which were used for training, validation and test sets at 60%, 20% and 20%, respectively. All measurements were performed under identical experimental settings, including sensor configuration, excitation conditions and acquisition speed, to ensure data consistency. The same acquisition procedure was repeated for different damage cases, and the window slicing process was applied uniformly across all datasets to improve repeatability. In the three MWR damage datasets, the encoder and classifier were fine-tuned with a label ratio of 5%, 15% and 30%, respectively. The corresponding parameter settings for data augmentation are shown in
Table 3.
This noise coefficient controls the amplitude of additive Gaussian noise introduced to simulate measurement noise and electromagnetic interference commonly observed in MFL signals. The adopted value represents a moderate noise level relative to the signal amplitude, aiming to enhance robustness while avoiding excessive distortion of damage-related features. Translation corresponds to randomly shifting the signal along the time axis to account for temporal misalignment caused by speed fluctuation or sensor position variation. The translation position is selected randomly within the signal length, and no physical unit is involved. Jitter introduces small stochastic perturbations to simulate local signal fluctuations. The jitter follows a Gaussian distribution with zero mean and a standard deviation of 0.07, which is selected to introduce slight variability without altering the overall waveform structure. The scaling factor controls amplitude scaling of the signal to emulate variations in magnetic field strength caused by changes in lift-off distance or excitation conditions. The adopted value ensures mild amplitude variation while preserving damage-sensitive patterns. These parameters were selected based on prior experience with MFL signal processing and preliminary empirical tuning to balance invariance learning and feature preservation. While no exhaustive parameter optimization was conducted, the chosen values were found to provide stable contrastive pre-training and consistent downstream performance.
The program is written in Python 3.9, based on the PyTorch framework (version 2.0). The hardware configuration is an Intel Core i5-12600H and an NVIDIA RTX4060 GPU. The batch size is 32, the learning rate is 0.0005, and the Adam optimizer provided by PyTorch was adopted. Additionally, to minimize randomness, all methods were tested under same conditions.
3.3. Model Performance Metrics
When analyzing results and comparing models, evaluation metrics serve as crucial standards for measuring model performance [
29]. To comprehensively validate the effectiveness of the proposed method and deliver accurate, reliable results, evaluation metrics such as accuracy, precision, recall, and F1 score have been introduced.
where
and
represent the number of correctly classified positive and negative samples, respectively, while
and
denote the number of misclassified positive and negative samples, respectively.
4. Results and Discussion
4.1. Analysis of Self-Supervised Pre-Training and Fine-Tuning
The damage identification model proposed in this paper consists of two stages: self-supervised pre-training and fine-tuning. Based on the aforementioned configuration, the training processes for these two stages are shown in
Figure 7. During the pre-training phase, self-supervised learning is performed using contrastive loss, and the model achieves optimal performance within approximately 300 epochs. In the fine-tuning phase, the model converges rapidly on a small amount of labeled data and stabilizes within 100 epochs, demonstrating that the representational capabilities acquired during pre-training can be effectively transferred to downstream classification tasks.
The temperature parameter in the contrastive loss was empirically set to a fixed value across all experiments, and negative samples were implicitly constructed from other instances within the same mini-batch. The pre-training and fine-tuning processes were conducted for a predefined number of epochs under identical experimental conditions. To avoid overfitting, early stopping was applied based on the validation loss, and the model parameters corresponding to the best validation performance were retained.
4.2. Identification Results and Comparison Analysis
To comprehensively evaluate the performance of the proposed method, several representative baseline models were selected for comparison, including supervised, semi-supervised, and self-supervised approaches. All baseline models take the same one-dimensional time-series signals as input, with an identical input dimension to that of the proposed model, and are trained using the same data preprocessing and augmentation strategy to ensure a fair comparison. These include commonly used supervised methods such as CNN and LSTM, semi-supervised methods like HCAE [
30] and RelaNet [
31], and the self-supervised method AFFE [
25]. The CNN baseline consists of several stacked one-dimensional convolutional layers followed by pooling layers and a fully connected classifier. The LSTM model employs stacked long short-term memory layers to capture temporal dependencies in sequential data, followed by a dense classification layer. For semi-supervised learning, HCAE adopts an encoder–decoder structure to learn latent representations from both labeled and unlabeled data, with the encoder output connected to a classifier for damage identification. RelaNet is a relation-based semi-supervised model that enhances feature discrimination by modeling the relationships among samples in the latent space. The self-supervised baseline AFFE leverages an auxiliary pretext task to learn robust representations from unlabeled data, which are subsequently fine-tuned using limited labeled samples for classification. To reduce randomness, results were averaged across 10 experiments, as shown in
Table 4.
Th experimental results show that traditional supervised learning methods have limited performance in damage identification in the three datasets, especially when there are few labeled samples. While accuracy improves with increasing labeled samples, it remains highly dependent on labeled samples. In contrast, the semi-supervised methods HCAE and RelaNet achieved improved accuracy in all three datasets, demonstrating their ability to extract more effective information from limited labeled data through similarity calculation and category expansion. The self-supervised method AFFE, leveraging a two-stage training strategy combining abundant unlabeled data with limited labeled data, demonstrated significant advantages even with low labeled data ratios. At a 5% labeled ratio, AFFE achieved identification accuracy on three datasets that surpassed LSTM and RelaNet by over 18% and 2%, respectively. At a 30% labeled ratio, AFFE maintained its advantage, with accuracy exceeding LSTM and RelaNet by over 16% and 1.7%, respectively. This demonstrates that as the labeled ratio decreases, the performance gains from self-supervised methods become more pronounced, reflecting their ability to capture the feature distribution of unlabeled data.
Notably, the proposed method maintains optimal performance across varying labeled ratios, fully demonstrating its advantages in the quantitative identification of MWR damage. By efficiently mining latent features in unlabeled data, this approach achieves accurate identification even under limited labeled conditions, exhibiting excellent adaptability and stability. Overall, the proposed method not only performs exceptionally well in experimental validation but also possesses practical value in meeting the engineering demands of MWR damage identification.
To visually demonstrate the performance of MS-Informer under different labeling ratios, the macro-averaged F1 (MF1) score on three different datasets is shown in
Figure 8. It is evident that while all methods exhibit performance degradation as the labeling ratio decreases, the proposed method demonstrates superior stability. This demonstrates that the proposed model exhibits greater robustness and discriminative capability when dealing with unlabeled data, suggesting a more effective learned feature space. Overall, the proposed model achieves optimal results in both accuracy and MF1 score across tasks with varying labeling ratios, further validating the superior performance of the proposed model.
Figure 9 shows the F1 scores for various damage states in the mixed dataset C under the 15% labeled ratio task. It can be concluded that MS-Informer achieves the highest F1 scores in most types, indicating its ability to effectively extract potential prior information from unlabeled signals, thereby enabling more accurate damage identification. Compared to the advanced self-supervised contrastive learning method AFFE, MS-Informer demonstrates superior feature representation and classification performance, further highlighting its potential for application in MWR damage identification.
The confusion matrices for the three datasets under the 15% labeled ratio are shown in
Figure 10. The confusion matrices clearly reflect the identification accuracy of each category, and MS-Informer achieves excellent identification results for various types and degrees of damage. These results also demonstrate that the designed data augmentation strategy for generating positive and negative sample pairs effectively improves the feature extraction capabilities of MS-Informer, enabling it to maintain high identification accuracy while exhibiting robust performance, thereby meeting the application requirements of practical industrial scenarios.
To further illustrate the performance of the proposed method more intuitively, t-distributed stochastic neighbor embedding (t-SNE) [
32] was employed to visualize samples from different categories in three distinct datasets under a 15% labeled ratio.
Figure 11 demonstrates that the proposed method effectively embeds features into the corresponding feature space and clearly distinguishes samples across different categories, exhibiting excellent discriminative ability. This indicates that the method reduces intra-class distances while expanding inter-class distances without label assistance, thereby extracting discriminative features. During the pre-training phase, multiple augmented views of the same sample are effectively learned to capture the invariant information between similar samples. After fine-tuning, samples from different categories are further separated, while similar samples are more closely clustered with minimal label assistance. Therefore, the proposed method can achieve self-supervised feature learning on large-scale unlabeled data and provide strong support for downstream tasks with minimal prior information.
4.3. Ablation Experiment
To further illustrate the effectiveness of the key modules in MS-Informer, ablation experiments were conducted under a 15% labeling ratio condition. The results are shown in
Figure 12.
It can be observed that the complete MS-Informer baseline model achieved the best performance, demonstrating that the combination of multi-scale convolutions and long-term sequence modeling can effectively enhance the accuracy and robustness of quantitative damage identification in MWR.
When the Informer encoder is removed, the model relies solely on multi-scale convolution for local modeling, resulting in a significant decline in performance. This indicates that global dependency features are crucial for identifying MFL time-series signals. Conversely, removing the MS module causes performance degradation as the model relies solely on attention mechanisms to capture global features, highlighting the critical role of local multi-scale feature extraction in capturing transient damage patterns. Furthermore, replacing MS with single-scale convolutions yields better performance than removing either MS or Informer, yet still falls short of the full model, demonstrating that multi-scale design offers superior feature representation capabilities compared to single convolutional kernels.
In summary, ablation experiments validate the complementary roles of multi-scale convolution and Informer encoder in quantifying MWR damage with limited labeled data: the former fully exploits local transient features in MFL signals, while the latter excels at capturing long-term dependencies and global patterns. Their integration significantly enhances damage identification performance under constrained labeling conditions.
4.4. Analysis of Combined Data Augmentation Strategy
In this section, we evaluated the impact of six different data augmentation combination strategies on the feature extraction capability of the proposed method. The experimental results are shown in
Table 5, showing the identification accuracy of the six data augmentation combination strategies on three datasets. Since the contrastive learning method used in this paper uses combined data augmentation strategy to generate positive sample pairs, we set C1 to C3 as different configurations under the same data augmentation conditions for comparison, and C4 to C6 are used to analyze the combined effects of the four different enhancement techniques.
It can be observed that the average accuracy exceeds 94.5% under different data augmentation combinations, demonstrating the robustness of MS-Informer when dealing with limited labeled data. The results show that the identification performance of C1 to C3 under the same data augmentation conditions is generally inferior to that of C4 to C6, indicating that using different augmentation strategies can help improve the model’s feature learning ability and generalization. By configuring these varied data augmentation methods, more indistinguishable positive–negative sample pairs are generated, thereby enabling the model to more deeply capture the differences between samples. The resulting model possesses stronger representational capabilities, which helps improve performance on downstream classification tasks.
5. Conclusions
This paper proposes a self-supervised contrastive learning method for MWR damage identification based on MS-Informer. The method was validated on a dataset of MWR damage with varying degrees and states of damage. The experimental results demonstrate that MS-Informer exhibits outstanding performance in handling MWR damage with limited labeled data and across varying states. This method not only effectively enhances damage identification accuracy to ensure safe mining operations but also provides reference for subsequent MWR maintenance, thereby improving overall mining production efficiency. The conclusions of this paper are as follows.
- (1)
A self-supervised contrastive learning framework for MWR is constructed, which can achieve accurate damage identification under limited labeled data and different damage degrees and states.
- (2)
By integrating multi-scale convolution and Informer within a contrastive learning framework, the proposed model learns robust local and global feature representations, which leads to improved damage state identification performance in terms of classification accuracy and related statistical metrics.
- (3)
The acquired representations are conveniently transferable to diverse downstream tasks, indicating that the method has good adaptability in various monitoring and damage identification scenarios.
Nevertheless, the present study still has several limitations. The proposed framework requires relatively high computational cost during the pre-training stage, and the fine-tuning process still depends on a limited amount of labeled data. Moreover, the model is validated under fixed sensor configurations, and significant changes in sensing conditions, such as sensor spacing or lift-off distance, may require additional calibration or fine-tuning. Future work will focus on reducing training cost, improving cross-condition generalization, and extending the framework to more downstream tasks.