A Self-Supervised Contrastive Learning Framework Based on Multi-Scale Convolution and Informer for Quantitative Identification of Mining Wire Rope Damage

Zhao, Chun; Tian, Jie; Wang, Hongyao

doi:10.3390/machines14010054

Open AccessArticle

A Self-Supervised Contrastive Learning Framework Based on Multi-Scale Convolution and Informer for Quantitative Identification of Mining Wire Rope Damage

by

Chun Zhao

^1,2,

Jie Tian

^1,2,* and

Hongyao Wang

^1,2

¹

School of Mechanical and Electrical Engineering, China University of Mining and Technology, Beijing 100083, China

²

Key Laboratory of Coal Mine Intelligence and Robot Innovative Application, Ministry of Emergency Management, China University of Mining and Technology, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Machines 2026, 14(1), 54; https://doi.org/10.3390/machines14010054

Submission received: 5 December 2025 / Revised: 26 December 2025 / Accepted: 30 December 2025 / Published: 31 December 2025

(This article belongs to the Section Machines Testing and Maintenance)

Download

Browse Figures

Versions Notes

Abstract

As a critical part of hoisting equipment in mines, the damage of mining wire rope (MWR) directly poses a threat to production safety and economic efficiency within the coal industry. Therefore, developing efficient and reliable methods for MWR damage identification is of great significance. However, acquiring substantial labeled damage data in real industrial scenarios is often impractical, and accurate damage identification with limited labeled data remains an urgent challenge. To address this, this paper proposes a self-supervised contrastive learning method based on multi-scale convolution and Informer (MS-Informer) for the quantitative damage identification of MWR. This method first generates positive and negative sample pairs through a combined data augmentation strategy to increase data diversity and enhance feature generalization. Subsequently, multi-scale convolution is combined with Informer to simultaneously capture local details and global features. Finally, limited labeled data are incorporated during the fine-tuning phase to accurately identify MWR damage. The experimental results on multiple MWR damage datasets demonstrate that the proposed method achieves superior identification performance under conditions of limited labeled data. This framework not only enhances the accuracy and reliability of MWR damage detection but also provides crucial guidance for the safe operation and subsequent maintenance of mine hoisting equipment.

Keywords:

mining wire rope; damage identification; self-supervised; contrastive learning; MS-Informer

1. Introduction

Mine hoisting equipment is an important device that undertakes transportation and load-bearing tasks in mines and plays an indispensable role in the coal industry [1]. As a core component of mine hoisting equipment, the operating status of mining wire rope (MWR) is directly related to the production safety and economic benefits of the coal industry. During operation, MWR is often affected by bending fatigue, mechanical shock, and harsh working conditions, and is prone to damage such as wire breakage. If the health status of MWR cannot be detected in a timely and accurate manner, mine production will not only face economic losses, but also endanger the lives of workers. Therefore, the development of an accurate and reliable MWR damage identification method is of great significance to ensure production safety and improve operational reliability in the coal industry.

Non-destructive testing (NDT) methods for wire rope damage include ultrasonic testing [2], acoustic emission testing [3], ray testing [4], and electromagnetic testing [5]. Among these, electromagnetic testing is a commonly employed technique for wire rope damage detection. Electromagnetic testing methods encompass magnetic flux leakage (MFL) testing [6], eddy current testing [7], magnetic memory testing [8], and Barkhausen noise testing [9]. Owing to its low price and convenient operation, MFL testing has become the most widely adopted method for detecting damage in wire rope. The key to quantitative MWR damage identification based on MFL testing lies in the extraction of representative damage features from the signal. High-quality feature extraction not only provides sufficient discriminative basis for subsequent classification but also improves computational efficiency, thereby laying the foundation for accurate identification of MWR damage severity. Traditional wire rope damage identification typically involves two stages: first, manual feature extraction from the raw signal, followed by the integration of various classification models (such as BP neural networks or support vector machines) to complete damage identification [10,11]. While these methods have achieved certain effectiveness, they also present limitations: the feature extraction process is time-consuming, relies on experience, and is highly subjective, which can affect the final classification accuracy.

In recent years, the accelerated advancement of artificial intelligence (AI) has driven the extensive application of deep learning (DL) in mechanical fault diagnosis. Leveraging its powerful feature learning and pattern recognition capabilities, DL can automatically extract effective information directly from signals in the time, frequency, or time-frequency domains, eliminating tedious manual extraction and significantly improving diagnostic accuracy and efficiency. Inspired by this, Zhang et al. [12] used continuous wavelet transform (CWT) to process the MFL signal of damaged wire rope and employed convolutional neural network (CNN) to quantitatively identify internal and external broken wires. He then combined generative adversarial network (GAN) with CNN to quantitatively identify MFL signals after CWT, further improving damage identification accuracy [13]. Liu et al. [14] used a modified Hilbert transform (HT) and long short-term memory (LSTM) network to quantitatively identify wire rope damage. Tian et al. [15] proposed a method based on CWT and an improved CNN for the efficient identification of both minor and major damage in MWR, demonstrating exceptional performance in early damage detection. Zhao et al. [16] proposed an end-to-end DL model for the quantitative identification of MWR damage. This model automatically learns features directly from raw signals, eliminating the need for manual extraction and effectively ensuring safety in mining operations. However, these methods rely on large amounts of labeled data to train highly accurate damage diagnosis models. In real-world industrial environments, obtaining sufficient labeled data is challenging due to the complex and harsh working conditions of coal mines. Furthermore, manual labeling is time-consuming and requires specialized knowledge, making it difficult to meet the large-scale data requirements. These factors limit the effectiveness of traditional DL methods in MWR damage diagnosis.

To deal with the issue of limited labeled samples, researchers have proposed various methods, including transfer learning (TL) and semi-supervised learning. TL alleviates overfitting by extracting common features between the source and target domains, but its over-reliance on domain relevance can easily lead to negative knowledge transfer. Semi-supervised learning uses unlabeled data to assist in training to improve model performance, but in practical applications it often fails to fully exploit fault information in unlabeled data. Furthermore, TL requires careful selection of transferable prior knowledge and auxiliary tasks in practice, while suitable source domain samples are sometimes difficult to obtain, limiting the effectiveness of the method [17].

To solve the aforementioned issues, self-supervised learning offers a new approach that can utilize large amounts of unlabeled signals for automatic labeling and learning effective representations. Self-supervised methods are primarily categorized into generative and contrastive approaches: generative methods map inputs to latent spaces via encoders and reconstruct original signals using decoders; contrastive methods employ data augmentation to construct diverse perspectives, extracting discriminative features by maximizing similarity within the same sample view while minimizing similarity between different samples [18]. In contrast, contrastive learning does not require reconstructing input details but directly captures differences within high-dimensional semantic spaces. Consequently, its optimization is more efficient, and the resulting models exhibit stronger generalization capabilities.

Contrastive learning was initially applied to computer vision tasks such as image classification and object recognition [19]. In recent years, it has been gradually introduced into the field of intelligent fault diagnosis to alleviate the scarcity of labeled samples and enhance the effectiveness of feature extraction and the generalization ability of models. Ding et al. [20] proposed a contrastive learning-based self-supervised pre-training method to learn discriminative representations from unlabeled bearing vibration signals, enabling early fault detection in bearings when combined with a small amount of labeled data. Zhang et al. [21] applied contrastive learning to imbalance fault diagnosis, with experimental results demonstrating excellent performance in imbalance scenarios. Zhu et al. [22] achieved accurate identification of wind turbine gearbox faults under limited labeling and complex operating conditions by integrating temporal prediction with similarity contrastive learning through an embedded self-attention mechanism. Sha et al. [23] combined attribute supervision with contrastive learning-based feature generation for zero-shot fault diagnosis, optimizing feature space representations to enhance recognition performance for unknown faults. However, the application of contrastive learning in the mining industry remains limited, especially in the field of MWR damage identification. Due to the complex environment and noise interference in coal mines, existing methods struggle to accurately identify MWR damage with limited labeled data.

To address these challenges, this paper proposes a self-supervised contrastive learning method based on multi-scale convolution and informer (MS-Informer) for quantitative damage identification in MWR. This method automatically learns features from a massive volume of unlabeled damage signals, effectively improving the model’s identification capabilities, even with limited labeled samples. The main contributions and novelties of this paper are as follows:

(1): A self-supervised contrastive learning framework for damage identification of MWR is proposed. This framework can fully utilize unlabeled monitoring data from industrial sites and can still achieve accurate identification under limited labeling conditions, providing a novel solution for condition monitoring and fault diagnosis of mine hoisting systems.
(2): A combined data enhancement strategy for MFL signals of MWR is proposed. This strategy generates more representative positive and negative samples through combined data augmentation operation, effectively improving the feature discrimination ability in the contrastive learning process and enhancing the robustness of the model in complex environments.
(3): A contrastive learning model MS-Informer is constructed. This model focuses on both local details and overall regularities in the signal, making damage detection more comprehensive and reliable.

This paper is organized as follows: Section 2 introduces the theoretical basis and overall framework of the proposed method. Section 3 details the experimental setup and process. Section 4 analyzes the experimental results. Section 5 concludes the study.

2. Methods

2.1. Self-Supervised Learning

The core of self-supervised learning lies in extracting potential information from unlabeled data. By designing pre-training tasks, the model is guided to generate pseudo labels from the data itself, thereby learning discriminative representations. The self-supervised model is then fine-tuned for downstream tasks. Pre-training tasks are specifically designed for self-supervised models, while supervised pseudo labels are constructed around the inherent properties of the data [24]. During training, through the designed objective function, the model progressively learns deep representations of unlabeled data within the sequence. Self-supervised learning is particularly well suited for industrial scenarios where labeled data are difficult to obtain. By learning generalizable and discriminative representations during pre-training, it provides an effective foundation for subsequent fine-tuning on downstream tasks.

2.2. Contrastive Learning

Contrastive learning is a typical self-supervised learning strategy whose core idea is to fully learn features from unlabeled data by constructing positive–negative sample pairs. Data augmentation strategies are generally used to generate positive and negative sample pairs of samples, and by comparing the feature similarity between them, the model is guided to extract potential representational information. Contrastive learning employs an encoder to extract features from samples. It trains the encoder to learn features of unlabeled signals by maximizing similarity among positive samples while minimizing similarity between positive and negative pairs. Feature similarity between positive and negative samples is calculated as follows:

s i m (f (x), f (x^{+})) ≫ s i m (f (x), f (x^{_}))

(1)

where

x^{+}

represents the positive sample,

x^{-}

represents the negative sample, and

s i m (\cdot, \cdot)

is a similarity metric between the horizontal samples

x

and

y

, which represent the dot product of the two positive samples

x_{m}

and

x_{n}

after data enhancement:

s i m (x_{m}, x_{n}) = \frac{{x_{m}}^{T} \cdot x_{n}}{‖x_{m}‖ \cdot ‖x_{n}‖}

(2)

The loss function is calculated as follows:

L_{c l} = - \frac{1}{N} \sum_{i = 1}^{N} \log \frac{\exp (s i m (z_{i}, z_{i +}) / τ)}{\exp (s i m (z_{i}, z_{i +}) / τ) + \sum_{j = 1}^{K} \exp (s i m (z_{i}, z_{i - j}) / τ)}

(3)

where N represents the number of mini-batch samples,

z_{i}

represents the

i

th sample,

z_{i^{+}}

and

z_{i^{-}}

represent positive samples and negative samples, respectively, and

τ

is the temperature coefficient, which is used to adjust the similarity distribution.

In this study, we employ self-supervised contrastive learning to extract deeper feature representations from MFL signals in MWR, thereby enhancing damage identification capabilities.

2.3. Combined Data Augmentation Strategy

In contrastive learning, data augmentation plays a key role. By constructing positive and negative sample pairs, contrastive learning maximizes feature similarity between positive pairs while minimizing similarity between negative pairs. Data augmentation not only enhances the generalization capability of feature representations but also mitigates overfitting issues arising from limited sample sizes [25,26]. Therefore, designing a reasonable and effective data augmentation strategy is particularly crucial in contrastive learning. Typically, contrastive learning uses the same data augmentation method to generate two enhanced samples for each sample signal. Addressing the non-stationary and noisy characteristics of MFL signals from MWR, this paper designs a combined data augmentation strategy. It utilizes four signal transformation methods to generate two distinct augmented samples as positive pairs, as shown in Figure 1. Given an input signal

x

, generate enhanced samples

x_{c}

by adding noise and translation, Generate enhanced samples

x_{d}

by jitter and scaling, augmented samples from the same input

x_{c}

and

x_{d}

constitute a positive sample pair. The enhanced samples and from other input samples

x_{c}

and

x_{d}

constitute a negative sample pair. For the sample signal

x = [x_{1}, x_{2}, \dots, x_{n}]

, the data augmentation operation is as follows:

(1) Adding noise: Adding random Gaussian noise

G

to the original signal

x

.

\tilde{x} = x + G, G \sim N (0, σ_{n})

(4)

where

σ_{n}

denotes the noise coefficient.

(2) Translation: Randomly shift the original signal

x

forward or backward. To keep the signal length unchanged, fill the missing part with overflow elements and select any position

i

as the starting point of the signal.

\tilde{x} = c o n c a t (x [i : e n d], x [0 : i - 1])

(5)

(3) Jitter: Add jitter to the original signal

x

, the disturbance term follows a normal distribution with a mean of

μ

and a standard deviation of

σ

.

\tilde{x} = x + ε, ε \sim N (μ, σ^{2})

(6)

(4) Scaling: Scale the original signal

x

by

s

.

\tilde{x} = s \cdot \tilde{x}, s \sim N (1, σ_{s})

(7)

where

σ_{s}

denotes the scale factor.

The combination of multiple data augmentation strategies not only effectively promotes the contrastive learning process of features, enhancing the robustness of model in the noise environment, but also strengthens the ability of feature extractor to uncover potentially useful features from unlabeled signals.

2.4. Informer Encoder

Informer [27] is a time-series prediction model based on the Transformer architecture, as shown in Figure 2. Compared to the Transformer, it is more effective in capturing long-term dependencies and short-term changes in time-series data. The principal advantage of the model lies in its structural innovations, which substantially mitigate the computational burden and memory requirements that conventional Transformers typically encounter when handling long-sequence modeling tasks. The Informer encoder is the central module of the model, tasked with extracting features and learning representations from the input sequence. It mainly integrates a ProbSparse self-attention mechanism together with self-attention distillation.

2.4.1. ProbSparse Self-Attention

Traditional self-attention mechanisms compute attention weights using input vectors

Q, K, V

, as follows:

A (Q, K, V) = softmax (\frac{Q K^{T}}{\sqrt{d}}) V

(8)

where

d

represents the input dimension.

q_{i}, k_{i}, v_{i}

represent the

i

th row in

Q, K, V

, respectively; then, the probability of the

i

th

Q

vector is as follows:

A (q_{i}, K, V) = \sum_{j} \frac{k (q_{i}, k_{j})}{\sum_{l} k (q_{i}, k_{l})} v_{j} = E_{p (k_{j} | q_{i})} [v_{j}]

(9)

where

k (q_{i}, k_{j})

is the similarity function,

p (k_{j}| q_{i})

represents the attention probability distribution.

Informer uses KL divergence to assess the sparsity of

Q

vector:

K L (q ‖p) = \ln \sum_{j = 1}^{L_{K}} e^{\frac{q_{i} k_{j}^{T}}{\sqrt{d}}} - \frac{1}{L_{K}} \sum_{j = 1}^{L_{K}} \frac{q_{i} k_{j}^{T}}{\sqrt{d}} - \ln L_{K}

(10)

After removing constants, the sparsity of the

i

th

Q

vector is as follows:

M (q_{i}, K) = \ln \sum_{j = 1}^{L_{K}} e^{\frac{q_{i} k_{j}^{T}}{\sqrt{d}}} - \frac{1}{L_{K}} \sum_{j = 1}^{L_{K}} \frac{q_{i} k_{j}^{T}}{\sqrt{d}}

(11)

To accelerate,

M (q_{i}, K)

can be approximated as follows:

\bar{M} (q_{i}, K) = \max_{j} \{\frac{q_{i} k_{j}^{T}}{\sqrt{d}}\} - \frac{1}{L_{K}} \sum_{j = 1}^{L_{K}} \frac{q_{i} k_{j}^{T}}{\sqrt{d}}

(12)

This approach only requires finding the maximum point product value for each

q_{i}

, drastically reducing computational load. Finally, self-attention calculations are performed on the selected sparse

Q

vector:

A (Q, K, V) = softmax (\frac{\bar{Q} K^{T}}{\sqrt{d}}) V

(13)

where

\bar{Q}

is a

Q

vector with high sparsity.

2.4.2. Self-Attention Distilling

Although ProbSparse Self-Attention significantly reduces computational complexity, the output features may still contain a large amount of redundant information, which can affect the model’s efficiency and generalization capabilities. To further extract features and reduce computational complexity, Informer employs a self-attention distillation mechanism. By refining and compressing features layer by layer, it effectively extracts dominant information and enhances the sparsity and representativeness of feature representation. The distillation operation from layer

j

to layer

j + 1

is as follows:

X_{j + 1}^{t} = MaxPool (ELU (Conv 1 d ({[X_{j}^{t}]}_{A B})))

(14)

where

Conv 1 d

represents a one-dimensional convolution operation on the selected features using

ELU

activation function, and

{[\cdot]}_{A B}

represents an attention block.

2.5. The MS-Informer Model Structure

Based on the characteristics of MFL signals in MWR, this paper designs a feature extraction model based on multi-scale convolution and informer encoder, as shown in Figure 3. This structure can fully exploit potential features with limited labeled samples, thereby improving the accuracy and robustness of quantitative damage identification for MWR. The overall processing flow is as follows:

First, the MFL signals are input into a multi-scale convolutional module to extract local and medium-to-long-range features across different receptive fields. This module comprises four parallel branches, each containing 1 × 1 convolutions, 3 × 1 convolution, 3 × 1 and 5 × 1 depthwise separable (DS) convolution to capture feature patterns at different scales, as shown in Figure 4. The outputs from the four branches are concatenated and then fused through a 1 × 1 convolution, expressed as follows:

X_{M S C} = {Conv}_{1 \times 1} (Concat [F_{1 \times 1}, F_{3 \times 1}, F_{d = 3}, F_{d = 5}])

(15)

where

Concat (\cdot)

represents feature concatenation, and

{Conv}_{1 \times 1} (\cdot)

represents

1 \times 1

convolution fusion.

The fused feature sequences are subsequently fed into a two-layer stacked Informer encoder. This module maintains global attention modeling capabilities while effectively reducing computational complexity through the ProbSparse self-attention mechanism. It captures long-term dependencies and global structural features within wire rope signals, thereby enhancing its ability to identify subtle damage and complex interference.

During pre-training, a feature extractor consisting of the multi-scale convolution and informer encoder is connected to the projection head to map features into a contrastive learning space. By applying a combined data augmentation operation to the original signal, different perspectives of the same signal are generated as positive sample pairs, while negative samples are sampled from other samples. The model is trained using the NT-Xent loss, defined as follows:

L_{C L} = - \sum_{i = 1}^{N} \log \frac{\exp (sim (r_{i}, r_{i}^{+}) / τ)}{\sum_{j = 1}^{2 N} 1_{[j \neq i]} \exp (sim (r_{i}, r_{j}) / τ)}

(16)

where

r_{i}

and

r_{i}^{+}

represent the feature representations of the positive sample pairs output by the encoder and projection head, respectively,

s i m (\cdot)

represents the cosine similarity,

τ

is the temperature coefficient, and

N

is the batch size.

During the fine-tuning phase, the pre-trained encoder parameters are transferred and retained for the downstream MWR damage identification task. The encoder output is connected to a linear classifier to produce specific damage classes. The cross-entropy loss function is used to quantify the classification error:

L_{C E} = - \sum_{c = 1}^{N_{c}} y_{c} \log p_{c}

(17)

where

N_{c}

denotes the number of classes,

y_{c}

represents the true label of the

i

th sample in class

c

, and

p_{c}

denotes the probability distribution predicted by the model.

Through the aforementioned process, the model achieves effective fusion of local convolutional features with global attention features. Combined with contrastive learning, this enhances the discriminative power of the feature space, enabling it to maintain superior identification performance and robustness even under conditions of limited labeled samples. The detailed layer configuration of the MS-Informer structure is shown in Table 1.

3. Experiments

3.1. Experimental Platform

The experimental platform for non-destructive testing of MWR was built at the Key Laboratory of Coal Mine Intelligence and Robot Innovative Application of the Ministry of Emergency Management at China University of Mining and Technology (Beijing). This platform comprises an experimental bench, electric motor, power supply unit, transmission device, signal acquisition device, flaw detector, and MWR samples. Among these, the NdFeB type

N_{48}

permanent magnet was selected. The flaw detector’s housing and armature were constructed from industrial pure iron DT4 with high magnetic susceptibility to fully excite the MWR. Furthermore, the magnetic sensor was a THS119 Hall effect device, and the collected signal was amplified by an amplifier circuit containing an AD620 chip. Data acquisition was performed using an Altech USB5633 acquisition card, the overall structure is shown in Figure 5. The common types of damage in MWR are shown in Figure 6. During MFL signals collection, the sampling frequency was set to 2000 Hz.

During actual operation, MWR damage can be caused by multiple factors and is commonly classified into two categories: Localized Flaw (LF) and Loss of Metallic Area (LMA) [16]. LF-type damage corresponds to localized defects occurring over a short rope length, such as broken wires, internal wire rupture and local wear, which lead to abrupt reductions in the effective cross-sectional area. In contrast, LMA-type damage is characterized by a gradual loss of metallic area along the axial direction of the rope, typically associated with uniform wear or corrosion over extended lengths. In engineering practice, the number of broken wires within a specified rope length is commonly used as a quantitative indicator for damage assessment and scrapping criteria. Accordingly, this study focuses on LF-type damage associated with broken wire defects. The constructed datasets include damage cases with different numbers of broken wires and varying broken wire lengths, which represent typical LF damage scenarios suitable for MFL inspection. The proposed MS-Informer framework is therefore developed and validated for the identification of localized broken wire damage in MWR. Other damage types, including LMA-related defects caused by uniform corrosion, plastic deformation or fatigue, are not explicitly considered in this study and remain outside the current scope.

3.2. Experimental Datasets and Implementation Details

The MWR model selected in this paper is the 6 × 19S + FC type with a diameter of 26 mm. All experiments were conducted under controlled laboratory conditions in order to ensure measurement consistency and repeatability. To comprehensively evaluate the performance of the proposed method in identifying MWR damage, three distinct datasets were constructed: Dataset A, Dataset B, and Dataset C.

In Dataset A, the length of broken wires remains constant, while the number of broken wires starts at 3 and increases by 2 each time, to validate the model’s identification capability under varying numbers of broken wires.

In Dataset B, the number of broken wires is fixed, and the length of broken wires starts at 3 mm, increasing by 2 mm each time, thereby further validating the effectiveness and generalization capability of the proposed method under different damage lengths.

Considering the complex forms of MWR damage in real-world mine environments, this paper also constructed a mixed dataset C. This dataset contains not only normal samples but also two typical damage scenarios: one with a fixed broken wire length and the number of broken wires set to 3, 7, and 11, respectively; and another with a fixed number of broken wires and the lengths set to 5 mm, 9 mm, and 13 mm, respectively. Details of the three different datasets are shown in Table 2.

Due to the limitations of actual industrial conditions, the collected MFL damage data generally cannot meet the needs of experimental training. This paper uses the window slicing method [28] to amplify the MWR damage data. Each type of MWR damage signal consists of 280,000 data points. After window slicing, each type of MWR damage signal is divided into 1024 data points. The three datasets in the experiment used a total of 6400 samples from 7 different damage types, which were used for training, validation and test sets at 60%, 20% and 20%, respectively. All measurements were performed under identical experimental settings, including sensor configuration, excitation conditions and acquisition speed, to ensure data consistency. The same acquisition procedure was repeated for different damage cases, and the window slicing process was applied uniformly across all datasets to improve repeatability. In the three MWR damage datasets, the encoder and classifier were fine-tuned with a label ratio of 5%, 15% and 30%, respectively. The corresponding parameter settings for data augmentation are shown in Table 3.

This noise coefficient controls the amplitude of additive Gaussian noise introduced to simulate measurement noise and electromagnetic interference commonly observed in MFL signals. The adopted value represents a moderate noise level relative to the signal amplitude, aiming to enhance robustness while avoiding excessive distortion of damage-related features. Translation corresponds to randomly shifting the signal along the time axis to account for temporal misalignment caused by speed fluctuation or sensor position variation. The translation position is selected randomly within the signal length, and no physical unit is involved. Jitter introduces small stochastic perturbations to simulate local signal fluctuations. The jitter follows a Gaussian distribution with zero mean and a standard deviation of 0.07, which is selected to introduce slight variability without altering the overall waveform structure. The scaling factor controls amplitude scaling of the signal to emulate variations in magnetic field strength caused by changes in lift-off distance or excitation conditions. The adopted value ensures mild amplitude variation while preserving damage-sensitive patterns. These parameters were selected based on prior experience with MFL signal processing and preliminary empirical tuning to balance invariance learning and feature preservation. While no exhaustive parameter optimization was conducted, the chosen values were found to provide stable contrastive pre-training and consistent downstream performance.

The program is written in Python 3.9, based on the PyTorch framework (version 2.0). The hardware configuration is an Intel Core i5-12600H and an NVIDIA RTX4060 GPU. The batch size is 32, the learning rate is 0.0005, and the Adam optimizer provided by PyTorch was adopted. Additionally, to minimize randomness, all methods were tested under same conditions.

3.3. Model Performance Metrics

When analyzing results and comparing models, evaluation metrics serve as crucial standards for measuring model performance [29]. To comprehensively validate the effectiveness of the proposed method and deliver accurate, reliable results, evaluation metrics such as accuracy, precision, recall, and F1 score have been introduced.

A c c u r a c y = \frac{T P + T N}{T P + F P + T N + F N}

(18)

P r e c i s i o n = \frac{T P}{T P + F P}

(19)

R e c a l l = \frac{T P}{T P + F N}

(20)

F 1 = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(21)

where

T P

and

T N

represent the number of correctly classified positive and negative samples, respectively, while

F P

and

F N

denote the number of misclassified positive and negative samples, respectively.

4. Results and Discussion

4.1. Analysis of Self-Supervised Pre-Training and Fine-Tuning

The damage identification model proposed in this paper consists of two stages: self-supervised pre-training and fine-tuning. Based on the aforementioned configuration, the training processes for these two stages are shown in Figure 7. During the pre-training phase, self-supervised learning is performed using contrastive loss, and the model achieves optimal performance within approximately 300 epochs. In the fine-tuning phase, the model converges rapidly on a small amount of labeled data and stabilizes within 100 epochs, demonstrating that the representational capabilities acquired during pre-training can be effectively transferred to downstream classification tasks.

The temperature parameter in the contrastive loss was empirically set to a fixed value across all experiments, and negative samples were implicitly constructed from other instances within the same mini-batch. The pre-training and fine-tuning processes were conducted for a predefined number of epochs under identical experimental conditions. To avoid overfitting, early stopping was applied based on the validation loss, and the model parameters corresponding to the best validation performance were retained.

4.2. Identification Results and Comparison Analysis

To comprehensively evaluate the performance of the proposed method, several representative baseline models were selected for comparison, including supervised, semi-supervised, and self-supervised approaches. All baseline models take the same one-dimensional time-series signals as input, with an identical input dimension to that of the proposed model, and are trained using the same data preprocessing and augmentation strategy to ensure a fair comparison. These include commonly used supervised methods such as CNN and LSTM, semi-supervised methods like HCAE [30] and RelaNet [31], and the self-supervised method AFFE [25]. The CNN baseline consists of several stacked one-dimensional convolutional layers followed by pooling layers and a fully connected classifier. The LSTM model employs stacked long short-term memory layers to capture temporal dependencies in sequential data, followed by a dense classification layer. For semi-supervised learning, HCAE adopts an encoder–decoder structure to learn latent representations from both labeled and unlabeled data, with the encoder output connected to a classifier for damage identification. RelaNet is a relation-based semi-supervised model that enhances feature discrimination by modeling the relationships among samples in the latent space. The self-supervised baseline AFFE leverages an auxiliary pretext task to learn robust representations from unlabeled data, which are subsequently fine-tuned using limited labeled samples for classification. To reduce randomness, results were averaged across 10 experiments, as shown in Table 4.

Th experimental results show that traditional supervised learning methods have limited performance in damage identification in the three datasets, especially when there are few labeled samples. While accuracy improves with increasing labeled samples, it remains highly dependent on labeled samples. In contrast, the semi-supervised methods HCAE and RelaNet achieved improved accuracy in all three datasets, demonstrating their ability to extract more effective information from limited labeled data through similarity calculation and category expansion. The self-supervised method AFFE, leveraging a two-stage training strategy combining abundant unlabeled data with limited labeled data, demonstrated significant advantages even with low labeled data ratios. At a 5% labeled ratio, AFFE achieved identification accuracy on three datasets that surpassed LSTM and RelaNet by over 18% and 2%, respectively. At a 30% labeled ratio, AFFE maintained its advantage, with accuracy exceeding LSTM and RelaNet by over 16% and 1.7%, respectively. This demonstrates that as the labeled ratio decreases, the performance gains from self-supervised methods become more pronounced, reflecting their ability to capture the feature distribution of unlabeled data.

Notably, the proposed method maintains optimal performance across varying labeled ratios, fully demonstrating its advantages in the quantitative identification of MWR damage. By efficiently mining latent features in unlabeled data, this approach achieves accurate identification even under limited labeled conditions, exhibiting excellent adaptability and stability. Overall, the proposed method not only performs exceptionally well in experimental validation but also possesses practical value in meeting the engineering demands of MWR damage identification.

To visually demonstrate the performance of MS-Informer under different labeling ratios, the macro-averaged F1 (MF1) score on three different datasets is shown in Figure 8. It is evident that while all methods exhibit performance degradation as the labeling ratio decreases, the proposed method demonstrates superior stability. This demonstrates that the proposed model exhibits greater robustness and discriminative capability when dealing with unlabeled data, suggesting a more effective learned feature space. Overall, the proposed model achieves optimal results in both accuracy and MF1 score across tasks with varying labeling ratios, further validating the superior performance of the proposed model.

Figure 9 shows the F1 scores for various damage states in the mixed dataset C under the 15% labeled ratio task. It can be concluded that MS-Informer achieves the highest F1 scores in most types, indicating its ability to effectively extract potential prior information from unlabeled signals, thereby enabling more accurate damage identification. Compared to the advanced self-supervised contrastive learning method AFFE, MS-Informer demonstrates superior feature representation and classification performance, further highlighting its potential for application in MWR damage identification.

The confusion matrices for the three datasets under the 15% labeled ratio are shown in Figure 10. The confusion matrices clearly reflect the identification accuracy of each category, and MS-Informer achieves excellent identification results for various types and degrees of damage. These results also demonstrate that the designed data augmentation strategy for generating positive and negative sample pairs effectively improves the feature extraction capabilities of MS-Informer, enabling it to maintain high identification accuracy while exhibiting robust performance, thereby meeting the application requirements of practical industrial scenarios.

To further illustrate the performance of the proposed method more intuitively, t-distributed stochastic neighbor embedding (t-SNE) [32] was employed to visualize samples from different categories in three distinct datasets under a 15% labeled ratio. Figure 11 demonstrates that the proposed method effectively embeds features into the corresponding feature space and clearly distinguishes samples across different categories, exhibiting excellent discriminative ability. This indicates that the method reduces intra-class distances while expanding inter-class distances without label assistance, thereby extracting discriminative features. During the pre-training phase, multiple augmented views of the same sample are effectively learned to capture the invariant information between similar samples. After fine-tuning, samples from different categories are further separated, while similar samples are more closely clustered with minimal label assistance. Therefore, the proposed method can achieve self-supervised feature learning on large-scale unlabeled data and provide strong support for downstream tasks with minimal prior information.

4.3. Ablation Experiment

To further illustrate the effectiveness of the key modules in MS-Informer, ablation experiments were conducted under a 15% labeling ratio condition. The results are shown in Figure 12.

It can be observed that the complete MS-Informer baseline model achieved the best performance, demonstrating that the combination of multi-scale convolutions and long-term sequence modeling can effectively enhance the accuracy and robustness of quantitative damage identification in MWR.

When the Informer encoder is removed, the model relies solely on multi-scale convolution for local modeling, resulting in a significant decline in performance. This indicates that global dependency features are crucial for identifying MFL time-series signals. Conversely, removing the MS module causes performance degradation as the model relies solely on attention mechanisms to capture global features, highlighting the critical role of local multi-scale feature extraction in capturing transient damage patterns. Furthermore, replacing MS with single-scale convolutions yields better performance than removing either MS or Informer, yet still falls short of the full model, demonstrating that multi-scale design offers superior feature representation capabilities compared to single convolutional kernels.

In summary, ablation experiments validate the complementary roles of multi-scale convolution and Informer encoder in quantifying MWR damage with limited labeled data: the former fully exploits local transient features in MFL signals, while the latter excels at capturing long-term dependencies and global patterns. Their integration significantly enhances damage identification performance under constrained labeling conditions.

4.4. Analysis of Combined Data Augmentation Strategy

In this section, we evaluated the impact of six different data augmentation combination strategies on the feature extraction capability of the proposed method. The experimental results are shown in Table 5, showing the identification accuracy of the six data augmentation combination strategies on three datasets. Since the contrastive learning method used in this paper uses combined data augmentation strategy to generate positive sample pairs, we set C1 to C3 as different configurations under the same data augmentation conditions for comparison, and C4 to C6 are used to analyze the combined effects of the four different enhancement techniques.

It can be observed that the average accuracy exceeds 94.5% under different data augmentation combinations, demonstrating the robustness of MS-Informer when dealing with limited labeled data. The results show that the identification performance of C1 to C3 under the same data augmentation conditions is generally inferior to that of C4 to C6, indicating that using different augmentation strategies can help improve the model’s feature learning ability and generalization. By configuring these varied data augmentation methods, more indistinguishable positive–negative sample pairs are generated, thereby enabling the model to more deeply capture the differences between samples. The resulting model possesses stronger representational capabilities, which helps improve performance on downstream classification tasks.

5. Conclusions

This paper proposes a self-supervised contrastive learning method for MWR damage identification based on MS-Informer. The method was validated on a dataset of MWR damage with varying degrees and states of damage. The experimental results demonstrate that MS-Informer exhibits outstanding performance in handling MWR damage with limited labeled data and across varying states. This method not only effectively enhances damage identification accuracy to ensure safe mining operations but also provides reference for subsequent MWR maintenance, thereby improving overall mining production efficiency. The conclusions of this paper are as follows.

(1): A self-supervised contrastive learning framework for MWR is constructed, which can achieve accurate damage identification under limited labeled data and different damage degrees and states.
(2): By integrating multi-scale convolution and Informer within a contrastive learning framework, the proposed model learns robust local and global feature representations, which leads to improved damage state identification performance in terms of classification accuracy and related statistical metrics.
(3): The acquired representations are conveniently transferable to diverse downstream tasks, indicating that the method has good adaptability in various monitoring and damage identification scenarios.

Nevertheless, the present study still has several limitations. The proposed framework requires relatively high computational cost during the pre-training stage, and the fine-tuning process still depends on a limited amount of labeled data. Moreover, the model is validated under fixed sensor configurations, and significant changes in sensing conditions, such as sensor spacing or lift-off distance, may require additional calibration or fine-tuning. Future work will focus on reducing training cost, improving cross-condition generalization, and extending the framework to more downstream tasks.

Author Contributions

Conceptualization, methodology, software, validation, formal analysis, data curation, writing—original draft preparation, writing—review and editing, visualization, C.Z.; project administration, funding acquisition, J.T.; supervision, resources, J.T. and H.W.; investigation, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China (Grant No. 2024YFB4711004).

Data Availability Statement

All data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, H.; Li, Q.; Han, S.; Li, P.; Tian, J.; Zhang, S. Wire Rope Damage Detection Signal Processing Using K-Singular Value Decomposition and Optimized Double-Tree Complex Wavelet Transform. IEEE Trans. Instrum. Meas. 2022, 71, 1–12. [Google Scholar] [CrossRef]
Guo, J.; Yang, C. Highly Stabilized Phase-Shifted Fiber Bragg Grating Sensing System for Ultrasonic Detection. IEEE Photon. Technol. Lett. 2015, 27, 848–851. [Google Scholar] [CrossRef]
Garcés, G.; Máthis, K.; Medina, J.; Horváth, K.; Drozdenko, D.; Oñorbe, E.; Dobroň, P.; Pérez, P.; Klaus, M.; Adeva, P. Combination of In-Situ Diffraction Experiments and Acoustic Emission Testing to Understand the Compression Behavior of Mg-Y-Zn Alloys Containing LPSO Phase under Different Loading Conditions. Int. J. Plast. 2018, 106, 107–128. [Google Scholar] [CrossRef]
Liu, F.; Liu, S.; Zhang, Q.; Li, Z.; Qiu, H. Quantitative Non-Destructive Evaluation of Drilling Defects in SiCf/SiC Composites Using Low-Energy X-Ray Imaging Technique. NDT E Int. 2020, 116, 102364. [Google Scholar] [CrossRef]
Shibata, T.; Hashizume, H.; Kitajima, S.; Ogura, K. Experimental Study on NDT Method Using Electromagnetic Waves. J. Mater. Process. Technol. 2005, 161, 348–352. [Google Scholar] [CrossRef]
Shi, P.; Zhang, P.; Hao, S.; Wang, W.; Gou, X. Classification and Evaluation for Nearside/Backside Defect via Magnetic Flux Leakage: A Dual Probe Design with SVM and PSO Intelligence Algorithms. NDT E Int. 2024, 144, 103100. [Google Scholar] [CrossRef]
Huang, R.; Lu, M.; He, X.; Peyton, A.; Yin, W. Measuring Coaxial Hole Size of Finite-Size Metallic Disk Based on a Dual-Constraint Integration Feature Using Multifrequency Eddy Current Testing. IEEE Trans. Instrum. Meas. 2021, 70, 1–7. [Google Scholar] [CrossRef]
Xu, K.; Yang, K.; Liu, J.; Wang, Y. Study on Metal Magnetic Memory Signal of Buried Defect in Fracture Process. J. Magn. Magn. Mater. 2020, 498, 166139. [Google Scholar] [CrossRef]
Neslušan, M.; Bahleda, F.; Minárik, P.; Zgútová, K.; Jambor, M. Non-Destructive Monitoring of Corrosion Extent in Steel Rope Wires via Barkhausen Noise Emission. J. Magn. Magn. Mater. 2019, 484, 179–187. [Google Scholar] [CrossRef]
Zhang, D.; Zhao, M.; Zhou, Z.; Pan, S. Characterization of Wire Rope Defects with Gray Level Co-Occurrence Matrix of Magnetic Flux Leakage Images. J. Nondestruct. Eval. 2013, 32, 37–43. [Google Scholar] [CrossRef]
Kim, J.-W.; Tola, K.D.; Tran, D.Q.; Park, S. MFL-Based Local Damage Diagnosis and SVM-Based Damage Type Classification for Wire Rope NDE. Materials 2019, 12, 2894. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Feng, Z.; Shi, S.; Dong, Z.; Zhao, L.; Jing, L.; Tan, J. A Quantitative Identification Method Based on CWT and CNN for External and Inner Broken Wires of Steel Wire Ropes. Heliyon 2022, 8, e11623. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Han, J.; Jing, L.; Wang, C.; Zhao, L. Intelligent Fault Diagnosis of Broken Wires for Steel Wire Ropes Based on Generative Adversarial Nets. Appl. Sci. 2022, 12, 11552. [Google Scholar] [CrossRef]
Liu, S.; Hua, X.; Liu, Y.; Shan, L.; Wang, D.; Wang, Q.; Sun, Y. Accurate Wire Rope Defect MFL Detection Using Improved Hilbert Transform and LSTM Neural Network. Nondestruct. Test. Eval. 2024, 40, 1379–1408. [Google Scholar] [CrossRef]
Tian, J.; Zhao, C.; Wang, H. Damage Identification for Mining Wire Rope Based on Continuous Wavelet Transform and Convolutional Neural Network. Nondestruct. Test. Eval. 2025, 40, 2598–2620. [Google Scholar] [CrossRef]
Zhao, C.; Tian, J.; Wang, H.; Shi, Z.; Wang, X.; Huang, J.; Tang, L. An End-to-End Quantitative Identification Method for Mining Wire Rope Damage Based on Time Series Classification and Deep Learning. J. Nondestruct. Eval. 2025, 44, 25. [Google Scholar] [CrossRef]
Luo, S.; Zhang, D.; Wu, J.; Wang, Y.; Zhou, Q.; Hu, J. A Limited Annotated Sample Fault Diagnosis Algorithm Based on Nonlinear Coupling Self-Attention Mechanism. Eng. Fail. Anal. 2025, 174, 109474. [Google Scholar] [CrossRef]
Ozyurt, Y.; Feuerriegel, S.; Zhang, C. Contrastive Learning for Unsupervised Domain Adaptation of Time Series. arXiv 2023, arXiv:2206.06243. [Google Scholar] [CrossRef]
Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning, Virtual, 13–18 July 2020. [Google Scholar]
Ding, Y.; Zhuang, J.; Ding, P.; Jia, M. Self-Supervised Pretraining via Contrast Learning for Intelligent Incipient Fault Detection of Bearings. Reliab. Eng. Syst. Saf. 2022, 218, 108126. [Google Scholar] [CrossRef]
Zhang, J.; Zou, J.; Su, Z.; Tang, J.; Kang, Y.; Xu, H.; Liu, Z.; Fan, S. A Class-Aware Supervised Contrastive Learning Framework for Imbalanced Fault Diagnosis. Knowl.-Based Syst. 2022, 252, 109437. [Google Scholar] [CrossRef]
Zhu, Y.; Xie, B.; Wang, A.; Qian, Z. Fault Diagnosis of Wind Turbine Gearbox under Limited Labeled Data through Temporal Predictive and Similarity Contrast Learning Embedded with Self-Attention Mechanism. Expert Syst. Appl. 2024, 245, 123080. [Google Scholar] [CrossRef]
Sha, L.; Li, J.; Wang, M.; Yu, S.; Qiao, S. A Contrastive Generative Network with Feature-Attribute Consistency for Zero-Shot Fault Diagnosis in Process Industries. J. Process Control 2025, 154, 103529. [Google Scholar] [CrossRef]
Wang, A.; Qian, Z.; Pei, Y.; Jing, B. A De-Ambiguous Condition Monitoring Scheme for Wind Turbines Using Least Squares Generative Adversarial Networks. Renew. Energy 2022, 185, 267–279. [Google Scholar] [CrossRef]
Peng, T.; Shen, C.; Sun, S.; Wang, D. Fault Feature Extractor Based on Bootstrap Your Own Latent and Data Augmentation Algorithm for Unlabeled Vibration Signals. IEEE Trans. Ind. Electron. 2022, 69, 9547–9555. [Google Scholar] [CrossRef]
Wang, H.; Liu, Z.; Ge, Y.; Peng, D. Self-Supervised Signal Representation Learning for Machinery Fault Diagnosis under Limited Annotation Data. Knowl.-Based Syst. 2022, 239, 107978. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 19–21 May 2021. [Google Scholar]
Guennec, A.L.; Malinowski, S.; Tavenard, R. Data Augmentation for Time Series Classification Using Convolutional Neural Networks. In Proceedings of the ECML/PKDD Workshop on Advanced Analytics and Learning on Temporal Data, Riva del Garda, Italy, 19–23 September 2016. [Google Scholar]
Xia, Y.; Zhang, D.; Liu, C.; Cao, Z.; Su, Y.; Chen, Y. Fault Diagnosis of Air Handling Units Based on an MCNN-Transformer Ensemble Learning. J. Process Control 2025, 154, 103526. [Google Scholar] [CrossRef]
Wu, X.; Zhang, Y.; Cheng, C.; Peng, Z. A Hybrid Classification Autoencoder for Semi-Supervised Fault Diagnosis in Rotating Machinery. Mech. Syst. Signal Process. 2021, 149, 107327. [Google Scholar] [CrossRef]
Ruan, H.; Wang, Y.; Li, X.; Qin, Y.; Tang, B.; Wanga, P. A Relation-Based Semi-Supervised Method for Gearbox Fault Diagnosis with Limited Labeled Samples. IEEE Trans. Instrum. Meas. 2021, 70, 3510013. [Google Scholar] [CrossRef]
van der Maaten, L.; Hinton, G. Visualizing Data Using T-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]

Figure 1. Data augmentation for MFL signals.

Figure 2. Basic Structure of the Informer.

Figure 3. The structure of the proposed MS-Informer model.

Figure 4. The Structure of Multi-Scale Convolution.

Figure 5. Damage detection equipment for MWR.

Figure 6. MWR with different damage lengths.

Figure 7. Training process of proposed model: (a) self-supervised pre-training process; (b) fine-tuning process.

Figure 8. Comparative results of MF1 score under different labeled ratios in three datasets. (a) Dataset A; (b) Dataset B; and (c) Dataset C.

Figure 9. Comparative results of MF1 score for different damage type.

Figure 10. Confusion matrix of proposed model in three datasets. (a) Dataset A; (b) Dataset B; and (c) Dataset C.

Figure 11. T-SNE of classification results in three datasets. (a) Dataset A; (b) Dataset B; and (c) Dataset C.

Figure 12. Ablation experiment results for key modules.

Table 1. Detailed layer configuration of MS-Informer structure.

Module	Operation	Output
Input	-	1 $\times$ 1024
Multi-scale Convolution	1 $\times$ 1 Conv	16 $\times$ 1024
	3 $\times$ 1 Conv	16 $\times$ 1024
	3 $\times$ 1 DSConv	16 $\times$ 1024
	5 $\times$ 1 DSConv	16 $\times$ 1024
	Concat	64 $\times$ 1024
Encoder	Informer Encoder 1	64 $\times$ 1024
	Informer Encoder 2	64 $\times$ 1024
Downsampling	AvgPool	64 $\times$ 512
Classifier	GAP	64 $\times$ 1
	Linear	32
	Linear	n

Table 2. Details of three different datasets.

Dataset A		Dataset B		Dataset C
Length of Broken Wires	Number of Broken Wires	Number of Broken Wires	Length of Broken Wires	Length of Broken Wires	Number of Broken Wires
10 mm	3	19	3 mm	Normal	Normal
10 mm	5	19	5 mm	10 mm	3
10 mm	7	19	7 mm	10 mm	7
10 mm	9	19	9 mm	10 mm	11
10 mm	11	19	11 mm	5 mm	19
10 mm	13	19	13 mm	9 mm	19
10 mm	15	19	15 mm	13 mm	19

Table 3. Data augmentation parameters settings.

Data Augmentation	Parameter	Value
Adding noise	$noise coefficient σ_{n}$	0.05
Translation	Translation position $i$	Random
Jitter	Mean $μ$ , Standard deviation $σ$	0.07
Scaling	scale factor $σ_{s}$	0.06

Table 4. Comparison results under different labeled ratios in three datasets.

Dataset	Method	Identification Accuracy
Dataset	Method	5%	15%	30%	Average
Dataset A	CNN	71.15 ± 1.18	80.73 ± 1.72	83.35 ± 0.63	78.41
	LSTM	68.33 ± 0.48	75.52 ± 0.65	80.31 ± 0.94	74.72
	HAE	80.73 ± 1.72	90.35 ± 0.72	93.33 ± 1.19	88.14
	RelaNet	85.62 ± 1.14	92.40 ± 0.65	94.82 ± 0.52	90.95
	AFFE	87.40 ± 0.47	94.27 ± 1.00	96.56 ± 0.77	92.74
	MS-Informer	92.50 ± 0.99	97.22 ± 0.86	98.36 ± 0.44	96.03
Dataset B	CNN	71.04 ± 1.54	80.52 ± 0.18	83.75 ± 1.43	78.44
	LSTM	68.13 ± 0.32	75.21 ± 0.72	80.52 ± 1.26	74.62
	HAE	80.83 ± 1.26	90.47 ± 0.82	93.44 ± 1.36	88.25
	RelaNet	85.42 ± 0.79	92.29 ± 0.66	94.69 ± 0.63	90.80
	AFFE	87.23 ± 0.38	94.16 ± 0.18	96.41 ± 0.55	92.60
	MS-Informer	92.66 ± 0.40	97.42 ± 0.42	98.59 ± 0.34	96.22
Dataset C	CNN	71.25 ± 0.83	80.31 ± 0.54	83.62 ± 0.61	78.39
	LSTM	68.23 ± 1.26	75.62 ± 0.94	80.21 ± 0.96	74.69
	HAE	80.62 ± 1.74	90.31 ± 1.08	93.23 ± 0.95	88.05
	RelaNet	85.52 ± 0.65	92.18 ± 0.63	94.37 ± 0.32	90.69
	AFFE	87.44 ± 0.53	94.38 ± 1.08	96.38 ± 0.87	92.63
	MS-Informer	92.40 ± 0.65	97.33 ± 0.51	98.32 ± 0.59	96.02

Table 5. Identification accuracy of different combined data augmentation strategies.

No.	Combined Data Augmentation Strategy		Identification Accuracy
No.	$x_{c}$ (View 1)	$x_{d}$ (View 2)	Dataset A	Dataset B	Dataset C	Average
C1	Adding noise + Translation	Adding noise + Translation	95.11 ± 0.46	95.07 ± 0.22	95.20 ± 0.95	95.13
C2	Adding noise + Jitter	Adding noise + Jitter	94.79 ± 0.68	94.84 ± 0.68	95.07 ± 0.28	94.90
C3	Adding noise + Scaling	Adding noise + Scaling	94.91 ± 0.39	95.13 ± 0.61	95.17 ± 0.44	95.07
C4	Adding noise + Translation	Jitter + Scaling	96.45 ± 0.48	96.88 ± 0.46	96.41 ± 0.22	96.58
C5	Adding noise + Jitter	Translation + Scaling	96.36 ± 0.23	96.75 ± 0.48	96.67 ± 0.20	96.59
C6	Adding noise + Scaling	Translation + Jitter	96.36 ± 0.26	96.41 ± 0.48	96.46 ± 0.49	95.13

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, C.; Tian, J.; Wang, H. A Self-Supervised Contrastive Learning Framework Based on Multi-Scale Convolution and Informer for Quantitative Identification of Mining Wire Rope Damage. Machines 2026, 14, 54. https://doi.org/10.3390/machines14010054

AMA Style

Zhao C, Tian J, Wang H. A Self-Supervised Contrastive Learning Framework Based on Multi-Scale Convolution and Informer for Quantitative Identification of Mining Wire Rope Damage. Machines. 2026; 14(1):54. https://doi.org/10.3390/machines14010054

Chicago/Turabian Style

Zhao, Chun, Jie Tian, and Hongyao Wang. 2026. "A Self-Supervised Contrastive Learning Framework Based on Multi-Scale Convolution and Informer for Quantitative Identification of Mining Wire Rope Damage" Machines 14, no. 1: 54. https://doi.org/10.3390/machines14010054

APA Style

Zhao, C., Tian, J., & Wang, H. (2026). A Self-Supervised Contrastive Learning Framework Based on Multi-Scale Convolution and Informer for Quantitative Identification of Mining Wire Rope Damage. Machines, 14(1), 54. https://doi.org/10.3390/machines14010054

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Self-Supervised Contrastive Learning Framework Based on Multi-Scale Convolution and Informer for Quantitative Identification of Mining Wire Rope Damage

Abstract

1. Introduction

2. Methods

2.1. Self-Supervised Learning

2.2. Contrastive Learning

2.3. Combined Data Augmentation Strategy

2.4. Informer Encoder

2.4.1. ProbSparse Self-Attention

2.4.2. Self-Attention Distilling

2.5. The MS-Informer Model Structure

3. Experiments

3.1. Experimental Platform

3.2. Experimental Datasets and Implementation Details

3.3. Model Performance Metrics

4. Results and Discussion

4.1. Analysis of Self-Supervised Pre-Training and Fine-Tuning

4.2. Identification Results and Comparison Analysis

4.3. Ablation Experiment

4.4. Analysis of Combined Data Augmentation Strategy

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI