1. Introduction
Rolling bearings are critical supporting components in rotating machinery, and their performance directly affects the operating efficiency, precision, and service life of equipment. When a bearing issue arises, it can result in production disruption, equipment stoppage, and even safety incidents, while also directly affecting production costs and the economic benefits of enterprises [
1].
Rolling bearing fault diagnosis has advanced significantly in recent years due to the quick development of deep learning networks and signal processing techniques. Neural network models like convolutional neural networks, recurrent neural networks, and generative adversarial networks, as well as signal processing techniques like empirical mode decomposition, Fourier transform, continuous wavelet transform, and Gramian angular field transformation, have been extensively used [
2].
Currently, the main method for diagnosing bearing faults is the use of one-dimensional vibration signals. Variational mode decomposition (VMD) and a one-dimensional convolutional neural network based on the Pearson correlation coefficient are combined in an early defect identification technique presented by Deng et al. [
3]. A rolling bearing defect diagnostic model was created by Wang et al. [
4] using an upgraded one-dimensional convolutional neural network with a parametric rectified linear unit and better variational mode decomposition based on grey wolf optimization. A hybrid defect diagnostic technique called GGRU-1DCNN-AdaBN was presented by Sun et al. [
5]. Kumar et al. [
6] propose an intelligent fault diagnosis framework that integrates deep features with spatio-temporal modeling to capture fault characteristics. By employing a dual-condition cost-sensitive strategy that combines full-data and few-shot learning, the framework enhances adaptability and generalization. Sharma et al. [
7] integrated a convolutional neural network (CNN) with a support vector machine (SVM). They implemented an adaptive cutoff strategy that enables the system to automatically determine when to transition from CNN-based feature extraction to SVM-based classification without human intervention.
The working circumstances of rolling bearings in real-world applications are frequently more complicated and changeable, despite the fact that significant progress has been achieved in rolling bearing defect identification using the aforementioned techniques. In these situations, feature extraction from one-dimensional vibration signal data is still difficult and can be greatly impacted by different kinds of noise, which makes it hard to fully represent fault-related information like the bearing’s type, location, and severity. Converting one-dimensional vibration signals into two-dimensional photographs and inserting them into network models for fault classification has become a research hotspot and an emerging trend in this field because of developments in computer technology and image processing techniques [
8,
9].
A cross-condition bearing defect identification technique that combines a residual deep subdomain adaptation network (RDSAN) with the Gramian angular difference field (GADF) was presented by Qu et al. [
10]. A fault diagnostic model based on the Markov transition field and an enhanced capsule network was created by Xiao et al. [
11]. Two-stage feature extraction utilizing an enhanced selective kernel network and a capsule network was used to classify faults. Gilbert et al. [
12] designed a two-dimensional convolutional neural network (2D-CNN) architecture that utilizes images generated from “Morlet1” continuous wavelets. Group normalization (GN) was employed to enhance the stability and generalization of the bearing diagnostic model. Khan et al. [
13] propose a novel Variational Model Decomposition (VMD)-Continuous Wavelet Transform (CWT)-Vision Transformer (ViT) framework that integrates VMD, CWT, and ViT, leveraging the complementary strengths of these three techniques to enhance feature representation and achieve accurate fault classification.
However, the aforementioned single-branch network-based fault diagnostic techniques still have certain drawbacks. For instance, they frequently fail to adequately capture the essential features of vibration signals; the extracted information may be excessively redundant, leading to insufficient information utilization; and the generalization capability of the network models remains limited.
This work created a dual-channel rolling bearing failure diagnostic model based on VMD-BiGRU and GADF-ResNet-CBAM to overcome the aforementioned problems. To get the best decomposition effect, the optimized VMD’s parameters were first iteratively changed. Before being fed into the BiGRU network for temporal feature extraction, the decomposed intrinsic mode components were evaluated using the kurtosis criterion, and the components with the highest kurtosis values were selected for signal reconstruction, minimizing noise interference and redundant features. In the meanwhile, a ResNet linked with the convolutional block attention module (CBAM) was used to extract spatial features from the original vibration signals after they had been encoded into two-dimensional pictures using GADF. The network’s emphasis on important defect information was strengthened and feature representation was further improved by the CBAM module. To give diagnostic results, a feature fusion layer and a fully connected layer were created, and a Softmax classifier was utilized to detect fault categories. The suggested approach allows complementary analysis of the data from various angles by combining the temporal features of fault signals with the spatial features of two-dimensional images. This results in more thorough fault feature extraction as well as enhanced analytical accuracy and dependability. Experiments indicate that the proposed strategy is better and practicable.
2. Basic Theory
2.1. VMD
Recently, an adaptive signal processing technique called variational mode decomposition (VMD) has been put forth. By iteratively searching for the optimal solution, this method determines the center frequency and bandwidth of each component and constitutes a fully non-recursive model [
14]. The VMD algorithm mainly consists of two parts: the construction of the constrained variational objective and the iterative computation process. The variational constrained objective constructed in VMD is given as follows:
where
and
denote the sets of intrinsic mode function (IMF) components and their center frequencies, respectively;
denotes the gradient operator;
represents the Dirac distribution; and * denotes the convolution operation.
To obtain the optimal solution to the above variational problem, the Lagrange multiplier
and the quadratic penalty factor
were introduced, and the original objective function was transformed into:
The variables , , and were iteratively updated using the alternating direction method of multipliers (ADMM) until the convergence criterion for the optimal solution was satisfied, whereupon the iteration process was terminated.
To achieve the most informative decomposition, the mode number K and the quadratic penalty factor were determined by a grid search that minimizes the average envelope entropy of the resulting IMFs. Envelope entropy reflects the sparsity and periodicity of a signal; a low value indicates prominent fault-related transients. The search range for K was set to (step 1) and for to (step 500). The noise tolerance was fixed at 0, and the convergence tolerance of the ADMM solver was set to . The combination , yielded the lowest average envelope entropy and was therefore adopted. These optimized values are used throughout the remainder of the study.
2.2. GADF
The Gramian angular field (GAF) is an image encoding method that transforms one-dimensional time-series data into two-dimensional images [
15]. Through polar coordinate mapping and Gramian matrix computation, the dynamic characteristics and nonlinear relationships of the time-series data can be effectively preserved and converted into two-dimensional image texture features.
The Gramian angular field transformation includes the Gramian angular summation field (GASF) and the Gramian angular difference field (GADF). Compared with GASF, GADF exhibits superior performance in terms of image color representation, detail characterization, and cross-boundary expression [
16] Therefore, the Gramian angular difference field was adopted for encoding in this study.
Assuming that there are
n points of time series
, the scale of the time series
X places it inside interval [−1, 1], and the processing equation is shown in Equation (
3).
where
is the
ith value in the original time series
X and
is the
i th value in the deflated time series
,
.
The value
in
is encoded as the cosine of the angle of the pinch, and the radius serves as the timestamp, the calculation procedure is shown in Equation (
4).
where
denotes the polar coordinates of the angle cosine;
denotes the timestamp; and
N denotes the constant factor obtained from the regularization transformation of the polar coordinates.
After the time series is transformed into the polar coordinate system, the temporal correlations over different time intervals can be identified by considering the angular differences between individual points. The Gramian angular difference field (GADF) transformation is implemented based on the sine function in the Gramian angular field (GAF), as expressed in Equation (
5).
where
I is the vector [1, 1…1] for the unit row.
2.3. BiGRU
The gated recurrent unit (GRU) introduces an update gate and a reset gate, thereby overcoming the problems of gradient vanishing and gradient explosion encountered by traditional recurrent neural networks (RNNs) in long-sequence processing. As a result, long-term dependencies can be captured efficiently. Its specific structure is shown in
Figure 1. However, when processing extremely long sequences, GRU may still suffer from memory degradation and sensitivity to noise.
Based on the GRU architecture, the bidirectional gated recurrent unit (BiGRU) further incorporates both forward and backward state information, thereby enhancing the model’s comprehensive understanding of the sequence and improving its robustness and adaptability in temporal feature extraction. The corresponding calculation process is given in Equations (6)–(8).
In the formula: and represent the forward output and reverse output generated by the unit at time t, respectively; and represent the weight values corresponding to the forward state and backward state, respectively. represents the output of the BiGRU network model at time step t; denotes the context vector associated with time step t; is the weight matrix associated with the hidden state; is the parameter related to the reset gate at time step t.
2.4. ResNet
The residual neural network (ResNet), proposed by He et al. [
17], addresses the degradation problem in deep networks, and its core idea is the introduction of the residual block structure, as shown in
Figure 2. This paper uses ResNet 18 as the base model. ResNet is composed of multiple stacked residual blocks, each of which contains two paths: a skip-connection path and a conventional path with nonlinear mapping. Each residual block has two convolutional layers, which are followed by a batch normalization layer and a ReLU activation function.
a is the original signal input to the residual neural network, is the residual output, is the residual mapping function, and is the constant mapping function.
2.5. CBAM
Woo et al. [
18] presented the convolutional block attention module (CBAM) in 2018. It is a lightweight attention mechanism that can be easily added into CNN architectures. CBAM allows the network to adaptively focus on critical feature channels and spatial areas, improving feature representation capabilities. The implementation process of CBAM is described in Equation (
9), and its structure is illustrated in
Figure 3.
where
F is the input feature map;
is the channel attention weight;
is the spatial attention weight;
is the feature map weighted by channel attention;
is the feature map weighted by spatial attention.
After processing the input feature map, the channel attention module moves on to the maximum and average pooling layers using the Multilayer Perceptron (MLP), the number of channels is compressed to
C/
r, and then expanded to
C, where
C is the number of channels, and
r is the attenuation ratio; then the MLP’s output characteristics are activated by
, and the channel attention
is given by Equation (
10), and the module structure is shown in
Figure 4.
where
stands for the sigmoid activation function;
F denotes the input feature.
The spatial attention module is used to enhance important spatial regions in the feature map and serves as a complement to channel attention. To build the spatial attention map, the input signal is processed sequentially using max and average pooling techniques. The spatial attention
is defined in Equation (
11), and the module structure is shown in
Figure 5.
where
indicates that the convolution kernel size is
.
4. Validation and Analysis of Experiments
4.1. Case Western Reserve University Bearing Dataset
The bearing dataset from Case Western Reserve University (CWRU), USA, was used for experimental validation. Fan-end bearing data at 12 kHz and drive-end bearing data at 12 kHz and 48 kHz are included in this collection.
Figure 8 shows the CWRU bearing test bench, whose main components include a motor, a coupling, a torque sensor, and a dynamometer.
The drive-end bearing data recorded at 12 kHz was employed in this investigation. An SKF 6205 deep-groove ball bearing was the kind of bearing. Four distinct load conditions—0, 0.735, 1.47, and 2.25 kW—corresponding to rotating speeds of 1797, 1772, 1750, and 1730 r/min were used in the data gathering platform. The defects were artificially introduced using electro-discharge machining (EDM) with specific diameters. Three fault states were included in the dataset: rolling-element, outer-race, and inner-race faults—as well as one healthy state for each operating scenario. With fault sizes of 0.1778, 0.3556, and 0.5334 mm, each fault condition was further classified into three severity categories. Based on the fault type and fault diameter, the dataset was divided into ten groups: nine fault categories and one healthy category. There were 200 examples chosen for each category of failure, for a total of 2000 samples. 70% of the total samples were chosen at random to serve as the training set, with the remaining 30% serving as the test set. To generate the experimental data, the sensor signals were split into smaller samples using a sliding-window approach with a window length of 1024 and a step size of 1024. The results of the dataset splitting are shown in
Table 3.
Two-dimensional feature images were generated using the GADF encoding method, and the image resolution used as the input to the network model was
pixels. The GADF-encoded images corresponding to each bearing fault category are shown in
Figure 9.
4.2. Experimental Results and Analysis
A hardware configuration featuring an NVIDIA GeForce RTX 4060 Ti graphics card and an Intel(R) Core(TM) i7-12700H CPU was used for the research. The PyTorch 1.9.0 deep learning framework (Facebook AI Research, Menlo Park, CA, USA), based on the Python 3.8 programming language (Python Software Foundation, Wilmington, DE, USA), was used to create the model. The Adam optimizer was used to update the network parameters, the batch size was 64, the initial learning rate was 0.001, and there were 100 training epochs.
Figure 10 displays the accuracy and loss curves of the proposed dual-channel rolling bearing defect detection model based on VMD-BiGRU and GADF-ResNet-CBAM at a rotational speed of 1750 r/min. After 20 training epochs, the accuracy and loss gradually stabilized, indicating that the model ultimately converged and continued training until 100 epochs were completed. The training accuracy was 99.71%, the test accuracy was 99.50%, and the training loss was 0.0015. These findings show that the suggested model performs exceptionally well in defect identification on the experimental dataset.
The confusion matrix depicted in
Figure 11 was produced in order to examine the detailed classification results of the suggested model for each type of defect. The confusion matrix offers a thorough visual representation of the categories of faults that were misclassified as well as the quantity of misclassifications. The genuine labels in this matrix are represented by the vertical axis, while the anticipated labels are represented by the horizontal axis. The number of samples that were successfully categorized for each defect category is shown by the values on the diagonal.
As can be seen from the confusion matrix, no misclassification occurred for labels 0, 2, 3, 4, 5, 6, 7, and 9, and the diagnostic accuracy for these categories reached 100%. Two outer-race fault bearings with a fault width of 0.3556 mm were misidentified as rolling-element fault bearings with a fault diameter of 0.1778 mm after two label 8 samples were misprojected as label 1. Furthermore, a single rolling-element fault bearing with a fault width of 0.1778 mm was misdiagnosed as an outer-race fault bearing with a fault diameter of 0.3556 mm due to a single sample of label 1 being incorrectly projected as label 8. Because the vibration characteristics of the two types of bearing failures described above are very similar in the early stages of failure, and because weak failure pulses are partially masked by noise, the model presented in this paper produces classification errors in failure type identification. Overall, the suggested model’s rolling bearing issue diagnosis detection accuracy was rather high.
The fault features were shown using t-distributed stochastic neighbor embedding (t-SNE) to make the classification performance of the proposed model more intelligible. Only a very tiny percentage of data were improperly grouped, and samples from various fault categories display unique clustering patterns with no discernible overlap, as seen in the t-SNE figure.
Figure 12 displays the visualization findings.
4.3. Comparative Analysis of Different Algorithmic Models
Comparative tests were carried out against a number of benchmark models, such as a BP neural network, 1D-CNN, BiGRU, MTF-CNN, GADF-ResNet, VMD-BiGRU, and GADF-ResNet-CBAM, to show the improved performance of the suggested model in rolling bearing defect diagnostics. All models were trained using the same dataset under the same experimental conditions, with a constant number of training epochs of 100, in order to guarantee a fair comparison. Ten repeated experiments were used to test each model while accounting for the impact of random mistakes. The accuracy of the suggested model was consistently about 99% during all ten trials, as seen in
Figure 13, which was far greater than that of the other models.
The average value from the ten experiments was used to determine the test accuracy.
Table 4 shows that Model 1 (1D-CNN) and Model 2 (BiGRU), which are conventional fault classification models employing original vibration signals as input, achieved accuracy rates of 95.54 ± 0.42% and 95.89 ± 0.38%, respectively. Model 3 and Model 4, which use MTF and GADF encoding to convert one-dimensional vibration signals into two-dimensional feature images, obtained accuracies of 96.78 ± 0.35% and 97.37 ± 0.30%, respectively. This transformation allows the network models to extract more detailed fault features and greatly enhances fault classification capability. Model 5 (VMD-BiGRU) and Model 6 (GADF-ResNet-CBAM), which use single signal processing techniques, still show comparatively good diagnostic performance with accuracies of 97.92 ± 0.25% and 98.63 ± 0.20%. On the other hand, the proposed dual-channel model concurrently combines the signal processing techniques mentioned above, enabling data analysis via both spatial and temporal feature extraction. With an average accuracy of 99.39 ± 0.15%, the two channels work in tandem to further enhance the model’s fault identification accuracy while minimizing variance. The improved bearing defect diagnostic ability of the proposed model is confirmed by comparative studies with the six models mentioned above.
To further validate the fault diagnosis capabilities of the model proposed in this paper, we included advanced bearing fault diagnosis models for comparison, including MSCN-LSTM [
19] and CWT-IDenseNet [
20]. Using the same dataset as before and training for 100 iterations, these models achieved accuracy rates of 99.18% and 99.26%, respectively—both lower than the 99.39% accuracy rate of the model proposed in this paper. This demonstrates that the model proposed in this paper exhibits superior fault diagnosis performance.
4.4. Ablation Study
To rigorously validate the contribution of each component and the rationality of the proposed design, a detailed ablation study was conducted, with the results summarized in
Table 5.
Effectiveness of VMD Optimization: A comparison between the Baseline and Variant A reveals that replacing fixed VMD parameters with the optimized strategy yields a significant improvement, boosting accuracy from 95.80 ± 0.45% to 97.10 ± 0.32%. This confirms that adaptive parameter selection allows for more effective feature extraction from raw signals.
Impact of Fusion Strategies: Variant B achieves an accuracy of 97.95 ± 0.28%, outperforming Variant A and demonstrating that weighted fusion is superior to simple addition. Furthermore, when comparing Variant E, which utilizes Late Decision Fusion (97.90 ± 0.30%), with Variant C, the proposed feature-level approach reaches 98.65 ± 0.21%. This indicates that feature-level fusion retains more discriminative information than decision-level fusion.
Contribution of CBAM and Reconstruction: The performance gain observed in Variant C highlights the effectiveness of the CBAM module in focusing on critical features. Additionally, the Proposed model attains the highest accuracy of 99.39 ± 0.15%, surpassing Variant D (without reconstruction) at 98.10 ± 0.25%. This verifies the necessity of the VMD reconstruction step in preserving signal integrity.
Overall Performance: Ultimately, the Proposed model achieves the highest accuracy of 99.39 ± 0.15%, validating that the synergistic integration of optimized VMD, CBAM, and feature-level weighted fusion provides the most robust solution for fault diagnosis.
To evaluate the robustness of the method described in this paper, we conducted five independent runs for each experimental setup. The target model achieved an average accuracy of 99.39 ± 0.15%. The small standard deviation indicates its stability.
4.5. Cross-Load Generalization Study
To verify the model’s generalization capability under different operating conditions, we conducted rigorous cross-load validation. Unlike random partitioning, this approach ensures that the test data originates from load conditions that did not appear during training.
The results are shown in
Table 6. We observed that the diagnostic accuracy naturally decreased compared to the 99.39% accuracy achieved by the model under random load distribution. This decline was expected due to the domain shift in load distribution. However, the proposed method still maintained a high average accuracy of 95.93% ± 0.51%, demonstrating strong adaptability to changes in operating conditions.
It is worth noting that in tests under 1 HP and 2 HP loads, the model demonstrated excellent adaptability, with accuracy rates exceeding 96% in both cases. In contrast, performance declined slightly under 3 HP and 0 HP loads. This may be attributed, respectively, to increased signal complexity under heavy-load conditions and the attenuation of fault characteristic signals under no-load conditions.
As shown in
Table 6, the proposed method maintains a high diagnostic accuracy even when the load changes, which validates the effectiveness of the learned features.
Although our model achieves high accuracy in cross-domain tasks, the interpretability of the transferred features remains a challenge. Future work will explore explainable domain adaptation.
4.6. Performance Analysis Under Different Sample Sizes
To evaluate the fault diagnosis performance of the proposed model under different numbers of training samples, comparative experiments were conducted using BiGRU, MTF-CNN, GADF-ResNet-CBAM, and the proposed model. The average diagnosis accuracy was used as the final assessment parameter after each model was evaluated ten times.
Under all sample-size settings, the suggested model outperformed the other models in terms of average recognition accuracy, as seen in
Figure 14. The average accuracy of the suggested model was 95.07% ± 0.80% when just 30 training samples were provided for each fault category. This was 6.55% better than BiGRU, 3.82% better than MTF-CNN, and 1.91% better than GADF-ResNet-CBAM. All models showed varied degrees of improvement in accuracy as the number of training samples rose. The suggested model’s accuracy achieved 96.44% ± 0.60% when the sample size was 60. The accuracy rose to 97.62% ± 0.40% and 98.75% ± 0.25%, respectively, with sample sizes of 90 and 120. All models attained comparatively high diagnostic accuracy when the sample size reached 140, with the suggested model’s accuracy reaching 99.39% ± 0.15%. The diagnostic findings show that the suggested model continues to perform better than the other models in bearing defect identification with varying training sample sizes, demonstrating not only high accuracy but also robust stability even with limited data.
4.7. Noise Immunity Analysis
In this work, Gaussian white noise with varying signal-to-noise ratios (SNRs) was introduced to the original signals to replicate multi-noise interference circumstances, keeping in mind that vibration signals are frequently contaminated by noise in real-world operational settings. This allowed for a thorough evaluation of the adaptability and robustness of the suggested model under noise interference. Equation (
12) provides the formula for computing the SNR.
where
denotes the power of the original signal, and
denotes the noise power.
Gaussian white noise with SNRs of −2 dB, 0 dB, 2 dB, and 4 dB was applied to the signals using an operating condition of 1750 r/min and 60 training samples for each fault category. A comparison study was carried out against three different models in order to further confirm the noise immunity of the suggested model; the outcomes are shown in
Table 7.
As the signal-to-noise ratio (SNR) increases, noise interference gradually decreases, and the performance of all models improves accordingly. Under various noise conditions, the proposed model consistently outperforms the comparison models and demonstrates higher stability. Specifically, even under low SNR conditions of −2 dB, the proposed model still achieves a robust accuracy of 90.64 ± 0.80%, which is 10.61% higher than that of the BiGRU model (80.03 ± 1.25%). At signal-to-noise ratios of 0 dB and 2 dB, the proposed model achieved accuracy rates of 93.81 ± 0.60% and 96.02 ± 0.40%, respectively, which are 2.08% and 1.86% higher than the second-best model (GADF-ResNet-CBAM). Furthermore, at a signal-to-noise ratio of 4 dB, the proposed model achieved an accuracy of 98.79 ± 0.25%. This result is comparable to the noise-free test accuracy of 99.39 ± 0.15% reported in
Table 4, indicating that the impact of the additional Gaussian white noise on the proposed model is negligible. These results confirm that the proposed method possesses excellent robustness and noise resistance.
4.8. Validation on an Independent Laboratory Dataset
Experiments were carried out utilizing a dataset gathered from the rolling bearing test rig at Xinjiang University in order to further validate the effectiveness and applicability of the suggested method on an independent dataset.
Figure 15 depicts the test setup. The test bearing was an ER-16K rolling bearing, and the sampling frequency was 20.48 kHz. The experimental validation was carried out using a rotating speed of 1500 r/min and a load of 1.2 A.
The specific operational parameters of the dataset used in this experiment are listed in
Table 8. The dataset contains four bearing scenarios: rolling-element fault, inner-race fault, outer-race fault, and normal condition. To maintain consistency with the single-fault sample size of the previously mentioned public CWRU bearing dataset, 200 samples were selected for each type of fault, for a total of 800 samples. Thirty percent of the total data were used as the test set for validation, while the remaining seventy percent were chosen at random as the training set.
The suggested model performed exceptionally well in defect diagnosis on the Xinjiang University rolling bearing dataset, with an accuracy of 99.58 ± 0.10% on the test set after 100 training epochs. These results demonstrate that the proposed model can reliably diagnose faults in rolling bearings of different types, confirming its effectiveness in practical applications and strong adaptability to data from distinct hardware sources. The corresponding accuracy curves of the proposed model are shown in
Figure 16.
4.9. Limitations and Future Work
While the proposed method achieves high diagnostic accuracy, there are several limitations that warrant discussion regarding the validity and scope of our results.
First, the model is dependent on specific signal encoding. The use of GADF transforms one-dimensional signals into two-dimensional images, which is highly sensitive to the length and scaling range of the input data. If the length of the vibration signal changes in practical applications, the topological structure of the GADF image may be distorted, potentially leading to a drop in recognition accuracy. This limits the model’s robustness to variable-length signals.
Second, the computational cost restricts real-time application. The dual-channel architecture combining ResNet-18 and BiGRU results in a large number of parameters (approximately [Insert Number] Million). While this complexity aids in feature extraction, it increases the computational burden, making it difficult to deploy the model directly on resource-constrained edge devices for real-time monitoring.
Third, the scope of generalization requires clarification. Although the model achieved high accuracy on both the CWRU dataset and our self-collected dataset (
Section 4.8), demonstrating cross-device applicability, we acknowledge that the current validation is primarily under fixed operating conditions. While
Section 4.5 validates cross-load performance, the model’s robustness against complex environmental and operational variability (e.g., rapid speed changes or temperature fluctuations) has not been fully established. Therefore, we moderate our conclusion: the model exhibits excellent performance under controlled and fixed-condition scenarios, but further research into domain adaptation methods is needed to achieve true strong generalization across diverse industrial environments.
Future work will focus on developing adaptive encoding mechanisms to handle variable-length signals, exploring model compression techniques for edge deployment, and integrating domain adaptation algorithms to mitigate the impact of environmental variability.