Abstract
Accurate remaining useful life (RUL) prediction of tantalum capacitors is essential for enhancing the reliability and maintainability of power electronic systems. However, online RUL prediction remains a challenging task due to the difficulty of accessing internal degradation states and the non-stationarity of operating conditions. This paper presents a novel CNN-LSTM-Attention-based deep learning framework for accurate online RUL prediction of tantalum capacitors, leveraging infrared surface temperature measurements and ambient thermal compensation. The proposed framework initiates with the collection of degradation temperature data under controlled accelerated aging experiments, where true degradation indicators are extracted by eliminating ambient temperature interference through dual-sensor compensation. The resulting preprocessed data are used to train a hybrid deep neural network model that integrates convolutional layers for local feature extraction, long short-term memory (LSTM) units for sequential dependency modeling, and a soft attention mechanism to selectively focus on the critical degradation patterns. A channel attention module is further embedded to adaptively optimize the importance of different feature channels. Experimental validation using three groups of aging data demonstrates the effectiveness and superiority of the proposed method over conventional LSTM and CNN-LSTM baselines. The CNN-LSTM-Attention model achieves a substantial improvement in prediction accuracy, with mean absolute percentage error (MAPE) reductions of up to 60.97%, root mean squared error (RMSE) reductions of up to 65.63%, and coefficient of determination (R2) increases of up to 68.67%. The results confirm the ability to deliver precise and robust online RUL predictions for tantalum capacitors under complex operational conditions.
1. Introduction
With the continuous advancement of power electronics and miniaturized embedded systems, the health prognostics and management (PHM) of passive components, especially tantalum capacitors, have become increasingly critical [1,2,3,4,5,6]. Tantalum capacitors are widely used in high-reliability applications such as the oil and gas industry, aerospace, communication base stations, and industrial control due to their high-capacitance density and stable performance [7,8,9,10,11,12]. However, their degradation behavior under thermal and electrical stress directly threatens system reliability [13,14,15,16,17,18]. To address this, condition monitoring (CM) and remaining useful life (RUL) prediction techniques have been increasingly adopted. Among various RUL prediction paradigms, data-driven approaches based on sensor measurements have shown strong potential thanks to their ability to model complex degradation patterns without relying on precise physical failure models [19,20,21,22]. In this context, deep learning methods, especially those capable of capturing temporal and nonlinear characteristics of degradation signals, offer powerful tools for modeling the aging process of tantalum capacitors [23,24,25,26].
Unlike mechanical systems that often exhibit evident vibration-based degradation patterns, the internal deterioration process of tantalum capacitors is typically latent and difficult to observe directly during their operational lifespan. Among the various indicators, temperature, particularly the internal core temperature, has proven to be one of the most sensitive signals for tracking capacitor degradation [27], as it reflects the combined effects of increased leakage current, elevated equivalent series resistance (ESR), and thermal stress accumulation. However, due to structural limitations, direct access to the core temperature is impractical in real-world applications. In contrast, the surface temperature of the capacitor casing can be conveniently monitored using noncontact infrared temperature sensors. These surface measurements, when corrected for environmental effects, serve as reliable proxies for internal degradation trends. Notably, such temperature data inherently possess rich temporal dynamics and exhibit strong correlations with the degradation states of the component. This requires the adoption of advanced modeling approaches that can effectively capture both sequential dependencies and complex feature interactions within the temperature measurement data, providing a solid foundation for accurate and online remaining useful life (RUL) prediction.
In recent years, deep learning has emerged as a promising method for the RUL prediction of capacitors. Delanyo D.K.B. et al. [28] successfully used bidirectional long short-term memory (BiLSTM) to predict the RUL of aluminum electrolytic capacitors. The network can better capture the degradation trend of AEC and improve the prediction accuracy compared with traditional LSTM. Z. Wang et al. [29] used the ARI-MA-Bi-LSTM hybrid model, which combined ARIMA’s ability to extract nonlinear features from linear Bi-LSTM, to achieve AEC prediction from an early stage. F. Wang et al. [30] proposed an ensemble learning method combining Chained-SVR and 1D-CNN for the prognostics of the RUL of aluminum electrolytic capacitors (AECs). The experimental results show that the proposed method not only improves the robustness of the individual models but also achieves the best performance among all the compared methods. Q. Sun et al. [9] proposed a residual lifetime prediction method for electrolytic capacitors based on GRU and PSO-SVR. They achieved better prediction accuracy compared to traditional methods on both NASA and experimental datasets. G. Lou et al. [31] proposed a two-stage online RUL prediction framework based on the bidirectional long short-term memory (BiLSTM) network and the H∞ observer. The results indicate that the BiLSTM network can explain more than 99% of the variation of capacitance, achieving competitive prediction accuracy when compared with offline methods. Z. Yi et al. [32] introduced a novel method for SOH estimation and RUL prediction, based on a hybrid neural network optimized by an improved honey badger algorithm (HBA). The method combines the advantages of a convolutional neural network (CNN) and a bidirectional long short-term memory (BiLSTM) neural network. The results show that the proposed hybrid model effectively extracts features, enriches local details, and enhances global perception capabilities. A.F. Shahraki et al. [33] proposed an RUL prediction method based on the long short-term memory (LSTM) model. This method can reduce computational time and complexity while ensuring high prediction performance. X. Zhao et al. [34] applied the fuzzy theory to the fault diagnosis of a shunt capacitor. At the same time, they proposed a map-based fault diagnosis system. This method not only improves the accuracy of power capacitor fault diagnosis and identification but also provides a new method for the application of power capacitor fault research and development.
Despite the growing application of deep learning techniques in RUL prediction, several critical challenges remain unresolved in the context of tantalum capacitors. First, the degradation behavior of capacitors is often modulated by complex and dynamic environmental conditions, such as fluctuating ambient temperature and electrical loading, which introduce significant noise and obscure the true degradation signals. Second, most existing approaches are trained on offline datasets and lack the capacity for real-time inference, limiting their applicability in scenarios requiring continuous health monitoring and timely maintenance decisions. Third, conventional neural network architectures often fall short in modeling the multiscale temporal dependencies and intricate feature correlations embedded within long-term degradation data. These challenges collectively underscore the need for a unified and adaptive prediction framework that can (i) effectively suppress environmental interference; (ii) support online, real-time prognostics; and (iii) robustly extract and learn subtle yet informative degradation patterns. Overcoming these limitations is essential for achieving accurate and dependable RUL estimation of passive electronic components in high-reliability and safety-critical applications.
To address the aforementioned challenges, this paper presents a novel CNN-LSTM-Attention-based framework for online RUL prediction of tantalum capacitors operating under complex and variable conditions. The key contributions of this work are summarized as follows:
(1) A temperature-based data preprocessing method is proposed, in which a dual-sensor configuration (comprising an infrared temperature sensor and an ambient temperature sensor) is utilized to compensate for environmental interference. This strategy enables the extraction of degradation-relevant thermal signals, thereby improving the quality and reliability of the input features.
(2) A unified hybrid deep learning architecture is developed by integrating convolutional neural networks (CNNs), long short-term memory (LSTM) units, and attention mechanisms. This architecture effectively captures local degradation features, models long-term temporal dependencies, and dynamically emphasizes the most informative patterns within the data. Additionally, a channel attention mechanism is incorporated to adaptively reweight feature channels, enhancing the model’s capacity to extract subtle yet critical degradation cues across spatial dimensions.
(3) Extensive experimental evaluations based on the accelerated aging data of real-world tantalum capacitors demonstrate the proposed method’s superior performance over conventional LSTM and CNN-LSTM baselines. The model achieves significant improvements in prediction accuracy, robustness, and generalization capability. Notably, the proposed approach is particularly well suited for online prognostic scenarios where only non-intrusive surface temperature measurements are available and holds strong potential for extension to other types of capacitors and passive electronic components in broader PHM applications.
The rest of this paper is organized as follows. Section 2 presents the problem formulation and related preliminaries, including the temperature degradation modeling of tantalum capacitors. Section 3 details the proposed online RUL prediction method, including the data preprocessing strategy and the hybrid CNN-LSTM-Attention architecture. Section 4 validates the proposed framework using accelerated degradation experiments on real-world tantalum capacitors and discusses the prediction results. Finally, Section 5 concludes the paper and outlines potential directions for future research.
2. Preliminaries
During the degradation process of tantalum capacitors, the core temperature plays a critical role in determining their performance and operational life. However, under complex working conditions, obtaining the core temperature of tantalum capacitors is challenging. In contrast, the operating temperature of the capacitor, which corresponds to the surface casing temperature, can be conveniently measured using noncontact detection methods. The operating temperature of a capacitor is influenced by the ambient temperature, heat dissipation conditions, and electrical operating parameters. Heat dissipation is affected by variables such as the structure of packaging, spatial configuration, and the accumulation of dust. In high-reliability application scenarios, these factors are typically stable under long-term operating conditions. Therefore, this work neglects the impact of heat dissipation conditions. The relationship between the core temperature and the operating temperature is described as follows:
where Th(t) represents the core temperature of the tantalum capacitor at moment t. α is a constant that only relates to the ambient thermal resistance. T(t) represents the operating temperature of the tantalum capacitor at moment t. Te(t) is ambient temperature. Tdc(t) denotes the temperature induced by losses due to leakage current. Tac(t) denotes the temperature induced by losses due to ripple current. From Equation (1), the operating temperature can be derived:
Equation (2) elucidates the relationship between the operating temperature and the core temperature of tantalum capacitors. Specifically, the operating temperature is determined by the heat dissipation arising from two components: leakage current and ripple current. Additionally, the ambient temperature plays a significant role as an indispensable factor influencing the operating temperature. The temperature rise of the operating temperature in tantalum capacitors caused by leakage current follows the relationship described below.
where Udc represents the DC voltage that remains unchanged during capacitor degradation. ic is the leakage current. H is the thermal conductivity. S refers to the surface area of the capacitor. 1/HS is simplified as a constant value, β. The temperature rise of the operating temperature in tantalum capacitors caused by ripple current can be described as follows:
Combining Equation (2), the total temperature variation in the operating temperature of tantalum capacitors caused by performance degradation can be expressed as follows:
The core temperature representing the degradation state of the tantalum capacitor can be expressed as follows:
During the aging process of tantalum capacitors, the degradation of performance significantly accelerates the rise of the operating temperature. Specifically, the degradation of performance parameters and the equivalent series resistance (ESR) lead to a rapid increase in the core temperature of the tantalum capacitor. Therefore, long-term continuous monitoring of the core temperature provides a convenient means to predict the remaining useful life (RUL) of the tantalum capacitor.
3. The Proposed Method
3.1. Overview of Proposed Method
Figure 1 illustrates the framework of the proposed CNN-LSTM-Attention-based method for predicting the RUL of tantalum capacitors, which consists of three main steps.
Figure 1.
Framework of the proposed life prediction method.
(1) The first step involves the acquisition and preprocessing of temperature data for the tantalum capacitors. The degradation data used in this study is sourced from accelerated degradation tests. During the data preprocessing phase, the degradation data of the tantalum capacitors are processed to remove interference caused by environmental temperature. Considering the potential impact of environmental temperature changes on the degradation process, this study considers multiple factors, including the operating temperature and temperature rise slope. These factors are integrated to construct the input dataset, which is then used to train the neural network model. Meanwhile, during the initial phase of accelerated testing, considering that the rapid temperature rise of the capacitor does not represent its gradual aging process, the lifetime is calculated after the temperature stabilizes. The acquired lifetime data is preprocessed to generate the dataset for the tantalum capacitor lifetime prediction algorithm.
(2) The second step involves the offline training of the lifetime prediction model. The purpose is to systematically train the deep neural network model using the training dataset. To be specific, the input data and corresponding degradation labels are simultaneously fed into the constructed neural network. The input data is propagated through the multi-layer structure of the model, ultimately yielding the predicted lifetime of the capacitor. The error between the predicted values and the actual lifetime is calculated, and the backpropagation algorithm is employed to optimize the internal parameters of the model. This process continues until the predefined number of training iterations is reached.
(3) The third step involves online prediction and performance evaluation. The target of online prediction is to estimate the remaining lifetime of the tantalum capacitor in real time. At this stage, predictions are made based on current and historical data. Sample data from the test set is input into the trained neural network model, and the model outputs the predicted remaining lifetime of the capacitor. Various evaluation metrics, such as MAPE and RMSE, are then used to quantitatively assess the accuracy of the model’s predictions, thereby evaluating the overall performance of the model. Additionally, the predicted lifetime values are compared with the actual lifetime values recorded from accelerated tests to validate the effectiveness and accuracy of the model.
3.2. Data Preprocessing Method
During the accelerated testing process, considering the potential impact of ambient temperature on the degradation temperature data of the tantalum capacitor, data preprocessing measures are implemented with the aim of minimizing the interference from ambient temperature, thereby ensuring the accuracy of the acquired tantalum capacitor degradation temperature data. To be specific, two temperature sensors are adopted in the experimental setup. Herein, y1(t) represents the tantalum capacitor degradation temperature data measured by an infrared temperature sensor, while y2(t) indicates the real-time ambient temperature conditions surrounding the tantalum capacitor, monitored by an ambient temperature sensor. y3(t) represents the true temperature data of the tantalum capacitor, which is calculated using the following formula:
By subtracting the ambient temperature data from the measured tantalum capacitor temperature data, the true temperature data of the tantalum capacitor can be obtained, thereby minimizing the potential interference of ambient temperature on the test data and ensuring a high degree of reliability and accuracy in the dataset used for training the CNN-LSTM-Attention model.
3.3. The Proposed CNN-LSTM-Attention-Based Remaining Useful Life Prediction Method
The workflow for predicting the RUL of tantalum capacitors integrates deep learning techniques with the physical degradation mechanisms of the capacitors. It estimates the remaining lifetime by analyzing multi-dimensional degradation data, including operating temperature, electrical parameters, and environmental conditions. Given the strong feature extraction capability of convolutional neural networks (CNNs) and considering the long time series of degradation data for tantalum capacitors, a CNN-LSTM-Attention-based model has been developed for RUL prediction. The structure of the prediction model is illustrated in Figure 2.
Figure 2.
The structure of the proposed CNN-LSTM-Attention-based model.
To be specific, the CNN is composed of convolutional layers, pooling layers, and fully connected layers. It can automatically learn meaningful features from the input data without the need for manual feature preprocessing. The convolutional layers serve as the foundation and core of the CNN, carrying out the critical task of feature extraction. The convolution operation is expressed as follows:
where Ci,j represents the value of the output feature map at position (i, j). I is the input data. K is a convolution kernel of size p × q. b is the offset term. As the convolutional kernel slides over the input layer, the equation can be expressed as
By comparing Equations (8) and (9), it can be observed that i′ = i + m, j′ = j + m, meaning the output values undergo the same spatial displacement as the input. This convolution operation provides the CNN with a degree of translation invariance, enabling it to preserve the spatial information of the input data. To ensure consistency between the spatial dimensions of the input data and the output feature maps, zero-padding is applied when the convolution kernel reaches the edges of the input data. The number of convolutional kernels directly determines the depth (number of channels) of the output feature maps. Additionally, the size of the convolutional kernels is a key factor in defining the receptive field, as kernels of different sizes can extract spatial features at various scales.
The pooling layer is a crucial structure in CNN for reducing the spatial dimensions of data and is typically placed after convolutional layers. By reducing the dimensions of feature maps while maintaining the number of channels, the pooling layer decreases the number of parameters in subsequent layers, enhancing computational efficiency and helping to mitigate overfitting. The most common pooling methods are Average Pooling and Max Pooling. Considering the superior performance of Max Pooling in capturing and extracting prominent features, this study adopts the Max Pooling strategy. The mathematical expression is as follows:
After being processed by the convolutional and pooling layers, the data needs to be flattened before being input into the fully connected (FC) layer. The function of the FC layer is to further integrate and reduce the dimensionality of the features extracted and fused by the preceding network layers, ultimately producing the prediction results through the forward propagation algorithm. The mathematical representation of the fully connected layer typically involves a linear combination of weight matrices and bias vectors. The expression for the fully connected layer is as follows:
where z represents the output of the current layer, w represents the parameter value of the current layer, a represents the output of the previous layer, and b represents the offset item.
LSTM (long short-term memory) networks are designed to address the vanishing and exploding gradient problems commonly encountered when processing long sequence data. The core architecture consists of a forget gate, an input gate, an output gate, and a memory cell. The forget gate regulates the extent to which previous memory cell content is retained. The input gate determines how much new information is stored in the memory cell. The output gate controls how much of the memory unit content will be used to calculate the current hidden state and output. These gates perform computations based on the current input and the previous hidden state, using a Sigmoid activation function that produces outputs between 0 and 1, representing the proportion of information that passes through. The update mechanisms for the three gates and the memory cell are as follows:
In the equation, represents the intermediate memory unit at the current time step, derived by applying the tanh activation function to the current input and the previous hidden state. The outputs of the forget gate and input gate regulate the extent of memory unit updates, enabling the storage and propagation of temporal information within the LSTM network. Finally, the output of the output gate, combined with the tanh-activated value of the current memory unit, determines the output at the current time step. This output is then used to update the hidden state for the next time step. This design allows the LSTM to incorporate the historical information of the entire input sequence into a state update at each time step, enhancing the performance in processing long sequence data.
The application of attention mechanisms in deep learning is inspired by the human visual attention system, which allows for the concentration and shifting of attention during information processing. Attention mechanisms are generally classified into two types: hard attention and soft attention. Considering the advantages of dynamic weight allocation in soft attention, this study adopts the soft attention mechanism. The mathematical expression of the soft attention mechanism can be represented as follows:
where represents the weighted output of the model in time steps, represents the attention weight for the i-th element of the input data, and denotes the corresponding input feature.
Through this weighting process, the model dynamically adjusts its focus at each time step, prioritizing the processing of input features assigned higher weights. This mechanism not only enhances the flexibility of the model in processing information but also significantly improves its ability to capture and utilize critical data features.
The channel attention mechanism is specifically designed to optimize the importance of channels in the convolutional layers of a model, serving as a key implementation of soft attention. The mechanism is implemented by introducing both a squeeze module and an excitation module. The squeeze module is adopted to reduce the feature map of each channel into a single scalar value, effectively capturing and representing the global information of that channel. The mathematical expression is as follows:
where H represents the height of the convolution layer, and W represents the width of the convolution layer. uc represents the convolution kernel parameter of the c-th channel.
The excitation module evaluates and determines the relative importance of each channel by applying an activation function. The specific implementation process is as follows:
where δ represents a specific nonlinear threshold function, such as ReLU. σ represents the activation function that processes the global information of each channel, such as a Sigmoid. Through this mechanism, the excitation module assigns a weight to each channel, reflecting its contribution and importance within the entire network. The channel importance adjustment method, based on dynamic learning, provides an effective means for deep learning models to process complex information in a more refined and flexible manner, thereby enhancing the accuracy and efficiency of the model.
4. Experimental Results and Discussion
4.1. Experimental Description
To closely emulate the actual operating conditions of tantalum capacitors, the test specimens in this study are meticulously designed based on the DC/DC modules from the original control board. This approach ensured not only consistency in the constituent components but also an identical structural layout of the DC/DC module. The schematic diagram and the actual photograph of the experimental circuit are presented in Figure 3a,b, respectively. The key parameters of the tantalum capacitors are summarized in Table 1. The schematic diagram and the actual photograph of the experimental platform are depicted in Figure 4a,b, respectively. The electrical stress is generated by adopting a combination of DC and AC. The DC voltage is supplied via a constant voltage power source, while the AC ripple signal is generated using a signal generator. The AC ripple signal is amplified using a power amplifier and coupled with the DC voltage via a transformer, thereby achieving hybrid electrical stress loading. For safety considerations, the entire DC/DC module under test is placed inside an explosion-proof test chamber. Considering the sufficient volume of the explosion-proof chamber and the limitations of the power amplifier, multiple aging test boards are simultaneously arranged in the explosion-proof chamber, allowing repetitive aging tests to be conducted at the same time. Each test board shared the same voltage input and load, ensuring uniform operating conditions for all capacitors. Temperature-monitoring devices are mounted above the test boards to collect degradation data during the aging process. Each monitoring device consisted of an infrared temperature sensor and an ambient temperature sensor. The collected temperature data are processed by an integrated microcontroller unit (MCU), which transmits the data via an RS485 bus to an RS485-to-Ethernet module. Finally, the data are transmitted to a PC for storage and further analysis. The collected data are subsequently utilized to train and test a deep learning model designed to analyze and predict capacitor degradation.
Figure 3.
(a) The schematic diagram. (b) The actual experimental circuit.
Table 1.
Electrical parameters of tantalum capacitor.
Figure 4.
(a) Schematic of the experimental platform. (b) Actual experimental setup.
4.2. Data Preprocessing
Figure 5 illustrates the temperature data of the tantalum capacitor obtained by the temperature sensors and the corresponding preprocessing results. To be specific, Figure 5a depicts the degradation temperature curve of the tantalum capacitor during operation, as measured by the infrared sensor. Figure 5b presents the data simultaneously recorded by the ambient temperature sensor, which reflects the environmental temperature variations surrounding the capacitor and serves as critical reference information for subsequent data preprocessing. Figure 5c shows the results of the preprocessing procedure, where the environmental temperature component is subtracted from the total temperature measured by the infrared sensor to isolate the actual degradation temperature of the tantalum capacitor. This preprocessing step not only enhances the accuracy and reliability of the data but also ensures that subsequent analyses focus specifically on the thermal degradation characteristics of the capacitor itself. The preprocessed degradation temperature data of the tantalum capacitor are then adopted as the training dataset for the CNN-LSTM-Attention model. This facilitates detailed learning and training aimed at developing a prediction model capable of capturing fine-grained variations with high accuracy. This preprocessing step is critical for improving the overall performance of the model and ensuring its reliability in predicting the degradation temperature of tantalum capacitors.
Figure 5.
The obtained temperatures. (a) Infrared temperature. (b) Environmental temperature. (c) The temperature difference obtained from the infrared temperature minus the environmental temperature.
4.3. Experimental Results Analysis and Discussion
Based on the aforementioned experimental system, this section aims to simulate the aging process of tantalum capacitors under actual application conditions to collect and record temperature data during the degradation process. These temperature data are utilized to train the constructed lifetime prediction model, which is subsequently validated using an independent test dataset. The target is to ensure the accuracy and reliability of the model in various application scenarios. To eliminate the influence of random factors on the experimental results, three sets of repeated tests are conducted in this research. The experimental results are shown in Figure 6, corresponding to Groups 1, 2, and 3. In these figures, the green curves represent the actual degradation trends, while the red curves depict the predicted degradation trends generated by the CNN-LSTM-Attention-based model.
Figure 6.
Predicted results for Group 1. (a) LSTM model. (b) CNN-LSTM model. (c) CNN-LSTM-Attention model.
To comprehensively evaluate the accuracy and reliability of the developed lifetime prediction model, three evaluation metrics are proposed: root mean squared error (RMSE), mean absolute percentage error (MAPE), and the coefficient of determination (R2). To be specific, RMSE focuses on the absolute magnitude of the differences between the predicted and actual values, making it highly sensitive to error magnitude. A smaller RMSE indicates better predictive performance for the model. The mathematical expression for RMSE is as follows:
where N represents the total number of samples. is the predicted value of the tantalum capacitance at the i sample point. yi is the actual value of the tantalum capacitance at the i sample point. is the average of the actual values.
MAPE focuses on the relative magnitude of errors, representing the absolute difference between the predicted and actual values as a percentage of the actual value. A lower MAPE indicates higher accuracy for the model. The mathematical expression is as follows:
The R2 metric is adopted to evaluate the degree of fitting between the predicted values and the actual values of the model. A value closer to 1 indicates a better fit to the data and stronger predictive performance.
Based on the dataset constructed from the experimental data of Group 1, the CNN-LSTM-Attention model is used for prediction, and the prediction results are shown in Figure 6. In order to assess the prediction accuracy of the models on the test set, evaluation metrics, including RMSE, MAPE, and R2, are calculated for the three network models, and the values are shown in Table 2. As analyzed in Table 2, the MAPE of the CNN-LSTM-Attention model decreases by 36.75% and 60.97% compared to the CNN-LSTM and LSTM models, respectively. The RMSEs of the CNN-LSTM-Attention model decreased by 39.61% and 64.29% compared to the CNN-LSTM and LSTM models, respectively. The R2 values of the CNN-LSTM-Attention model increased by 4.15% and 18.40% compared to the CNN-LSTM and LSTM networks, respectively, indicating the high accuracy and reliability in the lifetime prediction of the CNN-LSTM-Attention model.
Table 2.
Evaluation metrics for the predictions of the three network models in Group 1.
The lifetime of a tantalum capacitor is considered to reach the endpoint when its performance degrades to a predefined failure threshold. In this work, the moment corresponding to the highest temperature during the stable degradation phase is defined as the end-of-life of the capacitor. When the capacitor reaches the highest temperature during the stable degradation phase, it can be considered to have failed, completing the prediction of the RUL of the capacitor. The RUL prediction results are compared with the actual lifetime values obtained from accelerated degradation experiments, and the error analysis results are presented in Table 3. By analyzing the life prediction results of the LSTM network, CNN-LSTM network, and CNN-LSTM-Attention network, it can be seen that the prediction relative error of the RUL prediction method based on the CNN-LSTM-Attention network is 10.65%, while the prediction relative errors of the CNN-LSTM and LSTM networks are 12.54% and 16.82%, respectively. The prediction error of the CNN-LSTM-Attention network is reduced by 15.07% and 36.68% compared to the CNN-LSTM and LSTM networks, respectively, revealing the high accuracy and reliability of the experimental results.
Table 3.
RUL prediction error results for Group 1.
Utilizing the dataset from the experimental data of Group 2, predictions are conducted employing the CNN-LSTM-Attention model. The results are depicted in Figure 7. The evaluation metrics, including RMSE, MAPE, and R2, are calculated and exhibited in Table 4. It can be seen that the CNN-LSTM-Attention model exhibits a notable decrease in MAPE by 18.24% and 47.39% when compared to the CNN-LSTM and LSTM models, respectively. Similarly, the RMSE for the CNN-LSTM-Attention model is reduced by 2.86% and 49.81% in comparison to the CNN-LSTM and LSTM models. Furthermore, the R2 value for the CNN-LSTM-Attention model increases by 0.20% and 6.64% relative to the CNN-LSTM and LSTM networks, respectively. These results underscore the enhanced accuracy and dependability of the CNN-LSTM-Attention model in predicting lifetimes.
Figure 7.
Predicted results for Group 2. (a) LSTM model. (b) CNN-LSTM model. (c) CNN-LSTM-Attention model.
Table 4.
Evaluation metrics for the predictions of the three network models in Group 2.
As shown in Figure 7, it can be concluded that the tantalum capacitor corresponding to the degradation moment has already failed, thus completing the prediction of the remaining life (RUL) of the tantalum capacitor. Then, the RUL prediction results are analyzed with the real life of the capacitor obtained from the accelerated degradation test, and the prediction errors of the three types of networks are compared, as shown in Table 5. It can be seen that the prediction relative error of the RUL prediction method based on the CNN-LSTM-Attention network is 5.20%, while the prediction relative errors of the CNN-LSTM and LSTM networks are 9.81% and 14.28%, respectively. The prediction error of the CNN-LSTM-Attention network is reduced by 46.99% and 63.59% compared to the CNN-LSTM and LSTM networks, respectively, which indicates the high accuracy and reliability of the experimental results.
Table 5.
RUL prediction error results for Group 2.
Employing the dataset from the experimental data of Group 3, the CNN-LSTM-Attention model is applied for predictive modeling, with the results shown in Figure 8. It can be observed that the degradation trend of this group decreases during the failure phase, which is due to the ambient temperature variation masking the real performance degradation trend. The performance of the CNN-LSTM-Attention model is assessed using key metrics such as RMSE, MAPE, and R2, which are detailed in Table 6. It can be seen from Table 6 that the MAPE of the CNN-LSTM-Attention model decreases by 19.20% and 47.12% compared to the CNN-LSTM and LSTM models, respectively. The RMSE of the CNN-LSTM-Attention model decreased by 24.71% and 65.63% compared to the CNN-LSTM and LSTM models, respectively. The R2 of the CNN-LSTM-Attention model increased by 28.79% and 68.67% compared to the CNN-LSTM and LSTM networks, respectively. The above key metrics show the high accuracy and reliability in the lifetime prediction of the CNN-LSTM-Attention model.
Figure 8.
Predicted results for Group 3. (a) LSTM model. (b) CNN-LSTM model. (c) CNN-LSTM-Attention model.
Table 6.
Evaluation metrics for the predictions of the three network models in Group 3.
The relative errors of prediction for the three types of networks are shown in Table 7. It can be seen that the prediction relative error of the RUL prediction method based on the CNN-LSTM-Attention network is 3.06%, while the prediction relative errors of CNN-LSTM and LSTM networks are 3.54% and 12.25%, respectively. The prediction error of the CNN-LSTM-Attention network is reduced by 13.56% and 75.02% compared to the CNN-LSTM and LSTM networks, respectively, indicating the high accuracy and reliability of the predictive model based on CNN-LSTM-Attention.
Table 7.
RUL prediction error results for Group 3.
5. Conclusions
This article has proposed a novel deep learning-based framework for online remaining useful life (RUL) prediction for tantalum capacitors, leveraging CNN-LSTM-Attention mechanisms to enable accurate and reliable prognostics based on temperature measurements. The key innovations of the proposed method are summarized as follows: (1) a hybrid degradation-aware modeling approach was developed by integrating convolutional feature extraction, long-term temporal memory, and soft attention mechanisms to dynamically enhance degradation pattern recognition, and (2) a channel attention mechanism was embedded to adaptively reweight critical feature channels, improving the model’s sensitivity to subtle degradation trends under varying operating conditions. Extensive experiments using real-world accelerated aging data demonstrated the superiority of the proposed approach in terms of prediction accuracy and generalization. Compared with baseline models, the CNN-LSTM-Attention framework achieved significant reductions in MAPE and RMSE, along with notable improvements in R2, consistently across multiple test groups. These results validate the robustness and effectiveness of the proposed method for the health prognostics of tantalum capacitors under complex thermal and electrical stresses. Future work will focus on extending this framework to address broader classes of electronic components with different degradation modalities, integrating multi-source sensory information beyond temperature, and exploring lightweight model architectures for deployment in embedded and resource-constrained prognostics scenarios. Moreover, potential applications in the condition-based maintenance of power electronic systems and real-time health monitoring in industrial IoT environments will be investigated to further demonstrate the scalability and adaptability of the proposed approach.
Author Contributions
All authors have made valuable contributions to this paper. Conceptualization, Z.H. and Q.Z.; methodology, G.L. and Y.C.; validation, G.L. and Q.Z.; formal analysis, G.L. and Q.Z.; investigation, Z.H. and Y.C.; data curation, G.L. and Y.C.; writing—original draft preparation, Z.H.; writing—review and editing, Q.Z.; visualization, Z.H.; supervision, Q.Z.; project administration, Z.H.; funding acquisition, Q.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the National Natural Science Foundation of China under Grant 62473080.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
The original contributions presented in this study are included in the article material; further inquiries can be directed to the corresponding author.
Conflicts of Interest
Author Zhongsheng Huang is employed by the company PipeChina West Pipeline Company. Author Guoming Li is employed by the company PipeChina Gansu Pipeline Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
- Li, M.; Zhang, M.; Shen, X.; Yan, J.; Wang, A.; Chao, Y.; Qi, B. Research on Capacitor Aging Behavior and Prediction of Remaining Service Life. In Proceedings of the 2023 6th International Conference on Renewable Energy and Power Engineering (REPE), Beijing, China, 15–17 September 2023; pp. 146–150. [Google Scholar] [CrossRef]
- Zeng, D.; Lu, J.; Zheng, Y. Combined fuzzy time series prediction method for fault prediction of EML pulse capacitors. IEEE Trans. Plasma Sci. 2021, 49, 905–913. [Google Scholar] [CrossRef]
- Tai, Y.; Chen, P.; Jian, Y.; Fang, Q.; Xu, D.; Cheng, J. Failure mechanism and life estimate of metallized film capacitor under high temperature and humidity. Microelectron. Reliab. 2022, 137, 114755. [Google Scholar] [CrossRef]
- Ghadrdan, M.; Peyghami, S.; Mokhtari, H.; Wang, H.; Blaabjerg, F. Dissipation factor as a degradation indicator for electrolytic capacitors. IEEE J. Emerg. Sel. Top. Power Electron. 2022, 11, 1035–1044. [Google Scholar] [CrossRef]
- Xu, A.; Fang, G.; Zhuang, L.; Gu, C. A multivariate student-t process model for dependent tail-weighted degradation data. IISE Trans. 2024, 57, 1071–1087. [Google Scholar] [CrossRef]
- Zhuang, L.; Xu, A.; Wang, Y.; Tang, Y. Remaining useful life prediction for two-phase degradation model based on reparameterized inverse Gaussian process. Eur. J. Oper. Res. 2024, 319, 877–890. [Google Scholar] [CrossRef]
- Freeman, Y.; Lessner, P. Evolution of Polymer Tantalum Capacitors. Appl. Sci. 2021, 11, 5514. [Google Scholar] [CrossRef]
- Butnicu, D. A Derating-Sensitive Tantalum Polymer Capacitor’s Failure Rate within a DC-DC eGaN-FET-Based PoL Converter Workbench Study. Micromachines 2023, 14, 221. [Google Scholar] [CrossRef]
- Sun, Q.; Yang, L.; Li, H.; Sun, G. RUL prediction for AECs of power electronic systems based on machine learning and error compensation. J. Intell. Fuzzy Syst. 2023, 44, 7407–7417. [Google Scholar] [CrossRef]
- Freeman, Y.; Lessner, P.; Luzinov, I. Reliability and Failure Mode in Solid Tantalum Capacitors. ECS J. Solid State Sci. Technol. 2021, 10, 045007. [Google Scholar] [CrossRef]
- Xu, A.; Wang, J.; Tang, Y.; Chen, P. Efficient online estimation and remaining useful life prediction based on the inverse Gaussian process. Nav. Res. Logist. 2024, 72, 319–336. [Google Scholar] [CrossRef]
- Zhu, J.; Huang, C.; Shen, C.; Shen, Y. Cross-domain open set machinery fault diagnosis based on adversarial network with multiple auxiliary classifiers. IEEE Trans. Ind. Inform. 2022, 18, 8077–8086. [Google Scholar] [CrossRef]
- Zhao, J.; Liu, J. Ordered clustering method and degradation trend analysis for performance degradation of tantalum capacitor. IEEJ Trans. Electr. Electron. Eng. 2020, 15, 179–186. [Google Scholar] [CrossRef]
- Zhao, J.; Zhou, Y.; Zhu, Q.; Song, Y.; Liu, Y.; Luo, H. A remaining useful life prediction method of aluminum electrolytic capacitor based on wiener process and similarity measurement. Microelectron. Reliab. 2023, 142, 114928. [Google Scholar] [CrossRef]
- Gasperi, M.L. Life Prediction Modeling of Bus Capacitors in AC Variable-Frequency Drives. IEEE Trans. Ind. Appl. 2005, 41, 1430–1435. [Google Scholar] [CrossRef]
- Wang, H.; Blaabjerg, F. Reliability of Capacitors for DC-Link Applications in Power Electronic Converters—An Overview. IEEE Trans. Ind. Appl. 2014, 50, 3569–3578. [Google Scholar] [CrossRef]
- Zhu, J.; Wang, Y.; Xia, M.; Williams, D.; Silva, C. A new multisensor partial domain adaptation method for machinery fault diagnosis under different working conditions. IEEE Trans. Instrum. Meas. 2023, 72, 1–10. [Google Scholar] [CrossRef]
- Huang, C.; Li, H.; Peng, W.; Tang, L.; Ye, Z. Personalized Federated Transfer Learning for Cycle-Life Prediction of Lithium-Ion Batteries in Heterogeneous Clients with Data Privacy Protection. IEEE Internet Things J. 2024, 11, 36895–36906. [Google Scholar] [CrossRef]
- Laflamme, S.; Kollosche, M.; Connor, J.J.; Kofod, G. Robust flexible capacitive surface sensor for structural health monitoring applications. J. Eng. Mech. 2013, 139, 879–885. [Google Scholar] [CrossRef]
- Khandebharad, A.R.; Dhumale, R.B.; Lokhande, S.S.; Lokhande, S.D. Real time remaining useful life prediction of the electrolytic capacitor. In Proceedings of the 2015 International Conference on Information Processing (ICIP), Pune, India, 16–19 December 2015; pp. 631–636. [Google Scholar] [CrossRef]
- Luo, C.; Chen, P.; Jaillet, P. Portfolio Optimization Based on Almost Second-degree Stochastic Dominance. Manag. Sci. 2025, 63, 3381–3392. [Google Scholar] [CrossRef]
- Zhao, X.; Chen, P.; Tang, L. Condition-based maintenance via Markov decision processes: A review. Front. Eng. Manag. 2025, 12, 330–342. [Google Scholar] [CrossRef]
- Ye, S.; Jiang, J.; Li, J.; Liu, Y.; Zhou, Z.; Liu, C. Fault diagnosis and tolerance control of five-level nested NPP converter using wavelet packet and LSTM. IEEE Trans. Power Electron. 2019, 35, 1907–1921. [Google Scholar] [CrossRef]
- Jeong, S.H.; Park, J.W.; Kim, H.S. Deep neural network-based lifetime diagnosis algorithm with electrical capacitor accelerated life test. J. Power Sources 2024, 599, 234182. [Google Scholar] [CrossRef]
- Kang, W.; Wang, D.; Jongbloed, G.; Hu, J.; Chen, P. Robust Transfer Learning for Battery Lifetime Prediction Using Early Cycle Data. IEEE Trans. Ind. Inform. 2025, 21, 4639–4648. [Google Scholar] [CrossRef]
- Hu, J.; Chen, P. Mission Abort Policy Considering Imperfect Alarms and Two Abort Options. Nav. Res. Logist. 2025, 72, 858–867. [Google Scholar] [CrossRef]
- Sakamoto, J.; Hirata, R.; Shibutani, T. Potential failure mode identification of operational amplifier circuit board by using high accelerated limit test. Microelectron. Reliab. 2018, 85, 19–24. [Google Scholar] [CrossRef]
- Kulevome, D.K.B.; Wang, H.; Wang, X. A bidirectional LSTM-based prognostication of electrolytic capacitor. Prog. Electromagn. Res. C 2021, 109, 139–152. [Google Scholar] [CrossRef]
- Wang, Z.; Qu, J.; Fang, X.; Li, H.; Zhong, T.; Ren, H. Prediction of early stabilization time of electrolytic capacitor based on ARIMA-Bi_LSTM hybrid model. Neurocomputing 2020, 403, 63–79. [Google Scholar] [CrossRef]
- Wang, F.; Cai, Y.; Tang, H.; Lin, Z.; Pei, Y.; Wu, Y. Prognostics of Aluminum Electrolytic Capacitors Based on Chained-SVR and 1D-CNN Ensemble Learning. Arab. J. Sci. Eng. 2022, 47, 13995–14012. [Google Scholar] [CrossRef]
- Lou, G.; Lin, W.; Huang, G.; Xiang, W. A two-stage online remaining useful life prediction framework for supercapacitors based on the fusion of deep learning network and state estimation algorithm. Eng. Appl. Artif. Intell. 2023, 123, 106399. [Google Scholar] [CrossRef]
- Yi, Z.; Wang, S.; Li, Z.; Wang, L.; Wang, K. A Novel Approach for State of Health Estimation and Remaining Useful Life Prediction of Supercapacitors Using an Improved Honey Badger Algorithm Assisted Hybrid Neural Network. Prot. Control. Mod. Power Syst. 2024, 9, 1–18. [Google Scholar] [CrossRef]
- Shahraki, A.F.; Al-Dahidi, S.; Taleqani, A.R.; Yadav, O.P. Using LSTM neural network to predict remaining useful life of electrolytic capacitors in dynamic operating conditions. Proc. Inst. Mech. Eng. Part O J. Risk Reliab. 2023, 237, 16–28. [Google Scholar] [CrossRef]
- Zhao, X.; Zhang, X.; Ren, P. Fault diagnosis and identification of power capacitor based on edge cloud computing and deep learning. Math. Probl. Eng. 2020, 2020, 3120805. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).