A Method of Diagnosing Analog Circuit Soft Faults Using Boruta Features and LightGBM

Chen, Hongyang; Hu, Chunyan; Han, Bo; Miao, Keqiang

doi:10.3390/electronics13061123

Open AccessArticle

A Method of Diagnosing Analog Circuit Soft Faults Using Boruta Features and LightGBM

¹

National Key Laboratory of Science and Technology on Advanced Light-Duty Gas-Turbine, Beijing 100190, China

²

Institute of Engineering Thermophysics, Chinese Academy of Sciences, Beijing 100190, China

³

University of Chinese Academy of Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(6), 1123; https://doi.org/10.3390/electronics13061123

Submission received: 25 January 2024 / Revised: 14 March 2024 / Accepted: 16 March 2024 / Published: 19 March 2024

Download

Browse Figures

Versions Notes

Abstract

Modern electronic power systems rely heavily on analog circuits. The accurate detection of analog circuit faults, especially soft faults, is of great significance for the maintenance and inspection of electronic systems. This paper proposes the application of the Boruta feature selection method to the field of the soft fault diagnosis of analog circuits to screen out low-dimensional and efficient feature components from the high-dimensional time-domain statistical features and frequency-domain features of circuit responses. Then, the feature components are used as the input to train the LightGBM classification model, and the Bayesian optimization method is introduced to optimize the model’s hyperparameters. Finally, the trained fault diagnosis model is verified in two typical experimental circuits, and satisfactory accuracy is obtained.

Keywords:

Boruta; LightGBM; analog circuit; fault diagnosis

1. Introduction

The electronic circuit industry’s rapid evolution has propelled sophisticated devices into nearly every aspect of our lives, from industrial production to everyday conveniences. As these circuits become increasingly integrated, their functionalities and modularity are also growing, demanding heightened operational reliability. This imperative is particularly critical in demanding fields like aerospace, military, and medicine, where systems often operate under extreme conditions [1]. Achieving stable, error-free performance is paramount in these applications. Failure to diagnose faults promptly can lead to significant consequences.

While electronic circuits come in two flavors—digital and analog—troubleshooting issues in the latter proves significantly more demanding. This stems from several inherent characteristics of analog circuits: component complexities, measurement uncertainties, limited observability, etc. Despite these challenges, the significance of analog circuits cannot be understated. Even in mixed-signal devices where digital circuits dominate in number, 80% of faults occur in the analog portions [2]. This highlights the critical need for advancements in analog fault diagnosis techniques.

Two distinct categories encompass the fault diagnosis methods applicable to analog circuits which are called model-based and data-driven methods. The former generally hopes to derive the transfer function equation of the circuit, mainly through analyzing the circuit design principle or adopting some parameter identification techniques. Obviously, for simple circuits, such methods usually have certain effects. Nevertheless, with the escalating complexity of the circuit, the difficulty of obtaining a usable transfer function equation, or even just estimating some parameters, is greatly increased, and it is quite challenging. Relatively speaking, data-driven methods are more popular and have more operability.

There exist two distinct categories of faults in analog circuits. The first type is called a hard fault, which refers to the open circuit or short circuit caused by the physical damage or connection error of electronic components. In serious cases, it can lead to the circuit being completely unable to work. Many people have done relevant research on the open-circuit and short-circuit phenomena of some special circuits. Wang et al. [3] combined an improved Dempster–Shafer (DS) evidence theory with a backpropagation (BP) neural network to enhance the accuracy of diagnosing open-circuit faults in inverter transistors. Mingyun Chen et al. [4] developed a fault injection strategy to aid in differentiating between internal and external switch faults within 3L-NPC rectifiers. Their method facilitates the quick and accurate detection of both single and multiple switch faults, improving overall system reliability. Tiancheng Shi [5] introduced an enhanced diagnostic approach utilizing a deep belief network (DBN) in conjunction with the least squares support vector machine (LSSVM). This method proves highly effective in diagnosing diverse switch faults within pulse-width modulation voltage source rectifier systems, showcasing a robust resistance to interference and swift fault identification.

Another type is called a soft fault. Soft faults in analog circuits refer to instances where component parameters deviate from expectations due to external factors such as temperature variations, electromagnetic interference, prolonged usage, etc. These deviations can render the circuit inoperable under certain conditions. In contrast to most hard faults, soft faults exhibit a more covert nature and pose greater challenges for detection [6]. Scholars have extensively explored data-driven approaches to address soft fault issues in electronic circuits, recognizing the need for advancements in this field. Mehran Aminian and Farzan Aminian [7] have successively attempted to employ the PCA [8] method to reduce the dimensionality of features obtained through wavelet transformation. Yingqun Xiao et al. [9] have introduced a distinctive preprocessing technique, referred to as kernel principal components analysis with a focus on maximal class separability, for analyzing the time response of the analog circuit. Lipeng Ji et al. [10] leveraged the formidable of ResNet networks in feature extraction and learning to identify crucial parameters defining analog circuit performance to pinpoint the failing component. Ping Song et al. [11] introduced a novel approach for fault feature extraction utilizing fractional Fourier transform (FRFT), and SVM was employed to train the extracted features to achieve the effect of diagnosing and categorizing the faults. Peng Sun et al. [12] introduced a fault diagnosis method for modular analog circuits, utilizing support vector data description (SVDD) and integrating Dempster–Shafer (abbreviated as DS) evidence theory. They performed simulation and hardware experiments on a double-bandpass filter circuit, achieving favorable results. Chaolong Zhan [13] introduced deep belief networks into the fault diagnosis of analog circuits, and used a particle swarm optimization algorithm (QPSO) to optimize the learning rate of a DBN. Huahui Yang et al. [14] applied convolutional neural networks in one dimension for diagnosing faults in analog circuits, aiming to simultaneously complete the tasks of extracting relevant features and classifying faults within the input signal through the neural network. Zhijie Yuan [15] used two popular methods in manifold learning methods, local linear embedding (LLE) and diffusion mapping (DM), to optimize the dimensionality reduction techniques commonly used, so as to better extract the fault features in analog circuits.

From the existing research, it can be seen that most studies follow the practice of first extracting features from circuit signals, then reducing the dimensionality of the features, and finally classifying faults. In the feature extraction step, many time-domain statistical features are often ignored, while these features should be combined with frequency-domain features to more completely represent the fault features of the circuit. However, this combination will increase the dimension of the feature vector, and it is necessary to exclude some less correlated and redundant feature parts from these high-dimensional features to reduce the pressure of subsequent fault classification. To that aim, this paper introduces a novel approach to diagnose faults in analog circuits, which relies on Boruta feature selection and the LightGBM model. In order to achieve this, we rely on the technical contributions listed below:

(i): We use several time-domain statistical feature methods to extract the statistical features of the time-domain signal and use wavelet packet transform to extract the frequency features of the time-domain signal. By combining the two, the composite feature vector of the circuit signal is obtained.
(ii): The Boruta feature selection method is proposed to extract low-dimensional effective features from high-dimensional feature vectors.
(iii): The LightGBM model is proposed as a means for diagnosing analog circuit faults. We also introduce the Bayesian optimization approach to effectively fine-tune the hyperparameters of the model for enhanced performance.

The subsequent sections are structured as follows: Section 2 presents the pertinent theoretical framework utilized in this research. Section 3 elaborates on the procedural steps of the proposed methodology. In Section 4, the application of the proposed method to two experimental circuits is discussed, along with an analysis of the experimental findings. Additionally, a comparative experiment is outlined to showcase the efficacy of the proposed approach. Finally, Section 5 provides a summary of the research conducted in this paper.

2. Related Theories

2.1. Boruta Feature Selection

The Boruta feature selection technique is categorized as a wrapper-based approach for feature selection. Unlike filter-based methodologies, wrapper-based techniques evaluate the significance of features by analyzing their performance within predictive models. They aim to identify the most suitable subset of features while ensuring robustness against irrelevant or noisy features [16]. Boruta is an all-relevant feature selection method, while most other methods are minimally optimal. This means that it aims to find all the features that carry information for prediction, rather than finding a possible compact subset of features in which some classifiers have the minimum error amount. The detailed algorithm steps are as follows [17]:

Step 1: Initialization. First, randomly generate a shadow dataset, where the values of each feature are the shuffled values of that feature in the original dataset. Then, combine the original dataset and the shadow dataset to obtain an extended dataset.

Step 2: Train the model. Train a classification model on the extended dataset. In this paper, the LightGBM model is selected. Calculate the importance of each feature, which is represented by the

Z_{s c o r e}

in the following fomula:

Z_{s c o r e} = \frac{E_{f}}{f}

(1)

where

E_{f}

represented the mean of accuracy loss, and

f

represented the standard deviation of accuracy loss.

Step 3: Confirm the features. Verify each feature to determine if it holds a greater significance than the highest value of the importance score within the shadow feature. If so, confirm the feature as a relevant feature.

Step 4: Iterate. Repeat Step 2 and 3 until all features are confirmed or the set maximum number of iterations is reached.

The above Boruta feature selection method can represented by Figure 1:

2.2. Light Gradient-Boosting Machine (LightGBM)

The light gradient-boosting machine is a gradient-boosting tree model introduced by Microsoft Research Asia in 2016. The origins of the LightGBM can be traced back to the 1990s, a period when gradient-boosting tree models started to garner interest among researchers. Gradient-boosting tree models are an iterative learning approach, progressively enhancing the model’s performance by incorporating a tree during each iteration.

The LightGBM represents a refinement of the gradient-boosting decision tree (GBDT) algorithm, employing weak classifiers such as decision trees to progressively refine the model. Notably, the LightGBM offers advantages in terms of effective training outcomes and a reduced risk of overfitting. In contrast to the GBDT, which necessitates multiple passes through the entire training dataset during each iteration, the LightGBM mitigates the need for loading the complete dataset into memory. This circumvents the limitations on training data size imposed by memory constraints. Moreover, the LightGBM’s approach addresses the time-consuming nature of repeatedly reading and writing training data by implementing specific strategies to optimize performance. The specific implementation methods include the following points [18,19]:

Histogram-based algorithm. This algorithm addresses eigenvalue segmentation with both memory and computational efficiency. By discretizing continuous eigenvalues into k integers and constructing a corresponding histogram, it avoids extensive data processing. Traversing the data once populates the histogram with relevant statistics, enabling an efficient search for the optimal segmentation point within the discrete representation. This significantly reduces both the memory footprint and the computational complexity compared to alternative methods.
Leaf-wise growth. Unlike traditional level-wise tree growth, the LightGBM grows trees leaf-wise. Instead of expanding all nodes at a given level, it continuously seeks the leaf with the biggest potential improvement (split gain), leading to a potentially lower error and higher accuracy at the same number of splits.
Gradient-based one-side sampling (GOSS). GOSS is a smart technique that speeds up learning in decision trees by focusing on informative samples. The key idea of GOSS is that samples with larger gradients (essentially, larger prediction errors) contribute more to information gain. GOSS ranks all samples based on their absolute gradient values, prioritizing those with large errors. To maintain data diversity, the algorithm randomly samples a smaller number of remaining (lower gradient) samples. Then, it adjusts the weights of these randomly selected samples slightly to emphasize their importance without significantly altering the dataset’s overall distribution.
Exclusive feature bundling (EFB). High dimensional data tend to be sparse, and this sparsity inspires us to design a lossless method to reduce the dimensions of features. Usually, the features that are bundled are mutually exclusive (i.e., they do not have both nonzero values, like one-hot), so that two features are bundled without losing information. If the two features are not mutually exclusive (in some cases, they are both non-zero), we can use a metric called the collision ratio to measure the extent to which the features are not mutually exclusive, and when this value is small, we can choose to combine the two features without affecting the final accuracy. Exclusive feature bundling (EFB) points out that the number of features can be reduced if some features are fused and bound together. This will help to reduce the time complexity when building the histogram.

2.3. Bayesian Optimization

Bayesian optimization is a global optimization technique that leverages Bayesian statistical theory to efficiently search for global optima. In contrast to traditional optimization methods, Bayesian optimization for parameter tuning uses Gaussian processes, taking into account previous parameter information, continually updating prior information, effectively reducing the number of iterations in the tuning process, and demonstrating robust performance when dealing with non-convex problems.

The Bayesian optimization framework primarily comprises two fundamental components: a probabilistic surrogate model and an acquisition function [20]. The probabilistic surrogate model contains a prior probability model and an observation model. In a narrow sense, Bayesian optimization refers to sequential model-based optimization (SMBO) in which the surrogate model is a Gaussian process regression model. Gaussian process regression involves using a Gaussian process model

F (x)

to the target function

f (x)

. Initially, the predefined mean function

m (x)

and covariance function

K (x, x')

are established as the prior distribution of the Gaussian process. Next, the sampling indices

x_{1}, x_{2}, \dots, x_{t}

are selected, obtaining observed values of the target function

f (x_{1}), \dots, f (x_{t})

, which correspondingly are the random variables

F (x_{1}), \dots, F (x_{t})

in the Gaussian process. The parameters of the mean and covariance functions are adjusted based on the observed values, thereby determining the final form of the Gaussian process, completing the fitting of the function

f (x)

.

Another important part of Bayesian optimization is the acquisition function []. Since the surrogate model outputs the posterior distribution of function

f

, we can utilize this posterior distribution

F (x) | F (x_{1 : t}) = f (x_{1 : t})

to evaluate where the next sampling point should be located. The acquisition function takes the form of

A (x, F (x) | F (x_{1 : t}) = f (x_{1 : t}))

, where its input scores each sampling point ‘x’, with higher scores indicating points more deserving of being sampled.

Generally, the acquisition function needs to satisfy several criteria. Firstly, it should have smaller values at existing sampling points, as these points have already been explored. Secondly, it should have larger values at points with wider confidence intervals (higher variance) because these points possess greater uncertainty and are more worthy of exploration. For maximization (or minimization) problems, the acquisition function should have larger values at points with higher (or lower) function means, as the mean provides an estimate of the function value at that point, making these points more likely to be near extreme points. There are various choices of acquisition functions. Those commonly used are the probability improvement (PI), expected improvement (EI), and the upper confidence bound (UCB).

In summary, Bayesian optimization is an iterative process that primarily involves three steps:

Step 1: Select the next most promising evaluation point

x_{t}

based on maximizing the acquisition function.

Step 2: Evaluate the objective function

y_{t} = f (x_{t}) + ε_{t}

at the chosen evaluation point

x_{t}

.

Step 3: Add the newly obtained input-observation pair

\{x_{t}, y_{t}\}

to the historical observation set

D_{1 : t - 1}

, and update the probabilistic surrogate model in preparation for the next iteration.

3. Proposed Method

3.1. Obtain the Feature Vector

The time-domain statistical characteristics of a circuit signal provide essential insights into the behavior and properties of the signal. Understanding these features is crucial signal processing, system identification, and fault diagnosis in circuits.

This article selects six typical time-domain statistical features for a set of time-domain signal vectors

X = [x_{1}, x_{2}, \dots, x_{n}]

, including the standard deviation, kurtosis, skewness, entropy, waveform factor and impulse indicator:

Standard deviation is used to measure the dispersion of the data, with a larger standard deviation indicating greater data spread, and is defined as Equation (2):

σ = \sqrt{\frac{1}{N} \sum_{n = 1}^{N} {(x_{n} - μ)}^{2}}

(2)

Kurtosis is used to describes the steepness of the data distribution, where a higher kurtosis indicates relatively concentrated data and a lower kurtosis indicates a flatter distribution. It is defined as follows.

K U = \frac{\frac{1}{N} \sum_{n = 1}^{N} {(x_{n} - μ)}^{4}}{σ^{4}} - 3

(3)

Skewness is used to measures the asymmetry of the data distribution, with positive skewness indicating right-skewed data and negative skewness indicating left-skewed data, and is defined as follows.

S K = \frac{\frac{1}{N} \sum_{n = 1}^{N} {(x_{n} - μ)}^{3}}{σ^{3}}

(4)

Entropy is used to measure the complexity or uncertainty of a signal, where higher entropy values indicate greater signal complexity or uncertainty. Entropy is defined as follows.

e = - \sum_{i} P (x = a_{i}) \log P (x = a_{i})

(5)

The waveform factor is defined as follows. It is used to described the degree of distortion of a signal waveform.

S H (X) = \frac{R M S (X)}{\frac{1}{N} \sum_{n = 1}^{N} | x_{n} |}

(6)

where RMS represents for root mean square.

The impulse indicator is used to described the impulsive characteristics of a signal and is defined as follows.

I M (X) = \frac{\max | X |}{\frac{1}{N} \sum_{n = 1}^{N} | x_{n} |}

(7)

In addition to time-domain statistical features, we also selected wavelet packet transform (WPT), a representative signal frequency-domain feature extraction method [21,22], to extract the frequency-domain features of analog circuit responses. WPT is a further decomposition of high-frequency signals on the basis of wavelet transform. The general process is as shown in Figure 2. After n layers of decomposition, the original signal can be decomposed into 2ⁿ sub-bands.

Assume that the initial signal is given by

d_{0}^{1} = f (n)

, which could be represented after WPT as follows:

d_{l}^{j, 2 n} = \sum_{k} h_{k - 2 l} d_{k}^{j + 1, n}

(8)

d_{l}^{j, 2 n + 1} = \sum_{k} g_{k - 2 l} d_{k}^{j + 1, n}

(9)

where

d_{l}^{j, 2 n}

and

d_{l}^{j, 2 n + 1}

refer to the node coefficients of node

(j, n)

in the layer

l

of the wavelet packet decomposition under a high-pass filter and low-pass filter, respectively.

h_{k - 2 l}

and

g_{k - 2 l}

refer to the high-pass and low-pass filters.

d_{k}^{j + 1, n} (n)

is the node coefficient of node

(j + 1, n)

at layer k of the wavelet packet.

The energy of the kth layer and jth band is defined as follows:

E_{k}^{j} = \sum_{n = 1}^{M} | d_{k}^{j, n} |^{2}, j = 0, 1, 2, \dots, 2^{k} - 1

(10)

where M denotes the length of the jth band.

And the corresponding band spectrum coefficient is:

r_{j} = \frac{E_{j}}{E}, j = 0, 1, 2, \dots, 2^{k} - 1

(11)

where E denotes the total energy of the kth layer. Thus, the sequence

\{r_{j}, j = 0, 1, 2, \dots, 2^{k} - 1\}

could be used as the frequency-domain feature of the signal.

In this paper, we combine the 6-dimensional statistical features of circuit responses with the 8-dimensional frequency features extracted by the wavelet packet transform to form a 14-dimensional feature vector. Then, we use the Boruta method to perform a feature screening on the 14-dimensional features to achieve a feature dimension reduction. After that, we use the reduced features as the input to train the LightGBM model, and optimize the model parameters with the Bayesian optimization method. Finally, we use the model to realize the fault diagnosis of the simulated circuits.

3.2. Steps of the Proposed Method

In this section, we outline the fundamental procedures for conducting feature extraction as follows and depicted in Figure 3:

Step 1: Obtain circuit response signals under various fault modes from the experiment circuit. The features of the signals are extracted to obtain high-dimensional feature vectors that simultaneously contain the time-domain statistical features and frequency-domain features of the signals.

Step 2: The Boruta feature selection algorithm is utilized on feature vectors with a high dimensionality to eliminate features that are weakly correlated or redundant.

Step 3: The refined characteristics are utilized as the input for the training of a LightGBM model. Bayesian optimization is utilized to optimize the hyperparameters of the LightGBM model with the aim of improving its classification performance.

4. Analog Circuit Experiment

In this segment, we implemented the suggested approach on two standard circuits to assess its efficacy and dependability. The circuits under consideration are the Sallen–Key band-pass filter circuit and the four opamp biquad high-pass filter circuit. Based on practical experience, circuits rarely have two or more components failing simultaneously; therefore, this paper only considers single fault scenarios.

4.1. Sallen–Key Band-Pass Filter

The initial experimental circuit implemented is the Sallen–Key band-pass filter, as shown in Figure 4. Sallen–Key band-pass filter. This filter configuration comprises five resistors, two capacitors, and a fundamental operational amplifier. Through a sensitivity analysis of the circuit, it is observed that the component values of C1, C2, R2, and R3 play a crucial role in determining the output characteristics of the circuit. Therefore, these components are chosen as potential faulty components. In this paper, component faults are defined as follows: if the actual parameters of a component deviate by 50% or more from its nominal value, the component is considered faulty. The nominal values of the potential faulty components and their fault modes for the Sallen–Key band-pass filter are shown in Table 1. The symbols “↑” and “↓” in the table indicate that the actual value of the component in this fault type exceeds or falls below its nominal value.

In the circuit simulation setting, according to practical experience, all resistors are set to a 5% tolerance and all capacitors are set to a 10% tolerance. The circuit’s excitation source is an alternating voltage signal with an amplitude of 5 V and a frequency of 1 kHz. The circuit’s sampling time is 1.2 ms. A Monte Carlo analysis is performed on the circuit 100 times under each fault mode, and the circuit’s time-domain output response is extracted.

According to the previous text, the proposed method was used to perform feature selection on the data features using the Boruta method and the selected results are the kurtosis, skewness, and the eighth component of the wavelet packet decomposition. The visualization of the fault features is shown in Figure 5. It can be seen from the figure that the fault features have a good degree of discrimination.

Then, the filtered features were used as input to train a LightGBM model. Bayesian optimization was employed to optimize the model’s hyperparameters. The optimized hyperparameters are shown in Table 2. The classification results of the model on the data are represented by a confusion matrix, as shown in Figure 6. The accuracy of the diagnosis of each type of fault is above 94%, and the overall accuracy reaches 97.8%.

4.2. Four Opamp Biquad High-Pass Filter

The subsequent experimental setup is the four opamp biquad high-pass filter, depicted in Figure 7. This filter comprises ten resistors, two capacitors, and four fundamental operational amplifiers. Through sensitivity analysis, it is observed that the values of R1, R2, R3, R4, C1, and C2 play a crucial role in determining the output characteristics of the circuit. Therefore, these components are chosen as potential faulty components. Similarly, if the actual parameters of an electronic component exceed or fall below its nominal value by 50% or more, then the component is considered to be a faulty component. The nominal values of the potential faulty components and their fault modes for the four opamp biquad high-pass filter are shown in Table 3.

Similarly, in the circuit simulation setting, all resistors are set to a 5% tolerance and all capacitors are set to a 10% tolerance. The circuit’s excitation source is an alternating voltage source signal with an amplitude of 5 V and a frequency of 12 kHz. The circuit’s sampling time is 0.5 ms. A Monte Carlo analysis is performed on the circuit 100 times under each fault mode, and the circuit’s time-domain output response is extracted.

According to the previous text, the proposed method was used to perform feature selection on the data features using the Boruta method and the selected results are the kurtosis, the seventh component, and the eighth component of the wavelet packet decomposition. The visualization of the fault features is shown in Figure 8. As can be seen from the figure, except for the F5 fault, which is more dispersed in the feature space, the remaining faults have a relatively clear distinction in the feature space. To correctly classify the fault modes, the classifier still needs to be trained further.

Then, the filtered features were used as input to train a LightGBM model. Bayesian optimization was used to optimize the hyperparameters of the model. The optimized hyperparameters are shown in Table 4. The classification results of the model on the data are represented by a confusion matrix, as shown in Figure 9. As shown in the figure, the accuracy of most features is high. Only F7 and F11 have lower diagnosis rates, but they still have accuracies of 93% and 91%, respectively. The overall accuracy reaches 97.31%. Overall, the results are still satisfactory.

4.3. Comparison Experiments

To analyze the superiority of the proffered scheme, a comparative experiment is set up in this paper. On the one hand, we conducted a comparative experiment based only on statistical features and only on frequency-domain features. On the other hand, we tried to choose common classifiers other than the LightGBM. And the results are shown in Table 5, Table 6 and Table 7.

The results presented in Table 5 and Table 6 indicate a moderate enhancement in the classification accuracy of fault diagnosis when incorporating time-domain and frequency-domain characteristics. Specifically, there is an enhancement in diagnostic precision ranging from 0.5% to 0.6% for the Sallen–Key band-pass filter and from 2.3% to 2.9% for the four opamp biquad high-pass filter. This comparative analysis suggests that Boruta effectively identifies optimal features across both time and frequency domains. Furthermore, the outcomes detailed in Table 7 demonstrate that the LightGBM surpasses the SVM and Random Forest as a classifier for the aforementioned circuits.

5. Conclusions

This study introduces a novel approach for diagnosing soft faults in analog circuits by utilizing Boruta feature selection and a LightGBM model. The methodology integrates the time-domain statistical characteristics and frequency-domain features of circuit signals, employing the Boruta technique to identify the most effective low-dimensional features. Subsequently, a LightGBM model with hyperparameters optimized through Bayesian methods is employed to construct a soft fault diagnosis model for analog circuits using training data. The evaluation of fault diagnosis effectiveness is performed on a test dataset, yielding diagnostic accuracies of 97.8% and 97.3% for the Sallen–Key band-pass filter and four opamp biquad high-pass filter experimental circuits, respectively. A comparative analysis with feature datasets focusing solely on time-domain statistical features or frequency-domain features demonstrates the superior feature selection capabilities of the Boruta algorithm. Furthermore, comparative experiments reveal that the LightGBM model optimized through Bayesian techniques exhibits superior classification performance compared to SVM and random forest algorithms. The findings of the study suggest the potential utility of the fault diagnosis technique in the realm of soft fault diagnosis within analog circuits or within a wider scope of fault-tolerant systems. This offers avenues for future research endeavors. Subsequent studies could expand upon this groundwork to assess the viability of implementing this approach in various circuit arrangements or practical situations.

Author Contributions

Conceptualization, H.C. and C.H.; methodology, H.C.; software, H.C.; validation, H.C.; investigation, K.M.; project administration, C.H.; funding acquisition, B.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LightGBM	Light gradient-boosting machine
GBDT	Gradient-boosting decision tree
DS	Dempster–Shafer
DBN	Deep belief network
PCA	Principal component analysis
FRFT	Fractional Fourier transform
SVM	Support vector machine
SVDD	Support vector data description
GOSS	Gradient-based one-side sampling
EFB	Exclusive feature bundling
SMBO	Sequential model-based optimization
PI	Probability improvement
EI	Expected improvement
UCB	Upper confidence bound
WPT	Wavelet packet transform
Symbols
The following Symbols are used in this manuscript:
$Z_{s c o r e}$	Score that evaluates the importance of each feature
$E_{f}$	Mean of accuracy loss
$f$	Standard deviation of accuracy loss
$σ$	Standard deviation
$μ$	Mean of the data
KU	Kurtosis
SK	Skewness
H	Entropy
SH	Waveform factor
RMS	Root mean square
IM	Impulse indicator

References

Zhang, C.L.; Ye, L.L.; Wu, J.; Zhang, B.; Yao, N.G.; Wang, Y.Z. A Novel Analog Circuit Fault Diagnosis Approach. Recent Adv. Electr. Electron. Eng. 2021, 14, 535–546. [Google Scholar] [CrossRef]
Liu, X.D.; Yang, H.C.; Gao, T.Y.; Yang, J.L. A Novel Incipient Fault Diagnosis Method for Analogue Circuits Based on an MLDLCN. Circuits Syst. Signal Process. 2024, 43, 684–710. [Google Scholar] [CrossRef]
Wang, M.; Zhao, J.; Wu, Z.F.; Yang, H.W. Transistor Open-Circuit Fault Disgnosis of Three Phase Voltage-Source Inverter Fed Induction Motor Based on Information Fusion. In Proceedings of the 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), Siem Reap, Cambodia, 18–20 June 2017; pp. 1591–1594. [Google Scholar]
Chen, M.Y.; He, Y.G. Multiple Open-Circuit Fault Diagnosis Method in NPC Rectifiers Using Fault Injection Strategy. IEEE Trans. Power Electron. 2022, 37, 8554–8571. [Google Scholar] [CrossRef]
Shi, T.C.; He, Y.G.; Wang, T.; Li, B. Open Switch Fault Diagnosis Method for PWM Voltage Source Rectifier Based on Deep Learning Approach. IEEE Access 2019, 7, 66595–66608. [Google Scholar] [CrossRef]
Catelani, M.; Fort, A. Soft fault detection and isolation in analog circuits: Some results and a comparison between a fuzzy approach and radial basis function networks. IEEE Trans. Instrum. Meas. 2002, 51, 196–202. [Google Scholar] [CrossRef]
Aminian, F.; Aminian, M.; Collins, H.W. Analog fault diagnosis of actual circuits using neural networks. IEEE Trans. Instrum. Meas. 2002, 51, 544–550. [Google Scholar] [CrossRef]
Siddique, M.F.; Ahmad, Z.; Ullah, N.; Kim, J. A Hybrid Deep Learning Approach: Integrating Short-Time Fourier Transform and Continuous Wavelet Transform for Improved Pipeline Leak Detection. Sensors 2023, 23, 8079. [Google Scholar] [CrossRef]
Xiao, Y.Q.; He, Y.G. A novel approach for analog fault diagnosis based on neural networks and improved kernel PCA. Neurocomputing 2011, 74, 1102–1115. [Google Scholar] [CrossRef]
Ji, L.P.; Fu, C.Q.; Sun, W.Q. Soft Fault Diagnosis of Analog Circuits Based on a ResNet with Circuit Spectrum Map. IEEE Trans. Circuits Syst. I-Regul. Pap. 2021, 68, 2841–2849. [Google Scholar] [CrossRef]
Song, P.; He, Y.Z.; Cui, W.J. Statistical property feature extraction based on FRFT for fault diagnosis of analog circuits. Analog Integr. Circuits Signal Process. 2016, 87, 427–436. [Google Scholar] [CrossRef]
Sun, P.; Yang, Z.M.; Jiang, Y.M.; Jia, S.H.; Peng, X.Y. A Fault Diagnosis Method of Modular Analog Circuit Based on SVDD and D-S Evidence Theory. Sensors 2021, 21, 6889. [Google Scholar] [CrossRef]
Zhang, C.L.; He, Y.G.; Yuan, L.F.; Xiang, S. Analog Circuit Incipient Fault Diagnosis Method Using DBN Based Features Extraction. IEEE Access 2018, 6, 23053–23064. [Google Scholar] [CrossRef]
Yang, H.H.; Meng, C.; Wang, C. Data-Driven Feature Extraction for Analog Circuit Fault Diagnosis Using 1-D Convolutional Neural Network. IEEE Access 2020, 8, 18305–18315. [Google Scholar] [CrossRef]
Yuan, Z.J.; He, Y.G.; Yuan, L.F.; Chen, P.; Cheng, Z. An efficient feature extraction approach based on manifold learning for analogue circuits fault diagnosis. Analog Integr. Circuits Signal Process. 2020, 102, 237–252. [Google Scholar] [CrossRef]
Farhana, N.; Firdaus, A.; Darmawan, M.F.; Razak, M.F.A. Evaluation of Boruta algorithm in DDoS detection. Egypt. Inform. J. 2023, 24, 27–42. [Google Scholar] [CrossRef]
Muzoglu, N.; Adigüzel, E.; Akbacak, E.; Karaslan, M.K. Detection of Damaged Structures From Satellite Imagery Processed by Autoencoder With Boruta Feature Selection Method. Electrica 2023, 23, 397–405. [Google Scholar] [CrossRef]
Bentéjac, C.; Csörgo, A.; Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
Tang, M.Z.; Meng, C.H.; Wu, H.W.; Zhu, H.Q.; Yi, J.B.; Tang, J.; Wang, Y.F. Fault Detection for Wind Turbine Blade Bolts Based on GSG Combined with CS-LightGBM. Sensors 2022, 22, 6763. [Google Scholar] [CrossRef] [PubMed]
Wang, X.L.; Jin, Y.C.; Schmitt, S.; Olhofer, M. Recent Advances in Bayesian Optimization. Acm Comput. Surv. 2023, 55, 287. [Google Scholar] [CrossRef]
Zhang, C.L.; He, Y.G.; Yang, T.; Zhang, B.; Wu, J. An Analog Circuit Fault Diagnosis Approach Based on Improved Wavelet Transform and MKELM. Circuits Syst. Signal Process. 2022, 41, 1255–1286. [Google Scholar] [CrossRef]
Yang, Y.Y.; Wang, L.D.; Nie, X.B.; Wang, Y. Incipient fault diagnosis of analog circuits based on wavelet transform and improved deep convolutional neural network. Ieice Electron. Express 2021, 18, 20210174. [Google Scholar] [CrossRef]

Figure 1. Boruta algorithm procedure.

Figure 2. Three-level wavelet packet decomposition.

Figure 3. Flowchart of the proposed method.

Figure 4. Sallen–Key band-pass filter.

Figure 5. Fault feature visualization of the Sallen–Key band-pass filter.

Figure 6. Confusion matrix of the diagnostic results of the proposed method in a Sallen–Key band-pass filter.

Figure 7. Four opamp biquad high-pass filter.

Figure 8. Fault feature visualization of the four opamp biquad high-pass filter.

Figure 9. Confusion matrix of the diagnostic results of the proposed method in a four opamp biquad high-pass filter.

Table 1. Nominal and fault values for a Sallen–Key band-pass filter.

Fault Tag	Fault Class	Nominal Value	Fault Value
F0	\	\	\
F1	C1 ↓	5 nF	2.5 nF
F2	C1 ↑	5 nF	7.5 nF
F3	R3 ↑	2 kΩ	3 kΩ
F4	R3 ↓	2 kΩ	1 kΩ
F5	C2 ↓	5 nF	2.5 nF
F6	C2 ↑	5 nF	7.5 nF
F7	R2 ↑	3 kΩ	4.5 kΩ
F8	R2 ↓	3 kΩ	1.5 kΩ

Table 2. Bayesian optimization results of Sallen–Key band-pass filter.

Parameter	Value
n_estimators	296
learning_rate	0.1794
min_child_samples	26
min_child_weight	0.009
min_split_gain	0.89
num_leaves	43

Table 3. Nominal and fault values for the four opamp biquad high-pass filter.

Fault Tag	Fault Class	Nominal Value	Fault Value
F0	\	\	\
F1	R1 ↑	6.2 kΩ	9.3 kΩ
F2	R1 ↓	6.2 kΩ	3.1 kΩ
F3	R2 ↑	6.2 kΩ	9.3 kΩ
F4	R2 ↓	6.2 kΩ	3.1 kΩ
F5	R3 ↑	6.2 kΩ	9.3 kΩ
F6	R3 ↓	6.2 kΩ	3.1 kΩ
F7	R4 ↑	1.6 kΩ	2.4 kΩ
F8	R4 ↓	1.6 kΩ	0.8 kΩ
F9	C1 ↑	5 nF	7.5 nF
F10	C1 ↓	5 nF	2.5 nF
F11	C2 ↑	5 nF	7.5 nF
F12	C2 ↓	5 nF	2.5 nF

Table 4. Bayesian optimization results of four opamp biquad high-pass filter.

Parameter	Value
n_estimators	387
learning_rate	0.2708
min_child_samples	28
min_child_weight	0.007
min_split_gain	0.6355
num_leaves	59

Table 5. Performance comparison of different domains in the Sallen–Key band-pass filter.

Fault ID	Only the Time Domain	Only the Frequency Domain	Both
F0	1	1	1
F1	1	1	1
F2	1	1	1
F3	0.97	0.97	0.97
F4	0.96	0.97	0.97
F5	0.94	0.94	0.94
F6	0.94	0.94	0.94
F7	0.94	0.94	0.98
F8	1	1	1
Average accuracy	97.2%	97.3%	97.8%

Table 6. Performance comparison of different domains in the four opamp biquad high-pass filter.

Fault ID	Only the Time Domain	Only the Frequency Domain	Both
F0	1	1	1
F1	0.95	1	1
F2	0.96	0.97	0.97
F3	1	1	1
F4	0.94	0.94	1
F5	0.90	0.90	0.95
F6	0.95	0.96	0.98
F7	0.94	0.93	0.93
F8	0.95	0.95	1
F9	0.96	0.91	0.94
F10	0.88	0.90	0.96
F11	0.90	0.88	0.91
F12	0.84	1	1
Average accuracy	94.4%	95%	97.3%

Table 7. Performance comparison of the classification methods.

Experiment Circuit	LightGBM	SVM	Random Forest
Sallen–Key band-pass filter	97.8%	96.2%	96.6%
Four opamp biquad high-pass filter	97.3%	92.2%	94.0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, H.; Hu, C.; Han, B.; Miao, K. A Method of Diagnosing Analog Circuit Soft Faults Using Boruta Features and LightGBM. Electronics 2024, 13, 1123. https://doi.org/10.3390/electronics13061123

AMA Style

Chen H, Hu C, Han B, Miao K. A Method of Diagnosing Analog Circuit Soft Faults Using Boruta Features and LightGBM. Electronics. 2024; 13(6):1123. https://doi.org/10.3390/electronics13061123

Chicago/Turabian Style

Chen, Hongyang, Chunyan Hu, Bo Han, and Keqiang Miao. 2024. "A Method of Diagnosing Analog Circuit Soft Faults Using Boruta Features and LightGBM" Electronics 13, no. 6: 1123. https://doi.org/10.3390/electronics13061123

APA Style

Chen, H., Hu, C., Han, B., & Miao, K. (2024). A Method of Diagnosing Analog Circuit Soft Faults Using Boruta Features and LightGBM. Electronics, 13(6), 1123. https://doi.org/10.3390/electronics13061123

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Method of Diagnosing Analog Circuit Soft Faults Using Boruta Features and LightGBM

Abstract

1. Introduction

2. Related Theories

2.1. Boruta Feature Selection

2.2. Light Gradient-Boosting Machine (LightGBM)

2.3. Bayesian Optimization

3. Proposed Method

3.1. Obtain the Feature Vector

3.2. Steps of the Proposed Method

4. Analog Circuit Experiment

4.1. Sallen–Key Band-Pass Filter

4.2. Four Opamp Biquad High-Pass Filter

4.3. Comparison Experiments

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI