A Novel Analog Circuit Soft Fault Diagnosis Method Based on Convolutional Neural Network and Backward Difference

This paper develops a novel soft fault diagnosis approach for analog circuits. The proposed method employs the backward difference strategy to process the data, and a novel variant of convolutional neural network, i.e., convolutional neural network with global average pooling (CNNGAP) is taken for feature extraction and fault classification. Specifically, the measured raw domain response signals are firstly processed by the backward difference strategy and the first-order and the second-order backward difference sequences are generated, which contain the signal variation and the rate of variation characteristics. Then, based on the one-dimensional convolutional neural network, the CNN-GAP is developed by introducing the global average pooling technical. Since global average pooling calculates each input vector’s mean value, the designed CNN-GAP could deal with different lengths of input signals and be applied to diagnose different circuits. Additionally, the first-order and the second-order backward difference sequences along with the raw domain response signals are directly fed into the CNN-GAP, in which the convolutional layers automatically extract and fuse multi-scale features. Finally, fault classification is performed by the fully connected layer of the CNN-GAP. The effectiveness of our proposal is verified by two benchmark circuits under symmetric and asymmetric fault conditions. Experimental results prove that the proposed method outperforms the existing methods in terms of diagnosis accuracy and reliability.


Introduction
With the wide application of electronic systems in aerospace, aircraft, robot, and other fields, improving the stability, security, and maintainability of electronic systems has become a fundamental issue in the circuit field [1]. Analog circuits are an essential part of electronic systems, although only 20% of electronic systems are analog circuits, 80% of electronic system failures are caused by them [2,3]. Therefore, developing an effective analog circuit fault diagnosis method is of great significance for maintaining electronic systems' reliable operation.
Generally, analog circuit faults can be roughly divided into two categories, i.e., hard faults and soft faults [4]. Hard faults are mainly manifested as the circuit topology changing or the component value extremely exceeding the nominal value, resulting in complete circuit failure, while soft faults are caused by the deviation of the component value from the nominal value [5]. Although these soft faults do not cause the circuit to fail completely, they will ultimately affect the circuit performance. Moreover, due to the tolerance effect of the components and the limited testable nodes, diagnosis of soft faults in analog circuits are very difficult and are more challenging than that of hard faults [6,7]. Thus, this paper mainly focuses on the soft fault diagnosis of analog circuits.
In the past six decades, a large number of analog circuit fault diagnosis studies have been done. These approaches can be roughly classified into model-based methods and the second-order backward difference sequences) are directly fed into the CNN-GAP, where the convolution layers automatically extract features from the multi-scale inputs and adaptively fuse the mined multi-scale features. Finally, the fully connected layer in CNN-GAP outputs the fault diagnosis results. The performance of the proposed method is evaluated by two benchmark circuits, i.e., Sallen-key band-pass filter (SKBPF) circuit and Four-opamp biquad high-pass filter (FOBHPF) circuit. Experimental results indicate that our proposal achieves higher diagnosis accuracy compared with several existing methods.
The main contributions of this paper are summarized as follows: (1) A feature extraction method, i.e., backward difference, is introduced for preprocessing the raw signals of analog circuit. The purpose of introducing this strategy is to extract the signal variation and the rate of variation feature, which may be more discriminative than the features contained in the original signal sequences. (2) By integrating the GAP technique, the designed CNN-GAP could deal with different lengths of input sequences, which is more practical and promising for different circuits fault diagnosis problems. (1) Multi-scale signals are directly fed into the CNN-GAP network, which automatically extracted the circuit fault features and adaptively fused the learned features, without any manual operations. In addition, experimental results demonstrate that the presented solution is more effective for analog circuit fault diagnosis.
The rest of this paper is organized as follows: Section 2 introduces the basic components of the 1DCNN networks. Section 3 details our motivations and the architecture of the proposed method. In Section 4, fault diagnosis experiments are conducted based on two typical benchmark circuits to demonstrate the effectiveness of our proposal. Finally, conclusions and future work are presented in Section 5.

The Basic Architecture of 1DCNN
This section provides a brief introduction of the main components of the 1DCNN network.

Convolutional Layers
The convolutional layers are the key components of 1DCNN which are responsible for the main feature extraction task of the network [29]. The convolution layers perform convolution operation on the input feature maps through a group of convolution kernels [30], whose weights do not change during a convolution process, i.e., weight sharing. The convolution operation of the r-th convolutional layer can be expressed as follows: x r−1 i ⊗ k r ij + b r j , i = 1, 2, . . . , N(r), j = 1, 2, . . . , M(r) where x r−1 j and x r j are the j-th input and output feature map, respectively, and b r j represents the bias vector of the j-th convolution kernel. k r ij denotes the j-th convolutional kernel. N(r) and M(r) are the number of channels of the input and output feature map in the r-th convolutional layer. ⊗ denotes the convolution operation.

Batch Normalization and Activation Function
In the network training process, the distribution of features continuously changes, which increases the difficulty of training. Therefore, the batch normalization (BN), a feature normalization technique, is often used to reduce internal covariant shifts [31]. The normalization process of the r-th batch normalization layer can be expressed as follows: where x r and y r are the input feature map and the normalized feature map, µ Batch and σ 2. Batch represent the expectation and the variance of a batch, respectively, ε is a constant and its value is close to 0. γ and β are the learnable scale and shift parameters.
Activation functions are often introduced into DL networks due to their ability of nonlinear transformation [32]. Rectified linear unit (ReLU) is a typical activation function. Many classic networks adopt it because it is easy to calculate and can prevent gradient vanishing. The ReLU function can be expressed as follows: where x and y are the input and output data.

Pooling Layer
The pooling layer is able to zoom and map the input feature map, extracting essential features while reducing the dimensionality. Average pooling and maximum pooling are the most frequently used pooling functions. This paper uses the maximum pooling function, and the calculation process of the k-th feature map by the r-th pooling layer is as follows: where p and t represent the length and width of the pooling kernel, respectively. After the maximum pooling operation, the maximum value in the range p × t will be retained in the output feature map, which is regarded as the new feature corresponding to this region.

Fully Connected Layer
The fully connected layer (FC) is usually used as the output layer of 1DCNN. After the multi-layer convolution and pooling operation, the obtained feature maps are generally flattened into a one-dimensional vector, and then fed into the FC layer. The r-th FC layer is calculated as: x r = f (w r x r + b r ) (6) where x r and x r are the input and the output feature maps. W r denotes the weight of the FC layer, and b r represents the bias.

The Proposed Method
This section details the motivations for developing the proposed method. In addition, the architecture of CNN-GAP as well as the procedure of the proposed methods for analog circuit fault diagnosis are elaborated.

Backward Difference Preprocessing Method
Backward difference is an operation in which each element in the sequence is subtracted from the previous element (except the first element). By performing the backward difference strategy on the circuit signal, the generated the first-order backward difference (FOBD) and the second-order backward difference (SOBD) sequences contain the signal variation and rate of signal variation characteristics. Note that the SOBD sequence is calculated by another backward difference operation on the FOBD sequence. Figure 1 shows the original voltage domain response signal sequence, and the FOBD and the SOBD sequence of the voltage signal in the FOBHPF circuit under single pulse signal input, where Original, Diff1, and Diff2 denote the original, the FOBD and the SOBD sequences, respectively. As shown in Figure 1, these three kinds of sequences of the FOBHPF circuit show significant differences in the first 20 µs, and the peakedness of the FOBD sequence is larger than that of the original voltage sequence, while the SOBD sequence has the largest peakedness. The position of the peak and the peakedness of the FOBD and SOBD sequences contain the circuit fault information. They may be more discriminative than the fault information implied in the original signal. As aforementioned, in complex circuit systems or incipient circuit fault diagnosis cases, it is hard to accurately classify different fault models because their signal responses are similar to each other. Since the generated FOBD and SOBD sequences have the potential to highlight the fault information over the original signal, this paper employed them for fault diagnosis along with the original signal. Assuming that is the raw domain response signal sequence of the circuit, where T denotes the length of the sequence and t x indicates the t-th element of the output sequence. The FOBD and SOBD sequence can be calculated as follows: where 1 X Δ and 2 X Δ represent the FOBD and SOBD sequence, respectively. Then, these three sequences (the original, FOBD and SOBD sequence) are assembled into a matric 3 T X × ∈  , which is used as the input of the DL network. X can be expressed as follows:

Convolutional Neural Network with Global Average Pooling (CNN-GAP)
As mentioned above, the length of the output signal sequence is often different from circuit to circuit. This brings a great challenge to the application of conventional DL-based methods since their inputs are usually fixed size. Yang et al. [4] developed three different 1DCNN models for fault diagnosis of the SKBPF, FOBHPF, and leapfrog filter circuit. Obviously, the generalization ability of the proposed methods is insufficient, as it requires to design a new 1DCNN model when applying their approach to the new circuit. In order to make the sequence lengths obtained by different circuits the same, Zhao et al. [25] adopted a smaller sampling interval in the SKBPF circuit and set a larger sampling interval in the As shown in Figure 1, these three kinds of sequences of the FOBHPF circuit show significant differences in the first 20 µs, and the peakedness of the FOBD sequence is larger than that of the original voltage sequence, while the SOBD sequence has the largest peakedness. The position of the peak and the peakedness of the FOBD and SOBD sequences contain the circuit fault information. They may be more discriminative than the fault information implied in the original signal. As aforementioned, in complex circuit systems or incipient circuit fault diagnosis cases, it is hard to accurately classify different fault models because their signal responses are similar to each other. Since the generated FOBD and SOBD sequences have the potential to highlight the fault information over the original signal, this paper employed them for fault diagnosis along with the original signal. Assuming that X = {x 1 , . . . , x t , . . . x T } is the raw domain response signal sequence of the circuit, where T denotes the length of the sequence and x t indicates the t-th element of the output sequence. The FOBD and SOBD sequence can be calculated as follows: where ∆ 1 X and ∆ 2 X represent the FOBD and SOBD sequence, respectively. Then, these three sequences (the original, FOBD and SOBD sequence) are assembled into a matric X ∈ R 3×T , which is used as the input of the DL network. X can be expressed as follows:

Convolutional Neural Network with Global Average Pooling (CNN-GAP)
As mentioned above, the length of the output signal sequence is often different from circuit to circuit. This brings a great challenge to the application of conventional DLbased methods since their inputs are usually fixed size. Yang et al. [4] developed three different 1DCNN models for fault diagnosis of the SKBPF, FOBHPF, and leapfrog filter circuit. Obviously, the generalization ability of the proposed methods is insufficient, as it requires to design a new 1DCNN model when applying their approach to the new circuit. In order to make the sequence lengths obtained by different circuits the same, Zhao et al. [25] adopted a smaller sampling interval in the SKBPF circuit and set a larger sampling interval in the FOBHPF circuit. However, a larger sampling interval may cause the critical circuit fault information to be overlooked, and a smaller sampling interval results in more computational cost. In order to better solve this problem, this study introduces the global average pooling (GAP) into the 1DCNN network. As shown in Figure 2, the GAP strategy calculates the average value of each input feature map, thus sequences of different lengths entering the GAP layer will be processed into a numerical value [33]. In addition, it can be seen that there are no parameters in the GAP layer, which can avoid network over-fitting [34]. Assume that the input of the GAP layer is X ∈ R C×1×N , where C and N indicate the number of input channel and the length of feature map, respectively. In the GAP layer, the number of output nodes is equal to the number of input feature maps, and the generated feature map X is a 1D vector (with a size of C × 1 × 1).
Symmetry 2021, 13, x FOR PEER REVIEW 6 of 16 FOBHPF circuit. However, a larger sampling interval may cause the critical circuit fault information to be overlooked, and a smaller sampling interval results in more computational cost. In order to better solve this problem, this study introduces the global average pooling (GAP) into the 1DCNN network. As shown in Figure 2, the GAP strategy calculates the average value of each input feature map, thus sequences of different lengths entering the GAP layer will be processed into a numerical value [33]. In addition, it can be seen that there are no parameters in the GAP layer, which can avoid network over-fitting [34]. Assume that the input of the GAP layer is , where C and N indicate the number of input channel and the length of feature map, respectively. In the GAP layer, the number of output nodes is equal to the number of input feature maps, and the generated feature map X is a 1D vector (with a size of C × 1 × 1). In addition, the designed CNN-GAP uses multi-scale sequences obtained by the backward difference strategy as its inputs. Multi-scale features are automatically extracted and adaptively fused from the inputs in the convolution operation process. Through multiple layers convolution operation, the CNN-GAP could learn the important circuit fault representations and encode high-level fault feature maps. These feature maps are then taken as the input of the GAP layer, which maps the feature maps into a more discriminative feature vector. Finally, fault classification is performed by the fully connected layer based on the obtained feature vector.
In fact, most existing DL-based fault diagnosis methods directly used the raw sequences as inputs, without using any feature extraction methods, and the main reasons are summarized as follows: (1) Traditional feature extraction methods, such as wavelet transform and Fourier transform, more or less ignore some fault information when extracting features. The DL network's fault diagnosis performance using the extracted features by traditional feature extraction methods as input is even inferior to that of directly using the raw signals as inputs. (2) Most traditional feature extraction methods require manual extraction and selection of some discriminative features, which is time-consuming and laborious tasks. It also requires explicit prior knowledge and specialized knowledge. However, the proposed method is an attempt to combine the DL method with a feature extraction method. Since the original signal sequence and the backward difference sequences are in the CNN-GAP network, non-fault information is ignored, and the network can extract more features. In addition, the feature extraction and fusion, as well as fault classification, are conducted by CNN-GAP, without any manual operations. Thus, the proposed method effectively overcome the above two issues, and the effectiveness of our proposal is verified in Section 4.

General Procedure of the Proposed Method for Fault Diagnosis
This paper proposes to integrate the backward difference strategy with a variant of CNN, i.e., CNN-GAP, for fault diagnosis of analog circuits. The procedure of the proposed method for fault diagnosis is shown in Figure 3. Firstly, the backward difference strategy is performed on the measured domain response signals, so as to obtain multi-scale signals. Then, the CNN-GAP network is constructed by introducing the GAP technique into the 1DCNN, and it is able to process different lengths of circuit signals. Additionally, multi- In addition, the designed CNN-GAP uses multi-scale sequences obtained by the backward difference strategy as its inputs. Multi-scale features are automatically extracted and adaptively fused from the inputs in the convolution operation process. Through multiple layers convolution operation, the CNN-GAP could learn the important circuit fault representations and encode high-level fault feature maps. These feature maps are then taken as the input of the GAP layer, which maps the feature maps into a more discriminative feature vector. Finally, fault classification is performed by the fully connected layer based on the obtained feature vector.
In fact, most existing DL-based fault diagnosis methods directly used the raw sequences as inputs, without using any feature extraction methods, and the main reasons are summarized as follows: (1) Traditional feature extraction methods, such as wavelet transform and Fourier transform, more or less ignore some fault information when extracting features. The DL network's fault diagnosis performance using the extracted features by traditional feature extraction methods as input is even inferior to that of directly using the raw signals as inputs. (2) Most traditional feature extraction methods require manual extraction and selection of some discriminative features, which is time-consuming and laborious tasks. It also requires explicit prior knowledge and specialized knowledge. However, the proposed method is an attempt to combine the DL method with a feature extraction method. Since the original signal sequence and the backward difference sequences are in the CNN-GAP network, non-fault information is ignored, and the network can extract more features. In addition, the feature extraction and fusion, as well as fault classification, are conducted by CNN-GAP, without any manual operations. Thus, the proposed method effectively overcome the above two issues, and the effectiveness of our proposal is verified in Section 4.

General Procedure of the Proposed Method for Fault Diagnosis
This paper proposes to integrate the backward difference strategy with a variant of CNN, i.e., CNN-GAP, for fault diagnosis of analog circuits. The procedure of the proposed method for fault diagnosis is shown in Figure 3. Firstly, the backward difference strategy is performed on the measured domain response signals, so as to obtain multi-scale signals. Then, the CNN-GAP network is constructed by introducing the GAP technique into the 1DCNN, and it is able to process different lengths of circuit signals. Additionally, multiscale signals are directly inputted to the CNN-GAP network, and the convolution layer automatically extracted features from multi-channel inputs. Furthermore, the generated feature maps by the last convolution layer are fed into the GAP layer, which calculates a scale signals are directly inputted to the CNN-GAP network, and the convolution layer automatically extracted features from multi-channel inputs. Furthermore, the generated feature maps by the last convolution layer are fed into the GAP layer, which calculates a mean value from each channel of the feature map. Finally, the fully connected layer output the final fault diagnosis results.  It is worth mentioning that a fully connected layer is connected behind the GAP layer in the designed CNN-GAP. While there are some existing methods in other fields [35,36], the GAP layer is directly used to replace the entire full connection layer. In such a case, the number of convolution kernels is equal to the number of classes. As for different circuit fault diagnosis, the number of faults is usually various-some of which may contain a few fault models, while others have dozens of fault models. Considering that feature extraction capability of the network may be insufficient if the number of convolution kernels is too small, the designed network sets a fixed number of convolution kernels, making the network have stable performance in different circuit fault diagnosis. Therefore, it is necessary to add a fully connected layer as the output layer.

Case Studies Using the Proposed Method
In this section, two representative circuits, i.e., the SKBPF and FOBHPF circuit, are employed to verify the validity of the proposed method.

Validation Setup and the Structure of the CNN-GAP
In this study, both benchmark circuits are constructed by OrCAD16.6 (Cadence, San Jose, CA, USA), and the designed DL networks are implemented in the Pytorch library. Each experiment is conducted on a computer with the Windows 10 operating system, an Intel(R) Core(TM) i5-10400F CPU, and a GTX 2060 GPU. Moreover, 10 trials were performed for each model to reduce the effect of randomness. The Glorot initialization strategy [37] is adopted for the initialization of the DL networks. The learning rate, as well as batchsize, are set to 0.001 and 64, respectively. All the DL networks are eventually trained for 40 epochs. In addition, two CNN-GAP networks are constructed to verify the effectiveness of the backward difference strategy. One of the CNN-GAP network using the raw domain response sequence as input (denote by CNN-GAP-R), while the other CNN-GAP network was input by multi-scale sequences (denote by CNN-GAP-MS). The detailed information of their structure is shown in Table 1, where the first and the second number in Conv( ⋅ ) represent the number of convolution kernels and the size of convolution kernels, respectively. In addition, it can be seen that the difference between these two CNN-GAP networks is the number of input channels. Moreover, L in Input( ⋅ ) represents the length It is worth mentioning that a fully connected layer is connected behind the GAP layer in the designed CNN-GAP. While there are some existing methods in other fields [35,36], the GAP layer is directly used to replace the entire full connection layer. In such a case, the number of convolution kernels is equal to the number of classes. As for different circuit fault diagnosis, the number of faults is usually various-some of which may contain a few fault models, while others have dozens of fault models. Considering that feature extraction capability of the network may be insufficient if the number of convolution kernels is too small, the designed network sets a fixed number of convolution kernels, making the network have stable performance in different circuit fault diagnosis. Therefore, it is necessary to add a fully connected layer as the output layer.

Case Studies Using the Proposed Method
In this section, two representative circuits, i.e., the SKBPF and FOBHPF circuit, are employed to verify the validity of the proposed method.

Validation Setup and the Structure of the CNN-GAP
In this study, both benchmark circuits are constructed by OrCAD16.6 (Cadence, San Jose, CA, USA), and the designed DL networks are implemented in the Pytorch library. Each experiment is conducted on a computer with the Windows 10 operating system, an Intel(R) Core(TM) i5-10400F CPU, and a GTX 2060 GPU. Moreover, 10 trials were performed for each model to reduce the effect of randomness. The Glorot initialization strategy [37] is adopted for the initialization of the DL networks. The learning rate, as well as batchsize, are set to 0.001 and 64, respectively. All the DL networks are eventually trained for 40 epochs. In addition, two CNN-GAP networks are constructed to verify the effectiveness of the backward difference strategy. One of the CNN-GAP network using the raw domain response sequence as input (denote by CNN-GAP-R), while the other CNN-GAP network was input by multi-scale sequences (denote by CNN-GAP-MS). The detailed information of their structure is shown in Table 1, where the first and the second number in Conv(·) represent the number of convolution kernels and the size of convolution kernels, respectively. In addition, it can be seen that the difference between these two CNN-GAP networks is the number of input channels. Moreover, L in Input(·) represents the length of the input sequence, and NFM in FC(·) indicates the number of fault modes. Note that there is no agreement on how to set these hyperparameters in CNN-GAP, and this paper determines them based on trial-and-error and some popular recommendations. The Sallen-Key band-pass filter (SKBPF) circuit is presented in Figure 4, which includes several resistors and capacitors, and an amplifier. The tolerances of the resistor and capacitor are set to 5% and 10%, respectively. A single pulse signal (amplitude 5 V, pulse width 10 µs) is the input, and the voltage signals of the output the amplifier are collected. In the experiment, nine symmetric and asymmetric fault modes are provided in the circuit, including the non-fault state and eight fault states (C1↑, C1↓, C2↑, C2↓, R2↑, R2↓, R3↑, R3↓), where ↑ represents that the current value of the component is 50% higher than its nominal value, and ↓ denotes 50% lower than that of nominal value.

. Simulation Settings and Data Collection
The Sallen-Key band-pass filter (SKBPF) circuit is presented in Fi cludes several resistors and capacitors, and an amplifier. The tolerances capacitor are set to 5% and 10%, respectively. A single pulse signal (am width 10 us) is the input, and the voltage signals of the output the ampl In the experiment, nine symmetric and asymmetric fault modes are pr cuit, including the non-fault state and eight fault states (C1↑, C1↓, C2↑, C R3↓), where ↑ represents that the current value of the component is 50 nominal value, and ↓ denotes 50% lower than that of nominal value.  In addition, the sampling range of time domain response signal is 0-75 µs, and the sampling interval is set to 0.5 µs, so the size of each collected sample is 1 × 150. Each fault mode performs 2000 Monte Carlo analyses, and Figure 5 illustrates one of the time domain response signals for each fault models. In the Monte Carlo analyses of a fault mode, the fault component value varied by exactly 50%, and the parameters of the rest component in the circuit were given different initial values within its 5% or 10% tolerance for R and C values. With nine fault modes defined before, each with 2000 samples, the total number of samples is 18,000. Meanwhile, the Max-Min normalization is used to standardize the data. Finally, 70% of instances for each fault mode are used as the training set, and the rest of the instances are taken as the testing set.

Experimental Results and Analyses
In this section, the effectiveness of the proposed method is verified by the fault diagnosis dataset described in Section 4.2.1. Two existing methods [15,25] are compared in the experiments. In order to make a fair comparison, this paper adopts the same circuit settings as [25], including fault mode, component parameter, tolerance, etc. In addition, this paper uses the diagnosis accuracy to evaluate the fault diagnosis performance, which is the most commonly used indicator to measure classification performance. The diagnosis accuracy is defined as: % D 1 iagnosis accu cy 00 ra

TP TN TP TN FP FN
where TP, TN, FP, and FN denote the number of true positive samples, true negative samples, false positive samples, and false negative samples, respectively. The larger the diagnosis accuracy, the better the diagnosis performance. Table 2 lists the comparison results, where the diagnosis results achieved by the CNN-GAP networks are the average diagnosis accuracy of 10 experiments. Meanwhile, the standard deviation results of the CNN-GAP-R and CNN-GAP-MS are also recorded. It is observed that there are no standard deviation results of the methods [15,25] because this indicator was not mentioned in their literature.
As shown in Table 2, CNN-GAP-R, CNN-GAP-MS, and the method in [25] achieve the same results, and they all obtain 100% diagnosis accuracy under each fault model. This is because the structure of SKBPF circuit is relatively simple, which only consists of a few components. When a component value is offset by 50% from its nominal value, the output of the circuit could be greatly affected, making it easy to separate from the other fault models. In addition, the method in [15] gets 98.41% average accuracy, which is worse than that of the other three tested approaches. This may be attributed to the fact that the traditional feature extraction methods have difficulty mining the high-level fault characteristics and result in poor diagnosis results.

Experimental Results and Analyses
In this section, the effectiveness of the proposed method is verified by the fault diagnosis dataset described in Section 4.2.1. Two existing methods [15,25] are compared in the experiments. In order to make a fair comparison, this paper adopts the same circuit settings as [25], including fault mode, component parameter, tolerance, etc. In addition, this paper uses the diagnosis accuracy to evaluate the fault diagnosis performance, which is the most commonly used indicator to measure classification performance. The diagnosis accuracy is defined as: where TP, TN, FP, and FN denote the number of true positive samples, true negative samples, false positive samples, and false negative samples, respectively. The larger the diagnosis accuracy, the better the diagnosis performance. Table 2 lists the comparison results, where the diagnosis results achieved by the CNN-GAP networks are the average diagnosis accuracy of 10 experiments. Meanwhile, the standard deviation results of the CNN-GAP-R and CNN-GAP-MS are also recorded. It is observed that there are no standard deviation results of the methods [15,25] because this indicator was not mentioned in their literature. As shown in Table 2, CNN-GAP-R, CNN-GAP-MS, and the method in [25] achieve the same results, and they all obtain 100% diagnosis accuracy under each fault model. This is because the structure of SKBPF circuit is relatively simple, which only consists of a few components. When a component value is offset by 50% from its nominal value, the output of the circuit could be greatly affected, making it easy to separate from the other fault models. In addition, the method in [15] gets 98.41% average accuracy, which is worse than that of the other three tested approaches. This may be attributed to the fact that the traditional feature extraction methods have difficulty mining the high-level fault characteristics and result in poor diagnosis results.
To verify the feature extraction ability of the proposed CNN-GAP-MS, the t-distributed stochastic neighbor embedding (t-SNE) technique is used to convert the raw data and the extracted features by CNN-GAP-MS into the three-dimensional maps, which are shown as scatterplots in Figure 6. To verify the feature extraction ability of the proposed CNN-GAP-MS, the t-distributed stochastic neighbor embedding (t-SNE) technique is used to convert the raw data and the extracted features by CNN-GAP-MS into the three-dimensional maps, which are shown as scatterplots in Figure 6. As shown in Figure 6a, the raw data of several fault modes are slightly overlapped with that of other fault modes. While Figure 6b shows that the features extracted by CNN-GAP-MS clearly separate the nine fault modes, which is beneficial to fault classification. These results indicate that the proposed method effectively extracts from the raw data of the SKBPF circuit.

Fault Diagnosis of Incipient Faults
In Section 4.2.1, the fault value, which is offset by 50% from its nominal value, is selected for a fair comparison with the experiments' results in [25]. In this section, the proposed method is applied to incipient fault diagnosis to verify its effectiveness further. The same as Section 4.2.1, nine fault models are set in the circuit, that is, non-fault state and eight fault states (C1↑20, C1↓20, C2↑20, C2↓20, R2↑20, R2↓20, R3↑20, R3↓20). However, ↑20 and ↓20 denote that the real value of the component is 20% higher and lower than its nominal value. Moreover, other circuit settings and all the experiment settings remain the same as Section 4.2.1. The method in [25] is also compared with the proposed approaches. The average results of CNN-GAP-R and CNN-GAP-MS are achieved by ten experiments, as well as the results in [25], are listed in Table 3. As shown in Figure 6a, the raw data of several fault modes are slightly overlapped with that of other fault modes. While Figure 6b shows that the features extracted by CNN-GAP-MS clearly separate the nine fault modes, which is beneficial to fault classification. These results indicate that the proposed method effectively extracts from the raw data of the SKBPF circuit.

Fault Diagnosis of Incipient Faults
In Section 4.2.1, the fault value, which is offset by 50% from its nominal value, is selected for a fair comparison with the experiments' results in [25]. In this section, the proposed method is applied to incipient fault diagnosis to verify its effectiveness further. The same as Section 4.2.1, nine fault models are set in the circuit, that is, non-fault state and eight fault states (C1↑20, C1↓20, C2↑20, C2↓20, R2↑20, R2↓20, R3↑20, R3↓20). However, ↑20 and ↓20 denote that the real value of the component is 20% higher and lower than its nominal value. Moreover, other circuit settings and all the experiment settings remain the same as Section 4.2.1. The method in [25] is also compared with the proposed approaches. The average results of CNN-GAP-R and CNN-GAP-MS are achieved by ten experiments, as well as the results in [25], are listed in Table 3.
From Table 3, it can be seen that the method in [25] and CNN-GAP-R obtain almost the same average results. With respect to CNN-GAP-MS, its average diagnosis accuracy is close to 100%, which is about 1.8% higher than that of CNN-GAP-R and the method in [25]. This is because the original time domain response signal along with the backward difference sequences are inputted into the CNN-GAP-MS network, where the convolutional layer automatically extracts and fuses the original signal features, signal variation features and the rate of variation features, and the generated feature maps have higher discriminability. Therefore, the proposed CNN-GAP-MS exhibits a better performance in the diagnosis of incipient faults. Table 3. Comparison of incipient diagnosis accuracy for the SKBPF circuit.  Figure 7 shows the four-opamp biquad high-pass filter (FOBHPF) circuit, which is a more complex analog circuit and is usually utilized to evaluate the performance of fault diagnosis methods. The same as Section 4.2.1, the tolerance of the resistor and capacitor are set to 5% and 10%, respectively. The input of the circuit is a single pulse signal (amplitude 5 V, pulse width 10 µs). In order to compare the proposed approach with the existing methods [15] and [25], the same fault modes are set as those in the circuit, which includes non-fault state and 12 fault states (C1↑, C1↓, C2↑, C2↓, R1↑, R1↓, R2↑, R2↓, R3↑, R3↓, R4↑, R4↓). The notation ↑ denotes that the current value of the component is 50% higher than its nominal value, while ↓ denotes that of the component being 50% lower than its nominal value.  Table 3, it can be seen that the method in [25] and CNN-GAP-R obtain almost the same average results. With respect to CNN-GAP-MS, its average diagnosis accuracy is close to 100%, which is about 1.8% higher than that of CNN-GAP-R and the method in [25]. This is because the original time domain response signal along with the backward difference sequences are inputted into the CNN-GAP-MS network, where the convolutional layer automatically extracts and fuses the original signal features, signal variation features and the rate of variation features, and the generated feature maps have higher discriminability. Therefore, the proposed CNN-GAP-MS exhibits a better performance in the diagnosis of incipient faults. Figure 7 shows the four-opamp biquad high-pass filter (FOBHPF) circuit, which is a more complex analog circuit and is usually utilized to evaluate the performance of fault diagnosis methods. The same as Section 4.2.1, the tolerance of the resistor and capacitor are set to 5% and 10%, respectively. The input of the circuit is a single pulse signal (amplitude 5 V, pulse width 10 us). In order to compare the proposed approach with the existing methods [15] and [25], the same fault modes are set as those in the circuit, which includes non-fault state and 12 fault states (C1↑, C1↓, C2↑, C2↓, R1↑, R1↓, R2↑, R2↓, R3↑, R3↓, R4↑, R4↓). The notation ↑ denotes that the current value of the component is 50% higher than its nominal value, while ↓ denotes that of the component being 50% lower than its nominal value. In this circuit, the sampling range of time domain response signal is 0-300 µs, and the sampling interval is 1 µs. Thus, each generated sample has a size of 1 × 300. Similarly, each state was simulated 2000 times under Monte Carlo analysis, and one of the time domain response signals for each fault models is shown in Figure 8. As 13 symmetric and asymmetric fault modes are defined, the entire fault diagnosis dataset contains 26,000 samples in total. Furthermore, all the collected samples are normalized by Max-Min normalization, then these samples are randomly classified into the training set, and the testing set, i.e., 70% samples are taken as the training set and others are used as the testing set. main response signals for each fault models is shown in Figure 8. As 13 symmetric and asymmetric fault modes are defined, the entire fault diagnosis dataset contains 26,000 samples in total. Furthermore, all the collected samples are normalized by Max-Min normalization, then these samples are randomly classified into the training set, and the testing set, i.e., 70% samples are taken as the training set and others are used as the testing set.

Experimental Results and Analyses
This section compares the proposed methods with the existing methods [15,25] using the fault diagnosis dataset recorded in Section 4.3.1. It is worth noting that the difference between the CNN-GAP networks used in this section and those used in Section 4.2.2 is only in the number of output units. Table 4 shows the comparison results of each testing method. It can be clearly seen that the method in [15] gets poor performances on fault models (C1↑, C2↑, R2↑) and eventually obtains 95.12% average diagnosis accuracy. In addition, the method in [25] achieves 99.43% average results, which is about 0.38% higher than that of the CNN-GAP-R. However, the CNN-GAP-MS shows the best performance, and its average diagnosis accuracy reaches 99.95%. This case study proves that the proposed method outperforms the compared existing methods for complex analog circuit fault diagnosis.

Experimental Results and Analyses
This section compares the proposed methods with the existing methods [15,25] using the fault diagnosis dataset recorded in Section 4.3.1. It is worth noting that the difference between the CNN-GAP networks used in this section and those used in Section 4.2.2 is only in the number of output units. Table 4 shows the comparison results of each testing method. It can be clearly seen that the method in [15] gets poor performances on fault models (C1↑, C2↑, R2↑) and eventually obtains 95.12% average diagnosis accuracy. In addition, the method in [25] achieves 99.43% average results, which is about 0.38% higher than that of the CNN-GAP-R. However, the CNN-GAP-MS shows the best performance, and its average diagnosis accuracy reaches 99.95%. This case study proves that the proposed method outperforms the compared existing methods for complex analog circuit fault diagnosis. In addition, the t-SNE technique is utilized to verify the feature extraction capacity of our proposal. As visualized in Figure 7, the raw data in case 2 and the extracted features by CNN-GAP-MS are mapped into three-dimensional maps.
From Figure 9a, it is easy to see that most of the fault modes' raw data are highly mixed and overlapped, especially for the fault modes (C2↑, C2↓, R4↑, and R4↓). When it comes to Figure 7b, it can be found that almost all the fault modes are separated from others, and only a few overlapped points are generated. These results prove that the proposed method has strong capability on feature extraction of the complex circuit.
by CNN-GAP-MS are mapped into three-dimensional maps.
From Figure 9a, it is easy to see that most of the fault modes' raw data are highly mixed and overlapped, especially for the fault modes (C2↑, C2↓, R4↑, and R4↓). When it comes to Figure 7b, it can be found that almost all the fault modes are separated from others, and only a few overlapped points are generated. These results prove that the proposed method has strong capability on feature extraction of the complex circuit.

Fault Diagnosis of Incipient Faults
In this section, to further evaluate the diagnosis performance of the proposed method on the more complex circuit, the incipient fault diagnosis experiment is conducted on the FOBHPF circuit. Thirteen fault modes are studied, including non-fault state and 12 fault states (C1↑20, C1↓20, C2↑20, C2↓20, R1↑20, R1↓20, R2↑20, R2↓20, R3↑20, R3↓20, R4↑20, R4↓20), where the notation ↑20 denotes that the real value of the component is 20% higher than the nominal value and ↓20 indicates that of the component is 20% lower than its nominal value. The other settings are the same as Section 4.3.1. Table 5 shows the average fault diagnosis results of the 10 experiments. It can be observed that the CNN-GAP-MS performs significantly better, as its average accuracy is about 6.79% higher than that of the CNN-GAP-R. This is because the fault characteristics implied in the original time domain signal of incipient fault circuit are weak, but these fault characteristics could be highlighted to some extent by the backward difference preprocessing method, which generates the FOBD and SOBD sequences containing the signal variation and rate of signal variation characteristics. With multi-scale inputs (the original signal and the FOBD and SOBD sequences), the CNN-GAP-MS realizes greater fault classification results. However, the proposed CNN-GAP-MS still does not achieve high accuracy on three fault models (non-fault, C2↑20, and R4↑20). In fact, the incipient fault diagnosis of the complex circuit is a considerably difficult task. As fault features of different fault models are confused deeply, it is tough to separate them accurately. Future work should be performed to improve the incipient fault diagnosis accuracy of complex circuits.

Fault Diagnosis of Incipient Faults
In this section, to further evaluate the diagnosis performance of the proposed method on the more complex circuit, the incipient fault diagnosis experiment is conducted on the FOBHPF circuit. Thirteen fault modes are studied, including non-fault state and 12 fault states (C1↑20, C1↓20, C2↑20, C2↓20, R1↑20, R1↓20, R2↑20, R2↓20, R3↑20, R3↓20, R4↑20, R4↓20), where the notation ↑20 denotes that the real value of the component is 20% higher than the nominal value and ↓20 indicates that of the component is 20% lower than its nominal value. The other settings are the same as Section 4.3.1. Table 5 shows the average fault diagnosis results of the 10 experiments. It can be observed that the CNN-GAP-MS performs significantly better, as its average accuracy is about 6.79% higher than that of the CNN-GAP-R. This is because the fault characteristics implied in the original time domain signal of incipient fault circuit are weak, but these fault characteristics could be highlighted to some extent by the backward difference preprocessing method, which generates the FOBD and SOBD sequences containing the signal variation and rate of signal variation characteristics. With multi-scale inputs (the original signal and the FOBD and SOBD sequences), the CNN-GAP-MS realizes greater fault classification results. However, the proposed CNN-GAP-MS still does not achieve high accuracy on three fault models (non-fault, C2↑20, and R4↑20). In fact, the incipient fault diagnosis of the complex circuit is a considerably difficult task. As fault features of different fault models are confused deeply, it is tough to separate them accurately. Future work should be performed to improve the incipient fault diagnosis accuracy of complex circuits.

Conclusions and Future Work
Traditional DL-based fault diagnosis methods of analog circuits require fixed-size inputs, which heavily limited their application ability to different circuits. Moreover, as the conventional DL-based methods directly use the raw time domain signals as inputs, their performance is still inadequate for incipient fault diagnosis or a complex circuit diagnosis. To address these issues, this paper proposes to employ the backward difference to preprocess the measured signals, and a novel variant of CNN, i.e., CNN-GAP, is taken to extract features and diagnose faults. Firstly, based on the backward difference strategy, the first-order and the second-order backward difference sequences of the raw time domain response signals are obtained, which contain the signal variation and the rate of variation characteristics. Then, the GAP technique is introduced into the 1DCNN network, and a better CNN-GAP is yield. The CNN-GAP is able to deal with different lengths of input signals because its GAP layer calculates the mean value of the input vectors. Furthermore, the original signal sequence, as well as the backward difference sequences, are directly fed into the CNN-GAP, while the convolutional layer in CNN-GAP automatically mines fault features and fuses the learned multi-scale features as well. Finally, the fully connected layer outputs the fault diagnosis results. Based on two benchmark circuits, experiments on two cases (i.e., common soft fault diagnosis and incipient fault diagnosis) are conducted to verify the validity of the proposed approach. Comparison results indicate that the proposed method achieves higher diagnosis accuracy and is more reliable than the two existing methods. However, it should be mentioned that the computation cost of the proposed fault diagnosis method is larger than that of the traditional DL-based methods, as it needs to calculate the backward difference sequences and extracts features from them.
In addition, there is still room for improving the fault diagnosis accuracy for incipient fault diagnosis of the complex circuit. In addition, it is a more complicated situation when two circuit elements are both failures in the circuit, but it may happen in real world application. Our future work will focus on incipient fault diagnosis of the complex circuit and diagnosis of two or more fault elements.