Prediction of Static Characteristic Parameters of an Insulated Gate Bipolar Transistor Using Artificial Neural Network

Breakdown voltage (BV), on-state voltage (Von), static latch-up voltage (Vlu), static latch-up current density (Jlu), and threshold voltage (Vth), etc., are critical static characteristic parameters of an IGBT for researchers. Von and Vth can characterize the conduction capability of the device, while BV, Vlu, and Jlu can help designers analyze the safe operating area (SOA) of the device and its reliability. In this paper, we propose a multi-layer artificial neural network (ANN) framework to predict these characteristic parameters. The proposed scheme can accurately fit the relationship between structural parameters and static characteristic parameters. Given the structural parameters of the device, characteristic parameters can be generated accurately and efficiently. Compared with technology computer-aided design (TCAD) simulation, the average errors of our scheme for each characteristic parameter are within 8%, especially for BV and Vth, while the errors are controlled within 1%, and the evaluation speed is improved more than 107 times. In addition, since the prediction process is mathematically a matrix operation process, there is no convergence problem, which there is in a TCAD simulation.


Introduction
The insulated gate bipolar transistor (IGBT) is widely used in power electronics due to its superior performance [1,2]. Analysis of its static characteristic parameters has always been a key part of the design process, given that these parameters, such as on-state voltage (V on ) and threshold voltage (V th ), can characterize the conduction capability of the device [3,4]. In addition, breakdown voltage (BV), static latch-up voltage (V lu ), and static latch-up current density (J lu ) are helpful when evaluating the safe operating area of the device and its reliability [5][6][7]. Conventionally, technology computer-aided design (TCAD) simulation tools are used to obtain these characteristic parameters of the device before experimental testing because of the advantage of low prediction errors [1][2][3][4]. However, this method may suffer from non-convergence when solving the semiconductor physical equations, thus decreasing the efficiency of obtaining the characteristic parameters.
Recently, machine learning techniques for predicting the electrical characteristic parameters of the semiconductor device have been booming due to their ability to learn the relationship between structural parameters and characteristics efficiently [8][9][10][11][12][13]. However, most work is limited to providing only one characteristic parameter, such as the threshold voltage of a junctionless nanowire transistor [11] or the breakdown voltage of a lateral 2 of 9 power device [12]. As for those that can output multiple characteristics, the characteristic parameters are extracted from the current-voltage (I-V) curve and capacitance-voltage (C-V) curve, which still relies on human-intensive operation [13]. In addition, machine learning is applied to predict the current value of IGBTs in circuits and suppress the current imbalance of parallel-connected IGBTs through artificial neural networks, or to predict the remaining lifetime of the IGBTs operating in circuits [14][15][16]. However, the focus of these works is on monitoring and optimizing IGBT operations in the circuit, rather than on exploring the relationship between the device's structural parameters and characteristic parameters.
In this paper, we propose a simple multi-layer artificial neural network (ANN) framework to predict the vital static characteristic parameters of the IGBT, including BV, V on , V th , V lu , and J lu . The proposed method enables designers to predict the static characteristic parameters effectively, speeding up the device design process. After testing, for the same testing samples, the prediction speed of the ANN is improved by more than 10 7 times compared to TCAD simulation. This method avoids the non-convergence problem of TCAD simulations, and, if supported by experimental data, the method's predictions will be closer to the experimental test values. Figure 1 shows the structural schematic diagram of the IGBT, and Figure 2 gives the curves of its static characteristic, including the breakdown characteristic curve, forward I-V characteristic curve, and transfer characteristic curve.

Dataset Generation and Division
focus of these works is on monitoring and optimizing IGBT operations in the circuit, rather than on exploring the relationship between the device's structural parameters and characteristic parameters.
In this paper, we propose a simple multi-layer artificial neural network (ANN) framework to predict the vital static characteristic parameters of the IGBT, including BV, Von, Vth, Vlu, and Jlu. The proposed method enables designers to predict the static characteristic parameters effectively, speeding up the device design process. After testing, for the same testing samples, the prediction speed of the ANN is improved by more than 10 7 times compared to TCAD simulation. This method avoids the non-convergence problem of TCAD simulations, and, if supported by experimental data, the method's predictions will be closer to the experimental test values. Figure 1 shows the structural schematic diagram of the IGBT, and Figure 2 gives the curves of its static characteristic, including the breakdown characteristic curve, forward I-V characteristic curve, and transfer characteristic curve.

Dataset Generation and Division
As shown in the figure, the prediction targets, including BV, Vth, Von, Vlu, and Jlu, are marked in the corresponding characteristic curves. In order to obtain the dataset of static characteristic parameters for IGBT with different structures, some vital structural parameters were generated randomly within a certain range. The detailed information on these structural parameters and their ranges is shown in Table 1. Moreover, during the data collection process, the static latch-up point was extracted from the forward I-V characteristic curve [7]. In addition, Von was defined as the voltage at the anode when the anode's current density reached a certain set value (Jset), meaning that it could also be extracted from the forward I-V curve at a gate voltage VG of 15 V, with Jset defined as 100 A/cm 2 [1]. We simulated the device characteristics of multiple IGBTs with the carrier lifetime of 1µs using the TCAD tool as the total dataset [17]. After finishing the total dataset collection from the simulation, the dataset was further divided into the training set and the testing set for the training process and the testing process of the ANN framework.     As shown in the figure, the prediction targets, including BV, V th , V on , V lu , and J lu , are marked in the corresponding characteristic curves. In order to obtain the dataset of static characteristic parameters for IGBT with different structures, some vital structural parameters were generated randomly within a certain range. The detailed information on these structural parameters and their ranges is shown in Table 1. Moreover, during the data collection process, the static latch-up point was extracted from the forward I-V characteristic curve [7]. In addition, V on was defined as the voltage at the anode when the anode's current density reached a certain set value (J set ), meaning that it could also be extracted from the forward I-V curve at a gate voltage V G of 15 V, with J set defined as 100 A/cm 2 [1]. We simulated the device characteristics of multiple IGBTs with the carrier lifetime of 1µs using the TCAD tool as the total dataset [17]. After finishing the total dataset collection from the simulation, the dataset was further divided into the training set and the testing set for the training process and the testing process of the ANN framework. Table 1. Structural parameters and ranges of the IGBT.

Structural Parameters Range
Channel length, L (µm) 1-5 Figure 3 shows the overall flow of the proposed approach and the designed ANN framework structure. In general, the proposed scheme can be divided into three steps: dataset processing, the training process, and the testing process. It is worth noting that, since the different input parameters have different scales, this will result in the updating of the ANN weights and biases during training being more susceptible to large-scale input parameters and the effect of input parameters with small scales on the output may be ignored. We use a normalization method to compress the structural parameters to (0, 1), and the process can eliminate the effect of unit and scale differences between input parameters in order to treat each class of input parameters equally, thereby increasing the prediction accuracy and efficiency of the ANN. Similar approaches to data pre-processing have been reported in papers related to machine learning [12,13]. during the training process. We defined the error function by using the mean absolute error (MAE) loss function [11]. Moreover, we used the Adam optimizer to update the network parameters to reduce the error [19]. Mathematically, the MAE loss function is expressed as follows:

Methodology
where m is the number of training samples, yi and ŷi are the actual value and predicted value of the output feature of the ith training sample, respectively. Moreover, to capture the nonlinear relationship between the device structure and the static characteristic parameters, each neuron is equipped with a ReLU activation function [20].

Results Analysis
The following numerical results validate the proposed scheme. The prediction results of the ANN for different static characteristic parameters are first shown, then the effects of the different regression algorithms and the number of training samples on the prediction The ANN is a nonlinear dynamic learning system, implemented using a mathematical method, which can handle complex nonlinear prediction problems by simulating biological neural networks. It consists of an input layer, hidden layers, and an output layer. The ANN structure for the proposed method is built using the Tensorflow library [18]. To obtain a robust framework for IGBT static characteristic parameter prediction, several trials and adjustments have been conducted. Finally, the hidden layers are set to four layers, and the number of neurons in each layer is 36, 20, 16, and eight, from first to last. Such an ANN structure has sufficient fitting ability for the static performance prediction task. Meanwhile, it also can avoid overfitting problems due to its extensive learning ability.
The ANN algorithm is capable of continuously optimizing the weights and biases between each layer to minimize the error between the predicted value and the actual value during the training process. We defined the error function by using the mean absolute error (MAE) loss function [11]. Moreover, we used the Adam optimizer to update the network parameters to reduce the error [19]. Mathematically, the MAE loss function is expressed as follows: where m is the number of training samples, y i andŷ i are the actual value and predicted value of the output feature of the ith training sample, respectively. Moreover, to capture the nonlinear relationship between the device structure and the static characteristic parameters, each neuron is equipped with a ReLU activation function [20].

Results Analysis
The following numerical results validate the proposed scheme. The prediction results of the ANN for different static characteristic parameters are first shown, then the effects of the different regression algorithms and the number of training samples on the prediction errors are investigated separately, and the reasons for the high and low average errors for different characteristics parameters are explained through physical theory. In addition, the ANN method has also shown its great ability to fit the trends of the static characteristic parameters well as they change with the device's structural parameters. Finally, we reveal the advantages in prediction speed by comparing the time of using ANN and TCAD simulation.

Prediction Results for Characteristic Parameters
Figure 4a-e are scatter plots showing the correlations between the target BV, V on , V lu , J lu , and V th and those predicted by the ANN method, while the number of training samples used for the prediction of the different characteristic parameters is described in the caption text. The number of training samples is influenced by the convergence of the simulation and the complexity of the predicted characteristic parameters, which will be discussed more closely later in this paper. Compared to the TCAD simulation results, the average errors for BV prediction and V th prediction can be controlled to within 1%. For IGBT, BV falls between the avalanche breakdown and the reach-through breakdown limits, and is governed by the open-base transistor breakdown phenomenon. The open-base transistor breakdown condition for the IGBT is given by Equation (2), where the α pnp is the common-base current gain, which is related to the injection efficiency γ E , the base transport factor α T , and the multiplication coefficient M. By derivation, α T and M are in turn related to N d and T d [21]. V th is primarily associated with the doping concentration of the channel region, Nwell. The number of structural parameters affecting these two characteristic parameters is small, so the ANN can predict the BV and V th of the IGBT very accurately. However, V on , V lu , and J lu are closely related to on-resistance, and the parameters of the drift region, buffer region, and channel region all impact the on-resistance of the device. The more structural parameters affect the static characteristic parameter, the more complex the prediction of the characteristic parameter is. Nevertheless, the average errors of our proposed scheme are all within 8%, which proves that the number of layers and neurons of the framework is well set so that it has sufficient fitting ability and can avoid overfitting problems due to the high fitting ability.
(2) base current gain, which is related to the injection efficiency γE, the base transport factor αT, and the multiplication coefficient M. By derivation, αT and M are in turn related to Nd and Td [21]. Vth is primarily associated with the doping concentration of the channel region, Nwell. The number of structural parameters affecting these two characteristic parameters is small, so the ANN can predict the BV and Vth of the IGBT very accurately. However, Von, Vlu, and Jlu are closely related to on-resistance, and the parameters of the drift region, buffer region, and channel region all impact the on-resistance of the device. The more structural parameters affect the static characteristic parameter, the more complex the prediction of the characteristic parameter is. Nevertheless, the average errors of our proposed scheme are all within 8%, which proves that the number of layers and neurons of the framework is well set so that it has sufficient fitting ability and can avoid overfitting problems due to the high fitting ability.

Comparison of Different Algorithms
To further emphasize the effectiveness of the ANN, we can compare the results of the ANN method with other predictors constructed using conventional regression algorithms, such as Gaussian process regression (GPR), support vector regression (SVR), and linear regression (LR). Table 2 shows the average errors of predictors constructed using

Comparison of Different Algorithms
To further emphasize the effectiveness of the ANN, we can compare the results of the ANN method with other predictors constructed using conventional regression algorithms, such as Gaussian process regression (GPR), support vector regression (SVR), and linear regression (LR). Table 2 shows the average errors of predictors constructed using different machine learning algorithms when predicting IGBT static characteristic parameters. The ANN achieves the most accurate predictions for each static characteristic parameter. In addition, it is worth noting that, except for the ANN, the average errors of the other three schemes for V lu are large. This is because, for some structures with a very small V lu , these schemes fail to capture the nonlinear condition due to the high complexity of the data distribution, so it is difficult to give accurate prediction results. However, due to the setting of the ReLU function, the multi-layer ANN we trained has a strong nonlinear fitting ability, so the prediction result is more accurate compared to the other three schemes.

Effect of Sample Size on Results
We can also analyze the influence of the number of training samples on the prediction results. The testing set consisted of 400 IGBT samples with different structural parameters, and the number of samples in the training set was increased from 400 to 4000. Figure 5 shows the variation curves of the ANN's prediction average error for five static characteristic parameters as the number of samples in the training set increased. To further emphasize the effectiveness of the ANN, we can compare the results of the ANN method with other predictors constructed using conventional regression algorithms, such as Gaussian process regression (GPR), support vector regression (SVR), and linear regression (LR). Table 2 shows the average errors of predictors constructed using different machine learning algorithms when predicting IGBT static characteristic parameters. The ANN achieves the most accurate predictions for each static characteristic parameter. In addition, it is worth noting that, except for the ANN, the average errors of the other three schemes for Vlu are large. This is because, for some structures with a very small Vlu, these schemes fail to capture the nonlinear condition due to the high complexity of the data distribution, so it is difficult to give accurate prediction results. However, due to the setting of the ReLU function, the multi-layer ANN we trained has a strong nonlinear fitting ability, so the prediction result is more accurate compared to the other three schemes.

Effect of Sample Size on Results
We can also analyze the influence of the number of training samples on the prediction results. The testing set consisted of 400 IGBT samples with different structural parameters, and the number of samples in the training set was increased from 400 to 4000. Figure 5 shows the variation curves of the ANN's prediction average error for five static characteristic parameters as the number of samples in the training set increased.
It can be observed that the number of training set samples required to predict Jlu and Vlu are more than the other static characteristic parameters, and the average error is larger. This is because when the latch-up effect occurs, the internal parasitic NPN transistor of the IGBT turns on and the gate loses the ability to control the current. Under such conditions, the relationship between the device's characteristics and the structural parameters is hard to capture, which assuredly increases the difficulty of the ANN's prediction task. It can be observed that the number of training set samples required to predict J lu and V lu are more than the other static characteristic parameters, and the average error is larger. This is because when the latch-up effect occurs, the internal parasitic NPN transistor of the IGBT turns on and the gate loses the ability to control the current. Under such conditions, the relationship between the device's characteristics and the structural parameters is hard to capture, which assuredly increases the difficulty of the ANN's prediction task.

Prediction Results with Changing Structural Parameters
Taking the V on prediction as an example, we can further analyze the fitting results of the trained ANN relating to the relationship between the structural parameters and characteristic parameters of the IGBT. Figure 6a-f show the comparisons of the prediction results of the ANN method and the TCAD simulation with different structural parameters, including N b , N p+ , N well , T d , T buffer , and L, respectively. Based on the initial structures, the structural parameters were sequentially changed. The initial structural parameters are shown in the caption. We have taken three different doping concentrations in the drift region into consideration.  As the figures show, compared with the TCAD simulation, the ANN method exhibits an accurate prediction ability in the presence of changes in the structural parameters, with the same trend as the TCAD simulation results. By comparing the points of three different colors in each figure, we can see that when other structural parameters remain unchanged, the larger Nd is, the smaller V on will be. This is because the on-state resistance of the drift region decreases with the increase of N d , and the anode voltage V on required to reach a certain value of the anode current J set decreases as well. In addition, larger values of the buffer region parameters N b and T buffer cause a decrease in the number of holes injected from the anode P+ region into the drift region, thereby weakening the conductance modulation effect, increasing the on-state resistance, and increasing V on . An increase in the structural parameters N well and L in the channel region raises the threshold voltage, causing an increase in the on-state resistance and V on of the device. As the figures show, the proposed scheme can fit the effects of various structural parameters on the static characteristic parameters of the device well.

Prediction Time and Efficiency
For the prediction using the ANN method, since the testing process is mathematically a matrix calculation between the input features and the updated parameters within the ANN, the prediction time of the ANN is within 0.1 s. Yet, under the same dataset, the simulation runtime for the TCAD tool is 328,653.6 s, which does not include the time spent manually setting up the device structure. The prediction speed is therefore increased by more than 10 7 times compared with the TCAD simulation, and the convergence problem caused by unreasonable grid settings in the TCAD simulation is totally avoided, which is also beneficial for the efficiency when predicting the static characteristics parameters of the IGBTs with different structures.

Conclusions
In this paper, we propose a multi-layer ANN predictive framework to predict multiple static characteristic parameters of an IGBT, such as BV, V on , V lu , J lu , and V th . Compared with the TCAD simulation tool, the trained ANN can achieve a speed increase of more than 10 7 times while ensuring the average errors are less than 8% when predicting the static characteristic parameters. Since the prediction process is a numerical operation of the matrix, it avoids the convergence problem that often occurs with TCAD simulations. The scheme confirms that ANN can capture the effects of changes to a device's structural parameters on a device's characteristic parameters and help designers to predict the static characteristic parameters quickly and speed up the design process. The method is extensible and when the ANN is trained with more informative datasets, the ANN will have the ability to predict the characteristics of more complexly structured devices. In addition, the method is able to predict characteristic parameters that are closer to the experimental test values based on experimental data.