RLC Circuit Forecast in Analog IC Packaging and Testing by Machine Learning Techniques

For electronic products, printed circuit boards are employed to fix integrated circuits (ICs) and connect all ICs and electronic components. This allows for the smooth transmission of electronic signals among electronic components. Machine learning (ML) techniques are popular and employed in various fields. To capture the nonlinear data patterns and input–output electrical relationships of analog circuits, this study aims to employ ML techniques to improve operations from modeling to testing in the analog IC packaging and testing industry. The simulation calculation of the resistance, inductance, and capacitance of the pin count corresponding to the target electrical specification is a complex process. Tasks include converting a two-dimensional circuit into a three-dimensional one in simulation and modeling-buried structure operations. In this study, circuit datasets are employed for training the ML model to predict resistance (R), inductance (L), and capacitance (C). The least squares support vector regression (LSSVR) with Genetic Algorithms (GA) (LSSVR-GA) serves as an ML model for forecasting RLC values. Genetic algorithms are used to select parameters of LSSVR models. To demonstrate the performance of LSSVR models in forecasting RLC values, three other ML models with genetic algorithms, including backpropagation neural networks (BPNN-GA), random forest (RF-GA), and eXtreme gradient boosting (XGBoost-GA), were employed to cope with the same data. Numerical results illustrated that the LSSVR-GA outperformed the three other forecasting models by around 14.84% averagely in terms of mean absolute percentage error (MAPE), weighted absolute percent error measure (WAPE), and normalized mean absolute error (NMAE). This study collected data from an IC packaging and testing firm in Taiwan. The innovation and advantage of the proposed method is using a machine approach to forecast RLC values instead of through simulation ways, which generates accurate results. Numerical results revealed that the developed ML model is effective and efficient in RLC circuit forecasting for the analog IC packaging and testing industry.


Introduction
Circuit simulation work in the integrated circuit (IC) packaging process depends on model complexity in geometry and electromagnetic materials. Properly simulating IC packaging plays a vital role in catching potential EMC, power, and signal integrity issues early in the design process and overcoming major pitfalls. The elimination of manual processing is required to reduce the time and effort from customer requirement specifications to IC packaging design, testing, manufacturing, and to determine the tools and process options for application design. Implementation takes prediction, evaluation, and decision making with machine learning-centric databases, tools, and design models. Learning-based tools and process models must continuously improve through additional design experience [1,2]. Machine learning (ML) solved many problems that were originally difficult to solve in data science. Many studies have shown that machine learning and optimization algorithms are suitable for solving different problems in the IC packaging and design processes, reducing design errors and design cycle time [3]. Table 1 shows the latest research on the IC packaging process using machine learning.
The signal passes through the substrate of the printed circuit board. Simulation provides the designer with a pre-optimized design concept. Ren et al. [4] introduced a graph neural network to predict network parasitics and device parameters by converting circuit schematics into graphs and utilizing key GNN-based modeling techniques. The results showed that the average simulation error was reduced from over 100% and estimated by designers to be less than 10%. Shook et al. [5] proposed a new machine learning-based parasitic estimation method for pre-layout custom circuit designs. For various analog circuits, the results show a reduction in the average error between pre-layout and postlayout circuit simulations from 37% to 8%.
For optimization and evaluation of package structural characteristics, Wu and Chu [6] proposed and verified an analog-driven design method for chip package integration structure design optimization. The study's results suggest that the random forest algorithm can predict stress for chip package-integrated design. Hsiao and Chiang [7] proposed applying the RF model to predict the reliability of wafer-level packaging. The designers can easily optimize the WLP structure and shorten the design cycle. Lee et al. [8] developed a chip-topackage interactive risk assessment platform using finite element analysis, meta-modeling, and genetic algorithm optimization methods.
Heat transfer analysis of package structures is important in package functional testing. Acharya et al. [9] used three ML algorithms, random forest, support vector regression, and a neural network to model thermal behavior through hotspot temperature simulation data evaluation. They proposed an ML-based thermal design method and provided a reference frame for future packaging materials. Durgam et al. [10] used several machine learning methods to predict the temperature of the heat source on the substrate. The results showed that the temperature agreement between the prediction and the simulation was less than 10%. Jing et al. [11] proposed using the genetic algorithm to optimize the temperature curve prediction model in the reflow soldering process. The results show that the predicted value meets the error accuracy requirements. The results also prove that the established mathematical model can effectively predict temperature curves.
The power delivery network (PDN) must reliably supply power to functional blocks in an integrated circuit (IC). A robust PDN design has always been a critical challenge. Cecchetti et al. [12] developed a Genetic Algorithm (GA) and Artificial Neural Network (ANN) model for iterative optimization of the placement of decoupling capacitors in a PDN. They concluded that the GA-ANN model is consistent with the results of commercial simulator optimization. Sourav et al. [13] presented an ML architecture that combined neural networks and regression trees to predict printed circuit board (PCB) inductance and resistance. They employed an LSTM model to predict voltage drop as a function of time. The average prediction accuracy of the proposed method is 94%.
There have been many models utilizing machine learning and optimization to solve issues in IC packaging. Mao et al. [14] proposed a machine learning (ML) model based on the backpropagation (BP) method for predicting three-dimensional board-level drop responses for ball grid array (BGA) encapsulation structures. Jin et al. [15] constructed several machine learning methods to accurately predict the radiated electric field of wirebonded ball grid array packages. They optimized model parameters to minimize the prediction error of each model. Their conclusion shows that DNN is an effective and feasible prediction model. Wang et al. [16] proposed a reverse design method based on convolutional neural networks for the fast optimization and design of encapsulation structures. Schierholz et al. [2] provided a database that allows for the study of machine learning tools and techniques in signal integrity, power integrity, and electromagnetic compatibility. It contains printed circuit board (PCB)-based interconnects and physics (PB).
The corresponding frequency domain data of the tool can be used for different types of structural simulations. This study attempted to use least squares support vector regression with genetic algorithms to predict the RLC (resistance, inductance, and capacitance) values currently generated by simulation methods. The genetic algorithms were employed to determine LSSVR parameters to improve forecasting accuracy. The designed method employs a machine learning approach to forecast RLC values instead of through simulation ways and generates more accurate results than the three other machine learning models. The rest of this study is organized as follows. Section 2 provides the substrate and interface electrical transfer properties based on IC packages. Section 3 briefs the LSSVR model and genetic algorithms. The flowchart of the LSSVR-GA model for predicting RLC values is also addressed and presented. Numerical results are illustrated in Section 4. Conclusions are indicated in Section 5.

The Substrate and Interface of the IC Package Transmit Electrical Properties
In IC packaging design, the substrate is used as a carrier. The functions of the substrate are to protect and carry the IC chip and serve as a medium for circuit signal transmission. Integrated circuit packaging is the final stage of semiconductor component manufacturing. As a method for connecting the die to the external circuit, the chip's packaging considers the pin configuration, electrical performance, heat dissipation, and the chip's physical size. There are many typical packaging forms in the semiconductor industry [17,18]. Currently, the most common internal packaging methods of integrated circuits are wire bonding (WB) and flip chip (FC) packages. Flip chip packaging connects the chip to the bump and then turns the IC chip over to directly connect the bump and substrate. The wire bonding package places the chip on the substrate (chip pad) and then uses the wire bonding technology to connect the chip to the connection point on the substrate. The IC substrate acts as a buffer interface for electrical connection and transmission between the IC die and the PCB through the conductive routing and vias (VIA) network, as shown in Figure 1.
wire bonding (WB) and flip chip (FC) packages. Flip chip packaging connects the chip to the bump and then turns the IC chip over to directly connect the bump and substrate. The wire bonding package places the chip on the substrate (chip pad) and then uses the wire bonding technology to connect the chip to the connection point on the substrate. The IC substrate acts as a buffer interface for electrical connection and transmission between the IC die and the PCB through the conductive routing and vias (VIA) network, as shown in Figure 1. The RLC circuit is essential to evaluate the overall interface transmission capability in the IC packaging design process. It is a circuit structure composed of resistors, capacitors, and inductors. Parasitic effects associated with ICs and printed circuit board (PCB) conductors and their paths are essential parameters of the electrical transport model. The parasitic effects of RLC lines in the IC package process can cause signal integrity problems due to signal attenuation and delay [20,21].
In the process of IC substrate generation, the substrate design is first performed according to the target circuit specification. The netlist of the corresponding circuit is associated with performing a post-layout simulation to verify the corresponding layout performance. If the post-layout simulation results are violated, the designer will adjust his layout and re-simulate. Figure 2 shows that this process is repeated until the simulated substrate design conforms to the RLC electrical specifications for interface transmission. The current process requires multiple simulation runs to meet the desired target circuit specification. Therefore, any inaccuracies in the design or components can produce misleading post-layout simulation results. Such misleading results can reduce yield and increase circuit design waste time. Signal transmission relies on the interconnected line group. According to the transmission line theory, the transmission line calibration model can replace the electrical characteristics of the signal and use an equivalent model. When the system simulation is based on the transmission line calibration model, the substrate RLC model, IC input/output buffer information specification (IBIS) model, and PCB electrical properties model are applied to the system simulations for system verification. The process is shown in Figure 3. The RLC circuit is essential to evaluate the overall interface transmission capability in the IC packaging design process. It is a circuit structure composed of resistors, capacitors, and inductors. Parasitic effects associated with ICs and printed circuit board (PCB) conductors and their paths are essential parameters of the electrical transport model. The parasitic effects of RLC lines in the IC package process can cause signal integrity problems due to signal attenuation and delay [20,21].
In the process of IC substrate generation, the substrate design is first performed according to the target circuit specification. The netlist of the corresponding circuit is associated with performing a post-layout simulation to verify the corresponding layout performance. If the post-layout simulation results are violated, the designer will adjust his layout and re-simulate. Figure 2 shows that this process is repeated until the simulated substrate design conforms to the RLC electrical specifications for interface transmission. The current process requires multiple simulation runs to meet the desired target circuit specification. Therefore, any inaccuracies in the design or components can produce misleading postlayout simulation results. Such misleading results can reduce yield and increase circuit design waste time. Signal transmission relies on the interconnected line group. According to the transmission line theory, the transmission line calibration model can replace the electrical characteristics of the signal and use an equivalent model. When the system simulation is based on the transmission line calibration model, the substrate RLC model, IC input/output buffer information specification (IBIS) model, and PCB electrical properties model are applied to the system simulations for system verification. The process is shown in Figure 3.

LSSVR Models with Genetic Algorithms
The LSSVR method can be traced back to the SVM (Support Vector Machine) proposed by Cortes and Vapnik [22]. The SVM can handle classification and regression problems and performs better on small samples. Suykens and Vandewalle [23] proposed LSSVM. It solves the high computational burden problem of the SVM. The problem used to solve regression is called LSSVR [24].
Consider a given data set {x i , y i | i = 1,2,3, . . . , n}, where x i ∈ R d is the i th input data including d features, and y t ∈ R is the i th output data. Establishing the model for the LSSVR is as follows in Equation (1): where ω T is the transposed form of the weight matrix, ϕ(x) represents a nonlinear function that maps from the original dimensional feature space to a higher dimensional feature space, and b is a bias value. The optimization problem to be solved by the model is presented as Equation (2): where F(ω, b) is lose function, γ is the regularization parameter, and e means the random error. Because of the constraints, the optimal solution to the computational problem is very complicated. The Lagrange function is optimized and presented in Equation (3) to solve this problem: 6 of 13 where L(ω, b, e t , l t ) is the Lagrange function and l is the Lagrange multiplier. After optimization using the KKT condition (Karush-Kuhn-Tucker condition), the formula is described in Equation (4) ∂L The kernel function K(x, x t ) is considered as follows in Equation (5): Finally, the model estimation formula by LSSVR can be obtained with Equation (6): There are common kernel functions such as the string kernel [25], the radial basis function kernel (RBF) [26], and the polynomial kernel [27]. This study used the RBF kernel function in Equation (7), and the RBF kernel utilizes high-dimensional nonlinear mapping to resolve the nonlinear relationship between dependent and independent variables. The RBF kernel learned more complex decision boundaries: where σ is the parameter of the RBF kernel function. The decision of these two parameters-γ and σ-would affect the accuracy of the LSSVR model, so GA was performed to optimize these two parameters. The complete concept of GA was advocated by John Holland [28,29]. GA simulates the natural evolution law of natural ecology, imitates the survival of the fittest in the natural group, eliminates the inferior, and converges into a balanced mechanism under repeated iterations. GA is a search method used to solve optimization problems. Genes can select, crossover, and mutate. Better genes are passed to the new generation, and the inferior genes will be eliminated gradually. GA has been widely applied in solving optimization problems, data searches, artificial intelligence, and machine learning. Figure 4 illustrates the framework of the LSSVR-GA model in RLC (resistance, inductance, and capacitance) forecasting. It consists of 3 modules: data preprocessing, GA for parameter selection, and the LSSVR model for RLC forecasting. To solve the timeconsuming problem of calculating the parameters for complex simulation software while verifying substrate designs, this study proposed a machine learning method to predict the RLC values for different product types.

LSSVR-GA Architecture for RLC Prediction
The experimental data were semi-structured historical data provided by SPIL (Siliconware Precision Industries Co., Ltd.), including two different IC package process products, FC and WB. The historical data of different products of two-layer, four-layer, and six-layer PCB were selected based on these two processes. The three dependent variables for each product are resistance (R), inductance (L), and capacitance (C). The independent variables (X1~Xn) include ball, bump, base, L1, L2, L3, L4, L5, via, and wire. In the data preprocessing stage, this study integrated these scattered semi-structured data into one-to-one corresponding structured data between dependent and independent variables according to product types. It filled the missing values with 0. Table 2 describes the features and samples in predicting RLC for different data sets of substrate products.  The preprocessed data sets were divided into 80% training data and 20% testing data. The training data were used to build the LSSVR model with the parameters optimized by GA. Before applying GA, it is necessary to encode the parameters to be optimized into a group of chromosomes. The common encoding methods include binary, real, multi-objective, parallel, chaotic, and hybrid GA [30]. Considering the simplicity of implementation for factory operators, this study used the binary-coded GA to optimize the parameters of LSSVR, and each digital bit represented a gene. The length of the chromosome was defined according to the spatial range of the actual problem to be solved. The real number represented by the binary encoded was calculated as Equation (8).
where RV is the real number represented by the binary encoded, LB is the lower bound of the spatial range, UB is the upper bound of the spatial range, l is the encoded bit length, and d i is the bit value of the i th bit. Figure 5 shows the LSSVR model's encoded parameters-γ and σ-and the operation of real numbers. Each parameter consists of 10 genes, and the LSSVR model has two parameters. These two parameters represent 20 genes as a chromosome. The lower and upper bounds of the two parameters are both 1 and 500, and the real numbers represented are calculated accordingly. It is also necessary to define the optimized procedure settings of LSSVR-GA. The population size, iteration, crossover rate, and mutation rate were arranged at 40, 20, 0.8, and 0.1, respectively. When starting GA, the parameters must first be initialized as the input parameter of the LSSVR model. The training result of the LSSVR model is calculated by the fitness function. This is to evaluate the stopping conditions for GA. If conditions are not met, it will go through the process as in Figures 6-8, and the GA selection-crossover-mutation process will have a new generation. Good chromosomes have more opportunities to be selected. Unfit and less fit chromosomes are gradually eliminated. Therefore, the updated parameters are used as input parameters of the LSSVR model. To find the best-fit parameters of LSSVR, repeat the fitness function to compute the evaluation until the GA stop condition. Then, set the best-fit parameters of LSSVR in the final LSSVR model and perform RLC predictions.

Numerical Results
Predicted results are evaluated and analyzed with the testing data to examine the effectiveness and interpretability of the proposed method. The evaluation is measured by mean absolute percentage error (MAPE (%)), weighted absolute percent error measure (WAPE (%)), and normalized mean absolute error (NMAE), as shown in Equations (9)- (11).
whereŶ i is the i th predict value, Y i is the i th actual value, and i = 1~n. Three other forecasting models with genetic algorithms, namely backpropagation neural networks (BPNN-GA), random forest (RF-GA), and eXtreme gradient boosting (XGBoost-GA), were employed to deal with the same data. Table 3 illustrates parameters determined by genetic algorithms to predict LCR values of different forecasting models. Lewis [31] reported that forecasting performance measured by MAPE values could be depicted in Table 4. Table 5 lists the MAPE, WAPE, and NMAE values of the four forecasting models.
The average performance of these six products is at the levels of good or highly accurate in predicting RLC using LSSVR-GA models. Furthermore, the LSSVR-GA models can generate average more accurate results than the other three forecasting models in terms of MAPE, WAPE, and NMAE.   [31].