# Drive System Inverter Modeling Using Symbolic Regression

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

- Is it possible to use GPSR to design symbolic expressions for drive inverter modeling?
- Is it possible to model the inverter based on targeted variables of the black-box inverter model and black-box compensation scheme?
- Which GP parameters must be used to achieve the highest estimation and generalization performance results, determined by using the 5-fold cross-validation principle?

## 2. Materials and Methods

#### 2.1. Description of the Control System

#### 2.2. Dataset Description

#### 2.3. Genetic Programming–Symbolic Regression

#### 2.4. Research Methodology

- The process begins with a random selection of GPSR hyperparameters;
- Then these hyperparameters are used in GPSR with 5-fold cross-validation execution, where 5 different symbolic expressions are obtained (one for each execution);
- Then each symbolic expression is evaluated on training and validation dataset to determine the mean and standard deviation values of ${R}^{2}$, $MAE$, and $RMSE$ metrics.
- If the mean value of ${R}^{2}$ is higher than $0.99$, the process continues to the final training with the same hyperparameters used in 5-fold cross validation. If not, the process starts from the beginning by selecting random hyperparameters.
- In the case of GPSR final training with the same hyperparameters, the GPSR is trained on the training part of the dataset ($70\%$ of the original dataset).
- After the symbolic expression is obtained, it is evaluated on training and testing parts of the original dataset to calculate the mean and standard deviation values of ${R}^{2}$, $MAE$, and $RMSE$. If the value of ${R}^{2}$ is larger than $0.99$, the values of $MAE$ and $RMSE$ are lower than 5 and if the standard deviations of the aforementioned metrics are lower than ${10}^{-1}$, then the process is successfully terminated.

#### 2.5. Computational Resources

## 3. Results and Discussion

#### 3.1. Estimation of Phase Voltages

#### 3.2. Estimation of Duty Cycles

#### 3.3. Comparison between Symbolic Regression and Other Modeling Methods

- Model complexity: Refers to the structure of the model itself, and how many parameters it contains, for example, models such as multilayer perceptron (MLP) or convolutional neural network (CNN) are black-box models, that is, the user is not aware of what is happening at a certain moment during training and cannot influence it before the results are calculated. Only the input and output are known.
- Model performance: Indicates what the user wants, which is what kind of performance a particular algorithm showed, that is, how high-quality the obtained results are.
- Model execution time: Refers to the time of execution or obtaining results from the moment of starting the model estimation process.
- Modeling procedure complexity: The complexity when creating an algorithm to perform a certain task.
- Modeling computational complexity: The hardware requirement, i.e., how many resources each model uses to perform the task.

## 4. Conclusions

- It is possible to utilize GP to design symbolic expressions for drive inverter modeling.
- The expressions have high performance for both black-box model and black-box compensation scheme targets.
- By using hyper-parameters selected with a random selection process, high estimation and generalization performance results are achieved.

- The obtained symbolic expressions are simple and easier to use than complex, trained AI/ML models.
- The symbolic expressions do not require all input variables to calculate the desired output. So further investigation using this approach could result in symbolic expression with fewer input variables.
- The process of training the GPSR even with 5-fold cross validation is on average 60 min. It can be stated that the presented execution time is not too long to obtain quality and robust symbolic expressions with high estimation accuracy.

- Initial tuning and defining hyperparameter ranges of the GPSR algorithm is a painstaking process that has to be carefully planned and executed. If this stage is done properly, then the GPSR with random hyperparameter search and the 5-fold cross-validation method should run smoothly. However, the process of fine-tuning to define hyperparameter ranges is a time-consuming process since each hyperparameter has to be defined and the GPSR must be executed to see the hyperparameter’s influence on the performance of the GPSR algorithm.
- The extremely high correlation between some dataset variables has presented a problem during the investigation since these highly correlated variables prevented the evolution process of GPSR.
- The tuning of the parsimony coefficient is the most sensitive process since the small variation of this value could cause a negative effect on GPSR algorithm execution (higher execution times, and lower accuracy of obtained symbolic expressions).

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Appendix A

**Figure A3.**The Pearson’s correlation heatmap of dataset variables used in black-box compensation scheme model.

## References

- Robles, E.; Fernandez, M.; Andreu, J.; Ibarra, E.; Ugalde, U. Advanced power inverter topologies and modulation techniques for common-mode voltage elimination in electric motor drive systems. Renew. Sustain. Energy Rev.
**2021**, 140, 110746. [Google Scholar] [CrossRef] - Chung, H.S.; He, Y.; Huang, M.; Wu, W.; Blaabjerg, F. Control and Filter Design of Single-Phase Grid-Connected Converters; John Wiley & Sons: Hoboken, NJ, USA, 2022. [Google Scholar]
- Gaddala, R.K.; Majumder, M.G.; Rajashekara, K. DC-Link Voltage Stability Analysis of Grid-Tied Converters Using DC Impedance Models. Energies
**2022**, 15, 6247. [Google Scholar] [CrossRef] - Rafiq, M.A.; Ulasyar, A.; Uddin, W.; Zad, H.S.; Khattak, A.; Zeb, K. Design and Control of a Quasi-Z Source Multilevel Inverter Using a New Reaching Law-Based Sliding Mode Control. Energies
**2022**, 15, 8002. [Google Scholar] [CrossRef] - Vishnuram, P.; Ramachandiran, G.; Sudhakar Babu, T.; Nastasi, B. Induction Heating in Domestic Cooking and Industrial Melting Applications: A Systematic Review on Modelling, Converter Topologies and Control Schemes. Energies
**2021**, 14, 6634. [Google Scholar] [CrossRef] - Ashraf, N.; Abbas, G.; Ullah, N.; Alahmadi, A.A.; Awan, A.B.; Zubair, M.; Farooq, U. A Simple Two-Stage AC-AC Circuit Topology Employed as High-Frequency Controller for Domestic Induction Heating System. Appl. Sci.
**2021**, 11, 8325. [Google Scholar] [CrossRef] - Ramkumar, S.; Kamaraj, V.; Thamizharasan, S. GA based optimization and critical evaluation SHE methods for three-level inverter. In Proceedings of the 2011 1st International Conference on Electrical Energy Systems, Chennai, India, 3–5 January 2011; pp. 115–121. [Google Scholar]
- Cheng, F.F.; Yeh, S.N. Application of fuzzy logic in the speed control of AC servo system and an intelligent inverter. IEEE Trans. Energy Convers.
**1993**, 8, 312–318. [Google Scholar] [CrossRef] - Aziz, M.V.G.; Questera, N.; Hindersah, H. Speed Profile Algorithm using Artificial Intelligence for Vehicle Control Unit on Quest Motors Electric Vehicles. In Proceedings of the 2022 7th International Conference on Electric Vehicular Technology (ICEVT), Online, 14–16 September 2022; pp. 200–204. [Google Scholar]
- Khan, A.A.; Beg, O.A.; Alamaniotis, M.; Ahmed, S. Intelligent anomaly identification in cyber-physical inverter-based systems. Electr. Power Syst. Res.
**2021**, 193, 107024. [Google Scholar] [CrossRef] - Anđelić, N.; Lorencin, I.; Glučina, M.; Car, Z. Mean Phase Voltages and Duty Cycles Estimation of a Three-Phase Inverter in a Drive System Using Machine Learning Algorithms. Electronics
**2022**, 11, 2623. [Google Scholar] [CrossRef] - Rajeswaran, N.; Swarupa, M.L.; Rao, T.S.; Chetaswi, K. Hybrid artificial intelligence based fault diagnosis of svpwm voltage source inverters for induction motor. Mater. Today Proc.
**2018**, 5, 565–571. [Google Scholar] [CrossRef] - Anđelić, N.; Šegota, S.B.; Lorencin, I.; Jurilj, Z.; Šušteršič, T.; Blagojević, A.; Protić, A.; Ćabov, T.; Filipović, N.; Car, Z. Estimation of covid-19 epidemiology curve of the united states using genetic programming algorithm. Int. J. Environ. Res. Public Health
**2021**, 18, 959. [Google Scholar] [CrossRef] [PubMed] - Stender, M.; Wallscheid, O.; Böcker, J. Data Set Description: Three-Phase IGBT Two-Level Inverter for Electrical Drives, the Dataset Used in the Research and Publicly Available on the Kaggle Repository; Department of Power Electronics and Electrical Drives, Paderborn University: Paderborn, Germany, 2020. [Google Scholar]
- Stender, M.; Wallscheid, O.; Boecker, J. Comparison of gray-box and black-box two-level three-phase inverter models for electrical drives. IEEE Trans. Ind. Electron.
**2020**, 68, 8646–8656. [Google Scholar] [CrossRef] - Huang, Z.; Mei, Y.; Zhong, J. Semantic linear genetic programming for symbolic regression. IEEE Trans. Cybern.
**2022**, 1–14. [Google Scholar] [CrossRef] [PubMed] - Nicolau, M.; McDermott, J. Genetic programming symbolic regression: What is the prior on the prediction? In Genetic Programming Theory and Practice XVII; Springer: Cham, Switzerland, 2020; pp. 201–225. [Google Scholar]
- Plonsky, L.; Ghanbar, H. Multiple regression in L2 research: A methodological synthesis and guide to interpreting R2 values. Mod. Lang. J.
**2018**, 102, 713–731. [Google Scholar] [CrossRef] - Qi, J.; Du, J.; Siniscalchi, S.M.; Ma, X.; Lee, C.H. On mean absolute error for deep neural network based vector-to-vector regression. IEEE Signal Process. Lett.
**2020**, 27, 1485–1489. [Google Scholar] [CrossRef] - Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geosci. Model Dev.
**2014**, 7, 1247–1250. [Google Scholar] [CrossRef][Green Version]

**Figure 4.**The example of symbolic expression $add(sqrt\left({X}_{1}\right),max({X}_{0},{X}_{2}))$ in tree form.

**Figure 5.**The dataflow of GP with random parameter search for the case of ${\overline{U}}_{a}$ estimation.

**Figure 8.**Mean $R2$ scores and their standard deviations achieved on the prediction of phase voltages.

**Figure 9.**Mean $MAE$ scores and their standard deviations achieved on the prediction of phase voltages.

**Figure 10.**Mean $RMSE$ scores and their standard deviations achieved on the prediction of phase voltages.

**Figure 11.**Mean $R2$ scores and their standard deviations achieved on the prediction of duty cycles.

**Figure 12.**Mean $MAE$ scores and their standard deviations achieved on the prediction of duty cycles.

**Figure 13.**Mean $RMSE$ scores and their standard deviations achieved on the prediction of duty cycles.

**Table 1.**Input and output parameters for both research models (${d}_{a}$: duty cycle phase A; ${d}_{b}$: duty cycle phase B; ${d}_{c}$: duty cycle phase C; ${i}_{a}$: current phase A; ${i}_{b}$: current phase B; ${i}_{c}$: current phase C; ${u}_{a}$: voltage phase A; ${u}_{b}$: voltage phase B; ${u}_{c}$: voltage phase C; ${u}_{dc}$: DC-link voltage).

Black-Box Inverter Model | Black-Box Inverter Compensation Scheme | ||
---|---|---|---|

Inputs | Outputs | Inputs | Outputs |

${d}_{,k-3},{d}_{\mathrm{b},k-3},{d}_{\mathrm{c},k-3},{d}_{\mathrm{a},k-2},{d}_{\mathrm{b},k-2}$ ${d}_{\mathrm{c},k-2},{i}_{\mathrm{a},k-1},{i}_{\mathrm{b},k-1},{i}_{\mathrm{c},k-1},{i}_{\mathrm{a},k}$ ${i}_{\mathrm{b},k},{i}_{\mathrm{c},k},{u}_{\mathrm{dc},k-1},{u}_{\mathrm{dc},k}$ | ${\overline{u}}_{\mathrm{a},k-1},{\overline{u}}_{\mathrm{b},k-1},{\overline{u}}_{\mathrm{c},k-1}$ | ${\overline{u}}_{,k-1},\phantom{\rule{1.em}{0ex}}{\overline{u}}_{\mathrm{b},k-1},\phantom{\rule{1.em}{0ex}}{\overline{u}}_{\mathrm{c},k-1},\phantom{\rule{1.em}{0ex}}{d}_{\mathrm{a},k-3}$ ${d}_{\mathrm{b},k-3},{d}_{\mathrm{c},k-3},{i}_{\mathrm{a},k-3},{i}_{\mathrm{b},k-3},{i}_{\mathrm{c},k-3}$ ${i}_{\mathrm{a},k-2},\phantom{\rule{1.em}{0ex}}{i}_{\mathrm{b},k-2},\phantom{\rule{1.em}{0ex}}{i}_{\mathrm{c},k-2},\phantom{\rule{1.em}{0ex}}{u}_{\mathrm{dc},k-3}$ ${u}_{\mathrm{dc},k-2}$ | ${d}_{\mathrm{a},k-2},{d}_{\mathrm{b},k-2},{d}_{\mathrm{c},k-2}$ |

**Table 2.**The list of input and output variables that were used in GP for the black-box inverter model and black-box compensation scheme. The symbolic representation of input and output variables in GP was given as well as the results of statistical analysis (minimum, maximum, mean and standard deviation values) for each variable.

Symbol | GP Variable | Min | Max | Mean | Std | ||
---|---|---|---|---|---|---|---|

Black-Box Inverter Model | Duty cycles at $k-3$ | ${D}_{a,k-3}$ | ${X}_{0}$ | 0 | 1 | 0.5 | 0.21 |

${D}_{b,k-3}$ | ${X}_{1}$ | 0 | 1 | 0.50026 | 0.21 | ||

${D}_{c,k-3}$ | ${X}_{2}$ | 0 | 1 | 0.5 | 0.21 | ||

Duty cycles at $k-2$ | ${D}_{a,k-2}$ | ${X}_{3}$ | 0 | 1 | 0.5 | 0.21 | |

${D}_{b,k-2}$ | ${X}_{4}$ | 0 | 1 | 0.5 | 0.21 | ||

${D}_{c,k-2}$ | ${X}_{5}$ | 0 | 1 | 0.5 | 0.21 | ||

Phase currents $k-1$ | ${I}_{a,k-1}$ | ${X}_{6}$ | −7.3 | 7.47 | 0.0005 | 2.19 | |

${I}_{b,k-1}$ | ${X}_{7}$ | −6.32 | 6.66 | −0.007 | 2.15 | ||

${I}_{c,k-1}$ | ${X}_{8}$ | −7.113 | 7.437 | −0.008 | 2.21 | ||

Phase currents at k | ${I}_{a,k}$ | ${X}_{9}$ | −7.47 | 7.47 | 0.0005 | 2.19 | |

${I}_{b,k}$ | ${X}_{10}$ | −6.32 | 6.668 | −0.007 | 2.15 | ||

${I}_{c,k}$ | ${X}_{11}$ | −7.1123 | 7.437 | −0.008 | 2.21 | ||

DC-link voltage at k − 1 | ${U}_{dc,k-1}$ | ${X}_{12}$ | 548.013 | 575.55 | 567.13 | 4.99 | |

DC-link voltage at k | ${U}_{dc,k}$ | ${X}_{13}$ | 548.013 | 575.55 | 567.13 | 4.99 | |

Mean Phase Voltages at $k-1$ | ${\overline{U}}_{a,k-1}$ | ${y}_{ua}$ | −2.28 | 573.33 | 283.41 | 114.64 | |

${\overline{U}}_{b,k-1}$ | ${y}_{ub}$ | −2.087 | 573.2 | 283.46 | 114.29 | ||

${\overline{U}}_{c,k-1}$ | ${y}_{uc}$ | −2.31 | 573.17 | 283.74 | 114.6 | ||

Black-box inverter compensation scheme | Mean Phase voltage at $k-1$ | ${\overline{U}}_{a,k-1}$ | ${X}_{0}$ | −2.288 | 573.33 | 283.41 | 114.6 |

${\overline{U}}_{b,k-1}$ | ${X}_{1}$ | −2.088 | 573.2 | 283.46 | 114.2 | ||

${\overline{U}}_{c,k-1}$ | ${X}_{2}$ | −2.31 | 573.17 | 283.74 | 114.6 | ||

Phase current $k-3$ | ${I}_{a,k-3}$ | ${X}_{3}$ | −7.3 | 7.47 | 0.0005 | 2.19 | |

${I}_{b,k-3}$ | ${X}_{4}$ | −6.32 | 6.668 | −0.007 | 2.15 | ||

${I}_{c,k-3}$ | ${X}_{5}$ | −7.11 | 7.437 | −0.008 | 2.21 | ||

Phase current $k-2$ | ${I}_{a,k-2}$ | ${X}_{6}$ | −7.3 | 7.47 | 0.0005 | 2.19 | |

${I}_{b,k-2}$ | ${X}_{7}$ | −6.32 | 6.668 | −0.007 | 2.15 | ||

${I}_{c,k-2}$ | ${X}_{8}$ | −7.113 | 7.437 | −0.008 | 2.21 | ||

DC-link voltage $k-3$ | ${U}_{dc,k-3}$ | ${X}_{9}$ | 548.013 | 575.55 | 567.13 | 4.99 | |

DC-link voltage $k-2$ | ${U}_{dc,k-2}$ | ${X}_{10}$ | 548.013 | 575.55 | 567.13 | 4.99 | |

Duty cycles $k-2$ | ${D}_{a,k-2}$ | ${y}_{da}$ | 0 | 1 | 0.5 | 0.21 | |

${D}_{b,k-2}$ | ${y}_{db}$ | 0 | 1 | 0.5 | 0.21 | ||

${D}_{c,k-2}$ | ${y}_{dc}$ | 0 | 1 | 0.5 | 0.21 |

Hyperparameter Name | Upper Bound | Lower Bound |
---|---|---|

Population size | 100 | 500 |

Number of generations | 100 | 200 |

Tournament size | 10 | 50 |

Init depth | (3–7) | (8–15) |

Crossover coefficient | 0.001 | 1 |

Subtree mutation | 0.001 | 1 |

Hoist Mutation | 0.001 | 1 |

Point Mutation | 0.001 | 1 |

Stopping criteria | 0 | $1\times {10}^{-6}$ |

Maximum samples | 0.99 | 1 |

Constant range | −10,000 | 10,000 |

Parsimony coefficient (mean phase voltages) | $1\times {10}^{-3}$ | $1\times {10}^{-1}$ |

Parsimony coefficient (duty cycles) | $1\times {10}^{-10}$ | $1\times {10}^{-4}$ |

**Table 4.**The GPSR hyperparameters used for the definition of the symbolic expressions for phase voltage estimation.

Phase A | Phase B | Phase C | |
---|---|---|---|

Population size | 343 | 313 | 254 |

Number of generations | 143 | 166 | 188 |

Tournament selection size | 19 | 10 | 23 |

Initial depth | (3, 9) | (3, 8) | (3, 10) |

Crossover coefficient | 0.12793209 | 0.046102316 | 0.069413606 |

Subtree mutation coefficient | 0.282640797 | 0.637264456 | 0.746384432 |

Hoist mutation coefficient | 0.396828111 | 0.105422777 | 0.02853356 |

Point mutation coefficient | 0.029499366 | 0.205518506 | 0.155416003 |

Stopping criteria | 0.000318147 | 0.000915282 | 0.000625943 |

Maximal number of samples | 0.958912563 | 0.902661416 | 0.906783421 |

Constant range | (−5221.42, 6286.92) | (−5539.63, 2462.81) | (−5206.02, 6947.92) |

Parsimony coefficient | 0.000715945 | 0.000944609 | 0.000823575 |

**Table 5.**The genetic programming parameters used for the definition of the symbolic expressions for duty cycles estimation.

Phase A | Phase B | Phase C | |
---|---|---|---|

Population size | 498 | 258 | 299 |

Number of generations | 167 | 171 | 121 |

Tournament selection size | 28 | 47 | 44 |

Initial depth | (6, 15) | (7, 9) | (3, 9) |

Crossover coefficient | 0.216988623 | 0.07 | 0.05 |

Subtree mutation coefficient | 0.29 | 0.22879136 | 0.72 |

Hoist mutation coefficient | 0.143122346 | 0.334293786 | 0.048 |

Point mutation coefficient | 0.270236321 | 0.047146706 | 0.085 |

Stopping criteria | $9.49\times {10}^{-7}$ | $3.24\times {10}^{-7}$ | $9.28\times {10}^{-7}$ |

Maximal number of samples | 0.990992282 | 0.994145103 | 0.999404119 |

Constant range | (−9955.02, 2292.48) | (−8145.73, 1432.6) | (−7631.24, 5066.85) |

Parsimony coefficient | $9.92\times {10}^{-8}$ | $8.37\times {10}^{-8}$ | $3.56\times {10}^{-5}$ |

GPSR Modeling | ML Modeling | Deterministic Modeling | |
---|---|---|---|

Model complexity | Low | High | Medium |

Model performances | High | High | High |

Model execution time | Low | High | Medium |

Modeling procedure complexity | Low | Low | Medium |

Modeling computational complexity | High | High | Low |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Glučina, M.; Anđelić, N.; Lorencin, I.; Baressi Šegota, S.
Drive System Inverter Modeling Using Symbolic Regression. *Electronics* **2023**, *12*, 638.
https://doi.org/10.3390/electronics12030638

**AMA Style**

Glučina M, Anđelić N, Lorencin I, Baressi Šegota S.
Drive System Inverter Modeling Using Symbolic Regression. *Electronics*. 2023; 12(3):638.
https://doi.org/10.3390/electronics12030638

**Chicago/Turabian Style**

Glučina, Matko, Nikola Anđelić, Ivan Lorencin, and Sandi Baressi Šegota.
2023. "Drive System Inverter Modeling Using Symbolic Regression" *Electronics* 12, no. 3: 638.
https://doi.org/10.3390/electronics12030638