Multi-Stage Optimization of Induction Machines Using Methods for Model and Parameter Selection

: Optimization methods are increasingly used for the design process of electrical machines. The quality of the optimization result and the necessary simulation effort depend on the optimization methods, machine models and optimization parameters used. This paper presents a multi-stage optimization environment for the design optimization of induction machines. It uses the strategies of simulated annealing, evolution strategy and pattern search. Artiﬁcial neural networks are used to reduce the solution effort of the optimization. The selection of the electromagnetic machine model is made in each optimization stage using a methodical model selection approach. The selection of the optimization parameters is realized by a methodical parameter selection approach. The optimization environment is applied on the basis of an optimization for the design of an electric traction machine using the example of an induction machine and its suitability for the design of a machine is veriﬁed by a comparison with a reference machine.


Introduction
Optimization methods are increasingly used in the electromagnetic design and redesign process of electrical machines. The selection of the electromagnetic machine model and the optimization methods has a great influence on the computational effort and the convergence of the optimization.
Optimization methods can be divided into deterministic and stochastic methods. A well-known deterministic method is Pattern Search (PS) [1,2]. Among the most commonly used stochastic methods are evolutionary algorithms such as Genetic Algorithms (GA) [3,4] or Evolution Strategy (ES) [5], Particle Swarm Optimizations [6], and Simulated Annealing (SA) [4]. Both deterministic and stochastic optimization methods are applied in the field of electric machine optimization. In addition, couplings of deterministic and stochastic methods, such as GA coupled with PS [7], are applied.
In the field of electrical machines, multi-objective optimization is often considered, which allows the machine to be optimized with respect to several objective functions. In these optimizations, stochastic methods such as GA or ES and the Design of Experience (DoE) are used. Other methods used are the sequential optimization method [8] and the multi-level or multi-stage optimization [3,[9][10][11]. In some of these referred multi-stage optimizations, a successive two stage optimization using one optimization method is conducted. The optimization parameters are divided into significant and less significant parameters and the parameter groups thus defined are varied or kept constant depending on the current stage. Examples of such methods can be found in [3,9] using a GA and [10,11] using DoE as the optimization method. An overview and further literature on optimization methods in the field of electrical machines can be found in [12,13].
The machine models can be divided into direct and indirect machine models. Direct models include numerical ones such as the Finite Element Method (FEM) and analytical ones such as the Equivalent Circuit Diagram (ECD) model. They differ in their range of values of the modeled effects, their level of detail and their computational effort. In the field of optimizing synchronous machines [14,15] and synchronous reluctance machines [16], machine modeling is often performed using the FEM. This is possible due to the negligible transient effects and the resulting lower computational effort compared to the consideration of the FEM in the optimization of an Induction Machine (IM) [17,18]. For the IM, transient effects can no longer be neglected without accepting a significant reduction of the level of detail, resulting in a high computational effort for the FEM. Therefore, lower order models like analytical ECD models are applied [19][20][21][22].
In machine optimization environments, a very high number of machine simulations can be required. In this case, the FEM and other analytical methods can lead to very high computational effort. To reduce this high computational effort indirect machine models like the Response Surface Model (RSM) [8], Kriging Model [8,10] or Artificial Neural Network (ANN) are used. These surrogate models replace the machine model and estimate the output parameters of the machine based on the input parameters.
This paper presents a multi-stage optimization environment for IM design optimization that combines the advantages of several of the described optimization methods. The methods used are SA by [23], ES by [24], and PS by [1]. While the stochastic SA method has good global convergence with low local convergence speed, the ES method is known for stable convergence in the local group [25]. PS, as a deterministic method, provides a tool for fast local convergence [26]. Both the application of the successive ES-PS optimization and the previously executed stage of SA improve the convergence behavior in this case. In all these stages, direct machine models are used for electromagnetic modeling. The successive ES-PS method also reduces the computational effort compared to the single-stage ES-PS method. The increased computational effort due to the use of SA is compensated for by the application of an indirect machine model in the form of an ANN. This leads to a multi-stage optimization environment that combines the advantages of deterministic and stochastic optimization methods and those of direct and indirect model building. The selection of electromagnetic machine models and optimization parameters in each stage is methodically performed using the model selection and parameter selection procedure approach presented in [27]. Thus, in each stage, the model can be adjusted according to the desired range of values and level of detail.
The presented optimization environment is exemplary used to design an IM as a traction drive for a small vehicle. The aim of the optimization is to minimize the losses occurring over a given driving cycle while at the same time minimizing the required installation space. Starting with a rough design of the machine, it is optimized using the optimization methods presented, considering geometric and thermal constraints. The resulting machine is compared with a reference machine, which is used as a benchmark. The simulation results of this example show a good robustness of the multi-stage optimization environment including SA, a successive ES-PS method and the use of an ANN. Compared to classical multi-stage optimizations using one optimization method and one type of machine model, as in [3,[9][10][11], it shows an improved convergence behavior. It can therefore be used for the design process and the design optimization of IM.

Optimization Environment and Optimization Methods
The classical optimization methods can be categorized into stochastic and deterministic methods. While deterministic methods realize a fast convergence to a local optimum, stochastic methods enable the search for a global optimum. Furthermore, stochastic methods offer other advantages, including easy consideration of constraints and numerical stability avoiding the use of derivatives [25]. A detailed review of optimization methods in the context of electrical machines is presented in [12,28].
In this paper, with the SA, the ES and the PS, three different optimization methods are combined in one optimization environment, thus exploiting synergies. While the stochastic SA method exhibits good global convergence with low local convergence speed, the ES method is known for stable convergence in the local group [25]. PS, as a deterministic method, provides a tool for fast local convergence [26]. In the following, the multi-stage optimization environment used in this work is presented. First, the overall structure of the optimization environment is explained and then the individual parts of it are described in more detail.

Structure of the Optimization Environment
The optimization environment used in this work is shown in Figure 1 and consists of several optimization steps and further functionalities to reduce the simulation effort.
The input of the optimization environment consists of the problem description, the electromagnetic IM models used in the respective optimization stages and the optimization parameters that are varied during the optimization. The problem description includes the physical problem including its constraints and decision parameters used to define the objective function and the fitness function respectively. The electromagnetic machine models used in the respective optimization stages are defined using a methodical model selection approach presented in [27]. Based on the problem description, this determines the most suitable problem-specific IM model for the individual stage of the optimization process. This results from the fact that, in a global search, if necessary, a lower level of detail of the model is sufficient for a rough estimate of the fitness value, whereas in a local search a high precision is required to converge to the actual minimum. The optimization parameters are also defined methodically. Building on the model selection methodology, the methodological parameter selection approach presented in [27] is used to determine the variable optimization parameters based on their influence on the optimization problem. For these parameters, possible lower and upper bounds are described.
The optimization itself consists of five optimization stages. The result of each stage represents the initial solution of the further stages. The first stage is performed using the SA method. This features good global convergence, but has a low local convergence speed. The SA optimization is used to identify a local group of solutions. The following stages involve a successive hybrid optimization method with a faster local convergence. A hybrid optimization method is a combination of a stochastic and a deterministic search method. In this paper, the combination of the ES method with the PS method forms such a hybrid method. In the ES method, both the required population size and the required number of generations increase for stable convergence of the method with an increasing number of optimization parameters and thus the dimensions of the solution space. This relationship can be explained as a function of the number of optimization variables n by O(n 2 ) [24]. To reduce the solution effort, the hybrid optimization is performed successively in two consecutive steps. In the first step, significant optimization parameters are varied and less significant optimization parameters are set to constant values. In the second step, the parameters optimized in the first step are assumed to be constant and the less significant parameters are varied. The classification of the optimization parameters into significant and less significant parameters is done by the user.
The hybrid successive optimization approach results in a reduction of the solution space and thus also of the computational effort within the individual stages. Since there is no holistic global search for the optimum, but the stages act independently of each other, this is a heuristic approach to solve the optimization problem.
To further reduce the simulation effort, an ANN is introduced. This is used to determine the objective function without the need for electromagnetic simulation. The database for the training, selection and testing instances of the ANN is set up during the first stage of the optimization environment. Once the minimum size of the database has been reached, the ANN is constructed and applied in the further course of the optimization.
In the following, the individual parts of this multi-stage optimization environment are explained in more detail. First, the model and parameter selection approach are introduced. Second, the used optimization methods and the fitness function used by all optimization methods are explained. Finally, the ANN used is described.

Model Selection Approach in the Optimization
Whereas in the global search a lower level of detail of the model may be sufficient for a rough estimation of the fitness value, in the local search, a high precision is required to converge to the true minimum. For this reason, the model selection approach presented in [27] is used to methodically assign different machine models of the IM to each step of the optimization environment. In the context of the optimization environment, the output variables and effects to be considered describe those variables, which influence the decision parameters of the optimization problem. The use of the model selection approach will allow different levels of detail to be considered and defined in the individual optimization steps, allowing both precision and solution effort to be flexibly adjusted.
The model selection approach assigns a constant model to the SA optimization. Since SA implements a global search to identify a suitable local group, a lower level of detail may be sufficient in this step, which in turn reduces the solution cost of the resulting model.
For the ES method, a model vector of arbitrary length and increasing level of detail is used. The desired level of detail is specified and the most suitable model which fulfils this is determined by the model selection approach and added to the model vector. In general, the model selection does not distinguish between the first and second stage in the case of the successive optimization. The ES optimization starts in both stages with the first model in the model vector. If the minimum fitness value f ( x min,k ) of the generation k has not changed more than a tolerance ε over a given number of generations n, thus is satisfied, the next most accurate machine model in the model vector is selected. This detects convergence to a local minimum and supports it with a more precise model. Provided that the criterion from (1) is met again, the next model in the model vector is selected, and so on. Since the PS method is a deterministic and thus local optimization method, a high level of detail of the machine model of the IM is required. Otherwise, the geometry changes required for convergence to the actual minimum may not be adequately represented. This may even lead to a degradation of the solution. The model selection approach assigns a constant model to the PS optimization, which is also independent of the stage of successive optimization. This model is already used in the last iteration of the ES method in order to support the local convergence there by a more precise model and thus to prepare the use of the PS optimization.

Parameter Selection Approach in the Optimization
The optimization parameters of the machine geometry optimization represent those geometry parameters which are varied during the optimization process in order to find a better solution. These parameters are intended to cover a high degree of possible degrees of freedom of the geometry. With a higher number of optimization parameters, the number of dimensions of the search space increases and so does the required solution effort. It is therefore desirable to choose those geometry parameters that have the greatest influence on the searched output variables and the lowest degree of redundancy. Therefore, the choice of optimization parameters in this work is done by using the parameter selection method presented in [27]. The approach for the parameter selection is problem-specific and is also influenced by the selected system model. This is a result of the approach for the model selection, which must be applied accordingly in advance.
In the approach, the sensitivities and the elasticities with respect to the output variables relevant for the optimization are examined for each geometry parameter. The optimization parameters are sorted according to their elasticity and the parameters with the greatest influence on the optimization problem are identified. The sensitivity and elasticity analysis used is briefly described in the following.

Sensitivity Analysis
The sensitivity analysis is based on the approach of the local sensitivity analysis in [9,10] and is carried out on the basis of the given initial design of the IM. For this purpose, the individual geometric parameters of the machine are varied by user defined factor r sens . A change that is too small leads to numerical instabilities and changes that are too large possibly lead to inconsistent machine geometries. If the sensitivity to a geometry parameter is to be examined, the influence of this parameter on other geometry variables must also be taken into account. These influences are described using a correlation matrix, which describes for each possible variable geometry parameter which other geometry parameter still has to be changed. Both machines, the initial design and the slightly changed machine geometry, are simulated and the influence on the objective function is assessed by comparing the output variables.

Elasticity Analysis
The elasticity describes the ratio of the change of an output quantity, related to the change of the input quantity, and therefore results in the case of the sensitivity analysis for a physical quantity to where r phys describes the relative change of the physical quantity between the two simulated machine models. This allows for capturing for each model how much a certain quantity changes when the parameter under investigation is modified.

Simulated Annealing
The basic principle of the methaheuristic optimization method of SA according to [23] describes the quality of a possible solution or an individual x by means of a fitness function f . The goal of the optimization is to minimize the fitness f (x). Starting from an initial solution x 0 , a random selection of new individuals x k+1 within the solution space is performed. A new individuum is accepted and not discarded if one of the constraints or is satisfied. Thereby (4) describes, based on Boltzmann statistics, the probability P with which an intermediate result of the optimization may deteriorate. This realizes the possibility to leave a local optimum in the search for the global optimum. For this purpose, a temperature T k is introduced, which is successively reduced over the iterations. The lower the temperature, the lower the probability of accepting a worse solution. Accordingly, the algorithm converges with the "cooling" of the solution space.

Application of SA for the IM Optimization
The initial solution x 0 of the SA process in the presented IM optimization process is a roughly designed machine geometry. The rough design is based on problem-specific design parameters, such as the rated power, and is computed according to [29]. An individual of the SA optimization is defined by a chromosome. This contains all geometry parameters described in the optimization variables. The optimization variables are determined by the parameter selection approach in [27].
In the SA, as well as in the ES and PS method, the parameters of the chromosome are changed, resulting in new designs of the IM. The new machine designs are calculated, like the initial solution according to the chromosome and [29]. The evaluation of an individual based on its chromosome is done by a fitness function, which will be discussed in Section 2.6.
In this work, new individuals are determined using a normally distributed random vector X, whose normalization results in the random vector y with an expected value of zero and a standard deviation of one. The number of dimensions is equal to the number of optimization parameters n. From the chromosome x k of the current individual and the updated temperature T k , the new individual is determined by where operator ( * ) describes an element-wise vector multiplication. The temperature adaptation is performed according to the Boltzmann annealing by since this method guarantees global convergence for sufficiently large starting temperatures T 0 [30]. However, since only very high starting temperatures ensure global convergence, but these result in low convergence rates, the technique of Very Fast Re-Annealing (VFR) is used. This increases the temperature after a given number of iterations to avoid the method converging to local minima [31].
If one or more dimensions of the resulting chromosome lie outside the solution domain, they are set to the upper or lower boundary conditions, depending on which boundary was violated. This results in the chromosome x * k+1 which, in a convex combination with the current chromosome, realizes a valid solution via a random variable α evenly distributed between zero and one. The from (6) or (8) resulting chromosome can subsequently be accepted or rejected analogous to the criteria described in Section 2.4. By considering the step-wise cooling temperature in the description of the new chromosome, local convergence is realized. The termination criterion of the SA is a given number of iterations at which the change in fitness of the best individual is less than a defined tolerance.

Hybrid Optimization: Evolution Strategy and Pattern Search
The stochastic method used in the hybrid optimization is the ES according to [24]. The method is based on populations of individuals in the solution space, in which a generation with µ parents creates λ children via crossover and mutation. A distinction is made between the plus strategies (µ + λ), in which a new generation can be composed of parents and children, and the comma strategies (µ, λ), which only consider the best descendants for the next generation of parents [24]. The method used in this work is based on the comma strategy, as this introduces a maximum "lifetime" of the individuals and thus counteracts premature local convergence. The method is structured in an initialization, a selection, a mutation, a crossover and an inheritance process. The method, especially in the case of mutation and crossover, is problem-specific. Gaussian distributions can be used as a statistical basis. Their variance is flexibly adapted as a function of various parameters in order to achieve good local convergence while at the same time enabling a global search. This achieves the already mentioned stable convergence in the local group. Due to the generation principle, there is the inherent possibility of parallelization, which results in a reduced computing time compared to other stochastic optimization methods [32]. The deterministic method of the hybrid optimization in this work is the PS method. In contrast to other deterministic optimization methods, PS as a direct optimization method does not require a gradient of the fitness function. In addition to the associated numerical stability avoiding the use of derivatives, a local search for problems that are neither continuous nor differentiable is enabled [1]. In this way, a fast local convergence can be realized especially for complex optimization tasks with problem-specific boundary conditions. In this work, the method according to [1] is used.

Application of the Hybrid Optimization Method for the IM Optimization
The implementation of the two optimization methods is as previously described. In the following, the implemented crossover and mutation of individuals based on random distributions in the ES optimization will be discussed. For this purpose, normal distributions are used in the presented optimization environment, whose standard deviations and expected values are adjusted via various parameters depending on the situation.

Crossover
Crossover of individuals is performed using the (2,1)-strategy. From two parents separated by their chromosomes x p1 and x p2 , by means of the convex function with the normally distributed random vector a descendant with the chromosome x c is generated. The expected value µ and the standard deviation σ are thereby influenced by different, parent-and population-specific factors. The aim of the crossover is to project the properties of the parent with the lower fitness value f ( x p ), more onto the descendant. For this purpose, a fitness factor: is introduced, which is derived from the fitness values of the parents and shifts the expected value to the parent with the lower fitness value. Since a larger distance between the two parents increases the uncertainty of how the fitness behaves in the solution space between the parent chromosomes, a distance factor is introduced. This describes the normalized relative deviation d p12 of the two parents relative to the maximum occurring value d max of all parent pairs in the current generation. This factor shifts the expected value toward the parent with the lower fitness when the parents are far apart and reduces the standard deviation, counteracting the uncertainty in the space between. The closer an individual's fitness is to the minimum fitness f ( x min ) of a parent in the current generation, the lower the variation of the descendant should be to allow local convergence. This is achieved by an overall fitness factor: which affects both the expected value and the standard deviation. The expected value of the random vector X is given by which shifts the expected value of the descendant's chromosome toward the parent with the lower fitness value as a function of the factors introduced. The resulting standard deviation considers an adjustable maximum standard deviation σ crossover in addition to the factors described. This should be set as a function of the elasticities of the optimization parameters in order to respond to large elasticities with a lower variance, thereby counteracting larger jumps through the solution space and thus improving the convergence behavior in the local group.

Mutation
Mutation of a selected parent with the chromosome x p is performed using a normally distributed random vector analogous to (10). The descendant's chromosome x c is calculated by where operator ( * ) describes an element-wise vector multiplication. Here, the random vector has an expected value of µ = 0, whereas the standard deviation, analogous to the crossover of individuals, is adapted depending on various parameters. Through these factors, local convergence is to be realized in particular. For this purpose, the overall fitness factor analogous to (13) is introduced. This reduces the variance as a function of the fitness value.
In addition to this, with a generation factor is defined for the current generation g k , relative to the maximum number of generations g max . This factor reduces the standard deviation across generations, which also supports local convergence of mutant descendants. The resulting standard deviation of the random vector describing the mutation follows accordingly with The adjustable maximum standard deviation of the mutation σ mutation should be adapted as in the case of the crossover depending on the elasticity of the optimization parameters.

Fitness Function
The fitness function is identical for all stages of the optimization environment. It assigns an individual x a fitness value f (x) based on its chromosome, considering geometry and thermal constraints as well as a given driving cycle. This fitness value thereby also categorizes unacceptable machine geometries, depending on the number of fulfilled boundary conditions and the achievable points of the driving cycle. By this categorization, inadmissible solutions can be of different fitness, which improves the convergence behavior of the optimization methods and, in particular, realizes a faster search for admissible solutions. The maximum fitness of an individual is given by where n BC represents the number of constraints to be considered and n DC represents the number of points in the driving cycle. This allows a clear differentiation and thus categorization of the fitness values. The procedure to evaluate the fitness of an individual is based on a sequential process: 1.
Creation of the machine geometry based on the chromosome and the necessary design parameters 2.
Verification of geometric boundary conditions. In case of violations of the boundary conditions, the resulting fitness value of the individual as a function of the number of satisfied geometry boundary conditions n BC,Geo (x) is given by 3.
Operating map calculation of the geometrically permissible individual using the selected machine model.

4.
Verification of the thermal boundary conditions using a thermal simulation. If the given maximum temperatures are exceeded, the fitness results in with the number of all geometry boundary conditions n BC,Geo .

5.
Determination of the individual fitness of an admissible individual by means of where p j,indiv are weighted problem-specific decision parameters, p j,ref are the decision parameters of a reference machine, and w i is the sum of the weighting factors.
In case of an invalid solution due to the non-achievement of several operating points in the driving cycle, an additional penalty term is considered. Here, v DC (x) stands for the share of the not reachable operating points in the total number of operating points of the driving cycle and ∆ offset for an offset to separate the invalid solutions from the allowed ones.

Reduction of the Solution Effort
In order to realize a stable convergence of the optimization environment with a solution space with sufficiently many degrees of freedom for the variation of the machine geometry, many iterations and large populations are required due to the used stochastic optimization methods. This yields a long computation time, particularly for machine models with a high number of degrees of freedom. Depending on the required level of detail of the optimization problem, which influences the model resulting from the model selection methodology, the problem may thus not be solvable within a few days. A reduction of the solution effort is desirable. The solution effort can be reduced by a preselection of the individuals to be simulated. In this work, the preselection is realized by estimating the fitness values of the individuals using an ANN . This estimates for each individual those decision parameters of the optimization environment that require a time-intensive operating map simulation, such as the mean losses over a given drive cycle. Geometry-dependent decision parameters, such as the volume or mass of the machine, on the other hand, are computed in a regular manner, so that a fast estimate of an individual's fitness can be obtained from the estimate of the ANN as well as from the individual geometry parameters. Based on these estimates, solutions are discarded and only those individuals promising better fitness than the current minimum are simulated. In the following, the ANN and its application will be discussed.

Artificial Neural Networks
Various transfer functions exist whose suitability depends on the problem. An overview of numerous transfer functions and their respective characteristics is given in [34].
Based on the representation of a single artificial neuron, an ANN can be realized, and various constructions are possible. In this work, the feed-forward architecture, which is shown in Figure 2a is considered. This is described by several layers, each of which is fully connected to the following layer. These are the input and output layer, which describe the input and output variables of the network, and one or more hidden layers. The latter are composed of an arbitrary number of artificial neurons [35].
The weighting factors w i and offsets b of the individual neurons are determined when training the network using a training method. An overview of various training methods is given in [36].

Application of the ANN for the IM Optimization
The ANN implemented in this work is based on a fully connected feed-forward architecture of McCulloch-Pitts neurons and shown in Figure 2a. As a transfer function, a sigmoidal function is chosen, which is characterized by a simple and fast calculation of the derivative [34]. Moreover, its range of values includes only positive values, which meaningfully describes the estimation of physical quantities, such as the electromagnetic losses. The inputs of the ANN are the optimization parameters that are the geometry parameters of the actual IM design. Within the input selection, which will be described in the following, these will be supplemented by further geometry parameters. Two common problems in the construction of ANN are under-fitting and over-fitting, which, due to insufficiently complex or over-complex architectures, can cause imprecise estimates of the network. This can be counteracted by a multi-stage method in which different network architectures are analyzed by varying the number of neurons and layers. From these possible architectures, the most appropriate one is selected. For this purpose, the following sequential, five-step flow is introduced.

1.
Database: For the training and the use of the ANN, a database of individuals is created, which covers the solution space representative. The database is built in parallel to the optimization from the already simulated individuals, which result during the SA and ES method. Here, the database is composed exclusively of geometrically and thermally admissible solutions that reach all points of the driving cycle and stores the chromosomes and the fitness values of the individuals.

2.
Construction: As soon as the database reaches an adjustable minimum number of individuals, the construction of the ANN starts. The database is first divided into training, selection and testing instances. Based on the operating map-dependent decision parameters of the training instances, which are derived from the respective fitness values, increasingly complex neural networks are generated. Starting with a hidden layer and an adjustable number of neurons, this number is successively increased up to a maximum value. A predefined set of further layers with an increasing number of neurons is added step by step. Here, the input variables of the networks are composed of the optimization parameters. The choice of the input variables of the ANN is essential, since they determine the precision significantly. Therefore, two phenomena must be considered in their selection: • Input Correlation: The input variables should be minimally correlated with each other, and accordingly have as little redundancy as possible. • Input Target Correlation: The correlation between output and input variables should be maximized, thereby having the greatest possible influence on the output.
Both aspects can be ensured by the introduced sensitivity analyses. As a training procedure the Levenberg-Marquart method is used in this paper.

3.
Order Selection: In the context of the order selection algorithm, the architecture with the lowest selection error is selected from all created network constructions on the basis of the selection instances. For this purpose, the constructed neural networks are used to estimate the operating map-dependent decision parameters of the individual selection instances, and the estimates of the individual fitness values are thus derived from the respective geometry parameters, which are used to determine the selection error.

4.
Input Selection: Based on the optimization parameters, successively additional geometry parameters, starting with the highest elasticity to realize the highest possible input-target correlation, are added as input variables to the selected network architecture. For each new geometry parameter to be added, the correlation with the already existing input variables must be checked, since in terms of input correlation the input variables should have as few redundancies as possible. Therefore, if a correlation exists, the geometry parameter to be added is discarded in this work. To realize the growing inputs algorithm, new geometry parameters are added as additional input variables until the selection error starts to increase.

5.
Characterization: The problem-specific constructed ANN follows from the order selection and the subsequent input selection. Analogous to the selection instances, this neural network is used to estimate the individual fitness values of the testing instances. Based on this estimation, the error distribution of the testing error can be determined, which is essential for the further application of the ANNin the optimization environment.
The steps for network construction, order and input selection, and characterization are repeated at predetermined intervals once the required number of individuals in the database has been reached. Thus, new individuals added to the database are considered in the ANN, further increasing the estimation accuracy.
The output or solution of the network are the mean loss P L over the driving cycle of the considered individual. To ensure that the preselection of individuals by the ANN does not discard solutions that would otherwise represent a new optimum, a threshold f th is introduced. Here, individuals whose estimated fitness is above the threshold are discarded and otherwise simulated. To determine the threshold, a delta ∆ ≥ 0 is introduced. This results from the given estimation reliability γ of the ANN as well as the error distribution of the testing error to The relationship is visualized in Figure 3a. The threshold, as shown in Figure 3b, is obtained as a function of the current optimum f ( x min ) to Estimates above the resulting threshold consequently lie within the γ confidence interval with respect to the null hypothesis that these individuals do not represent a new optimum. In addition to the threshold, the relative deviation or distance of the chromosome of the solution to be estimated from the individuals in the database is also considered. If the distance to the nearest individual exceeds a predefined value, the solution is simulated and not discarded because in that case the estimation of the ANN is based on an extrapolation whose validity is not guaranteed.
(a) Delta ∆ based on the distribution of the testing error.
(b) Threshold f th for discarding solutions.

Exemplary Design Optimization of an IM
The presented optimization environment is used to design an IM as a traction drive for a small vehicle. The aim of the optimization is to minimize the losses occurring over a driving cycle while at the same time minimizing the required installation space. Starting with a rough design of the machine, it is optimized using the presented multi-stage optimization environment, considering geometric and thermal constraints. The resulting machine is compared with a reference machine, which is used as a benchmark. To model the IM, different electromagnetic machine models with different value ranges and level of detail are used. The thermal behavior of the machine necessary for the evaluation of the thermal constraints is modeled with an equivalent thermal network with four nodes, as presented in [37].

Description of the Multiphysics Problem
Boundary conditions and design parameters of the IM are defined on the basis of a reference geometry, whose cross sectional area is shown in Figure 4a. This is a four-pole IM with a rated power of 41.5 kW. An unchorded three-phase copper winding is inserted into the stator, which is connected in delta. The rotor of the machine is a squirrel cage rotor with bars and rings made of a die-cast aluminum with an electrical conductivity of 28 · 10 6 S/m. The stator and rotor laminations are made of electrical steel sheets of type M400-50A, the designation of which is given by DIN EN 10027-1. The main rated and geometrical parameters are summarized in Table 1. The geometry of the reference machine was designed electromagnetically in an experience-based, iterative process. By solving the problem using the presented optimization environment, a machine geometry with identical boundary conditions and design parameters is to be realized starting from the roughly designed IM in an automated process. The geometry properties and operating behavior of the resulting optimized machine should be similar to those of the reference machine. The cross sectional area of the roughly designed machine is shown in Figure 4b.
To evaluate the machine behavior, two differently weighted decision parameters p j,indiv are introduced, which are used to determine the fitness value of a geometry relative to the reference machine according to (23).
The first decision parameter is the average losses over a given driving cycle. For this purpose, the Worldwide Harmonized Light Vehicles Test Cycle (WLTC) class 3 driving cycle is considered, which is a part of the Worldwide Harmonized Light Vehicles Test Procedure (WLTP). Based on this driving cycle, the associated speed and torque combinations of the WLTC 3 can be derived for an example small car defined by the vehicle parameters from Table 2 using the vehicle model according to [38]. To ensure that the driving cycle is fully represented by the reference machine, the resulting speeds and torques are additionally scaled by the factor 0.7. This is due to the fact that the vehicle data of the reference machine are not known. However, this has no influence on the methodical optimization of the machine. Using these speed and torque combinations, the average losses over the driving cycle can be determined via the operating map of the machine.
The second decision parameter is the installation space of the machine. The installation space is not only used as a decision parameter but is further restricted by boundary conditions. The installation space is weighted by a factor of three less than the losses over the driving cycle.
The reference machine is also be used to define the other design-relevant parameters of the multiphysics problem, on the basis of which the rough design of the IM is carried out.

Methodological Optimization
Based on the problem description, the most suitable machine model is derived for the individual optimization stages using the model selection methodology. The optimization parameters are then derived in dependence of this model using the parameter selection methodology. The used electromagnetic IM models, the procedure and the results of the model and parameter selection for the given problem are briefly explained. Furthermore, the adjustable convergence parameters of the optimization environment are discussed.

Electromagnetic Machine Models
For the electromagnetic simulation of the IM, the machine models described in [27] are The fundamental wave model is based on a single-phase ECD, also presented in [39]. For the calculation of the elements of the ECD, which can be derived exclusively from the machine geometry and constant parameters, reference is made to the literature [40].
The HWM and the E-HWM are based on the harmonic wave theory of the IM presented in [41][42][43]. The HWM is the implementation under the assumption of an infinite permeability of the stator and rotor iron. The E-HWM is an extension of the HWM, where the influence of saturation is modeled by multiplying the flux densities by an air gap conductance function. The circumferential location dependent air gap conductance function [40] is a description of the effective air gap of the machine. The air gap is is increased on average by a saturation factor k h > 1 as a result of the main field saturation. In the region of large iron saturation, i.e., in the maximum of the air gap flux density B, the air gap is further enlarged by an additional saturation factor k h1 , and reduced at zero crossings. The air gap conductance function defined in this way moves synchronously with the fundamental wave field. The air gap flux density flattened by the saturation follows from multiplying the air gap flux density calculated with the HWM by the air gap conductance function.
The TH-FEM is based on the state of the art FEM for time-harmonic simulation of electromagnetic components including a slip transformation [44][45][46]. To consider for the nonlinear material behavior of the stator and rotor laminations, an iterative procedure is used in the field solution. For this purpose, the successive substitution approach or the Newton method can be used.
For the computation of the T-FEM, an in-house state-of-the-art Finite Element (FE) solver called iMOOSE/pyMOOSE [47] is used. To reduce the computational effort of the T-FEM, a hybrid simulation approach presented in [48] is applied.
The FWM, HWM, E-HWM, and TH-FEM are implemented in Matlab ® , whereas the T-FEM is implemented in python TM and C++. An operating map simulation needs 15 min using the TH-FEM on an PC with an i7 3.6 GHz Processor and 16 GB RAM and 12 h using the T-FEM on the compute cluster of the RWTH Aachen University.

Model Selection Approach
Based on the decision parameters presented, the output variables or effects to be investigated are defined on the basis of those variables of the machine model which influence the decision parameters. This is the case for the diverse loss types of the IM, since they determine the average losses over the driving cycle. For this reason, all loss types are included as output variables to be considered. However, other output variables and effects are not defined in this use case, since none of the other variables has an influence on either of the decision parameters.
In Table 3, the output variables to be investigated are shown with the required levels of detail. Here, identical precision is required for all stages of the optimization environment, since the resulting higher computation times are acceptable in the context of this work. The illustrated levels of detail must be achieved for all operating points of the operating point matrix, so that the precision of the map calculation is ensured. The required level of detail of the ohmic losses is obtained assuming a measurement deviation of 5%, a deviation of the electromagnetic reference model of 5%, and a scatter of the optimum identified by the optimization environment of 5%. A deviation of 25% is allowed for the iron losses since they are not dominant compared to the ohmic losses.
From the model selection approach, the TH-FEM follows as the model that can represent the required levels of detail at all operating points while minimizing the computational effort. Consequently, in the context of the optimization problem considered in this work, the TH-FEM simulation is used in all stages of the optimization environment to describe the behavior of the IM. Table 3. Output variables of the machine model to be investigated.

Output Variables and Effects
Level of Detail

Parameter Selection Approach
Based on the defined problem-specific output variables and the machine model of the IM resulting from the model selection methodology, the parameter selection approach is carried out. With this methodology, seven optimization parameters, which have no geometric correlation among each other, are determined. Possible optimization parameters include all geometry parameters of the IM.
Using the approach described in [27], the optimization parameters presented in Table 4 with descending elasticity are derived. Consideration of the specified lower and upper bounds on the variables reduces the size of the solution space and thus the computational effort required. A large part of the bounds results from experience. However, the upper bounds on the rotor diameter and active length can be estimated by considering the installation space limitations in combination with the diameter or length increase of the housing and, in the case of the active length, the winding head length. The lower limit of the outer diameter of the rotor follows from the shaft diameter. In the context of successive optimization, the top four optimization parameters shown are chosen as significant parameters due to their higher elasticity, and the remaining three variables are chosen as less significant parameters. For the different stages of the optimization environment, numerous adjustable convergence parameters result, such as variances, tolerances, population sizes, or the maximum number of iterations. These parameters can be used to adapt the convergence behavior of the individual stages to the problem-specific specifications of precision and solution effort. In addition, the behavior of the ANN can be influenced by settings for the network construction and the database. The definition of these convergence parameters is thereby experience-based.

Simulation Results
The simulation results of the presented exemplary IM design optimization are divided in optimization methods results, describing the convergence, robustness, and stability of the presented multi-stage optimization environment, and the IM optimization results, analyzing the optimized machine design.

Verification of the Multi-Stage Optimization
The analysis of the convergence behavior of the multi-stage optimization environment is performed in two steps. In the first step, the simulation results of the successively applied hybrid ES-PS method are compared with the standard single-stage ES-PS method. In the second step, the use of SA as a preliminary stage to the ES-PS method is compared with the standard single-stage ES-PS method. This approach allows for evaluating separately the advantages of the second hybrid optimization stage and those of the pre-connected SA stage. For this comparison, the described exemplary design optimization is simulated 60 times with each of the different methods and the medians, means, dispersions, and variances of the resulting fitness functions are analyzed.
In Figure 5a, the distribution of the fitness values of the single-stage ES-PS method are plotted in comparison with the two-stage ES-PS method. In the single-stage method, all optimization parameters from Table 4 are varied simultaneously, and the number of generations is g = 50 and the population size is p = 200. In the two-stage method, the number of optimization parameters in each stage is less than in the one-stage method. Due to this resulting reduction of the solution space in the both stages, the number of generations and populations in the ES method, and thus the computation effort, can be reduced compared to the single-stage method. For the two-stage method, it is g = 20 and p = 50 in both stages. The reduction of the solution space also causes a reduction of the dispersion and variance, as can be seen in Figure 5a, with a smaller mean value and median. This shows that using the two-stage ES-PS method results in more stable convergence behavior while increasing the speed of convergence.
In Figure 5b, the results of a second single-stage ES-PS method are plotted in comparison to the combination of SA and the single-stage ES-PS method. The single-stage ES-PS method is performed with the same settings as in the previous comparison. This allows an indication of the robustness of the convergence behavior of this method. As shown in Figure 5a,b, the median, the mean value, the dispersion, and the variance are in a very similar range in both cases, suggesting a robust convergence behavior of the method. Combining this method with the SA results in a significant reduction of the median and mean value as shown in Figure 5b. The dispersion and variance also decrease when SA is used. This shows that the previously conducted global search using SA also improves the convergence behavior of the optimization compared with the standard single-stage ES-PS method. The computational effort increases with the use of SA. However, this can be counteracted by ANN as explained in the following section. Since the multi-stage optimization environment runs sequentially, an improvement in the convergence behavior by the SA and the two-stage ES-PS method compared to the single-stage one also means that the convergence behavior of the entire optimization environment is improved.

Verification of the ANN
The verification of the ANN is done by comparing the calculated and the estimated fitness of individuals that arise in the context of the SA optimization following the construction of the network and are to be discarded due to a estimated value above the threshold. An example comparison is shown in Figure 6a. The mean deviation of the estimated from the calculated fitness of the discarded individuals in this case is 2.7%, but much of the estimation is more precise. However, some estimated fitness values have a relatively high deviation from the calculated fitness, which is a consequence of a large distance to the closest individual in the database. Accordingly, the average precision can be increased if the allowed maximum distance is further reduced. It can also be seen that no individual was discarded whose fitness is better than the current optimum f ( x min ).

Reduction of the Solution Effort
By using the ANN, the types of fitness determination can be divided into three different variants: • Invalid: Neither estimation nor simulation is required. Fitness can be described directly using (21). • Estimation: Individuals for which the ANN estimates a fitness value above the threshold and are discarded accordingly. • Simulation: Individuals that are neither invalid nor discarded by the ANN. These require a simulation of the operating map.
In Figure 6b, the percentage breakdown of fitness determination of individuals in these three types is shown for the SA method for the optimization of the example machine. During the optimization, 150 simulations are performed using the TH-FEM to build the database of the ANN. In the SA stage, additional 1130 evaluations of individuals are performed. In each of the hybrid stages, 35 generations were evaluated using the ES method and 20 evaluations were calculated using the PS method. While a large proportion of individuals in the SA stage have invalid geometries, approximately 20.4% of the solutions can be discarded by neural networks and only 1.4% require time-consuming simulation in this use case. This results in a total of 166 simulations performed for the SA stage. Without the use of the ANN, all solutions that would otherwise be discarded due to the estimated fitness also require a simulation. The required simulations would thus increase to 397. In contrast to the SA optimization, the ANN only has a small impact on the solution effort of the ES-PS method. This is a consequence of the local convergence of this hybrid method, since it searches in a vicinity of the optimum where the estimated fitness values often do not exceed the required threshold. The simulation of each population in the 35 generations of the ES method is fully parallelized for all individuals. With the assumption that all valid individuals are electromagnetically simulated in the hybrid stage, the number of simulations performed by the ES method thus increases by 2 × 35 = 70 and by the PS method by 2 × 20 = 40. The performed total electromagnetic simulations using the ANN are 276 and without the ANN 507. This means a reduction of the simulation effort by 45.5%. This reduction is achieved assuming the same models in each stage. With the use of different models in the individual stages, this reduction factor will vary.

IM Optimization Results
Starting from the cross-section of the rough designed machine shown in Figure 4b, the geometry optimization of the IM is performed. The cross section of the resulting machine geometry of the optimization is shown in Figure 4c. The basic design is similar to that of the reference machine. The optimized geometry differs primarily by two additional rotor bars, a rotor diameter larger by approximately 10% and a shortened active length of l Fe = 154 mm.
The evaluation of the quality of a solution is done with the fitness described in Section 2.6. It is calculated using the decision parameters defined problem-specifically in Section 4.1 by means of the TH-FEM simulation and related to the reference machine. The resulting fitness values of the roughly designed IM, the optimized machine, and the reference machine are presented in Table 5 and divided into the fractions of the volume as well as the mean losses over the WLTC 3. It can be seen that the optimization environment improves the fitness of the roughly designed IM by approximately 20%. Both the mean losses over the test cycle and the volume are lower in the optimized machine. Compared to the reference machine, the optimized geometry has a lower volume due to the shortened active length, but higher mean losses over the drive cycle. This results in a by 2.6% worse fitness. This is a consequence of the insufficient coverage of all possible degrees of freedom of the machine geometry by the seven optimization parameters, which leads to the fact that the reference machine cannot be completely reproduced. In addition, it is possible that the optimization method has not converged to the global minimum. As shown in Figure 5a, the optimum identified by the optimization environment has a dispersion of approximate 6%. For further verification of the optimized machine geometry, it is modeled by means of T-FEM simulations and compared with the reference machine. The operating maps of the total losses of the reference machine and the optimized machine resulting from the T-FEM are shown in Figure 7a,b. Here, the optimized geometry exhibits higher total losses, especially at the borders of the operating map, but the losses in the WLTC 3 driving cycle are of a similar order of magnitude to those of the reference machine.
The resulting fitness values related to the transiently simulated reference machine as well as the resulting proportions of the decision parameters are shown in Table 6. The increase in fitness of the optimized geometry from 1.026 for the TH-FEM to 1.059 in the case of a a T-FEM is thereby within the level of detail required by the model selection methodology. Since the deviation of the fitness values between the optimized geometry and the reference machine with approximately 6% is near the range of the assumed accuracy of the transient FEM of approximately 5%, the optimized geometry derived automatically from the rough design and the reference machine can thus be assumed to be similarly suitable solutions of the multiphysics problem.

Discussion and Conclusions
The focus of this work is the development of a multi-stage optimization environment for the design of a IM. In the individual stages, the advantages of two stochastic and one deterministic optimization method are combined by successively applying SA, ES and PS. The search for the optimum starts in the SA stage in a global solution space and continues locally in the successive hybrid use of the ES and PS methods. In the first successive stage, significant optimization parameters are varied and less significant parameters are kept constant. In the second stage, significant parameters are then assumed to be constant and less significant parameters are varied. This successive implementation of the hybrid ES and PS method improves the convergence behavior in terms of a lower mean value, dispersion and variance. In addition, the reduction of the optimization parameters in the individual stages compared to a single-stage hybrid ES-PS method results in a reduction of the computational effort. Using the SA method as a global search performed before the successive hybrid optimization method also improves the convergence behavior in terms of lower mean and median, and lower dispersion and variance of the optimization results. A disadvantage, however, is an increased computational effort due to the additional introduction of another optimization stage. This disadvantage is compensated by the application of an indirect machine model in the form of an ANN. By the ANN individual parts of the objective function, which otherwise require a computationally expensive simulation, are estimated. As a result, the computational cost of the multi-stage optimization environment in the presented application can be reduced by 45% by using the ANN. This value is achieved when the same electromagnetic machine model is used in each stage. The use of different models leads to smaller reductions.
If a precise estimate via the ANN is not possible, direct machine models are used for electromagnetic computation of the IM in the optimization stages. Using a model selection approach in each stage different levels of detail can be considered and defined in each optimization step. Thus, a model of lower level of detail can be used in the global search and models of increasing level of detail can be used in the local search. This procedure results in a high degree of flexibility with respect to the accuracy and the solution effort of the optimization environment.
A methodical approach to parameter selection is used to determine the optimization parameters. For each geometry parameter, the sensitivities and elasticities are studied with respect to the output variables relevant to the optimization. The optimization parameters are sorted by their elasticities and the parameters with the greatest impact on the optimization problem are identified. Sorting by elasticities also allows a systematic division of the parameters into variable and constant values depending on the optimization level.
The optimization environment using the model and parameter selection procedure is applied to the design of a traction machine. The objective of the optimization is to minimize the design space while minimizing the mean electromagnetic losses over the WLTC. The optimization is performed using the TH-FEM of the IM determined by the model selection approach. The quality of the result is determined based on the fitness of the optimized machine and a reference machine. For comparison, both machines are then simulated again using the model with the highest level of detail, the T-FEM. The fitness value of the optimized machine is about 6% higher than that of the reference machine. Since this deviation is within the range of the assumed accuracy of the T-FEM of about 5%, it can be assumed that the optimized geometry automatically derived from the rough design and the reference machine are similarly suitable solutions for the multiphysics problem. Thus, the use of the presented optimization environment as a tool for the design of the machine is verified.
A further verification of the optimization environment with further machine designs for different applications still has to be performed. In addition, the individual machine models can be further improved with respect to their level of detail. Not integrated in the optimization is the simulation of external components, such as inverter or battery. In addition, active cooling of the machine is not considered in the current status. For further improvement, a more detailed thermal model of the machine can be considered.
Regarding the uncertainty of the parameters, the proposed method may be extended using neutrosophic statistics as future research, but this issue is not the content of the study presented here.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: