Assessment of Methods for the Real-Time Simulation of Electronic and Thermal Circuits

.


Introduction
Computer simulation of dynamical systems is currently a necessary and often critical step in a large number of engineering applications. Accurate and efficient simulations enable considerable reductions in product development cycles, together with savings in prototyping and testing costs. The development of computing technologies in the latest decades has made it possible to extend the simulation of nontrivial engineering systems to real-time (RT) environments [1], including Hardware-in-the-Loop (HiL) and System-in-the-Loop (SITL) setups, in which computer simulations are interfaced to physical components. Such simulation processes often describe multiphysics systems of great complexity, composed by subsystems with very different behavior and physical properties. This task can be performed in an efficient way by means of co-simulation schemes, integrating the dynamics of each subsystem with its own dedicated solver, and coupling these through the exchange of a limited set of input and output variables [2][3][4][5]. This is the case of test benches in the automotive industry, in which physical components of a vehicle are interfaced to an RT simulation of the overall system, which may include mechanical, thermal, electronic, or hydraulic effects, as well as interactions with the driving environment. Test benches for e-powertrains, for instance, showcase the interest of developing effective simulation methods for circuits that represent engineering systems whose purpose is the management and control of energy flows in industrial applications [6].
Co-simulation in RT platforms requires that every subsystem in the assembly is able to complete the integration of its dynamics equations in real time. This is particularly critical in HiL and SITL environments, where the correct and safe performance of physical components depends on the inputs generated by the virtual ones being delivered in time. The time spent in computations by the virtual subsystems must, therefore, be predictable [7]. Such environments frequently require that the communication between subsystems takes place at fixed time intervals-moreover, using iterative coupling schemes to improve convergence is restricted by the presence of physical subsystems [8]. As a consequence, explicit coupling schemes with fixed communication step-sizes are the option of choice in RT co-simulation setups.
Electric and electronics subsystems are present in a wide range of multidisciplinary engineering applications, like electric vehicles. The numerical methods used for time-domain simulation of these circuits must meet the efficiency and accuracy requirements of the main application; in HiL and SITL setups, RT performance with fixed integration step-sizes is required. Time-domain computer-aided simulation of electric and electronic circuits has been the subject of research since the early 1970s [9,10]. Circuit dynamics is usually formulated in terms of a set of differential algebraic equations (DAE), resulting from the application of Kirchhoff laws and the constitutive equations of the circuit components, and integrated using numerical methods like the Newmark family of integrators [11] or backward differentiation formulas (BDF) [12]. The accuracy and stability of these methods have been assessed in a number of research papers and compared to alternative integration formulas such as implicit Runge-Kutta methods [13,14]. Some of these pay special attention to the solution of discontinuous systems that result from switching components in electronics circuits [15,16]. Moreover, strategies like index reduction [17] and model order reduction [18,19] have been proposed to streamline the solution of the system equations. In the latest decades, the need for RT performance has motivated the development of simulation tools and platforms able to meet this requirement [20][21][22]. The capability of the proposed solution approaches to deliver RT execution, however, needs further evaluation. Proper benchmarking of existing techniques requires the definition of clear comparison criteria and standardized problems, representative and easy to replicate [23,24]. This paper puts forward a framework for the evaluation of the efficiency and accuracy of simulation methods for electric and electronic circuits, consisting in a series of benchmark examples and a set of criteria to assess simulation results. This framework was used to determine the suitability of general-purpose simulation formalisms for their execution in RT environments. Methods for the simulation of circuits that used the trapezoidal rule (TR) and BDFs as integration formulas were used to solve four benchmark problems. These were defined in a simple way and designed to capture at least one challenging aspect of circuit simulation. The examples included linear electric circuits, nonlinear electronics, and a thermal equivalent model of an electric motor; this last example extends the scope of the proposed testing framework to systems characterized by slow dynamics and time-varying physical properties [25]. The performance of the simulation methods was verified using conventional, off-the-shelf PCs; additionally, they were also tested in ARM processors, with limited computing power and memory, which are currently gaining importance in distributed and non-homogeneous co-simulation applications [1].
Besides computational performance aspects, the benchmark problems also enabled the identification of numerical issues in circuit simulation. Some methods can be ruled out for general purpose applications, based on aspects like their inability to deal with redundant algebraic constraints or their tendency to introduce numerical drift in the solution. Problems with the evaluation of the derivatives of the variables were observed with some integrators and a correction strategy consisting in the projection of the derivatives onto the constraint manifold was developed to address this issue.
Results showed that the selection of a certain algorithm has a significant impact on simulation efficiency for a required level of accuracy, and that the most suitable simulation method in each case depends on the nature and properties of the problem under study. Iterative algorithms based on BDF2, BDF3, and TR displayed comparable performance features, compatible with real-time execution in most of the examples, also on ARM single-board computers. The proposed benchmarking framework made it possible to compare several integration approaches in terms of accuracy and computational efficiency, evaluating their performance with a standard procedure.

Numerical Methods
An electric or electronic circuit can be described as a series of connected elements that obey Kirchhoff laws. Each element has one or more terminals, connected to nodes, i.e., points with a given voltage. The flow of electrons between nodes gives rise to currents through the components. In general, electronic circuits are nonlinear systems.
In this work, the variables used to describe the system are the node voltages V and the currents I that flow through the components. The node voltages of a circuit are grouped in n v × 1 term x v , while the currents that flow through each component are contained in the n c × 1 term x c . Together, these constitute the system variables which, in general, are not independent, but are subjected to constraints. The total number of variables in the system is n = n v + n c . Algebraic constrains can be imposed by the satisfaction of Kirchhoff's current law at the circuit nodes, by the specification of the voltage of a node, e.g., by connecting it to ground, and by the constitutive equations of some components, such as resistors. These algebraic constraints are grouped in an m × 1 term Φ (x, t), where m is the total number of algebraic constraint equations imposed on the system. The system variables x must satisfy Other components, like capacitors and coils, introduce differential constraint equations that involve the variables x and their derivativesẋ with respect to time. In most cases, these equations can be expressed with the following set of p linear ordinary differential equations (ODE) The time-domain simulation of the circuit dynamics requires the solution of the system of DAEs formed by Equations (2) and (3). Several methods exist in the literature to address this problem [12].

Formulation of the Problem as a System of ODEs
It is possible to transform the DAE system defined by Equations (2) and (3) into a system of ODEs by differentiating Equation (2) with respect to time to obtaiṅ where Φ x = ∂Φ/∂x is the m × n Jacobian matrix of the algebraic constraints, and Φ t = ∂Φ/∂t is an m × 1 term. Grouping Equations (3) and (4), the following system of ODEs is obtained: The leading matrix A in Equation (5) is (m + p) × n and, for most circuits, it is a square, non-symmetric matrix. In principle, it is possible to evaluateẋ from Equation (5) at time step k and integrate its value by means of a numerical integration formula to arrive at the variables x at time step k + 1. However, this approach to obtainẋ suffers from two shortcomings. First, only the derivative-level expression of the algebraic constraints is explicitly enforced by Equation (5). Accordingly, the accumulation of integration errors causes the solution to drift away from the exact satisfaction of Φ = 0. Second, the leading matrix in Equation (5) will not have a full rank if some constraint equations are linearly dependent. This causes the failure of most commonly used linear equation solvers, making it necessary to use special algorithms to arrive at the solution of this system of equations.

Solvers for Systems of DAE
An alternative approach to solve the system of DAEs under study consists of introducing a numerical integration formula in Equations (2) and (3) to obtain the expression of their satisfaction at the next integration step, k + 1, as a function of the system variables x k+1 , but not their derivatives The resulting system of nonlinear equations can be solved by means of Newton-Raphson iteration where i stands for the iteration number. The procedure in Equations (7) and (8) is repeated until convergence is achieved; several criteria can be used as stopping criterion to determine when this takes place. In this work, the iteration was terminated when the norm of the residual f in Equation (6) descended below a predetermined admissible value, ε. Different numerical integration formulas can be used to arrive at Equation (6). BDFs are a popular choice in circuit simulation. A BDF is a multistep method that uses values of the system variables computed at previous steps to evaluate the variables and their derivatives at time t k+1 . The order k of the BDF is the number of already known steps used by the formula. In this research, BDF1, BDF2, and BDF3 methods were tested. It must be noted that the derivativesẋ are calculated by the method, but they are not used to evaluate x orẋ in future time steps. The equilibrium in Equation (6) is expressed as a function of the variables x alone, replacing their derivatives witḣ where h is the integration step-size, and the α j coefficients depend on the order of the BDF [12]; their values are shown in Table 1 for BDF orders 1 to 3.
With these integration formulas, the system of nonlinear equations to be solved (6) becomes and the tangent matrix in Equation (7) is which is an (m + p) × n matrix. This tangent matrix is less likely to be singular than the leading term in Equation (5), due to the presence of term ∂b/∂x. The trapezoidal rule, a particular case of the Newmark family of integrators [11], can also be used to obtain Equation (6). The derivativesẋ at time step k + 1 are evaluated aṡ Equation (6) then becomes and its corresponding tangent matrix is which has a similar structure to that of the tangent matrix in Equation (11). The trapezoidal rule may suffer from numerical problems in the evaluation of the derivativeṡ x. This stems from the fact that Equation (6) imposes the satisfaction of the algebraic constraints Φ = 0, but not their derivative-level expression,Φ = 0 in Equation (4); the only conditions that the derivativesẋ comply with are the p differential equations in (3). As a result, termẋ k+1 obtained during the solution of Equation (6) is not unique. The BDF integration evaluates the derivativesẋ k+1 only from the values of the variables x, which are uniquely determined by the satisfaction of Φ = 0. The trapezoidal rule, on the other hand, uses the derivatives in the previous step,ẋ k , to evaluatė x k+1 , as shown by Equation (12); they also affect the evaluation of x k+1 through termx k . This may give rise to the accumulation of errors and the steady growth of some elements in termẋ, which may eventually cause numerical inaccuracies or even the failure of the simulation. This problem can be dealt with in a number of ways. In some problems, the errors in the derivative-level constraint equations do not affect noticeably the accuracy of the solution, and can be left untreated. When this is not the case, a possible solution to avoid incurring into these constraint violations is to explicitly include the derivative-level algebraic constraints in Equation (4) in the system of equations to be solved, Equation (6). This presents the disadvantage of making the tangent matrix in Equations (11) and (14) no longer square. An alternative solution consists of the projection of the system derivativesẋ onto the plane tangent to the manifold defined by the algebraic constraints and the ordinary differential equations. The method is similar to the one described in [26,27]. The system derivativesẋ * obtained upon convergence of the DAE solver in Equations (7) and (8) do not necessarily satisfy Equation (4). The projection step aims to find the closest set of derivativesẋ that is compatible with these constraints via the following minimization problem: The solution of this problem using a Lagrangian approach leads to the following projection formula where α is a scalar penalty factor and σ is a set of Lagrange multipliers, whose value can initially be set to zero. The leading matrix in Equation (16) is symmetric and positive semi-definite. The projection step can be performed iteratively via the update of the Lagrange multipliers and the subsequent re-evaluation of Equation (16).

Implementation
The formulations in this section were implemented using C++ to build an in-house simulation software tool. The Eigen template library (http://eigen.tuxfamily.org) was used to provide the fundamental data structures and algebraic routines; sparse matrices with compressed-column storage (CCS) were used as containers. KLU [28], a direct solver for square non-symmetric matrices, especially conceived for circuit simulation problems, was used for the sparse linear systems in Equations (5), (7), and (16). This solver has been shown to be very efficient in the solution of small-and moderate-size problems, whose leading matrix is very sparse, i.e., less than 20% of its elements are structural non-zeros [29]. All the C++ classes and methods to manage information about circuit topology and properties, the numerical solution methods described in this section, and the control of code execution were developed and implemented by the authors.
Most of the matrices required by the methods in this section have a constant sparsity pattern. This makes it possible to preprocess and reuse the factorizations of the leading matrices required to solve Equations (5), (7), and (16). This is also the case of the symmetric rank-k update used to evaluate the leading matrix I + A T αA in Equation (16) [29]. Moreover, some circuit components such as constant-value capacitors and inductors lead to Φ x and A matrices that do not need to be re-evaluated at runtime. Unless otherwise specified, this fact was taken into consideration to speed-up code execution.

Benchmark Problems
The numerical methods described in Section 2 were evaluated by means of four test problems, which are put forward as benchmarks to evaluate the efficiency and accuracy of time-domain circuit simulations.

RLC Circuit
The first example is an RLC circuit with constant coefficients and relatively slow dynamics, shown in Figure 1. As is the case with the rest of benchmark examples in this paper, this problem is intended to be easy to replicate and to implement, while providing relevant information about the performance of the methods under study. Source E 1 is defined by the sinusoidal function of time E 1 (t) = 100 sin (10π t) V. The passive elements in the circuit have constant values R 1 = 10 Ω, L 1 = 100 mH, C 1 = 400 mF, and C 2 = 200 mF. The benchmark simulation consists of a 10-s integration of the system dynamics, for which the initial values of the current through the inductor, I L1 0 , and the voltage of the capacitors, V C1 0 and V C2 0 , are set to zero.
In this circuit, capacitors C 1 and C 2 connect nodes 1 and 3 in parallel; their voltages will be related by the two following differential equations: which are linearly dependent in terms of the system derivativesẋ. This means that matrix A in Equation (3) is row-rank deficient, which may cause some solvers to fail during the solution of the system dynamics. For instance, the leading matrix in Equation (5) becomes singular, thus preventing most conventional linear solvers from finding a solution for the system derivatives.

Scalable RLC Circuit
The second example is a scalable RLC circuit with constant coefficients. The number of system variables can be increased by adding new loops to the circuit, as shown in Figure   Source E 1 is the sinusoidal function of time E 1 (t) = 100 sin (10π t) V. The passive elements in the circuit have the same values in every loop, R i = 10 Ω, L i = 100 mH and C i = 100 mF, where i = 1 . . . N is the loop number. The benchmark simulation consists of a 1-s integration of the system dynamics, starting from an initial state in which the currents through the inductors, I Li 0 , and the voltage of the capacitors, V Ci 0 , are set to zero.
This circuit provides a way to test the computational performance of a method as a function of the problem size, which can be adjusted by selecting a number of loops N. In this work, problems with up to N = 5000 loops were selected for simulation.

Full Wave Rectifier
The third example is a model of a full wave rectifier, shown in Figure 3. Source E 1 is defined by the sinusoidal function E 1 (t) = 100 sin (10π t) V. The passive elements in the circuit have constant values R 1 = 10 Ω, and L 1 = 100 mH. The diodes in this circuit are described using Shockley's model, shown in Figure 4, and defined by the constitutive equation [30] where ∆V = V a − V c is the voltage difference between the anode and the cathode of the diode, I o = 1 fA is the reverse bias saturation current, n = 1.5 is the ideality factor, and V t is the thermal voltage defined by where q is the electron charge, T is the temperature, andk is the Boltzmann constant. The series resistor in Figure 4 has a value of R s = 1 mΩ. The benchmark test for this problem consists of a 1-s simulation of the circuit dynamics in which the initial value of the current that flows through the inductor is This problem is an example of nonlinear electronic circuit with fast changes in the system state during the transitions between the forward and reverse regions of the diodes.

Synchronous Motor Thermal Equivalent Circuit
The fourth benchmark example is a thermal equivalent circuit to evaluate temperatures and heat flows in a permanent-magnet synchronous motor (PMSM), shown in  Water Thermal equivalent circuits describe the heat transfer, thermal losses, and thermal inertia properties of a physical system by means of lumped components comparable to those of an electric circuit. Similar models can be found in the literature, e.g., [31,32]. The one presented here aims at balancing the complexity of accurately describing the thermal phenomena that occur in a PMSM and the compactness desirable in a benchmark problem. Each node in the circuit corresponds to a representative point in the physical system and has a certain temperature T associated with it. Heat flows Q between nodes play a role similar to currents in an electric circuit. Thermal conductivity and convection are represented with thermal resistors. Thermal inertia is modelled with capacitors, and heat generation such as Joule effect losses is represented with current sources. The values of thermal resistances, capacitances, and heat sources may be variable if these properties vary as a result of changes in the system temperatures or operation conditions. This is often the case with convective resistors and Joule effect sources.
In the thermal model of the PMSM in Figure 5, all resistances, capacitances, and heat sources are constant, with the exception of the resistance of the air gap between the rotor and the stator (R airgap ). These parameters are shown in Table 2. The air gap resistance is calculated as a function of the Nusselt number, as detailed in the Appendix A. The benchmark simulation consists of a 5000-s long integration of the system dynamics. The motor is assumed to operate at a constant angular speed ω = 1570 rad/s. The temperatures of the cooling fluid and the environment are constant, T water = 320 K and T env = 300 K. Initially, the temperature difference between the two nodes connected by a capacitor is zero, which means that the system is at ambient temperature.
This system showcases the characteristics of most thermal equivalent circuits, namely very slow dynamics and physical properties of some components that may vary as the simulation progresses.

Reference Solutions
Reference solutions are a key component in benchmark problems [24], as they are required to verify the results provided by a certain solver tool, and to compare two or more solutions in terms of accuracy or performance. Analytical closed-form expressions can be obtained for some problems and used as reference solutions. Often, however, these are not available; in such cases, solutions at convergence can be used for the same purpose. These can be found by decreasing the integration tolerances and step-size with which the benchmark problem is solved, until the difference between the obtained results remains below a defined threshold. Solutions at convergence should be obtained with a minimum of two different methods.
In this work, reference solutions were obtained by convergence of two of the DAE solvers described in Section 2.2, namely the ones that used as integrators the trapezoidal rule and the BDF3 formula, decreasing the integration step-size down to 1 µs. Both methods delivered almost identical solutions in all the problems, with a maximum difference in the obtained values of the system variables x below 2 · 10 −8 A or V in the electric and electronic circuits. For the thermal circuit in Section 3.4, the differences were kept below 10 −8 K for the temperatures T, and below 4 · 10 −3 W for the heat flows Q.
For each benchmark problem, a few representative variables (monitored variables) among the n contained in term x were selected to enable the fast and simple comparison of the simulation results. This set of chosen variables should describe an important behavior aspect of the circuit under study, or critical operational values that need to be kept under control. The total number of variables, algebraic, and differential constraint equations of each benchmark problem is summarized in Table 3.

RLC Circuit
The monitored variables in the RLC circuit in Section 3.1 are the voltage at node 3, V 3 , and the current through the inductor, I L1 . The reference solutions for these values are shown in Figure 6.

Scalable RLC Circuit
For the scalable RLC circuit in Section 3.2, the monitored variables are the voltage at node 2N + 2 and the current through the last branch in the system, I LCN . The time-history of these two variables is shown in Figure 7 for N = 2 loops.

Full Wave Rectifier
For the full wave rectifier in Section 3.3, the monitored variables are the voltage difference between the anode and the cathode of diode 1, V 2 − V 3 , and the current through the resistor and the inductance, I RL , shown in Figure 8.

Synchronous Motor
For the thermal circuit of the PMSM, the monitored variables are the temperature of the permanent magnets, T 2 , and the heat flow from the housing to the refrigerant, Q water , which are shown in Figure 9.

Error Measurement and Performance Evaluation
A main goal of benchmark testing is to determine the performance of a given computational method in the solution of representative problems. For this performance evaluation to be meaningful, comparisons between different methods have to be carried out for the same level of accuracy in the obtained results. This accuracy is measured with respect to the reference solutions discussed in Section 3.5; doing this requires a standard way to evaluate the error of a solution, which is here adapted from [23].
If, for a given point in time t p , y ξ t p is the solution delivered by a method under study for a variable ξ, and y ref ξ t p is the reference solution for the same variable, then the error in ξ at time t p can be defined as A number n s of sampling points equally spaced in time is defined for each benchmark problem. The total error in the simulation for variable ξ is the aggregatê The number of sampling points n s for each problem is shown in Table 4. Two accuracy levels, "low" and "high", were defined for the simulation of the benchmark examples in this section. These are at least two orders of magnitude larger than the error margins of the reference solutions, pointed out in Section 3.5. Using two accuracy levels helps to determine whether the efficiency properties of a given solution method hold when the precision requirements are made more stringent. A simulation is considered acceptable if the error obtained after the numerical integration process, evaluated with Equation (23), is below the thresholds shown in Table 5 for every monitored variable. The comparison of the computational efficiency of two or more simulation methods must be carried out for the same level of accuracy. This criterion was observed in the numerical experiments reported in Section 4.

Numerical Experiments
The benchmark problems presented in Section 3 were used to compare the performance and features of the formulations in Section 2. This comparison was carried out taking into consideration the particular requirements of RT simulation. The time invested by the numerical methods in the integration of a time-step must be not only as short as possible, but also predictable. This means that the number of iterations required for the convergence of the Newton iteration to solve Equation (6) must be limited. In principle, the number of iterations could be decreased for those steps that are computationally less challenging, e.g., the intervals between switching points in the rectifier circuit in Section 3.3. This would result in overall shorter computation times. However, most RT applications require that every integration step is solved while meeting RT deadlines. Accordingly, in this work, the number of solver iterations was set to a fixed value, equal for all the integration steps in every method and problem.
The methods in Section 2 were compared in terms of the computation time that was needed to complete the simulation of the benchmark examples within the precision requirements specified in Table 5. Three parameters were adjusted for each method in order to determine its optimal performance in each case: the integration step-size h, the number of solver iterations per time step, γ, and the integrator tolerance, i.e., its maximum admissible error during iteration, ε. Two cases were distinguished when the trapezoidal rule was used as integrator: one in which one projection step was performed upon convergence of the Newton iteration, reported as TR in the following, and another one in which this projection was omitted, labelled TRnP. This makes it possible to determine the impact of the projection step on computation times.

Simulation Environments
In this research, the use of the benchmark problems and solver implementations was tested under three different computing platforms, namely The last two environments are ARM-based single-board computers. Testing the benchmark framework and the solver implementations under different software and hardware platforms shows that their application is not restricted to a particular development environment. Moreover, it also serves as a test of the ability of ARM platforms to take on the simulation of particular components in RT computation environments. The features of each simulation environment are summarized in Table 6.

Numerical Results
The set of solver parameters, ε, h, and γ, was adjusted for each solver and test problem to find out the configuration that delivered the best performance. It was found that decreasing the maximum admissible norm of the residual ε below 10 −3 did not result in significant accuracy improvements in any of the examples, and so it was fixed to this value in all the tests. It must be noted thatk-order BDF methods need a special initialization during their firstk steps. These methods use the previouŝ k values of the variables x to take an integration step; during initialization, these are not available, and so a modified version of the algorithm must be used. In this work, the trapezoidal rule was used as a replacement method for these initial steps. On the other hand, the accuracy requirements for lowand high-precision solutions displayed in Table 5 were enforced in all cases.

RLC Circuit
The direct ODE integration approach in Section 2.1 is unable to perform the simulation of the RLC circuit in Figure 1. Capacitors C 1 and C 2 connect nodes 1 and 3 in parallel, making the leading matrix A in Equation (3) rank deficient. Conversely, all the DAE solvers in Section 2.2 were able to complete the integration while attaining the required precision. Tables 7 and 8 show the selected DAE integrator configurations (step-size h and number of iterations γ) that resulted in the most efficient solution of the RLC circuit simulation while meeting the low and high precision requirements, respectively. As expected, the errors in the monitored variables scale linearly with the integration step-size h for each method. As shown in Table 8, when BDF integrators are used, decreasing the step-size to achieve high precision makes it possible to reach the desired error levels with a single iteration of the solver. The trapezoidal rule, on the other hand, still required two iterations with the same step-size h = 0.25 ms.
Tables 7 and 8 must be interpreted in the light of the results contained in Table 9, which shows the times elapsed in the simulation of the RLC system with each DAE integrator, with the configurations defined in Tables 7 and 8. Table 9 shows the efficiency delivered by each method for the two accuracy levels, namely high and low, in the solution of the test problem. With the exception of BDF1, all methods would be able to deliver RT performance in the three computing platforms described in Table 6, both for low and high precision. Computations on the BeagleBone Black and the Raspberry Pi 4 were about 30 and 5 times slower than on the PC, respectively. Moreover, results show that the projection stage after convergence of the trapezoidal rule required as much computation time as the convergence of the Newton-Raphson iteration of the main solver. BDF2 and BDF3 delivered the best efficiency in this test problem. It must be mentioned that the tangent in Equation (7) is a constant matrix for this example. This makes it possible to evaluate and factorize it during preprocess; accordingly, only the back-substitution required for the solution of Equation (7) needs to be performed during runtime.

Scalable RLC Circuit
The scalable RLC circuit in Figure 2 was solved using the DAE integration methods in Section 2.2. The integration step-sizes h and number of iterations required by each solver are shown in Tables 10  and 11 for low and high precision, respectively.  Tables 10 and 11 were obtained for a circuit with N = 2 loops. However, it was verified that these configurations were also able to keep simulation errors with respect to the reference solution below the admissible threshold for circuits with up to N = 5000 loops. Accordingly, these configurations were used for all the numerical experiments reported in this subsection. Table 12 shows the times elapsed by each method in the solution of the RLC circuit with N = 2 loops. As expected, they feature similar trends to those obtained in Section 4.2.1 with the simple RLC circuit. The tangent matrix in Equation (7) is constant in this problem as well, and so the times in Table 12 do not include its evaluation and factorization. The simulation of this RLC circuit was repeated varying the number of loops (N = 2, 100, 250, 500, 1000, 2000, 5000) to gain insight into the effect of system size on computational efficiency. These numerical experiments were conducted for both low and high precision. Moreover, an additional set of tests was performed forcing the simulation software to ignore the fact that the tangent matrix is constant and to perform its evaluation and factorization in every solver iteration. Figures 10 and 11 display the elapsed times delivered by each solution approach in the PC simulation platform, for low and high precision, respectively. Elapsed times for the trapezoidal rule (TR) include carrying out the velocity projections. Results for BDF1 were omitted in Figure 11 because the computation times that it required exceeded considerably the ones delivered by the other methods.   Figures 10 and 11 show the computational overhead due to the evaluation and factorization of the tangent matrix. They also show that the increase of elapsed time with the number of loops N (and, as a consequence, with the number of variables n) is approximately linear. This is the result of using sparse matrix implementations and solvers, and agrees with previous results in different simulation domains [29].
The maximum theoretically achievable system sizes that could be simulated while fulfilling RT execution requirements are shown in Table 13, for the PC simulation environment. BDF1 experienced convergence issues for large systems in the high-precision simulation, and so results for this method are omitted in this case. Results in Table 13 include the evaluation and factorization of the tangent matrix. They provide an orientation regarding the maximum system size that the tested methods would be able to handle in a RT simulation environment with the computing power of the PC in Table 6.

Full Wave Rectifier
The full wave rectifier was solved using the DAE integration methods. The integration step-sizes h and number of iterations required by each integrator are shown in Tables 14 and 15 for low and high precision, respectively.   Table 16 contains the elapsed times for each method and error criterion. Unlike the RLC examples, the rectifier is a nonlinear problem with relatively fast changes in its dynamics. Each diode in this system can be in a reverse or forward state; iterative solvers may run into numerical issues during the transitions between them. Moreover, the tangent matrix of the system is not constant and has to be evaluated and factorized at every solver iteration. This explains the comparatively longer computation times and higher number of iterations required for convergence in this example. The abruptly changing dynamics of the rectifier results in the need to use integration steps smaller than h = 0.1 ms with the BDF3 solver, even for low precision requirements. Increasing the step-size beyond this limit causes the errors with respect to the reference solution to rise quickly and, eventually, leads to the failure of the simulation. This fact points towards the difficulty that high-order multi-step integrators experience to keep the simulation of discontinuous systems stable. When low precision results are acceptable, TR and BDF2 are preferable in terms of efficiency, as confirmed by the results in Table 16.
The full wave rectifier example can also be used to highlight the importance of keeping the derivative-level violation of algebraic constraints under control when using certain integrators. Figure 12 shows the derivative with respect to time of the voltage at node 5,V 5 , obtained with three different integration approaches, namely BDF3 and the trapezoidal rule (TR) with and without the projection step in Equation (16). As expected, BDF3 and the projected TR deliver the same results. The uncorrected TR, however, delivers a magnitude ofV 5 that consistently increases with time. It must be stressed that the three simulations used to obtain Figure 12 satisfied the algebraic constraints Φ = 0 and the differential equations Aẋ + b = 0, explicitly imposed by Equation (8). Accordingly, they obtain very similar values of the system variables x, within the acceptable values in Table 5. This is not the case with the derivatives with respect to timeΦ = Φ xẋ + Φ t = 0. The TR integration formula is compatible with the accumulation of derivative-level errors, as long as the obtained values ofẋ satisfy the differential equations of the circuit. The BDF3 method suffers from the same limitation, but it does not reuse the derivatives to evaluate future values ofẋ.
As shown in Figures 12 and 13, the projection step removes the accumulation of derivative-level constraint violations and preventsẋ from reaching excessive values that could eventually lead to numerical errors in the computations or even to the simulation failure by overflow. Figure 14 shows the distribution of computational workload between the different tasks during a time-step for the full wave rectifier, solved with low precision requirements. The trapezoidal rule was used as integrator. Results are compared to those delivered by the scalable RLC circuit with N = 4 loops, which is similar to the rectifier in terms of system size. The solver was forced to repeat the evaluation and factorization of the tangent matrix in each solver iteration in both examples. The integration step can be divided into the following tasks: • the update of the dynamic terms Φ, A, b, and their derivatives; • the evaluation of the residual f and the tangent matrix df/dx in Equation (7); • the factorization and backsubstitution of the linear solver in Equation (7); • the projections in Equation (16); and • other operations, such as the update of variables and regulation of code execution.
The profiling results confirmed that the computational effort of the projection step amounts to two or three iterations of the main system solver for the studied systems. For circuits that require a considerable number of solver iterations, such as the rectifier, the overhead of the projection step is less relevant in relative terms than it would be in RLC systems, where convergence is achieved with one or two iterations.

Synchronous Motor
The thermal equivalent circuit that corresponds to the synchronous motor was also solved using the DAE integration methods. The step-sizes h and number of iterations per step are shown in Tables 17  and 18 for low and high precision, respectively.  Tables 17 and 18 show that the order of magnitude of the error in heat flows is much larger than that of the error in temperatures; the accuracy in the calculation of the heat flows becomes thus more critical to determine if a simulation is correct or not. Table 19 contains the elapsed times for every configuration and integration method. All methods are well below the real-time limit in all cases, regardless of the selected computation environment.
Besides having slower dynamics and time-varying properties, thermal equivalent circuits differ from their conventional RC counterparts in that they are often subjected to excitations that are not periodic. The system behavior during transients is a relevant part of the dynamics and needs to be determined accurately in most applications. The initialization method plays an important role in the evaluation of the first transient in the simulation; unsuitable initialization approaches result in incorrect predictions during the first integration steps that affect negatively the simulation accuracy long after the initialization phase is complete.  Figure 15 shows that the temperature T 9 delivered by the BDF3 integrator using the TR as initialization routine (BDF3) matches the reference solution, even with an integration step-size h = 1 s. On the other hand, if the initialization is not properly performed, for instance assuming a constant value of the temperature before t = 0 (BDF3 * ), the effect of initialization errors can be appreciated much later in the simulation, rendering it visibly inaccurate. This behavior was not observed in the electric and electronic circuits in Sections 3.1-3.3, in which errors derived from poor initialization approaches were quickly corrected upon completion of the initial transient.

Conclusions
This paper puts forward a methodology to evaluate the performance of time-domain simulation methods for electric, electronic, and thermal equivalent circuits, oriented towards the determination of their suitability to be used in real-time environments. Four benchmark problems with their corresponding reference solutions were defined, together with error metrics and accuracy criteria to enable the comparison of different methods. These examples are easy to reproduce and representative of significant problem types in circuit simulation.
The proposed methodology was used to assess the capability of ODE and DAE solver methods to deliver RT performance in three different hardware and software environments, including ARM boards with limited computing capabilities. The tested methods included transforming the circuit equations into a system of ODEs prior to their numerical integration, and Newton-Raphson iterative solvers for nonlinear systems of DAEs. Backward difference methods and the trapezoidal rule were used as integration formulas.
The numerical experiments performed in this study demonstrated that the benchmarking methodology and the examples can be used to obtain relevant information regarding the performance and features of a given solution method. Results made it possible to discard the direct ODE integration as a generally valid approach, at least in the form presented in the methods section in this paper. Among the DAE solver methods, the first-order BDF integrator, BDF1, consistently showed low performance, requiring very small integration step-sizes, down to three orders of magnitude smaller than those required by the other formulas, to attain the accuracy requirements defined in the benchmark. This rendered it unsuitable except for the simulation of the simplest test cases.
Implementations based on second and third order BDF formulas and the trapezoidal rule delivered comparable performances. For systems with relatively slow dynamics, BDF3 always delivered the best efficiency. TR and BDF2, on the other hand, could be used to obtain faster computations in circuits with nonsmooth behavior when the level of accuracy can be relaxed. A proper initialization of BDF methods is required to obtain accurate simulation results, especially in circuits in which the excitation is not periodic.
Regarding the RT ability of the tested methods, the benchmark framework can be used to provide orientations regarding the scope of applicability of a method in a certain computing platform. With the PC used in this study, BDF2, BDF3, and TR were able to solve all the problems in the benchmark below the RT mark. For RLC circuits, systems of up to a few thousand variables, depending on the requested precision, could be integrated below this limit. Results also show that ARM single-board computers can also be used to simulate most problems in the benchmark set. However, more factors than computational efficiency determine the usability of a method in RT computing, such as system latency and communication delays. The results from benchmarking can be used to rule out inefficient or inadequate approaches, and to choose the most effective options when dealing with the solution of a given problem.
Besides conclusions about efficiency, the simulation of the benchmark problems led to the identification of issues with the evaluation of the derivatives with respect to time of the system variables. Integration methods that determine the derivatives in future steps using previously computed values of the same derivatives need to enforce the satisfaction of the derivatives with respect to time of the algebraic equations of the circuit. Failure to do so may result in the accumulation of errors in the system derivatives, which can eventually lead to numerical issues in problem solutions. The trapezoidal rule is one of these methods; a projection step was introduced in this paper to address this need of the integrator. It was observed that the computational requirements of the projection step are comparable to those of two or three iterations of the DAE solver.
Author Contributions: The conceptualization and methodology employed in the paper were conducted jointly by its four authors. B.R. and F.G. wrote the simulation software, defined the benchmark examples, and ran the simulation tests. The paper manuscript was prepared by B.R. and F.G. and revised by M.Á.N. and J.C. Research funding was obtained and managed by F.G., M.Á.N., and J.C. All authors have read and agreed to the published version of the manuscript.
Funding: The work reported here was supported by the HiPERFORM project, which has received funding from the ECSEL Joint Undertaking under Grant No. 783174, by AVL through the University Partnership Program, by the Ministry of Economy of Spain through project TRA2017-86488-R "Técnicas de co-simulación en tiempo real para bancos de ensayo en automoción" and the Ramón y Cajal program, contract No. RYC-2016-20222, and by the Galician Government under grant ED431B2016/031.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A. Airgap Resistor
The value of the air gap resistance in the PMSM in Section 3.4 is obtained from the relation whereĥ is the convection coefficient at the airgap and S is the area of the common surface between stator and rotor. The convection coefficient is a function of the Nusselt number, Nu, where k a is the air conductivity and e = 0.82 mm is the air gap thickness. The Nusselt number can be evaluated using the correlation [ Term T am is defined by T a = ω 2 R m e 3 ν 2 (A5) where ω is the angular speed of the motor, R m = 72.61 mm is the mean radius of the rotor and the stator at the air gap and ν is the kinematic viscosity of the air. The physical parameters of the air, such as viscosity or conductivity, can be found in [33].