Reduced-Order Modelling and Homogenisation in Magneto-Mechanics : A Numerical Comparison of Established Hyper-Reduction Methods

The simulation of complex engineering structures built from magneto-rheological elastomers is a computationally challenging task. Using the FE2 method, which is based on computational homogenisation, leads to the repetitive solution of micro-scale FE problems, causing excessive computational effort. In this paper, the micro-scale FE problems are replaced by POD reduced models of comparable accuracy. As these models do not deliver the required reductions in computational effort, they are combined with hyper-reduction methods like the Discrete Empirical Interpolation Method (DEIM), Gappy POD, Gauss–Newton Approximated Tensors (GNAT), Empirical Cubature (EC) and Reduced Integration Domain (RID). The goal of this work is the comparison of the aforementioned hyper-reduction techniques focusing on accuracy and robustness. For the application in the FE2 framework, EC and RID are favourable due to their robustness, whereas Gappy POD rendered both the most accurate and efficient reduced models. The well-known DEIM is discarded for this application as it suffers from serious robustness deficiencies.


Introduction
The ongoing development of so-called smart materials over the last decades has given rise to the quest for numerical models which enable predictive, fast and accurate simulations of engineering structures.For smart materials, the desired constitutive behaviour is frequently architectured by tailoring the microstructure of said material, e.g., fibre-reinforced composites, auxetic materials, metal foams and many more.An established approach to model such structures is denoted multiscale modelling for which commonly two scales, the micro-and macro-scale, are introduced.The geometric complexities and advanced boundary conditions of engineering structures are modelled on the macro-scale, whereas the microstructure is represented on the micro-scale.One way to consistently couple micro-and macro-scale is the so-called FE 2 method [1].FE 2 is a multi-level finite element method that derives the constitutive response in every quadrature point of the macro-scale from an FE simulation incorporating the microstructure using the framework of computational homogenisation [2].
Even though the everlasting increase in computational resources following Moore's law has enabled scientists to solve FE problems with 10 13 DoFs [3], the numerical cost of FE 2 simulations is still prohibitive for most realistic problems.The idea of replacing the micro-scale FE simulation by a less expensive model has brought together the fields of multiscale and reduced-order modelling.
In the last several years, several viable models targeted at reducing the multiscale FE simulation have been developed.In this contribution, we focus on projection-based models using a reduced basis, but there are also alternatives like the Nonuniform Transformation Field Analysis [4,5] and Proper Generalized Decomposition [6,7].In projection-based reduced models, the reduced basis is a set of few functions with global support that is constructed to approximate the solution manifold of the problem in question.Projecting the governing equations onto the reduced basis yields a considerable reduction in the number of unknowns compared to using the locally supported FE basis functions.The most commonly used methods to construct the reduced basis are Proper Orthogonal Decomposition (POD) [8][9][10][11] and the Reduced Basis Method [12][13][14][15].Both rely on solutions of the parametrised partial differential equation (pPDE), for POD, the pPDE is solved for a set of given parameters, whereas the Reduced Basis Method employs a greedy algorithm equipped with an a posteriori error estimator to determine the parameters adaptively.As there are hardly any efficient and reliable error estimators for coupled nonlinear multi-physic problems, POD is the method of choice in this work.In [16], a POD reduced basis was used for the first time in multiscale analysis of nonlinear elasticity at finite strains, namely by reducing the micro-scale model.This was extended in [17] by introducing the computation of a consistent tangent operator based on the reduced model.
For problems with nonlinearities or non-affine parameter dependence, the sole application of a reduced basis does not render the desired computational savings as the nonlinearity or non-affine parameter dependence has to be evaluated for the original model and subsequently projected onto the reduced basis.A widely used method accelerating the computation of the nonlinearity is the (Discrete) Empirical Interpolation Method (D)EIM [18,19].DEIM approximates the nonlinearity by a linear combination of collateral basis functions.The coefficients are computed using interpolation based on values of the nonlinearity sampled at a relatively small number of points.In order to improve the approximation, interpolation is replaced by linear regression for Gappy POD [20].In [21], Petrov-Galerkin projection is used to increase the stability of reduced models.Together with Gappy POD and possibly differing approximations of the reduced system matrix, this is referred to as GNAT (Gauss-Newton Approximated Tensors).As DEIM, Gappy POD and GNAT use collateral basis functions to approximate the nonlinearity, they are classified as collateral basis methods.Both POD and DEIM have been applied previously to various mechanical problems: a simplified beam model for multiscale modelling at small strains including damage [22], strain-softening viscoplasticity at small strains [23] and structural mechanics using a variant of DEIM based on the unassembled nonlinearity [24].A collateral basis for the stresses instead of the nonlinearity was used in [25] for homogenisation of elasto-plastic materials at small strains, together with Gappy POD and a tailored method to determine the locations at which the stresses are evaluated in the reduced model.A detailed survey of DEIM, Gappy POD and GNAT for homogenisation of hyper-elastic materials at finite strains that focus on accuracy and robustness was performed in [26].
Cubature methods are another possibility of reducing the cost of computing the nonlinearity.In this sense, a problem specific quadrature rule replaces the quadrature used to integrate the weak form, e.g., Gaussian quadrature.This empirically determined quadrature uses only a subset of the support points or elements of the original FE model and computes the weights accordingly.This idea was put forward in [27] and later introduced to the field of computational homogenisation as Energy-Conserving Sampling and Weighting (ECSW) [28].A possibility to reduce the cost of constructing the cubature was introduced in [29] together with the term Empirical Cubature (EC).The accuracy and efficiency of EC was compared to a variant of DEIM/Gappy POD used in [25] for homogenisation of elasto-viscoplastic materials in small strains [30].
The Hyper-Reduction method [31] makes use of a reduced integration domain (RID) to speed up the computation of the nonlinearity.It defines test functions with support confined to the RID, which in combination with trial functions obtained by POD results in a Petrov-Galerkin projection.The expression hyper-reduction was coined in [31] but is now used as a term encompassing all methods aiming at accelerating the computation of nonlinearities in the field of model reduction.Therefore, we will refer to the Hyper-Reduction method [31] as RID to avoid any notational confusion.The RID was used for the simulation of elasto-plasticity [32], the simulation of nonlinear thermal and mechanical problems involving internal variables [33] and the lifetime assessment of elasto-plastic structures [34].An algorithmic comparison with DEIM for the nonlinear heat equation was carried out in [35].Similarly, the Missing Point Estimation (MPE) method [36] computes the Galerkin projection in a small subset of the computational domain to accelerate the assembly of the reduced problem.An investigation of the MPE method is beyond the scope of this article and accordingly we refer the interested reader to [37], where a detailed comparison of MPE, DEIM and Gappy POD was performed for a predator-prey model.
In this contribution, we will show the first application of reduced-order modelling for computational homogenisation in magneto-mechanics at finite strains.We will focus on reducing the problem at the micro-scale, using POD to compute the reduced basis and applying following hyper-reduction methods: DEIM, Gappy POD, GNAT, EC and RID.Through various numerical studies, a thorough comparison between the techniques with emphasis on accuracy and robustness will be drawn.

Homogenisation in Magneto-Mechanics
The simulation of engineering structures requires evaluations of a material law at the engineering/macro-scale (≈mm-m).For magneto-rheological elastomers (MREs), the constitutive behaviour on the macro-scale is determined by the underlying microstructure (≈nm-µm).Usually, MREs are composite materials consisting of an elastomeric matrix with embedded magneto-active particles [38] which induce changes in stiffness or deformations due to applied magnetic fields.Due to the scale separation, a resolution of the microstructure in the discretisation of the macrostructure is computationally not feasible.The tools of computational homogenisation offer an expedient to the issue as the constitutive behaviour for any point on the macro-scale is computed from the solution of a boundary value problem (BVP) representative for the microstructure.The material composition of the microstructure is described by an RVE (Representative Volume Element) for which the constitutive behaviour of the constituents is prescribed.In the context of magneto-mechanics, the macroscopic deformation gradient F and the magnetic field H are the input variables for the microstructural BVP [39,40].In the remainder, we use an over-bar to denote macro variables (•).
As is common in homogenisation, the micro displacement and scalar magnetic potential are additively split into two parts, the macroscopic fields and the fluctuations: The macroscopic fields depend linearly on the macroscopic deformation gradient F, the macroscopic magnetic field H and the position vector X.
We use linear boundary conditions to fulfill the Hill-Mandel condition.Using the fluctuations ũ and ỹ as primary variables allows us to transform the linear into homogeneous boundary conditions.The RVE occupies the domain B 0 ⊂ R d with its boundary ∂B 0 , where d denotes the space dimension.
The energy density Ψ (F, H) is expressed in terms of the deformation gradient F and the magnetic field H and used to define the constitutive relations for the Piola stress P and the magnetic induction B.
The balance of linear momentum and Gauss's law for magnetism (also known as conservation of magnetic flux) complete the strong form of magneto-mechanics on the micro-scale [41]: For the sake of an FE solution, the weak form is derived using the test functions δ ũ and δ ỹ.
For the spatial discretisation, the standard Bubnov-Galerkin FEM is used.The continuum body is approximated by a mesh where M denotes the number of elements.The displacement and potential fields in any finite element Ω e are approximated by the piecewise continuous vector-valued polynomials N u i (X) and scalar polynomials N y i (X), respectively: ũ The scalars d u e and d y e are the numbers of mechanical and magnetic DoFs in the element Ω e .Using these approximations results in the discrete weak form The sub-/superscripts (•) u /(•) u and (•) y /(•) y encode whether a variable is associated with the mechanical or magnetic component and are used throughout the remainder of the paper.The notation ( •) is consistently used to differentiate between a continuous field and its discrete FE counterpart, e.g., ũ is the displacement fluctuation and û is the vector containing the nodal coefficients for the FE discretisation.To refer to single elements of any vector/first-order tensor X and of any matrix/second-order tensor Y, the notation X[i] and Y[i, j] are used.The operator M A e=1 represents the assembly of the element contributions and the scalars N u , N y and N = N u + N y are the numbers of DoFs employed in the FE discretisation.
The numerical solution of the system of nonlinear Equations (5) using the iterative Newton-Raphson scheme requires its linearisation introducing the iteration count k.
For the sake of notational clarity, the dependences of Rk ŷk ; F, H and K k ŷk ; F, H are dropped.
The tangent stiffness matrix K is given as = for Ω e ∈ T .
Once the solution of ( 5) is obtained, the output quantities P and B are computed using where the volume of the RVE is denoted by V.

Reduced Basis
Instead of using a large number N of compact trial functions, projection-based ROMs are built upon a small number n of global functions spanning the space in which the solution manifold of the pPDE resides.Consequently, the unknown fluctuation fields ũ and ỹ are expressed as linear combinations of the global trial functions with the reduced coefficients ũr i and ỹr i : The reducibility of the problem, namely the conditions n u N u and n y N y , is accepted implicitly but has to be confirmed by numerical studies.To avoid scaling issues due to differently chosen units, separate reduced bases are used for the mechanical and the magnetic fluctuation fields.The numbers n u , n y and n = n u + n y are the numbers of reduced basis functions to be taken into account.
A common method to compute the reduced basis for a pPDE is POD [9,42].In order to do so, we define the parameter domain of the microscopic problem with reasonably chosen limits for the components of the macroscopic loading parameters.Each element p i = F i , H i ∈ P comprises an instance of the macroscopic deformation gradient and magnetic field.
The parameter domain P is sampled using n s parameters gathered in the set S = p 1 , . . ., p n s ⊂ P (11) and the full-order model (FOM) ( 5) is solved for all elements in S. The solutions are collected in the snapshot matrices and the subsequent application of gives the discrete reduced bases contained in the matrices B u and B y .For details on POD, we refer to [9,43].

Galerkin ROM
In the Galerkin reduced model, the same ansatz (9) as for the solution is used for the test functions Inserting ( 9) and ( 14) into ( 5) results in the weak form of the Galerkin reduced model where the dependences P ũ, ỹ; F, H and B ũ, ỹ; F, H are dropped for notational brevity.
Analogously, the discrete weak ( 16) form and its linearisation ( 17) are derived: Even though the size of the system of linear Equations ( 17) is significantly smaller than in Equation ( 6) and hence the cost of the linear solver reduces from O(N 2 ) to O(n 3 ), the speed-up is only marginal as the assembly of (17) depends on the original problem size.The cost for evaluating the constitutive law for every quadrature point is roughly O(Nn + n el qp M), where n el qp is the number of quadrature points per element.The computational complexities of assembling and projecting B Rk and B K k B are proportional to O(nN) and O(n 2 N + nN), respectively.Therefore, the application of hyper-reduction methods is imperative.

Discrete Empirical Interpolation Method
The Discrete Empirical Interpolation Method [19] is the standard hyper-reduction method for non-affine or nonlinear pPDEs, for some problems even equipped with a posteriori and a priori error estimators [44,45].The first step is to approximate the discrete residuum by a linear combination of collateral basis vectors contained in H R with r being the vector of coefficients.Due to a different number range, it is advisable to approximate the mechanical and magnetic residua by two separate collateral bases H R u and H R y.The collateral basis is computed based on snapshots of the residuum.For that purpose, ( 16) is solved for every parameter p i ∈ S and the residua are collected in the course of the Newton-Raphson process to build the matrices As Ru j p i and Ry j p i converge to the null vector during the iterative solution of ( 16), only residua fulfilling Ru The coefficients in (18) are determined using interpolation meaning the approximation has to be equal to the residuum at the interpolation indices.The matrices P R u and P R y are sampling matrices, where e ρ u i for i = 1, . . ., r u and e ρ y  Introducing ( 18) and ( 21) into (16) gives the hyper-reduced weak form and its linearisation becomes The cost for evaluating the constitutive law for every quadrature point in elements containing interpolation indices is reduced to approximately O(N eval n + n el qp m), where m is the number of elements containing DEIM indices and N eval the number of DoFs associated with this elements.The computational complexities of computing the residuum and tangent stiffness matrix are proportional to O(rN eval ) and O(nrN eval + rN eval ), respectively.It is to be noted that an efficient computation of the stiffness matrix utilises the sparsity of the FE matrix.Consequently, the assembly and solution of (24) do not depend on the size of the FOM and should therefore result in the desired speed-ups.

Gappy POD
Instead of interpolation, Gappy POD uses linear regression to determine the collateral basis coefficients, meaning the residual is evaluated at more indices than coefficients.This is particularly beneficial for hyper-reduced models originating from FE models, as for the calculation of Ru or Ry at each evaluation index ρ u i or ρ y i the solution ( ũ, ỹ) and the respective constitutive components have to be computed for every finite element containing the index.Hence, it is more economical to use all DoFs attached to a node instead of possibly only one as for DEIM.The collateral basis coefficients are computed solving r u = arg min The integers p u and p y are the numbers of FE nodes at which the residua Ru and Ry are computed and N u and N y are the dimensions of the underlying FE model.For (25) to have unique solutions, p u d ≥ r u and p y ≥ r y have to hold.The solutions of ( 25) are obtained by computing the pseudo-inverses P R u H R u + and P R y H R y + rendering the explicit expressions for the collateral basis coefficients.
Inserting ( 26) into (16) gives the Gappy POD hyper-reduced weak form and the linearisation becomes introducing p = p u d + p y .In (28), the only difference to the DEIM hyper-reduced system (24) is the appearance of the pseudo-inverse (•) + instead of the inverse (•) −1 .The cost for evaluating the constitutive law is the same as for DEIM O(N eval n + n el qp m).The computational complexities of computing the residuum and tangent stiffness matrix are proportional to O(pN eval ) and O(npN eval + pN eval ).
To determine the FE nodes/indices, Algorithm A2, given in Appendix B, which is an advancement of the algorithm proposed in [21] for multi-physic problems, is applied.Algorithm A2 uses normalised maxima to cope with distinct domains and different units in multi-physic problems.Algorithm A2 is applied to φR u 1 , . . ., φR u r u and φR y 1 , . . ., φR y r y either separately or combined.In the latter case, the same FE nodes are used for the gappy reconstruction of the residua Ru and Ry , resulting in more efficient reduced models.

GNAT
To improve the accuracy and stability [21] of reduced models, the GNAT hyper-reduced model is not based on Galerkin but on Petrov-Galerkin projection and therefore adopts different spaces for the test and trial functions.
For accuracy reasons, the state-dependent test functions K By r B are chosen and the discrete weak form and its linearisation are obtained.Equation ( 30) is the normal equation for the associated least-squares problem and therefore the solution of ( 29) is equivalent to solving the minimisation problem minimise with the Gauss-Newton method.
As the computation of (31) still depends on the original problem size, GNAT similarly to Gappy POD uses collateral bases to approximate the nonlinearities and linear regression to determine the coefficients: R = H R r and KB = H K k, Numerical experiments have shown that the choice H R = H K and consequently P R = P K renders reduced models of superior accuracy compared to models employing a separate basis for H K .Putting (33) into (31) and multiplying from the left with H R renders the least-squares problem to be solved in every Gauss-Newton iteration ∆ ỹr k = arg min and recalling p = p u d + p y .
The complexity of assembling and solving (34) is similar to (28).As the Gauss-Newton method does not converge quadratically like the classical Newton-Raphson scheme, more iterations are necessary to minimise (32).
The computation of the collateral basis is similar to DEIM except that the residua are gathered during the solution of (31).The matrix P R is determined using Algorithm A2 with the collateral bases as input.

Empirical Cubature
Cubature methods aim at reducing the cost of computing the nonlinearity in (15) by defining an empirical quadrature, which evaluates the integrand only in a limited number of quadrature points or elements.Instead of summing up all element contributions, the nonlinearities are computed only in the elements of the so-called reduced meshes E u and E y and multiplied by positive weights: In (35), each element e in E u or E y is equipped with a positive weight ω u e or ω y e , whereas all the other elements are assigned weights ω u e = ω y e = 0.The approximation in (35)  with m u and m y being the number of elements in the reduced meshes.Different algorithms for the minimisation of ( 36) are discussed in [46].Since this minimisation is numerically expensive, collateral bases By introducing (37) to (36) and recalling that the coefficients in (37) do not depend on the position of the elements in the reduced meshes, we obtain alternative errors For the details on minimisation of (39) in order to obtain the reduced meshes and the weights, the interested reader is referred to Appendix C or [29].In contrast to the method put forward in [29], the EC introduced here uses elements instead of single Gauss points, resembling the ECSW method [28].By doing so, the effective number of quadrature points employed in the reduced model increases, but the implementation is less code invasive.
The linearisation of the weak form of the EC hyper-reduced model ( 35 where e B u and e B y are the restrictions of B u and B u to the finite element Ω e .For EC, the reduced system matrix has the same properties, e.g., symmetry and positive definiteness, as the system matrix of the FE model, as the weights are strictly positive.This property is not shared by the collateral basis methods. The cost for evaluating the constitutive law in the elements of E u ∪ E y is roughly O(N eval n + n el qp m), where m is the number of elements in the union of the reduced meshes and N eval the number of associated DoFs.The assembly of the residuum and the tangent matrix is proportional to O(nN eval ) and O(n 2 N eval + nN eval ), respectively.

Reduced Integration Domain
For RID, two reduced integration domains Ω u RID ⊂ T and Ω y RID ⊂ T are introduced, which are used to define test functions δ ũ and δ ỹ with support only in Ω u RID and Ω y RID .Hence, the nonlinearities will be computed solely in Ω u RID and Ω y RID , which provides the desired reduction of computational cost.In the discrete setting, the test functions in B RID with confined support are expressed in terms of the reduced bases B as The cost for evaluating the constitutive law is roughly O(N eval n + n el qp m), where m is the number of elements of the union Ω u RID ∪ Ω y RID and N eval the number of DoFs associated with those elements.The approximate costs of assembling the residuum and tangent stiffness matrix are O(lN eval ) and O(nlN eval + lN eval ), respectively.
For (43) to be a well-posed problem, B RID is required to have full column rank.If that is not fulfilled or the accuracy of the model is poor, the l element layers surrounding Ω u RID and Ω y RID are included (cf.Algorithm A4 in Appendix D).

Test Problem
The magneto-mechanical material model chosen for the numerical studies is of Neo-Hookean type combining isotropic elastic with linear isotropic magnetic material properties.For further details including expressions of the Piola stress P and the magnetic induction B, see [41].
The mesh used for all numerical tests is displayed in Figure 1, where 1840 quadratic finite elements are used to discretise the continuum body B 0 , resulting in N = N u + N y = 14,882 + 7441 = 22,323 DoFs.
For numerical integration, a 4 × 4 Gaussian Quadrature is employed.The implementation of the tests is based on the open-source FE library deal.II [47].In Table 1, the dimensionless Lamé parameters λ 1 and λ 2 and magnetic permeability µ are given.The inclusion/particle has ten times stronger material parameters than the matrix.Additionally, the parameter domain for the two-dimensional problem to be investigated is prescribed by P = (0.9, 1.2) The output of interest from the reduced-models are the homogenised Piola stress P and magnetic induction B, for which the relative error measures are defined for any set of validation parameters V.
To validate the accuracy and robustness of the reduced models, two sets of randomly chosen parameters V I = p 1 , . . ., p 200 ⊂ P and V II = p 1 , . . ., p 2000 ⊂ P (48) are defined and used in combination with (47).
The computation of P ROM and B ROM (8) requires ũ and ỹ to be computed in all cells of the FE mesh using ( 9).This can be done more efficiently using an auxiliary basis for P and B together with gappy reconstruction, but this renders an additional error.As our focus is on the numerical study of the performance of the hyper-reduction methods, ũ and ỹ are computed for the whole mesh and the constitutive law is subsequently employed to obtain P ROM and B ROM .
It is established in the field of computational homogenisation that the application of linear boundary conditions overestimates the energy compared to e.g., periodic boundary conditions, in particular for small RVEs like the Unit Cell.This has been investigated extensively in [40] for magneto-mechanics.As the choice of boundary conditions does not affect the hyper-reduction methods, the findings from the numerical studies are expected to be valid for different types of boundary conditions.Therefore, due to their simplicity, linear boundary conditions have been chosen to carry out the numerical studies.

Validation of Galerkin ROM
In order to construct the reduced basis, the parameter space P has to be sampled.As the number of sampling points increases exponentially with the dimension of the parameter domain for full tensor grids, sparse grids [48,49] are employed.Sparse grids are based on one-dimensional quadrature rules and a sparse tensor product, which alleviates the curse of dimensionality.For that reason, sparse grids are frequently used in sampling, interpolation and integration of high dimensional functions.
In Figure 2, the sampling of the unit square using a full tensor grid and sparse grids is displayed.Sparse grids built from the one-dimensional Gauss-Legendre quadrature are used to sample the six-dimensional parameter domain P (46).
As the hyper-reduced models are built on top of an existing reduced basis, the accuracy of the Galerkin reduced model ( 16) for varying cardinalities n u and n y of the reduced bases for the fluctuation fields is displayed in Figure 3.For increasing n u and n y , the errors Err P (V I ) and Err B (V I ) decrease monotonously, though the impact of n y on Err P (V I ) becomes negligible for n y ≥ 10.The reduced basis was constructed from n s = 4541 training simulations.As a compromise between accuracy and efficiency, a reduced basis with n u = 20 and n y = 10 is chosen for all following numerical studies.The error of the Galerkin ROM is not only caused by the truncation of the POD basis but also by an insufficient selection of training parameters S. In this case, insufficient refers to a too sparse sampling of the parameter domain P. In Table 2, the errors Err P (V I ) and Err B (V I ) for n u = 20 and n y = 10 for three different training sets are given.The training set S with n s = 4541 is considered sufficiently large as the errors for the two sets with greater cardinality are not significantly smaller.Err B (V I ) 1.55 × 10 −6 1.51 × 10 −6 1.62 × 10 −6 The results of a ROM for one element in V I are displayed in Figure 4.The errors in the homogenised quantities are small O(10 −6 ) and differences between the ROM and FOM in the primary fields ũ and ỹ can not be seen with the unaided eye.Due to the inclusions ten times larger mechanical and magnetic material parameters, the Piola stress and magnetic induction inside the inclusion are significantly larger.

DEIM
Based on a reduced basis with n u = 20 and n y = 10, the results of the DEIM hyper-reduced model ( 23) are summarised in Figure 5, where the collateral basis H R is computed based on residua collected during the solution of ( 16) for n s = 4541 training parameters.For the POD computations, only residua fulfilling Ru ] and the iteration index j were taken into account.
It is well-established that DEIM models lack robustness for nonlinear pPDEs [26], which is exposed in Figure 5a.There is just a small region of combinations of numbers of collateral reduced basis functions r u , r y , for which the reduced model converged for all parameters in the validation set V I .There are two possible causes that prevent the convergence of a model.The first is the occurrence of unphysical deformations expressed by det F ≤ 0 during the solution process and the second is an insufficient reduction of the residuum R r max /R r 0 > 10 −6 , where max = 10 is the highest admissible number of nonlinear solver iterations.From the 135,200 solutions of (23) for 676 different combinations of r u , r y computed in this study, 11,992 failed to converge.It is remarkable that larger r u and r y do not result in more robust models.Figure 5b,c show the output errors, where we have to note that, for the calculation of Err P (V I ) and Err P (V II ), only the converged runs are taken into account.
The errors decrease with increasing r u and r y but certainly not monotonously.For a ROM with r u = 37 and r y = 25, small errors Err P (V I ) = 1.2 × 10 −3 and Err B (V I ) = 5.5 × 10 −4 are obtained with one simulation failing to converge.It is not reasonable to use greater r u and r y as the achievable reduction of the errors is disproportionate to the increasing numerical cost of the ROM.

Gappy POD
For the Gappy POD study, the same collateral basis H R as in the DEIM study is used.In order to have a fair comparison with DEIM, two sets of gappy points, one for the approximation of Ru and the other for Ry in (28), are determined by applying Algorithm A2 separately to the collateral bases.
In Figure 6, the robustness and accuracy of Gappy POD is studied for different combinations of r u , r y .To facilitate an adequately accurate approximation of the nonlinearities, large enough numbers of gappy points p u , p y are employed.Figure 6a shows that linear regression improves the robustness as more combinations of r u , r y exhibit no convergence failures compared to DEIM (c.f. Figure 5).Nonetheless, 11,339 out of 135,200 simulation runs failed, mostly for smaller values of r u and r y .For the failed runs, either the condition R r 10 /R r 0 < 10 −6 is not fulfilled after ten iterations or unphysical deformations det F ≤ 0 are predicted.,c) output error analysis for varying numbers of residuum modes r u and r y for sufficiently large numbers of gappy points p u = r u /2 + 10 and p y = r y + 20 for a reduced basis of size n u = 20 and n y = 10.
For r u = 36 and r y = 24, a combination yielding supposedly robust and accurate models is chosen to investigate the errors for varying numbers of gappy points p u , p y (see Figure 7).Except for p u ≤ 20, no robustness deficiencies occur.The error Err P (V I ) decreases with increasing p u and is hardly affected by changes of p y .Similarly, Err B (V I ) reduces for larger p y and is only minorly affected by p u .c) accuracy for a reduced basis of size n u = 20 and n y = 10 using r u = 36 and r y = 24 residuum modes.

GNAT
For the study of GNAT, the collateral bases contained in H R u and H R y are constructed by gathering the residua from the solution of ( 32) and a subsequent application of POD.Only residua fulfilling Ru ] and the iteration index j were taken into account.The gappy points are obtained by applying Algorithm A2 separately to the collateral mechanical and magnetic basis.
Similarly to the previous studies, the robustness and accuracy of GNAT is tested for different combinations of r u , r y using a large enough number of gappy points.The results are depicted in Figure 8.We never observed unphysical deformations det F ≤ 0 for GNAT hyper-reduced models, the unsuccessful runs are due to reduced models failing to sufficiently reduce the residuum R r 15 /R r 0 < 10 −3 .As the Gauss-Newton scheme does not exhibit quadratic convergence, we allow for up to 15 iterations to minimise the residuum.Furthermore, the solution of the nonlinear least squares problem (32) does not render R r ≡ 0 in general and consequently the convergence criterion is set to 10 −3 .While the error Err P (V I ) in Figure 8b reduces with increasing r u and values in the order of 10 −3 can be achieved, the error Err B (V I ) in Figure 8c increases for greater r u up to a certain point and Err B (V I ) ≈ 5 × 10 −2 seems feasible.,c) output error analysis for varying numbers of residuum modes r u and r y for sufficiently large numbers of gappy points p u = r u /2 + 10 and p y = r y + 20 for a reduced basis of size n u = 20 and n y =

Empirical Cubature
In Figure 9, the accuracy of EC hyper-reduced models for different numbers of elements in the reduced mesh is depicted.To construct the EC model 4541 snapshots of P and B are taken from the solution of ( 15) and processed into n P = 120 and n B = 100 POD modes.The singular value distribution of the snapshot matrices is depicted in Figure 10, indicating that n P = 120 and n B = 100 POD modes are sufficient to represent the stress and induction state.Thereafter, the weights and elements of the reduced meshes are determined by approximately solving (A3) using Algorithm A3, given in Appendix C. Figure 9a shows that Err P (V I ) decreases for greater m u and is barely affected by m y , whereas Err B (V I ) decreases for greater values of both m u and m y , but the dependence on m y is more pronounced.As EC does not employ collateral bases combined with linear regression or interpolation, no convergence issues occur for the EC hyper-reduced models.In Figure 11a, the weights used in a reduced mesh are plotted.Only a small number of elements accumulate more than half of the total weight sum and that is the reason why the errors depicted in Figure 9 decrease slowly with increasing m u and m y .The distribution of elements in the reduced meshes with a focus on elements equipped with relatively large weights is shown in Figure 11b,c.It is remarkable that the elements with large weights are all located inside the inclusion, whereas all the other hyper-reduced models (DEIM, GappyPOD and RID) evaluate nonlinearities in elements at the boundary or in the vicinity of the interface between matrix and inclusion (c.f. Figure 12).4.

Reduced Integration Domain
To construct the reduced integration domain, the gradients of ϕ u n u =20  As for EC, no convergence problems have been observed for RID hyper-reduced models.

Comparison of the Hyper-Reduction Methods
In Table 4, the performance statistics of reduced models for each hyper-reduction method except GNAT are provided.The parameters for the models, which are listed in Table 5, were chosen based on the results from the previous sections to achieve high accuracy at moderate computational cost.We except GNAT from the comparison as these reduced models are inferior to Gappy POD models with respect to robustness and accuracy.Most importantly, the error Err B (V ) produced by GNAT models is too large.
The number of elements in which either the mechanical or magnetic nonlinearity have to be computed are denoted by m u and m y , respectively.The aforementioned elements are highlighted in Figure 12.It possible to use the same elements for the evaluation of the nonlinearities for all hyper-reduction methods except DEIM, rendering supposedly slightly less accurate but more efficient reduced models.However, for the sake of comparison of all introduced Hyper-Reduction methods, the nonlinearities are treated separately, resulting in two distinct sets of elements.In all these elements, the solution ( ũ, ỹ) has to be computed based on the reduced solution ( ũr , ỹr ) with N u eval and N y eval denoting the number of DoFs involved in these operation performed in every iteration of the nonlinear solver.
The errors obtained by DEIM, Gappy POD and RID are in the same range, whilst the errors for EC are at least of one order of magnitude larger.The accuracy of Gappy POD is superior to DEIM as linear regression performs better than interpolation and additionally increases the robustness.For DEIM, 18 out of 2000 runs failed to converge, whereas no such deficiencies are observed for the other methods.
The computation times were measured on a single core (AMD ™ Ryzen ™ 1950X CPU @4 GHz) without using any kind of parallelisation.The speed-up is defined as the ratio of time needed to solve the FOM and the ROM for the 2000 parameters in V II , which does not include the time to calculate the homogenised quantities.The speed-up for DEIM is the greatest as the least number of solution and nonlinearity evaluations had to be performed, with Gappy POD being second due to more evaluations.The EC and RID reduced models are considerably slower as both methods need to evaluate the nonlinearities for a larger number of elements to gain comparable accuracy.Table 5. Parameters of the hyper-reduced models used in Table 4.
DEIM r u = 37 and r y = 25 Gappy POD r u = 36 and r y = 24, p u = 50 and p y = 58 EC m u = 120 and m y = 110 RID l = 1

Conclusions
In this work, we applied the tools of reduced-order modelling to the problems arising in computational homogenisation in magneto-mechanics.The main focus was the investigation and comparison of different hyper-reduction techniques with respect to accuracy and robustness.Collateral basis methods like DEIM and Gappy POD are the fastest and most accurate, but are susceptible to robustness deficiencies.This is particularly true for DEIM, for which we could not build sufficiently robust models in the course of this work.Gappy POD diminishes that issue by using linear regression, providing adequately robust models.Additionally, unlike Gappy POD, DEIM does not offer the option to evaluate the mechanical and magnetic nonlinearity in the same elements, resulting in more expensive reduced models.For those reasons, DEIM will not be considered in future studies.A disadvantage shared among the collateral basis methods is that instances of the FE residuum have to be collected from non-equilibrated states, which results in more expensive POD computations.
EC and RID solve the weak form in a subdomain, to which the first refers to reduced mesh and the latter to reduced integration domain.To obtain similar accuracy as collateral basis methods, EC and RID have to perform more function evaluations and are therefore more expensive.However, their robustness is superior to the collateral basis methods and hence they are particularly suitable for multi-query frameworks the FE 2 method.For problems with stronger nonlinearities, e.g., due to complex material models, rate-dependence, plasticity and many more, the robustness superiority will be even more pronounced.
The next step is to equip the reduced models with an auxiliary basis to efficiently compute the homogenised Piola stress and magnetic induction, which can be utilised in a perturbation-type method [50] to obtain the macroscopic tangent moduli.Similarly, the macroscopic tangent moduli could be computed adapting the method described in [51] for the reduced model.with Ju ∈ R n u n P ×M , Jy ∈ R n y n B ×M , bu ∈ R n u n P and by ∈ R n y n B , . . . . . . . . . .
Note that the problems in (A2) allow for the trivial solutions ω u = 0 and ω u = 0. Any POD mode φP j or φB j is a linear combination of the snapshots P1 , . . ., Pn s or B1 , . . ., Bn s and, as the snapshots are taken from states of equilibrium, the right-hand sides become bu = 0 and by = 0. Therefore, problems (A2) are regularised by adding the constraints ∑ By subtracting the volume averaged row-sums, the regularised minimisation problems (A3) are obtained and approximately solved using Algorithm A3:

j 1 P 1 P
for j = 1, . . ., r y are unit vectors with only one non-zero component in the ρ u i -th and ρ y j -th entry.Application of the DEIM Algorithm A1, provided in Appendix A, to the collateral bases φR u indices and guarantees the matrix products P R u H R u and P R u H R u to be non-singular.Consequently, the interpolation coefficients are calculated by r u = P R u H R u −R u Ru and r y = P R y H R y −R y Ry .

2 2with 2 with P R y = e ρ y 1 ,
P R u = e ρ u 1 , . . ., e ρ u pud ∈ N N u ×p u d and r y = arg min a∈R ry P R y Ry − P R y H R y a 2 . . ., e ρ y py ∈ N N y ×p y .

.
The reduced meshes E u and E y equipped with the weights {ω u e } m u e=1 and ω y e m y e=1 are constructed by minimising the errors (36), and induction fields are introduced, where n P n s and n B n s should hold.For that reason, (15) is solved for the parameters in S(11) and the snapshots of the stress and induction fields are gathered in the matrices S P and S B .A successive application of POD gives the collateral bases H P and H B : POD (S P ) → H P = φP 1 , . . ., φP n P ∈ R n qp d 2 ×n P with S P = P(p 1 ), . . ., P(p n s ) ∈ R n qp d 2 ×n s , POD (S B ) → H B = φB 1 , . . ., φB n B ∈ R n qp d×n B with S B = B(p 1 ), . . . ,B(p n s ) ∈ R n qp d×n s . of P and B at the n qp quadrature points of the FE model.

=RID = e ρ y 1 ,
P RID P RID B with P u RID = e ρ u 1 , . . ., e ρ u lu ∈ N N u ×l u , P y . . ., e ρ y ly ∈ N N y ×l y and P u i ∈ R N u for i = 1, . . ., l u and e ρ y j ∈ R N u for j = 1, . . ., l y are unit vectors with only one non-zero component in the ρ u i -th and ρ y j -th entry.The indices ρ u reduced domains are generated from the gradients of the reduced bases H ∇ X u := ∇ X φu 1 , . . ., ∇ X φu n u ∈ R n qp d 2 ×n u and H ∇ X y := ∇ X φy 1 , . . ., ∇ X φy n y ∈ R n qp d×n y .qp quadrature points of the FE model.Applying the DEIM Algorithm A1 to H ∇ X u and H ∇ X y returns the indices used to construct the reduced domains Ω u RID and Ω y RID .Strictly speaking, the reduced domains are the unions of elements containing these indices.Using (41) in (16) renders the discrete weak from of the RID hyper-reduced model R r k := B P RID P RID Rk = 0 (43) and its linearisation becomes B P RID precomputed: R n×l P RID K k B∆ ỹr k = − B P RID precomputed: R n×l P RID Rk .

Figure 1 .
Figure 1.FE mesh of a Unit Cell discretised using M = 1840 elements with quadratic Ansatz functions resulting in N u = 14, 882 and N y = 7441 DoFs.

Figure 2 .
Figure2.Sampling of the two-dimensional parameter domain [0, 1] × [0, 1] using a full tensor grid and two sparse grids based on a one-dimensional Gauss-Legendre quadrature with different sampling densities.

Figure 3 .
Figure 3. Errors in homogenised quantities of interest for varying sizes of reduced bases n u and n y computed from n s = 4541 training simulations.

Figure 5 .
Figure 5. (a) robustness and (b,c) output error analysis for varying numbers of DEIM indices r u and r y for a reduced basis of size n u = 20 and n y = 10.

Figure 6 .
Figure 6.(a) robustness and (b,c) output error analysis for varying numbers of residuum modes r u and

Figure 7 .
Figure 7. Influence of gappy point numbers p u and p y on (a) robustness and (b,c) accuracy for a reduced basis of size n u = 20 and n y = 10 using r u = 36 and r y = 24 residuum modes.

Figure 8 .
Figure 8.(a) robustness and (b,c) output error analysis for varying numbers of residuum modes r u and

Figure 9 .Figure 10 .
Figure 9.Output error analysis for varying numbers of elements m u and m y constituting the reduced meshes E u and E y built using n P = 120 stress and n B = 100 induction modes for a reduced basis of size n u = 20 and n y = 10.

Figure 11 .Figure 12 .
Figure 11.(a) values and distribution of weights (b) ω u and (c) ω y in T for reduced meshes E u and E y with m u = 120 and m y = 110 elements.The yellow circles mark the boundary between matrix and inclusion.For better quality, we point to the online version of the article.

i=1
and ϕ y n y =10 i=1 are computed and the application of Algorithm A4 yields the reduced domains Ω u RID and Ω y RID .The errors for three different choices of the number of surrounding layers l are listed in Table 3.As expected, the errors decrease with increasing l but solving (43) becomes computationally more expensive.If no surrounding layers are included, the reduced bases ϕ u n u =20 i=1 and ϕ y n y =10 i=1 in Ω u RID and Ω y RID are linearly dependent and consequently (43) is not well-posed.

(
ω u , E u ) = arg min solved and read in matrix format as(ω u , E u ) = arg min R d u

Table 2 .
Output error for different numbers of training parameters n s for a reduced basis of fixed size n u = 20 and n y = 10.Err P (V I ) 2.75 × 10 −6 2.59 × 10 −6 2.53 × 10 −6