emgr - The Empirical Gramian Framework

Gramian matrices are a well-known encoding for properties of input-output systems such as controllability, observability or minimality. These so called system Gramian matrices were developed in linear system theory for applications such as model order reduction of control systems. Empirical Gramian matrices are an extension to the system Gramians for parametric and nonlinear systems as well as a data-driven method of computation. The empirical Gramian framework \emgr implements the empirical Gramians in a uniform and configurable manner, with applications such as Gramian-based (nonlinear) model reduction, decentralized control, sensitivity analysis, parameter identification and combined state and parameter reduction.

P r e p r i n t

Introduction
Attributes of input-output systems, such as controllability, observability or minimality can be assessed by special matrices. These so-called system Gramian matrices, or short system Gramians, have manifold applications in system theory, control theory and mathematical engineering. Originally, Gramian-based methods were developed for linear systems [46]. The empirical Gramian matrices [49] are an extension of Gramian-based methods to nonlinear and parametric systems. This work summarizes the empirical Gramian framework (emgr) [35], a compact software toolbox, which implements various types of empirical Gramians as well as the related empirical covariance matrices. An important use of empirical Gramian matrices is model order reduction (MOR), utilizing the capability of system Gramians to quantify the input-output coherence of the underlying system's states based on controllability and observability. Several variants of Gramian-based model reduction are available, for example: • Linear Model Order Reduction [57], • Robust Model Reduction [73], • Parametric Model Order Reduction (pMOR) [39], • Nonlinear Model Order Reduction (nMOR) [49,27,15,79], • Second-Order System Model Reduction [80], • Combined State and Parameter Reduction [38].

P r e p r i n t
This wide range of applications and the compatibility to nonlinear systems make empirical Gramians a versatile tool in many system-theoretic computations. Furthermore, the empirical Gramians provide a data-driven method of computation with close relations to proper orthogonal decomposition (POD) and balanced POD (bPOD) [76,62]. Various (Matlab) implementations are available for the computation of linear system Gramians by the solution of a matrix equation, such as the basic sylvester command, the gram and lyap commands from the Matlab Control Toolbox 1 and Octave Control Package 2 or the M.E.S.S library [63]. For empirical Gramians, the only other generic implementation, to the author's best knowledge, is [75], which provides the empirical controllability Gramian and the empirical observability Gramian, but not an empirical cross Gramian (see Sections 3.1.3 and 3.1.4). This makes emgr the unique (open-source) implementation of all three: the empirical controllability Gramian W C , the empirical observability Gramian W O and the empirical cross Gramian W X (sometimes also symbolized by W CO and X CG ).

Aim
After its initial version 1.0 release, accompanied by [37], the empirical Gramian framework 3 has been significantly enhanced. Apart from extended functionality and accelerated performance, various new concepts and features were implemented. Now, with the release of version 5.0 [35], this is a follow up work illustrating the current state of emgr and its applicability, as well as documenting the flexibility of this toolbox. In short, the major changes involve: • Non-symmetric cross Gramian implementation, • Linear cross Gramian implementation, • Time-integrator interface, • Distributed cross Gramian implementation and interface, • Inner product kernel interface, • Time-varying system compatibility, • Tensor-based observability trajectory snapshot organization.

Outline
This work is structured as follows: In Section 2 the empirical Gramian's main application, projection-based model order reduction, is briefly described; followed by Section 3 presenting the mathematical definitions of the computable empirical Gramians. Section 4 summarizes the design decision for emgr, while Section 5 documents usage and configuration. Numerical examples are demonstrated in Section 6 and lastly, in Section 7, a short concluding remark is given. P r e p r i n t

Mathematical Preliminaries
The mathematical objects of interest are nonlinear parametric input-output systems, which frequently occur in physical, chemical, biological and technical models. These control system models consist of a dynamical system (typically on R) and an output function, which maps the input u : R >0 → R M via the state x : R >0 → R N to the output y : R >0 → R Q : The potentially nonlinear vector-field f : evaluations of the state x(t) and input or control u(t), as well as the parameters θ ∈ R P . Together with an initial condition x(0) = x 0 ∈ R N , this setup constitutes an initial value problem.

Model Reduction
The aim of model reduction is the algorithmic computation of surrogate reduced order models with lower computational complexity than the original full order model. For the sake of brevity only combined state and parameter reduction is summarized here, which encloses state-space reduction, parametric state-space reduction and parameter-space reduction as special cases; for an elaborate layout see [3,7,34]. Given the general, possibly nonlinear input-output system (1), a combined state and parameter reduced order model: with a reduced state x r : R >0 → R n N and a reduced parameter θ r ∈ R p P is sought. Accordingly, a reduced vector-field f r : R >0 × R n × R M × R p → R n and a reduced output functional g r : R >0 × R n × R M × R p → R Q describe the reduced system, for which the reduced system's outputs y r : R >0 → R Q should exhibit a small error compared to the full order model, yet preserving the parameter dependency: A class of methods to obtain such a reduced order model and fulfill associated requirements is described next.

Projection-Based Model Reduction
Projection-based model reduction is based on bi-orthogonal truncated projections for the state-and parameter-space respectively. The state-space trajectory x(t), near a steady statex ∈ R N , is approximated using truncated reducing and reconstructing projections V ∈ R n×N and U ∈ R N ×n , with V U = 1 n : P r e p r i n t The relevant parameter-space volume is also approximated by truncated reducing and reconstructing projections Λ ∈ R p×P and Π ∈ R P ×p , with ΛΠ = 1 p : Given these truncated projections, a projection-based reduced order model is then obtained by:ẋ Thus, to obtain a projection-based reduced order model with respect to the state-and parameter-space, the overall task is determining the truncated projections U , V , Λ and Π. This reduced model still requires an evaluation of the original model's nonlinearities for high-dimensional argumentsx + U x r (t). To thwart this so-called lifting bottleneck, a hyper-reduction method, such as the discrete empirical interpolation method (DEIM) [10] could be utilized, and is briefly described for empirical-Gramian-based reduction in [54]. Lastly, it should be noted, that this approach produces globally reduced order models, meaning the projections U , V , Λ, Π are valid over the whole operating region, which is an application-specific subspace of the Cartesian product of the full order state-and parameter-space R N × R P .

Gramian-Based Model Reduction
Gramian-based model reduction approximates the input-output behavior of a system by removing the least controllable and observable state components. To this end the system is transformed to a representation in which controllability and observability are balanced. Given a controllability Gramian W C (Section 3.1.1) and observability Gramian W O (Section 3.1.2) to an input-output system, a balancing transformation [57] is computable; here in the variant from [25] utilizing the singular value decomposition (SVD): Partitioning rows and columns of U and V based on the Hankel singular values in D, which indicate the balanced state's relevance to the system's input-output coherence, and discarding the partitions associated to small singular values, corresponds to the balanced truncation method [57,3]. A cross Gramian W X (Section 3.1.4) encodes both, controllability and observability, in a single linear operator. For a symmetric system, a balancing transformation can then be obtained from an eigenvalue decomposition (EVD) of the cross Gramian [1,6]: P r e p r i n t Alternatively, an approximate balancing transformation is obtained from an SVD of the cross Gramian [70,38]: Similarly, the parameter projection can be based on associated covariance matrices. A transformation aligning the parameters along their principal axis, resulting from an SVD of a such parameter covariance matrix ω [74,38,34]: yields truncatable projections given by the singular vectors. Using only the left or only the right singular vectors of W X or ω for the (truncated) projections of the state-or parameter-space respectively, and their transpose as reverse transformation, results in orthogonal (Galerkin) projections [29]. This approach is called direct truncation method [22,38], i.e. V := U or Λ := Π .

Empirical Gramians
Classically the controllability, observability and cross Gramians are computed for linear systems by solving matrix equations. Computing system Gramians empirically by trajectory simulations, was already motivated in [57] and systematically introduced in [49]. The empirical Gramians are a data-driven extension to the classic system Gramians, and do not depend on the linear system structure. The central idea behind the empirical Gramians is the averaging over local Gramians for any varying quantity, such as inputs, initial states, parameters or time-dependent components around an operating point [47]. In the following, first the empirical Gramians for state-space input-output coherence are summarized, then the empirical Gramians for parameter-space identifiability and combined state and parameter evaluation are described.

State-Space Empirical Gramians
Gramian-based controllability and observability analysis is based on linear system theory [33], which deals with linear (time-invariant) systems, An obvious approach for nonlinear systems is a linearization near a steady state [53], but this may obfuscate the original transient dynamics [69,16]. Alternatively, the nonlinear balancing approach from [64], based on controllability and observability energy functions, could be used. Yet practically, the associated nonlinear system Gramians require solutions to a Hamilton-Jacobi partial differential equation and a nonlinear Lyapunov equation or a nonlinear Sylvester equation, which is currently not feasible for large-scale systems. A compromise between linearization and nonlinear Gramians are empirical Gramians [49,27]. Empirical Gramians are computed by systematically averaging system Gramians obtained from numerical simulations over locations in an operating region P r e p r i n t near a steady-state. An operating region is defined in this context by sets of perturbations for inputs / controls and the steady-state. Originally in [49], these perturbation sets are constructed by the Cartesian product of sets of directions (standard unit vectors), rotations (orthogonal matrices) and scales (positive scalars) for the input and steady-state respectively. In the empirical Gramian framework, the rotations are limited to the set of the unit matrix and negative unit matrix, as suggested in [49]. This constraint on the rotation entails many numerical simplifications and reduces the perturbation sets to directions (standard unit vectors) and scales (non-zero scalars): Yet, only single input and state components can be perturbed at a time in this manner. Empirical Gramians use a centering of the trajectories around the temporal average and solely use impulse input type controls u(t) = δ(t) [49]. The related empirical covariance matrices center the trajectories around a steady-state and allow arbitrary step functions [30,31]. The empirical Gramian framework allows to compute either. In the following, empirical Gramians and empirical covariance matrices will be jointly referred to by the term "empirical Gramian".

Empirical Controllability Gramian
The (linear) controllability 4 Gramian quantifies how well the state is driven by the input of an underlying linear system and is defined as: The empirical variant is given by the following definition based on [49,30].

Definition 1 (Empirical Controllability Gramian)
Given non-empty sets E u and S u , the empirical controllability Gramian W C ∈ R N ×N is defined as: with the state trajectories x km (t) for the input configurationŝ u km (t) = c k e m • u(t) +ū, and the offsetsū,x km .
For an asymptotically stable linear system, delta impulse input u i (t) = δ(t) and an arithmetic average over time as offsetx = 1 T T 0 x(t) dt, the empirical controllability Gramian is equal to the controllability Gramian W C = W C [49]. P r e p r i n t

Empirical Observability Gramian
The (linear) observability Gramian quantifies how well the state is visible in the outputs of an underlying linear system and is defined as: The empirical variant is given by the following definition based on [49,30].

Definition 2 (Empirical Observability Gramian)
Given non-empty sets E x and S x , the empirical observability Gramian W O ∈ R N ×N is defined as: For an asymptotically stable linear system, no input and an arithmetic average over time as offsetȳ = 1

Empirical Linear Cross Gramian
The (linear) cross Gramian [18,20] quantifies the controllability and observability, and thus minimality, of an underlying square dim(u(t)) = dim(y(t)), linear system and is defined as: Augmenting the linear system's dynamical system component with its transposed system 5 , induces an associated controllability Gramian of which the upper right block corresponds to the cross Gramian [21,65]: The empirical variant for just the upper right block of this augmented controllability Gramian (3) is given by the following definition based on [8].

Definition 3 (Empirical Linear Cross Gramian)
Given non-empty sets E u and S u , the empirical linear cross Gramian W Y ∈ R N ×N is defined as: with the state trajectories x km (t) and adjoint state trajectories z km (t) for the input configurationsû km (t) = c k e m • u(t) +ū, and the offsetsū,x km ,z km .

P r e p r i n t
For an asymptotically stable linear system, delta impulse input u i (t) = δ(t) and an arithmetic average over time as offsetx = 1 T T 0 x(t) dt, the empirical linear cross Gramian is equal to the cross Gramian due to the result of the empirical controllability Gramian. This approach is related to balanced POD [4].

Empirical Cross Gramian
Analytically, a cross Gramian for (control-affine) nonlinear gradient systems was developed in [43,24], yet the computation of this nonlinear cross Gramian 6 is infeasible for large systems. For (nonlinear) SISO systems, the empirical variant of the cross Gramian is developed in [71], for (nonlinear) MIMO systems in [38].

Definition 4 (Empirical Cross Gramian)
Given non-empty sets E u , E u , S x and S u , the empirical cross Gramian W X ∈ R N ×N is defined as: For an asymptotically stable linear system, delta impulse input u i (t) = δ(t) and an arithmetic averages over time as offsetsx = 1 dt, the empirical cross Gramian is equal to the cross Gramian W X = W X [71,38].

Empirical Non-Symmetric Cross Gramians
The (empirical) cross Gramian is only computable for square systems, and verifiably useful for symmetric or gradient systems [18,70,38]. In [40] an extension to the classic cross Gramian is proposed. Based on results from decentralized control [56], this non-symmetric cross Gramian is computable for non-square systems and thus non-symmetric systems. Given a partitioning of (2), the (linear) non-symmetric cross Gramian is defined as: For this cross Gramian to the associated "average" SISO system, an empirical variant is then given by: P r e p r i n t

Definition 5 (Empirical Non-Symmetric Cross Gramian)
Given non-empty sets E u , E u , S x and S u , the empirical non-symmetric cross Gramian W Z ∈ R N ×N is defined as: with the state trajectories x km (t) for the input configurationŝ u km (t) = c k e m • u(t) +ū, the output trajectories y lj (t) for the initial state configurations x lj 0 = d l j +x, and the offsetsū,x km ,x,ȳ lj .

Parameter-Space Empirical Gramians
To transfer the idea of Gramian-based state-space reduction to parameter-space reduction, the concepts of controllability and observability are extended to the parameter-space. This leads to controllability-based parameter identification and observability-based parameter identification.

Empirical Sensitivity Gramian
Controllability-based parameter identification can be realized using an approach from [74], which treats the parameters as constant inputs. The controllability Gramian for a linear system with linear parametrization can be decomposed additively based on linear superposition: Similar to [71] the trace of the parameter-controllability Gramians W C,i embodies a measure of sensitivity, and holds approximately for systems with nonlinear parametrization [37].

Definition 6 (Empirical Sensitivity Gramian)
The empirical sensitivity Gramian is given by a diagonal matrix with entries corresponding to the traces of the parameter-controllability Gramians, The sum over all controllability Gramians can also be used for robust model reduction [73].

Empirical Identifiability Gramian
For an observability-based parameter identification, the parameters are interpreted as additional states of the system [68,26]. This approach leads to the P r e p r i n t augmented system, in which the system's state x is appended by the parameter θ and, since the parameters are (assumed) constant over time, the components of the vector-field associated to the parameter-states are zero: The observability Gramian to this augmented system, the augmented observability Gramian, has the block structure: with the state-space observability Gramian W O , the parameter-space observability Gramian W I and the mixed state and parameter block W M = W M . To isolate the parameter identifiability information, the state-space block is eliminated.

Definition 7 (Empirical Identifiability Gramian)
The empirical identifiability Gramian is given by the Schur-complement of the empirical augmented observability Gramian for the lower right block: Often, it is sufficient to approximate the empirical identifiability Gramian by the lower right block of the augmented observability Gramian W P : Apart from the relation of identifiability Gramian to the Fischer information matrix [68], also the connections of the (parameter) observability Gramian to the Hessian matrix [51] is noted here.

Empirical Cross-Identifiability Gramian
If a system is square, the augmented system (4) remains square and for linear systems also symmetry is preserved. Hence, a cross Gramian of the augmented system is computable.

Definition 8 (Empirical Joint Gramian) The empirical joint Gramian is given by the empirical cross Gramian of the augmented system.
The joint Gramian is an augmented cross Gramian and has a similar block structure as the augmented observability Gramian, but due to the uncontrollable parameter-states, the lower (parameter-related) blocks are identically zero. Nonetheless, the observability-based parameter identifiability information can be extracted.
P r e p r i n t

Definition 9 (Empirical Cross-Identifiability Gramian)
The empirical cross-identifiability Gramian is given by the Schur-complement of the symmetric part of the empirical joint Gramian for the lower right block: Thus, the empirical joint Gramian enables the combined state and parameter analysis by the empirical cross Gramian W X and empirical cross-identifiability Gramian WÏ from a single N × N + P matrix. Additionally, the empirical joint Gramian may also be computed based on the non-symmetric cross Gramian.

Implementation Details
This section states concisely the theoretical, practical and technical design decisions in the implementation of the empirical Gramian frameworkemgr [35].

Design Principles
emgr is realized using the high-level, interpreted Matlab programming language, which is chosen due to its widespread use, long-term compatibility and mathematical expressiveness. This enables first, a wide circulation due to compatibility with Mathworks Matlab R [42], and second, the usage of the open-source variant 7 Gnu Octave [13]. Generally, the implementation of emgr follows the procedural programming paradigm, includes some functional programming techniques and avoids object-oriented programming.
Since empirical Gramians are computable by mere basic linear algebra operations, Matlab code can be evaluated efficiently by vectorization, which transfers computationally intensive tasks as bulk operations to the BLAS (Basic Linear Algebra Subroutines) back-end. Overall, emgr is a reusable open-source toolbox, and encompasses less than 500 LoC (Lines of Code) in a single file. Apart from a Matlab interpreter, emgr has no further dependencies, such as on other toolboxes. The source code is engineered with regard to the best practice guides [45] (coding style) and [2] (performance). Furthermore, two variants of emgr are maintained: First, emgr_oct 8 , uses Octave-specific language extension: default arguments and assignment operations, second, emgr_legacy 9 , enables compatibility to Matlab versions before 2016b not supporting implicit expansion, also known as automatic broadcasting.

Parallelization
Apart from vectorization allowing the implicit use of SIMD (single-instructionmultiple-data) functionality for vectorized block operations, also multi-core parallelization is used to maximize use of available compute resources. 7 The FreeMat Matlab interpreter (Version: 4.0) [5] is not compatible. 8 See http://gramian.de/emgr_oct.m 9 See http://gramian.de/emgr_legacy.m P r e p r i n t

Shared Memory Parallelization
For shared memory systems with uniform memory access (UMA), two types of parallelization are utilized. First, an implicit parallelization may be applied by the interpreter for an additional acceleration of block operations. Second, explicit parallelization is available for the computation of different state and output trajectories, using parallel for-loops parfor 10 , but deactivated by default to guarantee replicable results, as the use of parfor does not guarantee a unique order of execution.

Heterogeneous Parallelization
The actual empirical Gramians result from N 2 inner products. In case of the default Euclidean inner product, this amounts to a dense matrix-matrixmultiplication 11 , which can be efficiently computed by GPGPUs (General Purpose Graphics Processing Units). In case an integrated GPU with zero-copy shared memory architecture, such as UMM (uniform memory model) or hUMA (heterogeneous unified memory access) [61], is used, the assembly of the Gramian matrices can be performed with practically no overhead, since the trajectories, which are usually computed and stored in CPU memory space, do not need to be copied between CPU and GPU memory spaces. The GPU can directly operate on the shared memory.

Distributed Memory Parallelization
A disadvantage of empirical Gramian matrices is the quadratically growing memory requirements with respect to the state-space (and parameter-space) dimension. To combat this shortcoming, a specific property of the empirical cross Gramian can be exploited: The columns of the empirical cross Gramian, and thus the empirical joint Gramian, may be computed separately, hence this distributed empirical cross Gramian [36,Sec. 4.2] is computable in parallel and communication-free on a distributed memory computer system, or sequentially in a memory-economical manner on a unified memory computer system. This column-wise computability translates also to the empirical joint Gramian and the non-symmetric variants of the empirical cross and joint Gramian. Based on this partitioning, a low-rank representation can be obtained in a memory-bound or compute-bound setting together with the HAPOD (hierarchical approximate proper orthogonal decomposition) [36]. This POD variant allows to directly compute a Galerkin projection from an arbitrary column-wise partitioning of the empirical cross Gramian. furthermore, a single argument variant may also be used,

emgr('version')
which returns the current version number.

Mandatory Arguments
For the minimal usage, the following five arguments are required: f handle to a function with the signature xdot = f(x,u,p,t) representing the system's vector-field and expecting the arguments: current state x, current input u, (current) parameter p and current time t.
g handle to a function with the signature y = g(x,u,p,t) representing the system's output functional and expecting the arguments: current state x, current input u, (current) parameter p and current time t.
If g = 1, the identity output functional g = x(t) is assumed. w character selecting the empirical Gramian type; for details see Section 5.2.

Features
The admissible characters to select the empirical Gramian type are as follows: 'c' Empirical controllability Gramian (see Section 3.1.1), emgr returns: N × N empirical cross Gramian matrix W X , P × P empirical cross-identifiability Gramian matrix WÏ .

Non-Symmetric Cross Gramian
The non-symmetric cross Gramian [39] (see Section 3.1.5) is a special variant of the cross Gramian for non-square and non-symmetric MIMO systems, which reduces to the regular cross Gramian for SISO systems. Since the computation is similar to the empirical cross Gramian and a non-symmetric variant of the empirical joint Gramian shall be computable too, instead of a Gramian type selected through the argument w, it is selectable via an option flag: Non-symmetric variants may be computed for the empirical cross Gramian w = 'x', empirical linear cross Gramian w = 'y' or empirical joint Gramian w = 'j' by activating the flag nf(7) = 1.

Parametric Systems
Parametric model order reduction is accomplished by averaging an empirical Gramian over a discretized parameter-space [39]. To this end the parameter sampling points, arranged as columns of a matrix, can be supplied via the argument pr.

Time-Varying Systems
Since empirical Gramians are purely based on trajectory data, they are also computable for time varying systems as described in [14]. The empirical Gramian framework can compute averaged Gramians for time varying systems [58], which are time independent matrices. P r e p r i n t

Optional Arguments
The eight optional arguments allow a detailed definition of the operating region and configuration of the computation. reserved for internal use.
P r e p r i n t

Option Flags
The vector nf contains ten components, each representing an option with the default value zero and the following functionality: nf ( = 1 Approximate Schur-complement,

Schur-Complement
The observability-based parameter Gramians, the empirical identifiability Gramian W I and the empirical cross-identifiability Gramian WÏ , utilize an inversion as part of a (approximated) Schur-complement. Instead of using a Schurcomplement solver or the pseudo-inverse, an approximate inverse with computational complexity O(N 2 ) based on [78] is utilized, with the diagonal matrix D, D ii = A ii and the matrix of off-diagonal elements E = A − D. This approximate inverse is used by default for W I and WÏ .

Configuration
Further preferences for the integrator, the inner product and the distributed cross Gramian partitioning scheme, may be configured by global variables.

Solver Interface
To provide a problem-specific integrator to generate the state and output trajectories, a global variable named ODE is available, and expects a handle to a function with the signature: y = ODE(f,g,t,x0,u,p) P r e p r i n t comprising the arguments: f handle to a function with the signature xdot = f(x,u,p,t) representing the system's vector-field and expecting the arguments: current state x, current input u, (current) parameter p and current time t.
g handle to a function with the signature y = g(x,u,p,t) representing the system's output functional and expecting the arguments: the current state x, current input u, (current) parameter p and current time t. x0 column vector of dimension N setting the initial condition.
u handle to a function with the signature u_t = u(t).
p column vector of dimension P setting the parameter.
The solver is expected to return a discrete trajectory matrix of dimension dim(g(x(t), u(t), θ, t)) × T h . As a default solver for (nonlinear) initial value problems, the optimal secondorder strong stability preserving (SSP) explicit Runge-Kutta method [48] is included in emgr. This single-step integrator is implemented in a low-storage variant, and the stability of this method can be increased by additional stages, which is configurable by a global variable named STAGES. The default number of stages is STAGES = 3, yielding the SSP32 method.

Inner Product Interface
The empirical Gramian matrices are computed by inner products of trajectory data. To enable the computation of empirical Gramians with custom inner products such as in reproducing kernel Hilbert spaces (RKHS) [9], an interface for the inner product used during the assembly of an empirical Gramian matrix, is provided by a global variable named DOT, which expects a handle to a function with the signature: w = DOT(x,y) and the arguments: y matrix of dimension T h × n for n ≤ N . This product has to return an N × n matrix. By default, the Euclidean inner product, thus the standard matrix multiplication, with the identity kernel is used: Other choices are universal kernels such as the polynomial, Gaussian or Wendland kernels [77]. Furthermore, the inner product interface may be used to directly compute the trace of an empirical Gramian by using a pseudo kernel: DOT = @(x,y) sum(sum(x.*y')) P r e p r i n t which exploits a property for computing the trace of a matrix product tr(AB) = i j A ij B ji . This interface may also be used to offload the matrix multiplication to an accelerator such as a GPGPU, as motivated in Section 4.2.2.

Distributed Interface
The distributed empirical cross Gramian (Section 4.2.3) can be configured by the global variable DWX, which is expected to be a two-component vector, with a ∈ N defining the maximum number of columns per partition, and b ∈ N setting the current running index of partitions. Together with a distributed singular value decomposition, such as the HAPOD, an empirical-cross-Gramianbased Galerkin projection is computable with minimal communication parallely on a distributed memory system, or sequentially on a shared memory system.

Numerical Examples
In this section empirical-Gramian-based model reduction techniques are demonstrated for two test systems using [13]; first, for a linear state-space symmetric MIMO system, second, for a nonlinear SIMO system.

Linear Verification
A linear test system of the form (2) is generated using the inverse Lyapunov procedure [12], in a variant that enforces state-space symmetric systems [41]. For state-space symmetric systems, A = A , B = C , all system Gramians are equal [52], hence it is sufficient to compute the SVD of an empirical linear cross Gramian (see Section 3.1.3) to obtain a balancing transformation. The system is configured to have N = dim(x(t)) = 256 states and dim(u(t)) = dim(y(t)) = 4 inputs and outputs. For the computation of the empirical linear cross Gramian of the reduced order model, an impulse input u i (t) = δ(t) and a zero initial state x 0,i = 0 is used, while for the trajectory simulation the default SSP32 (Section 5.5.1) integrator is utilized.
To quantify the quality of the resulting reduced order models, the error between the original full order model output and the reduced order model's output is compared in the time-domain L 2 -norm, Also, the compliance with the lower bound, given by error system's Hankel norm, and with the balanced truncation upper bound is assessed [3]: for a reduced model of order n. Instead of impulse input, zero-centered, unitvariance Gaussian noise is used as input time series for the evaluation.    1 shows the reduced order model's relative L 2 -norm model reduction error, as well as the lower and upper bounds for increasing reduced orders. The error evolves inside the envelope spanned by the bounds, until the numerical accuracy 14 is reached.
The structure of this system is similar to the linear system model (2), yet the vector-field includes a hyperbolic tangent nonlinearity, in which the parametrized activation is described by a diagonal gain matrix K(θ), K ii = θ i . The system is generated by a sparse uniformly distributed system matrix A ∈ R 256×256 with enforced stability by sufficiently negative diagonal entries, a sparse uniformly distributed input matrix B ∈ R 256×1 , a dense uniformly distributed output matrix C 16×256 , and parameters θ ∈ R 256 constrained to the interval θ i ∈ [ 1 10 , 1]. For this system a combined state and parameter reduction is demonstrated. To this end an empirical non-symmetric joint Gramian is computed, using again an impulse input u(t) = δ(t), a zero initial state x 0,i = 0 and the default integrator. The reduced order model quality is evaluated for the same input and initial state by the joint state and parameter norm [34]:

Concluding Remark
Empirical Gramians are a universal tool for nonlinear system and control theoretic applications with a simple, data-driven construction. The empirical Gramian frameworkemgr implements empirical Gramian computation for state input-output coherence and parameter identifiability evaluation. Future extensions of emgr include capabilities for the reduction of nonlinear differential algebraic equations (DAE) as well as hyperbolic systems. Finally, it is noted, that further examples and applications can be found at the emgr project website: http://gramian.de.

Code Availability
The source code of the presented numerical examples can be obtained from: http://runmycode.org/companion/view/2077 and is authored by: Christian Himpe.

Acknowldgements
This work was made possible via financial support from the German Federal Ministry for Economic Affairs and Energy (BMWi), in the joint project "MathEnergy -Mathematical Key Technologies for Evolving Energy Grids", sub-project: Model Order Reduction (Grant number: 0324019B).