A Hybrid Approach for Model Order Reduction of Barotropic Quasi-Geostrophic Turbulence

We put forth a robust reduced-order modeling approach for near real-time prediction of mesoscale flows. In our hybrid-modeling framework, we combine physics-based projection methods with neural network closures to account for truncated modes. We introduce a weighting parameter between the Galerkin projection and extreme learning machine models and explore its effectiveness, accuracy and generalizability. To illustrate the success of the proposed modeling paradigm, we predict both the mean flow pattern and the time series response of a single-layer quasi-geostrophic ocean model, which is a simplified prototype for wind-driven general circulation models. We demonstrate that our approach yields significant improvements over both the standard Galerkin projection and fully non-intrusive neural network methods with a negligible computational overhead.


Introduction
The growing advancements in computational power, technological breakthrough, algorithmic innovation, and the availability of data resources have started shaping the way we model physical problems now and for years to come.Many of these physical phenomena, whether it be in natural sciences and engineering disciplines or social sciences, are described by a set of ordinary differential equations (ODEs) or partial differential equations (PDEs) which is referred as the mathematical model of a physical system.Thus far, many computational approaches are based on solving these sets of equations of a mathematical model after applying suitable discretization schemes and numerical tools which may be referred to as a physics-based solution approach.Despite the advances in software engineering and processor technologies, the computational burden of high-fidelity simulation is still a limiting factor for many practical problems in different areas of research, specifically for the large-scale physical problems with high spatiotemporal variabilities such as turbulence modeling of geophysical flows, weather forecasting, and climate modeling.Indeed, low-fidelity models such as large eddy simulation (LES) [1] and Reynolds-averaged Navier-Stokes (RANS) [2] are introduced with additional approximations to neglect some of the physical aspects for closure modeling, and, as a result, the computational cost is reduced.However, these techniques require parameter calibration to approximate the true solution to any degree of confidence and may thus increase costs related to model validation and benchmark data generation.
As a first attempt to perform efficient physics-based surrogate modeling, projection based implementations of model order reduction strategies have proven to be successful in reducing computational cost significantly with little compromise in physical accuracy, and have been extensively Table 1.A classification of reduced-order modeling approaches.

Approaches Comments
Fully non-intrusive models Data-based modeling; data could be generated by experimental measurements (observational data) or high-fidelity numerical simulations (synthetic data); no need to know the underlying physical system or model (no need to have access to a full order model generating data).
Semi non-intrusive models It is a mixed approach with offline intrusive and online non-intrusive models; in addition to snapshot data, we should have access to the high-fidelity model to generate our surrogate model (ROM); after built, ROM stays on a reduced subspace, and we do not need to have access to a high-fidelity model during ROM computations; we often use a projection approach (with truncation) to obtain a dense low-order system.

Intrusive models
We need to have access to some parts (or whole) of a high-fidelity model during online ROM computations; sparse sampling or interpolation approaches might be incorporated; multilevel, multigrid, adaptive mesh refinement and dynamic time stepping approaches to accelerate high-fidelity simulations can be considered in this category.
Coarse-grained models It is a special case of intrusive modeling; reduction approach might utilize a similar high-fidelity model with a reduced computational complexity (i.e., fewer grid points in LES or RANS approaches); they often need a closure model to compensate the effects of truncated scales; closure effects can be embedded into numerics as well.
In the present study, we utilize an artificial neural network (ANN) to develop our ROM framework.Neural networks, one of many machine learning (ML) strategies, have become an area of interest for researchers from diverse fields of study because of their self-adaptive and flexible nature, while approximating complex functions of any physical process with a high degree of accuracy.Even though ML tools such as ANN have been utilized in flow control community for the last couple of decades [29,30], the data-driven idea has recently been popular in computational fluid dynamics community for modeling and solving complex fluid flow problems [31][32][33][34][35][36].These data-driven models can be developed either in a black box fashion or in combination with existing models.We utilize a simple single hidden layer feedforward network (SLFN), known as extreme learning machine (ELM), in this paper for both fully data-driven model and hybrid model.The resolved reduced-order model coefficients and full order simulation data are used to train the ELM architecture.The descriptive idea on ELM and its implementations can be found in Refs.[37][38][39].
The purpose of this paper is to develop a robust and efficient ROM framework using a hybrid approach, and test the performance of the proposed model with respect to the component models, i.e., physics-based and data-driven models.Recently, several promising works have been conducted to develop more efficient and improved reduced-order model frameworks using the hybridization of traditional projection based ROMs and data-driven ROMs [40][41][42].The proposed hybrid framework is developed by combining two component models, the standard projection based ROM and an ANN framework.Our main motivation of this study comes from addressing the strengths and weaknesses of the component models to find a bridge between both models to benefit from the strength of both approaches.Combining an imperfect physics-based model with a data-driven technique to get a hybrid scheme is not an entirely new idea in scientific research [43,44].Even though the hybridization approach varies with respect to the physical problem, selection of ML architecture, approach to find the solution, and the associated physics-based models, the main idea remains similar for most of the works.We then choose a challenging test problem for model order reduction for validation and comparison purpose, and perform a number of statistical analyses on the hybrid model as well as on the component models.The results of our tests suggest that the proposed model is robust and can be extremely useful for complex flow problems.
The rest of this manuscript is outlined as follows: in Section 2, we review the underlying governing equation of the physical problem considered in this study to generate data bank in full order space for POD basis construction and training the ANN architecture as well as the numerical schemes used for full order simulation.We briefly discuss the standard Galerkin projection based ROM approach and fully non-intrusive data-driven ROM approach in Sections 3 and 4, respectively.In Section 5, we introduce our proposed hybrid ROM framework.In Section 6, we perform a variety of analyses to assess the performance and viability of the proposed model with respect to the performance of its physics-based component or data-driven component alone.We also include the full order simulation results as reference.Finally, we summarize our findings in Section 7.

Full Order Model (FOM)
In FOM, we solve the underlying governing equations using a suitable numerical model and generate the data bank of snapshots for ROM.In this section, we introduce the quasi-geostrophic ocean model as a test case and provide a brief description of the numerical schemes employed in this study.

Quasi-Geostrophic (QG) Ocean Model
Since it is well established that much of the world's ocean circulation is wind-driven in large-scale, and oceanic mesoscale processes are also extremely prevalent in the large-scale oceanic circulations, studies of wind-driven circulation using an idealized double gyre wind forcing have become increasingly important in understanding various aspects of ocean and climate dynamics, such as the role of mesoscale eddies and their effect on mean circulation [15,[45][46][47][48].However, the numerical simulation of oceanic and atmospheric flows, up to this point, still requires approximations and simplifications of the full model to resolve some of the enormous range of spatial and temporal scales of the full form of the general circulation models.Even though remarkable advances in developing simplified models have been made during the last few decades in producing increasingly accurate results [8,49], an additional computational challenge is in being able to implement them in long time integration such as those required by the climate modeling [14,50].Therefore, with this in mind, we consider the simple single-layer QG model, also known as barotropic vorticity equation (BVE) that can capture the random inter-decade and inter-annual transitions in large-scale ocean basins [51].This mathematical model shares many analogies with the two-dimensional Euler equation and Navier-Stokes equation [52] and has been extensively used over the years to describe various aspects of the largest scales of turbulent geophysical fluid dynamics [53][54][55].The dimensionless BVE for QG single-layer ocean model can be written as [34,56] where the left hand side terms account for local acceleration and nonlinear advection, respectively, and the right hand side terms represent the rotational, dissipative, and forcing effects, respectively, in the governing equation.Here, ω is the kinematic vorticity, expressed as where u is the velocity vector.For two-dimensional flow in the x-y plane, the kinematic vorticity becomes Here, the flow velocity components can be found from the stream function, ψ, using the following definition: In Cartesian co-ordinates, The kinematic equation connecting the vorticity and stream function can be found by substituting the velocity components in terms of stream function in Equation ( 3), which forms the following divergence-free constraint satisfying Poisson equation: where ∇ 2 is the standard two-dimensional Laplacian operator.The dimensionless formulation of BVE in Equation ( 1) contains Reynolds number (Re) and Rossby number (Ro), two dimensionless numbers which are related to the physical parameters and non-dimensional variables in the following way: where ν is the horizontal eddy viscosity coefficient for ocean basin and β is the Rossby parameter.
It should be noted that the BVE model presented here uses the beta-plane approximation, valid for most mid-latitude simplified ocean basins, i.e., the Coriolis parameter to account for Earth's rotational effect is approximated by f = f 0 + βy, where f 0 is the constant mean Coriolis parameter and β is the gradient of the Coriolis parameter at the basin center.The characteristic horizontal length scale, L (for the non-dimensional purpose) is the basin dimension in x-direction in this study and the reference Sverdrup velocity scale, V, is given by where τ 0 is the maximum amplitude of the sinusoidal double-gyre wind stress, and ρ and H are the reference fluid density and depth of the ocean basin, respectively.In our study, we consider the benchmark four-gyre circulation problem where the time average of the statistically steady equilibrium state exhibits a four-gyre circulation pattern for relatively high Re instead of the symmetric double-gyre standard structure [10,56,57].Based on reference literature [10,[58][59][60][61], we use slip boundary condition for the velocity, which translates into homogeneous Dirichlet boundary condition for the vorticity ω| Γ = 0 and corresponding ψ| Γ = 0, where Γ symbolizes all the Cartesian boundaries.We initiate our computations from a quiescent state (i.e., ω = 0 and ψ = 0) and integrate Equation (1) until a statistically steady state is obtained.

Numerical Schemes
Since for most of the physically relevant ocean circulation models, such as the QG models, the solutions remain in a quasi-stationary state as time goes to infinity [62], schemes designed for the numerical integration of such phenomena should be suited for quasi-stationary behavior of the solutions as well as for the long time integration.In our study, we use the second-order central finite difference schemes in linear terms.For the modeling of nonlinear term, we use second-order skew-symmetric, energy-and enstrophy-conserving Arakawa scheme [63] to avoid computational instabilities arising from nonlinear interactions for the Jacobian term, J(ω i,j , ψ i,j ) in Equation (1), defined as where Here, ∆x and ∆y are the step size in xand y-directions, respectively.The detailed derivation of higher-order Arakawa schemes can be found in Ref. [23].To implement the Runge-Kutta scheme, we use the method of lines to cast our system in the following form: Here, subscripts i and j represent the discrete spatial indices in x and y directions, respectively.£(.) denotes the discrete spatial derivative operators.For our time advancement scheme, we use the optimal third order accurate total variation diminishing Runge-Kutta (TVDRK3) scheme given as [64]: i,j + 2£(ω i,j )∆t 3 , where ∆t is the adaptive time step, which can be computed at the end of each time step satisfying the stability criteria through the Courant-Freidrichs-Lewy (CFL) number.Finally, taking advantage of the simple Cartesian domain, an efficient fast Fourier transform (FFT) method is utilized for the solution of the elliptic Poisson equation which is the most computationally expensive part of the QG models [12].Details of the aforementioned numerical methods used for this study can be found in Refs.[10,58].

Galerkin Projection Based Reduced-Order Model (ROM-GP)
Similar to the area of interest in the present study, many areas of physical science and engineering often require iterative computations, data assimilation, and finer scale simulation for large dynamical systems to resolve the underlying physical phenomena [65][66][67][68].The efficiency of these large-scale simulations decreases with increasing computational complexity of the full order models due to the accumulation of the less significant intermediate results.Hence, the reduced-order models (ROMs) have come into wide use in large-scale simulations to replace large dynamical systems with lower dimensional systems having a similar range of validity and input/output characteristics.Proper orthogonal decomposition (POD), also referred as Karhunen-Lo ève transform (KLT), principal component analysis (PCA) or singular value decomposition (SVD), is a widely applied snapshot-based ROM technique in science and engineering applications with large dimensional systems of nonlinear ODEs or PDEs [69][70][71].In this section, we develop a standard projection methodology for Equation ( 1) where a POD-GP ROM framework is developed from the field variables, ω and ψ on the flow domain, and Ω based on the method of snapshots [72].The snapshots are obtained from the stored sample solution of FOM.The main procedure of ROM in the current study consists of generating low-dimensional orthogonal basis functions from snapshots and next, performing Galerkin projection to obtain the ROM.For POD basis construction, we first obtain N number of snapshots and denote the vorticity field, ω(x, y) as ω(x, y, t n ) for n = 1, 2, ..., N at pseudo-time t = t n .Then, we decompose the solution field by following way into a time-invariant mean, ω(x, y) and a fluctuating part, ω (x, y, t) to represent the full field with only fluctuating part [73]: for x, y ∈ Ω where Ω stands for the two-dimensional domain and the mean of the snapshot is The following algorithm describes the POD-GP procedure: 1. POD basis construction: * Construct a data correlation matrix of the fluctuating part, A = [α ij ] from the snapshots where α ij = Ω ω (x, y, t i )ω (x, y, t j )dxdy.Here, i and j refer to the snapshot indices.* Compute the optimal POD basis functions by solving AΓ = ΓΛ, where is the diagonal eigenvalue matrix and Γ = [γ 1 , ...., γ N ] refers to right eigenvector matrix whose columns are eigenvectors of A. In general, most of the subroutines for solving above eigenvalue equation give Γ with all of the eigenvectors normalized to unity.* Using the eigenvalues λ 1 ≥ λ 2 ≥ ... ≥ λ N stored in a descending order in the diagonal matrix, Λ, define the orthogonal POD basis functions for the vorticity field as where λ k is the kth eigenvalue, γ k n is the nth component of the kth eigenvector, and φ k (x, y) is the kth POD mode.We must mention that the eigenvectors must be normalized in such a way that the basis functions satisfy the orthogonality condition.* Obtain the kth mode for the stream function, ϕ k (x, y), utilizing the linear dependence between stream function and vorticity given by Equation ( 6): ∇ 2 ϕ k = −φ k .* Span the fluctuating component of the field variables into the POD modes by doing the separation of variable as where a k (t) is the time-dependent modal coefficients and φ k (x, y) and ϕ k (x, y) refer to the POD modes.It should be noted that the same a k (t) accounts for both stream function and vorticity based on the kinematic relation given by Equation (6).
* Retain R modes where R << N, such that these R largest energy containing modes correspond to the largest eigenvalues (λ 1 , λ 2 , ..., λ R ).The resulting full expression for the field variables can be written as 2. Galerkin projection to obtain ROM: * Perform an orthogonal Galerkin projection by multiplying the governing equation with the POD basis functions and integrating over the domain Ω [74], which yields the following dynamical system for a k : where We can define the right hand side of Equation ( 22) as r GP k to express the equation in the following form: The system given by above algorithm consists of R coupled ODEs for modal coefficients, a k which can be solved numerically by any suitable time integration scheme.Additionally, the POD basis functions and ROM coefficients can be precomputed from the data, which makes the system more efficient.Defining the vorticity field, ω(x, y, t 0 ) at time t 0 , a complete specification of the dynamical system given by Equation ( 22) can be calculated by the following projection of the initial condition: It is worth mentioning that there are occasions in our POD-GP ROM framework where we compute the inner product of two functions f = f (x, y) and g = g(x, y) as In the present study, we compute the inner products by taking the integral of the product over the domain, Ω using the dual integration method for Simpson's 1/3rd rule.We refer to Ref. [75] for the details of the integration technique.

Artificial Neural Network Based Non-Intrusive Reduced-Order Model (ROM-ANN)
The study of artificial neural networks (ANN) was introduced in the 1990s as a multilayered perceptron based on the idea of mimicking the functions of neurons in human brain [76].ANNs consist of layers possessing a predefined number of unit cells called neurons and connections between the neurons called weights to establish a map between an input layer and an output layer.Other than the input and output layers, an ANN architecture also contains at least one hidden layer which is independent of the training data.The learning can be supervised or unsupervised.In unsupervised learning, the network extracts data features without knowing the target output and input label [77].On the other hand, we have used a supervised learning network named single-hidden layer feedforward neural networks (SLFNs) based on the precomputed inputs from ROM-GP model which discovers the mapping between inputs and outputs from provided labeled sample.Mathematically, the input layer distributes input signals x p : p = 1, 2, ..., P ∈ R P to the Q dimensional hidden layer feature space (i.e., Q is the number of hidden neurons).Let y j : j = 1, 2, ..., J ∈ R J represent the outputs from J number of output neurons.c p q ∈ R Q×P are the weights connecting the neurons in input and hidden layers, β q j ∈ R J×Q are the output weights between the hidden and output layers, and q ∈ R Q and ξ j ∈ R J are the random biases operating as thresholds for the hidden and output layers.Then, the feed-forward operation can be written as: where differentiable functions F and G are the transfer or activation functions [78,79] which can be log-sigmoid, hyperbolic tangent sigmoid, or linear transfer function.The training process of ANN is adjusting the weights to reproduce the desired outputs when the given inputs are fed forward.
In the present study, we use Extreme Learning Machine (ELM) architecture which randomly chooses weights for the hidden nodes and bias terms and analytically determines the output weights of SLFNs [80].Huang et al. [37] first introduced ELM with a goal to achieve a learning speed thousands time faster than the traditional feedforward network learning algorithm while producing a better generalization performance.Later, several extensions of general ELM method are developed in different works [38,81,82].In our study, the ANN architecture is trained by utilizing an ELM approach which requires no biases in the output layer (ξ = 0) as well as the weights (c p q ) and biases ( q ) are initialized randomly from a uniform distribution between −1 and +1 and no longer modified.Therefore, β q j weights are only unknown to be determined in this case.We note that we utilized the tan-sigmoid activation function for the hidden layer neurons given by and a linear activation function for the output layer neurons which is G(x) = x.Finally, from Equation ( 26), the output of ELM can be written as for N sample training data examples, where x n p ∈ R P×N and y n j ∈ R J×N refer to training input-output data pairs.To illustrate further, we can show our problem in following matrix form: where the superscript "T" refers to "Transpose".In our study, the value of J is always 1 since there is only one output neuron.H T ∈ R Q×N on the right hand side of the above equation is given by . (30) By taking the transpose of Equation ( 29), the solution for the weights can be computed by Here, H † ∈ R Q×N is the pseudo-inverse of H ∈ R N×Q and " †" refers to "pseudo-inverse".Since N ≥ Q for H matrix, we can apply following single value decomposition (SVD) to H to compute the pseudo-inverse, , (33) where U and V are column-orthogonal and orthogonal matrices, and Σ is a diagonal matrix whose elements (i.e., Σ q q where q = 1, 2, . . ., Q) are non-negative and called singular values.To minimize the numerical instability introduced while calculating the inverse of Σ by taking the reciprocal of each non-zero element, a well-known Tikhonov-type regularization [83] is used as We choose τ = 10 −12 for the present study which indicates the trade-off between the least-squares error and the penalty term for regularization [84].The unknown weights can be calculated by using Equation (31).
In the present study, our ELM layers and architecture are devised with five inputs (P = 5) and a single output (J = 1) mechanism, as shown in Figure 1.Here, "LT" refers to "linear term", "NT" refers to "nonlinear term", and "TP" refers to "true projection" in the figure.The inputs are resolved ROM-GP variables in each time integration step which can be denoted from Equation (22) as and the single true solution output for training set can be expressed as which is the Galerkin projection to the FOM given by Equation (1).We denote the ANN predicted solution output as r ANN k for the rest of this paper.Using the ANN prediction, r ANN k , we can compute the modal coefficients, a k , for fully non-intrusive ROM using the following equation similar to Equation ( 23): where r ANN k is computed using Equation (28) in our neural network model deployment.We use the trained weights from the model to get the outputs (denoted as Y, i.e., y 1 = r ANN k in our study) for corresponding inputs of each time integration steps.We note that we normalize the computed ROM variables before using them as inputs for ELM training set to stabilize the system from any kind of numerical errors or biases, and to ensure that all the input variables are at a similar range.The normalized input sets in our study can be expressed as where , where g = a, r LT , r NT , r GP .( Finally, we normalize our training set input and output in (max, min) ∈ [+1, −1] limit as The physical output, Y is then recovered from the ANN predicted output, Ỹ, by following unnormalization method for the normalized data: This normalization ensures that the input variables lie in the data range of the tan-sigmoid activation function, i.e., −1 to +1 [85,86].
Output layer (J dimensional)

Random weights
Trained weights

Hybrid Modeling (ROM-GP + ROM-ANN) Based Reduced-Order Model (ROM-H)
Hybrid modeling, as a combination of physics-based and data-driven modeling, is a fast emerging modeling approach in fluids community since both of these distinct modeling approaches are well-established in the community at this point.As illustrated in the left part of Figure 2, the traditional physics-based modeling approach can be described as understanding a part of the full physics (full physics is represented by the black colored region whereas the orange colored region displays the understood physics), developing a mathematical understanding through equations (green colored region in the figure), and finally resolving some part of the true physics (represented by the blue colored region) by using further assumptions and computational constraints.On the other hand, data-driven modeling works as a black box where we make an estimation of the true solution by using available observed data snapshots without having any knowledge of the governing physics.For model order reduction, the physics-based approach is widely applied as mentioned in previous sections, and nowadays, with the rapid growth of the availability of data, data-driven approach has become a popular route for reduced-order modeling.A comparative representation of these two approaches has been outlined in Table 2. Considering the pros and cons of both sides, our hybrid reduced-order modeling (ROM-H) comes up with the idea to address the negative aspects of both approaches, whereas the positive aspects can be leveraged simultaneously.Similar to any kind of hybrid approach, the primary goal of this paper is to establish a robust hybrid ROM framework which aims at making the model more efficient as compared to either the physics-based or data-driven ROM models.We build our ROM-H framework by introducing a free parameter to dynamically account for the contribution of the physics-based ROM and data-driven ROM.The ROM-H methodology can be summarized by following key steps: Step 1 (offline): Generate a set of basis functions φ k for k = 1, 2, 3, ...., R from the snapshot data obtained from FOM.
Step 2 (offline): Apply Galerkin projection to compute the coefficients required for ROM-GP.
Step 3 (offline): Train the ELM network using the resolved ROM-GP coefficients and true projections datasets.
Step 4 (online): Compute a k by solving the following ordinary differential equations in reduced-order space: where we define η as a free weighting parameter in a range of 0 ≤ η ≤ 1 to establish a relationship between the standard ROM-GP and non-intrusive ROM-ANN models.If η = 0, we recover Equation ( 23), whereas, for η = 1, Equation ( 40) can be obtained.
Step 5 (offline): Obtain the full order solution by transferring data from reduced-order space by using Equation ( 20) as a post-processing task if needed.
The hybridization between the physics-based and data-driven based ROM happens in Step 4, where, in each time step, r GP k is calculated using Equation ( 23), and r ANN k is obtained from Equation (40).The parameter η gives us a freedom to benefit from both modeling techniques mentioned above rather than fully depending on a single modeling technique, and, as we show in Section 6, the hybrid model with optimal contribution from both modeling approaches gives a better prediction of the true solution than the individual modeling approaches.The limiting values of η will result in the standard ROM-GP and ROM-ANN model solutions, respectively, and any value between the limits of 0 to 1 will give a solution with contribution from both ROM-GP and ROM-ANN models.Based on the working physical problem, we assume that either more physics-based model contribution or more data-driven model contribution can produce a better estimation of the truth.Keeping this in mind, instead of forcing an arbitrary value to specify η for hybridization, we apply the following dynamic optimization strategy based on the values of r GP k and r ANN k in each time integration steps: Here, we utilize the absolute value of hyperbolic tangent function, which always produces a value for η between 0 and 1.If the values of r GP k and r ANN k are closely related (||r GP || ∼ = ||r ANN ||), the value of η parameter will be 0, i.e., it indicates the use of ANN prediction for that particular case will be redundant.Similarly, if the values are far away, the ANN prediction contribution will be more dominant.However, we emphasize that the estimation technique we utilized here for η calculation is on an ad-hoc basis for the present study.On this hybrid framework, the process of finding the optimal η can be a potential topic for another research direction where optimization techniques such as trust-region method, conjugate gradient methods, or quasi-Newton methods can be used [87,88].

Numerical Results
For validation purpose, our reduced-order model resulting from the hybridization of the standard ROM-GP approach and fully non-intrusive ROM-ANN approach has been used to model a four-gyre barotropic ocean circulation problem.In this section, we present the numerical results and analyses demonstrating the computation and prediction performance of our proposed hybrid framework as well as the component models (ROM-GP and ROM-ANN) and FOM simulation for comparisons.We select this particular test problem (as many other studies do [10,56,61,89,90]) because of its complexity induced by the symmetric double-gyre wind forcing developing a four-gyre circulation pattern which makes this problem challenging enough to assess the viability of proposed models.Based on the following results, it is apparent that the ROM-H model provides a better estimation of the true solution as compared to the individual component model solutions.Indeed, it was shown before that adding a stabilization to the standard ROM-GP improves the model prediction and resolve more mean dynamics [23].However, our proposed hybrid model idea is data-driven without taking any additional phenomenological arguments (e.g., eddy viscosity) to stabilize the model, and can be applied to most of the flow phenomena.On the other hand, we would like to note that the ROM-GP model can be cured with nonlinear modal eddy viscosities, i.e., with an energy-flow correction [91] or other closure approaches [92,93].

Case Setup Specifications for FOM Simulations
In the present work, we consider the dimensionless form of the BVE describing the QG problem in the Cartesian domain (x, y) ∈ [0, 1] × [−1, +1] as a reference.The FOM simulation is conducted starting from t = 0 to t = 60 using a fixed time step of ∆t = 1 × 10 −4 on a Munk layer resolving 128 × 256 grid resolution.We collect 900 data snapshot from time t = 15 to t = 60 to avoid collecting data from initial transition period for POD basis generation and ELM training, i.e., the sampling time ∆s is 5 × 10 −2 .We use dimensionless parameters Re = 25, 100, 400 and constant Ro = 3.6 × 10 −3 for all the simulations throughout the paper.

Analysis of the Standard ROM-GP Method
We first construct the POD data correlation matrix using the snapshots data from FOM simulation to generate the POD basis function for all the ROM approach in this paper.Hence, it is important to observe the characteristics and distribution of the eigenvalues with respect to modal index as presented in Figure 3.To estimate the total kinetic energy accumulation of the system, we compute the percentage of each eigenvalue by where N s = 900 is the total number of snapshots.Since higher Re incorporates more stratification or, to be specific, as more energy is added to the system, more modes are required to capture majority percentage of the energies in the system.Figure 3 shows that, while only around 10 modes is enough to capture most of the system's kinetic energy for Re = 25, a large percentage of energies (close to 85% and 75%) are accumulated in the first 30 and 40 modes for Re = 100 and Re = 400 cases, respectively.However, the percentage of captured energy is lower in higher Re cases which indicates the challenges associated with model order reduction of this test case in higher Re.These findings are supported by Figure 4 where we present the first few POD basis functions, the most important ones regarding the captured information, for Re = 100.We can identify the increase in the equal number of smaller structures (maxima and minima) which correspond to higher POD indices.Using more POD bases will result in better approximation until the corresponding eigenvalue stagnation point reaches.Based on the POD analysis, we develop the rest of our studies using POD modes up to 40 to incorporate most of the physics within the system.
(a) (b)  Figures 5-7 show the mean stream function field evolution with respect to increasing modes, R using standard ROM-GP approach for Re = 25, 100, 400, respectively.It is apparent in Figure 5 that, for Re = 25 case, even R = 10 shows a good prediction of the FOM solution.We can also notice in the figure that the mean stream function plot shows a consistent result for R = 20 with respect to R = 10 result, but we get a worse result for R = 30.This can be explained as an over-fitting, as we have seen in Figure 3 that, for Re = 25, R = 20 modes contain more than 99% of the system's total kinetic energy.It has also been noted in [94] that adding more modes might yield worse solutions due to the noise in the system.The inaccurate result for R = 30 is understandable in a sense that the previous modes contain most of the energies (more than 99%) of the system which is considered a sufficient total kinetic energy for the optimal number of modes in most of the POD studies [83].Even though R = 40 modes shows a very good estimation of FOM result, it is recommended not to use more than the optimal number of POD modes to avoid instabilities in the solution field.This occurrence can also be seen in Figures 8 and 9 for Re = 25.On the other hand, for Re = 100 case, R = 10 shows a nonphysical two-gyre circulation pattern in Figure 6 similar to the plots of coarse truncated modes (R = 10 and R = 20) for Re = 400 case in Figure 7.We further illustrate the effect of R and Re for ROM-GP approach in the prediction of true solution in Figures 8 and 9.For the purpose of comparison, we include the FOM data projection to reduced-order space using corresponding POD basis functions for Re = 25, 100, 400.In Figure 8, we show the non-dimensional time evolution of the first modal coefficient, a 1 (t) in reduced-order space for different Re and R. The plot supports the conclusions drawn from the POD analysis and the mean stream function plots that for Re = 100, 10 POD modes provide an unphysical result.For even higher Re = 400, 20 POD modes show the numerical instabilities as we can see a sudden burst of random oscillations in much higher scales.To put extra emphasis on our findings, we present the results for tenth modal coefficient in Figure 9 where we can observe the same scenario as Figure 8 on a different scale.We can notice that the scales for tenth modal coefficient are much smaller than the first modal coefficient time series which is expected for higher modes, but the performance of the ROM-GP model is consistent.

Assessments of the Prediction Performance of ROM-GP, ROM-ANN, ROM-H
We observed the inability of the standard ROM-GP model to estimate the correct physics using coarse truncated modes (R = 10 or 20) for Re = 400 in Section 6.2.For this reason, we consider R = 10 modes only for the rest of the analyses to assess the predictive performances of the models.As we can see in Figure 10 for Re = 25, all models capture the two-gyre pattern, which is consistent with the true solution as well as previous observations.Similar to Section 6.2, the ROM-GP model fails to capture four-gyre pattern for Re = 100 and R = 10 combination, as shown in Figure 11, but the ROM-ANN and ROM-H models exhibit a successful prediction.Nevertheless, Figure 12 demonstrates the most promising findings that the ROM-H model successfully predicts the four-gyre pattern while the other two basic ROM approaches fail for Re = 400 and R = 10 combination.To support this finding, we again plot the time series evolution for both a 1 and a 10 , as presented in Figures 13 and 14, to find that the ROM-H model shows an impressive prediction of the FOM projected time series for even higher Re.The prediction of ROM-ANN model is better than the ROM-GP model using same values for the parameters, but not as accurate as the ROM-H model for Re = 400.As we can see, the ROM-H model is capable of capturing the physics utilizing only 10 modes, which leads to an excellent development of a more efficient ROM than the standard ROM-GP, which requires higher modes to capture the physics for Re = 400.A summary of computational overhead is presented in Table 3.Based on the results in Section 6.3, we can state that the prediction performance of our proposed ROM-H model is better than the ROM-ANN model for this test case.However, sensitivity analysis of the ANN architecture should be performed before reaching any conclusion to verify whether the network output alters with the change in the total number of neurons in hidden layer, Q [95].Because of the black box nature of ANN, we perform a sensitivity analysis with respect to Q = 20, 40, 80 neurons for both ROM-ANN and ROM-H models.It can be clearly seen in Figure 15 of mean stream function results that the ROM-ANN model is showing similar performance improvements for all different number of neurons for R = 10 and Re = 100 cases.Furthermore, in Figure 16, the ROM-H model is showing an excellent agreement with FOM solution for all different number of neurons.On the other hand, the time series of modal coefficient plots for ROM-ANN model in Figures 17 and  18 exhibit some noticeably unusual traits in a number of cases, for example, a 1 (Q = 40, Re = 25), a 1 (Q = 20, Re = 100), a 10 (Q = 40, Re = 25).In addition, the ROM-H model displays satisfactory consistency in its predictive behavior with respect to the variation of Q, as shown in Figures 19 and 20.This suggests that the ROM-H model is robust with respect to the change in total number of neurons included in the hidden layer.

Time Series Evolution and Out-of-Sample Forecasting
To further illustrate the robustness and enhanced potential of the proposed model, here we present a couple analyses on the ROM-GP, ROM-ANN, and ROM-H approaches.In Figure 21, we present the temporal evolution of a 1 obtained by ROM-GP, ROM-ANN and ROM-H model along with FOM projection for Re = 25.The figure displays the forecasting capability of ROM models for a long period of time up to t = 180 even though the data snapshots were taken between t = 15 and t = 60.For Re = 100 and 400 in Figures 22 and 23, respectively, the ROM-ANN and ROM-H models still show a stable time series results, whereas the ROM-GP prediction becomes unphysical.The long time forecasting can be a very useful property of the ROM-H model to reduce the computational cost, and can be applied for practical applications related to forecasting.An out-of-sample a posteriori analysis is also performed where the snapshot data and training dataset are prepared at a particular physical condition to predict the performance of the models at different physical condition.In this study, we conduct two out-of-sample analyses: in one case, we generate the snapshots and training data at Re = 100 (lower value than the target prediction state) to predict the performance at Re = 200 and, in the other case, we generate the snapshots and training data at Re = 400 (higher value than the target prediction state) to predict the performance at Re = 200.As displayed in Figure 24 for the time series of a 1 , the ROM-GP again fails to capture the underlying physics for R = 10.The ROM-ANN prediction seems not so random but still does not represent the true statistics.However, the ROM-H prediction series matches the scales of the time series of FOM data projection.We can get a clear idea of this quantitative analysis by observing the visual representation of the mean stream function field in Figure 25.It is apparent that only the ROM-H prediction matches the FOM result since the underlying four-gyre dynamics is captured.Finally, we analyze the prediction performance at Re = 200 by using the data collected at Re = 400.The time series plot in Figure 26 reveals that neither ROM-GP nor ROM-ANN can capture the true scales while once again, the ROM-H model shows an impressive prediction of the true physics at Re = 200 using Re = 400 data.Figure 27 shows the mean field representation which indicates the ROM-H model is capable of out-of-sample forecasting using data from higher Re.These analyses of longer time forecasting and out-of-sample validation puts more strength on the conclusion drawn above that the proposed ROM-H method is more robust in predicting the underlying full order physics than the fully non-intrusive ROM-ANN and standard ROM-GP methods.Forecasting of temporal mode evolution of a 1 (t) for Re = 100.Note that all ROMs are trained by the data snapshots obtained between t = 15 and t = 60.Although the ROM-GP approach provides reasonable (physical) results at the beginning (i.e., t < 18), the error then is amplified exponentially.

Figure 23.
Forecasting of temporal mode evolution of a 1 (t) for Re = 400.Note that all ROMs are trained by the data snapshots obtained between t = 15 and t = 60.Although the ROM-GP approach provides reasonable (physical) results at the beginning (i.e., t < 18), the error is then amplified exponentially.

Summary
In this work, we propose a reliable and robust reduced-order modeling technique using a hybrid framework.The hybridization is done by combining the standard Galerkin projection based ROM and ELM neural network architecture.We consider an ocean circulation problem governed by a single-layer quasi-geostrophic model as a test platform to analyze the prediction performance and robustness of the model compared to the physics-based and data-driven models.The training and snapshots data for POD basis construction are generated using a FOM simulation.The resolved ROM coefficients are utilized as inputs for the simple feedforward ELM network.We then introduce a weighting parameter to dynamically account for contributions from the physics-based and data-driven models.In other words, we propose a hybrid reduced-order model (ROM) comprising a first principle Galerkin model and a data driven ANN model for the dynamics based on POD mode amplitudes.Therefore, the key novelty of this hybrid ROM is a simple weighting between first-principle and data-driven model.Furthermore, we show an ad-hoc dynamical approach to estimate this weighting parameter which provides a freedom in the hybridization procedure to come up with a good possible combination of component model contributions.
To assess the performance of our proposed model, we perform a number of analyses based on the mean stream function field variable and time series evolution statistics.Our study reveals that the proposed hybrid ROM model can successfully capture the underlying physics using a small number of modes which means the hybrid model is capable of performing a robust model order reduction process than the standard projection based ROM and data-driven ROM approaches.To validate the fidelity of the results obtained by the hybrid framework, we conduct a sensitivity analysis which reveals that the hybrid model is consistent with respect to the total number of hidden layer neurons of the ELM network.To show the robustness of the model, we also perform a long time forecasting and out-of-sample validation analysis.The hybrid model successfully captures the flow dynamics for these cases, while the other two component models fail to predict the true physics for higher Re and lower number of modes.Overall, our primary goal of leveraging the advantages of two basic reduced-order modeling techniques is fulfilled by the proposed hybrid ROM model in a sense that the model is data-driven and can be generalized for complex fluid flow problems, the model is robust and capable of reducing the total computational cost significantly, and the model neither requires any extra stabilization modification to account for numerical instabilities nor is it fully dependent on black box predictions.States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights.Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof.The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

Figure 1 .
Figure 1.Schematic representation of the extreme learning machine (ELM) neural network architecture utilized for the data-driven reduced-order modeling framework in this study.

Figure 2 .
Figure 2. A schematic representation of the two distinct modeling approaches: (a) physics-based modeling approach; and (b) data-driven modeling approach.

Figure 3 .
Figure 3. Proper orthogonal decomposition (POD) analysis by using 900 equally distributed snapshots for Re = 25, 100, 400: (a) Eigenvalue spectrum of the correlation matrix, A; and (b) Eigenvalue percentage energy accumulation with respect to modal index.

Figure 5 .Figure 6 .
Figure 5. Mean stream function contour plots between t = 15 and t = 60 obtained by reference FOM and standard ROM-GP simulations for Re = 25: (a) ψ FOM at a resolution of 128 × 256; (b) ψ ROM-GP with R = 10 modes; (c) ψ ROM-GP with R = 20 modes; (d) ψ ROM-GP with R = 30 modes; and (e) ψ ROM-GP with R = 40 modes.Note that a stable and well-estimated solution can be found using R = 10 modes and adding more modes might yield worse solutions due to a possible over-fitting as occurred in R = 30 case.

Figure 7 .
Figure 7. Mean stream function contour plots between t = 15 and t = 60 obtained by reference FOM and standard ROM-GP simulations for Re = 400: (a) ψ FOM at a resolution of 128 × 256; (b) ψ ROM-GP with R = 10 modes; (c) ψ ROM-GP with R = 20 modes; (d) ψ ROM-GP with R = 30 modes; and (e) ψ ROM-GP with R = 40 modes.Note that a stable and well-estimated solution can be found within R = 40 modes for this case.

Figure 10 .Figure 11 .Figure 12 .
Figure 10.Mean stream function contour plots between t = 15 and t = 60 obtained by reference FOM and different ROM simulations for Re = 25 and R = 10 modes: (a) ψ FOM at a resolution of 128 × 256; (b) ψ ROM-GP ; (c) ψ ROM-ANN ; and (d) ψ ROM-H .This figure clearly illustrates that all models have captured the two-gyre circulation pattern successfully.

Figure 17 .
Figure 17.Sensitivity analysis with respect to the number of ELM neurons of fully non-intrusive ROM-ANN approach using time series of the first modal coefficient, a 1 (t) for Re = 25, 100, 400 and Q = 20, 40, 80.

Figure 18 .
Figure 18.Sensitivity analysis with respect to the number of ELM neurons of fully non-intrusive ROM-ANN approach using time series of the tenth modal coefficient, a 10 (t) for Re = 25, 100, 400 and Q = 20, 40, 80.

Figure 19 .
Figure 19.Sensitivity analysis with respect to the number of ELM neurons of ROM-H approach using time series of the first modal coefficient, a 1 (t) for Re = 25, 100, 400 and Q = 20, 40, 80.

Figure 20 .
Figure 20.Sensitivity analysis with respect to the number of ELM neurons of ROM-H approach using time series of the tenth modal coefficient, a 10 (t) for Re = 25, 100, 400 and Q = 20, 40, 80.

Figure 21 .
Figure 21.Forecasting of temporal mode evolution of a 1 (t) for Re = 25.Note that ROMs are trained by the data snapshots obtained between t = 15 and t = 60.

Figure 22 .
Figure22.Forecasting of temporal mode evolution of a 1 (t) for Re = 100.Note that all ROMs are trained by the data snapshots obtained between t = 15 and t = 60.Although the ROM-GP approach provides reasonable (physical) results at the beginning (i.e., t < 18), the error then is amplified exponentially.

Figure 24 .Figure 25 .
Figure 24.Time series evolution of the first temporal coefficient a 1 (t) for the out-of-sample forecast by FOM and different ROM approaches (ROM-GP, ROM-ANN, and ROM-H) using R = 10 POD modes.Predictive performance is shown for Re = 200 while the training has been performed using the data generated at Re = 100.

Figure 26 .Figure 27 .
Figure 26.Time series evolution of the first temporal coefficient a 1 (t) for the out-of-sample forecast by FOM and different ROM approaches (ROM-GP, ROM-ANN, and ROM-H) using R = 10 POD modes.Predictive performance is shown for Re = 200 while the training has been performed using the data generated at Re = 400.

Author Contributions:
Conceptualization, O.S. and A.R.; Data curation, S.M.R. and O.S.; Formal analysis, S.M.R. and O.S.; Methodology, O.S.; Supervision, O.S.; Writing-original draft, S.M.R.; and Writing-review and editing, S.M.R., O.S. and A.R. Funding: This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under Award Number DE-SC0019290.Disclaimer: This report was prepared as an account of work sponsored by an agency of the United States Government.Neither the United

Table 3 .
The computational CPU time in seconds required for ROM simulations between t = 15 and t = 60.Note that the FOM simulation is 4.82 h and offline POD basis generation takes about 23.38 min.The offline training time for ANN is less than one second due to the extremely fast ELM approach.