Modiﬁed Polynomial Chaos Expansion for E ﬃ cient Uncertainty Quantiﬁcation in Biological Systems

: Uncertainty quantiﬁcation (UQ) is an important part of mathematical modeling and simulations, which quantiﬁes the impact of parametric uncertainty on model predictions. This paper presents an e ﬃ cient approach for polynomial chaos expansion (PCE) based UQ method in biological systems. For PCE, the key step is the stochastic Galerkin (SG) projection, which yields a family of deterministic models of PCE coe ﬃ cients to describe the original stochastic system. When dealing with systems that involve nonpolynomial terms and many uncertainties, the SG-based PCE is computationally prohibitive because it often involves high-dimensional integrals. To address this, a generalized dimension reduction method (gDRM) is coupled with quadrature rules to convert a high-dimensional integral in the SG into a few lower dimensional ones that can be rapidly solved. The performance of the algorithm is validated with two examples describing the dynamic behavior of cells. Compared to other UQ techniques (e.g., nonintrusive PCE), the results show the potential of the algorithm to tackle UQ in more complicated biological systems.


Introduction
Models are often used to describe biological systems, for which parameters are often considered as fixed constants [1,2]. However, model parameters cannot be known with certainty due to model calibration with noisy data or the time-varying nature of a dynamic system. The main objective of uncertainty quantification (UQ) is to study the impact of uncertainty (e.g., the variation in model parameters) on model predictions [1,[3][4][5][6], which is vital to assess the trustworthiness of model predictions for process design and performance evaluation [2,4,7,8].
There are several ways for UQ and the most popular tool is the sampling-based Monte Carlo (MC) simulations [9]. Typically, MC predicts uncertainty in model predictions by generating samples from the distribution of parametric uncertainty and by running the model with each sample. Although MC is methodically simple, the UQ accuracy is affected by the total number of samples, especially when the model has complex forms such as nonpolynomial terms [9]. In this case, many samples are required to ensure UQ accuracy, which may greatly increase the computational cost.
Compared to MC, polynomial chaos expansion (PCE) [10,11] has gained a growing interest in engineering and scientific problems [2,12,13]. For example, it was used to predict dynamic behaviors of the electrochemical flow in a microfluidic system [13] and to quantify uncertainty in biochemical systems [12] and epidemic models [2]. The key of PCE is to determine the PCE coefficients for both parametric uncertainty and model predictions. The PCE coefficients of parametric uncertainty can Appl. Mech. 2020, 1 155

Polynomial Chaos Expansion (PCE) for UQ
Suppose a dynamic system is described with nonlinear ordinary differential equations (ODE) over a given time period t 0 , t p as in: where x is the model prediction with the initial conditions x 0 at t = t 0 and u is a vector of fixed model parameters. Further, p is a vector to define parametric uncertainties, for which p = (p i=1 , . . . , p i=N ) ∈ R N and N indicates the total number of parametric uncertainties. In this work, we assume each parameter p i in p to be independent and identically distributed, which will be described with a prior given PDF using a random variable θ i as in: where θ i represents the ith random variable to describe the PDF of p i from the Wiener-Askey scheme [33]. Following this, each parametric uncertainty in Equation (2) is rewritten as in Reference [10]: where p i,k denote the PCE coefficients of the ith parametric uncertainty p i [10,34] and P is the polynomial order that determines the number of terms required to approximate the uncertainty. In addition, ψ k (θ i ) are the orthogonal polynomial basis functions, which should be selected appropriately depending on the PDF of p i . For instance, Hermite polynomial basis can be used to describe a normally distributed uncertainty [11]. For other distributions such as uniform distribution, their corresponding polynomial basis functions can be found in the literature [10].
Since the model response is affected by uncertainty, the resulting output can be approximated with a spectral expansion of random variables, that is, θ = (θ 1 , · · · , θ i , · · · , θ N ) as in [10,34]: where x m are PCE coefficients of predictions and Ψ m (θ) are the orthogonal multivariate polynomial functions computed from polynomial basis function ψ k (θ i ) with respect to each uncertainty p i [25]. In Equations (3) and (4), a finite number of terms, P + 1 and M + 1, including zeroth terms, are used. If the total number of uncertainties is N and the polynomial order of each uncertainty is P, the number of terms of model prediction (M + 1) is calculated as in [17]: The PCE coefficients of uncertain parameters can be calibrated with the prior information of the PDF, whereas the PCE coefficients of model responses in Equation (4), x m , have to be calculated with a stochastic Galerkin (SG) projection by substituting Equations (3) and (4) into Equation (1) and by projecting Equation (1) onto each of the polynomial basis functions in PCE theory, {Ψ m }, as defined in [10]: .
x(t, θ), Ψ m (θ) = g(t, u, p(θ), x(t, θ)), Ψ m (θ) , ∀m ∈ {0, · · · , M}, where · is the inner product operator. The SG projection in Equation (6) calculates the PCE coefficients of model prediction x and converts Equation (1) into a few coupled deterministic equations that are functions of the PCE coefficients of x. Specifically, the inner product between two functions in Equation (6) can be mathematically defined as in [10]: ϕ(θ), ϕ (θ) = R N ϕ(θ), ϕ (θ)W(θ)dθ, (7) where the multivariate integration in Equation (7) is solved over the random domain R N determined by θ; and W(θ) is a predefined weight function described by the joint PDFs of θ, which can be chosen from the Wiener-Askey framework [10]. When the PCE coefficients of x are available, uncertainty in x, originating from uncertainty in p, can be approximated analytically. For instance, the mean value of output x can be estimated with the first PCE coefficient, x m=0 in Equation (4) and other higher-order PCE coefficients (x m 0 ) can be used to estimate other statistical moments such as the variance around the mean value of x [10].
The calculation of PCE coefficients x m in Equation (4), can be further defined as in [10]: where γ m = E Ψ 2 m and E[·] here represents expectation. Although the SG projection-based UQ has been shown to be computationally effective over a wide range of applications [14,15,18,19], the calculation of PCE coefficient of model output x with Equation (8) is not trivial, especially when the model involves nonpolynomial functions and a large number of uncertainties [17,28]. For example, the calculation of Equation (8) can be computationally demanding as discussed in Section 3.
To address this issue, the UQ algorithm to combine the generalized dimension reduction method (gDRM or S-variate DRM) with PCE will be used to calculate high-dimensional integrals in the inner product of SG projection. This algorithm approximates a high-dimensional integral with several lower-dimensional ones that involve at most S-variables. For instance, the bivariate DRM (BiDRM and S = 2) estimates an N-dimensional integral with several one-and two-dimensional integrals [22]. For clarity, the approximation procedures are described in Section 2.2.

Dimension Reduction Based Stochastic Galerkin Projection
To solve PCE coefficients x m , the key is to calculate the multivariate integral in Equation (8), resulting from the inner product of SG. For simplicity, let assume y(θ) can be defined as y(θ) = x(t, θ)Ψ m (θ) at a specific time point t in Equation (8) and suppose y(θ) is a continuous and differentiable function. In this way, the integral in Equation (8) can be redefined as in [22]: where E is an expectation operator. Following the definition of gDRM [22], each coefficient in Equation (8) can be computed as in [21]: where each term in Equation (10), that is, y S−r can be further defined as in [22]: y S−r = ω 1 <ω 2 <···<ω S−r y 0, · · · , 0, θ ω 1 , 0, · · · , 0, θ ω 2 , 0, · · · , 0, θ ω S−r , 0 , where ω 1 , ω 2 ,· · · , ω S−r = 1, 2, · · · , N. Note that for the last term in the summation of Equation (11), S = r and y S−r can be defined as y 0 = y(0) by setting the mean values of all random variables θ to 0. Based on Equation (10), the PCE coefficients of x can be rapidly calculated, even if the model of a system involves nonpolynomial terms. For brevity, details are not given here, which can be found in our previous works [20,21]. An example of the trivariate DRM (TriDRM), that is, S = 3 in gDRM, is given for clarity. For TriDRM, each term E[y S−r ] in Equation (10) is determined as follows.
In summary, as in Equations (13) to (15), N one-dimensional integrals are required to estimate  (8) can be approximated with a few lower-dimensional ones. As discussed in Section 3 below, this approach is computationally efficient when the number of uncertainties is small. However, we found that the computational cost increases greatly, when the number of uncertainties increases. For example, when the number of uncertainties is 20 (i.e., 20 random variables are required to define uncertainties) and TriDRM (i.e., S = 3 in Equation (10)) is used to calculate PCE coefficients of model outputs, in total 1140 three-, 190 two-and 20 one-dimensional integrals should be computed. For such a case, it can be computationally prohibitive to calculate PCE coefficients. To attenuate the computational cost, we will integrate sampling-based quadrature rules with the gDRM-based PCE to numerically estimate these low-dimensional integrals. The modified gDRM-based PCE method is discussed below.

Modified gDRM-Based PCE Sing the Dimension Reduction and Quadrature Rules
The key of the modified gDRM-based PCE method is to approximate the integrals involved in Equation (10) (i.e., E[y S−r ]) with quadrature rules. For algorithm clarification, Gaussian quadrature rules [23,35] will be used in this work; but other multidimensional integral approximations can be also applied. The calculation of PCE coefficients with gDRM in Equation (10) can be redefined as in: where E[ y S−r ] represents that each lower-dimensional integral in Equation (10) is estimated with the Gaussian quadrature rules. For each random variable, for example, θ ω 1 in Equation (13), a one-dimensional quadrature rule can be constructed as in: where θ represent a set of integral evaluation points (i.e., θ q 1 ω 1 ) and their related weights (i.e., α q 1 ω 1 ) that will be used to estimate the one-dimensional integral (i.e., ω 1 = 1, 2, · · · , N) in Equation (13). Note that evaluation points and their weights have to be carefully selected by considering the PDFs of random variables (e.g., θ ω 1 ) to ensure the exponential convergence. For example, Gauss-Hermite quadrature is the best choice for normal distribution [23,35]. It is also worth mentioning that Gauss quadrature rule provides an exact result when the polynomial degree of the integrand is less than 2Q 1 − 1, where Q 1 is the number of evaluation points [16,23].
For multidimensional integrals (i.e., S − r ≥ 2 in Equations (14) and (15)), we will use full tensor evaluation points, constructed by tensor product with one-dimensional quadrature rules of each individual random variable. The full tensor evaluation is used in this work, since these integrals resulting from the gDRM step only involve a few random variables. Thus, the approximation of a multidimensional integral ( N) can be evaluated in real-time. However, it is important to point out that when high-dimensional integrals are yielded from the gDRM step, such as when S − r ≥ 6, sparse grids with a smaller number of evaluation points can be used to reduce the computational cost for approximating integrals in Equation (10) [10].
As an example, Equations (18) to (20) show the calculations of these lower-dimensional integrals in Equation (16), when TriDRM-PCE is used to calculate the PCE coefficients of outputs. In this case, the approximation of these integrals E[ y S−r ] in Equation (16), that is, N one-dimensional, N 2 two-dimensional and N 3 three-dimensional integrals in Equations (13) to (15), can be calculated with the quadrature rules as in: y 0, · · · , 0, θ q 1 ω 1 , 0, · · · , 0 ·α y 0, · · · , 0, θ y 0, · · · , 0, θ where Q i is the total number of quadrature evaluation points θ q i ω i , α q i ω i is the corresponding weights of each evaluation point and i is the index of random variables in the gDRM-based PCE. To estimate a multidimensional integral with quadrature rules, the same number of evaluation points and weights are used for each random variable. Thus, based on the tensor product rules, the number of evaluation points required to estimate a multidimensional integral is defined as [23,35]. We found that the estimation of lower-dimensional integrals (e.g., one-, two-and three-dimensional integrals) with Gaussian-quadrature points can significantly reduce the computational time to compute PCE coefficients, which will be discussed with three examples in Section 3.

Modified gDRM-Based PCE Using the Dimension Reduction and Quadrature Rules
A schematic of the modified gDRM-based PCE-based UQ is shown in Figure 1, which integrates the gDRM and quadrature rules to quickly solve the high-dimensional integral involved in the SG step. To calculate each PCE coefficient of model output in Equation (4), the inner product of the SG projection will yield a high-dimensional integral as in Equation (8). To alleviate the computational burden, gDRM is used to convert a high-dimensional integral in the SG projection into several low-dimensional ones. Since we found a large number of lower-dimensional integrals have to be calculated when the number of uncertainties is large, sampling-based quadrature rule (i.e., Gaussian Quadrature) is used to calculate the lower-dimensional integrals. This will calculate the PCE coefficients of model outputs in real-time, which can be used to quickly assess the impact of uncertainty on model predictions.
weights are used for each random variable. Thus, based on the tensor product rules, the number of evaluation points required to estimate a multidimensional integral is defined as = ∏ = [23,35]. We found that the estimation of lower-dimensional integrals (e.g., one-, two-and threedimensional integrals) with Gaussian-quadrature points can significantly reduce the computational time to compute PCE coefficients, which will be discussed with three examples in Section 3.

Modified gDRM-based PCE using the Dimension Reduction and Quadrature Rules.
A schematic of the modified gDRM-based PCE-based UQ is shown in Figure 1, which integrates the gDRM and quadrature rules to quickly solve the high-dimensional integral involved in the SG step. To calculate each PCE coefficient of model output in Equation (4), the inner product of the SG projection will yield a high-dimensional integral as in Equation (8). To alleviate the computational burden, gDRM is used to convert a high-dimensional integral in the SG projection into several lowdimensional ones. Since we found a large number of lower-dimensional integrals have to be calculated when the number of uncertainties is large, sampling-based quadrature rule (i.e., Gaussian Quadrature) is used to calculate the lower-dimensional integrals. This will calculate the PCE coefficients of model outputs in real-time, which can be used to quickly assess the impact of uncertainty on model predictions.

Sampling-Based Nonintrusive Discrete Projection (NIDP)
The SC-based nonintrusive approach is another way to deal with nonpolynomial functions and a large number of uncertainties. As a discrete projection [10], the unknown PCE coefficients of x, x m , are approximated with repetitive simulations. The model response x in Equation (4) can be described as in [10]: where ϑ is a vector of random variables, ϑ= (ϑ 1 , · · · , ϑ i , · · · , ϑ N ), as described with θ in Equation (4) and the PCE coefficients z m at each time point t in Equation (21) are calculated as in [10]: where Q is the number of collocation points ϑ j = ϑ j 1 , · · · , ϑ j N to estimate z m , λ j are the corresponding weights of collocation points and j is an index to define the jth point and its weight generated in a random space characterized by N random variables.
To calculate the PCE coefficients in Equation (22), collocation points must be appropriately chosen so that the difference between Equation (22) and the spectral projection in Equation (8) is minimized [10]. The most popular way to select collocation points in Equation (22) is to use the sampling points generated by the Gaussian quadrature rules [23,35], which are related to the PDFs of random variables. For example, Gauss-Hermite quadrature rules are one way to select collocation points for normally distributed uncertainty [23,35].
Suppose each term in the summation in Equation (22) can be defined as f ϑ j = x t, ϑ j Ψ m ϑ j for simplicity. For each random variable, that is, ϑ i , a one-dimensional integration rule is constructed as in [28]: Appl. Mech. 2020, 1

160
where the set of collocation points used in Equation (23) is represented as [28]: Once a one-dimensional quadrature rule for each random variable is determined as in Equations (23) and (24), the next step is to formulate collocation points from a random space defined with N independent random variables. The simplest way to generate collocation points is to use a full tensor product grid, which is given as in [28]: As seen, the full tensor grid to calculate a high-dimensional integral only requires tensor products of one-dimensional quadrature rules. Thus, the collocation points are mathematically given as: [10]. However, the full tensor product can be computationally intensive when many uncertainties are considered. In this case, it may require many collocation points to obtain accurate UQ results. For example, when N uncertainties and n i collocation points are used for each uncertainty, the total collocation points Q of the full tensor grid can be defined as Q = N i=1 n i [23,35]. As seen, the total number of collocation points for UQ grows greatly, when the number of uncertainties increases.
Alternatively, a sparse grid using the Smolyak algorithm [26,27] can significantly decrease the number of collocation points. The Smolyak algorithm is based on linear combinations of tensor product construction, which only needs a subset of the full tensor product grids in Equation (25). Details of the sparse grid constructed by the Smolyak algorithm can be found in References [26][27][28]. For brevity, the construction of the sparse grid with the Smolyak algorithm is given as in [27]: where |i| = i 1 + · · · + i N and w is a level of approximation, which is the parameter to decide the number of tensor product elements required to construct the Smolyak grid. The collocation points with sparse grids to evaluate Equation (26) are given as in [28]: As seen in Equation (27), the total number of collocation points O N can be greatly decreased, when the nested condition is satisfied for collocation points in each dimension [28]. More details about the nested property of sparse grids can be found in the literature [26,27]. Since our objective is to develop an efficient UQ method to deal with a large number of uncertainties (N ≥ 6), the Smolyak algorithm based sparse grid method will be used to show the efficiency of our algorithm in terms of UQ accuracy and computational efficiency.

Root-Mean-Square Error (RMSE) to Evaluate UQ Accuracy
For algorithm comparison, the root-mean-square error (RMSE) is used to compare UQ errors of the proposed modified gDRM-based PCE method and other algorithms, which is defined as in [36]: where l denotes either mean or standard deviation of model predictions obtained with different UQ methods and l re f is a reference of the corresponding statistical moments calculated with MC.
In Equation (28), h is an index defining UQ results at a specific time point h of simulations and the total number of time instants used to calculate the RMSE is defined as H.

Numerical Examples
Three case studies were used to compare the performance of different UQ methods in terms of computational efficiency and UQ accuracy. A nonlinear algebraic problem was used as a benchmark example to compare the accuracy of the proposed modified gDRM-based PCE method to other existing algorithms. Two additional examples, a biochemical process in the conjoint tumor-normal cell populations [31] and the G2 to Mitosis transition phase for the fission yeast [32], were further used to show the accuracy and computational efficiency, when dealing with complex and nonlinear systems that involve many uncertainties. For each example, RMSE in Equation (28) was calculated and the computational time was measured in terms of central processing unit (CPU) time with an office desktop (Core i5-8400 CPU at 2.80 GHz).

Example 1: Nonlinear Algebraic Problems
The UQ accuracy of the proposed algorithm in this work is first compared to other approaches using a nonlinear algebraic example, for which two model responses are defined as in: where each uncertainty is defined as X i (i = 1, · · · , N), which is normally distributed with a mean value µ and a standard deviation σ. For comparison, N was set to 4, 10 and 20 to show the efficiency of the UQ method in this work. In this example, we used Hermite polynomials as the polynomial basis functions for the PCE-based UQ since uncertainty is normally distributed. Also, the mean µ of {X i } was set to 1 and several values of σ were chosen to show the effect of the magnitude of uncertainty on UQ accuracy. Using model responses defined in Equations (29) and (30), the first case study was conducted by assuming the number of uncertainties in total is four (N = 4), which results in a four-dimensional random space. In addition, since the polynomial order of each uncertainty P is set to 2, the total number of terms for each response is calculated as M + 1 = 15, using Equation (5). The simulation results with different UQ methods are shown in Figure 2.
For the gDRM-based PCE, the four-dimensional integrals in the SG step to calculate the PCE coefficients of Z 1 and Z 2 were estimated with several low-dimensional integrals. For example, the BiDRM-PCE converted a four-dimensional integral into 6 two-dimensional and 4 one-dimensional ones. Whereas, for the TriDRM-PCE, 3 three-, 6 two-and 4 one-dimensional integrals were used. Further, for the modified gDRM-PCE method in this work, the Gauss-Hermite quadrature rules were used to generate 5 evaluation points in each random variable for the low-dimensional integrals from the gDRM. For clarity, as an example, the expressions of the PCE coefficients for the mBiDRM-PCE are provided in Appendix A. In addition, since the trapezoidal rule is one of the most popular tools to estimate an integral, it was adopted to solve the high-dimensional integrals resulting from the SG step for comparison purposes. The total number of trapezoids for each random variable in the integral was set to 10.
For nonintrusive UQ, full tensor grid (NIDP-FT) and sparse grid (NIDP-SP) were used. For NIDP-FT, 5 collocation points for each uncertainty were used to construct grids to evaluate the statistical moments of model responses, resulting in 625 collocation points in total. For the NIDP-SP, the Smolyak algorithm was used to generate sparse grids and the approximation level w was chosen as 5 here. For MC, 10 6 samples were used to ensure UQ accuracy because the results from MC were chosen as a reference to compare different methods. For each subplot, bivariate DRM (BiDRM-) and trivariate DRM (TriDRM)-PCE refer to the bivariate and trivariate dimension reduction based PCE, respectively. The modified algorithms, coupling the dimension reduction with quadrature rules, are termed as modified BiDRM (mBiDRM-) and modified TriDRM (mTriDRM)-PCE, respectively. The nonintrusive discrete projection (NIDP) represents the nonintrusive UQ methods in PCE theory and FT and SP refer to full-tensor grids and sparse grids of collocation points, respectively. The Trapezoid-PCE represents that the trapezoidal rules were used to solve the high-dimensional integrals in the stochastic Galerkin (SG) projection.
For the gDRM-based PCE, the four-dimensional integrals in the SG step to calculate the PCE coefficients of and were estimated with several low-dimensional integrals. For example, the BiDRM-PCE converted a four-dimensional integral into 6 two-dimensional and 4 one-dimensional ones. Whereas, for the TriDRM-PCE, 3 three-, 6 two-and 4 one-dimensional integrals were used. Further, for the modified gDRM-PCE method in this work, the Gauss-Hermite quadrature rules were used to generate 5 evaluation points in each random variable for the low-dimensional integrals from the gDRM. For clarity, as an example, the expressions of the PCE coefficients for the mBiDRM-PCE are provided in Appendix A. In addition, since the trapezoidal rule is one of the most popular tools to estimate an integral, it was adopted to solve the high-dimensional integrals resulting from the SG step for comparison purposes. The total number of trapezoids for each random variable in the integral was set to 10.
For nonintrusive UQ, full tensor grid (NIDP-FT) and sparse grid (NIDP-SP) were used. For NIDP-FT, 5 collocation points for each uncertainty were used to construct grids to evaluate the statistical moments of model responses, resulting in 625 collocation points in total. For the NIDP-SP, the Smolyak algorithm was used to generate sparse grids and the approximation level was chosen as 5 here. For MC, 10 6 samples were used to ensure UQ accuracy because the results from MC were chosen as a reference to compare different methods. represents the nonintrusive UQ methods in PCE theory and FT and SP refer to full-tensor grids and sparse grids of collocation points, respectively. The Trapezoid-PCE represents that the trapezoidal rules were used to solve the high-dimensional integrals in the stochastic Galerkin (SG) projection.
As in Figure 2a, when the BiDRM-PCE is used and when the uncertainty (σ) is increased, the RMSE of the mean value of Z 1 is found to be larger, as compared to other methods. However, RMSE is reduced when the TriDRM-PCE is used, since it involves several more integrals (e.g., three-dimensional integrals) to approximate high-dimensional integrals in the SG, which increases the UQ accuracy. As seen, the TriDRM-PCE provides accurate results. In addition, as in Figure 2, the modified UQ methods (i.e., mBiDRM-PCE and mTriDRM-PCE) that couple the gDRM and quadrature rules with the PCE provide identical results as compared to other techniques, including the traditional BiDRM-and TriDRM-PCE methods. This clearly shows the accuracy of the modified UQ methods. Note that the classical PCE approaches in Reference [16] were also considered to further demonstrate the accuracy of the algorithm and the comparison results are given in Appendix B.
To provide a comprehensive comparison between the modified UQ methods (mBiDRM-and mTriDRM-PCE) and other UQ methods, additional simulations were conducted to evaluate the UQ performance in terms of UQ accuracy and computational time. In this case study, different numbers of uncertainties, that is, N = 4, 10 and 20, were considered and the simulation results are shown in Figures 3 and 4. For the standard deviation of uncertainty σ, its value was set to 0.5, which was the largest value as seen in Figure 2. For brevity, the traditional TriDRM-PCE, NIDP-FT and the trapezoidal rule-based PCE were not considered in this case study, since it was found that the computational cost of these UQ approaches was intensive. For example, it was found that the BiDRM-PCE took about 2.95 s (see Figure 4a) on average to calculate the PCE coefficients when N was 4; however, the TriDRM-PCE required approximately 15.34 s. In addition, as the literature reported [10,37], the total number of collocation points for the NIDP-FT and the trapezoids of the trapezoidal rules grow greatly when the number of uncertainties increases. For the NIDP-FT based method, when N is 10, 5 10 collocation points in total will be required [10], which is computationally prohibitive. For the trapezoidal rule-based PCE, when 10 trapezoids are used for each variable in the high-dimensional integral in the SG step, (10 + 1) N calculations should be executed [37]. This will significantly increase the computational cost of UQ.
UQ methods. Note that the classical PCE approaches in Reference [16] were also considered to further demonstrate the accuracy of the algorithm and the comparison results are given in Appendix B.
To provide a comprehensive comparison between the modified UQ methods (mBiDRM-and mTriDRM-PCE) and other UQ methods, additional simulations were conducted to evaluate the UQ performance in terms of UQ accuracy and computational time. In this case study, different numbers of uncertainties, that is, = 4, 10 and 20, were considered and the simulation results are shown in Figures 3 and 4. For the standard deviation of uncertainty , its value was set to 0.5, which was the largest value as seen in Figure 2. For brevity, the traditional TriDRM-PCE, NIDP-FT and the trapezoidal rule-based PCE were not considered in this case study, since it was found that the computational cost of these UQ approaches was intensive. For example, it was found that the BiDRM-PCE took about 2.95 s (see Figure 4a) on average to calculate the PCE coefficients when was 4; however, the TriDRM-PCE required approximately 15.34 s. In addition, as the literature reported [10,37], the total number of collocation points for the NIDP-FT and the trapezoids of the trapezoidal rules grow greatly when the number of uncertainties increases. For the NIDP-FT based method, when is 10, 5 10 collocation points in total will be required [10], which is computationally prohibitive. For the trapezoidal rule-based PCE, when 10 trapezoids are used for each variable in the highdimensional integral in the SG step, (10 + 1) calculations should be executed [37]. This will significantly increase the computational cost of UQ.   For the gDRM-based UQ method, the BiDRM-PCE converts an -dimensional integral in the SG step into two-and one-dimensional integrals, while the TriDRM-PCE estimates thedimensional integral with three-, two-and one-dimensional integrals. The mBiDRMand mTriDRM-PCE methods in this work estimate each lower-dimensional integral with quadrature rules. For the nonintrusive (NIDP-SP) method, the approximation level was set to 5 and 4994,  For the gDRM-based UQ method, the BiDRM-PCE converts an N-dimensional integral in the SG step into N 2 two-and N 1 one-dimensional integrals, while the TriDRM-PCE estimates the N-dimensional integral with N 3 three-, N 2 two-and N 1 one-dimensional integrals.
The mBiDRM-and mTriDRM-PCE methods in this work estimate each lower-dimensional integral with quadrature rules. For the nonintrusive (NIDP-SP) method, the approximation level w was set to 5 and 4994, 126925 and 2112649 collocation points were constructed with the Smolyak algorithm, when N was set to 4, 10 and 20, respectively. As seen in Figure 3, the mBiDRM-PCE provides identical results as compared to the traditional BiDRM method in the presence of a large number of uncertainties. In addition, it is found that the mTriDRM-PCE has smaller RMSE compared to the mBiDRM-PCE, since more low-dimensional integrals are used in the dimensional reduction procedure of the SG. It is also found that the RMSE of the mTriDRM-PCE is very similar to the most popular nonintrusive method, NIDP-SP, which further confirms the accuracy of the proposed UQ methods.
To show the efficiency, we also compared the computational time among different UQ methods. As seen in Figure 4a, it is found that the difference in computational time for each method is smaller, when the number of uncertainties is small (N = 4). For example, the BiDRM method requires about 2.95 s, while the NIDP-SP needs approximately 2.39 s. But when the number of uncertainties is large as in Figure 4c, there is a significant difference in computational time. When N is set to 20, the traditional BiDRM-PCE in our previous work [20] takes about 162 s, which is clearly smaller than the nonintrusive NIDP-SP method (i.e., 162 vs. 411 s in Figure 4c). This shows the advantage to couple the gDRM with the PCE-based UQ method.
In addition, as compared to the BiDRM-PCE-based UQ method, mBiDRM-and mTriDRM-PCE can further reduce the computational time as shown in Figure 4c, when the number of uncertainties is large. For example, the computational time for the mTriDRM-PCE is about 83 s, when N is set to 20, which is significantly lower than the NIDP-SP method. This shows the efficiency to couple the PCE method with quadrature rules for dealing with high-dimensional UQ problem. Notably, when dealing with different numbers of uncertainties, it is also found that the increment in computational time of the proposed UQ method is much smaller, as compared to these existing algorithms. For instance, as can be seen in Figure 4a,c, for the nonintrusive NIDP-SP method, approximately 411 s are required when N is set to 20, whereas on average 2.39 s are needed when N is 4. But the increase in the computational time of the gDRM-based methods is much smaller. Since the proposed UQ method provides accurate UQ results and is computationally efficient, the mBiDRM-and mTriDRM-PCE methods will be further tested in the following two biochemical engineering examples.

Example 2: Conjoint Tumor-Normal Cell Models
In this example, mathematical models [31], describing the mutual interactions between tumor and normal cells, were used to show the efficiency of the proposed UQ methods. This model was chosen since it is sufficiently complicated to show the capability of the UQ methods for dealing with many uncertainties. In addition, to obtain accurate model predictions, it is useful to consider intrinsic variability in tumor-environmental interactions. For example, a probabilistic description of model predictions, estimated with UQ techniques, can provide information to design multi-therapeutic strategies and to advance understanding of the growth of tumor cells. The model to describe cellular interactions can be described as follows [31]: where T c and N c denote the total number of tumor and normal cells, respectively. The description of these model parameters and their values used in this work are given in Table 1 [31]. All parameters in Equations (31) and (32) were assumed to be uncertain, that is, the number of uncertainties N is 9, generating a nine-dimensional random space. Also, all parametric uncertainties were assumed to be identically and uniformly distributed. Thus, Legendre polynomials were used here as the PCE basis functions. For clarity, the mathematical expression of each parametric uncertainty can be given as δ i = δ i (1 + σξ i ), where δ i is the mean value of each parameter listed in Table 1 and ξ i is a random variable-uniformly distributed in the range of (−1, 1). In this case study, we introduced 10% uncertainty into each parameter and subsequently, σ was set to 0.1. As done in Section 3.1, different UQ methods, including the mBiDRM-, mTriDRM-PCE methods and nonintrusive NIDP-SP, were used in this case study. The results with different UQ methods were shown in Figure 5. The approximation level w of NIDP-SP was set to 5. Note that the polynomial order of each uncertainty P is set to 2, which results in 55 PCE coefficients for each response, that is, 55 coupled ODEs are used to approximate each model prediction with the PCE theory (see Equation (5)). Since there are two model responses as described in Equations (31) and (32), a total of 110 equations are generated to describe the original stochastic system in this example. These coupled ODEs were numerically solved using the explicit second-order Runge-Kutta method [38]. For simulations, the initial value of the first coefficient of each model response (i.e., T c and N c ) was set to 1, while the rest of the coefficients were set to 0. Note that the equation corresponding to the first coefficient of the model response describes the mean value of its response in the presence of uncertainty. Since 9 parametric uncertainties were considered, the BiDRM-PCE converted each ninedimensional integral in the SG step into 36 two-dimensional and 9 one-dimensional integrals, while the TriDRM-PCE required to calculate 84 three-dimensional, 36 two-dimensional and 9 one-dimensional integrals. For mBiDRM-and mTriDRM-PCE methods, the Gauss-Legendre quadrature rules were used to generate 5 evaluation points in each dimension and the full tensor product was used to decide the total required evaluation points to estimate these low-dimensional integrals mentioned above. For the nonintrusive NIDP-SP, the Clenshaw-Curtis quadrature rules were used to generate sparse grids following the Smolyak algorithm, which contains 26,017 grid points, since N was set to 9. As seen in Figure 5, the mean values of T c and N c calculated with different UQ methods are shown with solid and dash-dot lines, respectively, while the error bars in each subplot show the uncertainty in model predictions at 21 discrete time points of the entire simulation (i.e., Time from 0 to 100 in Figure 5). the accuracy of the methods in this work. We also compared the computational time as shown in Figure 7 to show the computational efficiency. As seen in Figure 7, compared to the NIDP-SP, the mBiDRM-PCE can reduce the computational time by about 43 percent points and the mTriDRM-PCE can save the computational time by approximately 33 percent points. This clearly shows that the superior performance of mBiDRM-and mTriDRM-PCE methods, when dealing with many uncertainties (up to 9 in this case study).  To compare the UQ accuracy, 10 5 samples for each parametric uncertainty were used for MC; and the simulation results were used as a reference in Equation (28). The RMSE of each method is given in Figure 6. Unlike the nonlinear algebraic problems in Section 3.1, simulation results of 21 discrete points over the entire simulation as shown in Figure 5 was used to calculate RMSE, that is, H = 21 in Equation (28). As seen in Figure 6, all methods show the identical RMSE results, indicating the accuracy of the methods in this work. We also compared the computational time as shown in Figure 7 to show the computational efficiency. As seen in Figure 7, compared to the NIDP-SP, the mBiDRM-PCE can reduce the computational time by about 43 percent points and the mTriDRM-PCE can save the computational time by approximately 33 percent points. This clearly shows that the superior performance of mBiDRMand mTriDRM-PCE methods, when dealing with many uncertainties (up to 9 in this case study).

Example 3: G2 to Mitosis Transition Model for Fission Yeast
This example describes dynamic behaviors of living cells (fission yeast cells) in the transition phase from G2 to Mitosis (G2-M) in the cell cycle [32]. For simplicity, it is assumed that cells in the G2 phase, which have completed DNA replication, grow and prepare for mitosis-cells are divided into daughter cells. It is important to understand the transition mechanism, since it determines the time for mitotic entry and the cell size at the point of mitosis, at which cells divide into daughter cells. In addition, it provides useful information to investigate the impact of missense mutations in proteins on the genotype-phenotype relationship [32]. But model parameters cannot be well calibrated with data from biological experiments, due to the strong nonlinearity in model structure. Therefore, it can be useful to consider uncertainties in model parameters to improve the accuracy of model predictions. For fission yeast, the proteins, including cyclin-dependent kinase 1 (Cdk1), cyclin B (CycB), kinase Wee1 and phosphatase Cdc25, are involved in the G2 phase [32]. In this phase, a complex, that is, mitosis promoting factor (MPF) is formed by binding of Cdk1 to CycB, which leads to the G2-M transition. Details about the mechanism of this biological process can be found in the literature [32]. The model to describe the G2-M transition consists of the concentrations of CycB, MPF, Wee1 and Cdc25 (i.e., , , and ), which are given as follows [32]:

Example 3: G2 to Mitosis Transition Model for Fission Yeast
This example describes dynamic behaviors of living cells (fission yeast cells) in the transition phase from G2 to Mitosis (G2-M) in the cell cycle [32]. For simplicity, it is assumed that cells in the G2 phase, which have completed DNA replication, grow and prepare for mitosis-cells are divided into daughter cells. It is important to understand the transition mechanism, since it determines the time for mitotic entry and the cell size at the point of mitosis, at which cells divide into daughter cells. In addition, it provides useful information to investigate the impact of missense mutations in proteins on the genotype-phenotype relationship [32]. But model parameters cannot be well calibrated with data from biological experiments, due to the strong nonlinearity in model structure. Therefore, it can be useful to consider uncertainties in model parameters to improve the accuracy of model predictions. For fission yeast, the proteins, including cyclin-dependent kinase 1 (Cdk1), cyclin B (CycB), kinase Wee1 and phosphatase Cdc25, are involved in the G2 phase [32]. In this phase, a complex, that is, mitosis promoting factor (MPF) is formed by binding of Cdk1 to CycB, which leads to the G2-M transition. Details about the mechanism of this biological process can be found in the literature [32]. The model to describe the G2-M transition consists of the concentrations of CycB, MPF, Wee1 and Cdc25 (i.e., C CycB , C MPF , C Wee1 and C Cdc25 ), which are given as follows [32]: where V 25 and V wee are coefficients to determine the activation and inactivation rates of MPF. Model parameters and their corresponding descriptions are given in Table 2 [32]. Note that uncertainties in the concentration of CycB and MPF (C CycB and C MPF ), resulting from parametric uncertainties, are studied, since the MPF activity is controlled by Wee1 and Cdc25. It is also assumed in this work that all rate constants, excluding the Michalis constants, involve uncertainty in order to validate the performance of the presented UQ method. Thus, the resulting number of parametric uncertainties is 10 in this case study (N = 10), which results in a ten-dimensional random space. In addition, as done in Section 3.2, the polynomial order of each uncertainty P is set to 2. Since there are ten uncertainties and P is 2, each of the model responses in Equations (33) to (36) consists of 66 ODEs, which results in 264 coupled ODEs to represent the original stochastic system. Similar to the previous example, 10% uncertainty was introduced into uncertain parameters; and it was assumed that each of them was randomly chosen from a uniform distribution in (−1, 1). Thus, the rate constants, δ = k s , k d , k awee , k iwee , k a25, , k i25 , k 25 , k 25 , k wee , k wee in Equations (33) to (38) can be rewritten as δ i = δ i (1 + σξ i ), where δ i is the mean value of each parameter, which can be found in Table 2.
As done in the previous example, Legendre polynomials were used as PCE basis functions. Michaelis constant of phosphatase for Wee1 0.93 -For brevity, in this case study, we focused on the comparison between the mTriDRM-PCE and the nonintrusive NIDP-SP methods. To quantify the UQ accuracy, MC simulations were also used to calculate the reference in Equation (28). For NIDP-SP, the approximation level w was set to 5. Based on the Smolyak algorithm, 41,265 collocation points were required, since the number of uncertainties N is 10. For MC, 10 5 samples were used for each parametric uncertainty. For the mTriDRM method, each 10-dimensional integral in the SG step was approximated with 120 three-dimensional, 45 two-dimensional and 10 one-dimensional integrals; and the full tensor product was used to construct evaluation points with 5 quadrature points of each uncertainty. Figure 8 shows the simulation results, where the explicit second-order Runge-Kutta method [38] was used to solve the differential equations formulated by the PCE theory. The initial conditions for the first coefficients of four model responses C CycB , C MPF , C Wee1 and C Cdc25 in Equations (33) to (36) were set to 0.01, 0.01, 1.0, 0.01, respectively, while the rest of the coefficients of each response were set to 0.
As seen in Figure 8, the solid and dash-dot lines show the mean values of C CycB and C MPF , respectively, while error bars show the uncertainty of model predictions at 21 simulation time points. It is found that different UQ methods, mTriDRM-PCE, NIDP-SP and MC, provide similar UQ results. To quantify the UQ accuracy, the RMSEs of the mTriDRM-PCE and NIDP-SP methods were calculated and given in Table 3. Once again, the RMSE was computed with 21 discrete points in the 20 min simulation as shown in Figure 8 and H is 21 in Equation (28). We also compared the computational time of mTriDRM-PCE and NIDP-SP methods. As seen in Table 3, the simulation time of the NIDP-SP is almost doubled, as compared to the mTriDRM PCE. This clearly shows the advantage of the UQ method proposed in this work. 20 min simulation as shown in Figure 8 and is 21 in Equation (28). We also compared the computational time of mTriDRM-PCE and NIDP-SP methods. As seen in Table 3, the simulation time of the NIDP-SP is almost doubled, as compared to the mTriDRM PCE. This clearly shows the advantage of the UQ method proposed in this work.

Conclusions
An efficient uncertainty quantification (UQ) method is developed by coupling the quadrature rules and the generalized dimension reduction method (gDRM) with the polynomial chaos expansion (PCE). To reduce the computational cost, especially when dealing with high-dimensional UQ problems, Gauss quadrature rules are used to numerically estimate low-dimensional integrals from the gDRM-based PCE. To show the efficiency of the method, the modified bivariate and trivariate dimension reduction based PCE (mBiDRM-and mTriDRM-PCE) are compared with different UQ methods, including the sparse grids-based nonintrusive discrete projection method (NIDP-SP), in terms of UQ accuracy and computational time. Our results show that the proposed UQ algorithm provides accurate UQ results and has superior performance (e.g., less computational time), as compared to other existing techniques (e.g., NIDP-SP). This lays the foundation to investigate more complicated problems in the future, which include the regulation and optimization of biological processes.

Conclusions
An efficient uncertainty quantification (UQ) method is developed by coupling the quadrature rules and the generalized dimension reduction method (gDRM) with the polynomial chaos expansion (PCE). To reduce the computational cost, especially when dealing with high-dimensional UQ problems, Gauss quadrature rules are used to numerically estimate low-dimensional integrals from the gDRM-based PCE. To show the efficiency of the method, the modified bivariate and trivariate dimension reduction based PCE (mBiDRM-and mTriDRM-PCE) are compared with different UQ methods, including the sparse grids-based nonintrusive discrete projection method (NIDP-SP), in terms of UQ accuracy and computational time. Our results show that the proposed UQ algorithm provides accurate UQ results and has superior performance (e.g., less computational time), as compared to other existing techniques (e.g., NIDP-SP). This lays the foundation to investigate more complicated problems in the future, which include the regulation and optimization of biological processes.

Conflicts of Interest:
The authors declare no conflict of interest.
where γ m = E Ψ 2 m and W(θ) is a weight function determined by the joint PDF of θ. The calculation of four-dimensional integral in Equation (A3) can be done in a computationally efficient manner by using gDRM technique. Here we present an example when BiDRM is used to evaluate the integral in Equation (A3) but other gDRM such as TriDRM can be used, depending on the dimension of the integral.

Appendix B
To compare the proposed algorithm with the classical PCE methods that rely on the intrusive spectral projection [16], the simulation results of Example 1 in Section 3.1 are shown in Figure A1. Of note, the PCE coefficients of Equation (29), involving nonlinear forms such as division and square root, were solved following the procedure in Reference [16] (see Sections 4.5.2 and 4.5.3 in Reference [16]). To derive the PCE coefficients of the square root, a nonlinear system solver 'fsolve' in MATLAB R2019a (MathWorks, Natick, MA, USA) is used. For Equation (30), the stochastic expansion of the logarithm term is determined based on the Taylor series approach [16,17] (see Section 2.3 in Reference [17]) and the number of terms in the Taylor series is set to 3 in this work. In addition, for the classical PCE method, the number of polynomial terms in the PCE expansion for the model responses in Equations (29) and (30)  Compared to the classical PCE approaches, the UQ results obtained with the algorithm in this work have smaller RMSE errors in Figure A1, when the variability of uncertain parameters (i.e., ) increases. This shows the accuracy of the algorithm in this work, indicating its applicability to deal with more complex and nonlinear problems. Compared to the classical PCE approaches, the UQ results obtained with the algorithm in this work have smaller RMSE errors in Figure A1, when the variability of uncertain parameters (i.e., σ) increases. This shows the accuracy of the algorithm in this work, indicating its applicability to deal with more complex and nonlinear problems.