Geometric Analysis of Conditional Bias-Informed Kalman Filters

: This paper presents a comparative geometric analysis of the conditional bias (CB)-informed Kalman ﬁlter (KF) with the Kalman ﬁlter (KF) in the Euclidean space. The CB-informed KFs considered include the CB-penalized KF (CBPKF) and its ensemble extension, the CB-penalized Ensemble KF (CBEnKF). The geometric illustration for the CBPKF is given for the bi-state model, composed of an observable state and an unobservable state. The CBPKF co-minimizes the error variance and the variance of the Type-II error. As such, CBPKF-updated state error vectors are larger than the KF-updated, the latter of which is based on minimizing the error variance only. Different error vectors in the Euclidean space imply different eigenvectors and covariance ellipses in the state space. To characterize the differences in geometric attributes between the two ﬁlters, numerical experiments were carried out using the Lorenz 63 model. The results show that the CBEnKF yields more accurate conﬁdence regions for encompassing the truth, smaller errors in the ensemble mean, and larger norms for Kalman gain and error covariance matrices than the EnKF, particularly when assimilating highly uncertain observations.


Introduction
With highly uncertain observations and model dynamics, KF [1] and EnKF [2] estimates tend to be conditionally biased. The conditional bias (CB)-informed Kalman filter (KF) is designed to improve the estimation of extreme states in geophysical data assimilation, particularly at the onset of extreme phenomena. The CB-informed KFs developed to date include the CB-penalized Kalman filter (CBPKF; [3,4]) and its ensemble extension, the CB-penalized Ensemble Kalman filter (CBEnKF; [5]), and aim at addressing CB [3][4][5] to improve estimation of extreme states. With extreme precipitation, droughts, and floods occurring more frequently in many parts of the globe, the estimation and prediction of extremes is an increasingly important topic in hydrology. With the increasing availability of diverse sources of observations, hydrologic data assimilation has a large role to play in improving state estimation. The CB-informed KF is an effort to help address both.
There are two types of CB, Type I and Type II. The Type-I CB is defined as E[X|X* = x*] − x* where X, X*, and x* represent the unknown truth, the estimate, and the realization of X*, respectively. The Type-II CB is defined as E[X*|X = x] − x where x represents the realization of X. The Type-I CB is associated with false alarms which can be reduced by calibration. The Type-II CB is associated with failure to detect an event and cannot be reduced by calibration. The CBPKF minimizes a linearly weighted sum of the error covariance and the expectation of the Type-II error squared. With skillful specification of the weight for the latter, the CBPKF improves estimation of extremes over the KF while slightly increasing the unconditional MSE [3,4].
Though extensively studied in statistics, econometrics, meteorology, and hydrology [6][7][8][9][10][11][12][13][14][15][16], CB has gained interest only recently in data assimilation. Relevant studies to date include the CBPKF [3,4] and the CBEnKF [5]-the CBEnKF application to flood forecasting [5], adaptive filtering for the CBPKF [17] and the CBEnKF [18], and the variance-inflated KF-based approximation of the CBPKF for reduced computation [17,18]. Improving understanding of how the CB-informed KF compares with conventional data assimilation techniques, such as the KF, is crucial for interpreting and analyzing its application results, as well as identifying areas for further development. To that end, this study provides a geometric analysis of the CB-informed KF and compares performance with the EnKF via numerical experiments.
Stochastic variables may be expressed as vectors in the Euclidean domain, which can be used to geometrically illustrate vectors of innovation, observation errors, and forecast and analysis errors in the Euclidean domain [19]. The geometric interpretation of state estimation thus helps understand how Kalman gain shape to analysis error vectors and assimilation solutions for observable as well as unobservable states [19]. Error covariance matrices may also be analyzed geometrically in the state space and characterized with eigenvalues, eigenvectors, and associated confidence regions.
The purpose of this work is to gain additional insight into intuitive understanding of how the CBPKF solution differs from the KF, by casting them in error and state spaces using a bi-state model, and to advance understanding of the comparative performance of the CBEnKF with the EnKF by identifying and characterizing the representative geometric attributes in the state space via eigenvalue analysis. The new and significant contributions of this work are: comparative geometric analysis of the CBPKF solution with the KF, development of a set of geometry-based relationships for improved understanding of the CBPKF solution, and geometric characterization of ensemble analysis for improved understanding of the comparative performance of the CBEnKF with the EnKF. The comparative performance is based on numerical experiments with the Lorenz 63 model, chosen in this study for familiarity and simplicity. For real-world flood forecasting applications of the CBEnKF, the interested reader is referred to Lee et al. [5] and Shen et al. [18]. This paper is organized as follows. Section 2 describes the state updating problem, and the two solution approaches, the KF and the CBPKF. Section 3 presents the geometric analysis of the CB-informed KF in relation to the KF and the EnKF. First, Section 3.1 describes how to use vectors to represent stochastic variables in the Euclidean domain, and Section 3.2 expresses the filter equations in terms of state and observation error terms. Section 3.3 geometrically illustrates the filter equations for a low-order model based on the visualization of error vectors and their relations to Kalman gain and covariance matrices in the Euclidean space [19]. Section 3.4 describes geometric analyses in the state-space with eigenvalues and eigenvectors of error covariance matrices and associated confidence regions. In Section 3.5, an example is given for geometric analysis in the state space in which the CBEnKF, the EnKF, and Open Loop (OL) are compared for the Lorenz 63 model via numerical experiments. Finally, Section 4 describes conclusions.

State Updating Problem
The nonlinear dynamical model is written as Equation (1).
In the above, X k denotes the (n c × 1) model state, or control vector, where n c denotes the number of variables in the control vector, M( ) denotes the dynamical model for state variables, and W k−1 denotes the dynamical model error at a time step k − 1 with a mean of zero and a covariance of Q k . The nonlinear observational model is written as: In the above, Z k denotes the (n × 1) observation vector, where n denotes the total number of observations. V k denotes the (n × 1) observation error vector at a time step k with a mean of zero and a covariance of R k . H k () denotes the nonlinear observation operator that maps X k to Z k . This study solves the linear observation model which renders H k (X k ) in Equation (2) linear, i.e., H k X k : In the case of a nonlinear observation operator, e.g., soil moisture-streamflow transformation, one can still render Equation (2) linear via state augmentation [5,20,21], which is beyond the scope of this study. Interested readers are referred to Lee et al. [5], where state augmentation is used to solve a highly nonlinear flood forecasting problem.
Equation (4) shows the state updating equation where X k|k , X k|k−1 , and K k represent updated state at a time step k, state forecast from k − 1 to k, and Kalman gain, respectively.
To find K k , the KF and the EnKF minimize the error variance (Σ EV ) in Equation (5). The CBPKF and the CBEnKF minimize the weighted sum of Σ EV and the expectation of the Type-II CB squared (Σ CB ) in Equation (6), or Σ EV + αΣ CB where α is the weight given to the CB penalty term. The α can be estimated using an iterative method that yields the error covariance within theoretically expected bounds (see [5] for details). Since the iterative method is computationally expensive, the adaptive filtering method has been developed, and the results are reported in Shen et al. [17].
The following sections present expressions for Kalman gain (K k ) and covariance (P k|k ) matrices in the case of the KF and the CBPKF.

Kalman Filter, KF
Equation (4) is rewritten as Equation (7) but with the superscript K to denote the KF.
Minimizing Σ EV in Equation (5) results in Kalman gain K K k in Equation (8) and the covariance analysis matrix P k|k in Equation (9) [1].

Error Representation of Filter Equations
Since the error covariance of stochastic variables plays the key role in geometric analyses in the Euclidean space, this section defines errors in states and a residual in measurement in order to rewrite state updating and observation equations with error terms.
In the above, ε C k , ε K k , and ε k denote the CBPKF analysis error, the KF analysis error, and the forecast error, respectively; y k represents innovation; X k , X C k|k , X K k|k , and X k|k−1 represent the truth, the CBPKF-updated state, the KF-updated state, and the state forecast, respectively.
State update equations for the KF and the CBPKF are rewritten into Equations (27) and (28), assuming the same a priori states used in both filters.
where K K k and K C k are Kalman gains from the KF and the CBPKF, respectively. Updated states from both filters have the following mathematical relationship.
The observation model in Equation (3) can be rewritten as follows: From Equation (3), the covariance matrix of y k , or D k , can be written as where P − k is the covariance matrix of the state forecast.

KF and CBPKF Solutions for a Bi-State Model
A bi-state model (X k = [x 1,k x 2,k ] T ) is used to illustrate the geometric relation of errors in states and observations from the CBPKF and the KF. Observation z 1,k exists for a state x 1,k but not for x 2,k to investigate how state updating can be illustrated in the two-dimensional (2-D) Euclidean space for observable as well as unobservable states, i.e., Z k = [z 1,k ]; H = [1 0] is used for simplicity. Kronhamn [19] used the same X k , Z k and H matrices as those described in this Section for the geometric illustration of the KF, whereas we focus on the CBPKF in reference to the KF. For the bi-state model with a forecast error covariance P f ,k = σ 2 1,k σ 12,k σ 12,k σ 2 2,k and the observation error R k = σ 2 z,k , matrices of Kalman gain (Equations (32) and (33)), analysis error covariance (Equations (34) and (35)), and correlation (Equations (36)-(38)) for the CBPKF and the KF are evaluated in the equations below where σ 12,k = ρ 12,k σ 1,k σ 2,k with ρ 12,k representing the correlation between x 1,k and x 2,k . where Equation (32) indicates that for an observable state x 1,k , K C k is always larger than K K k , but for an unobservable state x 2,k , the sign of K C k − K K k depends on the sign of σ 12,k . The CBPKF covariance matrix P C a,k and its relation to the KF-equivalent P K a,k are given in Equation (34).
In Equation (34), variances of both updated states x 1,k and x 2,k are larger than KF-equivalents, and the sign of covariance terms of P C a,k − P K a,k depend on the sign of σ 12,k . Equations (34) and (35) imply that the CBPKF-updated state ensembles have larger spreads in the state space than the KF-updated, and that the matrix norm of P C a,k is larger than that of P K a,k . From covariance matrices, Pearson product-moment correlation matrix C C a,k can be computed by Equation (36).
where diag( ) denotes a diagonal matrix. Equation (37) shows that the correlation coefficient of the CBPKF-updated states C C a,12,k can be either larger or smaller than the KF-equivalent C K a,12,k . where From Equations (32)-(38), if α = 0, then K C k = K K k , P C a,k = P K a,k , and C C a,k = C K a,k .

Geometric Representation of KF and CBPKF Solutions
This Section begins with describing a relation between stochastic variables and vectors in the Euclidean domain at Equations (39)-(41), and then describes the geometric representation of KF and CBPKF equations. The covariance of stochastic variables x s and y s can be used to compute the scalar product of two vectors → x E and → y E in the Euclidean domain (Equation (39)). The vector norm || → x E || corresponds to the standard deviation of x s (Equation (40)). The angle of → x E and → y E can be computed from the correlation of x s and y s (Equation (41)) [19,22]. Figure 1 shows the vector representation of stochastic variables in the Euclidean space.   Figure 2a, the state forecast error vector ⃗ , is orthogonal to the observation error vector , owing to the independence assumption. Being a minimum variance solution, ⃗ , is orthogonal to ⃗ , [19,23].
, which is expected from Equation (34). In Figure 2, the inequality, , ⃗ , < , ⃗ , , arises due to the fact that the CBPKF solution minimizes not the error variance but a weighted sum of the error variance and the variance of the Type-II CB. Figure 3 is the same as Figure  in agreement with Equation (35). In Figure 3b, ⃗ , may be written via the Pythagorean theorem as: where the last equality is from Equation (30). In Figure 2a, the state forecast error vector → ε 1,k is orthogonal to the observation error vector v 1,k owing to the independence assumption. Being a minimum variance solution, [19,23]. Figure 2a also shows that the forecast error is a vector sum of the gain-weighted innovation and the analysis error, i.e., (27). The KF analysis error variance may be obtained in Figure 2a via the Pythagorean theorem which is expected from Equation (34). In Figure 2, the inequality, → y 1,k ||, arises due to the fact that the CBPKF solution minimizes not the error variance but a weighted sum of the error variance and the variance of the Type-II CB. Figure 3 is the same as Figure 2 but for the unobservable state . In Figure 3a, the proportionality, as expected from Equation (33). In Figure 3a, the orthogonality,   Using Equations (35) Figures 2 and 3 show that the KF-and the CBPKF-updated state error vectors point to different directions in the state space. Figure 4 shows the updated state error vectors in Figures 2 and 3 to visually compare the differences in the angle, the magnitude, and the direction. The angles of the two-state error vectors for the CBPKF (θ C in Equation (44)) and the KF (θ K in Equation (45)) can be computed from the correlation C C a,12,k and C K a,12,k in Equations (37) and (38), respectively: Hydrology 2022, 9, 84 9 of 23  Figure 4 shows the updated state error vectors in Figures 2 and 3 to visually compare the differences in the angle, the magnitude, and the direction. The angles of the two-state error vectors for the CBPKF ( in Equation (44)) and the KF ( in Equation (45)) can be computed from the correlation , , and , , in Equations (37) Below, we develop a set of geometric expressions in the 2-D state space for the analysis error covariance via eigenvalue decomposition (EVD, [24]). Below, we develop a set of geometric expressions in the 2-D state space for the analysis error covariance via eigenvalue decomposition (EVD, [24]).

Geometric Analysis in the State Space
Geometric characteristics of state ensembles in the 2-D state space can be quantified by confidence regions (CR), eigenvectors, eigenvalues, and the angle between the eigenvector and the basis vector of the x-axis. Assuming normal distributions for state ensembles in a 2-D state space, a CR, or so-called a covariance ellipse, can be constructed based on the EVD of a covariance matrix as well as the Chi-Square probability table. The presence of the CB results in different variances (eigenvalues) and directions (eigenvectors) of updated state ensembles in a 2-D state space. The major and minor axis lengths of the covariance ellipse are 2 √ sλ 1 and 2 √ sλ 2 where λ 1 > λ 2 and the value of s is from the Chi-Square probability table for a given confidence region, e.g., s = 4.605 for a 90% confidence region given the Chi-Square probability P(s < 4.605) = 0.9 in the case of degrees of freedom of 2. In a 2-D state space, the error in the orientation of the covariance ellipse with respect to the truth can be estimated by the angle, θ, between the largest eigenvector

Geometric Analysis in the State Space
Geometric characteristics of state ensembles in the 2-D state space can be quantified by confidence regions (CR), eigenvectors, eigenvalues, and the angle between the eigenvector and the basis vector of the x-axis. Assuming normal distributions for state ensembles in a 2-D state space, a CR, or so-called a covariance ellipse, can be constructed based on the EVD of a covariance matrix as well as the Chi-Square probability table. The presence of the CB results in different variances (eigenvalues) and directions (eigenvectors) of updated state ensembles in a 2-D state space. The major and minor axis lengths of the covariance ellipse are 2 and 2 where > and the value of is from the Chi-Square probability table for a given confidence region, e.g., = 4.605 for a 90% confidence region given the Chi-Square probability ( < 4.605) = 0.9 in the case of degrees of freedom of 2. In a 2-D state space, the error in the orientation of the covariance ellipse with respect to the truth can be estimated by the angle, , between the largest eigenvector ⃗ and the vector ⃗ connecting the truth and the ensemble mean, i.e., ( ⃗ , ⃗) = ⃗ • ⃗ ‖ ⃗ ‖‖ ⃗‖ . The EVD of the CBPKF analysis covariance P C a,k may be written as: where In the above, U is the eigenvector matrix which rotates the white data (W), or uncorrelated standard normal variates by θ. The eigenvalue matrix E explains the variance along the principal error direction, or the direction of the eigenvector. In a 2-D state space, √ E is a scale factor applied to W. The dataset (D) resulting from scaling W by √ E and rotating by U, i.e., D = U √ EW, has the covariance matrix of P C a,k = U Below we apply EVD to the CBPKF analysis error covariance, P C a,k , from the bi-state model in Section 3.2.
With λ C 1 > λ C 2 > 0, and P K a,k = p 2 1,k p 12,k p 12,k p 2 2,k , P C a,k in Equation (34) may be rewritten as: Appendix A describes how eigenvalues and eigenvectors may be evaluated for a 2 × 2 covariance matrix. Using Equation (A11) in Appendix A, we have for λ C 1 : The difference in the largest eigenvalue between the KF and the CBPKF analysis error covariance is given by: where λ K 1 denotes the largest eigenvalue of P K a,k . Using Equation (A12), we may write λ C 2 and λ C 2 − λ K 2 : where the vector → i is the basis vector of the x-axis and θ C a,k → u 1 , With the geometric attributes established above, we now carry out the comparative geometric analysis of the KF and the CBPKF analysis results using the Lorenz 63 model [25].
With the EVD of P C a,k and the ensemble mean (x 1 , x 2 ), the minimum percentage confidence CR MI N to contain the verifying truth (x 1,T , x 2,T ) within the confidence region can be computed by CR MI N = 100 × P(s < s), where P(s < s) is the Chi-Square probability with degrees of freedom of 2; s satisfies Equation (55). where the vector ⃗ is the basis vector of the x-axis and , ( ⃗ , ⃗) − , ( ⃗ , ⃗) is equal to , ( ⃗ , ⃗) − , ( ⃗ , ⃗). With the geometric attributes established above, we now carry out the comparative geometric analysis of the KF and the CBPKF analysis results using the Lorenz 63 model [25].
With the EVD of , and the ensemble mean ( , ), the minimum percentage confidence to contain the verifying truth , , , within the confidence region can be computed by = 100 × ( <), where ( <) is the Chi-Square probability with degrees of freedom of 2; ̃ satisfies Equation (55).  . The red and green dots denote the ensemble mean pair and the truth, respectively. ⃗ is the vector connecting the ensemble mean and the truth. ⃗ is the eigenvector of the covariance matrix P that corresponds to the largest eigenvalue of P. ⃗ represents the vector along the major axis of covariance ellipses, where represents a scale factor applied to the white data. is the angle between ⃗ and ⃗ . Assuming normal distributions of state ensembles, the major axis length of the covariance ellipse, or the confidence region (CR), is 2 , where s = 1.39, 4.605 or 9.21 for 50, 90, or 99% CRs, respectively.

Numerical Experiment with the Lorenz 63 Model
In the sections above, a linear two-state model was used for theoretical simplicity. In this section, we use the three-state Lorenz 63 model to illustrate the differences between the EnKF and the CBEnKF solutions in terms of the geometric attributes introduced above. In this experiment, synthetically generated observations of all three states in the Lorenz 63 model were assimilated at every time step using the EnKF and the CBEnKF [26]. Preliminary experiments suggested observation error variances ( ) of 10 or 400 can be used for the cases of assimilating less uncertain or largely uncertain observations, respectively, based on the ensemble spread. To render the assimilation problem more challenging, = 400 is used to compare the performance of the CBEnKF to that of the EnKF in   u 1 is the eigenvector of the covariance matrix P that corresponds to the largest eigenvalue λ 1 of P. √ λ 1 → u 1 represents the vector along the major axis of covariance ellipses, where √ λ 1 represents a scale factor applied to the white data. θ is the angle between → a and → u 1 . Assuming normal distributions of state ensembles, the major axis length of the covariance ellipse, or the confidence region (CR), is 2 √ sλ 1 , where s = 1.39, 4.605 or 9.21 for 50, 90, or 99% CRs, respectively.

Numerical Experiment with the Lorenz 63 Model
In the sections above, a linear two-state model was used for theoretical simplicity. In this section, we use the three-state Lorenz 63 model to illustrate the differences between the EnKF and the CBEnKF solutions in terms of the geometric attributes introduced above. In this experiment, synthetically generated observations of all three states in the Lorenz 63 model were assimilated at every time step using the EnKF and the CBEnKF [26]. Preliminary experiments suggested observation error variances (σ 2 z ) of 10 or 400 can be used for the cases of assimilating less uncertain or largely uncertain observations, respectively, based on the ensemble spread. To render the assimilation problem more challenging, σ 2 z = 400 is used to compare the performance of the CBEnKF to that of the EnKF in Figures 6-11, where the ensemble size (n S ) used is 2000 to minimize filter performance degradation owing to a small ensemble size. Figures 12 and 13    and | ⃗ | reflect errors in the ensemble mean measured along horizontal or vertical directions, respectively; large dots overlaid in the scatter plots represent mean values of the samples in each of the ten bins equally dividing the entire state space. Figures 7 and 8 are the same as Figure 6 but for the state spaces of ( , ) and ( , ), respectively. The following summarizes general observations from Figures 6-8. The spread of | ⃗ | and | ⃗ | in the scatter plots shows that the CBEnKF reduces CBs more effectively than the EnKF or the OL, particularly in the extremes. In some cases, however, their mean values appear similar, e.g., | ⃗ | in the state spaces of ( , ) or ( , ). The OL results for | ⃗ | and | ⃗ | show that their patterns appearing in the state space are similar to those of the state space plot in the top left of Figures 6-8; this indicates the notable dependency of the amount of ensemble mean errors on model dynamics, e.g., the larger ensemble mean errors at the extreme-this is also seen in EnKF solutions but less so in the CBEnKF. This may be explained by the CBEnKF with a larger weight to observations than the EnKF in the case of largely uncertain observations ( = 400 ), which reduces the reliance of CBEnKF solutions on the model dynamics. Based on , CBEnKF covariances are generally larger than the EnKF at all state spaces. Larger and smaller | ⃗ | and | ⃗ | of the CBEnKF than the EnKF yield consistently smaller than the EnKF at both extremes and the median of all three variables. This signifies the benefit of using the CB- informed KF for the estimation of extremes given that the EnKF's quickly increases towards extremes, i.e., the EnKF is less confident in estimating extremes than the CBEnKF. For example, 3% confidence regions for selected extreme values presented in the bottom right plots show the truth (green dots) contained within the CBEnKF's confidence regions (red ellipses) but not within the EnKF's (blue ellipses). At the plots, arrows represent ⃗ . and do not clearly indicate differences between the two filters.    Figure 9 shows , | ⃗|, , , and of Figures 6-8 but as a function of exceedance probabilities to highlight CB-informed KF performances at extremes. At extremes with low exceedance probabilities, differences between the CBEnKF and the EnKF are vivid in the case of | ⃗| and . On the other hand, of the CBEnKF is consistently larger than those of the EnKF and the OL across exceedance probabilities. As exceedance probabilities increase, the EnKF's | ⃗| becomes similar to the CBEnKF's, implying unconditionally less biased. The CBEnKF keeps consistently low at all exceedance probabilities owing to small | ⃗| and large , compared to the EnKF or the OL. Since the EnKF seeks orthogonal solutions to minimize analysis covariances, its is always smaller than the OL's as well as the CBEnKF's. On the other hand, the CBEnKF increases to address CBs which helps keep low to contain the truth. Both and show no consistent patterns across different state spaces as well as exceedance probabilities. In Figure 10, the ensemble mean error time series indicates that among the three variables, CBEnKF's improvement is the largest for . On the other hand, for the state , the CBEnKF mainly remedies the underestimation of compared to the EnKF. In the case of , the CBEnKF slightly outperforms the EnKF. These observations may imply the different amounts of CBs present in different states, hence the need of applying a separate    . At the top three plots, the true state (green line) time series is overlaid for comparison. Inverse Hyperbolic Sine (IHS) transformation is applied to the top three plots to properly view ensemble mean errors, denoted by blue and red lines for the EnKF and the CBEnKF, respectively; x denotes the ensemble mean of x; the superscript + represents the updated state; the subscript T denotes the truth; ||K C k,i ||  To explore the CBEnKF performance with less uncertain observations ( = 10) and also to see the sensitivity to the ensemble size ( ), Figure 12 presents results from the combination of =10, 20, 30, 50, 70, 100, 200, 300, 500, 700, 1000, and 2000, and = 10 and 400. In Figure 12, | ⃗| plots indicate that with = 10, the accuracy of the ensemble mean continuously increases with an increase Figure 11. Frobenius norm of K k and P a,k , or ||K k || F and ||P a,k || F , respectively, as a function of exceedance probabilities, where n s = 2000, and σ 2 Z = 400 are used for both the CBEnKF (red line) and the EnKF (blue line).

Conclusions
Error covariance and gain matrices of two CB-informed KFs, i.e., the CBPKF and the CBEnKF, are geometrically illustrated and compared with the KF equivalents [19] for a bi-state model using error vectors in the Euclidean space. Geometric illustration and analysis offer an intuitive understanding of the relationship between the two filters. Unlike  a v | reflect errors in the ensemble mean measured along horizontal or vertical directions, respectively; large dots overlaid in the scatter plots represent mean values of the samples in each of the ten bins equally dividing the entire state space. Figures 7 and 8 are the same as Figure 6 but for the state spaces of (x 1 , x 3 ) and (x 2 , x 3 ), respectively. The following summarizes general observations from Figures 6-8. The spread of | → a h | and | → a v | in the scatter plots shows that the CBEnKF reduces CBs more effectively than the EnKF or the OL, particularly in the extremes. In some cases, however, their mean values appear similar, e.g., | → a h | in the state spaces of (x 1 , x 3 ) or (x 2 , x 3 ). The OL results for | → a h | and | → a v | show that their patterns appearing in the state space are similar to those of the state space plot in the top left of Figures 6-8; this indicates the notable dependency of the amount of ensemble mean errors on model dynamics, e.g., the larger ensemble mean errors at the extreme-this is also seen in EnKF solutions but less so in the CBEnKF. This may be explained by the CBEnKF with a larger weight to observations than the EnKF in the case of largely uncertain observations (σ 2 z = 400), which reduces the reliance of CBEnKF solutions on the model dynamics. Based on √ λ 1 , CBEnKF covariances are generally larger than the EnKF at all state spaces. Larger √ λ 1 and smaller | → a h | and | → a v | of the CBEnKF than the EnKF yield consistently smaller CR MI N than the EnKF at both extremes and the median of all three variables. This signifies the benefit of using the CB-informed KF for the estimation of extremes given that the EnKF's CR MI N quickly increases towards extremes, i.e., the EnKF is less confident in estimating extremes than the CBEnKF. For example, 3% confidence regions for selected extreme values presented in the bottom right plots show the truth (green dots) contained within the CBEnKF's confidence regions (red ellipses) but not within the EnKF's (blue ellipses). At the plots, arrows represent √ λ 1 → u 1 . θ and ρ do not clearly indicate differences between the two filters. Figure 9 shows CR MI N , | → a |, √ λ 1 , θ, and ρ of Figures 6-8 but as a function of exceedance probabilities to highlight CB-informed KF performances at extremes. At extremes with low exceedance probabilities, differences between the CBEnKF and the EnKF are vivid in the case of | → a | and CR MI N . On the other hand, √ λ 1 of the CBEnKF is consistently larger than those of the EnKF and the OL across exceedance probabilities. As exceedance probabilities increase, the EnKF's | → a | becomes similar to the CBEnKF's, implying unconditionally less biased. The CBEnKF keeps consistently low CR MI N at all exceedance probabilities owing to small | → a | and large √ λ 1 , compared to the EnKF or the OL. Since the EnKF seeks orthogonal solutions to minimize analysis covariances, its √ λ 1 is always smaller than the OL's as well as the CBEnKF's. On the other hand, the CBEnKF increases √ λ 1 to address CBs which helps keep CR MI N low to contain the truth. Both θ and ρ show no consistent patterns across different state spaces as well as exceedance probabilities.
In Figure 10, the ensemble mean error time series indicates that among the three variables, CBEnKF's improvement is the largest for x 1 . On the other hand, for the state x 3 , the CBEnKF mainly remedies the underestimation of x 3 compared to the EnKF. In the case of x 2 , the CBEnKF slightly outperforms the EnKF. These observations may imply the different amounts of CBs present in different states, hence the need of applying a separate weight α to the CB penalty for the individual state, which warrants a future effort. To compare P a,k and K k from the two filters, the time series of Frobenius norm of P a,k and K k is computed by Equations (56) and (57), respectively. Compared to the EnKF, the CBEnKF yields ||K k || F and ||P a,k || F consistently larger at all assimilation cycles, and the mean values of ||K k || F and ||P a,k || F are five and three times larger, respectively. Figure 11 shows mean ||K k || F , and ||P a,k || F as a function of exceedance probabilities. At extremes, both the CBEnKF and the EnKF show that mean ||K k || F and ||P a,k || F are larger than those at high exceedance probabilities, and that large differences in mean ||K k || F and ||P a,k || F between the CBEnKF and the EnKF are consistent across exceedance probabilities. Figures 6-11 are based on the case of uncertain observations (σ 2 z = 400) where the CBEnKF may supposedly outperform the EnKF. To explore the CBEnKF performance with less uncertain observations (σ 2 z = 10) and also to see the sensitivity to the ensemble size (n S ), Figure 12 presents results from the combination of n S = 10, 20, 30, 50, 70, 100, 200, 300, 500, 700, 1000, and 2000, and σ 2 z = 10 and 400. In Figure 12, | → a | plots indicate that with σ 2 z = 10, the accuracy of the ensemble mean continuously increases with an increase of n S at both cases of extremes (an exceedance probability of 0.1; red and blue dots for the CBEnKF and the EnKF, respectively) and all data (red and blue lines for the CBEnKF and the EnKF, respectively). When σ 2 z = 10, the EnKF's | → a | is slightly smaller than the CBEnKF's, but the CBEnKF's √ λ 1 is slightly larger than the EnKF's. The resulting CR MI N from both filters are very similar. This implies when observations are less uncertain, the EnKF solutions are as accurate and as confident as the CBEnKF solutions at extremes as well as the whole range. When n S ≥ 200 and σ 2 z = 10, mean CR MI N maintains~1%. When n S < 200 and σ 2 z = 10, CR MI N quickly increases with a decrease of n S because of inaccurate error covariance estimates with an insufficient ensemble size. When observations are largely uncertain (σ 2 z = 2000), the CBEnKF clearly shows more accurate ensemble means (smaller | → a |) and higher confidence in covariance estimates (smaller CR MI N ) than the EnKF, particularly at extremes. Compared to σ 2 z = 10, assimilating largely uncertain observations (σ 2 z = 2000) reduces accuracies in covariance estimates, resulting in larger √ λ 1 in both filters, although the CBEnKF's √ λ 1 addressing the CB is larger than the EnKF's. When σ 2 z = 2000, | → a | and CR MI N tend to be less sensitive to n S than the case of σ 2 z = 10. Both θ and ρ show neither any consistent patterns nor sensitivities to n S , but are included in Figure 12 for completeness.
Finally, Figure 13 presents mean ||K k || F and ||P a,k || F as a function of n S . Compared to the results from σ 2 z = 2000, σ 2 z = 10 results in larger ||K k || F in both filters due to bigger weights to the observations. When σ 2 z = 2000, the CBEnKF maintains relatively large ||K k || F to account for the CB; however, the EnKF's ||K k || F is conspicuously small. Both ||K k || F and ||P a,k || F tend to be little sensitive to the ensemble size n S , except the all data case of the CBEnKF with σ 2 z = 2000 (pink line). With uncertain observations (σ 2 z = 2000), the CBEnKF's ||P a,k || F becomes large at extremes (pink dots) as well as all data (pink line) at all n S values used to reflect CBs in all states.

Conclusions
Error covariance and gain matrices of two CB-informed KFs, i.e., the CBPKF and the CBEnKF, are geometrically illustrated and compared with the KF equivalents [19] for a bistate model using error vectors in the Euclidean space. Geometric illustration and analysis offer an intuitive understanding of the relationship between the two filters. Unlike the KF, the CBPKF solution is not orthogonal to its error, which renders its error covariances and gains to be larger than the KF's. The above differences result in different confidence regions and principal error directions in the state space. Synthetic sensitivity experiments with the Lorenz 63 model showed that the CBEnKF solutions have generally smaller errors in the ensemble mean, larger eigenvalues in the error covariance matrix, more accurate confidence regions for encompassing the truth, and larger Frobenius norms of the error covariance and gain matrices than the KF. The above differences are particularly pronounced when the observations are highly uncertain.
Future research recommendations include applying the CBPKF and the CBEnKF to diverse geophysical problems of estimating and predicting extremes, e.g., extreme precipitation or floods. The bi-state model was used in this work for a comparative geometric analysis of the CBPKF and the KF. Possible extension to an arbitrary number of states poses an interesting research topic.