Mean Field Bias-Aware State Updating via Variational Assimilation of Streamﬂow into Distributed Hydrologic Models

: When there exist catchment-wide biases in the distributed hydrologic model states, state updating based on streamflow assimilation at the catchment outlet tends to over- and under-adjust model states close to and away from the outlet, respectively. This is due to the greater sensitivity of the simulated outlet flow to the model states that are located more closely to the outlet in the hydraulic sense, and the subsequent overcompensation of the states in the more influential grid boxes to make up for the larger scale bias. In this work, we describe Mean Field Bias (MFB)-aware variational (VAR) assimilation, or MVAR, to address the above. MVAR performs bi-scale state updating of the distributed hydrologic model using streamflow observations in which MFB in the model states are first corrected at the catchment scale before the resulting states are adjusted at the grid box scale. We comparatively evaluate MVAR with conventional VAR based on streamflow assimilation into the distributed Sacramento Soil Moisture Accounting model for a headwater catchment. Compared to VAR, MVAR adjusts model states at remote cells by larger margins and reduces the Mean Squared Error of streamflow analysis by 2–8% at the outlet Tiff City, and by 1–10% at the interior location Lanagan.


Introduction
Streamflow observations are used extensively to update hydrologic model states via various forms of data assimilation (DA) [1][2][3][4][5][6][7][8][9][10][11][12]. Assimilating streamflow data into distributed models compared to that into lumped models presents additional challenges due to the greatly increased dimensionality of the inverse problem as elaborated below. In distributed modeling, changes in model states at a cell distant from a high-order stream exert a much smaller influence on the model-simulated flow at the catchment outlet compared to those at a cell near a high-order stream. This is because the runoff generated near a low-order stream, that is, in a headwater area, has to travel much longer distances to reach a high-order stream where channel flow takes over. Most of the above travel occurs on hillslopes where the flow is greatly attenuated. Unlike channel flow, hillslope flow occurs through numerous flow paths on the land surface and pore spaces in the subsurface. These flow paths vary greatly in size, length and shape, and have much larger friction factors than channels. Compared to channel flow, hillslope flow is hence subject to mechanical dispersion and is characterized by smaller ratios of advective to diffusive transport [13]. Recall in the advection-diffusion solution [14,15] that the width of a pulse inflow varies approximately with √ Dt where D and t denote the diffusivity and time elapsed, respectively. Larger D and t due to dispersion and longer travel time mean that hillslope flow is greatly attenuated and that sensitivity of changes in runoff generation to lateral inflow into channels Tension water contents (UZTWC and LZTWC) are related to soil moisture bounded to soil particles and free water contents (UZFWC, LZPFC, LZSFC) are related to soil gravitational water. Primary (LZPFC) and supplemental (LZSFC) free water contents in the LZ produce slow-or fast-responding baseflow, respectively. Tension and free soil moisture contents and the additional impervious area water content (ADIMC) interact with each other and generate baseflow, interflow, surface runoff, direct runoff, and impervious area runoff. The kinematic-wave routing model is then used to route the runoff through the hillslopes and channels based on cell-to-cell connectivity information created from the cell outlet tracing with an area threshold (COTAT) algorithm [31,37]. A priori parameters of the distributed SAC-SMA were estimated from the STATSGO2 soil data [38] and the routing parameters were from the digital elevation model (DEM) and channel hydraulic data [31]. This study used manually-optimized model parameters obtained from Phase 1 of the Distributed Model Intercomparison Project (DMIP, [39]). The soil moisture translation routine of the SAC-SMA with heat transfer dynamics (SAC-HT; [40]) is used to calculate depth-specific soil moisture contents from the original SAC-SMA model states.

MFB-Aware Variational Data Assimilation, MVAR
The following states the data assimilation problem solved in this study. Given the a priori model states, streamflow observations at either outlet or both outlet and interior locations, and observed precipitation (P) and monthly climatology of potential evapotranspiration (PE), update the distributed model soil moisture states at the beginning of the assimilation window and multiplicative biases in P and PE. The model may be applied as a weak constraint to this problem by considering model inadequacies [41][42][43]. MVAR first adjusts model states via multiplying MFB estimates and then individually updates at the HRAP scale. With enough length of the assimilation window, the MFB estimate effectively affects interior points through propagating outlet flow information to upstream pixels via the routing process, not just affected by high sensitivity area around the basin outlet pixel. The two-step approach in estimating MFB and HRAP cell-scale errors is to reduce the dimensionality or ill-posedness of the assimilation problem. Similar examples include dual state-parameter estimation [44] or data assimilation by field alignment to solve separately displacements and amplitudes [45]. The MFB estimation step can be formulated as the nonlinear constrained least-squares minimization problem in Equations (1) and (2). Minimize In Equation (1), is the vector composed of MFB for individual model states where j = 1, . . . , n S ; J K denotes the objective function value at hour K; X S denotes the state vector of the SAC-SMA model, or UZTWC, UZFWC, LZTWC, LZSFC, LZPFC, and ADIMC (Table 1); X S,K−L represents X S at the beginning (k = K − L) of the assimilation window, that is, background model states as driven by the base model simulation; X P and X E denote the multiplicative adjustment factors for the precipitation and Potential Evaporation (PE) data at k = K − L + 1, . . . , K, respectively, within the assimilation window; the subscript k denotes the time index; X min S,j,i and X max S,j,i denote the lower and upper bounds of the j-th state variable at the i-th grid, X S,j,i ; n S denotes the number of SAC states (n S = 6 in this study for both distributed and lumped SAC-SMA); n C denotes the number of HRAP cells within the basin; L denotes the length of the assimilation window-the duration of the unit hydrograph is used for L which represents the time scale of the fast response of the basin; M() denotes the SAC-SMA model; H B , H P , H E , and H Q () denote the observation operators that relate the control vector (X S,K−L , X P , X E , X W ) to observations (Z B , Z P , Z E , Z Q ) where, in this study, H B = I, H P = Z P , H E = Z E , and H Q () is the distributed SAC-SMA and kinematic-wave routing models; R B , R P , R E , R Q , and R W denote the measurement error covariance matrices for background model states, precipitation, PE, streamflow, and the model error, respectively; Z P , Z E , and Z Q denote the observations of precipitation, PE, and streamflow at k = K − L + 1, . . . , K, respectively; Equation (1) is based on the following observation equations: In the above, V B , V P , V E , V Q , and V W denote measurement error vectors for background model states, precipitation, PE, streamflow, and the rainfall-runoff model error. The Z W denotes (unknown) observations of an error in the rainfall-runoff model. Similar to Equation (1) in Beven [46], the rainfall-runoff model error X W may relate to runoff observations (Z R ) by Equation (8).
In Equation (8), V R denotes the measurement error vector for runoff and M R represents the rainfall-runoff model. Note in Equation (8) that, in order to model the error in a rainfall-runoff model, X W is added to the model-generated runoff M R -to reflect this into the model source code, the code can be modified by adding X W to the model generated runoff, set aside including the X W term in the objective function, Equation (1). Since control variables are the only variables adjusted or updated via DA, M R is presented as a function of X S,K−L , X P , and X E in Equation (8). Theoretically, however, M R should also be a function of model parameters, space-time resolutions, etc. In this respect, X W can be interpreted as an agglomerated error of all unresolved error sources propagated by fluxes through model dynamics to the point of generating runoff. If a systematic model bias exists in reproducing streamflow, this may be reflected in the long-term mean X W of a large positive or negative value. Equation (8) may be separately written for slow and fast responding runoff components, or collectively for the total channel inflow (TCI). Other important considerations include how to specify the space-time structure of X W and its error statistics. When solving the objective function J in Equation (1), X W is initially assumed zero, that is, no error in the rainfall-runoff transformation processes; during the minimization, X W is changed based on the gradient of J with respect to X W in a way to minimize J. Adding X W to the runoff is equivalent to adding uncertainties to the initial conditions of the routing model.
To model separately errors in fast and slow runoff components, X SURF W,k,i and X GRND W,k,i are added to the surface and ground runoff, respectively where X SURF W,k,i , and X GRND W,k,i represent the model error in surface and ground runoff at the k-th time step at the i-th HRAP grid, respectively. Since the statistical properties of observation errors in surface and ground runoff are unknown in reality, these observation errors are assumed non-informative which drops 1 2 [34]. Assuming that observational errors are independent of one another and time-invariant, observation error covariance matrices R in Equation (1) become diagonal and static (See [7,47] for justification). This renders Equation (1) rewritten as Equation (9). Minimize From minimizing Equation (9), X B = X B,j T is estimated where X B,j denotes the MFB estimate of the j-th model state. Using the estimated X B vector, the model state at the individual HRAP grid is updated by minimizing J K (X S,K−L , X P , X E , X W ), which is the same as Equation (9) except replacing In updating model states at the individual HRAP grid, the observation equation used for model states is the following: In Equation (11), Z B = X S,K−L = X T S,K−L X B and H B = I. The above assimilation problem was numerically solved with the Fletcher-Reeves-Polak-Ribiere minimization (FRPRMN) algorithm. MVAR as well as VAR described in Section 2.3 can handle nonlinear observation equations typically associated with assimilation of streamflow observations and provide full-rank solutions, as opposed to ensemble Kalman filter [17] which is optimal for linear observation equations only and provides reduced-rank solutions only. As a fix-lag smoother, both MVAR and VAR can readily account for the basin response time of fast runoff which is important to effectively capture the time-lagged effect of soil moisture to streamflow. Figure 1 presents a schematic of MVAR.

Conventional Variational Data Assimilation, VAR
Compared to MVAR, the conventional VAR lacks the MFB correction step and solves for model states at individual HRAP cells by minimizing Equation (12). Minimize

Conventional Variational Data Assimilation, VAR
Compared to MVAR, the conventional VAR lacks the MFB correction step and solves for model states at individual HRAP cells by minimizing Equation (12). Minimize subject to In Equation (12), observation equations for X P , X E , X W and streamflow are identical to Equations (4) to (7). The observation equation for model states X S can be written as Equation (14) in which H S = I and Z S denotes observations of model states at k = K − L, that is, the beginning of the assimilation window. Due to a lack of model state observations, Z S = X S,K−L is used.
With an assumption of observational errors being independent of one another and time-invariant, Equation (12) can be rewritten as Equation (15). Minimize subject to Same as MVAR, Equations (15) and (16) were solved by the FRPRMN algorithm. Table 2 compares the objective functions used in MVAR and VAR.

MFB-aware variational assimilation (MVAR)
Step 1. Adjustment of mean field bias in model states Objective function

Study Area and Evaluation Metrics
Sections 3.1 and 3.2 describe the study area and the evaluation metrics used, respectively.

Study Area
The study area used is the 2258 km 2 headwater basin that drains into the Elk River near Tiff City, Missouri. The runoff coefficient of this basin is 0. were selected for the assimilation experiments described in Section 4. Figure 2 shows the study basin where Tiff City and Lanagan drain 135 and 35 HRAP cells, respectively. Histograms in Figure 2 show the number of HRAP cells with the same distance to stream gauges where the distance is computed based on the cell-to-cell connectivity information. As a study area of the DMIP project, further details on this basin are found at Smith et al. [39].
Forecasting 2020, 2 FOR PEER REVIEW 9 Mean Squared Error (MSE) of streamflow and its decomposition into bias, variance, and covariance terms can be calculated with Equation (18).
where is the linear correlation coefficient between , and , ; , and , are means of , and , , respectively; , and , are standard deviations of , and , , respectively. The three components of MSE in Equation (18) help identify the source of MSE change.
To evaluate the performance of MVAR and VAR at stratified flow ranges, Type-I and Type-II Conditional Biases (CB) can be computed by Equations (19) and (20).
Type-I CB is conditioned on streamflow simulation , which tells falsely detecting nonexistent events or the quality of the model calibration. On the other hand, Type-II CB is conditioned on streamflow observation , which quantifies a failure of detecting existing events. Both CBs can be computed by slicing the scatter plots of , on an x-axis and , on a y-axis horizontally for

Evaluation Metrics
To quantify changes to model states posterior to the assimilation, the Mean Absolute Difference (MAD) between a priori and updated model states at time k is computed by Equation (17).
In Equation (17), X − S,j,i and X + S,j,i represent the a priori or updated, respectively, j-th state variable at the i-th grid. In the presence of MFB in model states, the MAD values from MVAR are expected to be larger than those from VAR at HRAP cells distant to stream gauges.
Mean Squared Error (MSE) of streamflow and its decomposition into bias, variance, and co-variance terms can be calculated with Equation (18).

535
where r is the linear correlation coefficient between Z Q,k and H Q,k ; µ Z Q,k and µ H Q,k are means of Z Q,k and H Q,k , respectively; σ Z Q,k and σ H Q,k are standard deviations of Z Q,k and H Q,k , respectively. The three components of MSE in Equation (18) help identify the source of MSE change.
To evaluate the performance of MVAR and VAR at stratified flow ranges, Type-I and Type-II Conditional Biases (CB) can be computed by Equations (19) and (20).
Type-I CB is conditioned on streamflow simulation H Q,k which tells falsely detecting non-existent events or the quality of the model calibration. On the other hand, Type-II CB is conditioned on streamflow observation Z Q,k which quantifies a failure of detecting existing events. Both CBs can be computed by slicing the scatter plots of Z Q,k on an x-axis and H Q,k on a y-axis horizontally for Type-I CB or vertically for Type-II CB into a number of intervals. Type-I and II CBs visualize performance changes from normal to extremes. Differences between Type-II CB and Type-I CB suggest the area of further enhancement.

Results
In this Section, an illustrative example of MVAR and VAR results from a single assimilation cycle is presented in Section 4.1, followed by the MAD of the two assimilation techniques in terms of model states or a soil moisture profile in Section 4.2. Section 4.3 describes the effect of weakly-constrained assimilation approaches in MVAR. Section 4.4 compares the performance of MVAR and VAR on streamflow.

Illustrative Example
The sensitivity of the objective function (J K ) to background model states (X S,j,i,K−L ) at the beginning of the assimilation window, or ∂J K ∂X S,j,i,K−L , tend to become lower with an increase of the distance to the stream gauges as shown in Figure 3 with the Normalized Mean Absolute Gradient (NMAG; Equation (21)) from all assimilation cycles.
In Equation (21), X S,j,i,K−L denotes the j-th model state at the i-th HRAP grid at k = K − L. In Figure 3, TWCs show higher sensitivity than FWCs. Despite the differences in the magnitude of gradients, all five soil moisture contents in UZ and LZ show a similar spatial pattern of NMAG, that is, NMAG decrease with an increase of the distance to stream gauges. Compared to UZ states, LZ states show slightly larger NMAG around the basin outlet, indicating a weak dynamic causality between outlet flow and deep soil moisture in distant cells. Compared to the case of outlet flow assimilation, additionally assimilating interior flow increases NMAG of UZ states at the scale of both a whole basin and the sub-basin Lanagan. However, the sensitivity of LZ states has reduced in terms of a basin mean, while the opposite is true in the case of Lanagan. ADIMC is overall little sensitive to the objective function at all HRAP cells.    Figure 4, MVAR adjusts states spatially uniformly to address the flow volume error at the early rising limb of Lanagan as well as Tiff City as shown in the middle panel. On the contrary, the conventional distributed state updating suffers from improving streamflow (middle panel) by adjusting individual cell states (top panel), and the change in states is overall negligible at distant HRAP cells. The changed amount in SAC states from all assimilation cycles will be described in the following section. The bottom panel of Figure 4 presents the states at the time of forecast, which show how differences in TWCs at the beginning of the assimilation window (k = K − L) yields differences in FWCs at the end of the assimilation window (k = K) as initial conditions for prediction.

Model States
Through the MFB correction, MVAR is expected to update states at distant cells more largely than VAR for all model states except ADIMC with its small gradient values everywhere (Figure 3). This is shown in Figure 5 as an increasing pattern of MAD(M)-MAD(V) with an increase of the distance to the basin outlet where MAD (Equation (17)) quantifies the mean absolute difference of a priori and updated states; MAD(M) and MAD(V) represent MAD from MVAR or VAR, respectively; LZSFC at k = K − L shows less of an increasing pattern than other soil moisture states possibly due to the objective function is less sensitive to LZSFC than the other states, as shown in Figure 3. In contrast, LZTWC with the largest sensitivity shows vividly an increasing pattern at both k = K − L and K, compared to the rest states. In Figure 5, MAD differences tend to decrease with an increase of NMAG particularly in the case of LZTWC, which reflects the overall decreasing pattern of NMAG with an increase of the distance (Figure 3). When assimilating interior flows in addition to outlet flows, NMAG mean values of UZ and LZ tension and free water contents for Lanagan are always larger than those for Tiff City (Figure 3). This is shown in Figure 5 as MAD differences are larger for HRAP cells within Lanagan (solid red dots) than the rest (open red circles) in the case of assimilating interior flow in addition to outlet flow. This suggests a potential need of addressing biases at a sub-basin scale, that is, adjusting Local Bias (LB) instead of MFB-left for future work. In Figure 6, soil moisture at 5, 25, 60, 75, and 100 cm depths translated from SAC states shows consistently an increasing pattern of the MAD difference with an increase of the distance to the stream gauge at both k = K − L and K-this supports the effectiveness of MVAR in updating soil moisture states at distant cells despite their little sensitivities to streamflow. Compared to a single scale DA used in Lee et al. [8], MVAR is able to noticeably update model states at distant cells (Figures 5 and 6) while allowing updating model states at individual cells separately.    The bottom panel of Figure 8 shows box plots of and at the scale of the entire basin Tiff City and the sub-basin Lanagan. At both Tiff City and Lanagan, MVAR produced notably smaller mean and value range of ( ○ ○ • • in Figure 8) than VAR ( ○ ○ • •) at both flow assimilation cases. In contrast, differences in are marginal except outliers at Tiff City ( ○ ○ ○ ○ at the top right box plot in Figure 8). Figures 7 and 8 suggest that, although MVAR generally reduces the effect of on model states by possibly reducing the degree of compensating for unresolved MFB by , improving model fidelity to physical realism may precede an assimilation procedure in order to pose the assimilation problem well-posed [34]. When assimilating interior flows in addition to outlet flow, and values within Lanagan (• and • at the

Model Structural Error
Differences between weakly-and strongly-constrained model results may reveal the reduced amount of model structural inadequacies owing to the MFB correction in MVAR when compared to corresponding VAR results.  Figure 8, which is the value averaged over the entire assimilation window (L) from all cycles (N − L + 1).    This may indicate the changing amount of model structural errors as well as field biases resolvable via additionally assimilating interior flows which may benefit from adjusting LB at the scale of Lanagan, as discussed in Section 4.2. Table 3 shows the time-averaged spatial correlation of background and updated model states from both MVAR and VAR. In Table 3, MVAR generally retains higher spatial correlations than VAR at most states at all assimilation cases at both k = K − L and K. When the model is applied as a weak-constraint, MVAR-generated LZTWC shows consistently lower spatial correlations than the VAR-generated. As discussed above, this is also an indication of a need to improve model physics prior to the assimilation in order to preserve the spatial correlation structure of background model states as driven by base model simulation [34]. Table 3. Time-averaged spatial correlation of background and updated model states. Values in the parenthesis represent correlation from VAR, or r V , otherwise from MVAR, or r M . Bold font highlights a larger value between r V and r M . Underscored is the pair showing the minimum or maximum value of r M − r V at either k = K − L or k = K. WC and SC represent the model applied as a weak-or strong-constraint to the assimilation problem, respectively.

Streamflow
MVAR improved outlet as well as interior flow over the conventional VAR at all assimilation cases in terms of MSE ( Figure 9). Compared to VAR, MVAR reduced further MSE of streamflow by 2-8% at Tiff City, and by 1-10% at Lanagan, depending on the data assimilated and the model applied as a weak-or strong-constraint to the assimilation problem ( Figure 9). Assimilating interior flow in addition to outlet flow is necessary to achieve considerable improvement in interior flows by VAR [8], which is further improved by MVAR. Compared to strongly-constrained assimilation results, weakly-constrained assimilation reduced MSE of streamflow by 2-4% at Tiff City and by 0.4-3% at Lanagan. MSE decomposition into bias, variance, and co-variance indicates that MVAR outperforms VAR by better modeling co-variabilities of streamflow observation and simulation at both Tiff City and Lanagan at the cost of increasing bias and variance terms at some assimilation cases. In the case of assimilating outlet flow only, all three MSE components for outlet flow are consistently reduced by MVAR more than VAR, implying the positive effect of addressing MFB in model states on reducing bias and modeling variance as well as co-variance in simulated and observed flow. Magnitude-dependent performance is examined based on Type-I and -II CBs. In Figure 10, Type-I CB is generally smaller than Type-II CB particularly at extremes which signifies the importance of addressing Type-II CB in estimation and prediction of extremes [23]. In the case of assimilating interior flows in addition to outlet flow, median-range flows of less than 150 m 3 /s at Lanagan are mostly improved, whereas heavy-to-extreme flows at Lanagan are degraded ( Figure 10). Visual examination of hydrographs at Lanagan indicated the notable amount of magnitude-dependent flow bias at some events which warrants future efforts with CB-penalized assimilation techniques such as CB-penalized Kalman Filter (CBPKF, [23]) or its ensemble extension, CB-penalized Ensemble Kalman Filter (CBEnKF, [9]).    (19)) and Type-II (Equation (20)) conditional biases in streamflow are shown as bar plots on the right-hand side or at the bottom, respectively, of the scatter plot.  (19)) and Type-II (Equation (20)) conditional biases in streamflow are shown as bar plots on the right-hand side or at the bottom, respectively, of the scatter plot.

Conclusions
The MFB-aware variational assimilation (MVAR) for the distributed SAC-SMA model was developed and comparatively evaluated with the conventional VAR based on its application to the headwater basin at Tiff City, Missouri. MVAR corrects MFB in model states and then update states at an individual HRAP grid. Compared to VAR [7,8,11,34], MVAR adjusts model states at remote cells by larger margins albeit their little sensitivities to streamflow. MVAR generally outperformed VAR in improving streamflow in both cases of outlet flow assimilation and interior as well as outlet flow assimilation. When the interior flow is assimilated in addition to outlet flow assimilation, improvement in interior flow is conspicuous. When the model is applied as a weak-constraint to the assimilation problem, MVAR generally less adjusted model states than VAR, implying the model error estimate possibly compensated for the MFB in model states in the case of VAR. The future work includes developing VAR with a capability to address spatially non-uniform biases in model states as well as magnitude-dependent biases in streamflow.

Funding:
The second and third authors were supported in part by the National Oceanic and Atmospheric Administration's Joint Technology Transfer Initiative Program (Grants NA16OAR4590232, NA17OAR4590174, and NA17OAR4590184). These supports are gratefully acknowledged.

Conflicts of Interest:
The authors declare no conflict of interest.