Integrating Remote Sensing Information Into A Distributed Hydrological Model for Improving Water Budget Predictions in Large-scale Basins through Data Assimilation

This paper investigates whether remote sensing evapotranspiration estimates can be integrated by means of data assimilation into a distributed hydrological model for improving the predictions of spatial water distribution over a large river basin with an area of 317,800 km2. A series of available MODIS satellite images over the Haihe River basin in China are used for the year 2005. Evapotranspiration is retrieved from these 1×1 km resolution images using the SEBS (Surface Energy Balance System) algorithm. The physically-based distributed model WEP-L (Water and Energy transfer Process in Large river basins) is used to compute the water balance of the Haihe River basin in the same year. Comparison between model-derived and remote sensing retrieval basin-averaged evapotranspiration estimates shows a good piecewise linear relationship, but their spatial distribution within the Haihe basin is different. The remote sensing derived evapotranspiration shows variability at finer scales. An extended Kalman filter (EKF) data assimilation algorithm, suitable for non-linear problems, is used. Assimilation results indicate that remote sensing observations have a potentially important role in providing spatial information to the assimilation system for the spatially optical hydrological parameterization of the model. This is especially important for large basins, such as the Haihe River basin in this study. Combining and integrating the capabilities of and information from model simulation and remote sensing techniques may provide the best spatial and temporal characteristics for hydrological states/fluxes, and would be both appealing and necessary for improving our knowledge of fundamental hydrological processes and for addressing important water resource management problems.

approaches at the catchment/river basin scale, and more case studies should be conducted to realize the operational potential of DA for a broad range of water resources management problems. Therefore, the work presented in this paper is motivated by the need to develop an improved data assimilation system that use remote sensed ET to improve the predictive performance of a distributed hydrological model in a large river basin.
Many studies in hydrological modeling of DA have begun to appear in recent years [e.g. [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22], spurred by the success of DA in other fields. Most of these studies focus on the assimilation of soil moisture data in land surface models, and rely on synthetic datasets to assess the performance of the DA algorithms. Much less work has been published on assimilating remote sensed evapotranspiration (ET)/latent flux (LE) into hydrological models at the regional/basin scale. Schuurmans et al. [23] address the question of whether remotely sensed latent heat flux estimates from Surface Energy Balance Algorithm for Land (SEBAL) over a catchment can be used to improve distributed hydrological model computations using a constant gain Kalman filter data assimilation algorithm in the Drentse Aa catchment with an drainage area of 300 km 2 . Pan et al. [3] proposed and tested a data assimilation system that consisted of a combination of two filters -a particle filter (PF) [24][25] and an ensemble Kalman filter (EnKF) [26][27] to estimate the water budget using a MODIS based estimate of surface evapotranspiration (ET) over the spatial domain of Red-Arkansas river basin.

Overview
The scheme of the data assimilation system is sketched in Figure 1. In this paper a series of MODIS satellite images available for the Haihe basin for the year 2005 are used. Evapotraspiration is retrieved from these 1×1 km resolution images using the SEBS (Surface Energy Balance System) algorithm [28]. The physically-based distributed model WEP-L (Water and Energy transfer Process in Large river basins) [29] is used to compute the water balance of the Haihe River basin in the same year. An extended Kalman filter (EKF) data assimilation algorithm, suitable for non-linear problems, is used.
In order to attain the water budgets for water resources assessment and planning, the specific setup of the WEP-L model is made to provide an accurate estimate of the magnitude of the different components of the hydrologic cycle in large basins like Haihe River basin. ET is an important component for water budget analysis, because ET can be regard as the net consumption of water. Remote sensing algorithm (e.g. SEBS) can provide spatially distributed estimates of evapotranspiration, although continuous time series with time steps are difficult. It is expected that the WEP-L model can benefit by assimilating the spatial distributed ET estimates provided by the SEBS, and give a better understanding about how the availability of actual evapotranspiration varies both spatially and temporally. The physical models, remote sensing retrieval tool, data assimilation techniques and data sources are further discussed below.

Description of the WEP-L model
With the computational resources available today to most modelers, it has become feasible to build and apply highly complex distributed hydrological models that represent many different processes and consist of many model elements. The distributed hydrological model WEP-L was developed in a national key basic research project of China [29][30][31]. The WEP-L model is based on the WEP model [32][33][34] which has been successfully applied in several watersheds in Japan, Korean and China with different climate and geographic conditions [32,[34][35][36][37][38][39]. The WEP-L model adopts the contour bands as the calculation units to fit for large river basins and has been applied in the Yellow river basin in China. For details one is referred to Jia et al. [29][30][31].
The vertical structure of WEP-L within a contour band is shown in Figure 2(a), and the horizontal structure of WEP-L within a sub-watershed is shown in Figure 2(b). Land use is divided into five groups within a contour band, namely Soil Vegetation (SV) group, Non-irrigated farmland (NF) group, Irrigated Farmland (IF) group, Water Body (WB) group and Impervious Area (IA) group. The SV group is further classified into bare soil land, tall vegetation (forest or urban trees) and short vegetation (grassland). The IA group consists of impervious urban cover, urban canopy and rocky mountain. The areal average of water and heat fluxes from all land uses in a contour band produces the averaged fluxes in the contour band. For pervious groups of SV, NF and IF, nine vertical layers, namely an interception layer, a depression layer, three upper soil layers, a transition layer, an unconfined aquifer, an aquitard and a confined aquifer, are included in the model structure.  The simulated hydrological processes include snow melting, evapotranspiration, infiltration, surface runoff, subsurface runoff, groundwater flow, overland flow, river flow, and water use. The simulated energy transfer processes include short-wave radiation, long-wave radiation, latent heat flux, sensible heat flux, and soil heat flux. Adopted modeling approaches for hydrological and energy processes are referred to in Jia et al. [34](2001b) except snow melting and water use processes.
WEP-L integrates the merits of distributed hydrological models with those of SVATS (Soil-Vegetation-Atmosphere Transfer Schemes) model, couples the simulation of water cycle and energy processes, and calculates evapotranspiration from each land use separately. Evapotranspiration consists of interception of vegetation canopies (evaporation from the wet part of leaves), evaporation from water bodies, soil, urban cover and urban canopy and transpiration from the dry fraction of leaves with the source from the three upper soil layers. The evaporation from water bodies or ponded water in the depression storage is calculated with the Penman equation. The evaporation from the impervious area is taken as the smaller one of current depression storage and the potential evaporation. The computation of interception is referred to the Noilhan and Planton [40] model that is an interception reservoir method. The evaporation from soil is assumed to come only from the topsoil layer. The Penman equation is adopted to compute potential evaporation from which actual evaporation of soil is computed using a wetness function suggested by Lee and Pielke [41]. The actual transpiration is calculated using the Penman-Monteith equation [42] and the canopy resistance [40] which is related to the soil moisture condition. The average evapotranspiration in a contour band is obtained by areally averaging those from each land use.

Description of the SEBS algorithm
With the rapid development and increased application of remote sensing technology, evapotranspiration calculation methods using remote sensing techniques have become a major trend for hydrological research in recent years. The data obtained from visible, near-infrared and thermal band can reflect the spatial and temporal distribution of surface features, which have a great importance in simulating the energy balance. The SEBAL (Surface Energy Balance Algorithm for Land) proposed by Bastiaanssen et al. [43], which is based on land surface parameters acquired with remote sensing techniques. It is a semi-empirical model applied in calculating the evapotranspiration and the main difficulty is to determine the "hot" pixel and the "cold" pixel which has a great effect on the final results. Norman and Kustas [44] proposed the TSEB (Two-Source Energy Balance) algorithm to calculate the evaporation from bare soil and transpiration from vegetation separately based on remote sensing data.
The Surface Energy Balance System (SEBS) was developed by Su [28] to estimate the atmospheric turbulent fluxes and evaporative fraction using satellite earth observation data, in combination with meteorological information at proper scales. The system retrieves evapotranspiration (ET) using measurements of incoming surface radiation, surface skin temperature, surface meteorology, and surface and vegetation properties [45]. One of the advantages in this algorithm is applying both Bulk Atmospheric Similarity (BAS) and the Monin-Obukov atmospheric surface layer (ASL) similarity in the model, which can be used for regional and local scales respectively to determine the turbulent fluxes. Another important merit of SEBS is the inclusion of a physical model which takes surface heterogeneity into account in the estimation of the roughness height for heat transfer. The SEBS algorithm has been successfully applied for many applications of evaporation estimations in many different places with different scale [46][47][48][49][50][51].
For the purpose of this study, only basic equations for ET retrieval will be briefly described, full details are given by Su [28]. The surface energy balance is commonly written as Where n R is the net radiation, 0 G is the soil heat flux, H is the turbulent sensible heat flux, and E λ is the turbulent latent heat flux ( λ is the latent heat of vaporization and E is the actual evapotranspiration).
To determine the evaporative fraction (to be defined below), use is made of energy balance considerations at limiting cases. Under the dry-limit, the latent heat (or the evaporation) becomes zero due to the limitation of soil moisture and the sensible heat flux is at its maximum value.
The relative evaporation then can be evaluated as The evaporative fraction is finally given by: By inverting Eqn. (5), the actual latent heat flux E λ can be obtained.
The actual sensible heat flux H is determined with the bulk atmospheric similarity approach and is constrained in the range set by the sensible heat flux at the wet limit wet H , and that at the dry limit dry H . dry H is given by Eqn. (2) and wet H can be derived by a combination equation [52] similar to the Penman-Monteith combination equation with the assumption of a completely wet situation. Other details are given in Su [28].

The extended Kalman filter
Data assimilation techniques can be used to improve model performance either by optimizing model parameters or by correcting the state produced by the model, both through a variational or a sequential approach. For operational purposes the combination of parameters optimization and state estimation is most promising. Nevertheless, in this paper we restrain ourselves to state estimation via a sequential approach, which provides a general framework to explicitly incorporate input, model and remote sensing observation errors. Among sequential data assimilation techniques, the Kalman filter (KF) introduced by Kalman [53] and originally devised for linear models, is the most widely used method. In order to perform DA on non-linear models, several different implementations of the KF have been devised: the Extended version (EKF) proposed by Jaswinski [54], the Ensemble version (EnKF) proposed by Evensen [27] and the Singular Evolutive Extended version (SEEK) proposed by Pham et al. [55].
A significant benefit of the Kalman filter is that unmeasured process states may be estimated based on limited, noisy process measurements. In this paper, the EKF is used for nonlinear systems that are linearized around the current process state, which take into account the estimated errors on the model and on the observations to determine the correction to be applied to the states and calculates a new estimation of these errors every time step. Thus, the more accurate the observations the closer the a posteriori state of the model will be to them.
In order to use EKF to perform parameter estimation, parameters are treated as process variables with a rate of change=0. We can define the state vector    and observation vector z: The evolution of the process state, x & , is a function of the state, x , time, t , and system parameters, θ . w and v are subject to zero mean Gaussian white noise functions. Their error covariance matrices represent the errors: k Q for the uncertainties on the model and k R for the uncertainties on the observations. k Q represents the errors generated by the model during one time step, and does not account for the total uncertainties of the model, which also include the propagation of uncertainties from 1 − k to time k . Therefore, a matrix k P is defined to include the propagation of the former errors.
The heart of the Kalman filter is the gain matrix, K , which is calculated in two phases: the adjustment phase and the propagation phase.
During the first phase, the Kalman gain matrix k K is derived by the following formula: Here H is the Jacobian matrix of h with respect to the observation vector. Eqn. (8) indicates that, for each state, the correction factor will be applied due to each observation and its calculation takes into account the error matrices k R and k P .
Then the state vector is updated: the a posteriori state vector k x ) is equal to the a priori state vector − k x ) plus the difference between the observation and state vectors multiplied by the Kalman gain: The a posteriori error covariance matrix k P is also updated as follows: During the second phase, the new error covariance matrix 1 + k P is calculated by adding the error at time k k P propagated via the tangent linear and the error k Q generated during this time step, where F is the Jacobian matrix of f with respect to the state vector.
Calculation of the correction factor (Kalman gain) is an important step of the method: if this factor is too small, the assimilation procedure will have no effect, if it is too large, the model would forget its original evolution. The magnitude of the correction is dominated by the ratio of the errors on the observations and the model. If the uncertainties on the observations are very small (i.e.
the Kalman gain will be almost equal to 1 and the states will set to the observations. Alternatively, if the observations have a large uncertainty, the Kalman gain will be almost equal to 0 and the correction will be almost null. The above expressions show that, if k Q is assumed to be zero, the magnitude of the error covariance will decrease and close to a value of zero. This would be true, only if the model is a perfect representation of reality, and the observations are perfect. However, we know that these assumptions cannot be made. So a certain level of uncertainties must be retained in the data assimilation process. A methodology to determine the optimal value for k R and k Q should be the focus of a further study, but falls outside the scope of this paper.

Description of study area
The Haihe River basin is located between 35°~43°N and 112°~120E°. It neighbors the Inner Mongolian Plateau in the north, and the Yellow River is the borderline in the south. It faces the Bohai Sea to the east and borders Shanxi Plateau in the west. The Haihe basin belongs to the warm temperate zone with a semi-humid and semi-arid climate. The winters are dry and cold, with low rainfall in the spring and heavy rainfall in the summer. The average annual precipitation is 548 mm, about 80 percent of which falls during June to September. Its area is 317,800 square kilometers, of which 189,000 km 2 is mountainous and the remainder is plain, and divided into 15 level-3 water resources areas (WRA3) (

WEP-L application to the Haihe basin
The whole drainage basin of the Haihe basin in WEP-L model is divided into 3067 sub-watersheds ( Figure 5(a)), each of which is assigned with a Pfafstetter code [56]. Each sub-watershed in hilly and tableland areas is further divided into 1-10 contour bands, but no further division is performed for subwatersheds in plain areas because of little topographic effects, i.e., one sub-watershed is taken as one contour band. According to the contour band, the whole Haihe basin is further discretized into 11,752 hydrologic response units (HRU) (Figure 5(b)). The details of the basin subdivision and coding are described in Luo et al. [57].  Table 1 shows a list of the main collected basic data on which the WEP-L input data are based. The data include the following categories: (1) hydro-meteorology; (2) land cover information including land use, vegetation, soil and water conservation, crop patterns, etc.; (3) topography, soil and hydrogeology; and (4) river network; etc. For data preparation and processing, it is referred to Jia et al. [29].
Model parameterization and sensitivity analysis: There are four categories of parameters in the WEP-L model: (1) parameters of land surface and river channel system; (2) parameters of vegetation; (3) parameters of soil and aquifer; and (4) parameters of snow melting. All of parameters are initially estimated according to land cover information, observation data, and remote sensing data. An explanation of estimation about some main parameters of these four categories referred to Jia et al. [29]. The disturbance analysis, a simple common method of sensitivity analysis, is used to analyze the sensitivities of main parameters and input data of the WEP-L model on the annual averages of model outputs. This method is that, when the model computes, one of the system parameter has an exiguity of change, while the other parameters are kept unchanged. Then the ratio of output change rate to the parameter change rate can be got, called as the sensitivity. The sensitivity analysis results indicates that the high sensitive include maximum depression storage of land surface, maximum soil moisture content (soil porosity), conductivity of river bed materials as well as thickness of soil layers. The middle sensitive include the conductivity and storage coefficient of aquifers, root depth and saturated conductivity of soils, etc. The low sensitive include vegetation parameters, manning roughness of river and lateral section shapes of river, etc.

Model calibration and validation:
Sensitivity of parameters is analyzed, and then parameters with high sensitivity are calibrated on a basis of "try and error". 11 years (1990-2000) is selected as calibration period. The calibration parameters include maximum depression storage depth of land surface, soil saturated hydraulic conductivity, hydraulic conductivity of unconfined aquifer, permeability of riverbed material, Manning roughness, snow melting coefficient, and critical air temperature for snow melting. After the model calibration, all parameters are kept unchanged, continuous simulations from 1980 to 2000 are performed to verify the model by using observed monthly discharges at 23 main gage stations in the basin. Simulation results of model indicate that average errors of annually runoff are less than 10%, Nash-Sutcliffe efficiency of monthly runoff at main gage stations is over 60%, and correlation coefficients between simulated and observed monthly runoff exceed 80%. A validation example at four stations is shown in Figure 6. Nash-Sutcliffe efficiency and multi-yearly errors of simulated monthly runoff at Guangtai, Huangbizhuang, Chengde and Daiying station are 0.4 and 6.5%, 0.68 and 5.3%, 0.72 and -0.6%, 0.66 and -3.0%, respectively.

SEBS application to the Haihe basin
In order to use SEBS algorithms, primary inputs required in this study are basically two folds: (1) Meteorological variables, such as, wind speed, air temperature, surface pressure, humidity and solar radiation; and (2) Remote sensing physical parameters derived from MODIS (Moderate-resolution Imaging Spectroradiometer) data onboard the Terra platform. The relevant parameters for this study are given in Table 2. In this study the radiation, surface temperature, and surface vegetation properties are estimated from MODIS-Terra products [45,51] for a detailed description of the input sources.. The MODIS sensors have a temporal resolution of a spatial resolution of 1×1 km. It is chosen because of its short revisit period and therefore higher chances to obtain cloud free images in the Haihe basin. The frequency of the MODIS-Terra data availability is once a day if it is cloud free, and the time of the satellite passing over the study area is around 11:00 a.m. local time. During the year 2005, totally 20 observations were available for the study area. Data preparation and processing are referred to in Shan [58].
The meteorological datasets were collected from the Chinese National Meteorological Center (NMC), which includes the standard meteorological observations over 50 meteorological stations in the Haihe River Basin on daily basis. Among these meteorological observations, relative humidity, wind speed, air temperature are measured on a hourly basis and are recorded every 6 hours at 2:00, 8:00, 14:00 and 20:00, while precipitation and sunshine hours are stored as daily values. A linear interpolation model is used to obtain the meteorological items at the satellite over passing time. The data is interpolated by ILWIS software to obtain the spatial meteorological items for the whole Haihe river basin.
SEBS first computes the net radiation flux using MODIS-L1B products and then estimates the ground heat flux following the empirical approach of Monteith [42] and Kustas and Daughtry [59]. Remote sensed estimates of surface thermal infrared emissivity and the reflectance of red and near infrared bands are used to compute spatial variations in reflected short-wave and emitted long-wave radiation away from the land surface. A combination of the short and long-wave radiation yields the possibility to compute the net radiation absorbed at the surface for every pixel. It is a key step to compute the sensible heat flux using turbulent heat equations by considering both the effective roughness height and aerodynamic resistance of the surface layer as well as the atmospheric stability of the boundary layer. For the latter, SEBS solves the similarity stability equations on momentum and potential temperature [45]. The latent heat flux, or equivalently the evapotranspiration, is then taken as the residual of the surface energy budget found by substracting the ground heat flux and sensible heat flux from the net radiation. Figure 7

Results and discussion of DA
The boundaries of the HRUs in WEP-L often differ from the boundaries of the satellite grids. In this study, pixel-to-pixel SEBS estimates are 1×1 km grid mode, whereas WEP-L predictions are given based on the 11752 HRUs. SEBS estimates are aggregated into 11752 HRUs (see Figure 7(b)) in order to make the data series matching for data assimilation. Although some spatial variability information could be missed in the process of data format conversion, SEBS ET distribution remains more spatially variable than model derived ET distribution. To update the WEP-L model simulated evapotranspiration, the model predicted actual evapotranspiration for the day with satellite coverage is used. Some obvious spatial differences between the simulated ET and remote sensing ET are found in Figs. 7(b) and 10 (a). This confirms the gap between the WEP-L model and SEBS retrievals. Although there is no definitive evidence to verify that the SEBS algorithm can provide better ET estimation than the WEP-L model, some studies on WEP-L show that its performance can be impacted due to the interferences of water diversions and reservoir regulations. This impact would be more obvious in the Haihe River basin, because over than 90% of water resources have been exploited in this region. ET simulation would be highly relative to water use information. So it would be difficult to attain the precise daily ET value with good spatial characteristics due to the difficulty to acquire the detailed water use process data. This would suggest that the assimilation of SEBS ET has the potential to improve the ET estimation.
Different assimilation techniques existing correct the inputs, the internal states or the outputs of the models. The assimilation method used in our approach is a sequential method: the output states (ET) of the model are corrected using a weight-adaptive optimal interpolation algorithm based on the extended Kalman filter mentioned above. This methodology consists of locally correcting the value of the actual evapotranspiration of the WEP-L model when an observation is available. Figure 8 shows a time series plot of simulated ET with and without DA in a point HRU No. 610 (Sub-basin No.327, Contour band No.05). At the time step k , an observation is available. The difference between the observed (SEBS) and a priori simulated (WEP-L) values is partially corrected and an a posteriori value of the state, closer to the observed value is obtained. Then the WEP-L model continues and evolves freely until new observations are available. In order to avoid the water balance errors when correcting the states produced by the model, the assimilated ET would be constrained equal to the available water in the model (force mode) if the attained ET is still higher than the available water in the model after data assimilation.

SEBS ET WEP-L ET Assimilated ET
A comparison between the area averaged daily actual evapotranspiration in the case of available MODIS spectral data by the WEP-L model and the SEBS algorithm is given in Figure 9. As can be seen from the figure, there is a good correlation (R2=0.60) between the daily ET estimated by SEBS and WEP-L. Although the basin averaged daily evapotranspiration show a good piecewise linear relationship between SEBS retrieval and WEP-L estimates in the basin level, their spatial distribution within the Haihe basin is different. Analyzing the SEBS and WEP-L results for 2005 (e.g., Figure 7(a) and Figure 10(a)), it is found that the remote sensing results contain more information about the spatial variability of the evapotranspiration estimates. Therefore, it is expected that above data assimilation system could lead to an improvement of the spatial water balance analysis using the WEP-L model.  Table 3 summarizes the results of the SEBS, WEP-L and assimilation respectively when averaged at the basin scale for the available day. Their spatial man, standard deviation and RMSE are given in this table. This RMSE between the observations and simulations is calculated as: ( 1 θ θ (12) t n is the number of data points, o θ the observed (SEBS) actual evapotranspiration, and s θ the simulated actual evapotranspiration. The RMSE can offer a measure of the "potential" from data assimilation. If the RMSE simulation with DA is lower than that of simulation without DA, this would means that the assimilation impact is positive. By the comparison of the RMSE of WEP-L derived actual evapotranspiration with and without DA, the conclusion is that data assimilation in this study leads to a positive impact on the WEP-L model. However, one must keep in mind that this is not sufficient for a definitive conclusion. This is because the SEBS derived actual evapotranspiration also contains errors.  show an example about comparison of ETa distribution in September 17, 2005 as calculated by the WEP-L model without and with data assimilation. As can be seen in these figures, the assimilation procedure results contain more spatial variability than the WEP-L derived results without data assimilation. Figure 11(a) and (b) show the spatially distributed difference in annual evapotranspiration as calculated by the WEP-L model without and with data assimilation at the HRU and WRA3 level, respectively. Negative values in Fig11a and b mean that the actual evapotranspiration calculated by the model with data assimilation is lower than the actual evapotranspiration calculated without data assimilation. Figure 11(a) implies that SEBS estimates with limited daily remote sensing images yet clear spatial patterns appear in the assimilated ET distribution at the HRU level. The spatial differences of 11752 HRUs range from -48 to 32mm, and the largest relative difference can reach 16%. Figure 11(b) indicates that at the WRA3 level, the DA technique does not bring an obvious change to the yearly ET. Besides the WRA3-4 and WRA3-5, the spatial differences of most WRA3s range from -10 to 10mm, and the largest relative difference is less than ±1.5%. The spatial differences at the WRA3-4 and WRA3-5 are 16.6 and 13.6mm, and the relative differences are 4.2% and 3.7%, respectively. Above analysis show that the assimilated ET does not differ much from the WEP-L ET at the WRA3 level, whereas the assimilated system makes more obvious spatial difference at the HRU-level. It is suggested that data assimilation system could at least remain its original accuracies of the WEP-L model at the higher scale, whereas more detailed description for spatial water balance can be given at the lower scale. Can we interpret these results? One possible reason is that the lack of detail in the database used to parameterize the hydrological states/fluxes is very likely to contribute to the incompatibility of the spatially distributed hydrological parameterization of the WEP-L model. Because although the whole basin is divided into 11752 hydrological response units in the model formulation, some parameters are given at a higher scale due to limited spatial information. This means that remote sensing observations have a potentially important role in providing spatial information to the assimilation system for the spatially optical hydrological parameterization of the model. Figure 11. Spatially distributed difference in yearly evapotranspiration [mm] as calculated by the WEP-L model without and with data assimilation at the HRU level and WRA3 level, respectively: (a) HRU-level; and (b) WRA3 level. Negative values mean that the actual evapotranspiration calculated by the model with data assimilation is lower than that without data assimilation.

Conclusion and remarks
The paper has demonstrated the potential of remote sensing observations for updating the spatial water balance of distributed hydrological model. We have used an extended Kalman filter (EKF) as our data assimilation algorithm to handle this type of spatial information in the WEP-L model. By means of this data assimilation technique, estimates of daily evapotranspiration derived from MODIS-L1B products using the SEBS algorithm are combined with a distributed hydrological model (WEP-L) to understand the spatial distribution of water balance in a large basin with an area of 317,800 km 2 in China. Assimilation results indicate that remote sensing observations have a potentially important role in providing spatial information to the assimilation system for the spatially optical hydrological parameterization of the model. This is especially important for large basins, such as the Haihe River basin in this study.
Both WEP-L and SEBS used in this study have robust physical mechanisms and their respective features. The WEP-L model is based on both water and energy balance and can provide temporally continuous hydrological simulation, whereas the SEBS algorithm is based on surface energy balance and can give more spatially variable information on surface energy fluxes. Assimilation of a combination of distributed hydrological model with available remote sensing data may provide the best spatial and temporal characteristics for hydrological states/fluxes. Therefore, combining and integrating the capabilities of and information from model simulation and remote sensing techniques would be both appealing and necessary for improving our knowledge of fundamental hydrological processes and for addressing important water resource management problems.
In the data assimilation experiment reported in this study, the benefits on the prediction precision of the model by assimilating MODIS-based ET need be further investigated, although RMSE results show data assimilation have a positive impact on the model. Unlike "synthetic" data assimilation experiments, there are no assumed "true" hydrologic states or fluxes for the DA system in this study. So it is difficult to evaluate to what extent the WEP-L prediction accuracy for the water balance can be improved by the integration of MODIS-based ET. So a key question, should be further investigated, is whether an energy balance model (such as SEBS) can provide better ET estimates than a distributed hydrological model (e.g. WEP-L).
A second key issue is whether more reliable spatial information can be achieved from remote sensing observations. For example, in this study only 20 cloud free MODIS images in the year 2005 can be attained for evapotranspiration retrieval. It remains challenging to retrieve energy fluxes under cloud cover. Another challenge is that the performance of the assimilation system for the combination of parameters optimization and state estimation should be further tested for the purpose of operational application.