^{1}

^{*}

^{2}

^{1}

^{1}

^{3}

^{1}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

More and more terrestrial observational networks are being established to monitor climatic, hydrological and land-use changes in different regions of the World. In these networks, time series of states and fluxes are recorded in an automated manner, often with a high temporal resolution. These data are important for the understanding of water, energy, and/or matter fluxes, as well as their biological and physical drivers and interactions with and within the terrestrial system. Similarly, the number and accuracy of variables, which can be observed by spaceborne sensors, are increasing. Data assimilation (DA) methods utilize these observations in terrestrial models in order to increase process knowledge as well as to improve forecasts for the system being studied. The widely implemented automation in observing environmental states and fluxes makes an operational computation more and more feasible, and it opens the perspective of short-time forecasts of the state of terrestrial systems. In this paper, we review the state of the art with respect to DA focusing on the joint assimilation of observational data precedents from different spatial scales and different data types. An introduction is given to different DA methods, such as the Ensemble Kalman Filter (EnKF), Particle Filter (PF) and variational methods (3/4D-VAR). In this review, we distinguish between four major DA approaches: (1) univariate single-scale DA (UVSS), which is the approach used in the majority of published DA applications, (2) univariate multiscale DA (UVMS) referring to a methodology which acknowledges that at least some of the assimilated data are measured at a different scale than the computational grid scale, (3) multivariate single-scale DA (MVSS) dealing with the assimilation of at least two different data types, and (4) combined multivariate multiscale DA (MVMS). Finally, we conclude with a discussion on the advantages and disadvantages of the assimilation of multiple data types in a simulation model. Existing approaches can be used to simultaneously update several model states and model parameters if applicable. In other words, the basic principles for multivariate data assimilation are already available. We argue that a better understanding of the measurement errors for different observation types, improved estimates of observation bias and improved multiscale assimilation methods for data which scale nonlinearly is important to properly weight them in multiscale multivariate data assimilation. In this context, improved cross-validation of different data types, and increased ground truth verification of remote sensing products are required.

The basic idea behind data assimilation (DA) is to combine complementary information from measurements and models of the Earth system and thus optimally estimate geophysical fields of interest [

In the context of climate change and land-use change, more and more terrestrial observational networks are being established to monitor states and fluxes in an effort to understand water, energy, or matter fluxes, as well as their biological and physical drivers and interactions with and within the terrestrial system. Examples of these networks include the global FLUXNET [

The potential of these multiple data sets as well as their combination is often not fully exploited. DA, which is defined as the updating of modeled state variables (and possibly also other model components like parameters and forcings) using externally obtained data sets, has been applied in the Earth sciences for decades. DA techniques, such as the Ensemble Kalman Filter [

The objective of this paper is to review the state of the art of multivariate and multiscale DA techniques in terrestrial systems, to detect current limitations for the use of multivariate and multiscale DA and to provide guidance for further methodological developments and potential areas of application.

In this section, we present a brief introduction to three prominent DA techniques, namely the Ensemble Kalman Filter (EnKF), the Particle Filter (PF) and variational methods (VAR). EnKF and PF are Bayesian-based approaches, whereas VAR uses the minimization of a cost function [in principle, EnKF can also be derived from a cost function minimization under the hypotheses of a linear model and Gaussian probability density functions (pdfs)]. The algorithms, or their derivatives, are widely used in environmental modeling. The general principle of operation will be clarified, which is important for the understanding of multivariate and multiscale DA techniques. The methods discussed here are well suited for parallel computation, since they make use of ensemble members. This is also true for VAR methods when applied in an ensemble approach [

DA algorithms based on recursive Bayesian estimation techniques first emerged with the Kalman filter [

Variational DA (VAR) is a very successful technique for operational numerical weather prediction because it can be efficiently used in realistic, complex systems. It was introduced in a three-dimensional form (3D-VAR) by Parrish and Derber [

In the application of the EnKF, the system state at time step k − 1 (x_{k−1}) is propagated to time step k as follows:

f_{k,k−1}(.) is a nonlinear operator representing the model in state space, including the model parameters and the meteorological forcings. w_{k−1} is the process noise. This is a white-noise term with zero mean and covariance matrix Q_{k−1}, and it summarizes all the uncertainties caused by the model formulation, the forcing data, and the model parameters. The system is observed as follows:

h_{k}(.) is a nonlinear function, relating the state variables to the observations. v_{k} is the observation noise, which is a white-noise term with zero mean and covariance matrix R_{k}. It should be noted that for all time steps w_{k} and v_{k} are independent.

Instead of propagating one single model realization, the EnKF propagates an ensemble of model realizations. The spread in the ensemble at each time step is an estimate of the uncertainty in the model results. The a priori (before the update) state variables of a single ensemble member i are stored in the vector
^{−} indicates an a priori estimate. This vector is obtained by propagating each ensemble member i:

The apex
^{+} indicates an a posteriori estimate (after the update).

N is the number of ensemble members and the superscript .^{T} indicates the transpose operator. The Kalman gain K_{K} is then calculated as:
_{k} is the Jacobian of the observation system [_{k}. In order to bypass the need to linearize the observation system,
_{k} for the calculation of the Kalman gain. If N is larger than the number of observations, the rank of the matrix that needs to be inverted is always the same as the number of observations that are used to update the system. Using the Kalman gain, the states of the individual ensemble members are then updated:

The EnKF is a sophisticated sequential DA method [

A variant of the Ensemble Kalman Filter is the Ensemble Kalman Smoother (EnKS) [

A further variant related to EnKF and EnKS is the Ensemble Smoother (ES) [

EnKF became popular in several fields of research for updating model states with aid of the sequential assimilation of measurements. Examples of remote sensing DA include the updating of soil moisture contents [

Particle filters [

The PF uses the same system description as the EnKF (

In the prediction step [_{k}|y_{1:k−1}) is obtained based on the fact that the transition pdf p(x_{k}|x_{k−1}) and the prior pdf at time step k − 1 are known, whereas in the correction step [_{k}|x_{k}), and the posterior pdf p(x_{k}|y_{1:k}) is derived. The analytical solution to

The selection of the proposal pdf

This pdf is optimal in the sense that it minimizes the variance of the importance weights. However, the application of

A drawback of this approach is the lack of information regarding the model errors in the computation of the importance weights. This limitation can affect the performance of the particle filter. The choice of the transition prior to the proposal simplifies

The denominator in

As shown in

The resampling step is an essential part of the PF methodology and is necessary to improve the efficiency of PF. Often the Sampling Importance Resampling PF is used (SIR-PF) [

PF has been applied for parameter estimation in rainfall-runoff modeling [

In the examples mentioned above, the models used had a limited number of states and only a few parameters were estimated. This is the main limitation of the PF. A very large number of particles is needed for adequate sampling of a high-dimensional state space. Resampling only partially alleviates this fundamental problem, and it is unclear whether the introduction of MCMC in the context of the PF could be CPU-efficient in the future. Neither improved estimates of the proposal distribution with Gaussian approximations nor the use of future measurement data yielded a breakthrough in this respect. Therefore, although the PF is one of the important alternatives to the EnKF, it currently needs an excessive amount of CPU time and its implementation in combination with large simulation models is not feasible. Application in high-performance computing and on parallelized architectures may help to overcome this problem. Whereas drawing samples in the state space and computing proper importance weights of each sample can be performed in parallel using a separate node for each particle, standard resampling techniques require strong interaction between nodes. However, resampling techniques specifically designed for parallel computation have recently been proposed [

In variational DA, the state vector x_{k−n} is calculated, which minimizes a cost function. This cost function is calculated from time step k − n through k. n is the number of time steps in the assimilation window and is chosen by the user. In the cost function, all observations between time steps k − n and n are taken into account. Contrary to the Kalman filter, variational assimilation does not directly provide an estimate of the error in the state estimate. We consider the following nonlinear system:

g_{k+1,k} is a nonlinear function relating the state at time step k to the state at time step k + 1. The system is observed as follows:

h_{k} is a nonlinear function relating the state at time step k to the observations at time step k. Note that variational DA considers not only measurements at time step k but also those at former time steps (until k − 1). The cost function J(x_{k−n}) over the interval of n time steps is the following:

_{i} is the uncertainty in the observations. If this is the identity matrix, this term is equal to the RMSE between the observations and the model simulations. If R_{i} is not equal to the identity matrix, it can again be considered a weight factor. The first term in the cost function is called the background error J_{b}, and the second term is the observation error J_{o}.

The objective of variational assimilation is the retrieval of the state x_{k − n} which minimizes this cost function. This can be achieved through optimization methods such as the Newton-Rapson method or the adjoint method. In the latter, the difference between y_{i} and h_{i}(x_{i}) for all time steps i between k and k − n is back-propagated in order to find the gradient in the cost function, which is then used to find the value for x_{k−n} for which the cost function is minimal.

Both model predictions and observations provide actual and important information on environmental state variables. Similar to EnKF and PF, VAR methods combine both information sources. They do not explicitly evaluate the large error covariance matrices which are propagated by Kalman Filters, but they simultaneously process the data within a given time period and implicitly take dynamic error information into account by propagating an adjoint variable [

An analysis of the literature on DA showed that four major approaches exist based on the number of states that are being assimilated and their corresponding scales: (1) univariate single-scale DA (UVSS), (2) univariate multiscale DA (UVMS), (3) multivariate single-scale DA (MVSS), and (4) multivariate multiscale DA (MVMS). In the subsequent section, we will briefly present and define these approaches giving specific examples for each of them. In addition, the special case of multisource DA will be defined.

Most publications about DA applications deal with the assimilation of a single data type (“univariate”), for which it is assumed that the scale at which it is measured coincides with the computational grid scale (“single-scale”). We define these approaches as univariate single-scale DA (UVSS). It is important to realize that although the measurement scale generally does not coincide with the computational grid scale, the scale mismatch is often not very large and is therefore neglected in DA study. Typically, support scales of observed environmental states are relatively small (e.g., a few cm^{3} to dm^{3}) and are often several factors smaller than the model grid scale. If these observations are assimilated into a model with a grid size of tens of meters, the difference in the spatial scale is significant. In typical UVSS DA schemes, the observations are assumed to represent the average value of the observed values of the state within the model pixel without using an appropriate data scaling technique. For example, in a small catchment, a soil moisture sensor network has been installed with several sensors in vertical and horizontal directions. Usually, the soil volume measured by a sensor is just a few cm^{3}. These observations are then assimilated into a 3D hydrological model with a spatial grid of 1 m^{3} in order to update the modeled soil moisture. Here, the spatial heterogeneity within one model grid element is neglected and it is assumed that the observation at the level of a few cm^{−3} is valid for the whole grid element of 1 m^{3}. As the scale discrepancy was neglected and no scaling technique was applied, this example refers to UVSS DA. UVSS DA is not further discussed in this paper. However, reviews can be found, e.g., in Evensen [

MVSS DA refers to the simultaneous assimilation of observation data for multiple model state variables into a simulation model. In these studies, the measurement data are either of the same scale as the computational grid, or, more commonly, scale disparities are neglected. The availability of simultaneous multiobservation pairs is an important characteristic. For example, leaf area index (LAI) and surface temperature can be obtained on the same spatial scale at the same moment by the MODIS satellite. Both data types can be assimilated in a multivariate and single-scale manner. The assimilation of remotely-sensed soil moisture and soil temperature is another example of MVSS DA. However, although the same assimilation moment is not mandatory for MVSS DA, it is important that the assimilation takes place in a certain time window. Assimilating soil moisture data from a microwave satellite, which overpasses an area under investigation at 06:00, and soil temperature data from a multispectral/thermal sensor, which overpasses at 10:00, into an hourly hydrological model would still require multivariate DA. This problem is usually solved by an augmented state vector, which is an important characteristic of MVSS DA. This updates only that part of the augmented state vector for which a corresponding observation is available. In contrast, calibrating a model by soil moisture DA in the first year and updating soil temperature by DA in the second year would not necessarily be characterized as multivariate DA, as we have defined it if the basic state vector is used.

If the DA framework can update both states and parameters, time series measurements of model parameters can also be assimilated. Therefore, our definition of MVSS DA is: (i) the assimilation of measurements for at least two model state variables, or (ii) at least one state variable in combination with at least one model parameter, or (iii) at least two different model parameters, at least one of which has the form of time series. For example, in several hydrologic models, LAI is a model parameter and soil moisture is a state variable. Both are time series products made available by satellite remote sensing. In a state-parameter estimation framework, where LAI is a parameter to be estimated, multivariate DA can be performed by updating both the soil moisture state and the LAI parameter.

UVMS DA refers to the assimilation of external data obtained at a significantly different resolution than the model resolution and the application of a scaling technique. In multiscale DA, a technique is required to consider statistical parameters on all scales, such as observation and model error noise variances. Examples are the assimilation of coarse-scale soil moisture contents or snow water equivalents (which are disaggregated to the fine scale) into a fine spatial scale hydrologic model. The application of a scaling technique is mandatory to distinguish between the multiscale and single-scale DA applications described above.

Multiscale definitions given in other publications, which are not in line with the UVMS DA characteristics presented in this review, should however also be taken into account. We define UVMS DA as (i) the assimilation of a certain data type measured at a scale that is different to the computational grid scale, where DA explicitly takes into account this scale mismatch, or (ii) the assimilation of a certain data type measured at two or more different spatial scales, where DA explicitly takes into account that measurements were made at different scales. In the literature, other definitions can be found. For example, Lu

MVMS DA refers to the complex combination of multivariate and multiscale DA techniques as defined above.

In principle, in MVSS and MVMS DA, the state variables are updated by data sets from different sensors. Moreover, in most cases the multiscale issue in UVMS and MVMS is addressed by different sensors. However, a special case is imaginable, where one state variable is updated at a single scale by two different data sets obtained from different means of observation. In such a case, we recommend introducing an explicitly multisource UVSS DA. Multisource UVSS DA involves the assimilation of equal-scale soil moisture products from specific radar on two different satellites, or two different radar types on one single satellite. The advantage of such a multisource UVSS DA application is that different observation errors can be overcome,

An example would be the assimilation of soil moisture products from the ERS 1 and ERS 2 satellites during their tandem phase. Two soil moisture products recorded at roughly the same time and at the same spatial resolution are provided. Such a simultaneous assimilation would not involve the DA procedures defined previously,

The multiscale problem has been addressed by several different approaches because a wide range of natural processes have multiscale properties in space and/or time [

Both the PF and the EnKF are excellent algorithms that assimilate data obtained at a certain spatial resolution into models that operate at a different resolution. This can be performed in two ways. The first approach is to use the observation operator [

_{k}(.) [

If the EnKF is used, the impact of the different weights can be assessed by examining the update equations. Let us assume that the hydrologic model is column-based, which means that the model results of all modeled pixels are independent of each other. This is a common feature of many hydrologic models, such as the widely used Community Land Model [

Under these conditions
_{k}(.) is a linear function (a linearly weighted average is obtained in order to simulate the large-scale observation), and can be written as the H_{k} vector (the vector containing all the weights). The denominator in the Kalman gain equation is a single variable. Since
_{k}, the weight matrix. In other words, pixels with a higher weight will receive a larger update than pixels with a lower weight. A limiting case may occur when the weight of a certain pixel is zero, which implies that its value does not contribute to the large-scale signal. In this case, the requirement of observability of the system is violated and the pixel should be left out of the analysis.

The impact of the averaging weight is fundamentally different when a PF is applied. The large-scale observation is simulated in exactly the same way as for the EnKF. However, in this case, one single particle contains the states (and possibly parameters) at all the modeled pixels, and is one of the ensemble members of the model realizations. Each particle possesses its own weight. The adaption of the weight depends on the deviation of the simulation of the large-scale observation from the actual observation. The particles could then be resampled. However, in this case, the modeled state variables for a certain particle are simply duplicated. In other words, there is no differential update in contrast to the EnKF.

VAR deals with this problem in another way. An initial state vector is eventually retrieved that minimizes the cost function. Since the pixels located near the center of the large-scale grid have the highest weight, their state estimate will match the truth better than the pixels further away from the center. The difference compared to the EnKF is that these results are not obtained through an update, but through a minimization of a cost function. The same reasoning can be applied when an entire profile needs to be updated instead of one single layer. The difference is that the matrix H_{k} will contain zeros for all state variables that are not in the uppermost layer of the profile.

A different approach involves downscaling the observations to the spatial resolution of the model before the observations are assimilated. _{k} is the same as the number of pixels inside the large-scale grid. H_{k} is the identity matrix. The downscaled model results can then directly be assimilated into the fine-scale model. In some applications, only one layer might be observed (e.g., from remote sensing), but multilayer systems need to be updated. In such cases, H_{k} again contains zero values for all model variables that are not located in the uppermost layer.

A drawback of this methodology is the need for a downscaling algorithm and the quantification of the measurement uncertainty on the fine scale. On the other hand, the advantage is the straightforward application of the DA algorithms, especially when multiple data sets at different spatial resolutions need to be assimilated.

This is also true for the assimilation of one state variable obtained on two or more spatial scales into terrestrial models. An example would be the assimilation of fused top-soil-moisture products obtained by active (relatively higher spatial resolution) and passive (relatively lower spatial resolution) microwave methods. In addition to the prior individual downscaling of both data sets to the model resolution, prior data fusion could also be feasible. A huge range of methods have been published in relation to satellite image fusion [

Several studies have been published assimilating one state variable at a specific spatial resolution to a model on another spatial resolution. As a simple method of transfering the spatial differences in soil moisture observations from ASCAT (∼25 km) and AMSR-E (∼38 km) to the model grid of 25 km, Draper

Parada and Liang [

Frakt and Willsky [

Another synthetic experiment was performed by Hill

In order to meet the problem of assimilating two or more observation data sets at dissimilar spatial scales, these observations are often fused prior to the assimilation. For example, in a synthetic study in the Arkansas-Red River Basin, Dunne

Wang

The EnKF, and variants of it, were used more often than PF and variational methods for multivariate DA in hydrology. For EnKF, the different observation types are all grouped together in the vector y_{k}. Therefore, compared to the univariate case, the vector y_{k} is “extended” to include different types of measurements. The state vector x_{k} also includes different types of state variables, and possibly parameters as well. These different variables and parameters are included in the state vector in the form of blocks: a first block for the first state variable, then additional blocks for additional state variables, and finally blocks for the different parameters. The covariance matrix is therefore also extended and includes cross-covariances between different state variables, or cross-covariances between a state variable and a parameter. The update equation for EnKF, which is valid both for univariate and multivariate DA, is:

We will now look at the example of two state variables, which were modeled and observed. Moreover, two distributed parameter fields were calibrated, and observations were available for one of the parameters. The augmented state vector, observation vector and covariance matrix are now composed of the following blocks:
_{1} corresponds to the first state variable, _{2} to the second state variable, _{1} to the first parameter and _{2} to the second parameter. It should be noted that the dimensions of the blocks normally differ between the augmented state vector and the observation vector, as not each modeled state is observed. The Kalman gain [see also

Variational methods have also been used in several cases for multivariate DA in hydrology. Multivariate DA with variational methods is performed by evaluating the second term on the right-hand side of _{i} in this expression. This matrix contains the estimated measurement error variances for each of the observations. The different types of observations are associated with different uncertainties and the diagonal elements of _{i}_{i} weights the influence of the different observations to update the simulated model values with the observations. The correcting influence of the observations also depends on the values in R_{k} compared to the covariance matrix for the background errors, P_{k}. The inclusion of additional observations, and the comparison of the measured values with the simulated ones, according to

The PF has not yet been used frequently for multivariate DA in hydrology. Different data sources can be relatively easily included as conditioning information in the particle filter, which can be understood by inspecting the likelihood function. The likelihood for the multivariate case is obtained by comparing the different measurement data with their simulated equivalents, and weighting each of the residuals with the measurement variance. For the univariate case, the probability of the observations in the modeled state was given by Moradkhani _{p}

For the definition of R_{k} and H_{k}, see Section 2.1 on EnKF. In these expressions, it was assumed that all observations have the same measurement error variance. For the multivariate case, the expression for the likelihood of the observation modifies to:

We see in this expression that the measurement error variances are in matrix notation, which acknowledges that different measurement types will be associated with different uncertainties. Measurement errors for different observations can also be correlated in space, as could be the case for remote sensing data. The uncertainty of the different (types of) observations affects the weighting of the particles.

Although multivariate DA seems like a relatively straightforward extension of univariate DA, most studies in terrestrial systems assimilate only one data type. The complication of MVSS DA is not so much of an algorithmic nature, but is related to the specification of the measurement uncertainty for all data types involved. If different data types are assimilated, the correct weighting of the different pieces of information becomes very important for the efficiency of the procedure. The following discussion of the papers that deal with MVSS DA is organized according to the application area, focusing on developments during the last decade and on EnKF, PF and VAR.

In groundwater hydrology, sequential DA focused from the outset on jointly updating states and parameters by assimilating piezometric head data using an augmented state vector approach. The work of Chen and Zhang was among the first in this area [

There has been a recent increase in papers concerned with DA for partial differential equation-based coupled surface-subsurface models. Crow and van Loon [

Some authors used more conceptual hydrological models for assimilation focusing on the reproduction of river discharge. In an early publication, Seo

The applications of DA in vadose zone hydrology are traditionally concerned with the assimilation of remote sensing data. In these applications, normally only states are updated, while for soil hydraulic parameter calibration, inverse methods are used. DA for vadose zone hydrology has a strong link with land surface hydrology and we distinguish here between assimilation experiments for single soil columns (vadose zone hydrology) and studies for larger areas with distributed land surface models (land surface hydrology). Although Walker

One option for improving the prediction quality of larger-scale land surface models is the joint assimilation of soil moisture and surface temperature data. Barrett and Renzullo [

DA might violate the mass balance because at each assimilation time step, mass might be removed or injected into the system. Pan and Wood [

In the area of snow hydrology, multivariate DA has also been explored as a method of improving the characterization of snow pack and runoff estimates during snow melt. Durand and Margulis [

Studies presenting a complex combination of multiscale (Section 4) and multivariate (Section 5) DA techniques are discussed in the following. Most of the MVMS DA studies deal with the assimilation of snow data and the assimilation of soil moisture and surface temperature. Durand

Balsamo _{2} fluxes were also reduced by about 5% with this joint scheme.

Compared to UVSS DA techniques, multivariate and multiscale assimilation (MVSS, UVMS, MVMS) allow additional data containing information about the states to be updated and quantities of interest (e.g., fluxes) to be modeled. Advantages have been outlined in several studies [

UVMS DA allows observations with different “support” scales to be integrated into mathematical models. However, this necessitates the availability of upscaling and downscaling approaches. One problem associated with UVMS DA is the assimilation of data measured at a certain scale into a model with a different grid resolution. It has been shown that assimilation is straightforward. For methods like EnKF, it can be handled by the observation operator, while PF can assimilate directly. Simulated values can be compared with measured values in a relatively straightforward manner: if for example the measurement comprises multiple computation grid cells, the observation operator can handle an equal weighting of all the measurement grid cells, while also taking account of an unequal weighting of the grid cells (as was the case in the SMOS example in Section 4.1). However, the problem is more complicated if a property scales nonlinearly. This is the case for example for brightness temperature measured by microwave sensors, which is nonlinearly related to soil moisture. When the relation between brightness temperature and soil moisture is applied to the larger-scale grid, we expect the soil moisture value to be different to the value we would receive if the brightness temperature was available for all smaller computational grid cells and if the conversion from brightness temperature to soil moisture was calculated for each of these grid cells. For properties that scale linearly, we expect the direct application of the observation operator (in the case of EnKF) to give the best results. For properties that scale nonlinearly, alternative strategies such as prior downscaling are promising. A systematic comparison of methods solving assimilation problems for properties that scale nonlinearly is still lacking. Here, more insight is needed, which could be obtained using synthetic studies mimicking real-world conditions as closely as possible or using real-world studies with sound verification. A second multiscale DA problem is when measurements are available at multiple scales. More experience is required with the assimilation of measurements at different scales. In theory, this is easy for problems that scale linearly, but in practice the data could conflict. Studies should not focus solely on the optimal fusion of data, as the bias correction of the observation data is also important [

The relative weights are not relevant, if an observation operator is used. Its scaling performance depends on the complexity of the observation operator. Let us take the example outlined in Section 4.1.1, where the antenna weighting of SMOS was used to calculate the magnitude of the update for different pixels. This approach considers additional information about the measurement system. Compared to a statistical scaling, such as calculating the average state of fine-scale data on the coarse scale, it is a more sophisticated approach. A more complex scaling observation operator with respect to the example given is the method published by Merlin

If the additional data is also a state variable of the main model, they can be assimilated into the model (

Although many different data types are available to constrain hydrological model parameterization and prediction, relatively few DA papers deal with MVSS DA. In groundwater hydrology, vadose zone hydrology and rainfall-runoff modeling sequential DA is a relatively novel approach with an increasing number of papers only in the last five years. Very few papers are therefore concerned with MVSS DA. The situation is different for land surface hydrology, where sequential DA was first used more than a decade ago. This is also the area with a larger number of papers on multivariate DA. However, it also lacks papers dealing with the assimilation of many different data types. In land surface hydrology, the main topic has been the assimilation of soil moisture data from satellites. This involves serious complications: biased data (e.g., vegetation, interferences), limited vertical penetration depth, nonlinear observation operators and scale mismatch. In addition, land surface hydrological models often use strongly simplified concepts of vegetation and the hydrological cycle, where the assimilation of LAI or FPAR data is not possible (they are a parameter in such models) and the assimilation of river discharge data or groundwater data is difficult because of strong model simplifications.

On the other hand, synthetic experiments have shown that multivariate DA is generally superior to univariate DA. Considering the evolution in the last years, we expect that an increasing number of applications will further investigate the benefits of multivariate DA. We also expect that it will be more successful in real-world applications in the area of land surface hydrology if it is combined with better models (e.g., models that allow lateral flows in the subsurface) and coupled models (e.g., in combination with a crop growth model to assimilate vegetation data). Improving our understanding of remotely sensed data, such as indirect estimates of soil moisture is also essential for increasing the use of MVSS DA (especially in land surface models). In Section 5.1, we showed that a key element in MVSS DA is the relative weighting of the different data types. Therefore, the uncertainty of the measured data must be better understood, and increased ground truth verification is required under different conditions. The equations in Section 5.1 also showed that the impact of the data is greater if measurement errors are smaller. Again, especially for remotely sensed data, a better understanding of the relation between what is measured (e.g., brightness temperature) and what we want to know (

Multivariate multiscale DA is the most complicated form of DA. For this type of assimilation, both aspects mentioned for multiscale DA (UVMS) and multivariate DA (MVSS) also hold here. Uncertainty assessment and bias correction of measurement data and appropriate multiscale methods to handle nonlinear scaling states and/or parameters are all important aspects. Although the combination of complications makes MVMS DA more problematic, we do not believe that it involves additional theoretical complications compared to MVSS or UVMS DA alone.

To date, only simple aggregation or disaggregation methods have been used to match the spatial resolution between observations and model states. More advanced downscaling methods, such as the downscaling of coarse passive soil moisture products with high-resolution remote sensing data [

A complication of multivariate and/or multiscale DA where parameters are also updated is that the parameter estimates cannot be directly verified in the field. Therefore, only indirect verification of calibrated parameter values is possible. This can be done by comparing uncalibrated and calibrated parameter values in independent model prediction experiments. If errors are significantly smaller for runs with calibrated parameter values (compared to uncalibrated values), then this would indicate that parameter estimation helped to improve model parameterization. Nevertheless, it is also possible that improved predictions with calibrated parameters are related to the fact that the updated parameter values compensate for another model structural error.

Running high-resolution models with many unknown states (and parameters) in an ensemble mode is very CPU-intensive and large amounts of stored data need to be managed. Small ensemble sizes give suboptimal results and therefore a certain minimum ensemble size is needed. High-performance computing is therefore an essential part of the DA methodology, and this is even more so the case for MVMS DA. For some very CPU-intensive applications, such as land surface modeling, this may have prohibited to some extent a more widespread use of these techniques until now.

In the context of climate change, several activities have been established for the long-term monitoring of environmental conditions. In general,

In this paper, we reviewed the state of the art of DA utilizing observational data from different spatial scales and different sources. We summarized three prominent DA methods: the ensemble Kalman filter (EnKF), the particle filter (PF) and variational methods (VAR). We identified four major classes of assimilation studies:

Univariate single-scale DA (UVSS, not discussed here as review papers exist in the various fields of research).

Univariate multiscale DA (UVMS). This refers to the assimilation of external data obtained at a different resolution than the model resolution. Examples are the assimilation of coarse-scale soil moisture contents or snow water equivalents into a hydrologic model, which is applied at a fine spatial scale.

Multivariate single-scale DA (MVSS). This refers to the assimilation of data for multiple variables (for example, surface temperature and soil moisture contents) into a simulation model.

Multivariate multiscale DA (MVMS). This refers to a complex combination of UVMS and MVSS.

We discussed several studies aiming to assimilate observations into models at dissimilar spatial scales. Some applications used the observation operator to align the spatial reference, others used prior downscaling before assimilation. If more than one observation of the same variable was used with a different spatial resolution, the majority of studies fused these data first. In a second step, one combined data set was usually assimilated. This approach may lead to underrepresentation of the variability as well as the accuracy of the state variable.

MVSS DA can be handled by EnKF, PF, VAR and other DA techniques. A crucial role is played by the appropriate determination of the measurement error variances for the different information. MVSS DA has been applied in several studies, but there are still few applications using real-world data or several (more than two) types of data. Most studies concluded that the results obtained when different types of data are assimilated are better than UVSS DA. Examples of this include the joint assimilation of soil moisture and LAI, the joint assimilation of river discharge and soil moisture, or the joint assimilation of surface temperature and brightness temperature. We argued that model deficiencies and an incomplete understanding of the relation between the measurement and the variable of interest has hampered a more extensive use of multivariate DA in the past. In order to ensure the successful application of multivariate DA in the future, these points and an increased understanding of the magnitude of measurement errors under different environmental conditions are important.

In conclusion, methods already exist for the simultaneous assimilation of various data types on different spatial scales. In atmospheric science, multivariate and multiscale DA is well established. In terrestrial systems, they are not yet generally established, and published studies are often synthetic. Further activities are needed to fully exploit the availability of environmental data, which could improve our knowledge of terrestrial processes as well as their interdependencies and teleconnections with the climate system.

This study was supported by the German Research Foundation DFG (Transregional Collaborative Research Centre 32—Patterns in Soil-Vegetation-Atmosphere Systems: Monitoring, modeling and data assimilation). It was also supported by the Helmholtz Alliance on “Remote Sensing and Earth System Dynamics”.

Ensemble-based DA system. Measurements are integrated into a DA framework by an observation operator for comparison with ensemble states for state (and parameter) updates. The scheme is presented for one time step only, the sequential character of DA is generated by new model forcings and new measurements initiating a new ensemble of forward models for the next time step.

The importance resampling particle filter with 12 particles (modified according to van Leeuwen [

Schematic of the use of the observation operator for the assimilation of coarse-scale data into a fine-resolution 2D model. w_{i,j} stands for the weight of the model result in row i and column j in the calculation of the grid-averaged model result. Darker colors represent higher weights. θ stands for the model results, and darker values represent higher values. For simplicity, the .^{−} is omitted from the y and θ variables, and the time index k is omitted from all variables.

Schematic of the use of prior downscaling for the assimilation of coarse-scale data into a fine-resolution model for a model with only one model layer. The symbols are identical to those in ^{−} is omitted from the y and θ variables, and the time index k is omitted from all variables. DA refers to data assimilation.