Hydrological predictability, amongst other factors, is linked with the initial hydrological conditions (IHC) within a catchment [1
]. For example, ensemble streamflow prediction (ESP) methods used in seasonal streamflow forecasts depend on accurate estimates of the IHCs [2
]. Of these IHCs, soil moisture is highly important, as the gradual release of water from the soil column is often a large component of streamflow. It has been shown that an accurate estimate of initial soil moisture enhances streamflow predictability at both short [3
] and seasonal time scales [4
]. This is because the hydrological prediction chain starts with the IHC’s which are used to initialise a hydrological model, then forcings from numerical weather prediction (NWP) forecasts are used to produce a streamflow forecast. Accurate measurements of initial soil moisture conditions are therefore beneficial to the operational global streamflow forecasts that have become available in recent years [5
Operational streamflow forecasts can benefit from accurate initial soil moisture measurements by assimilating them into their IHC’s. Measurements can be obtained from in-situ observations from frequency domain reflectometry (FDR [6
]) or cosmic ray methods such as the Cosmic-ray Soil Moisture Observing System (COSMOS [7
]). However these measurements have low spatial representativeness [9
] as they rely on point measurements and they do not have global coverage. Alternatively, measurements from satellite-based remote sensing platforms can provide global coverage. These can either come from active or passive microwave sensors, the former can have a high spatial resolution (~1 km for Synthetic Aperture Radar (SAR) sensors) but low repeat pass coverage (>10 days) [9
]. Passive microwave sensors conversely have coarse spatial resolution but high repeat pass coverage, this second attribute makes them highly suitable for use within operational streamflow forecast systems.
Passive microwave sensors onboard satellite platforms such as Soil Moisture Active Passive (SMAP [10
]), Soil Moisture and Ocean Salinity (SMOS [11
]) and the Advanced Microwave Scanning Radiometer-2 (AMSR-2 [12
]) can provide global soil moisture estimates [13
]. The sensors detect brightness temperature from the top 1–5 cm of the soil column, which can then be transformed into an estimate of the soil moisture through methods including radiative transfer [14
] and neural networks [15
]. The broad swath width (~1000 km) combined with the short repeat pass times (~1–2 days) allow for frequent updating of the soil moisture status, which is beneficial for operational forecasting [17
Remotely sensed soil moisture observations are typically incorporated into the IHC’s of a streamflow prediction system through data assimilation [9
], using the ensemble Kalman filter for example [18
]. Some previous studies have shown that this results in improved streamflow prediction [20
], whilst others have seen a deterioration [21
]. These results may relate to how challenges such as uncertainties and biases within the satellite data are dealt with [9
Many of these previous studies however have performed their evaluations of soil moisture data assimilation using level 2 or 3 quality data. This additional post-processing of the original level 1 data, whilst improving the quality, also increases the latency time, meaning that it cannot be integrated within a real time operational forecast system. Instead either the original level 1 data or emulated level 2 data via a neural network, for example [15
], can be assimilated within an operational forecast system. One example is the Integrated Forecast System (IFS) of the European Centre for Medium Range Weather Forecasts (ECMWF), whose land data assimilation system (LDAS) assimilates this soil moisture information from ASCAT (Advanced Scatterometer) and SMOS [23
] into the soil moisture analysis.
The ECMWF operational soil moisture analysis, amongst other land surface analysis variables, is then used within the configuration of the Global Flood Awareness System (GloFAS [27
]) to produce streamflow forecasts. In this configuration, the land surface analysis variables are used within the Hydrology Tiled ECMWF Scheme for Surface Exchanges of Land (H-TESSEL) land surface model component of the IFS [29
] to produce forecasts of hydrological variables including surface and subsurface runoff. These are then coupled offline with the kinematic channel routing of the LISFLOOD hydrological model [30
] to produce streamflow forecasts.
Since the GloFAS configuration is initialised from the IFS land surface analysis, it is likely that the assimilation of data including soil moisture has an impact upon streamflow prediction. Previous work has demonstrated a discernable impact of data assimilation upon GloFAS streamflow prediction, especially in areas dominated by snowmelt [31
]. However the specific impact of the soil moisture data assimilation has not been assessed. The inclusion of SMOS within the IFS LDAS, as part of model cycle 46r1 in June 2019 [23
], provides an opportunity to assess its impact upon GloFAS streamflow predictions.
The aim of this manuscript therefore is to describe our assessment of the impact of soil moisture data assimilation, from SMOS, upon streamflow prediction within GloFAS. This was be achieved by performing a data denial experiment using SMOS data and the GloFAS forecast configuration. Results from the experiment were analysed against in-situ streamflow observations to assess the impact upon GloFAS streamflow prediction skill. Then, an assessment against proxy streamflow observations from the GloFAS ERA-5 dataset [28
] was performed to assess the global impact upon skill. Finally, the impact upon high and low flow prediction was assessed through direct comparison between the two GloFAS data denial experiments.
This study investigated the impact of SMOS soil moisture data assimilation upon GloFAS streamflow prediction within an operational forecast configuration. In general, only a minor impact on streamflow prediction skill was found. Globally, the greatest impact was found in the Hudson Bay, central United States, the Sahel and Australia. The greatest impact of SMOS was upon the simulation of flood peaks, lower flows showed lower sensitivity to the inclusion of SMOS data assimilation.
The areas of the world which showed the greatest impact upon high flows in this study appeared to coincide with areas which have open land cover (Figure 8
). Comparing the results of this study against landcover data from the ESA Climate Change Initiative (CCI) dataset for 2018 [56
] confirms that the greatest changes occurred in sparsely vegetated, herbaceous, grassland, cropland and shrubland classes. Forested and urban areas showed little impact of SMOS soil moisture data assimilation upon GloFAS streamflow predictions. This is likely because SMOS measurements in these areas are subject to interference, which increases the measurement error, meaning they were filtered out and are not assimilated into the model. It may be possible that certain land cover types are associated with either an improvement or a degradation of GloFAS streamflow skill with the assimilation of SMOS soil moisture data. To investigate this further, at each observation station location in the United States and Australia the land cover classification from the ESA CCI data for 2018 [56
] were extracted. Then, all stations where the modified Kling–Gupta Efficiency skill score (KGEmod
SS) was ≤−0.05 (indicating a degradation with SMOS data assimilation) and all the stations where KGEmod
SS was ≥0.05 (indicating an improvement with SMOS data assimilation) were identified. Within each of the degradation and improvement categories, these were further broken down into the landcover classes from ESA CCI. Results showed that for both degradation and improvement most stations belonged to the grass, tree, water and shrub landcover classes (Table 5
). Therefore it appears that the landcover status does not explain the spatial pattern of degradations or improvements in the GloFAS prediction skill.
SMOS data assimilation also appeared to have a minimal impact upon GloFAS results within Europe (Figure 7
and Figure 8
). This could be because many of the rivers within Europe are below the 0.1° spatial resolution of GloFAS. Another reason could be the presence of radio frequency interference (RFI) in this region upon the SMOS measurements. This would mean that SMOS data are filtered out in this region and are not assimilated into the model.
The results suggest that the assimilation of SMOS soil moisture mostly affected high flows (Figure 8
). Analysis of hydrographs in the United States and Australia confirmed that the main, and sometimes only, differences occurred in the peak flows during the experiment period (Figure 2
and Figure 5
). It would be expected that altering the soil moisture may also affect the amount of water released to the river during low flows, however this was not observed in this study. The explanation could be that SMOS data assimilation mainly affects the top soil layer, and since this is only a very shallow portion of the entire soil column, this could explain the minimal impact on low flows. A greater impact on low flows could result if the soil moisture assimilation is then analysed into root zone soil moisture [20
]. The ECMWF IFS LDAS already performs this root zone analysis with SMOS data, but perhaps greater weight should be given to the SMOS data; future work could investigate this. However since the assimilation mostly affects the top soil layer this could have a large impact on the ability of the soil column to generate surface runoff, as surface runoff is mostly produced in the top soil layer. For example, if SMOS data assimilation increases the soil moisture in the top soil layer, this reduces the infiltration capacity of this soil layer, meaning more surface runoff production during the next rainfall event and greater flow in the river.
The finding that SMOS data assimilation has a minor impact on GloFAS streamflow prediction corresponds with previous findings [22
]. Previous studies have posited that the reasons for this include, amongst others, the representativeness of the soil layers, biases between the model and satellite data, the use of a calibrated hydrological model and uncertainties within the hydrological model [22
]. Regarding the first of these the SMOS soil moisture data were assimilated into the top soil layer of the H-TESSEL soil column, which was 7 cm deep. This is comparable with the depth penetrated by the SMOS soil moisture measurements, which are in the order of a few centimeters [34
Biases between the SMOS and H-TESSEL model soil moisture data were addressed by using the SMOS soil moisture neural network product trained on ECMWF IFS (i.e., H-TESSEL) soil moisture analysis. This would implicitly remove any biases between the SMOS observations and the ECMWF model. However this would restrict the data assimilation to only correcting for random model errors rather than also correcting the bias, preventing it from changing the behaviour of the soil moisture [55
]. Assimilating the SMOS neural network product trained on the original SMOS level 2 soil moisture data could offer a solution, as this product is not bias corrected to the ECMWF model. However it would not currently work within the ECMWF IFS LDAS, as it breaks the assumption of the zero observation-model bias. A possible solution for future work would be to perform a parameter analysis of H-TESSEL, which may involve tuning the parameters which control the vertical soil water budget.
The use of a calibrated hydrological model to perform the streamflow predictions may explain the resulting minor impact of SMOS data assimilation. As mentioned above, GloFAS was calibrated in a previous study by optimizing the streamflow parameters using forcings from a 20 year ECMWF IFS reforecast [46
]. The calibration of a given hydrological model can sometimes mean that it is difficult for any subsequent simulation to outperform it [58
]. However the GloFAS calibration study only tuned the LISFLOOD streamflow parameters and left the vertical hydrological component, i.e., H-TESSEL soil water balance, unchanged. Hence, GloFAS is not as fully calibrated as other hydrological models, meaning there could be more scope for improving its streamflow prediction skill through data assimilation into its initial conditions. This is evidenced by the improvements observed at some locations in the United States whereby peak discharges better matched the observations after the assimilation of SMOS soil moisture (Figure 2
). Increased soil moisture values from the assimilation of SMOS soil moisture could cause increased surface runoff production and hence greater streamflows [52
Uncertainties within the GloFAS model configuration and parameterisation may also explain the minor impact of soil moisture data assimilation. They could represent biases and or errors which could not be overcome by data assimilation of soil moisture alone. In the United States, for example, GloFAS exhibits a widespread under-estimation bias (Figure 3
a) whilst in Australia there was an over-estimation bias at most locations (Figure 6
a). A possible solution for future work could be to revise the parameterisation of the H-TESSEL soil water budget using SMOS soil moisture data in a calibration procedure [22
]. Biases within GloFAS could also be caused by the precipitation forcings, therefore, a dual updating procedure of both the precipitation and initial soil moisture conditions could be carried out in future work [60
Overall, this study has analysed the impact of SMOS soil moisture data assimilation upon GloFAS streamflow predictions within an operational configuration. Two GloFAS experiments were conducted using hydro-meteorological forcings from ECMWF IFS experiments, which include and exclude the assimilation of SMOS soil moisture data. Streamflow predictions from both GloFAS experiments were evaluated against observations from in-situ measurements using the KGEmod metric. The results showed some impact upon hydrological prediction skill, but it was difficult to discern a clear signal due to biases and uncertainties within GloFAS.
Further investigation was performed to determine how low and high flow GloFAS predictions were affected by SMOS data assimilation. Results showed that high flows were more affected than low flows. A global assessment of the impact upon low and high flows found the greatest impact around the Hudson Bay, central United States, the Sahel and Australia. However, there was no clear spatial trend to these results as differences of opposing sign were within close proximity to each other. Investigating the hydrographs at specific station locations found that differences in KGEmod could often be attributed to differences in a single flood peak, whilst the remainder of the simulated hydrographs were very similar. In some instances the flood peak in the simulation with SMOS data assimilation was the greatest, whilst the opposite was true in other instances. This could be because SMOS data assimilation only affects the top soil layer, which can greatly alter the generation of surface runoff during a flood peak, but has little effect upon baseflow production during lower flows. There was no clear spatial trend to the changes in high and low flows either. To better understand these changes future work should focus on finding out how the SMOS data are affecting the GloFAS simulations. This could be done by analysing changes in individual hydrological components such as surface and subsurface runoff, the former being significant for high flows and the latter more important during low flows. This study highlights that assimilating SMOS soil moisture does impact the hydrological predictions of GloFAS, but more work is needed to understand the causes of the observed results.