Exploring the Utility of Machine Learning-Based Passive Microwave Brightness Temperature Data Assimilation over Terrestrial Snow in High Mountain Asia

Kwon, Yonghwan; Forman, Barton A.; Ahmad, Jawairia A.; Kumar, Sujay V.; Yoon, Yeosang

doi:10.3390/rs11192265

Open AccessArticle

Exploring the Utility of Machine Learning-Based Passive Microwave Brightness Temperature Data Assimilation over Terrestrial Snow in High Mountain Asia

by

Yonghwan Kwon

^1,2,3,*

,

Barton A. Forman

¹,

Jawairia A. Ahmad

¹

,

Sujay V. Kumar

²

and

Yeosang Yoon

^2,4

¹

Department of Civil and Environmental Engineering, A. James Clark School of Engineering, University of Maryland, College Park, MD 20742, USA

²

Hydrological Sciences Laboratory, NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA

³

Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD 20740, USA

⁴

Science Applications International Corporation, McLean, VA 22102, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2019, 11(19), 2265; https://doi.org/10.3390/rs11192265

Submission received: 16 August 2019 / Revised: 26 September 2019 / Accepted: 26 September 2019 / Published: 28 September 2019

(This article belongs to the Special Issue Data Assimilation of Satellite-Based Observations into Land Surface Models)

Download

Browse Figures

Versions Notes

Abstract

:

This study explores the use of a support vector machine (SVM) as the observation operator within a passive microwave brightness temperature data assimilation framework (herein SVM-DA) to enhance the characterization of snow water equivalent (SWE) over High Mountain Asia (HMA). A series of synthetic twin experiments were conducted with the NASA Land Information System (LIS) at a number of locations across HMA. Overall, the SVM-DA framework is effective at improving SWE estimates (~70% reduction in RMSE relative to the Open Loop) for SWE depths less than 200 mm during dry snowpack conditions. The SVM-DA framework also improves SWE estimates in deep, wet snow (~45% reduction in RMSE) when snow liquid water is well estimated by the land surface model, but can lead to model degradation when snow liquid water estimates diverge from values used during SVM training. In particular, two key challenges of using the SVM-DA framework were observed over deep, wet snowpacks. First, variations in snow liquid water content dominate the brightness temperature spectral difference (ΔT_B) signal associated with emission from a wet snowpack, which can lead to abrupt changes in SWE during the analysis update. Second, the ensemble of SVM-based predictions can collapse (i.e., yield a near-zero standard deviation across the ensemble) when prior estimates of snow are outside the range of snow inputs used during the SVM training procedure. Such a scenario can lead to the presence of spurious error correlations between SWE and ΔT_B, and as a consequence, can result in degraded SWE estimates from the analysis update. These degraded analysis updates can be largely mitigated by applying rule-based approaches. For example, restricting the SWE update when the standard deviation of the predicted ΔT_B is greater than 0.05 K helps prevent the occurrence of filter divergence. Similarly, adding a thin layer (i.e., 5 mm) of SWE when the synthetic ΔT_B is larger than 5 K can improve SVM-DA performance in the presence of a precipitation dry bias. The study demonstrates that a carefully constructed SVM-DA framework cognizant of the inherent limitations of passive microwave-based SWE estimation holds promise for snow mass data assimilation.

Keywords:

support vector machine; passive microwave brightness temperature; data assimilation; snow water equivalent; High Mountain Asia; synthetic twin experiment; NASA Land Information System

1. Introduction

High Mountain Asia (HMA) is a vast, high elevation mid-latitude region comprising the world’s highest mountains. HMA snow and glaciers (that result from the accumulation of snow) are the earth’s largest freshwater reservoirs outside the polar regions, and their meltwater is the main water source for more than a billion people that live downstream, especially in the Indus river basin [1,2]. While an increase (thickening) in glacier mass has been observed in the West Kunlun and Karakoram regions (the so-called ‘Karakoram anomaly’; [3]), in recent decades most of HMA has been progressively losing glaciers [4,5,6,7], which could negatively impact regional water supply and native ecosystems. In this regard, accurate characterization of the spatiotemporal variability of snow and ice across the HMA region is an urgent issue for water resource management under global warming.

Ground-based monitoring of land surface snow cover and glaciers is quite limited in this region because of the complex topography and extreme climatic conditions. Satellite remote sensing and land surface modeling are alternative approaches, but they also have their own limitations. For example, remote sensing observations are limited by errors stemming from instrument noise and retrieval algorithms, whereas land surface models (LSMs) have large uncertainties attributed to model parameterization, initial conditions, and meteorological forcings (i.e., boundary conditions). Fortunately, data assimilation (DA) is an attractive alternative for generating high-quality estimates as it merges satellite-derived observations with model predictions, while explicitly accounting for their relative uncertainties.

Given the fact that passive microwave (PMW) brightness temperature (T_B) is sensitive to snow volume scattering, and thus, provides useful information on snow mass variations, T_B observations can be directly assimilated into an LSM to improve snow estimates. T_B assimilation in the context of snow was first applied by Durand et al. [8] as part of a set of synthetic experiments. Durand et al. [9] demonstrated from a point-scale study that T_B assimilation outperforms traditional T_B-based retrieval algorithms (e.g., [10]) in estimating snow depth. Many follow-on studies (e.g., [11,12,13,14,15]) showed similarly encouraging results. More recently, Kwon et al. [16,17] employed T_B assimilation for continental-scale snow storage estimates and Larue et al. [18] and Kim et al. [19] further improved the method. The focus of this current study was to characterize snow water equivalent (SWE) in the HMA region using T_B assimilation methods.

To assimilate T_B observations, the DA framework requires a forward observation operator that maps the relevant LSM state variables (e.g., SWE) into the corresponding observation space (i.e., T_B). Whereas the above-mentioned previous T_B assimilation studies used a microwave radiative transfer model (RTM) as the observation operator, we use a machine learning algorithm. Forman et al. [20] first employed an artificial neural network (ANN) to map geophysical (model) states into PMW T_B space to explore the feasibility of using machine learning algorithms as an observation operator within a DA framework. They showed that an ANN could be used to estimate T_B at multiple frequencies and polarizations over snow-covered land using LSM snow outputs as inputs to the ANN. Forman and Reichle [21] additionally explored the use of another machine learning technique, i.e., support vector machine (SVM) regression, and suggested that the SVM is a more effective alternative to ANNs for snow T_B assimilation purposes due to the increased sensitivity of SVMs to snow mass relative to ANNs. Through sensitivity experiments, Xue and Forman [22] supported the findings of Forman and Reichle [21] with respect to the use of SVMs within a DA framework. More recently, Xue et al. [23] conducted T_B spectral difference assimilation over snow-covered areas in North America using a well-trained SVM and presented promising initial results. Building on these prior efforts, we adopted the use of SVM regression algorithm in this study, but applying it to a different land surface model using a different set of boundary conditions in a different part of the globe where the accurate estimation of snow is very challenging.

Compared to RTMs, the use of SVMs in the data assimilation environment is computationally more efficient. As SVMs are pretrained prior to the assimilation process, they essentially act as an emulator to the computationally more expensive RTMs [16]. RTM-based DA approaches typically demand high-fidelity snow models to meet all of the input requirements (e.g., [24]). The SVM training process can also help mitigate the negative effects of LSM’s inaccurate representations of snow layers, snow grain size, and ice lens (or depth hoar) within the snowpack by providing a more bulk, mixed-pixel response [21]. SVMs generally provide a more flexible environment for characterizing snow variability over a large region of space, such as HMA, by assimilating T_B at multiple frequencies and multiple polarizations. On the other hand, the “black-box” nature of SVMs may have limitations for snow T_B data assimilation, which are discussed in this paper.

The T_B data assimilation framework used in this study includes the National Aeronautics and Space Administration (NASA) Land Information System (LIS; [25,26,27]) in conjunction with a well-trained SVM (herein SVM-DA framework for simplicity). The Noah Land Surface Model with multi-parameterization options (Noah-MP version 3.6; [28,29]) was used as the prior model for simulating snow. In view of the lack of available measurements for validation in the study area, a synthetic twin experiment (e.g., [30,31,32]) was performed to assess the potential of our SVM-DA framework for enhancing snow estimates over the HMA region. Note that the focus of this article is on terrestrial snow only. We consider the estimation of snow on glaciers to be beyond the scope of this paper.

Research questions addressed in this study include: (1) Is the SVM-DA framework effective in improving snow mass estimates across the HMA region? (2) What limitations does the SVM-DA framework possess? (3) Can these limitations be avoided through the recognition and appropriate handling of instances where the SVM forward operator is suboptimal?

2. Methodology

2.1. SVM-DA Framework and Synthetic Twin Experiment

The SVM-DA framework leveraged NASA LIS as the software “wrapper” around both the forward (prognostic) land surface model and the observation operator (Figure 1), which in this case consisted of well-trained SVMs. LIS is a comprehensive system for land surface modeling and data assimilation, and features a flexible and extensible design using object-oriented programming. In LIS, various advanced LSMs (LIS-LSM), ground measurements, satellite products, sequential and nonsequential DA algorithms (LIS-DA), and tools for high-performance computing and data management are integrated to generate estimates of land surface states and fluxes (see Kumar et al. [33] for more details).

Among the LSMs in LIS, we used Noah-MP v3.6 as the forward (prognostic) LSM to simulate snow-related land surface processes, including the accumulation and ablation of terrestrial snow. In the SVM-DA framework, a well-trained SVM was used as the observation operator to predict T_B from prior model estimates of snow. The well-trained SVM was derived from statistical learning theory and uses support vector regression [34,35,36] via the integration of the LIBSVM library [36] into the LIS framework. As a data-driven model, the SVM can reproduce nonlinear processes [35], such as relationships between snow’s physical properties and electromagnetic characteristics of snow (e.g., [21]). Once trained, the approximating function used in SVM regression is given as:

f (y) = \sum_{i = 1}^{m} (α_{i}^{*} - α_{i}) k (x_{i}, y) + δ

(1)

where y is the [1 × n] input vector from Noah-MP simulations at a given location (i.e., model pixel); n is the number of snow-related variables; m is the total number of training targets (i.e., number of T_B measurements for a given location at different times); α and α^* are the dual Lagrangian multipliers; k(x_i,y) is a Gaussian radial basis kernel function; x is the [m × n] training matrix; and δ is a bias coefficient.

The SVM-DA framework is composed of two phases: (1) SVM training, and (2) SVM prediction during the assimilation phase. As we conducted a synthetic twin experiment, an additional procedure of generating synthetic observations was required as part of the experiment (see Figure 1). During phase 1, Noah-MP was forced by the NASA Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA-2; [37]) atmospheric reanalysis that yielded estimates of snow-related model states (e.g., SWE and snow depth), which served as the synthetic snow truth for assessing DA performance. The SVM was trained using nine years (1 September 2002 to 1 September 2011) of model simulations in conjunction with PMW T_B observations from the Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E) (see Section 2.3.2). The trained SVM was then used to produce the synthetic ΔT_B truth as a function of the snow states generated by the “true” Noah-MP simulation. Finally, synthetic ΔT_B observations were generated by adding random errors, i.e., white Gaussian noise with zero-mean and standard deviation of 3 K [23], to the “true” synthetic ΔT_B. The errors were assumed to be temporally and horizontally uncorrelated.

During phase 2, a prior ensemble (with 20 different replicates) of snow-related model states were estimated by Noah-MP driven by stochastic MERRA-2 forcing. However, the precipitation in MERRA-2 was replaced with precipitation estimates from the Tropical Rainfall Measuring Mission (TRMM) multisatellite precipitation analysis (TMPA; [38]) so that the differences between the two different sets of precipitation estimates (i.e., MERRA-2 precipitation and TRMM precipitation) can be representative of the true errors that could be encountered by an operational assimilation system. The well-trained SVM, generated during phase 1, was used to predict an ensemble of ΔT_B from the ensemble of prior estimates. Using the ensemble of predicted ΔT_B along with the synthetic ΔT_B observations generated during phase 1, LIS-DA updated the ensemble of prior (geophysical) estimates with an ensemble Kalman filter (EnKF; [39,40]) algorithm. The updated (posterior) ensemble of snow states was then used as the initial conditions for subsequent model forecasts.

2.2. Data Sets

The AMSR-E microwave T_B data used during the SVM training were obtained from the MEaSUREs (NASA Making Earth System Data Records for Use in Research Environments) Calibrated Enhanced-Resolution Passive Microwave Daily EASE-Grid 2.0 Brightness Temperature Earth System Data Record (https://nsidc.org/data/nsidc-0630/versions/1; [41]). The highest resolution of T_B data at 10.65, 18.7, and 36.5 GHz were upscaled as an arithmetic average to the LSM resolution (i.e., 0.25° × 0.25° equidistant cylindrical grid) for computational simplicity and then spectral differences (ΔT_B) were calculated from the upscaled T_B data.

Meteorological boundary conditions used in this study include the MERRA-2 atmospheric reanalysis [37] and TRMM precipitation data [38]. MERRA-2 has been updated from its predecessor, MERRA, and is the latest atmospheric reanalysis product produced by the NASA Global Modeling and Assimilation Office (https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2). MERRA-2 has been available from 1980 to the present, and provides global coverage at a spatial resolution of 0.5° × 0.625°. The TRMM precipitation (available from https://pmm.nasa.gov/data-access/downloads/trmm) is provided at a spatial resolution of 0.25° × 0.25°. It spatially covers 50°S–50°N/180°W–180°E, and temporally spans from 2001 to 2016. The daily product of the TRMM precipitation (3B42, version 7) was used during the data assimilation experiments.

Yoon et al. [42] conducted an inter-comparison of 10 different precipitation datasets, including MERRA-2 and TRMM precipitation data, and confirmed that the existing precipitation datasets have large uncertainties and variability across HMA. All precipitation datasets agreed with respect to spatial patterns, but their magnitudes and temporal trends were significantly different. Yoon et al. [42] suggested the use of the Climate Hazards Group InfraRed Precipitation with Station data-version 2 (CHIRPS-2; [43]) precipitation in conjunction with other forcing fields from the European Centre for Medium-Range Weather Forecasts (ECMWF; [44]) to drive land surface models over HMA. Meanwhile, Ahmad et al. [45] showed that signs of the sensitivity of SVM-based ΔT_B predictions to both the forcings (i.e., MERRA-2 and ECMWF + CHIRPS-2) were comparable, and higher sensitivity values were obtained for MERRA-2. Furthermore, since the current study aimed to demonstrate the feasibility of the SVM-DA framework to improve SWE estimates for the highly uncertain environment of HMA, we concluded that both the MERRA-2 (for the synthetic truth) and TRMM precipitation data (for open-loop (OL) and data assimilation (DA)) could be used in this synthetic study as the differences in precipitation between the two different sets are representative of the real-world errors that could be encountered by an operational modeling and assimilation system.

Important surface input data sets for Noah-MP include elevation, land cover types, and soil texture. Elevation estimates were obtained from the Shuttle Radar Topography Mission (SRTM) elevation data [46]. For land cover types, we used the Terra Moderate Resolution Imaging Spectroradiometer (MODIS) sensor-based International Geosphere-Biosphere Programme (IGBP) land classification map [47]. Soil texture data from the International Soil Reference and Information Centre [48] were used as soil inputs to Noah-MP.

2.3. Experimental Setup

2.3.1. Land Surface Model

As noted above, the Noah-MP LSM was used to simulate snow conditions in this study. Noah-MP was improved from its baseline model (i.e., the Noah LSM) by augmenting representations of hydrological processes and land–atmosphere interactions, and by adding multiple parameterization options [28,29]. In Noah-MP, the snowpack is represented by up to three snow layers, depending on the total snow depth. The model simulates snowmelt–refreeze cycles and snow compaction processes such as destructive metamorphism, melt metamorphism, and compaction by snow load or overburden.

Noah-MP initial conditions were obtained from the following two-step process. First, the model was run from 1 January 1980 to 1 May 2002 using a single replicate to consider multiyear snow accumulation, soil moisture dynamics, and shallow groundwater dynamics in high elevation regions while minimizing the computational demand. Second, 20 ensemble members of model initial conditions were generated by running the model from 1 May 2002 to 1 September 2002 with stochastic perturbations of the MERRA-2 meteorological forcing fields and TRMM precipitation data (see Section 2.3.4 for perturbations) used as a boundary condition to Noah-MP. Using the 20-member ensemble to specify the initial conditions, Noah-MP was then run at a spatial resolution of 0.25° (equidistant cylindrical projection) from 1 September 2002 to 1 September 2011 with (DA cases) and without assimilation (open-loop (OL) case) of synthetic ΔT_B observations.

The Noah-MP model integration time step was set to 15 min and daily averaged snow (geophysical) outputs were obtained. The Noah-MP options of physical parameterization schemes used in our experiment are listed in Table 1. More options and detailed descriptions can be found in Niu et al. [28].

2.3.2. SVM Training

We adopted the SVM training procedure as presented in Forman et al. [20] and Forman and Reichle [21]. The training was conducted over a nine-year period from 1 September 2002 to 1 September 2011. A split-sample training procedure was adopted, such that observations for a given training period were excluded from the analysis. A fortnightly (two-week) training period was selected to capture seasonal variability in snow processes while ensuring an adequate training sample size (on the order of a thousand) at most locations.

The selected training targets (i.e., SVM outputs) included spectral differences between the 10.65 and 36.5 GHz channels and between the 18.7 and 36.5 GHz channels at both horizontal (H) and vertical (V) polarization. These three particular frequencies were chosen based on their reported sensitivity to snow associated with preferential scattering (volume) at higher frequency [10,58,59,60,61]. Kwon et al. [16] suggested that the 23.8 GHz channel can add value to the DA performance when it is additionally assimilated. However, due to the high sensitivity of the 23.8 GHz channel to atmospheric water vapor [62,63], it requires explicit estimation of atmospheric effects on T_B [17], which were not modeled in our SVM-DA framework, and hence, the 23.8 GHz channel was not used here. In addition, in order to minimize error resulting from wet snow effects [64], only nighttime AMSR-E measurements were used for training.

Note that atmospheric and vegetation corrections to AMSR-E T_B data were not conducted in this study. Xue and Forman [65] showed that the overall atmospheric contribution to the spaceborne T_B data was approximately 1 to 3 K. Due to the fact that the snow-covered grid cells in HMA are mostly located in high elevation regions (Figure 2a), the atmospheric contribution over HMA may be much smaller due to the relatively dry and thin atmosphere found at higher elevations [66,67,68], and thus was assumed to be negligible. In terms of overlying vegetation, the HMA study domain largely consists of non-forested land-cover types such as barren (or sparsely vegetated), shrublands, croplands, and grasslands (Figure 2b). The forest cover fraction in most of the study domain is small, while most of the snow-covered areas are outside densely forested areas (Figure 2c). In this regard, both atmospheric and vegetation corrections were considered of secondary importance to the performance of the SVM-DA framework in the HMA region.

The Noah-MP (geophysical) state variables selected as input (i.e., control) variables to the SVM included SWE, snow density, snow liquid water, and snow (top layer) temperature. The particular selection of these four input (control) variables to SVM-derived PMW T_B was based, in part, on the sensitivity analysis conducted by Ahmad et al. [45]. Using the same land surface model (Noah-MP) and boundary conditions (MERRA-2) as in this study, Ahmad et al. [45] analyzed the sensitivity of SVM-based ΔT_B predictions over the HMA region to four different Noah-MP state variables. Although the magnitude and sign of the estimated normalized sensitivity coefficient (NSC [unitless]) varied depending on location in time and space, the signs of the NSCs, in general, adhered to the known first-order physics. That is, the SVM-predicted ΔT_B was negatively sensitive to top-layer snow temperature and snow liquid water (i.e., ΔT_B decreased as snow temperature or snow liquid water content increased). Conversely, ΔT_B generally increased with increased SWE or snow density. The largest absolute values of the NSC for ΔT_B between the 18.7 and 36.5 GHz channels estimated for a specific location and averaged over snow accumulation or ablation season were about 1.0 (for SWE), 1.0 (for snow liquid water), 3.9 (for top layer snow temperature), and 1.5 (for snow density). The relatively low NSC of SWE is consistent with the results of Kwon et al. [16], who used radiative transfer models. This implies that the assimilation of PMW T_B or ΔT_B does not ensure improvement in SWE estimates for all surface conditions, and thus requires a nuanced approach.

The daily snow cover product from the National Oceanic and Atmospheric Administration’s National Environmental Satellite Data and Information Service (NOAA/NESDIS) Interactive Multisensor Snow and Ice Mapping System (IMS; [70,71]) was used as a reference for the presence of snow. That is, AMSR-E T_B observations and Noah-MP outputs were excluded from the training set to the SVM if the estimated snow cover did not coincide with the IMS snow cover information. This procedure was expected to reduce erroneous inputs [20] and thus increase the accuracy of the SVM. The SVM was trained on a 0.25° equidistant cylindrical grid, the same spatial resolution and map projection used for Noah-MP, and the optimal nonlinear SVM parameters were determined for each training target, for each fortnight, and for each 0.25° grid cell in the study area.

The performance of this training approach in reproducing the observed ΔT_B over snow-covered areas in the HMA region has been demonstrated in Ahmad et al. [45]. Overall, the trained SVM well reproduced AMSR-E ΔT_B observations not used during the SVM training. Results showed a bias between −0.5 K and 0.5 K and a root mean squared error (RMSE) less than 10 K (see Figure 3 in [45]). The seasonal variability in AMSR-E ΔT_B observations was also appropriately captured by the trained SVM (see Figure 4 in [45]). Please refer to Ahmad et al. [45] for more details about the performance of the SVM over HMA.

2.3.3. Precipitation Bias Correction

As previously mentioned, instead of using MERRA-2 precipitation, as was the case for the nominal (truth) simulation, TRMM precipitation was used for both the OL and DA runs within the synthetic twin experiments. To reduce the significant climatological difference in precipitation between MERRA-2 and TRMM, a simple bias correction on the TRMM precipitation was first conducted using a bias correction factor computed as:

α = \frac{\bar{M E R R A 2}}{\bar{T R M M}}

(2)

where α is the bias correction factor for TRMM precipitation,

\bar{M E R R A 2}

and

\bar{T R M M}

are the time-averaged (from 1 September 2002 to 1 September 2011) and domain-averaged (across the entire HMA region) precipitation from MERRA-2 and TRMM, respectively. The bias correction ensures that, in aggregate, the total amount of TRMM precipitation over the study period and domain is identical to that of MERRA-2. However, considerable differences still exist at individual locations in space and time, such that these differences are representative of the “real-world” errors in the boundary conditions encountered by the prognostic Noah-MP model. In other words, even though the total amount of precipitation is the same between the true simulation and the OL/DA ensembles, the amount of SWE could be significantly different at any individual location due to the nonlinear hydrologic response of precipitation forcing on snowpack estimation. The differences in SWE between the synthetic truth and OL runs were expected to be mitigated by the assimilation of the synthetic ΔT_B observations.

2.3.4. Data Assimilation

The ensemble Kalman filter (EnKF; [39,40]) was used to assimilate the synthetic ΔT_B observations. In the EnKF, model errors are dynamically represented by an ensemble of model integrations, which, in this study, was derived by perturbing meteorological forcings. As a compromise between the performance and computational efficiency of the SVM-DA framework, an ensemble size of 20 was used for both the OL and DA cases. The 20 ensemble members were constructed by perturbing meteorological forcing inputs, which represent the uncertainty in the boundary conditions (see Table 2). Note that the precipitation error was considered the key source of uncertainty in this paper. Due to the fact that the same Noah-MP model (with the same configurations) was used for the synthetic truth, OL, and DA cases during the synthetic twin experiment, errors stemming from the model structure (physics and parameters) were neglected, and, hence, no perturbation was applied to the model prognostic state variables. It is also important to note that the analysis update was disabled when one or more of the ensemble members in a grid cell simulated snow-free conditions.

The perturbed meteorological forcing fields (i.e., boundary conditions) include precipitation, incident shortwave and longwave radiation, and near-surface air temperature, which were selected based on their first-order control on snow processes. The perturbation parameters summarized in Table 2 were determined based on settings used in previous DA studies (e.g., [33,72,73,74,75,76]). The precipitation (P) and downward shortwave (SW) radiation forcing were perturbed using normally distributed, additive random fields (with zero-mean). The downward longwave (LW) radiation and near-surface air temperature (T_air) forcing were perturbed by log-normally distributed, multiplicative random fields (with mean 1). For all perturbation fields, temporal correlations were considered using a first-order autoregressive model (AR(1)). Cross correlations between the perturbation fields were also imposed, as indicated in Table 2, for realistic and balanced (physically consistent) perturbations.

The synthetic observations of ΔT_B between the 10.65 and 36.5 GHz (V and H) channels and between the 18.7 and 36.5 GHz (V and H) channels were simultaneously assimilated in the SVM-DA framework. The standard deviation of the ΔT_B error was assumed to be 3 K based on Kwon et al. [16], Xue et al. [23], and Durand and Margulis [77]. Note that only SWE was directly updated during the assimilation and that other snow states such as snow ice content, snow liquid water content, and snow depth were adjusted based on the amount of SWE update to maintain physical consistency between snow properties in the prior and posterior ensembles from Noah-MP.

2.4. Study Area

HMA presents an interesting study domain for this study. As previously mentioned, HMA is an important area with respect to climate change and freshwater resources. HMA is also characterized by its high biodiversity, highly variable precipitation, and high vulnerability to climate change and natural disasters such as floods and avalanches [78]. Therefore, accurate characterization of the spatiotemporal variability of snow across the HMA region is valuable to regional communities and ecosystems. Second, HMA is also a data-sparse area where ground-based measurements are quite limited, owing to the complex topography, resource availability, and extreme climatic conditions. In this regard, land surface modeling, satellite remote sensing, and data assimilation are the most viable alternatives for estimating SWE in this region. There may be limitations in the use of remote sensing observations to help constrain LSM predictions, and this paper discusses some of the advantages and disadvantages with respect to the use of PMW T_B.

The experiments were first conducted at four specific locations in HMA (Figure 2a) in order to explore a small suite of different elevation and land cover characteristic combinations. Geographical characteristics of the experimental grid cells are summarized in Table 3. Based on analysis of results for these experimental grid cells, we later expanded the study area to the entirety of the HMA domain spanning from 20.5°N to 41.0°N and from 66.5°E to 101.0°E (Figure 2).

Maps of elevation, dominant land-cover type, forest cover fraction, snow class, multiyear mean winter (December, January, and February) precipitation rate, and the percentage of wet snow days in the study domain are presented in Figure 2. Figure 2b derived from the Terra Moderate Resolution Imaging Spectroradiometer (MODIS) sensor-based International Geosphere-Biosphere Program (IGBP) land classification map [47] shows that about 86.8% of the HMA domain is dominated by non-forested land-cover types such as barren or sparsely vegetated (26.5%), croplands (26.0%), shrublands (21.3%), and grasslands (13.0%). Forest land-cover accounts for only 11.8% of the study area (Figure 2b), and dense forests are mostly located in the southeastern part of the region (Figure 2c). The snow classification map (Figure 2d), based on Sturm et al. [69], shows that six snow classes, i.e., ephemeral (60.29%), prairie (36.16%), alpine (1.96%), maritime (0.87%), tundra (0.70%), and taiga (0.02%) are found within the study area. Higher elevation regions (i.e., 4000 m or above) are characterized by the prairie snow class, barren (or sparsely vegetated), shrublands, and grasslands, while lower elevation regions are characterized by the ephemeral snow class, croplands, and forests. High winter (December, January, and February) precipitation rates were mostly seen in the northwestern part of the study domain (Figure 2e), where the significant analysis update of SWE by assimilating the synthetic ΔT_B data were observed (see Section 3.3). In the synthetic truth simulation, snow-covered grid cells experienced wet snow conditions for, on average, about 4.1% of the simulation period (Figure 2f), which may lead to reduced sensitivity of SVM-based ΔT_B estimates to SWE, and thus, degradations of the performance of the SVM-DA framework in estimating SWE (see Section 3.2).

3. Results and Discussion

3.1. Performance of the SVM-DA Framework

Figure 3 shows the performance of the SVM-DA framework at four selected grid cells (see Table 3). Assimilation of synthetic ΔT_B observations into Noah-MP considerably improved SWE estimates for Grids #1 and #2 (approximately 70% and 45% reduction in root mean square error (RMSE), respectively, relative to the OL), whereas degradations were observed for Grids #3 and #4 (approximately 39% and 10% increase in RMSE, respectively, relative to the OL).

There are, in general, three potential sources for DA degradation: (1) high frequency T_B signal saturation in the presence of deep snow, (2) mixed signal response in the presence of glaciers, and (3) mixed signal response in the presence of forest cover. T_B signal saturation in deep snow is a well-known issue that hinders SWE estimation based on PMW T_B observations. When ΔT_B = 10.65 − 36.5 GHz or ΔT_B = 18.7 − 36.5 GHz are used to estimate terrestrial SWE, it is assumed that ΔT_B is positively correlated with SWE because T_B at 36.5 GHz, in general, decreases as SWE increases (i.e., snowpack deepens) due to volume scattering while T_B at 10.65 GHz (or 18.7 GHz) is less sensitive to increasing snow depth as compared to 36.5 GHz. However, T_B at 36.5 GHz can begin to increase after SWE reaches a threshold value; this is due to the transition from snow volume scattering to emission at 36.5 GHz [60,80]. SWE threshold values for the slope reversal in the correlations between SWE and ΔT_B at 36.5 GHz vary depending on local snow conditions such as snow wetness, snow grain size, snow grain shape, and the presence of internal ice layers [81]. The documented range of the SWE threshold value is from about 100 to 200 mm (e.g., [58,60,81,82,83]). Correlations between the Noah-MP simulated SWE and SVM-based ΔT_B,18V–36V estimates were plotted in Figure 4, in which the SWE threshold value was assumed to be 200 mm, which is the upper limit of the range of reported values in the peer-reviewed literature. As shown in Figure 4a,c the slope reversal phenomenon was also observed in our modeling results.

A process limitation of Noah-MP v3.6 used in this study is the lack of simulated glaciers. As a result, no glacier information as simulated by the model was available for use by the SVM during training. Consequently, physically inconsistent correlations between simulated SWE and SVM-derived ΔT_B estimates were found in glaciated regions (Figure 4e,f). Namely, ΔT_B decreased as SWE increased, up until SWE reached a threshold of approximately 200 mm, after which ΔT_B transitioned to a positive correlation with SWE. This behavior is opposite to the relationship between SWE and ΔT_B for a dry snowpack on land (see Figure 4a,c for comparison) and was also seen in the relationship between the simulated SWE and AMSR-E ΔT_B data (not shown here). This implies that the SVM can yield the right prediction of ΔT_B from glaciated regions (i.e., snow on ice) for the wrong reasons (i.e., snow on land).

On the other hand, one physical mechanism that can partly explain the unusual relationships between SWE and ΔT_B over snow on glaciers is microwave volume scattering within underlying glaciers. The effective dielectric properties of solid ice are vastly different from those of water or snow (i.e., ice + air) [84]. In addition, Grenfell [85] and Eppler et al. [86] showed that the microwave emissivity of dry first-year ice at 37 GHz is comparable to that at 19 GHz, whereas the microwave emissivity of dry multiyear ice at 19 GHz is much higher than that at 37 GHz due to increased volume scattering, especially at higher frequencies, which is associated with the highly porous nature of multiyear ice. During the early period of snow season, ΔT_B between a lower frequency (e.g., 18.7 GHz) and a higher frequency (e.g., 36.5 GHz) is mainly determined by the underlying glaciers, i.e., multiyear ice. As snow depth increases, ΔT_B is gradually affected by snow cover. However, without accurate glacier information from observations or model simulations, it may be hard to fully understand the underlying physical mechanisms that determine the relationships between SWE and ΔT_B over snow on glaciers.

The PMW T_B signal at the top of the atmosphere is significantly affected by the presence of vegetation canopy [87,88,89]. More specifically, a low sensitivity of PMW T_B from terrestrial snow is regularly observed in the presence of dense overlying forest [90,91]. Therefore, the performance of PMW T_B observation assimilation in estimating SWE can be impeded by the presence of forest canopy [16]. Many studies (e.g., [17,65,91,92,93]) have applied correction techniques to try and ameliorate the impact of forest cover on SWE estimation, but no such correction techniques have been applied here as most of the snow-covered regions in the study domain are outside densely forested areas (Figure 2c).

However, these three issues (i.e., forest cover, T_B signal saturation, and the electromagnetic response of snow-on-ice) were not always the dominant factors yielding degraded results in this study. The SVM-DA framework was effective in improving the SWE estimation for Grid #2 even though the grid cell was characterized by high forest fraction (54.4%) and deep snow conditions. Conversely, SWE estimates were degraded by DA for Grid #4 where neither forest cover nor glaciers exist. A detailed analysis as to why such degradation occurs is provided in the next section.

As noted in our analysis below, careful attention to the three above mentioned issues must be included in DA frameworks. In Section 3.3, it turned out that the T_B signal saturation issue was one of the major factors causing the degradation of SWE estimates by the SVM-DA framework for the entire HMA domain. Due to the issue of the electromagnetic response of snow-on-ice, glaciated regions were excluded from the analysis throughout this paper. Comparatively, the influence of forest cover on the DA performance was small (also shown in Section 3.3), likely due to the fact that (1) the same vegetation scheme was used in the synthetic truth and DA, and (2) no vegetation-related state variables were used as inputs to the SVM. However, these issues are relevant for a successful application of the SVM-DA framework for real T_B or ΔT_B data assimilation.

3.2. Limitations of ΔT_B DA for Snow Estimation

As is seen in Figure 3c,d, the degradation of the SVM-DA framework in estimating SWE for Grids #3 and #4 was mainly due to the anomalously large updates in SWE to the grid cells via assimilation of synthetic ΔT_B observations during 2005–2006 and 2009–2010, respectively. To better highlight the underlying mechanisms that caused the degradation, we further analyzed the results of Grids #3 and #4, focusing on these particular time periods. Time series of the SVM inputs, i.e., SWE, snow density, snow liquid water, and snow (top layer) temperature for Grids #3 and #4 are presented in Figure 5 (time series for the entire 9-year experimental period are provided in the supplemental materials, i.e., Figure S1 for Grid #3 and Figure S2 for Grid #4). For comparison purposes, the same plots were made for Grids #1 and #2 for periods when the SVM-DA framework performed well (Figure 6).

It can be noted from Figure 5 that Grids #3 and #4, where DA degraded the SWE estimate, conditions were characterized by deep, wet snow. Snow density using DA was improved relative to the OL case, and the snow (top layer) temperature estimation error was marginal in both the OL and DA cases. Although some ensemble members exhibited abnormally large snow densities (i.e., snow densities close to that of ice) during the snow ablation period as a result of DA (Figure 5b,f), they were a direct consequence of the anomalously large updates to SWE during the snow accumulation period. It is important to note here that SWE is a prognostic variable (in conjunction with snow depth) and that snow density is a diagnostic variable computed as the ratio of the two. We can infer from Figure 5c,g that the snow liquid water error is one of the dominant factors that lead to degraded SWE estimates when using DA.

When wet snow is present in a pixel, microwave emission from the liquid water increases because the emissivity of water approaches unity, hence, liquid water tends to dominate the snow “signal” in the ΔT_B observations [64,68]. Therefore, the sensitivity of ΔT_B to SWE decreases in wet snow as sensitivity to liquid water correspondingly increases, as shown in Figure 4b. Although positive correlations between Noah-MP simulated SWE and SVM-based ΔT_B,18V–36V estimates were observed during deep (SWE ≥ 200 mm threshold), wet snow conditions in Grid #4 (Figure 4d), we would not expect that the ΔT_B signal is attributable to changes in SWE because of the occurrence of signal saturation in deep snow (see Section 3.1).

During the snow ablation period, large snow grains could be another determinant of the T_B signal emitted from the snowpack. The magnitude of the snow grain size error significantly affects the correlation between SWE and T_B [94], and hence, accurate estimation of snow grain size is a prerequisite for enhancing the DA performance [16]. Other important snow microstructural parameters for accurate modeling of snow microwave T_B include snow anisotropy and stickiness (e.g., [94,95,96]). However, because these microstructural parameters (i.e., snow grain size, anisotropy, and stickiness) are not explicitly represented in Noah-MP, they could not be incorporated into the SVM training procedure. Hence, the direct impacts of snow grain size, anisotropy, and stickiness on SVM-based ΔT_B estimates cannot be discussed here.

The large SWE update occurred (after the vertical dotted line in the figures) when the normalized innovations (NI) were large (Figure 7e,j), which was calculated as:

N I = \frac{y - \bar{H (x^{-})}}{\sqrt{H P^{-} H^{T} + R}}

(3)

where the numerator is the difference (innovation) between the assimilated (synthetic) ΔT_B observations (y) and the ensemble mean of predicted (prior) ΔT_B observations (

\bar{H (x^{-})}

), and the denominator is the square root of the sum of the measurement (R) and background error covariances (

\sqrt{H P^{-} H^{T}}

). Thus, these large normalized innovations (compared to other time periods) may be attributed to (1) the large innovations and/or (2) the small standard deviations of the prior ΔT_B across the ensemble.

During the ablation period in Grids #3 and #4, the temporal trend of the synthetic ΔT_B retrievals (Figure 7) was mainly attributed to snow liquid water content (Figure 5c,g respectively). As can be seen in Figure 5c,g the SVM-DA framework could not reproduce the “true” liquid water content within the snowpack, which led to the large differences (innovations) between the synthetic ΔT_B observations and the predicted ΔT_B. Conversely, large innovations did not occur when the snowpack was dry (Grid #1; Figure 6c) or when the snow liquid water content was well-estimated relative to the truth by the SVM-DA framework (Grid #2; Figure 6g).

As clearly shown in Figure 7a–d, the SVM predicts climatological ΔT_B when the prior estimates of snow lie outside the range of the “true” snow values used during SVM training. Consequently, the ensemble of SVM-based ΔT_B predictions collapse, yielding a near-zero ensemble standard deviation. The collapse of the predicted observation ensemble could result in the occurrence of spurious error correlations between SWE and SVM-based ΔT_B estimates that ultimately manifests itself as a degraded estimate when DA is enabled.

3.3. Possible Improvements in DA Performance

Based on the analysis in Section 3.2, we incorporated two simple rule-based updates into the SVM-DA framework to minimize DA degradation (Table 4). In the first rule (rule 1), prior SWE was updated by the assimilation of the synthetic ΔT_B observations only when the standard deviation of the prior ΔT_B was larger than 0.05 K. It was hypothesized that rule 1 can help avoid the negative effects associated with the collapse of the SVM-predicted ΔT_B ensembles, and the resultant spurious error correlations between SWE and ΔT_B on the DA performance.

As previously mentioned, the DA framework did not originally conduct an analysis update of the prior SWE if one or more of the ensemble members in a given grid cell was snow-free. This could lead to systematic underestimation of SWE during DA, and hence, underestimation of snow liquid water in subsequent time periods. To help mitigate this concern, a second rule (rule 2) was introduced where it was assumed that snow cover exists if the observed ΔT_B is greater than 5 K, and hence, a thin snow layer (i.e., 5 mm SWE) was added when the simulated SWE was zero but ΔT_B > 5 K. Rule 2 is expected to minimize abrupt changes in SWE during the analysis update caused by the snow liquid water error. The combination of rule 1 and rule 2 (i.e., rule 3) was also explored.

Figure 8 shows that the performance of the SVM-DA framework in estimating SWE can be improved slightly by applying the rule-based approaches. Rule 1, rule 2, and rule 3 all reduced the SWE RMSE compared to the OL case by 4.2%, 7.5%, and 11.3%, respectively, for Grid #3, and by 2.9%, 12.8%, and 42.0%, respectively, for Grid #4. Estimates of snow density and snow liquid water were also improved via the rule-based DA approach (see supplemental materials, i.e., Figure S1 for grid #3 and Figure S2 for grid #4). The large increase in SWE, shown in Figure 8a, was not observed when using the rule-based DA cases in Grid #3 (Figure 8b–d). However, in Grid #4, a large amount of SWE was added by DA during 2009 and 2010 winter in the rule 3 cases, which rapidly melted during the ablation period.

The same rule-based approaches were also employed in the application of the SVM-DA framework to the entire HMA domain (Figure 9). As shown in the figure, DA “baseline” slightly reduced the spatially averaged RMSE of SWE estimates (27.5 mm; Figure 9b) compared to that of the OL case (28.1 mm; Figure 9a). SWE estimates in 56.8% (43.2%) of snow-covered grid cells were enhanced (degraded) by DA, as shown in Figure 9c. Rule 1 further (but marginally) reduced the spatial mean of the SWE RMSE (27.3 mm; Figure 9d,e), while rule 2 was effective in increasing the number of improved grid cells (61.8%; Figure 9f,g). The best performance in terms of improvement to mean RMSE and number of improved grid cells was achieved by applying rule 3 (combination of rule 1 and rule 2) (Figure 9h,i).

Even with the rule-based approaches, the SVM-DA framework still exhibited mixed performance in estimating SWE, that is, 61.6% (38.4%) of snow-covered grid cells were improved (degraded) (Figure 9i). Therefore, the spatial mean RMSE was only marginally reduced by about 3.2% and 1.1% relative to the OL and DA (baseline) cases, respectively. The dominant land-cover types of grid cells, where DA degraded SWE estimates, include open shrublands, grasslands, and barren or sparsely vegetated areas (Figure 9j). Degradations of SWE estimates were mostly observed in deep snow conditions (Figure 9l) rather than densely forested areas (Figure 9k). A simple test (i.e., rule 4; see Table 4 and Figure 9j,k) showed that reduction of the spatial mean RMSE (i.e., 26.3 mm) by about 6.4% and 4.4% compared to the OL and DA (baseline) cases, respectively, can be achieved by suppressing the analysis update when one or more of the ensemble members in a grid cell have SWE greater than 500 mm. This implies that the T_B signal saturation (especially at 36.5 GHz) in deep snow still needs investigation in order to enhance the performance of the SVM-DA framework across the entire HMA domain, even though it was not the primary factor for the degradation in the experimental grid cells (i.e., Grids #3 and #4).

The percentage of deep snow days when SWE was greater than the reported SWE threshold (i.e., 200 mm) during the winter season (December, January, and February) is shown in Figure 9k,l. Due to the large uncertainties in precipitation data over the HMA region [42], both the results using MERRA-2 (synthetic truth case) and TRMM (OL case) precipitation were analyzed. Within the study domain, 6.3% (synthetic truth; Figure 9k) and 4.4% (OL; Figure 9l) of model grid cells experienced deep snow conditions, and, on average, 21.7% (synthetic truth; Figure 9k) and 5.4% (OL; Figure 9l) of the winter season were subject to the T_B signal saturation issue. It should be noted that many locations would experience T_B signal saturation more frequently than the values shown in the figure because of lower SWE thresholds.

Derksen [58] suggested that ΔT_B between the 10.65 and 18.7 GHz channels can detect SWE variations for relatively deep snow conditions, rather than using 36.5 GHz that most commonly encounters signal saturation. The 10.65 GHz channels are less sensitive to snowpack volume scattering compared to the 18.7 GHz channels. Therefore, for deep snow conditions (i.e., SWE > 100 to 200 mm thresholds), the 10.65 GHz channels serve as a non-scattering background while the 18.7 GHz channels are more significantly affected by volume scattering [58]. However, additionally assimilating ΔT_B between the 10.65 and 18.7 GHz (V and H) channels did not significantly improve DA performance (mean RMSE = 27.1 mm). This is due, in part, to the fact that simultaneous assimilation of all channels still involves errors stemming from higher frequency channels. Furthermore, ΔT_B between the 10.65 and 18.7 GHz channels may introduce errors (noise) during shallow snow conditions because the signal is more representative of soil moisture conditions rather than snow mass (e.g., [97]). A possible solution will be assimilating different frequency channels for shallow and deep snow conditions. That is, for example, ΔT_B between the 18.7 and 36.5 GHz channels could be assimilated for shallow snow whereas ΔT_B between the 10.65 and 18.7 GHz channels is assimilated for deep snow. This approach will be explored in a future study.

3.4. Controllability of Observation Operator

A final issue related to the efficacy of SVM-based assimilation of PMW ΔT_B is that of “controllability”, which is an important factor in optimal control theory. Controllability highlights the skill (or lack thereof) of a linear or nonlinear model to guide the output of that model from any physically plausible initial state (by changing the inputs, i.e., control variables) toward any physically plausible final state within a finite period of time [98]. In other words, a system is controllable if and only if the system state(s) can be changed by changing the system input [99]. In the context of data assimilation, controllability of the observation operator is crucial during the analysis update because any errors in the geophysical (land surface) model should correlate back to errors in the SVM-derived ΔT_B estimates: this is one of the fundamental assumptions behind the SVM-DA framework. In the absence of controllability, an error in the geophysical model would no longer correlate back to the corresponding error in the observation operator, which would negate the efficacy of any data assimilation scheme, and perhaps, even lead to spurious error correlations that ultimately degrade the geophysical model.

In the context of SVM-based estimation of PMW ΔT_B over snow-covered terrain, Figure 10 shows two different locations in HMA, whereby one location in space and time is controllable and hence shows improved SWE estimates by DA (RMSE_DA–RMSE_OL = −31 mm), and the other is not (RMSE_DA–RMSE_OL = 5.8 mm). Namely, Figure 10a shows a controllable SVM-based estimate of ΔT_B whereby perturbations to the nominal inputs (i.e., control variables) to the SVM yields changes in ΔT_B. Further, the ΔT_B estimates adhere to the first-order physics, such that an increase (decrease) to snow density or SWE results in an increase (decrease) of volume scattering that leads to an increase (decrease) in ΔT_B. Alternatively, the uncontrollable SVM in Figure 10b is insensitive to changes in the inputs, and as such, reverts to a climatological ΔT_B estimate gleaned during the training phase. It is important to note that most of the SVM-based predictions in HMA are, in fact, controllable. However, at some locations in space and/or time (particularly when the snow is deep and/or ripe), individual SVMs may encounter controllability issues that ultimately manifest themselves as an erroneous update. There is likely no single panacea that will solve this issue of controllability. Rather, improvements in machine learning efficacy (and controllability by association) will be achieved through a collection of actions including: (1) improvements in quality control and quality assurance of the observations used during SVM training; this may include the use of the original enhanced resolution (i.e., without upscaling) T_B data or those corrected by considering the effect of complex topography (e.g., slope, aspect, ridge, and valley) on the local incidence and polarization rotation [100] in order to better represent microwave emission from the snowpack in mountainous areas; (2) improvements in geophysical modeling (e.g., inclusion of snow grain size, inclusion of sub-grid scale glacier physics, inclusion of snow grain shape) that will better constrain the PMW-derived snow mass estimation problem; (3) reconciliation of differences between different meteorological boundary conditions that will help better constrain the snow mass estimation problem; (4) continued exploration of different machine learning algorithms, whereby one algorithm may possess greater skill in time series regression; and (5) exploration of various training approaches, e.g., train machine learning algorithms using T_B or ΔT_B to retrieve SWE, which is then assimilated into LSMs, as explored for soil moisture estimates in Rodríguez-Fernández et al. [101].

4. Summary and Conclusions

This study explored the potential of assimilating synthetic PMW ΔT_B observations within a SVM-DA framework in order to enhance the characterization of spatio-temporal variability of SWE over HMA. The SVM-DA framework employed a machine learning algorithm, i.e., SVM, as the observation operator in the prediction of ΔT_B. Noah-MP was employed as the prognostic LSM simulating snow. Three different research questions were examined using a set of synthetic twin experiments in which synthetic ΔT_B observations at four different frequency and polarization combinations (i.e., ΔT_B 10–36 H, ΔT_B, 10–36 V, ΔT_B, 18–36 H, and ΔT_B, 18–36 V) were simultaneously assimilated into Noah-MP using an EnKF.

Research question 1: Is the SVM-DA framework effective in improving snow mass estimates across the HMA region? The framework significantly improved SWE estimates for two of the experimental grid cells (Grids #1 and #2) but resulted in degraded output for the other two experimental grid cells (Grids #3 and #4) that were characterized by deep and wet snow conditions. For the entire HMA domain, the SVM-DA framework also showed mixed performance, that is, the framework improved (degraded) SWE estimates for 56.8% (43.2%) of snow-covered grid cells. Compared to the OL case, only ~2% improvement in the spatial mean RMSE of SWE was achieved with DA. These results suggest that the use of machine learning algorithms to invert geophysical (snow) states into observational (ΔT_B) space offers potential, but that issues related to sensitivity and controllability must be addressed to ensure the machine learning algorithm produces the right answer for the right reason.

Research question 2: What limitations does the SVM-DA framework possess? From the analysis of the results for the degraded experimental grid cells, two potential limitations of the T_B assimilation exist during deep snow and wet snow conditions. First, abrupt changes in SWE can happen during the analysis update due to the large innovations when the ΔT_B signal is dominated by variations in snow liquid water content. Second, the collapse of the ensemble of SVM-based ΔT_B predictions are observed when the prior snow estimates are outside the range of snow inputs used for the SVM training procedure. The resultant near-zero standard deviation across the ensemble of the prior ΔT_B leads to degradation of the DA performance, presumably due to spurious error correlations between SWE and ΔT_B. The former one is a general issue in PMW T_B-based snow data assimilation, while the latter one is the SVM-DA-specific issue. The issues highlighted here are, in part, related to problems of controllability mentioned above, but also stem from the ill-posed (underdetermined) nature of PMW T_B-based estimation of snow mass. One potential means of alleviating the ill-posed nature of this problem is to provide additional constraints (e.g., snow grain size and snow grain shape) via additional process physics in the forward (prognostic) land surface model.

Research question 3: Can these limitations be avoided through the recognition and appropriate handling of instances where the SVM forward operator is suboptimal? To minimize the negative effects of the potential pitfalls on the DA performance, three rules were tested. In rule 1, SWE assimilation was enabled only when the standard deviation of the predicted ΔT_B was greater than 0.05 K. In rule 2, a thin snow layer (5 mm SWE) was added in a grid cell when the simulated SWE was zero but the observed ΔT_B was larger than 5 K. DA performance was improved for the experimental grid cells by applying these rules. In particular, the combination of the two rules (rule 3) exhibited a 11.3% and 42.0% reduction of the SWE RMSE for Grids #3 and #4, respectively, as compared to the OL case. However, the SVM-DA framework still showed mixed performance for the entire HMA domain. DA (rule 3) degraded SWE estimates for about 38.4% of snow-covered grid cells, which were largely impacted by deep snow (signal saturation) conditions in the synthetic truth and less so by signal attenuation in densely forested areas. Further improvements of the DA performance in the rule 4 DA case imply the need for assimilating lower frequency channels (e.g., ΔT_B between the 10.65 and 18.7 GHz channels) for deep snow conditions (SWE > 100 to 200 mm thresholds) in order to avoid T_B signal saturation issue.

In addition to the T_B signal saturation, there are still a number of issues that need to be addressed to enhance the applicability of the SVM-DA framework to the HMA domain. First, the effects of glaciers and snow grain size, which are two important modulators of PMW T_B, are not included in the current framework. Second, a sophisticated forcing bias correction method should be explored for future use. The simple bias correction method used in this study, i.e., applying a single bias correction factor for all grid cells, may affect the performance of the data assimilation system, especially for areas where the difference in precipitation between MERRA-2 and TRMM was large. Third, along with the second issue, multiple meteorological forcing datasets can be simultaneously used for the SVM training to, perhaps, better account for the large uncertainties in forcing data, especially precipitation, over HMA, as shown in Yoon et al. [42]. This approach may help minimize the occurrence of the SVM predicting climatological ΔT_B when the prior snow estimates fall outside the range of the training data. Fourth, assimilation needs to be conducted at a finer spatial resolution (e.g., 1 km) to better characterize the spatial distribution of SWE in mountainous regions, which may require additional strategies to consider the difference in spatial resolution between LSMs and T_B observations (and SVM). Additionally, the effects of vegetation and atmosphere on T_B, which were neglected for simplicity in this synthetic study, should be taken into account in future studies to improve the DA performance. Finally, machine learning controllability (and sensitivity more generally) must be investigated prior to assimilation to ensure the SVM produces the right answer for the right reason.

Overall, the results obtained in this study provide useful information for future practical applications of the SVM-DA framework in HMA.

Supplementary Materials

The following are available online at https://www.mdpi.com/2072-4292/11/19/2265/s1, Figure S1: Time series of SWE (A and E), snow density (B and F), snow liquid water (C and G), and snow (top layer) temperature (D and H) including the synthetic truth (black dots), OL ensemble mean (blue line), and DA ensemble mean (red line) for the baseline (upper) and rule 3 cases (lower) in Grid #3; Figure S2: Time series of SWE (A and E), snow density (B and F), snow liquid water (C and G), and snow (top layer) temperature (D and H) including the synthetic truth (black dots), OL ensemble mean (blue line), and DA ensemble mean (red line) for the baseline (upper) and rule 3 cases (lower) in Grid #4.

Author Contributions

Y.K. designed and conducted the experiments and analyzed the results with the help of B.A.F. and S.V.K.; Y.K. and B.A.F. wrote the paper, and all other authors contributed to improving the paper; S.V.K. and Y.Y. prepared the initial setup of LIS for snow assimilation; B.A.F. trained the SVM; Y.K. implemented the SVM into LIS; J.A.A. and B.A.F. analyzed the results associated with the SVM controllability.

Funding

This research was funded by the NASA High Mountain Asia Science Team (NNX17AC15G).

Acknowledgments

Computational resources were provided by the NASA Center for Climate Simulation and the University of Maryland High-Performance Computing Clusters (HPCC). The authors would like to expressly thank Yuan Xue for many constructive comments.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

AMSR-E	Advanced Microwave Scanning Radiometer for Earth Observing System
ANN	Artificial neural network
AR(1)	First-order autoregressive model
CHIRPS-2	Climate Hazards Group InfraRed Precipitation with Station data, version 2
DA	Data assimilation
ECMWF	European Centre for Medium-Range Weather Forecasts
EnKF	Ensemble Kalman filter
GLIMS	Global Land Ice Measurements from Space
HMA	High Mountain Asia
IGBP	International Geosphere-Biosphere Program
IMS	Interactive Multisensor Snow and Ice Mapping System
LIS	Land Information System
LSM	Land surface model
MEaSUREs	Making Earth System Data Records for Use in Research Environments
MERRA-2	Modern-Era Retrospective analysis for Research and Applications, version 2
MODIS	Terra Moderate Resolution Imaging Spectroradiometer
NASA	National Aeronautics and Space Administration
NOAA/NESDIS	National Oceanic and Atmospheric Administration’s National Environmental Satellite Data and Information Service
Noah-MP	Noah Land Surface Model with multiparameterization options
NSC	Normalized sensitivity coefficient
OL	Open loop
PMW	Passive microwave
RTM	Radiative transfer model
SRTM	Shuttle Radar Topography Mission
SVM	Support vector machine
SWE	Snow water equivalent
T_B	Brightness temperature
ΔT_B	Brightness temperature spectral difference
TMPA	Multisatellite precipitation analysis
TRMM	Tropical Rainfall Measuring Mission

References

Immerzeel, W.W.; Van Beek, L.P.H.; Bierkens, M.F.P. Climate change will affect the Asian water towers. Science 2010, 328, 1382–1385. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Su, F.; Yang, D.; Hao, Z.; Tong, K. Discharge regime and simulation for the upstream of major rivers over Tibetan Plateau. J. Geophys. Res. Atmos. 2013, 118, 8500–8518. [Google Scholar] [CrossRef]
Hewitt, K. The Karakoram anomaly? Glacier expansion and the “elevation effect” Karakoram Himalaya. Mt. Res. Dev. 2005, 25, 332–340. [Google Scholar] [CrossRef]
Bolch, T.; Kulkarni, A.; Kaäb, A.; Huggel, C.; Paul, F.; Cogley, J.G.; Frey, H.; Kargel, J.S.; Fujita, K.; Scheel, M.; et al. The state and fate of Himalayan glaciers. Science 2012, 336, 310–314. [Google Scholar] [CrossRef] [PubMed]
Farinotti, D.; Longuevergne, L.; Moholdt, G.; Duethmann, D.; Mölg, T.; Bolch, T.; Vorogushyn, S.; Güntner, A. Substantial glacier mass loss in the Tien Shan over the past 50 years. Nat. Geosci. 2015, 8, 716–722. [Google Scholar] [CrossRef]
Buri, P.; Pellicciotti, F. Aspect controls the survival of ice cliffs on debris-covered glaciers. Proc. Natl. Acad. Sci. USA 2018, 115, 4369–4374. [Google Scholar] [CrossRef] [Green Version]
Dehecq, A.; Gourmelen, N.; Gardner, A.S.; Brun, F.; Goldberg, D.; Nienow, P.W.; Berthier, E.; Vincent, C.; Wagnon, P.; Trouvé, E. Twenty-first century glacier slowdown driven by mass loss in High Mountain Asia. Nat. Geosci. 2019, 12, 22–27. [Google Scholar] [CrossRef]
Durand, M.; Margulis, S.A. Feasibility test of multifrequency radiometric data assimilation to estimate snow water equivalent. J. Hydrometeor. 2006, 7, 443–457. [Google Scholar] [CrossRef]
Durand, M.; Kim, E.J.; Margulis, S.A. Radiance assimilation shows promise for snowpack characterization. Geophys. Res. Lett. 2009, 36. [Google Scholar] [CrossRef]
Chang, A.T.C.; Foster, J.L.; Hall, D.K. Nimbus-7 SMMR derived global snow cover parameters. Ann. Glaciol. 1987, 9, 39–44. [Google Scholar] [CrossRef]
Toure, A.M.; Goïta, K.; Royer, A.; Kim, E.J.; Durand, M.; Margulis, S.A.; Lu, H. A case study of using a multilayered thermodynamical snow model for radiance assimilation. IEEE Trans. Geosci. Remote Sens. 2011, 49, 2828–2837. [Google Scholar] [CrossRef]
Dechant, C.; Moradkhani, M. Radiance data assimilation for operational snow and streamflow forecasting. Adv. Water Resour. 2011, 34, 351–364. [Google Scholar] [CrossRef]
Andreadis, K.M.; Lettenmaier, D.P. Implications of representing snowpack stratigraphy for the assimilation of passive microwave satellite observations. J. Hydrometeor. 2012, 13, 1493–1506. [Google Scholar] [CrossRef]
Langlois, A.; Royer, A.; Derksen, C.; Montpetit, B.; Dupont, F.; Goïta, K. Coupling the snow thermodynamic model SNOWPACK with the microwave emission model of layered snowpacks for subarctic and arctic snow water equivalent retrievals. Water Resour. Res. 2012, 48. [Google Scholar] [CrossRef] [Green Version]
Che, T.; Li, X.; Jin, R.; Huang, C. Assimilating passive microwave remote sensing data into a land surface model to improve the estimation of snow depth. Remote Sens. Environ. 2014, 143, 54–63. [Google Scholar] [CrossRef]
Kwon, Y.; Yang, Z.-L.; Zhao, L.; Hoar, T.J.; Toure, A.M.; Rodell, M. Estimating snow water storage in North America using CLM4, DART, and snow radiance data assimilation. J. Hydrometeor. 2016, 17, 2853–2874. [Google Scholar] [CrossRef]
Kwon, Y.; Yang, Z.-L.; Hoar, T.J.; Toure, A.M. Improving the radiance assimilation performance in estimating snow water storage across snow and land-cover types in North America. J. Hydrometeor. 2017, 18, 651–668. [Google Scholar] [CrossRef]
Larue, F.; Royer, A.; De Sève, D.; Roy, A.; Cosme, E. Assimilation of passive microwave AMSR-2 satellite observations in a snowpack evolution model over northeastern Canada. Hydrol. Earth Syst. Sci. 2018, 22, 5711–5734. [Google Scholar] [CrossRef] [Green Version]
Kim, R.S.; Durand, M.; Li, D.; Baldo, E.; Margulis, S.A.; Dumont, M.; Morin, S. Estimating alpine snow depth by combining multifrequency passive radiance observations with ensemble snowpack modeling. Remote Sens. Environ. 2019, 226, 1–15. [Google Scholar] [CrossRef]
Forman, B.A.; Reichle, R.H.; Derksen, C. Estimating passive microwave brightness temperature over snow-covered land in North America using a land surface model and an artificial neural network. IEEE Trans. Geosci. Remote Sens. 2014, 52, 235–248. [Google Scholar] [CrossRef]
Forman, B.A.; Reichle, R.H. Using a support vector machine and a land surface model to estimate large-scale passive microwave brightness temperatures over snow-covered land in North America. IEEE J. Selec. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 4431–4441. [Google Scholar] [CrossRef]
Xue, Y.; Forman, B.A. Comparison of passive microwave brightness temperature prediction sensitivities over snow-covered land in North America using machine learning algorithms and the Advanced Microwave Scanning Radiometer. Remote Sens. Environ. 2015, 170, 153–165. [Google Scholar] [CrossRef]
Xue, Y.; Forman, B.A.; Reichle, R.H. Estimating snow mass in North America through assimilation of Advanced Microwave Scanning Radiometer brightness temperature observations using the Catchment land surface model and support vector machines. Water Resour. Res. 2018, 54, 6488–6509. [Google Scholar] [CrossRef]
Durand, M.; Kim, E.J.; Margulis, S.A. Quantifying uncertainty in modeling snow microwave radiance for a mountain snowpack at the point-scale, including stratigraphic effects. IEEE Trans. Geosci. Remote Sens. 2008, 46, 1753–1767. [Google Scholar] [CrossRef]
Kumar, S.V.; Peters-Lidard, C.D.; Tian, Y.; Houser, P.R.; Geiger, J.; Olden, S.; Lighty, L.; Eastman, J.L.; Doty, B.; Dirmeyer, P.; et al. Land information system: An interoperable framework for high resolution land surface modeling. Environ. Modell. Softw. 2006, 21, 1402–1415. [Google Scholar] [CrossRef]
Peters-Lidard, C.D.; Houser, P.R.; Tian, Y.; Kumar, S.V.; Geiger, J.; Olden, S.; Lighty, L.; Doty, B.; Dirmeyer, P.; Adams, J.; et al. High-performance Earth system modeling with NASA/GSFC’s Land Information System. Innov. Syst. Softw. Eng. 2007, 3, 157–165. [Google Scholar] [CrossRef]
Kumar, S.; Peters-Lidard, C.; Tian, Y.; Reichle, R.H.; Geiger, J.; Alonge, C.; Eylander, J.; Houser, P. An integrated hydrologic modeling and data assimilation framework. Computer 2008, 41, 52–59. [Google Scholar] [CrossRef]
Niu, G.-Y.; Yang, Z.-L.; Mitchell, K.E.; Chen, F.; Ek, M.B.; Barlage, M.; Kumar, A.; Manning, K.; Niyogi, D.; Rosero, E.; et al. The community Noah land surface model with multiparameterization options (Noah-MP): 1. Model description and evaluation with local-scale measurements. J. Geophys. Res. 2011, 116. [Google Scholar] [CrossRef] [Green Version]
Yang, Z.-L.; Niu, G.-Y.; Mitchell, K.E.; Chen, F.; Ek, M.B.; Barlage, M.; Longuevergne, L.; Manning, K.; Niyogi, D.; Tewari, M.; et al. The community Noah land surface model with multiparameterization options (Noah-MP): 2. Evaluation over global river basins. J. Geophys. Res. 2011, 116. [Google Scholar] [CrossRef]
Reichle, R.H.; Koster, R.D. Assessing the impact of horizontal error correlations in background fields on soil moisture estimation. J. Hydrometeor. 2003, 4, 1229–1242. [Google Scholar] [CrossRef]
Clark, M.P.; Slater, A.G.; Barrett, A.P.; Hay, L.E.; McCabe, G.J.; Rajagopalan, B.; Leavesley, G.H. Assimilation of snow covered area information into hydrologic and land-surface models. Adv. Water Resour. 2006, 29, 1209–1221. [Google Scholar] [CrossRef]
Forman, B.A.; Reichle, R.H. The spatial scale of model errors and assimilated retrievals in a terrestrial water storage assimilation system. Water Resour. Res. 2013, 49. [Google Scholar] [CrossRef]
Kumar, S.V.; Reichle, R.H.; Peters-Lidard, C.D.; Koster, R.D.; Zhan, X.; Crow, W.T.; Eylander, J.B.; Houser, P.R. A land surface data assimilation framework using the land information system: Description and applications. Adv. Water Resour. 2008, 31, 1419–1432. [Google Scholar] [CrossRef]
Vapnik, V.N. An overview of statistical learning theory. IEEE Trans. Neural Netw. 1999, 10, 988–999. [Google Scholar] [CrossRef] [PubMed]
Cherkassky, V.; Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 2004, 17, 113–126. [Google Scholar] [CrossRef] [Green Version]
Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 1–27. [Google Scholar] [CrossRef]
Bosilovich, M.G.; Lucchesi, R.; Suarez, M. MERRA-2: Initial Evaluation of the Climate; Technical Report No. 104606; NASA: Washington, DC, USA, 2015; p. 73.
Huffman, G.J.; Bolvin, D.T.; Nelkin, E.J.; Wolff, D.B.; Adler, R.F.; Gu, G.; Hong, Y.; Bowman, K.P.; Stocker, E.F. The TRMM multisatellite precipitation analysis (TMPA): Quasi-global, multiyear, combined-sensor precipitation estimates at fine scales. J. Hydrometeor. 2007, 8, 38–55. [Google Scholar] [CrossRef]
Evensen, G. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res. 1994, 99, 10143–10162. [Google Scholar] [CrossRef]
Reichle, R.H.; Walker, J.P.; Koster, R.D.; Houser, P.R. Extended versus ensemble Kalman filtering for land data assimilation. J. Hydrometeor. 2002, 3, 728–740. [Google Scholar] [CrossRef]
Brodzik, M.J.; Long, D.G.; Hardman, M.A.; Paget, A.; Armstrong, R. MEaSUREs Calibrated Enhanced-Resolution Passive Microwave Daily EASE-Grid 2.0 Brightness Temperature ESDR, Version 1; NASA National Snow and Ice Data Center Distributed Active Archive Center: Boulder, CO, USA, 2016. [Google Scholar]
Yoon, Y.; Kumar, S.V.; Forman, B.A.; Zaitchik, B.F.; Kwon, Y.; Qian, Y.; Rupper, S.; Maggioni, V.; Houser, P.; Kirschbaum, D.; et al. Evaluating the uncertainty of terrestrial water budget components over High Mountain Asia. Front. Earth Sci. 2019, 7, 120. [Google Scholar] [CrossRef]
Funk, C.; Peterson, P.; Landsfeld, M.; Pedreros, D.; Verdin, J.; Shukla, S.; Husak, G.; Rowland, J.; Harrison, L.; Hoell, A.; et al. The climate hazards infrared precipitation with stations–A new environmental record for monitoring extremes. Sci. Data 2015, 2, 150066. [Google Scholar] [CrossRef] [PubMed]
Molteni, F.; Buizza, R.; Palmer, T.N.; Petroliagis, T. The ECMWF ensemble prediction system: Methodology and validation. Q. J. R. Meteorol. Soc. 1996, 122, 73–119. [Google Scholar] [CrossRef]
Ahmad, J.A.; Forman, B.A.; Kwon, Y. Analyzing machine learning predictions of passive microwave brightness temperature spectral difference over snow-covered terrain in High Mountain Asia. Front. Earth Sci. 2019, 7, 212. [Google Scholar] [CrossRef]
Farr, T.G.; Rosen, P.A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S.; Kobrick, M.; Paller, M.; Rodriguez, E.; Roth, L.; et al. The shuttle radar topography mission. Rev. Geophys. 2007, 45. [Google Scholar] [CrossRef]
Friedl, M.A.; McIver, D.K.; Hodges, J.C.F.; Zhang, X.Y.; Muchoney, D.; Strahler, A.H.; Woodcock, C.E.; Gopal, S.; Schneider, A.; Cooper, A.; et al. Global land cover mapping from MODIS: Algorithms and early results. Remote Sens. Environ. 2002, 83, 287–302. [Google Scholar] [CrossRef]
Hengl, T.; de Jesus, J.M.; MacMillan, R.A.; Batjes, N.H.; Heuvelink, G.B.M.; Ribeiro, E.; Samuel-Rosa, A.; Kempen, B.; Leenaars, J.G.B.; Walsh, M.G.; et al. SoilGrids1km–Global soil information based on automated mapping. PLoS ONE 2014, 9, e105992. [Google Scholar] [CrossRef]
Ball, J.T.; Woodrow, I.E.; Berry, J.A. A model predicting stomatal conductance and its contribution to the control of photosynthesis under different environmental conditions. In Process in Photosynthesis Research; Biggins, J., Ed.; Martinus Nijhoff: Dordrecht, The Netherlands, 1987; pp. 221–234. ISBN 978-94-017-0521-9. [Google Scholar]
Chen, F.; Dudhia, J. Coupling an advanced land surface-hydrology model with the Penn State-NCAR MM5 modeling system. Part I: Model implementation and sensitivity. Mon. Weather Rev. 2001, 129, 569–585. [Google Scholar] [CrossRef]
Niu, G.-Y.; Yang, Z.-L.; Dickinson, R.E.; Gulden, L.E.; Su, H. Development of a simple groundwater model for use in climate models and evaluation with Gravity Recovery and Climate Experiment data. J. Geophys. Res. 2007, 112. [Google Scholar] [CrossRef]
Brutsaert, W.A. Evaporation into the Atmosphere; D. Reidel: Dordrecht, The Netherlands, 1982. [Google Scholar]
Niu, G.-Y.; Yang, Z.-L. Effects of frozen soil on snowmelt runoff and soil water storage at a continental scale. J. Hydrometeorol. 2006, 7, 937–952. [Google Scholar] [CrossRef]
Yang, R.; Friedl, M.A. Modeling the effects of three-dimensional vegetation structure on surface radiation and energy balance in boreal forests. J. Geophys. Res. 2003, 108, 8615. [Google Scholar] [CrossRef]
Niu, G.-Y.; Yang, Z.-L. The effects of canopy processes on snow surface energy and mass balances. J. Geophys. Res. 2004, 109. [Google Scholar] [CrossRef]
Yang, Z.-L.; Dickinson, R.E. Description of the Biosphere-Atmosphere Transfer Scheme (BATS) for the soil moisture workshop and evaluation of its performance. Global Planet. Chang. 1996, 13, 117–134. [Google Scholar] [CrossRef]
Jordan, R. A One-Dimensional Temperature Model for a Snow Cover. Technical Documentation for SNTHERM.89; Technical Report No. 91-16; U.S. Army Corps of Engineers: Washington, DC, USA, 1991.
Derksen, C. The contribution of AMSR-E 18.7 and 10.7 GHz measurements to improved boreal forest snow water equivalent retrievals. Remote Sens. Environ. 2008, 112, 2701–2710. [Google Scholar] [CrossRef]
Kelly, R. The AMSR-E snow depth algorithm: Description and initial results. J. Remote Sens. Soc. Jpn. 2009, 29, 307–317. [Google Scholar] [CrossRef]
Derksen, C.; Toose, P.; Rees, A.; Wang, L.; English, M.; Walker, A.; Sturm, M. Development of a tundra-specific snow water equivalent retrieval algorithm for satellite passive microwave data. Remote Sens. Environ. 2010, 114, 1699–1709. [Google Scholar] [CrossRef]
Foster, J.L.; Hall, D.K.; Eylander, J.B.; Riggs, G.A.; Nghiem, S.V.; Tedesco, M.; Kim, E.; Montesano, P.M.; Kelly, R.E.J.; Casey, K.A.; et al. A blended global snow product using visible, passive microwave, and scatterometer satellite data. Int. J. Remote Sens. 2011, 32, 1371–1395. [Google Scholar] [CrossRef]
Pulliainen, J.; Kärnä, J.-P.; Hallikainen, M. Development of geophysical retrieval algorithms for the MIMR. IEEE Trans. Geosci. Remote Sens. 1993, 31, 268–277. [Google Scholar] [CrossRef]
Turner, D.D.; Clough, S.A.; Liljegren, J.C.; Clothiaux, E.E.; Cady-Pereira, K.E.; Gaustad, K.L. Retrieving liquid water path and precipitable water vapor from the Atmospheric Radiation Measurement (ARM) microwave radiometers. IEEE Trans. Geosci. Remote Sens. 2007, 45, 3680–3690. [Google Scholar] [CrossRef]
Walker, A.; Goodison, B. Discrimination of a wet snow cover using passive microwave satellite data. Ann. Glaciol. 1993, 17, 307–311. [Google Scholar] [CrossRef] [Green Version]
Xue, Y.; Forman, B.A. Atmospheric and forest decoupling of passive microwave brightness temperature observations over snow-covered terrain in North America. IEEE J. Selec. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3172–3189. [Google Scholar] [CrossRef]
Abdalati, W.; Steffen, K. Passive microwave-derived snow melt regions on the Greenland ice sheet. Geophys. Res. Lett. 1995, 22, 787–790. [Google Scholar] [CrossRef]
Savoie, M.H.; Armstrong, R.L.; Brodzik, M.J.; Wang, J.R. Atmospheric corrections for improved satellite passive microwave snow cover retrievals over the Tibet Plateau. Remote Sens. Environ. 2009, 113, 2661–2669. [Google Scholar] [CrossRef]
Clifford, D. Global estimates of snow water equivalent from passive microwave instruments: History, challenges and future developments. Int. J. Remote Sens. 2010, 31, 3707–3726. [Google Scholar] [CrossRef]
Sturm, M.; Holmgren, J.; Liston, G.E. A seasonal snow cover classification system for local to regional applications. J. Clim. 1995, 8, 1261–1283. [Google Scholar] [CrossRef]
Ramsay, B. The interactive multisensory snow and ice mapping system. Hydrol. Proc. 1998, 12, 1537–1546. [Google Scholar] [CrossRef]
Helfrich, S.R.; McNamara, D.; Ramsay, B.H.; Baldwin, T.; Kasheta, T. Enhancements to, and forthcoming developments in the Interactive Multisensor Snow and Ice Mapping System (IMS). Hydrol. Proc. 2007, 21, 1576–1586. [Google Scholar] [CrossRef]
Reichle, R.H.; Koster, R.D.; Liu, P.; Mahanama, P.P.; Njoku, E.G.; Owe, M. Comparison and assimilation of global soil moisture retrievals from the Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) and the Scanning Multichannel Microwave Radiometer (SMMR). J. Geophys. Res. 2007, 112. [Google Scholar] [CrossRef]
Reichle, R.H.; Crow, W.T.; Keppenne, C.L. An adaptive ensemble Kalman filter for soil moisture data assimilation. Water Resour. Res. 2008, 44. [Google Scholar] [CrossRef]
Kumar, S.V.; Reichle, R.H.; Koster, R.D.; Crow, W.T.; Peters-Lidard, C.D. Role of subsurface physics in the assimilation of surface soil moisture observations. J. Hydrometeor. 2009, 10, 1534–1547. [Google Scholar] [CrossRef]
Kumar, S.V.; Zaitchik, B.F.; Peters-Lidard, C.D.; Rodell, M.; Reichle, R.; Li, B.; Jasinski, M.; Mocko, D.; Getirana, A.; De Lannoy, G.; et al. Assimilation of gridded GRACE terrestrial water storage estimates in the North American Land Data Assimilation System. J. Hydrometeor. 2016, 17, 1951–1972. [Google Scholar] [CrossRef]
Forman, B.A.; Reichle, R.H.; Rodell, M. Assimilation of terrestrial water storage from GRACE in a snow-dominated basin. Water Resour. Res. 2012, 48. [Google Scholar] [CrossRef]
Durand, M.; Margulis, S.A. Correcting first-order errors in snow water equivalent estimates using a multifrequency, multiscale radiometric data assimilation scheme. J. Geophys. Res. 2007, 112. [Google Scholar] [CrossRef]
Wester, P.; Mishra, A.; Mukherji, A.; Shrestha, A.B. The Hindu Kush Himalaya Assessment—Mountains, Climate Change, Sustainability and People; Springer Nature Switzerland AG: Cham, Switzerland, 2018. [Google Scholar]
GLIMS; National Snow and Ice Data Center. Global Land Ice Measurements from Space Glacier Database; National Snow and Ice Data Center: Boulder, CO, USA, 2005.
Kim, E.J.; England, A.W. A yearlong comparison of plot-scale and satellite footprint-scale 19 and 37 GHz brightness temperature of the Alaskan North Slope. J. Geophys. Res. 2003, 108, 4388–4403. [Google Scholar] [CrossRef]
Schanda, E.; Mätzler, C.; Kunzi, K. Microwave remote sensing of snow cover. Int. J. Remote Sens. 1983, 4, 149–158. [Google Scholar] [CrossRef]
Lemmetyinen, J.; Derksen, C.; Toose, P.; Proksch, M.; Pulliainen, J.; Kontu, A.; Rautiainen, K.; Seppänen, J.; Hallikainen, M. Simulating seasonally and spatially varying snow cover brightness temperature using HUT snow emission model and retrieval of a microwave effective grain size. Remote Sens. Environ. 2015, 156, 71–95. [Google Scholar] [CrossRef]
Fuller, M.C.; Derksen, C.; Yackel, J. Plot scale passive microwave measurements and modeling of layered snow using the multi-layered HUT model. Can. J. Remote Sens. 2015, 41, 219–231. [Google Scholar] [CrossRef]
Cumming, W.A. The dielectric properties of ice and snow at 3.2 centimeters. J. Appl. Phys. 1952, 23, 768–773. [Google Scholar] [CrossRef]
Grenfell, T.C. Surface-based passive microwave studies of multiyear sea ice. J. Geophys. Res. 1992, 97, 3485–3501. [Google Scholar] [CrossRef]
Eppler, D.T.; Farmer, D.; Lohanick, A.W.; Anderson, M.R.; Cavalieri, D.J.; Comiso, J.; Gloersen, P.; Garrity, C.; Grenfell, T.C.; Hallikainen, M.; et al. Passive microwave signatures of sea ice. In Microwave Remote Sensing of Sea Ice; Carsey, F.D., Ed.; American Geophysical Union: Washington, DC, USA, 1992; pp. 47–71. [Google Scholar]
Foster, J.L.; Chang, A.T.C.; Hall, D.K.; Rango, A. Derivation of snow water equivalent in boreal forests using microwave radiometry. Arctic 1991, 44, 147–152. [Google Scholar] [CrossRef]
Chang, A.T.C.; Foster, J.L.; Hall, D.K. Effects of forest on the snow parameters derived from microwave measurements during the BOREAS winter field experiment. Hydrol. Proc. 1996, 10, 1565–1574. [Google Scholar] [CrossRef]
Pampaloni, P. Microwave radiometry of forests. Waves Random Media 2004, 14, S275–S298. [Google Scholar] [CrossRef]
Hallikainen, M.T.; Jolma, P.A. Comparison of algorithms for retrieval of snow water equivalent from Nimbus-7 SMMR data in Finland. IEEE Trans. Geosci. Remote Sens. 1992, 30, 124–131. [Google Scholar] [CrossRef]
Roy, A.; Royer, A.; Wigneron, J.-P.; Langlois, A.; Bergeron, J.; Cliche, P. A simple parameterization for a boreal forest radiative transfer model at microwave frequencies. Remote Sens. Environ. 2012, 124, 371–383. [Google Scholar] [CrossRef]
Pellarin, T.; Kerr, Y.H.; Wigneron, J.-P. Global simulations of brightness temperature at 6.6 and 10.7 GHz over land based on SMMR data set analysis. IEEE Trans. Geosci. Remote Sens. 2006, 44, 2492–2505. [Google Scholar] [CrossRef]
Grant, J.P.; Saleh, K.; Wigneron, J.-P.; Guglielmetti, M.; Kerr, Y.H.; Schwank, M.; Skou, N.; Van deGriend, A. Calibration of the L-MEB model over a coniferous and a deciduous forest. IEEE Trans. Geosci. Remote Sens. 2008, 46, 808–818. [Google Scholar] [CrossRef]
Kwon, Y.; Toure, A.M.; Yang, Z.-L.; Rodell, M.; Picard, G. Error characterization of coupled land surface–Radiative transfer models for snow microwave radiance assimilation. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5247–5268. [Google Scholar] [CrossRef]
Sandells, M.; Essery, R.; Rutter, N.; Wake, L.; Leppänen, L.; Lemmetyinen, J. Microstructure representation of snow in coupled snowpack and microwave emission models. The Cryosphere 2017, 11, 229–246. [Google Scholar] [CrossRef] [Green Version]
Picard, G.; Sandells, M.; Löwe, H. SMRT: An active-passive microwave radiative transfer model for snow with multiple microstructure and scattering formulations (v1.0). Geosci. Model Dev. 2018, 11, 2763–2788. [Google Scholar] [CrossRef]
Roy, A.; Picard, G.; Royer, A.; Montpetit, B.; Dupont, F.; Langlois, A.; Derksen, C.; Champollion, N. Brightness temperature simulations of the Canadian seasonal snowpack driven by measurements of the snow specific surface area. IEEE Trans. Geosci. Remote Sens. 2013, 51, 4692–4704. [Google Scholar] [CrossRef]
Ogata, K. Modern Control Engineering; Prentice Hall International Inc.: Upper Saddle River, NJ, USA, 1997. [Google Scholar]
Gelb, A. Applied Optimal Estimation; M.I.T. Press: Cambridge, MA, USA, 1974. [Google Scholar]
Pulvirenti, L.; Pierdicca, N.; Marzano, F.S. Topographic effects on the surface emissivity of a mountainous area observed by a spaceborne microwave radiometer. Sensors 2008, 8, 1459–1474. [Google Scholar] [CrossRef]
Rodríguez-Fernández, N.; de Rosnay, P.; Albergel, C.; Richaume, P.; Aires, F.; Prigent, C.; Kerr, Y. SMOS Neural Network Soil Moisture Data Assimilation in a Land Surface Model and Atmospheric Impact. Remote Sens. 2019, 11, 1334. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram for the synthetic twin experiment using the Land Information System (LIS) and a well-trained support vector machine (SVM) as the observation operator.

Figure 2. High Mountain Asia (HMA) study domain with experimental grid points (i.e., Grids #1 to #4): (a) elevation, (b) dominant land-cover types, (c) forest cover fraction, (d) snow class, (e) multiyear mean winter (December, January, and February) precipitation rate, and (f) wet snow days. Grids #5 and #6 are glaciated regions presented in Figure 4e,f respectively. Digital elevation estimates derived from the Shuttle Radar Topography Mission (SRTM) elevation data [46]. The Terra Moderate Resolution Imaging Spectroradiometer (MODIS) sensor-based International Geosphere-Biosphere Program (IGBP) land classification map [47] was used for the dominant land-cover types and forest cover fraction. The snow classification was based on Sturm et al. [69]. The multiyear mean winter precipitation rate was estimated from MERRA-2 precipitation, and the percentage of wet snow days were obtained from the synthetic truth simulation.

Figure 3. Time series of snow water equivalent (SWE) from the synthetic truth (black dots), open loop (OL; blue lines), and data assimilation (DA; red lines) for the four different experimental locations: (a) Grid #1, (b) Grid #2, (c) Grid #3, and (d) Grid #4 (see Table 3 for geographical characteristics of the grid cells). The solid and dotted lines denote the ensemble mean and range of the simulated SWE, respectively.

Figure 4. Correlations between the Noah-MP simulated SWE and SVM-based Brightness temperature spectral difference (ΔT_B) 18 V–36 V estimates for Grid #3 (a: dry snow, b: wet snow), Grid #4 (c: dry snow, d: wet snow), and glaciated regions (e,f) marked in Figure 2. A SWE threshold of 200 mm was used for the slope reversal associated with signal saturation.

Figure 5. Time series of SWE (a,e), snow density (b,f), snow liquid water (c,g), and snow (top layer) temperature (d,h) including the synthetic truth (black dots), OL ensemble mean (blue line), and DA ensemble mean (red line) for Grid #3 and Grid #4. The dotted lines denote the ensemble range of the simulated snow states.

Figure 6. Time series of SWE (a,e), snow density (b,f), snow liquid water (c,g), and snow (top layer) temperature (d,h) including the synthetic truth (black dots), OL ensemble mean (blue line), and DA ensemble mean (red line) for Grid #1 and Grid #2. The dotted lines denote the ensemble range of the simulated snow states.

Figure 7. Time series of synthetic ΔT_B observations (black dots) and the predicted prior ΔT_B (red line) for Grid #3 and Grid #4: 10.65 V–36.5 V (a,f), 10.65 H–36.5 H (b,g), 18.7 V–36.5 V (c,h), and 18.7 H–36.5 H (d,i). The solid and dotted lines denote the ensemble mean and range of the predicted prior ΔT_B, respectively. The normalized innovation sequence is plotted in (e,j).

Figure 8. Time series of SWE from the synthetic truth (black dots), OL (blue line), and DA (red line) cases for Grid #3 and Grid #4 (see Table 4 for descriptions of the rules applied). The solid and dotted lines denote the ensemble mean and range of the simulated SWE, respectively. No rule was applied in (a,e).

Figure 9. Maps of SWE RMSE for OL and DA cases (left column) and their differences (RMSE_DA−RMSE_OL; middle column): (a) OL, (b,c) DA (baseline), (d,e) DA (rule 1), (f,g) DA (rule 2), (h,i) DA (rule 3), and (j,k) DA (rule 4) (see Table 4 for descriptions of the rules applied). Negative (blue color) and positive (red color) RMSE differences represent improvements and degradations in the SWE estimates, respectively, as a result of invoking assimilation. Glaciated regions (gray color) were excluded from the analysis. Maps of the (l) maximum SWE in the synthetic truth, (m) percentage of deep snow days (SWE ≥ 200 mm) in the synthetic truth, and (n) percentage of deep snow days (SWE ≥ 200 mm) in the OL case are presented for reference.

Figure 10. Examples of support vector machine (SVM) sensitivity associated with different perturbations about the nominal geophysical (snow) states (a: controllable case; b: uncontrollable case).

Table 1. Noah Land Surface Model with multiparameterization (Noah-MP) v3.6 options of physical parameterization schemes used in this study. (See Niu et al. [28] for additional details).

Process	Selected Option	References
Vegetation	Dynamic	-
Canopy stomatal resistance	Ball–Berry	[49]
Soil moisture factor for stomatal resistance	Noah type using soil moisture	[50]
Runoff and groundwater	SIMGM ^a	[51]
Surface layer drag coefficient	M-O ^b	[52]
Supercooled liquid water (or ice fraction) in frozen soil	NY06	[53]
Frozen soil permeability	NY06	[53]
Radiation transfer	Modified two-stream	[54,55]
Ground snow surface albedo	BATS ^c	[56]
Partitioning precipitation into rainfall and snowfall	Jordan91	[57]
Lower boundary condition of soil temperature	Original Noah scheme	-
Snow and soil temperature time scheme (only the first layer)	Semi-implicit	-

^a TOPMODEL-based runoff scheme with a simple groundwater scheme; ^b general Monin–Obukhov similarity theory; ^c biosphere–atmosphere transfer scheme.

Table 2. Perturbation parameters applied to meteorological forcing fields for both the OL and DA cases (M: multiplicative; A: additive; AR(1): first-order autoregressive temporal correlation).

Perturbed Meteorological Forcing	Type	Std Dev	AR(1)	Cross-Correlations with Perturbations in
				P	SW	LW	T_air
Precipitation (P)	M	0.5	1 day	−	−0.8	0.5	−0.1
Shortwave radiation (SW)	M	0.3	1 day	−0.8	–	−0.5	0.3
Longwave radiation (LW)	A	50 W m⁻²	1 day	0.5	−0.5	–	0.6
Near surface air temperature (T_air)	A	1 K	1 day	−0.1	0.3	0.6	–

Table 3. Geographical characteristics of the four experimental grid cells.

Grid ID	Latitude	Longitude	Elevation ^a (m)	Dominant Land-Cover Type ^b	Forest Fraction ^b (%)	Glacier Fraction ^c (%)
1	31.125°N	80.375°E	4659	Open shrublands	0	0
2	31.875°N	77.375°E	2986	Mixed forests	54.4	0.96
3	35.875°N	71.125°E	4528	Barren or sparsely vegetated	0.2	10.6
4	37.125°N	71.875°E	4053	Open shrublands	0	0

^a From Shuttle Radar Topography Mission (SRTM) elevation data [46]; ^b from Terra Moderate Resolution Imaging Spectroradiometer (MODIS) sensor-based International Geosphere-Biosphere Program (IGBP) land classification map [47]; ^c from Global Land Ice Measurements from Space (GLIMS) glacier database [79].

Table 4. Descriptions of rule-based updates incorporated into the SVM-DA framework.

Applied Rule	Description
Rule 1	Update SWE only when the standard deviation of the predicted ΔT_B is larger than 0.05 K
Rule 2	Add a thin layer (i.e., 5 mm SWE) when the observed ΔT_B is greater than 5 K but the simulated SWE is 0 mm.
Rule 3	Rule 1 + Rule 2
Rule 4	Rule 3 + Suppress the analysis update when SWE of one or more of the ensemble members is greater than 500 mm.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kwon, Y.; Forman, B.A.; Ahmad, J.A.; Kumar, S.V.; Yoon, Y. Exploring the Utility of Machine Learning-Based Passive Microwave Brightness Temperature Data Assimilation over Terrestrial Snow in High Mountain Asia. Remote Sens. 2019, 11, 2265. https://doi.org/10.3390/rs11192265

AMA Style

Kwon Y, Forman BA, Ahmad JA, Kumar SV, Yoon Y. Exploring the Utility of Machine Learning-Based Passive Microwave Brightness Temperature Data Assimilation over Terrestrial Snow in High Mountain Asia. Remote Sensing. 2019; 11(19):2265. https://doi.org/10.3390/rs11192265

Chicago/Turabian Style

Kwon, Yonghwan, Barton A. Forman, Jawairia A. Ahmad, Sujay V. Kumar, and Yeosang Yoon. 2019. "Exploring the Utility of Machine Learning-Based Passive Microwave Brightness Temperature Data Assimilation over Terrestrial Snow in High Mountain Asia" Remote Sensing 11, no. 19: 2265. https://doi.org/10.3390/rs11192265

APA Style

Kwon, Y., Forman, B. A., Ahmad, J. A., Kumar, S. V., & Yoon, Y. (2019). Exploring the Utility of Machine Learning-Based Passive Microwave Brightness Temperature Data Assimilation over Terrestrial Snow in High Mountain Asia. Remote Sensing, 11(19), 2265. https://doi.org/10.3390/rs11192265

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exploring the Utility of Machine Learning-Based Passive Microwave Brightness Temperature Data Assimilation over Terrestrial Snow in High Mountain Asia

Abstract

1. Introduction