Exploring the Utility of Machine Learning-Based Passive Microwave Brightness Temperature Data Assimilation over Terrestrial Snow in High Mountain Asia

This study explores the use of a support vector machine (SVM) as the observation operator within a passive microwave brightness temperature data assimilation framework (herein SVM-DA) to enhance the characterization of snow water equivalent (SWE) over High Mountain Asia (HMA). A series of synthetic twin experiments were conducted with the NASA Land Information System (LIS) at a number of locations across HMA. Overall, the SVM-DA framework is effective at improving SWE estimates (~70% reduction in RMSE relative to the Open Loop) for SWE depths less than 200 mm during dry snowpack conditions. The SVM-DA framework also improves SWE estimates in deep, wet snow (~45% reduction in RMSE) when snow liquid water is well estimated by the land surface model, but can lead to model degradation when snow liquid water estimates diverge from values used during SVM training. In particular, two key challenges of using the SVM-DA framework were observed over deep, wet snowpacks. First, variations in snow liquid water content dominate the brightness temperature spectral difference (ΔTB) signal associated with emission from a wet snowpack, which can lead to abrupt changes in SWE during the analysis update. Second, the ensemble of SVM-based predictions can collapse (i.e., yield a near-zero standard deviation across the ensemble) when prior estimates of snow are outside the range of snow inputs used during the SVM training procedure. Such a scenario can lead to the presence of spurious error correlations between SWE and ΔTB, and as a consequence, can result in degraded SWE estimates from the analysis update. These degraded analysis updates can be largely mitigated by applying rule-based approaches. For example, restricting the SWE update when the standard deviation of the predicted ΔTB is greater than 0.05 K helps prevent the occurrence of filter divergence. Similarly, adding a thin layer (i.e., 5 mm) of SWE when the synthetic ΔTB is larger than 5 K can improve SVM-DA performance in the presence of a precipitation dry bias. The study demonstrates that a carefully constructed SVM-DA framework cognizant of the inherent limitations of passive microwave-based SWE estimation holds promise for snow mass data assimilation.


Introduction
High Mountain Asia (HMA) is a vast, high elevation mid-latitude region comprising the world's highest mountains. HMA snow and glaciers (that result from the accumulation of snow) are the earth's largest freshwater reservoirs outside the polar regions, and their meltwater is the main water source for more than a billion people that live downstream, especially in the Indus river basin [1,2]. While an increase (thickening) in glacier mass has been observed in the West Kunlun and Karakoram regions (the so-called 'Karakoram anomaly'; [3]), in recent decades most of HMA has been progressively losing glaciers [4][5][6][7], which could negatively impact regional water supply and native ecosystems. In this regard, accurate characterization of the spatiotemporal variability of snow and ice across the HMA region is an urgent issue for water resource management under global warming.
Ground-based monitoring of land surface snow cover and glaciers is quite limited in this region because of the complex topography and extreme climatic conditions. Satellite remote sensing and land surface modeling are alternative approaches, but they also have their own limitations. For example, remote sensing observations are limited by errors stemming from instrument noise and retrieval algorithms, whereas land surface models (LSMs) have large uncertainties attributed to model parameterization, initial conditions, and meteorological forcings (i.e., boundary conditions). Fortunately, data assimilation (DA) is an attractive alternative for generating high-quality estimates as it merges satellite-derived observations with model predictions, while explicitly accounting for their relative uncertainties.
Given the fact that passive microwave (PMW) brightness temperature (T B ) is sensitive to snow volume scattering, and thus, provides useful information on snow mass variations, T B observations can be directly assimilated into an LSM to improve snow estimates. T B assimilation in the context of snow was first applied by Durand et al. [8] as part of a set of synthetic experiments. Durand et al. [9] demonstrated from a point-scale study that T B assimilation outperforms traditional T B -based retrieval algorithms (e.g., [10]) in estimating snow depth. Many follow-on studies (e.g., [11][12][13][14][15]) showed similarly encouraging results. More recently, Kwon et al. [16,17] employed T B assimilation for continental-scale snow storage estimates and Larue et al. [18] and Kim et al. [19] further improved the method. The focus of this current study was to characterize snow water equivalent (SWE) in the HMA region using T B assimilation methods.
To assimilate T B observations, the DA framework requires a forward observation operator that maps the relevant LSM state variables (e.g., SWE) into the corresponding observation space (i.e., T B ). Whereas the above-mentioned previous T B assimilation studies used a microwave radiative transfer model (RTM) as the observation operator, we use a machine learning algorithm. Forman et al. [20] first employed an artificial neural network (ANN) to map geophysical (model) states into PMW T B space to explore the feasibility of using machine learning algorithms as an observation operator within a DA framework. They showed that an ANN could be used to estimate T B at multiple frequencies and polarizations over snow-covered land using LSM snow outputs as inputs to the ANN. Forman and Reichle [21] additionally explored the use of another machine learning technique, i.e., support vector machine (SVM) regression, and suggested that the SVM is a more effective alternative to ANNs for snow T B assimilation purposes due to the increased sensitivity of SVMs to snow mass relative to ANNs. Through sensitivity experiments, Xue and Forman [22] supported the findings of Forman and Reichle [21] with respect to the use of SVMs within a DA framework. More recently, Xue et al. [23] conducted T B spectral difference assimilation over snow-covered areas in North America using a well-trained SVM and presented promising initial results. Building on these prior efforts, we adopted the use of SVM regression algorithm in this study, but applying it to a different land surface model using a different set of boundary conditions in a different part of the globe where the accurate estimation of snow is very challenging.
Compared to RTMs, the use of SVMs in the data assimilation environment is computationally more efficient. As SVMs are pretrained prior to the assimilation process, they essentially act as an emulator to the computationally more expensive RTMs [16]. RTM-based DA approaches typically Remote Sens. 2019, 11, 2265 3 of 29 demand high-fidelity snow models to meet all of the input requirements (e.g., [24]). The SVM training process can also help mitigate the negative effects of LSM's inaccurate representations of snow layers, snow grain size, and ice lens (or depth hoar) within the snowpack by providing a more bulk, mixed-pixel response [21]. SVMs generally provide a more flexible environment for characterizing snow variability over a large region of space, such as HMA, by assimilating T B at multiple frequencies and multiple polarizations. On the other hand, the "black-box" nature of SVMs may have limitations for snow T B data assimilation, which are discussed in this paper.
The T B data assimilation framework used in this study includes the National Aeronautics and Space Administration (NASA) Land Information System (LIS; [25][26][27]) in conjunction with a well-trained SVM (herein SVM-DA framework for simplicity). The Noah Land Surface Model with multi-parameterization options (Noah-MP version 3.6; [28,29]) was used as the prior model for simulating snow. In view of the lack of available measurements for validation in the study area, a synthetic twin experiment (e.g., [30][31][32]) was performed to assess the potential of our SVM-DA framework for enhancing snow estimates over the HMA region. Note that the focus of this article is on terrestrial snow only. We consider the estimation of snow on glaciers to be beyond the scope of this paper.
Research questions addressed in this study include: (1) Is the SVM-DA framework effective in improving snow mass estimates across the HMA region? (2) What limitations does the SVM-DA framework possess? (3) Can these limitations be avoided through the recognition and appropriate handling of instances where the SVM forward operator is suboptimal?

SVM-DA Framework and Synthetic Twin Experiment
The SVM-DA framework leveraged NASA LIS as the software "wrapper" around both the forward (prognostic) land surface model and the observation operator (Figure 1), which in this case consisted of well-trained SVMs. LIS is a comprehensive system for land surface modeling and data assimilation, and features a flexible and extensible design using object-oriented programming. In LIS, various advanced LSMs (LIS-LSM), ground measurements, satellite products, sequential and nonsequential DA algorithms (LIS-DA), and tools for high-performance computing and data management are integrated to generate estimates of land surface states and fluxes (see Kumar et al. [33] for more details).
Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 29 snow grain size, and ice lens (or depth hoar) within the snowpack by providing a more bulk, mixedpixel response [21]. SVMs generally provide a more flexible environment for characterizing snow variability over a large region of space, such as HMA, by assimilating TB at multiple frequencies and multiple polarizations. On the other hand, the "black-box" nature of SVMs may have limitations for snow TB data assimilation, which are discussed in this paper.
The TB data assimilation framework used in this study includes the National Aeronautics and Space Administration (NASA) Land Information System (LIS; [25][26][27]) in conjunction with a welltrained SVM (herein SVM-DA framework for simplicity). The Noah Land Surface Model with multiparameterization options (Noah-MP version 3.6; [28,29]) was used as the prior model for simulating snow. In view of the lack of available measurements for validation in the study area, a synthetic twin experiment (e.g., [30][31][32]) was performed to assess the potential of our SVM-DA framework for enhancing snow estimates over the HMA region. Note that the focus of this article is on terrestrial snow only. We consider the estimation of snow on glaciers to be beyond the scope of this paper.
Research questions addressed in this study include: (1) Is the SVM-DA framework effective in improving snow mass estimates across the HMA region? (2) What limitations does the SVM-DA framework possess? (3) Can these limitations be avoided through the recognition and appropriate handling of instances where the SVM forward operator is suboptimal?

SVM-DA Framework and Synthetic Twin Experiment
The SVM-DA framework leveraged NASA LIS as the software "wrapper" around both the forward (prognostic) land surface model and the observation operator (Figure 1), which in this case consisted of well-trained SVMs. LIS is a comprehensive system for land surface modeling and data assimilation, and features a flexible and extensible design using object-oriented programming. In LIS, various advanced LSMs (LIS-LSM), ground measurements, satellite products, sequential and nonsequential DA algorithms (LIS-DA), and tools for high-performance computing and data management are integrated to generate estimates of land surface states and fluxes (see Kumar et al. [33] for more details).  Among the LSMs in LIS, we used Noah-MP v3.6 as the forward (prognostic) LSM to simulate snow-related land surface processes, including the accumulation and ablation of terrestrial snow. In the SVM-DA framework, a well-trained SVM was used as the observation operator to predict T B from prior model estimates of snow. The well-trained SVM was derived from statistical learning theory and uses support vector regression [34][35][36] via the integration of the LIBSVM library [36] into the LIS framework. As a data-driven model, the SVM can reproduce nonlinear processes [35], such as relationships between snow's physical properties and electromagnetic characteristics of snow (e.g., [21]). Once trained, the approximating function used in SVM regression is given as: where y is the [1 × n] input vector from Noah-MP simulations at a given location (i.e., model pixel); n is the number of snow-related variables; m is the total number of training targets (i.e., number of T B measurements for a given location at different times); α and α * are the dual Lagrangian multipliers; k(x i ,y) is a Gaussian radial basis kernel function; x is the [m × n] training matrix; and δ is a bias coefficient. The SVM-DA framework is composed of two phases: (1) SVM training, and (2) SVM prediction during the assimilation phase. As we conducted a synthetic twin experiment, an additional procedure of generating synthetic observations was required as part of the experiment (see Figure 1). During phase 1, Noah-MP was forced by the NASA Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA-2; [37]) atmospheric reanalysis that yielded estimates of snow-related model states (e.g., SWE and snow depth), which served as the synthetic snow truth for assessing DA performance. The SVM was trained using nine years (1 September 2002 to 1 September 2011) of model simulations in conjunction with PMW T B observations from the Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E) (see Section 2.3.2). The trained SVM was then used to produce the synthetic ∆T B truth as a function of the snow states generated by the "true" Noah-MP simulation. Finally, synthetic ∆T B observations were generated by adding random errors, i.e., white Gaussian noise with zero-mean and standard deviation of 3 K [23], to the "true" synthetic ∆T B . The errors were assumed to be temporally and horizontally uncorrelated.
During phase 2, a prior ensemble (with 20 different replicates) of snow-related model states were estimated by Noah-MP driven by stochastic MERRA-2 forcing. However, the precipitation in MERRA-2 was replaced with precipitation estimates from the Tropical Rainfall Measuring Mission (TRMM) multisatellite precipitation analysis (TMPA; [38]) so that the differences between the two different sets of precipitation estimates (i.e., MERRA-2 precipitation and TRMM precipitation) can be representative of the true errors that could be encountered by an operational assimilation system. The well-trained SVM, generated during phase 1, was used to predict an ensemble of ∆T B from the ensemble of prior estimates. Using the ensemble of predicted ∆T B along with the synthetic ∆T B observations generated during phase 1, LIS-DA updated the ensemble of prior (geophysical) estimates with an ensemble Kalman filter (EnKF; [39,40]) algorithm. The updated (posterior) ensemble of snow states was then used as the initial conditions for subsequent model forecasts.

Data Sets
The AMSR-E microwave T B data used during the SVM training were obtained from the MEaSUREs (NASA Making Earth System Data Records for Use in Research Environments) Calibrated Enhanced-Resolution Passive Microwave Daily EASE-Grid 2.0 Brightness Temperature Earth System Data Record (https://nsidc.org/data/nsidc-0630/versions/1; [41]). The highest resolution of T B data at 10.65, 18.7, and 36.5 GHz were upscaled as an arithmetic average to the LSM resolution (i.e., 0.25 • × 0.25 • equidistant cylindrical grid) for computational simplicity and then spectral differences (∆T B ) were calculated from the upscaled T B data. Meteorological boundary conditions used in this study include the MERRA-2 atmospheric reanalysis [37] and TRMM precipitation data [38]. MERRA-2 has been updated from its predecessor, MERRA, and is the latest atmospheric reanalysis product produced by the NASA Global Modeling and Assimilation Office (https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2). MERRA-2 has been available from 1980 to the present, and provides global coverage at a spatial resolution of 0.5 • × 0.625 • . The TRMM precipitation (available from https://pmm.nasa.gov/data-access/downloads/trmm) is provided at a spatial resolution of 0.25 • × 0.25 • . It spatially covers 50 • S-50 • N/180 • W-180 • E, and temporally spans from 2001 to 2016. The daily product of the TRMM precipitation (3B42, version 7) was used during the data assimilation experiments.
Yoon et al. [42] conducted an inter-comparison of 10 different precipitation datasets, including MERRA-2 and TRMM precipitation data, and confirmed that the existing precipitation datasets have large uncertainties and variability across HMA. All precipitation datasets agreed with respect to spatial patterns, but their magnitudes and temporal trends were significantly different. Yoon et al. [42] suggested the use of the Climate Hazards Group InfraRed Precipitation with Station data-version 2 (CHIRPS-2; [43]) precipitation in conjunction with other forcing fields from the European Centre for Medium-Range Weather Forecasts (ECMWF; [44]) to drive land surface models over HMA. Meanwhile, Ahmad et al. [45] showed that signs of the sensitivity of SVM-based ∆T B predictions to both the forcings (i.e., MERRA-2 and ECMWF + CHIRPS-2) were comparable, and higher sensitivity values were obtained for MERRA-2. Furthermore, since the current study aimed to demonstrate the feasibility of the SVM-DA framework to improve SWE estimates for the highly uncertain environment of HMA, we concluded that both the MERRA-2 (for the synthetic truth) and TRMM precipitation data (for open-loop (OL) and data assimilation (DA)) could be used in this synthetic study as the differences in precipitation between the two different sets are representative of the real-world errors that could be encountered by an operational modeling and assimilation system. Important surface input data sets for Noah-MP include elevation, land cover types, and soil texture. Elevation estimates were obtained from the Shuttle Radar Topography Mission (SRTM) elevation data [46]. For land cover types, we used the Terra Moderate Resolution Imaging Spectroradiometer (MODIS) sensor-based International Geosphere-Biosphere Programme (IGBP) land classification map [47]. Soil texture data from the International Soil Reference and Information Centre [48] were used as soil inputs to Noah-MP.

Land Surface Model
As noted above, the Noah-MP LSM was used to simulate snow conditions in this study. Noah-MP was improved from its baseline model (i.e., the Noah LSM) by augmenting representations of hydrological processes and land-atmosphere interactions, and by adding multiple parameterization options [28,29]. In Noah-MP, the snowpack is represented by up to three snow layers, depending on the total snow depth. The model simulates snowmelt-refreeze cycles and snow compaction processes such as destructive metamorphism, melt metamorphism, and compaction by snow load or overburden.
Noah-MP initial conditions were obtained from the following two-step process. First, the model was run from 1 January 1980 to 1 May 2002 using a single replicate to consider multiyear snow accumulation, soil moisture dynamics, and shallow groundwater dynamics in high elevation regions while minimizing the computational demand. Second, 20 ensemble members of model initial conditions were generated by running the model from The Noah-MP model integration time step was set to 15 min and daily averaged snow (geophysical) outputs were obtained. The Noah-MP options of physical parameterization schemes used in our experiment are listed in Table 1. More options and detailed descriptions can be found in Niu et al. [28].

SVM Training
We adopted the SVM training procedure as presented in Forman et al. [20] and Forman and Reichle [21]. The training was conducted over a nine-year period from 1 September 2002 to 1 September 2011. A split-sample training procedure was adopted, such that observations for a given training period were excluded from the analysis. A fortnightly (two-week) training period was selected to capture seasonal variability in snow processes while ensuring an adequate training sample size (on the order of a thousand) at most locations.
The selected training targets (i.e., SVM outputs) included spectral differences between the 10.65 and 36.5 GHz channels and between the 18.7 and 36.5 GHz channels at both horizontal (H) and vertical (V) polarization. These three particular frequencies were chosen based on their reported sensitivity to snow associated with preferential scattering (volume) at higher frequency [10,[58][59][60][61]. Kwon et al. [16] suggested that the 23.8 GHz channel can add value to the DA performance when it is additionally assimilated. However, due to the high sensitivity of the 23.8 GHz channel to atmospheric water vapor [62,63], it requires explicit estimation of atmospheric effects on T B [17], which were not modeled in our SVM-DA framework, and hence, the 23.8 GHz channel was not used here. In addition, in order to minimize error resulting from wet snow effects [64], only nighttime AMSR-E measurements were used for training.
Note that atmospheric and vegetation corrections to AMSR-E T B data were not conducted in this study. Xue and Forman [65] showed that the overall atmospheric contribution to the spaceborne T B data was approximately 1 to 3 K. Due to the fact that the snow-covered grid cells in HMA are mostly located in high elevation regions (Figure 2a), the atmospheric contribution over HMA may be much smaller due to the relatively dry and thin atmosphere found at higher elevations [66][67][68], and thus was assumed to be negligible. In terms of overlying vegetation, the HMA study domain largely consists of non-forested land-cover types such as barren (or sparsely vegetated), shrublands, croplands, and grasslands ( Figure 2b). The forest cover fraction in most of the study domain is small, while most of the snow-covered areas are outside densely forested areas ( Figure 2c). In this regard, both atmospheric and vegetation corrections were considered of secondary importance to the performance of the SVM-DA framework in the HMA region. largest absolute values of the NSC for ΔTB between the 18.7 and 36.5 GHz channels estimated for a specific location and averaged over snow accumulation or ablation season were about 1.0 (for SWE), 1.0 (for snow liquid water), 3.9 (for top layer snow temperature), and 1.5 (for snow density). The relatively low NSC of SWE is consistent with the results of Kwon et al. [16], who used radiative transfer models. This implies that the assimilation of PMW TB or ΔTB does not ensure improvement in SWE estimates for all surface conditions, and thus requires a nuanced approach.   Figure 4e,f respectively. Digital elevation estimates derived from the Shuttle Radar Topography Mission (SRTM) elevation data [46]. The Terra Moderate Resolution Imaging Spectroradiometer (MODIS) sensor-based International Geosphere-Biosphere Program (IGBP) land classification map [47] was used for the dominant land-cover types and forest cover fraction. The snow classification was based on Sturm et al. [69]. The multiyear mean winter precipitation rate was estimated from MERRA-2 precipitation, and the percentage of wet snow days were obtained from the synthetic truth simulation.
The Noah-MP (geophysical) state variables selected as input (i.e., control) variables to the SVM included SWE, snow density, snow liquid water, and snow (top layer) temperature. The particular Remote Sens. 2019, 11, 2265 8 of 29 selection of these four input (control) variables to SVM-derived PMW T B was based, in part, on the sensitivity analysis conducted by Ahmad et al. [45]. Using the same land surface model (Noah-MP) and boundary conditions (MERRA-2) as in this study, Ahmad et al. [45] analyzed the sensitivity of SVM-based ∆T B predictions over the HMA region to four different Noah-MP state variables. Although the magnitude and sign of the estimated normalized sensitivity coefficient (NSC [unitless]) varied depending on location in time and space, the signs of the NSCs, in general, adhered to the known first-order physics. That is, the SVM-predicted ∆T B was negatively sensitive to top-layer snow temperature and snow liquid water (i.e., ∆T B decreased as snow temperature or snow liquid water content increased). Conversely, ∆T B generally increased with increased SWE or snow density. The largest absolute values of the NSC for ∆T B between the 18.7 and 36.5 GHz channels estimated for a specific location and averaged over snow accumulation or ablation season were about 1.0 (for SWE), 1.0 (for snow liquid water), 3.9 (for top layer snow temperature), and 1.5 (for snow density). The relatively low NSC of SWE is consistent with the results of Kwon et al. [16], who used radiative transfer models. This implies that the assimilation of PMW T B or ∆T B does not ensure improvement in SWE estimates for all surface conditions, and thus requires a nuanced approach.
The daily snow cover product from the National Oceanic and Atmospheric Administration's National Environmental Satellite Data and Information Service (NOAA/NESDIS) Interactive Multisensor Snow and Ice Mapping System (IMS; [70,71]) was used as a reference for the presence of snow. That is, AMSR-E T B observations and Noah-MP outputs were excluded from the training set to the SVM if the estimated snow cover did not coincide with the IMS snow cover information. This procedure was expected to reduce erroneous inputs [20] and thus increase the accuracy of the SVM. The SVM was trained on a 0.25 • equidistant cylindrical grid, the same spatial resolution and map projection used for Noah-MP, and the optimal nonlinear SVM parameters were determined for each training target, for each fortnight, and for each 0.25 • grid cell in the study area.
The performance of this training approach in reproducing the observed ∆T B over snow-covered areas in the HMA region has been demonstrated in Ahmad et al. [45]. Overall, the trained SVM well reproduced AMSR-E ∆T B observations not used during the SVM training. Results showed a bias between −0.5 K and 0.5 K and a root mean squared error (RMSE) less than 10 K (see Figure 3 in [45]). The seasonal variability in AMSR-E ∆T B observations was also appropriately captured by the trained SVM (see Figure 4 in [45]). Please refer to Ahmad et al. [45] for more details about the performance of the SVM over HMA.

Precipitation Bias Correction
As previously mentioned, instead of using MERRA-2 precipitation, as was the case for the nominal (truth) simulation, TRMM precipitation was used for both the OL and DA runs within the synthetic twin experiments. To reduce the significant climatological difference in precipitation between MERRA-2 and TRMM, a simple bias correction on the TRMM precipitation was first conducted using a bias correction factor computed as: where α is the bias correction factor for TRMM precipitation, MERRA2 and TRMM are the time-averaged (from 1 September 2002 to 1 September 2011) and domain-averaged (across the entire HMA region) precipitation from MERRA-2 and TRMM, respectively. The bias correction ensures that, in aggregate, the total amount of TRMM precipitation over the study period and domain is identical to that of MERRA-2. However, considerable differences still exist at individual locations in space and time, such that these differences are representative of the "real-world" errors in the boundary conditions encountered by the prognostic Noah-MP model. In other words, even though the total amount of precipitation is the same between the true simulation and the OL/DA ensembles, the amount of SWE could be significantly different at any individual location due to the nonlinear hydrologic response of precipitation forcing on snowpack estimation. The differences in SWE between the synthetic truth and OL runs were expected to be mitigated by the assimilation of the synthetic ∆T B observations.

Data Assimilation
The ensemble Kalman filter (EnKF; [39,40]) was used to assimilate the synthetic ∆T B observations. In the EnKF, model errors are dynamically represented by an ensemble of model integrations, which, in this study, was derived by perturbing meteorological forcings. As a compromise between the performance and computational efficiency of the SVM-DA framework, an ensemble size of 20 was used for both the OL and DA cases. The 20 ensemble members were constructed by perturbing meteorological forcing inputs, which represent the uncertainty in the boundary conditions (see Table 2). Note that the precipitation error was considered the key source of uncertainty in this paper. Due to the fact that the same Noah-MP model (with the same configurations) was used for the synthetic truth, OL, and DA cases during the synthetic twin experiment, errors stemming from the model structure (physics and parameters) were neglected, and, hence, no perturbation was applied to the model prognostic state variables. It is also important to note that the analysis update was disabled when one or more of the ensemble members in a grid cell simulated snow-free conditions.  (1): first-order autoregressive temporal correlation).

Perturbed Meteorological Forcing Type Std Dev AR(1) Cross-Correlations with Perturbations in
The perturbed meteorological forcing fields (i.e., boundary conditions) include precipitation, incident shortwave and longwave radiation, and near-surface air temperature, which were selected based on their first-order control on snow processes. The perturbation parameters summarized in Table 2 were determined based on settings used in previous DA studies (e.g., [33,[72][73][74][75][76]). The precipitation (P) and downward shortwave (SW) radiation forcing were perturbed using normally distributed, additive random fields (with zero-mean). The downward longwave (LW) radiation and near-surface air temperature (T air ) forcing were perturbed by log-normally distributed, multiplicative random fields (with mean 1). For all perturbation fields, temporal correlations were considered using a first-order autoregressive model (AR (1)). Cross correlations between the perturbation fields were also imposed, as indicated in Table 2, for realistic and balanced (physically consistent) perturbations.
The synthetic observations of ∆T B between the 10.65 and 36.5 GHz (V and H) channels and between the 18.7 and 36.5 GHz (V and H) channels were simultaneously assimilated in the SVM-DA framework. The standard deviation of the ∆T B error was assumed to be 3 K based on Kwon et al. [16], Xue et al. [23], and Durand and Margulis [77]. Note that only SWE was directly updated during the assimilation and that other snow states such as snow ice content, snow liquid water content, and snow depth were adjusted based on the amount of SWE update to maintain physical consistency between snow properties in the prior and posterior ensembles from Noah-MP.

Study Area
HMA presents an interesting study domain for this study. As previously mentioned, HMA is an important area with respect to climate change and freshwater resources. HMA is also characterized by its high biodiversity, highly variable precipitation, and high vulnerability to climate change and natural disasters such as floods and avalanches [78]. Therefore, accurate characterization of the spatiotemporal variability of snow across the HMA region is valuable to regional communities and ecosystems. Second, HMA is also a data-sparse area where ground-based measurements are quite limited, owing to the complex topography, resource availability, and extreme climatic conditions. In this regard, land surface modeling, satellite remote sensing, and data assimilation are the most viable alternatives for estimating SWE in this region. There may be limitations in the use of remote sensing observations to help constrain LSM predictions, and this paper discusses some of the advantages and disadvantages with respect to the use of PMW T B .
The experiments were first conducted at four specific locations in HMA (Figure 2a) in order to explore a small suite of different elevation and land cover characteristic combinations. Geographical characteristics of the experimental grid cells are summarized in Table 3. Based on analysis of results for these experimental grid cells, we later expanded the study area to the entirety of the HMA domain spanning from 20.5 • N to 41.0 • N and from 66.5 • E to 101.0 • E ( Figure 2).
Maps of elevation, dominant land-cover type, forest cover fraction, snow class, multiyear mean winter (December, January, and February) precipitation rate, and the percentage of wet snow days in the study domain are presented in Figure 2. Figure 2b derived from the Terra Moderate Resolution Imaging Spectroradiometer (MODIS) sensor-based International Geosphere-Biosphere Program (IGBP) land classification map [47] shows that about 86.8% of the HMA domain is dominated by non-forested land-cover types such as barren or sparsely vegetated (26.5%), croplands (26.0%), shrublands (21.3%), and grasslands (13.0%). Forest land-cover accounts for only 11.8% of the study area (Figure 2b), and dense forests are mostly located in the southeastern part of the region (Figure 2c). The snow classification map (Figure 2d), based on Sturm et al. [69], shows that six snow classes, i.e., ephemeral (60.29%), prairie (36.16%), alpine (1.96%), maritime (0.87%), tundra (0.70%), and taiga (0.02%) are found within the study area. Higher elevation regions (i.e., 4000 m or above) are characterized by the prairie snow class, barren (or sparsely vegetated), shrublands, and grasslands, while lower elevation regions are characterized by the ephemeral snow class, croplands, and forests. High winter (December, January, and February) precipitation rates were mostly seen in the northwestern part of the study domain (Figure 2e), where the significant analysis update of SWE by assimilating the synthetic ∆T B data were observed (see Section 3.3). In the synthetic truth simulation, snow-covered grid cells experienced wet snow conditions for, on average, about 4.1% of the simulation period (Figure 2f), which may lead to reduced sensitivity of SVM-based ∆T B estimates to SWE, and thus, degradations of the performance of the SVM-DA framework in estimating SWE (see Section 3.2).  Figure 3 shows the performance of the SVM-DA framework at four selected grid cells (see Table 3). Assimilation of synthetic ∆T B observations into Noah-MP considerably improved SWE estimates for Grids #1 and #2 (approximately 70% and 45% reduction in root mean square error (RMSE), respectively, relative to the OL), whereas degradations were observed for Grids #3 and #4 (approximately 39% and 10% increase in RMSE, respectively, relative to the OL).

Performance of the SVM-DA Framework
There are, in general, three potential sources for DA degradation: (1) high frequency T B signal saturation in the presence of deep snow, (2) mixed signal response in the presence of glaciers, and (3) mixed signal response in the presence of forest cover. T B signal saturation in deep snow is a well-known issue that hinders SWE estimation based on PMW T B observations. When ∆T B = 10.65 − 36.5 GHz or ∆T B = 18.7 − 36.5 GHz are used to estimate terrestrial SWE, it is assumed that ∆T B is positively correlated with SWE because T B at 36.5 GHz, in general, decreases as SWE increases (i.e., snowpack deepens) due to volume scattering while T B at 10.65 GHz (or 18.7 GHz) is less sensitive to increasing snow depth as compared to 36.5 GHz. However, T B at 36.5 GHz can begin to increase after SWE reaches a threshold value; this is due to the transition from snow volume scattering to emission at 36.5 GHz [60,80]. SWE threshold values for the slope reversal in the correlations between SWE and ∆T B at 36.5 GHz vary depending on local snow conditions such as snow wetness, snow grain size, snow grain shape, and the presence of internal ice layers [81]. The documented range of the SWE threshold value is from about 100 to 200 mm (e.g., [58,60,[81][82][83]). Correlations between the Noah-MP simulated SWE and SVM-based ∆T B,18V-36V estimates were plotted in Figure 4, in which the SWE threshold value was assumed to be 200 mm, which is the upper limit of the range of reported values in the peer-reviewed literature. As shown in Figure 4a,c the slope reversal phenomenon was also observed in our modeling results. ΔTB at 36.5 GHz vary depending on local snow conditions such as snow wetness, snow grain size, snow grain shape, and the presence of internal ice layers [81]. The documented range of the SWE threshold value is from about 100 to 200 mm (e.g., [58,60,[81][82][83]). Correlations between the Noah-MP simulated SWE and SVM-based ΔTB,18V-36V estimates were plotted in Figure 4, in which the SWE threshold value was assumed to be 200 mm, which is the upper limit of the range of reported values in the peer-reviewed literature. As shown in Figure 4a,c the slope reversal phenomenon was also observed in our modeling results.  Table 3 for geographical characteristics of the grid cells). The solid and dotted lines denote the ensemble mean and range of the simulated SWE, respectively.
A process limitation of Noah-MP v3.6 used in this study is the lack of simulated glaciers. As a result, no glacier information as simulated by the model was available for use by the SVM during training. Consequently, physically inconsistent correlations between simulated SWE and SVMderived ΔTB estimates were found in glaciated regions (Figure 4e,f). Namely, ΔTB decreased as SWE increased, up until SWE reached a threshold of approximately 200 mm, after which ΔTB transitioned to a positive correlation with SWE. This behavior is opposite to the relationship between SWE and ΔTB for a dry snowpack on land (see Figure 4a,c for comparison) and was also seen in the relationship between the simulated SWE and AMSR-E ΔTB data (not shown here). This implies that the SVM can yield the right prediction of ΔTB from glaciated regions (i.e., snow on ice) for the wrong reasons (i.e., snow on land).
On the other hand, one physical mechanism that can partly explain the unusual relationships between SWE and ΔTB over snow on glaciers is microwave volume scattering within underlying glaciers. The effective dielectric properties of solid ice are vastly different from those of water or snow (i.e., ice + air) [84]. In addition, Grenfell [85] and Eppler et al. [86] showed that the microwave emissivity of dry first-year ice at 37 GHz is comparable to that at 19 GHz, whereas the microwave emissivity of dry multiyear ice at 19 GHz is much higher than that at 37 GHz due to increased volume  Table 3 for geographical characteristics of the grid cells). The solid and dotted lines denote the ensemble mean and range of the simulated SWE, respectively. understand the underlying physical mechanisms that determine the relationships between SWE and ΔTB over snow on glaciers. The PMW TB signal at the top of the atmosphere is significantly affected by the presence of vegetation canopy [87][88][89]. More specifically, a low sensitivity of PMW TB from terrestrial snow is regularly observed in the presence of dense overlying forest [90,91]. Therefore, the performance of PMW TB observation assimilation in estimating SWE can be impeded by the presence of forest canopy [16]. Many studies (e.g., [17,65,[91][92][93]) have applied correction techniques to try and ameliorate the impact of forest cover on SWE estimation, but no such correction techniques have been applied here as most of the snow-covered regions in the study domain are outside densely forested areas ( Figure  2c). A process limitation of Noah-MP v3.6 used in this study is the lack of simulated glaciers. As a result, no glacier information as simulated by the model was available for use by the SVM during training. Consequently, physically inconsistent correlations between simulated SWE and SVM-derived ∆T B estimates were found in glaciated regions (Figure 4e,f). Namely, ∆T B decreased as SWE increased, up until SWE reached a threshold of approximately 200 mm, after which ∆T B transitioned to a positive correlation with SWE. This behavior is opposite to the relationship between SWE and ∆T B for a dry snowpack on land (see Figure 4a,c for comparison) and was also seen in the relationship between the simulated SWE and AMSR-E ∆T B data (not shown here). This implies that the SVM can yield the right prediction of ∆T B from glaciated regions (i.e., snow on ice) for the wrong reasons (i.e., snow on land).
On the other hand, one physical mechanism that can partly explain the unusual relationships between SWE and ∆T B over snow on glaciers is microwave volume scattering within underlying glaciers. The effective dielectric properties of solid ice are vastly different from those of water or snow (i.e., ice + air) [84]. In addition, Grenfell [85] and Eppler et al. [86] showed that the microwave emissivity of dry first-year ice at 37 GHz is comparable to that at 19 GHz, whereas the microwave emissivity of dry multiyear ice at 19 GHz is much higher than that at 37 GHz due to increased volume scattering, especially at higher frequencies, which is associated with the highly porous nature of multiyear ice. During the early period of snow season, ∆T B between a lower frequency (e.g., 18.7 GHz) and a higher frequency (e.g., 36.5 GHz) is mainly determined by the underlying glaciers, i.e., multiyear ice. As snow depth increases, ∆T B is gradually affected by snow cover. However, without accurate glacier information from observations or model simulations, it may be hard to fully understand the underlying physical mechanisms that determine the relationships between SWE and ∆T B over snow on glaciers.
The PMW T B signal at the top of the atmosphere is significantly affected by the presence of vegetation canopy [87][88][89]. More specifically, a low sensitivity of PMW T B from terrestrial snow is regularly observed in the presence of dense overlying forest [90,91]. Therefore, the performance of PMW T B observation assimilation in estimating SWE can be impeded by the presence of forest canopy [16]. Many studies (e.g., [17,65,[91][92][93]) have applied correction techniques to try and ameliorate the impact of forest cover on SWE estimation, but no such correction techniques have been applied here as most of the snow-covered regions in the study domain are outside densely forested areas (Figure 2c).
However, these three issues (i.e., forest cover, T B signal saturation, and the electromagnetic response of snow-on-ice) were not always the dominant factors yielding degraded results in this study. The SVM-DA framework was effective in improving the SWE estimation for Grid #2 even though the grid cell was characterized by high forest fraction (54.4%) and deep snow conditions. Conversely, SWE estimates were degraded by DA for Grid #4 where neither forest cover nor glaciers exist. A detailed analysis as to why such degradation occurs is provided in the next section.
As noted in our analysis below, careful attention to the three above mentioned issues must be included in DA frameworks. In Section 3.3, it turned out that the T B signal saturation issue was one of the major factors causing the degradation of SWE estimates by the SVM-DA framework for the entire HMA domain. Due to the issue of the electromagnetic response of snow-on-ice, glaciated regions were excluded from the analysis throughout this paper. Comparatively, the influence of forest cover on the DA performance was small (also shown in Section 3.3), likely due to the fact that (1) the same vegetation scheme was used in the synthetic truth and DA, and (2) no vegetation-related state variables were used as inputs to the SVM. However, these issues are relevant for a successful application of the SVM-DA framework for real T B or ∆T B data assimilation.

Limitations of ∆T B DA for Snow Estimation
As is seen in Figure 3c,d, the degradation of the SVM-DA framework in estimating SWE for Grids #3 and #4 was mainly due to the anomalously large updates in SWE to the grid cells via assimilation of synthetic ∆T B observations during 2005-2006 and 2009-2010, respectively. To better highlight the underlying mechanisms that caused the degradation, we further analyzed the results of Grids #3 and #4, focusing on these particular time periods. Time series of the SVM inputs, i.e., SWE, snow density, snow liquid water, and snow (top layer) temperature for Grids #3 and #4 are presented in Figure 5 (time series for the entire 9-year experimental period are provided in the supplemental materials, i.e., Figure S1 for Grid #3 and Figure S2 for Grid #4). For comparison purposes, the same plots were made for Grids #1 and #2 for periods when the SVM-DA framework performed well ( Figure 6).
It can be noted from Figure 5 that Grids #3 and #4, where DA degraded the SWE estimate, conditions were characterized by deep, wet snow. Snow density using DA was improved relative to the OL case, and the snow (top layer) temperature estimation error was marginal in both the OL and DA cases. Although some ensemble members exhibited abnormally large snow densities (i.e., snow densities close to that of ice) during the snow ablation period as a result of DA (Figure 5b,f), they were a direct consequence of the anomalously large updates to SWE during the snow accumulation period. It is important to note here that SWE is a prognostic variable (in conjunction with snow depth) and that snow density is a diagnostic variable computed as the ratio of the two. We can infer from Figure 5c,g that the snow liquid water error is one of the dominant factors that lead to degraded SWE estimates when using DA.
When wet snow is present in a pixel, microwave emission from the liquid water increases because the emissivity of water approaches unity, hence, liquid water tends to dominate the snow "signal" in the ∆T B observations [64,68]. Therefore, the sensitivity of ∆T B to SWE decreases in wet snow as sensitivity to liquid water correspondingly increases, as shown in Figure 4b   During the snow ablation period, large snow grains could be another determinant of the TB signal emitted from the snowpack. The magnitude of the snow grain size error significantly affects the correlation between SWE and TB [94], and hence, accurate estimation of snow grain size is a prerequisite for enhancing the DA performance [16]. Other important snow microstructural parameters for accurate modeling of snow microwave TB include snow anisotropy and stickiness (e.g., [94][95][96]). However, because these microstructural parameters (i.e., snow grain size, anisotropy, and stickiness) are not explicitly represented in Noah-MP, they could not be incorporated into the SVM training procedure. Hence, the direct impacts of snow grain size, anisotropy, and stickiness on SVMbased ΔTB estimates cannot be discussed here.
The large SWE update occurred (after the vertical dotted line in the figures) when the normalized innovations (NI) were large (Figure 7e,j), which was calculated as: where the numerator is the difference (innovation) between the assimilated (synthetic) ΔTB observations (y) and the ensemble mean of predicted (prior) ΔTB observations (ℋ ), and the During the snow ablation period, large snow grains could be another determinant of the T B signal emitted from the snowpack. The magnitude of the snow grain size error significantly affects the correlation between SWE and T B [94], and hence, accurate estimation of snow grain size is a prerequisite for enhancing the DA performance [16]. Other important snow microstructural parameters for accurate modeling of snow microwave T B include snow anisotropy and stickiness (e.g., [94][95][96]). However, because these microstructural parameters (i.e., snow grain size, anisotropy, and stickiness) are not explicitly represented in Noah-MP, they could not be incorporated into the SVM training procedure. Hence, the direct impacts of snow grain size, anisotropy, and stickiness on SVM-based ∆T B estimates cannot be discussed here.
The large SWE update occurred (after the vertical dotted line in the figures) when the normalized innovations (NI) were large (Figure 7e,j), which was calculated as: where the numerator is the difference (innovation) between the assimilated (synthetic) ∆T B observations (y) and the ensemble mean of predicted (prior) ∆T B observations (H (x − )), and the denominator is the square root of the sum of the measurement (R) and background error covariances ( √ HP − H T ). Thus, these large normalized innovations (compared to other time periods) may be attributed to (1) the large innovations and/or (2) the small standard deviations of the prior ∆T B across the ensemble.
Remote Sens. 2019, 11, x FOR PEER REVIEW 16 of 29 denominator is the square root of the sum of the measurement (R) and background error covariances (√ ). Thus, these large normalized innovations (compared to other time periods) may be attributed to (1) the large innovations and/or (2) the small standard deviations of the prior ΔTB across the ensemble. During the ablation period in Grids #3 and #4, the temporal trend of the synthetic ΔTB retrievals (Figure 7) was mainly attributed to snow liquid water content (Figure 5c,g respectively). As can be During the ablation period in Grids #3 and #4, the temporal trend of the synthetic ∆T B retrievals (Figure 7) was mainly attributed to snow liquid water content (Figure 5c,g respectively). As can be seen in Figure 5c,g the SVM-DA framework could not reproduce the "true" liquid water content within the snowpack, which led to the large differences (innovations) between the synthetic ∆T B observations and the predicted ∆T B . Conversely, large innovations did not occur when the snowpack was dry (Grid #1; Figure 6c) or when the snow liquid water content was well-estimated relative to the truth by the SVM-DA framework (Grid #2; Figure 6g).
As clearly shown in Figure 7a-d, the SVM predicts climatological ∆T B when the prior estimates of snow lie outside the range of the "true" snow values used during SVM training. Consequently, the ensemble of SVM-based ∆T B predictions collapse, yielding a near-zero ensemble standard deviation. The collapse of the predicted observation ensemble could result in the occurrence of spurious error correlations between SWE and SVM-based ∆T B estimates that ultimately manifests itself as a degraded estimate when DA is enabled.

Possible Improvements in DA Performance
Based on the analysis in Section 3.2, we incorporated two simple rule-based updates into the SVM-DA framework to minimize DA degradation (Table 4). In the first rule (rule 1), prior SWE was updated by the assimilation of the synthetic ∆T B observations only when the standard deviation of the prior ∆T B was larger than 0.05 K. It was hypothesized that rule 1 can help avoid the negative effects associated with the collapse of the SVM-predicted ∆T B ensembles, and the resultant spurious error correlations between SWE and ∆T B on the DA performance. Table 4. Descriptions of rule-based updates incorporated into the SVM-DA framework.

Rule 1
Update SWE only when the standard deviation of the predicted ∆T B is larger than 0.05 K Rule 2 Add a thin layer (i.e., 5 mm SWE) when the observed ∆T B is greater than 5 K but the simulated SWE is 0 mm. Rule 3 Rule 1 + Rule 2 Rule 4 Rule 3 + Suppress the analysis update when SWE of one or more of the ensemble members is greater than 500 mm.
As previously mentioned, the DA framework did not originally conduct an analysis update of the prior SWE if one or more of the ensemble members in a given grid cell was snow-free. This could lead to systematic underestimation of SWE during DA, and hence, underestimation of snow liquid water in subsequent time periods. To help mitigate this concern, a second rule (rule 2) was introduced where it was assumed that snow cover exists if the observed ∆T B is greater than 5 K, and hence, a thin snow layer (i.e., 5 mm SWE) was added when the simulated SWE was zero but ∆T B > 5 K. Rule 2 is expected to minimize abrupt changes in SWE during the analysis update caused by the snow liquid water error. The combination of rule 1 and rule 2 (i.e., rule 3) was also explored. Figure 8 shows that the performance of the SVM-DA framework in estimating SWE can be improved slightly by applying the rule-based approaches. Rule 1, rule 2, and rule 3 all reduced the SWE RMSE compared to the OL case by 4.2%, 7.5%, and 11.3%, respectively, for Grid #3, and by 2.9%, 12.8%, and 42.0%, respectively, for Grid #4. Estimates of snow density and snow liquid water were also improved via the rule-based DA approach (see supplemental materials, i.e., Figure S1 for grid #3 and Figure S2 for grid #4). The large increase in SWE, shown in Figure 8a, was not observed when using the rule-based DA cases in Grid #3 (Figure 8b-d). However, in Grid #4, a large amount of SWE was added by DA during 2009 and 2010 winter in the rule 3 cases, which rapidly melted during the ablation period.
The same rule-based approaches were also employed in the application of the SVM-DA framework to the entire HMA domain ( Figure 9). As shown in the figure, DA "baseline" slightly reduced the spatially averaged RMSE of SWE estimates (27.5 mm; Figure 9b) compared to that of the OL case (28.1 mm; Figure 9a). SWE estimates in 56.8% (43.2%) of snow-covered grid cells were enhanced (degraded) by DA, as shown in Figure 9c. Rule 1 further (but marginally) reduced the spatial mean of the SWE RMSE (27.3 mm; Figure 9d,e), while rule 2 was effective in increasing the number of improved grid cells (61.8%; Figure 9f,g). The best performance in terms of improvement to mean RMSE and number of improved grid cells was achieved by applying rule 3 (combination of rule 1 and rule 2) (Figure 9h,i).
Even with the rule-based approaches, the SVM-DA framework still exhibited mixed performance in estimating SWE, that is, 61.6% (38.4%) of snow-covered grid cells were improved (degraded) (Figure 9i). Therefore, the spatial mean RMSE was only marginally reduced by about 3.2% and 1.1% relative to the OL and DA (baseline) cases, respectively. The dominant land-cover types of grid cells, where DA degraded SWE estimates, include open shrublands, grasslands, and barren or sparsely vegetated areas (Figure 9j). Degradations of SWE estimates were mostly observed in deep snow conditions (Figure 9l) rather than densely forested areas (Figure 9k). A simple test (i.e., rule 4; see Table 4 and Figure 9j,k) showed that reduction of the spatial mean RMSE (i.e., 26.3 mm) by about 6.4% and 4.4% compared to the OL and DA (baseline) cases, respectively, can be achieved by suppressing the analysis update when one or more of the ensemble members in a grid cell have SWE greater than 500 mm. This implies that the T B signal saturation (especially at 36.5 GHz) in deep snow still needs investigation in order to enhance the performance of the SVM-DA framework across the entire HMA domain, even though it was not the primary factor for the degradation in the experimental grid cells (i.e., Grids #3 and #4).
The percentage of deep snow days when SWE was greater than the reported SWE threshold (i.e., 200 mm) during the winter season (December, January, and February) is shown in Figure 9k,l. Due to the large uncertainties in precipitation data over the HMA region [42], both the results using MERRA-2 (synthetic truth case) and TRMM (OL case) precipitation were analyzed. Within the study domain, 6.3% (synthetic truth; Figure 9k) and 4.4% (OL; Figure 9l) of model grid cells experienced deep snow conditions, and, on average, 21.7% (synthetic truth; Figure 9k) and 5.4% (OL; Figure 9l) of the winter season were subject to the T B signal saturation issue. It should be noted that many locations would experience T B signal saturation more frequently than the values shown in the figure because of lower SWE thresholds.
Derksen [58] suggested that ∆T B between the 10.65 and 18.7 GHz channels can detect SWE variations for relatively deep snow conditions, rather than using 36.5 GHz that most commonly encounters signal saturation. The 10.65 GHz channels are less sensitive to snowpack volume scattering compared to the 18.7 GHz channels. Therefore, for deep snow conditions (i.e., SWE > 100 to 200 mm thresholds), the 10.65 GHz channels serve as a non-scattering background while the 18.7 GHz channels are more significantly affected by volume scattering [58]. However, additionally assimilating ∆T B between the 10.65 and 18.7 GHz (V and H) channels did not significantly improve DA performance (mean RMSE = 27.1 mm). This is due, in part, to the fact that simultaneous assimilation of all channels still involves errors stemming from higher frequency channels. Furthermore, ∆T B between the 10.65 and 18.7 GHz channels may introduce errors (noise) during shallow snow conditions because the signal is more representative of soil moisture conditions rather than snow mass (e.g., [97]). A possible solution will be assimilating different frequency channels for shallow and deep snow conditions. That is, for example, ∆T B between the 18.7 and 36.5 GHz channels could be assimilated for shallow snow whereas ∆T B between the 10.65 and 18.7 GHz channels is assimilated for deep snow. This approach will be explored in a future study. Remote Sens. 2019, 11, x FOR PEER REVIEW 19 of 29  Table 4 for descriptions of the rules applied). The solid and dotted lines denote the ensemble mean and range of the simulated SWE, respectively. No rule was applied in (a,e).  Table 4 for descriptions of the rules applied). The solid and dotted lines denote the ensemble mean and range of the simulated SWE, respectively. No rule was applied in (a,e).  Table 4 for descriptions of the rules applied). Negative (blue color) and positive (red color) RMSE differences represent improvements and degradations in the SWE estimates, respectively, as a result of invoking assimilation. Glaciated regions (gray color) were excluded from the analysis. Maps of the (l) maximum SWE in the synthetic truth, (m) percentage of deep snow days (SWE ≥ 200 mm) in the synthetic truth, and (n) percentage of deep snow days (SWE ≥ 200 mm) in the OL case are presented for reference.

Controllability of Observation Operator
A final issue related to the efficacy of SVM-based assimilation of PMW ΔTB is that of "controllability", which is an important factor in optimal control theory. Controllability highlights the skill (or lack thereof) of a linear or nonlinear model to guide the output of that model from any physically plausible initial state (by changing the inputs, i.e., control variables) toward any physically plausible final state within a finite period of time [98]. In other words, a system is controllable if and only if the system state(s) can be changed by changing the system input [99]. In the context of data  Table 4 for descriptions of the rules applied). Negative (blue color) and positive (red color) RMSE differences represent improvements and degradations in the SWE estimates, respectively, as a result of invoking assimilation. Glaciated regions (gray color) were excluded from the analysis. Maps of the (l) maximum SWE in the synthetic truth, (m) percentage of deep snow days (SWE ≥ 200 mm) in the synthetic truth, and (n) percentage of deep snow days (SWE ≥ 200 mm) in the OL case are presented for reference.

Controllability of Observation Operator
A final issue related to the efficacy of SVM-based assimilation of PMW ∆T B is that of "controllability", which is an important factor in optimal control theory. Controllability highlights the skill (or lack thereof) of a linear or nonlinear model to guide the output of that model from any physically plausible initial state (by changing the inputs, i.e., control variables) toward any physically plausible final state within a finite period of time [98]. In other words, a system is controllable if and only if the system state(s) can be changed by changing the system input [99]. In the context of data assimilation, controllability of the observation operator is crucial during the analysis update because any errors in the geophysical (land surface) model should correlate back to errors in the SVM-derived ∆T B estimates: this is one of the fundamental assumptions behind the SVM-DA framework. In the absence of controllability, an error in the geophysical model would no longer correlate back to the corresponding error in the observation operator, which would negate the efficacy of any data assimilation scheme, and perhaps, even lead to spurious error correlations that ultimately degrade the geophysical model.
In the context of SVM-based estimation of PMW ∆T B over snow-covered terrain, Figure 10 shows two different locations in HMA, whereby one location in space and time is controllable and hence shows improved SWE estimates by DA (RMSE DA -RMSE OL = −31 mm), and the other is not (RMSE DA -RMSE OL = 5.8 mm). Namely, Figure 10a shows a controllable SVM-based estimate of ∆T B whereby perturbations to the nominal inputs (i.e., control variables) to the SVM yields changes in ∆T B . Further, the ∆T B estimates adhere to the first-order physics, such that an increase (decrease) to snow density or SWE results in an increase (decrease) of volume scattering that leads to an increase (decrease) in ∆T B . Alternatively, the uncontrollable SVM in Figure 10b is insensitive to changes in the inputs, and as such, reverts to a climatological ∆T B estimate gleaned during the training phase. It is important to note that most of the SVM-based predictions in HMA are, in fact, controllable. However, at some locations in space and/or time (particularly when the snow is deep and/or ripe), individual SVMs may encounter controllability issues that ultimately manifest themselves as an erroneous update. There is likely no single panacea that will solve this issue of controllability. Rather, improvements in machine learning efficacy (and controllability by association) will be achieved through a collection of actions including: (1) improvements in quality control and quality assurance of the observations used during SVM training; this may include the use of the original enhanced resolution (i.e., without upscaling) T B data or those corrected by considering the effect of complex topography (e.g., slope, aspect, ridge, and valley) on the local incidence and polarization rotation [100] in order to better represent microwave emission from the snowpack in mountainous areas; (2) improvements in geophysical modeling (e.g., inclusion of snow grain size, inclusion of sub-grid scale glacier physics, inclusion of snow grain shape) that will better constrain the PMW-derived snow mass estimation problem; (3) reconciliation of differences between different meteorological boundary conditions that will help better constrain the snow mass estimation problem; (4) continued exploration of different machine learning algorithms, whereby one algorithm may possess greater skill in time series regression; and (5) exploration of various training approaches, e.g., train machine learning algorithms using T B or ∆T B to retrieve SWE, which is then assimilated into LSMs, as explored for soil moisture estimates in Rodríguez-Fernández et al. [101]. This study explored the potential of assimilating synthetic PMW ΔTB observations within a SVM-DA framework in order to enhance the characterization of spatio-temporal variability of SWE over HMA. The SVM-DA framework employed a machine learning algorithm, i.e., SVM, as the observation operator in the prediction of ΔTB. Noah-MP was employed as the prognostic LSM simulating snow. Three different research questions were examined using a set of synthetic twin experiments in which synthetic ΔTB observations at four different frequency and polarization combinations (i.e., ΔTB 10-36 H, ΔTB, 10-36 V, ΔTB, 18-36 H, and ΔTB, 18-36 V) were simultaneously assimilated into Noah-MP using an EnKF.
Research question 1: Is the SVM-DA framework effective in improving snow mass estimates across the HMA region? The framework significantly improved SWE estimates for two of the experimental grid cells (Grids #1 and #2) but resulted in degraded output for the other two experimental grid cells (Grids #3 and #4) that were characterized by deep and wet snow conditions. For the entire HMA domain, the SVM-DA framework also showed mixed performance, that is, the framework improved (degraded) SWE estimates for 56.8% (43.2%) of snow-covered grid cells. Compared to the OL case, only ~2% improvement in the spatial mean RMSE of SWE was achieved with DA. These results suggest that the use of machine learning algorithms to invert geophysical (snow) states into observational (ΔTB) space offers potential, but that issues related to sensitivity and controllability must be addressed to ensure the machine learning algorithm produces the right answer for the right reason.
Research question 2: What limitations does the SVM-DA framework possess? From the analysis of the results for the degraded experimental grid cells, two potential limitations of the TB assimilation exist during deep snow and wet snow conditions. First, abrupt changes in SWE can happen during the analysis update due to the large innovations when the ΔTB signal is dominated by variations in snow liquid water content. Second, the collapse of the ensemble of SVM-based ΔTB predictions are observed when the prior snow estimates are outside the range of snow inputs used for the SVM training procedure. The resultant near-zero standard deviation across the ensemble of the prior ΔTB leads to degradation of the DA performance, presumably due to spurious error correlations between SWE and ΔTB. The former one is a general issue in PMW TB-based snow data assimilation, while the latter one is the SVM-DA-specific issue. The issues highlighted here are, in part, related to problems of controllability mentioned above, but also stem from the ill-posed (underdetermined) nature of

Summary and Conclusions
This study explored the potential of assimilating synthetic PMW ∆T B observations within a SVM-DA framework in order to enhance the characterization of spatio-temporal variability of SWE over HMA. The SVM-DA framework employed a machine learning algorithm, i.e., SVM, as the observation operator in the prediction of ∆T B . Noah-MP was employed as the prognostic LSM simulating snow. Three different research questions were examined using a set of synthetic twin experiments in which synthetic ∆T B observations at four different frequency and polarization combinations (i.e., ∆T B 10-36 H, ∆T B , 10-36 V, ∆T B , 18-36 H, and ∆T B , 18-36 V) were simultaneously assimilated into Noah-MP using an EnKF.
Research question 1: Is the SVM-DA framework effective in improving snow mass estimates across the HMA region? The framework significantly improved SWE estimates for two of the experimental grid cells (Grids #1 and #2) but resulted in degraded output for the other two experimental grid cells (Grids #3 and #4) that were characterized by deep and wet snow conditions. For the entire HMA domain, the SVM-DA framework also showed mixed performance, that is, the framework improved (degraded) SWE estimates for 56.8% (43.2%) of snow-covered grid cells. Compared to the OL case, only~2% improvement in the spatial mean RMSE of SWE was achieved with DA. These results suggest that the use of machine learning algorithms to invert geophysical (snow) states into observational (∆T B ) space offers potential, but that issues related to sensitivity and controllability must be addressed to ensure the machine learning algorithm produces the right answer for the right reason.
Research question 2: What limitations does the SVM-DA framework possess? From the analysis of the results for the degraded experimental grid cells, two potential limitations of the T B assimilation exist during deep snow and wet snow conditions. First, abrupt changes in SWE can happen during the analysis update due to the large innovations when the ∆T B signal is dominated by variations in snow liquid water content. Second, the collapse of the ensemble of SVM-based ∆T B predictions are observed when the prior snow estimates are outside the range of snow inputs used for the SVM training procedure. The resultant near-zero standard deviation across the ensemble of the prior ∆T B leads to degradation of the DA performance, presumably due to spurious error correlations between SWE and ∆T B . The former one is a general issue in PMW T B -based snow data assimilation, while the latter one is the SVM-DA-specific issue. The issues highlighted here are, in part, related to problems of controllability mentioned above, but also stem from the ill-posed (underdetermined) nature of PMW T B -based estimation of snow mass. One potential means of alleviating the ill-posed nature of this problem is to provide additional constraints (e.g., snow grain size and snow grain shape) via additional process physics in the forward (prognostic) land surface model.
Research question 3: Can these limitations be avoided through the recognition and appropriate handling of instances where the SVM forward operator is suboptimal? To minimize the negative effects of the potential pitfalls on the DA performance, three rules were tested. In rule 1, SWE assimilation was enabled only when the standard deviation of the predicted ∆T B was greater than 0.05 K. In rule 2, a thin snow layer (5 mm SWE) was added in a grid cell when the simulated SWE was zero but the observed ∆T B was larger than 5 K. DA performance was improved for the experimental grid cells by applying these rules. In particular, the combination of the two rules (rule 3) exhibited a 11.3% and 42.0% reduction of the SWE RMSE for Grids #3 and #4, respectively, as compared to the OL case. However, the SVM-DA framework still showed mixed performance for the entire HMA domain. DA (rule 3) degraded SWE estimates for about 38.4% of snow-covered grid cells, which were largely impacted by deep snow (signal saturation) conditions in the synthetic truth and less so by signal attenuation in densely forested areas. Further improvements of the DA performance in the rule 4 DA case imply the need for assimilating lower frequency channels (e.g., ∆T B between the 10.65 and 18.7 GHz channels) for deep snow conditions (SWE > 100 to 200 mm thresholds) in order to avoid T B signal saturation issue.
In addition to the T B signal saturation, there are still a number of issues that need to be addressed to enhance the applicability of the SVM-DA framework to the HMA domain. First, the effects of glaciers and snow grain size, which are two important modulators of PMW T B , are not included in the current framework. Second, a sophisticated forcing bias correction method should be explored for future use. The simple bias correction method used in this study, i.e., applying a single bias correction factor for all grid cells, may affect the performance of the data assimilation system, especially for areas where the difference in precipitation between MERRA-2 and TRMM was large. Third, along with the second issue, multiple meteorological forcing datasets can be simultaneously used for the SVM training to, perhaps, better account for the large uncertainties in forcing data, especially precipitation, over HMA, as shown in Yoon et al. [42]. This approach may help minimize the occurrence of the SVM predicting climatological ∆T B when the prior snow estimates fall outside the range of the training data. Fourth, assimilation needs to be conducted at a finer spatial resolution (e.g., 1 km) to better characterize the spatial distribution of SWE in mountainous regions, which may require additional strategies to consider the difference in spatial resolution between LSMs and T B observations (and SVM). Additionally, the effects of vegetation and atmosphere on T B , which were neglected for simplicity in this synthetic study, should be taken into account in future studies to improve the DA performance. Finally, machine learning controllability (and sensitivity more generally) must be investigated prior to assimilation to ensure the SVM produces the right answer for the right reason.
Overall, the results obtained in this study provide useful information for future practical applications of the SVM-DA framework in HMA.  Acknowledgments: Computational resources were provided by the NASA Center for Climate Simulation and the University of Maryland High-Performance Computing Clusters (HPCC). The authors would like to expressly thank Yuan Xue for many constructive comments.

Conflicts of Interest:
The authors declare no conflict of interest.