Downscaling Snow Depth Mapping by Fusion of Microwave and Optical Remote-Sensing Data Based on Deep Learning

: Accurate high spatial resolution snow depth mapping in arid and semi-arid regions is of great importance for snow disaster assessment and hydrological modeling. However, due to the complex topography and low spatial-resolution microwave remote-sensing data, the existing snow depth datasets have large errors and uncertainty, and actual spatiotemporal heterogeneity of snow depth cannot be effectively detected. This paper proposed a deep learning approach based on downscaling snow depth retrieval by fusion of satellite remote-sensing data with multiple spatial scales and diverse characteristics. The (Fengyun-3 Microwave Radiation Imager) FY-3 MWRI data were downscaled to 500 m resolution to match Moderate-resolution Imaging Spectroradiometer (MODIS) snow cover, meteorological and geographic data. A deep neural network was constructed to capture detailed spectral and radiation signals and trained to retrieve the higher spatial resolution snow depth from the aforementioned input data and ground observation. Veriﬁed by in situ measure-ments, downscaled snow depth has the lowest root mean square error (RMSE) and mean absolute error (MAE) (8.16 cm, 4.73 cm respectively) among Environmental and Ecological Science Data Center for West China Snow Depth (WESTDC_SD, 9.38 cm and 5.36 cm), the Microwave Radiation Imager (MWRI) Ascend Snow Depth (MWRI_A_SD, 9.45 cm and 5.49 cm) and MWRI Descend Snow Depth (MWRI_D_SD, 10.55 cm and 6.13 cm) in the study area. Meanwhile, downscaled snow depth could provide more detailed information in spatial distribution, which has been used to analyze the decrease of retrieval accuracy by various topography factors.


Introduction
Snow is an important indicator for global climate change, which has significant positive and noteworthy negative feedbacks on climate system, water resources and ecological environment [1][2][3][4]. On the positive side, due to the high surface albedo and large-scale cooling effect, snow cover (SC) participates in energy exchange between the atmosphere and Earth's land, thus regulating regional and global climate. Snowmelt can provide water resources for ecosystem and participate in regional water cycles, which is beneficial for Machine-learning and deep-learning technology power many aspects of remote sensing: from target recognition [25][26][27][28][29] to semantic segmentation [30,31] to spatial-temporal prediction [32,33], and they are increasingly present in snow parameter estimation and retrieval [34][35][36]. Tedesco [37] constructed a neural network to retrieve the snow depth and SWE, which showed the highest accuracy compared with the other four retrieval algorithms. Tabari [38] achieved the similar conclusion by using a neural network to estimate snow depth and SWE in the Samsami Basin, Iran. Theoretically speaking, microwave BTD enhances with the increases of snow depth, but when the snow depth exceeds a certain threshold (50 cm) there will be a large error in the estimation results. The neural network could overcome various complex problems existing in large-scale retrievals, such as non-linear modeling, classification and association by learning and summarizing a large number of data and does not rely on the understanding of physical process when modeling. The support vector machine (SVM) is also widely used to solve the nonlinear problems in geosciences. Xue [34] reveals that SVM is more sensitive than the neural network in estimating snow parameters. It has been found that the results based on the SVM algorithm have better accuracy no matter in deep or shallow snow cover, forest coverage area, snow accumulation and melting period. Xiao [39] used the support vector regression (SVR) algorithm to establish a snow depth retrieval model based on different vegetation types and different snow periods, showing better accuracy and reducing "snow saturation effect". Machine learning and deep learning technology can describe the non-linear relationship between BTD and snow parameters, and overcome the limitations of a linear algorithm in different areas [40,41]. Although the machine learning technology has a wide range of applications and high accuracy, there is no detailed snow physical model involved in the retrieval process [42], thus the interpretability of the results is poor.
The model of retrieving snow depth from microwave data has developed from simple linear regression to non-linear models considering complex meteorological and geographical factors. However, the snow depth with spatial resolution from 10 km to 25 km is not suitable for arid and semi-arid alpine region where the vertical drop is extremely significant. In practical application, snow depth mapping with higher spatial resolution is more useful for regional snow disaster evaluation and hydrological model establishment. Recent studies have demonstrated that higher snow depth mapping can be developed based on high-resolution geographic and meteorological data, together with downscaling microwave data. Wang developed a multifactor power snow depth downscaling model and greatly improved accuracy of snow depth retrieval [43]. Walters [44] describes and tests a linear model to downscale MOD10A1 fractional snow cover area (fSCA) (500 m) data to higher-resolution (30 m) spatially explicit binary snow cover area (SCA) estimates. Mhawej [45] downscaled Advanced Microwave Scanning Radiometer -Earth Observing System snow water equivalent (AMSR_E SWE) datasets based on the cooperative observation of MODIS Terra-Aqua data. For the arid and semi-arid alpine regions such as Northern Xinjiang (NX) with complex topography and violent vertical variation, high-resolution snow depth mapping can more accurately describe the accumulation and melting of snow in the region, and provide precise input parameters for hydrological process and snow disaster early warning. Thus, in this manuscript, we propose a deep learning approach to downscale and evaluate the snow depth in NX.
This manuscript is organized as follows. After introducing the study area and data preprocessing in Section 2, the methodology for snow identification and downscaling snow depth retrieval is expounded in Section 3. Results of accuracy verification and spatial comparison will be presented in Section 4, and more details will be discussed in Section 5. Ultimately, a summary of the research will be described in Section 6.

Study Area
The study area for this manuscript was Northern Xinjiang (NX), located in the northwest of China, at 79.90 • -96.18 • E and 40.88 • -48.10 • N, as shown in Figure 1. The total land area is 0.97 million km 2 , accounting for nearly 10% of China's land area. Junggar Basin is lying between Tianshan Mountains and Altai Mountains, forming a mountain-basin system, which is the most significant landform in NX [46][47][48]. Xinjiang is far from the ocean and has a continental arid and semi-arid climate with an annual mean temperature varying from 4 to 8 • C in NX. Due to the unique geographical location and topographic factors, the variation of temperature difference in this area is very significant. Simultaneously, an inversion layer often appears due to the effect of the cold lake in winter in Junggar Basin. Data analysis reveals that the average temperature in January in this area is mostly lower than −10 • C. Another significant feature of temperature variation in this area is that the closer to the Junggar Basin, the higher the temperature is. Below the altitude of 2000 m, the temperature increases with the elevation. Vegetation is sparse in the plains and the altitudinal belts of mountains are obvious [49]. The precipitation in alpine regions can reach 400-600 mm, and exceed 800 mm in specific areas. Snowfall accounts for more than 80% of the annual precipitation. In the piedmont plain, precipitation ranges from 100 to 200 mm and less than 50 mm in a specific area, and snowfall accounts for about 1/3 of the annual precipitation. Solid precipitation is the main form of precipitation across arid alpine areas in winter. The northwest arid region is the area with extremely poor surface water resources and the most abundant snow resources in China [50]. It is one of the three stable snow regions in China, namely, the Qinghai Tibet Plateau, Tianshan-Northern Xinjiang and Northeast-Inner Mongolia. Among them, the snow resources in NX are more advantaged, accounting for 1/3 of the national snow resources. In most areas of the NX plain, the period of stable snow cover lasts about five months and the snow thickness is generally 20-50 cm. However, the stable snow period in alpine regions lengthen out to 7 months every year, and the snow thickness is greater than 80 cm, and the stable snow cover in high mountain areas accumulates all year round. Except for Irtysh River, other rivers in NX are inland rivers, many of them originate from the northern foot of Tianshan Mountains. Thus, mountain glacier and snow meltwater resources are the major supplies of surface runoff in NX, where freshwater resources are seriously scarce [51]. Snow plays a very important role in regional ecological environment change across NX.

Microwave Radiation Imager (MWRI) Brightness Temperature (BT)
MWRI was launched onboard the Fengyun 3B (FY3B) satellite, which is mainly used to detect precipitation rate, water content of cloud, water quantity, total water vapor, soil moisture, sea ice, temperature and snow cover for the research of water cycle and energy transfer at global and hemispherical scales (http://satellite.nsmc.org.cn). The main parameters of the sensor are listed in Table 1. The resolution of different frequencies ranges from 9 km to 85 km, and each frequency has two modes of vertical polarization and horizontal polarizations. The mean accuracy of sensor is 2 K, which is lower than the

Microwave Radiation Imager (MWRI) Brightness Temperature (BT)
MWRI was launched onboard the Fengyun 3B (FY3B) satellite, which is mainly used to detect precipitation rate, water content of cloud, water quantity, total water vapor, soil moisture, sea ice, temperature and snow cover for the research of water cycle and energy transfer at global and hemispherical scales (http://satellite.nsmc.org.cn (accessed on 1 November 2020)). The main parameters of the sensor are listed in Table 1. The resolution of different frequencies ranges from 9 km to 85 km, and each frequency has two modes of vertical polarization and horizontal polarizations. The mean accuracy of sensor is 2 K, which is lower than the accuracy of 1 K of AMSR-2. The MWRI Level 1 BT data were downloaded for the snow seasons from 2011 to 2016 and were used to construct the downscaling snow depth model, data from four snow seasons (1 November 2011-31 March 2014, 1 November 2015-31 March 2016) were used to train the model, and data from 1 November 2014 to 31 March 2015 with few extreme abnormal weather were chosen to validate and evaluate model accuracy. In this manuscript, the official ascending and descending orbit snow depth datasets were denoted as MWRI_A and MWRI_D, respectively and were used to compare with the spatial distribution of downscaled snow depth. The high-frequency and low-frequency polarimetric data of ascending and descending orbit are the main inputs to construct the downscaling snow depth model. Original BT data were resampled to 500 m resolution with nearest-neighbor interpolation to match snow cover data, meteorological and geographic data. Based on the longitude and latitude information of national meteorological stations, the pixel BT data of specific coordinates are extracted for modeling and analysis [43].

Moderate-Resolution Imaging Spectroradiometer (MODIS) Datasets (MOD10A1 and MOD35)
Downscaling snow depth retrieval not only utilizes the sensitivity of microwave data at different frequencies, but also depends on high spatial resolution snow cover pixels. In arid and semi-arid regions, clouds and snow have similar reflectance in the visible spectrum, which makes it difficult to distinguish them. Thus, the contrast in reflectance of snow between the visible spectrum and shortwave infrared spectrum was used to discriminate clouds from snow. The normalized difference snow index (NDSI) applies this contrast to discriminate snow from clouds and other surface covers in mountainous terrain [3]. The MOD10A1/MYD10A1 datasets based on this method, with 500 m resolution, have reached an accuracy of 80%. In this manuscript, MOD10A1 and MYD10A1 were used to synthesize the snow cover distribution in the study area.
The MOD10A1 V006 product, which is an improved version of V005, includes four basic data layers: the NDSI_Snow_Cover, NDSI, albedo and data quality assessment layers [52]. The subpixel of fractional snow cover with 500 m resolution is not considered in this study. Therefore, the pixel with fractional snow cover from 1% to 100% is determined as a snow pixel.
However, optical remote-sensing data can not effectively detect the ground snow information under all weather conditions and at all times, thus MOD35 and MYD35 were used to distinguish between cloudy and cloudless weather conditions. In cloudless weather, Remote Sens. 2021, 13, 584 6 of 25 the snow cover detected by optical remote sensing is accepted, while in cloudy weather, passive microwave remote sensing data has the greatest potential to derive snow coverage.

Air Temperature
Air temperature has a significant influence on snow freezing and melting. The change of temperature will influence the content of liquid water in snow, then changes the grain size and density of snow, which will affect the detection of snow depth by satellite microwave data [53]. The air temperature used in the downscaling snow depth retrieval model was from the Global Land Data Assimilation System (GLDAS) data [54,55], GLDAS-Noah is the Noah model driven by the Global Land Data Assimilation System, version 2.1 (GLDAS-2.1) (https://disc.gsfc.nasa.gov/datasets/ (accessed on 1 November 2020)), which was developed jointly by scientists at the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC) and the National Oceanic and Atmospheric Administration (NOAA) NCEP [56]. GLDAS provides temperature data with a maximum spatial resolution of 0.25 • × 0.25 • and a temporal resolution of 3 h. These data are resampled and calculated into daily mean temperature of 500 m resolution to match other datasets and were fed to the model.

Land Cover
The scattering and attenuation degree of microwave emission energy of various land cover are different, which is a significant factor for retrieving snow depth [57]. The MODIS Land Cover Type (MCD12Q1) Version 6 data product provides global land cover types at yearly intervals and was selected to characterize the underlying surface coverage of the study area. MCD represents the fusion of two MODIS sensors mounted on Aqua and Terra, and its spatial resolution is 500 m. The International Geosphere-Biosphere Program (IGBP) classification system was adopted to classify land cover data into 17 categories, including Water, Permanent Wetlands, Various Forest, Shrublands, Grassland, Cropland, Urban and Built-up Lands and Barren, etc. Based on the approach of MCD12Q1 reclassification from Huang [43], the land cover in NX was reclassified into five types: Water, Forest, Grassland, Farmland and Bare land (showed in Figure 2 MODIS Land Cover Type (MCD12Q1) Version 6 data product provides global land cover types at yearly intervals and was selected to characterize the underlying surface coverage of the study area. MCD represents the fusion of two MODIS sensors mounted on Aqua and Terra, and its spatial resolution is 500 m. The International Geosphere-Biosphere Program (IGBP) classification system was adopted to classify land cover data into 17 categories, including Water, Permanent Wetlands, Various Forest, Shrublands, Grassland, Cropland, Urban and Built-up Lands and Barren, etc. Based on the approach of MCD12Q1 reclassification from Huang [43], the land cover in NX was reclassified into five types: Water, Forest, Grassland, Farmland and Bare land (showed in Figure 2). The aforementioned four land types account for 0.47%, 0.42%, 44.18%, 7.03% and 47.9% of the total area of NX respectively. Bare land dominates the eastern regions of NX, grassland dominates Altai Mountain and Tianshan mountain area, while cultivated land and pasture are important land-cover types in Ili River Valley.

Topographic
Topography also affects the rate of snow accumulation and melting, and has a significant effect on snow redistribution. The Shuttle Rader Topography Mission (STRM) version 004 digital elevation model (DEM) data with 90 m resolution used in this study was mapped by the U.S. Geological Survey (USGS) using interferometric radar and can be download from http://srtm.csi.cgiar.org/index.asp [58,59]. The original DEM data

Topographic
Topography also affects the rate of snow accumulation and melting, and has a significant effect on snow redistribution. The Shuttle Rader Topography Mission (STRM) version 004 digital elevation model (DEM) data with 90 m resolution used in this study was mapped by the U.S. Geological Survey (USGS) using interferometric radar and can be download from http://srtm.csi.cgiar.org/index.asp (accessed on 1 November 2020) [58,59]. The original DEM data were also transformed, mosaiced and extracted and resampled to 500 m spatial resolution by ArcGIS 10.1 Desktop software to match the other dataset according to the vector file of NX. The elevation data of NX is shown in Figure 1. Tianshan and Altai mountains are the two regions with the highest altitude and the most complex terrain. This study will focus on snow depth observation in the aforementioned regions.
A variety of terrain factor data derived from DEM are calculated and used as prior knowledge to represent geomorphic features. Slope is the degree of steepness and slowness of surface units, which can also significantly affect the scale and intensity of the flow and energy conversion of land cover. Furthermore, slope has an important influence on the accumulation and melting speed of snow. Aspect is defined as the projection direction of the slope normal on the horizontal plane. The slope and aspect determine the total solar radiation received by snow and then affect the snow melting speed [60]. The definition of fluctuation refers to the difference between the highest and lowest altitude in a specific area. Compared with DEM, fluctuation reflects the change of elevation difference in a pixel more accurately, which has a significant impact on snow redistribution caused by drifting snow. The surface roughness affects the surface momentum transport efficiency, and thus redistributes the snow cover [61]. The aforementioned topographic factors, with 500 m resolution, are all calculated by ArcGIS 10.1 Desktop software.

Ground Observation
To construct a retrieval model and compare the accuracy of downscaled snow depth and other reported snow depth datasets, daily ground snow depth observation data of national meteorological station obtained from National Meteorological Information Center (http://data.cma.cn/site (accessed on 1 November 2020)) were used as the objective truth value for verification. Ground observations from 1 November 2014 to 31 March 2015 were used to validate the downscaled results and observations from the other four snow seasons were used to develop the model.

Long-Term Series of Daily Snow Depth Dataset in China
A long-term series of the daily snow depth dataset developed by Che [2] was used to verify whether the spatial distribution of downscaled snow depth in this research is consistent with the widely accepted and reported snow depth datasets. Snow depth datasets were denoted as Environmental and Ecological Science Data Center for West China Snow Depth (WESTDC_SD), with 25 km spatial resolution, and the data are sorted in ASCII format. The original data used to retrieve the snow depth dataset were SMMR (1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987), SSM/I (1987-2009) and SSMI/S (2008-2019) daily passive microwave BT data [15,62]. The data obtained are inconsistent since three sensors are mounted on different satellite platforms. The consistency of BT data is improved by simultaneous interpreting BT of different sensors. Ultimately, snow depth was retrieved by modified Chang's algorithm [16]. In this manuscript, the daily snow depth dataset in China from 1 November 2014 to 31 March 2015 was transformed into Geo-tiff format to evaluate and verify the downscaled snow depth.

Methodology
Due to the different characteristics of an optical sensor and passive microwave sensor, MODIS data of the optical sensor and MWRI data of the passive microwave radiometer have advantages and limitations in snow parameter retrieval. Generally speaking, MODIS data has a high spatial resolution. The visible and shortwave infrared bands can be used to identify snow cover effectively. However, large-scale snowfall was often accompanied by cloudy and bad weather. The strong reflection of visible and near-infrared at the cloud top will shield the ground information, resulting in the lack of snow parameters under the cloud. Also, cirrus clouds have similar spectral characteristics to snow cover in the visible band, which often leads to the misjudgment of snow cover. MWRI data can penetrate through cloud and fog, not affected by cloud layer and other factors. MWRI data has the ability to observe the ground snow information all day long and penetrate the snow layer to detect the snow depth and SWE. However, there is a large deviation in regional scale snow recognition due to the coarse spatial resolution of MWRI data.
To realize the complementary advantages of the two types of data, improve the accuracy of snow identification, maxing the snow information, and obtain daily continuous snow monitoring data under all-weather conditions with and without clouds, it is imperative to effectively combine the two types of data to retrieve snow parameters.
A collaborative retrieval of snow parameters based on MODIS and MWRI data is designed in this manuscript, and the technical flow chart is shown in Figure 3. The cloud will move rapidly with the wind, and its spatial position and coverage in the morning and afternoon will often change rapidly, while the snow coverage and the spatial position of the ground features are relatively fixed. Therefore, the complementary information and synthesis of the snow extent in the morning and afternoon can effectively remove the influence of cloud and restore the snow under the cloud layer. Meanwhile, the high-resolution information of the optical sensor is retained to the maximum extent. The maximum daily snow cover with 500 m resolution under all weather conditions is obtained by using optical and microwave data, high-resolution auxiliary data (DEM, Geodata, etc.) is added to the microwave data to map the snow depth. The concrete steps are as follows: Remote Sens. 2021, 13, x FOR PEER REVIEW 9 of 26 will move rapidly with the wind, and its spatial position and coverage in the morning and afternoon will often change rapidly, while the snow coverage and the spatial position of the ground features are relatively fixed. Therefore, the complementary information and synthesis of the snow extent in the morning and afternoon can effectively remove the influence of cloud and restore the snow under the cloud layer. Meanwhile, the high-resolution information of the optical sensor is retained to the maximum extent. The maximum daily snow cover with 500 m resolution under all weather conditions is obtained by using optical and microwave data, high-resolution auxiliary data (DEM, Geodata, etc.) is added to the microwave data to map the snow depth. The concrete steps are as follows: Step1: Using the remote-sensing data MOD10A1 and MYD10A1, the daily snow cover MXD10A1 is extracted cooperatively, and the cloud interference is removed based on the dynamic of clouds; Step2: Extraction of snow cover under cloud-covered using MWRI data based on decision tree algorithm; Step3: The snow cover retrieved from MODIS and MWRI single sensor is combined to further remove the influence of cloud, MXD10A1 was adopted in cloud free conditions, while MWRI derived snow cover was adopted in cloud covered conditions, the complete snow cover information without cloud interference is thus combined and obtained; Step4：Preparation of input variables: Resampling and spatiotemporal matching of snow cover and auxiliary data; Step5: Ground observations are used as target labels to train the model and minimize errors. Step 1: Using the remote-sensing data MOD10A1 and MYD10A1, the daily snow cover MXD10A1 is extracted cooperatively, and the cloud interference is removed based on the dynamic of clouds; Step 2: Extraction of snow cover under cloud-covered using MWRI data based on decision tree algorithm; Step 3: The snow cover retrieved from MODIS and MWRI single sensor is combined to further remove the influence of cloud, MXD10A1 was adopted in cloud free conditions, while MWRI derived snow cover was adopted in cloud covered conditions, the complete snow cover information without cloud interference is thus combined and obtained; Step 4: Preparation of input variables: Resampling and spatiotemporal matching of snow cover and auxiliary data; Step 5: Ground observations are used as target labels to train the model and minimize errors.

Snow Cover Identification from MWRI
Nowadays, snow cover datasets widely used in investigations of climate, hydrology, and glaciology are likely derived from MODIS data, which cover the whole Earth at a near-half daily frequency through the cooperative observation of Terra and Aqua [3,4]. The combination of MOD10A1 and MYD10A1 can effectively reduce cloud interference and obtain more accurate snow cover distribution. However, the cloud layer limits the optical sensor to detect the ground snow information. In winter, the cloud layer will cover more than 50% of NX [6]. Microwave data were used to retrieve snow cover distribution under clouds.
There is a large difference of polarization between snow cover and the bare surface, and the horizontal polarization is much smaller than the radiation BT of vertical polarization. With the increase of snow depth, the polarization effect of microwave radiation enhanced, the dual polarization BT of different frequencies tends to be consistent and there is free snow cover. In arid and semi-arid regions, water, frozen soil and cold desert will be misjudged as snow cover due to the similar microwave radiation characteristics. Snow cover should be accurately distinguished from other land covers before retrieving snow depth. The decision tree method has been widely used to distinguish snow from other scattering signatures [7][8][9] and is applied in this study. The detailed algorithm flow is shown in Figure 4.

Downscaling Snow Depth Retrieval Model
As stated in Section 1, previous studies have focused on the linear relationship between snow depth and microwave BT to retrieve snow depth. However, a nonparametric statistical approach has greater potential to explore the non-linear relationship between snow depth and BTD in a complex ground environment. A neural network, as a classic non-parametric method, has been proved to be effective in the retrieval and estimation of geological parameters. In this study, a three hidden layer backpropagation neural network, with 20, 20 and 10 neurons in the 1st, 2nd and 3rd hidden layer was constructed by the Pytorch framework.
As illustrated in Figure 5, the BTD, topographic data and meteorological data were resampled to the same spatial resolution of 500 m, and then the pixel values of specified longitude and latitude are extracted as input data to match the target task, that is, ground snow depth observation. The computation between neurons in each layer is illustrated in Formula (1): where is the amount of layers, l represents that the variables belong to the l-th layer, w [l] is the weight matrix and b [l] is the bias of the layer; is the sigmoid function, a non-linear activation function, is used by the neurons in the third hidden layer, which can capture non-linear relationships between various inputs and snow depth. The backpropagation is a method that repeatedly adjusts weights to minimize the disparity between actual and expected outputs. Mean squared error (MSE) cost function and stochastic gradient descent (SGD) optimization function were used in the proposed deep neural network to minimize a sum of the mean squared error, and SGD is described in Formula (2): where F is the cost function, ∂ is the learning rate, set to 0.001, w [l] and b [l] are the weight matrix and bias vector in the layer l.
and glaciology are likely derived from MODIS data, which cover the whole Earth at a near-half daily frequency through the cooperative observation of Terra and Aqua [3,4]. The combination of MOD10A1 and MYD10A1 can effectively reduce cloud interference and obtain more accurate snow cover distribution. However, the cloud layer limits the optical sensor to detect the ground snow information. In winter, the cloud layer will cover more than 50% of NX [6]. Microwave data were used to retrieve snow cover distribution under clouds. There is a large difference of polarization between snow cover and the bare surface, and the horizontal polarization is much smaller than the radiation BT of vertical polarization. With the increase of snow depth, the polarization effect of microwave radiation enhanced, the dual polarization BT of different frequencies tends to be consistent and there is free snow cover. In arid and semi-arid regions, water, frozen soil and cold desert will be misjudged as snow cover due to the similar microwave radiation characteristics. Snow cover should be accurately distinguished from other land covers before retrieving snow depth. The decision tree method has been widely used to distinguish snow from other scattering signatures [7][8][9] and is applied in this study. The detailed algorithm flow is shown in Figure 4.

Downscaling Snow Depth Retrieval Model
where F is the cost function, ∂ is the learning rate, set to 0.

Model Evaluation and Accuracy Metrics
The measured snow depth from ground stations was used as the truth value to evaluate the performance of the downscaling model and the accuracy of retrieval results. The specific evaluation criteria included the mean coefficient of determination (R 2 ), root mean squared error (RMSE), positive mean error (PME), negative mean error (NME), average absolute error (MAE) and bias (BIAS). The calculation Formulas (3)-(8) of the five indexes are as follows: In the aforementioned formulas, y i is the i-th measured snow depth from ground station; E(y i ) is the i-th retrieved snow depth based on microwave data; p and r are the number of samples whose measured snow depth are greater and less than the retrieved snow depth, respectively; and n is the total number of tested samples.

Corrections of Factors
The correlation between snow depth and BTD is described quantitatively in this section. Four  Altay station is located in the covered area of Altai Mountain vein. The mean temperature in cold winter is −23 • C, the mean snow depth in winter is 48 cm, and the annual snow cover days is more than 6 months with a 98.8% coverage rate. In winter, dry snow is dominated and well-distributed in Altay area. The underlying ground is farmland and alpine grassland. Tuscaloosa station is located in Tuscaloosa basin, with cold climate and mean snow depth of 32 cm in winter. As in Altay region, well-distributed dry snow dominates Tuscaloosa basin and the underlying surface is farmland or sparse vegetation, however, the terrain is flatter. The Ili River valley where Yining is located is a bell mouth terrain ringed on three sides by mountains. The water vapor from the Caspian Sea and the Black Sea accumulates here, making the region rich in precipitation and humid climate. The snow with wet feature dominates the hills and mountains, covering the underlying surface of farmland and vegetation heterogeneously. Ulan Wusu station is located in the Economic Zone on the north slope of Tianshan Mountain. The mean snow depth in this area is about 23 cm, and the underlying surface of snow is more complex, including farmland, shrub vegetation and shelterbelt. The above contents are the climate, topography and underlying surface characteristics of four typical stations in NX.
Four representative regions were selected to assess the correlation between BTD and snow depth. The BT polarization difference in Altay, Tuscaloosa, Yining and Ulan Wusu from 1 November 2014 to 31 March 2015, as well as the snow depth and ground surface temperature recorded by meteorological stations on the corresponding date. It can be seen that there are obvious changes in the polarization difference of MWRI BT corresponding to the snow cover, especially the brightness temperature difference between the lowfrequency channel and the high-frequency channel ( Figure 6). The specific features are as follows: (1) BTD18h36h and BTD18v36v increased with the accumulation of new snow and the increase of snow depth; (2) BTD decreases with the increase of the ground surface temperature, even the result is negative. Therefore, the air temperature factor should be considered in the passive microwave snow depth retrieval model.
Detailed coefficient of determination between snow depth and BTD at four typical stations were described in Table 2. Generally speaking, except for the BTD18h36h and BTD18v36v data of Yining station, the other BTD data have a high coefficient of determination with snow depth. Altay, where dry snow is evenly distributed, is the best place for satellite synchronous observation and ground verification in snow remote-sensing research. The coefficient of determination between BTD and snow depth shows the best result and reaches 0.4819. Moreover, the coefficient of determination between BTD18h36h, BTD18v36v and snow depth, which is widely used in the classical Chang algorithm, is as high as 0.6037 and 0.6243. The coefficient of determination at Yining station is relatively poor, which is related to the humid climate in the valley and the characteristic of wet snow. Only from the perspective of BTD, the mean R 2 of the six BTDs and snow depth at each station are relatively close, except that the poor coefficient of determination of BTD18v36v in Yining station reduces the mean value. Therefore, the six BTDs can be used to retrieve the snow depth in NX based on the contents described in the above figures and table.
can be seen that there are obvious changes in the polarization difference of MWRI BT corresponding to the snow cover, especially the brightness temperature difference between the low-frequency channel and the high-frequency channel ( Figure 6). The specific features are as follows: (1) BTD18h36h and BTD18v36v increased with the accumulation of new snow and the increase of snow depth; (2) BTD decreases with the increase of the ground surface temperature, even the result is negative. Therefore, the air temperature factor should be considered in the passive microwave snow depth retrieval model. Detailed coefficient of determination between snow depth and BTD at four typical stations were described in Table 2. Generally speaking, except for the BTD18h36h and BTD18v36v data of Yining station, the other BTD data have a high coefficient of determination with snow depth. Altay, where dry snow is evenly distributed, is the best place for satellite synchronous observation and ground verification in snow remote-sensing research. The coefficient of determination between BTD and snow depth shows the best result and reaches 0.4819. Moreover, the coefficient of determination between BTD18h36h, BTD18v36v and snow depth, which is widely used in the classical Chang  The radiation BT of snow cover in the high-frequency channel decrease obviously. Thus, the BTD between high and low frequency channels is used as scattering index, its change and threshold value are used as a characteristic index in retrieval algorithm. In the actual snow depth microwave radiation modeling, the influence of surface roughness (high mountain, hilly, flat, etc.) and land cover parameters (glacier, vegetation, forest, desert, desert, etc.) should be considered. Moreover, microwave radiation can penetrate snow and detect radiation from the underlying surface, which indicates that the detection results are affected by the volume scattering caused by the difference of soil physical properties in vertical direction. Therefore, the natural environmental differences of snowfields must be considered comprehensively in snow depth retrieval.

Case Study of Spatial Distribution
The comparison of snow depth mapping was used to show the consistency of the spatial distribution among the downscaled snow depth and existing snow depth datasets, and try to present more detailed information, especially in alpine regions and river valleys. The snow depth mapping for 1 December 2014, as a typical example, is shown in Figure 7.

Accuracy Validation
Snow depth observations from 37 national meteorological stations in NX (from 2014.11.1 to 2015.3.31) were selected to validate and evaluate the accuracy of downscaled snow depth and to compare the accuracy of MWRI_A, MWRI_D, WESTDC_SD. The detailed RMSE, PME, NME, MAE and BIAS are described in Table 3. MWRI_D_SD shows the highest RMSE of 10.55 cm, highest NME of -11.16 cm, highest MAE of 6.13 cm, and highest BIAS of -6.28 cm, followed by MWRI_A_SD with an RMSE of 9.45 cm, PME of 4.42 cm, NME of -10.11 cm, MAE of 5.49 cm and BIAS of -3.73 cm. WESTDC_SD, a widely reported and approved snow depth dataset [2,15,62], is more accurate than MWRI_A_SD and MWRI_D_SD in NX, with the lowest PME of 2.95 cm. PME values are The downscaling snow depth map (Figure 7g), the corresponding WESTDC snow depth datasets (Figure 7e), and the official ascending ( Figure 7a) and descending (Figure 7c) snow depth datasets via Jiang's [63] algorithm for the same day are shown. Due to orbital scanning, both official ascending and descending snow depth maps have large strip gaps across NX. Data only based on a single orbit will lead to missing snow depth information. WESTDC snow depth map reflects the snow distribution covering the entire NX (Figure 7e). However, the proposed downscaling approach can derive snowpack information and provide details on snow depth variations at 500 m spatial resolution. As shown in Figure 7, on 1 December 2014, the snow cover was primarily distributed in Altai Mountains, Tianshan Mountains and Western Tianshan Mountains. There is also a few shallow snow distributed over Hami area in the eastern of NX and Tuscaloosa in the western of NX. Tianshan Mountains and the Altay area are the two high-value regions of snow depth and Tianshan Mountains were selected for detailed display. In Figure 7h enlarged views, the snow depth distribution varies with the altitude of the mountains, showing obvious texture characteristics, that are consistent with the actual situation. WESTDC snow depth datasets, official ascending and descending snow depth datasets also described the regional snow depth information with coarse resolution, especially WESTDC snow depth datasets has good consistency with the snow depth map generated by the downscaling approach in spatial distribution. However, extensive heterogeneity of snow depth is only shown in the downscaled snow depth mapping.

Accuracy Validation
Snow depth observations from 37 national meteorological stations in NX (from 1 November 2014 to 31 March 2015) were selected to validate and evaluate the accuracy of downscaled snow depth and to compare the accuracy of MWRI_A, MWRI_D, WESTDC_SD. The detailed RMSE, PME, NME, MAE and BIAS are described in Table 3. MWRI_D_SD shows the highest RMSE of 10.55 cm, highest NME of −11.16 cm, highest MAE of 6.13 cm, and highest BIAS of −6.28 cm, followed by MWRI_A_SD with an RMSE of 9.45 cm, PME of 4.42 cm, NME of −10.11 cm, MAE of 5.49 cm and BIAS of −3.73 cm. WESTDC_SD, a widely reported and approved snow depth dataset [2,15,62], is more accurate than MWRI_A_SD and MWRI_D_SD in NX, with the lowest PME of 2.95 cm. PME values are lower than NME of all SD datasets. There are significant errors of the snow depth retrieved by microwave data compared with ground observations in NX, but the difference between PME and NME for downscaled SD is minimal. Most importantly, the downscaled snow depth has the best accuracy compared to the other three snow depth datasets, with the lowest RMSE of 8.16 cm, MAE of 4.73 cm and BIAS of −2.71 cm. Table 3. The root mean square error (RMSE), positive mean error (PME), negative mean error (NME), average absolute error (MAE) and bias (BIAS) for four types of snow depth (SD) dataset for NX base ground stations. In the overall accuracy evaluation, downscaled snow depth is dominant compared to snow depth datasets, while detailed results based on snow comparison at different depths are described in Table 4. Similar to the results in Table 3, downscaled snow depth performs best at different snow depth levels, followed by WESTDC_SD and MWRI_A_SD, and MWRI_D_SD has the poorest accuracy performance.

Improvement of Snow Cover
The daily snow cover used for snow depth retrieval in this study is combined from the collaborative retrieval of MODIS optical data and MWRI microwave data in different time phases in a single day. The variation of snow cover proportion of various snow datasets from 1 November 2014 to 31 March 2015 is described in Figure 8. In particular, MXD10A1 is the result of cloud removal synthesized by MOD10A1 and MYD10A1. MWRI_snowcover is the snow cover information retrieved from MWRI data by the decision tree algorithm. Composite snowcover is the final snow cover result synthesized by MXD10A1 and MWRI_snowcover. In a snow season, the mean snow cover percentage of MOD10A1 is 19.01%, with the highest value 50.02% and the lowest value 0.16%. In contrast, MYD10A1 is a snow dataset of satellite transit in the afternoon. The enhancement of solar radiation promotes the water vapor to rise and eventually leads to the increase of clouds. Therefore, MYD10A1 has more cloud proportion and less snow proportion, with a mean value of 18.00%, the highest value of 49.85% and lowest value of 0.09%. Due to the small variation of snow cover and the great change of cloud movement, the maximum snow cover MXD10A1 based on optical data can be obtained by the intersection of cloud pixels of MOD10A1 and MYD10A1, with a mean ratio of 25.61%.

A Rough Comparison of Snow Storage at the Study Area
The snow storage in pixel scale is calculated by units of the pixels, which is the total volume of snow in a specific area. The general Equation (9) is given as follows: where A is the snow area and SD is snow depth at pixel scale, s n o w v is the snow storage. The quantitative calculation of snow storage in the study area is of great significance to the monitoring and development of snow solid water resources. More intuitively, it can provide a way for the comparison and verification of snow depth datasets with different However, it is still impossible to detect the snow pixels under the clouds. Microwave radiation data was able to penetrate the clouds and detect the land cover. Thus, the results retrieved by optical data were accepted for cloud-free area, while the results retrieved by microwave data were accepted for cloud-covered area. From 1 November 2014 to 31 March 2015, the mean snow cover ratio of the region increased to 54.59% after the collaborative observation of the two types of data.

A Rough Comparison of Snow Storage at the Study Area
The snow storage in pixel scale is calculated by units of the pixels, which is the total volume of snow in a specific area. The general Equation (9) is given as follows: where A is the snow area and SD is snow depth at pixel scale, v snow is the snow storage. The quantitative calculation of snow storage in the study area is of great significance to the monitoring and development of snow solid water resources. More intuitively, it can provide a way for the comparison and verification of snow depth datasets with different scales. Figure 9 illustrates the same change trend of snow storage generated from three snow depth datasets in NX from 1 November 2014 to 31 March 2015. The snow storage increased from the beginning of November to the peak in February of the next year and then decreased from the latter part of February. From 1 November to Mid-December, MWRI_SD has the largest snow storage, WESTDC has the least snow storage, downscaled snow depth ranked second. The snow storage and rising trends of downscaled snow depth and WESTDC have good consistency from mid-December to latter January of the next year. Starting from the snowmelt phase in February, the downscaled snow depth has the fastest decreasing trend. It is worth noting that during the snowmelt stage, both downscaled snow depth and MWRI_SD fluctuate and oscillate, while the downward trend of WESTDC is relatively flat. First of all, this difference may be due to the lower spatial resolution and sensitivity of MWRI than that of SSMI/S, and the relatively poor accuracy of BT. Secondly, downscaled snow depth and MWRI_SD have the same trend of fluctuation in the same period, but the amplitude of downscaled snow depth fluctuation is smaller, which indicates the results of downscaled snow depth show better stability than MWRI_SD. Finally, with the increase of temperature in March, accelerated snow melting leads to an increase in liquid water and a greater proportion of wet snow. In the above study, wet snow pixels are eliminated in the snow recognition based on decision tree algorithm. Therefore, since the snowmelt period in March, the snow storage calculated based on the downscaling algorithm is the smallest compared with the other two snow data.
Remote Sens. 2021, 13, x FOR PEER REVIEW 18 of 26 Figure 9. Snow storage generated from three snow depth datasets.

Error Analysis Based on Multiple Factors
In Section 4.3, the accuracy of downscaled snow depth has been validated and evaluated, which shows good performance compared with the snow depth datasets with 25 km resolution. In this section, we will analyze the error distribution under various terrain conditions, and discuss the influence of terrain factors on the error of snow depth retrieval. Figure 10 reveals that average RMSE of each station (from 2014.11.1 to Figure 9. Snow storage generated from three snow depth datasets.

Error Analysis Based on Multiple Factors
In Section 4.3, the accuracy of downscaled snow depth has been validated and evaluated, which shows good performance compared with the snow depth datasets with 25 km resolution. In this section, we will analyze the error distribution under various terrain conditions, and discuss the influence of terrain factors on the error of snow depth retrieval. Figure 10 reveals that average RMSE of each station (from 1 November 2014 to 31 March 2015) has a large difference. However, the spatially consistent trends in error are consistent across the four snow depth datasets. The blue points represent the average RMSE in the downscaled snow depth datasets, which is the lowest of the four SD datasets. This is consistent with the findings in Section 4.  N). With the exception of Tianchi station, which is located in Tianchi Scenic Area near Urumqi, the other two stations are located in Ili River Valley. As mentioned in Section 4.1, the unstable snow characteristics in the Ili River valley, which is dominated by wet snow, create significant uncertainty in snow depth retrieval and hence high errors. Another interesting finding is that the mean RMSE is positively correlated with elevation, which indicates that the accuracy of snow depth retrieval decreases with elevation. By comparing the blue line (Downscaled_SD), orange line (Westdc_SD), gray line (MWRI_A_SD) and yellow line (MWRI_D_SD), we can find that the slope of the blue line is the smallest, which also shows that RMSE of downscaled snow depth increases the slowest with the increase of elevation, showing the best performance.  We also counted the distribution of RMSE across different aspects. As shown in Figure 12, the values of RMSE are larger in the west and south and east directions and smaller in the remaining directions. This may be related to the insolation conditions, where the west, south, and east sides receive more solar radiation during the day, which changes the moisture, density, snow particle size, and other minor parameters of the snowpack, resulting in a decrease in the accuracy of the snow depth retrieval. We also counted the distribution of RMSE across different aspects. As shown in Figure 12, the values of RMSE are larger in the west and south and east directions and smaller in the remaining directions. This may be related to the insolation conditions, where the west, south, and east sides receive more solar radiation during the day, which changes the moisture, density, snow particle size, and other minor parameters of the snowpack, resulting in a decrease in the accuracy of the snow depth retrieval.

Advantages and Limitations
Downscaling a snow depth model based on deep learning has the two advantages. Firstly, based on the validation of snow depth data measured by ground observation stations, the proposed model shows the best accuracy and minimum error compared with the widely reported public datasets. Secondly, all the input variables can be obtained quickly, which is helpful for the future business promotion of the model.

Advantages and Limitations
Downscaling a snow depth model based on deep learning has the two advantages. Firstly, based on the validation of snow depth data measured by ground observation stations, the proposed model shows the best accuracy and minimum error compared with the widely reported public datasets. Secondly, all the input variables can be obtained quickly, which is helpful for the future business promotion of the model.
Although the downscaling snow depth model performs well in accuracy and spatial resolution, there are still some limitations that warrant further improvement. We compared the altitude distribution of observation stations with that of the whole NX. There is a large elevation difference in NX, from basin to hill to mountain. As shown in Table 5, the proportion elevation ranges from 500 m to 1000 m, from 1000 m to 1500 m in NX it is 31.95% and 22.79%, respectively. The location of 36 observation stations in NX was more determined by traffic, terrain, population and other factors. Therefore, the proportion of observation stations below 500 m, from 500 m to 1000 m reaches 29.73% and 32.43%. Nevertheless, as shown in Figure 13, Stations distribution across elevation is similar to NX across elevation. Therefore, the upscaling performance of the algorithm is credible.  Although the downscaling snow depth model performs well in accuracy and spatial resolution, there are still some limitations that warrant further improvement. We compared the altitude distribution of observation stations with that of the whole NX. There is a large elevation difference in NX, from basin to hill to mountain. As shown in Table 5, the proportion elevation ranges from 500 m to 1000 m, from 1000 m to 1500 m in NX it is 31.95% and 22.79%, respectively. The location of 36 observation stations in NX was more determined by traffic, terrain, population and other factors. Therefore, the proportion of observation stations below 500 m, from 500 m to 1000 m reaches 29.73% and 32.43%. Nevertheless, as shown in Figure 13, Stations distribution across elevation is similar to NX across elevation. Therefore, the upscaling performance of the algorithm is credible. Similar to the above study, stations distribution and NX across aspects were compared in Table 6 and Figure 14. The distribution of aspects in the whole NX is relatively balanced, with the most pixels in the north aspect, accounting for 17.32% of all pixels, and the least pixels in the east aspect, accounting for 8.02%. Stations' distribution across aspect is quite different, with the highest proportion of 27.03% in north aspect and lowest proportion of 5.41% in northeast aspect, southwest aspect and west aspect. As shown in Figure 14, the aspect distribution of the stations is different from that of the whole NX, and the algorithm has limitations in the scalability of the aspect. Similar to the above study, stations distribution and NX across aspects were compared in Table 6 and Figure 14. The distribution of aspects in the whole NX is relatively balanced, with the most pixels in the north aspect, accounting for 17.32% of all pixels, and the least pixels in the east aspect, accounting for 8.02%. Stations' distribution across aspect is quite different, with the highest proportion of 27.03% in north aspect and lowest proportion of 5.41% in northeast aspect, southwest aspect and west aspect. As shown in Figure 14, the aspect distribution of the stations is different from that of the whole NX, and the algorithm has limitations in the scalability of the aspect.    Due to the safety of the staff and the convenience of material transportation, the ground meteorological observation stations are usually located in the flat and small slope area. Therefore, it is difficult to discuss the scalability of the algorithm on the slope. Based on the above analysis, there are many limitations on the scalability of the downscaling snow depth model in different altitudes, aspect, slope and other environ- Due to the safety of the staff and the convenience of material transportation, the ground meteorological observation stations are usually located in the flat and small slope area. Therefore, it is difficult to discuss the scalability of the algorithm on the slope. Based on the above analysis, there are many limitations on the scalability of the downscaling snow depth model in different altitudes, aspect, slope and other environments.

Conclusions
Accurate high spatial resolution snow depth mapping is of great importance for regional snow disaster assessment and hydrological simulation, especially for NX with a complex topography and abundant snow. Microwave data with coarse spatial resolution are not enough to detect the spatiotemporal heterogeneity of snow depth. This manuscript proposes a deep learning approach for spatial downscaling of snow depth over the alpine region. Compared to the traditional remote-sensing snow depth retrieval method based on linear regression, this method can better adapt to arid and semi-arid regions with complex topographical and specific geo-meteorological features. Accuracy of downscaled snow depth estimations has been improved compared to existing snow datasets, with the lowest RMSE and MAE (8.16 cm and 4.73 cm, respectively) compared with WESTDC_SD (9.38 cm and 5.36 cm), MWRI_A_SD (9.45 cm and 5.49 cm) and MWRI_D_SD (10.55 cm and 6.13 cm). Furthermore, the downscaled snow depth depicts the distribution characteristics of snow depth along with the altitude in the alpine region. The above results are enough to prove the effeteness and superiority of this method. Although the method has multi-element requirements for the input data and has a limitation on retrieval scene, it has achieved fruitful results in the downscaling snow depth retrieval, which also provides a reference for subsequent snow depth retrieval.
Author Contributions: All authors contributed significantly to this manuscript. L.Z. designed and performed the experiment and finished the manuscript writing. Y.Z., J.W., W.T. and Q.L. offered guidance to complete the work. L.Z., G.M., X.K. and Y.C. carried out data processing and data analysis. All authors have read and agreed to the published version of the manuscript.