Retrieval of Snow Depth over Arctic Sea Ice Using a Deep Neural Network

: The accurate knowledge of spatial and temporal variations of snow depth over sea ice in the Arctic basin is important for understanding the Arctic energy budget and retrieving sea ice thickness from satellite altimetry. In this study, we develop and validate a new method for retrieving snow depth over Arctic sea ice from brightness temperatures at di ﬀ erent frequencies measured by passive microwave radiometers. We construct an ensemble-based deep neural network and use snow depth measured by sea ice mass balance buoys to train the network. First, the accuracy of the retrieved snow depth is validated with observations. The results show the derived snow depth is in good agreement with the observations, in terms of correlation, bias, root mean square error, and probability distribution. Our ensemble-based deep neural network can be used to extend the snow depth retrieval from ﬁrst-year sea ice (FYI) to multi-year sea ice (MYI), as well as during the melting period. Second, the consistency and discrepancy of snow depth in the Arctic basin between our retrieval using the ensemble-based deep neural network and two other available retrievals using the empirical regression are examined. The results suggest that our snow depth retrieval outperforms these data sets.


Introduction
Snow over sea ice is an integral part of the Arctic climate system, having variability associated with a broad range of processes.Snow has a higher reflectivity and much lower thermal conductivity compared to sea ice [1][2][3].As snow develops and accumulates, it increases surface albedo, leading to reduced solar radiation absorbed by sea ice (except during the polar night).It reduces the heat transferred from the ocean, through sea ice, to the atmosphere, slowing down sea ice growth [4,5].It modifies surface roughness of sea ice, affecting atmosphere-ice drag coefficient and interactions [6].It can increase sea ice thickness through snow-to-ice conversion [7].As snow melts, it forms melt ponds, enhancing solar radiation absorbed by sea ice and the melt of the ice pack [8,9].It also increases freshwater input into the ocean, affecting the buoyancy of water in the ocean.Hence variations of sea ice and snow over it are intricately linked.Additionally, snow over sea ice influences primary production in and underneath sea ice [10,11].
Snow depth over sea ice is a required parameter for the freeboard-based retrieval of sea ice thickness from satellite altimetry (radar and laser altimetry).For radar altimetry, it assumes that the radar signal can penetrate snow and the main reflectance is from the snow-ice interface, and snow depth is required once for the retrieval of sea ice thickness [12].For laser altimetry, the main reflectance is from the atmosphere-snow interface, and snow depth is required twice for the freeboard retrieval and the conversion of snow freeboard to sea ice thickness [13].
Thus, accurate knowledge of spatial and temporal variations of snow depth over sea ice in the Arctic basin is needed for diagnosing changes in surface heat and freshwater budget of the coupled atmosphere-sea ice-ocean system [14]; understanding rapid changes of Arctic environments, particularly rapid decline of Arctic sea ice [15]; reducing the uncertainty of the retrieval of sea ice thickness from satellite altimetry [16]; evaluating snow depth simulated by numerical weather prediction models and climate system models to improve the forecast of sea ice and climate.
However, snow depth over Arctic sea ice, particularly on the basin scale, has received less research attention than sea ice concentration and thickness.As a result, spatial and temporal variations of snow depth over Arctic sea ice are poorly known.This contributes to large uncertainty in estimates of Arctic climate variability and limits our ability to validate climate models used to project 21st century climate [17].
In-situ measurements of snow depth over sea ice are sparse in the Arctic, having very limited spatial and temporal coverage.A climatology of snow depth in the Arctic Ocean was developed based on direct measurements of snow depth from a number of drifting Soviet stations deployed on MYI for the period of 1954-1991 (hereafter referred to as W99, [15]).This climatology has been widely used for retrieving sea ice thickness from satellite altimetry and evaluating basin-wide snow depth simulations by climate models.Recently, a new climatology of spring snow depth has been developed for 1959-1989 using the Soviet airborne expeditions measurements [18], which is an important extension of W99.However, the Arctic has undergone rapid environmental changes for the past three decades, i.e., amplified warming in the Arctic, rapid decrease of sea ice extent and thickness, particularly MYI [19][20][21].Thus, this climatology cannot represent the current snow state [22,23].This also contributes to a serious source of uncertainty for the retrieval of sea ice thickness and volume or their trend analysis using satellite altimetry [24].Bunzel et al. [16] found that the snow depth climatology from different sources could lead to significant differences in the retrieved Arctic sea ice thickness and volume.Moreover, it is not clear to what extent variability (i.e., trend) in the retrieved Arctic sea ice thickness and volume are influenced by the inability of the snow depth climatology to properly represent its variability.A number of field campaigns have been conducted in the Arctic to gain insights into the evolution of snow depth over sea ice and associated controlling factors.For instance, the Surface Heat Budget of the Arctic Ocean (SHEBA) during 1997-1998 made extensive snow depth measurements in the Beaufort Sea [25,26].A number of sea ice mass balance buoys (IMBs) have been deployed on MYI or FYI floes in the Arctic Ocean to monitor local ice mass balance change, including snow depth for the past 2-3 decades [27].The NASA Operation IceBridge (OIB) started a long-term airborne campaign in the Beaufort Sea, the Canadian Archipelago, and north of Greenland in 2009 [23].Airborne snow depth measurements have been conducted in March and April every year using a frequency-modulated snow radar [28].A Norwegian Young Sea ICE (N-ICE2015) expedition [29] was conducted over four separate ice drifts from January to June 2015 in the Atlantic sector of the Arctic Ocean (north of Svalbard).Observations of snow depth were made using snow probe measurements, snow and IMB buoys, and electromagnetic induction survey.An ongoing expedition, the Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC), will conduct a more comprehensive measurement of snow depth over Arctic sea ice for a better understanding of the changing Arctic.
Satellite remote sensing offers the spatial and temporal resolution to retrieve snow depth over sea ice that can be used for basin-scale snow depth variability analysis and sea ice thickness derivation.For passive microwave remote sensing, like the Special Sensor Microwave/Imager (SSM/I) and the Special Sensor Microwave Imager/Sounder (SSMIS), the radiation emitted from snow depends on its properties (i.e., physical temperature, snow depth, layer structure, grain size, density, salinity and wetness), as well as the underlying ice conditions (i.e., FYI or MYI) [30][31][32][33].The satellite measured brightness temperature at different frequencies, and their gradient ratio can be used to retrieve snow depth.However, MYI tends to have a similar emission signal as snow in microwave frequencies, which can cause an ambiguity between snow and ice [34,35], i.e., for dry snow, the 37-GHz channel becomes saturated for snow depth around 50 cm, the 19-GHz channel becomes saturated for snow depths around 100 cm [30,36], and the saturated depth decreases with increasing snow wetness.For snow depth retrieved with lower frequency passive microwave remote sensing, like the soil moisture and ocean salinity (SMOS), it operates at the L-band and can penetrate snowpack.Using an L-band radiation model, snow depth can be estimated by considering the influence of a snow layer on the brightness temperature above the ice.However, the ice thickness is usually assumed to constant or based on the buoyancy relationship, and the model parameters also have a significant effect on the retrieval result [37][38][39].For active microwave remote sensing, like ICESat, the laser signal reflects in the air-snow interface, the satellite measured snow-freeboard that can be converted to snow depth under an assumption of isostatic equilibrium [13], whereas like CryoSat2, the radar signals can penetrate snowpack and are mainly reflected at the interface of snow and sea ice, the satellite measured ice-freeboard that can be converted to snow depth under same concept [12], or the snow depth can be estimated from the difference between the overlapped snow-freeboard and ice-freeboard [40].However, surface features, like slope and roughness, affect the accuracy of surface elevation [41], and saline snow over sea ice may result in a vertical shift in the location of the main reflectance interface, leading to the misrepresentation of sea ice freeboard for CryoSat2 [42].Thus, retrieving snow depth from space is challenging.The first polar region retrieval of snow depth over Antarctic sea ice was conducted by Markus and Cavalieri [43] using passive microwaves, later extended to Arctic sea ice.They found that the spectral gradient ratio of 19 and 37 GHz vertical polarization brightness temperature has a good correlation with the observed snow depth.The gradient ratio is expressed as  Based on this empirical regression, they produced daily maps of snow depth over Arctic sea ice.This method with updated regression coefficients has also been applied to brightness temperature from AMSR-E and AMSR2 sensors to derive snow depth in the Arctic basin [28,33,[44][45][46].However, the snow depth retrieval algorithm of Markus and Cavalieri works well for FYI, but not for MYI [45,46].This is related to emissivity and volume scattering from MYI, i.e., MYI has a similar effect on brightness temperatures as snow in microwave frequencies [34,35].Thus, there is no value of snow depth over MYI in their retrieved data set.In addition to the problem of retrieving snow depth over MYI, there is also an upper limit in the maximum retrievable snow depth.Maaß et al. [38] developed an emission model for retrieving snow depth over thick sea ice by taking advantage of low frequency brightness temperature in the L-band (1.4 GHz, has a large penetration depth) from the Microwave Imaging Radiometer with Aperture Synthesis (MIRAS) on the SMOS.Their analysis showed that the sensitivity of brightness temperature to snow depth is about an order of magnitude higher than that to ice thickness (especially for the ice thicker than 1.5 m).Rostosky et al. [35] extended the empirical method by considering lower frequency brightness temperature (7 GHz).They showed that snow depth over MYI could be derived using lower frequency passive microwave measurements.Kilic et al. [47] combined 6.9, 18.7, and 36.5 GHz in vertical polarization from AMSR2 with a multi-linear regression model to retrieve snow depth over both FYI and MYI.Guerreiro et al. [48] tested an algorithm to retrieve snow depth over Arctic sea ice based on CryoSat-2 radar altimeter and SARAL/AltiKa Ka-band altimeter.Instead of the regression and emission model methods mentioned above, in this study, we investigate an alternative approach to retrieve snow depth over Arctic sea ice, which is a deep neural network.Tsang et al. [49] and Davis et al. [50] attempted to use a simple artificial neural network (ANN) to derive some snow parameters over sea ice from passive microwave remote sensing, but no ground truth was used to validate the derivation in their studies.Tedesco et al. [51,52] also use ANN to obtain snow water equivalent and snow depth over land surface using brightness temperature from SSM/I or AMER-E as the input, instead of over sea ice.These studies suggested that ANN has the potential to learn the complex relationships between snow parameters and input data, although they used ANN with only one hidden layer and selected frequencies of brightness temperature as the input.The application of the neural network is still developing.
A deep neural network, stemmed from the ANN theory, can lead to better data representation learning, holding great promise to address the challenges of the retrieval of snow depth over Arctic sea ice.Compared to the traditional ANN, the deep neural network has a more sophisticated structure, i.e., more than two hidden layers.Each hidden layer contains a number of neurons that allow the network to learn more complicated, non-linear relationships between multi-frequencies (channels) microwave emissions from within a snow layer and snow depth, i.e., the neurons in the first hidden layer make simple decisions, and then the neurons in the second hidden layer make more complicated decisions than those in the first hidden layer.More recently, Braakmann-Folgmann and Donlon [53] attempted to retrieve snow depth over sea ice a deep neural network using a gradient ratio of selected frequencies of brightness temperatures as the input and showed promising results.However, a specific network shows certain sensitive dependence on initially assigned weight and bias to the neurons.
The purpose of this paper is to develop a new algorithm-an ensemble-based deep neural network-to generate a robust network to retrieve snow depth over sea ice in the Arctic basin based on brightness temperatures measured by a passive microwave radiometer at all frequencies.This paper is organized as follows.In Section 2, we outline data sets used and the deep neural network constructed.In Section 3, we examine the robustness of our network, the accuracy of our snow depth retrieval against direct measurements obtained from sea ice mass balance buoys, and the consistency and discrepancy in snow depth over Arctic sea ice between our retrieval and two other available retrievals.The conclusion section summarizes the contributions of this study and outlines possible directions for future research.

Satellite Data
In this study, brightness temperatures at different microwave frequencies measured by a Special Sensor Microwave Imager/Sounder (SSMIS) are utilized.Specifically, we use brightness temperatures from 7 channels as the input data to retrieve snow depth over Arctic sea ice, including 19 GHz V/H, 22 GHz V, 37 GHz V/H, and 91 GHz V/H (H and V denote horizontal and vertical polarization, respectively).As suggested above, different channels have different penetration depth into the snow as well as sea ice.Thus, here we used brightness temperatures from all channels as the input, given that the deep neural network can learn more complex and non-linear relationships between multi-frequency (channel) microwave emissions from the inside snow layer and snow depth.Here, the brightness temperature is mapped on a polar stereographic grid with a spatial resolution of 25 km (see [54] for more detailed information).The daily gridded brightness temperature from 1992 to 2018 is used here.
Additionally, the daily sea ice concentration processed by the NASA Team algorithm [55] is employed as the sea ice mask, with a spatial resolution of 25 km.Here we focus on grid cells with sea ice concentration greater than 30%, which is often used as the threshold to define sea ice extent and represent the uncertainty of sea ice concentration.

Sea Ice Mass Balance Buoy Data
The sea ice mass balance buoy (IMB) is designed to monitor the evolution of local ice mass balance due to thermodynamic and dynamic changes as the IMB drifts with the ice [27].The IMB includes above and under ice acoustic sounders to measure snow depth and sea ice thickness.Since 2000, a number of IMBs were initially deployed either during the melting period or at the beginning of the Remote Sens. 2019, 11, 2864 5 of 21 freeze-up period in the Arctic Ocean.In this study, the IMB data archived at the CRREL-Dartmouth mass balance buoy program are utilized.These IMB data have been cleaned and processed to remove outliers (http://imb-crrel-dartmouth.org,[56]).Specifically, we use 38 IMBs with quality-controlled snow depth measurements as the target (training) data for the snow depth retrieval.The data spans from August 2008 to August 2016.These IMBs were initially deployed either during the melting period or at the beginning of the freeze-up period.Figure 1 shows the trajectories of the 38 IMBs.They cover large areas of the Beaufort and Chukchi Seas, the central Arctic Ocean and the Greenland Sea.The snow depth is typically measured about every 2-4 h by the acoustic sounder, which records the position of the snow or sea ice surface with an accuracy of ~1 cm [56,57].Here we calculate the daily average of the IMB snow depth and resample it on the aforementioned 25 km polar stereographic grid.On average, each pixel contains 7 IMB measurements (not shown).In total, there are 6260 daily averaged snow depths after the resampling (hereafter referred to as SD-IMB).
are 6260 daily averaged snow depths after the resampling (hereafter referred to as SD-IMB).
Figure 2 shows the probability distribution of the daily averaged IMB snow depth.The snow depth varies from 0 to 85.7 cm, with a mean of 21.0 cm (a mode of 18.8 cm).About 90% of the observations have snow depth less than 40 cm.We also plot the histogram of the daily averaged IMB snow depth as a function of month, IMB measured sea ice thickness, and satellite-based sea ice concentration.As shown in Figure 3a, it appears that the IMB snow depth measurements are well distributed throughout the year, having observations for all months (although the number of observations in July and August are relatively less than others months, Figure 3a).Figure 3b is the histogram of the daily averaged IMB measured sea ice thickness.Typically, the FYI can grow to thickness up to 2 m thermodynamically, and MYI ranges from ~1.5 to 2 m to the upper end.As shown in Figure 3b, most snow depth measurements are over the ice thickness ranging from ~1.3 and ~2.3 m, which includes 63% of the measurements.The number of observations increases as sea ice concentration increases, particularly 64% of the observations falling within the ice concentration greater than 95% (Figure 3c).It should be noted that the majority of the IMB measurements were from the MYI, since the FYI on which IMBs initially developed became the MYI several weeks later.
In this study, we applied random sampling to the 6260 daily averaged IMB snow depths.Of them, 5260 were selected and used as the target (training) data in our snow depth retrieval.The remaining 1000 IMB snow depths were used to validate the performance of our retrieved snow depth (hereafter referred to as the validation data).Figure 2 shows the probability distribution of the daily averaged IMB snow depth.The snow depth varies from 0 to 85.7 cm, with a mean of 21.0 cm (a mode of 18.8 cm).About 90% of the observations have snow depth less than 40 cm.We also plot the histogram of the daily averaged IMB snow depth as a function of month, IMB measured sea ice thickness, and satellite-based sea ice concentration.As shown in Figure 3a, it appears that the IMB snow depth measurements are well distributed throughout the year, having observations for all months (although the number of observations in July and August are relatively less than others months, Figure 3a).Figure 3b is the histogram of the daily averaged IMB measured sea ice thickness.Typically, the FYI can grow to thickness up to 2 m thermodynamically, and MYI ranges from ~1.5 to 2 m to the upper end.As shown in Figure 3b, most snow depth measurements are over the ice thickness ranging from ~1.3 and ~2.3 m, which includes 63% of the measurements.The number of observations increases as sea ice concentration increases, particularly 64% of the observations falling within the ice concentration greater than 95% (Figure 3c).It should be noted that the majority of the IMB measurements were from the MYI, since the FYI on which IMBs initially developed became the MYI several weeks later.
In this study, we applied random sampling to the 6260 daily averaged IMB snow depths.Of them, 5260 were selected and used as the target (training) data in our snow depth retrieval.The remaining 1000 IMB snow depths were used to validate the performance of our retrieved snow depth (hereafter referred to as the validation data).

Deep Neural Network
As mentioned previously, most of the previous studies used regression type techniques to find linear relationships between snow depth and input data, which is then employed to retrieve snow depth.However, the linear regression type techniques cannot capture the non-linear relationships between snow depth and input data.As discussed previously, the simple ANN has the potential to learn the relationships between the input and target data, which has been applied to retrieve snow parameters over land surface with promising results, even with one hidden layer [51].Compared to the traditional ANN, the deep neural network has more than two hidden layers that allow it to learn more complicated, non-linear relationships between multi-frequency (channel) microwave emissions from the inside snow layer and snow depth.
In this study, we constructed a three hidden layer backpropagation neural network, with 20, 20, and 10 neurons in the 1st, 2nd, and 3rd hidden layer, respectively, using the MATLAB deep learning toolbox (Figure 4).To determine an optimal layer structure, we tested many network structures with a different number of layers and neurons and found that [20,20,10] has relatively better performance.The sigmoid function, a non-linear activation function, is used by the neurons in the three hidden layer, which can capture non-linear relationships between inputs and outputs.As illustrated in Figure 4, the value of the ith neuron in the 1st hidden layer (a 1 i ) can be expressed as , where I is the vector of the input data, ω 1 i is the weight of the ith neuron in the 1st hidden layer, and b 1 i is the bias of the ith neuron in the 1st hidden layer.The backpropagation is a method that repeatedly adjusts weights to minimize the departure between actual and expected outputs.The cost function used in our deep neural network is the mean squared error.The Levenberg-Marquardt algorithm is used for minimizing a sum of the mean squared error, addressing some of the limitations of the backpropagation (i.e., overfitting, [58]).

Deep Neural Network
As mentioned previously, most of the previous studies used regression type techniques to find linear relationships between snow depth and input data, which is then employed to retrieve snow depth.However, the linear regression type techniques cannot capture the non-linear relationships between snow depth and input data.As discussed previously, the simple ANN has the potential to learn the relationships between the input and target data, which has been applied to retrieve snow parameters over land surface with promising results, even with one hidden layer [51].Compared to the traditional ANN, the deep neural network has more than two hidden layers that allow it to learn more complicated, non-linear relationships between multi-frequency (channel) microwave emissions from the inside snow layer and snow depth.
In this study, we constructed a three hidden layer backpropagation neural network, with 20, 20, and 10 neurons in the 1st, 2nd, and 3rd hidden layer, respectively, using the MATLAB deep learning toolbox (Figure 4).To determine an optimal layer structure, we tested many network structures with a different number of layers and neurons and found that [20,20,10] has relatively better performance.The sigmoid function, a non-linear activation function, is used by the neurons in the three hidden layer, which can capture non-linear relationships between inputs and outputs.As illustrated in Figure 4, the value of the ith neuron in the 1st hidden layer (  1 ) can be expressed as where  is the vector of the input data,   1 is the weight of the ith neuron in the 1st hidden layer, and   1 is the bias of the ith neuron in the 1st hidden layer.The backpropagation is a method that repeatedly adjusts weights to minimize the departure between actual and expected outputs.The cost function used in our deep neural network is the mean squared error.The Levenberg-Marquardt algorithm is used for minimizing a sum of the mean squared error, addressing some of the limitations of the backpropagation (i.e., overfitting, [58]).

Two Available Long-Term Basin-Scale Snow Depth Retrievals for Intercomparison
To date, there are two long-term retrievals of snow depth over sea ice in the Arctic basin (see Table 1).The first available data set is archived at the NASA Cryosphere Science Research Portal (https://neptune.gsfc.nasa.gov/csb/index.php?section=53, [59]), which is also developed using brightness temperatures from SSM/I and SSMIS.Specifically, snow depth is calculated based on the spectral gradient ratio of the 18.7 and 37 GHz vertical polarization channels (hereafter referred to as SD-NASA).However, there is no snow depth information over MYI, as well as over the melting area from 1 April to 1 October.This data set provides a daily map of snow depth from November 1978 to June 2017 with a spatial resolution of 25 km.The second available data set is archived at the sea ice remote sensing group of the University of Bremen (https://seaice.uni-bremen.de/data/amsre/snow_daygrid[60]and https://seaice.uni-bremen.de/data/amsr2/snow_daygrid,[61]), which is developed by using the

Two Available Long-Term Basin-Scale Snow Depth Retrievals for Intercomparison
To date, there are two long-term retrievals of snow depth over sea ice in the Arctic basin (see Table 1).The first available data set is archived at the NASA Cryosphere Science Research Portal (https://neptune.gsfc.nasa.gov/csb/index.php?section=53, [59]), which is also developed using brightness temperatures from SSM/I and SSMIS.Specifically, snow depth is calculated based on the spectral gradient ratio of the 18.7 and 37 GHz vertical polarization channels (hereafter referred to as SD-NASA).However, there is no snow depth information over MYI, as well as over the melting area from 1 April to 1 October.This data set provides a daily map of snow depth from November 1978 to June 2017 with a spatial resolution of 25 km.The second available data set is archived at the sea ice remote sensing group of the University of Bremen (https://seaice.uni-bremen.de/data/amsre/snow_daygrid[60] and https://seaice.uni-bremen.de/data/amsr2/snow_daygrid,[61]), which is developed by using the similar method of Markus and Cavalieri with similar frequencies of brightness temperatures from the AMSR-E and AMSR-2 (hereafter referred to as SD-UB).This data does not filter out snow depth over MYI or wet snow.This data set provides daily composites of snow depth from June 2002 to October 2011 (when the AMSR-E stopped producing data) and from July 2012 to present (the AMSR2 was launched in May 2012) to present, with a spatial resolution of 6.25 km.In Section 3.3, we carry out a comparison between our retrieval using the ensemble-based deep neural network and the two available retrievals using the empirical regression.

Ensemble-Based Deep Neural Network
We ran the deep neural network described in Section 2.3 using seven channels of brightness temperatures as the input data and 5260 selected IMB snow depths as the target (training) data.First, we ran the deep neural network 100 times with random weights and biases assigned initially for the neurons in the network.In this way, 100 networks are generated.The performance of the 100 networks is shown in Figure 5a,c.The correlation coefficients between the snow depth obtained from these networks and IMB observed snow depth vary from 0.26 and 0.80.The root mean square errors (RMSE) vary from 8.8 cm to 14.1 cm.This suggests that the relationships identified by the networks show sensitive dependence on initially assigned weight and bias for the neurons in the network, which is also indicated by previous studies [62].In order to find a robust network with high correlation between the retrieved and observed snow depth, we followed an approach widely used in the climate modeling community, ensemble averaging.That is, the mean of the ensemble members is expected to outperform individual ensemble members, i.e., closer to the observation and higher prediction skill.Through the ensemble averaging, we combined multiple networks to generate the desired output.Specifically, here we selected the networks within the 10-90 percentile of the 100 networks.The average snow depth derived from the 80 networks was then used as the final output.As shown in Figure 5a, the retrieved snow depth from the ensemble averaging has a very good relationship with the IMB snow depth.The correlation coefficient is 0.79, which is higher than 99 of 100 networks.The retrieved snow depth from the ensemble averaging has an RMSE of 9.0 cm, which is lower than 98 of 100 networks.Note that we found that the correlation increases with increasing number of networks used for the ensemble averaging, but the correlation appears to reach the highest and level off when the number of networks is greater than about 30-40.The results change little if more than any 80 networks are used.This is also the case for the RMSE (not shown).

Comparison with the Validation Data
As mentioned in Section 2.2, 1000 IMB snow depth data (SD-IMB) were not employed for the training of our constructed network.They were used to evaluate the accuracy of the snow depth retrieved from our ensemble-based deep neural network (hereafter referred to as SD-EDNN) using different channels of brightness temperatures.As shown in Figure 5b,d, the derived snow depth using the ensemble averaging is still highly correlated with the validation data.The correlation coefficient is 0.74, which is higher than all the 100 networks.The RMSE is 9.8 cm, which is also smaller than all the networks.This is consistent with the result in Figure 5a,c, and confirms that our ensemble-based deep neural network is robust and capable of deriving snow depth over Arctic sea ice.

Comparison with the Validation Data
As mentioned in Section 2.2, 1000 IMB snow depth data (SD-IMB) were not employed for the training of our constructed network.They were used to evaluate the accuracy of the snow depth retrieved from our ensemble-based deep neural network (hereafter referred to as SD-EDNN) using different channels of brightness temperatures.As shown in Figure 5b,d, the derived snow depth using the ensemble averaging is still highly correlated with the validation data.The correlation coefficient is 0.74, which is higher than all the 100 networks.The RMSE is 9.8 cm, which is also smaller than all the networks.This is consistent with the result in Figure 5a,c, and confirms that our ensemblebased deep neural network is robust and capable of deriving snow depth over Arctic sea ice.
Figure 6a shows a scatter plot between the retrieved and observed snow depth.The retrieval is in good agreement with the observations.The correlation coefficient between SD-EDNN and SD-IMB is 0.73 (black line in Figure 6a).The relationship during the freeze-up period (November to March) is higher than that during the melting period (May to September, blue vs. red line in Figure 6a).Overall, SD-EDNN has a very small bias (0.1 cm), which is a balance of small negative bias during the freezeup period and small positive bias during the melting period (see Table 2).The root mean square error (RMSE) between SD-DNN and SD-IMB is 9.8 cm, which results from relatively smaller (larger) RMSE during the freeze-up (melting) period (see Table 2).Figure 6b shows the histogram of the difference between SD-EDNN and SD-IMB.Apparently, the bias is normally distributed, and about 78% of the bias is within its one standard deviation.The bias deviates slightly from normality, slightly towards the negative side during the freeze-up period (blue bar in Figure 6b) and the positive side during the melting period (red bar in Figure 6b).Figure 6c shows the spatial distribution of the difference between SD-EDNN and SD-IMB.We see a mixture of positive and negative biases, indicating that our retrieval does not have a systematic bias.This is also true for the freeze-up and melting periods (not shown).The red line in Figure 6d is the number of IMB observations used for the calculation of SD-EDNN minus SD-IMB for each bin.It appears that SD-EDNN has the largest bias for very thick snow (exceeding 60 cm), which is in part due to the limited number of IMB measurements.Figure 6a shows a scatter plot between the retrieved and observed snow depth.The retrieval is in good agreement with the observations.The correlation coefficient between SD-EDNN and SD-IMB is 0.73 (black line in Figure 6a).The relationship during the freeze-up period (November to March) is higher than that during the melting period (May to September, blue vs. red line in Figure 6a).Overall, SD-EDNN has a very small bias (0.1 cm), which is a balance of small negative bias during the freeze-up period and small positive bias during the melting period (see Table 2).The root mean square error (RMSE) between SD-DNN and SD-IMB is 9.8 cm, which results from relatively smaller (larger) RMSE during the freeze-up (melting) period (see Table 2).Figure 6b shows the histogram of the difference between SD-EDNN and SD-IMB.Apparently, the bias is normally distributed, and about 78% of the bias is within its one standard deviation.The bias deviates slightly from normality, slightly towards the negative side during the freeze-up period (blue bar in Figure 6b) and the positive side during the melting period (red bar in Figure 6b).Figure 6c shows the spatial distribution of the difference between SD-EDNN and SD-IMB.We see a mixture of positive and negative biases, indicating that our retrieval does not have a systematic bias.This is also true for the freeze-up and melting periods (not shown).The red line in Figure 6d is the number of IMB observations used for the calculation of SD-EDNN minus SD-IMB for each bin.It appears that SD-EDNN has the largest bias for very thick snow (exceeding 60 cm), which is in part due to the limited number of IMB measurements.
Table 2.The bias and root mean square error (RMSE) between SD-EDNN/SD-UB and SD-IMB for all validation data, as well as the data during freeze-up and melting period.Next, we compared the retrieved snow depth with the IMB measurements for individual buoys.We know that sea ice circulation in the Arctic Ocean has two primary components, one is the Beaufort Gyre, and the other is the Transpolar Drift Stream.Here we show a time series of SD-IMB and SD-EDNN for five buoys.Three of them drifted in the Beaufort Gyre and central Arctic (2009F, 2012G and 2013F buoys); 2009F and 2013F were deployed on MYI, whereas 2012G was deployed on FYI (documented at http://imb-crrel-dartmouth.org).Two of them drifted in the Transpolar Drift Stream (2010A and 2012J buoys); one was installed on MYI (2010A) and the other was installed on FYI (2012J).As shown in Figure 7, encouragingly, there is a reasonable degree of agreement between the retrieved and IMB measured snow depth for these buoys.However, some discrepancies are also noticed, i.e., our retrieval underestimates the observed thick snow (~80 cm) in April and May at the location of the 2013F buoy, based by ~10 cm (Figure 7f) and in June at the location of the 2012J buoy, based by ~20 cm (Figure 7h).We also calculated the time averaged snow depth per calendar day across all buoys and corresponding retrievals (not shown).It appears that there is a good match between them, i.e., high correlation (R = 0.93) and small RMSE (2.5 cm).Next, we compared the retrieved snow depth with the IMB measurements for individual buoys.We know that sea ice circulation in the Arctic Ocean has two primary components, one is the Beaufort Gyre, and the other is the Transpolar Drift Stream.Here we show a time series of SD-IMB and SD-EDNN for five buoys.Three of them drifted in the Beaufort Gyre and central Arctic (2009F, 2012G and 2013F buoys); 2009F and 2013F were deployed on MYI, whereas 2012G was deployed on FYI (documented at http://imb-crrel-dartmouth.org).Two of them drifted in the Transpolar Drift Stream (2010A and 2012J buoys); one was installed on MYI (2010A) and the other was installed on FYI (2012J).As shown in Figure 7, encouragingly, there is a reasonable degree of agreement between the retrieved and IMB measured snow depth for these buoys.However, some discrepancies are also noticed, i.e., our retrieval underestimates the observed thick snow (~80 cm) in April and May at the location of the 2013F buoy, based by ~10 cm (Figure 7c) and in June at the location of the 2012J buoy, based by ~20 cm (Figure 7d).We also calculated the time averaged snow depth per calendar day across all buoys and corresponding retrievals (not shown).It appears that there is a good match between them, i.e., high correlation (R = 0.93) and small RMSE (2.5 cm).

Comparison with Other Retrieved Snow Depth Data
In this section, we conduct preliminary intercomparisons of the consistency and discrepancy in snow depth over sea ice in the Arctic basin between our retrieval using the ensemble-based deep neural network and two other available retrievals using the empirical regression.Figures 8-10 show the spatial distribution of the monthly snow depth over the ice-covered Arctic Ocean averaged from June 2002 to June 2018.The SD-NASA is only valid 1) in areas covered by FYI (no value over MYI) and 2) from October to April (no value during the melting season) (Figure 8).By contrast, the SD-UB and SD-EDNN (our retrieval) have better spatiotemporal coverage (Figures 9 and 10).In general, snow depth over Arctic sea ice increases quickly in the Canadian Arctic and the north Greenland Sea from August to November.Then the increase of snow depth extends to the Siberian and Alaskan sectors of the Arctic from November to January.This is more clearly shown in our retrieval (Figure 10).This tends to reduce the regional differences in snow depth across the Arctic basin in winter and early spring.Snow depth is reduced from April to May, primarily confined to the periphery of the Canadian Arctic Archipelago and the northern Greenland Sea, which might be mostly due to sea ice dynamic processes there [63].The snow depth reaches its seasonal minimum in July associated with the melting.This seasonal evolution is broadly consistent with the widely used snow depth climatology (W99).However, some discrepancies are noted.First, W99 shows that snow depth is quite uniform across the Arctic in February, but SD-UB and SD-EDNN still have a large gradient of snow depth from the Canadian Arctic and the north Greenland Sea to the Siberian sector of the Arctic.Second, W99 showed that snow depth in the Canadian Arctic and the north Greenland Sea reaches seasonal maximum in June, and the largest regional difference in snow depth also occurs in June.Our retrieval (SD-EDNN) and the retrieval from the University of Bremen (SD-UB) do not support this and have a reduced regional difference in snow depth in June due to the decrease of snow depth from Spring to June.Such discrepancy might be in part due to that W99 is based on data back to the 1950s when MYI coverage and snow depth was larger than recent decades with rapid Arctic environmental changes, i.e., amplified warming, lengthened melting, and decreased sea ice [64].
Next, we repeated the comparison for the FYI for all three data sets, including SD-EDNN, SD-NASA, and SD-UB.Here we used the ice age data from https://nsidc.org/data/nsidc-0611 to mask out FYI.As shown in Figure 11, the correlation coefficient between the SD-EDNN and the observation is 0.63, which is much higher that of SD-NASA and SD-UB.Note that most of the IMB observations used here were from MYI.There is no IMB data available in the East Siberian, Laptev, Kara, and Barents Seas, which are dominated by FYI.This may raise uncertainty on the retrieved snow depth there.Thus, more observations over FYI are needed for further validation.
Then, we compared with SD-UB with the aforementioned 1000 IMB snow depth data (validation data), given that both data cover the central Arctic for all months.Figure 12a shows the scatter plot between the SD-UB and observed snow depth.The correlation coefficient between the SD-UB and the observation is 0.32 (black line in Figure 12a), which is much smaller than that of our retrieval (0.73, black line in Figure 6a).This is also true for the freeze-up and the melting periods (Figure 12a vs. Figure 6a).Overall, SD-UB has quite a large positive bias (10.5 cm), which is due to large positive biases during the freeze-up and melting period (see Table 2).The root mean square error (RMSE) between SD-UB and SD-IMB is 17.7 cm, which results from large RMSE during the freeze-up and melting periods (see Table 2).Figure 12b shows the histogram of the difference between SD-UB and SD-IMB.It appears that the bias distribution deviates relatively from the normal distribution, with the mean towards the positive side during the freeze-up and melting periods.Figure 12c shows the spatial distribution of the difference between SD-UB and SD-IMB.Again, the biases are mostly positive, indicating a systematic bias.This is also true for the freeze-up and melting periods (not shown).Apparently, our retrieval has a much smaller bias and RMSE than that of SD-UB.Next, we repeated the comparison for the FYI for all three data sets, including SD-EDNN, SD-NASA, and SD-UB.Here we used the ice age data from https://nsidc.org/data/nsidc-0611 to mask out FYI.As shown in Figure 11, the correlation coefficient between the SD-EDNN and the observation is 0.63, which is much higher that of SD-NASA and SD-UB.Note that most of the IMB observations melting periods (see Table 2).Figure 12b shows the histogram of the difference between SD-UB and SD-IMB.It appears that the bias distribution deviates relatively from the normal distribution, with the mean towards the positive side during the freeze-up and melting periods.Figure 12c shows the spatial distribution of the difference between SD-UB and SD-IMB.Again, the biases are mostly positive, indicating a systematic bias.This is also true for the freeze-up and melting periods (not shown).Apparently, our retrieval has a much smaller bias and RMSE than that of SD-UB.

Discussion and Conclusions
In this study, we developed a new algorithm for retrieving snow depth over sea ice in the Arctic basin.We constructed an ensemble-based deep neural network and used brightness temperatures from seven frequency channels of SSMIS as the input data and IMB measured snow depth as the target (training) data.Assessment with the target and independent snow depth measurements suggests that our ensemble-based deep neural network is robust, which is proved by high correlation between the retrieved and observed snow depth.
For the accuracy validation, our retrieved snow depth is validated against the independent snow depth measurements.Our analysis shows that the retrieved snow depth is in good agreement with the observations, in terms of correlation, bias, RMSE, and probability distribution, even during the melting period.Thus, our ensemble-based deep neural network can be used for snow depth retrieval for both FYI to MYI and the melting period.It should be noted that the majority of the IMB data used in this study were measured on MYI, and no IMB data were available in the Siberian and European sectors of the Arctic.Thus, more in-situ measurements over FYI are needed as the target and validation data.Moreover, given changes in penetration depth and snow properties during the melting period and their influence on brightness temperatures, more in-situ measurements in summer are needed as the target and validation data to further confirm the application of our ensembled-based deep neural network for the snow depth retrieval during the melting period.The ongoing Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC) will conduct a more comprehensive measurement of snow depth over Arctic sea ice.
For central Arctic snow depth data sets comparison, we examine the consistency and discrepancy of snow depth over Arctic sea ice between our data and two available data (SD-NASA and SD-UB) that are based on traditional regressive models.The results suggest that our retrieval outperforms these two data sets.Specifically, our retrieval has much better spatial and temporal coverage than that provided by the NASA Cryosphere Science Research and has much smaller bias and RMSE and better probability distribution than that processed by the sea ice remote sensing group of the University of Bremen.Although our retrieval shows better performance compared to SD-NASA and SD-UB, more investigations are needed to further improve the results.For example, recent research shows the advantage of the snow depth retrieval using lower frequencies of passive microwave measurements [35].Apart from the channels similar to SSM/I and SSMIS, the AMSR-E and AMSR2 include brightness temperatures at 6.9 and 10.6 GHz.We plan to apply our ensemble-based deep neural network to the AMSR-E and AMSR2 to examine to what extent the snow depth retrieval can be improved by considering lower frequency brightness temperatures.More recently, Braakmann-Folgmann and Donlon [53] attempted to retrieve snow depth over sea ice using a multi-source of satellite microwave radiometer measurements with a neural network.They showed that the inclusion of the L-band of SMOS as the additional input could lead to better estimation of snow depth.We also plan to apply our ensemble-based deep neural network to multi-source of passive microwave remote sensing to examine to what extent the snow depth retrieval can be improved.
More recently, two different approaches have been developed to obtain snow depth over Arctic sea ice during the accumulation season.Blanchard-Wrigglesworth et al. [57] reconstructed snow depth using snowfall from atmospheric reanalysis, sea ice motion from satellite, and a Lagrangian ice tracking algorithm.Petty et al. [65] forced the NASA Eulerian snow on sea ice model with snowfall and winds from atmospheric reanalysis and sea ice concentration and motion from during to simulate snow depth the accumulation season.They show the reconstructed and simulated snow depth have a good relationship with the Operation IceBridge measurements.A comparison of our retrieval and these two data sets will be conducted in future research.

Figure 1 .
Figure 1.Trajectories of 38 sea ice mass balance buoys used in this study.

Figure 1 .
Figure 1.Trajectories of 38 sea ice mass balance buoys used in this study.

Figure 3 .
Figure 3. Histogram of the daily averaged ice mass balance (IMB) as a function of (a) month, (b) IMB measured sea ice thickness (bin width = 10 cm) and (c) sea ice concentration (%) derived from the NASA Team algorithm (bin width = 5%).

Figure 3 .
Figure 3. Histogram of the daily averaged ice mass balance (IMB) as a function of (a) month, (b) IMB measured sea ice thickness (bin width = 10 cm) and (c) sea ice concentration (%) derived from the NASA Team algorithm (bin width = 5%).

Figure 3 .
Figure 3. Histogram of the daily averaged ice mass balance (IMB) as a function of (a) month, (b) IMB measured sea ice thickness (bin width = 10 cm) and (c) sea ice concentration (%) derived from the NASA Team algorithm (bin width = 5%).

Figure 4 .
Figure 4.The topological configuration of the three hidden layer backpropagation neural network.

Figure 4 .
Figure 4.The topological configuration of the three hidden layer backpropagation neural network.

Figure 5 .
Figure 5.The correlation coefficient and RMSE of the 100 three hidden layer networks with random weights and biases assigned initially, (a,c) training data and (b,d) validation data.The horizontal black line represents the correlation coefficient between the snow depth obtained from the ensemblebased deep neural network snow depth and IMB measurements.

Figure 5 .
Figure 5.The correlation coefficient and RMSE of the 100 three hidden layer networks with random weights and biases assigned initially, (a,c) training data and (b,d) validation data.The horizontal black line represents the correlation coefficient between the snow depth obtained from the ensemble-based deep neural network snow depth and IMB measurements.

Figure 6 .
Figure 6.(a) Scatter plot between the snow depth of our ensemble-based deep neural network (SD-EDNN) and the snow depth data of ice mass balance buoys (SD-IMB), and (b) histogram of the difference between SD-EDNN and SD-IMB for all the validation data (black color, bin width = 5 cm), the data during freeze-up period (November to March, blue color), and the data during melting period (May to September, red color), the root mean square error (RMSE) between SD-EDNN and SD-IMB for all validation data, as well as the data during freeze-up and melting period, are marked with horizontal bars.(c) Spatial distribution of the difference between SD-EDNN and SD-IMB.(d) Boxplot of the difference between SD-EDNN as a function of SD-IMB with 5 cm bins, the y-axis on the right side shows the number of SD-IMB in each bin.The red line is the number of IMB observations used for the calculation of SD-EDNN minus SD-IMB for each bin.

Figure 6 .
Figure 6.(a) Scatter plot between the snow depth of our ensemble-based deep neural network (SD-EDNN) and the snow depth data of ice mass balance buoys (SD-IMB), and (b) histogram of the difference between SD-EDNN and SD-IMB for all the validation data (black color, bin width = 5 cm), the data during freeze-up period (November to March, blue color), and the data during melting period (May to September, red color), the root mean square error (RMSE) between SD-EDNN and SD-IMB for all validation data, as well as the data during freeze-up and melting period, are marked with horizontal bars.(c) Spatial distribution of the difference between SD-EDNN and SD-IMB.(d) Boxplot of the difference between SD-EDNN as a function of SD-IMB with 5 cm bins, the y-axis on the right side shows the number of SD-IMB in each bin.The red line is the number of IMB observations used for the calculation of SD-EDNN minus SD-IMB for each bin.

22 Figure 7 .
Figure 7. Time series of snow depth from IMB and SD-EDNN for individual buoys (a-e) and their corresponding trajectories.Blue dots are IMB data, red dots are the retrieval, and red dots with black circle are the validation data.

Figure 7 .
Figure 7. Time series of snow depth from IMB and SD-EDNN for individual buoys (a-e) and their corresponding trajectories.Blue dots are IMB data, red dots are the retrieval, and red dots with black circle are the validation data.

Figure 8 .
Figure 8. Spatial distribution of the monthly mean SD-NASA snow depth averaged from June 2002 to June 2018.

Figure 8 .
Figure 8. Spatial distribution of the monthly mean SD-NASA snow depth averaged from June 2002 to June 2018.

Figure 9 .
Figure 9. Spatial distribution of the monthly mean SD-UB snow depth averaged from June 2002 to June 2018.

Figure 9 .
Figure 9. Spatial distribution of the monthly mean SD-UB snow depth averaged from June 2002 to June 2018.

Figure 10 .
Figure 10.Spatial distribution of the monthly mean SD-EDNN snow depth averaged from June 2002 to June 2018.

Figure 10 .
Figure 10.Spatial distribution of the monthly mean SD-EDNN snow depth averaged from June 2002 to June 2018.

22 Figure 11 .
Figure 11.Scatter plot between the overlapped (a) SD-EDNN, (c) SD-UB and (e) SD-NASA and SD-IMB for the first year sea ice (FYI), and the histogram of the difference between the overlapped (b) SD-EDNN, (d) SD-UB and (f) SD-NASA and SD-IMB (bin width = 5 cm) for the FYI.

Figure 11 .
Figure 11.Scatter plot between the overlapped (a) SD-EDNN, (c) SD-UB and (e) SD-NASA and SD-IMB for the first year sea ice (FYI), and the histogram of the difference between the overlapped (b) SD-EDNN, (d) SD-UB and (f) SD-NASA and SD-IMB (bin width = 5 cm) for the FYI.

Figure 11 .
Figure 11.Scatter plot between the overlapped (a) SD-EDNN, (c) SD-UB and (e) SD-NASA and SD-IMB for the first year sea ice (FYI), and the histogram of the difference between the overlapped (b) SD-EDNN, (d) SD-UB and (f) SD-NASA and SD-IMB (bin width = 5 cm) for the FYI.

Figure 12 .Figure 12 .
Figure 12.(a) Scatter plot between SD-UB and SD-IMB, and (b) histogram of the difference between SD-UB and SD-IMB for all the data (black color, bin width = 5 cm), the data during freeze-up period (November to March, blue color), and the data during melting period (May to September, red color), Figure 12.(a) Scatter plot between SD-UB and SD-IMB, and (b) histogram of the difference between SD-UB and SD-IMB for all the data (black color, bin width = 5 cm), the data during freeze-up period (November to March, blue color), and the data during melting period (May to September, red color), the root mean square error (RMSE) between SD-EDNN and SD-IMB as well as the data during freeze-up and melting period are marked with horizontal bars.(c) Spatial distribution of the difference between SD-UB and SD-IMB.

Table 1 .
Information of SD-NASA and SD-UB used for the intercomparison.