Estimation of Soil Freeze Depth in Typical Snowy Regions Using Reanalysis Dataset: A Case Study in Heilongjiang Province, China

: Soil freeze depth variations greatly affect energy exchange, carbon exchange, ecosystem diversity, and the water cycle. Given the importance of these processes, obtaining freeze depth data over large scales is an important focus of research. This paper presents a simple empirical algorithm to estimate the maximum seasonally frozen depth (MSFD) of seasonally frozen ground (SFG) in snowy regions. First, the potential inﬂuences of driving factors on the MSFD variations were quantiﬁed in the baseline period (1981–2010) based on the 26 meteorological stations within and around the SFG region of Heilongjiang province. The three variables that contributed more than 10% to MSFD variations (i.e., air freezing index, annual mean snow depth, and snow cover days) were considered in the analysis. A simple multiple linear regression to estimate soil freeze depth was ﬁtted (1981–2010) and veriﬁed (1975–1980 and 2011–2014) using ground station observations. Compared with the commonly used simpliﬁed Stefan solution, this multiple linear regression produced superior freeze depth estimations, with the mean absolute error and root mean square error of the station average reduced by over 20%. By utilizing this empirical algorithm and the ERA5-Land reanalysis dataset, the multi-year average MSFD (1981–2010) was 132 cm, ranging from 52 cm to 186 cm, and MSFD anomaly exhibited a signiﬁcant decreasing trend, at a rate of − 0.38 cm/decade or a net change of − 28.14 cm from 1950–2021. This study provided a practical approach to model the soil freeze depth of SFG over a large scale in snowy regions and emphasized the importance of considering snow cover variables in analyzing and estimating soil freeze depth.

Soil freeze depth is generally obtained by manual observation through standard frost tubes when the ground surface temperature is below 0 • C [17].Using ground observations, substantial efforts have been made to estimate the spatial distribution and change in the maximum seasonally frozen depth (MSFD) of seasonally frozen ground (SFG), such as in the Three Rivers Source Region [5] and Qinghai-Tibetan Plateau (QTP) of China [6,14], all of China [9,10], and Eurasian high latitudes [1].However, the results may have been partially affected by the number and distribution of stations, particularly at high altitudes and latitudes with sparse site distribution due to harsh natural environments and expensive observation costs [18,19].In addition, analysis based on station observations cannot reflect the spatiotemporal variations of soil freeze depth in data-scarce areas.Under such conditions, the development of soil freeze depth estimation models can be an effective way to obtain regional-scale freeze depth data.Over a regional scale, soil freeze/thaw processes have often been simulated by incorporating numerical solution schemes (e.g., finite element and finite volume) into land surface or hydrological models [4,[20][21][22], but the high computational cost and large data requirement for model calibration have limited their application in large scale surveys [19].By utilizing climatic variables such as air temperature, precipitation, snow depth, or freeze-thaw indexes, some empirical or semi-empirical solutions have performed adequately in soil freeze depth simulations [1,4,6,9,[23][24][25].Among them, the Stefan solution has been most widely used to estimate the soil freeze depth of SFG and active layer thickness above permafrost by utilizing air temperature and soil parameters (e.g., soil thermal conductivity, soil bulk density, and soil water content) as main inputs [9,26,27].
Due to the significant correlation between the soil freeze depth and air temperature, the Stefan solution has been simplified and widely used in MSFD estimation at both point [28,29] and regional scales [8,9,[30][31][32].However, considering air temperature as the only input variable and generalizing other variables as constants [31,33] may lead to large uncertainties in MSFD estimations.For example, in areas with deep snow cover, the effect of snow cover on soil temperature [34][35][36], the soil freeze depth of SFG [37], and the active layer thickness above permafrost [38][39][40][41] has been significant or even exceeded that of air temperature.However, the potential impact of snow cover parameters on soil freeze depth was not fully considered in the simplified Stefan solution, which may limit its wide application in areas with substantial snow cover.Other variables, such as soil properties and vegetation characteristics, may also influence the variation in soil freeze depth [1,42].For example, soil properties mainly affect the thermal and hydraulic conductivity of soil and thus influence the freeze/thaw process [43].Vegetation can affect the energy balance of the ground via changes in the surface albedo (e.g., bare ground versus vegetation cover), vegetation transpiration, and shading effects [44,45], which influence the freezing conditions of the soil [46].
Northeast China is a stable snow cover distribution area, where average snow depth, snow density, and snow water equivalent are larger than in other regions of China [47][48][49].Heilongjiang province is located in the northern part of Northeast China, where the annual average air temperature has risen at a rate of 0.35 • C/decade from 1960 to 2015 [50], leading to clear changes in the frozen ground [51,52].With Heilongjiang province as the study area, we aimed to develop a simple empirical algorithm for estimating the MSFD of SFG in snowy regions.To this end, we first quantified the potential influences of driving factors (including air temperature, snow cover, vegetation, and soil properties) on the MSFD variation based on the data from the 26 meteorological stations within and around the SFG region of Heilongjiang province.By selecting and fitting the principal driving elements, the empirical algorithm for MSFD estimation was developed, and its performance was evaluated by comparing it with the observed freeze depths and the estimations of the simplified Stefan solution.This empirical algorithm was also combined with the ERA5-Land reanalysis dataset to investigate the spatiotemporal variations in the soil freeze depth of SFG in Heilongjiang province.

Study Area
With a geographical boundary of approximately 121 • 11 -135 • 05 E and 43 • 26 -53 • 33 N, Heilongjiang province extends 930 km from east to west and 1120 km from north to south (Figure 1).The elevation ranges from 28 to 1637 m, with only 3.24% of the area exceeding 800 m.
As a typical cold temperate continental monsoon climate region, Heilongjiang province experiences hot and rainy summers and cold and dry winters [53].According to the distribution map of frozen ground across China [54], discontinuous permafrost, island permafrost, and seasonally frozen ground are distributed predominantly from north to south in Heilongjiang province (Figure 1).Among them, seasonally frozen ground comprises up to 68% of the total area and is the focus of this paper.

Data and Methods
Historical climate data of daily observations within Heilongjiang province, including soil freeze depth, air temperature, and snow depth, were obtained from the China Meteorological Administration.Daily soil freeze depth was manually measured at 08:00 Beijing time using a standard frost tube by trained professional technicians when the ground surface temperature was below 0 °C.The mean daily air temperature was calculated from the arithmetic means of four daily observations (02:00, 08:00, 14:00, and 20:00 Beijing time).Snow depth was measured at 08:00 Beijing time using a centimeter-scale wooden ruler when snow covered the ground and was recorded as an integer [17].The annual MSFD was determined as the maximum daily frozen ground depth during a freezing year from September to August of the following year [6,10].Twenty-six meteorological stations, with records available for more than three-quarters of the study period from 1975 to 2014 within and around the SFG region of Heilongjiang province, were selected to quantify the effect of main driving factors on MSFD variations and construct the MSFD estimation algorithm.Among the 26 stations, 4 were located in island permafrost regions but were not themselves above permafrost (Figure 1).The annual air freezing index (FI), which is the sum of the absolute values of all daily air temperatures below 0 °C, was used to evaluate the influence of air temperature on soil freeze depth As a typical cold temperate continental monsoon climate region, Heilongjiang province experiences hot and rainy summers and cold and dry winters [53].According to the distribution map of frozen ground across China [54], discontinuous permafrost, island permafrost, and seasonally frozen ground are distributed predominantly from north to south in Heilongjiang province (Figure 1).Among them, seasonally frozen ground comprises up to 68% of the total area and is the focus of this paper.

Data and Methods
Historical climate data of daily observations within Heilongjiang province, including soil freeze depth, air temperature, and snow depth, were obtained from the China Meteorological Administration.Daily soil freeze depth was manually measured at 08:00 Beijing time using a standard frost tube by trained professional technicians when the ground surface temperature was below 0 • C. The mean daily air temperature was calculated from the arithmetic means of four daily observations (02:00, 08:00, 14:00, and 20:00 Beijing time).Snow depth was measured at 08:00 Beijing time using a centimeter-scale wooden ruler when snow covered the ground and was recorded as an integer [17].The annual MSFD was determined as the maximum daily frozen ground depth during a freezing year from September to August of the following year [6,10].Twenty-six meteorological stations, with records available for more than three-quarters of the study period from 1975 to 2014 within and around the SFG region of Heilongjiang province, were selected to quantify the effect of main driving factors on MSFD variations and construct the MSFD estimation algorithm.Among the 26 stations, 4 were located in island permafrost regions but were not themselves above permafrost (Figure 1).The annual air freezing index (FI), which is the sum of the absolute values of all daily air temperatures below 0 • C, was used to evaluate the influence of air temperature on soil freeze depth [33,43].Annual snow cover days (SCD) and average snow depth (ASD) were obtained from daily snow depth observations from each station.SCD is the number of days in a year with snow on the ground at a depth of ≥1 cm [55], and ASD is the ratio of the total of all daily snow depths to SCD [56].
Due to the shortage of observational data, the time series data of soil moisture content (SM) and leaf area index (LAI) of the 26 stations were extracted from the ERA5-Land hourly reanalysis dataset (https://cds.climate.copernicus.eu(accessed on 10 May 2022)), depending on the longitude and latitude of the stations using ArcGIS 10.6.The ERA5-Land reanalysis dataset provides a consistent view of the evolution of land variables over several decades at high temporal (1 h) and spatial (0.1 • × 0.1 • ) resolutions from 1950 to the present, which grants it utility for many land surface applications, such as water resource, land, and environmental management [57,58].To estimate the MSFD at regional scales, gridded air temperature (at 2 m above the land surface), snow depth, SM, and LAI data were also extracted from the ERA5-Land reanalysis dataset.Specifically, SM refers to the mean values of soil moisture at depths of 0-7, 7-28, and 28-100 cm.
To evaluate hourly air temperature and snow depth data from ERA5-Land, the values of FI, ASD, and SCD calculated from the observations of the 26 stations were compared with those calculated from the ERA5-Land dataset using two evaluation indexes: root mean square error (RMSE) and mean absolute error (MAE).For the evaluation of hourly SM and LAI data from ERA5-Land, we compared the possible impact of using different data products to quantify the effects of the main driving factors on MSFD variations and to construct the MSFD estimation algorithm.Among these data products, gridded normalized difference vegetation index (NDVI) data from 1982 to 2010 with 5 km spatial resolution were collected from the National Earth System Science Data Center of the National Science & Technology Infrastructure of China (http://www.geodata.cn(accessed on 15 May 2022)); the data are produced by the National Oceanic and Atmospheric Administration NDVI Climate Data Record.The SM data were provided by the Global Land Data Assimilation System (GLDAS V2.0), which produces operational spatiotemporally continuous global soil moisture data sets (https://disc.gsfc.nasa.gov(accessed on 15 May 2022)), and has been widely used in previous research [59][60][61].The mode of the GLDAS used in this study was the Noah Land Surface Model with a spatial resolution of 0.25 degrees and the monthly time series data from 1948 to 2015 [62,63].Soil moisture was calculated as the mean soil moisture values at depths of 0-10, 10-40, and 40-100 cm.All these data (i.e., FI, ASD, SCD, LAI, NDVI, and SM) of the 26 stations from different data products were extracted depending on the longitude and latitude of the stations using ArcGIS 10.6.
During the baseline period (i.e., 1981-2010), the relative contributions of the environment variables (i.e., FI, SCD, ASD, SM, and LAI) to MSFD variations were quantified with hierarchical partitioning analysis implemented using the R package "rdacca.hp" in R programming language [64].In practical applications, this package has demonstrated advantages in managing multicollinearity [65][66][67] and allows us to directly compare the relative contributions of these explanatory variables by standardizing each variable.This R package "rdacca.hp" is available at the open-access repository Zenodo https://zenodo.org/record/5796018(accessed on 10 March 2022).To reduce the complexity of the model, only variables contributing more than 10% to MSFD variations were retained to fit the MSFD estimation algorithm.A simple multiple linear regression was performed with these variables to explore the possible empirical relationships between the MSFD and these environmental variables in Heilongjiang province.MSFD measurement data during the baseline period (i.e., 1981-2010) were used to construct the MSFD estimation algorithm, and 10 years of observational data (i.e., 1975-1980 and 2011-2014) were used to verify the accuracy of the algorithm.The MAE and RMSE were used to compare the observed and estimated MSFD values.We also compared the results from our empirical algorithm with those obtained from commonly used estimation methods (i.e., the simplified Stefan solution; see Equation (S2) in Supporting Materials).Based on the ERA5-Land reanalysis dataset, this empirical equation was applied to investigate the detailed spatiotemporal variations in the soil freeze depth of SFG in Heilongjiang province.By utilizing the modified Mann-Kendall test and Sen's slope estimator method [68,69], the trends of MSFD change were evaluated at both regional and grid scales.Here, changes in MSFD at regional scales were based on a time series of anomalies (with respect to the mean of the 30-year baseline period, i.e., 1981-2010) from 1950 to 2021.

Driving Factors of MSFD Variation
Based on data from 26 stations during the 30-year baseline period (1981−2010), the relative contributions of five environment variables to MSFD variations estimated from different data products (i.e., vegetation indices and soil moisture) were highly consistent (Figure 2), with the contribution of ASD > FI > SCD.The other two variables contributed less than 2% each.This suggests that different data products may have only minor effects on the quantification of the main driving factors of MSFD variations.
different data products (i.e., vegetation indices and soil moisture) were highly consistent (Figure 2), with the contribution of ASD > FI > SCD.The other two variables contributed less than 2% each.This suggests that different data products may have only minor effects on the quantification of the main driving factors of MSFD variations.
The results from ERA5-Land data (i.e., SM and LAI data were obtained from ERA5-Land) showed that (Figure 2) approximately 42.04% of the total variability in MSFD was explained by ASD during the baseline period (i.e., 1981-2010), while ~27.20% was explained by FI.Further study indicated that the station average contributions of ASD to MSFD were over 30%, with nearly 85% of sites contributing more than 10%, whereas the station average contributions of FI to MSFD were ~12.71%, with 13 stations contributing more than 10%.These findings indicated that the influence of air temperature (characterized by FI) on soil freeze depth was clearly lower than that of average snow depth in the study area.We also found that SCD explained approximately 11.07% of the total variability in MSFD, while the effect of SM and LAI on MSFD was negligible, with contributions of less than 2% each.
Overall, these findings indicated that ASD had the greatest effect on the MSFD dynamics, followed by FI and SCD, while annual SM and LAI changes had negligible effects.2).The negative contribution may suggest that the effect of this variable on MSFD variations can be considered truly unimportant [64].
The results from ERA5-Land data (i.e., SM and LAI data were obtained from ERA5-Land) showed that (Figure 2) approximately 42.04% of the total variability in MSFD was explained by ASD during the baseline period (i.e., 1981-2010), while ~27.20% was explained by FI.Further study indicated that the station average contributions of ASD to MSFD were over 30%, with nearly 85% of sites contributing more than 10%, whereas the station average contributions of FI to MSFD were ~12.71%, with 13 stations contributing more than 10%.These findings indicated that the influence of air temperature (characterized by FI) on soil freeze depth was clearly lower than that of average snow depth in the study area.We also found that SCD explained approximately 11.07% of the total variability in MSFD, while the effect of SM and LAI on MSFD was negligible, with contributions of less than 2% each.
Overall, these findings indicated that ASD had the greatest effect on the MSFD dynamics, followed by FI and SCD, while annual SM and LAI changes had negligible effects.

Development of the MSFD Estimation Algorithm
Considering that three variables (i.e., FI, ASD, and SCD) each contributed more than 10% to MSFD variation, and the sum of the relative contribution of these three variables was 81.4% (Figure 2), they were included in a multiple linear regression to construct the MSFD estimation algorithm.Their fitting equation was expressed as: where MSFD is the annual maximum freeze depth (cm), FI is the air freezing index ( • C), ASD is the annual mean snow depth (cm), and SCD is the snow cover days (days).The multiple regression equation was significant despite a relatively low R 2 (p < 0.01, R 2 = 0.37) and could thus be used to estimate the MSFD for the study region.
Compared with the simplified Stefan formula, the RMSE and MAE of the station average during the 10-year validation period (i.e., 1975-1980 and 2011-2014) obtained from the multiple regression equation were reduced by approximately 23.1% and 20.1%, respectively (Figure 3).Additionally, the obtained station average RMSEs and MAEs from the multiple regression equation were ~10% lower than those obtained using the Stefan formula for 1975-1980 and more than 30% lower for 2011-2014.The above comparative results suggest that this simple regression equation has wide applicability for estimating soil freeze depth in the study area and also emphasize the need to consider snow cover variables when estimating soil freeze depth in snowy areas.

Figure 2.
Relative contributions of five variables to MSFD variation for the entire region based on data from 26 stations during the 30-year baseline period .The red bars represent the data of SM and LAI were obtained from ERA5-Land reanalysis dataset.The blue bars represent the data of SM and NDVI were obtained from other data products (see Section 2.2).The negative contribution may suggest that the effect of this variable on MSFD variations can be considered truly unimportant [64].

Development of the MSFD Estimation Algorithm
Considering that three variables (i.e., FI, ASD, and SCD) each contributed more than 10% to MSFD variation, and the sum of the relative contribution of these three variables was 81.4% (Figure 2), they were included in a multiple linear regression to construct the MSFD estimation algorithm.Their fitting equation was expressed as: where MSFD is the annual maximum freeze depth (cm), FI is the air freezing index (°C), ASD is the annual mean snow depth (cm), and SCD is the snow cover days (days).The multiple regression equation was significant despite a relatively low R 2 (p < 0.01, R 2 = 0.37) and could thus be used to estimate the MSFD for the study region.
Compared with the simplified Stefan formula, the RMSE and MAE of the station average during the 10-year validation period (i.e., 1975-1980 and 2011-2014) obtained from the multiple regression equation were reduced by approximately 23.1% and 20.1%, respectively (Figure 3).Additionally, the obtained station average RMSEs and MAEs from the multiple regression equation were ~10% lower than those obtained using the Stefan formula for 1975-1980 and more than 30% lower for 2011-2014.The above comparative results suggest that this simple regression equation has wide applicability for estimating soil freeze depth in the study area and also emphasize the need to consider snow cover variables when estimating soil freeze depth in snowy areas.

Performance of ERA5-Land
Figure 4 shows the statistical criteria between the calculated using the ERA5-Land reanalysis dataset and observed FI, ASD, and SCD at 26 meteorological stations in Heilongjiang province from 1975−2014.Due to relatively large differences in the FI and SCD

Performance of ERA5-Land
Figure 4 shows the statistical criteria between the calculated using the ERA5-Land reanalysis dataset and observed FI, ASD, and SCD at 26 meteorological stations in Heilongjiang province from 1975−2014.Due to relatively large differences in the FI and SCD values for each station, the MAE and RMSE varied greatly between stations.The ratios of MAE and RMSE to annual mean FI, ASD, and SCD (RMSE/Mean and MAE/Mean, respectively) were selected to analyze the ERA5-Land performance [48].
having an MAE/Mean < 0.6.For ASD, the RMSE/Mean and MAE/Mean of the station average were 1.01 and 0.80, respectively, with 14 stations (53.8 %) having an RMSE/Mean < 1.0 and 20 (76.9 %) having an MAE/Mean < 1.0.Compared with the statistical criteria of the calculated and observed snow depth in China from 1951 to 2009 [48], the RMSE/Mean and MAE/Mean of the corresponding sites in this study were relatively low, which suggests that the snow depth estimation result in Heilongjiang province using the ERA5-Land reanalysis dataset was reasonable and reliable.

Spatial Distributions of Soil Freeze Depth
Using air temperature and snow depth data obtained from the ERA5-Land hourly reanalysis dataset, we calculated the values of FI, ASD, and SCD from 1950−2021 in Heilongjiang province and then applied Equation (S1) to obtain the spatial distribution status of the soil freeze depth of SFG.
Based on the 30-year baseline period (1981−2010), the spatial distributions of the MSFD of SFG varied greatly (Figure 5).The average MSFD for the entire region was 132 cm, ranging from 52 to 186 cm.The areas at depths of 52-120 cm, 120-130 cm, 130-140 cm, 140-150 cm, and 150-186 cm accounted for approximately 20.66%, 28.68%, 14.19%, 19.86%, and 16.62% of the total area of SFG, respectively.As shown in Figure 5, the higher MSFD was mainly concentrated in the north (e.g., > 140 cm), whereas the lower For FI, the RMSE/Mean and MAE/Mean of the station average were 0.16 and 0.15, respectively, with 18 stations (69.2 %) having an RMSE/Mean < 0.2 and 19 (73.1 %) having an MAE/Mean < 0.2.These relatively low values suggest relatively high estimation accuracy for air temperature using the ERA5-Land reanalysis dataset in Heilongjiang province.For SCD, the RMSE/Mean and MAE/Mean of the station average were 0.49 and 0.42, respectively, with 19 stations (73.1 %) having an RMSE/Mean < 0.6 and 22 (84.6 %) having an MAE/Mean < 0.6.For ASD, the RMSE/Mean and MAE/Mean of the station average were 1.01 and 0.80, respectively, with 14 stations (53.8 %) having an RMSE/Mean < 1.0 and 20 (76.9 %) having an MAE/Mean < 1.0.Compared with the statistical criteria of the calculated and observed snow depth in China from 1951 to 2009 [48], the RMSE/Mean and MAE/Mean of the corresponding sites in this study were relatively low, which suggests that the snow depth estimation result in Heilongjiang province using the ERA5-Land reanalysis dataset was reasonable and reliable.

Spatial Distributions of Soil Freeze Depth
Using air temperature and snow depth data obtained from the ERA5-Land hourly reanalysis dataset, we calculated the values of FI, ASD, and SCD from 1950−2021 in Heilongjiang province and then applied Equation (S1) to obtain the spatial distribution status of the soil freeze depth of SFG.
Based on the 30-year baseline period (1981−2010), the spatial distributions of the MSFD of SFG varied greatly (Figure 5).The average MSFD for the entire region was 132 cm, ranging from 52 to 186 cm.The areas at depths of 52-120 cm, 120-130 cm, 130-140 cm, 140-150 cm, and 150-186 cm accounted for approximately 20.66%, 28.68%, 14.19%, 19.86%, and 16.62% of the total area of SFG, respectively.As shown in Figure 5, the higher MSFD was mainly concentrated in the north (e.g., > 140 cm), whereas the lower MSFD was mainly concentrated in the south (e.g., < 120 cm).Overall, the MSFD displayed an increasing trend from southeast to northwest.MSFD was mainly concentrated in the south (e.g., < 120 cm).Overall, the MSFD displayed an increasing trend from southeast to northwest.

Changes in Soil Freeze Depth
Over the past seven decades, the average MSFD for the entire region reached its maximum (153 cm) in the 1950s, after which it has tended to gradually decrease.From the 1950s to the 2010s, the MSFD greatly decreased by −25 cm, and the area of MSFD that thinned more than −20 cm represented ~72% of the total area of SFG (Figure 6).In addition, increased MSFD was detected, and the area ratio of MSFD thickening increased to 79.4% from the 1980s to the 1990s.Similar findings also applied to MSFD variation for other periods.For example, the area ratio of soil freeze depth thickening exceeded 25% for the periods of the 1970s-1980s, 1990s-2000s, and 2000s-2010s but was only ~12% for the 1960s-1970s (Figure 6).
Figure 7 shows that the MSFD anomaly exhibited a significantly decreasing trend at a rate of −0.39 cm/decade or a net change of −28.14 cm from 1950-2021 (i.e., 72 years).At grid scales, approximately 93.48% of the area displayed a significant decreasing trend in MSFD at the 95% confidence level (Figure 8).The geographic distributions of the decreasing rates in MSFD have varied greatly over the past 72 years, with relatively high decreasing rates (e.g., <−4.5 cm/decade) primarily occurring in the center of the region and relatively low decreasing rates (e.g., −3.0-0 cm/decade) sporadically distributed around the study area.

Changes in Soil Freeze Depth
Over the past seven decades, the average MSFD for the entire region reached its maximum (153 cm) in the 1950s, after which it has tended to gradually decrease.From the 1950s to the 2010s, the MSFD greatly decreased by −25 cm, and the area of MSFD that thinned more than −20 cm represented ~72% of the total area of SFG (Figure 6).In addition, increased MSFD was detected, and the area ratio of MSFD thickening increased to 79.4% from the 1980s to the 1990s.Similar findings also applied to MSFD variation for other periods.For example, the area ratio of soil freeze depth thickening exceeded 25% for the periods of the 1970s-1980s, 1990s-2000s, and 2000s-2010s but was only ~12% for the 1960s-1970s (Figure 6).
Figure 7 shows that the MSFD anomaly exhibited a significantly decreasing trend at a rate of −0.39 cm/decade or a net change of −28.14 cm from 1950-2021 (i.e., 72 years).At grid scales, approximately 93.48% of the area displayed a significant decreasing trend in MSFD at the 95% confidence level (Figure 8).The geographic distributions of the decreasing rates in MSFD have varied greatly over the past 72 years, with relatively high decreasing rates (e.g., <−4.5 cm/decade) primarily occurring in the center of the region and relatively low decreasing rates (e.g., −3.0-0 cm/decade) sporadically distributed around the study area.

Comparison with the Stefan Solution
The simplified Stefan solution requires relatively few inputs (see Equation (S2) in Supporting Materials) and has high simulation accuracy, and thus has been broadly applied for soil freeze depth estimations at regional, national, and hemispheric scales [8,9,[28][29][30][31][32].However, the simplified Stefan solution does not fully consider the potential impact of changing snow cover conditions on soil freeze depth, which may limit its wide application in areas with substantial snow cover.We therefore emphasize the importance of considering snow cover variables in analyzing and estimating the soil freeze depth of SFG in such areas.
Snow only covers the ground during the cold season, but its changes can significantly affect the soil's thermal state due to its high albedo, low thermal conductivity, and latent heat of phase changes [34,35,70,71], which then influence soil freeze/thaw processes [11,36,71].In pan-Arctic regions, snow depth and snow duration can impact permafrost thermal regimes [72], and increasing snow depth greatly promotes the increase in active layer thickness [41].Conversely, reductions in snow depth decrease soil temperature and active layer thickness [39,40].Snow cover can cause permafrost degradation in the low Arctic due to a strong insulation effect in winter but can protect permafrost in the high Arctic when the SCD exceeds 330 days [38].Across the circumpolar north, the influence of SCD on soil temperature was larger than that of snow depth [34].According to the summary of Zhang (2001) [71], a relatively thin snow cover may cool the soil surface due to high albedo snow causing the soil to absorb less solar radiation, but this cooling effect is very brief if the snow lasts for a short time.Alternatively, relatively thick snow cover can warm the ground by insulating it from coldness, whereas snow cover that lasts until next spring or even summer can cool the ground due to the albedo and latent heat of fusion.
By quantifying the potential influences of driving factors (including air temperature,

Comparison with the Stefan Solution
The simplified Stefan solution requires relatively few inputs (see Equation (S2) in Supporting Materials) and has high simulation accuracy, and thus has been broadly applied for soil freeze depth estimations at regional, national, and hemispheric scales [8,9,[28][29][30][31][32].However, the simplified Stefan solution does not fully consider the potential impact of changing snow cover conditions on soil freeze depth, which may limit its wide application in areas with substantial snow cover.We therefore emphasize the importance of considering snow cover variables in analyzing and estimating the soil freeze depth of SFG in such areas.
Snow only covers the ground during the cold season, but its changes can significantly affect the soil's thermal state due to its high albedo, low thermal conductivity, and latent heat of phase changes [34,35,70,71], which then influence soil freeze/thaw processes [11,36,71].In pan-Arctic regions, snow depth and snow duration can impact permafrost thermal regimes [72], and increasing snow depth greatly promotes the increase in active layer thickness [41].Conversely, reductions in snow depth decrease soil temperature and active layer thickness [39,40].Snow cover can cause permafrost degradation in the low Arctic due to a strong insulation effect in winter but can protect permafrost in the high Arctic when the SCD exceeds 330 days [38].Across the circumpolar north, the influence of SCD on soil temperature was larger than that of snow depth [34].According to the summary of Zhang (2001) [71], a relatively thin snow cover may cool the soil surface due to high albedo snow causing the soil to absorb less solar radiation, but this cooling effect is very brief if the snow lasts for a short time.Alternatively, relatively thick snow cover can warm the ground by insulating it from coldness, whereas snow cover that lasts until next spring or even summer can cool the ground due to the albedo and latent heat of fusion.
By quantifying the potential influences of driving factors (including air temperature, snow cover, vegetation, and soil properties) on the MSFD variation based on data from the 26 stations, the effect of snow cover on the MSFD was significantly greater than that of air temperature in Heilongjiang, with the relative contributions of ASD and SCD (~56.11%)more than double that of FI (27.20%).This may have been due to relatively thick snow cover combined with relatively long snow cover days.For example, the multi-year mean ASD (1981-2010) was ~7.37 cm (from 2.61 cm to 11.38 cm) based on the 26 stations, which was smaller than that of Russia (>10 cm) but nearly twice the average snow depth of China [47,56].The multi-year mean SCD was ~103.23 days (from 60.23 days to 134.57days), and ~73% of stations had a mean of more than 90 days.The relatively strong impact of snow cover on MSFD suggested substantial uncertainty in the MSFD estimations in snowy regions using the simplified Stefan solution, which only considers air temperature as the main input variable.By fitting the three selected variables (i.e., FI, ASD, and SCD), this study provided an empirical algorithm to estimate soil freeze depth.Compared with the commonly used estimation methods (i.e., the simplified Stefan solution), this empirical algorithm achieved superior performance by considering snow cover variables, with the RMSE and MAE of the station average reduced by over 20%.It was thus more suitable to simulate the soil freeze depth in snowy regions.

Limitations of This Study
In this study, the fitting of the empirical algorithm to estimate soil freeze depth was based on data from 26 stations within and around the SFG region of Heilongjiang province.The uniformity and representativeness of the site distribution undoubtedly affected the accuracy of the fit of the algorithm.More station observations and field survey data should thus be added in the future, and the applicability of this estimation method should be evaluated in other snowy regions [73][74][75][76][77].In addition, two common snow cover variables, ASD and SCD, were selected and fitted with the air temperature variable (characterized by FI) to build the algorithm and estimate soil freeze depth.In the real world, the effects of snow cover on soil freeze/thaw processes are complex and largely depend on snow cover thickness, timing, duration, density, accumulation characteristics, and melting processes, as well as local environmental and meteorological conditions [71].Future research should therefore focus on the effects of multiple snow cover parameters on soil freeze depth with greater consideration of local climatic and environmental contexts.
The ERA5-Land reanalysis dataset provides a good data source to study the characteristics of soil freeze depth changes at different time scales (e.g., monthly, annual, and decadal) with high temporal (1 h) and spatial (0.1 • × 0.1 • ) resolutions from 1950 to present [57,58].Previous studies have suggested that the ERA5-Land reanalysis dataset (for daily air temperature) performed better in East China (including Heilongjiang province) than in West China and the QTP [78].Following Li et al. (2022), the mean coefficient of determination (R 2 ) of all 26 stations was 0.88, ranging from 0.81 to 0.95.Combined with relatively low RMSE/Mean and MAE/Mean values for the calculated against observed FI (Figure 4a,b), our results further confirm the high performance of air temperature data obtained from ERA5-Land in Heilongjiang province.Compared with the statistical criteria calculated against observed snow depths in China from 1951 to 2009 [48], the RMSE/Mean and MAE/Mean of the corresponding sites in this study were relatively low, which suggests a superior snow depth simulation result in Heilongjiang province using the ERA5-Land reanalysis dataset.Due to the shortage of observational data, the evaluation of LAI and SM data extracted from the ERA5-Land hourly reanalysis dataset was performed by comparing the possible influence of different data products on quantifying the main driving factors of MSFD variations.However, to obtain better estimation and analysis results at spatial scales in the future, the utilization of higher resolution downscaled and remote sensing datasets that combine ground-based observations and field surveys should be considered.

Conclusions
Statistics on the relative contributions of potential driving factors to MSFD variations showed that ASD had the greatest effect (42.04%), followed by FI (27.20%) and SCD (11.07%), while SM and LAI had negligible effects (<2%).This suggests that the influence of snow cover on MSFD variations was clearly larger than that of air temperature, possibly caused by relatively thick snow cover combined with relatively long snow cover days in the study area.By considering the three variables (i.e., FI, ASD, and SCD) that contributed more than 10% to MSFD variation, a simple empirical algorithm to estimate soil freeze depth was fitted (1981-2010) and verified (1975-1980 and 2011-2014) using ground station observations.This simple empirical algorithm (considering the snow cover variable) performed better in snowy regions compared to the commonly used Stefan solution, with the MAE and RMSE of the station average reduced by over 20%.Using the ERA5-Land reanalysis dataset (1950−2021) and the simple empirical algorithm, detailed spatiotemporal variations of soil freeze depth were also assessed in the SFG region of Heilongjiang province.
Soil freeze depth may also be affected by other unconsidered regional variables, such as soil organic materials, land use change, soil type, air pollution, and ecological protection activities.Nevertheless, this paper provides a practical empirical algorithm with particular utility in data-scarce regions with relatively substantial snow cover, as it requires fewer inputs and provides useful information from frozen ground depth simulation over a regional scale.The results of this study also underscored the importance of including snow cover variables when analyzing and estimating historical and future soil freeze depth variations in snowy regions.

Figure 1 .
Figure 1.Location of Heilongjiang province with the spatial distributions of frozen ground type and 26 meteorological stations.The distribution map of frozen ground was provided by the National Cryosphere Desert Data Center (http://www.ncdc.ac.cn (accessed on 15 May 2022)).The background reflects the altitude.DEM: digital elevation model.

Figure 1 .
Figure 1.Location of Heilongjiang province with the spatial distributions of frozen ground type and 26 meteorological stations.The distribution map of frozen ground was provided by the National Cryosphere Desert Data Center (http://www.ncdc.ac.cn (accessed on 15 May 2022)).The background reflects the altitude.DEM: digital elevation model.

Figure 2 .
Figure 2. Relative contributions of five variables to MSFD variation for the entire region based on data from 26 stations during the 30-year baseline period (1981-2010).The red bars represent the data of SM and LAI were obtained from ERA5-Land reanalysis dataset.The blue bars represent the data of SM and NDVI were obtained from other data products (see Section 2.2).The negative contribution may suggest that the effect of this variable on MSFD variations can be considered truly unimportant[64].

Figure 3 .
Figure 3.Comparison of the observed and simulated MSFD values using the (a) multiple linear regression equation (see Equation (S1)) and (b) simplified Stefan solution (see Equation (S2) in Supporting Materials) for all 26 stations during the 10-year validation period (i.e., 1975-1980 and 2011-2014).The black solid line is the 1:1 line.

Figure 3 .
Figure 3.Comparison of the observed and simulated MSFD values using the (a) multiple linear regression equation (see Equation (S1)) and (b) simplified Stefan solution (see Equation (S2) in Supporting Materials) for all 26 stations during the 10-year validation period (i.e., 1975-1980 and 2011-2014).The black solid line is the 1:1 line.

Figure 4 .
Figure 4. Statistical criteria of the calculated FI (a,b), SCD (c,d), and ASD (e,f) using the ERA5-Land reanalysis dataset against observed data from 26 meteorological stations in Heilongjiang province.RMSE/Mean and MAE/Mean refer to the ratios of MAE and RMSE to annual mean FI, ASD, and SCD, respectively.

Figure 4 .
Figure 4. Statistical criteria of the calculated FI (a,b), SCD (c,d), and ASD (e,f) using the ERA5-Land reanalysis dataset against observed data from 26 meteorological stations in Heilongjiang province.RMSE/Mean and MAE/Mean refer to the ratios of MAE and RMSE to annual mean FI, ASD, and SCD, respectively.

Figure 5 .
Figure 5. Geographic distribution of the MSFD of SFG in Heilongjiang province during the baseline period (1981-2010).

Figure 5 .
Figure 5. Geographic distribution of the MSFD of SFG in Heilongjiang province during the baseline period (1981-2010).

Figure 6 .
Figure 6.Changes in the MSFD of SFG over different periods in Heilongjiang province.In the figure, 1950s-1960s refers to the differences in the average MSFD between the 1950s and the 1960s, and the meanings of other labels follow in the same way.

Figure 7 .
Figure 7. Changes in the MSFD of SFG in Heilongjiang province based on a time series of anomalies (with respect to the mean of the 30-year baseline period, i.e., 1981-2010) from 1950 to 2021.

Figure 6 .
Figure 6.Changes in the MSFD of SFG over different periods in Heilongjiang province.In the figure, 1950s-1960s refers to the differences in the average MSFD between the 1950s and the 1960s, and the meanings of other labels follow in the same way.

Figure 6 .
Figure 6.Changes in the MSFD of SFG over different periods in Heilongjiang province.In the figure, 1950s-1960s refers to the differences in the average MSFD between the 1950s and the 1960s, and the meanings of other labels follow in the same way.

Figure 7 .
Figure 7. Changes in the MSFD of SFG in Heilongjiang province based on a time series of anomalies (with respect to the mean of the 30-year baseline period, i.e., 1981-2010) from 1950 to 2021.

Figure 7 . 15 Figure 8 .
Figure 7. Changes in the MSFD of SFG in Heilongjiang province based on a time series of anomalies (with respect to the mean of the 30-year baseline period, i.e., 1981-2010) from 1950 to 2021.

Figure 8 .
Figure 8. Geographic distribution of the change rates in the MSFD of SFG in Heilongjiang province from 1950 to 2021.The oblique line indicates where the trend significantly changes at the 95% confidence level.