Spatial-Temporal Distribution of the Freeze–Thaw Cycle of the Largest Lake (Qinghai Lake) in China Based on Machine Learning and MODIS from 2000 to 2020

: The lake ice phenology variations are vital for the land–surface–water cycle. Qinghai Lake is experiencing ampliﬁed warming under climate change. Based on the MODIS imagery, the spatiotemporal dynamics of the ice phenology of Qinghai Lake were analyzed using machine learning during the 2000/2001 to 2019/2020 ice season, and cloud gap-ﬁlling procedures were applied to reconstruct the result. The results showed that the overall accuracy of the water–ice classiﬁcation by random forest and cloud gap-ﬁlling procedures was 98.36% and 92.56%, respectively. The annual spatial distribution of the freeze-up and break-up dates ranged primarily from DOY 330 to 397 and from DOY 70 to 116. Meanwhile, the decrease rates of freeze-up duration (DFU), full ice cover duration (DFI), and ice cover duration (DI) were 0.37, 0.34, and 0.13 days/yr., respectively, and the duration was shortened by 7.4, 6.8, and 2.6 days over the past 20 years. The increased rate of break-up duration (DBU) was 0.58 days/yr. and the duration was lengthened by 11.6 days. Furthermore, the increase in temperature resulted in an increase in precipitation after two years; the increase in precipitation resulted in the increase in DBU and decrease in DFU in corresponding years, and decreased DI and DFI after one year.


Introduction
Spatially explicit knowledge of lake ice is crucial for understanding a wide variety of earth system processes and interactions with the environment, including hydrological budgets, sediment trapping; heat fluxes and coupled weather and climate effects; lake productivity; species richness; food chain dynamics; and inland fishery yields [1,2]. The lake surface water-ice state is controlled by the local geolocation and climate conditions, especially the freeze-up dates and the break-up dates. The lake surface water's physical properties changed due to the transition of the lake surface water-ice state processes [3].
Known as the "Third Pole", the Tibetan Plateau (TP) is the highest and most extensive highland in the world. As the Asian water tower, there exist around 1200 lakes (>1 km 2 ) with an overall surface water area of 47,000 km 2 , which makes up beyond 50% of the overall surface water area of Chinese lakes, including the largest lake: Qinghai Lake (4254.90 km 2 ) [4]. The Qinghai Lake locates in the Yellow River sub-basin, which is an area of approximately 2.5 × 10 5 km 2 ; the distinctive climatic conditions and geophysical environment make it a highly attractive research region [5].
Ice cover dominates the annual cycle of the lake and essentially isolates the water body from the exchange of moisture and gas fluxes [6]. Thus, lake ice cover plays a key role in regional water and energy balance [7]. Lake phenology events, such as critical  [24]. (Note: This figure is created using Sentinel-2, and the source of the Yellow River Basin: https://data.tpdc.ac.cn/zh-hans/data/dff6b437-90a1-4729-8140-faafc544860f/).

MODIS Surface Reflectance
In order to extract the daily pixel wise spatial-temporal distribution of the wate ice, the daily MODIS surface reflectance was used. It includes seven-bands, whic bands 1-7 (1-2: 250 m, 3-7: 500 m resolution). It is provided by the MODIS Land Su Reflectance Science Computing Facility [30]. The products are the Level 2 data de from the Level 1B data, the radiometric calibration and the atmospheric correction b processed in Level 1 and Level 2, respectively. Therefore, the atmospheric scattering absorption are eliminated. The MOD09GQ V6 and MYD09GQ V6 provided band surface reflectance, as well as the MOD09GA V6 and MYD09GA V6 provided band surface reflectance in the GEE platform [30][31][32]. The date range of MODIS/Terra data from 24 February 2000 to 31 July 2020 (https://developers.google.com/earth-engin tasets/catalog/MODIS_006_MOD09GQ; https://developers.google.com/earth-engin tasets/catalog/MODIS_006_MOD09GA). The date range of MODIS/Aqua data was fr  [24]. (Note: This figure is created using Sentinel-2, and the source of the Yellow River Basin: https://data.tpdc.ac.cn/zh-hans/data/dff6b4 37-90a1-4729-8140-faafc544860f/ (accessed on 31 July 2020)).

Sentinel-2 Surface Reflectance
Sentinel-2 (S2) is a wide-swath, high-resolution, multispectral imaging mission with a global 5-day revisit frequency [33]. The S2 Multispectral Instrument (MSI) samples Remote Sens. 2021, 13, 1695 4 of 21 13 spectral bands: visible and NIR (10 m spatial resolution), red edge, and SWIR (20 m spatial resolution), and atmospheric bands (60 m spatial resolution). It provides data suitable for assessing the state and change of vegetation, soil, and water cover. Observation of inland waterways and coastal areas [34][35][36]. The dataset has been available since 14 December 2018. The aim is to validate the water-ice classification accuracy of RF derived from MODIS, and the reconstruction results of cloud gap-filling (https://developers.google. com/earth-engine/datasets/catalog/COPERNICUS_S2 (accessed on 31 July 2020)).

ERA5 Daily Aggregates
ERA5 is the fifth generation European Centre for Medium-Range Weather Forecasts atmospheric reanalysis of the global climate. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset. The two critical variables-2 m air temperature and total precipitation-are provided by ERA5 Daily Aggregates in GEE. Additionally, daily (monthly/yearly) total precipitation values are given as daily (monthly/yearly) sums; daily (monthly/yearly) temperature values are provided as daily (monthly/yearly) averages [39,40]. The objective is to analyze the relationship with the pixel wise spatial-temporal distribution of the freeze-thaw cycle of Qinghai Lake (https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ ERA5_DAILY (accessed on 31 July 2020)).

RF Classification
The previous research compared the water index, ice threshold value method and RF algorithm. The RF is very robust and can handle complex data, and the advantage is that bagging can avoid overfitting [22]. The RF algorithm is a classification and regression algorithm originally designed for machine learning (ML) [20,41]. ML is a discipline focused on two interrelated questions: "How can one construct computer systems that automatically improve through experience?" and "What are the fundamental statistical-computationalinformation-theoretic laws that govern all learning systems, including computers, humans, and organizations?" [42]. Figure 2 shows the flowchart of MODIS data using RF based on the GEE platform. Individual trees are derived using a bootstrapped sample of the original data. Approximately two-thirds of the samples (referred to as "in-bag" samples) in the dataset are used for training and the remaining one-third (referred to as "out-of-bag" samples) are used in an internal cross-validation technique for estimating how well the resulting RF model performs [20]. In the GEE platform, the RF algorithm included six critical parameters, which were the number of trees, variablesPerSplit, min-LeafPoplation, bagFraction, outofBagMode, and seed variables. It excluded the criterion and maximum tree depth. The number of trees was set to 85 depending on the training data, the variable PerSplit is 0.66, other variables could be set as default, the detailed information was given in the previous research [22].

Cloud Gap-Filling
The classification of four classes was verified further; some outliers may be misclassified. There are some water and ice errors of commission. Meanwhile, there are some water and ice errors of omission. Cloud cover and cloud shadow have the greatest impact on classifying lake water and ice. Of paramount importance, the ice and water can only be found inside the lake. Therefore, to solve this problem, annual water mask products (JRC GSW) were used to filter out the outliers and obtain the lake water and ice information inside the lake from 2000 to 2020.
Suppose that in an ice season year, there are only water-ice-water transition processes at each pixel. The water pixels freeze up in fall or winter, the ice pixels break up in spring at the corresponding pixels, there are no multiple water-ice state transitions during an ice season year. The critical hypothesis conducts the annual temporal-spatial distribution of surface water freeze-up dates and ice break-up dates; some misclassifications of four classes could be reduced and eliminated.
An effective gap-filling scheme via Non-local Spatio-Temporal Filter (NSTF) is applied to reconstruct the daily classification results under the cloud cover gaps. Due to the irregular gaps under the corresponding cloud regions, the classification results derived from Terra and Aqua are blended first [48]. The adjacent temporal filtering (ATF) is executed to fill some cloud-gaps [49][50][51][52][53][54], and finally, the NSTF is applied to fill most of the remaining cloud-gap [55][56][57].

Methods Evaluation Approaches
It is critical to evaluate the water-ice classification results. The confusion matrix and the derived indexes are applied to evaluate the results ( Table 1). The kappa coefficient is frequently used to summarize the accuracy assessment of land cover classifications obtained from remote sensing. The reference classification is obtained from the ground visit

Training and Testing Dataset
The training data and testing data were derived from MODIS products, which are "MOD09GQ V6" and "MOD09GA V6". As the freeze-up events of lake surface water generally occur in November, and break-up events of lake ice occur in April, the ice duration is five months [29]. Four images with MODIS Surface Reflectance in the middle of October (15 October), January (15 January), April (15 April), and July (15 July) are selected in each year, and 100 sample pixels (4 classes: water, ice, land, and cloud) are simply randomly selected in each image. As a result, 84 images were selected over 24 February 2000 to 31 December 2020. Additionally, if the MODIS images showed bad sample pixels in the seven-band features with low quality, the nearest former or following images were used instead of the current images. For example, a MODIS image showed some bad pixels on the 15 April 2001, but the pixels of the next image were fine on the 16 April 2001, which were used in place of the image on the 15 April 2001. The total number of MODIS sample pixels is 7776 after quality control, including water (2343), ice (1421), land (2850), and cloud (1162) pixels. Finally, the 2/3 (5184) sample pixels were randomly assigned to the training sample. Meanwhile, the remanent 1/3 (2592) of sample pixels were assigned to the testing sample.

Model Training and Validation
The number of RF trees is different. The results of the classifier accuracy and validation accuracy for RF are different, especially the validation accuracy. In general, the number of RF trees was empirically set to 100 [43][44][45]. Oil spills were classified and it was found that an ensemble of 70 trees was sufficient [46]. The number of RF trees varying from 1 to 75 based on a step of 5 was set to 50 [47].
The suitable and performable number of RF trees in GEE for this study was selected, the grid search was performed. Many trials were conducted from 10 to 500 by steps of 10 for numberoftrees variable. The classifier accuracy was greater than 0.950. When the number of trees was greater than 100, the classifier accuracy was close to 1.000, the validation accuracy was between 0.747 and 0.807. Specifically, the focus was on the number of trees that ranged from 70 to 100, with the greatest validation accuracy value being 0.807 in 85 and 90 trees, and the classifier accuracy was 0.997. Herein, the key-dependent factors were validation accuracy and high computing performance; therefore, 85 was selected as the number of RF trees.

Cloud Gap-Filling
The classification of four classes was verified further; some outliers may be misclassified. There are some water and ice errors of commission. Meanwhile, there are some water and ice errors of omission. Cloud cover and cloud shadow have the greatest impact on classifying lake water and ice. Of paramount importance, the ice and water can only be found inside the lake. Therefore, to solve this problem, annual water mask products (JRC GSW) were used to filter out the outliers and obtain the lake water and ice information inside the lake from 2000 to 2020.
Suppose that in an ice season year, there are only water-ice-water transition processes at each pixel. The water pixels freeze up in fall or winter, the ice pixels break up in spring at the corresponding pixels, there are no multiple water-ice state transitions during an ice season year. The critical hypothesis conducts the annual temporal-spatial distribution of surface water freeze-up dates and ice break-up dates; some misclassifications of four classes could be reduced and eliminated.
An effective gap-filling scheme via Non-local Spatio-Temporal Filter (NSTF) is applied to reconstruct the daily classification results under the cloud cover gaps. Due to the irregular gaps under the corresponding cloud regions, the classification results derived from Terra and Aqua are blended first [48]. The adjacent temporal filtering (ATF) is executed to fill some cloud-gaps [49][50][51][52][53][54], and finally, the NSTF is applied to fill most of the remaining cloud-gap [55][56][57].

Methods Evaluation Approaches
It is critical to evaluate the water-ice classification results. The confusion matrix and the derived indexes are applied to evaluate the results ( Table 1). The kappa coefficient is frequently used to summarize the accuracy assessment of land cover classifications obtained from remote sensing. The reference classification is obtained from the ground visit or other spatial high-resolution remote sensing sample data. The estimated sample date is often summarized in a confusion matrix subjected to various statistical analyses [58]. Table 1. Accuracy estimation and confusion matrix for all classes (a i stands for the total number of actual class i pixels, b j stands for the total number of predicted class j pixels, and N ij stands for the total number of actual class i and predicts class j pixels, N stands for the total number of pixels, q stands for the number of classes).

Predicted
Row Total Suppose N pixels of remote sensing images are classified into q categories. Given a census of N pixels and the true classification of each pixel, a i stands for the total number of actual class i pixels, b j stands for the total number of predicted class j pixels, and N ij stands Remote Sens. 2021, 13, 1695 7 of 21 for the total number of actual class i and predicts class j pixels. The population confusion matrix is shown in Table 1.
The producer accuracy (PA) is the following Equation (1): The overall accuracy (OA) is the following Equation (2): Kappa coefficient (K) is the following Equation (3): where the P e is the following Equation (4):

Definition of Lake Phenology
For convenience, the ice season year is defined as running from 1 September to 31 August of the following year. Each pixel has its water-ice state phase of the lake. Firstly, the water froze into the solid ice state, then the ice melted into the water during the annual fall-winter-spring successive ice season for pixels. We used the general calculation method for Julian calendar dates referred to in the literature [10,29,59], i.e., when this day occurs within the year (day of the year, DOY, with 1st January as the reference), the consecutive date of the following year is added. Due to the difference of Julian calendar dates in each pixel, the spatial-temporal distribution of the freeze-thaw cycle is very critical. The Julian calendar date of the first pixel (the water froze into the solid ice state) was set as the date of freeze-up start (FUS) of the lake. That of the last pixel (the water froze into the solid ice state) was set as the date of the freeze-up end (FUE) [60,61]. Symmetrically, that of the first pixel (the ice melt into the water state) was set as the date of break-up start (BUS). The last pixel (the ice melt into the water state) was set as the date of break-up end (BUE).
Therefore, the attendant variables of lake phenology are the freeze-up duration (DFU), full ice cover duration (DFI), break-up duration (DBU), ice duration (DI), and full water duration (DFW). The DFU is the duration that ranges from FUS to FUE; the DFI is the duration that ranges from FUE to BUS; the DBU is the duration that ranges from BUS to BUE; the DI is the duration that ranges from FUS to BUE; the DFW is the duration that ranges from BUE to following FUS.

Cloud Removal from MODIS Images
The first challenge in information extraction from optical satellite images of the Earth's surface is the collection of cloud-free images. Clouds dramatically affect signal transmission in complex ways due to their different shapes, heights, and distributions, thus contaminating remote sensing data from land and water. Cloudy image restoration is a vital step in the remote sensing image processing chain. The correction of cloudy data can substantially increase the number of useable images and pixels available for later applications. Cloud removal strives to remove cloud effects, including cloud shadows. The daily Terra/Aqua cloud was calculated using cloud state and cloud shadow bitmask of state 1km. The daily percent and daily accumulative percent with Terra/Aqua cloud/cloud shadow percent group are shown in Figure 3. thus contaminating remote sensing data from land and water. Cloudy image restoration is a vital step in the remote sensing image processing chain. The correction of cloudy data can substantially increase the number of useable images and pixels available for later applications. Cloud removal strives to remove cloud effects, including cloud shadows. The daily Terra/Aqua cloud was calculated using cloud state and cloud shadow bitmask of state 1km. The daily percent and daily accumulative percent with Terra/Aqua cloud/cloud shadow percent group are shown in Figure 3. The cloud percent covered from 0% to 100%, the range of cloud percent was divided into 10 groups by 10% step, Figure 3(a), (b) show the days' percent and days accumulative percent with Terra/Aqua cloud percent group. The trend of days' percent with Terra/ Aqua cloud percent was similar, the daily percent decreased from the group (0, 10), reached the lowest point in the group (70,80), and lasted the longest in the group (90, 100). However, the daily percent in Terra was greater than that in Aqua in two groups (0, 10) and (10,20). In contrast, the daily percent in Terra was less than that in Aqua in four groups (50,60), (60,70), (70,80), and (80, 90). The daily accumulative percent increased fast, then slowly in Terra; however, it increased with a relatively uniform rate in Aqua, which was about 40% when the cloud percent was less than 30% in Terra, but that reached the same level when the cloud percent was less than 40% in Aqua; that was greater than 50% in Terra, nevertheless, that was less than 50% in Aqua, when the cloud percent was less than 50%; that reached about 78% both in Terra and Aqua when the cloud percent was less than 90%. The cloud percent covered from 0% to 100%, the range of cloud percent was divided into 10 groups by 10% step, Figure 3a,b show the days' percent and days accumulative percent with Terra/Aqua cloud percent group. The trend of days' percent with Terra/ Aqua cloud percent was similar, the daily percent decreased from the group (0, 10), reached the lowest point in the group (70,80), and lasted the longest in the group (90, 100). However, the daily percent in Terra was greater than that in Aqua in two groups (0, 10) and (10,20). In contrast, the daily percent in Terra was less than that in Aqua in four groups (50,60), (60,70), (70,80), and (80, 90). The daily accumulative percent increased fast, then slowly in Terra; however, it increased with a relatively uniform rate in Aqua, which was about 40% when the cloud percent was less than 30% in Terra, but that reached the same level when the cloud percent was less than 40% in Aqua; that was greater than 50% in Terra, nevertheless, that was less than 50% in Aqua, when the cloud percent was less than 50%; that reached about 78% both in Terra and Aqua when the cloud percent was less than 90%.
For cloud shadow that was less influential but cannot be ignored, the cloud shadow percent covered from 0% to 20%, the range of cloud shadow percent was divided into 10 groups by 2% step. Figure 3c,d show the days' percent and days' accumulative percent with Terra/Aqua cloud shadow percent group. The trend of days' percent/days' accumulative percent with Terra/Aqua cloud percent was similar and showed exponential decrease/increase. The daily percent decreased from about 40% in the group (0, 2) in Terra, reaching the lowest point 0 in the group (18,20). Comparably, the daily percent decreased from about 36% in the group (0, 2) in Aqua, reached the lowest point 0 in the group (18,20). The daily accumulative percent increased fast, then slowly, both in Terra and Aqua, reached about 100% in the group (8)(9)(10). It was less influential when the cloud shadow percent was larger than 10%.
The monthly cloud and cloud shadow percent in Terra and Aqua were shown in Figure 4. The trend change of cloud percent and cloud shadow percent in Terra was similar to that in Aqua. The mean monthly cloud percent in Terra was 49.46%, less than about 4% in Aqua (53.63%). The maximum monthly cloud percent in Terra was 71.92%, which was the same as in Aqua (72.02%). The minimum monthly cloud percent in Terra was 16.63%, less than about 6% in Aqua (22.50%). It was illustrated that there were the more cloudy conditions in Afternoon (Aqua) than Morning (Terra). Simultaneously, the mean monthly Remote Sens. 2021, 13, 1695 9 of 21 cloud shadow percent in Terra was 3.58%, less than about 0.16% in Aqua (3.74%). The maximum monthly cloud shadow percent in Terra was 7.33%, less than about 0.30% in Aqua (7.63%). The minimum monthly cloud percent in Terra was 0.70%, less than about 0.97% in Aqua (1.67%). It was the same result as the cloud, and there was more cloudy shadow in the Afternoon (Aqua) than Morning (Terra). The cloud percent was 13.82 times than cloud shadow percent in Terra, cloud shadow percent was 14.34 times than cloud shadow percent in Aqua-this was an interesting result. The cloud percent was about 14 times than cloud shadow percent. To some degree, the cloud shadow cannot make much difference to cloud removal. percent was larger than 10%.
The monthly cloud and cloud shadow percent in Terra and Aqua were shown in Figure 4. The trend change of cloud percent and cloud shadow percent in Terra was similar to that in Aqua. The mean monthly cloud percent in Terra was 49.46%, less than about 4% in Aqua (53.63%). The maximum monthly cloud percent in Terra was 71.92%, which was the same as in Aqua (72.02%). The minimum monthly cloud percent in Terra was 16.63%, less than about 6% in Aqua (22.50%). It was illustrated that there were the more cloudy conditions in Afternoon (Aqua) than Morning (Terra). Simultaneously, the mean monthly cloud shadow percent in Terra was 3.58%, less than about 0.16% in Aqua (3.74%). The maximum monthly cloud shadow percent in Terra was 7.33%, less than about 0.30% in Aqua (7.63%). The minimum monthly cloud percent in Terra was 0.70%, less than about 0.97% in Aqua (1.67%). It was the same result as the cloud, and there was more cloudy shadow in the Afternoon (Aqua) than Morning (Terra). The cloud percent was 13.82 times than cloud shadow percent in Terra, cloud shadow percent was 14.34 times than cloud shadow percent in Aqua-this was an interesting result. The cloud percent was about 14 times than cloud shadow percent. To some degree, the cloud shadow cannot make much difference to cloud removal.   Table 2 shows the training classification results of four classes based on the confusion matrix of RF. Four classes' results were expressed as the pixel count, the bold count stands for the predicted accurate class results, while the accuracy of RF was expressed as the percentage. The training kappa coefficient was 99.59%, the overall training accuracy of RF was 99.71%, so the agreement is almost perfect. The obtained accuracies were high for all classes. Table 3 shows the testing classification results of four classes based on the confusion matrix of RF. The overall testing accuracy was 97.45%, testing the kappa coefficient was 96.45%, the agreement is almost perfect. The accuracies were high for all four classes.

Extraction of Water and Ice by RF Algorithm
Daily water and ice classes were extracted by the RF algorithm according to trained RF classifiers, which is based on four classes of training sample pixels and testing sample pixels. Four daily scenes with water-ice-cloud mixed conditions were selected, which were from around April. Figure 5 visually listed the RF water-ice classification in Terra and Aqua under the cloud cover condition. Figure 5(1), (3) show the four daily raw scenes in Terra and Aqua. Figure 5(2), (4) show the water-ice classification using RF. The classification results were in good agreement with the raw surface reflectance image. These scenes were influenced by the cloud cover and cloud shadow, which were distributed randomly over the lake. Scenes were slightly influenced by the cloud cover and cloud shadow on 6 April 2018 (Figure 5d) and the water-ice extraction was slightly influenced. Scenes were slightly influenced by the cloud cover and cloud shadow outside the lake on 24 March 2014 ( Figure 5c) but water-ice extraction was relatively complete. The more cloudy conditions are distributed near the lake on 11 April 2004 (Figure 5a), and the water-ice classifications were partly influenced. Dense cloud covered almost the whole lake on 15 April 2008 (Figure 5b), and the water-ice classifications were severely affected. The images from Aqua were more likely affected than the images from Terra; these results were in good agreement with the previous daily distribution of cloud and cloud shadow. Although the images from Aqua and Terra were affected by the cloud and cloud shadow, the result could be completely reconstructed by gap-filling procedures after the RF classification of images. To validate the water-ice classification accuracy of RF derived from MODIS, 2979 validation samples from Sentinel-2 images were selected manually from 14 December 2018 to 29 December 2019. We selected 150 composite images and an average of 20 sample points were sampled for each composite image. We obtained 2377 matched samples through the validation procedure, including the time matching and space matching for RF water-ice results and validation samples. Table 4 shows the validation accuracies and confusion matrix. The obtained validation accuracies were 98.50% and 97.62% for the water and ice classes, respectively. The overall validation accuracy was 98.36%, the validation kappa coefficient was 94.00%-the agreement is almost perfect.  To validate the water-ice classification accuracy of RF derived from MODIS, 2979 validation samples from Sentinel-2 images were selected manually from 14 December 2018 to 29 December 2019. We selected 150 composite images and an average of 20 sample points were sampled for each composite image. We obtained 2377 matched samples through the validation procedure, including the time matching and space matching for RF water-ice results and validation samples. Table 4 shows the validation accuracies and confusion matrix. The obtained validation accuracies were 98.50% and 97.62% for the water and ice classes, respectively. The overall validation accuracy was 98.36%, the validation kappa coefficient was 94.00%-the agreement is almost perfect.

Reconstruction of Water and Ice by Cloud gap-filling
thm An effective gap-filling scheme was applied via a combination of Terra and Aqua, ATF, and NSTF methods to reconstruct the daily classification results of Terra and Aqua under the cloud cover gaps. The results of three cloud gap-filling procedures are visually

Reconstruction of Water and Ice by Cloud Gap-Filling
An effective gap-filling scheme was applied via a combination of Terra and Aqua, ATF, and NSTF methods to reconstruct the daily classification results of Terra and Aqua under the cloud cover gaps. The results of three cloud gap-filling procedures are visually illustrated in Figure 6. Figure 6 mainly illustrates the two critical liquid water-to-solid-ice state or solid-ice-to-liquid-water state transfer processes in the co-existence of Terra and Aqua: liquid water freeze-up process ( Figure 6) and solid ice break-up process (Figure 7). Figure 6 (1), (2) show the RF ice-water classification results of Terra and Aqua during the 25 December 2002 to 30 December 2002. The ice was freezing up in these data ranges, but the cloud gaps seriously influenced the extraction of ice and water, so complete ice and water pixels were severely missing. Figure 6 (3) combined the Terra and Aqua to compensate for the shortcomings of the single satellite. The ATF method procedure (Figure 6 (4)) could restore the most missing cloud gap data. The NSTF method procedure could completely recover the data. Furthermore, Figure 7 shows the ice break-up process on 27 March 2003 to 1 April 2003. Figure 7e shows the full cloud cover, but the cloud gap was completely reconstruct through the cloud gap-filling procedures. Figure 7a,f were severely covered by cloud. The cloud gap could be restored. The visualized results showed the feasibility and superiority of cloud gap-filling procedures.
To validate the performance of the cloud gap-filling procedures, the validation samples from Sentinel-2 images were used to quantitatively analyze the overall accuracy of the cloud gap-filling procedures. Table 5 shows the validation accuracies and confusion matrix. The obtained validation produced accuracies of 90.65% and 97.94% for the water and ice classes, respectively. The overall validation accuracy was 92.56%, the validation kappa coefficient was 82.17%, the agreement is almost perfect.
Remote Sens. 2021, 13, 1695 12 of 23 illustrated in Figure 6. Figure 6 mainly illustrates the two critical liquid water-to-solid-ice state or solid-ice-to-liquid-water state transfer processes in the co-existence of Terra and Aqua: liquid water freeze-up process ( Figure 6) and solid ice break-up process (Figure 7).  illustrated in Figure 6. Figure 6 mainly illustrates the two critical liquid water-to-solid-ice state or solid-ice-to-liquid-water state transfer processes in the co-existence of Terra and Aqua: liquid water freeze-up process ( Figure 6) and solid ice break-up process (Figure 7).

Annual Spatial Dynamics of Freeze-Up and Break-Up Dates
The previous work mainly studied the lake ice as a whole, while this work focused on each pixel of the lake for detailed research. The lake images are composed of water-ice pixels. Each pixel has its own water-ice state phase, with either solid ice or liquid water, and each pixel transitions between the above two states: firstly, the liquid water was transited into the solid ice state. The ice melted into the water during the annual fallwinter-spring successive ice season. Therefore, the date on which each pixel transitions from water-to-ice during the surface water freeze-up processes or from ice-to-water during the ice break-up processes is crucial in each ice season year. Figure 8a visualized the annual spatial distribution of the ice freeze-up dates in each pixel on the Julian calendar day during the 2000/2001-2019/2020 ice season years. Generally, the freeze-up date ranged primarily from Julian calendar day 330 to day 397. The earliest positions were located on the northern and western parts along the lake shoreline. The surface water was frozen from along the lake boundary to the middle, and the last freezed positions were located on the southern and central parts of the lake. The lake water freeze-up rule was also very consistent with the latitude range of Qinghai Lake. The ice melted into the water when the spring was coming. Figure 9a depicts the annual spatial distribution of the ice break-up date for each pixel on Julian calendar days during the 2000/2001-2019/2020 ice season. The break-up dates ranged primarily from Julian calendar day DOY 70 to 116, and lasted average 46 days. As a whole, the ice began to break up from the southwestern parts or center of the lake gradually diffused to the northern and eastern lake boundary, the coastal area and several bays ultimately melted.  The ice melted into the water when the spring was coming. Figure 9a depicts the annual spatial distribution of the ice break-up date for each pixel on Julian calendar days during the 2000/2001-2019/2020 ice season. The break-up dates ranged primarily from Julian calendar day DOY 70 to 116, and lasted average 46 days. As a whole, the ice began to break up from the southwestern parts or center of the lake gradually diffused to the northern and eastern lake boundary, the coastal area and several bays ultimately melted. Figure 9b, c depict the temporal variation and temperature of the annual ice breakup at each pixel. The earliest warm spring year was in 2013/2014, where the 1/3 ice-to- ice season year, a far-reaching 2008 spring snow disaster took place in China, and the ice break-up of all lake pixels was delayed-the Julian calendar day ranged from DOY 101 to 105. The coldest spring was the 2010/2011 ice season year, the ice break-up of all lake pixels ranged from DOY 97 to 107. The fastest ice break-up year was the 2012/2013 ice season; it lasted about 4 days and the majority of pixels ranged from DOY 92 to 95. The slowest ice freeze-up year was the 2011/2012 ice season; it lasted about 33 days and the majority of pixels ranged from DOY 71 to 103.  Distinctively, the 2019/2020 ice season year also saw the coldest spring; the ice breakup of all lake pixels ranged from DOY 83 to 109, which lasted a longer time than other ice season years. This result may have a close correlation with precipitation and temperature. Figure 10 shows the daily air temperature at 2 m and precipitation derived from the ERA5 during the solid ice break-up process in the 2019/2020 ice season. It snowed four times during this period: 15 March; 25 March to 7 April; 16 April to 17 April; and 20 April to 24 April at the same time, the snowfall was accompanied by four distinctly advanced temperature drops, respectively. The difference was that the temperature was greater than 0°C after the third snowfall events (14 April). Therefore, snowfall and adjoint cooling would result in the continuous existence of lake ice, which would not melt at the earlier time as in previous years, presenting a unique spatial distribution phenomenon. Therefore, the lake phenology is closely related to precipitation and temperature.
season years. This result may have a close correlation with precipitation and temperature. Figure 10 shows the daily air temperature at 2 m and precipitation derived from the ERA5 during the solid ice break-up process in the 2019/2020 ice season. It snowed four times during this period: 15 March; 25 March to 7 April; 16 April to 17 April; and 20 April to 24 April at the same time, the snowfall was accompanied by four distinctly advanced temperature drops, respectively. The difference was that the temperature was greater than 0 ℃ after the third snowfall events (14,April). Therefore, snowfall and adjoint cooling would result in the continuous existence of lake ice, which would not melt at the earlier time as in previous years, presenting a unique spatial distribution phenomenon. Therefore, the lake phenology is closely related to precipitation and temperature.

Discussion
The variation of the liquid water-to-solid-ice state and the solid-ice-to-liquid-water state of the lake is continuous in temporal range. Figure 11 illustrates the temporal variation of the freeze-thaw-cycle processes, precipitation, and the air temperature at 2 m of Qinghai Lake, during the 2000/2001-2019/2020 ice season years. Consequently, the duration and variation trends of the DFU, DFI, DBU, and DI are very important for freezethaw cycle lakes; the relationships between those durations, precipitation, and temperature are very critical. Figure 11a shows the annual certain date ranges of liquid surface water freeze-up, full solid ice cover, solid ice break-up and ice period. The date of surface water freeze-up mainly ranged from 1 December to 10 January The ice break-up mainly ranged from 10 March to 20 April, the full ice cover was centrally distributed within the embedded FUE and BUS, as well as which the ice period extended the whole lifetime of ice cover. The FUS and FUE showed a tendency to delay, while the BUS and BUE tended to advance. Most of the former studies of lake ice had also demonstrated a global warming tendency, and the latter was more intense change than the former [59,[62][63][64][65]. Figure 11b illustrates the duration and change trend of four critical ice duration processes. The mean durations of DFU, DFI, DBU, and DI were 29.05, 65.95, 26.30, and 121.30 days, respectively. The annual durations fluctuated around the average, and the maximum amplitude durations were 18.95, 30.95, 24.70, and 19.3 days, respectively. Furthermore, three variations-DFU, DFI, and DI-presented a decreasing tendency, the decrease rates were -0.37, -0.34, and -0.13 days/year, respectively, and the durations were shortened by 7.4, 6.8, and 2.6 days in the past 20 years. Only the DBU presented an increasing tendency; the increasing rate was 0.58 days/yr. and the duration was lengthened by 11.6 days in the past 20 years. It is consistent with the results that the ice duration shortened 17.5 days/decade for Canada's lakes during 1985-2004 [3]. Figure 11c, d illustrate the yearly and monthly precipitation and temperature during the 2000/2001 to 2019/2020 ice season years. The mean monthly cumulative precipitation is 0.056 m, the monthly increase rate is 4.06 × 10 −5 m/mo., the maximum monthly precipitation is gradually increasing. The maximum yearly precipitation is gradually increasing, the mean yearly cumulative precipitation is 0.681 m, the yearly increase rate is 7.700 × 10 −3 m/year, and the precipitation has increased by 154 mm in the past 20 years. It is very similar to the above temperature result. The mean monthly temperature is 1.568 ℃, the

Discussion
The variation of the liquid water-to-solid-ice state and the solid-ice-to-liquid-water state of the lake is continuous in temporal range. Figure 11 illustrates the temporal variation of the freeze-thaw-cycle processes, precipitation, and the air temperature at 2 m of Qinghai Lake, during the 2000/2001-2019/2020 ice season years. Consequently, the duration and variation trends of the DFU, DFI, DBU, and DI are very important for freeze-thaw cycle lakes; the relationships between those durations, precipitation, and temperature are very critical. Figure 11a shows the annual certain date ranges of liquid surface water freeze-up, full solid ice cover, solid ice break-up and ice period. The date of surface water freeze-up mainly ranged from 1 December to 10 January. The ice break-up mainly ranged from 10 March to 20 April, the full ice cover was centrally distributed within the embedded FUE and BUS, as well as which the ice period extended the whole lifetime of ice cover. The FUS and FUE showed a tendency to delay, while the BUS and BUE tended to advance. Most of the former studies of lake ice had also demonstrated a global warming tendency, and the latter was more intense change than the former [59,[62][63][64][65]. Figure 11b illustrates the duration and change trend of four critical ice duration processes. The mean durations of DFU, DFI, DBU, and DI were 29.05, 65.95, 26.30, and 121.30 days, respectively. The annual durations fluctuated around the average, and the maximum amplitude durations were 18.95, 30.95, 24.70, and 19.3 days, respectively. Furthermore, three variations-DFU, DFI, and DI-presented a decreasing tendency, the decrease rates were −0.37, −0.34, and −0.13 days/year, respectively, and the durations were shortened by 7.4, 6.8, and 2.6 days in the past 20 years. Only the DBU presented an increasing tendency; the increasing rate was 0.58 days/yr. and the duration was lengthened by 11.6 days in the past 20 years. It is consistent with the results that the ice duration shortened 17.5 days/decade for Canada's lakes during 1985-2004 [3]. Figure 11c,d illustrate the yearly and monthly precipitation and temperature during the 2000/2001 to 2019/2020 ice season years. The mean monthly cumulative precipitation is 0.056 m, the monthly increase rate is 4.06 × 10 −5 m/mo., the maximum monthly precipitation is gradually increasing. The maximum yearly precipitation is gradually increasing, the mean yearly cumulative precipitation is 0.681 m, the yearly increase rate is 7.700 × 10 −3 m/year, and the precipitation has increased by 154 mm in the past 20 years. It is very similar to the above temperature result. The mean monthly temperature is 1.568°C, the monthly increase rate is 2.200 × 10 −3 • C/mo.; thus, the monthly temperature is gradually increasing. Meanwhile, the mean yearly temperature is 1.691°C, and the yearly increase rate is 0.118 • C/yr., meaning that the temperature has increased by 2. The relationship between the Polish lowland lakes' ice and winter air temperature was analyzed, the ice cover duration has been becoming shorter under the climatic changes condition [62].

Conclusions
This research focuses on the spatial-temporal distribution of the freeze-thaw cycle pixel-by-pixel in Qinghai Lake based on the ML and MODIS during the 2000/2001 to 2019/2020 ice seasons. After the daily cloud and cloud shadow were extracted and removed, the water-ice pixel state results were classified using the RF algorithm and were validated using Sentinel-2 images. The combination of Terra and Aqua, ATF, and NSTF cloud gap-filling procedures were applied to fill the cloud gap, and the Sentinel-2 images were used to validate the results of the cloud gap-filling. The results showed that the accuracy of the water-ice classification by RF was 98.36%, the kappa coefficient was 94.00%. The overall accuracy of the reconstruction of water and ice by cloud gap-filling procedures Furthermore, the main factors affecting the duration are precipitation and temperature [66]. The relationships between the duration and variation trend of annual DFU, DFI, DBU, DI, precipitation, and 2 m air temperature during the 2000/2001-2019/2020 ice season years are critical, and it was found that the trends of these variables were in relative agreement. Compared with the temperature, the precipitation showed a delayed increasing pattern of about two years. The changing trend of yearly precipitation was similar to that of DBU and DFU. Compared with the precipitation, the durations of DI showed a delayed trend pattern of about one year, although DFU and DI were negative. Compared with the duration of DFI, the DFU showed an opposite trend pattern. The duration of DI showed a similar trend pattern. In addition, the lake size, snowfall (area and thickness) and ice thickness displayed the significant role in regulating the ice phenology [11]. Humidity is a very sensitive variable that responds to changes in lake phenology and its surrounding environment [67,68]. The theoretical consistency in geophysical data analysis from the fractals to the stochastics method was also critical [69][70][71][72].
In summary, the increase in the temperature resulted in an increase in precipitation after two years. The increase in precipitation resulted in the increase in the duration of DBU and the decrease in DFU in corresponding years, and a decrease in the duration of DI and DFI after one year. In addition, the border shape, salinity, depth and wind speed are also the impacting factors of lake ice phenology.

Conclusions
This research focuses on the spatial-temporal distribution of the freeze-thaw cycle pixel-by-pixel in Qinghai Lake based on the ML and MODIS during the 2000/2001 to 2019/2020 ice seasons. After the daily cloud and cloud shadow were extracted and removed, the water-ice pixel state results were classified using the RF algorithm and were validated using Sentinel-2 images. The combination of Terra and Aqua, ATF, and NSTF cloud gap-filling procedures were applied to fill the cloud gap, and the Sentinel-2 images were used to validate the results of the cloud gap-filling. The results showed that the accuracy of the water-ice classification by RF was 98.36%, the kappa coefficient was 94.00%. The overall accuracy of the reconstruction of water and ice by cloud gap-filling procedures was 92.56%, the kappa coefficient was 82.17%. The annual spatial dynamics of liquid water freeze-up date showed that it ranged primarily from DOY 330 (25 November) to 397 (1 February) and the solid ice break-up date ranged primarily from DOY 70 (11 March) to 116 (26 April). Meanwhile, the decrease rate of DFU, DFI, and DI were 0.37, 0.34, and 0.13 days/yr. and the duration days were shortened by 7.4, 6.8, and 2.6 days in the past 20 years. Only the rate of DBU increased, by 0.58 days/yr., and the duration days were lengthened by 11.6 days in the past 20 years. Compared with the temperature, an increase in the temperature resulted in an increase in precipitation after two years, the increase in precipitation resulted in the increase in duration days of DBU and decrease in duration days of DFU in current years, and a decrease in duration days of DI and DFI after one year.
In conclusion, the performance of the image classification of ML is efficient, and the cloud gap-filling procedures could reconstruct the spatio-temporal dynamics of the water-ice state of the lake under the complex cloud cover conditions. However, there are some limitations, such as that the spatial resolution is 250 m, and the size of the lake is relatively large. Further research will focus on the fusion of MODIS (250 m), Landsat (30 m), Sentinel-2 (10 m), and Synthetic Aperture Radar remote sensing data; it has great potential to improve spatial resolution to 10 m.