Automated Delineation of Supraglacial Debris Cover Using Deep Learning and Multisource Remote Sensing Data

: High-mountain glaciers can be covered with varying degrees of debris. Debris over glaciers (supraglacial debris) signiﬁcantly alter glacier melt, velocity, ice geometry, and, thus, the overall response of glaciers towards climate change. The accumulated supraglacial debris impedes the automated delineation of glacier extent owing to its similar reﬂectance properties with surrounding periglacial debris (debris aside the glaciated area). Here, we propose an automated scheme for supraglacial debris mapping using a synergistic approach of deep learning and multisource remote sensing data. A combination of multisource remote sensing data (visible, near-infrared, shortwave infrared, thermal infrared, microwave, elevation, and surface slope) is used as input to a fully connected feed-forward deep neural network (i.e., deep artiﬁcial neural network). The presented deep neural network is designed by choosing the optimum number and size of hidden layers using the hit and trial method. The deep neural network is trained over eight sites spread across the Himalayas and tested over three sites in the Karakoram region. Our results show 96.3% accuracy of the model over test data. The robustness of the proposed scheme is tested over 900 km 2 and 1710 km 2 of glacierized regions, representing a high degree of landscape heterogeneity. The study provides proof of the concept that deep neural networks can potentially automate the debris-covered glacier mapping using multisource remote sensing data.


Introduction
Glaciers are considered as proxies of climate change, as the Earth's glacierized region has witnessed significant ice mass loss due to atmospheric warming [1,2]. Mountain glaciers are likely to play an imperative role in sea-level rise in the coming decades, as High Mountain Asia is expected to contribute 16 ± 5 mm sea-level equivalent as per RCP 4.5 scenario [3], thus significantly influencing the hydrological regime of glacierized catchments [4,5]. The changes occurring in the glacierized mountain regions have a close link to their hydrology [6], ecology [7,8], hydrochemistry [9,10], climate [11,12], and economy [2]. Moreover, recent studies have shown that the glacierized regions are quite dynamic [13][14][15] and are closely associated with hazards (e.g., avalanches and glacial lake outburst floods) [16,17]. Therefore, timely monitoring of glaciers is of the utmost importance to comprehend climate change at regional and global scales, and so the study of the cryosphere has received global scientific attention in the last few decades [1,[18][19][20].
Glacier extent is the primary parameter to aid hydrological modeling [21], ice thickness estimation [1,22], and volume estimation [23]. Thus, glacier mapping is of utmost importance to provide a complete picture of freshwater availability at a regional and global scale. The spatiotemporal mapping of the glacier extent using traditional survey methods is impracticable at large scales, as it requires significant time as well as manual work and economic resources [24]. However, remotely-sensed data provide a huge potential for glacier mapping [25].
Debris over glaciers (supraglacial debris) significantly alter glacier melt [26,27], velocity [28][29][30], lake geometry [31], and, thus, the overall response of glaciers towards climate change. However, the presence of supraglacial debris (SGD) alongside periglacial debris (PGD; debris outside the glacier) poses a serious challenge to map the extent of glaciers, owing to its similar spectral response in the visible, near-infrared, and shortwave infrared wavelength range [32]. Thereby, various methods have been employed to classify SGD and PGD. Initially, glacier mapping was initiated via manual tracing on false color composites (FCC) of Landsat MSS data [33], combining with topographic parameters extracted from DEMs [34,35]. Semi-automated methods used optical, thermal, and topographical data [24,32,36]. Manual delineation of glacier boundaries using visual interpretation keys relies on the individual skills of the image analyst, which could be highly subjective in the method and results [37]. Moreover, it is labor-intensive and time-consuming, especially for large areas [38,39]. The presence of supraglacial debris also hampers semi-automated delineation methods [24,32,40,41], as these require extensive post-processing to finalize the glacier boundaries and have applicability over small geographic regions. There are very limited studies using fully automated methods for delineating debris cover glacier boundaries over large geographical regions (>1000 km 2 ), which might bring some insightful solutions to the existing methods. Robson et al. [36] used object-based image analysis to automate debris-covered glacier mapping while utilizing a combination of optical, SAR, and topographic data. The method was found 83% accurate and effective when compared with manual delineation methods. Recently, deep learning (DL) based debris cover mapping has been initiated by Xie et al. [42]. This study proposed convolutional neural network (CNN) based architecture (GlacierNet) by choosing the optimum number and size of layers, filters, and encoder depth. This scheme's robustness is demonstrated over two different High Mountain Asia regions (i.e., Nepal Himalaya and Central Karakoram). Similarly, Alifu et al. [38] demonstrated the utility of machine-learning algorithms in combination with remote-sensing data to map the extent of debris cover on glaciers. The highest classification accuracy is obtained using a random forest (RF) classifier on multisource data (optical, thermal, SAR, and topographical). Recently, Lu et al. [43] reported automated debris cover mapping, using a combination of random forest and CNN, in parts of the Eastern Pamir and Nyainqentanglha region. The RF exhibits an overall higher accuracy primarily due to lesser available training data for CNN+RF. The current state of the art explicitly shows the absence of a fully-automated robust scheme to automate the ongoing efforts of glacier mapping, reducing subjectivity and reducing extensive labor. Moreover, DL in glaciology is still in its infancy; however, considering the improved quality and quantity of satellite data, it could be of the utmost importance in the automation of feature extraction in rugged terrains.

Brief Background-Deep Learning in Glaciology
Deep learning (DL) has revolutionized numerous research fields, and remote sensing is not an exception, especially because of its exceptional capabilities to recognize image texture and pattern. DL attempts to replicate human expertise in image interpretation, as it relies on image pattern, texture, and the association between the features. Thereby, DL bears great significance in extracting information from complex states where standard methods of classification are not adequate [44]. DL is generally referred to as deep artificial neural networks (i.e., ANN) inspired by a biological neural network. The deep neural network (DNN) is described by (1) architecture (number of hidden layer and nodes), (2) activation function (nonlinearity), (3) optimizer, which is used to compute the weights of the connection between units, and (4) loss function, which adjusts the weights and reduces the loss of the model by backpropagation. The application of DL in glaciology is not exploited to its full extent; however, previous studies used ANN for debris cover classification [45] and mass balance estimation [46]. More recently, advancements in computational power and optimizations of algorithms made it possible to use DNN for rock glacier identification [47], supraglacial lake detection [48], debris cover extraction [42,43,49], ice cliff mapping [50], marine-terminating outlet glaciers [51], and mass balance reconstruction [52]. A review on glacier mapping by Kaushik et al. [25] explicitly highlights its potential use in the automated mapping of debris-covered glaciers. Specific studies related to debris cover mapping show severe limitations that cause gaps in our understanding. The applicability of the hybrid DL model (CNN+RF) is tested over a very small geographical region [49]. A study reported by [42] exhibits overall high accuracy; however, it is notable that the CNN model is trained and tested over the same site. This study used 10, 15, and 20% of the total study area and predicted terminus boundaries of the glacier in the study site, which raises questions on the spatial transferability of the proposed model. Likewise, a study reported by [38] adopted a similar approach to map debris-covered glaciers. In this study, various machine-learning classifiers (i.e., gradient boosting, random forest, K-nearest neighbors, and support vector machine) are trained with~2000 points and tested over the same scene. To overcome these issues, our DNN is trained over eight sites in Himalaya and tested over three different Himalaya and Karakoram sites.

Objectives
The objective of the present study is to develop a novel method for automated debris cover delineation utilizing the capabilities of a deep neural network (DNN) and a combination of multisource remote-sensing data. We proposed a DNN architecture by choosing an optimum number and size of layers, activation function, optimizer, and loss function. We aimed to facilitate fast and reliable glacier mapping in the high-rugged mountain terrain to improve the ongoing efforts of generating consistent global/regional glacier inventories.

Study Site
The training and testing sites were well distributed across Himalaya and Karakoram ( Figure 1). In total, two training sites were selected over the Eastern Himalayas, four sites covered the region of Central Himalaya, two sites were located in Western Himalaya ( Figure 1). The Western, Central, and Eastern Himalayan glaciers are characterized by distinct latitudinal range, source of moisture, mass loss rate, and lake expansion rate. The glaciers of Central and Eastern Himalaya have a higher rate of terminus retreat, mass loss, and lake expansion [53,54]. Particularly, Eastern Himalaya is characterized as a hotspot of glacial lake outburst floods [55]. Contrarily, Western Himalayan glaciers have shown comparatively lower rates of mass loss and lake expansion. The glaciers of the Karakoram region have shown anomalous behavior, as observations show glacier advancements [56], surge events [57], mass gains [18], and an increasing trend of velocity [58]. Therefore, to test the spatiotemporal transferability of the developed algorithm, three testing sites were selected in the Karakoram region. These testing sites cover all different types of environment and surface features occurring over the mountain glaciers (e.g., wet snow, supraglacial lakes, supraglacial debris of various extent, shape and thickness, slushy snow, dry snow). The proposed scheme was trained over eight different sites in Himalaya and tested over three sites in Karakoram; one site was selected as common for training and testing ( Figure 1).

Data Used
The application of DL for remote-sensing image classification usually requires a large dataset of high resolution (i.e., spatial and temporal) [60]. However, acquisition of the high spatial resolution imagery for such extensive spatial coverage can be expensive and challenging. Thereby, the proliferation of medium-resolution satellite data (10-30 m) provides ideal training and testing data for the DL workflow. Here, we used a combination of multisource remote-sensing data to train DNN (Table S1). Sentinel 2 MSI Level 1C bands (visible, near-infrared, and SWIR) were integrated with Sentinel 1 (single look complex, Interferometric Wide Swath data) downloaded from https://search.asf.alaska.edu/#/ derived coherence images, Landsat 8 Thermal Infrared (L1TP, TIRS1) (https://earthexplorer. usgs.gov/), ALOS DEM (https://search.asf.alaska.edu/#/), and slope data. The details of the data used are provided in Table S1.
The existing literature [25] on glacier mapping provides sufficient information on the selection of input parameters to DNN. Numerous studies [24,32,38] have highlighted the capabilities of integrated multisource remote-sensing data (optical, thermal, morphometric, and SAR) for SGD mapping. Each layer has some unique characteristics that aid the automated delineation of supraglacial debris. Optical satellite imagery is fed to discriminate between supraglacial debris and periglacial debris based on reflectance properties. Image texture (entropy, homogeneity, variance, and dissimilarity) plays a key role in discriminating SGD and PGD [49,61]. TIR data represents the radiometric temperature difference between SGD and PGD. A previous study [62] explicitly highlighted that a longer temporal baseline of InSAR images might result in lower co-registration accuracy, primarily due to wind drift or solid precipitation. In contrast, a shorter temporal baseline may provide significant results. Here, we used SAR data (Table S1) to utilize loss of coherence due to glacier motion, taking advantage of temporal decorrelation on the glacier surface, as it varies from surrounding coherent surface [62,63]. The presence of seasonal snow and cloud cover makes debris-covered glacier mapping difficult; therefore, satellite imagery of minimal seasonal snow (i.e., ablation season) with the least cloud coverage was used [25]. It is noteworthy that ablation seasons can vary significantly within Himalaya, as Eastern Himalaya experiences ablation season during winters, aptly called summer accumulation type glaciers [64][65][66]. This phenomenon is particularly attributed to the dominance of the Indian Summer Monsson in the Eastern Himalayan region. Contrarily, Western Himalaya and Karakoram glaciers receive minimum precipitation during September and October, which makes this season suitable for debris-covered glacier mapping [27].

Pre-Processing and Data Preparation
The presented scheme follows three phases for the automated mapping of SGD, which include (1) pre-processing and data preparation, (2) proposed DNN architecture (Figure 2), and (3) post-processing and accuracy assessment ( Figure 3). The pre-processing in the present study primarily includes generating SAR coherence images and resampling all the layers at the same resolution. The second phase is of the utmost importance in the presented framework, in which we proposed DNN architecture with optimum hyperparameter tuning for classifying SGD. All the background misclassified and isolated pixels were removed in the post-processing step using size thresholding and median filter. A combination of QGIS 3.16, Sentinel Application Platform (SNAP), deep learning libraries (e.g., TensorFlow 2.2 and Keras 2.4), and geospatial data libraries (e.g., Rasterio, GDAL) is used. The following steps were followed to pre-process all the remotely-sensed data and to generate the training data.  (1) Resampling: Sentinel-2 SWIR bands, Landsat 8 thermal bands, InSAR coherence images, and ALOS DEMs were resampled at 10 m resolution using the nearest neighbor method to match the Sentine-2 VNIR resolution.
(2) InSAR coherence processing: InSAR coherence estimated the phase's statistical noise in an interferogram between two synthetic aperture radar (SAR) acquisitions [67]. InSAR coherence indicated constant scattering within one resolution cell, and its numerical value varied between 0 and 1. The coherence value of 1 indicated the signal's stability and high coherence between two SAR images, whereas a coherence value of 0 signifies complete decorrelation between two SAR image acquisitions. Here, we took advantage of the fact that glacier motion and surface melting cause decorrelation in the glacierized region, making them separable from the surrounding surface [62]. VV polarization image pair was used to generate coherence images. The SAR images were processed in an automated fashion using a sentinel toolbox (SNAP) graph builder. This automated workflow included TOPSAR-Split, applying orbit file, Back-Geocoding, Enhanced-Spectral-Diversity, Interferogram generation, TOPSAR-Deburst, Topo Phase Removal, Multilook, Goldstein Phase Filtering, and Range Doppler terrain correction. In order to minimize the motion of the phase present in the coherence window, the topographic phase was removed from the complex interferogram [63].
(3) Slope: The surface slope was computed from ALOS DEM using the raster terrain analysis module in QGIS 3.2.
(4) Extent: To keep all the training data of the same extent, images were subset to 5400 × 5400. Finally, a raster layer consisting of nine bands were stacked to map the debris cover. Similarly, test data were generated to a different extent (i.e., 3000 × 3000 and 4500 × 3800).
(5) Generation of label data: The label data were generated using the scene-adjusted RGI 6 dataset [68]. Since the focus was to map supraglacial debris, all the snow and ice-covered pixels were removed manually from the glacier boundary using the QGIS 3.2 digitizing tool. Vector polygons labeled by class number were rasterized to generate a class raster with the same category as input image using the GDAL library in the python environment. All supraglacial debris-covered pixels were labeled as 1, and 0 represents non-supraglacial debris, including peripheral debris and other land covers. The generation of input data did not require precise delineation of the debris cover boundary. Deep learning models have shown significant results by feeding a huge amount of data with low-quality labels rather than less training data with precise-quality check labels [69]. In our case, 200 images of size 500 × 500 × 9 were provided as input to the network.

Deep Neural Network Architecture (SGDNet)
The present study used a fully connected feed-forward DNN for pixel-based image segmentation of supraglacial debris and non-debris ( Figure 2). In such a DNN architecture, the neurons are assembled into layers, where all the neurons of one layer are fully connected to all neurons of the next layer. In order to find out the optimum combination of hyperparameters (i.e., number and size of hidden layers, nonlinearity, learning rate, loss function, and regularization method), we ran~50 different training sessions with some (re) tuning in each session. Considering the complexity of the supraglacial mapping, we observed the best performance with quite a dense DNN, having a total of eight layers (9, 1024, 512, 256, 128, 64, 32, and 2) (Figure 2). In each hidden layer unit, the weighted values were summed before going through an activation function, responsible for inserting the nonlinearities in the model. We used "tanh" as an activation function and sparse categorical entropy as a loss function to reduce the model loss while comparing the ground truth (y). The range of tanh varied from −1 to 1, and it was successfully used for classification between two classes.
In order to optimize the weights of the gradient descent, we adopted the "Adam" optimizer [70], for which we fine-tuned the learning rate and obtained the best results at 0.0001. To prevent overfitting the model during training, we used the classical early stop method, which stops training when there is no significant improvement (patience 20) in a monitored metric (validation loss). Our model converged after 500 epochs ( Figure 4). All experiments were performed using an NVIDIA Quadro P4000 GPU. The DL approach, including multisource remote-sensing data, is comparable to human expertise of image interpretation using pattern, tone, and texture for feature identification. However, DL reduces human subjectivity as well as inter-operator bias [42]. The DL models often outperform shallow classifiers due to their ability to extract abstract and unchanging data features using complex architecture [71].

Post Processing
The output of the proposed DNN is a probability distribution that has supraglacial debris and background pixels as a classification result. We used post-processing in an automated fashion to finalize the results of the supraglacial debris mapping. The pixels in the output layer were exported in the float format; thereby, the pixel values represent the probability of it being supraglacial debris. As a first post-processing step, we used a logical operator to remove misclassified background noise. We retained all the DNN output pixels with values greater than 0.2 and above 3300 m (minimum elevation of the glaciers may change per scene). In order to smooth out the results, a 5 × 5 median filter was applied to the entire image. We applied size thresholding as the final post-processing step, where all the miss-classified background pixels were removed, which usually represent small regions such as dirty and mixed late-season snow (Figure 3).

Accuracy Assessment
Accuracy assessment of the remotely sensed product is not a straightforward task, owing to the absence of field data and the complexity of the task. In our case, several glacier inventories and supraglacial debris datasets were available, which could be used as a baseline for comparison with the present results. We used the updated Randolph Glacier Inventory (RGI V6) glacier boundary and supraglacial debris boundaries generated by [72] to compare the proposed automated algorithm results. However, we still made minor corrections in the glacier boundary using Sentinel visible and ALOS DEM data. These corrections are particularly attributed to changes in the spatial resolution of data and acquisition date. These data are referred to as reference data throughout Sections 3 and 4. It should be noted that these reference datasets were present only for glacier >1 km 2 , so our analysis and accuracy assessment focused on glaciers with an area greater than 1 km 2 . We computed several deep-learning standard statistical matrices (accuracy, recall, precision, and F-score) to assess DNN model performance [38,42,43]. The precision rate, recall rate, and F-score are commonly-used parameters to evaluate a segmentation model's performance for remotely-sensed data [73,74]. The recall denotes the fraction of glacier pixels retrieved out of actual glacier pixels. A higher recall rate demonstrates improved glacier classification; however, this rate cannot distinguish whether the background pixels are more likely to be erroneously classified as a glacier. The precision rate estimates the fraction of correctly segmented glacier pixels and all pixels that are segmented into glaciers; this matric does not consider the background of remote-sensing images [43]. The F-score is a comprehensive performance metric that considers a balance between the recall and precision rate. These indices are calculated as: where TP = true positive, FN = false negative, FP = false positive, and TN = true negative.

Proposed Deep Artificial Neural Network (SGDNet)
Here, we first present the results of the proposed DNN model (SGDNet) for the classification of SGD. The proposed DNN architecture outperformed the existing automated scheme for supraglacial debris mapping, as the results exhibited an overall classification accuracy of 96.3% over test data (Figure 4). The model performance assessed with other parameters (recall; 0.80, precision; 0.82, and F-Score; 0.81) also exhibited overall good results. The model accuracy and loss over test data are shown in Figure 4. The results obtained via SGDNet have a varying degree of isolated pixels due to similar spectral, thermal, and motion characteristics of the SGD and surrounding (PGD). However, post-processing steps proved to be significant in improving classification results, as background noise, isolated pixels, and misclassified water pixels (near the glacier terminus) were removed correctly ( Figure S1). The SGD maps obtained by the proposed algorithm were assessed based on visual comparison and a quantitative accuracy assessment. After elimination of isolated pixels and misclassified water pixels, the final results demonstrated that the proposed algorithm ( Figure 5) can successfully classify supraglacial debris over large geographical regions (~3000 km 2 ; Figures 5-8). The algorithm robustly distinguished supraglacial debris from surrounding land covers (e.g., periglacial debris, water pixels, ice, and vegetation components, etc.) ( Figure 5). The comparison of automatically delineated supraglacial debris against reference data demonstrated high classification accuracy (accuracy of 0.99) ( Figure 5). Table 1 summarizes the classification results of the algorithm on the chosen test sites. However, with such high accuracy, we still observed some areas classified as false positive (0.2 km 2 ) and false negative (11.2 km 2 ) areas ( Figure 5).

Test Site 1, 2, and 3
The classification results obtained over test site 1 (1710 km 2 ) show an excellent accuracy of 0.99 ( Figure 6; Table 1). The comparison with the reference data revealed slight variations, especially at the terminus where several pixels were identified as false positives and false negatives ( Figure 6). However, the algorithm successfully mapped all 45 debris-covered glaciers presented in the test site. The detailed investigation shows that 3.2 km 2 and 3.3 km 2 area were classified as false positives and false negatives against reference data. We observed a comparatively lower classification accuracy and F-score (0.98 and 0.84) over test site 2 (900 km 2 ). This site had the presence of 28 debris-covered glaciers, and the algorithm was found to be capable of identifying all the debris-covered glaciers. However, the automated scheme had some limitations, especially in mapping glacier terminus precisely; Figure 7 and Table 1 explicitly show false positive and false negative pixels classified by the algorithm. The comparison with the reference data exhibits 8.2 km 2 and 11.1 km 2 as false positive and false negative areas, as predicted by the algorithm. The proposed algorithm proved to be robust over test site 3 (900 km 2 ). The comparison with reference data resulted in only 3.5 km 2 and 1.3 km 2 areas as being false positive and false negative (Figure 8; Table 1). The results of the automated scheme exhibited correct identification of all the debris-covered glaciers (34). The overall accuracy achieved for this test site was 0.99, with an F-Score of 0.94. These results are significant in view of several studies that highlight the algorithm's need to map the debris cover map over large geographical regions [25,36,38]. Moreover, the spatiotemporal transferability of the proposed algorithm makes it novel in the domain. Our results show that the proposed DL model (SGDNet) can successfully learn the intensity of various remote-sensing data and classify SGD in unknown sites. These results also highlight capabilities of combined remote-sensing data in classification of SGD. However, in order to prove the spatiotemporal transferability at a global scale, training data from other mountainous regions (e.g., Inner Tibet Plateau, Tien Shan, Andes, and Alpes) need to be incorporated in the model.

Proposed DNN (SGDNet) for SGD Mapping
Here, we discuss the performance of the proposed SGDNet using different band combinations. As expected, we have observed a significant decrease in accuracy and the model's overall performance by reducing the number of neurons (i.e., layers of remote sensing data) used to feed the SGDNet. In the first experiment, the elevation layer was removed, resulting in a significant decrease in the model's overall performance (F-Score; 0.64) in distinguishing SGD and PGD (Table 2; Figure 9A). In the second experiment, a combination of optical, thermal, elevation, and slope layer were fed to the proposed DNN, and we observed comparatively improved results (F-score; 0.79) (Table 2; Figure 9B). Thus, the elevation layer proved to be more significant than the coherence layer in distinguishing SGD and PGD. To evaluate the significance of topographic parameters in the classification of SGD, slope and elevation layers were removed from the proposed DNN. The exclusion of topographic parameters (slope and elevation) resulted in a drastic decline of the model's overall performance (F-Score 0.47) (Table 2; Figure 9C). These results demonstrate the significance of slope and elevation layer in automated mapping of debris-covered glaciers. These layers are also proven to be significant in semi-automated [24,32,40], manual [34], and automated methods [38] of debris-covered glacier mapping. The visual comparison ( Figure 10) and accuracy scores ( Table 2) exhibited higher accuracy of the proposed algorithm based on the combination of Optical-SAR-TIR-E-S compared to results from another band combination. This is likely because the SGDNet was able to learn useful classification features from all the multisource data (i.e., the optical, SAR, thermal, elevation, and slope). Each layer provides a piece of useful information in distinguishing supraglacial debris from its surroundings. Thus, it resulted in the lowest misclassification rate for both the SGD and PGD near the glacier terminus ( Figure 10). For example, various extents of underestimation and overestimation of SGD could be observed in the results derived from Optical-SAR-TIR-S, Optical-TIR-S-E, and Optical-SAR-TIR ( Figure 10). These observations laid emphasis on the advantage of multisource data fusion for automated SGD mapping.

Classification Results and Comparison with Existing Methods
The presence of debris over glaciers plays an imperative role in determining glacier ice mass and volume by altering albedo and surface motion. Thus, inclusion of the debris cover extent is of the utmost importance for modeling future sea-level rise scenarios at regional or global scales. The current practice of debris cover mapping is still limited to manual delineation, owing to similarity in thermal, spectral, and motion characteristics of SGD and PGD. The classification results obtained via the proposed algorithm show excellent agreement with the reference data ( Figures 5-8). Our results show improvement in the model's performance as dense layers were added to the network (Table S2), which signifies the complexity of the task. The DNN proved to be significant in identifying the supraglacial debris over a large geographical region (Figures 5-8). To prove the proposed algorithm's spatiotemporal transferability, we applied it over three test sites in the Karakoram, considering the entirely unique morphology of the glaciers in the region. However, our trained DNN model and subsequent post-processing steps successfully classified the supraglacial debris at each site, despite the complexity of the highly rugged terrain. These test sites were chosen to represent all the expected land features in the mountain terrain (lakes, shadow, and flowing water from glacier terminus). Figure 11 provides a detailed view of the supraglacial debris mapped via a proposed algorithm, the updated RGI boundary, and the supraglacial debris dataset generated by Herried and Pellicciotti [72]. The algorithm was robustly distinguished between water pixel, surrounding rocks, and ice pixels. Analyzing the complete classification maps of the test sites, we observed that most of the false-positive classified supraglacial debris was particularly attributed to the shadow present in the scene. The false-negative supraglacial debris pixels were found to have a close association with long and elongated glaciers. Overall, these results provide strong proof of the concept that the integration of multisource remote sensing and deep learning has huge potential in automating glacier mapping in highly rugged terrain mountains (e.g., Himalaya and Karakoram). The existing methods of automated SGD mapping using a combination of multisource remote-sensing data and deep learning/machine learning have not demonstrated spatiotemporal transferability of the model [38,42,43], as all of these studies train and test their model over the same study region. Unlike previous studies, we trained our deep artificial neural network over different sites in the Himalayas. Thereby, the proposed automated scheme of debris cover mapping has demonstrated spatiotemporal transferability over three different test sites in the Himalayas and Karakoram. Our proposed scheme could be used to minimize the uncertainties presented in the debris-covered glacier inventories caused by inter-user inconsistencies and subjectivity in the image interpretation. In quest of the fast and reliable method for debris cover mapping, the proposed scheme offers a new avenue to generate the automated debris cover extent at a regional or global scale.   [72]. The RGI updated shows our minor corrections to Herreid and Pellicciotti [72] glacier boundary as per scene characteristics. Supraglacial debris data exhibit the supraglacial debris extent by Herreid and Pellicciotti [72], and DNN-derived shows the results of the proposed algorithm. The contextual information of Figure

Limitations and Future Work
Our DNN method show promising results, yet we have encountered some limitations of the proposed automated scheme. The primary limitation of the presented workflow is associated with the presence of shadows in the scene. At times, shadow is classified as supraglacial debris ( Figure S2), which is added as a false positive in the accuracy assessment. At present, our method is susceptible to misclassifying dirty ice or snow present in the scene as supraglacial debris. Moreover, at times, it becomes notoriously difficult to map heavily debris-covered glaciers, even with the manual digitization method. To overcome this challenge, a pragmatic solution would be to include surface velocity data estimated from feature tracking of optical imagery or SAR interferometry, which may provide valuable input to the DNN model. This could also play an imperative role in classifying rock glaciers based on their different surface velocity. Moreover, the inclusion of spectral indices, which differentiate between SGD and PGD, may improve the results. It is worth mentioning that, in the presented study, the method adopted focused only on glaciers >1 km 2 ; however, mapping of small glaciers (<1 km 2 ) bears great significance in terms of climate change studies. Our algorithm has successfully identified 10 debris covered glaciers <1 km 2 ; however, we have not performed an extensive accuracy assessment of smaller glaciers owing to the absence of reference data. Thereby, we need to train DNN using several small glaciers to generate consistent glacier inventories. As our training data have an underrepresentation of the small debris-covered glaciers. The most important measure to further improve the proposed method would be to increase the training dataset with more training data representing supraglacial debris of different sizes and shapes and surface features that are often misclassified as SGD. Future research has huge potential to apply deep learning for various applications in glaciological research without prior experience in coding and deep learning. Particularly, our primary focus should be centered around the automated temporal inventory of glaciers and glacial lakes. Moreover, we see the huge potential of DNN in glacier facies mapping, snow cover mapping, mass balance estimation, and glacial lakes outburst flood modeling.

Conclusions
The primary focus of this study was to provide a fast and accurate method for glacier mapping to generate temporal glacier inventories at regional levels. For the first time, this study proposed an automated workflow with transferable spatiotemporal capabilities for debris cover mapping over large geographical regions. We proposed a novel DNN architecture, namely SGDNet, to exploit multisource remote-sensing data to automate debris-covered glacier mapping. The aim was to leverage the complementary information provided by multisource remote-sensing data to obtain a set of descriptors that helps to better distinguish SGD and PGD. The mapping results reveal the robustness of the proposed algorithm with reliable supraglacial debris classifications for all test data, where supraglacial debris were found to vary strongly with respect to size, shape, and appearance. The quantitative and qualitative evaluation of three test sites, characterized by different topography and surface feature characteristics, has exhibited the significance of our approach, showing how our proposed scheme outperforms state-of-the-art solutions considering operational settings. The presented study is a step forward towards the automated monitoring of glaciological processes at high temporal and spatial resolutions. Our proposed method is not reliant on commercial remote-sensing data. Instead, we incorporate freely available data, enabling the entire glaciological community to monitor glacier change at an unprecedented scale. The infancy exploitation of DNN in glacial settings using remotelysensed data provides huge potential to reduce labor-intensive manual methods and to offer an automated approach in an era of burgeoning availability of satellite imagery. We strongly recommend the continued use of deep learning to generate consistent temporal global glacier inventory.  Table S1. Data used in the present study, Table S2. Results of the various used Deep Artificial Neural Network models on the test data.
Author Contributions: T.S., P.K.J., A.B., A.J.D. and S.K. conceptualized the work. S.K. wrote the DNN (SGDNet) code in python, performed data analysis, and wrote the original draft of the manuscript. T.S. and P.K.J. supervised the work, including preparation of the manuscript. A.B. and A.J.D. reviewed the draft. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: Data used and analyzed in the present study will be made available upon author request. Acknowledgments: S.K. and T.P.S. wish to acknowledge the Director CSIR-CSIO, Chandigarh, India, to provide the necessary infrastructure and motivation for the work. S.K. and T.P.S. extend their most profound acknowledgment to Head of Division Amitava Das (Senior Principal Scientist, CSIR-CSIO, AcSIR) for providing the required computational facility and workspace. P.K.J. acknowledges DST-PURSE of JNU, New Delhi, for providing research support. S.K. thanks Tarandeep Singh (AcSIR-CSIO), Nitin Mohan Sharma (AcSIR-CSIO), Vikash Shaw (AcSIR-CSIO), and Salay Jain (IIIT-Hyderabad) for useful discussions on deep learning. All authors acknowledge US Geological Survey, Copernicus programme, and JAXA for providing satellite data (Landsat, Sentinel 1, Sentinel 2, and ALOS DEM) free of cost.

Conflicts of Interest:
The authors declare no conflict of interest.