Multi-Temporal Land Cover Change Mapping Using Google Earth Engine and Ensemble Learning Methods

: The study deals with the application of Google Earth Engine (GEE), Landsat data and ensemble-learning methods (ELMs) to map land cover (LC) change over a decade in the Kaski district of Nepal. As Nepal has experienced extensive changes due to natural and anthropogenic activities, monitoring such changes are crucial for understanding relationships and interactions between social and natural phenomena and to promote better decision-making. The main novelty lies in applying the XGBoost classiﬁer for LC mapping over Nepal and monitoring the decadal changes of LC using ELMs. To map the LC change, a yearly cloud-free composite Landsat image was selected for the year 2010 and 2020. Combining the annual normalized di ﬀ erence vegetation index, normalized di ﬀ erence built-up index and modiﬁed normalized di ﬀ erence water index, with elevation and slope data from shuttle radar topography mission, supervised classiﬁcation was performed using a random forest and extreme gradient boosting ELMs. Post classiﬁcation change detection, validation and accuracy assessment were executed after the preparation of the LC maps. Three evaluation indices, namely overall accuracy (OA), Kappa coe ﬃ cient, and F1 score from confusion matrix reports, were calculated for all the points used for validation purposes. We have obtained an OA of 0.8792 and 0.875 for RF and 0.8926 and 0.8603 for XGBoost at the 95% conﬁdence level for 2010 and 2020 LC maps, which are better for mountainous terrain. The applied methodology could be signiﬁcant in utilizing the big earth observation data and overcoming the traditional computational challenges using GEE. In addition, the quantiﬁcation of changes over time would be helpful for decision-makers to understand current environmental dynamics in the study area. methods. This study presents an e ﬀ ective workﬂow of LC mapping and change detection using openly available tools. In this case study, we obtained good accuracy for all LC maps, resulting in a satisfactory outcome. For the RF classiﬁer, the OA of 0.8792 +/ − 0.0472 at the 95% conﬁdence level was obtained for the 2010 LC map, whereas for 2020, the accuracy was 0.875 +/ − 0.0575 at the 95% conﬁdence level. For XGBoost, we obtained an accuracy of 0.8926 +/ − 0.0448 at the 95% conﬁdence level for the 2010 LC map, whereas for 2020, the accuracy was 0.8603 +/ − 0.053 at the 95% conﬁdence level. From this study, we can conclude that the Landsat imagery from EO data and GEE with ELMs are best for mapping the LC for a larger area. As RF performed better during the 2020 mapping and XGBoost model during the 2010 LC mapping, a clear distinction was not made on ELM’s performance.


Introduction
Nepal, a seismically active [1] and economically developing country [2], lies between India and China. In 2015, Nepal experienced a massive earthquake and many aftershocks destroying many infrastructures and lives [3][4][5]. In the past decade, Nepal has seen rapid population growth and unplanned urbanization leading to many infrastructure developments from rural excavated roads to national highway projects [4,6]. Many forest areas were converted to cultivation lands, and existing cultivation lands were converted to built-up areas. These alterations are causing undulations in river courses and patterns, thus further affecting the water quantity and quality of rivers [7][8][9][10][11]. All these natural and anthropogenic activities have altered the land cover (LC) rapidly in Nepal. Regular LC mapping and change detection is an utmost important task to maintain pace with these changes. Regular monitoring of LC change through an updated database helps us access global climatic pattern change and mitigate its issues [12]. LC changes are crucial for understanding relationships and interactions between human and natural phenomena and to promote better decision-making for sustainable development [13].
Earth observation (EO) data are one of the best resources for measuring LC dynamics [14][15][16]. It started with the Landsat satellite mission launch and has been continuously monitoring the earth's surface for more than four decades [17][18][19]. While many satellites with advanced capabilities have been launched for the same purpose, Landsat has continuously evolved, with each mission maintaining its legacy. This vast collection of data is also freely available for researchers, which is a great asset to the scientific community. Many case studies have been conducted for LC mapping and change detection [20][21][22][23][24][25][26][27]. Many indices have been developed to extract a single feature, whereas recent advancements in machine learning (ML) algorithms have significantly improved classification accuracy [28][29][30][31][32]. However, working on multiple satellite imageries/scenes in edge devices is tedious, time-consuming and often requires enormous storage space. One possible approach to handle such massive EO data for large spatial and temporal extents is to utilize multiple computers, which is also an expensive investment for a few times of use. An optimum solution that could solve all the above problems would be a cloud-based public computing platform. This conserves the infrastructure, cost, computing time and maximum utilization issues at one step. Google Earth Engine (GEE) is such a cloud-based computing platform that has a vast amount of satellite data. GEE makes satellite imagery processing very convenient [33][34][35]. This method provides free and easy data access and analyzes the massive amount of data in a matter of seconds. Some previous works in Asia, China, Africa, and the USA have shown a promising use of GEE [34][35][36][37][38][39][40][41].
Though there are many conventional methods for mapping LC and its dynamics, these processes are time-consuming and require much computational power. To overcome these limitations, one possible solution is the implementation of ML algorithms that not only provide efficient computing performance in terms of power and time but also furnish more accurate results than conventional methods. With the advancement in computing technology, many studies have been conducted by combining statistical methods with machine learning models such as artificial neural networks (ANN), logistic regression (LR), decision trees (DT) and support vector machines (SVM) and has been reported to perform better than conventional methods [42][43][44]. Accuracy can be furthermore enhanced using the ensemble-learning methods (ELMs) that combine several models to one predictive model that decreases variance (bagging), bias (boosting) and improves predictions (stacking) that produces better accuracy than a single ML model [45,46]. Many studies have proved the significance of the ensemble model over a single and conventional model [42,[47][48][49]. Random forest (RF) and extreme gradient boosting (XGBoost) are some of the models based on an algorithm of ensemble methods.
From the aforementioned literature, it can be observed that EO data can act as an alternative to traditional data acquisition and machine learning algorithms are more advanced than traditional supervised and unsupervised classifiers. To test their applicability for mountainous terrain, we have implemented GEE and ELMs for LC mapping of mountainous terrain in Nepal, i.e., the Kaski district. The main novelty lies in applying the XGBoost classifier for LC mapping over Nepal and monitoring

Study Area
The Kaski district lies between 28°4′35″-28°36′44″ N latitude and 83°42′10.53″-84°16′49″ E longitude in Province 5 of Nepal. It has the second largest metropolitan city of Nepal, Pokhara, which experienced tremendous urbanization in the past decade [50]. Pokhara is a primary tourist attraction site and one of the significant economic hubs of Nepal. This district represents Nepal geographically as it has the snow-covered Annapurna mountain range, three major lakes, one of the biggest valleys and many rivers. Most of the area of the LC of the Kaski district is occupied by forest and cultivation areas. The snow-covered area is also prominent in the Kaski district. The study area is shown in Figure 2.

Study Area
The Kaski district lies between 28 • 4 35"-28 • 36 44" N latitude and 83 • 42 10.53"-84 • 16 49" E longitude in Province 5 of Nepal. It has the second largest metropolitan city of Nepal, Pokhara, which experienced tremendous urbanization in the past decade [50]. Pokhara is a primary tourist attraction site and one of the significant economic hubs of Nepal. This district represents Nepal geographically as it has the snow-covered Annapurna mountain range, three major lakes, one of the biggest valleys and many rivers. Most of the area of the LC of the Kaski district is occupied by forest and cultivation areas. The snow-covered area is also prominent in the Kaski district. The study area is shown in Figure 2.

Satellite Data
In this study, all the datasets used are available from the GEE repository. GEE provides atmospherically corrected, tier 1 surface reflectance imagery along with many other datasets. Landsat 5 and 8 were used for the years 2010 and 2020 for LC change mapping. Table 1 shows the band combination of Landsat 5 and 8 data. Both satellites have sun-synchronous orbit and have a temporal resolution of 16 days. The data are available as approximately 185 km × 180 km scenes defined in a worldwide reference system (WRS) of the path and row coordinates [51,52]. Landsat 5 data were atmospherically corrected using Landsat ecosystem disturbance adaptive processing system (LEDAPS) [53]. Landsat 8 data were atmospherically corrected using land surface reflectance code (LaSRC) [54] that includes a cloud, shadow, water and snow mask produced using the C function of Mask (CFMASK) [55], as well as a per-pixel saturation mask. Since Nepal's Kaski district is a cloudy area, most of the Kaski district areas lie in the cloud covered portion of Landsat images. Hence, obtaining a full scene without a cloud of the same day or month is challenging. Hence, we performed cloud masking in the scene and applied median filtering to the scene to get smooth and continuous coverage of the scene without clouds resulting in the annual median composite.
In addition to surface reflectance, the slope was also used to aid the classification. The slope was derived from the shuttle radar topography mission (SRTM), an international research effort that obtained a digital elevation model (DEM) on a near-global scale [56]. This SRTM V3 product (SRTM Plus) is provided by National Aeronautics Space Administration Jet Propulsion Laboratory at a resolution of 1 arc-second (approximately 30 m) [57]. This dataset has undergone a gap-filling process using open-source data (ASTER GDEM2, GMTED2010 and NED) instead of other versions that contain voids or were gap-filled with commercial sources. Dem helps to eliminate the confusion created by the shadow of terrain as it has similar spectral characteristics to water in the optical image. Figure 3 shows the map representing elevation and slope from SRTM dem. The elevation of the Kaski district ranges from 450 m to 8091 m.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 21 resolution of 1 arc-second (approximately 30 m) [57]. This dataset has undergone a gap-filling process using open-source data (ASTER GDEM2, GMTED2010 and NED) instead of other versions that contain voids or were gap-filled with commercial sources. Dem helps to eliminate the confusion created by the shadow of terrain as it has similar spectral characteristics to water in the optical image. Figure 3 shows the map representing elevation and slope from SRTM dem. The elevation of the Kaski district ranges from 450 m to 8091 m.

Spectral Indices
Various spectral indices were derived from the primary bands of Landsat imagery. These bands in amalgamation with other primary bands were used as training data to produce the LC maps. These multiple indices help us classify images accurately, thereby reducing hillshade and building shadows. The adopted indices are:

Spectral Indices
Various spectral indices were derived from the primary bands of Landsat imagery. These bands in amalgamation with other primary bands were used as training data to produce the LC maps. These multiple indices help us classify images accurately, thereby reducing hillshade and building shadows. The adopted indices are: 1.
NDVI is the normalized difference between the NIR band and the red band. NIR bands are more sensitive to vegetation [58] and are used to perceive the vegetation's richness. It is calculated as shown in Equation (1).
NDBI is the normalized difference between SWIR1 and NIR bands [59,60]. It separates the built-up area from the background. It is calculated as shown in Equation (2). 2.
MNDWI is the normalized difference between SWIR1 and green bands. It accurately segregates the water body mitigating the built-up noise [61]. It is calculated as shown in Equation (3).

LC Label Scheme
The cloud masking scheme in the preprocessing of Landsat data masked some of the permanent ice covering areas in the study area. This created voids and difficulty in the classification of the snow LC. To mitigate this problem, we masked out the snow area from the image and only the remaining portion of the image was used for the classification, thus resulting in only five labels out of six LC classes. The classes were determined according to the major classes from the Nepal government's LC map. These classes overall represent the LC of the Kaski district. Table 2 shows the LC types classified in this study.

Reference Data
For this study, reference data were prepared in the area, leaving snow cover. First, the unsupervised classification was performed to understand the clustering of similar features. Based on the area coverage of unsupervised pixels, reference points were selected using stratified sampling. For each point, labels from Table 2 were added according to the same year high-resolution imagery available in Google Earth Pro. For a better representation of the area, the expert's judgment was used to add and remove points. A total of 593 and 542 reference points for all LC classes (excluding snow) were collected for the images of 2010 and 2020, respectively, omitting the NA values. The reference data for the Landsat image from 2010 and 2020 are shown in Figure 4. Training and validation points will be prepared by randomly splitting the data at 75% for 2020 and 80% for 2010.

Reference Data
For this study, reference data were prepared in the area, leaving snow cover. First, the unsupervised classification was performed to understand the clustering of similar features. Based on the area coverage of unsupervised pixels, reference points were selected using stratified sampling. For each point, labels from Table 2 were added according to the same year high-resolution imagery available in Google Earth Pro. For a better representation of the area, the expert's judgment was used to add and remove points. A total of 593 and 542 reference points for all LC classes (excluding snow) were collected for the images of 2010 and 2020, respectively, omitting the NA values. The reference data for the Landsat image from 2010 and 2020 are shown in Figure 4. Training and validation points will be prepared by randomly splitting the data at 75% for 2020 and 80% for 2010.

Methodology
First, reference data were prepared for the LC classification of the Kaski district during the years 2020 and 2010. For this purpose, unsupervised K-means classification was performed on the Landsat 8 and Landsat 5 imagery. A total of 50 clusters was made for the classification. The spectral similarity between the classes was observed. With the help of unsupervised clusters and the respective year's google imagery, the reference data were prepared. Later the number of reference data was adjusted

Methodology
First, reference data were prepared for the LC classification of the Kaski district during the years 2020 and 2010. For this purpose, unsupervised K-means classification was performed on the Landsat 8 and Landsat 5 imagery. A total of 50 clusters was made for the classification. The spectral similarity between the classes was observed. With the help of unsupervised clusters and the respective year's google imagery, the reference data were prepared. Later the number of reference data was adjusted in proportional to the area of the ten unsupervised training classes. In GEE, a cloud-free composite of the Landsat 5 and Landsat 8 surface reflectance (SR) of the whole year was taken using the median filter for the composite. Cloud-free composites were made by masking out the clouds from the image. Snow masking was also performed to remove the void pixels from the image. The true-and false-color composites of the median Landsat SR are shown in Figure 5.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 21 in proportional to the area of the ten unsupervised training classes. In GEE, a cloud-free composite of the Landsat 5 and Landsat 8 surface reflectance (SR) of the whole year was taken using the median filter for the composite. Cloud-free composites were made by masking out the clouds from the image. Snow masking was also performed to remove the void pixels from the image. The true-and falsecolor composites of the median Landsat SR are shown in Figure 5. Further, several indices like NDVI, NDBI and MNDWI were derived from the primary bands of satellite imagery. These bands were used to accurately classify the Landsat imagery eliminating shadows of hills and buildings. Since Kaski has a snow-covered Himalayan range and hills, topographical factors like slope and elevation obtained from SRTM DEM were also used for classification. The total reference data were randomly divided into a test set and a validation set. The yearly composite data were downloaded and further processed using the R programming language. Further, several indices like NDVI, NDBI and MNDWI were derived from the primary bands of satellite imagery. These bands were used to accurately classify the Landsat imagery eliminating shadows of hills and buildings. Since Kaski has a snow-covered Himalayan range and hills, topographical factors like slope and elevation obtained from SRTM DEM were also used for classification. The total reference data were randomly divided into a test set and a validation set. The yearly composite data were downloaded and further processed using the R programming language. The hyperparameter tuning was performed for both algorithms for both years, and the best hyperparameter was used for the LC mapping. Later, RF and XGBoost based classification was executed for the supervised classification. RF uses various carts as a single unit to give the best result. LC maps were generated for each test area using both compositing methods, and the classification accuracy of each map was computed using the corresponding validation data sets.
After preparing the LC classification in the R, postprocessing was done in ArcGIS Pro. The majority filter of eight neighborhood pixels was applied to remove the salt-and-pepper effect on the obtained maps. The changes were computed between RF-derived categorical classes for two years. Finally, post-classification change was detected and mapped. The change map was analyzed using Sankey's diagram.

Random Forest Algorithm
Random forest (RF) is the ensemble of many relatively uncorrelated decision trees that operate as a single unit [62]. A decision tree is a simple (Figure 6), deterministic data structure for modeling decision rules for a specific classification problem. At each node, one feature is selected to make class determining the decision. Each tree in the ensemble is built from a sample drawn with replacement (i.e., a bootstrap sample) from the training set. It produces stable results without even the requirement of hyperparameter tuning. The RF algorithm was selected because of its robustness and accuracy. RF can process features with many dimensions, and it can use continuous and categorical data sets. RF is easy to parametrize, is not sensitive to over-fitting, and is even good at dealing with outliers in training data. RF even calculates ancillary information such as classification error and variable importance [62].
Appl. Sci. 2020, 10, x FOR PEER REVIEW 8 of 21 is easy to parametrize, is not sensitive to over-fitting, and is even good at dealing with outliers in training data. RF even calculates ancillary information such as classification error and variable importance [62].

Extreme Gradient Boosting Algorithm
Gradient Boosting Algorithm is the ELM, which builds the model in a stage-wise fashion as other boosting methods do (Figure 6), and it generalizes them by allowing optimization of an arbitrary differentiable loss function [63]. XGBoost is one of the implementations of the Gradient Boosting concept, but what makes XGBoost unique is that it uses more regularized model formalization for controlling overfitting and better model performance. Its execution speed is fast than other gradient boosting algorithms. The goal of this algorithm is to push the extreme of the computation limits of

Extreme Gradient Boosting Algorithm
Gradient Boosting Algorithm is the ELM, which builds the model in a stage-wise fashion as other boosting methods do (Figure 6), and it generalizes them by allowing optimization of an arbitrary differentiable loss function [63]. XGBoost is one of the implementations of the Gradient Boosting concept, but what makes XGBoost unique is that it uses more regularized model formalization for controlling overfitting and better model performance. Its execution speed is fast than other gradient boosting algorithms. The goal of this algorithm is to push the extreme of the computation limits of machines to provide a scalable, portable and accurate library. XGBoost not only eliminates biases but also removes variance in the result. It provides efficient handling of missing data and promotes tree pruning using a depth-free approach.

Accuracy Assessment
Training of our model was controlled using a 10-fold cross-validation (CV) method. An accuracy assessment was done using the validation data set. Three evaluation indices, namely the overall accuracy (OA), Kappa coefficient (Kappa) and F1 score from confusion matrix reports, were calculated for all the points used for validation purposes. OA determines the algorithm's overall efficiency and can be calculated by dividing the total number of correctly labeled samples by the total number of the testing samples [64]. The Kappa indicates the degree of agreement between the ground truth data and the predicted values [65]. The F1 score was calculated from the precision and recall of the test, where the precision is the number of correctly identified positive results divided by the number of all positive results, including those not identified correctly. The recall is the number of correctly identified positive results divided by the number of all samples that should have been identified as positive [66,67]. The no-information rate (NIR) is the largest proportion of the observed classes. We also calculated producer accuracy for all the classes. The producer's accuracy represents the probability that a reference sample is correctly identified in the classification map. Table 3 shows a representation of a confusion matrix for two classes: F1 score = (1 + β 2 )(precision * recall)/(β2 * precision + recall) Suppose that there is response y i and covariates x i for i = 1 . . . . n, and some loss function L. The no-information error rate of a model f is the average loss of f over all combinations of y i and x i :

Results
In Kaski district, Pokhara is open and has mountains in the southern part, making the Landsat scene mostly covered with clouds. Hence, getting a cloud-free scene from a single day is challenging. Hence, we masked out the cloud and its shadow from images using pixel_qa band and made the median composite of the images, as shown in Figure 5. In the figure, the false-color composite image, we can observe that most of the Kaski district is covered by forest.
The reference data were created by dividing the classes based on the area of the unsupervised classification classes. The number of reference data on each class was proportional to the area of the unsupervised classes. The OA and Kappa, which was derived based on validation samples, was used to evaluate the performance of RF and XGBoost classifier in Landsat 8 and Landsat 5 images for the year 2020 and 2010, respectively. Before performing the LC classification, hyperparameter tuning was executed. For both the algorithms, the classifier manual grid search hyperparameter tuning was performed. In the RF function of the caret package, the accuracy mainly depends on mtry and ntree, where mtry is the number of variables to randomly sample as candidates at each split, and ntree is the number of decision tree ensembled. In our study, we kept the mtry as default for classification, i.e., square root of the number of predictors. In our case, there was a total of 11 predictors in the data set, so we used 3 as mtry approximately. In contrast, the manual hyperparameter tuning was conducted for the ntree. We tested the ntree from 500 to 2500, increasing 500 at a time. The repeated CV was used as a train control method. The hyperparameter tuning was done by considering a confidence interval of 95%. The dot plot of accuracy obtained by using different ntree to model is shown in Figure A1 from Appendix A. For XGBoost parameter tuning, we did manual grid tuning for the max tree depth, nrounds and eta, where max tree depth is the maximum depth of a binary tree to the number of nodes from the root down to the furthest leaf node, nround is boosting iteration that specifies the number of times the tree-making process is continued, and eta is the learning rate, as shown in Figure A2. We kept the other parameters like gamma, colsample_byte tree and subsample as constant. The best hyperparameters obtained for XGBoost are provided in Table A1 from Appendix A.
After the data training procedure, the RF model and XGBoost model produced classified images with five classes. The snow area that was masked earlier was joined after training to the classified map. The map obtained is shown in Figure 7. The accuracy of the obtained map was measured by using different statistical accuracy measurement units. The LC map was then assessed for classification accuracy as the map validation goal was to determine the OA of the LC map compared to available reference data. The validation was performed from the randomly divided data from the reference data. All the maps from 2010 and 2020 were validated separately as both have different reference data points. The OA and Kappa, along with no-information data and p-value for the hypothesis test, were measured for the accuracy assessment. For the LC map of the year 2010, XGBoost performed better than the RF classifier, whereas, for the year 2020, the RF classifier performed better. The accuracy of the RF model was 0.8792 +/− 0.0472 at the 95% confidence level for the 2010 LC map, whereas for 2020, the accuracy was 0.875 +/− 0.0575 at the 95% confidence level. Similarly, the accuracy of the XGBoost model was 0.8926 +/− 0.0448 at the 95% confidence level for the 2010 LC map, whereas for 2020, the accuracy was 0.8603 +/− 0.053 at the 95% confidence level. Since the Kappa of the classifier performance for both the years is greater than 0.80, we can conclude that there is a high agreement between the ground truth and agreement value. The overall accuracy and Kappa obtained from all methods for both years are shown in Table 4.
Similarly, the accuracy of the XGBoost model was 0.8926 +/− 0.0448 at the 95% confidence level for the 2010 LC map, whereas for 2020, the accuracy was 0.8603 +/− 0.053 at the 95% confidence level. Since the Kappa of the classifier performance for both the years is greater than 0.80, we can conclude that there is a high agreement between the ground truth and agreement value. The overall accuracy and Kappa obtained from all methods for both years are shown in Table 4.    The validation data were divided proportionally according to the unsupervised classes, and hence the number of validation points was not the same for all the classes. Looking at the confusion matrix from Table A2 in Appendix A and producer accuracy in Table 5, we can understand that the PA of the built-up area in 2010 LC maps was 0. This is mainly because of having very few numbers of validation points for the built-up class, out of which one point of the validation set of the built-up class was misclassified as rocky/barren, and two points of the built-up class were misclassified as cultivation. The forest class was the most accurately classified class in both the years. The producer's accuracy of forest in 2010 and 2020 were 0.93 and 0.98 for the RF classifier and 0.97 and 0.96 for the XGBoost classifier. We can see a significant difference in the performance of the classifier in water class. In 2010, the F1 score of water for the RF classifier was 0.67, whereas for XGBoost was 0.84. A similar trend can also be viewed for PA for 2010 and the F1 score and PA for 2020 for both classifiers. The main reason for the misclassification can be attributed to the same spectral signatures for two different classes in the Landsat image that led to the mismatch of pixels. Another constraint we experienced was with the shadow of hills and buildings that were misclassified as water. To mitigate this error, we took more distributed training data in those shadows and used multiple indices for classification. Other misclassification problems have occurred in the river's sandy area as they were misclassified as the built-up area. These errors were mitigated by adopting a similar procedure, as discussed above. The error of misclassified pixels can be reduced by increasing the number of training dataset in that area. Though the increased number of training points minimizes the effect of the misclassified pixels, it may cause overfitting or high variance, i.e., high training accuracy but low testing accuracy during classification. After the implementation of accuracy assessment, the percentage change was measured, and the categorical compute change function was used from ArcGIS Pro to map the changes according to the category represented in Figure 8. From the LC change map analysis, we can observe a major change in the main commercial area of the Kaski district, i.e., Pokhara. Pokhara's LC has had tremendous change due to the development of recreational activities and improvements in the sightseeing area. The new international airport construction has converted about 1% of the total agricultural area to barren land. The built-up area has even shown an increasing trend. From the change map, we can also notice that the forest area has From the LC change map analysis, we can observe a major change in the main commercial area of the Kaski district, i.e., Pokhara. Pokhara's LC has had tremendous change due to the development of recreational activities and improvements in the sightseeing area. The new international airport construction has converted about 1% of the total agricultural area to barren land. The built-up area has even shown an increasing trend. From the change map, we can also notice that the forest area has increased in the past decade. This may be due to the result of the increasing community forest plantation program. One other major change is from rocky to snow and vice versa. A major cause for this transition is the weather and climatic factors in this area. Rocky, barren land and cultivation areas have decreased from the previous year. The water area has not shown much change in terms of area, but the course of the river has changed in the past decade. The changes in area are shown in Figure 9. The Sankey diagram from Figure 9 is the best graphical tool to represent the LC class change [68]. We can visualize how much area has changed from one category to every other category. We can see most of the land classes have not changed from 2010 to 2020. There was a huge change in rocky/ barren to snow in interclass change, i.e., approximately 104 sq km area was converted to snow. Another considerable transformation was the cultivation area. About 67.13 sq km of cultivation land was converted to forest in 2020. This implies a positive sign due to the increment of the community forest plantation program in the district in the last decade.

Discussion
We used a cloud-based computing platform (GEE) to acquire data (median composite, derivative indices and terrain) for Kaski District, Nepal and performed LC mapping using the ELMs in the CARET package of R. This has replaced the tedious and expensive process of map-making using traditional methods. This study presents an effective workflow of LC mapping and change detection using openly available tools. In this case study, we obtained good accuracy for all LC maps, resulting in a satisfactory outcome. For the RF classifier, the OA of 0.8792 +/− 0.0472 at the 95% confidence level was obtained for the 2010 LC map, whereas for 2020, the accuracy was 0.875 +/− 0.0575 at the 95% confidence level. For XGBoost, we obtained an accuracy of 0.8926 +/− 0.0448 at the The Sankey diagram from Figure 9 is the best graphical tool to represent the LC class change [68]. We can visualize how much area has changed from one category to every other category. We can see most of the land classes have not changed from 2010 to 2020. There was a huge change in rocky/ barren to snow in interclass change, i.e., approximately 104 sq km area was converted to snow. Another considerable transformation was the cultivation area. About 67.13 sq km of cultivation land was converted to forest in 2020. This implies a positive sign due to the increment of the community forest plantation program in the district in the last decade.

Discussion
We used a cloud-based computing platform (GEE) to acquire data (median composite, derivative indices and terrain) for Kaski District, Nepal and performed LC mapping using the ELMs in the CARET package of R. This has replaced the tedious and expensive process of map-making using traditional methods. This study presents an effective workflow of LC mapping and change detection using openly available tools. In this case study, we obtained good accuracy for all LC maps, resulting in a satisfactory outcome. For the RF classifier, the OA of 0.8792 +/− 0.0472 at the 95% confidence level was obtained for the 2010 LC map, whereas for 2020, the accuracy was 0.875 +/− 0.0575 at the 95% confidence level. For XGBoost, we obtained an accuracy of 0.8926 +/− 0.0448 at the 95% confidence level for the 2010 LC map, whereas for 2020, the accuracy was 0.8603 +/− 0.053 at the 95% confidence level. From this study, we can conclude that the Landsat imagery from EO data and GEE with ELMs are best for mapping the LC for a larger area. As RF performed better during the 2020 mapping and XGBoost model during the 2010 LC mapping, a clear distinction was not made on ELM's performance.
LC mapping has been very limited in Nepal. The only national mapping organization of Nepal, the survey department (SD), still uses the traditional digitization technique for LC mapping for creating the national topographic database [10]. Although it is spatially more accurate, this process is labor-intensive and time-consuming. The process presented in this paper can be used for LC mapping with higher resolution satellite imagery to obtain a better result. The results of the LC mapping procedure were visually like that of the labor-intensive digitization mapping procedure of the SD. Both the model-simulated and observed LC map from SD are represented in Figure 10. creating the national topographic database [10]. Although it is spatially more accurate, this process is labor-intensive and time-consuming. The process presented in this paper can be used for LC mapping with higher resolution satellite imagery to obtain a better result. The results of the LC mapping procedure were visually like that of the labor-intensive digitization mapping procedure of the SD. Both the model-simulated and observed LC map from SD are represented in Figure 10. We also compared our LC map of the year 2010 with the LC map prepared by the International Centre for Integrated Mountain Development (ICIMOD). ICIMOD has produced a national LC map database of 2010 [69]. Although our map is not divided into broad categories as ICIMOD's LC map, we can get a relatively similar result as ICIMOD's LC map. Figure 11 shows the comparison of our LC map with ICIMOD's LC map. By comparing both the maps, we can see the difference in the builtup area. This is because ICIMOD's map is generalized, as it is performed for a larger area. With the exception of built-up, all classes showed similar characteristics when compared. Few works have been performed utilizing the ELMs. Abdi [42], in his study, used the RF and XGBoost classifiers to perform LC mapping using Sentinel 2 data. The author achieved an overall accuracy of 0.739 ± 0.018 for RF classifier and accuracy of 0.751 ± 0.017 for XGBoost. On the contrary, we obtained better accuracy by utilizing DEM data from SRTM using the GEE repository for mitigating the effects of hills and shadows and its misclassification. Another study by Talukdar et al. [40] implemented the We also compared our LC map of the year 2010 with the LC map prepared by the International Centre for Integrated Mountain Development (ICIMOD). ICIMOD has produced a national LC map database of 2010 [69]. Although our map is not divided into broad categories as ICIMOD's LC map, we can get a relatively similar result as ICIMOD's LC map. Figure 11 shows the comparison of our LC map with ICIMOD's LC map. By comparing both the maps, we can see the difference in the built-up area. This is because ICIMOD's map is generalized, as it is performed for a larger area. With the exception of built-up, all classes showed similar characteristics when compared. Few works have been performed utilizing the ELMs. Abdi [42], in his study, used the RF and XGBoost classifiers to perform LC mapping using Sentinel 2 data. The author achieved an overall accuracy of 0.739 ± 0.018 for RF classifier and accuracy of 0.751 ± 0.017 for XGBoost. On the contrary, we obtained better accuracy by utilizing DEM data from SRTM using the GEE repository for mitigating the effects of hills and shadows and its misclassification. Another study by Talukdar et al. [40] implemented the RF classifier in the riparian landscape of the Ganges river and obtained an accuracy of 0.91. However, this study was conducted on a flat surface of India that has an elevation ranging from 15 m to 90 m, whereas our study was conducted in an area having elevation ranging from 300 m to 8091 m. From the beginning, there were various limitations and challenges to this work. The first challenge was related to the study area. As the Kaski district is heavily covered with clouds in the Landsat scene, it was challenging for us to get the perfect scene for mapping. The only possible solution was to consider the median composite of the annual Landsat imagery after cloud and snow masking. As we considered the median of the summer snow-covered and winter snow-covered images, the snow and rocky/barren areas were highly misclassified. Another limitation can be attributed to the presence of hills in the Kaski district. The shadows due to these hills caused the misclassification of forest and rocky areas as water bodies. Taking the reference data from those shadow areas and using the multiple indices helped us overcome this problem. The approaches presented in this study may be extended to generate a reliable, hierarchical, national-scale Nepal's LC mapping. However, more challenges are expected when the study area is extended to the national scale (i.e., Nepal) with more cloud cover, fragmented landscapes and heterogeneous geography. The advances in technologies like cloud computing and AI has made it easier for automatic LC mapping.
Although sentinel imagery is available in the repository of the GEE, we have used Landsat imagery in our study because of the greater number of voids in the median composite of sentinel imagery even after cloud masking. Hence, we could not utilize and perform a proper study using sentinel imagery. While determining classes, we tried to incorporate all the LC classes as in Nepal Government's topographic LC map, but due to higher mismatches in surface reflectivity, we were limited to only six classes. We tried to include grassland class on our map but failed due to a higher mismatch of pixels as the classifier trained in the Landsat imagery could not distinguish forest and From the beginning, there were various limitations and challenges to this work. The first challenge was related to the study area. As the Kaski district is heavily covered with clouds in the Landsat scene, it was challenging for us to get the perfect scene for mapping. The only possible solution was to consider the median composite of the annual Landsat imagery after cloud and snow masking. As we considered the median of the summer snow-covered and winter snow-covered images, the snow and rocky/barren areas were highly misclassified. Another limitation can be attributed to the presence of hills in the Kaski district. The shadows due to these hills caused the misclassification of forest and rocky areas as water bodies. Taking the reference data from those shadow areas and using the multiple indices helped us overcome this problem. The approaches presented in this study may be extended to generate a reliable, hierarchical, national-scale Nepal's LC mapping. However, more challenges are expected when the study area is extended to the national scale (i.e., Nepal) with more cloud cover, fragmented landscapes and heterogeneous geography. The advances in technologies like cloud computing and AI has made it easier for automatic LC mapping.
Although sentinel imagery is available in the repository of the GEE, we have used Landsat imagery in our study because of the greater number of voids in the median composite of sentinel imagery even after cloud masking. Hence, we could not utilize and perform a proper study using sentinel imagery. While determining classes, we tried to incorporate all the LC classes as in Nepal Government's topographic LC map, but due to higher mismatches in surface reflectivity, we were limited to only six classes. We tried to include grassland class on our map but failed due to a higher mismatch of pixels as the classifier trained in the Landsat imagery could not distinguish forest and grassland pixels.
In Nepal, huge snow and cloud covers make it impossible to map a large extent with the single-day image or even with the monthly composite image. Hence, the annual median composite adopted in this study can be adopted to make the LC map of entire Nepal which is openly available in the GEE platform. Such availability of data without extensive preprocessing and allocating substantial storage makes the process very fast and efficient. Future work can focus on implementing data fusion techniques for integrating Landsat and Sentinel imagery that may furnish greater accuracies with finer resolutions. Various other machine learning models with multitemporal resolutions (summer and winter composites) can be used for better understanding the models' performance in LC mapping.      Table A2. Confusion matrix generated in Google Earth Engine (GEE) for the validation dataset. Here 1 is built-up class, 2 is water class, 3 is forest class, 4 is rocky/barren class, and 5 is cultivation class.   Table A2. Confusion matrix generated in Google Earth Engine (GEE) for the validation dataset. Here 1 is built-up class, 2 is water class, 3 is forest class, 4 is rocky/barren class, and 5 is cultivation class.

RF Classifier Reference
XGBoost Classifier Reference