Analysis of Land Use and Land Cover Using Machine Learning Algorithms on Google Earth Engine for Munneru River Basin, India

The growing human population accelerates alterations in land use and land cover (LULC) over time, putting tremendous strain on natural resources. Monitoring and assessing LULC change over large areas is critical in a variety of fields, including natural resource management and climate change research. LULC change has emerged as a critical concern for policymakers and environmentalists. As the need for the reliable estimation of LULC maps from remote sensing data grows, it is critical to comprehend how different machine learning classifiers perform. The primary goal of the present study was to classify LULC on the Google Earth Engine platform using three different machine learning algorithms—namely, support vector machine (SVM), random forest (RF), and classification and regression trees (CART)—and to compare their performance using accuracy assessments. The LULC of the study area was classified via supervised classification. For improved classification accuracy, NDVI (normalized difference vegetation index) and NDWI (normalized difference water index) indices were also derived and included. For the years 2016, 2018, and 2020, multitemporal Sentinel-2 and Landsat-8 data with spatial resolutions of 10 m and 30 m were used for the LULC classification. ‘Water bodies’, ‘forest’, ‘barren land’, ‘vegetation’, and ‘built-up’ were the major land use classes. The average overall accuracy of SVM, RF, and CART classifiers for Landsat-8 images was 90.88%, 94.85%, and 82.88%, respectively, and 93.8%, 95.8%, and 86.4% for Sentinel-2 images. These results indicate that RF classifiers outperform both SVM and CART classifiers in terms of accuracy.


Introduction
Understanding land use and land cover at various scales will aid future studies into a variety of global phenomena, such as droughts, floods, erosion, migration, and climate change. The continuous and accurate analysis of LULC is an integral part of the sustainable development activities undertaken in any given area. Detailed LC maps are an important input for a variety of scientific studies involving climate change effects on streamflow and water budgets [1,2], geomorphology [3], groundwater management [4][5][6][7], social knowledge management of natural resources [8], and agricultural land monitoring [9][10][11]. LULC maps can help determine which types of lands are suitable for agriculture and which can be useful in watershed management in general [12,13]. Remotely sensed imagery is the most commonly used method for mapping land cover and tracking changes over time [14][15][16][17]. Due to population increases and the need to develop new regions to meet the demand for food production, energy generation, and water security, the hydrologic and water resources modeling community is keen to integrate and evaluate changing land use and its impact on the water budget [18][19][20][21]. Generating low-resolution land cover maps across large regions involves massive amounts of data. As such, huge storage capacities, high processing power, and the flexibility to apply diverse approaches are all required [22]. These requirements were addressed and such technology was made freely available to anyone with the launch of the Google Earth Engine (GEE). GEE is a cloud-based platform which combines vast amounts of remote sensing data from multiple sources with a high-performance computer service, allowing for quick and easy satellite imagery computing [23][24][25][26].
GEE contains freely available satellite imagery from Landsat, Sentinel, and MODIS, among others. Client libraries are created in JavaScript and Python handles the code editing [27][28][29][30] GEE employs the MapReduce architecture for parallel processing, which is a technique for breaking down large amounts of data into smaller pieces and processing them across multiple devices. As a result, the data were recompiled for the final result after being processed as many separate parts. LULC classification results utilizing remote sensing-based imagery and using non-parametric machine learning methods-such as classification and regression trees (CART), support vector machine (SVM), and random forest (RF)-were found to be exceptionally accurate [31][32][33][34]. GEE is applied in a variety of LULC-based research fields due to its extensive capabilities [23]. Gong et al. (2013) [35] created a land cover global map of 30 m using cloud computing techniques. Midekisa et al. (2017) [36] produced maps of locations all over the African continent for 15 years using the GEE platform. Kolli et al. (2020) [26] mapped land use changes around Kolleru Lake, India, using an RF classifier and obtained an overall accuracy of 95.9% and a kappa coefficient of 0.94. Rahman et al. (2020) [10] analyzed the performance, via accuracy levels, of RF and SVM on the classification of urban and rural areas in Bangladesh. They achieved a maximum SVM accuracy of 96.9% for Bhola and 98.3% for Dhaka. Large-scale urban land GEE has also been used in agricultural areas for mapping of crops [27], as well as for comparative analysis with several machine learning methods and multitemporal datasets over larger regions [37,38]. Most studies using GEE have focused on the role of temperature in climate change, the analysis of LULC changes, and the monitoring of water resources using time series analysis [39][40][41]. GEE has some computational constraints in terms of time, storage, and memory. Tamiminia et al. (2020) [25] discussed some of these limitations, such as large computations; given the time constraints involved, it is better to use batch process. Furthermore, GEE encounters memory issues in some cases when processing large numbers of datasets.
With the growing demand for reliable LULC data from satellite images over large areas, it is more important than ever to understand machine learning methods and their performance in widely used cloud-based platforms, such as GEE. The vast majority of the available studies focus on comparisons between LULC classifiers. There are few studies on large-scale LULC mapping using machine learning methods which compare LULC maps created from different multispectral satellite images. The main aim of the present study was to use multispectral satellite images from Landsat-8 and Sentinel-2 for LULC classification, compare existing machine learning methods on the GEE platform, and thereby determine the satellite image source and machine learning algorithm which result in classification with the highest accuracy.

Study Area
The Munneru sub-basin is one of the most important agriculturally dominated subbasins in Lower Krishna basin, India ( Figure 1). This sub-basin drains areas in the districts of Khammam, Warangal (Telangana State), and Krishna (Andhra Pradesh State). The Munneru basin lies between latitudes of 16.6 • N and 18.1 • N and longitudes of 79.2 • E and 80.8 • E. The Munneru basin encompasses a total area of 9854 km 2 . Paddy, cotton, and maize are the dominant crops in the Munneru basin, which also has deciduous and degraded/scrub forests and a large spread of plantations in the lower part. This subbasin contains the tributaries Munneru, Akeru, Wyra, and Kattaleru. This basin's major water bodies include Pakhal Lake, Wyra Reservoir, Bhayyaram Cheruvu, and Lanka Sagar Reservoir. The dominant soils in this basin are red soils followed by black soils. The river Sustainability 2021, 13, 13758 3 of 15 plays a major role in providing water for irrigation and for domestic purposes. Around 77% of the total area of the basin is cultivable with the main crops being rice, corn, cotton, sorghum, millet, sugar cane, and a variety of horticulture crops. Population increase has resulted in heightened demand for and consumption of water for both domestic and industrial purposes, putting a strain on water resources. Land use of the watershed is mainly dominated by cropland and irrigated land. The major changes are the conversion of barren land to built-up land, cropland to dryland, and urbanisation in key areas such as Khamma and Nadigama. As part of the ongoing 2017 BRICS-DST project, this basin continues to be studied for the development of an integrated water resources management model under climate change scenarios.
maize are the dominant crops in the Munneru basin, which also has deciduous and degraded/scrub forests and a large spread of plantations in the lower part. This sub-basin contains the tributaries Munneru, Akeru, Wyra, and Kattaleru. This basin's major water bodies include Pakhal Lake, Wyra Reservoir, Bhayyaram Cheruvu, and Lanka Sagar Reservoir. The dominant soils in this basin are red soils followed by black soils. The river plays a major role in providing water for irrigation and for domestic purposes. Around 77% of the total area of the basin is cultivable with the main crops being rice, corn, cotton, sorghum, millet, sugar cane, and a variety of horticulture crops. Population increase has resulted in heightened demand for and consumption of water for both domestic and industrial purposes, putting a strain on water resources. Land use of the watershed is mainly dominated by cropland and irrigated land. The major changes are the conversion of barren land to built-up land, cropland to dryland, and urbanisation in key areas such as Khamma and Nadigama. As part of the ongoing 2017 BRICS-DST project, this basin continues to be studied for the development of an integrated water resources management model under climate change scenarios.

Data
A massive amount of EOD (Earth observation data) from the previous four decades-encompassing satellite images from popular platforms such as Sentinel, Landsat, and MODIS, as well as other geographic data including climate and demographic data-are stored in the cloud-based GEE platform. Landsat and Sentinel data can be accessed via USGS (the United States Geological Survey) in GEE. In the current study, Landsat-8 surface reflectance Tier 1 data-atmospherically corrected using the LASRC (Landsat-8 Surface Reflectance Code) and Sentinel-2 Level-1C data-were used. Due to cloud cover, less than 10 percent of the datasets were selected for each year, and those images were combined into a single image. For classification of the images, six bands from Landsat-8 and nine bands from Sentinel-2 were used. For Landsat-8, the total number of images used was ten into five major classes: water bodies, forest, barren land, vegetation, and built-up areas. Agriculture area and plantations were considered vegetation, while rivers and ponds were considered water bodies. The study made use of spectral bands from 1-7 of Landsat-8 images, as well as 2-8 and 11-12 of Sentinel-2 images (Table 1).  Figure 2 presents the methodology flowchart used in this study. Orthorectified images with the least amount of cloud cover served as the primary input for classification. The first step after importing the satellite data into GEE was to remove the cloud shadow and cloud cover. Contaminated pixels were removed from all available images due to cloudy or no-data conditions using cloud mask [29], a technique suggested by Simonetti et al. (2015) [42] and Zurqani et al. (2018) [43] and achievable on GEE. The yearly means of normalized difference vegetation (NDVI) and normalized difference water (NDWI) indices were calculated in the second phase.

Methods
To create a composite image, Landsat and Sentinel data from each year were combined into a single image using the median filter. A median value is assigned to each pixel for the entire stack of images, resulting in a single image for the entire image collection. To perform LULC classification, high-resolution Google Earth images were used to generate 575 training polygons for five land use classes. The generated polygons were evenly distributed throughout the study area. Next, the training data were loaded into the GEE as a feature collection table. For maximum classification accuracy, indices such as NDVI and NDWI were used.
The NDVI [44] is the normalized difference between the NIR and red bands and the NDWI [45] is the normalized difference between the NIR and SWIR bands, as shown in Equations (1) and (2): Machine learning algorithms available in GEE, such as RF, CART, and SVM, were used to train the classifiers for both Landsat-8 and Sentinel-2 images. of normalized difference vegetation (NDVI) and normalized difference water (NDWI) indices were calculated in the second phase. To create a composite image, Landsat and Sentinel data from each year were combined into a single image using the median filter. A median value is assigned to each pixel for the entire stack of images, resulting in a single image for the entire image collection. To perform LULC classification, high-resolution Google Earth images were used to generate 575 training polygons for five land use classes. The generated polygons were evenly distributed throughout the study area. Next, the training data were loaded into the GEE as a feature collection table. For maximum classification accuracy, indices such as NDVI and NDWI were used.
The NDVI [44] is the normalized difference between the NIR and red bands and the NDWI [45] is the normalized difference between the NIR and SWIR bands, as shown in Equations (1) and (2): Machine learning algorithms available in GEE, such as RF, CART, and SVM, were used to train the classifiers for both Landsat-8 and Sentinel-2 images.

Classification and Regression Tree (CART)
CART is a binary decision classification tree developed by Breimane et al. (1984) [46] that allows for simple decision making in logical if-then scenarios. CART operates recursively by splitting nodes until it reaches the terminal nodes, based on a predefined threshold. In this approach, input data are split into group sets and the trees are constructed utilizing all except one of those. The tree is validated using the left-out group, and the reduced tree with the lowest deviation is selected. CART is highly dependent on the sample size used in each class. The effectiveness of CART is hampered in particular by high dimensionality data, which result in complex tree architectures. The "classifier.smileCart" technique, which is included in the GEE library, was used in the current study to perform CART classification.

Random Forest Classifier (RF)
RF is the most commonly used classifier that builds an ensemble classifier [47] by combining many CART trees. Multiple decision trees are generated by RF utilizing a random selection of training datasets and variables. Internally, the non-training samples are used to evaluate the classifier's performance and provide an unbiased assessment of the generalization error. To establish the appropriate split for building of a tree, RF selects variables at random from training samples at each node. The two most important input factors for RF are the number of parameters and trees, which are both user-defined parameters. According to the literature, the optimal number of trees to be counted ranges from 100 to 500, and the optimal number of variables counted is the square of the set of variables [48].

Support Vector Machine (SVM)
The support vector machine (SVM) is a type of supervised learning algorithm that is used to solve regression and classification issues. SVM classifiers create an ideal hyperplane in the training stage that separates multiple classes with the fewest misclassified pixels. SVM is used to select the extreme points/vectors that will help create the hyperplane. These extreme points are referred to as support vectors. The main parameters for selecting support vectors are the cost parameter C, Gamma, and kernel functions [49]. The grid search technique is used to define C and Gamma parameters, resulting in reliable prediction results. C, the cost parameter, has a significant impact on support vector selection and SVM performance. The linear kernel is preferred for training on large datasets.

Accuracy Assessment
Once the classification was completed using machine learning algorithms, an accuracy assessment was performed to determine the accuracy of the classified images. Training datasets were divided into training and validation sets. Of the total training datasets, 70 percent, or 402 polygons, were used for training and 30 percent, or 173 polygons, were used as testing sets. A confusion matrix is a built-in algorithm in GEE that validates and then evaluates the classification accuracy of the images. The kappa coefficient (k) and overall accuracy (OA) are calculated from the following equations: where P c is the number of pixels classified correctly and P n is the total number of pixels.
where r = the number of rows and columns in the error matrix, x ii = the number of observations in row i and column i, x i+ = the marginal total of row i, x +i = the marginal total of column i, and N = the total number of observations. The consumer accuracy for each class is determined by the ratio of properly categorized pixels in the class to the total number of classified pixels. Similarly, the producer accuracy is determined by the ratio of properly categorized pixels to the total number of pixels in the reference data in each class. The proportionate reduction in errors is determined by comparing the errors in a classification class with the errors in a totally random class. Typically, the magnitude ranges from −1 to +1. It agrees well with the categorization if the value is larger than +0.5 [50]. The best performing classifier will be selected for further classification of images and will be used to examine spatiotemporal change in the future.

LULC Classification Using GEE
This study examines the performance of various machine learning techniques on LULC classification using Landsat-8 surface reflectance Tier 1 and Sentinel-2 Level-1C data with 30 m and 10 m resolutions, respectively. Figures 3 and 4 demonstrate how machine learning algorithms such as RF, CART, and SVM were used for the classification of LULC maps from 2016, 2018, and 2020 using Landsat-8 and Sentinel-2 images on the GEE platform. As the primary input, orthorectified images with minimal cloud cover were used and contaminated pixels, due to cloudy conditions, were removed from all available images using the cloud mask algorithm available on the GEE platform. To fill the gaps in cloudy images, temporal aggregation methods such as median, mean, and minimum/maximum were used. In this study, the median was used to compose Landsat-8 and Sentinel-2 images for the entire year. Two widely used indices that were developed and used as additional inputs for the classification of LULC are normalized difference water index (NDWI) and normalized difference vegetation index (NDVI), which are representative of water bodies and vegetation characteristics, respectively. Training and validation datasets were generated via image observations. A total of 575 training sites were used for classification. As a rule of thumb [50], each class should have at least 50 training samples for classification. Each class received 80-95 samples for training and 65-80 samples for validation. SVM, RF, and CART algorithms were used to classify the same training and validation data. LULC was divided into five major classes: vegetation, forest, water bodies, built-up, and barren land. From the studies of Kohavi (1995) [51], the best cross validation factor was determined to be 5 or 10 and was used as an input value for CART classifier. A number of trees in the 50-100 range exhibited higher accuracy and performed better for RF classification [48]. In the present study, a total of 100 trees yielded good results. Kernel type, gamma value, and cost are all important parameters in SVM. For large datasets, the linear kernel type is preferable [49]. For linear kernels, the gamma parameter is not required. The cost parameter determines the severity of the penalty for incorrectly classified data. A higher C value indicates less misclassified data. The C-SVC method is used for SVM classification, with a cost parameter of 10 and a linear kernel type.   around water bodies, as observed in Figure 4. Water bodies, forest, barren, and built-up areas are correctly classified using the RF algorithm. Vegetation was slightly misclassified as forest using the RF algorithm in 2016 and 2020. Using CART, the classification of water bodies, built-up areas, and forest was superior to that of vegetation and barren classes. Figure 5 depicts the changes in the LULC classification as determined from Sentinel-2 and Landsat-8 images using RF for the years 2016, 2018, and 2020. Figure 5 indicates that, for the period of 2016-2020, built-up, barren land, and vegetation increased by 0.6%, 0.78%, and 0.015%, respectively. Forest cover and water bodies decreased by 0.088% and 0.22%, respectively, for the period of 2016-2020. In 2018, vegetation cover decreased before increasing in 2020. Similarly, water bodies were reduced in 2018 and increased in 2020 as a result of heavy rains in some parts of the basin. Barren lands were more prevalent in 2018 than in 2016.

Comparison of Classification Performances
Accuracy assessment was used to assess the efficacy of various classifiers. The most commonly used metric for evaluating the accuracy and effectiveness of all classifiers is overall accuracy (OA), which represents the amount of test data correctly classified by the classifier as a percentage. Furthermore, the confusion matrix and user and producer accuracy were used to assess the class-wise performance of each classifier. GEE has methods for determining the correctness of several classifiers, and accuracy assessments were used to determine the accuracy of each classified image for the years 2014, 2016, 2018, and 2020. To obtain the ground truth data, user interpretation was employed to select testing sites. The performance of RF, SVM, and CART classifiers are compared in Table 2 in terms of the overall accuracy and kappa coefficient.
Landsat-8 images using RF for the years 2016, 2018, and 2020. Figure 5 indicates that, for the period of 2016-2020, built-up, barren land, and vegetation increased by 0.6%, 0.78%, and 0.015%, respectively. Forest cover and water bodies decreased by 0.088% and 0.22%, respectively, for the period of 2016-2020. In 2018, vegetation cover decreased before increasing in 2020. Similarly, water bodies were reduced in 2018 and increased in 2020 as a result of heavy rains in some parts of the basin. Barren lands were more prevalent in 2018 than in 2016.  From Table 2, the RF classifier outperformed the SVM and CART classifiers. It can also be observed that Sentinel-2 images were more accurate than Landsat images. The overall accuracy of RF, SVM, and CART classifiers for Landsat-8 was 94.85%, 90.88%, and 82.88%, respectively. The average overall accuracy of RF, SVM, and CART classifiers for Sentinel-2 was 95.84%, 93.65%, and 86.48%, respectively. For Landsat-8 data with RF, SVM, and CART classifiers, the average kappa coefficients were 0.90, 0.84, and 0.74, respectively. The average kappa coefficients for RF, SVM, and CART classifiers on Sentinel-2 data were 0.92, 0.88, and 0.77, respectively. When compared with SVM and CART, the RF classifier achieved the highest producer and user accuracy for both Landsat-8 and Sentinel-2.
The producer and user accuracy of Landsat-8 and Sentinel-2 for each land class are presented in Figures 6 and 7. When compared with other classes, forest and water bodies performed well, with more than 90% user and producer accuracy for both Sentinel and Landsat data. For both satellites, RF outperformed the other classifiers in terms of producer and user accuracy.

Discussion
In the current study, different machine learning methods were applied to determine the accuracy of LULC classifications using multispectral Sentinel-2 and Landsat imagery. From Figures 3 and 4, as the reflectance of plantations coincides with that of forest, the majority of the vegetation was misclassified and confused with forest. As there is a river in the study area, no flow was observed during the non-monsoon season, and the area was slightly misclassified as built-up and barren in some parts of the river due to reflectance matches. Overall accuracy (OA) is the most extensively used metric for estimating accuracy. It represents the percentage of the testing set that was correctly classified by the classifier. Additionally, confusion matrix, user accuracy, and producer accuracy are utilized to further evaluate the class-level performance of a given classifier [52]. The best performing model is chosen based on accuracy and kappa coefficient. RF classifies well in all classes, as evidenced by other studies [53,54]. The accuracy of barren land was lower than that of other land use classes, as observed in Figures 6 and 7. Because some parts of plantations are classified as forest rather than vegetation, the accuracy of vegetation was reduced. Built-up areas were also mistaken for water bodies because their reflectance values match during the non-monsoon season. Barren land had very few pixels, which were insufficient to train the classifier efficiently, resulting in poor performance when com-

Discussion
In the current study, different machine learning methods were applied to determine the accuracy of LULC classifications using multispectral Sentinel-2 and Landsat imagery. From Figures 3 and 4, as the reflectance of plantations coincides with that of forest, the majority of the vegetation was misclassified and confused with forest. As there is a river in the study area, no flow was observed during the non-monsoon season, and the area was slightly misclassified as built-up and barren in some parts of the river due to reflectance matches. Overall accuracy (OA) is the most extensively used metric for estimating accuracy. It represents the percentage of the testing set that was correctly classified by the classifier. Additionally, confusion matrix, user accuracy, and producer accuracy are utilized to further evaluate the class-level performance of a given classifier [52]. The best performing model is chosen based on accuracy and kappa coefficient. RF classifies well in all classes, as evidenced by other studies [53,54]. The accuracy of barren land was lower than that of other land use classes, as observed in Figures 6 and 7. Because some parts of plantations are classified as forest rather than vegetation, the accuracy of vegetation was reduced. Built-up areas were also mistaken for water bodies because their reflectance values match during the non-monsoon season. Barren land had very few pixels, which were insufficient to train the classifier efficiently, resulting in poor performance when compared with the other land use classes. In terms of producer and user accuracy, RF outperformed the other classifiers for both satellites; however, SVM and CART classifiers performed better for water bodies and forest land cover. Forest, water bodies, and barren land were misclassified as vegetation and built-up areas by SVM and CART classifiers. RF outperformed the other two classifiers in classifying all five classes for both the Sentinel-2 and Landsat-8 datasets.
It is difficult to distinguish between built-up, vegetation, and barren land classes in 30 m resolution Landsat-8 images due to mixed pixels. The Sentinel-2 image, on the other hand-in which multiple land-use classifications are combined together at the same time-allows for superior classification of tiny regions and diverse land use systems. A resolution of 10 m is preferred to others in the situation of scattered classes and rapidly shattered area categorization. In this scenario, the Sentinel-2 image performed better. When the results obtained from Landsat-8 and Sentinel-2 imagery are compared, the Sentinel-2 dataset yielded the highest accuracy results due to its higher spatial resolution and greater number of band combinations used for classification. When compared with Landsat data, the Sentinel red-edge band combination is best suited for accurately classifying vegetation.
Random forests, in general, combine numerous soft linear boundaries at the surface of the decision. In SVM and CART, misclassification occurs between some classes and SVM performs well if the input training data are sparse, making it a better choice when less data are available [54]. Each algorithm has its own set of benefits and drawbacks. RF is more resilient and less impacted by parameters, whereas SVM is sensitive to hyperparameters [55]. RF outperformed all of the other classifiers, regardless of training data size, followed by SVM and then CART. Some existing literature claims that SVM outperforms CART [56], which was observed in the present study. However, few studies claim that CART outperforms SVM [57]. It is best to use the least sensitive, most complex, and fastest method for classification [58]. Through the use of multispectral satellite images, GEE simplifies the process of classifying large study areas. With the available methods and algorithms applied here, the performance of image pre-processing tasks is made more flexible.

Conclusions
The performance of RF, CART, and SVM machine learning methods for the classification of LULC on the GEE platform, using Landsat-8 and Sentinel-2 datasets over three years, was analyzed. The classifier type used influenced the accuracy of LULC data classification from satellite images. Accuracy assessment of each individual class can be used to evaluate the performance of each classifier with respect to each class. The accuracy of the classifications was assessed using an error matrix. For both Sentinel-2 and Landsat-8 images, RF outperformed CART and SVM. The combination of band data was important and affected classification accuracy. Sentinel data have red-edge bands, which allow for better vegetation classification than Landsat data. Because of its high resolution, the Sentinel-2 dataset outperformed the Landsat-8 in terms of accuracy. The most suitable classifier for any given scenario may also be affected by the study region, thematic accuracy, training sample quality, and map necessity.