PS-InSAR-Based Validated Landslide Susceptibility Mapping along Karakorum Highway, Pakistan

: Landslide classiﬁcation and identiﬁcation along Karakorum Highway (KKH) is still challenging due to constraints of proposed approaches, harsh environment, detail analysis, complicated natural landslide process due to tectonic activities, and data availability problems. A comprehensive landslide inventory and a landslide susceptibility mapping (LSM) along the Karakorum Highway were created in recent research. The extreme gradient boosting (XGBoost) and random forest (RF) models were used to compare and forecast the association between causative parameters and landslides. These advanced machine learning (ML) models can measure environmental issues and risks for any area on a regional scale. Initially, 74 landslide locations were determined along the KKH to prepare the landslide inventory map using different data. The landslides were randomly divided into two sets for training and validation at a proportion of 7/3. Fifteen landslide conditioning variables were produced for susceptibility mapping. The interferometric synthetic aperture radar persistent scatterer interferometry (PS-InSAR) technique investigated the deformation movement of extracted models in the susceptible zones. It revealed a high line of sight (LOS) deformation velocity in both models’ sensitive zones. For accuracy comparison, the area under the curve (AUC) of the receiver operating characteristic (ROC) curve approach was used, which showed 93.44% and 92.22% accuracy for XGBoost and RF, respectively. The XGBoost method produced superior results, combined with PS-InSAR results to create a new LSM for the area. This improved susceptibility model will aid in mitigating the landslide disaster, and the results may assist in the safe operation of the highway in the research area.


Introduction
The Karakoram Highway (KKH) connects the Middle East and South Asia with China through Pakistan [1,2]. Gilgit Baltistan (GB) has a rough mountainous landscape with mountains covering 90% of the whole area and is susceptible to landslides [3]. Rock falls, avalanches, rockslides, debris flow, rotating slips, slumps, and creep have all been reported in the Karakoram Mountains [4]. The most prevalent kind of mass movement observed along the Karakoram highway are rockslides and debris slides. Debris flow is a sudden mass movement caused by heavy rain on steep unconsolidated terrain [5]. One of the world's highest paved highways, KKH connects Gilgit Baltistan province with China's Xinjiang Region across the Karakoram range. The KKH is economically significant because export and import between Pakistan and China depend on this highway. Because of the China-Pakistan Economic Corridor (CPEC) project (project value in 2017 was USD 62 billion [6]), the economic relevance of this route has increased in recent years. Landslides that have previously blocked KKH provide a significant difficulty in this regard. Several geohazards have undermined the reputation of KKH since its completion in 1979 [7]. In coherence pixel technique [62], small baseline subset [63], and stable point network [64,65] have developed some practical case studies. These methods are concerned with mapping and detecting landslide occurrences, as mentioned in [61,[66][67][68].
Previous research in the region [7,10,[69][70][71] has focused on statistical and deterministic connections and regression analysis of landslides with variables. The PS-InSAR technique evaluated surface deformation in the research region with XGBoost and RF models for the first time, making it a unique way of detecting landslide movements. Along the KKH, persistent scatterer interferometry (PSI), interferometric synthetic aperture radar (InSAR), and area under the curve (AUC) of receiver operating characteristic (ROC) techniques were used to assess the deformation in particularly vulnerable locations as well as the accuracy of the models utilized. PS-InSAR can identify and define individual landslides in unstable regions. Using a spatial statistical method can also detect landslides on the basis of a multi-temporal assessment of SAR pictures that compute slow landslide motions [2].
The present study aimed to create a comprehensive visually interpreted landslide inventory and generate a susceptibility model using the recently developed ML techniques of random forest and extreme gradient boosting. These advanced machine learning (ML) models can measure environmental issues and risks for any area on a regional scale. The second aim involved using PS-InSAR, estimating the deformation velocities of slowmoving landslides to identify high-risk zones for future landslide disaster management. Monitoring and detecting landslides in the region with steep terrain is challenging for InSAR assessment to achieve this goal. The third aim was to determine the best susceptible model on the basis of accuracy and AUC value, combined with PS-InSAR results to create a new LSM for the area. These forecasting techniques will also guide future development initiatives and land management in the area. These susceptibility maps will be useful in avoiding and minimizing human and economic losses along this vital corridor.

Research Area
The area is situated along the left and right banks of the Gilgit and Indus Rivers. This research area was 170 km long and covered a 5 km radius buffer of territory along KKH ( Figure 1). Natural disasters such as landslides and snow avalanches are common in the area. In the area, the most common forms of landslides are rockfalls and debris flow caused by rain and tectonic activity.
The region has moderate summers and harsh winters. The annual rainfall in the study region is 154 mm. In contrast, the maximum yearly mean temperature extends from 16 to 25 • C, and the minimum mean annual temperature rises from −3 to −21 • C (Meteorological Department of Pakistan). Avalanches and landslides are prevalent in the area as a result of the harsh weather situations.

Geological Setting of the Area
The majority of the rocks in the area are Paleozoic, Proterozoic, and Mesozoic in age ( Figure 2). According to the geological map compiled by Searle et al. [72], the area is composed of the following lithology.
The region has moderate summers and harsh winters. The annual rainfall in the study region is 154 mm. In contrast, the maximum yearly mean temperature extends from 16 to 25 °C, and the minimum mean annual temperature rises from −3 to −21 °C (Meteorological Department of Pakistan). Avalanches and landslides are prevalent in the area as a result of the harsh weather situations.

Geological Setting of the Area
The majority of the rocks in the area are Paleozoic, Proterozoic, and Mesozoic in age ( Figure 2). According to the geological map compiled by Searle et al. [72], the area is composed of the following lithology. The Yasin group comprised Conglomerate with boulders and cobbles of limestone and sandstone, andesite, red shale, and shelly limestone, and Kohistan Batholiths comprised granodiorite, granite, and diorite. The Chalt Volcanic Group contained minor green schist, quartzite, gneiss, amphibolite, andesite, agglomerate, grey marble, and metabasalt. Komila amphibolite complex is composed of meta-basaltic and meta-plutonic rocks. Chilas complex contains mafic and ultramafic igneous rocks associated with the Kohistan Arc sequence (volcanic clastic sedimentary rocks). The greenstone complex comprises metasediments, marble, and volcanic sediments (predominantly andesite, basalt, tuff, agglomerate, and rhyolite). The Southern Karakorum metamorphic complex was composed of paragneiss and included interbedded marble and shengus gneiss containing calic-silicates and garnet amphibolite intruded by basaltic dykes. The Darkot The Yasin group comprised Conglomerate with boulders and cobbles of limestone and sandstone, andesite, red shale, and shelly limestone, and Kohistan Batholiths comprised granodiorite, granite, and diorite. The Chalt Volcanic Group contained minor green schist, quartzite, gneiss, amphibolite, andesite, agglomerate, grey marble, and meta-basalt. Komila amphibolite complex is composed of meta-basaltic and meta-plutonic rocks. Chilas complex contains mafic and ultramafic igneous rocks associated with the Kohistan Arc sequence (volcanic clastic sedimentary rocks). The greenstone complex comprises metasediments, marble, and volcanic sediments (predominantly andesite, basalt, tuff, agglomerate, and rhyolite). The Southern Karakorum metamorphic complex was composed of paragneiss and included interbedded marble and shengus gneiss containing calic-silicates and garnet amphibolite intruded by basaltic dykes. The Darkot Group in the research area comprised minor phyllite, slate and quartzite, agglomerate, dacite, tuff, quartzite, dolomite limestone, and andalusite, and the quaternary deposits included eolian sand, alluvium, moraines, and scree.
The area's orogenic features began to form 50 million years ago when the Indo-Eurasian collision began. The region's subduction, active faulting, and cluster shortening are caused by elevated rates of 7 mm/year [73] and merging rates of 4 cm/year [74]. The region is situated between the Kohistan arc and the Karakorum block, with the Main Karakorum Thrust (MKT) fault connecting them ( Figure 2). Within the Indus suture zone (MKT), the oceanic island arc (cretaceous) of Kohistan arc terrain differentiates the Indian plate and the Karakorum (Asian) plate [72]. The Kohistan arc formed during the Cretaceous period, whenever an intra-oceanic island arc inside the NeoTethyan Ocean lay between continental India and the Asian plate margin.
In Figure 2, in the cross-section A to A', all faults are thrust in nature. MKT is a major tectonic feature in the research area. MKT passes between the Southern Karakorum metamorphic complex and Kohistan Tarrane and Shyok Suture Zone in the area ( Figure 2). The MKT is a substantial earthshattering feature along the highway that is responsible for brittle deformations [75,76]. Brittle deformation causes massive articulation and cracking of the rock mass. Geology along with the cross-section A to A' is Southern Karakorum metamorphic complex (marble, ampobolites), Yasin group (limestone, sandstone), glaciers and Hunza plutonic unit (plutonic rocks). Along cross-sections B to B', all faults are local level and normal faults. Geology along the cross-section B to B' is Gilgit metasedimentary complex (Mesozoic sedimentary rocks), Quaternary deposits (alluvium, moraines), and Kohistan Batholiths (gabbro, diorite). The region's geological formations and soils are weak, also playing an essential role. The soil in the investigated area is mainly loam sand, clay, silt, and sandy clay [77]. Previous research found that the study area is one of the steepest places in the world, with a dramatic fall in elevation from 7788 to 2000 m within a 5 km radius [75]. Furthermore, the mountains feature steep slopes that are prone to landslides [78].

Landslide Susceptibility Mapping
Remote sensing data, geological maps, and meteorological data had been extracted from different sources for the study. Advanced Land Observing Satellite, Phased Array Type L-band Synthetic Aperture Radar Digital Elevation Model (ALOS PALSAR DEM) with a spatial resolution of 12.5 m was downloaded from the NASA dataset. Sentinel-2 images with 10 m of the spatial resolution were taken from the USGS dataset to extract a land cover map for the area. The geological map for the area was digitized in the ArcGIS environment to understand surficial geological features. Sentinel-1 images (30 in ascending track and 28 in descending track) were taken for PS-InSAR processing to calculate the deformation velocity. The methodology followed in the study is shown in Figure 3.

Landslide Inventory
Landslide inventory has information about all kinds of previous landslides in the research area, and it is the first step to assess susceptibility. Inventory maps have

Landslide Inventory
Landslide inventory has information about all kinds of previous landslides in the research area, and it is the first step to assess susceptibility. Inventory maps have information on all historical and active landslide distribution by using the data of previous reports, interpretation of aerial images, and carrying out the field survey of the area [79].
However, the landslide inventory was developed by the visual interpretation of Sentinel-2 images with 10 m resolution (2020) and from the Google Earth engine, verified using previous reports and the field survey of the study area. There were 74 landslides mapped, divided into modeling (70%) and validation (30%) datasets, as shown in Figure 4.

Landslide Causative Factors
GIS is widely used to obtain critical susceptibility assessment variables, such as aspect, curvature, Topographic Wetness Index (TWI), elevation, and slope, from digital elevation models (DEM). Factors such as geology, precipitation, land cover, distance to road, slope, distance to stream, Normalized Difference Vegetation Index (NDVI), distance to faults, roughness, curvature, Topographic Wetness Index (TWI), profile curvature, aspect, elevation, and plan curvature were used to assess the related risk landslide catastrophes along the KKH (Table 1). Figures 5 and 6 show the 15 landslide factors.

Landslide Causative Factors
GIS is widely used to obtain critical susceptibility assessment variables, such as aspect, curvature, Topographic Wetness Index (TWI), elevation, and slope, from digital elevation models (DEM). Factors such as geology, precipitation, land cover, distance to road, slope, distance to stream, Normalized Difference Vegetation Index (NDVI), distance to faults, roughness, curvature, Topographic Wetness Index (TWI), profile curvature, aspect, elevation, and plan curvature were used to assess the related risk landslide catastrophes along the KKH (Table 1). Figures 5 and 6 show the 15 landslide factors. Machine learning model identification, model fitting, and model development were all part of the modeling process. The steps are as follows: In this study, the grid unit was used as the model unit. II.
Remote sensing data and DEM data spatial resolution corresponded to 12.5 m, and evaluation factors were all re-calculated at 12.5 m. III.
A condition attribute corresponding created a two-dimensional table to 15 evaluation factors and a landslide decision attribute (1 represents landslides, 0 represents non-landslides), with each line defining an object. IV.
Each column is the object's attribute and changed into training (70%) and testing the two-dimensional table (30%). The model was built using training data, and forecasts were made using test data. V.
The above two models were used to compute model units in the research area. Each model unit's prediction values per group were the outcome to generate the Landslide Prediction Index (LPI) maps. VI.
The results acquired using the two algorithms were imported into the GIS. The Janks natural breakpoint [80] was applied to landslide susceptibility into five grades: very low, low, moderate, high, and very high. VII.
The two models were extensively tested using the ROC curve and the area under the ROC curve.   Machine learning model identification, model fitting, and model development were all part of the modeling process. The steps are as follows: In this study, the grid unit was used as the model unit. II.
Remote sensing data and DEM data spatial resolution corresponded to 12.5 m, and evaluation factors were all re-calculated at 12.5 m. III.
A condition attribute corresponding created a two-dimensional table to 15 evaluation factors and a landslide decision attribute (1 represents landslides, 0 represents nonlandslides), with each line defining an object. IV.
Each column is the object's attribute and changed into training (70%) and testing the two-dimensional table (30%). The model was built using training data, and forecasts were made using test data. V.
The above two models were used to compute model units in the research area. Each model unit's prediction values per group were the outcome to generate the Landslide Prediction Index (LPI) maps. VI.
The results acquired using the two algorithms were imported into the GIS. The Janks natural breakpoint [80] was applied to landslide susceptibility into five grades: very low, low, moderate, high, and very high.

Random Forest
The RF method is an advanced, tree-based technique incorporating many regression and classification trees [81]. It is also a probabilistic technique for modeling decision tree methods' discrete and continuous data. The main issue with this method is the variation in the outcomes of each tree [82]. A random forest approach is proposed to decrease these variations and the approximation of alteration [83]. It integrates numerous decision trees that use several base classifiers from the data, and several parameters are used at random to build each tree. The three main parameters used are listed in Table 2: the number of features that are appropriate for dividing (mtry), and the minimum number of a sample can also be taken arbitrarily in each bootstrap sample to balance any tree with recursive portioning (ntree) [84].

Extreme Gradient Boosting
In 2001, Professor Fridman of the Stanford Statistics Department proposed the gradient boosting algorithm, an estimation technique based on the gradient descent techniques [85]. XGBoost is a type of lifting tree model and a boosting algorithm. Its key innovations are summarized below [86].

I.
It optimizes its loss function. II.
The parallel approximate histogram algorithm efficiently generates the candidate split value. III.
It provides an efficient cache-aware block structure for out-of-core tree learning as well as a unique sparsity-aware linear tree learning method.
Model XGBoost requires several model preview parameters to be chosen. Three main user-friendly parameters are required: nrounds (maximum number of iterations boosting), colsample_bytree (columns ratios sub-sample when each tree is constructed), and subsamples (the training instance subsample ratio) ( Table 3).

The Significance of Landslide Variables
We used R studio in this study to compute the significance of landslides variables. The factor's importance can be seen in Figure 7. intermediate elevation typically categorize high elevation zones, and slopes are usually covered by thin colluvium, making them more susceptible to landslides [87]. Because active fault and shear zones exert substantial control over landslide activity in the area [7], the buffer class closest to the fault line is the most vulnerable ( Figure 7). Barren land is in direct contact with climatological variables such as sun and rainfall, which causes rock weathering and increases the risk of landslides [88]. Monsoon rainfall is a significant causative factor in the area [89], causing the majority of the debris flow, rockfalls, and other slides. In this study, annual average precipitation data were used, which did not show precipitation as a significant causative factor but did find more landslides in areas with high precipitation (Figure 7). The distance to the road, landcover, aspect, and geology had a minor influence on the landslide in the investigated area. The lithological units have little influence in LSM, but the Yasin group and Quaternary alluvium (Figure 7) are the most susceptible formations [6,69,90] in the research area.  Figure 8 depicts the outcomes of using the two models for LSM that were identified using the LPI. The greater the LPI, the more likely it is that a landslide will occur [91]. Using the natural break method [80], we divided the probability value of landslide According to Figure 7, the slope and elevation had the most significant influence, and precipitation, distance to fault, roughness, and TWI were nearly identical on landslides in the research area. The elevation is a significant factor for landslides in the area ( Figure 7); it promotes the landslides and makes an area prone to landslides. Weathered rocks and intermediate elevation typically categorize high elevation zones, and slopes are usually covered by thin colluvium, making them more susceptible to landslides [87]. Because active fault and shear zones exert substantial control over landslide activity in the area [7], the buffer class closest to the fault line is the most vulnerable (Figure 7). Barren land is in direct contact with climatological variables such as sun and rainfall, which causes rock weathering and increases the risk of landslides [88]. Monsoon rainfall is a significant causative factor in the area [89], causing the majority of the debris flow, rockfalls, and other slides. In this study, annual average precipitation data were used, which did not show precipitation as a significant causative factor but did find more landslides in areas with high precipitation (Figure 7). The distance to the road, landcover, aspect, and geology had a minor influence on the landslide in the investigated area. The lithological units have little influence in LSM, but the Yasin group and Quaternary alluvium (Figure 7) are the most susceptible formations [6,69,90] in the research area. Figure 8 depicts the outcomes of using the two models for LSM that were identified using the LPI. The greater the LPI, the more likely it is that a landslide will occur [91]. Using the natural break method [80], we divided the probability value of landslide susceptibility: very low, low, moderate, high, and very high. The XGBoost model classified 6.81% as very high susceptibility, while values for very low, low, moderate, and high susceptibility were 19.46%, 27.59%, 25.37%, and 20.77%, respectively ( Figure 9). The RF model classified 5.41% as very high susceptibility, while values for very low, low, moderate, and high susceptibility were 16.50%, 25.53%, 30.12%, and 22.44%, respectively ( Figure 9).    In the training stage, a confusion matrix depicts the ability of the XGBoost and RF models. Table 4 shows the consequences of the confusion matrix for the XGBoost and RF models. XGBoost model displays a high accuracy (0.934) with AUC (0.728) value in the research area. The receiver operating characteristic (ROC) technique has been used for validation, which is valid [31]. In this technique, a ROC curve creates by plotting between 'sensitivity' versus 'specificity' on cut-off values, but it does not explain the model's accuracy thoroughly; therefore, the AUC of the ROC curve is applied to assess the quantitative functioning of the model [92]. The results show the AUC of the prediction rate curve were found at 72.88% and 69.28% for XGBoost and RF models, respectively ( Figure 10). In the training stage, a confusion matrix depicts the ability of the XGBoost and RF models. Table 4 shows the consequences of the confusion matrix for the XGBoost and RF models. XGBoost model displays a high accuracy (0.934) with AUC (0.728) value in the research area. The receiver operating characteristic (ROC) technique has been used for validation, which is valid [31]. In this technique, a ROC curve creates by plotting between 'sensitivity' versus 'specificity' on cut-off values, but it does not explain the model's accuracy thoroughly; therefore, the AUC of the ROC curve is applied to assess the quantitative functioning of the model [92]. The results show the AUC of the prediction rate curve were found at 72.88% and 69.28% for XGBoost and RF models, respectively ( Figure 10).

PS-InSAR-Based Validation
The PS-InSAR techniques were used to check the displacement in the area to validate and verify the models. Because of its extensive high spatial-temporal resolution, spatial coverage, and operational capability in all-weather circumstances, the interferometric

PS-InSAR-Based Validation
The PS-InSAR techniques were used to check the displacement in the area to validate and verify the models. Because of its extensive high spatial-temporal resolution, spatial coverage, and operational capability in all-weather circumstances, the interferometric synthetic aperture radar (InSAR) technique has been widely documented for detecting and monitoring mass movements over the last decade [93]. Many types of research associated with PS-InSAR have been carried out to identify the temporal or spatial landslide distortion patterns or the kinematic resolution of slow-moving landslides to assess the magnitude of slow-moving landslides [94]. This research collected and processed images from the Sentinel-1A IW sensor (one-year images with 12 days of temporal resolution) in a SARPROZ software environment. The calculated result was compared to the susceptibility model derived by the XGBoost and RF methods.
The line of sight (LOS) deformation velocity (VLos) was calculated in PS-InSAR processing using 0.7 as the coherent threshold, as shown in Table 5. V LOS depicts the deformation of only one direction based on the satellite's LOS, calculated to evaluate slope orientation velocity (Vslope). In the case of landslide assertion, most landslide or ground surface displacements occur along the way of steep terrain, and therefore Vslope is the assess constituent used to determine landslide progression. The determined Vslope for both ascending and descending paths was combined. An ultimate deformation map ( Figure 11) was extracted, showing the only displacements in the highly susceptible zone-produced susceptible model by XGBoost (Figure 12). The PSI results found most of the mapped landslides in deforming areas, but slow-moving landslides were predicted more accurately due to the Sentinel-1A sensor's long revisiting period. The Vslope mode was converted into 12.5 × 12.5 grid cells to generate a more refined LSM, corresponding to the model developed by the XGBoost method. The contingency matrix was applied to a Vslope and XGBoost-based susceptibility model to improve an LSM for the area (Figure 13). In other words, the degree of difference for each cell was compared using the newly produced LSM, and the LSM was developed using the XGBoost model.   The Vslope mode was converted into 12.5 × 12.5 grid cells to generate a more refined LSM, corresponding to the model developed by the XGBoost method. The contingency matrix was applied to a Vslope and XGBoost-based susceptibility model to improve an LSM for the area (Figure 13). In other words, the degree of difference for each cell was compared using the newly produced LSM, and the LSM was developed using the XGBoost model.
The recently generated map showed 11.52% of the research area as very high susceptibility. In contrast, very low, low, moderate, and high susceptibility class values were 16.64%, 29.23%, 22.30%, and 20.31%, respectively ( Figure 13). The recently generated map showed 11.52% of the research area as very high susceptibility. In contrast, very low, low, moderate, and high susceptibility class values were 16.64%, 29.23%, 22.30%, and 20.31%, respectively ( Figure 13). Figure 13 shows some areas with a significant increase in landslide susceptibility. We now present a thorough examination of two locations. Place 1,2,3 is a hilly area on the north side of Gilgit, mostly made up of Quaternary deposits, which are loose. The rain and wind have a significant impact on it. The slope is unstable all around, and the majority of it is steeper less than 30 degrees. After long-term immersive experience loosening of the bank slope, the mechanical and physical specifications of the soil are reduced. The soil and rocks have become steeper and less stable as the water level variations reciprocally, and the waves are stripped away. A specific level of local slip, failure, and destabilization events will occur at some point. PS-InSAR deformation exposes a higher deformation rate, and evaluation of the landslide hazard after enhancement by the PS-InSAR confirms an enhancement ( Figure 14). Remote Sens. 2021, 13, x FOR PEER REVIEW 18 of 26 Figure 13. The correction matrix was used to refine the landslide susceptibility model via Vslope. Figure 13 shows some areas with a significant increase in landslide susceptibility. We now present a thorough examination of two locations. Place 1,2,3 is a hilly area on the of it is steeper less than 30 degrees. After long-term immersive experience loosening of the bank slope, the mechanical and physical specifications of the soil are reduced. The soil and rocks have become steeper and less stable as the water level variations reciprocally, and the waves are stripped away. A specific level of local slip, failure, and destabilization events will occur at some point. PS-InSAR deformation exposes a higher deformation rate, and evaluation of the landslide hazard after enhancement by the PS-InSAR confirms an enhancement ( Figure 14). Place 2 is near the Chalt area, with high slope and elevation and high tectonic activity due to the MKT Fault; earthquakes are common and are susceptible to causing landslides. Previous research suggested that an earthquake triggered the landslide in this case. The substance in the landslide area started to move as an outcome of extra seismic forces. The crack appears, resulting in rapid mobility and demolishing of the road. The most recent PS-InSAR analysis showed numerous displacement places, indicating a higher probability Place 2 is near the Chalt area, with high slope and elevation and high tectonic activity due to the MKT Fault; earthquakes are common and are susceptible to causing landslides. Previous research suggested that an earthquake triggered the landslide in this case. The substance in the landslide area started to move as an outcome of extra seismic forces. The crack appears, resulting in rapid mobility and demolishing of the road. The most recent PS-InSAR analysis showed numerous displacement places, indicating a higher probability of landslides in areas with low landslide susceptibility. This difference is highlighted by the improved LSM, which rectifies the very low landslide sensitivity area to the very high landslide sensitivity area ( Figure 15). of landslides in areas with low landslide susceptibility. This difference is highlighted by the improved LSM, which rectifies the very low landslide sensitivity area to the very high landslide sensitivity area ( Figure 15).

Discussion
The findings demonstrate that the XGBoost and RF data mining techniques have comparable precision for LSM along the KKH, with the XGBoost outperforming the RF in terms of accuracy and AUC value. Our results conform to the results outline in work [6,7,10,44,69,70].
The overall precision values found in this research (93.4%) were compared to the consequence of Pradhan and Kim [42]; it was discovered that the precision values found in this study were higher than the XGBoost (76.73%) and deep neural network precision consequences (83.71%) of the revealed research. Cao et al. [95] evaluated the XGBoost algorithm for landslide, debris, unstable slope, and rockfall hazards in Jiuzhaigou, China, and compared it to the results of the RF and SVM. For LS mapping, the XGBoost algorithm outperformed the RF but performed in a slightly inferior way in comparison with the SVM. On the contrary, the method produced the best accuracy consequences for the other three hazards tested. Sahin et al. [43] found XGBoost (83.36%) consequences are lower than the results obtained here. Arabameri et al. [96] compared the RF, SVM, and LR

Discussion
The findings demonstrate that the XGBoost and RF data mining techniques have comparable precision for LSM along the KKH, with the XGBoost outperforming the RF in terms of accuracy and AUC value. Our results conform to the results outline in work [6,7,10,44,69,70].
The overall precision values found in this research (93.4%) were compared to the consequence of Pradhan and Kim [42]; it was discovered that the precision values found in this study were higher than the XGBoost (76.73%) and deep neural network precision consequences (83.71%) of the revealed research. Cao et al. [95] evaluated the XGBoost algorithm for landslide, debris, unstable slope, and rockfall hazards in Jiuzhaigou, China, and compared it to the results of the RF and SVM. For LS mapping, the XGBoost algorithm outperformed the RF but performed in a slightly inferior way in comparison with the SVM. On the contrary, the method produced the best accuracy consequences for the other three hazards tested. Sahin et al. [43] found XGBoost (83.36%) consequences are lower than the results obtained here. Arabameri et al. [96] compared the RF, SVM, and LR methods to a combination of genetic algorithm XGBoost (GE-XGBoost) methods for gully erosion susceptibility mapping in Iran. Compared to the RF, SVM, and LR methods, the GE-XGBoost process improved classification accuracy (89.56%). Zhang et al. [97], when comparing neural networks, decision trees, and the RF, obtained the best results for debris flow susceptibility with the XGBoost method in Shigatse Area, China. In areas with debris flow and rockfalls, we discovered that the XGBoost model is the best predictive technique for LSM.
InSAR methods for generating and updating landslide inventories have recently been developed [98]. The results of the InSAR methods are considered more accurate [99], producing high production accuracy susceptibility maps. The PS-InSAR method for calculating deformation velocity was used for testing the distribution of landslide movement in the area in the range of one year. The LOS velocity results show only the velocities along the slopes in both model's high susceptible zones. This study also demonstrated that PS-InSAR is an effective method for monitoring slow landslide movement and non-vegetation areas. On the basis of susceptibility classes, we classified the LSM produced by the XGBoost, RF, and PS-InSAR methods into five zones (Figures 9 and 13). The very high susceptibility class of all methods had 6.81% (XGBoost), 5.41% (RF), and 11.52% (PS-InSAR) of landslide mapped for this study, illustrating the model's accuracy. The other technique used to assess prediction capability was the ROC curve AUC, which predicted 72.8% and 69.2% for XG-Boost and the RF models, respectively, demonstrating the model's high accuracy. Using the natural fracture method [80], we categorized the extracted susceptibility map into five categories: very low, low, moderate, high, and very high. In comparison, the XGboost model was more effective in calculating the importance of each factor in triggering landslides.
The RF model and XGBoost were used in this study to conduct LSM. Still, the outcomes had many constraints that produced misclassification, such as (1) the efficiency of the landslide inventory, and (2) the efficiency of data related to each landslide variable. Concerning the former, only 74 landslides have been mapped for this study area due to the harsh environment along KKH. It caused significant misclassification errors for the landslide susceptibility mapping, highlighting the importance of improving the landslide susceptibility map using PS-InSAR outcomes. Remarkably, the newly LSM, when combined with the PS-InSAR results, decreased misclassification in which landscape influenced by slope deformity was categorized as very low and low vulnerability. A further concern is that the landslide susceptibility mapping only depicts forecast landslide dispersion in areas and not interactive displacement mechanisms through time. On the other hand, variations in landslide behavior through time are a significant issue for decision makers [100]. In combination with the PS-InSAR outcomes, a new landslide susceptibility map can reflect the actual condition of landslides, and it can be intended for quantitative hazard assessment and preparatory landslide mapping at the provincial level [101].
The LSM develops a landslide susceptibility map, identifies the significant variables that cause landslides, and assesses their contribution and effect [102]. Geology, precipitation, land cover, distance to road, slope, distance to stream, NDVI, distance to faults, roughness, curvature, TWI, profile curvature, aspect, plan curvature, and elevation were used to assess susceptibility in this study. According to Figure 7, the main contributors of landslides in the area are slope and elevation, precipitation, distance to fault, roughness, and TWI. The distance to the road, aspect, and geology of the landslide in the research area had little influence.
The elevation is a major triggering factor for landslides in the area (Figure 7). Weathered rocks and intermediate elevation typically categorize high elevation zones, and slopes are usually covered by thin colluvium, making them more susceptible to landslides [87]. Because active fault and shear zones exert substantial control over landslide activity in the area [7], the buffer class closest to the fault line is the most vulnerable (Figure 7). The lithological units have little influence in LSM, but the Yasin group and Quaternary alluvium (Figure 7) are the most susceptible formations [6,69,90] in the research area. Barren land is in direct contact with climatological variables such as sun and rainfall, which causes rock weathering and increases the risk of landslides [88]. Monsoon rainfall is a significant causative factor in the area [89], causing the majority of the debris flow, rockfalls, and other slides. In this study, annual average precipitation data were used, which did not show precipitation as a significant causative factor but did find more landslides in areas with high precipitation (Figure 7). Previous research in the area, such as [7], relied solely on statistical models. Consequently, the forwarded susceptibility model is not very efficient according to the model.

Conclusions
Landslides have been one of Pakistan's most severe natural adversities, causing significant deaths and socio-economic losses annually. Thus far, the process of landslide mapping has been quite complicated and volatile to predict the vulnerability of landslides accurately and timely in the most susceptible areas. For this purpose, various attempts have been made to enhance predictability on the basis of several forecast models for mapping the landslides vulnerability targeting different areas. It is essential for decision-makers to create more appropriate landslide vulnerability maps in order to enhance the efficiency of the prediction model. This study focused on comprehensive mapping of landslides to identify the primary reasons for landslides and delineate high-risk zones, which will be helpful in the future to medicate landslide hazards in the area. The novel aspect of this study is that it generates more accurate LSM using ML models validated by PS InSAR processing.
This research used XGBoost and RF ML methods for LSM of the Karakoram Highway, which was improved using the PS-InSAR methodology. Geology, precipitation, land cover, distance to road, slope, distance to stream, NDVI, distance to faults, roughness, curvature, TWI, profile curvature, aspect, plan curvature, and elevation were used to assess susceptibility in this study. The main contributors of landslides in the area are slope and elevation, precipitation, distance to fault, roughness, and TWI. Excellent forecasting results were obtained. The developed susceptibility model will serve as a guide to identify low-risk zones for infrastructure development and better management planning along the KKH. Geotechnical and other slope stabilization techniques are required in a high-risk area to reduce landslide disasters in the future. We assume that our method can provide valued inputs to strategies endorsing highway safety.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in the study are available on request from the first and corresponding author. The data are not publicly available due to the thesis that is being prepared from these data.

Conflicts of Interest:
The authors declare no conflict of interest.