Monitoring Oil Exploitation Infrastructure and Dirt Roads with Object-Based Image Analysis and Random Forest in the Eastern Mongolian Steppe

: Information on the spatial distribution of human disturbance is important for assessing and monitoring land degradation. In the Eastern Mongolian Steppe Ecosystem, one of the major driving factors of human-induced land degradation is the expansion of road networks mainly due to intensiﬁcations of oil exploration and exploitation. So far, neither the extents of road networks nor the extent of surrounding grasslands a ﬀ ected by the oil industry are monitored which is generally labor consuming. This causes that no information on the changes in the area which is a ﬀ ected by those disturbance drivers is available. Consequently, the study aim is to provide a cost-e ﬀ ective methodology to classify infrastructure and oil exploitation areas from remotely sensed images using object-based classiﬁcations with Random Forest. By combining satellite data with di ﬀ erent spatial and spectral resolutions (PlanetScope, RapidEye, and Landsat ETM + ), the product delivers data since 2005. For the classiﬁcation variables, segmentation, spectral characteristics, and indices were extracted from all above mentioned imagery and used as predictors. Results show that overall accuracies of land use maps ranged 73%–93% mainly depending on satellites’ spatial resolution. Since 2005, the area of grassland disturbed by dirt roads and oil exploitation infrastructure increased by 88% with its highest expansion by 47% in the period 2005–2010. Settlements and croplands remained relatively constant throughout the 13 years. Comparison of multiscale classiﬁcation suggests that, although high spatial resolutions are clearly beneﬁcial, all datasets were useful to delineate linear features such as roads. Consequently, the results of this study provide an e ﬀ ective evaluation for the potential of Random Forest for extracting relatively narrow linear features such as roads from multiscale satellite images and map products that are possible to use for detailed land degradation assessments.


Introduction
Land degradation is defined by Food and Agriculture Organization of the United Nations [1] as "a reduction in the capacity of the land to provide ecosystem goods and services over a period of time for its beneficiaries". Degradation is commonly caused by the mismanagement or over-exploitation of natural resources, such as vegetation clearance, nutrient deplete on, overgrazing, inappropriate irrigation, excessive use of agrochemicals, urban sprawl, pollution, or other direct impacts, such as mining, quarrying, trampling, or vehicle off-roading [2]. Consequently, the drivers of land degradation can be separated into those caused by nature (e.g., landslides, drought, floods) or by anthropogenic knowledge, there is no study on land cover classification of linear features which was conducted combining PlanetScope and RapidEye as high spatial resolution data with Landsat data providing a longer time series.
The major aim of the study is to provide a satellite based land-use classification monitoring product which is useful to detect land-use types such as infrastructure which are attributed to grassland degradation in the Eastern Mongolian Steppe Ecosystem. In this respect, we only use data which is free of charge in order to minimize the future monitoring costs. Specifically, this study aimed to (i) evaluate the suitability of machine learning based supervised object-oriented classification techniques and multiscale and multispectral remote sensing data to detect grasslands disturbed by dirt roads, road construction, oil extraction field and other infrastructure. (ii) The predictive power of multispectral bands, spectral indices and segment properties provided by PlanetScope, RapidEye, and Landsat were examined for linear object classification and (iii) the drivers of land degradation were assessed by analysing recent land use changes in the Eastern Mongolian Steppe. Special care is taken, to quantify the uncertainty arising from the combination of data from different sensors in their spectral and spatial configurations.

Study Area
The study was conducted in the Menen Steppe (Menengyn Tal) and the Khalkh river area (Khalkh Gol) which are part of the Eastern Mongolian Steppe in the Dornod province of Mongolia ( Figure 1). The study area comprises approximately 20,000 km 2 between 46 • 33 N-48 • 02 N in latitude and 115 • 46 E-118 • 48 E in longitude. The steppe mainly consists of broad plains and rolling hills where the vegetation is dominated by Artemisia sp. and bunch grasses like Stipa sp. The area is characterized by an extremely continental climate. The average monthly temperature minimum reaches from −20 to −24 • C in January, the average maximum monthly temperature of 18-22 • C occurs in July. The average annual precipitation amounts to 200-300 mm, with monthly maxima mainly occurring in boreal summer.
Remote Sens. 2020, 12, 144 3 of 22 features which was conducted combining PlanetScope and RapidEye as high spatial resolution data with Landsat data providing a longer time series. The major aim of the study is to provide a satellite based land-use classification monitoring product which is useful to detect land-use types such as infrastructure which are attributed to grassland degradation in the Eastern Mongolian Steppe Ecosystem. In this respect, we only use data which is free of charge in order to minimize the future monitoring costs. Specifically, this study aimed to (i) evaluate the suitability of machine learning based supervised object-oriented classification techniques and multiscale and multispectral remote sensing data to detect grasslands disturbed by dirt roads, road construction, oil extraction field and other infrastructure. (ii) The predictive power of multispectral bands, spectral indices and segment properties provided by PlanetScope, RapidEye, and Landsat were examined for linear object classification and (iii) the drivers of land degradation were assessed by analysing recent land use changes in the Eastern Mongolian Steppe. Special care is taken, to quantify the uncertainty arising from the combination of data from different sensors in their spectral and spatial configurations.

Study Area
The study was conducted in the Menen Steppe (Menengyn Tal) and the Khalkh river area (Khalkh Gol) which are part of the Eastern Mongolian Steppe in the Dornod province of Mongolia ( Figure 1). The study area comprises approximately 20,000 km 2 between 46°33´ N-48°02´ N in latitude and 115°46´ E-118°48´ E in longitude. The steppe mainly consists of broad plains and rolling hills where the vegetation is dominated by Artemisia sp. and bunch grasses like Stipa sp. The area is characterized by an extremely continental climate. The average monthly temperature minimum reaches from −20 to −24°C in January, the average maximum monthly temperature of 18-22°C occurs in July. The average annual precipitation amounts to 200-300 mm, with monthly maxima mainly occurring in boreal summer. The major land use types in this area are rangelands, croplands, mines and a few permanent settlements [25,26]. The whole study area is covered by the oil exploration licenses divided into several blocks encompassing different areas granted in different years. For instance, block XIX, XXI, and XXII, were issued under the production sharing contract in 1993 and 1995 ( Figure 1). In addition, two oil extraction sites were constructed in the Toson Uul XIX and Matad XXI in 2003 and 2005 [27,28]. The major land use types in this area are rangelands, croplands, mines and a few permanent settlements [25,26]. The whole study area is covered by the oil exploration licenses divided into several blocks encompassing different areas granted in different years. For instance, block XIX, XXI, and XXII, were issued under the production sharing contract in 1993 and 1995 ( Figure 1). In addition, two oil extraction sites were constructed in the Toson Uul XIX and Matad XXI in 2003 and 2005 [27,28].

Data Collection
The satellite data used for this study are depicted in Table 1. In order to estimate the recent change in transportation infrastructure, a set of 310 multispectral PlanetScope and RapidEye scenes were downloaded free-of-charge from the Planet Labs, Inc. as part of the Education and Research program website (www.planet.com) for the years 2010-2018 [29]. The imagery was selected according to its acquisition date, spatial resolution, and cloud coverage. To avoid major differences in phenology, all selected satellite scenes are acquired during summer and autumn. The satellite images were already georeferenced to the Universal Transverse Mercator (UTM) projection WGS84 zone 50 north. The information about disturbed grassland change in the study area was obtained by examining a composite of cloud-free multispectral images acquired in 2005 and 2007 by Landsat Enhanced Thematic Mapper Plus (ETM+; bands 1-8). The data were downloaded free-of-charge from the United States Geological Survey's (USGS) website (https://earthexplorer.usgs.gov/) [30]. For each year, two Landsat scenes (path/row 124/27 and 125/27) covered the study area which were selected that cloud cover was below 10%. They have been merged to achieve one complete datasets.

Data Preparation and Pre-Processing
Since the area of investigation was covered by multiple scenes irrespective of the sensor, the top of atmosphere reflectance calibration was applied to all data in each year using sun elevation at the acquisition time, sensor gain and bias for each band and scene. Consequently, single scenes of the same satellite were merged into separate mosaics to achieve datasets covering the entire area of investigation. For Landsat ETM+ data, two additional steps were performed. (1) To fill missing data due to the scan line failure of Landsat ETM+, a global linear histogram matching technique (called SLC Gap-Filled Products Phase One Methodology) was applied [31]. (2) Afterwards, the spatial resolution of the Landsat images was enhanced to 15 m applying the Gram-Schmidt spectral-sharpening algorithm which is a pansharpening technique to increase the spatial resolution of the multispectral images [32].

Supervised Classification Approach
To investigate the change in road networks in Eastern Mongolia during the last decade, supervised classification approaches were performed on all acquired imagery. In contrast to widely used pixel-based approaches, we performed an object-oriented classification to avoid the well-known "salt and pepper effect" if the size of the target to classify is larger in comparison to the spatial resolution of the imagery [18]. Recently, a comprehensive review summarized the supervised object-based land-cover image classification and found the development of the Random Forest (RF) in the supervised object-based framework is experiencing rapid advances and, which shows the best performance in land cover classification [19]. The following four land cover classes were distinguished because they represent the dominant land use categories of the study area: (1) dirt roads and petroleum extraction infrastructure sites, (2) croplands, (3) natural grasslands, and (4) settlement areas. The supervised classification approach involved five main steps ( Figure 2): (i) the segmentation of the whole study area into objects, (ii) feature extraction on a segment basis to build a training samples database, (iii) using the training samples database to train Random Forest classifier models, (iv) classify each segment into dirt roads or other land-use, and (v) using Equalized Stratified Random and Cohen's Kappa scores to determine the accuracy of classified results.
Remote Sens. 2020, 12, 144 5 of 22 same satellite were merged into separate mosaics to achieve datasets covering the entire area of investigation. For Landsat ETM+ data, two additional steps were performed. (1) To fill missing data due to the scan line failure of Landsat ETM+, a global linear histogram matching technique (called SLC Gap-Filled Products Phase One Methodology) was applied [31]. (2) Afterwards, the spatial resolution of the Landsat images was enhanced to 15 m applying the Gram-Schmidt spectralsharpening algorithm which is a pansharpening technique to increase the spatial resolution of the multispectral images [32].

Supervised Classification Approach
To investigate the change in road networks in Eastern Mongolia during the last decade, supervised classification approaches were performed on all acquired imagery. In contrast to widely used pixel-based approaches, we performed an object-oriented classification to avoid the well-known "salt and pepper effect" if the size of the target to classify is larger in comparison to the spatial resolution of the imagery [18]. Recently, a comprehensive review summarized the supervised objectbased land-cover image classification and found the development of the Random Forest (RF) in the supervised object-based framework is experiencing rapid advances and, which shows the best performance in land cover classification [19]. The following four land cover classes were distinguished because they represent the dominant land use categories of the study area: (1) dirt roads and petroleum extraction infrastructure sites, (2) croplands, (3) natural grasslands, and (4) settlement areas. The supervised classification approach involved five main steps ( Figure 2): (i) the segmentation of the whole study area into objects, (ii) feature extraction on a segment basis to build a training samples database, (iii) using the training samples database to train Random Forest classifier models, (iv) classify each segment into dirt roads or other land-use, and (v) using Equalized Stratified Random and Cohen's Kappa scores to determine the accuracy of classified results. The satellite data in this study feature different spatial resolutions which might have an effect on the accuracy of the classification results and could therefore distort the analysis of land-use changes over time. To quantify this uncertainty, the spatial resolutions of the PlanetScope (3 m) and RapidEye (5 m) images are reduced to 15 m to fit the resolution of Landsat. Those images are termed "spatially binned" datasets in the following. Afterward, the same analysis was performed with the images with artificially reduced spatial resolution and the results were compared to the classifications obtained from the data with original and high spatial resolutions. To save computational time, this was only conducted within a test area instead of the full Eastern Mongolian Steppe ecosystem. The satellite data in this study feature different spatial resolutions which might have an effect on the accuracy of the classification results and could therefore distort the analysis of land-use changes over time. To quantify this uncertainty, the spatial resolutions of the PlanetScope (3 m) and RapidEye (5 m) images are reduced to 15 m to fit the resolution of Landsat. Those images are termed "spatially binned" datasets in the following. Afterward, the same analysis was performed with the images with artificially reduced spatial resolution and the results were compared to the classifications obtained from the data with original and high spatial resolutions. To save computational time, this was only conducted within a test area instead of the full Eastern Mongolian Steppe ecosystem.

Segmentation
Segmentation is key of any object-based classification workflow. This process groups neighboring pixels together into objects if pixels are similar in spectral and spatial characteristics. In our study, the images of PlanetScope, RapidEye, and Landsat ETM+ were segmented using the mean shift algorithm implemented in version 2.3 of the ArcGIS Pro software, which was developed by Fukunaga and Hostetler and generalized by Cheng [33,34]. After testing several band combinations for their suitability to segment roads in the images, a true color RGB composite was chosen as best option for segmentation based on visual interpretation of the results and expert knowledge. Mean shift is an iterative procedure that shifts each data point to the average of data points in its neighborhood. In this study, the segmentation process used a moving window within which average pixel values were calculated to decide which pixels are grouped into objects. As the window moves over the image, it iteratively re-computes a value to examine that each segment is concise. The characteristics of the image segments are determined by adjusting three parameters which are (i) spectral detail, (ii) spatial detail, and (iii) minimum segment size. In order to specify the spectral and spatial differences of features in each image, the values of segmentation parameters were adjusted on the basis of previous experience, visual analysis of different results and literature reviews. For instance, it is suggested that a higher value of spatial and spectral detail is appropriate in the case that the classified object is smaller than the spatial resolution of satellite image [35,36]. Therefore, values of spectral and spatial detail were set to 18 for all imagery (possible range: [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]. The size of objects is defined by the parameter of minimum segment size, which merges segments smaller than this criterion with their best fitting neighboring segment. This parameter was adjusted to 30 pixels for PlanetScope, 20 pixels for RapidEye, and 10 pixels for Landsat ETM, which was determined by the minimum mapping unit as average width of linear features. As a result of the segmentation process, a set of attributes was generated for each segment within the input image. This set includes mean, standard deviation, segment size, active chromaticity color, rectangularity, and compactness. "Segment size" indirectly defines the scale of the segments, and rectangularity and compactness are defined by the degree to which a segment is compact or circular. Active chromaticity color is the average RGB color value that is derived from the input image. The segmentation quality was evaluated by means of visual inspection of the results and expert knowledge.

Training Samples
The feature samples were collected from segmented images of PlanetScope to build training datasets. Therefore, segments derived from PlanetScope data in 2018 were randomly and manually chosen and their boundary, homogeneity and heterogeneity values were included in the training dataset. The data of training samples was generated with 500 objects, which were assigned into dirt roads and oil extraction infrastructure (200), for natural grassland (100), cropland (100), and settlement area (100). The merging process was performed using the training tools available in ArcGIS Pro 2.3, which combined the training polygons of each of the four land use types with the selection of optimal object features and segmented layer. The training datasets for the classifications of previous years' RapidEye and Landsat images were generated from the classified data of PlanetScope images in 2018 (for further details on the classification see previous subsection, Figure 2). The selection of the training samples was repeated for each year using the classified results of the next year as reference. Fifty thousand randomly selected seed points were used to generate training data from an existing classified raster of PlanetScope imagery ( Figure 2). In generating the training sample by seed points, the parameters of maximum sample radius and minimum sample area were needed to adjust. After testing several combinations of parameter values, the maximum sample radius was set to 50 meters, which defined the longest distance from any point within the training sample to its center seed point. The minimum sample area was set to 30 square meters. Furthermore, the spectral characteristics and indices were captured by a set of attributes extracted separately from the time series of multispectral images. In particular several predictor variables were calculated from the four, five, and eight multispectral bands of PlanetScope, RapidEye and, Landsat ETM+ images, respectively. The multispectral bands were associated with polygons corresponding to the segmented image.

Random Forest Classifier
The random forest classification method [37] was applied to all satellite scenes of each year. It is an extension of the classification trees algorithm [38], belonging to the ensemble learning methods. The classification tree algorithm creates individual decision trees automatically based on tree-wise randomly chosen samples and subsets of the training data. For a random forest model, many classification trees are grown and the classification result is derived by a vote of each tree. Each classification tree is constructed separately by using an individual learning algorithm from a random sample of the training data set. At each node of classification trees, the best split is performed based on random subsets of the predictor variables. All trees are grown to the largest possible extent that is controlled by the node size set by the user. To run the random forest classification, it is necessary to define several important adjustable parameters. The primary parameters are the number of classification trees in the forest to run (ntree), number of randomly selected variables to use for building each tree (mtry) and depth of each tree in the forest. The predictor variables are listed in Table 2. Please note that the total number depended on the number of bands of the respective satellite.  The degree to which a segment is compact or circular Rectangularity The degree to which a segment is rectangular Count The number of pixels comprising the segment 1 Variables were calculated for each of the four multispectral bands of PlanetScope, the five multispectral bands of RapidEye and the 8 multispectral bands of Landsat. 2 The calculation of indices was performed on the multispectral bands of Landsat and RapidEye only. 3 The calculations are cited from References [39][40][41][42].
The importance of a certain variable to the overall model was calculated by the percentage of tree votes for the correct class in overall trees. In this study, the random forest algorithm [43] implemented in version 2.3 of the ArcGIS Pro software was used. The number of trees (ntree) was set to 2000 for each classification. Default values were used for the number of samples to be used for defining each class and the maximum depth of each tree in the forest. Classification of satellite imageries was performed separately using spectral and spatial variables and collection of training sample data which are delineated from the single date and multi-temporal imagery, respectively.

Accuracy Assessment and Validation
To assess the accuracy of the classification results, a stratified random sampling was applied to create a set of 500 random points independent from the training samples. Each point was manually assigned to its class based on the high-resolution satellite imagery and expert knowledge. The performance of the classifications was assessed by constructing confusion matrices and calculating overall accuracies, Cohen's Kappa scores, and quantity and allocation disagreement [44][45][46][47][48].
Two different sources for validation were used: (1) The base map images of ESRI (Environmental Systems Research Institute) and (2) Google Earth Pro images [49,50]. Both datasets differ in the temporal information and detail regarding the land use classes. The ground truth data for validation was generated using manual digitization. The objects were carefully selected by the basis of qualitative data of land use of Dornod provice in Mongolia [51]. The values of points were manually extracted and collected from high-resolution imagery of Google Earth Pro and ESRI (Figure 2).

Classification Results and Accuracy
Supervised classification was performed using the random forest classification algorithm on each image to calculate the disturbed grassland due to dirt road and petroleum extraction infrastructure and other land uses for years 2005, 2007, 2010, 2014, and 2018. As a prerequisite to supervised classification, training samples were generated for all the land use classes mentioned above for each image. Three-meter spatial resolution images of PlanetScope enabled the generation of training samples and validation data for the classification. The training samples for RapidEye and Landsat ETM+ were created from an existing result of PlanetScope classification using seed points. The accuracy reports for each classified imagery included a confusion matrix and estimates of overall accuracy, kappa coefficient, user accuracy, producer accuracy, quantity, and allocation disagreement for each land use class. Summary of these metrics are shown in Table 1 (Figure 3a,b). The Cohen's Kappa scores were rated 0.65 and 0.75. Some narrow dirt roads or linear features were detected which were not connected to the main network. Those features are most probably falsely classified as roads.

Comparison of Multiscale Classification by Satellite Images and Up-Scaling Images
The overall accuracies of native satellite imagery were 92.6% for PlanetScope, 87.2% for RapidEye, and 83.2% for Landsat ETM+. Regarding the overall accuracy of spatially binned satellite imagery was 81.8% for PlanetScope (15 m) and 84.6% for RapidEye (15 m). The comparison of original imageries shows that PlaneScope (3 m) has a higher accuracy than RapidEye (5 m) and Landsat (15 m). Comparing the original and spatially binned imagery of PlanetScope, the overall accuracy decreased by 10% as consequence of the reduced spatial resolution. The kappa value declined by 20% from original to spatially binned imagery. For the RapidEye classifications, both overall accuracy and kappa values Remote Sens. 2020, 12, 144 9 of 21 decreased by 3-5% if the spatial resolution was reduced from 5 m to 15 m (confusion matrices of the classification accuracy are shown in Table A2).
Remote Sens. 2020, 12, 144 9 of 22  To analyze the dependence of the detection of linear features on spatial resolution, maps were compiled allowing a direct comparison. Figure 4 shows the classified result as dirt roads and oil extraction site, and grassland in the comparison plot of Menen Steppe in 2018. From the difference maps, minimum widths of feature were calculated which are required in relation to the spatial resolution in order to allow a reliable detection.  Areas with high road densities such as those where multiple parallel dirt road lanes exist, are commonly classified as the linear features in classification of RapidEye (5 m), Landsat ETM+ (15 m), and spatially binned imageries. This overlapping leads to an increase in the size of classified area because pixels between then different lanes are not counted as grasslands. Table 3    Areas with high road densities such as those where multiple parallel dirt road lanes exist, are commonly classified as the linear features in classification of RapidEye (5 m), Landsat ETM+ (15 m), and spatially binned imageries. This overlapping leads to an increase in the size of classified area because pixels between then different lanes are not counted as grasslands. Table 3

Analysis of Variable Importance
The variable importance was separately identified for all three different satellite data sources ( Figure 5). Concerning PlanetScope data, the active chromaticity color of blue, green, and red were ranked as particularly important for classifying linear feature extraction. For RapidEye data, the variables derived from red edge and near-infrared channels stand out as the most important ones which had 2-3 times higher contributions than the other bands for predicting linear features. Furthermore, other spectral variables of above-mentioned satellite data were generally ranked as relatively important for classification and prediction.

Analysis of Variable Importance
The variable importance was separately identified for all three different satellite data sources ( Figure 5). Concerning PlanetScope data, the active chromaticity color of blue, green, and red were ranked as particularly important for classifying linear feature extraction. For RapidEye data, the variables derived from red edge and near-infrared channels stand out as the most important ones which had 2-3 times higher contributions than the other bands for predicting linear features. Furthermore, other spectral variables of above-mentioned satellite data were generally ranked as relatively important for classification and prediction. Comparing variables importance in models based on RapidEye and PlanetScope, the mean and standard deviation value of near infrared was twice as high in the former compared to the latter. The largest difference was observed in the active chromaticity color of blue, green, and red, which was 3-4 times more important in models based on in PlanetScope compared to RapidEye.
For the Landsat imagery, the predictor variables derived from visible light and short-wave infrared-2 spectral bands of Landsat ETM+ were among the most important variables for linear feature extraction ( Figure 6). In particular, the variables of the blue and short-wave infrared-2 (SWIR 2) band had high importances compared to all other predictors. Comparing variables importance in models based on RapidEye and PlanetScope, the mean and standard deviation value of near infrared was twice as high in the former compared to the latter. The largest difference was observed in the active chromaticity color of blue, green, and red, which was 3-4 times more important in models based on in PlanetScope compared to RapidEye.
For the Landsat imagery, the predictor variables derived from visible light and short-wave infrared-2 spectral bands of Landsat ETM+ were among the most important variables for linear feature extraction ( Figure 6). In particular, the variables of the blue and short-wave infrared-2 (SWIR 2) band had high importances compared to all other predictors. In general, the variables derived from NDVI, NDVIred-edge, PVI, and SAVI spectral indices featured low importances compared to multi-spectral bands in both of RapidEye and Landsat ETM+. In the classification of PlanetScope, the variables of active chromaticity color in segmentation were more important than other spectral bands variables. In contrary, the variables of active chromaticity color in segmentation, spectral bands and indices in Landsat ETM+ were equally involved in the classification model.

Land Use Changes
As result of this research, the first time-series of land use data of dirt road network, oil extraction infrastructure, cropland and settlement area for Eastern Mongolia was derived from the satellite images. According to the data, the total disturbed grassland of dirt road and oil extraction infrastructure increased sharply from 7840 ha in 2005 to 14  In general, the variables derived from NDVI, NDVI red-edge , PVI, and SAVI spectral indices featured low importances compared to multi-spectral bands in both of RapidEye and Landsat ETM+. In the classification of PlanetScope, the variables of active chromaticity color in segmentation were more important than other spectral bands variables. In contrary, the variables of active chromaticity color in segmentation, spectral bands and indices in Landsat ETM+ were equally involved in the classification model.

Land Use Changes
As result of this research, the first time-series of land use data of dirt road network, oil extraction infrastructure, cropland and settlement area for Eastern Mongolia was derived from the satellite images. According to the data, the total disturbed grassland of dirt road and oil extraction infrastructure  (Table 4).   (Table  4).

Relevance of the Approach
The spatial distribution of land use change in the period 2005-2018 was extracted developing an object-based random forest classification method applied to long term multiscale and multi spectral remote sensing data. To detect changes in infrastructure in Eastern Mongolia since 2005, several images acquired by different satellite sensors were classified into "dirt road and oil extraction infrastructure site", "settlement area", "grassland", and "cropland". Since sensors differ in spatial and spectral resolutions, it is interesting to compare classification results among the different data sources. Highest overall accuracies between 87% and 93% were achieved using PlanetScope imagery,

Relevance of the Approach
The spatial distribution of land use change in the period 2005-2018 was extracted developing an object-based random forest classification method applied to long term multiscale and multi spectral remote sensing data. To detect changes in infrastructure in Eastern Mongolia since 2005, several images acquired by different satellite sensors were classified into "dirt road and oil extraction infrastructure site", "settlement area", "grassland", and "cropland". Since sensors differ in spatial and spectral resolutions, it is interesting to compare classification results among the different data sources. Highest overall accuracies between 87% and 93% were achieved using PlanetScope imagery, followed by RapidEye imagery with 85-91%. Classifications on Landsat ETM+ data yielded the lowest accuracies with 73-85%, respectively. Regardless of the spatial resolution of the satellite sensor, "cropland", "natural grassland", and "settlement area" classes were more accurately classified than the "dirt road and oil extraction infrastructure site" classes due to the relative homogeneity of the "cropland" "natural grassland", and "settlement" classes compared to the narrow disconnected linear "dirt road and infrastructure" classes.
For the extraction of linear features, the method was well-suited to identify dirt roads and infrastructure from PlanetScope and RapidEye imagery, which obtained accurate results and opened up possibilities to apply the methodology to other areas. Consequently, the classification model achieved affordable results with high accuracy by the basis of different satellite data, which is consistent with the findings of other studies [22][23][24]52]. However, as could be seen in the performance of the different classifications, the accuracies of the dirt road and linear infrastructure were strongly related to the spatial resolution of the images through the influence of boundary pixels and influence of finer spatial resolution that increases the spectral-radiometric variation of land cover types [53]. In contrast to previous studies applying object-based image analysis to detect roads, geometrical features only marginally contributed to the overall model [54]. One explanation could be that the homogeneous undisturbed steppe ecosystems in Eastern Mongolia differed much more strongly in terms of spectral signatures from the infrastructure than it is the case in areas with more complex land-use as e.g., in cities. The segmentations were performed on images with different spatial resolution and same band composition. Previous studies suggested that, considering to set the high level on the parameters of spatial and spectral have given the opportunities to improve the segmentation results [35,55]. In the stage of experiments, the convenient adjustment was defined by the influence of spatial and spectral parameter on the segmentation in linear features. In this study, setting a high level of those parameters resulted in a decrease in homogeneity of objects. In other words, increasing the spatial and spectral parameter meant that a large number of small objects were obtained after segmentation. On the contrary, the lower level of those parameters made a homogeneity of objects increasing that failed to delimit linear features borders. Therefore, the selection of suitable parameters was paid much attention to the basis of those diverse in different spatial resolution.
The results of classification were validated using a ground truth dataset which were manually created by visual inspection of Google Earth Pro and ESRI base map images. Several studies showed that Google Earth imagery is a possible source of very high resolution imagery suitable to manually derive a reference dataset to assess the accuracy of remotely sensed image classifications [56,57]. In this study, the Google Earth and ESRI basemaps provided cost and time effective data sources especially from the point of view that this study covered an extremely large and remote area in Eastern Mongolia which is generally difficult to access. On the contrary, these sources have a scarcity of archived long time very-high-resolution imagery that cause difficulties in using the data in long-term image analysis. However, by selecting objects for reference data which have not been subject to change during the study period, both datasets were suitable to quantify the accuracy of the new product. The results obtained for Landsat ETM+ data were compared to original and spatially binned (15 m) multiscale imagery of PlanetSope and RapidEye to clarify the influence of different resolution on detecting linear features. Here, the comparison of different resolution image classification suggested that relatively wide dirt roads and linear infrastructures such as those featuring >20 m in width were easily classified from all types of images. Detection of 6-20 m wide narrow linear features from 3-meter resolution PlanetScope images and 5-meter resolution RapidEye revealed relatively high overall accuracies between 86% and 92%. In contrast, the detection of relatively narrow linear features such as roads featuring widths of 6-20 m was hardly possible from 15-meter spatial resolution data such as Landsat ETM+ and spatially binned images.
Differences in satellite imagery spectral significance for each random forest model are shown by calculation of variable importance. Generally, all variables derived from geometries of segments, multispectral channels and spectral indices were ranked as relatively important for classification and prediction. In general, the analysis of variable importance in all satellite imagery indicated that the variables of spectral channels and spectral indices are more important than geometrical variables. Comparing the variable importance, classification accuracy, and spatial resolution, the results suggested that the finer spatial resolution and lower spectral resolution of PlanetScope imagery yielded the highest overall accuracy (93.2%) and its most important variables were derived from active chromaticity color in segments and visible bands. Concerning the set of 5-meter spatial resolution, 5 spectral bands and 4 spectral indices composites in RapidEye data source, the variables of red, red edge and near-infrared channels, and active chromaticity color in blue were important to distinguish dirt road and infrastructure from all other surfaces with higher overall accuracy (87.3%). In the classification of RapidEye, red-edge and near-infrared channels had 2-3 times higher contributions than the other bands. According to the spectral reflectance, a typical bare soil and unvegetated surface such as dirt roads have a high reflectance in the near-infrared and shortwave infrared bands [58]. For the coarser spatial resolution and higher spectral resolution imagery of Landsat ETM+, generally, all variables derived from the geometry of segments, spectral bands, and spectral indices were equally involved. Considering individual variables, the results suggested that the visible and SWIR-2 bands proved to the most important variable for predicting and classifying linear features with 73.1% overall accuracy.
Conceptually, the result of this analysis suggest that finer spatial resolution data is preferable to coarser spatial resolution data when a land use/land cover object is small and linear. Furthermore, the higher-spectral resolution data could not provide the possibilities to deal effectively with the detection of smaller and linear feature than their spatial resolution.

Land Use Analysis
Remote sensing technologies play an important role in continually delivering the quantitative information needed to analyze the nature, dynamic, and spatial distribution of land degradation processes. Diverse assessment approaches are used to monitor the scope and consequences of land degradation. According to several studies [59][60][61], the infrastructure development, especially, construction and extension of oil extraction infrastructure can foster land degradation processes. The main interest of this study is to estimate the land use change such as dirt road, oil extraction infrastructure, and site that can be determined by multiscale remote sensing imagery and RF classification. The land use change in the study area was determined by a combination of supervised classification methods and long-term multiscale earth observation data. In the research area, there are several man-made factors contributing to land degradation distinguished by the manners and types.
The land use change analysis of Menen steppe and Khalkh river area indicates significant shifts in dirt road and infrastructure over the last 13 years. The most pronounced change detected in this study was the expansion of dirt roads and infrastructure due to growth of oil exploitation, exploration and transportation. Over the observed 13 years, the total area occupied by dirt roads and oil exploitation infrastructure expanded by 75% or 14,730 ha. Regarding their change over space, the dirt roads and infrastructure expanded over the whole territory of the study area. Cropland slightly increased by 3% to approximately 50,134 ha. Concerning the land usage, the cropland area is relatively concentrated at a few locations; one of those was built in 1972 with approximately 40,000 ha. The slight growth was related to additional extension of main cropland in 2015-2018 and its further extension is purposed to establish 72,000 ha in 2020 [62]. Settlement area did not change during the last 13 years.
The analysis recorded some declination or some artificial changes, particularly, several temporal expansion of dirt roads were detected in the period 2007-2010, and their detection was decreased or disappeared from satellite images in 2014-2018. In this case, for the short period, dirt roads were visibly recovered by pioneering plants shortly after the roads were abandoned. However, it has to be questioned if the successional stages of the vegetation are structurally and functionally equivalent to the native vegetation cover before creation of the dirt roads [5,63].
Several studies (e.g., [4,64]) have indicated that dirt roads are a major anthropogenic driver for land degradation in Mongolia. Therein contrast much less studies analyzed anthropogenic drivers of grassland degradation in Mongolia [63]. However, no detailed study of dirt road impact on land degradation has been done yet. Intensive growth of dirt road in broad territory of Mongolia challenges the detection and registration of the short-term dirt roads into official data base. In addition, temporal dirt roads due to oil exploitation in the study area have never taken into account in official databases, because dirt roads are abandoned shortly after they have been created. These difficulties preclude a detailed analysis of dirt road impact on land degradation at regional scales. In this study, the permanent and temporal dirt roads due to exploitation were detected from an analysis of PlanetScope, RapidEye and Landsat ETM+ in 2005-2018.

Conclusions
Extensive land utilization for oil exploitation and dirt roads are major drivers of land degradation in the steppe ecosystem of eastern Mongolia. This research highlighted the potential usage of multi-resolution and multi-spectral remote sensing images and Random Forest classification to investigate the extraction of linear features and their change in extent over time. The performance of segmentation shows that adjusting high level of spatial and spectral detail provided the largest number of objects that were characterized by boundary, homogeneity, and heterogeneity, and thus, contained useful information for the linear features detection.
Understanding the driving factors such as the extent of land use change is particularly important for policies to develop sustainable land management practices, counteracting land degradation processes and fostering environmental conservation in the country. We tested and applied the method to data from 13 years to analyze spatio-temporal changes in land use for an area of more than 20,000 km 2 in the Menen Steppe and Khalkh River area. The classified results of 13 years of satellite data indicate that the land usage for dirt road and oil extraction infrastructure in the Eastern Mongolian Steppe is increasing due to the active oil exploitation and exploration. Within the study area over the observed 13 years, the total area occupied by dirt roads and oil exploitation infrastructure expanded by 75% to occupy an additional area of approximately 14,730 ha. For the spatial distribution, the dirt roads and infrastructure were expanded over the whole territory of the study area. In future, the results of this study will serve as data source to quantify the impact of human and climate induced disturbances on steppe ecosystems in Eastern Mongolia. This knowledge is key to preserve the unique open steppe ecosystems under global change.