Harnessing the Power of Remote Sensing and Unmanned Aerial Vehicles: A Comparative Analysis for Soil Loss Estimation on the Loess Plateau

: This study explored the innovative use of multiple remote sensing satellites and unmanned aerial vehicles to calculate soil losses in the Loess Plateau of Iran. This ﬁ nding emphasized the importance of using advanced technologies to develop accurate and e ﬃ cient soil erosion assessment techniques. Accordingly, this study developed an approach to compare sinkholes and gully heads in hilly regions on the Loess Plateau of northeast Iran using convolutional neural network (CNN or ConvNet). This method involved coupling data from UAV, Sentinel-2, and SPOT-6 satellite data. The soil erosion computed using UAV data showed AUC values of 0.9247 and 0.9189 for the gully head and the sinkhole, respectively. The use of SPOT-6 data in gully head and sinkhole computations showed AUC values of 0.9105 and 0.9123, respectively. The AUC values were 0.8978 and 0.9001 for the gully head and the sinkhole using Sentinel-2, respectively. Comparison of the results from the calculated UAV, SPOT-6, and Sentinel-2 data showed that the UAV had the highest accuracy for calculating sinkhole and gully head soil features, although Sentinel-2 and SPOT-6 showed good results. Overall, the combination of multiple remote sensing satellites and UAVs o ﬀ ers improved accuracy, timeliness, cost e ﬀ ectiveness, accessibility, and long-term monitoring capabilities, making it a powerful approach for calculating soil loss in the Loess Plateau of Iran.


Introduction
Spatial and temporal data drawn from global quantitative research have shown that erosion rates are much higher than soil production rates [1].Soil nutrients discharge faster during soil erosion (SE) than they form, which threatens the sustainability of agroecosystems [2].SE has been happening for hundreds of years, allowing the soil regeneration and regaining of its nutritional value.Additionally, it increases sediment transport (estimated to be 2.3 ± 0.6 BMT of sediment every year) beyond agricultural fields [3].Due to climate change and land use changes, many areas are at risk of SE worldwide, including arid and semi-arid regions as well as humid ones [4,5].Therefore, it is essential to detect and monitor soil loss in susceptible regions in order to ensure human health.
Space-born remote sensing images are most frequently used to obtain features of erosion on large scales with coarse spatial resolution [6][7][8][9].However, some soil landforms (erosion features) are not as large as others; therefore, they are high-resolution remote sensing images [10,11].During the last few decades, high-resolution and fine-grained data using unmanned aerial vehicle (UAV) images have become on-demand from low-altitude airspaces and are rarely distributed in developing countries [12].Additionally, data from these soil landforms (e.g., sinkholes) are not publicly available in developing countries [13].Therefore, different methods must be tested to derive spatial information regarding the erosional features, including sinkholes and gully heads.
As a result of increased data availability and thus a deeper knowledge of SE mechanisms, prediction equations have been developed based on some indicators such as soil properties, climate, vegetation cover, and topography [14,15].Several mathematical and geospatial models (e.g., Revised Universal Soil Loss Equation-RUSLE) have also been proposed to forecast SE distribution at various temporal and spatial resolutions [16][17][18].However, some uncertainties in outputs have resulted from nonlinear relationships between driving factors and related erosion processes, as well as the difficulty of upscaling the model results from a local scale to a larger scale [19].Recently, data-driven machine learning methods have been increasingly applied to analyze the spatial distribution of SE [20,21].In areas without observed field data, numerical models based on computational intelligence can provide probability-based distribution of erosion.These models are based on developing mathematical patterns between erodible areas and other properties [22].Deep learning is the most effective machine learning approach and has attracted significant research attention [23][24][25][26].Recently, the CNN method has been widely used to obtain more accurate earth feature mapping and modelling [27].It can handle complex modeling well and use a large number of resources for training [28,29].In the domain of deep learning data-driven networks, there is a regression relationship between input and output variables, which involves neurons [29].The weights for the inputs from the first layer to the other layer are generated based on the connections between neurons.The weighted inputs are then adopted to produce a reliable output using a bias term [30].To generate a desirable output, an activation function is applied to the neurons [31].
Several studies have resulted in fast and accurate outputs when using UAV for erosional feature modeling and mapping in remote and complex regions [32][33][34][35][36][37][38][39].Although UAV data have recently been used to calculate erosional landforms [8,40], these studies calculated soil losses using three different images.In other words, this study proposed a novel deep-learning approach for calculating soil loss in the Loess Plateau of Iran, where a convolutional neural network was employed for the task of interest.In the proposed method, data from UAV, Sentinel-2, and SPOT-6 satellites were employed for model development and validation.Furthermore, this study intended to expand on previous research [24] and test the issues as follows: (1) a UAV image was applied to prepare highresolution data in the region of losses of the Plateau of Iran; (2) multi-sources of remote sensing data (SPOT-6 and Sentinel-2) were used in the same region to detect soil maps of susceptible landforms; (3) the maps prepared from UAV and two remote sensing data were studied and compared; (4) finally, we examined the efficiency of CNN for detecting and mapping soil landforms.

Study Region
The study area is located in northeast Iran (Golestan Province) (37°36′40″ N to 37°38′40″ N latitudes, and 55°39′40″ E to 50°41′40″ E longitudes), with approximately 500 hectares of a dominantly semi-arid climatic region.The study was conducted in loessdriven soils with a mean of 265 mm precipitation per year.The min and max altitudes are 210 m and 550 m above sea level (Figure 1), respectively, with the dominant "silt loam" texture of the soil surface and the whole region covered by loess soils.

Flowchart and Framework of the Present Research
Information regarding the UAV and two remote sensing datasets (SPOT-6 and Sentinel-2) were prepared and applied in the present study.We collected the UAV, SPOT-6, and Sentinel-2 datasets on three different dates: August 18th, 2019 for both the UAV and SPOT-6, and September 14th, 2019 for Sentinel-2..The soil landforms (erosion features) in the loess region were computed by applying a deep learning model called CNN (Figure 2).The first step was to prepare, gather data sources, and digitize the location of two erosion features, including the sinkhole and the gully head.To this end, 48 ground control points, GCPs, were gathered from the study area: 70% of the data were allocated for the training section and 30% for testing the model.Elevation data and topographic information were collected using SPOT-6, Sentinel-2 remote sensing, and UAV images (Table 1).The next step involved image processing of the remote sensing and UAV data that were entered as the inputs of SE susceptibility maps.The images of the UAV with a pixel size of 0.2 m were processed using Pix4D software (Versions 3.3).The land surface/subsurface maps were processed in ArcGIS 10.8, and the main indices/factors were extracted from UAV-DEM.Next, susceptibility maps were computed and validated applying the CNN method.In the fourth step, erosional landform maps were prepared and validated using two remote sensing datasets named SPOT-6 and Sentinel-2.In the final step, we analyzed and compared the results calculated from the UAVs and two different remote sensing satellites (Figure 2).

The Preparation of the UAVs and Two Satellite Images
Soil erosional features, including sinkholes and the gully head, were calculated and detected using UAV data and remote sensing satellite images.Phantom 4RTK was equipped with a C4K camera, 1-inch CMOS sensor that can shoot 20MP photos and an 8element lens with an 84-degree FOV.Video was recorded in H. 264 or H. 265 and C4K resolution of 4096 × 2160 up to 60 fps for excellent results.It had a 1/2.3-inchsensor that is comparable in size to that of a camera phone.The Phantom 4's remote control and live feed are based on DJI Lightbridge technology, providing an effective control range of up to 3.1 mi (5 km) in unobstructed areas that are free from interference.At this stage, the digital model of the area's height was prepared from the images and point clouds prepared by the UAV in pixels of 0.2 m.The images were taken in an area (2700 hectares) with a flight height of 200 m from the ground and a flight speed of 10 m per second with an overlap of the flight paths of 75%.The flight path was specified by the Pix4D software, and the bird moved automatically according to the defined air paths.After the flight operation, the processes included automatic internal, relative, and absolute justifications, which led to the preparation of point clouds of the area and a digital height model.
Before conducting aerial photography, BM stations and non-permanent signs of ground control points were designed, and their approximate locations were determined.Owing to the non-flatness of the area, BM points were positioned at intervals of at least 1 km using the Differential Global Positioning System (DGPS) method with a triple set of Global Navigation Satellite System (GNSS) receivers.The DGPS network was positioned with at least one reference point (mapping organization or the Shamim system).The marking of non-permanent ground control points was performed before the flight, and their quality was such that they could be seen and measured in all aerial images.The density of these signs can vary between 150 and 400 m depending on the presence or absence of the GNSS/Inertial Measurement unit (IMU) sensor, the flight height, and the quality of the images.Landmarks were located by Real Time Kinematic (RTK)-GNSS method from the nearest BM.One of the essential parts of a drone mapping project is flight and aerial imaging.Therefore, the proper functioning of the bird was ensured to have the correct flight and quality images, as well as to prevent all kinds of financial losses or even lives.In addition, before the flight operation, the parameters related to photography, including shutter speed, ISO (the sensitivity of the camera's sensor), focus, and aperture, were adjusted to obtain high-quality images and proper lighting.Quality products were obtained during the processing stages.

Definition of CNN Method for Detecting Soil Erosional Features
To reduce land degradation and mitigate the adverse effects of erosion on ecosystem services, it is necessary to accurately predict SE susceptibility.Several theoretical and empirical models, including RUSLE (Revised Universal Soil Loss Equation), USLE (Universal Soil Loss Equation), WEPP (Water Erosion Prediction Project), SWAT (Soil and Water Assessment Tool), and WaTEM/SEDEM (Water and Tillage Erosion Model and Sediment Delivery Model), can be employed to study soil erosion [41,42].In contrast to the traditional soil erosion modelling approach, which requires a physically or empirically based model for prediction and collection of field data to verify the accuracy of the model, a machine learning-based approach does not require prior modelling.Additionally, a new approach is essential to address the diverse factors influencing land degradation and soil erodibility [43].Instead, field measurements are used directly to formulate rules and draw generalizations from the data, leading to predictions using semi-automated and automated approaches, such as learning-based techniques [44,45].Deep learning, a subset of machine learning, has found practical applications in modeling and mapping various earth features.Among the prevalent deep learning techniques utilized for this purpose are Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Denoising Autoencoder (DAE), Deep Belief Network (DBN), and Long Short-Term Memory (LSTM) networks.Notably, CNN is a fully connected feedforward neural network celebrated for its ability to effectively reduce the number of parameters while preserving model quality compared to other deep learning methods [46].This attribute is particularly advantageous when working with high-dimensional data, such as images, where each pixel serves as a feature, making it well-suited for harnessing the power of a CNN [47].Utilizing CNN for soil erosion modeling can also lead to cost and time savings in preparation and classification processes.In recent years, several researchers have used the CNN model to accurately map the dynamic features of the Earth [40,48,49].A CNN is a feed-forward type of neural network that utilizes convolutional calculation and a depth structure, which is a representative algorithm for depth learning.A CNN consists of a hidden layer and an input layer which can include a pooling layer, a convolution layer, a whole connection layer, and finally an output layer.The input layer comprises an  ×  matrix with the respective feature value at each node.Our input layers for detecting soil erosional features were the DEM, satellite images, and training datasets.A convolutional layer immediately adjacent to the input layer is sometimes referred to as the feature extractor.This is because it is applied to extract the features of an image.Several convolutional kernels were used in the convolutional layer, which was optimized by a back-propagation algorithm.The input for the next layer is the output of the convolutional layer [50][51][52][53].In general, six convolutional layers were used to detect soil erosional features (Table 2).We constructed CNN models with a two-layer depth for soil erosional features.This means that for each separated dataset (UAV, SPOT-6, and Sentinel-2), a two-layer depth consisting of DEM and satellite images was constructed for detecting the gully head and the sinkhole.Each twolayer depth CNN model was fed predisposing variables (DEM and satellite images).In this study, we applied multiple convolutions with different filters (2 × 2), resulting in feature maps.All the feature maps were then gathered, and the results of the convolutional layers were produced.
Each convolutional layer consists of a pooling layer, an activation function, and multiple weights [54].A down-sampling algorithm is used in the pooling layer to reduce overfitting and minimize dimensionality [55].Max pooling is employed as a maximum operator to down sample the feature maps in the encoder.The feature maps must be manipulated using maximum pooling "to split them into several rectangular regions" to generate maximum values for the regions.A fully connected layer is used to reduce the loss function and subsequently output classification results [56,57].
The previous convolutional layers must be weighed to construct a weighted sum.The weighted sum then passes through an activation function [58].The ReLu (Rectified Linear Unit) was applied in the current study, which is defined according to Equation (1).In the field of deep learning, ReLu is very popular.The function considers negative values (below zero) as zero, positive values (greater than zero), and values equal to zero as its own.As its relationship is linear, it is faster than the sigmoid and Tanh functions owing to its computational efficiency.
= {0,  < 0 ,  ≥ 0. ( A back-propagation algorithm is then applied to optimize all parameters in a CNN model, which is applied to decrease the value of the loss function.The cross-entropy loss was used when adjusting the model weights during training.Therefore, the aim was to minimize the loss because the best model has a smaller loss.In general, a cross-entropy loss of zero considered a perfect model [30].The optimizer was calculated as follows: where  is the total number of obtained quantitative samples from the study area,  is the natural result of sample ,  is the predicted likelihood of sample , whose output is 1, and  and  reveal the accurate output vectors and predicted probabilities, respectively.

Output Validation
The result's classification validation is a critical phase in image analysis to complete a preliminary assessment of the structural model and conceptual framework [59][60][61].This study utilized ROC curves to assess the accuracy of the flood models.In the ROC curve, there are two crucial axes: vertical and horizontal.The vertical axis corresponds to true positives (TP), indicating correctly labelled flood-affected pixels.Conversely, the horizontal axis represents false positives (FP), indicating incorrectly labelled flood-affected pixels.The Area Under the Curve (AUC) serves as a metric to quantify the accuracy of prediction model results [61].The outputs of the accuracy assessment for detecting gully heads and sinkholes are represented in Table 3.According to Table 3, the CNN model performed well with an AUC of > 0.89 for gully head and sinkhole detection, respectively.We also applied loss, validation loss, accuracy, and validation accuracy in Pythonbased Spyder software (version 3.7) to estimate classification accuracy.Table 4 shows the results of the accuracy assessment using Python-based Spyder software.

The Detected Maps of UAV, SPOT-6 and Sentinel-2 Using CNN Method
This study employed an automated CNN data-driven approach to compare the results of different UAV, SPOT-6, and Sentinel-2 images and their derived DEMs for detecting and mapping gully head and sinkhole.In other words, the main contribution of this work is the development of an approach using deep learning and convolutional neural networks (CNN) to compare and detect soil landforms, specifically sinkholes and gully heads, in hilly regions.For detecting erosional features, UAV, SPOT-6 and Sentinel-2 images with pixel sizes of 0.2, 6, and 10 m, respectively, were employed.ReLu, cross-entropy, and Adam were applied as activation, loss, and optimization functions for detecting the gully head and the sinkhole in the CNN models.In Figure 3, the CNN results for detecting soil erosion features are presented.Our results showed the highest performance of the CNN based on UAV datasets for detecting and mapping soil erosional features (Table 3).The results of this research show that the gully head was detected using CNN with an AUC of 0.9247, 0.91.05, and 0.89.78 for UAV, SPOT-6, and Sentinel-2, respectively (Table 3).Our findings also show that CNN performed well for detecting sinkholes with AUC of 0.9189, 0.9123, and 0.9001 for UAV, SPOT-6, and Sentinel-2, respectively (Table 3).To better understand the performance of the CNN, we employed four functions including loss, validation loss, accuracy, and validation accuracy in Python-based Spyder software, as shown in Table 4.According to Table 4, CNN performed well in gully head detection with accuracies of 0.9452, 0.9214, and 0.9012 for UAV, SPOT-6, and Sentinel-2, respectively.It also achieved accuracies of 0.9324, 0.9201, and 0.9135 in sinkhole detection for UAV, SPOT-6, and Sentinel-2, respectively, as shown in Table 4.The results of this study emphasize the dependency of soil erosional feature accuracy on the resolution and quality of DEM data.Overall, by combining the detected maps obtained from the UAV, SPOT-6, and Sentinel-2 imagery, we can create a comprehensive and multi-scale analysis of the study area.This integration offers a holistic view, capturing fine details from the UAV data, broader coverage from SPOT-6, and spectral richness from Sentinel-2.Such combined maps can provide valuable insights for diverse applications, enabling informed decision making and accurate assessment of the study area.(c,d) detected gully head and sinkhole using SPOT-6 image, respectively, and (e,f) detected gully head and sinkhole using Sentinel-2, respectively.

Disadvantages and Advantages of UAVs in Using the CNN Model
In this research, recently published findings [24] were extended to detect sinkhole collapses and gully heads using UAV, SPOT-6, and Sentinel-2.Because the pixel size of the satellite images was not at the same fine resolution as UAVs, it was difficult to obtain soil loss information continuously.Considering the importance of satellite data, data at various spatial or temporal resolutions can be used in environmental studies.However, these data should be compared and checked with the acceptable spatial resolution of UAV images, and the present study must study and assess this issue.However, this study showed that the UAV and two satellite images had acceptable accuracies.It is challenging to use satellite images in studies of soil loss size (i.e., volume, width, and height) [30].Therefore, to ensure good results from different methods, especially to detect information regarding the size of erosional features, UAVs are much better than the others.
Elevation data obtained from UAVs remote sensing has various benefits, including the high spatial resolution of the region and its flexibility [21].A very high resolution is an essential advantage of UAVs, and because of the low altitude of UAV flights, imaging problems from the atmosphere have decreased [32,62].Therefore, highly accurate information regarding erosional landforms and elevation values can be easily generated from UAVs.In addition, UAV deployment has become critical in terms of accessibility.In developing countries, several data barriers and limitations, such as inaccessibility to highresolution/up-to-date satellite imagery in areas prone to land degradation, have limited the wider dissemination of existing data sources, and there is a need to use other means, such as drone imagery.This has motivated the search for new techniques to obtain spatial information about erosional characteristics.It should be noted that land subsurface/surface information and much more quantitative data from more significant regions (i.e., studied on national scales) can be obtained from the SE sources gathered by remote sensing satellite images, including SPOT-6 and Sentinel-2.In addition, UAV-driven data are much more expensive; for instance, having soil loss data with satellite imagery data from 500 hectares is about one twentieth that of gathering UAV data in the same region size.The UAV also has more limitations.For instance, it is not feasible to obtain different hydrogeomorphologic variables (i.e., flow velocity, and water depth) from UAVs, although they can be easily obtained from satellite data [63,64].In addition, UAV remote sensing has a restricted range, such as the possibility of flying, only in clear sky conditions.Moreover, as UAVs become more popular and demand for them increases, they become increasingly vulnerable to a number of security attacks [65].Although the capabilities of drones will expand in the coming decades, social organizations and governments must be aware of the security aspects of drone communications [66].Overall, although UAVs offer significant advantages in utilizing the CNN model for various applications, it is essential to consider the limitations of flight endurance, weather dependency, regulatory compliance, and data management.By addressing these challenges, UAVs can effectively enhance the performance of the CNN model and enable the accurate and timely analysis of high-resolution aerial imagery.

The Positive and Negative Points of Multiple Remote Sensing Sources in Using the CNN Method
Multiple sources of remote sensing have increased the usage of the "CNN" method.Other data from satellite images can be applied to detect soil losses [67].In the current research, the two-satellite data were gathered at the time when we obtained UAVs in the fields, and it means that there were more data accessible to support the soil losses in the Loess Plateau of "Golestan" province.The SPOT-6 and the Sentinel-2 data were equally efficient for computing soil losses in the studied region.Then, they offered good results for spatially detecting erosional landforms.Therefore, we achieved well-calculated soil loss map (Figure 2) outputs from multiple satellite sources, both due to the spatial resolution and because erosional landforms in the region were very large for detection with different satellite images.Recently, we expanded the application of the CNN model by applying data from multiple satellite images.
We are aware that data from multiple satellite sources have excellent temporal resolution in contrast to UAVs.In future research, we must be able to combine much deeper learning methods with multiple satellite platforms on large scales.After the detection of soil loss, the changes in the dimensions of soil landforms can be monitored in the shortterm.Long-time detection of other soil landforms, including gullies and mass movements, can be managed using UAV or data from multi-platform satellites, such as LiDAR [68], thermal infrared remote sensing [69], and optical remote sensing (e.g., Landsat [70]).Overall, the use of multiple remote sensing sources in conjunction with the CNN method offers great potential for improving performance and accuracy.However, it is important to be mindful of the challenges associated with data fusion and the computational requirements.With careful consideration and proper techniques, the benefits can outweigh the drawbacks, leading to a more accurate and detailed remote sensing analysis.

Conclusions
The research paper highlights the importance of using drones and satellite images to identify sinkholes and gully heads so that soil losses in the form of erosion can be calculated.In the present study, we applied UAV and multiplatform satellite-acquired data, such as SPOT-6 and Sentinel-2, to detect erosional landforms in the Loess Plateau of Iran.RC values obtained from the UAVs were 0.89 (sinkhole) and 0.88 (gully head), respectively.We then calculated the RC values of soil losses with the SPOT-6 data, which were 0.87 (gully head), and 0.86 (sinkhole).The RC values calculated with the Sentinel-2 were 0.86 (gully head), and 0.85 (sinkhole).The results showed the excellent performance of the proposed method, which can be considered a potential solution for practical use.However, to effectively manage SE, especially in erosion-susceptible soils, more hydro-geomorphological information is required over time.The UAVs prepared accurate information that was subsequently applied to compute soil losses.The UAVs offered very good results in detecting soil losses, although Sentinel-2 and SPOT-6 provided good results, too.The use of UAV images in SE mapping confirms some benefits in comparison to the sensing and orbital acquisition methods.Some characteristics such as flying in lower altitudes, less atmospheric interference, and, importantly, quite lesser expense are the benefits of this acquisition system in both scientific and commercial explorations.With the CNN method, the UAV and other satellite remote sensing data confirmed accurate values, and we believe that this model will be beneficial for SE research groups and managers worldwide.However, it is recognized that one of the significant motivations behind CNN popularity these days is the large amount of accessible data to acquire knowledge.The basis is a CNN which uses images as inputs and provides a feature map that illustrates the image with semantic features.Future research on geospatial-temporal hazard analysis should be considered using other deep learning models for satellite imagery.

Figure 1 .
Figure 1.Location of study area: (a) Iran and (b,c) in Golestan Province.

Figure 2 .
Figure 2. A summary (a-f) of the methodology used for gully head and sinkhole detection.

Figure 3 .
Figure 3. Susceptibility of erosional features from very low to very high; (a,b) detected gully head and sinkhole using UAV, respectively, (c,d) detected gully head and sinkhole using SPOT-6 image, respectively, and (e,f) detected gully head and sinkhole using Sentinel-2, respectively.

Table 1 .
Characteristics of predisposing variables for gully head and sinkhole detection.

Table 2 .
Characteristics of employed CNN models for gully head and sinkhole detection.

Table 3 .
Accuracy assessment for gully head and sinkhole detection.

Table 4 .
Estimated loss, validation loss, accuracy, and validation accuracy in Python-based Spyder software for detecting the gully head and the sinkhole.