Extracting the Tailings Ponds from High Spatial Resolution Remote Sensing Images by Integrating a Deep Learning-Based Model

: Due to a lack of data and practical models, few studies have extracted tailings pond margins in large areas. In addition, there is no public dataset of tailings ponds available for relevant research. This study proposed a new deep learning-based framework for extracting tailings pond margins from high spatial resolution (HSR) remote sensing images by combining You Only Look Once (YOLO) v4 and the random forest algorithm. At the same time, we created an open source tailings pond dataset based on HSR remote sensing images. Taking Tongling city as the study area, the proposed model can detect tailings pond locations with high accuracy and efﬁciency from a large HSR remote sensing image (precision = 99.6%, recall = 89.9%, mean average precision = 89.7%). An optimal random forest model and morphological processing were utilized to further extract accurate tailings pond margins from the target areas. The ﬁnal map of the entire study area was obtained with high accuracy. Compared with the random forest algorithm, the total extraction time was reduced by nearly 99%. This study can be beneﬁcial to mine monitoring and ecological environmental governance.


Introduction
A tailings pond is a mine production facility used to store tailings generated during processing metal resources [1]. The liquid contained in tailings ponds is poisonous, harmful or radioactive [2]. The tailings pond usually requires a large area to contain all the produced tailings [3]. Restricted by mineral resources, terrain and other factors, tailings ponds are mostly located in remote mountainous areas with relatively weak supervision [4]. According to statistics, the number of known tailings ponds in China in 2019 was 5189. Most of them were distributed in isolation and nearly 80% of tailing ponds are less than 0.363 km 2 in area [5]. Most tailings ponds use the upstream raising method due to the low operating costs. However, this method lacks stability. If a tailings dam breaks, it can cause severe casualties, economic losses and irreparable environmental pollution [6]. In recent years, tailings ponds have frequently caused environmental emergencies, which have had extremely adverse effects on economic development and social stability [7][8][9]. Therefore, the rapid acquisition of comprehensive tailings pond inventories is of great significance to regional effective resource utilization and mine safety monitoring.
Traditional ground surveys of tailings ponds are costly and inefficient [4,10,11]. Remote sensing technology has the advantages of a wide monitoring range and low cost [12,13], which provides more opportunities to monitor tailings ponds. However, the shapes, dimensions, backgrounds and tones of different tailings ponds vary greatly in remote images covering huge geospatial areas because of the influence of mine types, geography and climate conditions on tailings ponds in different regions [14]. These characteristics also make the extraction of tailings pond margins a challenge.
Due to the abundant spectral and texture features of tailings ponds in remote sensing images, these features are typically utilized to design remote sensing indices to extract tailings [15][16][17]. Ma et al. (2018) developed an ultra-low-grade iron-related objects index from Landsat 8 OLI (Operational Land Imager) images and then extracted the tailings ponds based on the entropy difference between tailings ponds and stopes [16]. Using Landsat 8 data,  developed an all-band tailing index, a modified normalized difference tailing index and a normalized difference tailings index to build a tailing extraction model for extracting the tailings [17]. However, the smaller tailings ponds were difficult to extract because of the low spatial resolution of remote sensing images.
High spatial-resolution (HSR) remote sensing images can reflect the land surface comprehensively [18]. Some scholars used visual interpretation to extract tailings ponds by HSR remote sensing images [19][20][21]. However, visual interpretation is time-consuming and laborious when extracting tailings ponds in large areas. Using Landsat 7and SPOT (Système Probatoire d'Observation de la Terre) 5 fusion images, Mezned et al. (2016) mapped mine tailings by a linear spectral unmixing method. Based on the spatial combination of objects [22], Liu et al. (2019) extracted four main tailings pond structures, including starter dams, embankments, deposited beaches and water bodies using GaoFen-2 images [14]. However, the studies mentioned above were limited to the mine area or a few tailings ponds within the mine area. Due to excessive variation in tone, shape and dimension between tailings ponds, the methods are difficult to apply on a large area. In addition, tailings ponds are more sparsely distributed than other objects [23], making extraction of tailings ponds in a large area more difficult and time-consuming. Therefore, the quick and accurate extraction of tailings pond margins still faces challenges when using remote sensing images at large spatial areas.
Deep learning can obtain higher dimensional and abstract features of the images than traditional methods [24] and it has been proven to be a powerful technology for remote sensing image processing [24][25][26]. In recent years, deep learning has been widely used in land cover classification [25], scene classification [27] and ground target extraction [28,29]. The current target detection methods based on deep learning can be divided into two categories: region-based methods and regression-based methods. Region-based methods, such as Faster R-CNN (region-based convolutional neural networks) [30], achieve high precision for target detection. Nevertheless, the detection speed is slow, while the regression-based methods (e.g., SSD (Single Shot MultiBox Detector) [31] and YOLO (You Only Look Once) [32]) achieve a relatively fast detection speed. The deep learning techniques provide a new idea for extracting tailings pond margins from HSR remote sensing images in large spatial areas, which is one of the main issues in this study. In addition, the lack of public tailings pond datasets and lack of clear explanation of the abstract features extracted limit the development of tailings pond extraction by using deep learning.
To address the aforementioned limitations on extracting tailings pond margins, we produced a tailings pond dataset containing multiple tailings ponds and proposed a deep learning-based framework to quickly and accurately extract tailings pond margins from HSR remote sensing images in large spatial areas. First, the tailings ponds were located by YOLOv4 model. Then, a random forest model was applied to the detected regions to extract tailings pond margins. Finally, a morphological processing method was used to obtain the final tailings pond margins. Taking Tongling city as the study area, we extracted tailings pond margins from the HSR remote sensing image based on the proposed method.

Methodology
The flowchart of the proposed methodology in this study is illustrated in Figure 1. The methodology attempts to extract tailings pond margins using the proposed framework. The framework can be summarized by the following steps: (1) creating a tailings pond dataset based on the characteristics of the tailings ponds in HSR remote sensing images; (2) training the YOLOv4 model by the tailings pond dataset to obtain the tailings pond target areas; (3) building a random forest model combining the spectral and texture features to extract the initial tailings ponds based on the acquired target areas; and (4) using a morphological processing method to extract the final tailings ponds.

Methodology
The flowchart of the proposed methodology in this study is illustrated in Figure 1. The methodology attempts to extract tailings pond margins using the proposed framework. The framework can be summarized by the following steps: (1) creating a tailings pond dataset based on the characteristics of the tailings ponds in HSR remote sensing images; (2) training the YOLOv4 model by the tailings pond dataset to obtain the tailings pond target areas; (3) building a random forest model combining the spectral and texture features to extract the initial tailings ponds based on the acquired target areas; and (4) using a morphological processing method to extract the final tailings ponds.

Detection of Tailings Pond Regions
To quickly extract the tailings pond margins in a large area, this study first detected the tailings pond regions. We have applied the YOLOv4 to detect tailings pond regions in the proposed framework. YOLOv4 achieves the best balance of accuracy and speed [33]. The characteristic empowers YOLOv4 to detect tailings ponds quickly in a large area.
The YOLOv4 algorithm was proposed by Bochkovskiy in 2020 [33]. The architecture of YOLOv4 consists of CSPDarknet53, spatial pyramid pooling (SPP), path aggregation network (PANet) and YOLOv3 head [33]. The CSPDarknet53 is used as the backbone network to extract the features of the images. The SPP block is used to increase the receptive field [34] and the PANet is adopted to merge the extracted features [35]. When scaling the input image to a size of 416×416, the three scales of feature maps are obtained after sampling five times by CSPDarkNet53. The last feature map is disposed with 5×5, 9×9 and 13×13 maximum pooling operations and then the results are merged to obtain a 13×13 feature map. When processing the acquired 13×13 feature map and the first two 26×26 and 52×52 feature maps by the modified PANet structure, three scale predictive feature maps

Detection of Tailings Pond Regions
To quickly extract the tailings pond margins in a large area, this study first detected the tailings pond regions. We have applied the YOLOv4 to detect tailings pond regions in the proposed framework. YOLOv4 achieves the best balance of accuracy and speed [33]. The characteristic empowers YOLOv4 to detect tailings ponds quickly in a large area.
The YOLOv4 algorithm was proposed by Bochkovskiy in 2020 [33]. The architecture of YOLOv4 consists of CSPDarknet53, spatial pyramid pooling (SPP), path aggregation network (PANet) and YOLOv3 head [33]. The CSPDarknet53 is used as the backbone network to extract the features of the images. The SPP block is used to increase the receptive field [34] and the PANet is adopted to merge the extracted features [35]. When scaling the input image to a size of 416×416, the three scales of feature maps are obtained after sampling five times by CSPDarkNet53. The last feature map is disposed with 5×5, 9×9 and 13×13 maximum pooling operations and then the results are merged to obtain a 13×13 feature map. When processing the acquired 13×13 feature map and the first two 26×26 and 52×52 feature maps by the modified PANet structure, three scale predictive feature maps are obtained. Finally, by performing regression and classification operations on the predicted feature maps, the boundary coordinates and confidence of tailings pond are obtained. The main network structure is shown in Figure 2.

Extraction of Tailings Ponds
After tailings ponds detection, the correctly detected regions were further processed to extract the precise margins of tailings ponds using random forest in our framework.

Feature Extraction of Tailings Ponds
The commonly used methods for feature extraction are divided into two categories: traditional feature extraction and deep learning extraction [25,36]. Although deep learning can provide a large number of robust features, it has the disadvantage of being difficult to interpret the extracted features [37]. To extract tailings pond features and further obtain the importance of different features, traditional feature extraction method was performed in this study. The spectral and texture features of HSR remote sensing images are able to reflect the inner components and tonal variation of ground components [38]. When two different objects have the same spectral features, they could be distinguished by their texture features [39]. Therefore, the spectral and texture features were utilized to construct a feature set for tailings pond extraction in our framework.
The RGB bands of HSR remote sensing images were selected as the spectral feature variables. Principal component analysis (PCA) was performed on the images before extracting texture features. Based on the first principal component images, the commonly used gray-level cooccurrence matrix (GLCM) and Gabor filter were utilized to obtain the texture features. We used the GLCM to extract the statistical features of the images and the Gabor filter to extract the structural features. GLCM is one of the best known texture feature extraction methods providing a good description of the image texture [40]. However, GLCM features are less accurate in the region of class boundaries [41]. Gabor filters have been proved to be successful in the field of remote sensing images [42,43]. The features extracted by Gabor filters are more accurate in the boundary regions [41]. It has been demonstrated that the combination of these two texture features can improve the accuracy of image classification [39,41,44]. The GLCM was used to calculate texture measurements in four directions (0 • , 45 • , 90 • , 135 • ) and the average of the corresponding measurement values in all directions was used as the final texture feature. Eight gray texture features (i.e., mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment and correlation) were calculated with a 7 × 7 window size. Additionally, the Gabor filter was used to extract Gabor texture features in four directions (0 • , 45 • , 90 • , 135 • ) and the wavelength was set to 2. The selected feature types and variables are shown in Table 1.

Random Forest Classification
In addition to the multiple features, a robust classification algorithm is also important for extracting the tailings pond margins. Our framework utilized the random forest algorithm, which has achieved good results in the field of automatic extraction of remote sensing information [45], to further extract the tailings pond margins.
Random forest is a fusion algorithm based on decision tree classifiers with fast computational efficiency, high classification accuracy and can deal with high-dimensional features [46]. The random forest algorithm establishes corresponding decision tree models for training sets extracted from the original training dataset through the bootstrap method. The pixels are classified by taking the most popular voted class from all the forest decision trees [47]. Two-thirds of the sampled training data are used to build the classifier. The other third of the sampled training data is used to calculate the out-of-bag (OOB) error, which can evaluate the performance of random forests [48,49].
This study created a classification dataset containing tailings ponds and other objects to yield an optimal random forest model by choosing the features. Based on the importance ranking of 15 features, the features were grouped into 1 to 15 feature sets to train the random forest model. We select the feature set that has the highest OOB accuracy as the optimal feature. Finally, the chosen optimal features were used to train the random forest model for extracting tailings ponds.

Morphological Processing
The initial extracted image may exist some extra holes and small areas and they should be excluded before we obtain the final results. We performed morphological processing to obtain the final images.
First, we performed the hole filling operation on the initial extracted tailings pond images. A pixel point was found in the extracted tailings pond image to expand with structural elements and then constrained it with the complement of the tailings pond image. We kept repeating the expansion and constraint operation until the graph did not change and then intersected the original image to get the tailings pond image [50]. Next, we removed tiny interference. The connected regions are first extracted from the image processed in the previous step. We counted the number of pixels in each connected region. If the number was less than a given constant, the region was removed [51]. Finally, the final extracted image of the tailings pond was obtained.

Accuracy Evaluation
To evaluate the performance of our framework, we evaluated the detection task and final mapped results of the framework, respectively.

Tailings Ponds Detection Accuracy Evaluation
Precision, recall and mean average precision (mAP) were used in our framework to evaluate the performance of the YOLOv4 model. The formulas are as follows: where TP represents the number of tailings ponds that were correctly identified, FP represents the number of tailings ponds that were mistakenly identified and FN represents the number of tailings ponds that were not identified. The mAP value is the mean area under the precision-recall curve for all categories [52].

Final Results Accuracy Evaluation
After random forest classification and morphological processing, the obtained map of the entire study area was evaluated using the stratified sampling strategy proposed by Olofsson et al. [53,54]. We stratified by two classes: tailings ponds and other objects. The calculations of the total number of sample units and stratified sample sizes were performed [53]. The stratified random sampling was conducted by calculated sample sizes. We used the higher resolution google image and field information to determine the final reference labels. Finally, the omission and commission errors of the tailings ponds and the overall accuracy of the map were calculated based on the error matrix.

Study Area and Data
Tongling is located in south-central Anhui Province, China. It consists of three municipal districts and one county, covering approximately 3008 square kilometers ( Figure 3). Located in the copper-iron metallogenic belt of the Yangtze River, Tongling is rich in mineral resources. There are more than 30 kinds of rare metal minerals in the territory. Copper, gold, pyrite and limestone for cement are four dominant minerals [55]. However, many tailings generated after mining and smelting are stored in tailings ponds, which have become the primary source of heavy metal pollution in the environmental system of Tongling [56]. To detect tailings ponds based on deep learning, a HSR remote sensing image dataset of tailings ponds was produced. The data used in the tailings pond dataset were 2.05 m resolution Google images of Tongling and other cities (e.g., Ma'anshan, Chizhou, Lu'an, Huangshi), which contain tailings ponds. Considering the spatial heterogeneity of tailings ponds, samples were made by selecting various types of tailings ponds as far as possible to avoid serious omission in tailings pond results. As a preparation, we combined the color and morphological characteristics of the tailings ponds in HSR remote sensing images for visual interpretation. The remote sensing images were sliced into 500×500 pixels and then we annotated the tailings ponds in the images with the location information of the target category. Due to the complexity of HSR remote sensing image scenes, negative samples (e.g., reservoirs, mining areas) were added to reduce interference.
The initial dataset contained 352 positive samples and 430 negative samples. A total of 516 annotations were included in the initial dataset. However, deep learning requires a large number of samples to prevent overfitting during training and data augmentation is usually adopted [57]. The widely used data augmentation methods, including rotation, Gaussian noise, flipping and brightness variation, were adopted to increase the diversity of the samples. For image rotation and flipping, we have performed 90, 180 and 270 degrees rotations and horizontal and vertical flip operations on the images. For brightness variation, the parameters for adjusting the brightness were set to 0.5 and 1.2 and the bias was set to 10. The noise was added to the original images as well. we added Gaussian noise with a variance of 0.1 to the original images.
A positive sample was generated to 9 new images while we removed a small number of poor quality samples. To keep the number of positive and negative samples similar, we rotated, flipped and added noise to generate 7 new images for each negative sample. Via the above operations, the final tailings pond dataset contains 6150 images, including 3140 positive samples and 3010 negative samples. A total of 4610 annotations were included in the final dataset. Figure 4 shows samples in the tailings pond dataset.

Detection of Tailings Ponds Based on YOLOv4
The tailings pond dataset was randomly divided into a training set and testing set at a ratio of 8:2. The YOLOv4 algorithm was trained for 300 iterations and the batch size was set to 4. The default hyperparameters which could obtain the great performance in YOLO [58], were applied with a learning rate of 0.01, the momentum of 0.937 and weight decay of 0.0005.
A multiscale training strategy was adopted to enhance the robustness of images of different sizes. During the training process, the image size was randomly transformed into multiples of 32 and the image transformation range was set as 320 × 320~608 × 608 [59]. After approximately 250 iterations of the model, the training loss tended to be stable. For the confidence threshold of 0.5, the model performed the best mAP in the testing set when the image size was 448×448 and it was selected as the final tailings ponds detector. As shown in Table 2, the precision is 99.6%, the recall rate is 89.9% and the mAP is 89.7%.
The results showed that the YOLOv4 algorithm can detect tailings ponds accurately and provide a reasonable basis for further extraction of tailings ponds. Taking the Tongling as a study area, the final tailings ponds detector was used to detect the tailings ponds from HSR remote sensing image. The images of Tongling were sliced into 500 × 500 pixel size and a total of 2745 images were obtained. We obtained 26 tailings ponds as the truth value by visual interpretation. For the confidence threshold of 0.5, 31 features were detected as suspected tailings ponds. The tailings ponds obtained by visual interpretation were all included in the detected regions. The distribution of targets obtained by geospatial mapping was illustrated in Figure 5. The detected targets were verified by visual interpretation and the information provided by the tailings pond managers. Five false positive targets were detected by YOLOv4, mainly including reservoirs, waste rock piles and bare ground. Examples of incorrectly identified targets were shown in (d) and (e). (d) was a bare ground with similar shape to the tailings pond. (e) was a reservoir with a similar structure to the tailings pond, including water bodies and beaches. The color of the water body was similar to the color of the tailing pond wastewater, showing a similar brightness to that of the tailing pond. Inactive tailings ponds contained very little water. The bare grounds and the exposed land in small reservoirs were similar in shape, texture to the tailings in inactive tailings ponds. These similar properties made the YOLOv4 model prone to errors in prediction. Objects that were incorrectly detected as tailings ponds were excluded from the final results. A total of 26 tailings ponds were finally recognized in Tongling. The correctly detected tailings ponds were served as the basis for further extracting tailing ponds.

The Extraction of Tailings Ponds in Tongling
Considering that the characteristics of tailings ponds vary greatly in different regions, we performed the selection of tailings classification samples in the correctly detected regions. This study created a training dataset containing 28,704 pixels for two classes. The number of decision trees was set to 500, which is a frequently used value when using the random forest classifier on remote sensing data [60].

Selection of Optimal Features Based on Random Forest
The importance ranking of the 15 features is shown in Figure 6. The variation in OOB accuracy (%) concerning the number of features used during random forest based extraction of tailings ponds is demonstrated in Figure 7. The OOB accuracy started to increase with the increase in the input features until reaching an optimum, after which it began to decrease. The OOB accuracy achieved 91.38% by using all the features. In contrast, the OOB accuracy achieved 92.03% by using 12 features. Therefore, the top 12 features were selected as the feature variables to obtain an optimal random forest model.

Extraction Results of Tailings Ponds
This study selected the optimal random forest model to extract tailings ponds. Due to the existence of internal holes and broken areas in the extracted tailings ponds (Figure 8b), we used hole filling and small area removal methods in morphological processing to extract final tailings ponds. Examples of tailings pond extraction results are shown in Figure 8.

Final Map Accuracy Assessment
The obtained map of the entire study area was evaluated using the stratified sampling strategy. A total of 90,000 sample units (pixels) was finally obtained. The two strata of the tailings ponds and other objects were allocated 3200 and 86,800, respectively. The error matrix based on the estimated proportions of areas was shown in Table 3. The overall accuracy was 99.98% and the omission and commission errors of the tailings ponds were 18.75% and 7.15%, respectively. The results indicated that the map of the entire study area was obtained with great performance.

Model Comparison Results
To prove the proposed method's efficiency in extracting tailings ponds in a large area, a conventional random forest pixel classification method for tailings ponds extraction was also used in the experiments for comparison. In the whole study area, substantial disk resources are required for storage and calculation after fusing multiple texture features. Therefore, we chose Tongshan town in the study area (approximately 0.017 times the size of Tongling) to compare the extraction time of different methods.
As shown in Table 4, the extraction time of the proposed method in Tongshan town was 17.8 s and the extraction time in the whole study area was 378.5 s. However, the extraction time of random forest classification in Tongshan town was 844.7 s. Combined with the morphology of tailing ponds in HSR remote sensing images, the margins of tailing ponds detected by YOLOv4 were outlined to verify the extraction results. Although the tailings ponds were well extracted by plain random forest (Figure 9a), the misclassification of other regions was obvious. In the results of (b) and (c), the misclassified regions were greatly reduced and better extraction results were obtained. Overall, compared with only using random forest classification, the proposed method was more efficient and effective in extracting tailings ponds.

Comparative Experiment Extraction Time in Tongshan Town Extraction Time in the Study Area
YOLOv4 + random forest 17.8 seconds 378.5 seconds Only random forest 844.7 seconds Approximately 13 hours

Discussion
In previous studies, few researchers used HSR remote sensing images to extract tailings pond margins in large spatial areas. This study proposed a framework that combined YOLOv4 and the random forest algorithm to extract tailings pond margins. First, the YOLOv4 model was trained with a tailings pond dataset to obtain the tailings pond locations. On this basis, the optimal random forest model combined with spectral and texture features was utilized to extract initial tailings ponds on the correctly detected regions. Finally, hole filling and small area removal methods were performed on the initial results to obtain the final extraction results of tailings ponds.
This study first combined deep learning with random forest to extract tailings pond margins. Our proposed framework can be applied to the areas where satellite images are available. The proposed framework can quickly locate tailings ponds and extract them with great extraction performance, significantly reducing the time for visual interpretation of remote sensing images. A map that contains the tailings ponds and the areas classified as another type could also be further obtained. Besides, we have produced a tailings pond dataset using visual interpretation, which contains various types of tailings ponds and has proved to be valid. The dataset could provide data support for relevant research on tailings ponds.
Tailings ponds are only a small part of the total area. Considering the actual demand of mineral resources inventory, the image area we need to extract is usually large (e.g., city, province, etc.). When the margins of tailings ponds need to obtain in a large area, the performance of extraction and the saving of computing resources have to be taken into account. Therefore, the YOLOv4 model was used to detect the tailings ponds, which is the key step of the proposed framework. The YOLOv4 model would address the issue of the time-consuming of tailings pond extraction in HSR remote sensing images and accurately locate to tailings pond regions. Based on the tailings pond dataset, with a confidence threshold of 0.5, we obtained high accuracy (precision = 99.6%, recall = 89.9%, mAP = 89.7%). Experiment results showed that the YOLOv4 model in our framework could effectively detect tailings pond locations from a large spatial HSR remote sensing image and provide a reasonable basis for the further extraction of tailings ponds.
Compared with the current study, the area of Tongling (approximately 3008 km 2 ) is relatively large. Applying the proposed method to extract tailings ponds in Tongling, we obtained 26 correctly detected regions and 5 wrongly detected regions. In the actual application, there are inevitably some objects, such as reservoirs and bare ground that have similar characteristics to tailing ponds, causing YOLOv4 detection errors. This is the reason for the discrepancy between model accuracy and the results in the actual application. Based on correctly detected regions, the optimal random forest model and morphological processing were used to further extract tailings ponds. Finally, we obtained a map of the entire study area containing tailings ponds and other objects. The obtained map was evaluated based on a stratified sampling strategy. The omission and commission errors of the tailings ponds were 18.75% and 7.15%, respectively, with an overall accuracy of 99.98%. The results showed that our framework could obtain great performance in extracting the tailings ponds. Since the YOLOv4 model excluded a large number of other objects, the areas outside the target regions and the apparently misidentified target regions were classified as other objects when we mapped the entire study area results. The target regions were small relative to the whole area and other objects misclassified as tailings ponds only existed in the target regions, resulting in a small probability of misclassification of other objects and high overall accuracy.
In addition, the total extraction time of the proposed framework was 378.5 s. The computational time of the proposed framework was mainly affected by the number of image pixels and the longest time-consuming step was the detection calculation of the YOLOv4 model. The extraction efficiency and effectiveness of the proposed method were much better than only using random forest classification. The reason for the results is that we first used the YOLOv4 model to accurately locate the tailings ponds, which significantly reduced the time to classify irrelevant features by the random forest model and improved the extraction effect of tailings ponds from another aspect. In addition, in the previous studies, the Mask R-CNN model has been used in the task of boundary extraction [61]. However, the Mask R-CNN mainly focuses on the accuracy of the model and considers the speed less [62]. In practice, large scale and high-frequency monitoring are needed. The algorithm would be very time consuming when applied to large HSR remote sensing images (e.g., city, province, etc.) [63]. In contrast, our framework combining the advantages of YOLOv4's speed to make it more suitable for practical applications. The combination of the YOLOv4 and the random forest enabled our entire framework to take speed into account while considering accuracy.
Furthermore, by conducting random forest feature optimization, we found that the mean texture feature and R-band feature were dominant in the extraction. The optimal features that were selected included spectral bands, GLCM texture and Gabor texture. The results mean that the appropriate combination of spectral and texture features played an important role in improving tailings pond extraction accuracy. In addition, the highest OOB accuracy of 92.03% was obtained when the top 12 feature variables were selected. Then, the OOB accuracy tended to decline (Figure 7). The reason resulting this phenomenon may be that there is a specific correlation among multiple features adopted in this study. The participation of all features in classification would lead to information redundancy, thus reducing classification accuracy [64].
Although our framework obtained great performance, some details of our framework may be enhanced in the future. First, Tongling is a relatively large region and the applicability of our framework would be verified in city clusters in our future studies. Second, due to the complexity and diversity of tailings ponds, which are influenced by the natural conditions and policies of the regions in which they are located, the original tailings pond dataset in the study was small and we used a data augmentation approach to expand the dataset. The detection effect of the model may be affected when the study area changes. In the future, it is necessary to improve the completeness of the dataset by collecting more samples from other cities and to explore the impact of different data enhancement methods to further enhance the robustness of the detection model. Additionally, despite the high overall accuracy of the obtained map in our study, there are still omission and commission errors of tailings ponds. The pixel classification we used inevitably reduced the extraction effect of the tailings ponds. The object-oriented classification method could be considered in the future. Finally, tailings ponds have strong spatial heterogeneity and the characteristics of tailings ponds in different regions vary greatly. Therefore, the extraction model of random forest needs to be adjusted to further enhance the robustness by combining the characteristics of local tailings ponds. Moreover, in addition to extracting tailings ponds, future studies may consider adding hyperspectral remote sensing data and DEM data to study the scope of infiltration pollution of tailings ponds and conduct an environmental risk assessment.

Conclusions
This study proposed a framework that combines YOLOv4 with the random forest algorithm to extract tailings pond margins from large spatial HSR remote sensing images and created an opened source tailings pond dataset. Taking Tongling as the study area and based on our dataset, our proposed framework achieved high accuracy in extracting tailings pond locations (precision = 99.6%, recall = 89.9%, mAP = 89.7%). Based on an optimal random forest model and morphological processing, tailings pond margins were further extracted and the map of the entire study area was obtained with an overall accuracy of 99.98%. Our framework is more efficient in contrast to the random forest algorithm. The proposed framework could extract various tailings pond locations and margins with high accuracy and speed from a large spatial HSR remote sensing image. In addition, we found that the appropriate combination of spectral and texture features played an essential role in improving tailings pond extraction accuracy. This study can provide an effective inventory method for tailings ponds in government departments and a useful reference for mine safety and environmental monitoring.