The Use of C-Band and X-Band SAR with Machine Learning for Detecting Small-Scale Mining

Illicit small-scale mining occurs in many tropical regions and is both environmentally and socially hazardous. The aim of this study was to determine whether the classification of Synthetic Aperture Radar (SAR) imagery could detect and map small-scale mining in Ghana by analyzing multi-temporal filtering applied to three SAR datasets and testing five machine-learning classifiers. Using an object-based image analysis approach, we were successful in classifying water bodies associated with small-scale mining. The multi-temporally filtered Sentinel-1 dataset was the most reliable, with kappa coefficients at 0.65 and 0.82 for the multi-class classification scheme and binarywater classification scheme, respectively. The single-date Sentinel-1 dataset has the highest overall accuracy, at 90.93% for the binary water classification scheme. The KompSAT-5 dataset achieved the lowest accuracy at an overall accuracy of 80.61% and a kappa coefficient of 0.61 for a binary-water classification scheme. The experimental results demonstrated that it is possible to classify water as a proxy to identify illegal mining activities and that SAR is a potentially accurate and reliable solution for the detection of SSM in tropical regions such as Ghana. Therefore, using SAR can assist local governments in regulating small-scale mining activities by providing specific spatial information on the whereabouts of small-scale mining locations.


Introduction
Small-scale mining (SSM) has devastating impacts on the natural environment when not regulated properly, and many small-scale mining operations are operated illegally. SSM is a low-cost, labor-intensive method of mining [1] in areas where gold is easily accessible, such as on the river banks where alluvial gold deposits can be found. Land degradation is a consequence of such mining activities and remains an important global issue, as the global demand for precious minerals will continue to increase [2]. Around two-thirds of the total supply of these minerals comes from countries in South America, South Asia, and Sub-Saharan Africa [3].
In Ghana, "galamsey" is a term commonly used to describe illegal mining activity in Ghana. Mantey et al. [4] and Owusu-Nimo et al. [5] suggest that galamsey operations are an illegal or unregulated form of SSM and processing of gold that lies at or below soil and water surfaces in Ghana. Galamsey operations have historically only been associated with simple tools and manual labor [6,7] but the use of mechanized equipment like excavators has recently also come into play [4,8], most probably due to the influx of foreign nationals who changed the operational dynamics of galamsey operations [4]. The precious minerals are gathered discreetly and sold in contravention of state laws [4,9]. Galamseyers also do not pay tax, many mines are in delicate or prohibited areas, and often, human safety is put at risk [5,[10][11][12].
Galamsey operations tend to leave behind many wastelands in the form of pits flooded with water, deforested lands, and polluted water bodies [5,13] that are hazardous to the health assessments that involve manual sampling methods, such as the studies by [53][54][55][56][57][58]. Only a few examples exist of where remote sensing is used to map or monitor SSM and these include [14,15,36,59], who used optical imagery.
Almeida-Filho and Shimabukuro [60] have published the only study so far using the backscatter values of SAR to map the degeneration caused by SSM. The aim of their study was to investigate the possibility of using SAR to detect degradation areas caused by independent gold miners, "garimpeiros", in the Amazon because of the regular cloud cover in the region. Almeida-Filho and Shimabukuro [60] used three 18-m resolution, L-band, HH polarization Japanese Earth Resource Satellite-1 (JERS-1) images from the years 1993 (dry season), 1994 (rainy season), and 1996 (rainy season). Speckle filtering was applied with a 7 × 7 window kernel and the images were resampled to 30 m resolution to match the resolution of Landsat TM, acquired in 1994, which was used as the reference image. Their study area was the Tepequém plateau, situated in northern Brazil, which consisted of savannah grass that was surrounded by tropical rain forest. They found that the low grassland vegetation produced subtle tonal contrasts in the SAR imagery compared to the high contrast produced by deforestation in studies performed in the forested areas, e.g., [61][62][63]. The similar backscatter responses between the eroded areas and the savannah grass made identification of the degraded areas from gold-mining activities unidentifiable when using a single-date JERS-1 SAR image. Their change detection normalized difference index (NDI) technique showed that the land cover change was detectable with the low tonal contrast of the grasslands. Due to the lack of a reliable classifier for SAR imagery, the authors of [60] were unable to produce a thematic map of the degradation areas found in their 1993-1994 NDI image.
From the study by Almeida-Filho and Shimabukuro [60], it is clear that SAR has the potential to detect SSM. Their challenge of detecting the degradation areas in single-date imagery can be tested over forested areas where gold mining takes place, such as in Ghana. The recent publication of research by [19] proved that SSM in Ghana can be detected with SAR. They used Sentinel-1 time series data and compared the mean, minimum, and maximum backscatter difference images of both polarizations. They found that the minimum backscatter images were the most sensitive to detecting changes caused by mining-induced land cover changes in Ghana and that a threshold value of +1.65 dB was suitable to classify this change. It is, however, uncertain what the impact of multi-temporal filtering of a time series of Sentinel-1 imagery would have on SSM detection accuracies. The use of a wider variety of machine learning algorithms, especially applied to different SAR wavelengths and features, has not been sufficiently investigated.
In this study, we evaluated how different SAR sensors compare regarding the mapping of SSM when applying classification. We tested how single-product speckle-filtered SAR performed, compared with multi-temporal filtered SAR, when applying classification, and which classification algorithm is best suited for the mapping of SSM with SAR imagery.

Study Area
The study area is located in Southern Ghana, which includes the Ofin River near the mining town of Obuasi ( Figure 1). The study area is rural and dominated by forests. Southern Ghana has a tropical climate with daytime temperatures ranging between 25 and 35 degrees Celsius [64]; the most rainfall occurs between July and September and the average annual rainfall is 736.6 mm [65]. The absence of large buildings in the scattered rural settlements is beneficial for applying remote sensing with SAR since the buildings will not interfere with the radar signal. Buildings produce a double-bounce effect that reflects the microwave energy back to the sensor very strongly. and 35 degrees Celsius [64]; the most rainfall occurs between July and September and the average annual rainfall is 736.6 mm [65]. The absence of large buildings in the scattered rural settlements is beneficial for applying remote sensing with SAR since the buildings will not interfere with the radar signal. Buildings produce a double-bounce effect that reflects the microwave energy back to the sensor very strongly. Figure 1. The footprints of the satellite imagery used in this study are located in Southern Ghana. All the imagery was cropped to the KompSAT-5 footprint that delineates the study area.

Data Collection
The data used in this study consists of multiple Sentinel-1 images and one KompSAT-5 image. Sentinel-1 is freely available through the Copernicus Open Access Hub and the KompSAT-5 image was sponsored by SI-Imaging. For ground truth validation, openaccess Sentinel-2 imagery was used. The Sentinel-1 imagery is C-band (5.405 GHz) SAR backscatter with incidence angles ranging from 32.9° to 43.1° and the spatial resolution is 20 m. The Sentinel-1 imagery was accessed as interferometric wide-swath (IW) single-look complex (SLC) products from the Copernicus Open Access Hub. IW offers dualpolarization capability, with vertical transmit-horizontal receive (VH) and vertical transmit-vertical receive (VV). The range of dates for the Sentinel-1 imagery is from August 2017 to August 2018 and one image was selected for each month. While all 13 images were used for the multi-temporal filtering process, classification was only performed on the 12 August 2018 image.
The KompSAT-5 (Korea multi-purpose satellite) X-band (9.66 GHz) image used in this experiment was provided by SI-Imaging Services. KompSAT-5′s enhanced standard (ES Standard) mode is similar to Sentinel-1′s IW mode, with single-polarization VH. The incidence angle ranges from 28.8° to 55°, the spatial resolution is 3 m, and it covers a swath of 30 km (SI-Imaging Services 2019). The date of acquisition was 12 July 2018.
Sentinel-2 optical imagery was used as the ground truth dataset for training and validation purposes. The aim was to retrieve an image with the least amount of cloud cover, closest to the date of the single-date SAR imagery (July/August 2018). Sentinel-2 imagery is also freely available through the Copernicus Open Access Hub. Level-1C imagery was downloaded and only the true color band combination (RGB 4-3-2) at 10 m spatial resolution was used. Figure 2 displays an example of each of the Sentinel-1 and the Figure 1. The footprints of the satellite imagery used in this study are located in Southern Ghana. All the imagery was cropped to the KompSAT-5 footprint that delineates the study area.

Data Collection
The data used in this study consists of multiple Sentinel-1 images and one KompSAT-5 image. Sentinel-1 is freely available through the Copernicus Open Access Hub and the KompSAT-5 image was sponsored by SI-Imaging. For ground truth validation, openaccess Sentinel-2 imagery was used. The Sentinel-1 imagery is C-band (5.405 GHz) SAR backscatter with incidence angles ranging from 32.9 • to 43.1 • and the spatial resolution is 20 m. The Sentinel-1 imagery was accessed as interferometric wide-swath (IW) singlelook complex (SLC) products from the Copernicus Open Access Hub. IW offers dualpolarization capability, with vertical transmit-horizontal receive (VH) and vertical transmitvertical receive (VV). The range of dates for the Sentinel-1 imagery is from August 2017 to August 2018 and one image was selected for each month. While all 13 images were used for the multi-temporal filtering process, classification was only performed on the 12 August 2018 image.
The KompSAT-5 (Korea multi-purpose satellite) X-band (9.66 GHz) image used in this experiment was provided by SI-Imaging Services. KompSAT-5 s enhanced standard (ES Standard) mode is similar to Sentinel-1 s IW mode, with single-polarization VH. The incidence angle ranges from 28.8 • to 55 • , the spatial resolution is 3 m, and it covers a swath of 30 km (SI-Imaging Services 2019). The date of acquisition was 12 July 2018.
Sentinel-2 optical imagery was used as the ground truth dataset for training and validation purposes. The aim was to retrieve an image with the least amount of cloud cover, closest to the date of the single-date SAR imagery (July/August 2018). Sentinel-2 imagery is also freely available through the Copernicus Open Access Hub. Level-1C imagery was downloaded and only the true color band combination (RGB 4-3-2) at 10 m spatial resolution was used. Figure 2 displays an example of each of the Sentinel-1 and the KompSAT-5 imagery over an area where abundant SSM takes place, surrounding the Ofin River. The ground truth Sentinel-2 image is also shown over the same area segment.
KompSAT-5 imagery over an area where abundant SSM takes place, surrounding the Ofin River. The ground truth Sentinel-2 image is also shown over the same area segment. The Sentinel-1 toolbox, available in SNAP, was used for the pre-processing of both the Sentinel-1 and KompSAT-5 datasets. Pre-processing involved the radiometric calibration, geocoding, and terrain correction of all images. The single-date images were speckle-filtered using the Lee Sigma (7 × 7) filter, and the multi-temporal Sentinel-1 dataset underwent image co-registration and multi-temporal filtering, also with the Lee Sigma (7 × 7) filter. These filtered datasets will be referred to as S1 and S1-MT, respectively, in this article. All filtered images were converted to decibel format. Additional textural features were derived from each dataset using PCI Geomatica's TEX algorithm to calculate the grey-level co-occurrence matrix (GLCM) for each polarization band. The following eight texture measures were calculated: homogeneity, contrast, dissimilarity, mean, variance, entropy, angular second moment (ASM), and correlation. The Sentinel-1 toolbox, available in SNAP, was used for the pre-processing of both the Sentinel-1 and KompSAT-5 datasets. Pre-processing involved the radiometric calibration, geocoding, and terrain correction of all images. The single-date images were specklefiltered using the Lee Sigma (7 × 7) filter, and the multi-temporal Sentinel-1 dataset underwent image co-registration and multi-temporal filtering, also with the Lee Sigma (7 × 7) filter. These filtered datasets will be referred to as S1 and S1-MT, respectively, in this article. All filtered images were converted to decibel format. Additional textural features were derived from each dataset using PCI Geomatica's TEX algorithm to calculate the grey-level co-occurrence matrix (GLCM) for each polarization band. The following eight texture measures were calculated: homogeneity, contrast, dissimilarity, mean, variance, entropy, angular second moment (ASM), and correlation.

Methodology
The research design is illustrated in Figure 3. First, the SAR features were extracted per object after image segmentation was applied. Then, samples were collected for input to the training and testing datasets for the machine learning classifiers. Next, the classifier was

Methodology
The research design is illustrated in Figure 3. First, the SAR features were extracted per object after image segmentation was applied. Then, samples were collected for input to the training and testing datasets for the machine learning classifiers. Next, the classifier was run on each dataset. Lastly, validation statistics were performed to analyze the accuracy and reliability of each classifier or dataset.

Sample Collection
Samples were selected to use as input for the training and testing of the machine learning classifiers. Samples were collected on the Sentinel-1 multi-temporally filtered image (S1-MT) by selecting objects from the segmentation results and visually confirming on the Sentinel-2 image to which class the object belongs. Two classification scheme designs were tested. The first classification scheme was a binary-water classification scheme consisting of two classes: water and non-water. Binary classification schemes are simple and highly accurate where the classification scheme merely consists of, for example, water and non-water. After several studies attempted to take advantage of the very low backscatter received from water bodies [66][67][68][69] it has been established that when mapping large water bodies, the simple single-threshold method did not produce good results [70] because of the variability of the environment, such as wind-roughening and satellite parameters [71,72]. Spatial and temporal variety in backscatter also occurred in permanent water bodies in the study by [73]. The second classification scheme was a

Sample Collection
Samples were selected to use as input for the training and testing of the machine learning classifiers. Samples were collected on the Sentinel-1 multi-temporally filtered image (S1-MT) by selecting objects from the segmentation results and visually confirming on the Sentinel-2 image to which class the object belongs. Two classification scheme designs were tested. The first classification scheme was a binary-water classification scheme consisting of two classes: water and non-water. Binary classification schemes are simple and highly accurate where the classification scheme merely consists of, for example, water and non-water. After several studies attempted to take advantage of the very low backscatter received from water bodies [66][67][68][69] it has been established that when mapping large water bodies, the simple single-threshold method did not produce good results [70] because of the variability of the environment, such as wind-roughening and satellite parameters [71,72]. Spatial and temporal variety in backscatter also occurred in permanent water bodies in the study by [73]. The second classification scheme was a multi-class classification scheme that consisted of the following classes: water, bare ground, vegetation and built-up area. This was performed to compare binary classification results to a more generic classification scheme approach that is popularly used for land cover classification applications (e.g., [74,75]). Both classification schemes are of interest for this study, to compare the accuracy and robustness of the classification algorithms. The descriptions of each class are listed in Table 1. The number of samples collected was 197 per class. Thus, the multi-class classification database consisted of 788 samples, and the binary-water classification database comprised 394 samples.

Multi-Class Classes Binary-Water Classes Description
Water Water Uniform texture. Blue, green pixels.
Built-up area Non-water Built structures such as roads, dam walls and buildings. Can be grey in color or brown.

Vegetation
Forest: Densely packed trees. usually of a dark green color. Have a "speckled" texture. Low vegetation: Usually located in patches of lighter green areas between darker green (forest/tree) areas. Vegetation refers to grass, bushes, and low trees.
Bare ground Open areas that contain no structures or vegetation. Typically, a dark brown/brown/orange color or grey color.

Image
Backscatter Bands Texture Features Geometric Attributes Total Number of Features S1 VH, VV (n = 2) (n = 16) (n = 5) 23 K5 VH (n = 1) (n = 8) (n = 5) 14 S1-MT VH, VV (n = 2) (n = 16) (n = 5) 23 The SAR datasets were classified using the random forest machine-learning algorithm. Random forest (RF) is a classifier that fits a number of decision tree classifiers on various subsamples of the dataset. RF then uses averaging to improve the predictive accuracy and to control overfitting. Breiman (2001) defines RF as a "combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest" [76]. RF selects the best solution from all the predictive sets by means of a voting parameter and RF stores the significance of the input features [26]. The classifier was run 1000 times using the Python-based scikit-learn libraries, where the dataset was split on 40% for the test size and the samples were selected according to a stratified random approach.

Image Classification Part 2: Machine Learning Classification Comparison
The five most commonly used machine learning classifiers employed in the classification of remote sensing datasets are: k-nearest neighbor (KNN), decision trees (DT), random forest (RF), the support vector machine (SVM), and kernel support vector machine (C-SVM) [45]. The popularity of these algorithms for remote sensing applications is because of their ease of use, computational efficiency and adaptability [77]. Each classification algorithm uses a different mathematical model as an approach to pattern recognition and the prediction of classes.
Decision trees represent a supervised machine learning method that uses a tree-like model based on conditional control statements. This non-parametric method is used for classification and regression analysis. The DT algorithm creates a model that predicts a target value through learned decision rules that are inferred from the data input features [78].
Neighbors-based classification, such as k-nearest neighbor, is a type of instance-based learning where new problem instances are compared with instances seen in training. The algorithm does not attempt to construct an internal model but only stores instances of the training data in the memory [79]. KNN performs classification based on approximate locality from the major votes of the nearest neighbor of each point using, e.g., the Euclidean distance between points [78]. Support vector machines are used for classification and regression analysis by building an internal model that works well for non-linear datasets [80,81]. The algorithm takes input vectors and maps them non-linearly onto a high-dimension feature space, to maximize the width of the gaps between categories using a hyperplane [80]. When a hyperplane cannot separate the two categories while maximizing the margin, C-SVM is used, where C is the regularization parameter that has an inverse relationship with the margin [82]. SVM utilizes kernel functions that map the input data to higher dimensions to achieve the optimal hyperplane for maximum separation between classes [80]. The types of kernels used by SVM are linear, polynomial, sigmoid and radial base function (RBF), where the RBF-kernel is commonly used in many applications [83]. SVM is a highly accurate classifier compared to classifiers such as decision trees [84].
Image classification was conducted with the same methodology as was used for the SAR dataset comparison, but it was only performed on the S1-MT dataset. Each classifier was run separately for both classification schemes and the test set size remained at 40%. The dataset was split with each run of the classifier and the classifier ran 1000 times. All classifiers were trained using the same set of input features of S1-multi as in the SAR dataset comparison, which included intensity bands, texture layers, and geometric features. The classifiers tested were RF, DT, SVM (with linear kernel function), C-SVM (with RBF kernel function), and KNN.

Accuracy Assessment
Confusion matrices were created for each classifier and classification scheme, to measure the relative performance of each classifier. The average of each position in the confusion matrix was calculated from the 1000 iterations of the classifier, to create a derived confusion matrix for each classification scheme. Error metrics like producer's and user's accuracies, overall accuracy and kappa coefficient were derived from the matrix. The kappa coefficient is an important metric often used in the accuracy assessment of classification results in remote sensing applications [85,86]. The kappa coefficient is not an index of accuracy, but is instead an indication of agreement beyond chance and has been criticized as unsuited for typical remote sensing applications [86]. Despite this criticism, the kappa coefficient was reported in this study for ease of comparison, since it has been used by many remote sensing studies for an accuracy assessment of the production of thematic maps (e.g., [45,85,87]).
The statistical significance was measured with the area under the curve (AUC) method that is based on probability assessment. The standard normal value (z), together with the confidence level, which was set at 95%, is then used to get the probability from the probability tables that can be found in the study by [88]. The higher the probability, the larger the area under the curve and, therefore, the more significant the result. The z-scores were calculated using the means and standard deviations of the overall accuracy results of each classifier on the multi-class classification scheme.

SAR Datasets Comparison
The S1-MT dataset has the highest overall accuracy at 73.60% for the multi-class classification scheme and S1 has the highest overall accuracy at 90.93% for the binary water classification scheme ( Figure 4). However, the S1-MT dataset has the highest kappa coefficients at 0.65 and 0.82 for the multi-class classification scheme and binary-water classification scheme, respectively. The K5 dataset yielded the lowest overall accuracy and kappa coefficients at 80.61 and 0.61, respectively, for both the multi-class classification scheme and the binary-water classification scheme.
Remote Sens. 2021, 13, x FOR PEER REVIEW 10 of 23 Figure 4. Graphs displaying the overall accuracy (a) and kappa coefficient (b) results for the SAR datasets comparison (image classification, part 1).

Machine Learning Classifier Comparison
For the machine learning classifier comparison, RF outperformed all the classifiers, with an average overall accuracy of 73.60% and 90.04% for the multi-class classification scheme and binary-water classification scheme, respectively ( Figure 5). The second-best performing classifier was C-SVM with an average overall accuracy of 70.15% and 89.17% for the multi-class classification scheme and binary-water classification scheme, respectively. SVM and C-SVM performed very similarly, and DT and KNN yielded the lowest overall accuracy and kappa coefficients. RF was significantly better than the other classifiers, according to the area under the curve method. Random forest showed the  Table 3 summarizes the producer's and user's accuracies for the "water" class with RF between the datasets. The S1-MT dataset PA is 2.28% and 20.76% higher, and the UA is 2.61% and 21.57% higher than S1 and K5, respectively, in terms of the "water" class of the multi-class classification scheme. The S1-MT dataset also outperformed the S1 and K5 datasets with the binary-water classification scheme at 1.58% and 11.05% higher PA, and 0.33% and 9.49% higher UA, for S1 and K5, respectively. Table 3. Producer's (PA) and user's accuracies (UA) of the "water" class for RF classification on the S1, K5, and S1-MT datasets. The average error of omission and commission values summarized in Table 4 confirms why the multi-class classification scheme performed poorly compared to the binary-water classification scheme. For the S1-MT dataset, 26.40% of reference sites were left out (omitted) and 26.26% of the objects were incorrectly classified (committed) for the multi-class classification scheme, compared to a 9.07% omission error and 9.00% commission error for the binary-water classification scheme. The individual confusion matrices for all three datasets and both classification schemes are listed in Appendix A.

Machine Learning Classifier Comparison
For the machine learning classifier comparison, RF outperformed all the classifiers, with an average overall accuracy of 73.60% and 90.04% for the multi-class classification scheme and binary-water classification scheme, respectively ( Figure 5). The second-best performing classifier was C-SVM with an average overall accuracy of 70.15% and 89.17% for the multi-class classification scheme and binary-water classification scheme, respectively. SVM and C-SVM performed very similarly, and DT and KNN yielded the lowest overall accuracy and kappa coefficients. RF was significantly better than the other classifiers, according to the area under the curve method. Random forest showed the highest probability score at 0.87, whereas decision trees had the lowest probability score of 0.12. The probability results are displayed in Table 5. highest probability score at 0.87, whereas decision trees had the lowest probability score of 0.12. The probability results are displayed in Table 5.   Table 6 summarizes the producer's and user's accuracies of the "water" class classification on S1-MT of DT, RF, KNN, SVM, and C-SVM. RF had the highest PA and UA results, except in the case of the UA of the multi-class classification scheme that belongs to SVM. The RF classifier is 3.9% higher in PA than the second-best result of C-SVM for the multi-class classification scheme, and 10.85% higher in PA than DT, which gave the lowest result. For the binary water classification scheme, RF outperformed C-SVM by 1.67% in PA and DT, again, with the lowest result at 6.44% lower than the PA of RF. The UA in the multi-class classification scheme was the highest at 81.87% for SVM, 0.65% better than the UA of RF; the UA of SVM is also 10.95% higher than the UA of KNN, which gave the lowest result. The UA in the binary-water classification scheme was the highest for RF at 89.29%, with a 0.25% difference from SVM and 3.35% difference from KNN, again the lowest result. Table 6. Producer's (PA) and user's accuracies (UA) of the "water" class classification on S1-MT of DT, RF, KNN, SVM, and C-SVM.  Figure 6 visually shows the RF classification results on the multi-temporally filtered S1 dataset (S1-MT). Figure 6a shows the multi-class land cover classification map, where blue is water, green is vegetation, orange is bare ground and grey is built-up area. The image indicates that built-up areas and bare ground are incorrectly classified in the vegetation and river areas. The confusion matrix calculated confirms high errors of commission for bare ground and built-up area at 38.88% and 18.50%, respectively. The water-binary land cover classification map (Figure 6b) clearly shows the extent of water pools along the river, stretching into the vegetated areas. The reduction of classes from four to two led to more accurate results, albeit at the cost of class granularity.

Classifier
For the SAR dataset comparison (image classification part 1) the multi-class classification scheme yielded unreliable results, whereas the average kappa coefficient for all datasets and classification algorithms was 0.54, as shown in Table 7. The binary-water classification scheme gave reliable results with an average kappa coefficient of 0.75. With the multi-class classification scheme, confusion between classes took place, as can be seen in Table 7, whereas the binary-water classification could clearly distinguish between the water objects and the non-water objects. The confusion matrices of all results are given in Appendix A Tables A1-A14.  For the SAR dataset comparison (image classification part 1) the multi-class classification scheme yielded unreliable results, whereas the average kappa coefficient for all datasets and classification algorithms was 0.54, as shown in Table 7. The binary-water classification scheme gave reliable results with an average kappa coefficient of 0.75. With the multi-class classification scheme, confusion between classes took place, as can be seen in Table 7, whereas the binary-water classification could clearly distinguish between the water objects and the non-water objects. The confusion matrices of all results are given in Appendices Tables A1-A14. Figure 6. Example of the visual RF results of S1-MT where: (a) is the multi-class classification, (b) the intensity of S1-multi, (c) the binary water classification, and (d) the map showing the extent of the images in (a-c). Blue is water, green is vegetation, orange is bare ground and grey is built-up area.

Discussion
By comparing the classification results of the S1 sensor with those from K5, the difference in wavelength is evident. The X-band K5 imagery yielded significantly lower accuracies than the C-band S1 imagery. This may be due to the shorter wavelength of K5, which is less sensitive to differences in roughness between the land cover classes. To test this, an additional classification was performed with just the VH bands of S1 and S1-MT. The results showed that the overall accuracy and kappa coefficient results for Sentinel-1 VH are less accurate than using both polarizations, but they are still more accurate than the K5 results. There is also a bigger difference in overall accuracy and kappa coefficient for the multi-class classification scheme than the binary-water classification scheme. This further supports the notion that the C-band is better suited for this application than the X-band, even at a lower resolution.
The radiometric calibration of K5 may also influence the weaker results. Comparing multi-temporal filtering with single-date filtering showed that S1-MT performed more accurately, with margins of less than 3% in producer's accuracy for the multi-class classification and margins of less than 2% in producer's accuracy for the binary-water classification. Multi-temporal filtering reduces the amount of speckle, but the difference in UA and PA between S1 and S1-MT is negligible. This suggests that the time, data, and computation cost of performing multi-temporal filtering might not be worthwhile for this application.
Confusion exists within the multi-class classification scheme, especially between water and bare ground. This is evident by the high omission and commission errors produced by this classification scheme. With the S1 imagery, the water class and the bare ground class are mixed at the ASM sites around the river, due to the use of lower resolution than the K5 imagery. Therefore, the binary-water classification scheme obtained significantly higher accuracies due to the elimination of confusion. This is confirmed by the low omission and commission errors of the binary-water classification scheme. The implication is that while small-scale mining operations cannot be directly detected using this approach, the scattered water pools associated with these operations can be detected to a fair degree of accuracy.
The same trend as in the SAR dataset comparison occurs in the machine learning comparison, where the binary-water classification scheme outperforms the multi-class classification scheme. The difference between the highest-performing dataset and lowestperforming dataset for the multi-class classification scheme is around 21%, whereas the difference for the binary-water classification scheme is around 10%. The difference between the highest performing classifier for the multi-class classification scheme is less than 11%, and the difference for the binary-water classification scheme is less than 7%. The high variance in the multi-class classification scheme results is due to the misclassification of classes that took place during this classification process.
For the machine learning classifier comparison, the random forest classifier was significantly more accurate than the other classifiers, with a probability of 0.87 at a 95% confidence level. Random forest is the most robust of the machine learning algorithms and deals very well with high dimensionality and complex data. C-SVM and SVM also achieved high overall accuracies, where C-SVM outperformed SVM. This is likely because the data structure was of a higher dimensionality, where the C-SVM kernel could find a better fit for the hyperplane to group the data. The decision trees classifier gave the least accurate as well as the least reliable results. This is because of how the trees are split in the algorithm and not iterated through, unlike in the random forest method. KNN is the simplest of the algorithms and performed slightly better than decision trees. KNN is not robust, demonstrating high dimensionality and complex data.
In Ghana, the mapping of illegal mining using remote sensing has only been attempted in one study [15], where they used multi-temporal optical imagery. They were able to detect the galamsey sites with change detection in Ghana but recommended the use of SAR imagery because of the interference of cloud cover. Bangira et al. [45] compared different machine learning classifiers where they mapped different types of water bodies; the SVM classifier outperformed the RF classifier, with an average overall accuracy of 91.7% compared to 79.5%. Their study incorporated SAR imagery and optical imagery, as well as indices, into the training of the classifiers. The results from the classification conducted in this study showed that RF outperformed SVM. A likely reason that SVM performed better than RF in Bangira et al.'s [45] study is that the indices and optical imagery gave more information on the water body types, meaning that it was simpler for the SVM classifier to classify the data. RF is prone to overclassifying, and this may have been the case in their study. Bangira et al. [45] also assessed the Otsu threshold methods and made the point that for water body mapping, using thresholds is simpler and yields accurate enough results for classification. From studies where floods were mapped with SAR imagery (e.g., [41,[89][90][91]), threshold methods were used, therefore suggesting that mapping SSM with thresholds is a simpler and more effective option.
The lower accuracies obtained from S1 using only the VH polarization, when compared to using both VH and VV, indicate that the combination of these polarizations increases the discriminatory power of the machine learning algorithms. Since the VH-only S1 classification outperformed the VH-only K5 classification, it can also be argued that the difference in accuracy obtained between these two sensors is largely due to the difference in wavelength and not due to the increased dimensionality offered by S1 s dual polarization. In a forested environment such as this, the C-band is therefore preferable to the X-band for mapping SSM.

Conclusions
The C-band Sentinel-1 image outperformed the X-band KompSAT-5 image in accuracy and reliability. The KompSAT-5 image performance was substandard for both classification schemes, with an OA of less than 60% and a kappa coefficient of less than 0.6. Both SAR sensors could only classify the water bodies associated with SSM. Both single-product speckle-filtered SAR and multi-temporally filtered SAR produced very accurate and reliable results. However, the multi-temporally filtered database had a smaller range in the difference between the results of the multi-class classification scheme and the binary-water classification scheme. For this reason, multi-temporal filtering improves the overall reliability of the classification results. The comparison of the machine learning classification algorithms showed that RF is the best-suited algorithm to map SSM with SAR imagery. However, SVM came a close second. The influence of the custom kernel for SVM was insignificant and can therefore be deemed unnecessary. The DT and KNN algorithms had the poorest performance and are not suitable for SSM mapping with SAR.
The classification methods showed that a multi-class classification approach for using SAR to map illegal mining in forested areas does not perform well. The binary-water classification scheme gave highly accurate and reliable results, especially with the multitemporally filtered Sentinel-1 imagery and the random forest classifier. The resulting classifications only mapped the water bodies in the image and the actual SSM activities were not yet mapped. This is, however, very useful because of the association of illegal alluvial mining with water bodies. This research has shown that C-band SAR imagery can be successfully used to detect illegal mining operations in tropical regions by means of the associated water bodies. SSM in these areas are remote and require an uninterrupted supply of remote sensing imagery to provide useful insights on the spatial distribution patterns of SSM activities. The methods tested in this study are a step toward achieving automated near-real-time monitoring of SSM in cloud-cover-prone areas. Freely available SAR imagery, such as from Sentinel-1, could serve as the cornerstone of an operational monitoring system designed to detect and monitor illegal mining operations.  Data Availability Statement: Publicly available datasets were analyzed in this study. These data can be found here: https://scihub.copernicus.eu/dhus/#/home (accessed on: 6 June 2018) and http://www.si-imaging.com/products/#1478507064219-34e51d03-67d9 (accessed on: 31 July 2018).
Acknowledgments: Thank you to Janine Cole and Patrick Cole at the CGS for their guidance and support, and SI-Imaging for providing the KompSAT-5 data. The authors also wish to thank the anonymous reviewers for their valuable inputs to the paper.

Conflicts of Interest:
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Table A1. Random forest multi-class classification on an S1 confusion matrix.  Table A3. Random forest multi-class classification on an S1-MT confusion matrix.  Table A4. Decision trees multi-class classification on an S1-MT confusion matrix.   Table A6. SVM multi-class classification on an S1-MT confusion matrix.  Table A7. C-SVM multi-class classification on an S1-MT confusion matrix.  Table A8. Random forest binary-water classification on an S1 confusion matrix.