A Comparative Assessment of Machine-Learning Techniques for Forest Degradation Caused by Selective Logging in an Amazon Region Using Multitemporal X-Band SAR Images

Kuck, Tahisa Neitzel; Sano, Edson Eyji; Bispo, Polyanna da Conceição; Shiguemori, Elcio Hideiti; Silva Filho, Paulo Fernando Ferreira; Matricardi, Eraldo Aparecido Trondoli

doi:10.3390/rs13173341

Open AccessArticle

A Comparative Assessment of Machine-Learning Techniques for Forest Degradation Caused by Selective Logging in an Amazon Region Using Multitemporal X-Band SAR Images

by

Tahisa Neitzel Kuck

^1,2,*

,

Edson Eyji Sano

^2,3

,

Polyanna da Conceição Bispo

⁴

,

Elcio Hideiti Shiguemori

¹

,

Paulo Fernando Ferreira Silva Filho

¹

and

Eraldo Aparecido Trondoli Matricardi

⁵

¹

Command, Control, Communications, Computers, Intelligence, Surveillance and Reconnaissance Division, Institute for Advanced Studies (IEAv), São José dos Campos 12228-001, Brazil

²

Geoscience Institute, Universidade de Brasília (UnB), Brasília 70910-900, Brazil

³

Embrapa Cerrados, Planaltina 73310-970, Brazil

⁴

Department of Geography, School of Environment, Education and Development, University of Manchester, Manchester M13 9PL, UK

⁵

Forestry Department, Universidade de Brasília (UnB), Brasília 70919-970, Brazil

^*

Author to whom correspondence should be addressed.

Remote Sens. 2021, 13(17), 3341; https://doi.org/10.3390/rs13173341

Submission received: 12 July 2021 / Revised: 6 August 2021 / Accepted: 19 August 2021 / Published: 24 August 2021

(This article belongs to the Special Issue Vegetation Dynamics and Forest Structure Monitoring Based on Multisensor Approaches)

Download

Browse Figures

Versions Notes

Abstract

:

The near-real-time detection of selective logging in tropical forests is essential to support actions for reducing CO₂ emissions and for monitoring timber extraction from forest concessions in tropical regions. Current operating systems rely on optical data that are constrained by persistent cloud-cover conditions in tropical regions. Synthetic aperture radar data represent an alternative to this technical constraint. This study aimed to evaluate the performance of three machine learning algorithms applied to multitemporal pairs of COSMO-SkyMed images to detect timber exploitation in a forest concession located in the Jamari National Forest, Rondônia State, Brazilian Amazon. The studied algorithms included random forest (RF), AdaBoost (AB), and multilayer perceptron artificial neural network (MLP-ANN). The geographical coordinates (latitude and longitude) of logged trees and the LiDAR point clouds before and after selective logging were used as ground truths. The best results were obtained when the MLP-ANN was applied with 50 neurons in the hidden layer, using the ReLu activation function and SGD weight optimizer, presenting 88% accuracy both for the pair of images used for training (images acquired in June and October) of the network and in the generalization test, applied on a second dataset (images acquired in January and June). This study showed that X-band SAR images processed by applying machine learning techniques can be accurately used for detecting selective logging activities in the Brazilian Amazon.

Keywords:

synthetic aperture radar; machine learning; random forest; AdaBoost; multilayer perceptron

Graphical Abstract

1. Introduction

Anthropogenic activities are responsible for the current global temperature increase of about 1.0 °C and for an expected increase of 1.5 °C sometime between 2030 and 2052. Potential impacts and risks associated with increasing temperature include elevation of the sea level, higher frequency and intensity of extreme temperatures, storms, and droughts, loss of biodiversity, reduction in oxygen concentration in the oceans, and shortage of food production [1]. A reduction in anthropogenic CO₂ emissions is mandatory to control the global rise in temperature. Deforestation and forest degradation are the second largest anthropogenic sources of CO₂ emissions into the atmosphere, since they are related to the combustion of forest biomass, as well as to the decomposition of remaining plant materials. Approximately 65% of Brazilian CO₂ emissions in 2019 came from deforestation and forest degradation [2]. According to Qin et al. [3], forest degradation contributes three times more to the aboveground gross biomass loss than deforestation in the Brazilian Amazon. This is because the areal extent of degradation exceeds that of deforestation, indicating that forest degradation is the most important process driving carbon loss in this region [4].

Selective logging is an important economic activity in the Brazilian Amazon and often progresses into clear-cut deforestation, especially if exploitation is unsustainable [4,5,6]. Nevertheless, the establishment of environmentally sustainable timber extraction can assist the social and economic development of this region without causing irreversible environmental degradation [7]. However, the growing predatory exploitation, in addition to the reduced competitiveness of companies intending to operate legally, is having a negative impact on the forest [5,7].

As a limited number of marketable tree species are targeted in selective logging activities, causing point-based and spatially diffuse forest degradation, remote sensing monitoring systems, to some extent, are limited in detecting those impacted forests [4,6,8]. On the other hand, field inspections for monitoring those activities are unrealistic because of the large territorial extension of the Brazilian Amazon and because of security issues [9]. Thus, monitoring selective logging using remotely sensed data and geoprocessing techniques remains the best alternative approach.

Several authors have used optical remotely sensed data to monitor vegetation suppression in tropical forests [10,11,12,13,14], although the use of optical images is limited because of the persistent cloud-cover conditions, mainly in the wet season [15]. The use of synthetic aperture radar (SAR) data can overcome such a limitation since SAR systems can operate in most weather conditions. The main drawback is the higher complexity of image processing and interpretation [16]. The analysis of the potential of X-band SAR data to detect forest degradation by selective logging in the Brazilian Amazon is still limited despite their relatively good availability from the COSMO-SkyMed [17], Iceye [18], and TanDEM-X and TerraSAR-X [19] twin satellites. Although large-wave L-band SAR data have been intensively used for biomass estimation [20,21,22], previous studies have shown that shortwave X- and C-band SAR data present good potential for this purpose [23,24,25] and for forest disturbance mapping using interferometric techniques [26,27], combination with L-band [28] and texture attributes [29]. However, extracting information from SAR images is not a trivial task because of the presence of speckle, the sensor’s side-viewing geometry, and different backscattering processes, demanding specific and complex approaches of data processing and interpretation [16].

One of the first steps of SAR image processing for forest change detection is the application of spatial filters to reduce speckle. Several authors have explored different methodological approaches to reduce speckle without losing information [30,31,32,33]. Change detection based on machine learning techniques applied in SAR images has shown promising results [34,35,36,37]. Examples of these techniques are random forest [38,39,40], AdaBoost [41,42], multilayer perceptron artificial neural network (MLP-ANN) [43,44], and convolutional neural network (CNN) [45,46,47,48].

This study aimed to compare the performance of the machine-learning-based random forest, AdaBoost, and MLP-ANN classification techniques in identifying selective logging activities in a forest concession site located in the Jamari National Forest, Rondônia State, Brazil. The study was based on the multitemporal, 3 m spatial resolution, HH-polarized SAR images acquired by the COSMO-SkyMed constellation.

2. Materials and Methods

2.1. Study Area

The study area is spatially located in the Jamari National Forest, Rondônia State, southwestern portion of the Brazilian Legal Amazon (Figure 1). The typical vegetation observed in the study site is dense rainforest with patches of open rainforest and dominant palm trees and lianas [49]. The Jamari National Forest has an area of approximately 220,000 hectares, of which an area of 96,000 hectares has been managed and used as forest concessions since 2008 [50]. Three companies were selected by the Brazilian Forest Service (SFB) under the Ministry of Agriculture, Livestock, and Food Supply to implement forest management plans of timber and no-timber forest products (latex, fruits, and leaves) within previously established forest management units (UMF I, II, and III) in that National Forest. UMF III was divided into different annual production units (UPAs). UPA 11 was explored in 2018 and, therefore, was selected as the test site in this analysis.

The timber companies need to present forest inventory data to obtain authorization for selective logging in those UMFs, which include the number and the list of tree species that occur at that site. They also need to provide a tree species census of the concession area, including the species identification, diameter at breast height (DBH), circumference at breast height (CBH), estimated volume, and LiDAR point clouds from aerial surveys (before and after exploitation).

2.2. SAR Data

This study was based on SAR images acquired by the COSMO-SkyMed constellation of four satellites, allowing a few-hourly revisiting time with varying incidence angles. The main payload of the COSMO-SkyMed satellite is a multi-resolution and multi-polarized X-band imaging radar with spatial resolution ranging from 1 m to 100 m and nominal incidence angles between 20° and 59°. Table 1 shows the details of the images selected by this study: three overpasses from 8 January 2018, 5 June 2018, and 8 October 2018, comprising the logging period in the study area, which took place from April to September of the same year.

The image preprocessing included the following steps:

Download of the complex images in H5 format, which stores the data in hierarchical data format (HDF), containing the sensor’s scan metadata;
Multi-look filtering, defined as one look in range and azimuth, which resulted in a grid cell of 3 m, representing the best spatial resolution of the StripMap image acquisition mode, and conversion from slant range to ground range;
Co-registration for correction of relative translational and rotational deviations and scale difference between images;
Application of the GammaMAP filter [51], with a 3 × 3 window and ENL 1.1, which was pointed out by [33] as presenting the best results for multitemporal analyzes in tropical forests;
Geocoding using the digital elevation model produced from the Phased Array type L-band Synthetic Aperture Radar (PALSAR) sensor and conversion to the backscatter coefficients (σ°, units in dB).

2.3. Cloud Computing of LiDAR Points

The LiDAR point clouds acquired in 2018 and 2019 over a portion of UPA 11 were processed to generate a digital surface model (DSM) from the first return points, which correspond to the pulses with the shortest time between emission and return, which are those that correspond to the outermost surface of forest canopy. This procedure involves the generation of a matrix, where each cell, with a resolution defined by the specialist (1 m × 1 m in the case of this research), receives the value of the first pulse received within that cell. The scanning process was set up by the company that carried out the survey to acquire approximately 21 pulses per square meter of land surface from the airborne LiDAR Optech ALTM Gemini sensor. The ratio of these two DSMs was also obtained to highlight the areas with tree extraction. Pixels representing selectively logged forests show higher values using this ratio. Subsequently, the pixels representing selectively logged forests were manually delineated on a computer screen and converted into vector-based shapefile format and considered as ground truth (Figure 2).

2.4. SAR Attribute Extraction

The first step for SAR attribute extraction consisted of generating the coefficients of variation (CV) between two SAR images. In this case, the images from 5 June 2018 and 8 October 2018 were chosen as Time 1 (T1) and Time 2 (T2), respectively. CV has been used to detect changes in SAR images as an alternative to the normalized ratio, with good results [52,53,54]. CV can assume real numbers within the 0–1 range, where 0 means no change. Several attempts were made to define the threshold between selective logging and undisturbed forests. The threshold should result in commission errors mostly related to those selectively logged and undisturbed forests and omission errors close to zero. This threshold was applied to the CV image, resulting in a binary image where 0 represents undisturbed forests and 1 represents logged forests. Then, pixels with a value of 1 were converted into polygons. For each polygon, the following spatial attributes were calculated: area, perimeter, circularity, and shape factor [55]. A kernel map was also generated from those polygons using the area as the weight to define the centroids. The large density of polygons representing isolated pixels received smaller weights than the groups of larger polygons, which were more characteristic of selective logging. For each polygon, several zonal statistics were generated (count, sum, mean, median, standard deviation, minimum, maximum, range, minority, majority, and variance [56]) for the following matrices:

Maximum ratio between T1 and T2 images;
CV between T1 and T2 images;
Minimum values between T1 and T2 images;
Gradient between T1 and T2 images;
5 × 5 window average [57,58] of the CV image (item 2);
5 × 5 window variance of the CV image (item 2);
5 × 5 window homogeneity of the CV image (item 2);
5 × 5 window contrast of the CV image (item 2);
5 × 5 window dissimilarity of the CV image (item 2);
5 × 5 window entropy of the CV image (item 2);
5 × 5 window second moment of the CV image (item 2);
5 × 5 window correlation of the CV image (item 2);
Kernel of polygons generated by thresholding the CV.

Each polygon received 154 spectral, textural, and spatial attributes. With the help of selective logging samples obtained through the steps described in Section 2.3, each polygon was classified as timber extraction and no-timber extraction. This attribute is essential for the machine learning process, since this label is used by the algorithms to learn which other attributes describe each class and define their limits.

2.5. Classification Tests through Machine Learning

Orange data mining [59] was used for testing machine learning algorithms, which is a platform to perform data analysis and visualization based on visual programming, with the possibility of implementation using Python 3 libraries. Initially, to evaluate the predictive performance of the machine learning model [60], the dataset was divided randomly into training (70%) and validation (30%) sets, respectively. The sampling set was fixed to allow subsequent tests to use the same dataset, making the results comparable with each other. It was also stratified to mimic the composition of the input dataset. The attributes were used as input in the supervised machine learning algorithms. The first test was conducted considering the random forest classifier, which offers some advantages over most statistical modeling methods, including the ability to model nonlinear relationships, resistance to overfitting, and relative robustness to the presence of noise in the data [39]. Random forest is an ensemble classifier that uses several classification and regression trees to perform a prediction [61] (Figure 3).

Each decision tree is produced independently, and each node is split using a randomly selected number of user-defined attributes (Mtry). By growing the forest to a user-defined number of trees (Ntree), the algorithm creates trees with high variance and low bias. The final decision can be made using different strategies, including the average probability of class assignment from all produced trees. A new unlabeled data entry is evaluated against all decision trees created in the set and each tree vote for a membership class. The membership class with the highest number of votes is selected.

Studies involving application of random forest for SAR data indicated that Ntree = 70 is the best suggestion without gain in the classification results [62,63]. The Mtry default is defined as the square root of the number of input attributes. In this study, the random forest classifier was analyzed considering several parameters, as defined in Table 2.

The boosted decision trees (DTs) (or augmented DTs; Figure 4) are also assembly methods that rely on DTs. In these cases, as the models are built, they are adjusted in an attempt to minimize the errors of the previous trees [64]. One type of boosted DT is Adaptative Boosting or Adaboost, which was also evaluated in this study. This method comprises three components: weak learners (individual bad predictors), a function that applies a penalty for incorrect classifications, and an additive model which allows individual weak learners to be combined in such a way that the loss function is minimized. The algorithm considers all trees, since the additive model is designed in such a way that the combination of all trees provides an optimal solution. Tests with the AdaBoost classifier were performed considering different numbers of estimators (Table 3). The weak learners used by the classifier were the DTs.

Artificial neural networks (ANNs) have been used in various applications [65], including remote sensing [66]. The basic elements of an ANN are neurons, equivalent to biological axons, which are organized in layers (Figure 5).

An ANN has input and output layers, for example, one neuron for each input variable and one neuron for each output class. ANNs typically have hidden nodes organized into one or more additional layers. Some models have the characteristic that all neurons in a layer are connected to all neurons in all adjacent layers [67]. These connections have fully connected weights. The weights in the connections, in combination with the typically nonlinear activation function, further modify the values in each neuron and determine how the input values are mapped to values in the output nodes. The potential of describing very complex decision boundaries can be improved by increasing the number of neurons in the hidden layer, especially by adding even more hidden layers. Neural networks are normally trained by initial random values for the weights which are then iteratively adjusted, observing the effect on the output nodes [67].

In general, the challenges in using ANNs are as follows: the training stage can be slow and laborious; there can be network overfitting; there is a need for several parameters to be specified by the user [64]. Some advantages of neural approaches in remote sensing are as follows: more accurate performance than other techniques such as statistical methods, particularly when the feature space is complex and the source data have different statistical distributions; more rapid performance than other techniques such as statistical classifiers; incorporation of a priori knowledge and realistic physical constraints into the analysis; incorporation of different types of data (including those from different sensors) [68].

There are several architectures of ANNs applicable to problems in remote sensing [34]. The architecture of the ANN tested in this work was a feedforward network with multiple layers, called multilayer perceptron (MLP). Feedforward ANNs are the most commonly applied, where the flow is always toward the output layer. When there is more than one hidden layer, these networks are called deep neural networks or deep learning [69].

With the aim of finding the best ANN topology for the data considered in this work, several hyperparameters were tested (Table 4), varying the number of neurons in each hidden layer, the number of hidden layers, the activation function (Relu and hyperbolic tangent (tanh)), the weight optimizers (L-BFGS-B [70], SGD [71], and ADAM [72]), the maximum number of iterations, and the stop criteria. The tests from NN61 to NN70 were performed using the five topologies tested from NN1 to NN60 that showed the best results, changing the activation function to hyperbolic tangent and the maximum number of iterations of 2000.

Test results were evaluated by applying the trained network over the validation set by calculating the following parameters [73]:

Area under receiver operator curve (AUC): an AUC of 0.5 suggests no discrimination between classes; 0.7 to 0.8 is considered acceptable; 0.8 to 0.9 is considered excellent; more than 0.9 is considered exceptional.
Accuracy: proportion of correctly classified samples.
F-1: accuracy weighted harmonic average and recall.
Precision: proportion of true positives among correctly classified samples as positive, for example, the extract proportion correctly classified as selectively logged.
Recall: proportion of true positives among all positive instances in the data.
Training time (s).
Test time (s).

In order to evaluate the results obtained with each classifier and typology, the confusion matrices of the best results obtained were compared, indicating the proportion of error in each class separately. To test the generalizability of the training set, five topologies that presented the best classification results were analyzed for the complete COSMO-SkyMed scene, including the area that exceeds the limits of UPA 11 (Figure 1), validating the results obtained by comparing them with the records of extracted trees provided by the SFB. The generalization capability was also tested with the application of the network trained to detect the features of selective logging that occurred between the COSMO-SkyMed acquisitions on 8 January 2018 and 5 June 2018.

3. Results

3.1. Exploratory Attribute Analysis

A threshold of 0.4 was obtained in the CV image from the June and October 2018 overpasses of the COSMO-SkyMed satellite, including changes (nonstationary processes) and speckle (stationary process) [53]. This procedure covered all pixels classified as timber extraction in the LiDAR images plus 70% commission errors (mainly speckle). Among the 2974 polygons generated, 870 belonged to the timber extraction class and 2104 belonged to the no-timber extraction class. By overlaying the samples selected as timber extraction in the LiDAR images, the coordinates of extracted trees provided by the SFB, and the polygons generated by the thresholding of the CV, we found 849 trees extracted in the UPA 11. SAR images were able to identify 96% of these trees, while the remaining 4% were related to extractions with little change in the canopy structure and, therefore, not detected in the X-band SAR data. This limited ability to detect selective logging through canopy gaps was previously reported by [6,7].

The exploratory analysis of the attributes showed redundancy and overlapping of several attributes. Figure 6 exemplifies, through boxplots of four attributes randomly selected from the set, that the data are linearly inseparable, which points out the need to use classifiers capable of dealing with these situations [38,44], as proposed in this work. The training set resulted in a total of 2082 samples stratified into timber extraction and no-timber extraction. The samples, although randomly selected, were replicable; that is, all tests used the same set to avoid differences in results arising from different sample selections.

3.2. Tests with the Random Forest Classifier

Table 5 shows the results obtained from the variation of the parameters and topology of the random forest algorithm. All topologies based on random forest presented good results, since the accuracy and the precision obtained were close to 95% and the AUC values were high (near 0.95).

A common problem in machine learning techniques is the comparison of the classification results (topologies) applied to a dataset. Corani and Benavoli [74] presented a Bayesian approach for comparative statistical inference of two concurrent algorithms evaluated through cross-validation. Such an approach is composed of two parts: the first, constituted by a proposal of correlated Bayesian t-test for the analysis of the results of the cross-validation of a single dataset responsible for the correlation due to the overlapping training sets; the second, by merging the posterior probabilities calculated by the Bayesian t-test correlated on the different datasets to make inferences on various datasets (or subsets) by adopting a binomial Poisson model. Inferences across multiple datasets account for different uncertainties in cross-validation results over different datasets. The performance of this test in the k-fold cross-validation with k = 10 showed that all random forest topologies were equivalent, with a probability of 0.03.

Figure 7 shows the confusion matrices of the accuracy, omission, and commission errors regarding the nine selected random forest topologies. RF3 showed better precision and accuracy: accuracy rates of 87% and 97.2% for the selectively logged forest and undisturbed forest classes, respectively. The shortest training and validation times were obtained with the RF1 and RF2, although RF3 also presented very close times.

Yu et al. [40] previously highlighted random forest as a good SAR image classifier together with optical images for forest type classification. Gosh and Behera [38] showed good results with the same method applied to biomass estimation by combining optical and SAR images acquired using Sentinel satellites. Shiraishi et al. [75] compared several machine learning methods for land-use classification in SAR images, and the best results obtained were also obtained with the random forest technique.

3.3. Tests with the AdaBoost Classifier

Table 6 shows the results obtained from the variation of the parameters and topology in the AdaBoost classifier. The variation of the topology in terms of number of estimators did not result in significant changes in any of the quality measures of the classifiers, with all being considered equivalent.

The confusion matrices (Figure 8) showed that the timber extraction class had classification accuracy close to 80% in all topologies, while the undisturbed forest class varied around 94%, without significant gain with an increase in the number of estimators. AB2, AB3, and AB4 topologies had a higher accuracy rate for the selectively logged forest class, although slightly higher accuracy and precision values were obtained by AB5 and AB6.

In general, the AdaBoost classifier performed worse than the random forest. The same conclusion was reported by Shiraishi et al. [75], who evaluated different methodologies applied to SAR for classification of land use and land cover in a region of tropical forests.

3.4. Tests with MLP-ANN

The typologies used in the NN61 to NN70 tests, which replicated the typologies of the five best results obtained in tests from NN1 to NN60, were NN23, NN26, NN31, NN22, and NN21, respectively. Table 7 shows the 10 best results obtained from the variation of the parameters and topology of the NNs, in relation to accuracy and precision.

Confusion matrices showing accuracy and omission and commission errors as a percentage of the best results obtained through the ANNs are presented in Figure 9. The highest accuracy and precision were obtained with the NN26 topology, with a function of ReLu activation, extensively applied in neural networks for image classification [76], and the SGD weight optimizer, contradicting the results obtained by [77] (although, for class tree, the results obtained by [77] were very close for all optimizers), showing no difference with an increase in the number of iterations. The referred topologies resulted in 89% and 96.4% accuracy for the selectively logged and undisturbed forest classes, respectively. The test time was also the lowest (0.161 s), and the training time (9456 s) was significantly lower than those presented by the NN29, NN32, and NN37 topologies.

3.5. Comparative Assessment of Machine Learning Techniques

All tests performed showed good results for detecting selective logging, as previously shown in the land use and land cover classification studies [43], pointing out that X-band SAR images, associated with machine learning techniques, despite their shallow interaction with the canopy, have potential in relation to other SAR bands and other methodologies presented in the literature [25,27], especially due to the high revisiting capacity of the COSMO-SkyMed constellations and the need for only one multitemporal pair of images.

Among the different architectures tested in the set of attributes generated on the images before and after exploitation, the best result was obtained by the NN26 topology, which presented a precision and accuracy of 0.961, being 89% correspondent with the validation samples in terms of selective logging class and 96.4% in terms of undisturbed forest class. Despite exhaustive tests varying the parameters of the RF and neural network topologies, the results did not show an improvement in relation to those presented. This fact indicated the need for an analysis of the NN26 topology error.

Analyzing the boxplots of features correctly classified as timber extraction with those that were misclassified as no-timber extraction, it was seen that the center, amplitude, and symmetry of these features are different (Figure 10), which exemplifies some randomly selected attributes and presents the t-test, rejecting the null hypothesis and demonstrating that the means are statistically different. Likewise, comparing the features correctly classified as undisturbed forest with those that were misclassified as selectively logged forests, asymmetric data were found. In summary, the features classified erroneously are outliers, with attribute values differing from the rest of the samples.

Analyzing the LiDAR data available for the study area, it was possible to observe that the extracted trees informed by the companies were not delimited in the stage of thresholding of the CV. Therefore, they were not included in the classification process in timber extraction or no-timber extraction (11% of the trees cut). They are small crown structures (between 6 m and 10 m in diameter) with corymbiform architecture [78] and not emergent in relation to neighboring crowns (Figure 11).

According to Locks [7], who carried out a study of canopy damage identification by selective logging based on LiDAR data, about 93.3% of felled trees result in canopy damage; that is, approximately 7% of the extractions are unidentified in SAR images that interact at the surface level of canopy, as is the case of the X-band.

3.6. Generalization Test

Generalization capability refers to the ability of the network to correctly classify data not used for network training [67]. In this work, the NN26 generalization ability test was performed, the first step of which consisted of thresholding the CV image generated between the COSMO-SkyMed overpasses from June–October and January–June 2018, at a value of 0.4. This procedure resulted in 186,886 polygons for the June–October images and 224,250 polygons for the January–June images (Figure 12).

Figure 12 also presents the results of the classification through NN26 for the extraction class. The data were validated by selecting all polygons contained in the area where the forest inventory provided by the SFB is available, which includes the geographic position of the cut trees, comparing them with the classification result using the inventory data. For the result obtained between the COSMO-SkyMed overpasses of 5 June 2018 and 8 October 2018, on which the typologies were trained, the selectively logged and undisturbed forests correctly classified represented 89% and 95%, respectively. The overall accuracy was 91%.

An analysis of the forest inventory data showed that 11% of the trees reported as being cut were unidentified and did not even participate in the classification process because they were not delimited in the first stage (thresholding CV). The evaluation of 11% of trees cut and undetected showed that 36% represent logging that did not result in forest clearance; therefore, it was not possible to identify them through high-frequency SAR sensors (Figure 13). Those features were also unidentified in the digital surface model product generated from the last LiDAR return. The remaining 64% are trees with small crowns or shaded by neighboring crowns and located in front of them in relation to the sensor imaging direction. According to [16], the detection of changes in the shadowing effect depends on factors such as polarization, incidence angle, and characteristics of the study area. In this study, three-dimensional features involving the target, surroundings, and sensor characteristics made the change detection partially difficult.

For the January–June COSMO-SkyMed images, correctly classified timber extractions and no-timber extractions represented 88% and 92%, respectively. The overall accuracy was 89%. The January image was noisy (Figure 14), changing the values of the backscatter coefficient and, consequently, generating high values of CV between the images, causing a high incidence of false positives for the timber extraction class. Therefore, it is necessary to observe the occasional noise, which may result from atmospheric effects frequently present in X-band data [79] or sensor problems, which may interfere in the quality of the results.

Although machine learning algorithms were trained using data collected in areas of legal forest concession, the generalization test highlighted logging from areas outside the studied National Forest. The overlapping of the logging activities detected in the period between June and October with the polygons pointed out by the Selective Logging Detection System (DETEX) [80] as areas of illegal logging that took place between June and October, from the partnership between the SFB and the National Institute for Space Research (INPE), demonstrated that our methodology may be useful in detecting illegal logging in the Brazilian Amazon, as there was a good agreement between our study’s approach and that of INPE (Figure 15). The advantage of the proposed method is that, once based on SAR data, it allows monitoring even under cloud cover; moreover, given the correlation between gaps and log volume [7], it allows estimating the intensity of logging.

Deutscher et al. [54] reported that X-band SAR data from the TerraSAR-X sensor are able to identify disturbances in tropical forests larger than 0.5 ha with an accuracy of 98%. In this research, it was shown that X-band SAR data are also promising for the detection of smaller scars, such as selective logging, and that, in conjunction with machine learning techniques, present high generalization capacity, resulting in greater accuracy than 85% for the extraction class.

4. Discussion

Our study is the first to investigate the potential of X-band SAR images acquired in the StripMap mode for monitoring selective logging in tropical forests. Our results indicate good possibility for the use of this dataset, as the logging activity causes changes in the backscattering processes because of the changes in the structure of the forest canopy. However, about 7% of the extracted trees could not be identified because of the insufficient SAR spatial resolution and because of the arrangements of the extracted tree crowns in relation to their neighbors, as observed by Locks [7].

The large amount of data available on the Jamari National Forest (LiDAR images and the forest inventory) allowed the identification of forest scars caused by selective logging, essential for training the machine learning algorithms tested in this study. Multitemporal changes in shading and illumination in the X-band SAR images, whose transmitted signals mostly interact at the top of canopies, were able to be detected and delimited by defining a threshold in terms of the coefficient of variation. These changes in shading and lighting were also reported by Bouvet et al. [16] as potential criteria for deforestation detection and mapping. The threshold based on coefficient of variation, as reported by Koeniguer and Nicolas [53], is suitable for detecting changes due to its simple formulation and remarkable statistical properties.

Koeniguer and Nicolas [53] presented the first theoretical study demonstrating that the coefficient of variation is relevant for detecting changes even in areas with speckle, and that it has different statistical properties for at least three categories of temporal profiles: permanent disperser, stable natural speckle area, and a nonstationary area that generally is interpreted as a change. In this study, the forest areas that did not face anthropogenic interference were considered natural with stable speckle, and the selective loggings corresponded to changes. A threshold of 0.4 enabled the delimitation of changed features (selective logging) together with some other changes related to speckle and atmospheric interference (false positives).

The machine learning techniques considered in this study showed good results in comparison with previous studies on the automatic detection of selective logging based on high-resolution optical or SAR images, semiautomated methods, and low temporal frequency [26,81,82]. The advantages over previous studies are as follows: (1) after training the algorithms, the method is fully automatic, reducing errors associated with human interpretation; (2) the possibility of constant monitoring even under adverse weather conditions; (3) high revisit capability of X-band COSMO-SkyMed constellation satellites; (4) high generalization capacity of pretrained networks, allowing the use of the same training set if the SAR images are acquired under the same image acquisition modes.

The tests of generalization capacity presented results close to those obtained by the algorithms that rely on training processes, demonstrating high generalization of the trained networks. A frequent reported problem in studies based on machine learning classifiers is the specialization of the trained network (or overfitting) and the problem of local minima, limiting the generalization capacity and demanding new training samples for each scene to be classified [83]. A hypothesis suggested for the good results obtained in this work is the stability of the SAR signal in the images used and the use of textural and spatial attributes, influenced by spectral changes on a smaller scale than the spectral attributes. In addition to the ionospheric effects, climate and seasonal surface conditions can considerably affect SAR measurements, modifying timeseries characteristics and limiting large-scale applications [84]. In the case of X-band SAR images, rain interception, for example, can add up to 3 dB to the backscattered signal. Several authors have studied these phenomena and pointed out solutions for stabilizing the multitemporal SAR data signal [84].

The image acquired in January 2018 used in this study showed radiometric problems related to the sensor, as can be seen in Figure 14, which did not negatively affect the generalizability of the pretrained network. However, the method still needs to be tested regarding the generalizability of the prediction according to factors that can affect the detection by the model, such as scenes covering distinct forests and obtained under adverse weather conditions. Convolutional neural network (CNN) tests are also suggested, which have great potential for generalization in areas not seen before, as they use contextual information and are not strongly affected by absolute pixel values [85]. Other suggested approaches to reduce the overfitting problem are endless learning [86] and self-taught learning [87], which can be tested in the future.

Although several tests were performed using deep neural networks, the best result was obtained using a single-layer neural network. Regardless of the depth of the network, an advantage of using artificial neural networks lies in the fact that learning is based on the characteristics of the data presented and not on the a priori knowledge of the interpreter [83]. Thus, an important step to obtain good results is the availability of reliable data for training algorithms. The adoption of neural networks for remote sensing problems has advantages in problems whose physical models are complex (nonlinear, for example), not yet well understood, or even difficult to be generalized [83]. The disadvantage of this type of approach is to find the appropriate hyperparameters for the dataset and problem considered, as well as the long training time (which increases with the number of hidden layers).

Machine learning-based approaches, especially CNN, are often treated as black boxes due to the difficulty in understanding the parameters that led the network to a given decision. Kattenborn et al. [85] justified this treatment due to the users’ unfamiliarity with these techniques and the incomparable depth and number of parameters of these models. However, most machine learning models have a clear and linear structure and basic operators such as pooling and activation functions. Regarding the attributes used in this study, it is possible, through various approaches such as genetic algorithms [88], to know the individual contribution to obtaining the results, seeking combinatorial optimization. This will be the subject of further investigation, in addition to tests using CNN.

5. Conclusions

The images used in this study, in which acquisition dates coincided with the period before and after selective logging, allowed us to identify the features resulting from this type of activity, since all images were acquired according to the same parameters (orbit direction, angle of incidence, band, and polarization); therefore, temporal changes refer to changes in land cover instead of differences in acquisition parameters. Other factors that can cause multitemporal changes in X-band SAR images are severe atmospheric events and flaws in the imaging acquisition process.

To our best knowledge, no research had been published addressing the use of X-band SAR data acquired in StripMap mode for systematic monitoring of selective logging in a tropical forest. With the increasing availability of these data, through the launch of new X-band SAR constellations, new information extraction methodologies are required for applications that demand high revisits.

The machine learning techniques tested showed good results from the attributes generated with images acquired before and after the exploitation of the area, showing the potential for detecting selective logging in tropical forests through X-band SAR data. The accuracy rates obtained for the selectively logged forest class were approximately 88% using an ANN with 50 neurons in a hidden layer, with ReLu activation function and an SGD weight optimizer with 1000 iterations, even in the validation test applied to a dataset different from that used for training.

As the X-band interacts mostly at the top of the forest canopy, only selective extractions that cause canopy damage can be detected. As a result, 11% of the trees identified as being logged in UPA 11 could not be identified in the first stage of the proposed method, and those logged forests could not be detected using the digital surface model product derived from the first return of LiDAR data.

We recommend that future studies should better explore methods of suppression or minimization of errors related to systematic noise in images or caused by extreme meteorological events, which affect X-band backscatter signals, causing an increase in commission errors. Another suggestion is to extend the tests including the CNN machine learning algorithm and the development of a method based on the probability of being a logging event, rather than the binary classification adopted in this study.

Author Contributions

T.N.K. was primarily responsible for conceptualizing the method and writing the paper; E.E.S., P.d.C.B. and E.H.S. guided the experiments and reviewed the article; P.F.F.S.F. and E.A.T.M. reviewed the article as machine learning and Brazilian degradation monitoring experts, respectively. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the author.

Acknowledgments

The authors would like to thank José Humberto Chaves, from the Brazilian Forest Service, for the data cession. The authors also thank Ricardo Dal’Agnol da Silva, from National Institute for Spatial Research (INPE), for his important contributions to the review of this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

IPCC. Summary for Policymakers. Global Warming of 1.5 °C. In Global Warming of 1.5 °C. An IPCC Special Report on the Impacts of Global Warming of 1.5 °C above Pre-Industrial Levels and Related Global Greenhouse Gas Emission Pathways, in the Context of Strengthening the Global Response to the Threat of Climate Change; IPCC: Geneva, Switzerland, 2018. [Google Scholar] [CrossRef] [Green Version]
SEEG-Brasil. Sistema de Estimativa de Emissões de Gases de Efeito Estufa. Available online: http://plataforma.seeg.eco.br/total_emission# (accessed on 17 June 2021).
Qin, Y.; Xiao, X.; Wigneron, J.P.; Ciais, P.; Brandt, M.; Fan, L.; Li, X.; Crowell, S.; Wu, X.; Doughty, R.; et al. Carbon Loss from Forest Degradation Exceeds that from Deforestation in the Brazilian Amazon. Nat. Clim. Chang. 2021, 11, 442–448. [Google Scholar] [CrossRef]
Matricardi, E.A.T.; Skole, D.L.; Costa, O.B.; Pedlowski, M.A.; Samek, J.H.; Miguel, E.P. Long-Term Forest Degradation Surpasses Deforestation in the Brazilian Amazon. Science 2020, 369, 1378–1382. [Google Scholar] [CrossRef] [PubMed]
Merry, F.; Soares-Filho, B.; Nepstad, D.; Amacher, G.; Rodrigues, H. Balancing Conservation and Economic Sustainability: The Future of the Amazon Timber Industry. Environ. Manag. 2009, 44, 395–407. [Google Scholar] [CrossRef] [PubMed]
Matricardi, E.A.T.; Skole, D.L.; Pedlowski, M.A.; Chomentowski, W. Assessment of Forest Disturbances by Selective Logging and Forest Fires in the Brazilian Amazon Using Landsat Data. Int. J. Remote Sens. 2013, 34, 1057–1086. [Google Scholar] [CrossRef]
Locks, C.J. Aplicações da Tecnologia LiDAR no Monitoramento da Exploração Madeireira em Áreas de Concessão Florestal. Master’s Thesis, Universidade de Brasília, Brasília, Brazil, 2017. [Google Scholar]
Asner, G.P.; Knapp, D.E.; Broadbent, E.N.; Oliveira, P.J.C.; Keller, M.; Silva, J.N. Selective Logging in the Brazilian Amazon. Science 2005, 305, 480–482. [Google Scholar] [CrossRef]
Loureiro, V.R.; Pinto, J.N.A. A Questão Fundiária na Amazônia. Estud. Avançados 2005, 19, 77–98. [Google Scholar] [CrossRef] [Green Version]
INPE. Monitoramento do Desmatamento da Floresta Amazônica Brasileira por Satélite. Available online: http://www.obt.inpe.br/OBT/assuntos/programas/amazonia/prodes (accessed on 21 June 2021).
Bem, P.P.; Carvalho, O.A.; Guimarães, R.F.; Gomes, R.A.T. Change Detection of Deforestation in the Brazilian Amazon Using Landsat Data and Convolutional Neural Networks. Remote Sens. 2020, 12, 901. [Google Scholar] [CrossRef] [Green Version]
Cabral, A.I.R.; Saito, C.; Pereira, H.; Laques, A.E. Deforestation Pattern Dynamics in Protected Areas of the Brazilian Legal Amazon Using Remote Sensing Data. Appl. Geogr. 2018, 100, 101–115. [Google Scholar] [CrossRef]
Fawzi, N.I.; Husna, V.N.; Helms, J.A. Measuring deforestation using remote sensing and its implication for conservation in Gunung Palung National Park, West Kalimantan, Indonesia. IOP Conf. Ser. Earth Environ. Sci. 2018, 149 , 012038. [Google Scholar] [CrossRef]
Pickering, J.; Stehman, S.V.; Tyukavina, A.; Potapov, P.; Watt, P.; Jantz, S.M.; Bholanath, P.; Hansen, M.C. Quantifying the Trade-off between Cost and Precision in Estimating Area of Forest Loss and Degradation Using Probability Sampling in Guyana. Remote Sens. Environ. 2019, 221, 122–135. [Google Scholar] [CrossRef]
Asner, G.P. Cloud Cover in Landsat Observations of the Brazilian Amazon. Int. J. Remote Sens. 2001, 22, 3855–3862. [Google Scholar] [CrossRef]
Bouvet, A.; Mermoz, S.; Ballère, M.; Koleck, T.; Le Toan, T. Use of the SAR Shadowing Effect for Deforestation Detection with Sentinel-1 Time Series. Remote Sens. 2018, 10, 1250. [Google Scholar] [CrossRef] [Green Version]
Fiorentino, C.; Virelli, M. COSMO-SkyMed Mission and Products Description; Agenzia Spaziale Italiana (ASI): Rome, Italy, 2019. Available online: https://earth.esa.int/eogateway/documents/20142/37627/COSMO-SkyMed-Mission-Products-Description.pdf (accessed on 16 June 2021).
Iceye. Iceye Sar Product Guide; Iceye: Esper, Finland, 2021. [Google Scholar] [CrossRef]
ESA. TerraSAR-X/TanDEM-X Full Archive and Tasking. Available online: https://earth.esa.int/eogateway/catalog/terrasar-x-tandem-x-full-archive-and-tasking (accessed on 6 July 2021).
Bispo, P.C.; Santos, J.R.; Valeriano, M.M.; Touzi, R.; Seifert, F.M. Integration of Polarimetric PALSAR Attributes and Local Geomorphometric Variables Derived from SRTM for Forest Biomass Modeling in Central Amazonia. Can. J. Remote Sens. 2014, 40, 26–42. [Google Scholar] [CrossRef]
Bispo, P.C.; Rodríguez-Veiga, P.; Zimbres, B.; Miranda, S.C.; Cezare, C.H.G.; Fleming, S.; Baldacchino, F.; Louis, V.; Rains, D.; Garcia, M.; et al. Woody Aboveground Biomass Mapping of the Brazilian Savanna with a Multi-Sensor and Machine Learning Approach. Remote Sens. 2020, 12, 2685. [Google Scholar] [CrossRef]
Santoro, M.; Cartus, O.; Carvalhais, N.; Rozendaal, D.; Avitabilie, V.; Araza, A.; Bruin, S.; Herold, M.; Quegan, S.; Veiga, P.R.; et al. The Global Forest Above-Ground Biomass Pool for 2010 Estimated from High-Resolution Satellite Observations. Earth Syst. Sci. Data Discuss. 2021, 13, 3927–3950. [Google Scholar] [CrossRef]
Schlund, M.; von Poncet, F.; Kuntz, S.; Schmullius, C.; Hoekman, D.H. TanDEM-X Data for Aboveground Biomass Retrieval in a Tropical Peat Swamp Forest. Remote Sens. Environ. 2015, 158, 255–266. [Google Scholar] [CrossRef]
Treuhaft, R.; Goncalves, F.; Santos, J.R.; Keller, M.; Palace, M.; Madsen, S.N.; Sullivan, F.; Graca, P.M.L.A. Tropical-Forest Biomass Estimation at X-Band from the Spaceborne Tandem-X Interferometer. IEEE Geosci. Remote Sens. Lett. 2015, 12, 239–243. [Google Scholar] [CrossRef] [Green Version]
Treuhaft, R.; Lei, Y.; Gonçalves, F.; Keller, M.; Santos, J.R.; Neumann, M.; Almeida, A. Tropical-Forest Structure and Biomass Dynamics from TanDEM-X Radar Interferometry. Forests 2017, 8, 277. [Google Scholar] [CrossRef] [Green Version]
Deutscher, J.; Perko, R.; Gutjahr, K.; Hirschmugl, M.; Schardt, M. Mapping Tropical Rainforest Canopy Disturbances in 3D by COSMO-SkyMed Spotlight InSAR-Stereo Data to Detect Areas of Forest Degradation. Remote Sens. 2013, 5, 648–663. [Google Scholar] [CrossRef] [Green Version]
Lei, Y.; Treuhaft, R.; Keller, M.; Santos, M.; Gonçalves, F.; Neumann, M. Quantification of Selective Logging in Tropical Forest with Spaceborne SAR Interferometry. Remote Sens. Environ. 2018, 211, 167–183. [Google Scholar] [CrossRef]
Berninger, A.; Lohberger, S.; Stängel, M.; Siegert, F. SAR-Based Estimation of above-Ground Biomass and its Changes in Tropical Forests of Kalimantan Using L- and C-Band. Remote Sens. 2018, 10, 831. [Google Scholar] [CrossRef] [Green Version]
Delgado-Aguilar, M.J.; Fassnacht, F.E.; Peralvo, M.; Gross, C.P.; Schmitt, C.B. Potential of TerraSAR-X and Sentinel 1 Imagery to Map Deforested Areas and Derive Degradation Status in Complex Rain Forests of Ecuador. Int. For. Rev. 2017, 19, 102–118. [Google Scholar] [CrossRef]
Macedo, C.R.; Ogashawara, I. Comparação de Filtros Adaptativos para Redução do Ruído Speckle em Imagens SAR. In Proceedings of the XVI Simpósio Bras. Sensoriamento Remoto, Foz do Iguaçu, Brazil, 13–18 April 2013; INPE: São José dos Campos, Brazil, 2013. [Google Scholar]
Gomez, L.; Buemi, M.E.; Jacobo-Berlles, J.C.; Mejail, M.E. A New Image Quality Index for Objectively Evaluating Despeckling Filtering in SAR Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 1297–1307. [Google Scholar] [CrossRef]
Mahdavi, S.; Salehi, B.; Moloney, C.; Huang, W.; Brisco, B. Speckle Filtering of Synthetic Aperture Radar Images Using Filters with Object-Size-Adapted Windows. Int. J. Digit. Earth 2018, 11, 703–729. [Google Scholar] [CrossRef]
Kuck, T.N.; Gomez, L.D.; Sano, E.E.; Bispo, P.C.; Honorio, D.D.C. Performance of Speckle Filters for COSMO-SkyMed Images from the Brazilian Amazon. IEEE Geosci. Remote Sens. Lett. 2021, 99, 1–5. [Google Scholar] [CrossRef]
Shi, W.; Zhang, M.; Zhang, R.; Chen, S.; Zhan, Z. Change Detection Based on Artificial Intelligence: State-of-the-Art and Challenges. Remote Sens. 2020, 12, 1688. [Google Scholar] [CrossRef]
Shi, J.; Liu, X.; Lei, Y. SAR Images Change Detection Based on Self-Adaptive Network Architecture. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1204–1208. [Google Scholar] [CrossRef]
Liu, R.; Wang, R.; Huang, J.; Li, J.; Jiao, L. Change Detection in SAR Images Using Multiobjective Optimization and Ensemble Strategy. IEEE Geosci. Remote Sens. Lett. 2020. [Google Scholar] [CrossRef]
Yan, S.; Jing, L.; Wang, H. A New Individual Tree Species Recognition Method Based on a Convolutional Neural Network and High-spatial Resolution Remote Sensing Imagery. Remote Sens. 2021, 13, 479. [Google Scholar] [CrossRef]
Ghosh, S.M.; Behera, M.D. Aboveground Biomass Estimation Using Multi-Sensor Data Synergy and Machine Learning Algorithms in a Dense Tropical Forest. Appl. Geogr. 2018, 96, 29–40. [Google Scholar] [CrossRef]
Seo, D.K.; Kim, Y.H.; Eo, Y.D.; Lee, M.H.; Park, W.Y. Fusion of SAR and Multispectral Images Using Random Forest Regression for Change Detection. ISPRS Int. J. Geo. Inf. 2018, 7, 401. [Google Scholar] [CrossRef] [Green Version]
Yu, Y.; Li, M.; Fu, Y. Forest Type Identification by Random Forest Classification Combined with SPOT and Multitemporal SAR Data. J. For. Res. 2018, 29, 1407–1414. [Google Scholar] [CrossRef]
Zhao, X.; Jiang, Y.; Stathaki, T. Automatic Target Recognition Strategy for Synthetic Aperture Radar Images Based on Combined Discrimination Trees. Comput. Intell. Neurosci. 2017, 7186120. [Google Scholar] [CrossRef] [Green Version]
Zhang, F.; Wang, Y.; Ni, J.; Zhou, Y.; Hu, W. SAR Target Small Sample Recognition Based on CNN Cascaded Features and AdaBoost Rotation Forest. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1008–1012. [Google Scholar] [CrossRef]
Camargo, F.F.; Sano, E.E.; Almeida, C.M.; Mura, J.C.; Almeida, T. A Comparative Assessment of Machine-Learning Techniques for Land Use and Land Cover Classification of the Brazilian Tropical Savanna Using ALOS-2/PALSAR-2 Polarimetric Images. Remote Sens. 2019, 11, 1600. [Google Scholar] [CrossRef] [Green Version]
Lee, Y.S.; Lee, S.; Baek, W.K.; Jung, H.S.; Park, S.H.; Lee, M.J. Mapping Forest Vertical Structure in Jeju Island from Optical and Radar Satellite Images Using Artificial Neural Network. Remote Sens. 2020, 12, 797. [Google Scholar] [CrossRef] [Green Version]
Dong, H.; Ma, W.; Wu, Y.; Gong, M.; Jiao, L. Local Descriptor Learning for Change Detection in Synthetic Aperture Radar Images via Convolutional Neural Networks. IEEE Access 2019, 7, 15389–15403. [Google Scholar] [CrossRef]
Li, Y.; Peng, C.; Chen, Y.; Jiao, L.; Zhou, L.; Shang, R. A Deep Learning Method for Change Detection in Synthetic Aperture Radar Images. IEEE Trans. Geosci. Remote Sens. 2019, 57, 5751–5763. [Google Scholar] [CrossRef]
Cui, B.; Zhang, Y.; Yan, L.; Wei, J.; Wu, H. An Unsupervised SAR Change Detection Method Based on Stochastic Subspace Ensemble Learning. Remote Sens. 2019, 11, 1314. [Google Scholar] [CrossRef] [Green Version]
Jaturapitpornchai, R.; Matsuoka, M.; Kanemoto, N.; Kuzuoka, S.; Ito, R.; Nakamura, R. Newly Built Construction Detection in SAR Images Using Deep Learning. Remote Sens. 2019, 11, 1444. [Google Scholar] [CrossRef] [Green Version]
IBGE. Manual Técnico da Vegetação Brasileira; IBGE: Rio de Janeiro, Brazil, 2012. [Google Scholar]
SFB. Serviço Florestal Brasileiro. Floresta Nacional do Jamari. Available online: http://www.florestal.gov.br/florestas-sob-concessao/92-concessoes-florestais/florestas-sob-concessao/101-floresta-nacional-do-jamari-ro (accessed on 16 June 2021).
Lopes, A.; Nezry, E.; Touzi, R.; Laur, H. Maximum a Posteriori Speckle Filtering and First Order Texture Models in SAR Images. In Proceedings of the 10th Annual International Symposium on Geoscience and Remote Sensing, College Park, MD, USA, 20–24 May 1990; Volume 28, pp. 2409–2412. [Google Scholar] [CrossRef]
Koeniguer, E.; Nicolas, J.-M.; Janez, F. Worldwide Multitemporal Change Detection Using Sentinel-1 Images. In Proceedings of the BIDS—Conference on Big Data from Space, Munich, Germany, 19–21 February 2019. [Google Scholar]
Koeniguer, E.C.; Nicolas, J.M. Change Detection based on the Coefficient of Variation in SAR Time-Series of Urban Areas. Remote Sens. 2020, 12, 2089. [Google Scholar] [CrossRef]
Deutscher, J.; Gutjahr, K.; Perko, R.; Raggam, H.; Hirschmugl, M.; Schardt, M. Humid Tropical Forest Monitoring with Multi-Temporal L-, C- and X-Band SAR Data. In Proceedings of the 2017 9th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp), Bruges, Belgium, 27–29 June 2017; pp. 1–4. [Google Scholar] [CrossRef]
Liu, H.; Wang, L. Mapping Detention Basins and Deriving Their Spatial Attributes from Airborne LiDAR Data for Hydrological Applications. Hydrol. Process. 2008, 22, 2358–2369. [Google Scholar] [CrossRef]
Everitt, B.S.; Skrondal, A. The Cambridge Dictionary of Statistics; Cambridge University Press: New York, NY, USA, 2010. [Google Scholar] [CrossRef]
Anys, H.; Bannari, A.; He, D.C.; Morin, D. Cartographie des Zones Urbaines á l’aide Des Images Aéroportées MEIS-II. Int. J. Remote Sens. 1998, 19, 883–894. [Google Scholar] [CrossRef]
Haralick, R.M.; Dinstein, I.; Shanmugam, K. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef] [Green Version]
Demšar, J.; Curk, T.; Erjavec, A.; Gorup, Č.; Hočevar, T.; Milutinovič, M.; Možina, M.; Polajnar, M.; Toplak, M.; Starič, A.; et al. Orange: Data Mining Toolbox in Python. J. Mach. Learn. Res. 2013, 14, 2349–2353. [Google Scholar]
Xu, Y.; Goodacre, R. On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning. J. Anal. Test. 2018, 2, 249–262. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Belgiu, M.; Drăgu, L. Random Forest in Remote Sensing: A Review of Applications and Future Directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Du, P.; Samat, A.; Waske, B.; Liu, S.; Li, Z. Random Forest and Rotation Forest for Fully Polarized SAR Image Classification Using Polarimetric and Spatial Features. ISPRS J. Photogramm. Remote Sens. 2015, 105, 38–53. [Google Scholar] [CrossRef]
Topouzelis, K.; Psyllos, A. Oil Spill Feature Selection and Classification Using Decision Tree Forest on SAR Image Data. ISPRS J. Photogramm. Remote Sens. 2012, 68, 135–143. [Google Scholar] [CrossRef]
Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of Machine-Learning Classification in Remote Sensing: An Applied Review. Int. J. Remote Sens. 2018, 39, 2784–2817. [Google Scholar] [CrossRef] [Green Version]
Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.E.; Arshad, H. State-of-the-Art in Artificial Neural Network Applications: A Survey. Heliyon 2018, 4, e00938. [Google Scholar] [CrossRef] [Green Version]
Cheng, G.; Xie, X.; Han, J.; Guo, L.; Xia, G.S. Remote Sensing Image Scene Classification meets Deep Learning: Challenges, Methods, Benchmarks, and Opportunities. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3735–3756. [Google Scholar] [CrossRef]
Haykin, S. Neural Networks and Learning Machines; Pearson Prentice Hall: New Jersey, NJ, USA, 2008. [Google Scholar]
Atkinson, P.M.; Tatnall, A.R.L. Introduction Neural Networks in Remote Sensing. Int. J. Remote Sens. 1997, 18, 699–709. [Google Scholar] [CrossRef]
Deng, L.; Yu, D. Deep Learning: Methods and Applications. Found. Trends Signal Process. 2014, 7, 197–387. [Google Scholar] [CrossRef] [Green Version]
Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
Bottou, L. Large-Scale Machine Learning with Stochastic Gradient Descent. In Proceedings of the COMPSTAT the 19th International Conference on Computational Statistics, Keynote, Invited and Contributed Papers, Paris, France, 22–27 August 2010. [Google Scholar] [CrossRef] [Green Version]
Kingma, D.P.; Ba, J.L. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
Powers, D.M.W. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation. arXiv 2020, arXiv:2010.16061. [Google Scholar]
Corani, G.; Benavoli, A. A Bayesian Approach for Comparing Cross-Validated Algorithms on Multiple Data Sets. Mach. Learn. 2015, 100, 285–304. [Google Scholar] [CrossRef] [Green Version]
Shiraishi, T.; Motohka, T.; Thapa, R.B.; Watanabe, M.; Shimada, M. Comparative Assessment of Supervised Classifiers for Land Use-Land Cover Classification in a Tropical Region Using Time-Series PALSAR Mosaic Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1186–1199. [Google Scholar] [CrossRef]
Parisi, L.; Candidate, M.; Ma, R.; RaviChandran, N.; Lanzillotta, M. Hyper-Sinh: An Accurate and Reliable Function from Shallow to Deep Learning in TensorFlow and Keras. Mach. Learn. Appl. 2021, 6, 100112. [Google Scholar]
Bera, S.; Shrivastava, V.K. Analysis of Various Optimizers on Deep Convolutional Neural Network Model in the Application of Hyperspectral Remote Sensing Image Classification. Int. J. Remote Sens. 2019, 41, 2664–2683. [Google Scholar] [CrossRef]
Saueressig, D. Manual de Dendrologia: O Estudo Das Árvores, 1st ed.; Plantas do Brasil Ltda: Irati, Brazil, 2018; p. 67. [Google Scholar]
Marzano, F.S.; Mori, S.; Weinman, J.A. Evidence of Rainfall Signatures on X-Band Synthetic Aperture Radar Imagery over Land. IEEE Trans. Geosci. Remote Sens. 2010, 48, 950–964. [Google Scholar] [CrossRef]
SFB. Serviço Florestal Brasileiro. DETEX. Available online: https://www.florestal.gov.br/monitoramento (accessed on 21 June 2021).
da Costa, O.B.; Matricardi, E.A.T.; Pedlowski, M.A.; Miguel, E.P.; de Oliveira Gaspar, R. Selective Logging Detection in the Brazilian Amazon. Floresta Ambient. 2019, 26, e20170634. Available online: https://www.scielo.br/j/floram/a/pKkYCDd9bzFqrZxJpqqpfhc/abstract/?lang=en (accessed on 14 June 2021). [CrossRef] [Green Version]
Bullock, E.L.; Woodcock, C.E.; Souza, C., Jr.; Olofsson, P. Satellite-Based Estimates Reveal Widespread Forest Degradation in the Amazon. Glob. Chang. Biol. 2020, 26, 2956–2969. [Google Scholar] [CrossRef] [PubMed]
Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef] [Green Version]
Doblas, J.; Shimabukuro, Y.; Sant’anna, S.; Carneiro, A.; Aragão, L.; Almeida, C. Optimizing near Real-Time Detection of Deforestation on Tropical Rainforests Using Sentinel-1 Data. Remote Sens. 2020, 12, 3922. [Google Scholar] [CrossRef]
Kattenborn, T.; Leitloff, J.; Schiefer, F.; Hinz, S. Review on Convolutional Neural Networks (CNN) in Vegetation Remote Sensing. ISPRS J. Photogramm. Remote Sens. 2021, 173, 24–49. [Google Scholar] [CrossRef]
Mitchell, T.; Cohen, W.; Hruschka, E.; Talukdar, P.; Betteridge, J.; Carlson, A.; Dalvi, B.; Gardner, M.; Kisiel, B.; Krishnamurthy, J.; et al. Never-Ending Learning. Proc. Natl. Conf. Artif. Intell. 2015, 3, 2302–2310. [Google Scholar] [CrossRef]
Feng, S.; Yu, H.; Duarte, M.F. Autoencoder Based Sample Selection for Self-Taught Learning. Knowl. Based Syst. 2020, 192, 105343. [Google Scholar] [CrossRef] [Green Version]
Goldberg, D.E. Genetic Algorithms in Search, Optimization, and Machine Learning; Addison Wesley: Boston, MA, USA, 1989. [Google Scholar]

Figure 1. Location of the study area in the Rondônia State, Brazil (A); the study site in the Jamari National Forest (B).

Figure 2. LiDAR processing steps used to create the ground-truth data.

Figure 3. Structure of the random forest classifier.

Figure 4. Structure of the AdaBoost classifier. Red dots and black crosses are the features to be classified. The weak linear classifier is not able to correctly separate the classes, as shown in the bottom of the figure (red and blue). On the right, the result of an Adaboost classifier, which assigns weights to weak classifiers and thus results in better separation of classes.

Figure 5. Structure of an artificial neuron.

Figure 6. Boxplots showing the distribution of four attributes (normalized maximum ratio, minimum, shape, and gradient ranges) for the timber extraction and no-timber extraction classes.

Figure 7. Confusion matrices for the random forest topologies.

Figure 8. Confusion matrices for AdaBoost topologies.

Figure 9. Confusion matrices for MLP-ANN topologies.

Figure 10. Boxplots from misclassified features exemplifying their differences in a randomly selected attribute.

Figure 11. Digital surface models generated from LiDAR data before and after exploitation, showing a tree undetected by the proposed method.

Figure 12. Polygons generated by slicing the coefficient of variation (in yellow) before and after classification, for the January–June and June–October pairs of images.

Figure 13. COSMO-SkyMed images showing an unidentified extraction. The green asterisk represents a tree recorded as extracted by forest inventory.

Figure 14. Polygons generated by slicing the coefficient of variation (in yellow) from January–June images and showing strips indicating sensor/acquisition problems.

Figure 15. Pixels (in yellow) classified as selective logging overlapped with the polygons (in magenta) provided by the Brazilian Forest Service (SFB) and National Institute for Space Research (INPE) partnership, delimiting illegal logging area and showing overlapped areas.

Table 1. Characteristics of the COSMO-SkyMed scenes used in this study.

Parameter	Specification
Platform	COSMO-SkyMed
Launch	June 2007
Swath	620 km
Wavelength	X-band
Polarization	HH
Number of satellites	4
Year	2018
Acquisition mode	Stripmap HIMAGE
Size	40 km × 40 km
Incidence angle	~55°
Spatial resolution	3 m × 3 m

Table 2. Random forest test parameters considering in this study.

Identification	Number of Tress (Ntree)	Number of Attributes in Each Division (Mtry)
RF1	10	$\sqrt{160}$
RF2	15	$\sqrt{160}$
RF3	20	$\sqrt{160}$
RF4	30	$\sqrt{160}$
RF5	50	$\sqrt{160}$
RF6	50	5
RF7	100	$\sqrt{160}$
RF8	200	$\sqrt{160}$
RF9	200	5

Table 3. AdaBoost tests parameters considered in this study.

Identification	Number of Estimators	Learning Rate ¹	Rank Algorithm ²
AB1	1	1.00	SAMME.R
AB2	15	1.00	SAMME.R
AB3	20	1.00	SAMME.R
AB4	30	1.00	SAMME.R
AB5	50	1.00	SAMME.R
AB6	100	1.00	SAMME.R

¹ The learning rate determines the extent to which newly acquired information will replace old information (0 = agent learns nothing; 1 = agent considers the most recent information). ² Rank algorithm: SAMME (updates the base estimator weights with ranking results) or SAMME.R (updates the base estimator weights with probability estimates).

Table 4. Artificial Neural Network test parameters.

Identification	Number of Neurons in Each Hidden Layer	Number of Hidden Layers	Activation Function	Weight Optimizer	α (Stop Criteria)	Maximum Number of Iterations
NN1 to NN5	10	1 to 5	ReLu	L-BFGS-B	0.00002	1000
NN6 to NN10	50	1 to 5	ReLu	L-BFGS-B	0.00002	1000
NN11 to NN15	100	1 to 5	ReLu	L-BFGS-B	0.00002	1000
NN16 to NN20	200	1 to 5	ReLu	L-BFGS-B	0.00002	1000
NN21 to NN25	10	1 to 5	ReLu	SGD	0.00002	1000
NN26 to NN30	50	1 to 5	ReLu	SGD	0.00002	1000
NN31 to NN35	100	1 to 5	ReLu	SGD	0.00002	1000
NN36 to NN40	200	1 to 5	ReLu	SGD	0.00002	1000
NN41 to NN45	10	1 to 5	ReLu	Adam	0.00002	1000
NN46 to NN50	50	1 to 5	ReLu	Adam	0.00002	1000
NN51 to NN55	100	1 to 5	ReLu	Adam	0.00002	1000
NN56 a NN60	200	1 to 5	ReLu	Adam	0.00002	1000
NN61	First best result NN1-NN60	-	-	-	-	2000
NN62	First best result NN1-NN60		tanh	-	-	2000
NN63	Second best result NN1-NN60		-	-	-	2000
NN64	Second best result NN1-NN60		tanh	-	-	2000
NN65	Third best result NN1-NN60		-	-	-	2000
NN66	Third best result NN1-NN60		tanh	-	-	2000
NN67	Forth best result NN1-NN60		-	-	-	2000
NN68	Forth best result NN1-NN60		tanh	-	-	2000
NN69	Fifth best result NN1-NN60		-	-	-	2000
NN70	Fifth best result NN1-NN60		tanh	-	-	2000

Table 5. Random forest classification results. In bold, the best results obtained.

Identification	Training Time (s)	Testing Time (s)	AUC	Accuracy	F1	Precision	Recall
RF1	0.349	0.1720	0.9661	0.9456	0.9461	0.9468	0.9456
RF2	0.418	0.0500	0.9721	0.9440	0.9441	0.9442	0.9440
RF3	0.550	0.0550	0.9742	0.9504	0.9506	0.9509	0.9504
RF4	0.991	0.0590	0.9737	0.9464	0.9467	0.9472	0.9464
RF5	1.423	0.0740	0.9740	0.9440	0.9443	0.9447	0.9440
RF6	0.898	0.0900	0.9740	0.9424	0.9423	0.9422	0.9424
RF7	4.962	0.2980	0.9772	0.9440	0.9442	0.9444	0.9440
RF8	5.369	0.4600	0.9792	0.9416	0.9417	0.9419	0.9416
RF9	1.769	0.1410	0.9759	0.9440	0.9440	0.9440	0.9440

Table 6. AdaBoost results.

Identification	Time of Training (s)	Time of Testing (s)	AUC	Accuracy, F1, Precision and Recall
AB1	0.901	0.064	0.903	0.919
AB2	0.625	0.067	0.899	0.914
AB3	0.749	0.049	0.899	0.914
AB4	0.514	0.036	0.899	0.914
AB5	0.664	0.034	0.909	0.924
AB6	0.547	0.036	0.909	0.924

Table 7. MLT-ANN results. In bold, the best results obtained.

Identification	Time of Training (s)	Time of Testing (s)	AUC	Accuracy	F1	Precision	Recall
NN22	3.723	0.208	0.987	0.959	0.958	0.958	0.959
NN26	9.456	0.161	0.988	0.961	0.961	0.961	0.961
NN29	102.091	0.222	0.983	0.957	0.957	0.957	0.957
NN31	12.075	0.224	0.986	0.956	0.956	0.956	0.956
NN32	117.475	0.236	0.986	0.954	0.954	0.954	0.954
NN36	28.229	0.195	0.988	0.960	0.960	0.960	0.960
NN37	209.572	0.259	0.986	0.954	0.954	0.954	0.954
NN64	8.518	0.210	0.986	0.956	0.956	0.956	0.956
NN66	4.238	0.173	0.983	0.955	0.955	0.955	0.955
NN70	3.672	0.174	0.986	0.960	0.959	0.959	0.960

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kuck, T.N.; Sano, E.E.; Bispo, P.d.C.; Shiguemori, E.H.; Silva Filho, P.F.F.; Matricardi, E.A.T. A Comparative Assessment of Machine-Learning Techniques for Forest Degradation Caused by Selective Logging in an Amazon Region Using Multitemporal X-Band SAR Images. Remote Sens. 2021, 13, 3341. https://doi.org/10.3390/rs13173341

AMA Style

Kuck TN, Sano EE, Bispo PdC, Shiguemori EH, Silva Filho PFF, Matricardi EAT. A Comparative Assessment of Machine-Learning Techniques for Forest Degradation Caused by Selective Logging in an Amazon Region Using Multitemporal X-Band SAR Images. Remote Sensing. 2021; 13(17):3341. https://doi.org/10.3390/rs13173341

Chicago/Turabian Style

Kuck, Tahisa Neitzel, Edson Eyji Sano, Polyanna da Conceição Bispo, Elcio Hideiti Shiguemori, Paulo Fernando Ferreira Silva Filho, and Eraldo Aparecido Trondoli Matricardi. 2021. "A Comparative Assessment of Machine-Learning Techniques for Forest Degradation Caused by Selective Logging in an Amazon Region Using Multitemporal X-Band SAR Images" Remote Sensing 13, no. 17: 3341. https://doi.org/10.3390/rs13173341

APA Style

Kuck, T. N., Sano, E. E., Bispo, P. d. C., Shiguemori, E. H., Silva Filho, P. F. F., & Matricardi, E. A. T. (2021). A Comparative Assessment of Machine-Learning Techniques for Forest Degradation Caused by Selective Logging in an Amazon Region Using Multitemporal X-Band SAR Images. Remote Sensing, 13(17), 3341. https://doi.org/10.3390/rs13173341

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comparative Assessment of Machine-Learning Techniques for Forest Degradation Caused by Selective Logging in an Amazon Region Using Multitemporal X-Band SAR Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. SAR Data

2.3. Cloud Computing of LiDAR Points

2.4. SAR Attribute Extraction

2.5. Classification Tests through Machine Learning

3. Results

3.1. Exploratory Attribute Analysis

3.2. Tests with the Random Forest Classifier

3.3. Tests with the AdaBoost Classifier

3.4. Tests with MLP-ANN

3.5. Comparative Assessment of Machine Learning Techniques

3.6. Generalization Test

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI