Next Article in Journal
High-Resolution Lightning Detection and Possible Relationship with Rainfall Events over the Central Mediterranean Area
Previous Article in Journal
A Hybrid Model Integrating Spatial Pattern, Spatial Correlation, and Edge Information for Image Classification
Open AccessArticle

A Comparative Assessment of Machine-Learning Techniques for Land Use and Land Cover Classification of the Brazilian Tropical Savanna Using ALOS-2/PALSAR-2 Polarimetric Images

1
University of Brasília, Institute of Geosciences, CEP: 70297-400 Brasília, DF, Brazil
2
Embrapa Cerrados, BR-020 km 18, CEP: 73310-970 Planaltina, DF, Brazil
3
National Institute for Space Research, Av. Dos Astronautas, 1758, CEP: 12227-010 São José dos Campos, SP, Brazil
*
Author to whom correspondence should be addressed.
Remote Sens. 2019, 11(13), 1600; https://doi.org/10.3390/rs11131600
Received: 6 May 2019 / Revised: 29 June 2019 / Accepted: 29 June 2019 / Published: 5 July 2019
(This article belongs to the Section Environmental Remote Sensing)

Abstract

This study proposes a workflow for land use and land cover (LULC) classification of Advanced Land Observing Satellite-2 (ALOS-2) Phased Array type L-band Synthetic Aperture Radar-2 (PALSAR-2) images of the Brazilian tropical savanna (Cerrado) biome. The following LULC classes were considered: forestlands; shrublands; grasslands; reforestations; croplands; pasturelands; bare soils/straws; urban areas; and water reservoirs. The proposed approach combines polarimetric attributes, image segmentation, and machine-learning procedures. A set of 125 attributes was generated using polarimetric ALOS-2/PALSAR-2 images, including the van Zyl, Freeman–Durden, Yamaguchi, and Cloude–Pottier target decomposition components, incoherent polarimetric parameters (biomass indices and polarization ratios), and HH-, HV-, VH-, and VV-polarized amplitude images. These attributes were classified using the Naive Bayes (NB), DT J48 (DT = decision tree), Random Forest (RF), Multilayer Perceptron (MLP), and Support Vector Machine (SVM) algorithms. The RF, MLP, and SVM classifiers presented the most accurate performances. NB and DT J48 classifiers showed a lower performance in relation to the RF, MLP, and SVM. The DT J48 classifier was the most suitable algorithm for discriminating urban areas and natural vegetation cover. The proposed workflow can be replicated for other SAR images with different acquisition modes or for other types of vegetation domains.
Keywords: SAR; polarimetry; data mining; thematic mapping; Cerrado SAR; polarimetry; data mining; thematic mapping; Cerrado

1. Introduction

Land use and land cover (LULC) data are essential in several activities, including urban and regional planning [1,2] natural resources inventories [3,4], global environmental modeling processes [5], and monitoring of greenhouse gas emissions related to deforestation and forest degradation [6,7]. Although most of the LULC mappings in Brazil have been produced using optical remote sensing data [8,9], they present limitations in tropical regions because of these regions’ persistent cloud coverage. In this study, we considered the use of synthetic aperture radar (SAR) data that are obtained by active systems operating in the microwave range of the electromagnetic spectrum. SAR images present high sensitivity to soil moisture content [10], surface roughness [11,12], and vegetation structure [13]; therefore, they are highly complementary in relation to the optical images that are strongly dependent on, for example, chlorophyll contents of green vegetation or organic matter and iron contents in minerals and rocks [14].
Some authors have analyzed SAR images for identifying different LULC classes from the Brazilian tropical savanna (Cerrado), which is an important region in terms of hotspot for biodiversity conservation [15] and grain production for exports [16]. Approximately 43% of the Cerrado biome (88 million hectares) has been converted into different land use types, especially pasturelands and croplands [9]. The Cerrado has been at the center of great expansion of national grain production for exportation [17]. Sano et al. [18,19] evaluated the potential of Japanese Earth Resources Satellite-1 (JERS-1) data to discriminate Cerrado phytophysiognomies within the Brasília National Park. The L-band data presented an accurate performance that was mainly based on differences in the canopy structure of such phytophysiognomies.
Bitencourt et al. [20] also evaluated JERS-1 images, but with the purpose of estimating Cerrado’s woody biomass. They obtained a coefficient of determination (R²) of 0.87 between field biomass data and corresponding L-band backscattering coefficients (σo, units in dB). In the same line, Ningthoujam et al. [21] assessed the relationship of S-band radar backscatter and aboveground biomass in different forest types, including temperate and tropical forest–savanna environments located in the UK, Brazil, Africa, and India, with the aid of an airborne SAR. Bouvet et al. [22], on their turn, analyzed an Advanced Land Observing Satellite (ALOS) mosaic to estimate biomass throughout the African continent, and Odipo et al. [23] combined ALOS and laser-altimeter data to estimate biomass in South Africa. Cassol et al. [24] further explored this topic, retrieving secondary forest aboveground biomass from polarimetric Advanced Land Observing Satellite-2 (ALOS-2)/Phased Array type L-band Synthetic Aperture Radar-2 (PALSAR-2) data in the Brazilian Amazon, which comprises forest–savanna interfaces.
Concerning the particular topic of vegetation types classification, Sano et al. [25] presented a LULC map of the Federal District of Brazil with 10 natural and anthropogenic classes based on ALOS-1/ PALSAR-1 data. A validation of this map showed a Tau (τ) index of 0.70. Some authors have also evaluated ALOS data for vegetation classification in the African savanna. Braun and Hochschild [5] used the Random Forest (RF) classifier and ALOS time series to describe changes in the savanna ecosystem around the Djabal refugee camp in eastern Chad. Symeonakis et al. [26] conducted a RF classification of different savanna land cover types based on the use of ALOS and LANDSAT imagery of both dry and wet seasons along a transect extending over the border of South Africa and Botswana. Urbazaev et al. [27] also used the Japanese SAR satellite images to map the fractional woody cover in southern African savannas based on polarimetric attributes (van Zyl and Freeman–Durden decompositions) and the RF classifier. Recently, Mendes et al. [28] also mapped vegetation types in the Brazilian Cerrado by means of the RF classifier using Sentinel 2A, Sentinel 1A, ALOS-PALSAR 2 dual/full polarimetric, and TanDEM-X images related to dry and rainy seasons, and exploring target decomposition approaches (van Zyl, Freeman–Durden, and Yamaguchi decompositions), as well.
Excepting Braun and Hochschild [5], Cassol et al. [24] and Mendes et al. [28], all the previously reported studies about savannas worldwide only used data from airborne SAR or the JERS-1 and ALOS-1/PALSAR-1 satellites. In the experiment presented here, we considered more updated and advanced ALOS-2/PALSAR-2 datasets. Another important remark in relation to these previous savanna analyses is that this article is endowed with greater complexity since it deals with a more detailed legend comprising nine LULC classes and explores several classifiers based on machine-learning, as well as several polarimetric attributes extracted from different methods (eight incoherent polarimetric parameters and four incoherent target decomposition methods—van Zyl, Freeman–Durden, Yamaguchi, and Cloude–Pottier). This approach has not yet been explored in studies involving either Brazilian or African tropical savannas.
In this context, this paper evaluates different machine-learning supervised techniques for classifying several LULC categories found in the Brazilian Cerrado. The following image classification techniques were selected for they represent the currently most often used ones by the Remote Sensing Community: Naive Bayes (NB), DT J48 (DT = decision tree), RF, MultiLayer Perceptron (MLP), and Support Vector Machine (SVM). To the authors’ knowledge, this is the first time the combined use of machine-learning techniques and polarimetric SAR data, including manifold target decomposition approaches, has been proposed to discriminate the major LULC classes from the Cerrado biome at a more refined legend level as compared to previous studies.

2. Materials and Methods

This section presents the methodological approach proposed in this study. Firstly, we present the description of the study site (location and major characteristics), and then the satellite data used in this paper. Next, the steps of preprocessing, image segmentation, attribute extraction, classification, and validation are detailed.

2.1. Study Area

The study area (3660 km² in surface area; 15°22´ south latitude; 47°32´ west longitude) is located in the eastern portion of the Goiás State and in the northeastern sector of the Federal District, Brazil (Figure 1). The area corresponds to the boundaries of an ALOS-2/PALSAR-2 scene acquired in the StripMap (SM2) image acquisition mode. This area was selected because it presents representative LULC classes found in the Cerrado biome. Native vegetation, croplands, and pasturelands occur predominantly in the central part of the scene, while urban areas are located in middle-southern part of the image [29,30]. Croplands are mostly represented by soybeans and corn, although some vegetables are produced under the center-pivot irrigation system [31].
The forestlands are composed of gallery forests, dry forests, and Cerradão [32]. The shrublands (shrub Cerrado, Cerrado shrubland, and dense Cerrado) correspond to a mosaic of different proportions of shrubs and trees that occur over a grass-dominated layer, while grasslands are composed of native grass species. The topography is mostly flat at the central part of the study area; along the north-south direction, there is a relatively rough terrain (Serra Geral do Paranã) with dominant folded and faulted metasediments [33].

2.2. Materials

This study was based on the polarimetric ALOS-2/PALSAR-2 L-band images obtained on 14 May 2016 in the StripMap (SM2), polarimetric, and High Sensitive mode (quad-pol, pixel size of 6 m, ascending orbit, incidence angle of 27.8°, and Single Look Complex (SLC), 1.1 processing level). We used SNAP 6.0 software for the SLC data preprocessing, eCognition Developer 8.7 software for image segmentation and attribute extraction, and WEKA 3.8 software for image classification.
Multispectral- and panchromatic-pansharpened LANDSAT-8/Operational Land Imager (OLI) satellite images [34] obtained on 2 May 2016 and 3 June 2016, as well as the higher spatial resolution images available in the Google Earth and Bing platforms, were accessed by the QuickMapServices plugin available in QGIS 3.0 software and were then used for validation purposes. Other datasets utilized were: the annual and municipality-based agricultural production reports from 2015 (PAM–Produção Agrícola Municipal) [31]; the vector-based, LULC data produced by the MapBiomas [30] and TerraClass [29] projects; and the daily based rainfall data from the National Institute of Meteorology [35].

2.3. Approach

Figure 2 shows the main steps of this study, which involved ALOS-2/PALSAR-2 image preprocessing, a legend definition of the LULC map, image segmentation and classification, and validation.

2.3.1. Preprocessing

The ALOS-2/PALSAR-2 SLC data were converted into backscattering coefficients (σo) using the following equation (Equation (1)) [36]:
σ s l c 0 = 10   l o g 10 I 2 + Q 2 + C F A ,
where I and Q are the real and imaginary parts of the SLC images. CF corresponds to the radiometric calibration factor (−83 dB), and A is the conversion factor (32 dB) [36]. For the speckle noise attenuation, we employed the Refined Lee polarimetric filter and an adaptive window size of 7 pixels × 7 pixels. This filter preserves the statistics and the linear features of the images [37].
The incoherent polarimetric parameters are derived from the power measurements in σo [38]. In this research, these parameters were generated to compose the set of attributes used in the machine-learning phase. The following indices were generated: Radar Vegetation Index (RVI) [39]; Radar Forest Degradation Index (RFDI) [40]; Canopy Structure Index (CSI); Volume Scattering Index (VSI); and Biomass Index (BMI) [41]. Parallel (co-pol) and cross-polarization (cros-pol) ratios were also generated [38].
Target decomposition aims to represent scattering processes as a sum of independent elements related to the physical scattering mechanisms [42]. The methods of target decomposition are classified into coherent and incoherent types [42,43]. Coherent decompositions assume the existence of deterministic scatterers and that the backscatter wave is polarized. In general, this type of target decomposition uses the Jones scattering matrix to represent the polarization states of the electromagnetic wave. Incoherent decompositions assume that scattering is not deterministic, so the backscattered wave is partially polarized. In this case, the power reflection matrices (covariance and coherence matrices) are used to characterize the backscattered wave [37,44].
In remote sensing applications, the assumption of the occurrence of pure deterministic targets is invalid [44], so the power reflection matrices are often used. In this study, we used only incoherent methods. The following algorithms were considered: van Zyl (three components) [45]; Freeman–Durden (three components) [46]; Yamaguchi (four components) [47]; and Cloude–Pottier (three components: entropy (H), anisotropy (A), and α angle) [42]. The decompositions were generated directly from the SLC images using the SNAP 6.0 application and the 5 pixels x 5 pixels window. Filters were not applied in the power matrices used in the polarimetric decompositions.

2.3.2. Image Segmentation and Attribute Extraction

The calibrated polarized images in terms of backscattering coefficients, the incoherent polarimetric parameters, and the polarimetric decompositions parameters were orthorectified using the range Doppler model specific for SAR sensors. The 30-m spatial resolution, digital elevation models (DEM) obtained by the Shuttle Radar Topograpy Mission (SRTM) were used in the orthorectification process.
For SAR images segmentation (σoHH, σoHV, σoVH, and σoVV), a multiresolution algorithm based on growing region was used [48]; in considering the scaling factor and the homogeneity composition variables, the latter was divided into color and shape. The shape, in turn, is subdivided into compactness and smoothness. The scale defines the size of the segments of an image, and the homogeneity composition tests the equality between segments [49]. Only one level of segmentation was generated, with parameters defined from several empirical tests. The scale parameter of 50 was selected, and weights were assigned to the criteria of homogeneity (shape = 0.10; color = 0.90; smoothness = 0.50; and compactness = 0.50).
After segmentation, the segment attributes were extracted. Among the various existing categories of attribute metrics, we selected the layer values of mean, standard deviation, asymmetry, and the pixel-based, minimum and maximum values. Thus, for each of the 25 images (eight polarimetric parameters, 13 decomposition components, and four polarizations) available, the attributes of the two categories mentioned previously were extracted. Therefore, a set of 125 layers of attributes (five attributes for 25 images) was used in the machine-learning-based classifications.

2.3.3. Classification and Validation

The following machine-learning classification algorithms were analyzed: NB, DT J48, RF, MLP, and SVM. An NB classifier employs the Bayesian theory dealing with conditional probability and predictions of events, with strong (naive) independent assumptions. NB assumes that the presence (or absence) of a given feature of a class is not related to the presence (or absence) of any other feature. Depending on the precise nature of the probability model, NB classifiers can be trained very efficiently in a supervised learning framework. In many practical applications, parameter estimation for NB models uses the method of maximum likelihood; in other words, one can work with the NB model without assuming Bayesian probability or using any Bayesian methods. Despite their naive setting and apparently over-simplified assumptions, NB classifiers have performed quite satisfactorily in many complex real-world situations. An advantage of the NB classifier is that it only requires a reduced amount of training data to estimate the parameters (means and variances of the variables) necessary for classification. Because independent variables are assumed, only the variances of the variables for each class need to be determined and not the entire covariance matrix [50,51,52].
The DT classifier (DT J48) consist in a graph that employs the “divide-and-conquer” approach to test attributes and assign classes to independent instances [53]. Basically, DTs are a non-parametric supervised learning method used for classification and regression. DTs learn from data to approximate a sine curve with a set of if-then-else decision rules. The deeper the tree, the more complex the decision rules and the fitter the model. A DT builds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets, while at the same time an associated DT is incrementally developed. The final result is a tree with decision nodes and leaf nodes. A decision node has two or more branches. Leaf node represents a classification or decision. The topmost decision node in a tree which corresponds to the best predictor is called the root node. DTs can handle both categorical and numerical data. There are several steps involved in the building of a DT. The first one is the process of partitioning the dataset into subsets, named splitting. Splits are formed on a particular variable. The second one is pruning, which corresponds to the shortening of branches of the tree. Pruning is the process of reducing the size of the tree by turning some branch nodes into leaf nodes and removing the leaf nodes under the original branch. Pruning is useful because classification trees may fit the training data well but may do a poor job of classifying new values. A simpler tree often avoids over-fitting. And finally, the next process is tree selection, which is responsible for finding the smallest tree that fits the data. Usually this is the tree that yields the lowest cross-validation error [53].
The RF algorithm was conceived by combining a large number of random DTs. Each tree contributes with only one class vote for each instance, and the final classification is determined by the majority of the votes of all forest trees [54]. The trees in RF are created by drawing a subset of training data through a bagging approach. The bagging randomly selects about two-thirds of the samples from the training data to train these trees. This means that the same sample can be used in a training subset several times, while others may not be selected in a particular subset [55]. In the RF algorithm, there are two main parameters to be defined: the number of variables in the random subset at each node (mtry) and the number of trees (ntree). Rodriguez-Galiano et al. [56] conducted an empirical evaluation as to the parameter “number of trees” and reported that differences of more than a hundred trees in the classification accuracy are not meaningful; hence we opted the use of 100 trees in this work. Concerning the mtry parameter, the default value was adopted, which corresponds to the square root of the total number of features used in each experiment [57]. Other authors, however, rely on optimization procedures to assess the values of ntree and mtry, as described in [58,59]. RF has shown to own several advantages in relation to other classifiers, since it is not based on strict parametric assumptions, besides being able to handle high dimensional data and to deal with nonlinearity. However, RF has its limitations, like longer computing time and higher algorithmic complexity as compared to an individual DT [60].
The MLP is a forward-structure artificial neural network (ANN) trained by the backpropagation method, designed to map a set of input vectors to a set of output vectors [61]. ANN can be simply defined as a massively parallel distributed computational device consisting of processing units, also called neurons or nodes, which are organized in a couple of layers. The neurons are responsible for the storage of knowledge acquired within the system, which is then made available for further use [62]. MLPs learn fast with high generalization and have a strong self-learning ability [61,63]. They are composed of an input layer to receive the signal, an output layer that makes a decision or prediction about the input, and in between those two, an arbitrary number of hidden layers (or eventually none) that are the true computational engine of the MLP. MLPs with one hidden layer are capable of approximating any continuous function. These successive layers of processing units present connections running from every unit (neuron) in one layer to every unit in the next layer. The connections are responsible for passing information throughout the network, and they are characterized by weights, which are initially set in a random way and can be positive or negative [64]. All the neurons, except those belonging to the input layer, perform two simple processing functions—receiving the signal (activation) of the neurons in the previous layer and transmitting a new signal as the input to the next layer. The weights in the network can be updated from the errors calculated for each training example, and this is called online learning. Alternatively, the errors can be saved up across all of the training examples, and the network can be updated at the end. This is called batch learning and is often more stable. The learning should be stopped when the validation set error reaches its minimum. At this very point, the net is able to attain the best generalization [62,65]. If learning is not stopped, overtraining occurs, and the performance of the net is jeopardized. Once a neural network has been trained, it can be used to make predictions.
The SVM function is based on the concept of decision planes that define decision boundaries. A decision plane is one that separates between a set of objects having different class memberships. According to [66], the possibility to maximize the margin (either side of a hyperplane that separates two data classes) and to create the largest possible distance between the separating hyperplanes has been acknowledged to reduce the upper bound of the expected generalization error. SVM supports both regression and classification tasks and can handle multiple continuous and categorical variables. Thus, SVM is primarily a classifier method that performs classification tasks by constructing hyperplanes in a multidimensional space that separates linear and nonlinear samples of different class labels [67,68]. This classifier is meant to maximize the distance between these hyperplanes and the classes samples, in which the bordering samples are called support vectors. Multiclass problems are solved by pairwise classification. There are different algorithms to train an SVM, like quadratic programming and the more efficient sequential minimal optimization (SMO), that uses heuristics to partition the training problem into smaller problems (that can be solved analytically), replaces all missing values, and transforms nominal attributes into binary ones, besides normalizing all attributes by default, aiming at minimizing an error function.
According to [31], there are approximately 132,000 hectares of soybeans and 85,000 hectares of maize in the study area. Cultivated pastures, forestlands, and shrublands are the other major classes found in the study area [29,30]. Field surveys were carried out on 10–11 September 2017 along the BR-010 and GO-118 highways with the purpose of identifying the major LULC classes present in the study area (Figure 3). Thus, we considered the following representative LULC classes: forestlands; shrublands; grasslands; reforestations; croplands; pasturelands; bare soils/straws; urban areas; and water reservoirs.
Based on LANDSAT-8 and higher spatial resolution images available in the Google Earth and Bing platforms, 200 training samples of segments were selected for each LULC class (except for grasslands, reforestations, and water reservoirs—25 samples each because of their limited occurrence in the study area). Another set of 959 segments was selected for validation purposes, according to approaches reported by [69]. A set of 1000 random and nonstratified points designed previously for the field campaign was considered. Forty-one segments were disregarded since they were located in hilly regions affected by layover or foreshortening effects associated with the radar image acquisition processes.
Thus, seven shapefiles were generated: five for training the classifiers (5, 25, 50, 100, and 200 samples per class), one for validation, and one for a classification using all 39,254 segments generated in the segmentation step. Each classification algorithm was trained five times and the respective validations were carried out with the same set of 959 segments. The validations were performed based on the error matrices generated with the segments of each classification. The following validation metrics were used: global accuracy, Kappa index, conditional producer’s accuracy (PA), and user´s accuracy (UA). Hypothesis tests were also analyzed based on the standard normal distribution to compare Kappa indices and to evaluate the performances of the different classifications.

3. Results and Discussion

Based on INMET’s data analysis, we verified that there was no rainfall during the 43 days before the ALOS-2 overpass (14 May 2016). Therefore, there was likely little influence of the soil moisture and plant water contents in the SAR image considered in this study. Figure 4 shows Kappa indices according to different training sets and different classifiers, and Table 1 presents the mean and standard deviation of these Kappa indices. Overall, they increased gradually according to the increase in the number of training samples. The MLP, RF, and SVM classifiers presented the highest Kappa indices (Kappa > 0.50), regardless of the number of samples. The maximum Kappa value was 0.68 for the SVM algorithm with 200 samples. Shiraishi et al. [70] also verified this behavior in their experiment with different training sets. Belgiu and Dragut [55] pointed out that the RF performs accurately for studies that employ few training samples and large attribute space.
The NB classifier presented a higher performance in comparison with the DT J48 classifier when the number of samples are fewer than 50, and a similar or worse performance when the number of samples are larger than 50. The NB classifier, which is widely recommended in the literature [67,68], showed a relatively low accuracy performance that was probably due to the high landscape complexity of the study area. Because of its relatively low computational costs, the NB classifier can be appropriated for inventories of large areas with more homogeneous land cover patterns. The use of several DTs—the case of RF—allowed the attainment of higher Kappa agreement indices compared to DT J48 in all training scenarios, which is in accordance with the results obtained by [55]. RF presented a similar performance for the more complex classifiers (MLP and SVM), regardless of the number of training samples. The same results were obtained by [70].
Figure 5 shows similar performances for the RF, MLP, and SVM algorithms in terms of UA and PA. This was also the case between the NB and DT J48 algorithms. The UA performance of the NB to discriminate reforestation and natural grasslands was high, with conditional Kappa indices of 0.65 and 0.70, respectively. These results were higher than those obtained by other classifiers.
Cerrado shrubland and shrub Cerrado presented a high degree of confusion with the forestlands and cultivated pastures, respectively. This confusion was somehow expected, since the shrub Cerrado has similar landscape characteristics of the cultivated pastures in the study area in terms of biomass levels and vegetation structure (sparse shrubs over the grassland stratum); both had relatively low levels of backscattering coefficients in the L-band images. Depending on the degree of preservation of the shrub Cerrado, confusion with Cerrado shrubland often occurs due to the more prominent volumetric scattering [22,23,27]. Confusion also occurred between cultivated pastures and bare soil/straws, which was also expected because of the similar landscape conditions (low moisture content, lack of vegetation, and relatively smooth soil roughness).
Figure 6 shows the results of the Kappa index and the global accuracy of different classifiers, involving nine LULC classes and 200 training samples. Figure 7 presents the classification results obtained by the different classifiers. The NB classifier overestimated the urban areas, which can be ascribed to the presence of hilly terrain in the surrounding areas. The effects of layover and foreshortening on the ALOS-2 images from the study area probably generated some confusion in identifying urban areas. The DT J48 presented a more accurate identification of urban areas.
Table 2 shows the p-values of the Z tests performed for each pair of classifiers. The p-values highlighted in bold are higher than the level of significance adopted in the test (α = 0.05), which therefore indicates classifiers with the same level of performance (H0: Kappa A - Kappa B = 0; and H1: Kappa A - Kappa B < 0; A and B = classifiers). The RF, MLP, and SVM classifiers were statistically similar in terms of Kappa agreement indices. The NB and DT J48 classifiers were also statistically similar to each other; however, they were significantly different between themselves, and both showed a lower performance in relation to RF, MLP, and SVM. Similar tests involving five LULC classes presented similar results (Table 3).
The DT J48 classifier used all input layers in its classification procedure (eight polarimetric parameters, 13 decomposition components, and four polarizations). Only the metrics applied to the segment varied. In other words, the mean, standard deviation, minimum value, and all other parameters were used. In the DT J48 classification, the classes with the best classification performance were urban areas and forestlands. The node-leaf attributes were, respectively, the cross-polarization ratio (HH/HV) and the VSI index. The use of this index to separate the forest formation is logical since it estimates the volumetric scattering of forest canopies [41]. Volumetric scattering components (from van Zyl, Freeman–Durden, and Yamaguchi theorems) were also listed at the DT J48 nodes. This is a quite coherent result since these components are related to the structure of the vegetation canopy [18,20,22,23,27]. The H, A, and α components and the CSI, VSI, BMI, and RFDI indices were also listed. The cross-polarization ratios were less thoroughly involved, and the amplitude cross-polarizations (HV and VH) were used, as well.

4. Conclusions

The methodology used in this study proved to be feasible for mapping LULC in the Cerrado biome. According to [71], the results of the classification are considered as “good” (NB and AD J48) or “very good” (RF, MLP, and SVM). Two groups of classifiers were identified. The first group, which obtained the best results, comprises the RF, MLP, and SVM algorithms, which presented statistically similar Kappa indices. The second group, which had less accurate performances, is composed of the NB and DT J48 classifiers, which also presented statistically similar Kappa indices.
The RF classifier outperformed DT J48, which agrees with the results found in the literature [55]. DT J48 is more complex, but it did not present more accurate results in comparison with those obtained by the NB classifier. However, the high accuracy of DT J48 in discriminating urban areas can provide thematic maps with higher accuracy in cases where the urban area is the main target of classification.
As for the polarimetric attributes, we verified that the decompositions were important in the identification and classification of LULC classes. Volumetric scattering components (van Zyl, Freeman–Durden, and Yamaguchi theorems) were used in the DT J48 classification. These components are related to the structure of the vegetation canopy [18,19,20,22,23,27]. However, a more detailed study should be performed to understand the mechanisms and types of scattering that prevailed in the scene as well as the mechanisms and types that are the most important components in the classifications. The so-called incoherent parameters also played an important role, especially for CSI, VSI, BMI, and RFDI. The cross-polarization ratios were less important, however, and cross-polarization in amplitude (HV and VH) should be highlighted.
For future studies, we suggest evaluating multitemporal SAR images and polarimetric and interferometric (PolInSAR) techniques. C-band SENTINEL-1 SAR data could also be tested separately or in combination with the ALOS-2/PALSAR-2 data. Testing the use of a previous unsupervised classification based on the H-α attribute space is also possible to better understand the dominant scattering processes and, consequently, to define a better strategy for training the classifiers. There should also be a previous basis for classification with the aim of using stratified sampling for thematic validation.
Finally, in considering the complexity of the landscape in the study area and the levels of accuracy obtained, we conclude that the procedures and attributes used in this research can be extended to other types of vegetation domains.

Author Contributions

Conceptualization, F.F.C. and E.E.S.; methodology, F.F.C.; software, F.F.C.; validation, F.F.C.; resources, E.E.S. and F.F.C.; writing—original draft preparation, F.F.C.; writing—review and editing, E.E.S., C.M.A., J.C.M. and T.A.

Funding

This research received no external funding.

Acknowledgments

The authors would like to express their gratitude to the Japanese Aerospace Exploration Agency (JAXA), specifically the Kyoto & Carbon Initiative, for providing the ALOS-2/PALSAR-2 images. The authors are also grateful for the three anonymous reviewers, who contributed to improve the quality of this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gamba, P.; Aldrighi, M. SAR data classification of urban areas by means of segmentation techniques and ancillary optical data. IEEE J. Select. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 1140–1148. [Google Scholar] [CrossRef]
  2. Qi, Z.; Yeh, A.G.O.; Li, X.; Lin, Z. A novel algorithm for land use and land cover classification using RADARSAT-2 polarimetric SAR data. Remote Sens. Environ. 2012, 118, 21–39. [Google Scholar] [CrossRef]
  3. Evans, T.L.; Costa, M. Landcover classification of the lower Nhecolândia subregion of the Brazilian Pantanal wetlands using ALOS/PALSAR, RADARSAT-2 and ENVISAT/ASAR imagery. Remote Sens. Environ. 2013, 128, 118–137. [Google Scholar] [CrossRef]
  4. Reynolds, J.; Wesson, K.; Desbiez, A.L.J.; Ochoa-Quintero, J.M.; Leimgruber, P. Using remote sensing and random forest to assess the conservation status of critical Cerrado habitats in Mato Grosso do Sul, Brazil. Land 2016, 5, 12. [Google Scholar] [CrossRef]
  5. Braun, A.; Hochschild, V. A SAR-based index for landscape changes in African savannas. Remote Sens. 2017, 9, 23. [Google Scholar] [CrossRef]
  6. Miles, L.; Kapos, V. Reducing greenhouse gas emissions from deforestation and forest degradation: Global land-use implications. Science 2008, 320, 1454–1455. [Google Scholar] [CrossRef] [PubMed]
  7. Haarpaintner, J.; Blanco, D.F.; Enssle, F.; Datta, P.; Mazinga, A.; Singa, C.; Mane, L. Tropical forest remote sensing services for the Democratic Republic of Congo inside the EU FP7 ‘Recover’ Project (Final Results 2000–2012). In Proceedings of the XXXVIth International Symposium on Remote Sensing of Environment, Berlin, Germany, 11–15 May 2015; pp. 397–402. [Google Scholar]
  8. Sano, E.E.; Rosa, R.; Brito, J.L.S.; Ferreira, L.G. Land cover mapping of the tropical savanna region in Brazil. Environ. Monit. Assess. 2010, 166, 113–124. [Google Scholar] [CrossRef] [PubMed]
  9. Scaramuzza, C.A.; Sano, E.E.; Adami, M.; Bolfe, E.L.; Coutinho, A.C.; Esquerdo, J.; Maurano, L.; Narvaes, I.D.S.; de Oliveira Filho, F.J.B.; Rosa, R.; et al. Land-use and land-cover mapping of the Brazilian Cerrado based mainly on Landsat-8 satellite images. Rev. Bras. Cart. 2017, 69, 1041–1051. [Google Scholar]
  10. Rahman, M.M.; Moran, M.S.; Thoma, D.P.; Bryant, R.; Collins, C.D.H.; Jackson, T.; Orr, B.J.; Tischler, M. Mapping surface roughness and soil moisture using multi-angle radar imagery without ancillary data. Remote Sens. Environ. 2008, 112, 391–402. [Google Scholar] [CrossRef]
  11. Duarte, R.M.; Wozniak, E.; Recondo, C.; Cabo, C.; Marquínez, J.; Fernández, S. Estimation of surface roughness and stone cover in burnt soils using SAR images. Catena 2008, 74, 264–272. [Google Scholar] [CrossRef]
  12. Tollerud, H.J.; Fantle, M.S. The temporal variability of centimeter-scale surface roughness in a playa dust source: Synthetic aperture radar investigation of playa surface dynamics. Remote Sens. Environ. 2014, 154, 285–297. [Google Scholar] [CrossRef]
  13. Bergen, K.M.; Goetz, S.J.; Dubayah, R.O.; Henebry, G.M.; Hunsaker, C.T.; Imhoff, M.L.; Nelson, R.F.; Parker, G.G.; Radeloff, V.C. Remote sensing of vegetation 3-D structure for biodiversity and habitat: Review and implications for lidar and radar spaceborne missions. J. Geophys. Res. 2009, 114, 1–13. [Google Scholar] [CrossRef]
  14. Jensen, J.R. Remote Sensing of the Environment. An Earth Resource Perspective, 2nd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2007. [Google Scholar]
  15. Myers, N.; Mittermeier, R.A.; Mittermeier, C.G.; Fonseca, G.A.B.; Kent, J. Biodiversity hotspots for conservation priorities. Nature 2000, 403, 853–858. [Google Scholar] [CrossRef] [PubMed]
  16. Strassburg, B.B.N.; Brooks, T.; Feltran-Barbiero, R.; Iribarrem, A.; Crouzeilles, R.; Loyola, R.; Latawiec, A.E.; Oliveira Filho, F.J.B.; Scaramuzza, C.A.M.; Scarano, F.R.; et al. Moment of truth for the Cerrado hotspot. Nat. Ecol. Evol. 2017, 1, 3. [Google Scholar] [CrossRef] [PubMed]
  17. Rada, N. Assessing Brazil’s Cerrado agricultural miracle. Food Policy 2013, 38, 146–155. [Google Scholar] [CrossRef]
  18. Sano, E.E.; Pinheiro, G.C.C.; Meneses, P.R. Assessing JERS-1 synthetic aperture radar data for vegetation mapping in the Brazilian savanna. J. Remote Sens. Soc. Jpn. 2001, 21, 158–167. [Google Scholar]
  19. Sano, E.E.; Ferreira, L.G.; Huete, A.R. Synthetic aperture radar (L-band) and optical vegetation indices for discriminating the Brazilian savanna physiognomies: A comparative analysis. Earth Interact. 2005, 9, 15. [Google Scholar] [CrossRef]
  20. Bitencourt, M.D.; Mesquita H.N., Jr.; Kuntschik, G.; Rocha, H.R.; Furley, P.A. Cerrado vegetation study using optical and radar remote sensing: Two Brazilian case studies. Can. J. Remote Sens. 2007, 33, 468–480. [Google Scholar] [CrossRef]
  21. Ningthoujam, R.K.; Balzter, H.; Tansey, K.; Feldpausch, T.R.; Mitchard, E.T.A.; Wani, A.A.; Joshi, P.K. Relationships of S-band radar backscatter and forest aboveground biomass in different forest types. Remote Sens. 2017, 9, 1116. [Google Scholar] [CrossRef]
  22. Bouvet, A.; Mermóz, S.; Le Toan, T.; Villard, L.; Mathieu, R.; Naidoo, L.; Asner, G.P. An above-ground biomass map of African savannahs and woodlands at 25m resolution derived from ALOS PALSAR. Remote Sens. Environ. 2018, 206, 156–173. [Google Scholar] [CrossRef]
  23. Odipo, V.O.; Nickless, A.; Berger, C.; Baade, J.; Urbazaev, M.; Walther, C.; Schmullius, C. Assessment of aboveground woody biomass dynamics using terrestrial laser scanner and L-band ALOS PALSAR data in South African savanna. Forests 2016, 7, 24. [Google Scholar] [CrossRef]
  24. Cassol, H.L.G.; Carreiras, J.M.B.; Moraes, E.C.; Aragão, L.E.O.C.; Silva, C.V.J.; Quegan, S.; Shimabukuro, Y.E. Retrieving secondary forest aboveground biomass from polarimetric ALOS-2 PALSAR-2 data in the Brazilian Amazon. Remote Sens. 2019, 11, 59. [Google Scholar] [CrossRef]
  25. Sano, E.E.; Santos, E.M.; Meneses, P.R. Análise de imagens do satélite ALOS PALSAR para o mapeamento de uso e cobertura da terra do Distrito Federal. Geociências 2009, 28, 441–451. [Google Scholar]
  26. Symeonakis, E.; Higginbottom, T.P.; Petroulaki, K.; Rabe, A. Optimisation of savannah land cover characterisation with optical and SAR data. Remote Sens. 2018, 10, 18. [Google Scholar] [CrossRef]
  27. Urbazaev, M.; Thiel, C.; Mathieu, R.; Naidoo, L.; Levick, S.R.; Smit, I.P.J.; Asner, G.P.; Schmullius, C. Assessment of the mapping of fractional woody cover in southern African savannas using multi-temporal and polarimetric ALOS PALSAR L-band images. Remote Sens. Environ. 2015, 166, 138–153. [Google Scholar] [CrossRef]
  28. Mendes, F.S.; Baron, D.; Gerold, G.; Liesenberg, V.; Erasmi, F. Optical and SAR remote sensing synergism for mapping vegetation types in the endangered Cerrado/Amazon ecotone of Nova Mutum—Mato Grosso. Remote Sens. 2019, 11, 1161. [Google Scholar] [CrossRef]
  29. INPE. Projeto TerraClass Cerrado. Mapeamento do uso e Cobertura Vegetal do Cerrado. 2017. Available online: http://www.dpi.inpe.br/tccerrado/download.php (accessed on 1 July 2017).
  30. MapBiomas. Mapeamento Anual da Cobertura e uso do Solo no Brasil. 2017. Available online: http://mapbiomas.org (accessed on 15 June 2017).
  31. IBGE. Produção Agrícola Municipal. 2017. Available online: https://ww2.ibge.gov.br/home/estatistica/economia/pam/2016/default.shtm (accessed on 10 August 2017).
  32. Ribeiro, J.F.; Walter, B.M.T. As principais fitofisionomias do Cerrado. In Cerrado: Ecologia e Flora; Sano, S.M., Almeida, S.P., Ribeiro, J.F., Eds.; Embrapa Cerrados: Planaltina, Brazil, 2008; pp. 151–199. [Google Scholar]
  33. Latrubese, E.M.; Carvalho, T.M. Geomorfologia do Estado de Goiás e Distrito Federal; Superintendência de Geologia e Mineração do Estado de Goiás: Goiânia, Brazil, 2006; 128p. [Google Scholar]
  34. USGS. Global Visualization (GloVis) Viewer. 2017. Available online: https://glovis.usgs.gov/ (accessed on 5 February 2017).
  35. INMET. Estações Automáticas. DF—Águas Emendadas. 2018. Available online: http://www.inmet.gov.br/portal/index.php?r=estacoes/estacoesAutomaticas (accessed on 15 July 2018).
  36. JAXA. Calibration Results of Alos-2/Palsar-2 Jaxa Standard Products. 2018. Available online: https://www.eorc.jaxa.jp/ALOS-2/en/calval/calval_index.htm (accessed on 15 January 2018).
  37. Lee, J.; Pottier, E. Polarimetric Radar Imaging. In From Basics to Applications; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
  38. Boerner, W.; Mott, H.; Lünenberg, E.; Livingstone, C.; Brisco, B.; Brown, R.J.; Paterson, J.S. Polarimetry in radar remote sensing: Basic and applied concepts. In Manual of Remote Sensing: Principles and Applications of Imaging Radars, 3rd ed.; Henderson, F.M., Lewis, A.J., Eds.; John Wiley & Sons: New York, NY, USA, 1998; pp. 271–356. [Google Scholar]
  39. Kim, Y.; van Zyl, J.J. A time-series approach to estimate soil moisture using polarimetric radar data. IEEE Trans. Geosci. Remote Sens. 2009, 47, 2519–2527. [Google Scholar] [CrossRef]
  40. Mitchard, E.T.A.; Saatchi, S.S.; White, L.J.T.; Abernethy, K.A.; Jeffery, K.J.; Lewis, S.L.; Collins, M.; Lefsky, M.A.; Leal, M.E.; Woodhouse, I.H.; et al. Mapping tropical forest biomass with radar and spaceborne LIDAR in Lopé National Park, Gabon: Overcoming problems of high biomass and persistent cloud. Biogeosciences 2012, 9, 179–191. [Google Scholar] [CrossRef]
  41. Pope, K.O.; Rey-Benayas, J.M.; Paris, J.F. Radar remote sensing of forest and wetland ecosystems in the central American tropics. Remote Sens. Environ. 1994, 48, 205–219. [Google Scholar] [CrossRef]
  42. Cloude, S.R.; Pottier, E. A review of target decomposition theorems in radar polarimetry. IEEE Trans. Geosci. Remote Sens. 1996, 34, 498–518. [Google Scholar] [CrossRef]
  43. Hellmann, M.P. SAR Polarimetry Tutorial. 2001. Available online: http://epsilon.nought.de/ (accessed on 1 February 2017).
  44. Richards, J.A. Remote Sensing with Imaging Radar; Springer: Berlin, Germany, 2009. [Google Scholar]
  45. van Zyl, J.J. Unsupervised classification of scattering behavior using radar polarimetry data. IEEE Trans. Geosci. Remote Sens. 1989, 27, 36–45. [Google Scholar] [CrossRef]
  46. Freeman, A.; Durden, S.L. A three-component scattering model for polarimetric SAR Data. IEEE Trans. Geosci. Remote Sens. 1998, 36, 963–973. [Google Scholar] [CrossRef]
  47. Yamaguchi, Y.; Moriyama, T.; Ishido, M.; Yamada, H. Four-component scattering model for polarimetric SAR image decomposition. IEEE Trans. Geosci. Remote Sens. 2005, 43, 1699–1706. [Google Scholar] [CrossRef]
  48. Trimble. eCognition Developer 8.7. Reference Book; Trimble: Munich, Germany, 2011. [Google Scholar]
  49. Benz, U.C.; Hofmann, P.; Willhauck, G.; Lingenfelder, I.; Heynen, M. Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS ready information. ISPRS J. Photogramm. Remote Sens. 2004, 58, 239–258. [Google Scholar] [CrossRef]
  50. Zhang, H. The Optimality of Naive Bayes. Available online: http://www.cs.unb.ca/~hzhang/publications/ FLAIRS04ZhangH.pdf (accessed on 13 June 2019).
  51. Caruana, R.; Niculescu-Mizil, A. An Empirical Comparison of Supervised Learning Algorithms. Available online: http://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml06.pdf (accessed on 13 June 2019).
  52. John, G.H.; Langley, P. Estimating Continuous Distributions in Bayesian Classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Montreal, QC, Canada, 18–20 August 1995; Morgan Kaufmann: San Mateo, CA, USA; pp. 338–345. Available online: http://web.cs.iastate.edu/~honavar/bayes-continuous.pdf (accessed on 13 June 2019).
  53. Quinlan, J.R. Combining instance-based and model-based learning. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, USA, 27–29 June 1993; pp. 236–243. [Google Scholar]
  54. Hastie, T.J.; Tibshirani, R.J.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009. [Google Scholar]
  55. Belgiu, M.; Dragut, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. 2016, 114, 24–31. [Google Scholar] [CrossRef]
  56. Rodriguez-Galiano, V.F.; Chica-Rivas, M. Evaluation of different machine learning methods for land cover mapping of a Mediterranean area using multi-seasonal Landsat images and Digital Terrain Models. Int. J. Digit. Earth. 2014, 7, 492–509. [Google Scholar] [CrossRef]
  57. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  58. Win, T.S.; Malik, A.A.; Prachayasittikul, V.; Wikberg, J.E.S.; Nantasenamat, C.; Shoombuatong, W. HemoPred: A web server for predicting the hemolytic activity of peptides. Future Med. Chem. 2017, 9, 275–291. [Google Scholar] [CrossRef] [PubMed]
  59. Win, T.S.; Schaduangrat, N.; Prachayasittikul, V.; Nantasenamat, C.; Shoombuatong, W. PAAP: A web server for predicting antihypertensive activity of peptides. Future Med. Chem. 2018, 10, 1749–1767. [Google Scholar] [CrossRef] [PubMed]
  60. Zhang, Q.; Gao, W.; Su, S.; Weng, M.; Cai, Z. Biophysical and socioeconomic determinants of tea expansion: Apportioning their relative importance for sustainable land use policy. Land Use Policy 2017, 68, 438–447. [Google Scholar] [CrossRef]
  61. Hu, L.; He, S.; Han, Z.; Xiao, H.; Su, S.; Weng, M.; Cai, Z. Monitoring housing rental prices based on social media: An integrated approach of machine-learning algorithms and hedonic modeling to inform equitable housing policies. Land Use Policy 2019, 82, 657–673. [Google Scholar] [CrossRef]
  62. Haykin, S.S. Neural Networks: A Comprehensive Foundation; Prentice-Hall: Upper Saddle River, NJ, USA, 1999. [Google Scholar]
  63. Lian, C.; Zeng, Z.; Yao, W.; Tang, H. Multiple neural networks switched prediction for landslide displacement. Eng. Geol. 2015, 186, 91–99. [Google Scholar] [CrossRef]
  64. Bishop, C.M. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
  65. Fischer, M.M.; Abrahart, R.J. Neurocomputing—Tools for Geographers. In GeoComputation; Openshaw, S., Abrahart, R.J., Eds.; Taylor & Francis: New York, NY, USA, 2000; pp. 187–217. [Google Scholar]
  66. Li, G.; Cai, Z.; Liu, X.; Liu, J.; Su, S. A comparison of machine learning approaches for identifying high-poverty counties: Robust features of DMSP/OLS night-time light imagery. Int. J. Remote Sens. 2019, 40, 5716–5736. [Google Scholar] [CrossRef]
  67. Witten, I.H.; Frank, E. Data Mining: Practical Machine Learning. Tools and Techniques, 2nd ed.; Morgan Kaufmann: San Francisco, CA, USA, 2005. [Google Scholar]
  68. Han, J.; Kamber, M.; Pei, J. Data Mining: Concepts and Techniques, 3rd ed.; Elsevier: Whaltan, MA, USA, 2012. [Google Scholar]
  69. Congalton, R.G.; Green, K. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices; CRC Press: Boca Raton, FL, USA, 2009. [Google Scholar]
  70. Shiraishi, T.; Motohka, T.; Thapa, R.B.; Watanabe, M.; Shimada, M. Comparative assessment of supervised classifiers for land use-land cover classification in a tropical region using time-series PALSAR mosaic data. IEEE J. Select. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1186–1199. [Google Scholar] [CrossRef]
  71. Landis, J.R.; Koch, G.G. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef]
Figure 1. Location of the study area in the Cerrado biome (a), in the border of the Goiás State and Federal District of Brazil (b), and in the municipalities of Planaltina and Formosa (Goiás State) and northern part of Federal District (c). The image corresponds to the RGB color composite of HH, HV, and VV polarizations from the Advanced Land Observing Satellite-2 (ALOS-2)/ Phased Array type L-band Synthetic Aperture Radar-2 (PALSAR-2) image (overpass: 14 May 2016).
Figure 1. Location of the study area in the Cerrado biome (a), in the border of the Goiás State and Federal District of Brazil (b), and in the municipalities of Planaltina and Formosa (Goiás State) and northern part of Federal District (c). The image corresponds to the RGB color composite of HH, HV, and VV polarizations from the Advanced Land Observing Satellite-2 (ALOS-2)/ Phased Array type L-band Synthetic Aperture Radar-2 (PALSAR-2) image (overpass: 14 May 2016).
Remotesensing 11 01600 g001
Figure 2. Flow chart of the main steps of the study. NB = Naive Bayes; DT = Decision Tree; RF = Random Forest; MLP = Multilayer Perceptron; SVM = Support Vector Machine; PAM = Produção Agrícola Municipal.
Figure 2. Flow chart of the main steps of the study. NB = Naive Bayes; DT = Decision Tree; RF = Random Forest; MLP = Multilayer Perceptron; SVM = Support Vector Machine; PAM = Produção Agrícola Municipal.
Remotesensing 11 01600 g002
Figure 3. Location of the sampling points for validation of the ALOS-2/PALSAR-2 image segmentation and classification. The panoramic field photos were obtained by the first author in September of 2017 (A, B, and D = harvested corn (bare soil/straws); C = shrub Cerrado; E and F = harvested corn with forestland in the back). The image corresponds to the RGB color composite of the LANDSAT-8/ Operational Land Imager (OLI) satellite (bands 4, 5, and 3).
Figure 3. Location of the sampling points for validation of the ALOS-2/PALSAR-2 image segmentation and classification. The panoramic field photos were obtained by the first author in September of 2017 (A, B, and D = harvested corn (bare soil/straws); C = shrub Cerrado; E and F = harvested corn with forestland in the back). The image corresponds to the RGB color composite of the LANDSAT-8/ Operational Land Imager (OLI) satellite (bands 4, 5, and 3).
Remotesensing 11 01600 g003
Figure 4. Kappa indices for different training sets and different classifiers.
Figure 4. Kappa indices for different training sets and different classifiers.
Remotesensing 11 01600 g004
Figure 5. Conditional Kappa indices of the user’s accuracy (UA) and producer’s accuracy (PA) considering training sets of 200 samples for each classifier.
Figure 5. Conditional Kappa indices of the user’s accuracy (UA) and producer’s accuracy (PA) considering training sets of 200 samples for each classifier.
Remotesensing 11 01600 g005
Figure 6. Kappa index and global accuracy per classifier with nine classes and 200 training samples.
Figure 6. Kappa index and global accuracy per classifier with nine classes and 200 training samples.
Remotesensing 11 01600 g006
Figure 7. Classification results obtained with NB (a), DT J48 (b), RF (c), MLP (d), and SVM (e) algorithms and nine land use and land cover (LULC) classes.
Figure 7. Classification results obtained with NB (a), DT J48 (b), RF (c), MLP (d), and SVM (e) algorithms and nine land use and land cover (LULC) classes.
Remotesensing 11 01600 g007
Table 1. Mean and standard deviation of Kappa indices for different training sets and different classifiers.
Table 1. Mean and standard deviation of Kappa indices for different training sets and different classifiers.
StatisticsMean of Kappa IndicesStandard Deviation of Kappa Indices
ML ClassifiersNaive Bayes0.534540.050026823
J480.510360.095230893
Random Forest0.60840.054270802
Multilayer Perceptron0.593440.055937715
Support Vector Machine0.630640.051343188
Table 2. P-values of Z-tests comparing Kappa indices between classifiers in a pairwise manner. The legend has nine classes. Bolded values indicate that the p-value is higher than α = 0.05.
Table 2. P-values of Z-tests comparing Kappa indices between classifiers in a pairwise manner. The legend has nine classes. Bolded values indicate that the p-value is higher than α = 0.05.
ClassifierNBDT J48RFMLPSVM
NB-0.08380.00000.00000.0000
DT J480.0838-0.00460.00380.0001
RF0.00000.0046-0.47900.1565
MLP0.00000.00380.4790-0.1673
SVM0.00000.00010.15650.1673-
Table 3. Final ranking of the performance of the five classifiers employed. The legend has nine classes.
Table 3. Final ranking of the performance of the five classifiers employed. The legend has nine classes.
RankClassifierKappa IndexGlobal Accuracy (%)
1stSVM0.6874.18
RF0.6673.20
MLP0.6672.99
2ndDT J480.5965.57
NB0.5563.50
Back to TopTop