Optical and SAR Remote Sensing Synergism for Mapping Vegetation Types in the Endangered Cerrado / Amazon Ecotone of Nova

: Mapping vegetation types through remote sensing images has proved to be e ﬀ ective, especially in large biomes, such as the Brazilian Cerrado, which plays an important role in the context of management and conservation at the agricultural frontier of the Amazon. We tested several combinations of optical and radar images to identify the four dominant vegetation types that are prevalent in the Cerrado area (i.e., cerrado denso, cerrad ã o, gallery forest, and secondary forest). We extracted features from both sources of data such as intensity, grey level co-occurrence matrix, coherence, and polarimetric decompositions using Sentinel 2A, Sentinel 1A, ALOS-PALSAR 2 dual / full polarimetric, and TanDEM-X images during the dry and rainy season of 2017. In order to normalize the analysis of these features, we used principal component analysis and subsequently applied the Random Forest algorithm to evaluate the classiﬁcation of vegetation types. During the dry season, the overall accuracy ranged from 48 to 83%, and during the dry and rainy seasons it ranged from 41 up to 82%. The classiﬁcation using Sentinel 2A images during the dry season resulted in the highest overall accuracy and kappa values, followed by the classiﬁcation that used images from all sensors during the dry and rainy season. Optical images during the dry season were su ﬃ cient to map the di ﬀ erent types of vegetation in our study area.


Introduction
The Cerrado biome is considered as being among the most extensive and diverse ecosystems in the Neotropics and is a hotspot in the context of biodiversity [1]. It is also one of the most threatened ecosystems in South America, with over 40% of the biome converted to agriculture and the remainder highly fragmented [2]. Despite the threat to the Brazilian Cerrado, studies on this ecosystem are few and recent.
The Cerrado biome is the second largest complex vegetation present in Brazil and occupies about 200 million hectares, of which the largest territory is in the state of Mato Grosso [3]. This large Nova Mutum is located in the AOD; this area covers 256 municipalities with the most intensive deforestation activities in an area of approximately 1,700,000 km² and it plays an important role in the context of deforestation in the frontier of Amazon and Cerrado. The AOD accounts for 75% of the deforestation in the Brazilian Amazon and the largest agricultural area [17]. Legislation, soil, relief, climate conditions, and the subsidies offered by the government have encouraged agricultural activity since 1970. Recently, the Brazilian government has established policies to decrease the rates of deforestation in these areas, such as the "Soy Moratorium" [42], which was an agreement with the major soybean traders not to purchase soybean that was planted in deforested areas after July 2006 in the Brazilian Amazon biome.
In general, the vegetation of Cerrado in Brazil covers three main different vegetation types: Nova Mutum is located in the AOD; this area covers 256 municipalities with the most intensive deforestation activities in an area of approximately 1,700,000 km 2 and it plays an important role in the context of deforestation in the frontier of Amazon and Cerrado. The AOD accounts for 75% of the deforestation in the Brazilian Amazon and the largest agricultural area [17]. Legislation, soil, relief, climate conditions, and the subsidies offered by the government have encouraged agricultural Remote Sens. 2019, 11, 1161 5 of 25 activity since 1970. Recently, the Brazilian government has established policies to decrease the rates of deforestation in these areas, such as the "Soy Moratorium" [42], which was an agreement with the major soybean traders not to purchase soybean that was planted in deforested areas after July 2006 in the Brazilian Amazon biome.
In general, the vegetation of Cerrado in Brazil covers three main different vegetation types: grassland, savannas, and forest formations. The forest formations consist of arboreal species in a continuous canopy and include the Gallery, Dry, and Open Forest. The savanna formation is characterized by a discontinuous herbaceous-shrub and tree canopy. The seven types of savanna formation are Dense Woodland, Woodland, Open Woodland, Park Woodland, Palm, Vereda, and Stone Woodland. The grassland formations include three vegetation types: Stone Grassland, Shrub Savanna, and Grassland. The first two types of grasslands are characterized by the large presence of shrubs with different types of soils. Figure 2 summarizes the distribution of the three different vegetation formations in the Cerrado biome. Each of these types has a high diversity, which is a consequence of the high variability of the soil and microclimates as well as the floristic evolution with plants from different Brazilian biomes [7].  Figure 2 summarizes the distribution of the three different vegetation formations in the Cerrado biome. Each of these types has a high diversity, which is a consequence of the high variability of the soil and microclimates as well as the floristic evolution with plants from different Brazilian biomes [7]. In our study, we mapped the four dominating vegetation types in Nova Mutum, cerradão (Open Forest), cerrado denso (Dense Woodland), gallery forest, and secondary forest. Cerradão and cerrado denso are located within the transition area of the Amazon and Cerrado biomes. As mentioned before, this area has a high deforestation rate, which can explain the presence of secondary forest. Cerradão has a crown cover between 50 and 90% and the height of the trees varies from 8 to 15 meters. In general, the soils of Cerradão are well drained, deep, and have medium-low fertility. Cerrado denso has a crown cover between 5 to 70% and tree height varies from 5 to 8 meters. The layers of shrubs and herbs are less dense compared to cerradão. In general, the soils of cerrado denso have medium to very clayey texture and are middle-well drained. The gallery forest has a crown cover between 70 to 95% and the height of the trees varies from 20 to 30 meters. Secondary forests are formed after clear-cutting and have different structures, depending on the age of the succession. At the beginning of its succession time, these secondary forests are poor in biodiversity and have a simple structure, whereas in the next succession time, its structure depends on environmental factors such as soil, climate, and management [44].

Satellite Data
In order to analyze the use of optical and radar sensors to map the vegetation type in Cerrado, we used a set of images from four sensors (3 SARs and 1 optical). The PALSAR-2 aboard ALOS-2 from the Japan Aerospace Exploration Agency (JAXA); TanDEM-X (TerraSAR-X add-on for Digital Elevation Measurement) from the German Aerospace Center, DLR, and Astrium GmbH; Sentinel 1A and the optical Sentinel 2A from the European Union's Copernicus programme. Figure 3 shows the temporal coverage of each satellite image used in our analysis. In our study, we mapped the four dominating vegetation types in Nova Mutum, cerradão (Open Forest), cerrado denso (Dense Woodland), gallery forest, and secondary forest. Cerradão and cerrado denso are located within the transition area of the Amazon and Cerrado biomes. As mentioned before, this area has a high deforestation rate, which can explain the presence of secondary forest. Cerradão has a crown cover between 50 and 90% and the height of the trees varies from 8 to 15 m. In general, the soils of Cerradão are well drained, deep, and have medium-low fertility. Cerrado denso has a crown cover between 5 to 70% and tree height varies from 5 to 8 m. The layers of shrubs and herbs are less dense compared to cerradão. In general, the soils of cerrado denso have medium to very clayey texture and are middle-well drained. The gallery forest has a crown cover between 70 to 95% and the height of the trees varies from 20 to 30 m. Secondary forests are formed after clear-cutting and have different structures, depending on the age of the succession. At the beginning of its succession time, these secondary forests are poor in biodiversity and have a simple structure, whereas in the next succession time, its structure depends on environmental factors such as soil, climate, and management [44].

Satellite Data
In order to analyze the use of optical and radar sensors to map the vegetation type in Cerrado, we used a set of images from four sensors (3 SARs and 1 optical). The PALSAR-2 aboard ALOS-2 from the Japan Aerospace Exploration Agency (JAXA); TanDEM-X (TerraSAR-X add-on for Digital Elevation Measurement) from the German Aerospace Center, DLR, and Astrium GmbH; Sentinel 1A We selected the satellite image data following some criteria. First, we selected images from 2017 since the field data were collected in 2017, except for the TanDEM-X image. Secondly, we selected the radar images on the dates of low precipitation, prior to the date of acquisition. Table 1 shows the date, polarization, orbit, and accumulated precipitation values three days before the acquisitions.  We selected the satellite image data following some criteria. First, we selected images from 2017 since the field data were collected in 2017, except for the TanDEM-X image. Secondly, we selected the radar images on the dates of low precipitation, prior to the date of acquisition. Table 1 shows the date, polarization, orbit, and accumulated precipitation values three days before the acquisitions. Seven coverages (using nine Bands altogether, from Band 2 to Band 8A, Band 11, and Band 12) of the Multispectral Instrument (MSI) on-board Sentinel-2A were processed using the ESA's Sentinel-2 toolbox in the ESA Sentinel Application Platform (SNAP). First, we applied atmospheric correction using Sen2cor, which is a L2A-processor for Sentinel-2 data that creates Bottom-Of-Atmosphere (BOA) reflectance images using Top-Of-Atmosphere (TOA) data [45]. Secondly, we resampled all the bands to a 10-m spatial resolution based on the geolocations obtained from Level-1C metadata. For the last step, we created a subset of our study area to speed up processing time, and lastly we mosaicked the images, as our study area was between two different orbits of the Sentinel 2A.
During the final step, we reduced the number of features by applying principal component analysis (PCA) on the spectral dataset due to the fact that some classification algorithms, such as Random Forest, cannot work well with high correlation data. Principal component analysis is a mathematical procedure that reduces a large amount of variables into principal components. The primary function of the PCA is to determine the extent of the correlation between multispectral bands and to remove it through an appropriate mathematical transformation [46]. We used the first principal component (PC1) of each one of the ten bands to aggregate only information that was essential to the classification process, as it explained most of the variance, e.g., PC1 of Band 2, PC1 of Band 3, and PC1 of Band 4. Overall, this resulted in a set of ten variables as input for classification for the dry season, and the dry and rainy season, respectively.

Sentinel 1A
Twenty-three coverages from Sentinel-1A IW Ground Range Detected (GRD) Level-1 product were processed using the ESA Sentinel Application Platform toolbox. First, each image was radiometrically calibrated to radar brightness values (β 0 ) [47]. Secondly, we applied the terrain flattening to correct any terrain variations in the images. Terrain flattering is an important step for the mapping of land use. Without the terrain flattening, an additional error into the coherency and covariance measurement could be created, due to the difference in the terrain and subsequently the brightness of the radar return [48]. During the third step, we coregistrated the 23 images based on the cross-correlation technique to guarantee that every pixel was correctly located in the same target of all images [49]. Once we had the images from the co-registration process, we separated them into two sub-processes.
During the first process, we applied the grey level co-occurrence matrix (GLCM) to extract second order statistical textures features. The GLCMs were extracted separately from every single date and polarization (VV and VH). These textures can be useful to improve land use classification in that it extracts intensity variations from the image involving the information of the neighbor pixels to identify specific clusters or objects [50]. Additionally, we applied the Refined Lee speckle filter (window size 5 × 5), after GLCMs extraction. This process is necessary to reduce the noise caused by random constructive and destructive interference from the radar signal [51]. For the second branch of processing, we only applied the Refined Lee speckle filter on the backscatter images. We applied the Range Doppler terrain correction in the last step. This process is necessary to geocode and correct the distortions in the image, which are caused by the topographical variations and the tilt of the sensor [52]. The entire process is illustrated in Figure 4. We applied the Range Doppler terrain correction in the last step. This process is necessary to geocode and correct the distortions in the image, which are caused by the topographical variations and the tilt of the sensor [52]. The entire process is illustrated in Figure 4. We applied the PCA to reduce the number of features in the same way as explained above for Sentinel 2A. In the backscattering images, we applied the PCA for the two different seasons and two polarizations, which resulted in four PCs: a) images of VV polarization in the rain and dry season; b) images of VH polarization in the rain and dry season; c) images of VV polarization during the dry season; and d) images of VH polarization during the dry season. The same was applied for the texture images (ten textures: ASM, contrast, correlation, dissimilarity, energy, entropy, homogeneity, MAX, mean, variance. This resulted in a set of twenty two inputs in the dry season (one PC of VV, one PC of VH, ten PCs of VV textures, and ten PCs of VH textures) and a set of fourth four inputs in the rainy and dry season (one PC of VV in the dry season; one PC of VV in the rainy and dry season; one PC of VH in the dry season; one PC of VH in the rainy and dry season; ten PCs of VV textures in the dry season; ten PCs of VV in the rainy and dry season; ten PCs of VH textures in the dry season; ten PCs of VH in the rainy and dry season).

ALOS-PALSAR 2 (Dual and Full Polarimetric)
Four coverages of the dual polarization images were converted to covariance matrix C2 and one coverage of the full polarization images was converted to C3 matrix [53]. In this step, multilook with 4 looks in azimuth and 1 in range was applied to convert the image from single look complex to ground range detected. We applied the speckle filter on all images to reduce speckle noise. For that we used the Refined Lee adaptive filter (5 × 5 window), which is more efficient and whose results have less destructive averaging, having been largely used in the radar studies [51][52][53][54]. Here, we separated the images into two subprocesses. For the first, we kept the backscattering images. For the second, we calculated polarimetric decompositions. Polarimetric SAR decomposition is a useful method to map and discriminate the different targets on the surface, especially due to the signal of the target, which is a combination of speckle noise and random vector scattering effects [55]. In our study, we chose the Freeman-Durden, Yamaguchi, and VanZyl polarimetric decompositions. In general, these three decompositions are based on the covariance matrix that is divided into three scattering mechanism: volume, double bounce and surface scatter [55]. Additionally, polarimetric compositions have been used before in mapping of vegetation showing the improvement in the vegetation classification in the Amazon and Cerrado [56]. We applied the Range Doppler terrain correction in all images. We applied the PCA to reduce the number of features in the same way as explained above for Sentinel 2A. In the backscattering images, we applied the PCA for the two different seasons and two polarizations, which resulted in four PCs: (a) images of VV polarization in the rain and dry season; (b) images of VH polarization in the rain and dry season; (c) images of VV polarization during the dry season; and (d) images of VH polarization during the dry season. The same was applied for the texture images (ten textures: ASM, contrast, correlation, dissimilarity, energy, entropy, homogeneity, MAX, mean, variance. This resulted in a set of twenty two inputs in the dry season (one PC of VV, one PC of VH, ten PCs of VV textures, and ten PCs of VH textures) and a set of fourth four inputs in the rainy and dry season (one PC of VV in the dry season; one PC of VV in the rainy and dry season; one PC of VH in the dry season; one PC of VH in the rainy and dry season; ten PCs of VV textures in the dry season; ten PCs of VV in the rainy and dry season; ten PCs of VH textures in the dry season; ten PCs of VH in the rainy and dry season).

ALOS-PALSAR 2 (Dual and Full Polarimetric)
Four coverages of the dual polarization images were converted to covariance matrix C2 and one coverage of the full polarization images was converted to C3 matrix [53]. In this step, multilook with 4 looks in azimuth and 1 in range was applied to convert the image from single look complex to ground range detected. We applied the speckle filter on all images to reduce speckle noise. For that we used the Refined Lee adaptive filter (5 × 5 window), which is more efficient and whose results have less destructive averaging, having been largely used in the radar studies [51][52][53][54]. Here, we separated the images into two subprocesses. For the first, we kept the backscattering images. For the second, we calculated polarimetric decompositions. Polarimetric SAR decomposition is a useful method to map and discriminate the different targets on the surface, especially due to the signal of the target, which is a combination of speckle noise and random vector scattering effects [55]. In our study, we chose the Freeman-Durden, Yamaguchi, and VanZyl polarimetric decompositions. In general, these three decompositions are based on the covariance matrix that is divided into three scattering mechanism: Remote Sens. 2019, 11, 1161 9 of 25 volume, double bounce and surface scatter [55]. Additionally, polarimetric compositions have been used before in mapping of vegetation showing the improvement in the vegetation classification in the Amazon and Cerrado [56]. We applied the Range Doppler terrain correction in all images.
Following the same process of the other images, we applied the PCA to reduce the number of features and consequently facilitated the further classification process. This resulted in a set of ten PCs, resulting in four of the dual polarimetric, PC1 of backscattering in each polarization (HH and VH) and each orbit (ascending and descending) and six of the full polarimetric: PC1 of backscattering in each polarization (HH, VV and VH) and PC1 of each scattering mechanism (volume, double bounce and surface scatter).

TanDEM-X
The TanDEM-X mission operates two X-Band satellites flying in close formation in order to acquire single-pass interferometric SAR data. The primary mission goal of the TanDEM-X mission was the generation of a global digital elevation model [57]. All TanDEM-X acquisitions are available to the science community on request in Coregistered Single look Slant range Complex (CoSSC) format. For the study area, we acquired one TAnDEM-X scene in standard (bistatic) mode with horizontal polarization (HH). The processed data was separated into two different parts. In the first part, we estimated the magnitude of coherence. Coherence describes the the degree of correlation between the two complex radar images [58]. It is a measure of quality of the phase measurement in interferometric SAR analysis and also used as a proxy for soil and vegetation structural parameters.
In the second part, we processed the intensity images. We performed multilook in both images (coherence and intensity) from the first and second part, with 4 looks in azimuth and 3 in range to reduce the noise. Additionally, we applied, as with the other images before, the speckle filter Refined Lee (window 5 × 5). For the last step, we applied the Range Doppler terrain correction. All images were processed to have a final product with spatial resolution of 10 m. The PCA was not applied for TanDEM-X due to the use of only one single date.

Classification
The process of image classification was separated into two steps. First, a forest mask was generated as a result of a forest/non-forest classification. At the second step, we classified the forest type within this forest mask. The area under investigation is covered by all images used in this study.
We used the RF algorithm implemented in R software for image classification. Random Forests is a supervised classification algorithm that uses multiples decision trees to get an accurate classification and prediction. The N numbers of trees are being built by the classifier, contributing each to the assignment of the most frequent class. This algorithm uses the bagging method to produce random samples of training sets for each random decision tree. Every tree uses a random subset from the original set. This original set was created from training samples, where two-thirds were used to train the classifier and one-third of them were used for validation. Two-thirds of the training sample were the out-of-bag (OOB) data and one-third of the training sample were the OOB error estimate [59]. Random Forest can be used for classification and regression and is an efficient tool due to measuring the relative importance of each feature. This variable importance measures the decrease of accuracy when a variable is removed from the classification. The higher a variable is ranked, the more it is contributing to the accuracy. Additionally, it has a lower probability to overfit compared to other models if there are enough trees. This method has many improvements: it does not require any input preparation, it is more stable using big data since it works well with variable non-linearity, it provides a pre-feature selection building the trees, and reduces the time required for the process. For remote sensing analysis, RF showed to be a stable and accurate algorithm, especially when it is applied to different types of sensors and large time series. The achievement of this method can be seen in recent studies such as References [59][60][61], which applied RF for vegetation mapping using different types of data. In this study, the RF models consisted of 1000 trees, and 70% of our samples were used for training the classifier and 30% for validation of the classification results.

Forest and Non-Forest
In order to classify the several vegetation types, we first needed to create a map of forest and non-forest areas. For accuracy assessment, we created 100 random points of 3.13 ha each and visually classified them using high-resolution imagery from Google Earth and the sensor Sentinel 2A ( Figure 5). During the classification process, we used 70% of these points for training the classifier and 30% for validation of the classification results. In order to classify the several vegetation types, we first needed to create a map of forest and non-forest areas. For accuracy assessment, we created 100 random points of 3.13 ha each and visually classified them using high-resolution imagery from Google Earth and the sensor Sentinel 2A ( Figure  5). During the classification process, we used 70% of these points for training the classifier and 30% for validation of the classification results.

Forest Type
Forest-type mapping was only conducted in the areas masked as forests in the previous step. We created 24 reference areas equally distributed into four different vegetation classes (cerradão, cerrado denso, gallery, and secondary forest). Each one had an area of 14.265 ha ( Figure 5). The polygons were classified based on field data collection (July 2017) and high-resolution imagery from Google Earth and Sentinel 2A (26 July 2017). During the classification process using RF, we used 70% of the pixels in the 24 references areas for training and 30% for validation.
To analyze the synergy of optical and radar data for mapping Cerrado vegetation types, all possible combinations between optical and radar sensors were tested in two different scenarios, dry

Forest Type
Forest-type mapping was only conducted in the areas masked as forests in the previous step. We created 24 reference areas equally distributed into four different vegetation classes (cerradão, cerrado denso, gallery, and secondary forest). Each one had an area of 14.265 ha ( Figure 5). The polygons were classified based on field data collection (July 2017) and high-resolution imagery from Google Earth and Sentinel 2A (26 July 2017). During the classification process using RF, we used 70% of the pixels in the 24 references areas for training and 30% for validation.
To analyze the synergy of optical and radar data for mapping Cerrado vegetation types, all possible combinations between optical and radar sensors were tested in two different scenarios, dry season, dry and rainy seasons ( Table 2). In addition, we used the sensors separately and analyzed the SAR classifications. In total, 23 datasets were processed. For the classifications, which combined two or more sensors, e.g., Sentinel 2A and ALOS-PALSAR 2, we did not use all the features of each sensor. In this case, we selected the first three features based on variable importance, which was calculated during RF classification for the single sensor dataset, respectively. Variable importance shows the interaction between the variables/features and inserts them into an hierarchy within a level of contribution and importance for the classification.
For both classifications, forest/non-forest and vegetation type, we used the confusion matrix to analyze the performance of Random Forest classifications. The confusion matrix assesses the accuracy of the classification, showing the relation between classification result and sample site. Column values correspond to the sample site results, rows to the classification results, and diagonal to the correctly classified pixels. The general measurement showed in confusion matrices of q classes is the overall accuracy, which is a result of dividing the total number of pixels and the pixels that were correctly classified. Additionally, the kappa coefficient was largely used to measure the accuracy of the classification. The values of the kappa coefficient range from 0 to 1, where 0 means no relation between the classification results and the sample site results, and 1 means that both are identical [62].
Finally, for detailed analysis, we calculated both the user's and producer's accuracy. User's accuracy (U i ) is obtained considering the number of the correctly identified pixels of a given class (p ii ), divided by the total number of pixels of the class in the classified image (p i. ).
On the other hand, producer's accuracy (P j ) is the number of correctly identified pixels (p jj ) divided by the total number of pixels in the reference image (p .j ). A detailed description of the classification assessment can be found in the literature [62,63].

Forest and Non-Forest
The two different combinations used for classifications, Sentinel 2A with ALOS-PALSAR 2 dual polarimetric and Sentinel 2A with ALOS-PALSAR 2 full polarimetric, showed similar high overall accuracy of 0.99 and 1, respectively. The variable importance showed similar results. In both cases, the PC1 of Band 11 and Band 5 of the Sentinel 2A images had the highest contribution for the Random Forest classifier.
Based on this result, we created a mask of forest and non-forest areas, where 34% was forest and 66% was non-forest. (Figure 6). This mask was used in the next step for the forest type classification. Based on this result, we created a mask of forest and non-forest areas, where 34% was forest and 66% was non-forest. (Figure 6). This mask was used in the next step for the forest type classification.

Dry Season
The Table 3 shows the overall average accuracy (OAA), Kappa, confidence interval (CI) values, and variable importance of the classifications during the dry season. Table 3. Overall accuracy, kappa, confidence interval 95%, overall average accuracy (OAA), and the three most important variables for the classifications based on the Random Forest variable importance for Sentinel 2A (S2), ALOS PALSAR 2 full (A2f), ALOS PALSAR 2 dual (A2d), TanDEM X (TX), and Sentinel 1A (S1). . The parameters listed in the variable importance are the PC1 derived from the PCA, except for the images from TanDEM-X, as only one acquisition was available. Only data acquisitions during the dry season were considered.

Dry Season
The Table 3 shows the overall average accuracy (OAA), Kappa, confidence interval (CI) values, and variable importance of the classifications during the dry season.
Classifications using only a single radar sensor (Sentinel 1A, TanDEM-X and ALOS2 dual) had lower overall accuracy and kappa values compared to the classification that used two or more sensors. Sentinel 2A (S2) had with 82.60 % the highest overall accuracy and kappa values with 0.77. The variable importance shows the PC1 of Bands 11 and 12 were more important during the RF classification, followed by the PC1 of Bands 5, 4 and 2. Figure 7 shows the results of the S2 classification. A gradient is visible, with the north mostly comprising of areas of cerradão, which is closest to the Amazon biome, and whose south cerrado denso areas are prevailing. Additionally, it illustrates a large area of secondary forest in the northwest of the study area. Based on this map, Cerrado denso covers 34.50% of the Cerrado area, cerradão 28.70%, gallery forest 28.14% and secondary forest 8.66%. Table 3. Overall accuracy, kappa, confidence interval 95%, overall average accuracy (OAA), and the three most important variables for the classifications based on the Random Forest variable importance for Sentinel 2A (S2), ALOS PALSAR 2 full (A2f), ALOS PALSAR 2 dual (A2d), TanDEM X (TX), and Sentinel 1A (S1). The parameters listed in the variable importance are the PC1 derived from the PCA, except for the images from TanDEM-X, as only one acquisition was available. Only data acquisitions during the dry season were considered.

Dry Season (SAR and Optical)
Overall Classifications using only a single radar sensor (Sentinel 1A, TanDEM-X and ALOS2 dual) had lower overall accuracy and kappa values compared to the classification that used two or more sensors. Sentinel 2A (S2) had with 82.60 % the highest overall accuracy and kappa values with 0.77. The variable importance shows the PC1 of Bands 11 and 12 were more important during the RF classification, followed by the PC1 of Bands 5, 4 and 2. Figure 7 shows the results of the S2 classification. A gradient is visible, with the north mostly comprising of areas of cerradão, which is closest to the Amazon biome, and whose south cerrado denso areas are prevailing. Additionally, it illustrates a large area of secondary forest in the northwest of the study area. Based on this map, Cerrado denso covers 34.50% of the Cerrado area, cerradão 28.70%, gallery forest 28.14% and secondary forest 8.66%. The overall accuracy and kappa values of the Sentinel 1A (S1) classification had the lowest classification results using only single sensors with an overall accuracy of 48.51% and a kappa value of 0.31. Additionally, the PC1 of entropy and mean images of VV polarization and PC1 of variance image of VH polarization were more important to the RF classifier. The TanDEM-X classification also presented low accuracy and kappa values, 58.22% and 0.44, respectively. The coherence was more The overall accuracy and kappa values of the Sentinel 1A (S1) classification had the lowest classification results using only single sensors with an overall accuracy of 48.51% and a kappa value of 0.31. Additionally, the PC1 of entropy and mean images of VV polarization and PC1 of variance image of VH polarization were more important to the RF classifier. The TanDEM-X classification also presented low accuracy and kappa values, 58.22% and 0.44, respectively. The coherence was more important than the intensity. The images from ALOS-PALSAR 2 dual and full polarimetric showed different results in the classification. In our study, the dual polarization images had a higher overall accuracy and kappa values, 59.70% and 0.46, respectively, compared to the full polarimetric images. However, we used four different dates of dual polarimetric images and one of full polarimetric image. This difference in the number of acquisitions from dual and full polarimetric images can cause a better accuracy for the dual polarization images.
The combinations of two or more sensors, in general, improved the extraction of the target's information, and consequently, the classification. The classification that used S2 and TanDEM-X showed the highest overall kappa values, 81.91% and 0.76. Variable importance shows the PC1 of Bands 11 and 12 of S2 and the coherence of TanDEM-X were more important to the RF classifier.
The S1 and S2 classifications had an overall accuracy and kappa value of 79.90% and 0.73. PC1 of Bands 11 and 12 of S2 and the PC1 of contrast of VH polarization of S1 had a high ranking in the variable importance. The classification that used all images from the dry season had a similar overall accuracy and kappa values compared to the S2 and TanDEM-X classification. The PC1 of Band 11, PC1 of ALOS-PALSAR 2 dual VH polarization descending orbit, and PC1 Band 12 images had the highest importance.
The highest accuracy for each of the four forest classes was obtained by different classification inputs, the highest producer's accuracy for cerrado denso class was achieved with S2 and S1 classification and the highest user's accuracy with the classification that used S2 images. The highest producer's accuracy for cerradão class was reached with the classification that used all images, and the user's accuracy was reached with the S2 and TanDEM-X classification. For the gallery forest, the highest producer's accuracy was obtained with the classification that used S1 images, and the user's accuracy was obtained using S2 images. The highest users' accuracy for secondary forest class was again reached with S2 images. The ALOS-PALSAR 2 dual polarimetric images resulted here in the best producer's accuracy (Figure 8). Table 4 summarizes the results for the classifications during the dry and rainy season. The classification of S1 images during the dry and rainy seasons had higher overall accuracy and kappa values compared to the S1 classification of the dry season, with 16% overall accuracy and a kappa of 33%. This result shows that the use of images combining the dry and rainy seasons improved the classification of S1 images. Here, the PC1 of entropy and of mean images of VV polarization as well as of the VH polarization contrast image were the three most important variables. The ALOS-PALSAR 2 full polarimetric classification showed a lower overall accuracy and kappa values compared to the ALOS-PALSAR 2 dual polarimetric classification during the dry season. Moreover, the volume polarimetric decomposition image was more important to the RF classifier. classification and the highest user's accuracy with the classification that used S2 images. The highest producer's accuracy for cerradão class was reached with the classification that used all images, and the user's accuracy was reached with the S2 and TanDEM-X classification. For the gallery forest, the highest producer's accuracy was obtained with the classification that used S1 images, and the user's accuracy was obtained using S2 images. The highest users' accuracy for secondary forest class was again reached with S2 images. The ALOS-PALSAR 2 dual polarimetric images resulted here in the best producer's accuracy (Figure 8).  Table 4. Overall accuracy, kappa values, confidence interval 95% OAA, the three most important variables for the classifications according to Random Forest variable importance for Sentinel 2A (S2), ALOS PALSAR 2 full (A2f), ALOS PALSAR 2 dual (A2d), TanDEM X (TX), and Sentinel 1A (S1). The parameters listed in the variable importance are the PC1 derived from the PCA, except for the images from TanDEM-X, as only one acquisition was available. All data acquisitions during the dry and rainy season were considered. For the dry and rainy season, the classifications that combined radar and optical sensors were more accurate. From each classification, which used more than on sensor, we selected the first three images with highest variable importance, totalling 15 images and used these images as input for all image classifications. This classification had the highest overall accuracy and kappa values (81.91% and 0.76) (Figure 9). The PC1 of Band 11 of S2, PC1 of ALOS-PALSAR 2 dual VH polarization at descending orbit and PC1 of Band 12 of S2 were the most important images that contributed to the classification of all images. The S2 and S1 classifications showed a higher overall accuracy and kappa values, 81.73% 0.75, compared to the classification during the dry season. Variable importance showed that the PC1 of Bands 12 and 11 of S2 and the PC1 of contrast VH polarization were more important.

Dry and Rainy Season
The highest producer's accuracy for the cerrado denso class was achieved with S2 and ALOS-PALSAR 2 full polarimetric classification, and the highest user's accuracy was achieved with the classification that used all images. The highest producer's and user's accuracy for cerradão class was reached with the classification that used S2 and S1. For the gallery and secondary forest, the highest user's accuracy was obtained using S2 and S1 images. The highest producer's accuracy for the gallery forest was achieved with S1 images and for the secondary forest class with all images (Figure 10). The S2 and S1 classifications showed a higher overall accuracy and kappa values, 81.73% 0.75, compared to the classification during the dry season. Variable importance showed that the PC1 of Bands 12 and 11 of S2 and the PC1 of contrast VH polarization were more important.
The highest producer's accuracy for the cerrado denso class was achieved with S2 and ALOS-PALSAR 2 full polarimetric classification, and the highest user's accuracy was achieved with the classification that used all images. The highest producer's and user's accuracy for cerradão class was reached with the classification that used S2 and S1. For the gallery and secondary forest, the highest user's accuracy was obtained using S2 and S1 images. The highest producer's accuracy for the gallery forest was achieved with S1 images and for the secondary forest class with all images (Figure 10).

Radar Classification
We separately analyzed the radar classifications of Sentinel 1A, ALOS-PALSAR 2 dual/full polarimetric, and TanDEM-X (C Band, L Band, and X Band, respectively) for both seasons. Table 5 presents the results of these classifications during the dry season. The TanDEM-X in combination with ALOS-PALSAR 2 dual polarimetric classification achieved the highest overall accuracy and kappa values, 66.96% and 0.56. S1, and TanDEM-X had the lowest overall accuracy with 54.46% and 0.39. Furthermore, PC1 of ALOS-PALSAR 2 dual polarimetric VH descending orbit and HH descending orbit and coherence of TanDEM-X images were higher ranked in the variable importance list of Random Forests (Table 5). Table 5. Overall accuracy, kappa values, confidence interval 95% OAA, and the three most important variables for the classifications based on the Random Forest variable importance for Sentinel 2A (S2), ALOS PALSAR 2 full (A2f), ALOS PALSAR 2 dual (A2d), TanDEM X (TX), and Sentinel 1A (S1). . The parameters listed in the variable importance are the PC1 derived from the PCA except for the images from TanDEM-X, as only one acquisition was available.

Radar Classification
We separately analyzed the radar classifications of Sentinel 1A, ALOS-PALSAR 2 dual/full polarimetric, and TanDEM-X (C Band, L Band, and X Band, respectively) for both seasons. Table 5 presents the results of these classifications during the dry season. The TanDEM-X in combination with ALOS-PALSAR 2 dual polarimetric classification achieved the highest overall accuracy and kappa values, 66.96% and 0.56. S1, and TanDEM-X had the lowest overall accuracy with 54.46% and 0.39. Furthermore, PC1 of ALOS-PALSAR 2 dual polarimetric VH descending orbit and HH descending orbit and coherence of TanDEM-X images were higher ranked in the variable importance list of Random Forests (Table 5). Table 5. Overall accuracy, kappa values, confidence interval 95% OAA, and the three most important variables for the classifications based on the Random Forest variable importance for Sentinel 2A (S2), ALOS PALSAR 2 full (A2f), ALOS PALSAR 2 dual (A2d), TanDEM X (TX), and Sentinel 1A (S1). The parameters listed in the variable importance are the PC1 derived from the PCA except for the images from TanDEM-X, as only one acquisition was available.

Dry Season
Overall Combining the dry and rainy seasons, S1 and ALOS-PALSAR 2 dual polarimetric classification achieved the highest overall accuracy and kappa values, 66.61% and 0.55, respectively. Here, PC1 of ALOS-PALSAR 2 dual polarimetric VH descending orbit, HH descending orbit, and PC1 contrast of VH polarization images were more important. The ALOS-PALSAR 2 dual polarimetric and ALOS-PALSAR 2 full polarimetric classification showed the lowest overall accuracy and kappa values, 58.30% and 0.44, respectively.
Highest producer's and user's accuracy for cerrado denso and the cerradão class for the dry season were achieved with TanDEM-X and ALOS-PALSAR 2 dual polarimetric classification. This sensor combination also had the highest user's accuracy and producer's accuracy together with S1 and ALOS-PALSAR 2 dual polarimetric in the secondary forest. For the gallery forest, highest producer's and user's accuracies were achieved with S1 and TanDEM-X classification ( Table 6). The radar sensors combinations presented a higher overall accuracy and kappa values compared to the single use of these sensors. The dry and rainy season had similar results. Producer's and user's accuracy were for the gallery forest the highest using S1 and TanDEM-X, too. For cerrado denso, the best user's accuracy was achieved with S1 and TanDEM-X classification. Highest producer's accuracy was obtained by using S1 and ALOS-PALSAR 2 dual polarimetric images as input for the classification. This combination was also the best for the secondary forest, paired with ALOS-PALSAR 2 dual polarimetric and ALOS-PALSAR 2 full polarimetric. Here, TanDEM-X and ALOS-PALSAR 2 full polarimetric images reached the highest user's accuracies. The highest producer's and user's accuracy for cerradão class were achieved with ALOS-PALSAR 2 dual polarimetric and ALOS-PALSAR 2 full polarimetric classification (Table 7). Furthermore, the polarization of radar sensors is shown to be an important factor for the Random Forest classification. The intensity of cross-polarized HV polarization PC1 images were one of the most important variables in 60% of the classification, which used radar sensors.

Summary of the Classification
The three highest overall accuracies and kappa values belonged to S2, S2 with TanDEM-X, and to the combinations of all images for the dry and rainy seasons. Nevertheless, the range of confidence interval shows different results compared to the overall accuracy and kappa values. The three narrowest ranges, which indicate good precision, belong to all images of the dry and rainy season, all images of the dry season and S2 with S1 from the dry and rainy season classifications ( Table 5).
The variable importance for the classifications that combined optical and radar images showed that PC1 of Bands 11, 12, and 5 from S2, PC1 of ALOS-PALSAR 2 dual polarimetric VH descending orbit, PC1 of ALOS-PALSAR 2 dual polarimetric HH descending orbit, coherence of TanDEM-X, and the PC1 of contrast VH from the rainy season of Sentinel 1A images were the most important variables during the Random Forest classification.

Discussion
The results showed the importance of integrating satellite images from different sensors to classify the forest and non-forest area. The Program for the Estimation of Amazon Deforestation (PRODES) is the most important project that has been conducting satellite monitoring of deforestation in the Legal Amazon, producing annual deforestation rates in the region, using Landsat images (30 m spatial resolution). Comparing the data of forest areas from the PRODES project with the results of our work, it is possible to verify a high underestimation in the forest areas, mainly in the classes gallery forest and cerrado denso. The PRODES estimated an area of 12,702 ha of forest, and our work estimated an area of 27,326 ha. This difference can be associated to the different spatial resolution used in PRODES (30 m) and in our study (10 m).
Optical images are largely used to map vegetation types in the Cerrado biome. In our results, S2 classifications showed the highest overall accuracy and kappa values. The application of S2 images to map vegetation types in the Cerrado biome is new. In general, Landsat is the most common sensor used to discriminate vegetation types in the Cerrado. Nascimento and Sano [23] had 85% overall accuracy for mapping vegetation types in this biome. The authors used Landsat 7 ETM+ images to discriminate the Rupestrian Cerrado (Savanna formation) in the Chapada dos Veadeiros National Park in Goias State, which can be difficult due to the spectral confusion with other types of Cerrado vegetation. The optical bands located in the red and NIR wavelengths showed high importance and contribution to the discrimination of vegetation type, as was visible in our results (Tables 3 and 4). Nascimento and Sano (2010) [23] agree on the importance of VIS and NIR regions for characterizing forest areas, as the vegetation has higher reflectance in this wavelength range and is thus more sensitive. Additionally, the number of optical images in ours and other studies helps the increase of discrimination power of different vegetation types, due to the unique spectral signatures of the plant during the year [64,65]. The optical data are certainly useful to map the vegetation type in Cerrado; however, these images are usually not available during the rainy season and the optical data cannot extract information from the structure of the forest [66]. Moreover, the availability of images in the rainy season would allow for a higher temporal resolution, which is crucial to better discriminate the vegetation types in the Cerrado biome due its high seasonality. Additionally, in dense areas of vegetation, the optical sensor is usually saturated due to the low optical depth penetration through these areas, affecting the mapping of the various vegetation types. There are important projects assessing the land use of the Cerrado biome, such as the TerraClass Cerrado project, which produced a map of the land use of the Cerrado biome. However, the project had great difficulties to discriminate the different types of vegetation, which is important for the preservation of biodiversity in this region. Nevertheless, TerraClass presents another step in the challenge of mapping the different types of vegetation in the Cerrado [29].
The use of radar images can be a solution to overcome the lack of image availability in the rainy season and the high saturation of optical images in areas of great biomass density. In our radar, classification results from the dry and rainy seasons, TanDEM-X (X Band) and ALOS-PALSAR 2 (L Band) dual polarimetric classification from the dry season showed the highest overall accuracy and kappa values. The influence of vegetation scattering mechanism dependencies is strongly dependent on the wavelength and polarization of the sensor. In the short/intermediate wavelengths, such as X and C Bands, backscattering represents the radiation interaction of canopy, leaves, branches, secondary branches, and part of volumetric scattering (inside crown). Longer wavelengths, such as the L and P Bands, have the capability for deeper penetration. Bigger vegetation components such as trunks, crown, ground, and branches interact with these lower wavelengths. According to the results for the dry season, L Band dual polarimetric images had the highest overall accuracy and kappa values were comparable to the classifications that used single sensor (X and C Bands). The study area is mostly forested. In these areas, radar signals are more likely to be saturated in the X and C Bands compared to the L Bands [67]. The polarization controls the types of components that interact with the radiation. In our study, the L Band cross-polarized HV polarization was the most important variable that contributed to the random classifier in the best classification. This agrees with the fact that cross-polarized images have direct relation with volumetric scattering, and are therefore sensitive to forest structure [68]. There are few studies in the Cerrado biome using only radar images. Sano et al. [34] used the L Band from JERS-1 SAR data to map the different types of vegetation by analyzing the backscattering coefficient values. The study could well separate the grassland, mixed grass/shrub/woodland, and woodland in the state of Distrito Federal.
The results of the CI 95% OAA showed the importance of the fusion between optical and radar data to map vegetation type in the Cerrado biome, since the confidence interval with the narrowest range belonged to the classification that used all images from the dry and rainy seasons, where the narrower the interval, the more accurate the classification. The Cerrado vegetation has one of the largest forest diversities, consequently the combination of different sensors (optical and radar) and spatial resolution (low, medium, and high) results in a great improvement in the accuracy [32]. Of the three classifications that obtained the highest values of accuracy and kappa, two used radar and optical images. This showed the importance of the integration of different sensors in improving the mapping of forest types in Cerrado. A similar result was reported by Sano et al. [38], who combined optical and radar images to improve the classification of different vegetation types in the Cerrado biome. The study had a high overall classification accuracy, which used both sensors in regions of savanna and grasslands formations. Sano et al. [38] used data from the dry and rainy seasons and showed the importance of the time series in improving the classification of different types of vegetation. Additionally, Sano et al. [38] showed better performance of radar data (JERS-1 SAR) compared to optical data (Landsat). In contrast, our results showed that optical data performed better for classification, compared to the radar data. However, this study used a higher number of radar images using L Band compared to our study, which increased the efficiency of mapping vegetation, due to the sensitivity to identify the various structures of the forest, consequently better distinguishing the type of forest, as reported by Lucas et al. [69], Garestier [70], and Santoro [71]. Carvalho et al. [37] used images from ALOS-PALSAR and Landsat to map the different types of vegetation and the results agree on our findings. The highest overall accuracy and kappa values were from the S2 classification; therefore, in our results, the use of radar images did not reach the highest accuracy and kappa values. Carvalho et al. [37] showed that the use of radar data did not improve classification accuracy; however, the study used only one data from radar imaging. Concerning GLCM textures, the same study showed similar results. Grey Level Co-occurrence Matrix textures images had a high variable importance during the Random Forest classification, in particular for entropy, which showed the disorder of GLCM elements. This may be related to the differences in the backscattering of the vegetation type classes.
Regarding the user's accuracy, the secondary forest was better classified using optical images, whereas the other three classes were better classified using optical and radar images. The optical bands were the most important variables for the RF classifier. The texture images were the second most important ones. Several authors presented similar results achieved in this study [62,72,73]. All mentioned studies showed an improvement in the separability of land cover types employing texture images. The coherence image from TanDEM-X was the third most important variable. Schlund et al. [72] and Baron and Erasmi [62] showed an improvement in the discrimination of forest against other classes using coherence as well.
Other studies about classification of vegetation type in the Cerrado biome, such as Mesquita et al. [35], were in regions where the vegetation has a smaller gradient compared to regions within the Arc of Deforestation, such as Distrito Federal, Minas Gerais, and São Paulo. The IBGE and the MMA mapped vegetation types from the whole Cerrado biome. The studies used Landsat images from the year 2004 and scaling of 1:250,000, which is not enough to detect the gradients of the Cerrado biome. The mapping of vegetation types in transition zones is still a challenge, due to these not having a clear border [74]. However, these regions play an important role in the conservation of the Amazon and Cerrado biome, wherein 75% of the deforestation in Amazon occurs.

Conclusions
In this paper, we evaluated the use of optical and radar remote sensing for mapping different types of vegetation in the transitional area between the Cerrado and Amazon biomes. The method described in this study improved the mapping of vegetation type in the Arc of Deforestation in the Cerrado biome and can be applied to create accurate vegetation type maps. We evaluated the use of four different sensors, one optical sensor (Sentinel 2) and three radar sensors (Sentinel 1, ALOS, TanDEM-X), for better vegetation type identification and area discrimination, so that these can be used for better calculations of biomass loss and carbon storage in the high dynamic Arc of Deforestation in Brazil.
When applying a supervised random forest classification, the highest overall accuracy and kappa coefficient were obtained using only the Sentinel 2A for classification. However, of the three classifications that obtained the highest overall accuracy and kappa values, two used radar and optical images. Bands 5, 11, and 12 of Sentinel 2A, texture images from Sentinel 1A cross-polarization, and coherence of TanDEM-X were the most important images in order to separate each class, as calculated by the random forest variable importance. The combination of optical and radar sensor data usually improves the vegetation classification. Nevertheless, in our study, the single use of optical sensors was sufficient to discriminate the four forest classes in the study area: cerradão (Open Forest), cerrado denso (Dense Woodland), gallery forest, and secondary forest classes in a highly fragmented complex vegetation biome. Such information is relevant for the upcoming mapping of vegetation types in the endangered Cerrado/Amazon ecotone.