Geographic Object-Based Image Analysis Framework for Mapping Vegetation Physiognomic Types at Fine Scales in Neotropical Savannas

: Regional maps of vegetation structure are necessary for delineating species habitats and for supporting conservation and ecological analyses. A systematic approach that can discriminate a wide range of meaningful and detailed vegetation classes is still lacking for neotropical savannas. Detailed vegetation mapping of savannas is challenged by seasonal vegetation dynamics and substantial heterogeneity in vegetation structure and composition, but fine spatial resolution imagery (<10 m) can improve map accuracy in these heterogeneous landscapes. Traditional pixel-based classification methods have proven problematic for fine spatial resolution data due to increased within-class spectral variability. Geographic Object-Based Image Analysis (GEOBIA) is a robust alternative method to overcome these issues. We developed a systematic GEOBIA framework accounting for both spectral and spatial features to map Cerrado structural types at 5-m resolution. This two-step framework begins with image segmentation and a Random Forest land cover classification based on spectral information, followed by spatial contextual and topological rules developed in a systematic manner in a GEOBIA knowledge-based approach. Spatial rules were defined a priori based on descriptions of environmental characteristics of 11 different physiognomic types and their relationships to edaphic conditions represented by stream networks (hydrography), topography, and substrate. The Random Forest land cover classification resulted in 10 land cover classes with 84.4% overall map accuracy and was able to map 7 of the 11 vegetation classes. The second step resulted in mapping 13 classes with 87.6% overall accuracy, of which all 11 vegetation classes were identified. Our results demonstrate that 5-meter spatial resolution imagery is adequate for mapping land cover types of savanna structural elements. The GEOBIA framework, however, is essential for refining land cover categories to ecological classes (physiognomic types), leading to a higher number of vegetation classes while improving overall accuracy.

Monitoring patterns and trends in tropical savannas still faces major uncertainties related to their definition and classification [1][2][3][4]. This uncertainty is reflected both in general land cover classification and maps featuring vegetation physiognomic types (e.g., life form, vegetation cover). For example, savannas are poorly defined in global land cover products, and variation in their physiognomic types is not well classified at local scales [2,5,6]. Most current technology featuring moderate to coarse spatial resolution (>10 m) fails to resolve the fine-scale heterogeneity of savannas. Major issues in their discrimination relate to growth patterns (associated with seasonality of contrasting dry and wet seasons) and to admixtures of life forms and land cover categories at operational sensor scales [7,8].
Savannas occupy a significant area of the tropics, covering approximately 20% of the world's land surface [3,9]. Tropical savannas, for example the Argentinian Chaco, the African Miombo, and the Brazilian Cerrado, are often intermixed with riparian forests, swamps, and marshes [9]. They are composed of a herbaceous stratum in a discontinuous tree and shrub cover of varying height and density [2,3,10]. The Cerrado, a neotropical savanna in Brazil, is the most floristically diverse savanna in the world, with more than 12,000 plant species [11], including numerous endemics [12,13]. Moreover, the Cerrado provides critical ecosystem services such as carbon storage [14] and plays a major role in provision of water resources by hosting the headwaters of the three largest watersheds in South America.
Land cover mapping of savannas has been conducted mostly at regional scales, using optical sensors available at moderate (10-500 m) to coarse (>500 m) spatial resolution, such as the Landsat series and Moderate Resolution Imaging Spectrometer (MODIS). Most studies focus on multitemporal analysis for change detection [15][16][17][18], deforestation monitoring [19][20][21][22], and land surface phenology [23][24][25]. Specific challenges to savanna land cover classification are related to: (a) high sensitivity to sensor resolution due to discontinuous tree canopy cover [26]; (b) high seasonal variation in ecosystem properties, cloud cover, and data availability [27]; and (c) smoke and haze due to frequent fires in the dry season [28].
As for other savannas, discriminating spectrally similar shrubs from trees with moderate-tocoarse resolution imagery has proven challenging for the Brazilian Cerrado [23,29]. Sano et al. [30] used image segmentation and visual interpretation of Landsat to produce a map of natural and converted areas for the entire Cerrado region. Other Landsat-based studies have focused on local sites to investigate methods for mapping fractional woody cover, such as spectral unmixing [29], and Support Vector Machine classification of multi-year phenologic profiles based on the Tasseled Cap Transform [25]. Several studies took advantage of multi-temporal rather than single-date imagery to overcome spectral similarities in woody cover using characteristic phenological patterns [17,23,25,31,32]. Although these approaches are useful for broad-scale analyses, they depict coarse structural vegetation classes [25,[33][34][35] and cannot resolve the structural heterogeneity essential for regional biodiversity and ecosystem assessments [36].
A critical problem in mapping Cerrado physiognomic types concerns the definition of classes. Most previous remote sensing studies considered a widely adopted vegetation nomenclature for the Cerrado physiognomies based on structural attributes and floristic composition (see Ribeiro and Walter [37]). Vegetation maps exhibiting the diverse structural variation in vegetation types are critical for representing fine-scale savanna habitat patterns. Spectrally based remote sensing analyses based on floristic classification systems (such as Ribeiro and Walter [37]) may not succeed in identifying structural differences in vegetation and may require extensive field work for species identification. Thus, they may not be suitable for regional scale mapping using multispectral imagery classification alone. Geographical characteristics related to edaphic conditions (e.g., topography, soils), however, can potentially help identify some physiognomic types not strictly based on species composition.
Remote sensing imagery at fine (<10 m) spatial resolution can better capture the diversity in vegetation structure of heterogeneous and complex savanna landscapes [36,[38][39][40][41]. However, traditional pixel-based classification methods have proven problematic for fine spatial resolution data due to increased within-class spectral variability, potentially leading to inconsistent results Remote Sens. 2019, 11, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing [42,43]. Geographic Object-Based Image Analysis (GEOBIA) bridges remote sensing and Geographic Information Science by defining image objects as entities and focusing on the conceptual modeling of defined land cover classes at multi-scales. Thus, GEOBIA is a robust alternative approach to address within-class spectral variability issues in land cover classification of high spatial resolution imagery and heterogeneous landscapes [42].
GEOBIA is based on extracting information from Earth Observations using spectral, spatial, structural, and hierarchical properties of an image [44]. A fundamental step is to delineate objects of interest, which are strongly associated with image segmentation approaches that cluster relatively homogenous pixels into image objects. One of the significant advantages of the GEOBIA approach is that image objects provide not only diverse spectral information (e.g., mean values per band, standard deviation, mean ratios) but also additional spatial information, such as distance, neighborhood, and topological metrics [42,45]. The combination of spectral and spatial properties allows incorporation of contextual information of a given object using ontologies/semantics to create hierarchical conditional rules tailored to classify meaningful object definitions in a knowledge-based classification [42,46].
Efforts to map fine-scale structural variation in savanna ecosystems using GEOBIA have obtained encouraging results compared to moderate spatial resolution data and pixel-based methods [26,39,41,47,48]. GEOBIA is also increasingly used for Cerrado studies due to the recent availability of fine (5 m) spatial resolution imagery from the RapidEye sensor at no cost for Brazilian researchers. Initial efforts to evaluate the utility of high spatial resolution imagery for discriminating and mapping Cerrado physiognomic types have demonstrated improved discrimination of structural classes and higher map accuracy compared to coarser resolution imagery [49][50][51]. Such efforts include using supervised object-based classification with several input object features in a GEOBIA context, such as in Girolamo-Neto [49] and Girolamo-Neto et al. [41]; or strict knowledge-based classification by defining conditional rules based on shape and brightness parameters, such as in Teixeira et al. [51].
However, these studies were tested at sites with limited extent (< 50,000 ha) such as Brasilia National Park, which does not include some major vegetation structural types known for causing misclassification errors (i.e., semi-deciduous versus deciduous forest [51]), or featured coarse vegetation classes as opposed to detailed physiognomic types. A systematic approach that can discriminate a wide range of meaningful and detailed vegetation classes is still lacking for the Cerrado biome.
The primary goal of this study is to develop a systematic framework to discriminate detailed Cerrado physiognomic types in a semi-automatic manner using single-date high spatial resolution imagery. The rationale for mapping detailed physiognomic types at fine scales stems from the potential of such maps to (1) improve our understanding of species habitat requirements and conditions, as well as our ability to assess ecosystem services and biodiversity [36], and (2) provide improved inputs for fire modeling, carbon accounting [52], landscape restoration [53], and land-use management [54]. Our approach takes advantage of GEOBIA and semantics to combine land cover classes and edaphic conditional drivers in the definition of hierarchical contextual rules used to classify a wide range of Cerrado physiognomic types. The specific research questions we aim to answer with this framework are: What accuracies are achievable using spectral information alone? What accuracies are achievable adding spatial context information? How can the widely adopted Cerrado physiognomic types nomenclature be used in a remote sensing analysis? We address these questions using RapidEye imagery (5 m) in a two-step GEOBIA framework that begins with a supervised object-based land cover classification based on spectral information alone, followed by assignment of spectral land cover classes to more detailed physiognomic types using a novel hierarchical spatial and topological ruleset defined by semantics (i.e., descriptive assessment and knowledge). This approach takes advantage of ancillary information on hydrography, topography, and substrate as environmental conditional drivers in the semantic definition of hierarchical contextual rules. This GEOBIA framework was tested for two large study sites covering most major Cerrado physiognomic types. Its main advantages relate to its reproducibility across different areas of this heterogeneous biome, its capacity to discriminate a wide variety of physiognomic types that Remote Sens. 2019, 11, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing could not be distinguished in previous studies, and its adaptability to other physiognomic types and to other types of optical imagery.

Study Sites
The Cerrado has a tropical climate characterized by an October-April wet season and May-September dry season, when rainfall can be close to zero [55]. Plant distributions across the Cerrado are mostly determined by topography, soil texture, nutrient content and depth, fire regime, and water availability [56,57]. Spatial variation in these environmental conditions results in high beta diversity across the biome as well as large variability of physiognomic types over relatively small distances [37,58].
We chose two study sites ( Figure 1) to test our classification framework and compare its accuracy in discriminating savanna vegetation with differing landscape composition and surface heterogeneity. The sites were chosen based on their ecological importance for conservation, their differences in composition and beta diversity, and a combination of imagery and ancillary data availability.  We initially tested our method at the Taquara site ( Figure 1), a study site for which we had highquality orthophotos (24 cm resolution) and a greater availability of ground reference and ancillary data that were important for testing our ability to visually identify physiognomic types using air photos when collecting training data. The Taquara site (15°54' S, 47°55' W to 15°58', 47°50' W) comprises an area of 67.3 km 2 located approximately 26 km from downtown Brasília, covering most of the Taquara watershed and its surroundings. This site also contains the Brazilian Institute of Geography and Statistics (IBGE) Ecological Reserve, a protected area created to act as a biodiversity control site for comparison to other Cerrado areas altered by human occupation. The IBGE Ecological Reserve served as one of the Cerrado sites included in the Large Scale Biosphere-Atmosphere Experiment in Amazonia (LBA) [59], and was the first International Long Term Ecological Research (ILTER) site in the Cerrado biome.
The watershed is located in the Cenozoic bed from the Paranoa Group and terrain is relatively flat, with elevation in the region varying between 1040 and 1196 meters. Mean annual precipitation is 1426 mm and mean annual temperature is 23 o C [60]. The site has considerable diversity of plants and soil types, representing most of the typical physiognomic types found across the Cerrado. Soils are mostly acidic, low fertility Oxisols (Latosols) supporting savanna ecosystems. Organic, nutrientrich hydromorphic soils occur locally in the area and often support forest ecosystems. Most of the site is covered by savanna ecosystems, but grasslands located on small hills, and gallery forests with surrounding wetlands following small streams are also present. We also tested whether our method could be applied to a larger and even more heterogeneous site with minimal ground reference and ancillary data. This study site comprises an area of 9409 km 2 and is located on the western side (11°43'S, 45°52'W to 12°36'S, 44°59'W) of the São Francisco River watershed, the largest river basin entirely located in Brazilian territory ( Figure 1). The Western Bahia site is not only naturally heterogeneous but also has considerable complexity due to historical land use conversion of natural savanna to pasture and row crop agriculture [18]. Elevation ranges from a maximum of 808 m across karstic mesas/plateaus (known as Chapadões do São Francisco) to 433 m in the lowest point in the São Francisco Depression, with annual precipitation ranging from 800 mm at lower elevations to 1600 mm at highest elevations [54,62]. The plateaus are composed of Proterozoic rocks from the Bambui Group and Cretaceous beds from the Urucuia Group. Diverse soils include deep well-drained Oxisols (Latosols) of medium texture in the highest parts of the plateau and sandy texture (sandy quartz) on irregular terrain, rocky soils (Lithosols) of sandy to medium texture on steep slopes and escarpments, and hydromorphic/organic soils across floodplains [62]. This variety of edaphic conditions supports diverse vegetation types mostly consisting of savanna ecosystems across the plateaus, wetlands and riparian vegetation along floodplains, and semi-deciduous forest restricted to cliffs and to the eastern part of the plateau, which is possibly due to local concentrations of calcium carbonate in the soil and higher moisture conditions [62]. In general, lower elevation sites have greater physiognomic diversity compared to the plateaus [63].
Common tree species within the savanna ecosystem include Anacardium occidentale and Miconia ferruginea, and the grass layer is dominated by annual species such as Ichnantus hoffmannseggii [64]. Seasonally dry tropical forests are mostly found along escarpment slopes, and are composed of deciduous or semideciduous tree species such as Astronium urundeuva, Piptadenia macrocarpa, Chorisia speciosa., Tabebuia spp., Cavanillesia arborea, and Cedrella fissilis [65].

Methods Overview
The GEOBIA framework used in this study is divided into two major steps (Levels 1 and 2), in addition to pre and post-processing stages: (a) pre-processing; (b) land cover classification (Level 1 Remote Sens. 2019, 11, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing processing); (c) physiognomic types classification (Level 2 processing); (d) area estimates; and accuracy assessment (statistics). These procedures are shown and described in detail in Figure 2.

Definition of Classes
The Cerrado physiognomic types have been defined by many authors such as Coutinho [66], Eiten [67], and Oliveira-Filho and Ratter [57]. The most recent vegetation terminology proposed by Ribeiro and Walter [37] has been widely adopted by the scientific community in Brazil. This scheme, however, is based on criteria such as environmental edaphic conditions and species composition that are not reliably detected by multispectral sensors. Thus, translating these on-the-ground Cerrado classification schemes to a land cover classification derived from remote sensing is a challenging task.
The United Nations Food and Agriculture Organization's (FAO) Land Cover Classification System (LCCS) is a flexible and systematic framework designed for land cover classification terminology at any given scale and for any data source [68]. The LCCS framework defines classes at different levels, starting with broad distinctions (e.g., Primarily Vegetated Areas, Primarily Non-Vegetated Area) within a dichotomous key and then adds specific attributes through a hierarchical framework (e.g., life form, cover, height). Classes are then defined as a function of the intended level of detail (scale) for the land cover classification based on a combination of the spatial and spectral resolution of the imagery, which makes the LCCS appropriate for object-based classification [69].
To allow for standardization among Cerrado classes, we used the LCCS as a reference in the first classification level. The classes were defined a priori, based on a literature review of major physiognomic types found across the Cerrado. In accordance with the RapidEye spatial and spectral characteristics, the quality of the images (e.g., off-nadir viewing angles), and the recommended scale for mapping Cerrado physiognomic types [37], the map scale was defined as 1:25,000. We followed the LCCS criteria based on dominant life form, vegetation cover, and structure, as well as water seasonality ( Figure 3). In the second classification level (Figure 3), we followed the nomenclature described by Ribeiro and Walter [37] for our map legend of physiognomic types (Table 1)  Classification System (LCCS) classes defined for the Cerrado biome, which is appropriate for mapping with multispectral imagery at fine spatial scales (< 10 m). Level 2 represents the corresponding physiognomic types for each LCCS class. The arrows represent the corresponding physiognomic type category (level 2) derived from the LCCS classification (level 1). Most RapidEye imagery used in this study was acquired in the 2011 dry season, representing the imagery with the best quality available. The commercial RapidEye sensor, launched in 2009, operates in a constellation of five satellites in the same orbit, providing multispectral images with a spatial resolution of 6.5 meters resampled to a 5-meter grid (at the Level 3A), and a tile size of 25 km by 25 km. The sensor has a swath width of 77 km, daily off-nadir coverage, and radiometric resolution of 12 bits, scaled up to a 16-bit dynamic range. RapidEye's spectral resolution covers the visible and near-infrared bands ranging from 440 to 850 nm, including a red-edge band (690 to 730 nm). The imagery is available through the Ministry of Environment (MMA) Geocatalog and is accessible to Brazilian researchers at no cost. The collection covers the entire country and is composed of varying off-nadir angles and temporal coverage, which is limited to inconsistent dates mostly available for the years 2011 through 2015, depending on the area of interest.
Although the orthorectified Level 3A RapidEye product is provided with radiometric, geometric, and terrain corrections, additional corrections were made for improving consistency in the product. Atmospheric corrections and reflectance retrieval were performed using ACORN 4.0 software for all individual imagery tiles. Calibration files corresponding to image spectral response, gain, and offset, as well as acquisition parameters, were created from the metadata provided. Water vapor and atmospheric visibility parameters were determined by a trial-and-error analysis of a dark object reflectance (e.g., pure water pixels) and following the ACORN user guide suggestions for areas of dry conditions.
In total, we used 17 RapidEye imagery tiles. One tile corresponds to most of the Taquara watershed, covering the IBGE Ecological Reserve and its surroundings (Figure 1). After applying the atmospheric correction and reflectance retrieval, the image was subset to the bounding box extent of  (Figure 1). The pre-processing steps were applied to individual imagery tiles, and tiles with the same acquisition date were mosaicked. The individual mosaics were processed separately, at both classification levels, and merged together to derive statistics for the complete study site (i.e., accuracy assessment and landscape composition). The Taquara tile was acquired on August 11 th 2013, and the Western Bahia tiles were acquired from June to October 2011. Details of imagery tiles, dates, and sensor angle-viewing characteristics are summarized in Table S1.
2.2.3. Level 1 Classification: Major Land Cover Types 1) Segmentation Our GEOBIA approach starts with segmenting the images into homogeneous image objects to ensure neighboring pixel similarity at an adequate scale. The segmentation was performed through the multi-resolution segmentation (MRS) algorithm, proposed by Baatz and Schäpe [70] and implemented in eCognition Developer 8.0 software. For all images, the segmentation used all five spectral bands of the RapidEye image, with higher weights for near-infrared (NIR) (760-850 nm), rededge (690-730 nm), and red (630-685 nm) bands due to their importance in discriminating vegetation types [23,71]. The multi-resolution algorithm implemented in eCognition uses a bottom-up merging approach as an optimization procedure to identify similar, homogenous, neighboring pixels and cluster them into a single object. eCognition accounts for a scale parameter as a level of aggregation of image objects and uses a stop criterion in the optimization algorithm. Thus, the scale is a crucial part of GEOBIA, as it defines the size of image objects as well as their level of heterogeneity. The multi-resolution algorithm also accounts for shape and compactness parameters. We visually inspected multiple combinations of scale, shape, and compactness parameters and selected a combination of scale = 10, shape = 0.3, and compactness = 0.7. These parameters are in accordance with other GEOBIA studies that use high spatial resolution imagery and have small image object scale [72].

2) Collection of training data and Random Forest model
The training data were collected through a process of visual interpretation assisted by orthophotos and Google Earth images covering both study sites. We defined standard parameters for visual interpretation of the classes, which were assisted by ancillary data (such as other vegetation maps available for the sites) and one field excursion conducted in the dry season of 2018 to each site for confirmation of class categories in areas that were still unclear after examining the available resources.
Training samples were collected in proportion to class abundance, with abundant classes having a higher number of training data compared to classes that were rare across the landscape (Table S2). All training was done at the object scale, in which each sample corresponds to an image object generated in the segmentation process. Spectral variables (e.g., statistics and indices), also known as object features in GEOBIA, were attributed to each training polygon (Table 2), including three indices: Normalized Difference Vegetation Index -NDVI [73], Normalized Difference Vegetation Index with red-edge band -NDVI-RE [71], and Normalized Difference Water Index -NDWI [74].
All training samples and their respective statistical attributes were used as input in the 'random forest' package in RStudio developed by Liaw and Wiener [75] based on Breiman [76]. All parameters were set to default on the random forest classification algorithm, which resulted in a model based on 500 decision trees used to classify all image objects derived from the multi-resolution segmentation. The result of this process is the Level 1 land cover classification based on the LCCS land cover classes. In accordance with our goal of classifying physiognomic types in a semi-automatic manner, we developed a series of spatial contextual rules for each study site in order to refine the Level 1 land cover map (Table 3). We combined the Level 1 LCCS classification with hydrographic data (stream networks and hydromorphic soils) developed by Ribeiro [77] for the Taquara site (scale 1:10,000); and by the Laboratory of Spatial Information Systems-LSIE at the University of Brasília (Brazil), in partnership with the Inter-American Institute of Commerce and Agriculture and the Brazilian Ministry of National Integration, for the Western Bahia site (scale 1:2,000). Slope and elevation were derived from the NASA Digital Elevation Model-NASADEM [78] available at a resolution of 1 arcsec (approximately 30 m).
The spatial rules were developed based on environmental characteristics of the vegetation physiognomic types described in Pereira and Furtado [61], Nou and Costa [62], and Ribeiro and Walter [37], in addition to personal and expert knowledge of the study sites. Specific elevation and slope thresholds were based on recommendations from the Brazilian Agricultural Research Corporation-Embrapa [79]. The same contextual and topological rules were applied for both study sites, except that thresholds used for elevation and slope were adapted to each site's characteristics. Table 3. Spatial contextual rules used to characterize LCCS land cover classes into physiognomic types; rules and classes with "*" were only applied for the Taquara watershed site, and rules with "**" were only applied for the Western Bahia site due to the absence of these classes in the other study site.

Level 1 Classes Spatial Rules Level 2 Classes
Closed Canopy The combination of mountainous terrain (e.g., escarpments/cliffs) and fine spatial resolution resulted in a high presence of shadows in the Western Bahia site imagery. However, RapidEye's spectral resolution does not allow us to distinguish shadows from water, a well-known source of confusion in remote sensing and multispectral high resolution images [35]. We therefore developed a "shade" mask using a topological rule in eCognition. We also used visual interpretation to create a "cloud" and "shade from cloud" mask, and a "water body" mask was created for the Taquara site, given that this class was rare and small enough (<0.01%) to not be included in the model. Because we were exclusively interested in natural areas, a land use mask from our database (data from [18,77]) was used in each study site to exclude paved roads, agricultural, and urban areas from validation.

Accuracy Assessment Procedures
Measuring thematic map accuracy is a crucial step to determine error sources and calculate producer and user map accuracies. Moreover, it is a way to analyze potential weakness and strengths of classification methods. However, it is not a straightforward task and can include many uncertainties [80][81][82][83][84]. In traditional pixel-based classification, thematic accuracy is assessed by estimating the proportion of correctly classified pixels for each class. This approach assumes that pixels have the same size, and thus, one can estimate the proportion of area correctly classified [83]. However, it is recommended that image objects are used as sampling units in object-based accuracy assessments, instead of the traditional point-sampling from pixel-based classification [85]. Image objects have variable areas across the landscape in GEOBIA-derived thematic maps, leading to a greater impact on error estimates from large misclassified objects than small polygons. Thus, area count should be accounted for in accuracy assessments of GEOBIA classification [86].
The Random Forest algorithm generates an out-of-bag (OOB) error estimate using subsampling and bootstrapping, accounting for samples not used as training in the model [76]. To minimize inflated accuracies from the OOB error due to spatial autocorrelation, we performed additional independent validation for both study sites. This independent validation was done by comparing randomly selected polygons from our classified images (excluding training samples) to a series of ancillary data, including orthophotos, Google Earth imagery, and digital photographs (taken on the ground), when available, following recommendations from Richards [83]. Given the lack of fine-scale time series imagery available for the entire landscape, experts with local knowledge of the sites were also consulted to validate the classes that are influenced by seasonality. For instance, the natural seasonality of seasonally dry tropical forest and semi-deciduous forest required local knowledge when the time series of Google Earth imagery was not available. We performed the sampling selection using the original segments/objects derived from eCognition, which contain information at both map levels. The number of samples for each category was determined based on the final map (physiognomic types), but accuracy estimates were performed for both levels using the same polygon. We used an equal proportion of randomly selected polygons for categories that were sparsely represented across the landscape (≤10%), which resulted in a total of 50 polygons per class. For classes with high landscape abundance (i.e., non-natural/barren, open savanna, savanna for both sites), we performed a stratified random selection based on a total number of samples of 225 for each site. The classes "semi-deciduous forest" and "shrub swamp" were not accounted for in the validation process for the Taquara site because they each represent less than 0.5% of the landscape. In total, we selected 725 polygons for the Western Bahia site and 525 polygons for the Taquara study site.
A common issue in estimating accuracy from thematic maps is the potential error that can be included in the reference data [81,82,87]. To minimize error and bias from the interpreter in the accuracy assessment, our validation procedures were performed by two authors trained in photointerpretation of the regions and with previous experience working in the Cerrado. Error matrices were generated for each map level using an area-weighted approach based on independent sampled image objects [69,86]. The traditional count-based accuracy assessment was also performed for comparison (Tables S5-S8). They were used to derive traditional statistical accuracy measures for both map levels, such as overall agreement, user's accuracy, and producer's accuracy.

Segmentation Results
The MRS algorithm generated a different number of image objects (Figure 4) for each mosaic or image tile processed, which is expected since they have different extents. The imagery tiles acquired in September 16 th and September 13 th resulted in 384,011 and 389,664 objects, respectively. The August and October mosaics have similar extent (4 image tiles) and resulted in 1,157,419 and 979,083 objects, respectively. The June mosaic contains 6 image tiles and thus resulted in a much larger number of objects, a total of 2,804,338.

Accuracy Assessment
The OOB error for the Taquara site was 7.8%. Given that the Random Forest classification was performed by mosaic for the Western Bahia site, the OOB error estimates were then generated for each mosaic. The September 13 th image had the lowest OOB error (3.5%). The October mosaic had the second lowest OOB error (4.4%), followed by the September 16 th imagery (6.3%), the August mosaic (7.0%), and the June mosaic (7.3%).
These relatively small differences in the OOB error estimates could be due to a combination of reasons such as atmospheric conditions on a particular day (e.g., active fire was present in the June mosaic, and haze was present in the September 16 th imagery), possible rain close to the imagery date, differences in sensor angle-viewing, and particular characteristics of the classified image (e.g., one specific image can have more disturbed areas and hold higher heterogeneity compared to the other).
The mean decrease in accuracy is a percent estimate of variable importance in the random forest model ( Figure 5). The NDVI was the most important variable in all models, except in the June mosaic, in which NDWI had the highest importance. This is likely due to water content available in the soil during early dry season (i.e., June), whereas later in the dry season, some physiognomies (i.e., grass and shrublands) are more impacted by water limitation. The other indices, NDVI (red-edge) and NDWI, also contributed significantly (>12%) in all models. Considering only the RapidEye spectral bands, the near-infrared (band 5) had a contribution above 13% in all models, whereas the red and red-edge bands had high importance (>15%) in the October and June mosaic models, respectively.  Additional accuracy estimates based on the independent randomly selected image objects were performed for the entire map and not for individual mosaics. Error matrices were developed for each map level (Tables 4 and 5) and were reported as area (in hectares) count of each image object, following best practices for object-based accuracy assessment proposed by Radoux et al. [69]. Statistical measures of overall agreement, as well as user's and producer's estimates were derived from the error matrices (Tables 4 and 5). For comparison, we also generated error matrices and statistical measures based on the regular polygon count approach (Tables S5-S8). Accuracy assessments for the Taquara watershed are found in the supplementary material (Tables S3-S4).

Landscape Composition: Area Assessments
We estimated landscape composition (Table 6) for the total area of the Western Bahia site, as well as for each mosaic, based on the final (Level 2) physiognomic type map ( Figure 6). The most abundant classes in the landscape are non-natural/barren areas, open savanna, and savanna. The rarest physiognomic types found are seasonally dry tropical forest, shrub swamp, and palm swamp. Considering the individual mosaics, the June mosaic and the September 13 th image have the highest amount of natural vegetation, whereas the October mosaic has the highest concentration of nonnatural/barren areas. The Level 2 classification map and landscape composition of the Taquara watershed is found in the supplementary material ( Figure S1, Table S9).

Discussion
Land cover mapping in the Cerrado has generally used multispectral imagery with medium to coarse spatial resolution and pixel-based approaches, such as in Muller et al. [17], Schwieder et al., [25], Ferreira et al. [29], and Reynolds et al. [88]. As demonstrated by Sano et al. [30] and Schwieder et al. [25], these types of imagery do not capture the fine-scale heterogeneity present within the savanna ecosystem gradient and thus are not appropriate to discriminate differences in vegetation structure, often leading to low accuracy results (such as 71% and 63%, respectively). Distinguishing the fine-scale heterogeneity of Cerrado physiognomic types is crucial for identifying species habitats and estimating plant diversity. For instance, the Hyacinth Macaw (Anodorhynchus hyacinthinus) is an endangered species of small population inhabiting the Cerrado that is heavily dependent on palm trees present in wetlands for breeding and foraging; discriminating the different structural types present in seasonal wetlands (i.e., palm swamp, shrub swamp, and marsh) can improve estimates related to their occurrence and habitat quality and availability. Given the challenges of mapping Cerrado physiognomic types with traditional pixel-based methods at medium to coarse spatial resolution, we developed a systematic GEOBIA framework using single-date high spatial resolution imagery accounting for a novel environmental spatial ruleset developed to identify a wide range of Cerrado vegetation structural types. This framework was shown to be a robust method to differentiate a larger number of physiognomic types at a higher accuracy than previously reported in several studies regarding Cerrado land cover mapping.
Our results show an improvement in classification accuracy compared to studies using similar image characteristics and object-based methods to map Cerrado physiognomic types, such as in Girolamo-Neto et al. [41,49] and Orozco-Filho [50]. In the Level 1 LCCS classification, we mapped a total of 10 land cover classes (of which 7 correspond to vegetation types) and reached 82% overall accuracy, while others have discriminated 8 classes and reached an overall accuracy of 67.7% [41], or 81% accuracy while considering 7 classes [50]. Girolamo-Neto [49] used the RapidEye imagery in a method similar to ours (i.e., segmentation + RF classifier) in the Level 1 LCCS classification, but with a different class legend, and reached an overall accuracy of 74.3% to classify 5 land cover classes. Previous studies aiming to classify Cerrado physiognomic types used vegetation taxonomy based on structural parameters and species composition defined either by Coutinho [66], IBGE [89], Ribeiro and Walter [37], or a combination of them. The disparity in map accuracy and number of discriminated classes between our LCCS results and previous studies suggest that nomenclatures considering floristic composition (used in most previous Cerrado remote sensing studies) might not be appropriate for multispectral imagery alone as we initially suspected. It is important to note that standardization of land cover classes-that are comparable across scales and appropriate to the imagery characteristics (e.g., spectral and spatial resolutions)-is essential to produce accurate and meaningful results. As in other studies aiming to standardize land cover classes for remote sensing applications [90,91], we used the LCCS to define appropriate Cerrado land cover classes for the RapidEye imagery and tested if the defined classes could be spectrally discriminated and mapped at high accuracy.
Given that there is no other study using LCCS classes for the Cerrado, we cannot compare accuracy assessments for specific classes from our Level 1 classification with other studies. However, our results demonstrate that accounting for RapidEye's spectral information alone accurately discriminates our defined LCCS land cover classes, distinguishing some structural variation within savanna (i.e., open shrubland; dense shrubland; open canopy) and grassland (i.e., herbaceous; shrubherbaceous) ecosystems. Despite encouraging results, single-date RapidEye spectral properties alone were not able to discriminate variations within forest structural elements (i.e., riparian forest versus semi-deciduous forest) given that most closed-canopy classes (e.g., seasonally dry tropical forest and sclerophyll forest) are composed of broad-leaf semi-deciduous (or deciduous, for a subtype of seasonally dry tropical forest) trees. It could also not differentiate some variations between terrestrial ecosystems and seasonal wetlands. For instance, shrublands (i.e., dense and open shrub) could not be distinguished from wetland shrubs (i.e., shrub swamp). Despite that, grasslands (i.e., herbaceous) Remote Sens. 2019, 11, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing were distinguished from marsh (i.e., herbaceous-wet) accounting only for its spectral properties. This result is consistent with previous studies that demonstrated an improvement in classification accuracy of terrestrial and wetland ecosystems when using multispectral fine spatial resolution imagery [92]. Defining environmental contextual rules in addition to using spectral properties proved an effective strategy in discriminating within-class variations across ecosystems as confirmed by high accuracy results for those classes. Other works also accounted for such classes but frequently merged similar physiognomic types into one class for higher accuracy estimates. For instance, our classes "marsh", "palm swamp", and "shrub swamp", if merged, would be equivalent to the class "floodplains with palm trees" in Girolamo-Neto [41], and "veredas" in Orozco-Filho [50]. The same is true for our classes "riparian forest", "seasonally dry tropical forest", and "semi-deciduous forest", which are equivalent, if merged, to the class "forest" in Orozco-Filho [50]. It is known that map accuracy tends to decrease as a function of the number of classes [93]; however, our GEOBIA approach showed an inverse pattern, which is a major contribution of this study. Applying our environmental spatial ruleset to the Level 1 LCCS map resulted in a thematic map (Level 2 classification) with a larger number of classes and a higher overall agreement accuracy. Despite the fact that two classes of the Level 1 LCCS map were merged in the physiognomic types map (both "soil, NPV, impervious" classes became "non-natural/barren"), four new classes were added to the Level 2 map and the overall accuracy improved by around 3%. Considering both user's and producer's estimates for the physiognomic types classification (Table 5), the highest accuracies (>80%) are among the classes non-natural/barren, marsh, seasonally dry tropical forest, riparian forest, savanna, and open savanna. In general, all classes representing savanna and forest ecosystems resulted in a high (>80%) producer's accuracy.
Most studies aiming to test methodological approaches to map Cerrado physiognomic types were developed for one study area, usually of small extent (<50,000 ha) and not covering some major physiognomic types, such as in Ferreira et al. [29], Teixeira et al. [51], and Girolamo-Neto [41,49], which can be problematic for making portability assumptions to other Cerrado areas. Exceptions include studies from Schwieder et al. [25], who tested methods for three study sites of similar vegetation composition, and Silva and Sano [94] that considered four small test sites of different composition to map three major vegetation classes (i.e., savanna, forest, grasslands). To bring a higher level of confidence in testing the portability of our GEOBIA framework to other regions within the core area of the Cerrado, our method was tested for two study sites: a control site with larger availability of datasets (i.e., the Taquara site), and another covering a larger extent (>900,000 ha) and supporting different composition and heterogeneity levels, covering a total of 11 major physiognomic types that are present across the Cerrado. The high accuracy results for both study sites indicate that this framework should be portable to other areas in the core Cerrado region. However, further analysis is necessary to adjust it for areas of transition to other biomes where unique local flora composes additional physiognomic types (e.g., carrasco, capão). In addition, we suggest future studies to explore adapting this framework to similar ecosystems in other continents, such as the African and Australian savannas.
Despite its robust ability to classify a wide range of physiognomic types, our method was not able to differentiate classes of similar structure for which edaphic conditional drivers were not available in our dataset. This is the case for classes that would be separable from each other with detailed information about soil types and/or species composition. For instance, seasonally dry tropical forest located in areas of flat terrain (plateaus/mesas) could not be differentiated from sclerophyll forest, which co-occurs in the same terrain type, so they were combined into a single semideciduous forest class. We could only identify seasonally dry tropical forests within steep slopes, which could be discriminated using a fine-scale Digital Elevation Model. Additionally, transitional enclaves of denser caatinga vegetation (a deciduous xerophyte type) were also not possible to differentiate from savanna. Similarly, a rocky savanna type (cerrado rupestre), which is present in the Western Bahia region and structurally similar to open savanna, could not be discriminated.
Recent advances in remote sensing, such as imaging spectroscopy and Light Detection and Ranging (LiDAR), are improving vegetation studies in savannas [95][96][97]. They have great potential to overcome gaps and uncertainties related to savanna patterns and processes, such as species discrimination [98][99][100] and plant community composition [95], as well as major drivers and impacts on woody structure [97,101]. A potential solution for overcoming remaining issues related to Cerrado structure and floristic composition would be the availability of a detailed (< 1:10,000) soil types map or a combination of LiDAR and hyperspectral imagery. Moreover, publicly available multispectral imagery, such as the Sentinel-2 MSI sensor, are also promising to improve discrimination of Cerrado physiognomic types due to their combination of fine spatial and spectral properties, free availability, and larger areal coverage allowing for regional scale analysis.

Conclusions
The semi-automatic method proposed here combines image spectral properties (mean reflectance, standard deviation) with standard spectral indices (e.g., NDVI, NDWI) in a Random Forest land cover classification, and uses a novel spatial contextual ruleset to classify land cover categories into physiognomic types in a systematic manner. Our study demonstrates that high spatial resolution imagery is appropriate for discriminating Cerrado land cover classes. The Random Forest algorithm was effective in mapping structural differences within savanna ecosystems, in addition to distinguishing wetlands from terrestrial ecosystems. A combination of ancillary data and spatial rules, however, allowed characterizing physiognomic types while increasing the number of classes and improving map accuracy. Despite the demonstrated success of our method, caveats include high computational costs for processing a large volume of data, lack of automated methods to determine MRS initial parameters (scale, shape, and compactness), and low temporal availability for RapidEye data available at no cost for monitoring purposes and for improving discrimination of classes.
Detailed maps differentiating physiognomic types are essential for conservation strategies, and a consistent classification method is currently lacking in the Cerrado. To the best of our knowledge, our study is the first to propose a systematic method to map Cerrado physiognomic types resulting in a high accuracy assessment and a large number of classes for areas of different heterogeneity. Thus, we conclude that the proposed framework is effective to accurately map physiognomic types across the Cerrado biome at fine spatial scales. Given the availability of RapidEye data for the entire Brazilian Cerrado, application of our framework could improve region-wide mapping in support of conservation and ecological analysis.
Supplementary Materials: The following are available online at www.mdpi.com/2072-4292/12/11/1721/s1, Table S1. RapidEye tiles, acquisition dates, and sensor viewing angles, for each study site; Table S2. Number of training data (polygons) collected for each image and study site; Table S3. Error matrix (reported as ha), overall accuracy, producer's accuracy, and user's accuracy for the Taquara watershed Level 1 classification. The number in parentheses corresponds to the number of independent testing samples used for validation; Table S4. Error matrix (reported as ha), overall accuracy, producer's accuracy, and user's accuracy for the Taquara watershed Level 2 classification. The number in parenthesis corresponds to the number of independent testing samples used for validation; Table S5. Error matrix (reported as count of polygons), overall accuracy, producer's accuracy, and user's accuracy for the Taquara watershed Level 1 classification. The number in parentheses corresponds to the number of independent testing samples used for validation; Table S6. Error matrix (reported as count of polygons), overall accuracy, producer's accuracy, and user's accuracy for the Taquara watershed Level 2 classification. The number in parentheses corresponds to the number of independent testing samples used for validation; Table S7. Error matrix (reported as count of polygons), overall accuracy, producer's accuracy, and user's accuracy for the Western Bahia Level 1 classification. The number in parentheses corresponds to the number of independent testing samples used for validation; Table S8. Error matrix (reported as count of polygons), overall accuracy, producer's accuracy, and user's accuracy for the Western Bahia Level 2 classification. Number in parentheses corresponds to the number of independent testing samples used for validation; Table S9. Estimate of landscape composition for the Taquara site considering the proportion of the mapped area of each physiognomic type, reported in percentage, with respect to the total mapped area (i.e., entire study site extent); Figure S1. Map of physiognomic types (Level 2) for the Taquara watershed study site.